Commit Graph

384015 Commits

Author SHA1 Message Date
Tony 850fcedb27 [NFC][AMDGPU] Corrections to AMD GPU initial kernel launch documentation
Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D99223
2021-03-26 02:05:45 +00:00
Lang Hames 19e402d2b3 [JITLink][MachO] Use full <segment>,<section> names for MachO jitlink::Sections.
JITLink now requires section names to be unique. In MachO section names are only
guaranteed to be unique within their containing segment (e.g. a '__const' section
in the '__DATA' segment does not clash with a '__const' section in the '__TEXT'
segment), so we need to use the fully qualified <segment>,<section> section
names (e.g. '__DATA,__const' or '__TEXT,__const') when constructing
jitlink::Sections for MachO objects.
2021-03-25 18:31:18 -07:00
Stella Laurenzo 594e0ba969 [mlir][python] Add docs for op class extension mechanism.
Differential Revision: https://reviews.llvm.org/D99387
2021-03-25 18:27:26 -07:00
Richard Smith 4f3ea27dac Stop this test from dropping a .s file in the current directory. 2021-03-25 18:22:18 -07:00
Richard Smith ed8d76ec60 Explicitly enable the new pass manager in this test.
Otherwise it fails under -DENABLE_EXPERIMENTAL_NEW_PASS_MANAGER=OFF.
2021-03-25 18:10:36 -07:00
Craig Topper 9b3c0f9a54 [RISCV] Add Zbb+Zbt command lines to the signed saturing add/sub tests.
This will enable cmov to be used for select. I improve the codegen
of select_cc in D99021, but that patch doesn't work for cmov.
2021-03-25 17:25:36 -07:00
Amara Emerson 55533203d7 [GlobalISel] Add G_ROTR and G_ROTL opcodes for rotates.
Differential Revision: https://reviews.llvm.org/D99383
2021-03-25 17:23:30 -07:00
Jessica Paquette 23f657c165 [AArch64][GlobalISel] Emit bzero on Darwin
Darwin platforms for both AArch64 and X86 can provide optimized `bzero()`
routines. In this case, it may be preferable to use `bzero` in place of a
memset of 0.

This adds a G_BZERO generic opcode, similar to G_MEMSET et al. This opcode can
be generated by platforms which may want to use bzero.

To emit the G_BZERO, this adds a pre-legalize combine for AArch64. The
conditions for this are largely a port of the bzero case in
`AArch64SelectionDAGInfo::EmitTargetCodeForMemset`.

The only difference in comparison to the SelectionDAG code is that, when
compiling for minsize, this will fire for all memsets of 0. The original code
notes that it's not beneficial to do this for small memsets; however, using
bzero here will save a mov from wzr. For minsize, I think that it's preferable
to prioritise omitting the mov.

This also fixes a bug in the libcall legalization code which would delete
instructions which could not be legalized. It also adds a check to make sure
that we actually get a libcall name.

Code size improvements (Darwin):

- CTMark -Os: -0.0% geomean (-0.1% on pairlocalalign)
- CTMark -Oz: -0.2% geomean (-0.5% on bullet)

Differential Revision: https://reviews.llvm.org/D99358
2021-03-25 17:14:25 -07:00
Richard Smith 11bf268864 Add a target triple to fix test failure on targets that don't support
__int128.
2021-03-25 17:05:36 -07:00
Richard Smith 040c60d9b6 Fix a miscompile introduced by 99203f2.
getPointersDiff would previously round down the difference between two
pointers to a multiple of the element size of the pointee, which could
result in a pointer value being decreased a little.

Alexey Bataev has graciously agreed to add a testcase for this;
submitting the bugfix now to unblock.
2021-03-25 16:53:58 -07:00
Rahman Lavaee cf62b6d3b2 Add missing 'CHECK' prefix to basic block labels test.
The `CHECK` prefix was dropped in e0bf234930. This lead to all CHECK
lines having no effect.

Reviewed By: tmsriram

Differential Revision: https://reviews.llvm.org/D99316
2021-03-25 16:41:41 -07:00
Muhammad Omair Javaid c3152536fd [LLDB] Skip TestVSCode_launch.test_progress_events arm/linux
TestVSCode_launch.test_progress_events is mysteriously failing on arm
linux. I am marking it skipped for the buildbot while looking into
failure.
2021-03-26 04:38:31 +05:00
Fangrui Song ed956554f9 [Triple][Driver] Add muslx32 environment and use /lib/ld-musl-x32.so.1 for -dynamic-linker
Differential Revision: https://reviews.llvm.org/D99308
2021-03-25 16:25:47 -07:00
Yonghong Song 886f9ff531 BPF: add extern func to data sections if specified
This permits extern function (BTF_KIND_FUNC) be added
to BTF_KIND_DATASEC if a section name is specified.
For example,

-bash-4.4$ cat t.c
void foo(int) __attribute__((section(".kernel.funcs")));
int test(void) {
  foo(5);
  return 0;
}

The extern function foo (BTF_KIND_FUNC) will be put into
BTF_KIND_DATASEC with name ".kernel.funcs".

This will help to differentiate two kinds of external functions,
functions in kernel and functions defined in other bpf programs.

Differential Revision: https://reviews.llvm.org/D93563
2021-03-25 16:03:29 -07:00
Jingu Kang 3fd64cc7a3 [ValueTracking] Handle two PHIs in isKnownNonEqual()
loop:
  %cmp.0 = phi i32 [ 3, %entry ], [ %inc, %loop ]
  %pos.0 = phi i32 [ 1, %entry ], [ %cmp.0, %loop ]
  ...
  %inc = add i32 %cmp.0, 1
  br label %loop

On above example, %pos.0 uses previous iteration's %cmp.0 with backedge
according to PHI's instruction's defintion. If the %inc is not same among
iterations, we can say the two PHIs are not same.

Differential Revision: https://reviews.llvm.org/D98422
2021-03-25 22:56:05 +00:00
Jonas Devlieghere bbb419151c [lldb] Add IsFullyInitialized to DynamicLoader
On Darwin based systems, lldb will get notified by dyld before it itself
finished initializing, at which point it's not safe to call certain APIs
or SPIs. Add a method to the DynamicLoader to query that.

Differential revision: https://reviews.llvm.org/D99314
2021-03-25 15:44:37 -07:00
Leonard Chan 36eaeaf728 [llvm][hwasan] Add Fuchsia shadow mapping configuration
Ensure that Fuchsia shadow memory starts at zero.

Differential Revision: https://reviews.llvm.org/D99380
2021-03-25 15:28:59 -07:00
Stella Laurenzo ec294eb87b [mlir][linalg] Add an InitTensorOp python builder.
* This has the API I want but I am not thrilled with the implementation. There are various things that could be improved both about the way that Python builders are mapped and the way the Linalg ops are factored to increase code sharing between C++/Python.
* Landing this as-is since it at least makes the InitTensorOp usable with the right API. Will refactor underneath in follow-ons.

Differential Revision: https://reviews.llvm.org/D99000
2021-03-25 15:17:48 -07:00
Guozhi Wei 3240910f00 [DAE] Adjust param/arg attributes when changing parameter to undef
In DeadArgumentElimination pass, if a function's argument is never used, corresponding caller's parameter can be changed to undef. If the param/arg has attribute noundef or other related attributes, LLVM LangRef(https://llvm.org/docs/LangRef.html#parameter-attributes) says its behavior is undefined. SimplifyCFG(D97244) takes advantage of this behavior and does bad transformation on valid code.

To avoid this undefined behavior when change caller's parameter to undef, this patch removes noundef attribute and other attributes imply noundef on param/arg.

Differential Revision: https://reviews.llvm.org/D98899
2021-03-25 14:53:22 -07:00
Philip Reames 4f5e92cc05 Mark gc.relocate and gc.result as readnone (try 2)
As noted in the LangRef, these are semantically readnone projections from the result value of the associated statepoint. However, it turned out we had a few latent bugs being covered up by the fact we were only marking them readonly (see PR49607 for context).

As of this change, all known issues are resolved. This is a deliberately minimal patch to make it easy to test downstream and revert with minimal change if that turns out to be necessary.

Differential Revision: https://reviews.llvm.org/D98729
2021-03-25 14:50:07 -07:00
Philip Reames e7ebb87222 [deref] Handle byval/byref/sret/inalloc/preallocated arguments for deref-at-point semantics
All of these are scoped allocations which remain dereferenceable during the lifetime of the callee.

Differential Revision: https://reviews.llvm.org/D99310
2021-03-25 14:47:31 -07:00
Philip Reames 67e28173f1 Autogen test to account for tool output format change 2021-03-25 14:41:08 -07:00
Philip Reames 88d0f47b4f [test] Add test for hoisting to custom allocation function using allocsize
The first is currently demonstrating a miscompile.
2021-03-25 14:31:51 -07:00
David Stone 4b5baa5b82 Handle 128-bits IntegerLiterals in StmtPrinter
This fixes PR35677: "int128_t or uint128_t as non-type template
parameter causes crash when considering invalid constructor".
2021-03-25 17:27:13 -04:00
Vedant Kumar 414412d3dc [lldb/Commands] Fix spelling of target.move-to-nearest-code in helptext 2021-03-25 14:25:10 -07:00
Matt Morehouse 8e0bb21931 [HWASan] Mention x86_64 aliasing mode in design doc.
Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D98892
2021-03-25 14:22:20 -07:00
Craig Topper 5797feaa55 [RISCV] Reorder checks in RISCVTTIImpl::getGatherScatterOpCost to avoid calling getMinRVVVectorSizeInBits() when V extension is not enabled.
getMinRVVVectorSizeInBits() asserts if the V extension isn't
enabled. So check that gather/scatter is legal first since it
already contains a check for V extension being enabled. It
also already checks getMinRVVVectorSizeInBits for fixed length
vectors so we don't need a check in getGatherScatterOpCost.
2021-03-25 14:20:47 -07:00
Andrew Savonichev bba25a9cd8 [MCA] Support carry-over instructions for in-order processors
Instructions that have more uops than the processor's IssueWidth are
issued in multiple cycles.

The patch fixes PR49712.

Differential Revision: https://reviews.llvm.org/D99339
2021-03-26 00:06:19 +03:00
Xun Li f490a5969b [OpenMP][InstrProfiling] Fix a missing instr profiling counter
When emitting a function body there needs to be a instr profiling counter emitted. Otherwise instr profiling won't work for this function.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D98135
2021-03-25 13:52:36 -07:00
Richard Smith 622f8de4f2 PR49724: Fix deduction of null member pointers.
Previously we created an implicit cast of the wrong kind, which we'd
later fail to constant-evaluate, resulting in deduction failure.
2021-03-25 13:47:22 -07:00
Vy Nguyen dee5787d3e Reland [lld-macho][nfc] minor clean up, follow up to D98559
This reverts commit 77b4230ed9.

New change: Fixed tests on windows

     Differential Revision: https://reviews.llvm.org/D99210
2021-03-25 16:46:37 -04:00
Xun Li c7a39c833a [Coroutine][Clang] Force emit lifetime intrinsics for Coroutines
tl;dr Correct implementation of Corouintes requires having lifetime intrinsics available.

Coroutine functions are functions that can be suspended and resumed latter. To do so, data that need to stay alive after suspension must be put on the heap (i.e. the coroutine frame).
The optimizer is responsible for analyzing each AllocaInst and figure out whether it should be put on the stack or the frame.
In most cases, for data that we are unable to accurately analyze lifetime, we can just conservatively put them on the heap.
Unfortunately, there exists a few cases where certain data MUST be put on the stack, not on the heap. Without lifetime intrinsics, we are unable to correctly analyze those data's lifetime.

To dig into more details, there exists cases where at certain code points, the current coroutine frame may have already been destroyed. Hence no frame access would be allowed beyond that point.
The following is a common code pattern called "Symmetric Transfer" in coroutine:
```
auto tmp = await_suspend();
__builtin_coro_resume(tmp.address());
return;
```
In the above code example, `await_suspend()` returns a new coroutine handle, which we will obtain the address and then resume that coroutine. This essentially "transfered" from the current coroutine to a different coroutine.
During the call to `await_suspend()`, the current coroutine may be destroyed, which should be fine because we are not accessing any data afterwards.
However when LLVM is emitting IR for the above code, it needs to emit an AllocaInst for `tmp`. It will then call the `address` function on tmp. `address` function is a member function of coroutine, and there is no way for the LLVM optimizer to know that it does not capture the `tmp` pointer. So when the optimizer looks at it, it has to conservatively assume that `tmp` may escape and hence put it on the heap. Furthermore, in some cases `address` call would be inlined, which will generate a bunch of store/load instructions that move the `tmp` pointer around. Those stores will also make the compiler to think that `tmp` might escape.
To summarize, it's really difficult for the mid-end to figure out that the `tmp` data is short-lived.
I made some attempt in D98638, but it appears to be way too complex and is basically doing the same thing as inserting lifetime intrinsics in coroutines.

Also, for reference, we already force emitting lifetime intrinsics in O0 for AlwaysInliner: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Passes/PassBuilder.cpp#L1893

Differential Revision: https://reviews.llvm.org/D99227
2021-03-25 13:46:20 -07:00
Nico Weber a60ffee3f4 Revert "[InlineCost] Enable the cost benefit analysis on FDO"
This reverts commit ef69aa961d.
Makes clang assert in PGO builds, see repro tgz in
https://bugs.chromium.org/p/chromium/issues/detail?id=1192783#c6
2021-03-25 16:42:19 -04:00
Leonard Chan 1abaadb30d [clang][driver] Support HWASan in the Fuchsia toolchain
These contain clang driver changes for supporting HWASan on Fuchsia.
This includes hwasan multilibs and the dylib path change.

Differential Revision: https://reviews.llvm.org/D99361
2021-03-25 13:36:23 -07:00
Roman Lebedev 1c55dcbca7
[NFCI][SimplifyCFG] Don't pay for a Small{Map,Set}Vector when plain SmallSet will suffice
This *only* changes the cases where we *really* don't care
about the iteration order of the underlying contained,
namely when we will use the values from it to form DTU updates.
2021-03-25 23:25:40 +03:00
Nikita Popov 93a636d9f6 [IR] Lift attribute handling for assume bundles into CallBase
Rather than special-casing assume in BasicAA getModRefBehavior(),
do this one level higher, in the attribute handling of CallBase.

For assumes with operand bundles, the inaccessiblememonly attribute
applies regardless of operand bundles.
2021-03-25 21:15:39 +01:00
Sanjay Patel ad8010e598 [PowerPC] auto-generate complete testchecks; NFC
The full checks demonstrate a problem that comes up in:
https://llvm.org/PR49610
2021-03-25 15:52:39 -04:00
peter klausler d811c829af [flang] fix spurious runtime crash on TRIM('')
The standard interoperability routine CFI_establish() does not
accept a zero-length CHARACTER type.  Since these can be valid
results of intrinsic function references, work around the design
of CFI_establish() in the wrapper routine that calls it.

Differential Revision: https://reviews.llvm.org/D99296
2021-03-25 12:36:50 -07:00
Markus Böck c6047101ad [Support][Windows] Make sure only executables are found by sys::findProgramByName
The function utilizes Windows' SearchPathW function, which as I found out today, may also return directories. After looking at the Unix implementation of the file I found that it contains a check whether the found path is also executable. While fixing the Windows implementation, I also learned that sys::fs::access returns successfully when querying whether directories are executable, which the Unix version does not.

This patch makes both of these functions equivalent to their Unix implementation and insures that any path returned by sys::findProgramByName on Windows may only be executable, just like the Unix implementation.

The equivalent additions I have made to the Windows implementation, in the Unix implementation are here:
sys::findProgramByName: 39ecfe6143/llvm/lib/Support/Unix/Program.inc (L90)
sys::fs::access: c2a84771bb/llvm/lib/Support/Unix/Path.inc (L608)

I encountered this issue when running the LLVM testsuite. Commands of the form not test ... would fail to correctly execute test.exe, which is part of GnuWin32, as it actually tried to execute a folder called test, which happened to be in a directory on my PATH.

Differential Revision: https://reviews.llvm.org/D99357
2021-03-25 20:29:43 +01:00
Mircea Trofin 20ad206b60 [NFC] Module::getInstructionCount() is const 2021-03-25 12:29:19 -07:00
Yaxun (Sam) Liu cc9477166a [CUDA][HIP] add __builtin_get_device_side_mangled_name
Add builtin function __builtin_get_device_side_mangled_name
to get device side manged name for functions and global
variables, which can be used to get symbol address of kernels
or variables by mangled name in dynamically loaded
bundled code objects at run time.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D99301
2021-03-25 15:25:29 -04:00
Stanislav Mekhanoshin dc928e9c37 [AMDGPU] Refactoring mfma intrinsic definitions. NFC.
Differential Revision: https://reviews.llvm.org/D99366
2021-03-25 12:22:52 -07:00
Vy Nguyen e2f34cc330 [lld-macho][nfc] Removed unnecessary static_cast
Differential Revision: https://reviews.llvm.org/D99365
2021-03-25 15:07:46 -04:00
Andrzej Warzynski fcf629d76a [flang][driver] Fix typos and inconsistent comments (nfc) 2021-03-25 19:01:40 +00:00
Krzysztof Parzyszek a5b7d38c57 [Hexagon] Limit virtual register reuse range in FI elimination 2021-03-25 13:59:36 -05:00
Jez Ng 0113cf00b6 [lld-macho] Add support for --threads
Code and test are largely identical to the LLD-ELF equivalents.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D99312
2021-03-25 14:51:31 -04:00
Jez Ng 4bcaafeb0e [lld-macho] Add more TimeTraceScopes
I added just enough to allow us to see a top-level breakdown of time taken. This
is the result of loading the time-trace output into `chrome:://tracing`:

ef5e8234f3/tracing.png

Reviewed By: oontvoo

Differential Revision: https://reviews.llvm.org/D99311
2021-03-25 14:51:31 -04:00
Jez Ng 53fd1ada76 [lld-macho] Fix typo in diagnostic message 2021-03-25 14:51:31 -04:00
Lang Hames 7d1c503080 [JITLink][MachO/x86-64] Remove stale commented-out code.
This commented-out code was accidentally left in during the transition from
MachO-specific to generic x86-64 edge kinds (ecf6466f01).
2021-03-25 11:47:24 -07:00
Mehdi Amini fcdf142ed5 Remove unused function, fix warning (NFC)
The `mayNotHaveTerminator` was initially on Block but moved to the
verifier before landing and wasn't removed from its original place
where it is unused.
2021-03-25 18:37:57 +00:00