Commit Graph

371756 Commits

Author SHA1 Message Date
Wang, Pengfei 881b4d20f6 [CodeGen][X86] Remove unused check-prefixes from vector shift tests. NFCI. 2020-11-11 14:19:14 +08:00
Wang, Pengfei 0e582781f3 [CodeGen][X86] Remove unused check-prefixes from vector shuffle tests. NFCI. 2020-11-11 13:48:03 +08:00
Wang, Pengfei f7826b7729 [CodeGen][X86] Remove unused check-prefixes from vector tzcnt tests. NFCI. 2020-11-11 13:48:03 +08:00
Faisal Vali e4d27932a5 [NFC, Refactor] Rename the (scoped) enum DeclaratorContext's enumerators to remove duplication
Since these are scoped enumerators, they have to be prefixed by DeclaratorContext, so lets remove Context from the name, and return some characters to the multiverse.

Patch was reviewed here: https://reviews.llvm.org/D91011

Thank you to aaron, bruno, wyatt and barry for indulging me.
2020-11-10 23:40:12 -06:00
Xun Li b8a8ef3276 [SafeStack] Make sure SafeStack does not break musttail call contract
SafeStack instrumentation should not insert anything inbetween musttail call and return instruction.
For every ReturnInst that needs to be instrumented, we adjust the insertion point to the musttail call if exists.

Differential Revision: https://reviews.llvm.org/D90702
2020-11-10 20:46:05 -08:00
Max Kazantsev 7dcc889917 [SCEV] Generalize no-self-wrap check in isLoopInvariantExitCondDuringFirstIterations
Lift limitation on step being `+/- 1`. In fact, the only thing it is needed for
is proving no-self-wrap. We can instead check this flag directly.

Theoretically it can increase the scope of the transform, but I could not
construct such test easily.

Differential Revision: https://reviews.llvm.org/D91126
Reviewed By: apilipenko
2020-11-11 11:17:13 +07:00
Wang, Pengfei b7067480d2 [CodeGen][X86] Remove unused check-prefixes from vector shuffle tests. NFCI. 2020-11-11 12:00:52 +08:00
Wang, Pengfei 8b87fdb207 [CodeGen][X86] Remove unused check-prefixes from vector shift tests. NFCI. 2020-11-11 12:00:52 +08:00
Wang, Pengfei a28eaafc23 [CodeGen][X86] Remove unused check-prefixes from vector reduce tests. NFCI. 2020-11-11 12:00:52 +08:00
Wang, Pengfei 5f96fd06ac [CodeGen][X86] Remove unused check-prefixes from vector popcnt tests. NFCI. 2020-11-11 12:00:52 +08:00
Roland McGrath cf36142d34 [clang] Add missing header guard in <cpuid.h>
This header has long lacked a standard multiple inclusion guard
like other headers have, for no apparent reason.  The GCC header
of the same name likewise lacks one up through release 10.1, but
trunk GCC (release 11, and perhaps future 10.x) has fixed it
(see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96238).

Reviewed By: phosek

Differential Revision: https://reviews.llvm.org/D91226
2020-11-10 19:34:25 -08:00
Kazu Hirata 21fbe2ee68 Revert "[BranchProbabilityInfo] Use SmallVector (NFC)"
This reverts commit 2f1038c7b6.
2020-11-10 19:17:13 -08:00
Wang, Pengfei 659230ff80 [CodeGen][X86] Remove unused check-prefixes from mask tests. NFCI. 2020-11-11 11:09:41 +08:00
Gaurav Jain a0b7129767 [NFC] Use [MC]Register in TwoAddressInstructionPass
Differential Revision: https://reviews.llvm.org/D90902
2020-11-10 19:01:56 -08:00
Akira Hatanaka d9258a21f0 Fix the data layout mangling specification for 'arm64-pc-win32-macho'
rdar://problem/70410504
2020-11-10 18:52:12 -08:00
Chen Zheng 724a0e53de NFC - use script to update testcases and add new testcases. 2020-11-10 21:39:42 -05:00
zoecarver 31dfaff3b3 [libc++] Change requirements on linear_congruential_engine.
This patch changes how linear_congruential_engine picks its randomization
algorithm. It adds two restrictions, `_OverflowOK` and `_SchrageOK`.
`_OverflowOK` means that m is a power of two so using the classic
`(a * x + c) % m` will create a meaningless overflow. The second checks
that Schrage's algorithm will produce results that are in bounds of min
and max. This patch fixes https://llvm.org/PR27839.

Differential Revision: D65041
2020-11-10 18:23:22 -08:00
Chen Zheng 4eb8359e74 [EarlyCSE] delete abs/nabs handling
delete abs/nabs handling in earlycse pass to avoid bugs related to
hashing values. After abs/nabs is canonicalized to intrinsics in D87188,
we should get CSE ability for abs/nabs back.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D90734
2020-11-10 21:10:58 -05:00
Sam Clegg 29a3056bb5 [lld][WebAssembly] Allow references to __tls_base without shared memory
Previously we limited the use of atomics and TLS to programs
linked with `--shared-memory`.

However, as of https://reviews.llvm.org/D79530 we now allow
programs that use atomic to be linked without `--shared-memory`.
For this to be useful we also want to all TLS usage in such
programs.  In this case, since we know we are single threaded
we simply include the TLS data as a regular active segment
and create an immutable `__tls_base` global that point to the
start of this segment.

Fixes: https://github.com/emscripten-core/emscripten/issues/12489

Differential Revision: https://reviews.llvm.org/D91115
2020-11-10 17:58:06 -08:00
Evgenii Stepanov e1eeb026e6 [hwasan] Fix Thread reuse.
HwasanThreadList::DontNeedThread clobbers Thread::next_, breaking the
freelist. As a result, only the top of the freelist ever gets reused,
and the rest of it is lost.

Since the Thread object its associated ring buffer is only 8Kb, this is
typically only noticable in long running processes, such as fuzzers.

Fix the problem by switching from an intrusive linked list to a vector.

Differential Revision: https://reviews.llvm.org/D91208
2020-11-10 17:24:24 -08:00
Wang, Pengfei 07ba0662da [CodeGen][X86] Remove unused check-prefixes from bitcast tests. NFCI. 2020-11-11 09:16:57 +08:00
Vedant Kumar 9922dded47 [test] Delete redundant lldbutil import, NFC 2020-11-10 16:37:47 -08:00
Vedant Kumar ae3640e386 [ThreadPlan] Add a test for `thread step-in -r`, NFC
Adds test coverage for ThreadPlanStepInRange::SetAvoidRegexp.

See:
http://lab.llvm.org:8080/coverage/coverage-reports/coverage/Users/buildslave/jenkins/workspace/coverage/llvm-project/lldb/source/Target/ThreadPlanStepInRange.cpp.html#L309

Differential Revision: https://reviews.llvm.org/D91220
2020-11-10 16:37:47 -08:00
peter klausler b670189975 [flang] Fix CheckSpecificationExpr handling of associated names
Avoid a spurious error message about a dummy procedure reference
in a specification expression by restructuring the handling of
use-associated and host-associated symbols.

Differential revision: https://reviews.llvm.org/D91209
2020-11-10 16:19:13 -08:00
Vedant Kumar 04cd6c6217 [ThreadPlan] Delete unused ThreadPlanStepInRange code, NFC 2020-11-10 16:15:03 -08:00
Vedant Kumar 34d56b05fd [ThreadPlan] Reflow docs to fit the 80 column limit, NFC 2020-11-10 16:14:52 -08:00
Vedant Kumar ba21376883 [Command] Fix accidental word concatenation in Options.td
Split up words that appear to have been accidentally concatenated.

This looks to be exhaustive: to find these in vim, use:

/\v[^ ]"\n +"[^ ]
2020-11-10 16:13:39 -08:00
Peter Collingbourne 0ae2ea8f83 hwasan: Bring back operator {new,delete} interceptors on Android.
It turns out that we can't remove the operator new and delete
interceptors on Android without breaking ABI, so bring them back
as forwards to the malloc and free functions.

Differential Revision: https://reviews.llvm.org/D91219
2020-11-10 16:05:24 -08:00
Lang Hames e7a63df88c [ORC] Add debugging output for ResourceTracker to be used in JITDylib::define. 2020-11-11 10:58:22 +11:00
Richard Smith c6d86b6b45 Properly collect template arguments from a class-scope function template
specialization.

Fixes a crash-on-valid if further template parameters are introduced
within the specialization (by a generic lambda).
2020-11-10 15:55:19 -08:00
Peter Collingbourne c052510c0b gn build: (manually) Port ae032e27 and 21f83113.
__register_frame and __deregister_frame are associated with the
.eh_frame section, which I think is used on all of our platforms
except Windows and 32-bit ARM (which uses the ARM EHABI).

Also add a file that was added to lld/MachO.
2020-11-10 15:50:40 -08:00
Gaurav Jain 3726b14428 [NFC] Use [MC]Register for x86 target
Differential Revision: https://reviews.llvm.org/D91161
2020-11-10 15:49:39 -08:00
Tue Ly d41280467d [libc] Add implementations of fdim[f|l].
Implementing fdim, fdimf, and fdiml for llvm-libc.

Differential Revision: https://reviews.llvm.org/D90906
2020-11-10 18:48:11 -05:00
Stephen Kelly 872633b285 Add utility for testing if we're matching nodes AsIs
Differential Revision: https://reviews.llvm.org/D91144
2020-11-10 23:32:30 +00:00
Kazushi (Jam) Marukawa dd6f607ea8 [VE] Implement FoldImmediate
Implement FoldImmediate for only integer aritihmetic operations.
Add regression tests also.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D91150
2020-11-11 08:08:32 +09:00
Florian Hahn c8d73d939f Revert "[VPlan] Use VPValue def for VPWidenSelectRecipe."
This reverts commit a8e50f1c6e.

This reportedly breaks building the Linux kernel.
  https://bugs.llvm.org/show_bug.cgi?id=48142
2020-11-10 22:50:46 +00:00
Richard Smith c43f8c7728 Add PrintingPolicy overload to APValue::printPretty. NFC. 2020-11-10 14:48:56 -08:00
Stephen Kelly ed2baaac56 Revert "Add utility for testing if we're matching nodes AsIs"
This reverts commit e73296d3b9.

This may have caused build bot failure.
2020-11-10 22:29:58 +00:00
Zequan Wu 78b48426a2 [llvm-cov] Add a test for c75a0a1e 2020-11-10 14:25:43 -08:00
Akira Hatanaka 874b0a0b9d [CodeGen] Mark calls to objc_autorelease as tail
This enables a method sending an autorelease message to an object and
returning the object in MRR to avoid adding the object to an autorelease
pool if a call to objc_retainAutoreleasedReturnValue in the caller
function accepts the hand off of the retain count.

rdar://problem/50678052

Differential Revision: https://reviews.llvm.org/D91111
2020-11-10 13:48:25 -08:00
Sean Silva 53a0d45db6 [mlir] Add pass to convert elementwise ops to linalg.
This patch converts elementwise ops on tensors to linalg.generic ops
with the same elementwise op in the payload (except rewritten to
operate on scalars, obviously). This is a great form for later fusion to
clean up.

E.g.

```
// Compute: %arg0 + %arg1 - %arg2
func @f(%arg0: tensor<?xf32>, %arg1: tensor<?xf32>, %arg2: tensor<?xf32>) -> tensor<?xf32> {
  %0 = addf %arg0, %arg1 : tensor<?xf32>
  %1 = subf %0, %arg2 : tensor<?xf32>
  return %1 : tensor<?xf32>
}
```

Running this through
`mlir-opt -convert-std-to-linalg -linalg-fusion-for-tensor-ops` we get:

```
func @f(%arg0: tensor<?xf32>, %arg1: tensor<?xf32>, %arg2: tensor<?xf32>) -> tensor<?xf32> {
  %0 = linalg.generic {indexing_maps = [#map0, #map0, #map0, #map0], iterator_types = ["parallel"]} ins(%arg0, %arg1, %arg2 : tensor<?xf32>, tensor<?xf32>, tensor<?xf32>) {
  ^bb0(%arg3: f32, %arg4: f32, %arg5: f32):  // no predecessors
    %1 = addf %arg3, %arg4 : f32
    %2 = subf %1, %arg5 : f32
    linalg.yield %2 : f32
  } -> tensor<?xf32>
  return %0 : tensor<?xf32>
}
```

So the elementwise ops on tensors have nicely collapsed into a single
linalg.generic, which is the form we want for further transformations.

Differential Revision: https://reviews.llvm.org/D90354
2020-11-10 13:44:44 -08:00
Sean Silva b4fa28b408 [mlir] Add ElementwiseMappable trait and apply it to std elementwise ops.
This patch adds an `ElementwiseMappable` trait as discussed in the RFC
here:
https://llvm.discourse.group/t/rfc-std-elementwise-ops-on-tensors/2113/23

This trait can power a number of transformations and analyses.
A subsequent patch adds a convert-elementwise-to-linalg pass exhibits
how this trait allows writing generic transformations.
See https://reviews.llvm.org/D90354 for that patch.

This trait slightly changes some verifier messages, but the diagnostics
are usually about as good. I fiddled with the ordering of the trait in
the .td file trait lists to minimize the changes here.

Differential Revision: https://reviews.llvm.org/D90731
2020-11-10 13:44:44 -08:00
Pirama Arumuga Nainar 8262e94a6d [ARM] Fix PR 47980: Use constrainRegClass during foldImmediate opt.
Previously we used setRegClass to rgpr, which may expand the register
domain if the result was already in a constrained class (tcgpr in the
above PR).

Differential Revision: https://reviews.llvm.org/D91192
2020-11-10 13:38:11 -08:00
Michael Kruse e408935bb5 [Polly][ScopBuilder] Use only modeled instructions to compute statement granularity.
ScopBuilder distributes independent instructions between statements.
Only modeled (e.g. not synthesizable) instructions are represented.
To compute independence, non-modeled instructions were used in some
parts of determining instruction independence, which could lead to the
re-introduction of non-model instructions.

In particular, required invariant loads could be added to instruction
list, which then led to redundant MemoryAccesses for such a load.

This fixes llvm.org/PR48059.
2020-11-10 15:30:16 -06:00
Xun Li 19f0770923 [Coroutine][Sema] Cleanup temporaries as early as possible
The original bug was discovered in T75057860. Clang front-end emits an AST that looks like this for an co_await expression:
|- ExprWithCleanups
  |- -CoawaitExpr
    |- -MaterializeTemporaryExpr ... Awaiter
      ...
    |- -CXXMemberCallExpr ... .await_ready
      ...
    |- -CallExpr ... __builtin_coro_resume
      ...
    |- -CXXMemberCallExpr ... .await_resume
      ...

ExprWithCleanups is responsible for cleaning up (including calling dtors) for the temporaries generated in the wrapping expression).
In the above structure, the __builtin_coro_resume part (which corresponds to the code for the suspend case in the co_await with symmetric transfer), the pseudocode looks like this:
  __builtin_coro_resume(
   awaiter.await_suspend(
     from_address(
       __builtin_coro_frame())).address());

One of the temporaries that's generated as part of this code is the coroutine handle returned from awaiter.await_suspend() call. The call returns a handle  which is a prvalue (since it's a returned value on the fly). In order to call the address() method on it, it needs to be converted into an xvalue. Hence a materialized temp is created to hold it. This temp will need to be cleaned up eventually. Now, since all cleanups happen at the end of the entire co_await expression, which is after the <coro.suspend> suspension point, the compiler will think that such a temp needs to live across suspensions, and need to be put on the coroutine frame, even though it's only used temporarily just to call address() method.
Such a phenomena not only unnecessarily increases the frame size, but can lead to ASAN failures, if the coroutine was already destroyed as part of the await_suspend() call. This is because if the coroutine was already destroyed, the frame no longer exists, and one can not store anything into it. But if the temporary object is considered to need to live on the frame, it will be stored into the frame after await_suspend() returns.

A fix attempt was done in https://reviews.llvm.org/D87470. Unfortunately it is incorrect. The reason is that cleanups in Clang works more like linearly than nested. There is one current state indicating whether it needs cleanup, and an ExprWithCleanups resets that state. This means that an ExprWithCleanups must be capable of cleaning up all temporaries created  in the wrapping expression, otherwise there will be dangling temporaries cleaned up at the wrong place.
I eventually found a walk-around (https://reviews.llvm.org/D89066) that doesn't break any existing tests while fixing the issue. But it targets the final co_await only. If we ever have a co_await that's not on the final awaiter and the frame gets destroyed after suspend, we are in trouble. Hence we need a proper fix.

This patch is the proper fix. It does the folllowing things to fully resolve the issue:
1. The AST has to be generated in the order according to their nesting relationship. We should not generate AST out of order because then the code generator would incorrectly track the state of temporaries and when a cleanup is needed. So the code in buildCoawaitCalls is reorganized so that we will be generating the AST for each coawait member call in order along with their child AST.
2. await_ready() call is wrapped with an ExprWithCleanups so that temporaries in it gets cleaned up as early as possible to avoid living across suspension.
3. await_suspend() call is wrapped with an ExprWithCleanups if it's not a symmetric transfer. In the case of a symmetric transfer, in order to maintain the musttail call contract, the ExprWithCleanups is wraaped before the resume call.
4. In the end, we mark again that it needs a cleanup, so that the entire CoawaitExpr will be wrapped with a ExprWithCleanups which will clean up the Awaiter object associated with the await expression.

Differential Revision: https://reviews.llvm.org/D90990
2020-11-10 13:27:42 -08:00
AndreyChurbanov 33da6bd7f5 [OpenMP] Fixes for shared memory cleanup when aborts occur
Patch by Erdner, Todd <todd.erdner@intel.com>

Differential Revision: https://reviews.llvm.org/D90974
2020-11-11 00:16:23 +03:00
Stanislav Mekhanoshin 544ef42e40 [AMDGPU] Set default op_sel_hi on accvgpr read/write
These are opsel opcodes with op_sel actually being ignored.
As a such op_sel_hi needs to be set to default 1 even though
these bits are ignored. This is compatibility change.

Differential Revision: https://reviews.llvm.org/D91202
2020-11-10 13:07:29 -08:00
Richard Smith 438a27f2e5 Move code to determine the type of an LValueBase out of ExprConstant and
into a member function on LValueBase. NFC.
2020-11-10 13:03:57 -08:00
Bruno Cardoso Lopes dc14542a71 [Coroutines] Add missing llvm.dbg.declare's to cover for more allocas
Tracking local variables across suspend points is still somewhat incomplete.
Consider this coroutine snippet:

```
resumable foo() {
  int x[10] = {};
  int a = 3;
  co_await std::experimental::suspend_always();
  a++;
  x[0] = 1;
  a += 2;
  x[1] = 2;
  a += 3;
  x[2] = 3;
}
```

Can't manage to print `a` or `x` if they turn out to be allocas during
CoroSplit (which happens if you build this code with `-O0` prior to this
commit):

```
* thread #1, queue = 'com.apple.main-thread', stop reason = step over
    frame #0: 0x0000000100003729 main-noprint`foo() at main-noprint.cpp:43:5
   40     co_await std::experimental::suspend_always();
   41     a++;
   42     x[0] = 1;
-> 43     a += 2;
   44     x[1] = 2;
   45     a += 3;
   46     x[2] = 3;
(lldb) p x
error: <user expression 21>:1:1: use of undeclared identifier 'x'
x
^
```

The generated IR contains a `llvm.dbg.declare` for `x` in it's initialization
basic block. After CoroSplit, the `llvm.dbg.declare` might not dominate all of
`x` uses and we lose debugging quality.

Add `llvm.dbg.value`s to all relevant basic blocks such that if later
transformations break the dominance the reliable debug info is already in
place. For instance, this BB:

```
await.ready:
  ...
  %arrayidx = getelementptr inbounds [10 x i32], [10 x i32]* %x.reload.addr, i64 0, i64 0, !dbg !760
  ...
  %arrayidx19 = getelementptr inbounds [10 x i32], [10 x i32]* %x.reload.addr, i64 0, i64 1, !dbg !763
  ...
  %arrayidx21 = getelementptr inbounds [10 x i32], [10 x i32]* %x.reload.addr, i64 0, i64 2, !dbg !766
```

becomes:

```
await.ready:
  ...
  call void @llvm.dbg.value(metadata [10 x i32]* %x.reload.addr, metadata !751, metadata !DIExpression()), !dbg !753
  ...
  %arrayidx = getelementptr inbounds [10 x i32], [10 x i32]* %x.reload.addr, i64 0, i64 0, !dbg !760
  ...
  %arrayidx19 = getelementptr inbounds [10 x i32], [10 x i32]* %x.reload.addr, i64 0, i64 1, !dbg !763
  ...
  %arrayidx21 = getelementptr inbounds [10 x i32], [10 x i32]* %x.reload.addr, i64 0, i64 2, !dbg !766
```

Differential Revision: https://reviews.llvm.org/D90772
2020-11-10 12:36:07 -08:00
Sjoerd Meijer 2ef47910d5 [LoopFlatten] Run it earlier, just before IndVarSimplify
This is a prep step for widening induction variables in LoopFlatten if this is
posssible (D90640), to avoid having to perform certain overflow checks. Since
IndVarSimplify may already widen induction variables, we want to run
LoopFlatten just before IndVarSimplify. This is a minor reshuffle as both
passes were already close after each other.

Differential Revision: https://reviews.llvm.org/D90402
2020-11-10 20:22:41 +00:00