Commit Graph

141440 Commits

Author SHA1 Message Date
Liu, Chen3 776f92e067 [X86] Add support for vex, vex2, vex3, and evex for MASM
For MASM syntax, the prefixes are not enclosed in braces.
The assembly code should like:
  "evex vcvtps2pd xmm0, xmm1"

Differential Revision: https://reviews.llvm.org/D90441
2020-11-20 16:20:19 +08:00
Georgii Rymar 9a99d23a1b [lib/Object] - Generalize the RelocationResolver API.
This allows to reuse the RelocationResolver from the code
that doesn't want to deal with `RelocationRef` class.

I am going to use it in llvm-readobj. See the description
of D91530 for more details.

Differential revision: https://reviews.llvm.org/D91533
2020-11-20 10:32:49 +03:00
Arthur Eubanks b77436047a [PGO] Make -disable-preinline work with NPM
Fixes cspgo_profile_summary.ll under NPM.

Reviewed By: xur

Differential Revision: https://reviews.llvm.org/D91826
2020-11-19 22:58:55 -08:00
Kazu Hirata 2583d8eb08 [CodeGen] Use llvm::is_contained (NFC) 2020-11-19 22:07:56 -08:00
Bill Wendling b2f6630739 [PowerPC] Allow a '%' prefix for registers in CFI directives
Clang generates a '%' prefix for some registers in CFI directives. E.g.
".cfi_register lr, r12" becomes ".cfi_register lr, %r12" after
processing.

Differential Revision: https://reviews.llvm.org/D91735
2020-11-19 18:19:51 -08:00
Arthur Eubanks 513d165b80 Port -lower-matrix-intrinsics-minimal to NPM
This reuses the existing lower-matrix-intrinsics pass rather than going
the legacy pass route of creating a new pass.

Use this new variant in the NPM -O0 pipeline.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D91811
2020-11-19 17:42:48 -08:00
Florian Hahn 7fa14a7c69
[ConstraintElimination] Decompose GEP with arbitrary offsets.
This patch decomposes `GEP %x, %offset` as  0 + 1 * %x + 1 * %off.
2020-11-19 22:49:21 +00:00
Geoffrey Martin-Noble b156514f8d Remove unused private fields
Unused since https://reviews.llvm.org/D91762 and triggering
-Wunused-private-field

```
llvm/lib/Transforms/Instrumentation/DataFlowSanitizer.cpp:365:13: error: private field 'GetArgTLS' is not used [-Werror,-Wunused-private-field]
  Constant *GetArgTLS;
            ^
llvm/lib/Transforms/Instrumentation/DataFlowSanitizer.cpp:366:13: error: private field 'GetRetvalTLS' is not used [-Werror,-Wunused-private-field]
  Constant *GetRetvalTLS;
```

Reviewed By: stephan.yichao.zhao

Differential Revision: https://reviews.llvm.org/D91820
2020-11-19 13:54:54 -08:00
Roman Lebedev a91e96702a
[InstCombine] Fold `and(shl(zext(x), width(SIGNMASK) - width(%x)), SIGNMASK)` to `and(sext(%x), SIGNMASK)`
One less instruction and reducing use count of zext.
As alive2 confirms, we're fine with all the weird combinations of
undef elts in constants, but unless the shift amount was undef
for a lane, we must sanitize undef mask to zero, since sign bits
are no longer zeros.

https://rise4fun.com/Alive/d7r
```
----------------------------------------
Optimization: zz
Precondition: ((C1 == (width(%r) - width(%x))) && isSignBit(C2))
  %o0 = zext %x
  %o1 = shl %o0, C1
  %r = and %o1, C2
=>
  %n0 = sext %x
  %r = and %n0, C2

Done: 2016
Optimization is correct!
```
2020-11-20 00:31:27 +03:00
Nikita Popov e8dc6e9a32 [MemLoc] Use hasValue() method more (NFC)
Followup to 7de7c40898. I previously
removed a number of == comparisons to LocationSize::unknown(), but
missed these != comparisons.
2020-11-19 22:29:44 +01:00
Jianzhou Zhao 6c1c308c0e Remove deadcode from DFSanFunction::get*TLS*()
clean more deadcode after D84704

Reviewed-by: morehouse

Differential Revision: https://reviews.llvm.org/D91762
2020-11-19 21:10:37 +00:00
Nikita Popov 7de7c40898 [MemLoc] Use hasValue() method (NFC)
Instead of comparing to LocationSize::unknown(), prefer calling
the hasValue() method instead, which is less reliant on
implementation details.
2020-11-19 21:53:50 +01:00
Nikita Popov 393b9e9db3 [MemLoc] Require LocationSize argument (NFC)
When constructing a MemoryLocation by hand, require that a
LocationSize is explicitly specified. D91649 will split up
LocationSize::unknown() into two different states, and callers
should make an explicit choice regarding the kind of MemoryLocation
they want to have.
2020-11-19 21:45:52 +01:00
Artur Pilipenko 887c7660bd [BasicAA] Deoptimize intrinsics don't modify memory
Similarly to assumes and guards deoptimize intrinsics are
marked as writing to ensure proper control dependencies
but they never modify any particular memory location.

Differential Revision: https://reviews.llvm.org/D91658
2020-11-19 12:08:33 -08:00
Nikita Popov 22ec72f803 [Lint] Use MemoryLocation
Instead of separately passing pointer and size, make use of
MemoryLocation. This allows us to also reuse all the existing
logic for determining the MemoryLocation correponding to an
instruction or call argument.

Not quite NFC because used locations may be more precise in some
cases.
2020-11-19 20:55:25 +01:00
Arthur Eubanks 72badbcdcc [NPM] Move more O0 pass building into PassBuilder
This moves handling of alwaysinline, coroutines, matrix lowering, PGO,
and LTO-required passes into PassBuilder. Much of this is replicated
between Clang and opt. Other out-of-tree users also replicate some of
this, such as Rust [1] replicating the alwaysinline, LTO, and PGO
passes.

The LTO passes are also now run in
build(Thin)LTOPreLinkDefaultPipeline() since they are semantically
required for (Thin)LTO.

[1]: f5230fbf76/compiler/rustc_llvm/llvm-wrapper/PassWrapper.cpp (L896)

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D91585
2020-11-19 11:22:23 -08:00
Jorge Gorbe Moya 314a0d73a8 Fix crash after looking up dwo_id=0 in CU index.
In the current state, if getFromHash(0) is called and there's no CU with
dwo_id=0, the lookup will stop at an empty slot, then the check
`Rows[H].getSignature() != S` won't cause the lookup to fail and return
a nullptr (as it should), because the empty slot has a 0 in the
signature field, and a pointer to the empty slot will be incorrectly
returned.

This patch fixes this by using the index field in the hash entry to
check for empty slots: signature = 0 can match a valid hash but
according to the spec the index for an occupied slot will always be
non-zero.

Differential Revision: https://reviews.llvm.org/D91670
2020-11-19 11:15:01 -08:00
Leonard Chan a97f62837f [llvm][IR] Add dso_local_equivalent Constant
The `dso_local_equivalent` constant is a wrapper for functions that represents a
value which is functionally equivalent to the global passed to this. That is, if
this accepts a function, calling this constant should have the same effects as
calling the function directly. This could be a direct reference to the function,
the `@plt` modifier on X86/AArch64, a thunk, or anything that's equivalent to the
resolved function as a call target.

When lowered, the returned address must have a constant offset at link time from
some other symbol defined within the same binary. The address of this value is
also insignificant. The name is leveraged from `dso_local` where use of a function
or variable is resolved to a symbol in the same linkage unit.

In this patch:
- Addition of `dso_local_equivalent` and handling it
- Update Constant::needsRelocation() to strip constant inbound GEPs and take
  advantage of `dso_local_equivalent` for relative references

This is useful for the [Relative VTables C++ ABI](https://reviews.llvm.org/D72959)
which makes vtables readonly. This works by replacing the dynamic relocations for
function pointers in them with static relocations that represent the offset between
the vtable and virtual functions. If a function is externally defined,
`dso_local_equivalent` can be used as a generic wrapper for the function to still
allow for this static offset calculation to be done.

See [RFC](http://lists.llvm.org/pipermail/llvm-dev/2020-August/144469.html) for more details.

Differential Revision: https://reviews.llvm.org/D77248
2020-11-19 10:26:17 -08:00
Fraser Cormack 1ac9b54831 [RISCV] Lower GREVI and GORCI as custom nodes
This moves the recognition of GREVI and GORCI from TableGen patterns
into a DAGCombine. This is done primarily to match "deeper" patterns in
the future, like (grevi (grevi x, 1) 2) -> (grevi x, 3).

TableGen is not best suited to matching patterns such as these as the compile
time of the DAG matchers quickly gets out of hand due to the expansion of
commutative permutations.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D91259
2020-11-19 18:11:42 +00:00
Adhemerval Zanella 807320119f [AArch64] Lower fptrunc/fpext from/to FP128t to/from FP16
The compiler-rt part which adds the emitted symbols is handled in
a subsequent patch.

Differential Revision: https://reviews.llvm.org/D91731
2020-11-19 15:14:50 -03:00
Sander de Smalen 41c9f4c1ce [LoopVectorize] NFC: Fix unused variable warning for MaxSafeDepDist
rGf571fe6df585127d8b045f8e8f5b4e59da9bbb73 led to a warning of an unused
variable for MaxSafeDepDist (written but not used). It seems this
variable and assignment can be safely removed.
2020-11-19 17:41:35 +00:00
Sam Tebbs 8ecb015ed5 [ARM][LowOverheadLoops] Convert intermediate vpr use assertion to condition
This converts the intermediate VPR use assertion to a condition in the if-statement to protect against assertion failures in case behaviuour is changed.

This is a follow-up to https://reviews.llvm.org/D90935 and implements the post-approval comments.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D91790
2020-11-19 17:15:45 +00:00
Joseph Huber da8bec47ab [OpenMP] Add Location Fields to Libomptarget Runtime for Debugging
Summary:
Add support for passing source locations to libomptarget runtime functions using the ident_t struct present in the rest of the libomp API. This will allow the runtime system to give much more insightful error messages and debugging values.

Reviewers: jdoerfert grokos

Differential Revision: https://reviews.llvm.org/D87946
2020-11-19 12:01:53 -05:00
diggerlin ab77fa5155 [AIX][XCOFF][Patch2] decode vector information and extent long table of the traceback table of the xcoff.
SUMMARY:

1. decode the Vector extension if has_vec is set
2. decode long table fields, if longtbtable is set.

There is conflict on the bit order of HasVectorInfoMask and HasExtensionTableMask between AIX os header and IBM aix compiler XLC.
In the /usr/include/sys/debug.h defines
static constexpr uint32_t HasVectorInfoMask = 0x0040'0000;
static constexpr uint32_t HasExtensionTableMask = 0x0080'0000;
but the XLC defines as

static constexpr uint32_t HasVectorInfoMask = 0x0080'0000;
static constexpr uint32_t HasExtensionTableMask = 0x0040'0000;
we follows the definition of the IBM AIX compiler XLC here.

Reviewer: Jason Liu

Differential Revision: https://reviews.llvm.org/D86461
2020-11-19 10:23:43 -05:00
Simon Pilgrim fceaff41d6 [ValueTracking] computeKnownBitsFromShiftOperator - move shift amount analysis to top of the function. NFCI.
These are all lightweight to compute and helps avoid issues with Known being used to hold both the shift amount and then the shifted result.

Minor cleanup for D90479.
2020-11-19 13:50:49 +00:00
David Green 006b3bdedd [ARM] Deliberately prevent inline asm in low overhead loops. NFC
This was already something that was handled by one of the "else"
branches in maybeLoweredToCall, so this patch is an NFC but makes it
explicit and adds a test. We may in the future want to support this
under certain situations but for the moment just don't try and create
low overhead loops with inline asm in them.

Differential Revision: https://reviews.llvm.org/D91257
2020-11-19 13:28:21 +00:00
Simon Pilgrim 14ae02fb33 [X86][AVX] Only share broadcasts of different widths from the same SDValue of the same SDNode (PR48215)
D57663 allowed us to reuse broadcasts of the same scalar value by extracting low subvectors from the widest type.

Unfortunately we weren't ensuring the broadcasts were from the same SDValue, just the same SDNode - which failed on multiple-value nodes like ISD::SDIVREM

FYI: I intend to request this be merged into the 11.x release branch.

Differential Revision: https://reviews.llvm.org/D91709
2020-11-19 12:15:18 +00:00
Simon Moll a1de391dae [LV][NFC-ish] Allow vector widths over 256 elements
The assertion that vector widths are <= 256 elements was hard wired in the LV code. Eg, VE allows for vectors up to 512 elements. Test again the TTI vector register bit width instead - this is an NFC for non-asserting builds.

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D91518
2020-11-19 10:58:29 +01:00
Florian Hahn 1983acce7c
[SelDAGBuilder] Do not require simple VTs for constraints.
In some cases, the values passed to `asm sideeffect` calls cannot be
mapped directly to simple MVTs. Currently, we crash in the backend if
that happens. An example can be found in the @test_vector_too_large_r_m
test case, where we pass <9 x float> vectors. In practice, this can
happen in cases like the simple C example below.

using vec = float __attribute__((ext_vector_type(9)));
void f1 (vec m) {
  asm volatile("" : "+r,m"(m) : : "memory");
}

One case that use "+r,m" constraints for arbitrary data types in
practice is google-benchmark's DoNotOptimize.

This patch updates visitInlineAsm so that it use MVT::Other for
constraints with complex VTs. It looks like the rest of the backend
correctly deals with that and properly legalizes the type.

And we still report an error if there are no registers to satisfy the
constraint.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D91710
2020-11-19 09:31:54 +00:00
Max Kazantsev 515105f46b [NFC] Remove comment (commited ahead of time by mistake) 2020-11-19 16:28:34 +07:00
Max Kazantsev 7c601d09a7 [NFC] Move code earlier as preparation for further changes 2020-11-19 16:27:23 +07:00
Simon Moll ffe6c97f6b [VE] VEC_BROADCAST, lowering and isel
This defines the vec_broadcast SDNode along with lowering and isel code.
We also remove unused type mappings for the vector register classes (all vector MVTs that are not used in the ISA go).

We will implement support for short vectors later by intercepting nodes with illegal vector EVTs before LLVM has had a chance to widen them.

Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D91646
2020-11-19 09:44:56 +01:00
Sam Clegg 1827005cfc [WebAssembly] Add support for named globals in the object format.
Differential Revision: https://reviews.llvm.org/D91769
2020-11-19 00:17:22 -08:00
Andrew Wei ea7ab5a42c [IndVarSimplify] Notify top most loop to drop cached exit counts
Some nested loops may share the same ExitingBB, so after we finishing FoldExit,
we need to notify OuterLoop and SCEV to drop any stored trip count.

Patched by: guopeilin
Reviewed By: mkazantsev

Differential Revision: https://reviews.llvm.org/D91325
2020-11-19 15:37:54 +08:00
Mircea Trofin 8ab2353a4c [NFC][TFUtils] also include output specs lookup logic in loadOutputSpecs
The lookup logic is also reusable.

Also refactored the API to return the loaded vector - this makes it more
clear what state it is in in the case of error (as it won't be
returned).

Differential Revision: https://reviews.llvm.org/D91759
2020-11-18 21:20:21 -08:00
Kazu Hirata 43c0e4f665 [Transforms] Use llvm::is_contained (NFC) 2020-11-18 20:42:22 -08:00
Mircea Trofin b51e844f7a [NFC][TFUtils] Extract out the output spec loader
It's generic for the 'development mode', not specific to the inliner
case.

Differential Revision: https://reviews.llvm.org/D91751
2020-11-18 20:03:20 -08:00
Craig Topper 6b0fc1f3c1 [RISCV] Add MemOperand to the instruction created by storeRegToStackSlot/loadRegFromStackSlot
Differential Revision: https://reviews.llvm.org/D91730
2020-11-18 19:20:03 -08:00
Duncan P. N. Exon Smith 90966daac3 Support: Avoid SmallVector::assign with a range from to-be-replaced vector in Windows GetExecutableName
This code wasn't valid, and 5abf76fbe3
started asserting. This is a speculative fix since I don't have a
Windows machine handy.
2020-11-18 17:55:49 -08:00
Duncan P. N. Exon Smith 5abf76fbe3 ADT: Add assertions to SmallVector::insert, etc., for reference invalidation
2c196bbc6b asserted that
`SmallVector::push_back` doesn't invalidate the parameter when it needs
to grow. Do the same for `resize`, `append`, `assign`, `insert`, and
`emplace_back`.

Differential Revision: https://reviews.llvm.org/D91744
2020-11-18 17:36:28 -08:00
snek 803af31e5b [WebAssembly] Support fp reg class in r constraint
Patch by snek

Reviewed By: aheejin

Differential Revision: https://reviews.llvm.org/D90978
2020-11-18 17:05:58 -08:00
Arthur Eubanks 67f16e9e91 [NPM] Remove -enable-npm-optnone flag
It has been on by default for a couple months without complaint.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D91743
2020-11-18 15:49:16 -08:00
Scott Linder 2980933d85 [YAMLIO] Support non-null-terminated inputs
In some places the parser guards against dereferencing `End`, while in
others it relies on the presence of a trailing `'\0'` to elide checks.

Add the remaining guards needed to ensure the parser never attempts to
dereference `End`, making it safe to not require a null-terminated input
buffer.

Update the parser fuzzer harness so that it tests with buffers that are
guaranteed to be non-null-terminated, null-terminated, and 1-terminated,
additionally ensuring the result of the parse is the same in each case.

Some of the regression tests were written by inspection, and some are
cases caught by the fuzzer which required additional fixes in the
parser.

Differential Revision: https://reviews.llvm.org/D84050
2020-11-18 23:06:03 +00:00
Kazushi (Jam) Marukawa 132d6d73ea [VE] Add vmv intrinsic instructions
Add vmv intrinsic instructions and regression tests.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D91700
2020-11-19 08:05:35 +09:00
Hsiangkai Wang 44cd03ad04 [RISCV] Use register class VR for V instruction operands directly.
@tangxingxin1008 found a bug that regard vadd.vv v1, v3, a0 as a valid V
instruction. We should remove the VRegAsmOperand operand class and use
VR register class directly.

Patched by: tangxingxin1008, Hsiangkai
Differential Revision: https://reviews.llvm.org/D91712
2020-11-19 05:59:46 +08:00
Fangrui Song 96d40df71e MCExpr::evaluateAsRelocatableImpl : allow evaluation of non-VK_None MCSymbolRefExpr when MCAsmLayout is available
https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=4acf8c78e659833be8be047ba2f8561386a11d4b
(1994) introduced this behavior:
if a fixup symbol is equated to an expression with an undefined symbol, convert
the fixup to be against the target symbol. glibc relies on this behavior to perform
assembly level indirection

```
asm("memcpy = __GI_memcpy"); // from sysdeps/generic/symbol-hacks.h

...
  // call memcpy@PLT
  // The relocation references __GI_memcpy in GNU as, but memcpy in MC (without the patch)
  memcpy (...);
```

(1) It complements `extern __typeof(memcpy) memcpy asm("__GI_memcpy");` The frontend asm label does not redirect synthesized memcpy in the middle-end. (See D88712 for details)
(2) `asm("memcpy = __GI_memcpy");` is in every translation unit, but the memcpy declaration may not be visible in the translation unit where memcpy is synthesized.

MC already redirects `memcpy = __GI_memcpy; call memcpy` but not `memcpy = __GI_memcpy; call memcpy@plt`.
This patch fixes the latter by allowing MCExpr::evaluateAsRelocatableImpl to
evaluate a non-VK_None MCSymbolRefExpr, which is only done after the layout is available.

GNU as allows `memcpy = __GI_memcpy+1; call memcpy@PLT` which seems nonsensical, so we don't allow it.

`MC/PowerPC/pr38945.s` `NUMBER = 0x6ffffff9; cmpwi 8,NUMBER@l` requires the
`symbol@l` form in AsmMatcher, so evaluation needs to be deferred. This is the
place whether future simplification may be possible.

Note, if we suppress the VM_None evaluation when MCAsmLayout is nullptr, we may
lose the `invalid reassignment of non-absolute variable` diagnostic
(`ARM/thumb_set-diagnostics.s` and `MC/AsmParser/variables-invalid.s`).
We know that this diagnostic is troublesome in some cases
(https://github.com/ClangBuiltLinux/linux/issues/1008), so we can consider
making simplification in the future.

Reviewed By: jyknight

Differential Revision: https://reviews.llvm.org/D88625
2020-11-18 13:52:33 -08:00
Jamie Schmeiser cff479b145 Revert "Revert "Revert "Expand existing loopsink testing to also test loopsinking using new pass manager and fix LICM bug."""
This reverts commit e29292969b.

This apparently causes a regression in compile time (ie, it slows down).
2020-11-18 16:07:16 -05:00
Baptiste Saleil 18db29ea6f [PowerPC] Add peephole to remove redundant accumulator prime/unprime instructions
In some situations, the compiler may insert an accumulator prime instruction and
an accumulator unprime instruction with no use of that accumulator between the two.
That's for example the case when we store an accumulator after assembling it or
restoring it. This patch adds a peephole to remove these prime and unprime instructions.

Differential Revision: https://reviews.llvm.org/D91386
2020-11-18 15:01:07 -06:00
Roman Lebedev 7bf89c2174
[NFC][Reassociate] Delay checking isLoadCombineCandidate() until after ShouldConvertOrWithNoCommonBitsToAdd() but before haveNoCommonBitsSet()
This appears to improve -O3 compile-time performance somewhat:
https://llvm-compile-time-tracker.com/compare.php?from=87369c626114ae17f4c637635c119e6de0856a9a&to=c04b8271e1609b0dfb20609b40844b0c4324517e&stat=instructions
It doesn't look like delaying it until after haveNoCommonBitsSet() is better:
https://llvm-compile-time-tracker.com/compare.php?from=c04b8271e1609b0dfb20609b40844b0c4324517e&to=b2943d450eaf41b5f76d2dc7350f0a279f64cd99&stat=instructions
2020-11-18 23:57:12 +03:00
Nikita Popov cd3c22c47e [BasicAA] Generalize base offset modulus handling
The GEP aliasing implementation currently has two pieces of code
that solve two different subsets of the same basic problem: If you
have GEPs with offsets 4*x + 0 and 4*y + 1 (assuming access size 1),
then they do not alias regardless of whether x and y are the same.

One implementation is in aliasSameBasePointerGEPs(), which looks at
this in a limited structural way. It requires both GEP base pointers
to be exactly the same, then (optionally) a number of equal indexes,
then an unknown index, then a non-equal index into a struct. This
set of limitations works, but it's overly restrictive and hides the
core property we're trying to exploit.

The second implementation is part of aliasGEP() itself and tries to
find a common modulus in the scales, so it can then check that the
constant offset doesn't overlap under modular arithmetic. The second
implementation has the right idea of what the general problem is,
but effectively only considers power of two factors in the scales
(while aliasSameBasePointerGEPs also works with non-pow2 struct sizes.)

What this patch does is to adjust the aliasGEP() implementation to
instead find the largest common factor in all the scales (i.e. the GCD)
and use that as the modulus.

Differential Revision: https://reviews.llvm.org/D91027
2020-11-18 21:48:49 +01:00
Jamie Schmeiser e29292969b Revert "Revert "Expand existing loopsink testing to also test loopsinking using new pass manager and fix LICM bug.""
This reverts commit 562addba65.

Reverted change too quickly, the failing test cases passed on the next build.
So reverting revert (to include the changes).
2020-11-18 15:33:02 -05:00
Florian Hahn 2fead1ac61
[ConstraintElimination] Decompose add nuw/sub nuw.
Make use of the more flexible constraint handling added in
a8a79c9069 to decompose add nuw/sub nuw.
2020-11-18 20:29:30 +00:00
Joseph Huber 97e55cfef5 [OpenMP] Add Passing in Original Declaration Names To Mapper API
Summary:
This patch adds support for passing in the original delcaration name in the source file to the libomptarget runtime. This will allow the runtime to provide more intelligent debugging messages. This patch takes the original expression parsed from the OpenMP map / update clause and provides a textual representation if it was explicitly mapped, otherwise it takes the name of the variable declaration as a fallback. The information in passed to the runtime in a global array of strings that matches the existing ident_t source location strings using ";name;filename;column;row;;"

Reviewers: jdoerfert

Differential Revision: https://reviews.llvm.org/D89802
2020-11-18 15:28:39 -05:00
Nikita Popov f4a3969bff [Inline] Fix incorrectly dropped noalias metadata
This is the same fix as 23aeadb89d,
just for CloneScopedAliasMetadata rather than PropagateCallSiteMetadata.

In this case the previous outcome was incorrectly dropped metadata,
as it was not part of the computed metadata map.

The real change in the test is that the first load now retains
metadata, the rest of the changes are due to changes in metadata
numbering.
2020-11-18 21:22:50 +01:00
Jamie Schmeiser 562addba65 Revert "Expand existing loopsink testing to also test loopsinking using new pass manager and fix LICM bug."
This reverts commit d4ba28bddc.
2020-11-18 15:17:53 -05:00
Nikita Popov 23aeadb89d [Inline] Fix incorrect noalias metadata application (PR48209)
The VMap also contains a mapping from Argument => Instruction,
where the instruction is part of the original function, not the
inlined one. The code was assuming that all the instructions in
the VMap were inlined.

This was a pre-existing problem for the loop access metadata, but
was extended to the more common noalias metadata by
27f647d117, thus causing miscompiles.

There is a similar assumption inside CloneAliasScopeMetadata(), so
that one likely needs to be fixed as well.
2020-11-18 20:52:58 +01:00
Jamie Schmeiser d4ba28bddc Expand existing loopsink testing to also test loopsinking using new pass manager and fix LICM bug.
Summary:
Expand existing loopsink testing to also test loopsinking using new pass
manager.  Enable memoryssa for loopsink with new pass manager.  This
combination exposed a bug that was previously fixed for loopsink
without memoryssa.  When sinking an instruction into a loop, the source
block may not be part of the loop but still needs to be checked for
pointer invalidation.  This is the fix for bugzilla #39695 (PR 54659)
expanded to also work with memoryssa.

Respond to review comments.  Enable Memory SSA in legacy Loop Sink pass
under EnableMSSALoopDependency option control.  Update tests accordingly.

Respond to review comments.  Add options controlling whether memoryssa is
used for loop sink, defaulting to off.  Expand testing based on these
options.

Respond to review comments.  Properly indicated preserved analyses.

Author: Jamie Schmeiser <schmeise@ca.ibm.com>
Reviewed By: asbirlea (Alina Sbirlea)
Differential Revision: https://reviews.llvm.org/D90249
2020-11-18 14:08:42 -05:00
Nikita Popov 85ccdcaa50 [BasicAA] Remove assert in AA evaluator
As reported in https://reviews.llvm.org/D91383#2401825, this
assert breaks external -aa-eval tests. We'll have to fix this
case before re-enabling it.
2020-11-18 20:04:38 +01:00
serge-sans-paille 733f7b5084 Revert "[build] normalize components dependencies"
This reverts commit c6ef6e1690.

Basically, publicly linked libraries have a different semantic than components,
which link libraries privately.

Differential Revision: https://reviews.llvm.org/D91461
2020-11-18 19:23:11 +01:00
Sebastian Neubauer 72ccec1bbc [AMDGPU] Fix v3f16 interaction with image store workaround
In some cases, the wrong amount of registers was reserved.

Also enable more v3f16 tests.

Differential Revision: https://reviews.llvm.org/D90847
2020-11-18 18:21:04 +01:00
Simon Pilgrim 480ad4afc2 HazardRecognizer - Fix definition/declaration argument name mismatches. NFCI.
Consistently use SUnit *SU (or drop the argname entirely if not used like the other HazardRecognizer methods).

Silences cppcheck warnings.
2020-11-18 16:50:52 +00:00
Gaurav Jain 06fcc4f06f [NFC] Use [MC]Register for Hexagon target
Differential Revision: https://reviews.llvm.org/D91160
2020-11-18 08:17:07 -08:00
Piotr Sobczak b3b9be4ae7 SpeculativeExecution: Allow speculating more instruction types
Support more instructions in SpeculativeExecution pass:
- ExtractValue
- InsertValue
- Trunc
- Freeze

Differential Revision: https://reviews.llvm.org/D91688
2020-11-18 17:00:19 +01:00
Jay Foad 7ecf19697e [AMDGPU] Fix and extend vccz workarounds
We have workarounds for two different cases where vccz can get out of
sync with the value in vcc. This fixes them in two ways:

1. Fix the case where the def of vcc was in a previous basic block, by
pessimistically assuming that vccz might be incorrect at a basic block
boundary.

2. Fix the handling of pre-existing waitcnt instructions by calling
generateWaitcntInstBefore before examining ScoreBrackets to determine
whether there's an outstanding smem read operation.

Differential Revision: https://reviews.llvm.org/D91636
2020-11-18 15:26:06 +00:00
Roman Lebedev 34ff90ad5d
[Reassociate] Don't convert add-like-or's into add's if they appear to be part of load-combining idiom
As Wei Mi is reporting in post-commit review
  https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20201116/853479.html
teaching -reassociate about add-like-or's (70472f3) results in breaking apart
load widening patterns, and reassociating them.

For now, simply exclude any such `or` that appears to be a root of
load widening idiom from the or->add transformation.

Note that the heuristic is greedy, it doesn't ensure that loads
can *actually* be widened into a single load.
2020-11-18 17:55:02 +03:00
Mikhail Goncharov f45c052c9e Fix unused variables in release build
Differential Revision: https://reviews.llvm.org/D91705
2020-11-18 15:18:31 +01:00
Jay Foad 9f69c1bc54 [AMDGPU] Rename pseudo S_WAITCNT_IDLE to S_WAIT_IDLE. NFC. 2020-11-18 14:03:43 +00:00
Florian Hahn a8a79c9069
[ConstraintElimination] Refactor constraint extraction (NFC).
This patch generalizes the extraction of a constraint for a given
condition. It allows decompose to return a vector of c * X pairs, which
allows de-composing multiple instructions in the future.

It also adds more clarifying comments.
2020-11-18 13:59:18 +00:00
Jonas Paulsson 45b8e37afc [SystemZ] Use ISD::ABS opcode during isel.
The SystemZISD::IABS node is no longer needed since ISD::ABS can be used
instead.

Review: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D91697
2020-11-18 14:43:55 +01:00
Sam Tebbs da2e4728c7 [ARM][LowOverheadLoops] Merge VCMP and VPST across VPT blocks
This patch adds support for combining a VPST with a dangling VCMP from a
previous VPT block.

Differential Revision: https://reviews.llvm.org/D90935
2020-11-18 12:54:16 +00:00
Benjamin Kramer 4dbe12e866 [SLP] Use the minimum alignment of the load bundle when forming a masked.gather
Instead of the first load. That works when vectorizing contiguous loads,
but not for gathers.

Fixes a miscompile introduced in fcad8d3635.
2020-11-18 12:53:39 +01:00
Max Kazantsev f33118c61c [IndVars] Support different types of ExitCount when optimizing exit conds
In some cases we can handle IV and iter count of different types. It's a typical situation
after IV have been widened. This patch adds support for such cases, when legal.

Differential Revision: https://reviews.llvm.org/D88528
Reviewed By: skatkov
2020-11-18 18:20:05 +07:00
Georgii Rymar 9aa7898200 Reland "[lib/Support/YAMLTraits] - Don't print leading zeroes when dumping Hex8/Hex16/Hex32 types." (https://reviews.llvm.org/D90930).
This reverts reverting commit fc40a03323
and fixes LLD (MachO/wasm) tests that failed previously.
2020-11-18 13:08:46 +03:00
Simon Pilgrim eef203dbdf [Analysis] CGSCCPassManager.cpp - fix Wshadow warnings. NFCI. 2020-11-18 09:59:31 +00:00
Georgii Rymar fc40a03323 Revert "[lib/Support/YAMLTraits] - Don't print leading zeroes when dumping Hex8/Hex16/Hex32 types."
This reverts commit 65fd17c241.

It breaks LLD/MachO tests that seems use obj2yaml the check the output.
2020-11-18 11:55:03 +03:00
Piotr Sobczak c173f1b8eb SpeculativeExecution: Allow speculating more instruction types
Support more instructions in SpeculativeExecution pass:
- ExtractElement
- InsertElement
- ShuffleVector

Differential Revision: https://reviews.llvm.org/D91633
2020-11-18 09:46:43 +01:00
Georgii Rymar 65fd17c241 [lib/Support/YAMLTraits] - Don't print leading zeroes when dumping Hex8/Hex16/Hex32 types.
When we produce an YAML output, we also print leading zeroes currently.
An output might look like this:

```
- Name:    .dynsym
  Type:    SHT_DYNSYM
  Address: 0x0000000000001000
  EntSize: 0x0000000000000018
```

There are probably no reason to print leading zeroes.
It just makes harder to read values. This patch stops printing them.
The output becomes like:

```
- Name:    .dynsym
  Type:    SHT_DYNSYM
  Address: 0x1000
  EntSize: 0x18
```

This affects obj2yaml mostly, but also dsymutil and llvm-xray tools output.

Differential revision: https://reviews.llvm.org/D90930
2020-11-18 11:31:00 +03:00
Craig Topper f0b0bab34d [X86] Use GF2P8AFFINEQB to implement vector bitreverse.
We can use GF2P8AFFINEQB to reverse bits in a byte. Shuffles are needed to reverse the bytes in elements larger than i8. LegalizeVectorOps takes care of inserting the shuffle for the larger element size.

We already have Custom lowering for v16i8 with SSSE3, v32i8 with AVX, and v64i8 with AVX512BW.

I think we might be able to use this for scalars too by moving into a vector and back. But I'll save that for a follow up as its a little more involved.

Reviewed By: RKSimon, pengfei

Differential Revision: https://reviews.llvm.org/D91515
2020-11-17 23:49:06 -08:00
Arthur Eubanks 9e3b4f4941 [JumpThreading] Make -print-lvi-after-jump-threading work with NPM 2020-11-17 23:15:20 -08:00
Arthur Eubanks ee7d315cd9 [DCE] Always get TargetLibraryInfo
I don't see any reason not to unconditionally retrieve TLI, it's fairly
cheap.

Fixes calls-errno.ll under NPM.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D91476
2020-11-17 20:41:05 -08:00
Yevgeny Rouban cba3e78338 [NewPM] Disable PreservedCFGChecker and add regression unit tests
The design of the PreservedCFG Checker (landed with the commit
28012e00d8) has a fundamental flaw which makes it incorrect.
The checker is based on the PreservedAnalyses result returned
by functional passes: if CFGAnalyses is in the returned
PreservedAnalyses set, then the checker asserts that the CFG
snapshot saved before the pass is equal to the CFG snapshot
taken after the the pass. The problem is in passes that change
CFG and invalidate CFGAnalyses on their own. Such passes do not
return CFGanalyses in the returned PreservedAnalyses. So the
checker mistakenly expects CFG unchanged. As an example see the
class TestSimplifyCFGInvalidatingAnalysisPass in the new tests.

It is interesting that the bug was not found in LLVM. That is
because the CFG checker ran only if CFGAnalyses was checked
incorrectly:
  if (!PassPA.allAnalysesInSetPreserved<CFGAnalyses>())
    return;

but must be checked as follows:
  auto PAC = PA.getChecker<PreservedCFGCheckerAnalysis>();
  if (!(PAC.preserved() ||
        PAC.preservedSet<AllAnalysesOn<Function>>() ||
        PAC.preservedSet<CFGAnalyses>())
    return;

A fully redesigned checker will be sent as a separate follow-up
patch.

Reviewed By: Serguei Katkov, Jakub Kuderski

Differential Revision: https://reviews.llvm.org/D91324
2020-11-18 10:02:47 +07:00
Andrew Paverd 0139c8af8d [CFGuard] Add address-taken IAT tables and delay-load support
This patch adds support for creating Guard Address-Taken IAT Entry Tables (.giats$y sections) in object files, matching the behavior of MSVC. These contain lists of address-taken imported functions, which are used by the linker to create the final GIATS table.
Additionally, if any DLLs are delay-loaded, the linker must look through the .giats tables and add the respective load thunks of address-taken imports to the GFIDS table, as these are also valid call targets.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D87544
2020-11-17 18:24:45 -08:00
Nick Desaulniers f4c6080ab8 Revert "[IR] add fn attr for no_stack_protector; prevent inlining on mismatch"
This reverts commit b7926ce6d7.

Going with a simpler approach.
2020-11-17 17:27:14 -08:00
Nick Desaulniers dd6087cac0 Revert "[BitCode] decode nossp fn attr"
This reverts commit 0b11d018cc.

Going with a simpler approach.
2020-11-17 17:27:14 -08:00
Amy Huang bc98034040 [llvm-symbolizer] Add inline stack traces for Windows.
This adds inline stack frames for symbolizing on Windows.

Differential Revision: https://reviews.llvm.org/D88988
2020-11-17 13:19:13 -08:00
Christopher Tetreault 792f8e1114 [SVE] Take constant fold fast path for splatted vscale vectors
This should be a perfectly reasonable operation for scalable vectors.
Currently, it only works for zeroinitializer values of
ScalableVectorType, but the fundamental operation is sound and it should
be possible to make it work for other splats

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D77442
2020-11-17 12:45:31 -08:00
Sanjay Patel 08834979e3 [SLP] avoid unreachable code crash/infloop
Example based on the post-commit comments for D88735.
2020-11-17 15:10:23 -05:00
Jon Roelofs a461e76b6f [MachineScheduler] Inform pass infra of post-ra scheduler's dependencies
Differential Revision: https://reviews.llvm.org/D91561
2020-11-17 10:56:12 -08:00
Sanjay Patel 4a66a1d17a [InstCombine] allow vectors for masked-add -> xor fold
https://rise4fun.com/Alive/I4Ge

  Name: add with pow2 mask
  Pre: isPowerOf2(C2) && (C1 & C2) != 0 && (C1 & (C2-1)) == 0
  %a = add i8 %x, C1
  %r = and i8 %a, C2
  =>
  %n = and i8 %x, C2
  %r = xor i8 %n, C2
2020-11-17 13:36:08 -05:00
Simon Pilgrim f7ebdec987 [InstCombine] visitAnd - remove unnecessary Value *X, *Y shadow variables. NFCI.
Fixes a number of Wshadow warnings.
2020-11-17 17:59:21 +00:00
Simon Pilgrim abf29d9862 [InstCombine] visitAnd - use m_SpecificInt instead of m_APInt + comparison. NFCI.
m_SpecificInt has the same 'no undef element' behaviour as m_APInt so no change there, and anyway we have test coverage for undef elements in the fold.

Noticed while fixing a Wshadow warning about shadow Value *X, *Y variables.
2020-11-17 17:37:10 +00:00
Wei Wang 3279347da0 [BPI] Look through bitcasts in calcZeroHeuristic
Constant hoisting may hide the constant value behind bitcast for And's
operand. Track down the constant to make the BFI result consistent
regardless of hoisting.

Differential Revision: https://reviews.llvm.org/D91450
2020-11-17 09:33:05 -08:00
Sanjay Patel f791ad7e1e [InstCombine] remove scalar constraint for mask-of-add fold
https://rise4fun.com/Alive/V6fP

  Name: add with low mask
  Pre: (C1 & (-1 u>> countLeadingZeros(C2))) == 0
  %a = add i8 %x, C1
  %r = and i8 %a, C2
  =>
  %r = and i8 %x, C2
2020-11-17 12:13:45 -05:00
Sanjay Patel 433696911a [InstCombine] relax constraints on mask-of-add
There are 2 changes:
1. Remove the unnecessary one-use check.
2. Remove the unnecessary power-of-2 check.

https://rise4fun.com/Alive/V6fP

  Name: add with low mask
  Pre: (C1 & (-1 u>> countLeadingZeros(C2))) == 0
  %a = add i8 %x, C1
  %r = and i8 %a, C2
  =>
  %r = and i8 %x, C2
2020-11-17 12:13:44 -05:00
Fangrui Song f4d9d80fe4 [ARC] Correct ARCInstPrinter::getMnemonic after D90039 2020-11-17 09:08:36 -08:00
Nikita Popov cb4fc25c91 [BasicAA] Make alias GEP positive offset handling symmetric
aliasGEP() currently implements some special handling for the case
where all variable offsets are positive, in which case the constant
offset can be taken as the minimal offset. However, it does not
perform the same handling for the all-negative case. This means that
the alias-analysis result between two GEPs is asymmetric:
If GEP1 - GEP2 is all-positive, then GEP2 - GEP1 is all-negative,
and the first will result in NoAlias, while the second will result
in MayAlias.

Apart from producing sub-optimal results for one order, this also
violates our caching assumption. In particular, if BatchAA is used,
the cached result depends on the order of the GEPs in the first query.
This results in an inconsistency in BatchAA and AA results, which
is how I noticed this issue in the first place.

Differential Revision: https://reviews.llvm.org/D91383
2020-11-17 18:05:34 +01:00
Simon Pilgrim 5f3a8074a4 [PPC] Fix dead store value clang static analyzer warning. NFCI.
Simplify the SplatBits 2-byte -> 4-byte 'splat'.
2020-11-17 16:27:45 +00:00
Florian Hahn 52f3714dae [VPlan] Add VPDef class.
This patch introduces a new VPDef class, which can be used to
manage VPValues defined by recipes/VPInstructions.

The idea here is to mirror VPUser for values defined by a recipe. A
VPDef can produce either zero (e.g. a store recipe), one (most recipes)
or multiple (VPInterleaveRecipe) result VPValues.

To traverse the def-use chain from a VPDef to its users, one has to
traverse the users of all values defined by a VPDef.

VPValues now contain a pointer to their corresponding VPDef, if one
exists. To traverse the def-use chain upwards from a VPValue, we first
need to check if the VPValue is defined by a VPDef. If it does not have
a VPDef, this means we have a VPValue that is not directly defined
iniside the plan and we are done.

If we have a VPDef, it is defined inside the region by a recipe, which
is a VPUser, and the upwards def-use chain traversal continues by
traversing all its operands.

Note that we need to add an additional field to to VPVAlue to link them
to their defs. The space increase is going to be offset by being able to
remove the SubclassID field in future patches.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D90558
2020-11-17 16:18:11 +00:00
Simon Pilgrim 7e30989dab [IR] ShuffleVectorInst::isIdentityWithPadding - bail on non-fixed-type vector shuffles.
Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=27416
2020-11-17 16:16:51 +00:00
Andy Wingo 2a473db573 [WebAssembly] Fix parsing of linking section for named global imports
Differential Revision: https://reviews.llvm.org/D91635
2020-11-17 08:14:29 -08:00
Matt Arsenault c5ce6036c1 Linker: Fix linking of byref types
This wasn't properly remapping the type like with the other
attributes, so this would end up hitting a verifier error after
linking different modules using byref.
2020-11-17 11:02:04 -05:00
Anton Afanasyev 0a1d315f9f [SLPVectorizer] Fix assert 2020-11-17 18:46:31 +03:00
Anton Afanasyev fcad8d3635 [SLP] Make SLPVectorizer to use `llvm.masked.gather` intrinsic
For the scattered operands of load instructions it makes sense
to use gathering load intrinsic, which can lower to native instruction
for X86/AVX512 and ARM/SVE. This also enables building
vectorization tree with entries containing scattered operands.
The next step is to add scattered store.

Fixes PR47629 and PR47623

Differential Revision: https://reviews.llvm.org/D90445
2020-11-17 18:11:45 +03:00
Andy Wingo b90228e411 [WebAssembly][MC] Remove useless overrides in MCWasmStreamer
Differential Revision: https://reviews.llvm.org/D91604
2020-11-17 07:09:49 -08:00
Florian Hahn 13042da5cb
[ConstraintElimination] Add support for And.
When processing conditional branches, if the condition is an AND of 2 compares
and the true successor only has the current block as predecessor, queue both
conditions for the true successor.
2020-11-17 14:12:15 +00:00
Sander de Smalen f571fe6df5 Reland [LoopVectorizer] NFCI: Calculate register usage based on TLI.getTypeLegalizationCost.
This relands https://reviews.llvm.org/D91059 and reverts commit
30fded75b4.

GetRegUsage now returns 0 when Ty is not a valid vector element type.
2020-11-17 13:45:10 +00:00
Kazushi (Jam) Marukawa f4517bbd73 [VE] Implement JumpTable
Implement JumpTable to make BRIND work on VE.  Update an existing
br_jt regression test also.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D91582
2020-11-17 22:43:10 +09:00
Kazushi (Jam) Marukawa 5872cab849 [VE] Correct getMnemonic
https://reviews.llvm.org/D90039 breaks VE backend.  So, fix it.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D91619
2020-11-17 22:33:29 +09:00
Luke Drummond 537cbd90c4 Escape command line arguments in backtraces
A common routine is to have the compiler crash, and attempt to rerun the
cc1 command-line by copying and pasting the arguments printed by
`llvm::Support::PrettyStackProgram::print`. However, these arguments are
not quoted or escaped which means they must be manually edited before
working correctly. This patch ensures that shell-unfriendly characters
are C-escaped, and arguments with spaces are double-quoted reducing the
frustration of running cc1 inside a debugger.

As the quoting is C, this is "best effort for most shells", but should
be fine for at least bash, zsh, csh, and cmd.exe.

Reviewed by: jhenderson

Differential Revision: https://reviews.llvm.org/D90759
2020-11-17 12:16:13 +00:00
Florian Hahn a9adb62a64
[AsmPrinter] Use getMnemonic for instruction-mix remark.
This patch uses the new `getMnemonic` helper from D90039
to display mnemonics instead of the internal opcodes.

The main motivation behind using the mnemonics is that they
are more user-friendly and more directly related to the assembly
the users will be presented.

Reviewed By: paquette

Differential Revision: https://reviews.llvm.org/D90040
2020-11-17 12:12:47 +00:00
Kazushi (Jam) Marukawa 3a5c0ea895 [VE] Add vbrd intrinsic instructions
Add vbrd intrinsic instructions and a regression test.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D91569
2020-11-17 19:04:18 +09:00
Ben Shi 9f8f8db339 [AVR] Optimize the 16-bit NEGW pseudo instruction
Reviewed By: dylanmckay

Differential Revision: https://reviews.llvm.org/D88658
2020-11-17 17:51:58 +08:00
Florian Hahn b2f4c5fddc
[AsmWriter] Factor out mnemonic generation to accessible getMnemonic.
This patch factors out the part of printInstruction that gets the
mnemonic string for a given MCInst. This is intended to be used
subsequently for the instruction-mix remarks to display the final
mnemonic (D90040).

Unfortunately making `getMnemonic` available to the AsmPrinter
seems to require making it virtual. Not sure if there's a way around
that with the current layering of the AsmPrinters.

Reviewed By: Paul-C-Anagnostopoulos

Differential Revision: https://reviews.llvm.org/D90039
2020-11-17 09:47:38 +00:00
serge-sans-paille c6ef6e1690 [build] normalize components dependencies
Use LINK_COMPONENTS instead of explicit target_link_libraries for components.
This avoids redundancy and potential inconsistencies.

Differential Revision: https://reviews.llvm.org/D91461
2020-11-17 10:42:34 +01:00
Yevgeny Rouban a57fe210ff [JumpThreading] Fix branch probabilities in DuplicateCondBranchOnPHIIntoPred()
When instructions are cloned from block BB to PredBB in the method
DuplicateCondBranchOnPHIIntoPred() number of successors of PredBB
changes from 1 to number of successors of BB. So we have to copy
branch probabilities from BB to PredBB.

Reviewed By: Kazu Hirata

Differential Revision: https://reviews.llvm.org/D90841
2020-11-17 14:40:50 +07:00
Max Kazantsev 63dd1734b2 [NFC] Collect ext users into vector instead of finding them twice 2020-11-17 14:01:43 +07:00
Kazu Hirata 1da60f1d44 [Transforms] Use pred_empty (NFC) 2020-11-16 22:09:14 -08:00
Kazu Hirata 5935952c31 [SanitizerCoverage] Use [&] for lambdas (NFC) 2020-11-16 21:45:21 -08:00
Arthur Eubanks 7de6dcd246 [Debugify] Skip debugifying on special/immutable passes
With a function pass manager, it would insert debuginfo metadata before
getting to function passes while processing the pass manager, causing
debugify to skip while running the function passes.

Skip special passes + verifier + printing passes. Compared to the legacy
implementation of -debugify-each, this additionally skips verifier
passes. Probably no need to update the legacy version since it will be
obsolete soon.

This fixes 2 instcombine tests using -debugify-each under NPM.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D91558
2020-11-16 20:39:46 -08:00
Fangrui Song fac0622ae0 ELFAsmParser: Remove non-SHF_ALLOC or non-executable sections' line info/address ranges contribution for -g
I filed the issue https://sourceware.org/bugzilla/show_bug.cgi?id=26850 ,
which was acknowledged and fixed in GNU binutils 2.36

This patch adds the similar behavior to MC.

Reviewed By: #debug-info, dblaikie

Differential Revision: https://reviews.llvm.org/D91505
2020-11-16 20:02:25 -08:00
Lang Hames 22e44358d3 [ORC] Include config.h in RegisterEHFrames.cpp.
RegisterEHFrames.cpp needs access to the HAVE_REGISTER_FRAME /
HAVE_DEREGISTER_FRAME defines.

rdar://71458921
2020-11-17 14:18:04 +11:00
Philip Reames 0f41a2fe83 test commit for new client 2020-11-16 17:26:52 -08:00
Sjoerd Meijer fa5cb4b936 [LoopFlatten] Disable IV widening
Disable widening of the IV in LoopFlatten while I investigate an assertion
failures. Please note that the pass is also not yet enabled by default.
2020-11-16 22:30:52 +00:00
Jessica Paquette 5bc0bd05e6 [AArch64][GlobalISel] Fold G_XOR x, -1 into G_SELECT and select CSINV
When we see

```
xor = G_XOR xor_lhs, -1
select = G_SELECT cc, tval, xor
```

Fold this into

```
select = CSINV tval, xor_lhs, cc
```

Update select-select.mir to reflect the changes.

For now, only handle the case where the G_XOR is the false-value for the
G_SELECT. It may make more sense to handle the true-value case in post-legalizer
lowering.

Differential Revision: https://reviews.llvm.org/D90774
2020-11-16 14:14:14 -08:00
Michael Liao f375885ab8 [InferAddrSpace] Teach to handle assumed address space.
- In certain cases, a generic pointer could be assumed as a pointer to
  the global memory space or other spaces. With a dedicated target hook
  to query that address space from a given value, infer-address-space
  pass could infer and propagate that to all its users.

Differential Revision: https://reviews.llvm.org/D91121
2020-11-16 17:06:33 -05:00
Kazushi (Jam) Marukawa 38621c45a8 [VE] Add lvm/svm intrinsic instructions
Add lvm/svm intrinsic instructions and a regression test.  Change
RegisterInfo to specify that VM0/VMP0 are constant and reserved
registers.  This modifies a vst regression test, so update it.
Also add pseudo instructions for VM512 register classes
and mechanism to expand them after register allocation.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D91541
2020-11-17 07:05:36 +09:00
Florian Hahn 5a4ca8b550 [ConstraintElimination] Add support for Or.
When processing conditional branches, if the condition is an OR of 2 compares
and the false successor only has the current block as predecessor, queue both
negated conditions for the false successor
2020-11-16 21:48:38 +00:00
Philip Reames 2240d3d054 [LoopVec] Introduce an api for detecting uniform memory ops
Split off D91398 at request of reviewer.
2020-11-16 13:30:48 -08:00
Philip Reames 257d33c815 [SCEV] Factor out part of wrap flag detection logic [NFC](try 2)
This is a cut down version of 1ec6e1 which was reverted due to a compile time issue.  The key changes made from that patch: 1) only infer the flags needed along each path, 2) be careful to preserve order of checks, and 3) avoid computing NW flags at all since we need to prove the stronger property (does not cross 0) in the caller anyways.

Assuming this doesn't trip regressions, I'm going to try weakening (1).  My end objective is to move flag inference into addrec construction.  If I can't weaken (1) without compile time impact, I'll have a problem.
2020-11-16 12:07:21 -08:00
Arnold Schwaighofer d861cc0e43 [coro] Async coroutines: Make sure we can handle control flow in suspend point dispatch function
Create a valid basic block with a terminator before we call
InlineFunction.

Differential Revision: https://reviews.llvm.org/D91547
2020-11-16 11:59:02 -08:00
Sanjay Patel 4e68bc0999 Revert "[InstCombine] add multi-use demanded bits fold for add with low-bit mask"
This reverts commit e56103d250.
There is a stage2 msan failure blamed on this commit:
http://lab.llvm.org:8011/#/builders/74/builds/888/steps/9/logs/stdio
2020-11-16 14:48:09 -05:00
Amara Emerson 0b6090699a [AArch64][GlobalISel] Look through a G_ZEXT when trying to match shift-extended register offsets.
The G_ZEXT in these cases seems to actually come from a combine that we do but
SelectionDAG doesn't. Looking through it allows us to match "uxtw #2" addressing
modes.

Differential Revision: https://reviews.llvm.org/D91475
2020-11-16 10:50:46 -08:00
Scott Linder b877c35d4b [YAMLIO] Correctly diagnose empty alias/anchor
The `Range` of an alias/anchor token includes the leading `&` or `*`,
but it is skipped while parsing the name. The check for an empty name
fails to account for the skipped leading character and so the error is
never hit.

Fix the off-by-one and add a couple regression tests.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D91462
2020-11-16 18:45:05 +00:00
Jameson Nash bf6ed355c8 Reland "[AsmPrinter] fix -disable-debug-info option"
This reverts commit 105ed27ed8, and
removes the offending line from the tests.
2020-11-16 13:34:47 -05:00
Craig Topper 124c93c528 [RISCV] When matching SROIW, check all 64 bits of the OR mask
We need to make sure the upper 32 bits are all ones to ensure the result is properly sign extended. Previously we only checked the lower 32 bits of the mask. I've also added a check that the shift amount is less than 32. Without that the original code asserts inside maskLeadingOnes if the SROI check is removed or the SROIW pattern is checked first. I've refactored the code to use early outs to reduce nesting.

I've also updated SLOIW matching with the same changes, but I couldn't find a broken test case with the existing code.

Differential Revision: https://reviews.llvm.org/D90961
2020-11-16 10:08:15 -08:00
Arthur Eubanks aeb0fdff35 [SimplifyCFG] Respect optforfuzzing in NPM pass
Regression caused by refactoring in
cdd006eec9.

See discussion in https://reviews.llvm.org/D89917.

Reviewed By: arsenm, morehouse

Differential Revision: https://reviews.llvm.org/D91473
2020-11-16 09:56:37 -08:00
Xun Li 985c524001 [Coroutine] Allocas used by StoreInst does not always escape
In the existing logic, for a given alloca, as long as its pointer value is stored into another location, it's considered as escaped.
This is a bit too conservative. Specifically, in non-optimized build mode, it's often to have patterns of code that first store an alloca somewhere and then load it right away.
These used should be handled without conservatively marking them escaped.

This patch tracks how the memory location where an alloca pointer is stored into is being used. As long as we only try to load from that location and nothing else, we can still
consider the original alloca not escaping and keep it on the stack instead of putting it on the frame.

Differential Revision: https://reviews.llvm.org/D91305
2020-11-16 09:14:44 -08:00
Matt Arsenault d2e52eec51 AMDGPU: Select global saddr mode from SGPR pointer
Use the 64-bit SGPR base with a 0 offset, since it's 1 fewer
instruction to materialize the 0 vs. the 64-bit copy.
2020-11-16 11:51:06 -05:00
Mirko Brkusanin 4cf6dd518e [AMDGPU][GlobalISel] Fix lowerShlSat
RegBankSelect would crash on G_SELECT when type is not s1.

Differential Revision: https://reviews.llvm.org/D91437
2020-11-16 17:43:31 +01:00
Matt Arsenault a6e353b1d0 AMDGPU: Split large offsets when selecting global saddr mode
When the offset doesn't fit in the immediate field, move some to
voffset.
2020-11-16 11:36:01 -05:00
Victor Huang 6bb2ceac90 Fix the compilation assertion due to unreachable BB pruning not deleting the associated BB from the jump tables
This patch is added to remove the unreachable MBBs reference in the jump table.

Differential Revisien: https://reviews.llvm.org/D90498
Reviewed by: amyk, bsaleil
2020-11-16 10:35:31 -06:00
Jay Foad a6ecb2eb3d [AMDGPU] Add comments. NFC. 2020-11-16 16:34:13 +00:00
Kazushi (Jam) Marukawa 44a4f93925 [VE] Optimize leaf functions
Optimize leaf functions by not generating save/restore for callee saved
registers.  Update regression tests also.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D91539
2020-11-17 00:38:01 +09:00
Simon Moll a598c08ac8 [VE] fastcc and vreg-to-vreg copy
This defines a 'fastcc' for the VE target and implements vreg-to-vreg
copy for parameter passing.  The 'fastcc' extends the standard CC for
SX-Aurora with register passing of vector-typed parameters and return
values.

Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D90842
2020-11-16 16:24:22 +01:00
Yonghong Song 4369223ea7 BPF: make __builtin_btf_type_id() return 64bit int
Linux kernel recently added support for kernel modules
  https://lore.kernel.org/bpf/20201110011932.3201430-5-andrii@kernel.org/

In such cases, a type id in the kernel needs to be presented
as (btf id for modules, btf type id for this module).
Change __builtin_btf_type_id() to return 64bit value
so libbpf can do the above encoding.

Differential Revision: https://reviews.llvm.org/D91489
2020-11-16 07:08:41 -08:00
Florian Hahn 8dbe44cb29 Add pass to add !annotate metadata from @llvm.global.annotations.
This patch adds a new pass to add !annotation metadata for entries in
@llvm.global.anotations, which is generated  using
__attribute__((annotate("_name"))) on functions in Clang.

This has been discussed on llvm-dev as part of
    RFC: Combining Annotation Metadata and Remarks
    http://lists.llvm.org/pipermail/llvm-dev/2020-November/146393.html

Reviewed By: thegameg

Differential Revision: https://reviews.llvm.org/D91195
2020-11-16 14:57:11 +00:00
Kazushi (Jam) Marukawa 37e7a80aed [VE] Add lsv/lvs intrinsic instructions
Add lsv/lvs intrinsic instructions and a regression test.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D91526
2020-11-16 23:42:51 +09:00
Dmitry Preobrazhensky 65f3e121fe [AMDGPU][MC] Corrected error position for some operands and modifiers
Partially fixes bug 47518 (https://bugs.llvm.org/show_bug.cgi?id=47518)

Reviewers: rampitec

Differential Revision: https://reviews.llvm.org/D91412
2020-11-16 16:11:23 +03:00
Caroline Concatto 6c4d8f4651 [AArch64] Add check for widening instruction for SVE.
This patch fixes the function isWideningInstruction for scalable vectors.
Now the cost model can check the widening pattern for SVE.

Differential Revision: https://reviews.llvm.org/D91260
2020-11-16 12:30:08 +00:00
Dmitry Preobrazhensky 0bee8c784b [AMDGPU][MC] Corrected error position for swizzle()
Partially fixes bug 47518 (https://bugs.llvm.org/show_bug.cgi?id=47518)

Reviewers: rampitec

Differential Revision: https://reviews.llvm.org/D91408
2020-11-16 14:37:57 +03:00
Dmitry Preobrazhensky 89df8fc0d7 [AMDGPU][MC] Corrected error position for hwreg() and sendmsg()
Partially fixes bug 47518 (https://bugs.llvm.org/show_bug.cgi?id=47518)

Reviewers: rampitec

Differential Revision: https://reviews.llvm.org/D91407
2020-11-16 14:25:07 +03:00
Kazushi (Jam) Marukawa e0c92c6c03 [VE] Add pfchv intrinsic instructions
Add pfchv intrinsic instructions and a regression test.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D91522
2020-11-16 20:10:44 +09:00
Benjamin Kramer 2e7455f00a [LoopFlatten] Fold variable into assert. NFC. 2020-11-16 11:51:39 +01:00
Florian Hahn ca2e7e5999 [IRGen] Add !annotation metadata for auto-init stores.
This patch updates Clang's IRGen to add !annotation nodes with an
"auto-init" annotation to all stores for auto-initialization.

As discussed in 'RFC: Combining Annotation Metadata and Remarks'
(http://lists.llvm.org/pipermail/llvm-dev/2020-November/146393.html)
this allows using optimization remarks to track down where auto-init
code was inserted (and not removed by optimizations).

There are a few cases in the tests where !annotation gets dropped by
optimizations. Those optimizations will be updated in subsequent
patches.

This patch is based on a patch by Francis Visoiu Mistrih.

Reviewed By: thegameg, paquette

Differential Revision: https://reviews.llvm.org/D91417
2020-11-16 10:37:02 +00:00
Sjoerd Meijer 9aa773381b [LoopFlatten] Widen the IV
Widen the IV to the widest available and legal integer type, which makes this
transformations always safe so that we can skip overflow checks.

Motivation is to let this pass trigger on 64-bit targets too, and this is the
last patch in a serie to achieve this: D90402 moves pass LoopFlatten to just
before IndVarSimplify so that IVs are not already widened, D90421 factors out
widening from IndVarSimplify into Utils/SimplifyIndVar so that we can also use
it in LoopFlatten.

Differential Revision: https://reviews.llvm.org/D90640
2020-11-16 10:20:13 +00:00
David Penry 48b43c9d4f [ARM] Cortex-M7 schedule
This patch adds the SchedMachineModel for Cortex-M7. It
also adds test cases for the scheduling information.

Details of the pipeline and descriptions are in comments
in file ARMScheduleM7.td included in this patch.

Differential Revision: https://reviews.llvm.org/D91355
2020-11-16 10:16:07 +00:00
Fraser Cormack fe9dc2e54a [RISCV] Use a macro to simplify getTargetNodeName
Similar to the X86 and AMDGPU targets, this uses a macro to cut down on
repetitive and error-prone code when converting RISCVISD node names to
strings in getTargetNodeName.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D91414
2020-11-16 09:33:47 +00:00
Kazushi (Jam) Marukawa 15a2bacab6 [VE] Change variable capitalization
Change dl to DL in VEFrameLowering.cpp.  And clean some comments.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D91490
2020-11-16 18:25:23 +09:00
Simon Moll 0007d8ed2c [VP][NFC] Rename to HANDLE_VP_TO_OPC
Use the less surprising shorthand OPC instead of OC.
2020-11-16 10:24:18 +01:00
Lang Hames f62e5f4569 [MCJIT] Profile the code generated by MCJIT engine using Intel VTune profiler
Patch by Elena Kovanova. Thanks Elena!

Problem:

LLVM already has a feature to profile the JIT-compiled code with VTune. This is
done using Intel JIT Profiling API (https://github.com/intel/ittapi). Function
information is captured by VTune as soon as the function is JIT-compiled. We
tried to use the same approach to report the function information generated by
the MCJIT engine – read parsing the debug information for in-memory ELF module
and report it using JIT API. As the results, we figured out that it did not work
properly for the following cases: inline functions, the functions located in
multiple source files, the functions having several bodies (address ranges).

Solution:

To overcome limitations described above, we have introduced new APIs as a part
of Intel ITT APIs to report the entire in-memory ELF module to be further
processed as regular ELF binaries with debug information.

This patch

1. Switches LLVM to open source version of Intel ITT/JIT APIs
(https://github.com/intel/ittapi) to keep it always up to date.

2. Adds support of profiling the code generated by MCJIT engine using Intel
VTune profiler

Another separate patch will get rid of obsolete Intel ITT APIs stuff, having
LLVM already switched to https://github.com/intel/ittapi.

Differential Revision: https://reviews.llvm.org/D86435
2020-11-16 19:28:14 +11:00
Simon Moll 1c00d096a6 [VE] LVLGen sets VL before vector insts
The VE backend represents vector instructions with an explicit 'i32'
vector length operand.  In the VE ISA, the vector length is always read
from the VL hardware register.  The LVLGen pass inserts 'lvl'
instructions as necessary to set VL to the right value before each
vector instruction.

Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D91416
2020-11-16 09:19:14 +01:00
Max Kazantsev b4624f65cf Recommit "[NFC] Move code between functions as a preparation step for further improvement"
The bug should be fixed now.
2020-11-16 14:30:34 +07:00
Kazu Hirata 147ccc848a [JumpThreading] Call eraseBlock when folding a conditional branch
This patch teaches the jump threading pass to call BPI->eraseBlock
when it folds a conditional branch.

Without this patch, BranchProbabilityInfo could end up with stale edge
probabilities for the basic block containing the conditional branch --
one edge probability with less than 1.0 and the other for a removed
edge.

This patch is one of the steps before we can safely re-apply D91017.

Differential Revision: https://reviews.llvm.org/D91511
2020-11-15 22:29:30 -08:00
Kazu Hirata aa06951377 [IR] Use llvm::is_contained in BasicBlock::removePredecessor (NFC) 2020-11-15 21:15:31 -08:00
Kazu Hirata 0888eaf3fd [Loop Fusion] Use pred_empty and succ_empty (NFC) 2020-11-15 20:32:57 -08:00
Kazu Hirata 0c03d1328c [ADCE] Use succ_empty (NFC) 2020-11-15 19:52:59 -08:00
Kazu Hirata c5cc2d8b94 [BranchProbabilityInfo] Use predecessors(BB) and successors(BB) (NFC) 2020-11-15 19:26:38 -08:00
Kazu Hirata 43a6a1e928 [TRE] Use successors(BB) (NFC) 2020-11-15 19:12:49 -08:00
Craig Topper 57c0c4a275 [X86] Fix crash with i64 bitreverse on 32-bit targets with XOP.
We unconditionally marked i64 as Custom, but did not install a
handler in ReplaceNodeResults when i64 isn't legal type. This
leads to ReplaceNodeResults asserting.

We have two options to fix this. Only mark i64 as Custom on
64-bit targets and let it expand to two i32 bitreverses which
each need a VPPERM. Or the other option is to add the Custom
handling to ReplaceNodeResults. This is what I went with.
2020-11-15 19:02:34 -08:00
Kazu Hirata 918e3439e2 [SanitizerCoverage] Use llvm::all_of (NFC) 2020-11-15 19:01:20 -08:00
Serguei Katkov 400f6edce7 [IRCE] Use the same min runtime iteration threshold for BPI and BFI checks
In the last change to IRCE the BPI is ignored if BFI is present, however
BFI and BPI have a different thresholds. Specifically BPI approach checks only
latch exit probability so it is expected if the loop has only one exit block (latch)
the behavior with BFI and BPI should be the same,

BPI approach by default uses threshold 10, so it considers the loop with estimated
number of iterations less then 10 should not be considered for IRCE optimization.
BFI approach uses the default value 3 and this is inconsistent.

The CL modifies the code to use the same threshold for both approaches..

The test is updated due to it has two side-exits (except latch) and each of them has a
probability 1/16, so BFI estimates the number of runtime iteration is about to 7
(1/16 + 1/16 + some for latch) and test fails.

Reviewers: mkazantsev, ebrevnov
Reviewed By: mkazantsev
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D91230
2020-11-16 09:21:50 +07:00
Sanjay Patel 6ddc237766 [InstCombine] reduce code for flip of masked bit; NFC
There are 1-2 potential follow-up NFC commits to reduce
this further on the way to generalizing this for vectors.

The operand replacing path should be dead code because demanded
bits handles that more generally (D91415).
2020-11-15 15:43:34 -05:00
Sanjay Patel e56103d250 [InstCombine] add multi-use demanded bits fold for add with low-bit mask
I noticed an add example like the one from D91343, so here's a similar patch.
The logic is based on existing code for the single-use demanded bits fold.
But I only matched a constant instead of using compute known bits on the
operands because that was the motivating patterni that I noticed.

I think this will allow removing a special-case (but incomplete) dedicated
fold within visitAnd(), but I need to untangle the existing code to be sure.

https://rise4fun.com/Alive/V6fP

  Name: add with low mask
  Pre: (C1 & (-1 u>> countLeadingZeros(C2))) == 0
  %a = add i8 %x, C1
  %r = and i8 %a, C2
  =>
  %r = and i8 %x, C2

Differential Revision: https://reviews.llvm.org/D91415
2020-11-15 15:09:49 -05:00
Nikita Popov 3b7f84d97f [AA] Add missing AAQI parameter
This alias() call did not pass on the AAQueryInfo.
2020-11-15 20:29:53 +01:00
Florian Hahn 0c119ba8a8 [VPlan] Use VPValue def for VPWidenGEPRecipe.
This patch turns VPWidenGEPRecipe into a VPValue and uses it
during VPlan construction and codegeneration instead of the plain IR
reference where possible.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D84683
2020-11-15 15:12:47 +00:00
Nikita Popov 9ace4b337f Revert "[SCEV] Factor out part of wrap flag detection logic [NFC-ish]"
This reverts commit 1ec6e1eb8a.

This change causes a significant compile-time regression:
https://llvm-compile-time-tracker.com/compare.php?from=dd0b8b94d0796bd895cc998dd163b4fbebceb0b8&to=1ec6e1eb8a084bffae8a40236eb9925d8026dd07&stat=instructions

I assume that this is due to the non-NFC part of the change, which
now performs expensive nowrap inference even for nowrap flags that
are not used by the particular code.
2020-11-15 10:19:44 +01:00
Philip Reames 1ec6e1eb8a [SCEV] Factor out part of wrap flag detection logic [NFC-ish]
In an effort to make code around flag determination more readable, and (possibly) prepare for a follow up change, factor out some of the flag detection logic.  In the process, reduce the number of locations we mutate wrap flags by a couple.

Note that this isn't NFC.  The old code tried for NSW xor (NUW || NW).  This is, two different paths computed different sets of wrap flags.  The new code will try for all three.  The result is that some expressions end up with a few extra flags set.
2020-11-14 19:21:05 -08:00
Arthur Eubanks 6e04da0a5a [DCE] Port -redundant-dbg-inst-elim to NPM
This is used to test RemoveRedundantDbgInstrs(), which is used by other
passes.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D91477
2020-11-14 16:55:20 -08:00
Florian Hahn a70b511e78 Recommit "[VPlan] Use VPValue def for VPWidenSelectRecipe."
This reverts the revert commit c8d73d939f.

It includes a fix for cases where we missed inserting VPValues
for some selects, which should fix PR48142.
2020-11-14 20:00:25 +00:00
Paul C. Anagnostopoulos 9671790b4f [TableGen] Fix missing braces in if statement 2020-11-14 12:38:44 -05:00
Nathan James 5a2febb31a
[llvm][NFC] Remove unnecessary vector creation in Annotations 2020-11-14 15:55:09 +00:00
Nikita Popov 0b72444211 [BasicAA] Remove unnecessary size limitation
We're dropping a common offset from both GEPs here. It's not
necessary for the access sizes to be the same as well.
2020-11-14 16:51:31 +01:00
Nico Weber 237dcfe2e6 Fix build after 54f9ee334 2020-11-14 10:23:22 -05:00
Paul C. Anagnostopoulos 54f9ee3341 [TableGen] Add frontend/backend phase timing capability.
Describe in the BackEnd Developer's Guide. Instrument a few backends.

Remove an old unused timing facility. Add a null backend for timing
the parser.

Differential Revision: https://reviews.llvm.org/D91388
2020-11-14 10:10:29 -05:00
Arnold Schwaighofer 8fb73cecfd [Coroutines] Make sure that async coroutine context size is a multiple of the alignment requirement
This simplifies the code the allocator has to executed

Differential Revision: https://reviews.llvm.org/D91471
2020-11-14 04:56:56 -08:00
Roman Lebedev 6861d938e5
Revert "clang-misexpect: Profile Guided Validation of Performance Annotations in LLVM"
See discussion in https://bugs.llvm.org/show_bug.cgi?id=45073 / https://reviews.llvm.org/D66324#2334485
the implementation is known-broken for certain inputs,
the bugreport was up for a significant amount of timer,
and there has been no activity to address it.
Therefore, just completely rip out all of misexpect handling.

I suspect, fixing it requires redesigning the internals of MD_misexpect.
Should anyone commit to fixing the implementation problem,
starting from clean slate may be better anyways.

This reverts commit 7bdad08429,
and some of it's follow-ups, that don't stand on their own.
2020-11-14 13:12:38 +03:00
Mehdi Amini ac06b1af40 Revert "Switch libLLVMFrontendOpenACC to be a regular CMake library and not a "component""
This reverts commit e7ed276532.

Build is broken with  -DLLVM_LINK_LLVM_DYLIB=ON
2020-11-14 04:14:17 +00:00
Mehdi Amini e7ed276532 Switch libLLVMFrontendOpenACC to be a regular CMake library and not a "component"
This library is only used in Flang at the moment and not tested withing LLVM.
Having it as a component is breaking llvm-config:

  $ bin/llvm-config --shared-mode
  llvm-config: error: component libraries and shared library

  llvm-config: error: missing: [...]/lib/libLLVMFrontendOpenACC.a

This will reverted when unit-tests are provided for it.

Reviewed By: clementval

Differential Revision: https://reviews.llvm.org/D91470
2020-11-14 02:18:25 +00:00
Stanislav Mekhanoshin c9821cec74 [AMDGPU] Mark sin/cos load folding as modifying the function.
When the load value is folded into the sin/cos operation, the
AMDGPU library calls simplifier could still mark the function
as unmodified. Instead ensure if there is an early return,
return whether the load was folded into the sin/cos call.

Authored by MJDSys

Differential Revision: https://reviews.llvm.org/D91401
2020-11-13 14:49:33 -08:00
Sam Clegg a083b28a31 [WebAssembly] Move GlobalTLSAddress handling to WebAssemblyISelLowering. NFC
I'm not why it was added to DAGToDAG oringally but it seems
to make sense alongside the non-TLS version:  LowerGlobalAddress

Differential Revision: https://reviews.llvm.org/D91432
2020-11-13 14:35:51 -08:00
Akira Hatanaka 2ed3a76745 [ObjC][ARC] Add and use a function which finds and returns the single
dependency. NFC

Use findSingleDependency in place of FindDependencies and stop passing a
set of Instructions around. Modify FindDependencies to return a boolean
flag which indicates whether the dependencies it has found are all
valid.
2020-11-13 14:02:58 -08:00
Akira Hatanaka 00d0974e62 Move variable declarations to functions in which they are used. NFC 2020-11-13 14:02:58 -08:00
Nikita Popov 9a85643cd3 [KnownBits] Combine abs() implementations
ValueTracking was using a more powerful abs() implementation. Roll
it into KnownBits::abs(). Also add an exhaustive test for abs(),
in both the poisoning and non-poisoning variants.
2020-11-13 22:23:50 +01:00
Heejin Ahn 902ea588ea [WebAssembly] Rename atomic.notify and *.atomic.wait
- atomic.notify -> memory.atomic.notify
- i32.atomic.wait -> memory.atomic.wait32
- i64.atomic.wait -> memory.atomic.wait64

See https://github.com/WebAssembly/threads/pull/149.

Reviewed By: tlively

Differential Revision: https://reviews.llvm.org/D91447
2020-11-13 12:04:48 -08:00
Guozhi Wei a20220d25b [AlwaysInliner] Call mergeAttributesForInlining after inlining
Like inlineCallIfPossible and InlinerPass, after inlining mergeAttributesForInlining
should be called to merge callee's attributes to caller. But it is not called in
AlwaysInliner, causes caller's attributes inconsistent with inlined code.

Attached test case demonstrates that attribute "min-legal-vector-width"="512" is
not merged into caller without this patch, and it causes failure in SelectionDAG
when lowering the inlined AVX512 intrinsic.

Differential Revision: https://reviews.llvm.org/D91446
2020-11-13 12:01:35 -08:00
Jianzhou Zhao 06c9b4aaa9 Extend the dfsan store/load callback with write/read address
This helped debugging.

Reviewed-by: morehouse

Differential Revision: https://reviews.llvm.org/D91236
2020-11-13 19:46:32 +00:00
Baptiste Saleil 3f78605a8c [PowerPC] Add paired vector load and store builtins and intrinsics
This patch adds the Clang builtins and LLVM intrinsics to load and store vector pairs.

Differential Revision: https://reviews.llvm.org/D90799
2020-11-13 12:35:10 -06:00
Jessica Paquette 9a8bfe3835 [AArch64][GlobalISel] Select G_SELECT cc, t, (G_SUB 0, x) -> CSNEG t, x, cc
When we see

```
%sub = G_SUB 0, %x
%select = G_SELECT %cc, %t, %sub
```

Fold away the G_SUB by producing

```
%select = CSNEG %t, %x, cc
```

Simple IR example: https://godbolt.org/z/K8TEnh

This is valid on both sides of the select, but for now, just handle one side.
It may make more sense to handle swapping sides during post-legalizer lowering.

Differential Revision: https://reviews.llvm.org/D90723
2020-11-13 10:12:51 -08:00
Jessica Paquette 6c20c1da1e [AArch64][GlobalISel] NFC: Use CmpInst::isUnsigned instead of static helper
Reducing some code duplication.

We had a helper for checking if a predicate is unsigned. Remove that and use
the existing function in Instructions.cpp.

Differential Revision: https://reviews.llvm.org/D91288
2020-11-13 09:35:42 -08:00
Wouter van Oortmerssen 16f02431dc [WebAssembly] Added R_WASM_FUNCTION_OFFSET_I64 for use with DWARF DW_AT_low_pc
Needed for wasm64, see discussion in https://reviews.llvm.org/D91203

Differential Revision: https://reviews.llvm.org/D91395
2020-11-13 09:32:31 -08:00