Commit Graph

177488 Commits

Author SHA1 Message Date
Fangrui Song f42990e687 [Mem2Reg] Don't call LBI.deleteValue on AllocInst/DbgVariableIntrinsic
Only StoreInst/LoadInst are assigned numbers. Other types of instructions are not in LBI.

llvm-svn: 358350
2019-04-14 06:27:07 +00:00
Fangrui Song 8f9bb2250b [Mem2Reg] Simplify rewriteSingleStoreAlloca
llvm-svn: 358349
2019-04-14 05:48:13 +00:00
Fangrui Song dabd80047e [ConstantRange] Fix unittest after rL358347
llvm-svn: 358348
2019-04-14 05:19:15 +00:00
Fangrui Song 43d110bd27 [ConstantRange] Delete unused getSetSize
getSetSize returns an APInt that is 1 bit wider. The APInt is typically 65-bit and requires memory allocation. isSizeStrictlySmallerThan and isSizeLargerThan are preferred. The last use of this helper method was removed by rL302385.

llvm-svn: 358347
2019-04-14 04:45:04 +00:00
Craig Topper 476dd06854 [X86] Update bool_reduction_v8f32 test cases from vector-compare-any_of.ll and vector-compare-all_of.ll to be proper reductions.
One of the shuffles was used twice. While the intended shuffle wasn't connected.

llvm-svn: 358346
2019-04-14 04:20:42 +00:00
Craig Topper fdcdf74b0e [X86] Remove some unused tablegen multiclasses. NFC
llvm-svn: 358345
2019-04-14 04:20:38 +00:00
Philip Reames 0eeb2cd491 [Tests] Add tests for D60659, and make adjustments to others to make diff clear
Three related changes:
1) auto-gen several test files
2) Add the new tests at the bottom of said files
3) Adjust a couple of other test files not to use stores to constants when trying to test constexpr address handling

llvm-svn: 358344
2019-04-13 22:12:56 +00:00
Bill Wendling 191f1487b6 [X86] Use PC-relative mode for the kernel code model
Summary:
The Linux kernel uses PC-relative mode, so allow that when the code model is
"kernel".

Reviewers: craig.topper

Reviewed By: craig.topper

Subscribers: llvm-commits, kees, nickdesaulniers

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60643

llvm-svn: 358343
2019-04-13 21:39:28 +00:00
Nikita Popov 040871db48 [CVP] Add tests for range of with.overflow result; NFC
Test range of with.overflow result in the no-overflow branch.

llvm-svn: 358341
2019-04-13 19:43:51 +00:00
Nikita Popov a96480ebc1 [ConstantRange] Disallow NUW | NSW in makeGuaranteedNoWrapRegion()
As motivated in D60598, this drops support for specifying both NUW and
NSW in makeGuaranteedNoWrapRegion(). None of the users of this function
currently make use of this.

When both NUW and NSW are specified, the exact nowrap region has two
disjoint parts and makeGNWR() returns one of them. This result doesn't
seem to be useful for anything, but makes the semantics of the function
fuzzier.

Differential Revision: https://reviews.llvm.org/D60632

llvm-svn: 358340
2019-04-13 19:43:45 +00:00
Nikita Popov 95e5f28337 [InstCombine] Remove redundant/bogus mul_with_overflow combines
As pointed out in D60518 folding mulo(%x, undef) to {undef, undef}
isn't correct. As a correct version of this already exists in
InstructionSimplify (bd8056ef32/lib/Analysis/InstructionSimplify.cpp (L4750-L4757)) this is just
dead code though. Drop it together with the mul(%x, 0) -> {0, false}
fold that is also already handled by InstSimplify.

Differential Revision: https://reviews.llvm.org/D60649

llvm-svn: 358339
2019-04-13 19:43:35 +00:00
Craig Topper 55b0d987fd [X86] Use int64_t and isInt<N> instead of APInt operations in foldLoadStoreIntoMemOperand. NFC
We know all our values are limited to 64 bits here so we don't need an APInt.

This should save some generated code checking between large and small size.

llvm-svn: 358338
2019-04-13 18:57:41 +00:00
Don Hinton 7d2021defc [CommandLineParser] Add DefaultOption flag
Summary: Add DefaultOption flag to CommandLineParser which provides a
default option or alias, but allows users to override it for some
other purpose as needed.

Also, add `-h` as a default alias to `-help`, which can be seamlessly
overridden by applications like llvm-objdump and llvm-readobj which
use `-h` as an alias for other options.

Reviewers: alexfh, klimek

Reviewed By: klimek

Subscribers: MaskRay, mehdi_amini, inglorion, dexonsmith, hiraditya, llvm-commits, jhenderson, arphaman, cfe-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D59746

llvm-svn: 358337
2019-04-13 16:55:28 +00:00
Heejin Ahn 5f3a04510a [WebAssembly] Use Function::hasOptSize() (NFC)
Summary: Use member function.

Reviewers: aheejin

Subscribers: sunfish, hiraditya, sbc100, jgravelle-google, dschuff, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60651

Patch by Hideto Ueno (uenoku)

llvm-svn: 358336
2019-04-13 16:54:39 +00:00
Fangrui Song 49f8776f0c [CallingConvLower] Use SmallVectorImpl::swap
llvm-svn: 358335
2019-04-13 15:58:48 +00:00
Fangrui Song 8540486974 [Mem2Reg] Delete unused AllocaPointerVal
It is no longer used after the AliasSetTracker updating logic was removed.

llvm-svn: 358334
2019-04-13 15:41:42 +00:00
Fangrui Song 67c29e2294 [ADT] Fix OwningArrayRef's move ctor
llvm-svn: 358332
2019-04-13 13:52:11 +00:00
Nikita Popov 41e284b9c3 [CVP] Fix inverted predicates in test; NFC
Checked the wrong direction in the umul tests... fix predicated to
line up with the test name.

llvm-svn: 358331
2019-04-13 11:47:36 +00:00
Nikita Popov 25c1aa15a7 [CVP] Add tests for with.overflow used as condition; NFC
llvm-svn: 358330
2019-04-13 11:40:16 +00:00
Chen Zheng 87dd0e06dc [InstCombine] Canonicalize (-X srem Y) to -(X srem Y).
Differential Revision: https://reviews.llvm.org/D60647

llvm-svn: 358328
2019-04-13 09:21:22 +00:00
Chen Zheng fc59a0326b [InstCombine] [NFC] add testcases for canonicalizing (-X srem Y) to -(X srem Y).
llvm-svn: 358327
2019-04-13 07:34:55 +00:00
Philip Reames e03301a3b3 [StackMaps] Update llvm-readobj to parse V3 Stackmaps
This updates the StackMap parser in the llvm-readobj tool to parse version 3 StackMaps, which were bumped in https://reviews.llvm.org/D32629.

Version 3 StackMaps differ in that they have a uint16 sized "location size" field which was added to the Location block in a StackMap record. The record has additional padding for alignment. This was a backwards incompatible change resulting in a StackMap version bump.

Patch By: jacob.hughes@kcl.ac.uk (with a rewrite of tests by me)
Differential Revision: https://reviews.llvm.org/D59020

llvm-svn: 358325
2019-04-13 03:55:13 +00:00
Philip Reames eea989a909 [StackMaps] Add location size to llvm-readobj -stackmap output
The size field of a location can be different for each entry, so it is useful to have this displayed in the output of llvm-readobj -stackmap. Below is an example of how the output would look:

Record ID: 2882400000, instruction offset: 16
   3 locations:
     #1: Constant 1, size: 8
     #2: Constant 2, size: 8
     #3: Constant 3, size: 8
   0 live-outs: [ ]

Patch By: jacob.hughes@kcl.ac.uk (with heavy modification by me)
Differential Revision: https://reviews.llvm.org/D59169

llvm-svn: 358324
2019-04-13 03:08:45 +00:00
Philip Reames f7acef9c88 [llvm-readobj] Minor style tweak for consistency sake [NFC]
llvm-svn: 358323
2019-04-13 02:23:08 +00:00
Philip Reames 377f507a9f [StackMaps] Remove format version from the class name [NFC]
Motivation is to reduce silly diffs when we change the format.  For instance, this causes most of D59020 to disappear.

llvm-svn: 358322
2019-04-13 02:02:56 +00:00
Philip Reames cebf0b3ab5 [StackMaps] Add explicit location size accessor to the stackmap parser
The reserved uint8 field in the location block of the stackmap record is used to denote the size of the location.

Patch By: jacob.hughes@kcl.ac.uk
Differential Revision: https://reviews.llvm.org/D59167

llvm-svn: 358319
2019-04-13 01:50:50 +00:00
Amara Emerson 93e58d2396 [AArch64][GlobalISel] Enable copy elision in the pre-legalizer combine and fix a crash.
This enables the simple copy combine that already exists in the CombinerHelper.
However, it exposed a bug in the GISelChangeObserver where it wouldn't clear a
set of MIs to process, and so would end up causing a crash when deleted MIs were
being added to the combiner worklist again.

Differential Revision: https://reviews.llvm.org/D60579

llvm-svn: 358318
2019-04-13 00:33:25 +00:00
Thomas Lively fef8de66a6 [WebAssembly] Add DataCount section to object files
Summary:
This ensures that object files will continue to validate as
WebAssembly modules in the presence of bulk memory operations. Engines
that don't support bulk memory operations will not recognize the
DataCount section and will report validation errors, but that's ok
because object files aren't supposed to be run directly anyway.

Reviewers: aheejin, dschuff, sbc100

Subscribers: jgravelle-google, hiraditya, sunfish, rupprecht, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60623

llvm-svn: 358315
2019-04-12 22:27:48 +00:00
Amara Emerson bdb5e4e4ca [GlobalISel] Fix a crash when handling an invalid MVT during call lowering.
This crash was introduced in r358032 as we try to construct an EVT from an MVT
in order to find the register type for the calling conv. Fall back instead of
trying to do this with an invalid MVT coming from i256.

llvm-svn: 358314
2019-04-12 22:05:46 +00:00
Alina Sbirlea f9f073a861 [MemorySSA] Add previous def to cache when found, even if trivial.
Summary:
When inserting a new Def, MemorySSA may be have non-minimal number of Phis.
While inserting, the walk to find the previous definition may cleanup minimal Phis.
When the last definition is trivial to obtain, we do not cache it.

It is possible while getting the previous definition for a Def to get two different answers:
- one that was straight-forward to find when walking the first path (a trivial phi in this case), and
- another that follows a cleanup of the trivial phi, it determines it may need additional Phi nodes, it inserts them and returns a new phi in the same position as the former trivial one.
While the Phis added for the second path are all redundant, they are not complete (the walk is only done upwards), and they are not properly cleaned up afterwards.

A way to fix this problem is to cache the straight-forward answer we got on the first walk.
The caching is only kept for the duration of a getPreviousDef call, and for Phis we use TrackingVH, so removing the trivial phi will lead to replacing it with the next dominating phi in the cache.
Resolves PR40749.

Reviewers: george.burgess.iv

Subscribers: jlebar, Prazek, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60634

llvm-svn: 358313
2019-04-12 21:58:52 +00:00
Amara Emerson 2806fd01a1 [AArch64][GlobalISel] Fix a crash when selecting shufflevectors with an undef mask element.
If a shufflevector's mask vector has an element with "undef" then the generic
instruction defining that element register is a G_IMPLICT_DEF instead of G_CONSTANT.
This fixes the selector to handle this case, and for now assumes that undef just means
zero. In future we'll optimize this case properly.

llvm-svn: 358312
2019-04-12 21:31:21 +00:00
Thomas Lively 9e27514996 [WebAssembly] Add mutable-globals to bleeding-edge CPU
Summary: This brings the backend in line with Clang.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60594

llvm-svn: 358310
2019-04-12 20:39:53 +00:00
Nikita Popov 3dc7c7ca31 [ConstantRange] Clarify makeGuaranteedNoWrapRegion() guarantees; NFC
makeGuaranteedNoWrapRegion() is actually makeExactNoWrapRegion() as
long as only one of NUW or NSW is specified. This is not obvious from
the current documentation, and some code seems to think that it is
only exact for single-element ranges. Clarify docs and add tests to
be more confident this really holds.

There are currently no users of makeGuaranteedNoWrapRegion() that
pass both NUW and NSW. I think it would be best to drop support for
this entirely and then rename the function to makeExactNoWrapRegion().

Knowing that the no-wrap region is exact is useful, because we can
backwards-constrain values. What I have in mind in particular is
that LVI should be able to constrain values on edges where the
with.overflow overflow flag is false.

Differential Revision: https://reviews.llvm.org/D60598

llvm-svn: 358305
2019-04-12 19:36:47 +00:00
Alina Sbirlea 2312a06c87 [SCEV] Add option to forget everything in SCEV.
Summary:
Create a method to forget everything in SCEV.
Add a cl::opt and PassManagerBuilder option to use this in LoopUnroll.

Motivation: Certain Halide applications spend a very long time compiling in forgetLoop, and prefer to forget everything and rebuild SCEV from scratch.
Sample difference in compile time reduction: 21.04 to 14.78 using current ToT release build.
Testcase showcasing this cannot be opensourced and is fairly large.

The option disabled by default, but it may be desirable to enable by
default. Evidence in favor (two difference runs on different days/ToT state):

File Before (s) After (s)
clang-9.bc 7267.91 6639.14
llvm-as.bc 194.12 194.12
llvm-dis.bc 62.50 62.50
opt.bc 1855.85 1857.53

File Before (s) After (s)
clang-9.bc 8588.70 7812.83
llvm-as.bc 196.20 194.78
llvm-dis.bc 61.55 61.97
opt.bc 1739.78 1886.26

Reviewers: sanjoy

Subscribers: mehdi_amini, jlebar, zzheng, javed.absar, dmgreen, jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60144

llvm-svn: 358304
2019-04-12 19:16:07 +00:00
Alina Sbirlea 57769382b1 [MemorySSA] Small fix for the clobber limit.
Summary:
After introducing the limit for clobber walking, `walkToPhiOrClobber` would assert that the limit is at least 1 on entry.
The test included triggered that assert.

The callsite in `tryOptimizePhi` making the calls to `walkToPhiOrClobber` is structured like this:
```
while (true) {
   if (getBlockingAccess()) { // calls walkToPhiOrClobber
   }
   for (...) {
     walkToPhiOrClobber();
   }
}
```

The cleanest fix is to check if the limit was reached inside `walkToPhiOrClobber`, and give an allowence of 1.
This approach not make any alias() calls (no calls to instructionClobbersQuery), so the performance condition is enforced.
The limit is set back to 0 if not used, as this provides info on the fact that we stopped before reaching a true clobber.

Reviewers: george.burgess.iv

Subscribers: jlebar, Prazek, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60479

llvm-svn: 358303
2019-04-12 18:48:46 +00:00
Philip Reames b091cc081d [InstCombine] Fix a nasty miscompile introduced w/masked.gather demanded elts
This fixes a miscompile which was introduced in r356510 (https://reviews.llvm.org/D57372).

The problem is that the original patch removed pointer operands where the load results we're demanded, but without considering the legality of the load itself.  If the masked.gather had active, but undemanded, lanes, then we could end up creating a load which loaded from an undef address.  The result could be a segfault, or, in theory, an arbitrary read from a random memory location into an used register.  

llvm-svn: 358299
2019-04-12 18:26:56 +00:00
Nikita Popov 00a0d5d1de [CVP] Set NSW/NUW flags when simplifying with.overflow
When CVP determines that a with.overflow intrinsic cannot overflow,
it currently inserts a simple add/sub. As we already determined that
there can be no overflow, we should add the appropriate NUW/NSW flag.

Differential Revision: https://reviews.llvm.org/D60585

llvm-svn: 358298
2019-04-12 18:18:17 +00:00
Nikita Popov 7671fc71f6 [KnownBits] Add computeForAddCarry()
This is for D60460. computeForAddSub() essentially already supports
carries because it has to deal with subtractions. This revision
extracts a lower-level computeForAddCarry() function, which allows
computing the known bits for add (carry known zero), sub (carry known
one) and addcarry (carry unknown).

As we don't seem to have any yet, I've added a unit test file for
KnownBits and exhaustive tests for the new computeForAddCarry()
functionality, as well the existing computeForAddSub() function.

Differential Revision: https://reviews.llvm.org/D60522

llvm-svn: 358297
2019-04-12 18:18:08 +00:00
Philip Reames 7a60cd38af [Tests] Checkin a test demonstrating a miscompile so that patch which fixes it shows a clear diff
llvm-svn: 358296
2019-04-12 18:11:58 +00:00
Lang Hames c7c1f21525 Simplify decoupling between RuntimeDyld/RuntimeDyldChecker, add 'got_addr' util.
This patch reduces the number of functions in the interface between RuntimeDyld
and RuntimeDyldChecker by combining "GetXAddress" and "GetXContent" functions
into "GetXInfo" functions that return a struct describing both the address and
content. The GetStubOffset function is also replaced with a pair of utilities,
GetStubInfo and GetGOTInfo, that fit the new scheme. For RuntimeDyld both of
these functions will return the same result, but for the new JITLink linker
(https://reviews.llvm.org/D58704) these will provide the addresses of PLT stubs
and GOT entries respectively.

For JITLink's use, a 'got_addr' utility has been added to the rtdyld-check
language, and the syntax of 'got_addr' and 'stub_addr' has been changed: both
functions now take two arguments, a 'stub container name' and a target symbol
name. For llvm-rtdyld/RuntimeDyld the stub container name is the object file
name and section name, separated by a slash. E.g.:

rtdyld-check: *{8}(stub_addr(foo.o/__text, y)) = y

For the upcoming llvm-jitlink utility, which creates stubs on a per-file basis
rather than a per-section basis, the container name is just the file name. E.g.:

jitlink-check: *{8}(got_addr(foo.o, y)) = y
llvm-svn: 358295
2019-04-12 18:07:28 +00:00
Brendon Cahoon 4df216cd62 [Hexagon] Fix reuse bug in Vector Loop Carried Reuse pass
The Hexagon Vector Loop Carried Reuse pass was allowing reuse between
two shufflevectors with different masks. The reason is that the masks
are not instruction objects, so the code that checks each operand
just skipped over the operands.

This patch fixes the bug by checking if the operands are the same
when they are not instruction objects. If the objects are not the
same, then the code assumes that reuse cannot occur.

Differential Revision: https://reviews.llvm.org/D60019

llvm-svn: 358292
2019-04-12 16:37:12 +00:00
Sanjay Patel 5e4ad39af7 [DAGCombiner] narrow shuffle of concatenated vectors
// shuffle (concat X, undef), (concat Y, undef), Mask -->
// concat (shuffle X, Y, Mask0), (shuffle X, Y, Mask1)

The ARM changes with 'vtrn' and narrowed 'vuzp' are improvements.

The x86 changes look neutral or better. There's one test with an
extra instruction, but that could be reversed for a subtarget with
the right attributes. But by default, we want to avoid the 256-bit
op when possible (in my motivating benchmark, a handful of ymm ops
sprinkled into a sequence of xmm ops are triggering frequency
throttling on Haswell resulting in significantly worse perf).

Differential Revision: https://reviews.llvm.org/D60545

llvm-svn: 358291
2019-04-12 16:31:56 +00:00
Zachary Turner e1bc9758cb [PDB Docs] Add some prose describing public and global symbols.
llvm-svn: 358289
2019-04-12 15:51:40 +00:00
Hiroshi Yamauchi c27ff0d32d Add options for MaxLoadsPerMemcmp(OptSize).
Reviewers: davidxl

Reviewed By: davidxl

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60587

llvm-svn: 358287
2019-04-12 15:05:46 +00:00
Simon Pilgrim 6c8f4ada36 [X86][SSE] Recognise vXi1 boolean anyof/allof reduction patterns
Currently combineHorizontalPredicateResult only handles anyof/allof reduction patterns of legal types, which can be tricky to match as type legalization of bools can introduce bitcasts/truncs/extensions.

This patch extends combineHorizontalPredicateResult to recognise vXi1 bool reductions as well and uses the existing combineBitcastvxi1 helper to create the MOVMSK necessary to then compare the signmask result.

This ensures the accuracy of the reduction costs added in D60403 which assume the MOVMSK generation.

Differential Revision: https://reviews.llvm.org/D60610

llvm-svn: 358286
2019-04-12 14:22:57 +00:00
Hans Wennborg 4e6b857922 Revert r358268 "[DebugInfo] DW_OP_deref_size in PrologEpilogInserter."
It causes clang to crash while building Chromium. See https://crbug.com/952230
for reproducer.

> The PrologEpilogInserter need to insert a DW_OP_deref_size before
> prepending a memory location expression to an already implicit
> expression to avoid having the existing expression act on the memory
> address instead of the value behind it.
>
> The reason for using DW_OP_deref_size and not plain DW_OP_deref is that
> big-endian targets need to read the right size as simply truncating a
> larger read would yield the wrong result (LSB bytes are not at the lower
> address).
>
> Differential Revision: https://reviews.llvm.org/D59687

llvm-svn: 358281
2019-04-12 12:54:52 +00:00
Eugene Leviant 88089fed9c [llvm-objcopy] Fill .symtab_shndx section correctly
Differential revision: https://reviews.llvm.org/D60555

llvm-svn: 358278
2019-04-12 11:59:30 +00:00
Fangrui Song fb79ff6ab5 Use llvm::upper_bound. NFC
llvm-svn: 358277
2019-04-12 11:31:16 +00:00
Kang Zhang 2446f843ae [PowerPC] Add initialization for some ppc passes
Summary:

Some llc debug options need pass-name as the parameters.
But if we use the pass-name ppc-early-ret, we will get below error:
llc test.ll -stop-after ppc-early-ret
LLVM ERROR: "ppc-early-ret" pass is not registered.
Below pass-names have the pass is not registered error:
ppc-ctr-loops
ppc-ctr-loops-verify
ppc-loop-preinc-prep
ppc-toc-reg-deps
ppc-vsx-copy
ppc-early-ret
ppc-vsx-fma-mutate
ppc-vsx-swaps
ppc-reduce-cr-ops
ppc-qpx-load-splat
ppc-branch-coalescing
ppc-branch-select

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D60248

llvm-svn: 358271
2019-04-12 09:59:40 +00:00
Jeremy Morse 32afe6a1f8 [DebugInfo] Fix pr41175 Dead Store Elimination missing debug loc
Bug: https://bugs.llvm.org/show_bug.cgi?id=41175

In the bug test case the DSE pass is shortening the range of memory that a
memset is working on. A getelementptr is generated so that the new
starting address can be passed to memset. This instruction was not given
a DebugLoc.

To fix the bug, copy the DebugLoc from the memset instruction.

Patch by Orlando Cazalet-Hyams!

Differential Revision: https://reviews.llvm.org/D60556

llvm-svn: 358270
2019-04-12 09:47:35 +00:00