Commit Graph

113570 Commits

Author SHA1 Message Date
Lang Hames 502f81e37e [ORC] Preserve Materializing symbol flag during resolution.
llvm-svn: 332899
2018-05-21 21:11:22 +00:00
Lang Hames 0b0b41fcce [ORC] Lookup now returns an error if any symbols are not found.
Also tightens the behavior of ExecutionSession::failQuery. Queries can usually
only be failed by marking a symbol as failed-to-materialize, but
ExecutionSession::failQuery provides a second route, and both routes may be
executed from different threads. In the case that a query has already been
failed due to a materialization error, ExecutionSession::failQuery will
direct the error to ExecutionSession::reportError instead.

llvm-svn: 332898
2018-05-21 21:11:21 +00:00
Lang Hames add9b6805c [ORC] Remove the optional MaterializationResponsibility argument from lookup.
The lookup function provides blocking symbol resolution for JIT clients (not
layers themselves) so it does not need to track symbol dependencies via a
MaterializationResponsibility.

llvm-svn: 332897
2018-05-21 21:11:21 +00:00
Lang Hames 1cf9987f6e [ORC] Add IRLayer and ObjectLayer interfaces and related MaterializationUnits.
llvm-svn: 332896
2018-05-21 21:11:13 +00:00
Craig Topper 25444c852a [DAGCombiner] Use computeKnownBits to match rotate patterns that have had their amount masking modified by simplifyDemandedBits
SimplifyDemandedBits can remove bits from the masks for the shift amounts we need to see to detect rotates.

This patch uses zeroes from computeKnownBits to fill in some of these mask bits to make the match work.

As currently written this calls computeKnownBits even when the mask hasn't been simplified because it made the code simpler. If we're worried about compile time performance we can improve this.

I know we're talking about making a rotate intrinsic, but hopefully we can go ahead and do this change and just make sure the rotate intrinsic also handles it.

Differential Revision: https://reviews.llvm.org/D47116

llvm-svn: 332895
2018-05-21 21:09:18 +00:00
Reid Kleckner 537917d13c [X86] Simplify some X86 address mode folding code, NFCI
This code should really do exactly the same thing for 32-bit x86 and
64-bit small code models, with the exception that RIP-relative
addressing can't use base and index registers.

llvm-svn: 332893
2018-05-21 21:03:19 +00:00
Craig Topper aad3aefaeb [X86] Remove masking from vpternlog intrinsics. Use a select in IR instead.
This removes 6 intrinsics since we no longer need separate mask and maskz intrinsics.

Differential Revision: https://reviews.llvm.org/D47124

llvm-svn: 332890
2018-05-21 20:58:09 +00:00
Peter Collingbourne 274c4f7ab4 Fix a make_unique ambiguity.
llvm-svn: 332889
2018-05-21 20:56:28 +00:00
Sanjay Patel b8346e3f07 [InstCombine] remove fptrunc (select) code; NFCI
This pattern is handled within commonCastTransforms(),
so the code here is dead AFAICT.

llvm-svn: 332887
2018-05-21 20:39:35 +00:00
Peter Collingbourne c5a9765cea LTO: Replace split dwarf implementation that uses objcopy with one that uses direct emission.
Part of PR37466.

Differential Revision: https://reviews.llvm.org/D47091

llvm-svn: 332884
2018-05-21 20:26:49 +00:00
Peter Collingbourne 9a45114b3c CodeGen: Add a dwo output file argument to addPassesToEmitFile and hook it up to dwo output.
Part of PR37466.

Differential Revision: https://reviews.llvm.org/D47089

llvm-svn: 332881
2018-05-21 20:16:41 +00:00
Peter Collingbourne 63062d9d0f MC: Introduce an ELF dwo object writer and teach llvm-mc about it.
Part of PR37466.

Differential Revision: https://reviews.llvm.org/D47051

llvm-svn: 332875
2018-05-21 19:44:54 +00:00
Jonas Devlieghere c111382aa8 [DebugInfo] Use absolute addresses in location lists
Rather than relying on the user to do the address calculating in
DW_AT_location we should just dump the absolute address.

rdar://problem/38513870

Differential revision: https://reviews.llvm.org/D47152

llvm-svn: 332873
2018-05-21 19:36:54 +00:00
Peter Collingbourne f0226e62a8 MC: Extract a derived class from ELFObjectWriter. NFCI.
This class will be used to create regular, non-split ELF files.

Part of PR37466.

Differential Revision: https://reviews.llvm.org/D47049

llvm-svn: 332870
2018-05-21 19:30:59 +00:00
Peter Collingbourne dcd7d6c331 MC: Separate creating a generic object writer from creating a target object writer. NFCI.
With this we gain a little flexibility in how the generic object
writer is created.

Part of PR37466.

Differential Revision: https://reviews.llvm.org/D47045

llvm-svn: 332868
2018-05-21 19:20:29 +00:00
Peter Collingbourne a29fe579f4 MC: Extract ELFObjectWriter's ELF writing functionality into an ELFWriter class. NFCI.
The idea is that we will be able to use this class to create multiple
files.

Differential Revision: https://reviews.llvm.org/D47048

llvm-svn: 332867
2018-05-21 19:18:28 +00:00
Peter Collingbourne 2602a0d40c Fix ubsan bounds check failure.
llvm-svn: 332866
2018-05-21 19:09:47 +00:00
Craig Topper f14e62c9a5 [EarlyCSE] Improve EarlyCSE of some absolute value cases.
Change matchSelectPattern to return X and -X for ABS/NABS in a well defined order. Adjust EarlyCSE to account for this. Ensure the SPF result is some kind of min/max and not abs/nabs in one place in InstCombine that made me nervous.

Prevously we returned the two operands of the compare part of the abs pattern. The RHS is always going to be a 0i, 1 or -1 constant. This isn't a very meaningful thing to return for any one. There's also some freedom in the abs pattern as to what happens when the value is equal to 0. This freedom led to early cse failing to match when different constants were used in otherwise equivalent operations. By returning the input and its negation in a defined order we can ensure an exact match. This also makes sure both patterns use the exact same subtract instruction for the negation. I believe CSE should evebntually make this happen and properly merge the nsw/nuw flags. But I'm not familiar with CSE and what order it does things in so it seemed like it might be good to really enforce that they were the same.

Differential Revision: https://reviews.llvm.org/D47037

llvm-svn: 332865
2018-05-21 18:42:42 +00:00
Peter Collingbourne 59a6fc469f MC: Remove stream and output functions from MCObjectWriter. NFCI.
Part of PR37466.

Differential Revision: https://reviews.llvm.org/D47043

llvm-svn: 332864
2018-05-21 18:28:57 +00:00
Peter Collingbourne 438390fae1 MC: Have the object writers return the number of bytes written. NFCI.
This removes the last external use of the stream.

Part of PR37466.

Differential Revision: https://reviews.llvm.org/D47042

llvm-svn: 332863
2018-05-21 18:23:50 +00:00
Stanislav Mekhanoshin 9badad2051 [AMDGPU] Add divergence analysis as a dependency for ISel
AMDGPUDAGToDAGISel adds DivergenceAnalysis in getAnalysisUsage
but does not list it in pass dependencies which may lead to
crash.

Differential Revision: https://reviews.llvm.org/D47151

llvm-svn: 332862
2018-05-21 18:18:52 +00:00
Peter Collingbourne f17b149d8c MC: Change object writers to use endian::Writer. NFCI.
Part of PR37466.

Differential Revision: https://reviews.llvm.org/D47040

llvm-svn: 332861
2018-05-21 18:17:42 +00:00
Diego Caballero 168d04d544 [VPlan] Reland r332654 and silence unused func warning
r332654 was reverted due to an unused function warning in
release build. This commit includes the same code with the
warning silenced.

Differential Revision: https://reviews.llvm.org/D44338

llvm-svn: 332860
2018-05-21 18:14:23 +00:00
Peter Collingbourne 147db3e628 MC: Change MCAssembler::writeSectionData and writeFragmentPadding to take a raw_ostream. NFCI.
Also clean up a couple of hacks where we were writing the section
contents to another stream by setting the object writer's stream,
writing and setting it back.

Part of PR37466.

Differential Revision: https://reviews.llvm.org/D47038

llvm-svn: 332858
2018-05-21 18:11:35 +00:00
Peter Collingbourne 571a3301ae MC: Change MCAsmBackend::writeNopData() to take a raw_ostream instead of an MCObjectWriter. NFCI.
To make this work I needed to add an endianness field to MCAsmBackend
so that writeNopData() implementations know which endianness to use.

Part of PR37466.

Differential Revision: https://reviews.llvm.org/D47035

llvm-svn: 332857
2018-05-21 17:57:19 +00:00
Tom Stellard a91ce17b5f AMDGPU/GlobalISel: Address post-commit review comments for r332379
MCRegisterInfo::getPhysRegSize() will be deprecated.

llvm-svn: 332856
2018-05-21 17:49:31 +00:00
Alexey Bataev 7c9ad0db3d [InstCombine] Fix PR37526: MinMax patterns produce an infinite loop.
Summary:
This patch fixes PR37526 by simplifying the newly generated LoadInst
instructions. If the pointer address is a bitcast from the pointer to
the NewType, we can just remove this extra bitcast instead of creating
the new one. This fixes the PR37526 + may speed up the whole compilation
process.

Reviewers: spatel, RKSimon, hfinkel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D47144

llvm-svn: 332855
2018-05-21 17:46:34 +00:00
Andrea Di Biagio b5757abefb [X86][BtVer2] Add a 'J' prefix to the PRF/RCU defs. NFC
This is to keep the Jaguar model's naming convention. Processor resources all
have a 'J' prefix in the BtVer2 scheduling model.

llvm-svn: 332851
2018-05-21 16:30:26 +00:00
Robert Widmann 38fa750b7a [LLVM-C] Add DIBuilder Bindings For ObjC Classes
Summary: Add LLVMDIBuilderCreateObjCIVar, LLVMDIBuilderCreateObjCProperty, and LLVMDIBuilderCreateInheritance to allow declaring metadata for Objective-C class hierarchies and their associated properties and instance variables.

Reviewers: whitequark, deadalnix

Reviewed By: whitequark

Subscribers: harlanhaskins, llvm-commits

Differential Revision: https://reviews.llvm.org/D47123

llvm-svn: 332850
2018-05-21 16:27:35 +00:00
Lama Saba 9417f7ff2e [X86] - Avoid SFB pass - fix bug in updating the offsets for newly created copies
Change-Id: I169ab6fe7e187727c0298c2a1e2868a683f3e688
llvm-svn: 332849
2018-05-21 16:23:16 +00:00
James Henderson 004b729ed1 [DWARF] Refactor callback usage for .debug_line error handling
Change the "recoverable" error callback to take an Error instaed of a
string.

Reviewed by: JDevlieghere

Differential Revision: https://reviews.llvm.org/D46831

llvm-svn: 332845
2018-05-21 15:30:54 +00:00
Simon Pilgrim a8869e68a9 [X86][SSE] Add an assert to ensure that rotation amount is converted to a scale
Missed in rL332832 where we added SSE v4i32 rotations for PR37426.

llvm-svn: 332844
2018-05-21 15:17:23 +00:00
Tim Northover 4e3eec39fa ARM: be conservative when asked load/store alignment of weird type.
Chances are we'll be asked again after type legalization, but before that point
it's better to claim misaligned accesses aren't allowed than to assert.

llvm-svn: 332840
2018-05-21 12:43:54 +00:00
Nico Weber e4a12cfa2f revert r332610, it breaks cfi, see D46326
llvm-svn: 332838
2018-05-21 11:44:39 +00:00
Aleksandar Beserminji 4977705727 [mips] Revert Merge MipsLongBranch and MipsHazardSchedule passes
Revert this patch due buildbot failure.

Differential Revision: https://reviews.llvm.org/D46641

llvm-svn: 332837
2018-05-21 11:38:52 +00:00
David Green 8ceab61c75 [CVP] Require DomTree for new Pass Manager
We were previously using a DT in CVP through SimplifyQuery, but not requiring it in
the new pass manager. Hence it would crash if DT was not already available. This now
gets DT directly and plumbs it through to where it is used (instead of using it
through SQ).

llvm-svn: 332836
2018-05-21 11:06:28 +00:00
Eric Christopher 563d0b9cb9 Fix up a few grammar issues.
llvm-svn: 332835
2018-05-21 10:27:36 +00:00
Aleksandar Beserminji de7be5e46f [mips] Merge MipsLongBranch and MipsHazardSchedule passes
MipsLongBranchPass and MipsHazardSchedule passes are joined to one pass
because of mutual conflict. When MipsHazardSchedule inserts 'nop's, it
potentially breaks some jumps, so they have to be expanded to long
branches. When some branch is expanded to long branch, it potentially
creates a hazard situation, which should be fixed by adding nops.
New pass is called MipsBranchExpansion, it combines these two passes,
and runs them alternately until one of them reports no changes were made.

Differential Revision: https://reviews.llvm.org/D46641

llvm-svn: 332834
2018-05-21 10:20:02 +00:00
Simon Pilgrim 5aa7cdfd70 [X86][SSE] Support v4i32 rotations (PR37426)
As suggested by Fabian on PR37426, we can use PMULUDQ to perform v4i32 vector rotations as the upper 32bits of the multiply will contain the 'wrapped' bits of the rotation.

v8i16/v16i8 rotations would be straightforward to add to lowerRotate in the future - ideally we'd mostly share code with the vector shifts lowering.

Differential Revision: https://reviews.llvm.org/D46954

llvm-svn: 332832
2018-05-21 09:45:59 +00:00
Robert Widmann 360d6e35e6 [LLVM-C] Improve Bindings For Aliases
Summary: Add wrappers for a module's alias iterators and a getter and setter for the aliasee value.

Reviewers: whitequark, deadalnix

Reviewed By: whitequark

Subscribers: llvm-commits, harlanhaskins

Differential Revision: https://reviews.llvm.org/D46808

llvm-svn: 332826
2018-05-20 23:49:08 +00:00
Craig Topper e4c045b7df [X86] Remove mask arguments from permvar builtins/intrinsics. Use a select in IR instead.
Someday maybe we'll use selects for all intrinsics.

llvm-svn: 332824
2018-05-20 23:34:04 +00:00
Nico Weber 41597b92b1 Revert 332750, llvm part (see comment on D46910).
llvm-svn: 332823
2018-05-20 23:03:17 +00:00
Simon Dardis 777afc7fbd [mips] Add microMIPSR6 ll/sc instructions.
Previously the compiler was using the microMIPSR3 variants, incorrectly.

Reviewers: atanasyan, abeserminji, smaksimovic

Differential Revision: https://reviews.llvm.org/D46948

llvm-svn: 332820
2018-05-20 17:21:00 +00:00
Sanjay Patel a003c728a5 [InstCombine] choose 1 form of abs and nabs as canonical
We already do this for min/max (see the blob above the diff), 
so we should do the same for abs/nabs.
A sign-bit check (<s 0) is used as a predicate for other IR 
transforms and it's likely the best for codegen.

This might solve the motivating cases for D47037 and D47041, 
but I think those patches still make sense. We can't guarantee 
this canonicalization if the icmp has more than one use.

Differential Revision: https://reviews.llvm.org/D47076

llvm-svn: 332819
2018-05-20 14:23:23 +00:00
Haicheng Wu 69ba0613f2 [GlobalMerge] Exit early if only one global is to be merged
To save some compilation time and prevent some unnecessary changes.

Differential Revision: https://reviews.llvm.org/D46640

llvm-svn: 332813
2018-05-19 18:00:02 +00:00
Brian Gesiak 9968e0dd49 Re-revert "[Option] Fix PR37006 prefix choice in findNearest"
Summary:
Reverting due to a test failure in an llvm-mt test on some buildbots, namely
http://green.lab.llvm.org/green/job/clang-stage2-configure-Rlto/26020/.

llvm-svn: 332812
2018-05-19 16:21:01 +00:00
Robert Widmann 025c78f5d7 [LLVM-C] Use Length-Providing Value Name Getters and Setters
Summary:
- Provide LLVMGetValueName2 and LLVMSetValueName2 that return and take the length of the provided C string respectively
- Deprecate LLVMGetValueName and LLVMSetValueName

Reviewers: whitequark, deadalnix

Reviewed By: whitequark

Subscribers: llvm-commits, harlanhaskins

Differential Revision: https://reviews.llvm.org/D46890

llvm-svn: 332810
2018-05-19 15:08:36 +00:00
Max Kazantsev c0b268f90c [IRCE] Fix miscompile with range checks against negative values
In the patch rL329547, we have lifted the over-restrictive limitation on collected range
checks, allowing to work with range checks with the end of their range not being
provably non-negative. However it appeared that the non-negativity of this value was
assumed in the utility function `ClampedSubtract`. In particular, its reasoning is based
on the fact that `0 <= SINT_MAX - X`, which is not true if `X` is negative.

The function `ClampedSubtract` is only called twice, once with `X = 0` (which is OK)
and the second time with `X = IRC.getEnd()`, where we may now see the problem if
the end is actually a negative value. In this case, we may sometimes miscompile.

This patch is the conservative fix of the miscompile problem. Rather than rejecting
non-provably non-negative `getEnd()` values, we will check it for non-negativity in
runtime. For this, we use function `smax(smin(X, 0), -1) + 1` that is equal to `1` if `X`
is non-negative and is equal to 0 if `X` is negative. If we multiply `Begin, End` of safe
iteration space by this function calculated for `X = IRC.getEnd()`, we will get the original
`[Begin, End)` if `IRC.getEnd()` was non-negative (and, thus, `ClampedSubtract` worked
correctly) and the empty range `[0, 0)` in case if ` IRC.getEnd()` was negative.

So we in fact prohibit execution of the main loop if at least one of range checks was
made against a negative value (and we figured it out in runtime). It is still better than
what we have before (non-negativity had to be proved in compile time) and prevents
us from miscompile, however it is sometiles too restrictive for unsigned range checks
against a negative value (which in fact can be eliminated).

Once we re-implement `ClampedSubtract` in a way that it handles negative `X` correctly,
this limitation can be lifted, too.

Differential Revision: https://reviews.llvm.org/D46860
Reviewed By: samparker

llvm-svn: 332809
2018-05-19 13:06:37 +00:00
Benjamin Kramer a76b64ff80 [MergeICmps] Don't crash when memcmp is not available
Fixes clang crashing with -fno-builtin, PR37527.

llvm-svn: 332808
2018-05-19 12:51:59 +00:00
Simon Pilgrim ede0e4073e Fix MSVC unused variable warning. NFCI.
AMDGPURegisterInfo::getSubRegFromChannel is a static method - we don't need to get the AMDGPURegisterInfo instance.

llvm-svn: 332807
2018-05-19 12:46:02 +00:00
Brian Gesiak 8cfb4b6d41 Un-revert "[Option] Fix PR37006 prefix choice in findNearest"
Summary:
In https://reviews.llvm.org/rL332804 I loosed the assertion in
the Clang driver test that forced me to revert
https://reviews.llvm.org/rL332299. Once this lands I should be
able to narrow down what caused PS4 buildbots to fail, and
reinstate the check in that test.

Test Plan: check-llvm & check-clang

llvm-svn: 332805
2018-05-19 12:03:26 +00:00
Yaxun Liu ea988f1fd9 Fix evaluator for non-zero alloca addr space
The evaluator goes through BB and creates global vars as temporary values to evaluate
results of LLVM instructions. It creates undef for alloca, however it assumes alloca
in addr space 0. If the next instruction is addrspace cast to 0, then we get an invalid
cast instruction.

This patch let the temp global var have an address space matching alloca addr space,
so that the valuation can be done.

Differential Revision: https://reviews.llvm.org/D47081

llvm-svn: 332794
2018-05-19 02:58:16 +00:00
Piotr Padlewski 5642a42442 Propagate nonnull and dereferenceable throught launder
Summary:
invariant.group.launder should not stop propagation
of nonnull and dereferenceable, because e.g. we would not be
able to hoist loads speculatively.

Reviewers: rsmith, amharc, kuhar, xbolva00, hfinkel

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D46972

llvm-svn: 332788
2018-05-18 23:54:33 +00:00
Piotr Padlewski ce358262eb Dissallow non-empty metadata for invariant.group
Summary:
This feature is not needed, but it might be usefull in the future
to use metadata to mark what which function should support it
(and strip it when not).

Reviewers: rsmith, sanjoy, amharc, kuhar

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D45419

llvm-svn: 332787
2018-05-18 23:53:46 +00:00
Piotr Padlewski a26a08cb52 Constant fold launder of null and undef
Summary:
This might be useful because clang will add
some barriers for pointer comparisons.

Reviewers: majnemer, dberlin, hfinkel, nlewycky, davide, rsmith, amharc,
kuhar

Subscribers: davide, amharc, llvm-commits

Differential Revision: https://reviews.llvm.org/D32423

llvm-svn: 332786
2018-05-18 23:52:57 +00:00
Piotr Padlewski 153fe60079 [MemDep] Fixed handling of invariant.group
Summary:
Memdep had funny bug related to invariant.groups - because it did not
invalidated cache, in some very rare cases it was possible to show memory
dependence of the instruction that was deleted, but because other
instruction took it's place it resulted in call to vtable!
Thanks @amharc for repro!.

Reviewers: dberlin, kuhar, amharc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D45320

Co-authored-by: Krzysztof Pszeniczny <krzysztof.pszeniczny@gmail.com>
llvm-svn: 332781
2018-05-18 22:40:34 +00:00
Matt Arsenault 9fc8593a77 DAG: Fix crash on shift with large shift amounts
Fixes bug 37521.

llvm-svn: 332774
2018-05-18 21:54:16 +00:00
Wolfgang Pieb 20e1546655 Fixing buildbot error introduced with r332759.
llvm-svn: 332772
2018-05-18 21:44:28 +00:00
Matt Arsenault 372d796ab1 AMDGPU: Add pass to optimize reqd_work_group_size
Eliminate loads from the dispatch packet when they will have
a known value.

Also pattern match the code used by the library to handle partial
workgroup dispatches, which isn't necessary if reqd_work_group_size
is used.

llvm-svn: 332771
2018-05-18 21:35:00 +00:00
Craig Topper 0198b73769 [InstCombine] Qualify a select pattern based transform to restrct to only min/max and ignore abs/nabs.
llvm-svn: 332770
2018-05-18 21:21:56 +00:00
Sam Clegg 4bbc6b55e7 [WebAssembly] Object: Add more error checking for object file reading
This should address some the assert failures the fuzzer has been
finding such as:
  https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=6719

Differential Revision: https://reviews.llvm.org/D47046

llvm-svn: 332769
2018-05-18 21:08:26 +00:00
Wolfgang Pieb 401b5ecfea Addressing a couple of compiler warnings introduced with r332759.
llvm-svn: 332766
2018-05-18 20:51:16 +00:00
Wolfgang Pieb da71639cdb Fixing build error introduced with r332759.
llvm-svn: 332762
2018-05-18 20:35:13 +00:00
Evgeniy Stepanov 28f330fd6f [msan] Don't check divisor shadow in fdiv.
Summary:
Floating point division by zero or even undef does not have undefined
behavior and may occur due to optimizations.

Fixes https://bugs.llvm.org/show_bug.cgi?id=37523.

Reviewers: kcc

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D47085

llvm-svn: 332761
2018-05-18 20:19:53 +00:00
Wolfgang Pieb ad60559be7 [DWARF v5] Improved support for .debug_rnglists (consumer). Enables any consumer to
extract DWARF v5 encoded rangelists.

Reviewer: JDevlieghere

Differential Revision: https://reviews.llvm.org/D45549

llvm-svn: 332759
2018-05-18 20:12:54 +00:00
Jessica Paquette c604817493 [NFC] Change cast from r332739 to a static cast
The casts in the delta computation for size remarks should have
been static casts. This fixes that.

Thanks to Dávid Bolvanský for pointing that out.

llvm-svn: 332758
2018-05-18 20:04:21 +00:00
Peter Collingbourne e3f652973e Support: Simplify endian stream interface. NFCI.
Provide some free functions to reduce verbosity of endian-writing
a single value, and replace the endianness template parameter with
a field.

Part of PR37466.

Differential Revision: https://reviews.llvm.org/D47032

llvm-svn: 332757
2018-05-18 19:46:24 +00:00
Konstantin Zhuravlyov caa8251971 AMDGPU/NFC: Set symbol's type that is coming from an argument in
EmitAMDGPUSymbolType, instead of hard-coding it to STT_AMDGPU_HSA_KERNEL.

llvm-svn: 332753
2018-05-18 18:41:37 +00:00
Petr Hosek 24b61ac832 [Support] Avoid normalization in sys::getDefaultTargetTriple
The return value of sys::getDefaultTargetTriple, which is derived from
-DLLVM_DEFAULT_TRIPLE, is used to construct tool names, default target,
and in the future also to control the search path directly; as such it
should be used textually, without interpretation by LLVM.

Normalization of this value may lead to unexpected results, for example
if we configure LLVM with -DLLVM_DEFAULT_TARGET_TRIPLE=x86_64-linux-gnu,
normalization will transform that value to x86_64--linux-gnu. Driver will
use that value to search for tools prefixed with x86_64--linux-gnu- which
may be confusing. This is also inconsistent with the behavior of the
--target flag which is taken as-is without any normalization and overrides
the value of LLVM_DEFAULT_TARGET_TRIPLE.

Users of sys::getDefaultTargetTriple already perform their own
normalization as needed, so this change shouldn't impact existing logic.

Differential Revision: https://reviews.llvm.org/D46910

llvm-svn: 332750
2018-05-18 18:33:07 +00:00
Peter Collingbourne f7b81db715 MC: Change the streamer ctors to take an object writer instead of a stream. NFCI.
The idea is that a client that wants split dwarf would create a
specific kind of object writer that creates two files, and use it to
create the streamer.

Part of PR37466.

Differential Revision: https://reviews.llvm.org/D47050

llvm-svn: 332749
2018-05-18 18:26:45 +00:00
Brendon Cahoon e5ed563cc5 [Hexagon] Generate post-increment for floating point types
The code that generates post-increments for Hexagon considered
integer values only. This patch adds support to generate them for
floating point values, f32 and f64.

Differential Revision: https://reviews.llvm.org/D47036

llvm-svn: 332748
2018-05-18 18:14:44 +00:00
Galina Kistanova 083ea389d6 Reverted r332654 as it has broken some buildbots and left unfixed for a long time.
The introduced problem is:
llvm.src/lib/Transforms/Vectorize/VPlanVerifier.cpp:29:13: error: unused function 'hasDuplicates' [-Werror,-Wunused-function]
static bool hasDuplicates(const SmallVectorImpl<VPBlockBase *> &VPBlockVec) {
            ^

llvm-svn: 332747
2018-05-18 18:14:06 +00:00
Simon Pilgrim 1273f4ad93 [X86] Add GPR<->XMM Schedule Tags
BtVer2 - fix NumMicroOp and account for the Lat+6cy GPR->XMM and Lat+1cy XMm->GPR delays (see rL332737)

The high number of MOVD/MOVQ equivalent instructions meant that there were a number of missed patterns in SNB/Znver1:
SNB - add missing GPR<->MMX costs (taken from Agner / Intel AOM)
Znver1 - add missing GPR<->XMM MOVQ costs (taken from Agner)

llvm-svn: 332745
2018-05-18 17:58:36 +00:00
Craig Topper f94ed26ea9 [X86] Directly legalize v16i16/v8i16 vselect to vXi8 vselect to use VPBLENDVB
The intrinsic legalization for masked truncate uses ISD::TRUNCATE which can be constant folded by getNode. This prevents getVectorMaskingNode from seeing the ISD::TRUNCATE special case where it should emit X86ISD::SELECT instead of ISD::VSELECT. This causes a vselect with a v16i1 or v8i1 condition to be emitted during vector legalization. but vector legalization doesn't revisit nodes it creates. DAG combine will then promote this condition to match the result type. Then op legalization will try to legalize it, but the custom lowering hook returned SDValue(). But op legalization doesn't have an Expand for VSELECT because it expects vector legalization to have taken care of it. So the operation sticks around and fails in isel.

This patch adds a custom legalization hook to morph it to a vXi8 vselect instead.

This also simplifies the normal vXi16 vselect handling because vector legalization was normally expanding to AND/ANDN/OR and DAG combine was turning that into VBLENDVB. So we can skip a step by doing it directly.

Fixes PR37499

Differential Revision: https://reviews.llvm.org/D47025

llvm-svn: 332743
2018-05-18 17:48:06 +00:00
Than McIntosh 3c639dbd0d Revert changes from D46265.
This is a revert of the changes from https://reviews.llvm.org/D46265;
the new test introduced (test/CodeGen/X86/PR37310.mir) causes buildbot
failures.

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D47061

llvm-svn: 332742
2018-05-18 17:47:10 +00:00
Nirav Dave 588fad4d3b [MC] Relax .fill size requirements
Avoid requirement that number of values must be known at assembler
time.

Fixes PR33586.

Reviewers: rnk, peter.smith, echristo, jyknight

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D46703

llvm-svn: 332741
2018-05-18 17:45:48 +00:00
Jessica Paquette e49374d009 Add remarks describing when a pass changes the IR instruction count of a module
This patch adds a remark which tells the user when a pass changes the number of
IR instructions in a module.

It can be enabled by using -Rpass-analysis=size-info.

The point of this is to make it easier to collect statistics on how passes
modify programs in terms of code size. This is similar in concept to timing
reports, but using a remark-based interface makes it easy to diff changes over
multiple compilations of the same program.

By adding functionality like this, we can see
  * Which passes impact code size the most
  * How passes impact code size at different optimization levels
  * Which pass might have contributed the most to an overall code size
    regression

The patch lives in the legacy pass manager, but since it's simply emitting
remarks, it shouldn't be too difficult to adapt the functionality to the new
pass manager as well. This can also be adapted to handle MachineInstr counts in
code gen passes.

https://reviews.llvm.org/D38768

llvm-svn: 332739
2018-05-18 17:26:39 +00:00
Simon Pilgrim 007b50fd35 [X86][BtVer2] Improve simulation of (V)PINSR values
Include the 6cy delay transferring from the GPR to FPU.

llvm-svn: 332737
2018-05-18 17:09:41 +00:00
Simon Pilgrim 3ecb0b80f6 [X86][BtVer2] Partial vector stores (inc MMX) have a 2cy latency
llvm-svn: 332722
2018-05-18 14:22:22 +00:00
Simon Pilgrim c4b8d367a8 [X86][SSE] Ensure vector partial load/stores use the WriteVecLoad/WriteVecStore scheduler classes
Retag some instructions that were missed when we split off vector load/store/moves - MOVQ/MOVD etc.

Fixes BtVer2/SLM which have different behaviours for GPR stores.

llvm-svn: 332718
2018-05-18 14:08:01 +00:00
Simon Pilgrim e819199e2a [X86][AVX] VEXTRACTF128mr store is a WriteFStoreX not WriteFStore
llvm-svn: 332715
2018-05-18 13:17:51 +00:00
Simon Pilgrim d749b321b2 [X86][SSE] Ensure float load/stores use the WriteFLoad/WriteFStore scheduler classes
Retag some instructions that were missed when we split off vector load/store/moves - MOVSS/MOVSD/MOVHPD/MOVHPD/MOVLPD/MOVLPS etc.

Fixes BtVer2/SLM which have different behaviours for GPR stores.

llvm-svn: 332714
2018-05-18 13:13:59 +00:00
Clement Courbet 8892c7db08 [ExynosM3] Fix scheduling info.
Differential Revision: https://reviews.llvm.org/D46356

llvm-svn: 332713
2018-05-18 13:10:41 +00:00
Simon Pilgrim a325dffd36 [X86][ZnVer1] Cleanup more single match instregexs
llvm-svn: 332712
2018-05-18 13:05:26 +00:00
Than McIntosh 4c21a363af StackColoring: better handling of statically unreachable code
Summary:
Avoid assert/crash during liveness calculation in situations where the
incoming machine function has statically unreachable BBs.

Fixes PR37130.

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D46265

llvm-svn: 332707
2018-05-18 12:25:30 +00:00
Jonas Paulsson b51ccaf4d4 [SystemZ] Fix commit message of previous commit.
Sorry, the commit comment for r332703 is completely broken.
My mind slipped - the right description would be:

In SystemZDAGToDAGISel::Select(), in the handling for SELECT_CCMASK:

Check if UpdateNodeOperands() returns a different SDNode and in that
case call ReplaceNode.

Review: Ulrich Weigand.
llvm-svn: 332706
2018-05-18 12:07:16 +00:00
Alexander Ivchenko 5c54742da4 [X86][CET] Changing -fcf-protection behavior to comply with gcc (LLVM part)
This patch aims to match the changes introduced in gcc by
https://gcc.gnu.org/ml/gcc-cvs/2018-04/msg00534.html. The
IBT feature definition is removed, with the IBT instructions
being freely available on all X86 targets. The shadow stack
instructions are also being made freely available, and the
use of all these CET instructions is controlled by the module
flags derived from the -fcf-protection clang option. The hasSHSTK
option remains since clang uses it to determine availability of
shadow stack instruction intrinsics, but it is no longer directly used.

Comes with a clang patch (D46881).

Patch by mike.dvoretsky

Differential Revision: https://reviews.llvm.org/D46882

llvm-svn: 332705
2018-05-18 11:58:25 +00:00
Jonas Paulsson de54c058a6 [SystemZ] Fold AHIMux in foldMemoryOperandImpl.
AHIMux can be folded the same way as AHI.

Review: Ulrich Weigand
llvm-svn: 332703
2018-05-18 11:54:04 +00:00
David Stenberg 0af67e5b65 [SimplifyCFG] Fix a debug invariant bug in FoldBranchToCommonDest()
Summary:
Fix a case where FoldBranchToCommonDest() would bail out from doing CSE
when encountering a debug intrinsic. Handle that by skipping past the
debug intrinsics.

Also, as a minor refactoring, rename checkCSEInPredecessor() to
tryCSEWithPredecessor() to make it a bit more clear that the function
may remove instructions.

Reviewers: fhahn, craig.topper, dblaikie, xbolva00

Reviewed By: fhahn, xbolva00

Subscribers: vsk, davide, llvm-commits

Differential Revision: https://reviews.llvm.org/D46635

llvm-svn: 332698
2018-05-18 08:52:15 +00:00
Shiva Chen 6e07dfb148 [RISCV] Add WasForced parameter to MCAsmBackend::fixupNeedsRelaxationAdvanced
For RISCV branch instructions, we need to preserve relocation types when linker
relaxation enabled, so then linker could modify offset when the branch offsets
changed.

We preserve relocation types by define shouldForceRelocation.
IsResolved return by evaluateFixup will always false when shouldForceRelocation
return true. It will make RISCV MC Branch Relaxation always relax 16-bit
branches to 32-bit form, even if the symbol actually could be resolved.

To avoid 16-bit branches always relax to 32-bit form when linker relaxation
enabled, we add a new parameter WasForced to indicate that the symbol actually
couldn't be resolved and not forced by shouldForceRelocation return true.

RISCVAsmBackend::fixupNeedsRelaxationAdvanced could relax branches with
unresolved symbols by (!IsResolved && !WasForced).

RISCV MC Branch Relaxation is needed because RISCV could perform 32-bit
to 16-bit transformation in MC layer.

Differential Revision: https://reviews.llvm.org/D46350

llvm-svn: 332696
2018-05-18 06:42:21 +00:00
Serguei Katkov 5095883fe9 [LICM] Extend the MustExecute scope
CanProveNotTakenFirstIteration utility does not handle the case when
condition of the branch is a constant. Add its handling.

Reviewers: reames, anna, mkazantsev
Reviewed By: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D46996

llvm-svn: 332695
2018-05-18 04:56:28 +00:00
Walter Lee cdbb207bd1 [asan] Add instrumentation support for Myriad
1. Define Myriad-specific ASan constants.

2. Add code to generate an outer loop that checks that the address is
   in DRAM range, and strip the cache bit from the address.  The
   former is required because Myriad has no memory protection, and it
   is up to the instrumentation to range-check before using it to
   index into the shadow memory.

3. Do not add an unreachable instruction after the error reporting
   function; on Myriad such function may return if the run-time has
   not been initialized.

4. Add a test.

Differential Revision: https://reviews.llvm.org/D46451

llvm-svn: 332692
2018-05-18 04:10:38 +00:00
Eric Christopher 68f2218e1e Revert "Temporarily revert "[DEBUG] Initial adaptation of NVPTX target for debug info emission.""
This reapplies commits: r330271, r330592, r330779.

    [DEBUG] Initial adaptation of NVPTX target for debug info emission.

    Summary:
    Patch adds initial emission of the debug info for NVPTX target.
    Currently, only .file and .loc directives are emitted, everything else is
    commented out to not break the compilation of Cuda.

llvm-svn: 332689
2018-05-18 03:13:08 +00:00
Eric Christopher f31e91e4a8 Tidy comment up a bit.
llvm-svn: 332687
2018-05-18 02:39:57 +00:00
Eli Friedman d268bf0a4d Fix unused lambda capture.
llvm-svn: 332686
2018-05-18 02:11:25 +00:00
Eli Friedman 4081a57af7 [MachineOutliner] Count savings from outlining in bytes.
Counting the number of instructions is both unintuitive and inaccurate.
On AArch64, this only affects the generated remarks and certain rare
pseudo-instructions, but it will have a bigger impact on other targets.

Differential Revision: https://reviews.llvm.org/D46921

llvm-svn: 332685
2018-05-18 01:52:16 +00:00
Keno Fischer e07153a859 [X86DomainReassignment] Don't compare stack-allocated values by address
Summary:
The Closure allocated in the main loop is allocated on the stack. However,
later in the code its address is taken (and used for comparisons). This
obviously doesn't work. In fact, the Closure will get the same stack address
during every loop iteration, rendering the check that intended to identify
Closure conflicts entirely ineffective. Fix this bug by giving every Closure
a unique ID and using that for comparison. Alternatively, we could heap
allocate the closure object.

Fixes PR37396
Fixes JuliaLang/julia#27032

Reviewers: craig.topper, guyblank

Reviewed By: craig.topper

Subscribers: vchuravy, llvm-commits

Differential Revision: https://reviews.llvm.org/D46800

llvm-svn: 332682
2018-05-18 01:03:01 +00:00
Keno Fischer 66ab99c3ee [X86DomainReassignment] Don't delete IMPLICIT_DEF nodes
Summary:
We cannot simply delete IMPLICIT_DEF nodes. They may be used
later (e.g. by a PHI) and deleting them will cause later passes (e.g.
LiveVariables) to crash. However, it seems fine to ignore them for
purposes of the domain reassignment (as we do with PHI).

Fixes PR37430
Fixes JuliaLang/julia#27080

Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D46797

llvm-svn: 332680
2018-05-18 00:40:52 +00:00
Zachary Turner c762666e87 Resubmit [pdb] Change /DEBUG:GHASH to emit 8 byte hashes."
This fixes the remaining failing tests, so resubmitting with no
functional change.

llvm-svn: 332676
2018-05-17 22:55:15 +00:00
Peter Collingbourne 070777dbdd Support: Add a raw_ostream::write_zeros() function. NFCI.
This will eventually replace MCObjectWriter::WriteZeros.

Part of PR37466.

Differential Revision: https://reviews.llvm.org/D47033

llvm-svn: 332675
2018-05-17 22:11:43 +00:00
George Burgess IV c6526176cf Revert r332657: "[AA] cfl-anders-aa with field sensitivity"
I don't believe the person who LGTMed this review has appropriate
context on this code. I apologize if I'm wrong.

llvm-svn: 332674
2018-05-17 21:56:39 +00:00
Changpeng Fang 860d460063 AMDGPU/SI: Don't promote alloca to vector for atomic load/store
Summary:
  Don't promote alloca to vector for atomic load/store

Reviewer:
  arsenm

Differential Revision:
  https://reviews.llvm.org/D46085

llvm-svn: 332673
2018-05-17 21:49:44 +00:00
Zachary Turner 1de9fce151 Revert "[pdb] Change /DEBUG:GHASH to emit 8 byte hashes."
A few tests haven't been properly updated, so reverting while
I have time to investigate proper fixes.

llvm-svn: 332672
2018-05-17 21:49:25 +00:00
Zachary Turner 3c4c8a0937 [pdb] Change /DEBUG:GHASH to emit 8 byte hashes.
Previously we emitted 20-byte SHA1 hashes.  This is overkill
for identifying debug info records, and has the negative side
effect of making object files bigger and links slower.  By
using only the last 8 bytes of a SHA1, we get smaller object
files and ~10% faster links.

This modifies the format of the .debug$H section by adding a new
value for the hash algorithm field, so that the linker will still
work when its object files have an old format.

Differential Revision: https://reviews.llvm.org/D46855

llvm-svn: 332669
2018-05-17 21:22:48 +00:00
Heejin Ahn b4be38fcdd [WebAssembly] Add Wasm personality and isScopedEHPersonality()
Summary:
- Add wasm personality function
- Re-categorize the existing `isFuncletEHPersonality()` function into
two different functions: `isFuncletEHPersonality()` and
`isScopedEHPersonality(). This becomes necessary as wasm EH uses scoped
EH instructions (catchswitch, catchpad/ret, and cleanuppad/ret) but not
outlined funclets.
- Changed some callsites of `isFuncletEHPersonality()` to
`isScopedEHPersonality()` if they are related to scoped EH IR-level
stuff.

Reviewers: majnemer, dschuff, rnk

Subscribers: jfb, sbc100, jgravelle-google, eraman, JDevlieghere, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D45559

llvm-svn: 332667
2018-05-17 20:52:03 +00:00
Lang Hames ecb3e50041 [ORC] Consolidate materialization errors, and generate them in VSO's
notifyFailed method rather than passing in an error generator.

VSO::notifyFailed is responsible for notifying queries that they will not
succeed due to error. In practice the queries don't care about the details
of the failure, just the fact that a failure occurred for some symbols.
Having VSO::notifyFailed take care of this simplifies the interface.

llvm-svn: 332666
2018-05-17 20:48:58 +00:00
Reid Kleckner f40f85868e [codeview] Include record prefix in global type hashing
The prefix includes type kind, which is important to preserve. Two
different type leafs can easily have the same interior record contents
as another type.

We ran into this issue in PR37492 where a bitfield type record collided
with a const modifier record. Their contents were bitwise identical, but
their kinds were different.

llvm-svn: 332664
2018-05-17 20:47:22 +00:00
Peter Collingbourne 0d8fa1b6fd ARC, Nios2: Silence build warnings. NFCI.
llvm-svn: 332663
2018-05-17 20:46:01 +00:00
David Bolvansky b1c59e3f30 [AA] cfl-anders-aa with field sensitivity
Summary:
There was some unfinished work started for offset tracking in CFLGraph by the author of implementation of Andersen algorithm. This work was completed and support for field sensitivity was added to the core of Andersen algorithm.

The performance results seem promising.

SPEC2006 int_base score was increased by 1.1 % (I  compared clang 6.0 with clang 6.0 with this patch). The avergae compile time was increased by +- 1 % according my measures with small and medium C/C++ projects (I did not tested it on the large projects with milions of lines of code)

Reviewers: chandlerc, george.burgess.iv, rja

Reviewed By: rja

Subscribers: rja, llvm-commits

Differential Revision: https://reviews.llvm.org/D46282

llvm-svn: 332657
2018-05-17 20:23:33 +00:00
Diego Caballero f58ad3129c [LV][VPlan] Build plain CFG with simple VPInstructions for outer loops.
Patch #3 from VPlan Outer Loop Vectorization Patch Series #1
(RFC: http://lists.llvm.org/pipermail/llvm-dev/2017-December/119523.html).

Expected to be NFC for the current inner loop vectorization path. It
introduces the basic algorithm to build the VPlan plain CFG (single-level
CFG, no hierarchical CFG (H-CFG), yet) in the VPlan-native vectorization
path using VPInstructions. It includes:
  - VPlanHCFGBuilder: Main class to build the VPlan H-CFG (plain CFG without nested regions, for now).
  - VPlanVerifier: Main class with utilities to check the consistency of a H-CFG.
  - VPlanBlockUtils: Main class with utilities to manipulate VPBlockBases in VPlan.

Reviewers: rengolin, fhahn, mkuper, mssimpso, a.elovikov, hfinkel, aprantl.

Differential Revision: https://reviews.llvm.org/D44338

llvm-svn: 332654
2018-05-17 19:24:47 +00:00
Xinliang David Li bc471c39ee Add a limit for phi folding instcombine
Differential Revision: http://reviews.llvm.org/D47023

llvm-svn: 332653
2018-05-17 19:24:03 +00:00
Sameer AbuAsal 1dc0a8fb18 [RISCV] Separate base from offset in lowerGlobalAddress
Summary:
When lowering global address, lower the base as a TargetGlobal first then
 create an SDNode for the offset separately and chain it to the address calculation

 This optimization will create a DAG where the base address of a global access will
 be reused between different access. The offset can later be folded into the immediate
 part of the memory access instruction.

  With this optimization we generate:

    lui a0, %hi(s)
    addi a0, a0, %lo(s) ; shared base address.

    addi a1, zero, 20 ; 2 instructions per access.
    sw a1, 44(a0)

    addi a1, zero, 10
    sw a1, 8(a0)

    addi a1, zero, 30
    sw a1, 80(a0)

    Instead of:

    lui a0, %hi(s+44) ; 3 instructions per access.
    addi a1, zero, 20
    sw a1, %lo(s+44)(a0)

    lui a0, %hi(s+8)
    addi a1, zero, 10
    sw a1, %lo(s+8)(a0)

    lui a0, %hi(s+80)
    addi a1, zero, 30
    sw a1, %lo(s+80)(a0)

    Which will save one instruction per access.

Reviewers: asb, apazos

Reviewed By: asb

Subscribers: rbar, johnrusso, simoncook, jordy.potman.lists, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, mgrang, apazos, asb, llvm-commits

Differential Revision: https://reviews.llvm.org/D46989

llvm-svn: 332641
2018-05-17 18:14:53 +00:00
Mandeep Singh Grang ef0ebf2806 [RISCV] Implement MC layer support for the tail pseudoinstruction
Summary:
This patch implements MC support for tail psuedo instruction.
A follow-up patch implements the codegen support as well as handling of the indirect tail pseudo instruction.

Reviewers: asb, apazos

Reviewed By: asb

Subscribers: rbar, johnrusso, simoncook, jordy.potman.lists, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, llvm-commits

Differential Revision: https://reviews.llvm.org/D46221

llvm-svn: 332634
2018-05-17 17:31:27 +00:00
Sam Clegg c0d41195b5 [WebAssembly] MC: Fix typo in comment
llvm-svn: 332632
2018-05-17 17:15:15 +00:00
Simon Pilgrim 2782a19fad [X86] Split WriteCMOV + WriteCMOV2 scheduler classes
Handle SNB+ targets which treat CMOVA/CMOVBE specially due to partial EFLAGS handling.

llvm-svn: 332626
2018-05-17 16:47:30 +00:00
Changpeng Fang 391bcf8893 AMDGPU/SI: Handle infinite loop for the structurizer to work with CFG with infinite loops.
Summary:
  The current StructurizeCFG pass only works for CFG with one exit. AMDGPUUnifyDivergentExitNodes combines multiple "return" blocks and/or "unreachable" blocks
to one exit block for the Structurizer to work. However, infinite loop is another kind of special "exit", and if we don't handle it, the case of multiple exits will prevent the structurizer from working.

In this work, for each infinite loop, we add a dummy edge to the "return" block, and thus the AMDGPUUnifyDivergentExitNodes pass will work with infinite loops.
This will make CFG with infinite loops be structurized.

Reviewer:
  nhaehnle

Differential Revision:
  https://reviews.llvm.org/D46340

llvm-svn: 332625
2018-05-17 16:45:01 +00:00
Petar Jovanovic daf5169398 [mips] Add support for Global INValidate ASE
This includes

  Instructions: ginvi, ginvt,

  Assembler directives: .set ginv, .set noginv, .module ginv, .module noginv

  Attribute: ginv

  .MIPS.abiflags: GINV (0x20000)

Patch by Vladimir Stefanovic.

Differential Revision: https://reviews.llvm.org/D46268

llvm-svn: 332624
2018-05-17 16:30:32 +00:00
Craig Topper bd332588bd [InstCombine] Propagate the nsw/nuw flags from the add in the 'shifty' abs pattern to the sub in the select version.
According to alive this is valid. I'm hoping to use this to make an assumption that the sign bit is zero after this sequence. The only way it wouldn't be is if the input was INT__MIN, but by preserving the flags we can make doing this to INT_MIN UB.

The nuw flags is weird because it creates such a contradiction that the original number would have to be positive meaning we could remove the select entirely, but we don't get that far.

Differential Revision: https://reviews.llvm.org/D46988

llvm-svn: 332623
2018-05-17 16:29:52 +00:00
Alex Bradbury 6a53023b4e [RISCV] Set isReMaterializable on ADDI and LUI instructions
The isReMaterlizable flag is somewhat confusing, unlike most other instruction 
flags it is currently interpreted as a hint (mightBeRematerializable would be 
a better name). While LUI is always rematerialisable, for an instruction like 
ADDI it depends on its operands. TargetInstrInfo::isTriviallyReMaterializable 
will call TargetInstrInfo::isReallyTriviallyReMaterializable, which in turn 
calls TargetInstrInfo::isReallyTriviallyReMaterializableGeneric. We rely on 
the logic in the latter to pick out instances of ADDI that really are 
rematerializable.

The isReMaterializable flag does make a difference on a variety of test 
programs. The recently committed remat.ll test case demonstrates how stack 
usage is reduce and a unnecessary lw/sw can be removed. Stack usage in the 
Proc0 function in dhrystone reduces from 192 bytes to 112 bytes.

For the sake of completeness, this patch also implements 
RISCVRegisterInfo::isConstantPhysReg. Although this is called from a number of 
places, it doesn't seem to result in different codegen for any programs I've 
thrown at it. However, it is called in the rematerialisation codepath and it 
seems sensible to implement something correct here.

Differential Revision: https://reviews.llvm.org/D46182

llvm-svn: 332617
2018-05-17 15:51:37 +00:00
Simon Pilgrim b5741f5c3d [X86][BtVer2] ADC/SBB take 2cy on an ALU pipe, not 1cy like ADD/SUB
llvm-svn: 332616
2018-05-17 15:43:23 +00:00
Dmitry Mikulin 3c6b4e35bd In thin and full LTO + CFI, direct function calls may go through jump table
entries to reach the target. Since these calls don't require type checks,
we can short-circuit them to their real targets.

Differential Revision: https://reviews.llvm.org/D46326

llvm-svn: 332610
2018-05-17 14:29:07 +00:00
Alex Bradbury 5e41fc83c5 [Hexagon] Use addAliasForDirective for data directives
Data directives such as .word, .half, .hword are currently parsed using 
HexagonAsmParser::ParseDirectiveValue which effectively duplicates logic from 
AsmParser::parseDirectiveValue. This patch deletes that duplicated logic in 
favour of using addAliasForDirective.

Differential Revision: https://reviews.llvm.org/D46999

llvm-svn: 332607
2018-05-17 13:21:18 +00:00
Simon Pilgrim 0c0336e003 [X86] Split WriteADC/WriteADCRMW scheduler classes
For integer ALU instructions taking eflags as an input (ADC/SBB/ADCX/ADOX)

llvm-svn: 332605
2018-05-17 12:43:42 +00:00
Jonas Paulsson caafed5570 [SystemZ] Commenting (NFC)
Some minor commenting in scheduler files.

Review: Ulrich Weigand
llvm-svn: 332599
2018-05-17 11:53:56 +00:00
Simon Pilgrim ceb4933dc1 [X86][SNB] Minor scheduler cleanup
Merge 2 instregex and explain the VMOVDQArr/MOVDQArr difference

llvm-svn: 332591
2018-05-17 10:36:29 +00:00
Sander de Smalen 75cfa34156 [AArch64][SVE] Asm: Support for structured ST2, ST3 and ST4 (scalar+scalar) store instructions.
Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D46680

llvm-svn: 332584
2018-05-17 09:05:41 +00:00
Mikael Holmen 2ca16899ec Require DominatorTree when requiring/preserving LoopInfo in the old pass manager
Summary:
Require DominatorTree when requiring/preserving LoopInfo in the old pass manager

BreakCriticalEdges tries to keep LoopInfo and DominatorTree updated if they
exist. However, since commit r321653 and r321805, to update LoopInfo we
must have a DominatorTree, or we will hit an assert.

To fix this we now make a couple of passes that only required/preserved
LoopInfo also require DominatorTree.

This solves PR37334.

Reviewers: eli.friedman, efriedma

Reviewed By: efriedma

Subscribers: efriedma, llvm-commits

Differential Revision: https://reviews.llvm.org/D46829

llvm-svn: 332583
2018-05-17 09:05:40 +00:00
Martin Storsjo c10788728b [Analysis] Only use _unlocked stdio functions on linux
The existing comment said that the functions were available only
on GNU/Linux (and on certain Android versions), but only checked
T.isGNUEnvironment() which also is true on MinGW (for arch-windows-gnu
triplets), which doesn't have such functions.

Existing checks in the initialize function in TargetLibraryInfo.cpp
also use only T.isOSLinux() to check for glibc features.

This fixes use of stdio on MinGW.

Differential Revision: https://reviews.llvm.org/D47002

llvm-svn: 332581
2018-05-17 08:16:08 +00:00
Bjorn Pettersson 81a76a388a [SROA] Handle PHI with multiple duplicate predecessors
Summary:
The verifier accepts PHI nodes with multiple entries for the
same basic block, as long as the value is the same.

As seen in PR37203, SROA did not handle such PHI nodes properly
when speculating loads over the PHI, since it inserted multiple
loads in the predecessor block and changed the PHI into having
multiple entries for the same basic block, but with different
values.

This patch teaches SROA to reuse the same speculated load for
each PHI duplicate entry in such situations.

Resolves: https://bugs.llvm.org/show_bug.cgi?id=37203

Reviewers: uabelho, chandlerc, hfinkel, bkramer, efriedma

Reviewed By: efriedma

Subscribers: dberlin, efriedma, llvm-commits

Differential Revision: https://reviews.llvm.org/D46426

llvm-svn: 332577
2018-05-17 07:21:41 +00:00
Hiroshi Inoue f5c0e6c285 [SROA] pr37267: fix assertion failure in integer widening
The current integer widening does not support rewriting partial split slices in rewriteIntegerStore (and rewriteIntegerLoad).
This patch adds explicit checks for this case in isIntegerWideningViableForSlice.
Before r322533, splitting is allowed only for the whole-alloca slice and hence the above case is implicitly rejected by another check `if (DL.getTypeStoreSize(ValueTy) > Size)` because whole-alloca slice is larger than the partition.

Differential Revision: https://reviews.llvm.org/D46750

llvm-svn: 332575
2018-05-17 06:32:17 +00:00
Alex Bradbury cea6db0480 [RISCV] Add support for .half, .hword, .word, .dword directives
These directives are recognised by gas. Support is added through the use of 
addAliasForDirective.

Also match RISC-V gcc in preferring .half and .word for 16-bit and 32-bit data 
directives.

llvm-svn: 332574
2018-05-17 05:58:08 +00:00
Craig Topper a2c5264718 [X86] Add OptForSize to a couple load folding patterns. Remove some bad FIXME comments.
The FIXME comments were about preventing load folding to avoid a partial xmm update. But these instructions use GPR as input when the load isn't folded. This won't help prevent a partial xmm update.

llvm-svn: 332573
2018-05-17 05:41:11 +00:00
Dan Gohman aef674102c [WebAssembly] Fix the opcode number for i64.load16_u.
Fixes PR37488.

llvm-svn: 332561
2018-05-17 00:14:13 +00:00
Craig Topper 342273a139 [CodeGen] Use MachineInstr::getOperand(0) instead of gets the defs iterator_range and calling begin. NFC
Defs are well defined to come first in MachineInstr operand list. No need for a more complex indirection.

llvm-svn: 332559
2018-05-16 23:39:27 +00:00
Greg Clayton f81f3a838a Revert 332508 as it caused problems in the clang test suite.
llvm-svn: 332555
2018-05-16 23:29:36 +00:00
Vedant Kumar 5a0872c2b7 [STLExtras] Add size() for ranges, and remove distance()
r332057 introduced distance() for ranges. Based on post-commit feedback,
this renames distance() to size(). The new size() is also only enabled
when the operation is O(1).

Differential Revision: https://reviews.llvm.org/D46976

llvm-svn: 332551
2018-05-16 23:20:42 +00:00
JF Bastien ddc84bf7d1 [NFC] WebAssembly build break #2
Summary:
Same as r332530, move WasmSymbol::dump to an implementation file to avoid linker
issues when the dump function is seen in the header, doesn't get eliminated, and
then linking fails because of the missing dependency.

<rdar://problem/40258137>

Reviewers: sbc100, ncw, paquette, vsk, dschuff

Subscribers: jgravelle-google, aheejin, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D46985

llvm-svn: 332542
2018-05-16 22:31:42 +00:00
Lang Hames d261e1258c [ORC] Rewrite the VSO symbol table yet again. Update related utilities.
VSOs now track dependencies for materializing symbols. Each symbol must have its
dependencies registered with the VSO prior to finalization. Usually this will
involve registering the dependencies returned in
AsynchronousSymbolQuery::ResolutionResults for queries made while linking the
symbols being materialized.

Queries against symbols are notified that a symbol is ready once it and all of
its transitive dependencies are finalized, allowing compilation work to be
broken up and moved between threads without queries returning until their
symbols fully safe to access / execute.

Related utilities (VSO, MaterializationUnit, MaterializationResponsibility) are
updated to support dependence tracking and more explicitly track responsibility
for symbols from the point of definition until they are finalized.

llvm-svn: 332541
2018-05-16 22:24:30 +00:00
Simon Pilgrim 820433f533 [X86][SNB] Remove unnecessary CVT InstRW overrides
llvm-svn: 332536
2018-05-16 22:14:29 +00:00
Sam Clegg 6a32560886 [WebAssembly] Remove unused headers in MCWasmObjectWriter
Differential Revision: https://reviews.llvm.org/D46969

llvm-svn: 332535
2018-05-16 22:13:18 +00:00
Benjamin Kramer 8ac15bf4dc [InstCombine] Fix the signature of fgets_unlocked.
It returns a pointer, not an int. This miscompiles all code that uses
the return value of fgets.

llvm-svn: 332531
2018-05-16 21:45:39 +00:00
JF Bastien 659932b0b2 [NFC] WebAssembly build fix
Summary:
r332305 added a use of llvm::wasm::toString in llvm::object::WasmSymbol::print,
which is in a header file. It also moves toString to BinaryFormat. This has the
unintended side-effect that any inclusion of Object/Wasm.h now relies on
toString, and needs to required_libraries = BinaryFormat. Thankfully most builds
don't fail with this because print just isn't used and gets eliminated, dropping
the required dependency in the process. Not all builds are so lucky.

Fix this issue by moving print to the corresponding .cpp file.

<rdar://problem/40258137>

Reviewers: sbc100, ncw, paquette

Subscribers: dschuff, jgravelle-google, aheejin, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D46977

llvm-svn: 332530
2018-05-16 21:24:03 +00:00
Eli Friedman ddbf6d6514 [MachineOutliner] Don't outline instructions that modify SP.
This breaks the code which saves and restores LR, so we can't outline
without doing something more complicated for stack adjustment.

Found by inspection; we get lucky in most cases because getMemOpInfo
only handles STRWpost, not any other pre/post-increment forms. But it
hits a couple of artificial testcases in the tree.

Differential Revision: https://reviews.llvm.org/D46920

llvm-svn: 332529
2018-05-16 21:20:16 +00:00
Krzysztof Parzyszek f18009dbc6 [Hexagon] Fix the order of operands when selecting QCAT
llvm-svn: 332526
2018-05-16 21:02:43 +00:00
Krzysztof Parzyszek e8a0ae7346 [Hexagon] Mark HVX vector predicate bitwise ops as legal, add patterns
llvm-svn: 332525
2018-05-16 21:00:24 +00:00
Simon Pilgrim 2e0f6c9b21 [X86][SSE] Reduce instruction/register usages for v4i32 vector shifts (PR37441)
As suggested by Fabian on PR37441, use PSHUFLW to extend shift amount types for use with PSRAD/PSRLD to reduce register pressure.

Some of this ideally would be done by combineTargetShuffle but its tricky to do as most of the shuffles are sharing inputs.

Differential Revision: https://reviews.llvm.org/D46959

llvm-svn: 332524
2018-05-16 20:52:52 +00:00
Konstantin Zhuravlyov c72ece6c2c AMDGPU : Recalculate SGPRs when trap handler is supported
Differential Revision: https://reviews.llvm.org/D29911

llvm-svn: 332523
2018-05-16 20:47:48 +00:00
Eric Christopher 1f5eb86b51 Fix small grammar-o.
llvm-svn: 332522
2018-05-16 20:34:00 +00:00
Eric Christopher fb923d28a9 Fix up a misleading format warning.
llvm-svn: 332521
2018-05-16 20:33:59 +00:00
Sam Clegg 6ccb59b3e9 [WebAssembly] MC: Ensure that FUNCTION_OFFSET relocations are always against function symbols.
The getAtom() method wasn't doing what we needed in all cases. We want
the symbols for the function which defines that section. We can compute
this easily enough and we know that we have at most one function in each
section.

Once this lands I will revert rL331412 which is no longer needed.

Fixes PR37409

Differential Revision: https://reviews.llvm.org/D46970

llvm-svn: 332517
2018-05-16 20:09:05 +00:00