Commit Graph

53347 Commits

Author SHA1 Message Date
Nico Weber e4a12cfa2f revert r332610, it breaks cfi, see D46326
llvm-svn: 332838
2018-05-21 11:44:39 +00:00
Aleksandar Beserminji 4977705727 [mips] Revert Merge MipsLongBranch and MipsHazardSchedule passes
Revert this patch due buildbot failure.

Differential Revision: https://reviews.llvm.org/D46641

llvm-svn: 332837
2018-05-21 11:38:52 +00:00
David Green 8ceab61c75 [CVP] Require DomTree for new Pass Manager
We were previously using a DT in CVP through SimplifyQuery, but not requiring it in
the new pass manager. Hence it would crash if DT was not already available. This now
gets DT directly and plumbs it through to where it is used (instead of using it
through SQ).

llvm-svn: 332836
2018-05-21 11:06:28 +00:00
Aleksandar Beserminji de7be5e46f [mips] Merge MipsLongBranch and MipsHazardSchedule passes
MipsLongBranchPass and MipsHazardSchedule passes are joined to one pass
because of mutual conflict. When MipsHazardSchedule inserts 'nop's, it
potentially breaks some jumps, so they have to be expanded to long
branches. When some branch is expanded to long branch, it potentially
creates a hazard situation, which should be fixed by adding nops.
New pass is called MipsBranchExpansion, it combines these two passes,
and runs them alternately until one of them reports no changes were made.

Differential Revision: https://reviews.llvm.org/D46641

llvm-svn: 332834
2018-05-21 10:20:02 +00:00
Simon Pilgrim 5aa7cdfd70 [X86][SSE] Support v4i32 rotations (PR37426)
As suggested by Fabian on PR37426, we can use PMULUDQ to perform v4i32 vector rotations as the upper 32bits of the multiply will contain the 'wrapped' bits of the rotation.

v8i16/v16i8 rotations would be straightforward to add to lowerRotate in the future - ideally we'd mostly share code with the vector shifts lowering.

Differential Revision: https://reviews.llvm.org/D46954

llvm-svn: 332832
2018-05-21 09:45:59 +00:00
Nico Weber d418776e04 win: try more to fix dia tests with newer msvc versions
llvm-svn: 332828
2018-05-21 02:55:41 +00:00
Nico Weber da5513b9c4 win: try to fix dia tests with newer msvc versions
llvm-svn: 332827
2018-05-21 02:09:57 +00:00
Robert Widmann 360d6e35e6 [LLVM-C] Improve Bindings For Aliases
Summary: Add wrappers for a module's alias iterators and a getter and setter for the aliasee value.

Reviewers: whitequark, deadalnix

Reviewed By: whitequark

Subscribers: llvm-commits, harlanhaskins

Differential Revision: https://reviews.llvm.org/D46808

llvm-svn: 332826
2018-05-20 23:49:08 +00:00
Craig Topper e4c045b7df [X86] Remove mask arguments from permvar builtins/intrinsics. Use a select in IR instead.
Someday maybe we'll use selects for all intrinsics.

llvm-svn: 332824
2018-05-20 23:34:04 +00:00
Simon Dardis 777afc7fbd [mips] Add microMIPSR6 ll/sc instructions.
Previously the compiler was using the microMIPSR3 variants, incorrectly.

Reviewers: atanasyan, abeserminji, smaksimovic

Differential Revision: https://reviews.llvm.org/D46948

llvm-svn: 332820
2018-05-20 17:21:00 +00:00
Sanjay Patel a003c728a5 [InstCombine] choose 1 form of abs and nabs as canonical
We already do this for min/max (see the blob above the diff), 
so we should do the same for abs/nabs.
A sign-bit check (<s 0) is used as a predicate for other IR 
transforms and it's likely the best for codegen.

This might solve the motivating cases for D47037 and D47041, 
but I think those patches still make sense. We can't guarantee 
this canonicalization if the icmp has more than one use.

Differential Revision: https://reviews.llvm.org/D47076

llvm-svn: 332819
2018-05-20 14:23:23 +00:00
Craig Topper 9ed890b0db [X86] Add test cases to show missed rotate opportunities due to SimplifyDemandedBits.
llvm-svn: 332815
2018-05-20 02:32:45 +00:00
Haicheng Wu 69ba0613f2 [GlobalMerge] Exit early if only one global is to be merged
To save some compilation time and prevent some unnecessary changes.

Differential Revision: https://reviews.llvm.org/D46640

llvm-svn: 332813
2018-05-19 18:00:02 +00:00
Max Kazantsev c0b268f90c [IRCE] Fix miscompile with range checks against negative values
In the patch rL329547, we have lifted the over-restrictive limitation on collected range
checks, allowing to work with range checks with the end of their range not being
provably non-negative. However it appeared that the non-negativity of this value was
assumed in the utility function `ClampedSubtract`. In particular, its reasoning is based
on the fact that `0 <= SINT_MAX - X`, which is not true if `X` is negative.

The function `ClampedSubtract` is only called twice, once with `X = 0` (which is OK)
and the second time with `X = IRC.getEnd()`, where we may now see the problem if
the end is actually a negative value. In this case, we may sometimes miscompile.

This patch is the conservative fix of the miscompile problem. Rather than rejecting
non-provably non-negative `getEnd()` values, we will check it for non-negativity in
runtime. For this, we use function `smax(smin(X, 0), -1) + 1` that is equal to `1` if `X`
is non-negative and is equal to 0 if `X` is negative. If we multiply `Begin, End` of safe
iteration space by this function calculated for `X = IRC.getEnd()`, we will get the original
`[Begin, End)` if `IRC.getEnd()` was non-negative (and, thus, `ClampedSubtract` worked
correctly) and the empty range `[0, 0)` in case if ` IRC.getEnd()` was negative.

So we in fact prohibit execution of the main loop if at least one of range checks was
made against a negative value (and we figured it out in runtime). It is still better than
what we have before (non-negativity had to be proved in compile time) and prevents
us from miscompile, however it is sometiles too restrictive for unsigned range checks
against a negative value (which in fact can be eliminated).

Once we re-implement `ClampedSubtract` in a way that it handles negative `X` correctly,
this limitation can be lifted, too.

Differential Revision: https://reviews.llvm.org/D46860
Reviewed By: samparker

llvm-svn: 332809
2018-05-19 13:06:37 +00:00
Benjamin Kramer a76b64ff80 [MergeICmps] Don't crash when memcmp is not available
Fixes clang crashing with -fno-builtin, PR37527.

llvm-svn: 332808
2018-05-19 12:51:59 +00:00
Yaxun Liu ea988f1fd9 Fix evaluator for non-zero alloca addr space
The evaluator goes through BB and creates global vars as temporary values to evaluate
results of LLVM instructions. It creates undef for alloca, however it assumes alloca
in addr space 0. If the next instruction is addrspace cast to 0, then we get an invalid
cast instruction.

This patch let the temp global var have an address space matching alloca addr space,
so that the valuation can be done.

Differential Revision: https://reviews.llvm.org/D47081

llvm-svn: 332794
2018-05-19 02:58:16 +00:00
Piotr Padlewski 5642a42442 Propagate nonnull and dereferenceable throught launder
Summary:
invariant.group.launder should not stop propagation
of nonnull and dereferenceable, because e.g. we would not be
able to hoist loads speculatively.

Reviewers: rsmith, amharc, kuhar, xbolva00, hfinkel

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D46972

llvm-svn: 332788
2018-05-18 23:54:33 +00:00
Piotr Padlewski ce358262eb Dissallow non-empty metadata for invariant.group
Summary:
This feature is not needed, but it might be usefull in the future
to use metadata to mark what which function should support it
(and strip it when not).

Reviewers: rsmith, sanjoy, amharc, kuhar

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D45419

llvm-svn: 332787
2018-05-18 23:53:46 +00:00
Piotr Padlewski a26a08cb52 Constant fold launder of null and undef
Summary:
This might be useful because clang will add
some barriers for pointer comparisons.

Reviewers: majnemer, dberlin, hfinkel, nlewycky, davide, rsmith, amharc,
kuhar

Subscribers: davide, amharc, llvm-commits

Differential Revision: https://reviews.llvm.org/D32423

llvm-svn: 332786
2018-05-18 23:52:57 +00:00
Piotr Padlewski 153fe60079 [MemDep] Fixed handling of invariant.group
Summary:
Memdep had funny bug related to invariant.groups - because it did not
invalidated cache, in some very rare cases it was possible to show memory
dependence of the instruction that was deleted, but because other
instruction took it's place it resulted in call to vtable!
Thanks @amharc for repro!.

Reviewers: dberlin, kuhar, amharc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D45320

Co-authored-by: Krzysztof Pszeniczny <krzysztof.pszeniczny@gmail.com>
llvm-svn: 332781
2018-05-18 22:40:34 +00:00
Sanjay Patel b41a5affea [x86] add more FP with FMF simplification tests; NFC
llvm-svn: 332780
2018-05-18 22:31:43 +00:00
Matt Arsenault 9fc8593a77 DAG: Fix crash on shift with large shift amounts
Fixes bug 37521.

llvm-svn: 332774
2018-05-18 21:54:16 +00:00
Matt Arsenault 372d796ab1 AMDGPU: Add pass to optimize reqd_work_group_size
Eliminate loads from the dispatch packet when they will have
a known value.

Also pattern match the code used by the library to handle partial
workgroup dispatches, which isn't necessary if reqd_work_group_size
is used.

llvm-svn: 332771
2018-05-18 21:35:00 +00:00
Evgeniy Stepanov 28f330fd6f [msan] Don't check divisor shadow in fdiv.
Summary:
Floating point division by zero or even undef does not have undefined
behavior and may occur due to optimizations.

Fixes https://bugs.llvm.org/show_bug.cgi?id=37523.

Reviewers: kcc

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D47085

llvm-svn: 332761
2018-05-18 20:19:53 +00:00
Wolfgang Pieb ad60559be7 [DWARF v5] Improved support for .debug_rnglists (consumer). Enables any consumer to
extract DWARF v5 encoded rangelists.

Reviewer: JDevlieghere

Differential Revision: https://reviews.llvm.org/D45549

llvm-svn: 332759
2018-05-18 20:12:54 +00:00
Michael Berg 1fa76cc3ea adding baseline fp fold tests for unsafe on and off
llvm-svn: 332756
2018-05-18 19:30:49 +00:00
Amara Emerson 08099c7edd Delete a test that was missed in the revert r332747.
r332747 originally reverted r332654 which added this test.

llvm-svn: 332755
2018-05-18 19:21:40 +00:00
Brendon Cahoon e5ed563cc5 [Hexagon] Generate post-increment for floating point types
The code that generates post-increments for Hexagon considered
integer values only. This patch adds support to generate them for
floating point values, f32 and f64.

Differential Revision: https://reviews.llvm.org/D47036

llvm-svn: 332748
2018-05-18 18:14:44 +00:00
Simon Pilgrim 1273f4ad93 [X86] Add GPR<->XMM Schedule Tags
BtVer2 - fix NumMicroOp and account for the Lat+6cy GPR->XMM and Lat+1cy XMm->GPR delays (see rL332737)

The high number of MOVD/MOVQ equivalent instructions meant that there were a number of missed patterns in SNB/Znver1:
SNB - add missing GPR<->MMX costs (taken from Agner / Intel AOM)
Znver1 - add missing GPR<->XMM MOVQ costs (taken from Agner)

llvm-svn: 332745
2018-05-18 17:58:36 +00:00
Craig Topper f94ed26ea9 [X86] Directly legalize v16i16/v8i16 vselect to vXi8 vselect to use VPBLENDVB
The intrinsic legalization for masked truncate uses ISD::TRUNCATE which can be constant folded by getNode. This prevents getVectorMaskingNode from seeing the ISD::TRUNCATE special case where it should emit X86ISD::SELECT instead of ISD::VSELECT. This causes a vselect with a v16i1 or v8i1 condition to be emitted during vector legalization. but vector legalization doesn't revisit nodes it creates. DAG combine will then promote this condition to match the result type. Then op legalization will try to legalize it, but the custom lowering hook returned SDValue(). But op legalization doesn't have an Expand for VSELECT because it expects vector legalization to have taken care of it. So the operation sticks around and fails in isel.

This patch adds a custom legalization hook to morph it to a vXi8 vselect instead.

This also simplifies the normal vXi16 vselect handling because vector legalization was normally expanding to AND/ANDN/OR and DAG combine was turning that into VBLENDVB. So we can skip a step by doing it directly.

Fixes PR37499

Differential Revision: https://reviews.llvm.org/D47025

llvm-svn: 332743
2018-05-18 17:48:06 +00:00
Than McIntosh 3c639dbd0d Revert changes from D46265.
This is a revert of the changes from https://reviews.llvm.org/D46265;
the new test introduced (test/CodeGen/X86/PR37310.mir) causes buildbot
failures.

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D47061

llvm-svn: 332742
2018-05-18 17:47:10 +00:00
Nirav Dave 588fad4d3b [MC] Relax .fill size requirements
Avoid requirement that number of values must be known at assembler
time.

Fixes PR33586.

Reviewers: rnk, peter.smith, echristo, jyknight

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D46703

llvm-svn: 332741
2018-05-18 17:45:48 +00:00
Craig Topper fdde9f363c [X86] Update fast-isel test cases for _mm256_mask_cvtepi16_epi8 to match clang r332738.
llvm-svn: 332740
2018-05-18 17:29:47 +00:00
Jessica Paquette e49374d009 Add remarks describing when a pass changes the IR instruction count of a module
This patch adds a remark which tells the user when a pass changes the number of
IR instructions in a module.

It can be enabled by using -Rpass-analysis=size-info.

The point of this is to make it easier to collect statistics on how passes
modify programs in terms of code size. This is similar in concept to timing
reports, but using a remark-based interface makes it easy to diff changes over
multiple compilations of the same program.

By adding functionality like this, we can see
  * Which passes impact code size the most
  * How passes impact code size at different optimization levels
  * Which pass might have contributed the most to an overall code size
    regression

The patch lives in the legacy pass manager, but since it's simply emitting
remarks, it shouldn't be too difficult to adapt the functionality to the new
pass manager as well. This can also be adapted to handle MachineInstr counts in
code gen passes.

https://reviews.llvm.org/D38768

llvm-svn: 332739
2018-05-18 17:26:39 +00:00
Simon Pilgrim 007b50fd35 [X86][BtVer2] Improve simulation of (V)PINSR values
Include the 6cy delay transferring from the GPR to FPU.

llvm-svn: 332737
2018-05-18 17:09:41 +00:00
Sanjay Patel 56e09c6928 [InstCombine] add tests for lack of abs/nabs canonicalization; NFC
llvm-svn: 332726
2018-05-18 15:26:38 +00:00
Sanjay Patel fa3e4601c6 [InstCombine] regenerate checks; NFC
There were a combination of auto-generated styles in use
here because the scripts have evolved. 

llvm-svn: 332725
2018-05-18 15:22:19 +00:00
Simon Pilgrim 3ecb0b80f6 [X86][BtVer2] Partial vector stores (inc MMX) have a 2cy latency
llvm-svn: 332722
2018-05-18 14:22:22 +00:00
Simon Pilgrim c4b8d367a8 [X86][SSE] Ensure vector partial load/stores use the WriteVecLoad/WriteVecStore scheduler classes
Retag some instructions that were missed when we split off vector load/store/moves - MOVQ/MOVD etc.

Fixes BtVer2/SLM which have different behaviours for GPR stores.

llvm-svn: 332718
2018-05-18 14:08:01 +00:00
Simon Pilgrim d749b321b2 [X86][SSE] Ensure float load/stores use the WriteFLoad/WriteFStore scheduler classes
Retag some instructions that were missed when we split off vector load/store/moves - MOVSS/MOVSD/MOVHPD/MOVHPD/MOVLPD/MOVLPS etc.

Fixes BtVer2/SLM which have different behaviours for GPR stores.

llvm-svn: 332714
2018-05-18 13:13:59 +00:00
Than McIntosh 4c21a363af StackColoring: better handling of statically unreachable code
Summary:
Avoid assert/crash during liveness calculation in situations where the
incoming machine function has statically unreachable BBs.

Fixes PR37130.

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D46265

llvm-svn: 332707
2018-05-18 12:25:30 +00:00
Alexander Ivchenko 5c54742da4 [X86][CET] Changing -fcf-protection behavior to comply with gcc (LLVM part)
This patch aims to match the changes introduced in gcc by
https://gcc.gnu.org/ml/gcc-cvs/2018-04/msg00534.html. The
IBT feature definition is removed, with the IBT instructions
being freely available on all X86 targets. The shadow stack
instructions are also being made freely available, and the
use of all these CET instructions is controlled by the module
flags derived from the -fcf-protection clang option. The hasSHSTK
option remains since clang uses it to determine availability of
shadow stack instruction intrinsics, but it is no longer directly used.

Comes with a clang patch (D46881).

Patch by mike.dvoretsky

Differential Revision: https://reviews.llvm.org/D46882

llvm-svn: 332705
2018-05-18 11:58:25 +00:00
Jonas Paulsson de54c058a6 [SystemZ] Fold AHIMux in foldMemoryOperandImpl.
AHIMux can be folded the same way as AHI.

Review: Ulrich Weigand
llvm-svn: 332703
2018-05-18 11:54:04 +00:00
David Stenberg 0af67e5b65 [SimplifyCFG] Fix a debug invariant bug in FoldBranchToCommonDest()
Summary:
Fix a case where FoldBranchToCommonDest() would bail out from doing CSE
when encountering a debug intrinsic. Handle that by skipping past the
debug intrinsics.

Also, as a minor refactoring, rename checkCSEInPredecessor() to
tryCSEWithPredecessor() to make it a bit more clear that the function
may remove instructions.

Reviewers: fhahn, craig.topper, dblaikie, xbolva00

Reviewed By: fhahn, xbolva00

Subscribers: vsk, davide, llvm-commits

Differential Revision: https://reviews.llvm.org/D46635

llvm-svn: 332698
2018-05-18 08:52:15 +00:00
Shiva Chen 6e07dfb148 [RISCV] Add WasForced parameter to MCAsmBackend::fixupNeedsRelaxationAdvanced
For RISCV branch instructions, we need to preserve relocation types when linker
relaxation enabled, so then linker could modify offset when the branch offsets
changed.

We preserve relocation types by define shouldForceRelocation.
IsResolved return by evaluateFixup will always false when shouldForceRelocation
return true. It will make RISCV MC Branch Relaxation always relax 16-bit
branches to 32-bit form, even if the symbol actually could be resolved.

To avoid 16-bit branches always relax to 32-bit form when linker relaxation
enabled, we add a new parameter WasForced to indicate that the symbol actually
couldn't be resolved and not forced by shouldForceRelocation return true.

RISCVAsmBackend::fixupNeedsRelaxationAdvanced could relax branches with
unresolved symbols by (!IsResolved && !WasForced).

RISCV MC Branch Relaxation is needed because RISCV could perform 32-bit
to 16-bit transformation in MC layer.

Differential Revision: https://reviews.llvm.org/D46350

llvm-svn: 332696
2018-05-18 06:42:21 +00:00
Serguei Katkov 5095883fe9 [LICM] Extend the MustExecute scope
CanProveNotTakenFirstIteration utility does not handle the case when
condition of the branch is a constant. Add its handling.

Reviewers: reames, anna, mkazantsev
Reviewed By: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D46996

llvm-svn: 332695
2018-05-18 04:56:28 +00:00
Keno Fischer 1e11fc1ccb [X86DomainReassignment] Hopefully fix buildbot failure
The Darwin build bot failed with:
```
llc -mcpu=skylake-avx512 -mtriple=x86_64-unknown-linux-gnu domain-reassignment-test.ll -o - | llvm-mc
--
Exit Code: 134

Command Output (stderr):
--
Assertion failed: (MAI->hasSingleParameterDotFile()), function EmitFileDirective, file lib/MC/MCAsmStreamer.cpp, line 1087.
```

Looks like this is because the `llvm-mc` command was missing a triple
directive and defaulting to MachO. Add the triple option.

llvm-svn: 332694
2018-05-18 04:36:38 +00:00
Walter Lee cdbb207bd1 [asan] Add instrumentation support for Myriad
1. Define Myriad-specific ASan constants.

2. Add code to generate an outer loop that checks that the address is
   in DRAM range, and strip the cache bit from the address.  The
   former is required because Myriad has no memory protection, and it
   is up to the instrumentation to range-check before using it to
   index into the shadow memory.

3. Do not add an unreachable instruction after the error reporting
   function; on Myriad such function may return if the run-time has
   not been initialized.

4. Add a test.

Differential Revision: https://reviews.llvm.org/D46451

llvm-svn: 332692
2018-05-18 04:10:38 +00:00
Eric Christopher 68f2218e1e Revert "Temporarily revert "[DEBUG] Initial adaptation of NVPTX target for debug info emission.""
This reapplies commits: r330271, r330592, r330779.

    [DEBUG] Initial adaptation of NVPTX target for debug info emission.

    Summary:
    Patch adds initial emission of the debug info for NVPTX target.
    Currently, only .file and .loc directives are emitted, everything else is
    commented out to not break the compilation of Cuda.

llvm-svn: 332689
2018-05-18 03:13:08 +00:00
Eli Friedman 4081a57af7 [MachineOutliner] Count savings from outlining in bytes.
Counting the number of instructions is both unintuitive and inaccurate.
On AArch64, this only affects the generated remarks and certain rare
pseudo-instructions, but it will have a bigger impact on other targets.

Differential Revision: https://reviews.llvm.org/D46921

llvm-svn: 332685
2018-05-18 01:52:16 +00:00
Keno Fischer e07153a859 [X86DomainReassignment] Don't compare stack-allocated values by address
Summary:
The Closure allocated in the main loop is allocated on the stack. However,
later in the code its address is taken (and used for comparisons). This
obviously doesn't work. In fact, the Closure will get the same stack address
during every loop iteration, rendering the check that intended to identify
Closure conflicts entirely ineffective. Fix this bug by giving every Closure
a unique ID and using that for comparison. Alternatively, we could heap
allocate the closure object.

Fixes PR37396
Fixes JuliaLang/julia#27032

Reviewers: craig.topper, guyblank

Reviewed By: craig.topper

Subscribers: vchuravy, llvm-commits

Differential Revision: https://reviews.llvm.org/D46800

llvm-svn: 332682
2018-05-18 01:03:01 +00:00
Keno Fischer 66ab99c3ee [X86DomainReassignment] Don't delete IMPLICIT_DEF nodes
Summary:
We cannot simply delete IMPLICIT_DEF nodes. They may be used
later (e.g. by a PHI) and deleting them will cause later passes (e.g.
LiveVariables) to crash. However, it seems fine to ignore them for
purposes of the domain reassignment (as we do with PHI).

Fixes PR37430
Fixes JuliaLang/julia#27080

Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D46797

llvm-svn: 332680
2018-05-18 00:40:52 +00:00
Zachary Turner c762666e87 Resubmit [pdb] Change /DEBUG:GHASH to emit 8 byte hashes."
This fixes the remaining failing tests, so resubmitting with no
functional change.

llvm-svn: 332676
2018-05-17 22:55:15 +00:00
George Burgess IV c6526176cf Revert r332657: "[AA] cfl-anders-aa with field sensitivity"
I don't believe the person who LGTMed this review has appropriate
context on this code. I apologize if I'm wrong.

llvm-svn: 332674
2018-05-17 21:56:39 +00:00
Changpeng Fang 860d460063 AMDGPU/SI: Don't promote alloca to vector for atomic load/store
Summary:
  Don't promote alloca to vector for atomic load/store

Reviewer:
  arsenm

Differential Revision:
  https://reviews.llvm.org/D46085

llvm-svn: 332673
2018-05-17 21:49:44 +00:00
Zachary Turner 1de9fce151 Revert "[pdb] Change /DEBUG:GHASH to emit 8 byte hashes."
A few tests haven't been properly updated, so reverting while
I have time to investigate proper fixes.

llvm-svn: 332672
2018-05-17 21:49:25 +00:00
Zachary Turner 3c4c8a0937 [pdb] Change /DEBUG:GHASH to emit 8 byte hashes.
Previously we emitted 20-byte SHA1 hashes.  This is overkill
for identifying debug info records, and has the negative side
effect of making object files bigger and links slower.  By
using only the last 8 bytes of a SHA1, we get smaller object
files and ~10% faster links.

This modifies the format of the .debug$H section by adding a new
value for the hash algorithm field, so that the linker will still
work when its object files have an old format.

Differential Revision: https://reviews.llvm.org/D46855

llvm-svn: 332669
2018-05-17 21:22:48 +00:00
Reid Kleckner f40f85868e [codeview] Include record prefix in global type hashing
The prefix includes type kind, which is important to preserve. Two
different type leafs can easily have the same interior record contents
as another type.

We ran into this issue in PR37492 where a bitfield type record collided
with a const modifier record. Their contents were bitwise identical, but
their kinds were different.

llvm-svn: 332664
2018-05-17 20:47:22 +00:00
David Bolvansky b1c59e3f30 [AA] cfl-anders-aa with field sensitivity
Summary:
There was some unfinished work started for offset tracking in CFLGraph by the author of implementation of Andersen algorithm. This work was completed and support for field sensitivity was added to the core of Andersen algorithm.

The performance results seem promising.

SPEC2006 int_base score was increased by 1.1 % (I  compared clang 6.0 with clang 6.0 with this patch). The avergae compile time was increased by +- 1 % according my measures with small and medium C/C++ projects (I did not tested it on the large projects with milions of lines of code)

Reviewers: chandlerc, george.burgess.iv, rja

Reviewed By: rja

Subscribers: rja, llvm-commits

Differential Revision: https://reviews.llvm.org/D46282

llvm-svn: 332657
2018-05-17 20:23:33 +00:00
Diego Caballero f58ad3129c [LV][VPlan] Build plain CFG with simple VPInstructions for outer loops.
Patch #3 from VPlan Outer Loop Vectorization Patch Series #1
(RFC: http://lists.llvm.org/pipermail/llvm-dev/2017-December/119523.html).

Expected to be NFC for the current inner loop vectorization path. It
introduces the basic algorithm to build the VPlan plain CFG (single-level
CFG, no hierarchical CFG (H-CFG), yet) in the VPlan-native vectorization
path using VPInstructions. It includes:
  - VPlanHCFGBuilder: Main class to build the VPlan H-CFG (plain CFG without nested regions, for now).
  - VPlanVerifier: Main class with utilities to check the consistency of a H-CFG.
  - VPlanBlockUtils: Main class with utilities to manipulate VPBlockBases in VPlan.

Reviewers: rengolin, fhahn, mkuper, mssimpso, a.elovikov, hfinkel, aprantl.

Differential Revision: https://reviews.llvm.org/D44338

llvm-svn: 332654
2018-05-17 19:24:47 +00:00
Sanjay Patel ed58fed9ef [x86] preserve test intent by removing undef
We need to clean up the DAG floating-point undef logic.
This process is similar to how we handled integer undef
logic in https://reviews.llvm.org/D43141.

And as we did there, I'm trying to reduce the patch by
changing tests that would probably become meaningless
once we correct FP undef folding.

llvm-svn: 332648
2018-05-17 18:43:44 +00:00
Anastasis Grammenos d6c6678766 [Debugify] Print the output to stderr
Currently debugify prints it's output to stdout,
with this patch all the output generated goes to stderr.

This change lets us use debugify without taking away
the ability to pipe the output to other llvm tools.

llvm-svn: 332642
2018-05-17 18:19:58 +00:00
Sameer AbuAsal 1dc0a8fb18 [RISCV] Separate base from offset in lowerGlobalAddress
Summary:
When lowering global address, lower the base as a TargetGlobal first then
 create an SDNode for the offset separately and chain it to the address calculation

 This optimization will create a DAG where the base address of a global access will
 be reused between different access. The offset can later be folded into the immediate
 part of the memory access instruction.

  With this optimization we generate:

    lui a0, %hi(s)
    addi a0, a0, %lo(s) ; shared base address.

    addi a1, zero, 20 ; 2 instructions per access.
    sw a1, 44(a0)

    addi a1, zero, 10
    sw a1, 8(a0)

    addi a1, zero, 30
    sw a1, 80(a0)

    Instead of:

    lui a0, %hi(s+44) ; 3 instructions per access.
    addi a1, zero, 20
    sw a1, %lo(s+44)(a0)

    lui a0, %hi(s+8)
    addi a1, zero, 10
    sw a1, %lo(s+8)(a0)

    lui a0, %hi(s+80)
    addi a1, zero, 30
    sw a1, %lo(s+80)(a0)

    Which will save one instruction per access.

Reviewers: asb, apazos

Reviewed By: asb

Subscribers: rbar, johnrusso, simoncook, jordy.potman.lists, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, mgrang, apazos, asb, llvm-commits

Differential Revision: https://reviews.llvm.org/D46989

llvm-svn: 332641
2018-05-17 18:14:53 +00:00
Sanjay Patel 87621174ac [x86] preserve test intent by removing undef
We need to clean up the DAG floating-point undef logic.
This process is similar to how we handled integer undef
logic in https://reviews.llvm.org/D43141.

And as we did there, I'm trying to reduce the patch by
changing tests that would probably become meaningless
once we correct FP undef folding.

llvm-svn: 332640
2018-05-17 18:13:58 +00:00
Sanjay Patel 5c48b73fff [ARM] preserve test intent by removing undef
We need to clean up the DAG floating-point undef logic.
This process is similar to how we handled integer undef
logic in https://reviews.llvm.org/D43141.

And as we did there, I'm trying to reduce the patch by
changing tests that would probably become meaningless
once we correct FP undef folding.

llvm-svn: 332638
2018-05-17 18:09:56 +00:00
Sanjay Patel 38892e5ce1 [ARM] preserve test intent by removing undef
We need to clean up the DAG floating-point undef logic.
This process is similar to how we handled integer undef
logic in https://reviews.llvm.org/D43141.

And as we did there, I'm trying to reduce the patch by
changing tests that would probably become meaningless
once we correct FP undef folding.

Follow-up to:
https://reviews.llvm.org/rL332538
...because that change wasn't enough.

llvm-svn: 332637
2018-05-17 18:08:27 +00:00
Sanjay Patel ffd349577d [AArch64] preserve test intent by removing undef
We need to clean up the DAG floating-point undef logic.
This process is similar to how we handled integer undef
logic in D43141.

And as we did there, I'm trying to reduce the patch by
changing tests that would probably become meaningless
once we correct FP undef folding.

Follow-up to:
https://reviews.llvm.org/rL332534
...because that change wasn't enough.

llvm-svn: 332636
2018-05-17 18:07:02 +00:00
Mandeep Singh Grang ef0ebf2806 [RISCV] Implement MC layer support for the tail pseudoinstruction
Summary:
This patch implements MC support for tail psuedo instruction.
A follow-up patch implements the codegen support as well as handling of the indirect tail pseudo instruction.

Reviewers: asb, apazos

Reviewed By: asb

Subscribers: rbar, johnrusso, simoncook, jordy.potman.lists, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, llvm-commits

Differential Revision: https://reviews.llvm.org/D46221

llvm-svn: 332634
2018-05-17 17:31:27 +00:00
Changpeng Fang 391bcf8893 AMDGPU/SI: Handle infinite loop for the structurizer to work with CFG with infinite loops.
Summary:
  The current StructurizeCFG pass only works for CFG with one exit. AMDGPUUnifyDivergentExitNodes combines multiple "return" blocks and/or "unreachable" blocks
to one exit block for the Structurizer to work. However, infinite loop is another kind of special "exit", and if we don't handle it, the case of multiple exits will prevent the structurizer from working.

In this work, for each infinite loop, we add a dummy edge to the "return" block, and thus the AMDGPUUnifyDivergentExitNodes pass will work with infinite loops.
This will make CFG with infinite loops be structurized.

Reviewer:
  nhaehnle

Differential Revision:
  https://reviews.llvm.org/D46340

llvm-svn: 332625
2018-05-17 16:45:01 +00:00
Petar Jovanovic daf5169398 [mips] Add support for Global INValidate ASE
This includes

  Instructions: ginvi, ginvt,

  Assembler directives: .set ginv, .set noginv, .module ginv, .module noginv

  Attribute: ginv

  .MIPS.abiflags: GINV (0x20000)

Patch by Vladimir Stefanovic.

Differential Revision: https://reviews.llvm.org/D46268

llvm-svn: 332624
2018-05-17 16:30:32 +00:00
Craig Topper bd332588bd [InstCombine] Propagate the nsw/nuw flags from the add in the 'shifty' abs pattern to the sub in the select version.
According to alive this is valid. I'm hoping to use this to make an assumption that the sign bit is zero after this sequence. The only way it wouldn't be is if the input was INT__MIN, but by preserving the flags we can make doing this to INT_MIN UB.

The nuw flags is weird because it creates such a contradiction that the original number would have to be positive meaning we could remove the select entirely, but we don't get that far.

Differential Revision: https://reviews.llvm.org/D46988

llvm-svn: 332623
2018-05-17 16:29:52 +00:00
Simon Pilgrim e389ea0e3e [llvm-mca][X86] Add CMOV test files
llvm-svn: 332622
2018-05-17 16:29:12 +00:00
Alex Bradbury 6a53023b4e [RISCV] Set isReMaterializable on ADDI and LUI instructions
The isReMaterlizable flag is somewhat confusing, unlike most other instruction 
flags it is currently interpreted as a hint (mightBeRematerializable would be 
a better name). While LUI is always rematerialisable, for an instruction like 
ADDI it depends on its operands. TargetInstrInfo::isTriviallyReMaterializable 
will call TargetInstrInfo::isReallyTriviallyReMaterializable, which in turn 
calls TargetInstrInfo::isReallyTriviallyReMaterializableGeneric. We rely on 
the logic in the latter to pick out instances of ADDI that really are 
rematerializable.

The isReMaterializable flag does make a difference on a variety of test 
programs. The recently committed remat.ll test case demonstrates how stack 
usage is reduce and a unnecessary lw/sw can be removed. Stack usage in the 
Proc0 function in dhrystone reduces from 192 bytes to 112 bytes.

For the sake of completeness, this patch also implements 
RISCVRegisterInfo::isConstantPhysReg. Although this is called from a number of 
places, it doesn't seem to result in different codegen for any programs I've 
thrown at it. However, it is called in the rematerialisation codepath and it 
seems sensible to implement something correct here.

Differential Revision: https://reviews.llvm.org/D46182

llvm-svn: 332617
2018-05-17 15:51:37 +00:00
Simon Pilgrim b5741f5c3d [X86][BtVer2] ADC/SBB take 2cy on an ALU pipe, not 1cy like ADD/SUB
llvm-svn: 332616
2018-05-17 15:43:23 +00:00
Dmitry Mikulin 3c6b4e35bd In thin and full LTO + CFI, direct function calls may go through jump table
entries to reach the target. Since these calls don't require type checks,
we can short-circuit them to their real targets.

Differential Revision: https://reviews.llvm.org/D46326

llvm-svn: 332610
2018-05-17 14:29:07 +00:00
Alex Bradbury 5e41fc83c5 [Hexagon] Use addAliasForDirective for data directives
Data directives such as .word, .half, .hword are currently parsed using 
HexagonAsmParser::ParseDirectiveValue which effectively duplicates logic from 
AsmParser::parseDirectiveValue. This patch deletes that duplicated logic in 
favour of using addAliasForDirective.

Differential Revision: https://reviews.llvm.org/D46999

llvm-svn: 332607
2018-05-17 13:21:18 +00:00
Simon Pilgrim 0c0336e003 [X86] Split WriteADC/WriteADCRMW scheduler classes
For integer ALU instructions taking eflags as an input (ADC/SBB/ADCX/ADOX)

llvm-svn: 332605
2018-05-17 12:43:42 +00:00
Andrea Di Biagio 650b5fc6cb [llvm-mca] add flag -all-views and flag -all-stats.
Flag -all-views enables all the views.
Flag -all-stats enables all the views that print hardware statistics.

llvm-svn: 332602
2018-05-17 12:27:03 +00:00
Simon Pilgrim b4fd145fc3 [llvm-mca][X86] Add ADX test files
llvm-svn: 332595
2018-05-17 11:32:38 +00:00
Sander de Smalen 75cfa34156 [AArch64][SVE] Asm: Support for structured ST2, ST3 and ST4 (scalar+scalar) store instructions.
Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D46680

llvm-svn: 332584
2018-05-17 09:05:41 +00:00
Mikael Holmen 2ca16899ec Require DominatorTree when requiring/preserving LoopInfo in the old pass manager
Summary:
Require DominatorTree when requiring/preserving LoopInfo in the old pass manager

BreakCriticalEdges tries to keep LoopInfo and DominatorTree updated if they
exist. However, since commit r321653 and r321805, to update LoopInfo we
must have a DominatorTree, or we will hit an assert.

To fix this we now make a couple of passes that only required/preserved
LoopInfo also require DominatorTree.

This solves PR37334.

Reviewers: eli.friedman, efriedma

Reviewed By: efriedma

Subscribers: efriedma, llvm-commits

Differential Revision: https://reviews.llvm.org/D46829

llvm-svn: 332583
2018-05-17 09:05:40 +00:00
Martin Storsjo c10788728b [Analysis] Only use _unlocked stdio functions on linux
The existing comment said that the functions were available only
on GNU/Linux (and on certain Android versions), but only checked
T.isGNUEnvironment() which also is true on MinGW (for arch-windows-gnu
triplets), which doesn't have such functions.

Existing checks in the initialize function in TargetLibraryInfo.cpp
also use only T.isOSLinux() to check for glibc features.

This fixes use of stdio on MinGW.

Differential Revision: https://reviews.llvm.org/D47002

llvm-svn: 332581
2018-05-17 08:16:08 +00:00
Bjorn Pettersson 81a76a388a [SROA] Handle PHI with multiple duplicate predecessors
Summary:
The verifier accepts PHI nodes with multiple entries for the
same basic block, as long as the value is the same.

As seen in PR37203, SROA did not handle such PHI nodes properly
when speculating loads over the PHI, since it inserted multiple
loads in the predecessor block and changed the PHI into having
multiple entries for the same basic block, but with different
values.

This patch teaches SROA to reuse the same speculated load for
each PHI duplicate entry in such situations.

Resolves: https://bugs.llvm.org/show_bug.cgi?id=37203

Reviewers: uabelho, chandlerc, hfinkel, bkramer, efriedma

Reviewed By: efriedma

Subscribers: dberlin, efriedma, llvm-commits

Differential Revision: https://reviews.llvm.org/D46426

llvm-svn: 332577
2018-05-17 07:21:41 +00:00
Hiroshi Inoue f5c0e6c285 [SROA] pr37267: fix assertion failure in integer widening
The current integer widening does not support rewriting partial split slices in rewriteIntegerStore (and rewriteIntegerLoad).
This patch adds explicit checks for this case in isIntegerWideningViableForSlice.
Before r322533, splitting is allowed only for the whole-alloca slice and hence the above case is implicitly rejected by another check `if (DL.getTypeStoreSize(ValueTy) > Size)` because whole-alloca slice is larger than the partition.

Differential Revision: https://reviews.llvm.org/D46750

llvm-svn: 332575
2018-05-17 06:32:17 +00:00
Alex Bradbury cea6db0480 [RISCV] Add support for .half, .hword, .word, .dword directives
These directives are recognised by gas. Support is added through the use of 
addAliasForDirective.

Also match RISC-V gcc in preferring .half and .word for 16-bit and 32-bit data 
directives.

llvm-svn: 332574
2018-05-17 05:58:08 +00:00
Sanjay Patel 2e50cec5e3 [Thumb2] fix typo in test from r332548
llvm-svn: 332569
2018-05-17 03:24:25 +00:00
Stanislav Mekhanoshin 595fdcf43b [AMDGPU] Move lsr test. NFC.
llvm-svn: 332562
2018-05-17 01:30:51 +00:00
Sanjay Patel ae83159530 [Hexagon] preserve test intent by removing undef
We need to clean up the DAG floating-point undef logic.
This process is similar to how we handled integer undef
logic in D43141.

And as we did there, I'm trying to reduce the patch by
changing tests that would probably become meaningless
once we correct FP undef folding.

llvm-svn: 332550
2018-05-16 22:49:08 +00:00
Sanjay Patel 354842abc5 [PowerPC] preserve test intent by removing undef
We need to clean up the DAG floating-point undef logic.
This process is similar to how we handled integer undef
logic in D43141.

And as we did there, I'm trying to reduce the patch by
changing tests that would probably become meaningless
once we correct FP undef folding.

llvm-svn: 332549
2018-05-16 22:48:48 +00:00
Sanjay Patel 68c83a24d4 [Thumb] preserve test intent by removing undef
We need to clean up the DAG floating-point undef logic.
This process is similar to how we handled integer undef
logic in D43141.

And as we did there, I'm trying to reduce the patch by
changing tests that would probably become meaningless
once we correct FP undef folding.

llvm-svn: 332548
2018-05-16 22:47:51 +00:00
Sanjay Patel eedf265a2c [Thumb] preserve test intent by removing undef
We need to clean up the DAG floating-point undef logic.
This process is similar to how we handled integer undef
logic in D43141.

And as we did there, I'm trying to reduce the patch by
changing tests that would probably become meaningless
once we correct FP undef folding.

llvm-svn: 332547
2018-05-16 22:47:42 +00:00
Simon Pilgrim 2dc00a64a2 [X86] Update SNB/generic scheduler tests missed from rL332536
llvm-svn: 332540
2018-05-16 22:24:22 +00:00
Sanjay Patel 2c1846de2d [ARM] preserve test intent by removing undef
We need to clean up the DAG floating-point undef logic.
This process is similar to how we handled integer undef
logic in D43141.

And as we did there, I'm trying to reduce the patch by
changing tests that would probably become meaningless
once we correct FP undef folding.

llvm-svn: 332539
2018-05-16 22:20:33 +00:00
Sanjay Patel ce20ac0bc5 [ARM] preserve test intent by removing undef
We need to clean up the DAG floating-point undef logic.
This process is similar to how we handled integer undef
logic in D43141.

And as we did there, I'm trying to reduce the patch by
changing tests that would probably become meaningless
once we correct FP undef folding.

llvm-svn: 332538
2018-05-16 22:20:26 +00:00
Sanjay Patel 066309f4ed [ARM] preserve test intent by removing undef
We need to clean up the DAG floating-point undef logic.
This process is similar to how we handled integer undef
logic in D43141.

And as we did there, I'm trying to reduce the patch by
changing tests that would probably become meaningless
once we correct FP undef folding.

llvm-svn: 332537
2018-05-16 22:20:11 +00:00
Sanjay Patel 60fe206793 [AArch64] preserve test intent by removing undef
We need to clean up the DAG floating-point undef logic.
This process is similar to how we handled integer undef
logic in D43141.

And as we did there, I'm trying to reduce the patch by
changing tests that would probably become meaningless
once we correct FP undef folding.

llvm-svn: 332534
2018-05-16 21:57:57 +00:00
Sanjay Patel 6dcda28ddc [ARM] preserve test intent by removing undef
We need to clean up the DAG floating-point undef logic.
This process is similar to how we handled integer undef
logic in D43141.

And as we did there, I'm trying to reduce the patch by
changing tests that would probably become meaningless
once we correct FP undef folding.

llvm-svn: 332533
2018-05-16 21:57:19 +00:00
Sanjay Patel 82eca55953 [ARM] preserve test intent by removing undef
We need to clean up the DAG floating-point undef logic.
This process is similar to how we handled integer undef
logic in D43141.

And as we did there, I'm trying to reduce the patch by
changing tests that would probably become meaningless
once we correct FP undef folding.

llvm-svn: 332532
2018-05-16 21:57:00 +00:00
Benjamin Kramer 8ac15bf4dc [InstCombine] Fix the signature of fgets_unlocked.
It returns a pointer, not an int. This miscompiles all code that uses
the return value of fgets.

llvm-svn: 332531
2018-05-16 21:45:39 +00:00
Eli Friedman ddbf6d6514 [MachineOutliner] Don't outline instructions that modify SP.
This breaks the code which saves and restores LR, so we can't outline
without doing something more complicated for stack adjustment.

Found by inspection; we get lucky in most cases because getMemOpInfo
only handles STRWpost, not any other pre/post-increment forms. But it
hits a couple of artificial testcases in the tree.

Differential Revision: https://reviews.llvm.org/D46920

llvm-svn: 332529
2018-05-16 21:20:16 +00:00
Krzysztof Parzyszek e8a0ae7346 [Hexagon] Mark HVX vector predicate bitwise ops as legal, add patterns
llvm-svn: 332525
2018-05-16 21:00:24 +00:00
Simon Pilgrim 2e0f6c9b21 [X86][SSE] Reduce instruction/register usages for v4i32 vector shifts (PR37441)
As suggested by Fabian on PR37441, use PSHUFLW to extend shift amount types for use with PSRAD/PSRLD to reduce register pressure.

Some of this ideally would be done by combineTargetShuffle but its tricky to do as most of the shuffles are sharing inputs.

Differential Revision: https://reviews.llvm.org/D46959

llvm-svn: 332524
2018-05-16 20:52:52 +00:00
Konstantin Zhuravlyov c72ece6c2c AMDGPU : Recalculate SGPRs when trap handler is supported
Differential Revision: https://reviews.llvm.org/D29911

llvm-svn: 332523
2018-05-16 20:47:48 +00:00
Sam Clegg 6ccb59b3e9 [WebAssembly] MC: Ensure that FUNCTION_OFFSET relocations are always against function symbols.
The getAtom() method wasn't doing what we needed in all cases. We want
the symbols for the function which defines that section. We can compute
this easily enough and we know that we have at most one function in each
section.

Once this lands I will revert rL331412 which is no longer needed.

Fixes PR37409

Differential Revision: https://reviews.llvm.org/D46970

llvm-svn: 332517
2018-05-16 20:09:05 +00:00
Eli Friedman 02709bcb78 [MachineOutliner] Don't save/restore LR for tail calls.
The cost computation assumes we do this correctly, but the actual
lowering was wrong.

Differential Revision: https://reviews.llvm.org/D46923

llvm-svn: 332514
2018-05-16 19:49:01 +00:00
Simon Pilgrim d5d77dcb46 [X86] Fix typo in instregex for CVTSI642SDrr
llvm-svn: 332510
2018-05-16 18:31:17 +00:00
Sanjay Patel 332bbb0fea [x86] preserve test intent by removing undef
We need to clean up the DAG floating-point undef logic.
This process is similar to how we handled integer undef
logic in D43141. 

And as we did there, I'm trying to reduce the patch by 
changing tests that would probably become meaningless
once we make those fixes.

llvm-svn: 332501
2018-05-16 17:58:50 +00:00
Sanjay Patel 502d115505 [x86] preserve test intent by removing undef
We need to clean up the DAG floating-point undef logic.
This process is similar to how we handled integer undef
logic in D43141. 

And as we did there, I'm trying to reduce the patch by 
changing tests that would probably become meaningless
once we make those fixes.

llvm-svn: 332500
2018-05-16 17:58:08 +00:00
Sanjay Patel a7874a52c9 [x86] preserve test intent by removing undef
We need to clean up the DAG floating-point undef logic.
This process is similar to how we handled integer undef
logic in D43141. 

And as we did there, I'm trying to reduce the patch by 
changing tests that would probably become meaningless
once we make those fixes.

llvm-svn: 332499
2018-05-16 17:57:35 +00:00
Craig Topper 67aa726f8c [X86][AVX512DQ] Use packed instructions for scalar FP<->i64 conversions on 32-bit targets
As i64 types are not legal on 32-bit targets, insert these into a suitable zero vector and use the packed vXi64<->FP conversion instructions instead.

Fixes PR3163.

Differential Revision: https://reviews.llvm.org/D43441

llvm-svn: 332498
2018-05-16 17:40:07 +00:00
Vedant Kumar 5c6b3fb8fb [Debugify] Tighten up the test for -debugify-each, NFC
In post-commit review for r332416, Paul Robinson pointed out that the
test for -debugify-each is not checking what it needs to.

This commit tightens up the test.

llvm-svn: 332497
2018-05-16 17:30:58 +00:00
Sanjay Patel 84caa9659e [x86] add run with unsafe global param; NFC
llvm-svn: 332486
2018-05-16 16:23:41 +00:00
Tony Tye 43259df44a [AMDGPU] Change llvm.debugtrap to be a debug breakpoint that can resume execution.
No longer require the queue pointer to be passed in in fixed SGPRs.

Differential Revision: https://reviews.llvm.org/D46769

llvm-svn: 332485
2018-05-16 16:19:34 +00:00
Sanjay Patel b3ac148cb4 [x86] add tests for DAG FP undef operands; NFC
llvm-svn: 332484
2018-05-16 16:16:48 +00:00
Sander de Smalen 22176a2242 [AArch64][SVE] Improve diagnostics for vectors with incorrect element-size.
For regular SVE vector operands, this patch introduces a more
sensible diagnostic when the vector has a wrong suffix (e.g. z0.s vs z0.b).

For example:
  add z0.s, z1.s, z2.b      -> invalid element width
               ^_____^
               mismatch

For the vector-with-shift/extend (e.g. z0.s, uxtw #2) this patch takes
a slightly different approach and instead returns a 'invalid operand'
if the element size is not as expected. This is because the diagnostics
are more specificied to suggest using the right shift/extend suffix. This
is a trade-off not to introduce more operand classes and still provide
useful diagnostics for LD1 and PRF instructions.

For example:
  ld1w z1.s, p0/z, [x0, z0.s] -> invalid shift/extend specified, expected 'z[0..31].s, (uxtw|sxtw)'
  ld1w z1.d, p0/z, [x0, z0.s] -> invalid operand
          ^________________^
               mismatch

For gather prefetches, both 'z0.s' and 'z0.d' would be allowed:
  prfw #0, p0, [x0, z0.s]   -> invalid shift/extend specified, expected 'z[0..31].s, (uxtw|sxtw) #2'
  prfw #0, p0, [x0, z0.d]   -> invalid shift/extend specified, expected 'z[0..31].d, (lsl|uxtw|sxtw) #2'

Without this change, the diagnostic would unnecessarily suggest a
different element size:
  prfw #0, p0, [x0, z0.s]   -> invalid shift/extend specified, expected 'z[0..31].d, (lsl|uxtw|sxtw) #2'

Reviewers: SjoerdMeijer, aemerson, fhahn, samparker, javed.absar

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D46688

llvm-svn: 332483
2018-05-16 15:45:17 +00:00
Sirish Pande cabe50a308 [AArch64] Gangup loads and stores for pairing.
Keep loads and stores together (target defines how many loads
and stores to gang up), such that it will help in pairing
and vectorization.

Differential Revision https://reviews.llvm.org/D46477

llvm-svn: 332482
2018-05-16 15:36:52 +00:00
Sanjay Patel 2eb3512090 [InstCombine] allow more binop (shuffle X), C transforms
The canonicalization was restricted to shuffle masks with
a 1-to-1 mapping to the constant vector, but that disqualifies
the common splat pattern. This is part of solving PR37463:
https://bugs.llvm.org/show_bug.cgi?id=37463

llvm-svn: 332479
2018-05-16 15:15:22 +00:00
Sander de Smalen bbc4e9a4e3 [AArch64][SVE] Asm: Support for gather PRF prefetch instructions
Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D46686

llvm-svn: 332472
2018-05-16 14:16:01 +00:00
Krzysztof Pszeniczny 2ba8fd4914 [BasicAA] Fix handling of invariant group launders
Summary:
A recent patch ([[ https://reviews.llvm.org/rL331587 | rL331587 ]]) to Capture Tracking taught it that the `launder_invariant_group` intrinsic captures its argument only by returning it. Unfortunately, BasicAA still considered every call instruction as a possible escape source and hence concluded that the result of a `launder_invariant_group` call cannot alias any local non-escaping value. This led to [[ https://bugs.llvm.org/show_bug.cgi?id=37458 | bug 37458 ]].

This patch updates the relevant check for escape sources in BasicAA.

Reviewers: Prazek, kuhar, rsmith, hfinkel, sanjoy, xbolva00

Reviewed By: hfinkel, xbolva00

Subscribers: JDevlieghere, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D46900

llvm-svn: 332466
2018-05-16 13:16:54 +00:00
Matt Arsenault 67a9815a5c AMDGPU: Custom lower v4i16/v4f16 vector operations
Avoids stack access.

Also handle extract hi elt pattern from truncate + shift
to avoid a couple test regressions.

llvm-svn: 332453
2018-05-16 11:47:30 +00:00
David Bolvansky ca22d427b9 [SimplifyLibcalls] Replace locked IO with unlocked IO
Summary: If file stream arg is not captured and source is fopen, we could replace IO calls by unlocked IO ("_unlocked" function variants) to gain better speed,

Reviewers: efriedma, RKSimon, spatel, sanjoy, hfinkel, majnemer, lebedev.ri, rja

Reviewed By: rja

Subscribers: rja, srhines, efriedma, lebedev.ri, llvm-commits

Differential Revision: https://reviews.llvm.org/D45736

llvm-svn: 332452
2018-05-16 11:39:52 +00:00
Amara Emerson 0d6a26dffc [GlobalISel][IRTranslator] Split aggregates during IR translation.
We currently handle all aggregates by creating one large LLT, and letting the
legalizer deal with splitting them up. However using this approach means that
we can't support big endian code correctly.

This patch changes the way that the IRTranslator deals with aggregate values,
by splitting them up into their constituent element values. To do this, parts
of the translator need to be modified to deal with multiple VRegs for a single
Value.

A new Value to VReg mapper is introduced to help keep compile time under
control, currently there is no measurable impact on CTMark despite the extra
code being generated in some cases.

Patch is based on the original work of Tim Northover.

Differential Revision: https://reviews.llvm.org/D46018

llvm-svn: 332449
2018-05-16 10:32:02 +00:00
Andrea Di Biagio 45ccdd1785 [llvm-mca] Regenerate tests after r332381 and r332361. NFC
llvm-svn: 332447
2018-05-16 10:12:06 +00:00
Simon Dardis 5cf9de4b72 [mips] Add support for isBranchOffsetInRange and use it for MipsLongBranch
Add support for this target hook, covering MIPS, microMIPS and MIPSR6, along
with some tests. Also add missing getOppositeBranchOpc() cases exposed by the
tests.

Reviewers: atanasyan, abeserminji, smaksimovic

Differential Revision: https://reviews.llvm.org/D46794

llvm-svn: 332446
2018-05-16 10:03:05 +00:00
Peter Smith c811758da6 [AArch64] Support "S" inline assembler constraint
This patch re-introduces the "S" inline assembler constraint. This matches
an absolute symbolic address or a label reference. The primary use case is

asm("adrp %0, %1\n\t"
    "add %0, %0, :lo12:%1" : "=r"(addr) : "S"(&var));

I say re-introduces as it seems like "S" was implemented in the original
AArch64 backend, but it looks like it wasn't carried forward to the merged
backend. The original implementation had A and L modifiers that could be
used to print ":lo12:" to the string. It looks like gcc doesn't use these
and :lo12: is expected to be written in the inline assembly string so I've
not implemented A and L. Clang already supports the S modifier.

Fixes PR37180

Differential Revision: https://reviews.llvm.org/D46745

llvm-svn: 332444
2018-05-16 09:33:25 +00:00
Sander de Smalen a680f558be [AArch64][SVE] Asm: Support for structured LD2, LD3 and LD4 (scalar+scalar) load instructions.
Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D46679

llvm-svn: 332442
2018-05-16 09:16:20 +00:00
Alexander Richardson 8f44579d0b Emit a left-shift instead of a power-of-two multiply for jump-tables
Summary:
SelectionDAGLegalize::ExpandNode() inserts an ISD::MUL when lowering a
BR_JT opcode. While many backends optimize this multiply into a shift, e.g.
the MIPS backend currently always lowers this into a sequence of
load-immediate+multiply+mflo in MipsSETargetLowering::lowerMulDiv().

I initially changed the multiply to a shift in the MIPS backend but it
turns out that would not have handled the MIPSR6 case and was a lot more
code than doing it in LegalizeDAG.
I believe performing this simple optimization in LegalizeDAG instead of
each individual backend is the better solution since this also fixes other
backeds such as MSP430 which calls the multiply runtime function
__mspabi_mpyi without this patch.

Reviewers: sdardis, atanasyan, pftbest, asl

Reviewed By: sdardis

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D45760

llvm-svn: 332439
2018-05-16 08:58:26 +00:00
Simon Pilgrim 5df1ef7a8c [X86][SSE] Fix tests for vector rotates by splat variable.
We weren't correctly splatting the offset shift

llvm-svn: 332435
2018-05-16 08:23:47 +00:00
Sander de Smalen 67f9154964 [AArch64][SVE] Asm: Support for contiguous PRF prefetch instructions.
Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D46682

llvm-svn: 332433
2018-05-16 07:50:09 +00:00
Shoaib Meenai 074728a2a9 [ObjCARC] Prevent code motion into a catchswitch
A catchswitch must be the only non-phi instruction in its basic block;
attempting to move a retain or release into a catchswitch basic block
will result in invalid IR. Explicitly mark a CFG hazard in this case to
prevent the code motion.

Differential Revision: https://reviews.llvm.org/D46482

llvm-svn: 332430
2018-05-16 04:52:18 +00:00
Evgeny Stupachenko bff9302c3d Fix LSR compile time hang.
Summary:
Limit number of reassociations in GenerateReassociationsImpl.

Reviewers: qcolombet, mkazantsev

Differential Revision: https://reviews.llvm.org/D46039

From: Evgeny Stupachenko <evstupac@gmail.com>
                         <evgeny.v.stupachenko@intel.com>
llvm-svn: 332426
2018-05-16 02:48:50 +00:00
Anastasis Grammenos 66f13e0ba2 [Debugify] Fix test failing after r332416
I missed a test that needed an update.

Failing bot: http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast/builds/30071

llvm-svn: 332418
2018-05-16 00:11:52 +00:00
Anastasis Grammenos b4344c66aa [Debugfiy] Print the pass name next to the result
CheckDebugify now prints the pass name right next to the result of the check.

Differential Revision: https://reviews.llvm.org/D46908

llvm-svn: 332416
2018-05-15 23:38:05 +00:00
Eli Friedman 25bef201c5 [MachineOutliner] Add optsize markings to outlined functions.
It doesn't matter much this late in the pipeline, but one place that
does check for it is the function alignment code.

Differential Revision: https://reviews.llvm.org/D46373

llvm-svn: 332415
2018-05-15 23:36:46 +00:00
Simon Pilgrim de13589625 [X86][SSE] Add tests for vector rotates by splat variable.
llvm-svn: 332410
2018-05-15 22:11:51 +00:00
Stanislav Mekhanoshin 57d341c27a [AMDGPU] Fix handling of void types in isLegalAddressingMode
It is legal for the type passed to isLegalAddressingMode to be
unsized or, more specifically, VoidTy. In this case, we must
check the legality of load / stores for all legal types. Directly
trying to call getTypeStoreSize is incorrect, and leads to breakage
in e.g. Loop Strength Reduction. This change guards against that
behaviour.

Differential Revision: https://reviews.llvm.org/D40405

llvm-svn: 332409
2018-05-15 22:07:51 +00:00
Sanjay Patel 919882638e [InstCombine] fix binop (shuffle X), C --> shuffle (binop X, C') to check uses
llvm-svn: 332407
2018-05-15 22:00:37 +00:00
Marek Olsak 37b9f55cc6 AMDGPU: Add a missing test for the 128-bit local addr space option
This should have been pushed with:
  "AMDGPU: enable 128-bit for local addr space under an option"

llvm-svn: 332404
2018-05-15 21:41:57 +00:00
Marek Olsak 3c5fd145c5 StructurizeCFG: fix inverting conditions
Author: Samuel Pitoiset

Without this patch, it appears to me that we are selecting
the wrong operand when inverting conditions. In the attached
test, it will select %tmp3 instead of %tmp4. To fix it, just
use 'A' as everywhere.

This fixes a regression introduced by
"[PatternMatch] define m_Not using m_Xor and cst_pred_ty"

https://reviews.llvm.org/D46351

llvm-svn: 332403
2018-05-15 21:41:55 +00:00
Evgeniy Stepanov 091fed94ae [msan] Instrument masked.store, masked.load intrinsics.
Summary: Instrument masked store/load intrinsics.

Reviewers: kcc

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D46785

llvm-svn: 332402
2018-05-15 21:28:25 +00:00
Jake Ehrlich e40398ad98 [llvm-objcopy] Add --only-keep-debug as a noop
This option just keeps being a problem and really needs to be implemented
in some fashion. Implementing it properly requires some kind of
"replaceSectionReference" method because all the existing links need to be
maintained. The desired behavior is just for allocated sections to become
NOBITS but actually implementing that is rather tricky due to the current
design of llvm-objcopy. However converting allocated sections to NOBITS is
just an optimization and not something debuggers need. Debuggers can debug
a stripped executable and take an unstripped executable for that stripped
executable as input. Additionally allocated sections account for a very
small part of debug binaries so this optimization is quite small. I propose
that for the time being we implement this as a NOP so that people can use
llvm-objcopy where they need to, just in a sub-optimal way.

This option has already blocked a lot of people and its currently blocking me.

llvm-svn: 332396
2018-05-15 20:53:53 +00:00
Evandro Menezes 8d522d811a [AArch64] Improve single vector lane unscaled stores
When storing the 0th lane of a vector, use a simpler and usually more
efficient scalar store instead.  In this case, also using the unscaled
offset.

Differential revision: https://reviews.llvm.org/D46762

llvm-svn: 332394
2018-05-15 20:41:12 +00:00
Sanjay Patel cc0fbdb6e4 [InstCombine] add more tests for binop-shuffle; NFC
The splat pattern is part of PR37463:
https://bugs.llvm.org/show_bug.cgi?id=37463

llvm-svn: 332393
2018-05-15 20:34:09 +00:00
Chandler Carruth 5ecd81aab0 [x86][eflags] Fix PR37431 by teaching the EFLAGS copy lowering to
specially handle SETB_C* pseudo instructions.

Summary:
While the logic here is somewhat similar to the arithmetic lowering, it
is different enough that it made sense to have its own function.
I actually tried a bunch of different optimizations here and none worked
well so I gave up and just always do the arithmetic based lowering.

Looking at code from the PR test case, we actually pessimize a bunch of
code when generating these. Because SETB_C* pseudo instructions clobber
EFLAGS, we end up creating a bunch of copies of EFLAGS to feed multiple
SETB_C* pseudos from a single set of EFLAGS. This in turn causes the
lowering code to ruin all the clever code generation that SETB_C* was
hoping to achieve. None of this is needed. Whenever we're generating
multiple SETB_C* instructions from a single set of EFLAGS we should
instead generate a single maximally wide one and extract subregs for all
the different desired widths. That would result in substantially better
code generation. But this patch doesn't attempt to address that.

The test case from the PR is included as well as more directed testing
of the specific lowering pattern used for these pseudos.

Reviewers: craig.topper

Subscribers: sanjoy, mcrosier, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D46799

llvm-svn: 332389
2018-05-15 20:16:57 +00:00
Konstantin Zhuravlyov f13c9969fc AMDGPU: Fix v_dot{4, 8}* instruction encoding
Differential Revision: https://reviews.llvm.org/D46848

llvm-svn: 332387
2018-05-15 19:32:47 +00:00
Martin Storsjo e241ce6f65 [llvm-rc] Add support for the optional CLASS statement for dialogs
Differential Revision: https://reviews.llvm.org/D46875

llvm-svn: 332386
2018-05-15 19:21:28 +00:00
Michael Zolotukhin 67cfbaac89 [MemorySSA] Don't sort IDF blocks.
Summary:
After r332167 we started to sort the IDF blocks inside IDF calculation, so
there is no need to re-sort them on the user site. The test changes are due to
a slightly different order we're using now (originally we used DFSInNumber and
now the blocks are sorted by a pair (LevelFromRoot, DFSInNumber)).

Reviewers: dberlin, mgrang

Subscribers: Prazek, hiraditya, george.burgess.iv, llvm-commits

Differential Revision: https://reviews.llvm.org/D46899

llvm-svn: 332385
2018-05-15 18:40:29 +00:00
Tom Stellard e182b28ae4 AMDGPU/GlobalISel: Implement select() for G_FCONSTANT
Summary: Also clean up G_CONSTANT selection.

Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D46170

llvm-svn: 332379
2018-05-15 17:57:09 +00:00
Konstantin Zhuravlyov 603a43fcd5 AMDGPU: Add disasm tests for deep learning instructions + fix v_fmac_f32 disasm
Differential Revision: https://reviews.llvm.org/D46853

llvm-svn: 332377
2018-05-15 17:39:13 +00:00
Simon Pilgrim be9a206883 [X86] Split WriteCvtF2F into F32->F64 and F64->F32 scheduler classes
BtVer2 - Fixes schedules for (V)CVTPS2PD instructions

A lot of the Intel models still have too many InstRW overrides for these new classes - this needs cleaning up but I wanted to get the classes in first

llvm-svn: 332376
2018-05-15 17:36:49 +00:00