Commit Graph

56613 Commits

Author SHA1 Message Date
Daniel Sanders 34eac35a60 Add the missing new files from r343654
llvm-svn: 343655
2018-10-03 02:21:30 +00:00
Daniel Sanders c973ad1878 Re-commit: [globalisel] Add a combiner helpers for extending loads and use them in a pre-legalize combiner for AArch64
Summary: Depends on D45541

Reviewers: ab, aditya_nandakumar, bogner, rtereshin, volkan, rovka, javed.absar, aemerson

Subscribers: aemerson, rengolin, mgorny, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D45543

The previous commit failed portions of the test-suite on GreenDragon due to
duplicate COPY instructions and iterator invalidation. Both issues have now
been fixed. To assist with this, a helper (cloneVirtualRegister) has been added
to MachineRegisterInfo that can be used to get another register that has the same
type and class/bank as an existing one.

llvm-svn: 343654
2018-10-03 02:12:17 +00:00
Thomas Lively 9075cd607d [WebAssembly] any_true and all_true intrinsics and instructions
Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D52755

llvm-svn: 343649
2018-10-03 00:19:39 +00:00
Sanjay Patel abcacf9753 [InstCombine] add icmp+logic tests with commuted ops; NFC
The transform in question is located in foldICmpAndConstConst(),
but as shown here, it doesn't work if operands are commuted.

llvm-svn: 343646
2018-10-02 22:53:37 +00:00
Reid Kleckner 9c0baa524c Relax dbg-declare-inalloca.ll test more
We don't need to match the precise type index number here. It's not
important. The type name is what matters to make this test useful.

llvm-svn: 343642
2018-10-02 22:28:10 +00:00
Sam Clegg b2486f118d [WebAssembly] Stop generating helper functions in WebAssemblyLowerEmscriptenEHSjLj
Previously we were creating weakly defined helper function in
each translation unit:

-  setThrew
-  setTempRet0

Instead we now assume these will be provided at link time.  In
emscripten they are provided in compiler-rt:
 https://github.com/kripken/emscripten/pull/7203

Additionally we previously created three global variable which are
also now required to exist at link time instead.

- __THREW__
- _threwValue
- __tempRet0

Differential Revision: https://reviews.llvm.org/D49208

llvm-svn: 343640
2018-10-02 22:12:15 +00:00
Fangrui Song e5652fc682 [CodeView] Try fixing DebugInfo/X86/dbg-declare-inalloca.ll
llvm-svn: 343639
2018-10-02 22:03:31 +00:00
Daniel Sanders f430d941e9 [globalisel] Attempt to fix llvm-clang-x86_64-expensive-checks-win
The behaviour of this bot indicates that -verify-machineinstrs has been forced
on and is therefore inserting the verifier on builds that don't expect it.
Explicitly specify whether it's enabled or disabled for each test.

llvm-svn: 343633
2018-10-02 20:51:27 +00:00
Aaron Smith da0602c154 [CodeView] Only add the Scoped flag for an enum type when it has an immediate function scope to match MSVC
Reviewers: rnk, zturner, llvm-commits

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D52706

llvm-svn: 343627
2018-10-02 20:28:15 +00:00
Aaron Smith 802b033d78 [CodeView] Emit function options for subprogram and member functions
Summary:
Use the newly added DebugInfo (DI) Trivial flag, which indicates if a C++ record is trivial or not, to determine Codeview::FunctionOptions.

Clang and MSVC generate slightly different Codeview for C++ records. For example, here is the C++ code for a class with a defaulted ctor,

       class C {
       public:
         C() = default;
       };

Clang will produce a LF for the defaulted ctor while MSVC does not. For more details, refer to FIXMEs in the test cases in "function-options.ll" included with this set of changes.


Reviewers: zturner, rnk, llvm-commits, aleksandr.urakov

Reviewed By: rnk

Subscribers: Hui, JDevlieghere

Differential Revision: https://reviews.llvm.org/D45123

llvm-svn: 343626
2018-10-02 20:21:05 +00:00
Matt Morehouse 4b1ec17fb0 Revert "X86, AArch64, ARM: Do not attach debug location to spill/reload instructions"
This reverts r343520 due to breakage of HWASan tests on Android.

llvm-svn: 343616
2018-10-02 18:35:44 +00:00
Craig Topper 49225d0915 [X86][Disassembler] Add bizarro versions of the MOVSXD instruction that sign extend from a GR32 to GR32 or GR16.
The 0x63 opcodes in 64-bit mode have a fixed source size of 32-bits, but the destination size is controlled by REX.W and the 0x66 opsize prefix. This instruction is normally used with a REX.W prefix which provides desired behavior. The other encodings are interpretted as valid by the processor, but aren't useful.

This patch makes us recognize them for the disassembler to match objdump.

llvm-svn: 343614
2018-10-02 18:16:19 +00:00
Reid Kleckner d5e4ec74e3 [codeview] Fix 32-bit x86 variable locations in realigned stack frames
Add the .cv_fpo_stackalign directive so that we can define $T0, or the
VFRAME virtual register, with it. This was overlooked in the initial
implementation because unlike MSVC, we push CSRs before allocating stack
space, so this value is only needed to describe local variable
locations. Variables that the compiler now addresses via ESP are instead
described as being stored at offsets from VFRAME, which for us is ESP
after alignment in the prologue.

This adds tests that show that we use the VFRAME register properly in
our S_DEFRANGE records, and that we emit the correct FPO data to define
it.

Fixes PR38857

llvm-svn: 343603
2018-10-02 16:43:52 +00:00
Simon Pilgrim 860cb5c071 [X86][Btver2] Fix BLENDV and AESDEC schedules
Match AMD Fam16h SOG + llvm-exegesis tests

llvm-svn: 343597
2018-10-02 15:13:18 +00:00
Krzysztof Parzyszek 528aff3372 [Hexagon] Fix extracting subvectors of non-HVX vNi1
Patch by Brendon Cahoon.

llvm-svn: 343596
2018-10-02 15:05:43 +00:00
Sanjay Patel e2cd6384b7 [InstCombine] add tests with undef elements; NFC
See discussion in D52747.

llvm-svn: 343595
2018-10-02 15:00:56 +00:00
Diogo N. Sampaio eb9ca5ab18 [ARM] Emmit data symbol for constant pool data
The ARM elf emitter would omit printing data
symbol when constant data. This patch
overrides the emitFill method as to enforce that
the symbol is correctly printed.

Differential revision: https://reviews.llvm.org/D52737

llvm-svn: 343594
2018-10-02 14:55:48 +00:00
Roman Lebedev ea2046bea9 [NFC][CodeGen][X86] fma.ll, lwp-intrinsics.ll: actually spell --check-prefixes correctly :/
llvm-svn: 343588
2018-10-02 13:34:50 +00:00
Sanjay Patel 6dbecb4162 [InstCombine] add more insert/extract vector tests with FP types; NFC
These are candidates for the same fold that was implemented in
D52439, but FP types require bitcasting (and that changes the
extra uses profitability calculation).

llvm-svn: 343587
2018-10-02 13:34:05 +00:00
Roman Lebedev 5412be4b7a [NFC][CodeGen][X86] lwp-intrinsics.ll: fix check prefixes
llvm-svn: 343585
2018-10-02 13:11:08 +00:00
Roman Lebedev 8b253f0b54 [NFC][CodeGen][X86] fma.ll: fix check prefixes for -mcpu=bdver2
llvm-svn: 343584
2018-10-02 13:10:55 +00:00
Oliver Stannard c41902807e [AArch64][v8.5A] Add Memory Tagging instructions
This adds new instructions to manipluate tagged pointers, and to load
and store the tags associated with memory.

Patch by Pablo Barrio, David Spickett and Oliver Stannard!

Differential revision: https://reviews.llvm.org/D52490

llvm-svn: 343572
2018-10-02 10:04:39 +00:00
Oliver Stannard 2a5fcba94b [AArch64][v8.5A] Add Memory Tagging system registers
This adds new system registers introduced by the Memory Tagging
extension.

Patch by Pablo Barrio!

Differential revision: https://reviews.llvm.org/D52488

llvm-svn: 343571
2018-10-02 09:54:35 +00:00
Oliver Stannard 4493f421ac [AArch64][v8.5A] Add MTE system instructions
The Memory Tagging Extension adds system instructions for data cache
maintenance, implemented as new operands to the DC instruction.

Patch by Pablo Barrio!

Differential revision: https://reviews.llvm.org/D52487

llvm-svn: 343570
2018-10-02 09:48:43 +00:00
David Green 1e44c3b62c [InstCombine] Fold ~A - Min/Max(~A, O) -> Max/Min(A, ~O) - A
This is an attempt to get out of a local-minimum that instcombine currently
gets stuck in. We essentially combine two optimisations at once, ~a - ~b = b-a
and min(~a, ~b) = ~max(a, b), only doing the transform if the result is at
least neutral. This involves using IsFreeToInvert, which has been expanded a
little to include selects that can be easily inverted.

This is trying to fix PR35875, using the ideas from Sanjay. It is a large
improvement to one of our rgb to cmy kernels.

Differential Revision: https://reviews.llvm.org/D52177

llvm-svn: 343569
2018-10-02 09:48:34 +00:00
Simon Pilgrim ad23f270db [X86] Standardize floating point assembly comments
Consistently try to use APFloat::toString for floating point constant comments to get rid of differences between Constant / ConstantDataSequential values - it should help stop some of the linux-windows buildbot failures matching NaN/INF etc. as well.

Differential Revision: https://reviews.llvm.org/D52702

llvm-svn: 343562
2018-10-02 09:08:51 +00:00
David Green c066a92657 [InstCombine] Tests for ~A - Min/Max(~A, O) -> Max/Min(A, ~O) - A. NFC
llvm-svn: 343561
2018-10-02 09:06:49 +00:00
Matt Arsenault ab41193312 AMDGPU: Expand atomicrmw nand in IR
llvm-svn: 343559
2018-10-02 03:50:56 +00:00
Thomas Lively 6f77811a21 [WebAssembly] Restore slashes in SIMD conversion names
Summary: Depends on D52372 and D52442.

Reviewers: aheejin, dschuff, aardappel

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D52512

llvm-svn: 343558
2018-10-02 01:52:21 +00:00
Fangrui Song 99d4f74d01 [AArch64][DAGCombiner]: change -stop-after=isel to instruction-select
"isel" is registered by AMDGPU. The test will break if the AMDGPU target
is not built.

llvm-svn: 343553
2018-10-02 00:22:51 +00:00
Daniel Sanders 33f42f97af Revert: r343521 and r343541: [globalisel] Add a combiner helpers for extending loads and use them in a pre-legalize combiner for AArch64
There's a strange assertion on two of the Green Dragon bots that goes away when
this is reverted. The assertion is in RegBankAlloc and if it is this commit then
-verify-machine-instrs should have caught it earlier in the pipeline.

llvm-svn: 343546
2018-10-01 22:32:08 +00:00
Reid Kleckner 9ea2c01264 [codeview] Emit S_FRAMEPROC and use S_DEFRANGE_FRAMEPOINTER_REL
Summary:
Before this change, LLVM would always describe locals on the stack as
being relative to some specific register, RSP, ESP, EBP, ESI, etc.
Variables in stack memory are pretty common, so there is a special
S_DEFRANGE_FRAMEPOINTER_REL symbol for them. This change uses it to
reduce the size of our debug info.

On top of the size savings, there are cases on 32-bit x86 where local
variables are addressed from ESP, but ESP changes across the function.
Unlike in DWARF, there is no FPO data to describe the stack adjustments
made to push arguments onto the stack and pop them off after the call,
which makes it hard for the debugger to find the local variables in
frames further up the stack.

To handle this, CodeView has a special VFRAME register, which
corresponds to the $T0 variable set by our FPO data in 32-bit.  Offsets
to local variables are instead relative to this value.

This is part of PR38857.

Reviewers: hans, zturner, javed.absar

Subscribers: aprantl, hiraditya, JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D52217

llvm-svn: 343543
2018-10-01 21:59:45 +00:00
Craig Topper 42cd8cd862 Recommit r343499 "[X86] Enable load folding in the test shrinking code"
Original message:
This patch adds load folding support to the test shrinking code. This was noticed missing in the review for D52669

llvm-svn: 343540
2018-10-01 21:35:28 +00:00
Craig Topper f06a57fc89 Recommit r343498 "[X86] Improve test instruction shrinking when the sign flag is used and the output of the and is truncated."
This includes a fix to prevent i16 compares with i32/i64 ands from being shrunk if bit 15 of the and is set and the sign bit is used.

Original commit message:
Currently we skip looking through truncates if the sign flag is used. But that's overly restrictive.

It's safe to look through the truncate as long as we ensure one of the 3 things when we shrink. Either the MSB of the mask at the shrunken size isn't set. If the mask bit is set then either the shrunk size needs to be equal to the compare size or the sign

There are still missed opportunities to shrink a load and fold it in here. This will be fixed in a future patch.

llvm-svn: 343539
2018-10-01 21:35:26 +00:00
Sanjay Patel de5e8b93f4 [InstCombine] add inverse test for vector trunc canonical form; NFC
llvm-svn: 343529
2018-10-01 20:25:49 +00:00
Sanjay Patel 746eb09127 [InstCombine] regenerate test checks; NFC
These files used an old version of the script.
We regex more now.

llvm-svn: 343527
2018-10-01 20:22:28 +00:00
Stefan Pintilie 5d32a86f44 [PowerPC] Folding XForm to DForm loads requires alignment for some DForm loads.
Going from XForm Load to DSForm Load requires that the immediate be 4 byte
aligned.
If we are not aligned we must leave the load as LDX (XForm).
This bug is causing a compile-time failure in the benchmark h264ref.

Differential Revision: https://reviews.llvm.org/D51988

llvm-svn: 343525
2018-10-01 20:16:27 +00:00
Eric Christopher dcf1d97c5c Temporarily revert "[GVNHoist] Re-enable GVNHoist by default"
This reverts commit r342387 as it's showing significant performance
regressions in a number of benchmarks. Followed up with the
committer and original thread with an example and will get performance
numbers before recommitting.

llvm-svn: 343522
2018-10-01 18:57:08 +00:00
Daniel Sanders 9659bfda5a [globalisel] Add a combiner helpers for extending loads and use them in a pre-legalize combiner for AArch64
Summary: Depends on D45541

Reviewers: ab, aditya_nandakumar, bogner, rtereshin, volkan, rovka, javed.absar, aemerson

Subscribers: aemerson, rengolin, mgorny, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D45543

llvm-svn: 343521
2018-10-01 18:56:47 +00:00
Matthias Braun 3e081703c3 X86, AArch64, ARM: Do not attach debug location to spill/reload instructions
Spill/reload instructions are artificially generated by the compiler and
have no relation to the original source code. So the best thing to do is
not attach any debug location to them (instead of just taking the next
debug location we find on following instructions).

Differential Revision: https://reviews.llvm.org/D52125

llvm-svn: 343520
2018-10-01 18:56:39 +00:00
Craig Topper 1346b5b7cf [X86] Add more test shrinking with truncate and sign bit usage tests. NFC
llvm-svn: 343519
2018-10-01 18:52:19 +00:00
Craig Topper e072934d28 Revert r343499 and r343498. X86 test improvements
There's a subtle bug in the handling of truncate from i32/i64 to i32 without minsize.

I'll be adding more test cases and trying to find a fix.

llvm-svn: 343516
2018-10-01 18:40:44 +00:00
Krzysztof Parzyszek 6d569a2cc4 [Hexagon] Remove incorrect pattern for swiz
The pattern had a couple of problems:
- It was checking for loads of bytes in the reverse order to what it
  should have been looking for.
- It would replace loads of bytes with a load of a word without making
  sure that the alignment was correct.

Thanks to Eli Friedman for pointing it out.

llvm-svn: 343514
2018-10-01 18:24:40 +00:00
Zachary Turner a5e3e02602 [PDB] Add support for dumping Typedef records.
These work a little differently because they are actually in
the globals stream and are treated as symbol records, even though
DIA presents them as types.  So this also adds the necessary
infrastructure to cache records that live somewhere other than
the TPI stream as well.

llvm-svn: 343507
2018-10-01 17:55:38 +00:00
Matthias Braun 7159daa68e MIRParser: Check that instructions only reference DILocation metadata
llvm-svn: 343505
2018-10-01 17:50:52 +00:00
Wouter van Oortmerssen 0c83c3ff38 [WebAssembly] Fixed AsmParser not allowing instructions with /
Summary:
The AsmParser Lexer regards these as a seperate token.
Here we expand the instruction name with them if they are
adjacent (no whitespace).

Tested: the basic-assembly.s test case has one case with a / in it.
The currently are also instructions with : in them, which we intend
to rename rather than fix them here.

Reviewers: tlively, dschuff

Subscribers: sbc100, jgravelle-google, aheejin, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D52442

llvm-svn: 343501
2018-10-01 17:20:31 +00:00
Craig Topper aa84e1bba2 [X86] Enable load folding in the test shrinking code
This patch adds load folding support to the test shrinking code. This was noticed missing in the review for D52669

Differential Revision: https://reviews.llvm.org/D52699

llvm-svn: 343499
2018-10-01 17:10:50 +00:00
Craig Topper 2b587ad071 [X86] Improve test instruction shrinking when the sign flag is used and the output of the and is truncated
Currently we skip looking through truncates if the sign flag is used. But that's overly restrictive.

It's safe to look through the truncate as long as we ensure one of the 3 things when we shrink. Either the MSB of the mask at the shrunken size isn't set. If the mask bit is set then either the shrunk size needs to be equal to the compare size or the sign flag needs to be unused.

There are still missed opportunities to shrink a load and fold it in here. This will be fixed in a future patch.

Differential Revision: https://reviews.llvm.org/D52669

llvm-svn: 343498
2018-10-01 17:10:45 +00:00
Simon Pilgrim e0d2019052 [X86][Btver2] Fix BT(C|R|S)mr & BT(C|R|S)mi schedule latency + uop counts
Match AMD Fam16h SOG + llvm-exegesis tests

llvm-svn: 343494
2018-10-01 16:31:30 +00:00
Matthias Braun 004fe6bf83 DAGCombiner: StoreMerging: Fix bad index calculating when adjusting mismatching vector types
This fixes a case of bad index calculation when merging mismatching
vector types. This changes the existing code to just use the existing
extract_{subvector|element} and a bitcast (instead of bitcast first and
then newly created extract_xxx) so we don't need to adjust any indices
in the first place.

rdar://44584718

Differential Revision: https://reviews.llvm.org/D52681

llvm-svn: 343493
2018-10-01 16:25:50 +00:00
Sanjay Patel 5187efcfab [x86] add tests for 256- and 512-bit vector types for scalar-to-vector transform; NFC
llvm-svn: 343491
2018-10-01 16:17:18 +00:00
Jesper Antonsson c954b86391 [InstCombine] Handle vector compares in foldGEPIcmp(), take 2
Summary:
This is a continuation of the fix for PR34627 "InstCombine assertion at vector gep/icmp folding". (I just realized bugpoint had fuzzed the original test for me, so I had fixed another trigger of the same assert in adjacent code in InstCombine.)

This patch avoids optimizing an icmp (to look only at the base pointers) when the resulting icmp would have a different type.

The patch adds a testcase and also cleans up and shrinks the pre-existing test for the adjacent assert trigger.

Reviewers: lebedev.ri, majnemer, spatel

Reviewed By: lebedev.ri

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52494

llvm-svn: 343486
2018-10-01 14:59:25 +00:00
Simon Atanasyan 1ea206be73 [mips] Generate tests expectations using update_llc_test_checks. NFC
Generate tests expectations using update_llc_test_checks and reduce
number of "check prefixes" used in the tests.

llvm-svn: 343485
2018-10-01 14:43:07 +00:00
Simon Pilgrim 6ddc4e821c [X86][Btver2] Fix BTmr schedule uop counts
Match AMD Fam16h SOG + llvm-exegesis tests

llvm-svn: 343484
2018-10-01 14:42:16 +00:00
Sanjay Patel 31b07198f1 [InstCombine] try to convert vector insert+extract to trunc; 2nd try
This was originally committed at rL343407, but reverted at 
rL343458 because it crashed trying to handle a case where
the destination type is FP. This version of the patch adds
a check for that possibility. Tests added at rL343480.

Original commit message:

This transform is requested for the backend in:
https://bugs.llvm.org/show_bug.cgi?id=39016
...but I figured it was worth doing in IR too, and it's probably
easier to implement here, so that's this patch.

In the simplest case, we are just truncating a scalar value. If the
extract index doesn't correspond to the LSBs of the scalar, then we
have to shift-right before the truncate. Endian-ness makes this tricky,
but hopefully the ASCII-art helps visualize the transform.

Differential Revision: https://reviews.llvm.org/D52439

llvm-svn: 343482
2018-10-01 14:40:00 +00:00
Sanjay Patel 22ae8dabb5 [InstCombine] add more insert-extract tests for D52439; NFC
The first attempt at this transform:
rL343407
...was reverted:
rL343458
...because it did not handle the case where we bitcast to FP. 
The patch was already limited to avoid the case where we
bitcast from FP, but we might want to transform that too.

llvm-svn: 343480
2018-10-01 14:29:09 +00:00
Simon Pilgrim a982236e59 [X86][Btver2] Fix masked load schedule
JFPU01 resource usage should match JFPX

Match AMD Fam16h SOG + llvm-exegesis tests

llvm-svn: 343468
2018-10-01 13:12:05 +00:00
Hans Wennborg a60aa91374 Revert r343407 "[InstCombine] try to convert vector insert+extract to trunc"
This caused Chromium builds to fail with "Illegal Trunc" assertion.
See https://crbug.com/890723 for repro.

> This transform is requested for the backend in:
> https://bugs.llvm.org/show_bug.cgi?id=39016
> ...but I figured it was worth doing in IR too, and it's probably
> easier to implement here, so that's this patch.
>
> In the simplest case, we are just truncating a scalar value. If the
> extract index doesn't correspond to the LSBs of the scalar, then we
> have to shift-right before the truncate. Endian-ness makes this tricky,
> but hopefully the ASCII-art helps visualize the transform.
>
> Differential Revision: https://reviews.llvm.org/D52439

llvm-svn: 343458
2018-10-01 12:07:45 +00:00
Puyan Lotfi 06e65cae4a [NFC] Adding "REQUIRES: zlib" to a llvm-objcopy test for bots without zlib.
M    test/tools/llvm-objcopy/compress-and-decompress-debug-sections-error.test

llvm-svn: 343454
2018-10-01 10:50:23 +00:00
Andrea Di Biagio 24ea163007 [X86][BtVer2] Teach how to identify zero-idiom VPERM2F128rr instructions.
This patch adds another variant class to identify zero-idiom VPERM2F128rr
instructions.

On Jaguar, a VPERM wih bit 3 and 7 of the mask set, is a zero-idiom.

Differential Revision: https://reviews.llvm.org/D52663

llvm-svn: 343452
2018-10-01 10:35:13 +00:00
Puyan Lotfi af048648d3 [llvm-objcopy] Adding support for decompressing zlib compressed dwarf sections.
Summary: I had added support for compressing dwarf sections in a prior commit,
         this one adds support for decompressing. Usage is:

         llvm-objcopy --decompress-debug-sections input.o output.o

Reviewers: jakehehrlich, jhenderson, alexshap	

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D51841

llvm-svn: 343451
2018-10-01 10:29:41 +00:00
Clement Courbet a933fb237e [X86][Sched] Update scheduling information for VZEROALL on HWS, BDW, SKX, SNB.
Summary:
    While looking at PR35606, I found out that the scheduling info is incorrect.

    One can check that it's really a P5+P6 and not a 2*P56 with:
    echo -e 'vzeroall\nvandps %xmm1, %xmm2, %xmm3' | ./bin/llvm-exegesis -mode=uops -snippets-file=-
    (vandps executes on P5 only)

    Reviewers: craig.topper, RKSimon

    Subscribers: llvm-commits

    Differential Revision: https://reviews.llvm.org/D52541

llvm-svn: 343447
2018-10-01 08:37:48 +00:00
Carlos Alberto Enciso 81d8ef2196 [DebugInfo][Dexter] Incorrect DBG_VALUE after MCP dead copy instruction removal.
When MachineCopyPropagation eliminates a dead 'copy', its associated debug information becomes invalid. as the recorded register has been removed.  It causes the debugger to display wrong variable value.

Differential Revision: https://reviews.llvm.org/D52614

llvm-svn: 343445
2018-10-01 08:14:44 +00:00
Clement Courbet ce4caff0de [CodeGen][NFC] Add tests for heterogeneous types in MergeConsecutiveStores
Reviewers: efriedma

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52643

llvm-svn: 343444
2018-10-01 07:16:22 +00:00
Craig Topper 67d9dbdbdd [X86] Stop X86DomainReassignment from creating copies between GR8/GR16 physical registers and k-registers.
We can only copy between a k-register and a GR32/GR64 register.

This patch detects that the copy will be illegal and prevents the domain reassignment from happening for that closure.

This probably isn't the best fix, and we should probably figure out how to handle this correctly.

Fixes PR38803.

llvm-svn: 343443
2018-10-01 07:08:41 +00:00
Simon Pilgrim f21083870d [X86] Fix scheduler class for BTmi instructions
This wasn't treated as a folded load instruction

llvm-svn: 343424
2018-09-30 20:19:16 +00:00
Simon Pilgrim b1108399bd [LLVM-MCA][X86] Add missing VCMPESTR/VCMPESTR tests
llvm-svn: 343421
2018-09-30 18:19:00 +00:00
Bjorn Pettersson c2fc53ac90 [PHIElimination] Lower a PHI node with only undef uses as IMPLICIT_DEF
Summary:
The lowering of PHI nodes used to detect if all inputs originated
from IMPLICIT_DEF's. If so the PHI node was replaced by an
IMPLICIT_DEF. Now we also consider undef uses when checking the
inputs. So if all inputs are implicitly defined or undef we
lower the PHI to an IMPLICIT_DEF. This makes
PHIElimination::LowerPHINode more consistent as it checks
both implicit and undef properties at later stages.

Reviewers: MatzeB, tstellar

Reviewed By: MatzeB

Subscribers: jvesely, nhaehnle, llvm-commits

Differential Revision: https://reviews.llvm.org/D52558

llvm-svn: 343417
2018-09-30 17:26:58 +00:00
Bjorn Pettersson 4af7f57bdf [PHIElimination] Update the regression test for PR16508
Summary:
When PR16508 was solved (in rL185363) a regression test was
added as test/CodeGen/PowerPC/2013-07-01-PHIElimBug.ll.
I discovered that the test case no longer reproduced the
scenario from PR16508. This problem could have been amended
by adding an extra RUN line with "-O1" (or possibly "-O0"),
but instead I added a mir-reproducer
  test/CodeGen/PowerPC/2013-07-01-PHIElimBug.mir
to get a reproducer that is less sensitive to changes in
earlier passes (including O-level).

While being at it I also corrected a code comment in
PHIElimination::EliminatePHINodes that has been incorrect
since the related bugfix from rL185363.

Reviewers: MatzeB, hfinkel

Reviewed By: MatzeB

Subscribers: nemanjai, jsji, llvm-commits

Differential Revision: https://reviews.llvm.org/D52553

llvm-svn: 343416
2018-09-30 17:23:21 +00:00
Simon Pilgrim 20623f2343 [LLVM-MCA][X86] Add some AVX512 tests
These are going to be necessary to check I don't mess up when I start cleaning up all the remaining vector integer overrides

llvm-svn: 343414
2018-09-30 17:01:59 +00:00
Simon Pilgrim 4f5693ac8d [X86][Btver2] Fix PCmpIStrI/PCmpIStrM schedules
Missing JFPU0 pipe and double JFPU1 pipe (to match JVALU1) resources

Match AMD Fam16h SOG + llvm-exegesis tests

llvm-svn: 343413
2018-09-30 16:38:38 +00:00
Zachary Turner 518cb2d560 [PDB] Add native support for dumping array types.
llvm-svn: 343412
2018-09-30 16:19:18 +00:00
Sanjay Patel 1e0f1f645a [InstCombine] try to convert vector insert+extract to trunc
This transform is requested for the backend in:
https://bugs.llvm.org/show_bug.cgi?id=39016
...but I figured it was worth doing in IR too, and it's probably 
easier to implement here, so that's this patch.

In the simplest case, we are just truncating a scalar value. If the 
extract index doesn't correspond to the LSBs of the scalar, then we 
have to shift-right before the truncate. Endian-ness makes this tricky, 
but hopefully the ASCII-art helps visualize the transform.

Differential Revision: https://reviews.llvm.org/D52439

llvm-svn: 343407
2018-09-30 14:34:01 +00:00
Sanjay Patel 26c119a9c2 [InstCombine] allow lengthening of insertelement to eliminate shuffles
As noted in post-commit comments for D52548, the limitation on 
increasing vector length can be applied by opcode.
As a first step, this patch only allows insertelement to be
widened because that has no logical downsides for IR and has 
little risk of pessimizing codegen.

This may cause PR39132 to go into hiding during a full compile,
but that bug is not fixed.

llvm-svn: 343406
2018-09-30 13:50:42 +00:00
Roman Lebedev 0496477c5d [NFC][CodeGen][X86][AArch64] Add 64-bit constant bit field extract pattern tests
llvm-svn: 343404
2018-09-30 12:42:08 +00:00
Simon Pilgrim 84e280ae42 [X86] Regenerate MMX coalescing test
Exposes another extractelement(bitcast(scalartovector())) pattern

llvm-svn: 343403
2018-09-30 09:42:04 +00:00
Zachary Turner 9be3b6a18b [PDB] Fix this test for real.
I was able to test this fix on an actual Windows machine
so this should get the bot green again.

llvm-svn: 343400
2018-09-30 03:57:49 +00:00
Craig Topper 1709829fed [X86] Disable BMI BEXTR in X86DAGToDAGISel::matchBEXTRFromAnd unless we're on compiling for a CPU with single uop BEXTR
Summary:
This function turns (X >> C1) & C2 into a BMI BEXTR or TBM BEXTRI instruction. For BMI BEXTR we have to materialize an immediate into a register to feed to the BEXTR instruction.

The BMI BEXTR instruction is 2 uops on Intel CPUs. It looks like on SKL its one port 0/6 uop and one port 1/5 uop. Despite what Agner's tables say. I know one of the uops is a regular shift uop so it would have to go through the port 0/6 shifter unit. So that's the same or worse execution wise than the shift+and which is one 0/6 uop and one 0/1/5/6 uop. The move immediate into register is an additional 0/1/5/6 uop.

For now I've limited this transform to AMD CPUs which have a single uop BEXTR. If may also might make sense if we can fold a load or if the and immediate is larger than 32-bits and can't be encoded as a sign extended 32-bit value or if LICM or CSE can hoist the move immediate and share it. But we'd need to look more carefully at that. In the regression I looked at it doesn't look load folding or large immediates were occurring so the regression isn't caused by the loss of those. So we could try to be smarter here if we find a compelling case.

Reviewers: RKSimon, spatel, lebedev.ri, andreadb

Reviewed By: RKSimon

Subscribers: llvm-commits, andreadb, RKSimon

Differential Revision: https://reviews.llvm.org/D52570

llvm-svn: 343399
2018-09-30 03:01:46 +00:00
Zachary Turner 6e6d545d24 Only dump the types we need in the test.
We added support for dumping pointers but pointers to arrays
won't correctly dump until we add support for dumping arrays.
Instead of trying to dump everything, which this test isn't
even interested in, just dump enums and typedefs.

llvm-svn: 343398
2018-09-30 00:51:54 +00:00
Zachary Turner a1e79e326a Fix some tests on Windows.
I don't actually have a Windows machine at the present moment,
so hopefully this fixes it.

llvm-svn: 343397
2018-09-30 00:22:21 +00:00
Lang Hames 98440293fb [ORC] Add partitioning support to CompileOnDemandLayer2.
CompileOnDemandLayer2 now supports user-supplied partition functions (the
original CompileOnDemandLayer already supported these).

Partition functions are called with the list of requested global values
(i.e. global values that currently have queries waiting on them) and have an
opportunity to select extra global values to materialize at the same time.

Also adds testing infrastructure for the new feature to lli.

llvm-svn: 343396
2018-09-29 23:49:57 +00:00
Zachary Turner 6ca6a03c51 [PDB] Better native API support for pointers.
We didn't properly detect when a pointer was a member
pointer, and when that was the case we were not
properly returning class parent info.  This caused
member pointers to render incorrectly in pretty mode.
However, we didn't even have pretty tests for pointers
in native mode, so those are also added now to ensure
this.

llvm-svn: 343393
2018-09-29 23:28:19 +00:00
David Bolvansky 09fd8172df [DAGCombiner][NFC] Tests for X div/rem Y single bit fold
llvm-svn: 343392
2018-09-29 21:00:37 +00:00
Simon Pilgrim c4e7c347cd [X86][AVX2] Cleanup shuffle combining tests - add common prefixes
llvm-svn: 343391
2018-09-29 20:34:16 +00:00
Simon Pilgrim a2efe82b81 [X86] SimplifyDemandedVectorEltsForTargetNode - remove identity target shuffles before simplifying inputs
By removing demanded target shuffles that simplify to zero/undef/identity before simplifying its inputs we improve chances of further simplification, as only the immediate parent user of the combined is added back to the work list - this still doesn't help us if its passed through other ops though (bitcasts....).

llvm-svn: 343390
2018-09-29 18:15:26 +00:00
Craig Topper 845789e823 [X86] Add fast-isel test cases for unaligned load/store intrinsics recently added to clang
This adds tests for:
_mm_loadu_si16
_mm_loadu_si32
_mm_loadu_si16
_mm_storeu_si64
_mm_storeu_si32
_mm_storeu_si16

llvm-svn: 343389
2018-09-29 18:03:52 +00:00
Simon Pilgrim d633e290c8 [X86] getTargetConstantBitsFromNode - add support for rearranging constant bits via shuffles
Exposed an issue that recursive calls to getTargetConstantBitsFromNode don't handle changes to EltSizeInBits yet.

llvm-svn: 343384
2018-09-29 17:01:55 +00:00
Sanjay Patel 20c64510cb [InstCombine] add test for vector widening of insertelements; NFC
The test shows a potential overreach with the fix from D52548.

llvm-svn: 343378
2018-09-29 15:01:45 +00:00
Simon Pilgrim 43e4e648ef [X86] Regenerate fma comments.
llvm-svn: 343376
2018-09-29 14:31:00 +00:00
Simon Pilgrim 22d51014af [X86] getTargetConstantBitsFromNode - add support for peeking through ISD::EXTRACT_SUBVECTOR
llvm-svn: 343375
2018-09-29 14:17:32 +00:00
Simon Pilgrim aa77033a6b [X86][SSE] Fixed issue with v2i64 variable shifts on 32-bit targets
The shift amount might have peeked through a extract_subvector, altering the number of vector elements in the 'Amt' variable - so we were incorrectly calculating the ratio when peeking through bitcasts, resulting in incorrectly detecting splats.

llvm-svn: 343373
2018-09-29 13:25:22 +00:00
Eli Friedman 5ab09a684f [ARM] Fix correctness checks in promoteToConstantPool.
Correctly check for relocations in the constant to promote. And don't
allow promoting a constant multiple times.

This partially fixes https://bugs.llvm.org//show_bug.cgi?id=32780 ;
it's not a complete fix because we also need to prevent
ARMConstantIslands from cloning the constant.

(-arm-promote-constant is currently off by default, and it stays off
with this patch. I'll look into turning it on again when all the known
issues are fixed.)

Differential Revision: https://reviews.llvm.org/D51472

llvm-svn: 343361
2018-09-28 20:27:31 +00:00
Eli Friedman bb993be56b [ARM] Use preferred alignment for constants in promoteToConstantPool.
This mostly affects IR generated by non-clang frontends because clang
generally sets the alignment of globals explicitly.

Fixes https://bugs.llvm.org//show_bug.cgi?id=32394 .

(-arm-promote-constant is currently off by default, and it stays off
with this patch. I'll look into turning it on again when all the known
issues are fixed.)

Differential Revision: https://reviews.llvm.org/D51469

llvm-svn: 343359
2018-09-28 20:21:51 +00:00
Craig Topper 98aa643420 [X86] Add test cases for failures to use narrow test with immediate instructions when a truncate is beteen the CMP and the AND and the sign flag is used.
The code in X86ISelDAGToDAG only looks through truncates if the sign flag isn't used, but that is overly restrictive. A future patch will improve this.

llvm-svn: 343355
2018-09-28 19:06:28 +00:00
Evandro Menezes fc1852ff1c [AArch64] Split zero cycle feature more granularly
Split the `zcz` feature into specific ones got GP and FP registers, `zcz-gp`
and `zcz-fp`, respectively, while retaining the original feature option to
mean both.

Differential revision: https://reviews.llvm.org/D52621

llvm-svn: 343354
2018-09-28 19:05:09 +00:00
Andrea Di Biagio 6e218d0a57 [llvm-mca] Add a test for zero-idiom VPERM2F128rr. NFC
We don't correctly model the latency and resource usage information for
zero-idiom VPERM2F128rr on Jaguar.

This is demonstrated by the incorrect numbers in the resource pressure view, and
the timeline view.
A follow up patch will fix this problem.

llvm-svn: 343346
2018-09-28 17:47:09 +00:00
Luke Cheeseman 10981cc884 Revert r343317
- asan buildbots are breaking and I need to investigate the issue

llvm-svn: 343341
2018-09-28 17:01:50 +00:00
Greg Bedwell becbbe0383 [utils] Stricter checking from update_mca_test_checks.py
If any prefixes have been specified on the RUN lines that do not end up
ever actually getting printed, raise an Error. This is either an
indication that the run lines just need cleaning up, or that something
is more fundamentally wrong with the test.

Also raise an Error if there are any blocks which cannot be checked
because they are not uniquely covered by a prefix.

Fixed up a couple of tests where the extra checking flagged up issues.

Differential Revision: https://reviews.llvm.org/D48276

llvm-svn: 343332
2018-09-28 15:39:09 +00:00
Greg Bedwell 2f528f8c1e [utils] Allow better identification of matching blocks in update_mca_test_checks.py
Insert empty blocks to cause the positions of matching blocks to match
across lists where possible so that later stages of the algorithm can
actually identify them as being identical.

Regenerated all tests with this change.

Differential Revision: https://reviews.llvm.org/D52560

llvm-svn: 343331
2018-09-28 15:38:56 +00:00
Robert Widmann 9cba4eced8 [LLVM-C] Add more debug information accessors to GlobalObject and Instruction
Summary: Adds missing debug information accessors to GlobalObject.  This puts the finishing touches on cloning debug info in the echo tests.

Reviewers: whitequark, deadalnix

Reviewed By: whitequark

Subscribers: aprantl, JDevlieghere, llvm-commits, harlanhaskins

Differential Revision: https://reviews.llvm.org/D51522

llvm-svn: 343330
2018-09-28 15:35:18 +00:00
Sanjay Patel 242f90fe82 [InstCombine] don't propagate wider shufflevector arguments to predecessors
InstCombine would propagate shufflevector insts that had wider output vectors onto 
predecessors, which would sometimes push undef's onto the divisor of a div/rem and 
result in bad codegen.

I've fixed this by just banning propagating shufflevector back if the result of 
the shufflevector is wider than the input vectors.

Patch by: @sheredom (Neil Henning)

Differential Revision: https://reviews.llvm.org/D52548

llvm-svn: 343329
2018-09-28 15:24:41 +00:00
Sanjay Patel 699ee504f6 [InstCombine] adjust shuffle undef propagation tests; NFC
These are the updated baseline tests for D52548 -
I'm putting the tests next to the tests where the transform 
functions as expected, so we can see the intended/unintended
consequences.

Patch by: @sheredom (Neil Henning)

llvm-svn: 343328
2018-09-28 15:20:06 +00:00
Aditya Nandakumar 1cbb057142 [GISel]: Remove an incorrect assert in CallLowering
https://reviews.llvm.org/D51147

Asserting if any extend of vectors should be up to the target's
legalizer/target specific code not in CallLowering.

reviewed by : dsanders.

llvm-svn: 343325
2018-09-28 15:08:49 +00:00
Simon Pilgrim 428c1196d8 [X86][Btver2] PSUBS/PSUBUS instructions are zero-idioms
Noticed during llvm-exegesis tests, the PSUBS/PSUBUS instructions have the same zero-idiom behaviour to PSUB

llvm-svn: 343321
2018-09-28 14:20:42 +00:00
Simon Pilgrim 3216fd3602 [X86][Btver2] Add zero-idiom tests for PSUBS/PSUBUS instructions
Noticed during llvm-exegesis tests, the PSUBS/PSUBUS instructions have the same zero-idiom behaviour to PSUB

llvm-svn: 343319
2018-09-28 13:53:11 +00:00
Luke Cheeseman 21f2955bb2 Reapply changes reverted by r343235
- Add fix so that all code paths that create DWARFContext
  with an ObjectFile initialise the target architecture in the context
- Add an assert that the Arch is known in the Dwarf CallFrameString method

llvm-svn: 343317
2018-09-28 13:37:27 +00:00
Petar Jovanovic ff1bc621a0 [MIPS GlobalISel] Lower i64 arguments
Lower integer arguments larger then 32 bits for MIPS32.
setMostSignificantFirst is used in order for G_UNMERGE_VALUES and
G_MERGE_VALUES to always hold registers in same order, regardless of
endianness.

Patch by Petar Avramovic.

Differential Revision: https://reviews.llvm.org/D52409

llvm-svn: 343315
2018-09-28 13:28:47 +00:00
Simon Pilgrim 66da1ed29d [X86][Btver2] CVTSS2I/CVTSD2I - add missing JFPU0 pipe
We issue JFPU1->JSTC then JFPU0->JFPA then -> JALU0 (integer pipe)

Match AMD Fam16h SOG + llvm-exegesis tests

llvm-svn: 343314
2018-09-28 13:19:22 +00:00
Jonas Devlieghere f1c414cd0d Split invocations in CodeGen/X86/cpus.ll among multiple tests. (NFC)
On GreenDragon `CodeGen/X86/cpus.ll` is timing out on the bot with Asan
and UBSan enabled. With the same configuration on my machine, the test
passes but takes more than 3 minutes to do so. I could increase the
timeout, but I believe it makes more sense to split up the test because
it allows for more parallelism.

Differential revision: https://reviews.llvm.org/D52603

llvm-svn: 343313
2018-09-28 12:08:51 +00:00
Simon Pilgrim 17e5981ebf [X86][Btver2] Fix BSF/BSR schedule
Double throughput to account for 2 pipes + fix BSF's latency/uop counts

Match AMD Fam16h SOG + llvm-exegesis tests

llvm-svn: 343311
2018-09-28 10:26:48 +00:00
David Spickett ea605913be [ARM] Allow execute only code on Cortex-m23
The NoMovt feature prevents the use of MOVW/MOVT
instructions on Cortex-M23 for performance reasons.
These instructions are required for execute only code
so NoMovt should be disabled when that option is enabled.

Differential Revision: https://reviews.llvm.org/D52551

llvm-svn: 343302
2018-09-28 08:55:19 +00:00
Oliver Stannard 5f34e9e265 [ARM][v8.5A] Add speculation barriers SSBB and PSSBB
This adds two new barrier instructions which can be used to restrict
speculative execution of load instructions.

Patch by Pablo Barrio!

Differential revision: https://reviews.llvm.org/D52484

llvm-svn: 343300
2018-09-28 08:27:56 +00:00
Simon Pilgrim 280af1c7f0 [X86][BtVer2] Fix PHMINPOS schedule resources typo
PHMINPOS can run on either JFPU pipe

llvm-svn: 343299
2018-09-28 08:21:39 +00:00
Hiroshi Inoue 69bfa40200 [CodeGen] fix broken successor probability in MBB dump
When printing successor probabilities for a MBB, a human readable value is sometimes shown as 200.0%.
The human readable output is based on getProbabilityIterator, which returns 0xFFFFFFFF for getNumerator() and 0x80000000 for getDenominator() for unknown BranchProbability.
By using getSuccProbability as we do for the non-human readable part, we can avoid this problem.

Differential Revision: https://reviews.llvm.org/D52605

llvm-svn: 343297
2018-09-28 05:27:32 +00:00
Craig Topper bb50c38635 [ScalarizeMaskedMemIntrin] Use MinAlign to calculate alignment for the scalar load/stores to handle element types that are byte-sized but not powers of 2.
This pass doesn't handle non-byte sized types correctly at all, but at least we can make byte sized types work.

llvm-svn: 343294
2018-09-28 03:35:37 +00:00
Craig Topper fdf4c76ca0 [ScalarizeMaskedMemIntrin] Fix the alignment calculation for the scalar stores of a masked store expansion.
It should be the minimum of the original alignment and the scalar size.

llvm-svn: 343284
2018-09-28 01:06:13 +00:00
Craig Topper 92b992164d [ScalarizeMaskedMemIntrin] Add test cases for masked store expansion. Increase alignment of one of the masked load test cases.
The masked store alignment is being miscalculated, but masked load is correct.

llvm-svn: 343283
2018-09-28 01:06:09 +00:00
Craig Topper 1b29615330 [X86] Add the test case from PR38986.
The assembly for this test should be optimal now after changes to the ScalarizeMaskedMemIntrin patch.

llvm-svn: 343281
2018-09-27 23:25:10 +00:00
Craig Topper 6911bfe263 [ScalarizeMaskedMemIntrin] When expanding masked gathers, start with the passthru vector and insert the new load results into it.
Previously we started with undef and did a final merge with the passthru at the end.

llvm-svn: 343273
2018-09-27 21:28:59 +00:00
Craig Topper 45ad631b4c [ScalarizeMaskedMemIntrin] Add some IR only test cases for masked gather expansion.
llvm-svn: 343272
2018-09-27 21:28:55 +00:00
Craig Topper 7d234d6628 [ScalarizeMaskedMemIntrin] When expanding masked loads, start with the passthru value and insert each conditional load result over their element.
Previously we started with undef and did one final merge at the end with a select.

llvm-svn: 343271
2018-09-27 21:28:52 +00:00
Craig Topper dfc0f289fa [ScalarizeMaskedMemIntrin] Handle the case where the mask is an all zero vector.
This shouldn't really happen in practice I hope, but we tried to handle other constant cases. We missed this one because we checked for ConstantVector without realizing that zero becomes ConstantAggregateZero instead.

So instead just check for Constant and use getAggregateElement which will do the dirty work for us.

llvm-svn: 343270
2018-09-27 21:28:46 +00:00
Craig Topper a6478ac5d4 [ScalarizeMaskedMemIntrin] Add dedicated IR only tests for masked load expansion so I can begin making modifications.
llvm-svn: 343269
2018-09-27 21:28:43 +00:00
Stanislav Mekhanoshin b080adfc0c [AMDGPU] Fold copy (copy vgpr)
This allows to reduce a number of used VGPRs in some cases.

Differential Revision: https://reviews.llvm.org/D52577

llvm-svn: 343249
2018-09-27 18:55:20 +00:00
Craig Topper 0423681d4a [ScalarizeMaskedMemIntrin] Don't emit 'icmp eq i1 %x, 1' to check mask values. That's just %x so use that directly.
Had we emitted this IR earlier, InstCombine would have removed icmp so I'm going to assume using the i1 directly would be considered canonical.

llvm-svn: 343244
2018-09-27 18:01:48 +00:00
Simon Pilgrim 86c7b07ecd [X86][Btver2] (V)MPSADBW instructions take 3uops not 1
llvm-svn: 343238
2018-09-27 17:13:57 +00:00
Luke Cheeseman 8e5676b1aa Revert r343192 as an ubsan build is currently failing
llvm-svn: 343235
2018-09-27 16:47:30 +00:00
Simon Pilgrim dd744f158a [X86][Btver2] BTC/BTR/BTS instructions take 2uops not 1
llvm-svn: 343234
2018-09-27 16:39:52 +00:00
Oliver Stannard a4f68bf4ad [AArch64][v8.5A] Add speculation barriers SSBB and PSSBB
This adds two new barrier instructions which can be used to restrict
speculative execution of load instructions.

Patch by Pablo Barrio!

Differential revision: https://reviews.llvm.org/D52483

llvm-svn: 343229
2018-09-27 16:09:05 +00:00
Sanjay Patel c3f50ff92e [InstCombine] Without infinites, fold (C / X) < 0.0 --> (X < 0)
When C is not zero and infinites are not allowed (C / X) > 0 is a sign
test. Depending on the sign of C, the predicate must be swapped.

E.g.:
  foo(double X) {
    if ((-2.0 / X) <= 0) ...
  }
 =>
  foo(double X) {
    if (X >= 0) ...
  }

Patch by: @marels (Martin Elshuber)

Differential Revision: https://reviews.llvm.org/D51942

llvm-svn: 343228
2018-09-27 15:59:24 +00:00
Simon Pilgrim c2a88ea64e [X86][Btver2] BLSI/BLSMSK/BLSR instructions take 2uops not 1 (same as TZCNT)
llvm-svn: 343227
2018-09-27 14:57:57 +00:00
Teresa Johnson f24136f17a [WPD] Fix incorrect devirtualization after indirect call promotion
Summary:
Add a dominance check to ensure that the possible devirtualizable
call is actually dominated by the type test/checked load intrinsic being
analyzed. With PGO, after indirect call promotion is performed during
the compile step, followed by inlining, we may have a type test in the
promoted and inlined sequence that allows an indirect call in that
sequence to be devirtualized. That indirect call (inserted by inlining
after promotion) will share the same vtable pointer as the fallback
indirect call that cannot be devirtualized.

Before this patch the code was incorrectly devirtualizing the fallback
indirect call.

See the new test and the example described there for more details.

Reviewers: pcc, vitalybuka

Subscribers: mehdi_amini, Prazek, eraman, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D52514

llvm-svn: 343226
2018-09-27 14:55:32 +00:00
Oliver Stannard a9a5eee169 [AArch64][v8.5A] Add Branch Target Identification instructions
This adds new instructions used by the Branch Target Identification
feature. When this is enabled, these are the only instructions which can
be targeted by indirect branch instructions.

Patch by Pablo Barrio!

Differential revision: https://reviews.llvm.org/D52485

llvm-svn: 343225
2018-09-27 14:54:33 +00:00
Sanjay Patel 95a816b34a [InstCombine] add tests for FP sign-bit cmp optimization with fdiv; NFC
These are baseline tests for D51942.
Patch by: @marels (Martin Elshuber)

llvm-svn: 343222
2018-09-27 14:24:29 +00:00
Oliver Stannard 8459d34e82 [AArch64][v8.5A] Add speculation restriction system registers
This adds some new system registers which can be used to restrict
certain types of speculative execution.

Patch by Pablo Barrio and David Spickett!

Differential revision: https://reviews.llvm.org/D52482

llvm-svn: 343218
2018-09-27 14:05:46 +00:00
Oliver Stannard dc837e3f1f [AArch64][v8.5A] Add Armv8.5-A random number instructions
This adds two new system registers, used to generate random numbers.

This is an optional extension to v8.5-A, and will be controlled by the
"+rng" modifier of the -march= and -mcpu= options.

Patch by Pablo Barrio!

Differential revision: https://reviews.llvm.org/D52481

llvm-svn: 343217
2018-09-27 14:01:40 +00:00
Oliver Stannard 6930b12d53 [AArch64][v8.5A] Add Armv8.5-A "DC CVADP" instruction
This adds a new variant of the DC system instruction for persistent
memory.

Patch by Pablo Barrio!

Differential revision: https://reviews.llvm.org/D52480

llvm-svn: 343216
2018-09-27 13:53:35 +00:00
Oliver Stannard 224428c06a [AArch64][v8.5A] Add prediction invalidation instructions to AArch64
This adds new system instructions which act as barriers to speculative
execution based on earlier execution within a particular execution
context.

Patch by Pablo Barrio!

Differential revision: https://reviews.llvm.org/D52479

llvm-svn: 343214
2018-09-27 13:47:40 +00:00
Oliver Stannard 382c935c42 [ARM][v8.5A] Add speculation barrier to ARM & Thumb instruction sets
This is a new barrier which limits speculative execution of the
instructions following it.

Patch by Pablo Barrio!

Differential revision: https://reviews.llvm.org/D52477

llvm-svn: 343213
2018-09-27 13:41:14 +00:00
Oliver Stannard e481f1d95a [AArch64][v8.5A] Add speculation barrier to AArch64 instruction set
This is a new barrier which limits speculative execution of the
instructions following it.

Patch by Pablo Barrio!

Differential revision: https://reviews.llvm.org/D52476

llvm-svn: 343211
2018-09-27 13:39:06 +00:00
Daniel Cederman 0c05bdea2b [Sparc] Remove the support for builtin setjmp/longjmp
Summary: It is currently broken and for Sparc there is not much benefit
in using a builtin version compared to a library version. Both versions
needs to store the same four values in setjmp and flush the register
windows in longjmp. If the need for a builtin setjmp/longjmp arises there
is an improved implementation available at https://reviews.llvm.org/D50969.

Reviewers: jyknight, joerg, venkatra

Subscribers: fedor.sergeev, jrtc27, llvm-commits

Differential Revision: https://reviews.llvm.org/D51487

llvm-svn: 343210
2018-09-27 13:32:54 +00:00
Oliver Stannard ddb7d46aa5 [AArch64][v8.5A] Add FRINT[32,64][Z,X] instructions
These are some new variants of the "Floating-point Round to Integral"
family of instructions, which round to the nearest floating-point value
which fits in a 32- or 64-bit integer.

Patch by Pablo Barrio!

Differential revision: https://reviews.llvm.org/D52475

llvm-svn: 343209
2018-09-27 13:32:06 +00:00
Clement Courbet 30183093ab [llvm-exegesis] Fix PR39096.
Summary: The key is now the resource name, not the resource id.

Reviewers: gchatelet

Subscribers: tschuett, RKSimon, llvm-commits

Differential Revision: https://reviews.llvm.org/D52607

llvm-svn: 343208
2018-09-27 13:26:37 +00:00
Daniel Cederman b35d3a2733 [Sparc] Add unimp alias
Summary: Use 0 as the default immediate for the UNIMP instruction.
This matches the behavior in gas.

Reviewers: jyknight, venkatra

Subscribers: fedor.sergeev, jrtc27, llvm-commits

Differential Revision: https://reviews.llvm.org/D51526

llvm-svn: 343203
2018-09-27 12:34:53 +00:00
Daniel Cederman c1968ba5d3 [Sparc] Add support for the partial write PSR instruction
Summary:
Partial write %PSR (WRPSR) is a SPARC V8e option that allows WRPSR
instructions to only affect the %PSR.ET field. It is supported by
the GR740 and GR716.

Reviewers: jyknight, venkatra

Subscribers: fedor.sergeev, jrtc27, llvm-commits

Differential Revision: https://reviews.llvm.org/D48644

llvm-svn: 343202
2018-09-27 12:34:48 +00:00
Simon Pilgrim 98f503a326 [X86][Btver2] TZCNT instructions take 2uops not 1
llvm-svn: 343200
2018-09-27 12:28:47 +00:00
Luke Cheeseman f6844b307a Reapply changes reverted in r343114, lldb patch to follow shortly
llvm-svn: 343192
2018-09-27 10:39:20 +00:00
Nicola Zaghen 436c012702 [InstCombine] Add new tests in preparation for a combine of icmp (mul nsw/nuw X, C2), C
Proof for the future optimisations are here:
- eq/neq: https://rise4fun.com/Alive/9PBA
- sgt/ugt: https://rise4fun.com/Alive/58yr
- slt/ult: https://rise4fun.com/Alive/VCQ

Differential Revision: https://reviews.llvm.org/D51625

llvm-svn: 343190
2018-09-27 10:08:38 +00:00
Oliver Stannard 31af178f4a [AArch64][v8.5A] Add PSTATE manipulation instructions XAFlag and AXFlag
These new instructions manipluate the NZCV bits, to convert between the
regular Arm floating-point comare format and an alternative format.

Patch by Pablo Barrio!

Differential revision: https://reviews.llvm.org/D52473

llvm-svn: 343187
2018-09-27 09:11:27 +00:00
Sanjay Patel 150afce75a [InstCombine] add tests that show undef propagation failures from D52548; NFC
Differential Revision: https://reviews.llvm.org/D52556

llvm-svn: 343140
2018-09-26 20:30:47 +00:00
Florian Hahn 6feb637124 [LoopInterchange] Preserve LCSSA.
This patch extends LoopInterchange to move LCSSA to the right place
after interchanging. This is required for LoopInterchange to become a
function pass.

An alternative to the manual moving of the PHIs, we could also re-form
the LCSSA phis for a set of interchanged loops, but that's more
expensive.

Reviewers: efriedma, mcrosier, davide

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D52154

llvm-svn: 343132
2018-09-26 19:34:25 +00:00
Sanjay Patel d938d0d4f5 [InstCombine] add tests for vector insert/extract; NFC
Preliminary step for D52439.

llvm-svn: 343128
2018-09-26 17:57:38 +00:00
Craig Topper e4c96f4a48 [X86] Update tzcnt fast-isel tests to match clang r343126.
We now generate cttz with the zero_undef flag set to false. This allows -O0 to avoid the zero check.

llvm-svn: 343127
2018-09-26 17:19:28 +00:00
Lang Hames f0a3fd885d Reapply r343058 with a fix for -DLLVM_ENABLE_THREADS=OFF.
Modifies lit to add a 'thread_support' feature that can be used in lit test
REQUIRES clauses. The thread_support flag is set if -DLLVM_ENABLE_THREADS=ON
and unset if -DLLVM_ENABLE_THREADS=OFF. The lit flag is used to disable the
multiple-compile-threads-basic.ll testcase when threading is disabled.

llvm-svn: 343122
2018-09-26 16:26:59 +00:00
Luke Cheeseman 77aaa22081 Revert r343112 as CallFrameString API change has broken lldb builds
llvm-svn: 343114
2018-09-26 14:48:03 +00:00
Luke Cheeseman 03ad8812f5 [AArch64] - Return address signing dwarf support
- Reapply r343089 with a fix for DebugInfo/Sparc/gnu-window-save.ll

llvm-svn: 343112
2018-09-26 14:30:29 +00:00
Clement Courbet a5720c4e62 [llvm-exgesis][NFC] Do not pollute buildbots with messages when
the exegesis lit tests cannot run.

llvm-svn: 343110
2018-09-26 13:58:26 +00:00
Clement Courbet 28d4f85824 [llvm-exegesis] Get rid of debug_string.
Summary:
THis is a backwards-compatible change (existing files will work as
expected).

See PR39082.

Reviewers: gchatelet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D52546

llvm-svn: 343108
2018-09-26 13:35:10 +00:00
Francis Visoiu Mistrih 6acaa18afc [CodeGen] Always print register ties in MI::dump()
It was the case when calling MO::dump(), but MI::dump() was still
depending on hasComplexRegisterTies().

The MIR output is not affected.

llvm-svn: 343107
2018-09-26 13:33:09 +00:00
Hans Wennborg 00b88bbcaf Revert r343089 "[AArch64] - Return address signing dwarf support"
This caused the DebugInfo/Sparc/gnu-window-save.ll test to fail.

> Functions that have signed return addresses need additional dwarf support:
> - After signing the LR, and before authenticating it, the LR register is in a
>   state the is unusable by a debugger or unwinder
> - To account for this a new directive, .cfi_negate_ra_state, is added
> - This directive says the signed state of the LR register has now changed,
>   i.e. unsigned -> signed or signed -> unsigned
> - This directive has the same CFA code as the SPARC directive GNU_window_save
>   (0x2d), adding a macro to account for multiply defined codes
> - This patch matches the gcc implementation of this support:
>   https://patchwork.ozlabs.org/patch/800271/
>
> Differential Revision: https://reviews.llvm.org/D50136

llvm-svn: 343103
2018-09-26 12:57:45 +00:00
Hiroshi Inoue 20982f0995 [PowerPC] optimize conditional branch on CRSET/CRUNSET
This patch adds a check to optimize conditional branch (BC and BCn) based on a constant set by CRSET or CRUNSET.
Other optimizers, such as block placement, may generate such code and hence
I do this at the very end of the optimization in pre-emit peephole pass.

A conditional branch based on a constant is eliminated or converted into unconditional branch. 
Also CRSET/CRUNSET is eliminated if the condition code register is not used
by instruction other than the branch to be optimized.

Differential Revision: https://reviews.llvm.org/D52345

llvm-svn: 343100
2018-09-26 12:32:45 +00:00
Hans Wennborg 20b5abe23b Revert r343058 "[ORC] Add support for multithreaded compiles to LLJIT and LLLazyJIT."
This doesn't work well in builds configured with LLVM_ENABLE_THREADS=OFF,
causing the following assert when running
ExecutionEngine/OrcLazy/multiple-compile-threads-basic.ll:

  lib/ExecutionEngine/Orc/Core.cpp:1748: Expected<llvm::JITEvaluatedSymbol>
  llvm::orc::lookup(const llvm::orc::JITDylibList &, llvm::orc::SymbolStringPtr):
  Assertion `ResultMap->size() == 1 && "Unexpected number of results"' failed.

> LLJIT and LLLazyJIT can now be constructed with an optional NumCompileThreads
> arguments. If this is non-zero then a thread-pool will be created with the
> given number of threads, and compile tasks will be dispatched to the thread
> pool.
>
> To enable testing of this feature, two new flags are added to lli:
>
> (1) -compile-threads=N (N = 0 by default) controls the number of compile threads
> to use.
>
> (2) -thread-entry can be used to execute code on additional threads. For each
> -thread-entry argument supplied (multiple are allowed) a new thread will be
> created and the given symbol called. These additional thread entry points are
> called after static constructors are run, but before main.

llvm-svn: 343099
2018-09-26 12:15:23 +00:00
Simon Pilgrim 26223bccde [X86][SSE] Refresh PR34947 test code to handle D52504
The previously reduced version used urem <9 x i32> zeroinitializer, %tmp which D52504 will simplify.

llvm-svn: 343097
2018-09-26 11:53:51 +00:00
Simon Pilgrim 5beaac433d [X86][SSE] Use ISD::MULHS for constant vXi16 ISD::SRA lowering (PR38151)
Similar to the existing ISD::SRL constant vector shifts from D49562, this patch adds ISD::SRA support with ISD::MULHS.

As we're dealing with signed values, we have to handle shift by zero and shift by one special cases, so XOP+AVX2/AVX512 splitting/extension is still a better solution - really we should still use ISD::MULHS if one of the special cases are used but for now I've just left a TODO and filtered by isKnownNeverZero.

Differential Revision: https://reviews.llvm.org/D52171

llvm-svn: 343093
2018-09-26 10:57:05 +00:00
Sam Parker 75aca94093 [ARM] Fix for PR39060
When calculating whether a value can safely overflow for use by an
icmp, we weren't checking that the value couldn't wrap around. To do
this we need the icmp to be using a constant, as well as the incoming
add or sub.

bugzilla report: https://bugs.llvm.org/show_bug.cgi?id=39060

Differential Revision: https://reviews.llvm.org/D52463

llvm-svn: 343092
2018-09-26 10:56:00 +00:00
David Green 353cb3d4e5 [CodeGen] Enable tail calls for functions with NonNull attributes.
Adding NonNull as attributes to returned pointers has the unfortunate side
effect of disabling tail calls. This patch ignores the NonNull attribute when
we decide whether to tail merge, in the same way that we ignore the NoAlias
attribute, as it has no affect on the call sequence.

Differential Revision: https://reviews.llvm.org/D52238

llvm-svn: 343091
2018-09-26 10:46:18 +00:00
Yury Gribov 67572004df Fixes removal of dead elements from PressureDiff (PR37252).
Reviewed By: MatzeB

Differential Revision: https://reviews.llvm.org/D51495

llvm-svn: 343090
2018-09-26 10:42:41 +00:00
Luke Cheeseman f755e687fc [AArch64] - Return address signing dwarf support
Functions that have signed return addresses need additional dwarf support:
- After signing the LR, and before authenticating it, the LR register is in a
  state the is unusable by a debugger or unwinder
- To account for this a new directive, .cfi_negate_ra_state, is added
- This directive says the signed state of the LR register has now changed,
  i.e. unsigned -> signed or signed -> unsigned
- This directive has the same CFA code as the SPARC directive GNU_window_save
  (0x2d), adding a macro to account for multiply defined codes
- This patch matches the gcc implementation of this support:
  https://patchwork.ozlabs.org/patch/800271/

Differential Revision: https://reviews.llvm.org/D50136

llvm-svn: 343089
2018-09-26 10:14:15 +00:00
Hans Wennborg 4b2e7daa7e Revert r342870 "[ARM] bottom-top mul support ARMParallelDSP"
This broke Chromium's Android build (https://crbug.com/889390) and the
polly-aosp buildbot
(http://lab.llvm.org:8011/builders/aosp-O3-polly-before-vectorizer-unprofitable).

> Originally committed in rL342210 but was reverted in rL342260 because
> it was causing issues in vectorized code, because I had forgotten to
> ensure that we're operating on scalar values.
>
> Original commit message:
>
> On failing to find sequences that can be converted into dual macs,
> try to find sequential 16-bit loads that are used by muls which we
> can then use smultb, smulbt, smultt with a wide load.
>
> Differential Revision: https://reviews.llvm.org/D51983

llvm-svn: 343082
2018-09-26 08:41:50 +00:00
Lang Hames d8048675f4 [ORC] Update CompileOnDemandLayer2 to use the new lazyReexports mechanism
for lazy compilation, rather than a callback manager.

The new mechanism does not block compile threads, and does not require
function bodies to be renamed.

Future modifications should allow laziness on a per-module basis to work
without any modification of the input module.

llvm-svn: 343065
2018-09-26 05:08:29 +00:00
Hsiangkai Wang 55321d82bd [DebugInfo] Do not generate address info for removed debug labels.
In some senario, LLVM will remove llvm.dbg.labels in IR. For example,
when the labels are in unreachable blocks, these labels will not
be generated in LLVM IR. In the case, these debug labels will have
address zero as their address. It is not legal address for debugger to
set breakpoints or query sources. So, the patch inhibits the address info
(DW_AT_low_pc) of removed labels.

Fix build failed in BuildBot, clang-stage1-cmake-RA-incremental, on macOS.

Differential Revision: https://reviews.llvm.org/D51908

llvm-svn: 343062
2018-09-26 04:19:23 +00:00
Lang Hames 225a32af72 [ORC] Add support for multithreaded compiles to LLJIT and LLLazyJIT.
LLJIT and LLLazyJIT can now be constructed with an optional NumCompileThreads
arguments. If this is non-zero then a thread-pool will be created with the
given number of threads, and compile tasks will be dispatched to the thread
pool.

To enable testing of this feature, two new flags are added to lli:

(1) -compile-threads=N (N = 0 by default) controls the number of compile threads
to use.

(2) -thread-entry can be used to execute code on additional threads. For each
-thread-entry argument supplied (multiple are allowed) a new thread will be
created and the given symbol called. These additional thread entry points are
called after static constructors are run, but before main.

llvm-svn: 343058
2018-09-26 02:39:42 +00:00
Vyacheslav Zakharin e06831a3b2 Remove LoopID metadata from the branch instruction
that follows the peeled iterations.

Differential Revision: https://reviews.llvm.org/D52176

llvm-svn: 343054
2018-09-26 01:03:21 +00:00
Zhaoshi Zheng 95710337b4 Revert "Revert "[ConstHoist] Do not rebase single (or few) dependent constant""
This reverts commit bd7b44f35ee9fbe365eb25ce55437ea793b39346.

Reland r342994: disabled the optimization and explicitly enable it in test.

-mllvm -consthoist-min-num-to-rebase<unsigned>=0

[ConstHoist] Do not rebase single (or few) dependent constant

If an instance (InsertionPoint or IP) of Base constant A has only one or few
rebased constants depending on it, do NOT rebase. One extra ADD instruction is
required to materialize each rebased constant, assuming A and the rebased have
the same materialization cost.

Differential Revision: https://reviews.llvm.org/D52243

llvm-svn: 343053
2018-09-26 00:59:09 +00:00
Thomas Lively c949857a7f [WebAssembly] SIMD conversions
Summary:
Lowers (s|u)itofp and fpto(s|u)i instructions for vectors. The fp to
int conversions produce poison values if their arguments are out of
the convertible range, so a future CL will have to add an LLVM
intrinsic to make the saturating behavior of this conversion usable.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D52372

llvm-svn: 343052
2018-09-26 00:34:36 +00:00
Stanislav Mekhanoshin 8dfcd83371 [AMDGPU] Fix ds combine with subregs
Differential Revision: https://reviews.llvm.org/D52522

llvm-svn: 343047
2018-09-25 23:33:18 +00:00
Craig Topper 12c18840fa [X86] Allow movmskpd/ps ISD nodes to be created and selected with integer input types.
This removes an int->fp bitcast between the surrounding code and the movmsk. I had already added a hack to combineMOVMSK to try to look through this bitcast to improve the SimplifyDemandedBits there.

But I found an additional issue where the bitcast was preventing combineMOVMSK from being called again after earlier nodes in the DAG are optimized. The bitcast gets revisted, but not the user of the bitcast. By using integer types throughout, the bitcast doesn't get in the way.

llvm-svn: 343046
2018-09-25 23:28:27 +00:00
Craig Topper d8c68840c8 [X86] Add some more movmsk test cases. NFC
These IR patterns represent the exact behavior of a movmsk instruction using (zext (bitcast (icmp slt X, 0))).

For the v4i32/v8i32/v2i64/v4i64 we currently emit a PCMPGT for the icmp slt which is unnecessary since we only care about the sign bit of the result. This is because of the int->fp bitcast we put on the input to the movmsk nodes for these cases. I'll be fixing this in a future patch.

llvm-svn: 343045
2018-09-25 23:28:24 +00:00
Sanjay Patel f23727d972 [InstCombine] add fneg variation of shuffle-binop fold; NFC
If the fsub in this pattern was replaced by an actual fneg
instruction, we would need to add a fold to recognize that
because fneg would not be a binop.

llvm-svn: 343041
2018-09-25 22:48:58 +00:00
Changpeng Fang 6f4922ccc9 AMDGPU: Add Selection patterns to support add of one bit.
Summary:
  We generate s_xor to lower add of i1s in general cases, and s_not to
lower add with a one-bit imm of -1 (true).

Reviewers:
  rampitec

Differential Revision:
  https://reviews.llvm.org/D52518

llvm-svn: 343030
2018-09-25 21:21:18 +00:00
Anna Thomas b1e3d45318 [LV][LAA] Vectorize loop invariant values stored into loop invariant address
Summary:
We are overly conservative in loop vectorizer with respect to stores to loop
invariant addresses.
More details in https://bugs.llvm.org/show_bug.cgi?id=38546
This is the first part of the fix where we start with vectorizing loop invariant
values to loop invariant addresses.

This also includes changes to ORE for stores to invariant address.

Reviewers: anemet, Ayal, mkuper, mssimpso

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D50665

llvm-svn: 343028
2018-09-25 20:57:20 +00:00
Teresa Johnson 7fb39dfa7c [ThinLTO] Efficiency fix for writing type id records in per-module indexes
Summary:
In D49565/r337503, the type id record writing was fixed so that only
referenced type ids were emitted into each per-module index for ThinLTO
distributed builds. However, this still left an efficiency issue: each
per-module index checked all type ids for membership in the referenced
set, yielding O(M*N) performance (M indexes and N type ids).

Change the TypeIdMap in the summary to be indexed by GUID, to facilitate
correlating with type identifier GUIDs referenced in the function
summary TypeIdInfo structures. This allowed simplifying other
places where a map from type id GUID to type id map entry was previously
being used to aid this correlation.

Also fix AsmWriter code to handle the rare case of type id GUID
collision.

For a large internal application, this reduced the thin link time by
almost 15%.

Reviewers: pcc, vitalybuka

Subscribers: mehdi_amini, inglorion, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D51330

llvm-svn: 343021
2018-09-25 20:14:40 +00:00
Sanjay Patel 10c11b867a [x86] avoid 256-bit andnp that requires insert/extract with AVX1 (PR37449)
This is the final (I hope!) problem pattern mentioned in PR37749:
https://bugs.llvm.org/show_bug.cgi?id=37749

We are trying to avoid an AVX1 sinkhole caused by having 256-bit bitwise logic ops but no other 256-bit integer ops. 
We've already solved the simple logic ops, but 'andn' is an x86 special. I looked at alternative solutions like 
extending the generic DAG combine or trying to wait until the ANDNP node is created, but those are bigger patches 
that can over-reach. Ie, splitting to 128-bit does not look like a win in most cases with >1 256-bit op.

The pattern matching is cluttered with bitcasts because of our i64 element canonicalization. For the affected test, 
we have this vector-type-legalized sequence:

        t29: v8i32 = concat_vectors t27, t28
      t30: v4i64 = bitcast t29
        t18: v8i32 = BUILD_VECTOR Constant:i32<-1>, Constant:i32<-1>, ...
      t31: v4i64 = bitcast t18
    t32: v4i64 = xor t30, t31
      t9: v8i32 = BUILD_VECTOR Constant:i32<255>, Constant:i32<255>, ...
    t34: v4i64 = bitcast t9
  t35: v4i64 = and t32, t34
t36: v8i32 = bitcast t35
      t37: v4i32 = extract_subvector t36, Constant:i64<0>
      t38: v4i32 = extract_subvector t36, Constant:i64<4>

Differential Revision: https://reviews.llvm.org/D52318

llvm-svn: 343008
2018-09-25 19:09:34 +00:00
Yury Delendik 7c18d6083a [WebAssembly] Move/clone DBG_VALUE during WebAssemblyRegStackify pass
Summary:
The MoveForSingleUse or MoveAndTeeForMultiUse functions move wasm instructions,
however DBG_VALUE stay unchanged -- moving or cloning these.

Reviewers: dschuff

Reviewed By: dschuff

Subscribers: mattd, MatzeB, dschuff, sbc100, jgravelle-google, aheejin, sunfish, llvm-commits, aardappel

Tags: #debug-info

Differential Revision: https://reviews.llvm.org/D49034

llvm-svn: 343007
2018-09-25 18:59:34 +00:00
Jessica Paquette e02de05b32 Revert "[ConstHoist] Do not rebase single (or few) dependent constant"
This caused a couple test failures on a bot:

CodeGen/X86/constant-hoisting-bfi.ll
Transforms/ConstantHoisting/X86/ehpad.ll

Example:

http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/53575/

llvm-svn: 343005
2018-09-25 18:41:40 +00:00
Daniil Fukalov 349b5943b4 [RegAllocGreedy] avoid using physreg candidates that cannot be correctly spilled
For the AMDGPU target if a MBB contains exec mask restore preamble, SplitEditor may get state when it cannot insert a spill instruction.

E.g. for a MIR

bb.100:
    %1 = S_OR_SAVEEXEC_B64 %2, implicit-def $exec, implicit-def $scc, implicit $exec
and if the regalloc will try to allocate a virtreg to the physreg already assigned to virtreg %1, it should insert spill instruction before the S_OR_SAVEEXEC_B64 instruction.
But it is not possible since can generate incorrect code in terms of exec mask.

The change makes regalloc to ignore such physreg candidates.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D52052

llvm-svn: 343004
2018-09-25 18:37:38 +00:00
Daniel Sanders 06f4ff1952 [globalisel][tblgen] Table optimization should consider the C++ code in C++ predicates
This fixes PR39045

llvm-svn: 342997
2018-09-25 17:59:02 +00:00
Zhaoshi Zheng 2c1a09188f [ConstHoist] Do not rebase single (or few) dependent constant
If an instance (InsertionPoint or IP) of Base constant A has only one or few
rebased constants depending on it, do NOT rebase. One extra ADD instruction is
required to materialize each rebased constant, assuming A and the rebased have
the same materialization cost.

Differential Revision: https://reviews.llvm.org/D52243

llvm-svn: 342994
2018-09-25 17:45:37 +00:00
Justin Bogner ef2ae740c6 Revert "[DebugInfo] Do not generate address info for removed debug labels."
The added test is failing on macOS:

  http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/53550/

This reverts r342943.

llvm-svn: 342993
2018-09-25 17:29:30 +00:00
Craig Topper 6fb1358a98 [X86] Add AVX512 support to combineVectorSizedSetCCEquality.
Reviewers: spatel, RKSimon

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52424

llvm-svn: 342989
2018-09-25 16:27:12 +00:00
Sanjay Patel 69ed4710b8 [InstCombine] narrow binops on concatenated vectors (PR33026)
The motivating case from:
https://bugs.llvm.org/show_bug.cgi?id=33026
...has no shuffles now. This kind of pattern may occur during
vectorization when targets have lumpy ISAs like SSE/AVX.

llvm-svn: 342988
2018-09-25 15:57:37 +00:00
Guillaume Chatelet 345fae5d56 [llvm-exegesis] Serializes registers initial values.
Summary: Adds the registers initial values to the YAML output of llvm-exegesis.

Reviewers: courbet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D52460

llvm-svn: 342982
2018-09-25 15:15:54 +00:00
Guillaume Chatelet 6078f82241 [llvm-exegesis] Fix missing document separator in YAML output.
Reviewers: courbet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D52496

llvm-svn: 342981
2018-09-25 14:48:24 +00:00
Clement Courbet 86baebc5fd [llvm-exegesis] Add lit tests (v2).
Summary: This revisits rL342953 by adding detection of host support.

Reviewers: gchatelet, lebedev.ri, alexshap

Subscribers: mgorny, tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D52464

llvm-svn: 342975
2018-09-25 13:59:35 +00:00
Simon Pilgrim b56be79e0c Revert rL342916: [X86] Remove shift/rotate by CL memory (RMW) overrides
As suggested by Craig Topper - I'm going to look at cleaning up the RMW sequences instead.

The uops are slightly different to the register variant, so requires a +1uop tweak

llvm-svn: 342969
2018-09-25 13:01:26 +00:00
David Green 9108c2b921 [LoopUnroll] Add check to Latch's terminator in UnrollRuntimeLoopRemainder
In this patch, I'm adding an extra check to the Latch's terminator in llvm::UnrollRuntimeLoopRemainder,
similar to how it is already done in the llvm::UnrollLoop.

The compiler would crash if this function is called with a malformed loop.

Patch by Rodrigo Caetano Rocha!

Differential Revision: https://reviews.llvm.org/D51486

llvm-svn: 342958
2018-09-25 10:08:47 +00:00
Sameer Sahasrabuddhe b4f2d1cb68 [AMDGPU] restore r342722 which was reverted with r342743
[AMDGPU] lower-switch in preISel as a workaround for legacy DA

Summary:
The default target of the switch instruction may sometimes be an
"unreachable" block, when it is guaranteed that one of the cases is
always taken. The dominator tree concludes that such a switch
instruction does not have an immediate post dominator. This confuses
divergence analysis, which is unable to propagate sync dependence to
the targets of the switch instruction.

As a workaround, the AMDGPU target now invokes lower-switch as a
preISel pass. LowerSwitch is designed to handle the unreachable
default target correctly, allowing the divergence analysis to locate
the correct immediate dominator of the now-lowered switch.

llvm-svn: 342956
2018-09-25 09:39:21 +00:00
Clement Courbet 6d92c198ac Revert rL342953 "[llvm-exegesis] Add lit tests."
We also need to make sure that we're on the right subtarget.

llvm-svn: 342955
2018-09-25 09:36:44 +00:00
Clement Courbet 7f1322dc4d [llvm-exegesis] Add lit tests.
Summary:
Right now we only have unit tests. This will allow testing the whole
tool. Even though We can't really check actual values, this will avoid
regressions such as PR39055.

Reviewers: gchatelet, alexshap

Subscribers: mgorny, tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D52407

llvm-svn: 342953
2018-09-25 09:27:43 +00:00
Hsiangkai Wang 9c2463622d [DebugInfo] Do not generate address info for removed debug labels.
In some senario, LLVM will remove llvm.dbg.labels in IR. For example,
when the labels are in unreachable blocks, these labels will not
be generated in LLVM IR. In the case, these debug labels will have
address zero as their address. It is not legal address for debugger to
set breakpoints or query sources. So, the patch inhibits the address info
(DW_AT_low_pc) of removed labels.

Differential Revision: https://reviews.llvm.org/D51908

llvm-svn: 342943
2018-09-25 06:09:50 +00:00
Thomas Lively 12da0f9c3d [WebAssembly] SIMD sqrt
Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D52387

llvm-svn: 342937
2018-09-25 03:39:28 +00:00
Stanislav Mekhanoshin 14fefe7f8e [AMDGPU] Remove useless check from test. NFC.
The check for assignment of zero is practically useless
while the assignment moves around with different scheduling.

llvm-svn: 342935
2018-09-25 01:24:54 +00:00
Craig Topper 9ce5da7b62 [X86] Don't create FILD ISD nodes when X87 is disabled.
The included test case previously asserted because the type legalizer tried to soften the FILD ISD node.

Fixes PR38819.

llvm-svn: 342934
2018-09-25 00:16:57 +00:00
Thomas Lively 586153652c [WebAssembly][NFC] Fix hardcoded stack indices in tests
Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D52388

llvm-svn: 342928
2018-09-24 23:42:07 +00:00
Evgeniy Stepanov 090f0f9504 [hwasan] Record and display stack history in stack-based reports.
Summary:
Display a list of recent stack frames (not a stack trace!) when
tag-mismatch is detected on a stack address.

The implementation uses alignment tricks to get both the address of
the history buffer, and the base address of the shadow with a single
8-byte load. See the comment in hwasan_thread_list.h for more
details.

Developed in collaboration with Kostya Serebryany.

Reviewers: kcc

Subscribers: srhines, kubamracek, mgorny, hiraditya, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D52249

llvm-svn: 342923
2018-09-24 23:03:34 +00:00
Evgeniy Stepanov 20c4999e8b Revert "[hwasan] Record and display stack history in stack-based reports."
This reverts commit r342921: test failures on clang-cmake-arm* bots.

llvm-svn: 342922
2018-09-24 22:50:32 +00:00
Evgeniy Stepanov 9043e17edd [hwasan] Record and display stack history in stack-based reports.
Summary:
Display a list of recent stack frames (not a stack trace!) when
tag-mismatch is detected on a stack address.

The implementation uses alignment tricks to get both the address of
the history buffer, and the base address of the shadow with a single
8-byte load. See the comment in hwasan_thread_list.h for more
details.

Developed in collaboration with Kostya Serebryany.

Reviewers: kcc

Subscribers: srhines, kubamracek, mgorny, hiraditya, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D52249

llvm-svn: 342921
2018-09-24 21:38:42 +00:00
Christy Lee e94374809e Re-submitting changes in D51550 because it failed to patch.
Reviewers: javed.absar, trentxintong, courbet

Reviewed By: trentxintong

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52433

llvm-svn: 342919
2018-09-24 20:47:12 +00:00
Simon Pilgrim 0b4ad7596f [X86] Remove shift/rotate by CL memory (RMW) overrides
The uops are slightly different to the register variant, so requires a +1uop tweak

llvm-svn: 342916
2018-09-24 20:11:50 +00:00
Stefan Pintilie b5305771fb [Power9] [LLVM] Add __float128 exponent GET and SET builtins
Added

__builtin_vsx_scalar_extract_expq
__builtin_vsx_scalar_insert_exp_qp

Builtins should behave the same way as in GCC.

Differential Revision: https://reviews.llvm.org/D48185

llvm-svn: 342910
2018-09-24 18:14:13 +00:00
Simon Pilgrim 51cbd838d0 [X86][AVX] Add truncation as shuffle test for PR31451
llvm-svn: 342908
2018-09-24 17:26:31 +00:00
Christy Lee bf112ea25b Reland r342494 after fixing LIT checks.
llvm-svn: 342907
2018-09-24 17:26:30 +00:00
Sanjay Patel 7b86bc22de [InstCombine] add/move tests for extractelement; NFC
llvm-svn: 342905
2018-09-24 17:17:16 +00:00
Zhaoshi Zheng 05b46dc300 [Thumb1] Any imm8 should have cost of 1
A simple MOVS rd, imm8 can materialize [-128, 127] in signed i8 type or
[0, 255] in unsigned i8 type on Thumb1.

Differential Revision: https://reviews.llvm.org/D52257

llvm-svn: 342898
2018-09-24 16:15:23 +00:00
Fedor Sergeev 662e5686fe [New PM][PassInstrumentation] IR printing support for New Pass Manager
Implementing -print-before-all/-print-after-all/-filter-print-func support
through PassInstrumentation callbacks.

- PrintIR routines implement printing callbacks.

- StandardInstrumentations class provides a central place to manage all
  the "standard" in-tree pass instrumentations. Currently it registers
  PrintIR callbacks.

Reviewers: chandlerc, paquette, philip.pfaffe
Differential Revision: https://reviews.llvm.org/D50923

llvm-svn: 342896
2018-09-24 16:08:15 +00:00
Simon Pilgrim 00865a48d1 [X86] Split WriteIMul into 8/16/32/64 implementations (PR36931)
Split WriteIMul by size and also by IMUL multiply-by-imm and multiply-by-reg cases.

This removes all the scheduler overrides for gpr multiplies and stops WriteMULH being ignored for BMI2 MULX instructions.

llvm-svn: 342892
2018-09-24 15:21:57 +00:00
Luke Cheeseman ab7f9b170d [Arm][AsmParser] Restrict register list size for VSTM/VLDM
- The assembler accepts VSTM/VLDM with register lists (specifically double registers lists) with more than 16 registers specified
- The Arm architecture reference manual says this instruction must not contain more than 16 registers when the registers are doubleword registers
- This addresses one of the concerns in https://bugs.llvm.org/show_bug.cgi?id=38389

Differential Revision: https://reviews.llvm.org/D52082

llvm-svn: 342891
2018-09-24 15:13:48 +00:00
Sanjay Patel 2c901742ca [DAGCombiner] use UADDO to optimize saturated unsigned add
This is a preliminary step towards solving PR14613:
https://bugs.llvm.org/show_bug.cgi?id=14613

If we have an 'add' instruction that sets flags, we can use that to eliminate an
explicit compare instruction or some other instruction (cmn) that sets flags for 
use in the later select.

As shown in the unchanged tests that use 'icmp ugt %x, %a', we're effectively 
reversing an IR icmp canonicalization that replaces a variable operand with a
constant:
https://rise4fun.com/Alive/V1Q

But we're not using 'uaddo' in those cases via DAG transforms. This happens in 
CGP after D8889 without checking target lowering to see if the op is supported. 
So AArch already shows 'uaddo' codegen for the i8/i16/i32/i64 test variants with 
"using_cmp_sum" in the title. That's the pattern that CGP matches as an unsigned 
saturated add and converts to uaddo without checking target capabilities.

This patch is gated by isOperationLegalOrCustom(ISD::UADDO, VT), so we see only 
see AArch diffs for i32/i64 in the tests with "using_cmp_notval" in the title 
(unlike x86 which sees improvements for all sizes because all sizes are 'custom'). 
But the AArch code (like x86) looks better when translated to 'uaddo' in all cases. 
So someone that is involved with AArch may want to set i8/i16 to 'custom' for UADDO, 
so this patch will fire on those tests.

Another possibility given the existing behavior: we could remove the legal-or-custom 
check altogether because we're assuming that a UADDO sequence is canonical/optimal 
before we ever reach here. But that seems like a bug to me. If the target doesn't 
have an add-with-flags op, then it's not likely that we'll get optimal DAG combining 
using a UADDO node. This is similar justification for why we don't canonicalize IR to 
the overflow math intrinsic sibling (llvm.uadd.with.overflow) for UADDO in the first 
place.

Differential Revision: https://reviews.llvm.org/D51929

llvm-svn: 342886
2018-09-24 14:47:15 +00:00
Petar Jovanovic f9808c5f09 [Mips][FastISel] Fix selectBranch on icmp i1
The r337288 tried to fix result of icmp i1 when its input is not sanitized
by falling back to DagISel. While it now produces the correct result for
bit 0, the other bits can still hold arbitrary value which is not supported
by MipsFastISel branch lowering. This patch fixes the issue by falling back
to DagISel in this case.

Patch by Dragan Mladjenovic.

Differential Revision: https://reviews.llvm.org/D52045

llvm-svn: 342884
2018-09-24 14:14:19 +00:00
Zaara Syeda edefda48d2 [PowerPC] Support operand modifier 'x' in inline asm
gcc uses operand modifier 'x' in inline asm for VSX registers.
Without this modifier, instructions which use VSX numbering for their
operands are printed as VMX registers. This patch adds support for the
operand modifier 'x'.

Differential Revision: https://reviews.llvm.org/D52244

llvm-svn: 342882
2018-09-24 14:01:16 +00:00
Jonas Devlieghere 8a7cfc6c86 [dsymutil] Set LSan blacklist whenever sanitizers are enabled.
LSan can be enabled by itself or as part of the address sanitizer.
Rather than checking the enabled sanitizers for both, just set the LSan
env options whenever a sanitizer is enabled.

llvm-svn: 342881
2018-09-24 13:56:36 +00:00
Roman Lebedev fb697d0f1b [NFC][CodeGen][X86][AArch64] More tests for 'bit field extract' w/ constants
It would be best to introduce ISD::BitFieldExtract,
because clearly more than one backend faces the same problem.
But for now let's solve this in the x86-specific DAG combine.

https://bugs.llvm.org/show_bug.cgi?id=38938

llvm-svn: 342880
2018-09-24 13:24:20 +00:00
Matt Arsenault f432011d33 AMDGPU: Fix private handling for allowsMisalignedMemoryAccesses
If the alignment is at least 4, this should report true.

Something still seems off with how < 4-byte types are
handled here though.

Fixing this seems to change how some combines get
to where they get, but somehow isn't changing the net
result.

llvm-svn: 342879
2018-09-24 13:18:15 +00:00
Matt Arsenault b53feca372 Fix some missing opcodes in bcanalyzer
llvm-svn: 342878
2018-09-24 12:47:17 +00:00
Sjoerd Meijer d986ede313 [ARM] Do not fuse VADD and VMUL on the Cortex-M4 and Cortex-M33
A sequence of VMUL and VADD instructions always give the same or better
performance than a fused VMLA instruction on the Cortex-M4 and Cortex-M33.
Executing the VMUL and VADD back-to-back requires the same cycles, but
having separate instructions allows scheduling to avoid the hazard between
these 2 instructions.

Differential Revision: https://reviews.llvm.org/D52289

llvm-svn: 342874
2018-09-24 12:02:50 +00:00
Luke Cheeseman bda54bca39 [ARM][ARMLoadStoreOptimizer]
- The load store optimizer is currently merging multiple loads/stores into VLDM/VSTM with more than 16 doubleword registers
- This is an UNPREDICTABLE instruction and shouldn't be done
- It looks like the Limit for how many registers included in a merge got dropped at some point so I am reintroducing it in this patch
- This fixes https://bugs.llvm.org/show_bug.cgi?id=38389

Differential Revision: https://reviews.llvm.org/D52085

llvm-svn: 342872
2018-09-24 10:42:22 +00:00
Petar Jovanovic c451c9ef50 [deadargelim] Update dbg.value of 'unused' parameters
DeadArgElim pass marks unused function arguments as ‘undef’ without updating
existing dbg.values referring to it. As a consequence the debug info
metadata in the final executable was wrong.

Patch by Djordje Todorovic.

Differential Revision: https://reviews.llvm.org/D51968

llvm-svn: 342871
2018-09-24 10:01:24 +00:00
Sam Parker a7b2405b06 [ARM] bottom-top mul support ARMParallelDSP
Originally committed in rL342210 but was reverted in rL342260 because
it was causing issues in vectorized code, because I had forgotten to
ensure that we're operating on scalar values.

Original commit message:

On failing to find sequences that can be converted into dual macs,
try to find sequential 16-bit loads that are used by muls which we
can then use smultb, smulbt, smultt with a wide load.

Differential Revision: https://reviews.llvm.org/D51983

llvm-svn: 342870
2018-09-24 09:34:06 +00:00
Craig Topper 2b8107614c [X86] Add 512-bit test cases to setcc-wide-types.ll. NFC
llvm-svn: 342860
2018-09-24 05:46:01 +00:00
Matt Arsenault 9a71e80645 Fix asserts when linking wrong address space declarations
llvm-svn: 342858
2018-09-24 04:42:14 +00:00
Matt Arsenault ce5f203415 llvm-diff: Fix crash on anonymous functions
Not sure what the correct behavior is for this.
Skip them and report how many there were.

llvm-svn: 342857
2018-09-24 04:42:13 +00:00
Simon Pilgrim 9202c9fb47 [X86] ROR*mCL instruction models should match ROL*mCL etc.
Confirmed with Craig Topper - fix a typo that was missing a Port4 uop for ROR*mCL instructions on some Intel models.

Yet another step on the scheduler model cleanup marathon......

llvm-svn: 342846
2018-09-23 19:16:01 +00:00
Sanjay Patel 0027946915 [DAGCombiner][x86] extend decompose of integer multiply into shift/add with negation
This is an alternative to https://reviews.llvm.org/D37896. We can't decompose 
multiplies generically without a target hook to tell us when it's profitable.

ARM and AArch64 may be able to remove some existing code that overlaps with
this transform.

This extends D52195 and may resolve PR34474: 
https://bugs.llvm.org/show_bug.cgi?id=34474
(still an open question about transforming legal vector multiplies, but we
could open another bug report for those)

llvm-svn: 342844
2018-09-23 18:41:38 +00:00
Simon Pilgrim 19952add7c [X86] Added missing RCL/RCR schedule overrides to the generic SNB model
The SandyBridge model was missing schedule values for the RCL/RCR values - instead using the (incredibly optimistic) WriteShift (now WriteRotate) defaults.

I've added overrides with more realistic (slow) values, based on a mixture of Agner/instlatx64 numbers and what later Intel models do as well.

This is necessary to allow WriteRotate to be updated to remove other rotate overrides.

It'd probably be a good idea to investigate a WriteRotateCarry class at some point but its not high priority given the unusualness of these instructions.

llvm-svn: 342842
2018-09-23 17:40:24 +00:00
Sanjay Patel 151efca3fe [x86] add tests for mul decomposition with negative constant; NFC
llvm-svn: 342838
2018-09-23 16:07:46 +00:00
Eugene Leviant 2b70d616f0 [WholeProgramDevirt] Don't process declarations when building type id map
Differential revision: https://reviews.llvm.org/D52175

llvm-svn: 342836
2018-09-23 13:27:47 +00:00
Craig Topper c296436a30 [X86] Add isel pattern for (v8i16 (sext (v8i1))) with DQI and no BWI.
Our lowering that tries to avoid this sign extend can be defeated by the DAG combine folding it with a truncate.

The pattern needs to extend to an v8i32 then truncate back down to v8i16.

llvm-svn: 342830
2018-09-23 06:49:48 +00:00
Tri Vo 6c47c62588 [AArch64] Support adding X[8-15,18] registers as CSRs.
Summary:
Specifying X[8-15,18] registers as callee-saved is used to support
CONFIG_ARM64_LSE_ATOMICS in Linux kernel. As part of this patch we:
- use custom CSR list/mask when user specifies custom CSRs
- update Machine Register Info's list of CSRs with additional custom CSRs in
LowerCall and LowerFormalArguments.

Reviewers: srhines, nickdesaulniers, efriedma, javed.absar

Reviewed By: nickdesaulniers

Subscribers: kristof.beyls, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D52216

llvm-svn: 342824
2018-09-22 22:17:50 +00:00
Yonghong Song 55b9156730 [bpf] Test case for symbol information in object file
This patch tests the change introduced in r342556.

Signed-off-by: Paul Chaignon <paul.chaignon@orange.com>
llvm-svn: 342807
2018-09-22 17:31:01 +00:00
Sanjay Patel 09e02fbf51 [InstCombine][x86] try even harder to convert blendv intrinsic to generic IR (PR38814)
Follow-up to rL342324 (D52059):

Missing optimizations with blendv are shown in:
https://bugs.llvm.org/show_bug.cgi?id=38814

This is an easier and more powerful solution than adding pattern matching for a few 
special cases in the backend. The potential danger with this transform in IR is that 
the condition value can get separated from the select, and the backend might not be 
able to make a blendv out of it again.

llvm-svn: 342806
2018-09-22 14:43:55 +00:00
George Rimar ce95ac6490 [lib/MC] - Set SHF_EXCLUDE flag for .dwo sections.
DWARF5 spec says about single file split case:

"The sections that do not require relocation, however, can be written
to the relocatable object (.o) file but ignored by the
the linker or they can be written to a separate DWARF object (.dwo) file
that need not be accessed by the linker."

Nice way to make linker to ignore them is to set SHF_EXCLUDE flag.
It seems to be not harmful to always set it for .dwo sections.
That is what this patch does.

Differential revision: https://reviews.llvm.org/D52303

llvm-svn: 342800
2018-09-22 07:36:20 +00:00
Craig Topper 2b3f5df73a [InstCombine] Fold (min/max ~X, Y) -> ~(max/min X, ~Y) when Y is freely invertible
Summary: This restores the combine that was reverted in r341883. The infinite loop from the failing test no longer occurs due to changes from r342163.

Reviewers: spatel, dmgreen

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52070

llvm-svn: 342797
2018-09-22 05:53:27 +00:00
Craig Topper 082e04c61d [X86] Fix inline expansion for memset in x32
Summary: Similar to D51893 which was for memcpy

Reviewers: efriedma

Reviewed By: efriedma

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52063

llvm-svn: 342796
2018-09-22 05:16:35 +00:00
Craig Topper 9995760df4 [X86] Fold (movmsk (setne (and X, (1 << C)), 0)) -> (movmsk (X << C)) for vXi8 vectors.
We don't have a vXi8 shift left so we need to bitcast to a vXi16 vector to perform the shift. If we let lowering legalize the vXi8 shift we get an extra and that we don't need and fail to remove.

llvm-svn: 342795
2018-09-22 05:08:38 +00:00
Jordan Rupprecht 8d60f9b6d2 [llvm-size] Berkeley formatting: use tabs instead of spaces as field delimeters.
This matches GNU behavior for size and allows use of cut to parse the output of llvm-size.

llvm-svn: 342791
2018-09-21 23:48:12 +00:00
Craig Topper ecdab03d10 [X86] Teach fast isel to use MOV32ri64 for loading an unsigned 32 immediate into a 64-bit register.
Previously we used SUBREG_TO_REG+MOV32ri. But regular isel was changed recently to use the MOV32ri64 pseudo. Fast isel now does the same.

llvm-svn: 342788
2018-09-21 23:14:05 +00:00
Warren Ristow 4f27730eaf [Loop Vectorizer] Abandon vectorization when no integer IV found
Support for vectorizing loops with secondary floating-point induction
variables was added in r276554.  A primary integer IV is still required
for vectorization to be done.  If an FP IV was found, but no integer IV
was found at all (primary or secondary), the attempt to vectorize still
went forward, causing a compiler-crash.  This change abandons that
attempt when no integer IV is found.  (Vectorizing FP-only cases like
this, rather than bailing out, is discussed as possible future work
in D52327.)

See PR38800 for more information.

Differential Revision: https://reviews.llvm.org/D52327

llvm-svn: 342786
2018-09-21 23:03:50 +00:00
Zachary Turner 6345e84dde [NativePDB] Add support for reading function signatures.
This adds support for parsing function signature records and returning
them through the native DIA interface.

llvm-svn: 342780
2018-09-21 22:36:28 +00:00
Zachary Turner 355ffb0032 [PDB] Add native reading support for UDT / class types.
This allows the native reader to find records of class/struct/
union type and dump them.  This behavior is tested by using the
diadump subcommand against golden output produced by actual DIA
SDK on the same PDB file, and again using pretty -native to
confirm that we actually dump the classes.  We don't find class
members or anything like that yet, for now it's just the class
itself.

llvm-svn: 342779
2018-09-21 22:36:04 +00:00
Fedor Sergeev 10febb0779 [New PM][PassInstrumentation] Adding PassInstrumentation to the AnalysisManager runs
As a prerequisite to time-passes implementation which needs to time both passes
and analyses, adding instrumentation points to the Analysis Manager.
The are two functional differences between Pass and Analysis instrumentation:
  - the latter does not increment pass execution counter
  - it does not provide ability to skip execution of the corresponding analysis

Reviewers: chandlerc, philip.pfaffe
Differential Revision: https://reviews.llvm.org/D51275

llvm-svn: 342778
2018-09-21 22:10:17 +00:00
Adrian Prantl 2e102480ac llvm-dwarfdump --statistics: Unique abstract origins across multiple CUs.
Instead of indexing local variables by DIE offset, use the variable
name + the path through the lexical block tree. This makes the lookup
key consistent across duplicate abstract origins in different CUs.

llvm-svn: 342776
2018-09-21 21:59:34 +00:00
Sanjay Patel db1fb8cd20 [x86] add more tests for poetntial andnp splitting with AVX1; NFC
llvm-svn: 342775
2018-09-21 21:25:16 +00:00
Simon Pilgrim eabc582b62 [X86] Add AVX512 target to load scalar to vector tests
To investigate broadcast instruction codegen for D51553

llvm-svn: 342773
2018-09-21 21:08:26 +00:00
Thomas Lively 720fbf553d [WebAssembly][NFC] Rename simd-conversions test to simd-bitcasts
Summary:
This name is more accurate and I want to reuse the simd-conversions
name for testing the actual conversion ops.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D52333

llvm-svn: 342761
2018-09-21 18:46:39 +00:00
Caroline Tice 3dea3f9e0a Pass code-model through Module IR to LTO which will use it.
Currently the code-model does not get saved in the module IR,
so if a code model is specified when compiling with LTO,
it gets lost and is not propagated properly to LTO. This patch,
along with one for the front end, fixes that.

Differential Revision: https://reviews.llvm.org/D52322

llvm-svn: 342760
2018-09-21 18:41:31 +00:00
Sanjay Patel 19b5eb580b [x86] add (negative) andnp test for D52318; NFC
llvm-svn: 342756
2018-09-21 18:24:53 +00:00
Sanjay Patel 83a1b66c43 [x86] add test with optsize attribute for scalar->vector transform; NFC
llvm-svn: 342755
2018-09-21 18:03:49 +00:00
Wouter van Oortmerssen 7beaa30e4e [WebAssembly] Made assembler only use stack instruction tablegen defs
Summary:
This ensures we have the non-register version of the instruction.

The stack version of call_indirect now wants a type index argument,
so that has been added in the existing tests.

Tested:
llvm-lit -v `find test -name WebAssembly`

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, aheejin, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D51662

llvm-svn: 342753
2018-09-21 17:47:58 +00:00
Sameer Sahasrabuddhe 0807e94951 revert changes from r342722
"[AMDGPU] lower-switch in preISel as a workaround for legacy DA"

This broke regression tests. The first breakage was noticed here:
http://lab.llvm.org:8011/builders/lld-x86_64-freebsd/builds/23549

llvm-svn: 342743
2018-09-21 16:31:51 +00:00
Sanjay Patel 72d627e5ec [InstCombine] add tests for extractelement; NFC
There are folds under visitExtractElementInst() that don't appear
to have any test coverage, so adding a few basic cases here.

llvm-svn: 342740
2018-09-21 14:43:49 +00:00
Clement Courbet 8171bd8e0f [X86][Sched] Add zero idiom sched data to the SNB model.
Summary:
On SNB, renamer-based zeroing does not work for:
 - 16 and 8-bit GPRs[1].
 - MMX [2].
 - ANDN variants [3]

[1] echo 'sub %ax, %ax' | /tmp/llvm-exegesis -mode=uops -snippets-file=-
[2] echo 'pxor %mm0, %mm0' | /tmp/llvm-exegesis -mode=uops -snippets-file=-
[3] echo 'andnps %xmm0, %xmm0' | /tmp/llvm-exegesis -mode=uops -snippets-file=-

Reviewers: RKSimon, andreadb

Subscribers: gbedwell, craig.topper, llvm-commits

Differential Revision: https://reviews.llvm.org/D52358

llvm-svn: 342736
2018-09-21 14:07:20 +00:00
Andrea Di Biagio 4cd5cf9fc8 [X86][BtVer2] Fix latency and resource cycles of AVX 256-bit zero-idioms.
This patch introduces a SchedWriteVariant to describe zero-idiom VXORP(S|D)Yrr
and VANDNP(S|D)Yrr.

This is a follow-up of r342555.

On Jaguar, a VXORPSYrr is 2 macro opcodes. Only one opcode is eliminated at
register-renaming stage. The other opcode has to be executed to set the upper
half of the destination YMM.
Same for VANDNP(S|D)Yrr.

Differential Revision: https://reviews.llvm.org/D52347

llvm-svn: 342728
2018-09-21 12:43:07 +00:00
Jonas Devlieghere 7cf529c11f [test] Fix Assembler/debug-info.ll
Update Assembler/debug-info.ll to contain discriminator.

llvm-svn: 342727
2018-09-21 12:28:44 +00:00
Andrea Di Biagio eebfecee4f [X86] Add scheduling tests for AVX1 256-bit zero-idioms. NFC
llvm-svn: 342726
2018-09-21 12:22:14 +00:00
Jonas Devlieghere a0c9cb1913 Ensure that variant part discriminator is read by MetadataLoader
https://reviews.llvm.org/D42082 introduced variant parts to debug info
in LLVM. Subsequent work on the Rust compiler has found a bug in that
patch; namely, there is a path in MetadataLoader that fails to restore
the discriminator.

This patch fixes the bug.

Patch by: Tom Tromey

Differential revision: https://reviews.llvm.org/D52340

llvm-svn: 342725
2018-09-21 12:03:14 +00:00
Jonas Devlieghere 907ed15f99 [dsymutil] Suppress CoreFoundation leaks in tests.
This suppresses CoreFoundation originated leaks in the dsymutil tests.
I'm not sure if this is a false positive or not, but either way we don't
have control over it and shouldn't keep the bot red.

http://lab.llvm.org:8080/green/job/clang-stage2-cmake-RgSan/

llvm-svn: 342724
2018-09-21 11:55:17 +00:00
Sameer Sahasrabuddhe 2de7653fd5 [AMDGPU] lower-switch in preISel as a workaround for legacy DA
Summary:
The default target of the switch instruction may sometimes be an
"unreachable" block, when it is guaranteed that one of the cases is
always taken. The dominator tree concludes that such a switch
instruction does not have an immediate post dominator. This confuses
divergence analysis, which is unable to propagate sync dependence to
the targets of the switch instruction.

As a workaround, the AMDGPU target now invokes lower-switch as a
preISel pass. LowerSwitch is designed to handle the unreachable
default target correctly, allowing the divergence analysis to locate
the correct immediate dominator of the now-lowered switch.

Reviewers: arsenm, nhaehnle

Reviewed By: nhaehnle

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits, simoll

Differential Revision: https://reviews.llvm.org/D52221

llvm-svn: 342722
2018-09-21 11:26:55 +00:00
Alexander Timofeev 36617f0160 [AMDGPU] Divergence driven instruction selection. Part 1.
Summary: This change is the first part of the AMDGPU target description
    change. The aim of it is the effective splitting the vector and scalar
    flows at the selection stage. Selection uses predicate functions based
    on the framework implemented earlier - https://reviews.llvm.org/D35267

    Differential revision: https://reviews.llvm.org/D52019

    Reviewers: rampitec

llvm-svn: 342719
2018-09-21 10:31:22 +00:00
Jonas Devlieghere b32274242d [dwarfdump] Verify DW_AT_type is set and points to a compatible DIE.
This extends the verifier to catch three new errors:

  * Missing DW_AT_type attributes for DW_TAG_formal_parameter,
    DW_TAG_variable and DW_TAG_array_type.

  * Valid references for DW_AT_type pointing to a non-type tag.

Differential revision: https://reviews.llvm.org/D52223

llvm-svn: 342713
2018-09-21 07:50:21 +00:00
Jonas Devlieghere 7ef2c2021e [dwarfdump] Verify compatibility of attribute TAGs.
Verify that DW_AT_specification and DW_AT_abstract_origin reference a
DIE with a compatible tag.

Differential revision: https://reviews.llvm.org/D38719

llvm-svn: 342712
2018-09-21 07:49:29 +00:00
JF Bastien 73d8e4e531 Merge clang's isRepeatedBytePattern with LLVM's isBytewiseValue
Summary:
his code was in CGDecl.cpp and really belongs in LLVM's isBytewiseValue. Teach isBytewiseValue the tricks clang's isRepeatedBytePattern had, including merging undef properly, and recursing on more types.

clang part of this patch: D51752

Subscribers: dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D51751

llvm-svn: 342709
2018-09-21 05:17:42 +00:00
Jordan Rupprecht 7b1c8168c7 [llvm-objcopy/llvm-strip]: handle --version
Summary:
Implement --version for objcopy and strip.

I think there are LLVM utilities that automatically handle this, but that doesn't seem to work with custom parsing since this binary handles both objcopy and strip, so it uses custom parsing.

This fixes PR38298

Reviewers: jhenderson, alexshap, jakehehrlich

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52328

llvm-svn: 342702
2018-09-21 00:47:31 +00:00
Yonghong Song 150ca5143b bpf: check illegal usage of XADD insn return value
Currently, BPF has XADD (locked add) insn support and the
asm looks like:
  lock *(u32 *)(r1 + 0) += r2
  lock *(u64 *)(r1 + 0) += r2
The instruction itself does not have a return value.

At the source code level, users often use
  __sync_fetch_and_add()
which eventually translates to XADD. The return value of
__sync_fetch_and_add() is supposed to be the old value
in the xadd memory location. Since BPF::XADD insn does not
support such a return value, this patch added a PreEmit
phase to check such a usage. If such an illegal usage
pattern is detected, a fatal error will be reported like
  line 4: Invalid usage of the XADD return value
if compiled with -g, or
  Invalid usage of the XADD return value
if compiled without -g.

Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
llvm-svn: 342692
2018-09-20 22:24:27 +00:00
Thomas Lively bb993db080 [WebAssembly][NFC] Add missing tests for indirect calls
Summary: Depends on D52105.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D52254

llvm-svn: 342691
2018-09-20 22:08:27 +00:00
Thomas Lively 6f21a13675 [WebAssembly] Add V128 value type to binary format
Summary: Adds the necessary support to lib/ObjectYAML and fixes SIMD
calls to allow the tests to work. Also removes some dead code that
would otherwise have to have been updated.

Reviewers: aheejin, dschuff, sbc100

Subscribers: jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D52105

llvm-svn: 342689
2018-09-20 22:04:44 +00:00
Sanjay Patel 18c29b7d74 [InstCombine] rename test file, simplify tests, regenerate full checks; NFC
Fast-math is irrelevant for these transforms.

llvm-svn: 342683
2018-09-20 21:10:14 +00:00
Walter Lee f75e803679 [RegAllocGreedy] Fix crash in tryLocalSplit
tryLocalSplit only handles a single use block, but an interval may
have multiple use blocks.  So don't crash in that case.  This fixes
PR38795.

Differential revision: https://reviews.llvm.org/D52277

llvm-svn: 342682
2018-09-20 20:05:57 +00:00
Vedant Kumar 386ad01c7b [Bitcode] Address backwards compat bug in r342631
r342631 expanded bitc::METADATA_LOCATION by one element. The bitcode
metadata loader was changed in a backwards-incompatible way, leading to
crashes when disassembling old bitcode:

  assertion: empty() && "PlaceholderQueue hasn't been flushed before being destroyed"
  Assertion failed: (empty() && "PlaceholderQueue hasn't been flushed before being destroyed")

This commit teaches the metadata loader to assume that the newly-added
IsImplicitCode bit is 'false' when not present in old bitcode. I've added a
bitcode compat regression test.

rdar://44645820

llvm-svn: 342678
2018-09-20 18:59:33 +00:00
Sameer AbuAsal 77beee4136 [inline Cost] Don't mark functions accessing varargs as non-inlinable
Summary:
rL323619 marks functions that are calling va_end as not viable for
inlining. This patch reverses that since this va_end doesn't need
access to the vriadic arguments list that are saved on the stack, only
va_start does.

Reviewers: efriedma, fhahn

Reviewed By: fhahn

Subscribers: eraman, haicheng, llvm-commits

Differential Revision: https://reviews.llvm.org/D52067

llvm-svn: 342675
2018-09-20 18:39:34 +00:00
Sanjay Patel dfe4380440 [InstCombine] add tests for vector concat with binop (PR33026); NFC
llvm-svn: 342665
2018-09-20 17:10:38 +00:00
Fedor Sergeev ee8d31c49e [New PM] Introducing PassInstrumentation framework
Pass Execution Instrumentation interface enables customizable instrumentation
of pass execution, as per "RFC: Pass Execution Instrumentation interface"
posted 06/07/2018 on llvm-dev@

The intent is to provide a common machinery to implement all
the pass-execution-debugging features like print-before/after,
opt-bisect, time-passes etc.

Here we get a basic implementation consisting of:
* PassInstrumentationCallbacks class that handles registration of callbacks
  and access to them.

* PassInstrumentation class that handles instrumentation-point interfaces
  that call into PassInstrumentationCallbacks.

* Callbacks accept StringRef which is just a name of the Pass right now.
  There were some ideas to pass an opaque wrapper for the pointer to pass instance,
  however it appears that pointer does not actually identify the instance
  (adaptors and managers might have the same address with the pass they govern).
  Hence it was decided to go simple for now and then later decide on what the proper
  mental model of identifying a "pass in a phase of pipeline" is.

* Callbacks accept llvm::Any serving as a wrapper for const IRUnit*, to remove direct dependencies
  on different IRUnits (e.g. Analyses).

* PassInstrumentationAnalysis analysis is explicitly requested from PassManager through
  usual AnalysisManager::getResult. All pass managers were updated to run that
  to get PassInstrumentation object for instrumentation calls.

* Using tuples/index_sequence getAnalysisResult helper to extract generic AnalysisManager's extra
  args out of a generic PassManager's extra args. This is the only way I was able to explicitly
  run getResult for PassInstrumentationAnalysis out of a generic code like PassManager::run or
  RepeatedPass::run.
  TODO: Upon lengthy discussions we agreed to accept this as an initial implementation
  and then get rid of getAnalysisResult by improving RepeatedPass implementation.

* PassBuilder takes PassInstrumentationCallbacks object to pass it further into
  PassInstrumentationAnalysis. Callbacks registration should be performed directly
  through PassInstrumentationCallbacks.

* new-pm tests updated to account for PassInstrumentationAnalysis being run

* Added PassInstrumentation tests to PassBuilderCallbacks unit tests.
  Other unit tests updated with registration of the now-required PassInstrumentationAnalysis.

  Made getName helper to return std::string (instead of StringRef initially) to fix
  asan builtbot failures on CGSCC tests.

Reviewers: chandlerc, philip.pfaffe
Differential Revision: https://reviews.llvm.org/D47858

llvm-svn: 342664
2018-09-20 17:08:45 +00:00
Zachary Turner 4bb55c6a0d [PDB] Fix failing test.
This test was missed on the last run since I only ran a subset
of them before commiting.

llvm-svn: 342659
2018-09-20 16:12:27 +00:00
Calixte Denizet 0b1fe47e22 [gcov] Fix wrong line hit counts when multiple blocks are on the same line
Summary:
The goal of this patch is to have the same behaviour than gcc-gcov.
Currently the hit counts for a line is the sum of the counts for each block on that line.
The idea is to detect the cycles in the graph of blocks in using the algorithm by Hawick & James.
The count for a cycle is the min of the counts for each edge in the cycle.
Once we've the count for each cycle, we can sum them and add the transition counts of those cycles.

Fix both https://bugs.llvm.org/show_bug.cgi?id=38065 and https://bugs.llvm.org/show_bug.cgi?id=38066

Reviewers: marco-c, davidxl

Reviewed By: marco-c

Subscribers: vsk, lebedev.ri, sylvestre.ledru, dblaikie, llvm-commits

Differential Revision: https://reviews.llvm.org/D49659

llvm-svn: 342657
2018-09-20 16:09:30 +00:00
Zachary Turner cfa1d499f9 [PDB] Add the ability to map forward references to full decls.
Some records point to an LF_CLASS, LF_UNION, LF_STRUCTURE, or LF_ENUM
which is a forward reference and doesn't contain complete debug
information. In these cases, we'd like to be able to quickly locate the
full record. The TPI stream stores an array of pre-computed record hash
values, one for each type record. If we pre-process this on startup, we
can build a mapping from hash value -> {list of possible matching type
indices}. Since hashes of full records are only based on the name and or
unique name and not the full record contents, we can then use forward
ref record to compute the hash of what *would* be the full record by
just hashing the name, use this to get the list of possible matches, and
iterate those looking for a match on name or unique name.

llvm-pdbutil is updated to resolve forward references for the purposes
of testing (plus it's just useful).

Differential Revision: https://reviews.llvm.org/D52283

llvm-svn: 342656
2018-09-20 15:50:13 +00:00
Andrea Di Biagio 0aea310391 [llvm-mca][BtVer2] Modify ANDN tests in zero-idioms-avx-256.s. NFC
Two test cases should have tested 256-bit variants of VANDN zero-idioms instead
of the 128-bit variants.

llvm-svn: 342655
2018-09-20 15:48:23 +00:00
Jesper Antonsson 719fa055d0 [InstCombine] Handle vector compares in foldGEPIcmp()
Summary:
This is to fix PR38984 "InstCombine assertion at vector gep/icmp folding":
https://bugs.llvm.org/show_bug.cgi?id=38984

Reviewers: majnemer, spatel, lattner, lebedev.ri

Reviewed By: lebedev.ri

Subscribers: lebedev.ri, llvm-commits

Differential Revision: https://reviews.llvm.org/D52263

llvm-svn: 342647
2018-09-20 13:37:28 +00:00
Simon Pilgrim 6f630e4618 Fix line-endings. NFCI.
llvm-svn: 342639
2018-09-20 10:59:08 +00:00
George Rimar 425f75172f [DWARF] - Emit the correct value for DW_AT_addr_base.
Currently, we emit DW_AT_addr_base that points to the beginning of
the .debug_addr section. That is not correct for the DWARF5 case because address
table contains the header and the attribute should point to the first entry
following the header.

This is currently the reason why LLDB does not work with such executables correctly.
Patch fixes the issue.

Differential revision: https://reviews.llvm.org/D52168

llvm-svn: 342635
2018-09-20 09:17:36 +00:00
Bjorn Pettersson cd53b7f54e [IPSCCP] Fix a problem with removing labels in a switch with undef condition
Summary:
Before removing basic blocks that ipsccp has considered as dead
all uses of the basic block label must be removed. That is done
by calling ConstantFoldTerminator on the users. An exception
is when the branch condition is an undef value. In such
scenarios ipsccp is using some internal assumptions regarding
which edge in the control flow that should remain, while
ConstantFoldTerminator don't know how to fold the terminator.

The problem addressed here is related to ConstantFoldTerminator's
ability to rewrite a 'switch' into a conditional 'br'. In such
situations ConstantFoldTerminator returns true indicating that
the terminator has been rewritten. However, ipsccp treated the
true value as if the edge to the dead basic block had been
removed. So the code for resolving an undef branch condition
did not trigger, and we ended up with assertion that there were
uses remaining when deleting the basic block.

The solution is to resolve indeterminate branches before the
call to ConstantFoldTerminator.

Reviewers: efriedma, fhahn, davide

Reviewed By: fhahn

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52232

llvm-svn: 342632
2018-09-20 09:00:17 +00:00
Calixte Denizet eb7f60201c [IR] Add a boolean field in DILocation to know if a line must covered or not
Summary:
Some lines have a hit counter where they should not have one.
For example, in C++, some cleanup is adding at the end of a scope represented by a '}'.
So such a line has a hit counter where a user expects to not have one.
The goal of the patch is to add this information in DILocation which is used to get the covered lines in GCOVProfiling.cpp.
A following patch in clang will add this information when generating IR (https://reviews.llvm.org/D49916).

Reviewers: marco-c, davidxl, vsk, javed.absar, rnk

Reviewed By: rnk

Subscribers: eraman, xur, danielcdh, aprantl, rnk, dblaikie, #debug-info, vsk, llvm-commits, sylvestre.ledru

Tags: #debug-info

Differential Revision: https://reviews.llvm.org/D49915

llvm-svn: 342631
2018-09-20 08:53:06 +00:00
Alex Bradbury 226f3ef5a5 [RISCV][MC] Improve parsing of jal/j operands
Examples such as `jal a3`, `j a3` and `jal a3, a3` are accepted by gas 
but rejected by LLVM MC. This patch rectifies this. I introduce 
RISCVAsmParser::parseJALOffset to ensure that symbol names that coincide with 
register names can safely be parsed. This is made a somewhat fiddly due to the 
single-operand alias form (see the comment in parseJALOffset for more info).

Differential Revision: https://reviews.llvm.org/D52029

llvm-svn: 342629
2018-09-20 08:10:35 +00:00
Roman Lebedev 38c25ace53 [NFC][x86][AArch64] Add BEXTR-like test patterns.
Summary: Also, adjust the check prefixes so that we actually get to check the BMI1-only-case.

Reviewers: craig.topper, RKSimon, spatel, javed.absar

Reviewed By: RKSimon

Subscribers: kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D48490

llvm-svn: 342623
2018-09-20 07:54:49 +00:00
Bjorn Pettersson b2154af25f [MachineVerifier] Relax checkLivenessAtDef regarding dead subreg defs
Summary:
Consider an instruction that has multiple defs of the same
vreg, but defining different subregs:
  %7.sub1:rc, dead %7.sub2:rc = inst

Calling checkLivenessAtDef for the live interval associated
with %7 incorrectly reported "live range continues after a
dead def". The live range for %7 has a dead def at the slot
index for "inst" even if the live range continues (given that
there are later uses of %7.sub1).

This patch adjusts MachineVerifier::checkLivenessAtDef
to allow dead subregister definitions, unless we are checking
a subrange (when tracking subregister liveness).

A limitation is that we do not detect the situation when the
live range continues past an instruction that defines the
full virtual register by multiple dead subreg defines.

I also removed some dead code related to physical register
in checkLivenessAtDef. Wwe only call that method for virtual
registers, so I added an assertion instead.

Reviewers: kparzysz

Reviewed By: kparzysz

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52237

llvm-svn: 342618
2018-09-20 06:59:18 +00:00
Eric Christopher 019889374b Temporarily Revert "[New PM] Introducing PassInstrumentation framework"
as it was causing failures in the asan buildbot.

This reverts commit r342597.

llvm-svn: 342616
2018-09-20 05:16:29 +00:00
Maya Madhavan ec1efe4ee3 Fix for bug 34002 - label generated before it block is finalized. Differential Revision: https://reviews.llvm.org/D52258
llvm-svn: 342615
2018-09-20 05:11:42 +00:00
QingShan Zhang cae9425a3c [PowerPC] Fix the assert of combineBVOfConsecutiveLoads when element num is 1
Building a vector out of multiple loads can be converted to a load of the vector type if the loads are consecutive.
But the special condition is that the element number is 1, such as <1 x i128>. So just early exit to fix the assert.

Patch By: wuzish (Zixuan Wu)
Differential Revision: https://reviews.llvm.org/D52072

llvm-svn: 342611
2018-09-20 03:09:15 +00:00
Thomas Lively f45de47c59 [WebAssembly] Renumber SIMD ops
Summary:
This change leaves holes in the opcode space where missing
instructions could logically be added later if they were found to be
useful.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D52282

llvm-svn: 342610
2018-09-20 02:55:28 +00:00
Michael Berg 120ca61d7d [WEB] add new flags to a DebugInfo lit test
llvm-svn: 342598
2018-09-19 22:56:14 +00:00
Fedor Sergeev a5f279ea89 [New PM] Introducing PassInstrumentation framework
Pass Execution Instrumentation interface enables customizable instrumentation
of pass execution, as per "RFC: Pass Execution Instrumentation interface"
posted 06/07/2018 on llvm-dev@

The intent is to provide a common machinery to implement all
the pass-execution-debugging features like print-before/after,
opt-bisect, time-passes etc.

Here we get a basic implementation consisting of:
* PassInstrumentationCallbacks class that handles registration of callbacks
  and access to them.

* PassInstrumentation class that handles instrumentation-point interfaces
  that call into PassInstrumentationCallbacks.

* Callbacks accept StringRef which is just a name of the Pass right now.
  There were some ideas to pass an opaque wrapper for the pointer to pass instance,
  however it appears that pointer does not actually identify the instance
  (adaptors and managers might have the same address with the pass they govern).
  Hence it was decided to go simple for now and then later decide on what the proper
  mental model of identifying a "pass in a phase of pipeline" is.

* Callbacks accept llvm::Any serving as a wrapper for const IRUnit*, to remove direct dependencies
  on different IRUnits (e.g. Analyses).

* PassInstrumentationAnalysis analysis is explicitly requested from PassManager through
  usual AnalysisManager::getResult. All pass managers were updated to run that
  to get PassInstrumentation object for instrumentation calls.

* Using tuples/index_sequence getAnalysisResult helper to extract generic AnalysisManager's extra
  args out of a generic PassManager's extra args. This is the only way I was able to explicitly
  run getResult for PassInstrumentationAnalysis out of a generic code like PassManager::run or
  RepeatedPass::run.
  TODO: Upon lengthy discussions we agreed to accept this as an initial implementation
  and then get rid of getAnalysisResult by improving RepeatedPass implementation.

* PassBuilder takes PassInstrumentationCallbacks object to pass it further into
  PassInstrumentationAnalysis. Callbacks registration should be performed directly
  through PassInstrumentationCallbacks.

* new-pm tests updated to account for PassInstrumentationAnalysis being run

* Added PassInstrumentation tests to PassBuilderCallbacks unit tests.
  Other unit tests updated with registration of the now-required PassInstrumentationAnalysis.

Reviewers: chandlerc, philip.pfaffe
Differential Revision: https://reviews.llvm.org/D47858

llvm-svn: 342597
2018-09-19 22:42:57 +00:00
Sanjay Patel 0bda919870 [x86] add test for 256-bit andn (PR37749); NFC
llvm-svn: 342595
2018-09-19 22:00:56 +00:00