Commit Graph

55834 Commits

Author SHA1 Message Date
Carl Ritson f898edd117 [AMDGPU] Prevent sequences of non-instructions disrupting GCNHazardRecognizer wait state counting
Summary:
This fixes a bug where a large number of implicit def instructions can fill the GCNHazardRecognizer lookahead buffer causing required NOPs to not be inserted.

Reviewers: nhaehnle, arsenm

Reviewed By: arsenm

Subscribers: sheredom, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D51726

Change-Id: Ie75338f94de704ee5816b05afd0c922c6748a95b
llvm-svn: 341798
2018-09-10 10:14:48 +00:00
Max Kazantsev 4d10ba37b9 [IndVars] Set Changed if sinkUnusedInvariants changes IR. PR38863
Currently, `sinkUnusedInvariants` does not set Changed flag even if it makes
changes in the IR. There is no clear evidence that it can cause a crash, but it
looks highly suspicious and likely invalid.

Differential Revision: https://reviews.llvm.org/D51777
Reviewed By: skatkov

llvm-svn: 341777
2018-09-10 06:32:00 +00:00
David Carlier 0efae196dd [XRay] Fix buildbot failure
llvm-svn: 341774
2018-09-10 05:29:49 +00:00
David Carlier 07cc5a8df9 [Xray] tooling allow MachO format support
Getting writable xray __DATA sections from MachO as well.

Reviewers: dberris

Reviewed By: dberris

Differential Revision: https://reviews.llvm.org/D51758

llvm-svn: 341772
2018-09-10 05:00:43 +00:00
Matt Arsenault 72d27f5525 AMDGPU: Fix tests using old number for constant address space
llvm-svn: 341770
2018-09-10 02:54:25 +00:00
Matt Arsenault d77fcc2a92 AMDGPU: Use GOT PSV since it has an address space now
llvm-svn: 341768
2018-09-10 02:23:39 +00:00
Matt Arsenault b998674610 AMDGPU: Don't abort on unknown addrspace argument
llvm-svn: 341767
2018-09-10 02:23:30 +00:00
Craig Topper 3823516103 [X86] Custom type legalize (v2i32 (fp_to_uint v2f64))) without avx512vl by widening to v4i32 and v4f64 instead of v8i32 and v8f64. Make it aware of x86-experimental-vector-widening-legalization
We have isel patterns for v4i32/v4f64 that artificially widen to v8i32/v8f64 so just use that.

If x86-experimental-vector-widening-legalization is enabled, we don't need any custom legalization and can just return. I've modified the test RUN lines to cover this case.

llvm-svn: 341765
2018-09-09 20:36:36 +00:00
Sanjay Patel 6ebf218e4c [SelectionDAG] enhance vector demanded elements to look at a vector select condition operand
This is the DAG equivalent of D51433.
If we know we're not using all vector lanes, use that knowledge to potentially simplify a vselect condition.

The reduction/horizontal tests show that we are eliminating AVX1 operations on the upper half of 256-bit 
vectors because we don't need those anyway.
I'm not sure what the pr34592 test is showing. That's run with -O0; is SimplifyDemandedVectorElts supposed 
to be running there?

Differential Revision: https://reviews.llvm.org/D51696

llvm-svn: 341762
2018-09-09 14:13:22 +00:00
Craig Topper 7af5e333e7 [X86] Create paddus/psubus from narrower vectors with i8/i16 element types.
Summary:
This patch allows vectors with a power of 2 number of elements and i8/i16 element type to select paddus/psubus instructions. ReplaceNodeResults has been updated to custom widen these operations up to 128 bits like we already do for PAVG.

Another step towards fixing PR38691

Reviewers: RKSimon, spatel

Reviewed By: RKSimon, spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D51818

llvm-svn: 341753
2018-09-08 19:32:58 +00:00
Craig Topper a2c9694bc8 [X86] Mark the ADCX and ADOX instruction as commutable.
llvm-svn: 341752
2018-09-08 18:47:56 +00:00
Craig Topper 4677110348 [X86] Add test cases for commuting ADCX/ADOX instruction to avoid copies.
This is a MIR test so we can test ADOX which we have no isel patterns for. I also plan to remove ADCX isel patterns in the near future so this will help maintain coverage.

llvm-svn: 341751
2018-09-08 18:47:54 +00:00
Craig Topper c96305970d [X86] Add commuted isel pattern for the load form of ADCX instructions.
This prevents the legacy ADC instruction from being favored over ADCX when the load is in the operand 0.

llvm-svn: 341745
2018-09-08 06:31:43 +00:00
Craig Topper 22a6f51646 [X86] Add load folding test cases for the addcarryx intrinsic.
We are currently only able to fold a load in operand 1 to ADCX. A load in operand 0 will use the legacy ADC instruction.

Ultimately I want to remove isel support for ADCX, but first I'm going to fix the shortcomings I know of so I can write proper MIR tests to maintain coverage later.

llvm-svn: 341744
2018-09-08 06:31:41 +00:00
Craig Topper 761e88d1d4 [X86] Add stack folding MIR test for ADCX/ADOX.
We currently have no way to isel ADOX and I plan to remove isel patterns for ADCX. This test will ensure we still have stack folding support for these instructions if we need them in the future.

llvm-svn: 341743
2018-09-08 05:08:18 +00:00
Adrian Prantl 609bf36952 Remove addBlockByrefAddress(), it is dead code as far as clang is concerned.
This patch removes addBlockByrefAddress(), it is dead code as far as
clang is concerned: Every byref block capture is emitted with a
complex expression that is equivalent to what this function does.

rdar://problem/31629055

Differential Revision: https://reviews.llvm.org/D51763

llvm-svn: 341737
2018-09-08 00:21:55 +00:00
Zachary Turner 0119e38491 Fix some of the PDB tests.
They were unintentionally calling DIA directly, which requires
Windows.  We need to pass the -native flag, and this then required
fixing up one or two tests.

llvm-svn: 341731
2018-09-07 23:36:08 +00:00
Zachary Turner da4b63ab9a [PDB] Support pointer types in the native reader.
In order to start testing this, I've added a new mode to
llvm-pdbutil which is only really useful for writing tests.
It just dumps the value of raw fields in record format.
This isn't really ideal and it won't allow us to test some
important cases, but it's better than nothing for now.

llvm-svn: 341729
2018-09-07 23:21:33 +00:00
Reid Kleckner f803b23879 [COFF] Implement llvm.global_ctors priorities for MSVC COFF targets
Summary:
MSVC and LLD sort sections ASCII-betically, so we need to use section
names that sort between .CRT$XCA (the start) and .CRT$XCU (the default
priority).

In the general case, use .CRT$XCT12345 as the section name, and let the
linker sort the zero-padded digits.

Users with low priorities typically want to initialize as early as
possible, so use .CRT$XCA00199 for prioties less than 200. This number
is arbitrary.

Implements PR38552.

Reviewers: majnemer, mstorsjo

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D51820

llvm-svn: 341727
2018-09-07 23:07:55 +00:00
Abderrazek Zaafrani c30dfb2dfc [SimplifyIndVar] Avoid generating truncate instructions with non-hoisted Laod operand.
Differential Revision: https://reviews.llvm.org/D49151

llvm-svn: 341726
2018-09-07 22:41:57 +00:00
Piotr Padlewski 9a925ba616 Set cost of invariant group intrinsics to 0
Summary:
Like with other similar intrinsics, presense of strip or
launder.invariant.group should not change the result of inlining cost.
This is because they are just markers and do not perform any computation.

Reviewers: amharc, rsmith, reames, kuhar

Subscribers: eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D51814

llvm-svn: 341725
2018-09-07 22:29:48 +00:00
Thomas Lively a0d25815a0 [WebAssembly] v8x16.shuffle
Summary:
Since the shuffle mask is not exposed as an operand in the native ISel
DAG, create a new WebAssembly ISD node exposing the mask. The mask is
lowered as sixteen immediate byte indices no matter what type the
original vector shuffle was operating on.

This CL depends on D51656

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D51659

llvm-svn: 341718
2018-09-07 21:54:46 +00:00
Sanjay Patel caa4de72a2 [InstCombine][x86] add tests for possible blendv transform (PR38814); NFC
llvm-svn: 341715
2018-09-07 21:40:41 +00:00
Philip Reames cb8b3278e5 [AST] Generalize argument specific aliasing
AliasSetTracker has special case handling for memset, memcpy and memmove which pre-existed argmemonly on functions and readonly and writeonly on arguments. This patch generalizes it using the AA infrastructure to any call correctly annotated.

The motivation here is to cut down on confusion, not performance per se. For most instructions, there is a direct mapping to alias set. However, this is not guaranteed by the interface and was not in fact true for these three intrinsics *and only these three intrinsics*. I kept getting myself confused about this invariant, so I figured it would be good to clearly distinguish between a instructions and alias sets. Calls happened to be an easy target.

The nice side effect is that custom implementations of memset/memcpy/memmove - including wrappers discovered by IPO - can now be optimized the same as builts by LICM.

Note: The actual removal of the memset/memtransfer specific handling will happen in a follow on NFC patch.  It was originally part of this one, but separate for ease of review and rebase.

Differential Revision: https://reviews.llvm.org/D50730

llvm-svn: 341713
2018-09-07 21:36:11 +00:00
Reid Kleckner 06d02d0306 [codeview] Add .cv_string directive for testing purposes
The main use case for this directive is to allow assembly writers to
write their own FPO data strings without going through the .cv_fpo*
directive family.

I'm experimenting with different RPN programs to fix PR38857, and I
figured I should go ahead and make this directive permanent.

llvm-svn: 341712
2018-09-07 21:30:52 +00:00
Craig Topper fa535c027e [X86] Add codegen tests for narrow PADDUS/PSUBUS patterns for PR38691.
llvm-svn: 341711
2018-09-07 21:28:46 +00:00
Sanjay Patel c1416b60f2 [InstCombine] narrow vector select with padded condition and extracted result (PR38691)
shuf (sel (shuf NarrowCond, undef, WideMask), X, Y), undef, NarrowMask) -->
sel NarrowCond, (shuf X, undef, NarrowMask), (shuf Y, undef, NarrowMask)

The motivating case from:
https://bugs.llvm.org/show_bug.cgi?id=38691
...is the last regression test. In that case, we're just left with the narrow select.

Note that if we do create new shuffles, they use the existing extraction identity mask, 
so there's no danger that this transform creates arbitrary shuffles.

Differential Revision: https://reviews.llvm.org/D51496

llvm-svn: 341708
2018-09-07 21:03:34 +00:00
Nick Desaulniers 287a3be379 [AArch64] Support reserving x1-7 registers.
Summary:
Reserving registers x1-7 is used to support CONFIG_ARM64_LSE_ATOMICS in Linux kernel. This change adds support for reserving registers x1 through x7.

Reviewers: javed.absar, phosek, srhines, nickdesaulniers, efriedma

Reviewed By: nickdesaulniers, efriedma

Subscribers: niravd, jfb, manojgupta, nickdesaulniers, jyknight, efriedma, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D48580

llvm-svn: 341706
2018-09-07 20:58:57 +00:00
Craig Topper 5cbce81c91 [X86] Don't create ZERO_EXTEND_INREG/SIGN_EXTEND_INREG for v1iX vectors.
The generic type legalizer will scalarize vXi1 instructions getting rid of the vector entirely. Creating wider vector instructions is just going to prevent that.

llvm-svn: 341705
2018-09-07 20:56:03 +00:00
Craig Topper 39f48fdcbc [X86] Don't create X86ISD::AVG nodes from v1iX vectors.
The type legalizer will try to scalarize this and fail.

It looks like there's some other v1iX oddities out there too since we still generated some vector instructions.

llvm-svn: 341704
2018-09-07 20:56:01 +00:00
Craig Topper 4863313b35 [X86] Modify the the rdtscp intrinsic to return values instead of taking a pointer argument
Similar to what was recently done for addcarry/subborrow and has been done for rdrand/rdseed for a while. It's better to use two results and an explicit store in IR when the store isn't part of the semantics of the instruction. This allows store->load forwarding to happen in the middle end. Or the store to be removed if its never loaded.

Differential Revision: https://reviews.llvm.org/D51803

llvm-svn: 341698
2018-09-07 19:14:15 +00:00
Reid Kleckner ee0e8bab2a [codeview] Improve readobj FPO dumper and pdbutil register names
The improved dumping helps me investigate PR38857.

llvm-svn: 341695
2018-09-07 18:48:27 +00:00
Ana Pazos b2ed11a086 [RISCV] Fix crash in decoding instruction with unknown floating point rounding mode
Summary:
Instead of crashing in printFRMArg, decode and warn about invalid instruction.

This bug was uncovered by a LLVM MC Disassembler Protocol Buffer Fuzzer
for the RISC-V assembly language.

Reviewers: asb

Reviewed By: asb

Subscribers: rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, mgrang, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, asb

Differential Revision: https://reviews.llvm.org/D51705

llvm-svn: 341691
2018-09-07 18:43:43 +00:00
Fangrui Song 91c95a35c1 [llvm-dwp] Clean up tests X86/*.test
llvm-svn: 341688
2018-09-07 18:29:20 +00:00
Ana Pazos b97d18945b [RISCV] Fix AddressSanitizer heap-buffer-overflow in disassembling
Summary:
RISCVDisassembler should check number of bytes available before reading them.
Crash noticed when enabling -DLLVM_USE_SANITIZER=Address.

This bug was uncovered by a LLVM MC Disassembler Protocol Buffer Fuzzer for the RISC-V assembly language.

Reviewers: asb

Reviewed By: asb

Subscribers: rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, mgrang, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, asb

Differential Revision: https://reviews.llvm.org/D51708

llvm-svn: 341686
2018-09-07 18:23:19 +00:00
Craig Topper 72964ae99e [X86] Change the addcarry and subborrow intrinsics to return 2 results and remove the pointer argument.
We should represent the store directly in IR instead. This gives the middle end a chance to remove it if it can see a load from the same address.

Differential Revision: https://reviews.llvm.org/D51769

llvm-svn: 341677
2018-09-07 16:58:39 +00:00
Craig Topper 51e11788a4 [X86] Use regular expressions to make test immune to register allocation changes.
llvm-svn: 341676
2018-09-07 16:58:36 +00:00
Craig Topper 313d09af51 [X86] Teach X86DAGToDAGISel::foldLoadStoreIntoMemOperand to handle loads in operand 1 of commutable operations.
Previously we only handled loads in operand 0, but nothing guarantees the load will be operand 0 for commutable operations.

Differential Revision: https://reviews.llvm.org/D51768

llvm-svn: 341675
2018-09-07 16:27:55 +00:00
Craig Topper 040c2b0acf [InstCombine] Fold (min/max ~X, Y) -> ~(max/min X, ~Y) when Y is freely invertible
If the ~X wasn't able to simplify above the max/min, we might be able to simplify it by moving it below the max/min.

I had to modify the ~(min/max ~X, Y) transform to prevent getting stuck in a loop when we saw the new ~(max/min X, ~Y) before the ~Y had been folded away to remove the new not.

Differential Revision: https://reviews.llvm.org/D51398

llvm-svn: 341674
2018-09-07 16:19:50 +00:00
Anna Thomas 110df11a1a [LV] Fix code gen for conditionally executed loads and stores
Fix a latent bug in loop vectorizer which generates incorrect code for
memory accesses that are executed conditionally. As pointed in review,
this bug definitely affects uniform loads and may affect conditional
stores that should have turned into scatters as well).

The code gen for conditionally executed uniform loads on architectures
that support masked gather instructions is broken.

Without this patch, we were unconditionally executing the *conditional*
load in the vectorized version.

This patch does the following:
1. Uniform conditional loads on architectures with gather support will
   have correct code generated. In particular, the cost model
   (setCostBasedWideningDecision) is fixed.
2. For the recipes which are handled after the widening decision is set,
   we use the isScalarWithPredication(I, VF) form which is added in the
   patch.

3. Fix the vectorization cost model for scalarization
   (getMemInstScalarizationCost): implement and use isPredicatedInst to
   identify *all* predicated instructions, not just scalar+predicated. So,
   now the cost for scalarization will be increased for maskedloads/stores
   and gather/scatter operations. In short, we should be choosing the
   gather/scatter in place of scalarization on archs where it is
   profitable.
4. We needed to weaken the assert in useEmulatedMaskMemRefHack.

Reviewers: Ayal, hsaito, mkuper

Differential Revision: https://reviews.llvm.org/D51313

llvm-svn: 341673
2018-09-07 15:53:48 +00:00
Aditya Kumar 801394a3d7 Hot cold splitting pass
Find cold blocks based on profile information (or optionally with static analysis).
Forward propagate profile information to all cold-blocks.
Outline a cold region.
Set calling conv and prof hint for the callsite of the outlined function.

Worked in collaboration with: Sebastian Pop <s.pop@samsung.com>
Differential Revision: https://reviews.llvm.org/D50658

llvm-svn: 341669
2018-09-07 15:03:49 +00:00
Florian Hahn e32ff4b28a [InstCombine] Do not fold scalar ops over select with vector condition.
If OtherOpT or OtherOpF have scalar types and the condition is a vector,
we would create an invalid select.

Reviewers: spatel, john.brawn, mssimpso, craig.topper

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D51781

llvm-svn: 341666
2018-09-07 14:40:06 +00:00
David Stenberg 45acc9610b [DebugInfo] Handle stack slot offsets for spilled sub-registers in LDV
Summary:
Extend LDV so that stack slot offsets for spilled sub-registers
are added to the emitted debug locations. This is accomplished
by querying InstrInfo::getStackSlotRange().

With this change, LDV will add a DW_OP_plus_uconst operation to
the expression if a sub-register is spilled. Later on, PEI will
add an offset operation for the stack slot, meaning that we will
get expressions of the forms:

 * {DW_OP_constu #fp-offset, DW_OP_minus,
    DW_OP_plus_uconst #subreg-offset}

 * {DW_OP_plus_const #fp-offset,
    DW_OP_minus, DW_OP_plus_uconst #subreg-offset}

The two offset operations should ideally be merged.

Reviewers: rnk, aprantl, stoklund

Reviewed By: aprantl

Subscribers: dblaikie, bjope, nemanjai, JDevlieghere, llvm-commits

Tags: #debug-info

Differential Revision: https://reviews.llvm.org/D51612

llvm-svn: 341659
2018-09-07 13:54:07 +00:00
Sid Manning 9ad0f02749 Add support for getRegisterByName.
Support required to build the Hexagon Linux kernel.

Differential Revision: https://reviews.llvm.org/D51363

llvm-svn: 341658
2018-09-07 13:36:21 +00:00
Simon Pilgrim 04d0748417 [X86][SSE] Add additional fadd/fsub(x, bitcast_fneg(y)) tests with different integer bitwidths
llvm-svn: 341657
2018-09-07 13:27:07 +00:00
Simon Pilgrim 96d6b9c2e2 [DAGCombiner] foldBitcastedFPLogic - Add basic vector support
Add support for bitcasts from float type to an integer type of the same element bitwidth.

There maybe cases where we need to support different widths (e.g. as SSE __m128i is treated as v2i64) - but I haven't seen cases of this in the wild yet.

llvm-svn: 341652
2018-09-07 12:13:45 +00:00
Florian Hahn b30f7aeeeb [NewGVN] Mark function as changed if we erase instructions.
Currently eliminateInstructions only returns true if any instruction got
replaced. In the test case for this patch, we eliminate the trivially
dead calls, for which eliminateInstructions not do a replacement and the
function is not marked as changed, which is why the inliner crashes
while traversing the call graph.

Alternatively we could also change eliminateInstructions to return true
in case we mark instructions for deletion, but that's slightly more code
and doing it at the place where the replacement happens seems safer.

Fixes PR37517.

Reviewers: davide, mcrosier, efriedma, bjope

Reviewed By: bjope

Differential Revision: https://reviews.llvm.org/D51169

llvm-svn: 341651
2018-09-07 11:41:34 +00:00
Simon Pilgrim a2aef22a72 [X86][SSE] Add fadd/fsub(x, bitcast_fneg(y)) tests
Show missing vector support

llvm-svn: 341650
2018-09-07 11:24:43 +00:00
Tim Northover bb7d7b3d33 ARM: fix Thumb2 CodeGen for ldrex with folded frame-index.
Because t2LDREX (& t2STREX) were marked as AddrModeNone, but did allow a
FrameIndex operand, rewriteT2FrameIndex asserted. This gives them a
proper addressing-mode and tells the rewriter about it so that encodable
offsets are exploited and others are rejected.

Should fix PR38828.

llvm-svn: 341642
2018-09-07 09:21:25 +00:00
Alexander Potapenko 8fe99a0ef2 [MSan] Add KMSAN instrumentation to MSan pass
Introduce the -msan-kernel flag, which enables the kernel instrumentation.

The main differences between KMSAN and MSan instrumentations are:

- KMSAN implies msan-track-origins=2, msan-keep-going=true;
- there're no explicit accesses to shadow and origin memory.
  Shadow and origin values for a particular X-byte memory location are
  read and written via pointers returned by
  __msan_metadata_ptr_for_load_X(u8 *addr) and
  __msan_store_shadow_origin_X(u8 *addr, uptr shadow, uptr origin);
- TLS variables are stored in a single struct in per-task storage. A call
  to a function returning that struct is inserted into every instrumented
  function before the entry block;
- __msan_warning() takes a 32-bit origin parameter;
- local variables are poisoned with __msan_poison_alloca() upon function
  entry and unpoisoned with __msan_unpoison_alloca() before leaving the
  function;
- the pass doesn't declare any global variables or add global constructors
  to the translation unit.

llvm-svn: 341637
2018-09-07 09:10:30 +00:00