Commit Graph

80814 Commits

Author SHA1 Message Date
Tom Stellard 833ae4fadd AMDGPU/SI: Remove unused variable
This should fix some bots that were broken by r240831.

llvm-svn: 240838
2015-06-26 21:58:26 +00:00
Philip Reames a3c6f0048c [Verifier] Verify invokes of intrinsics
We support invoking a subset of llvm's intrinsics, but the verifier didn't account for this.  We had previously added a special case to verify invokes of statepoints.  By generalizing the code in terms of CallSite, we can verify invokes of other intrinsics as well.  Interestingly, this found one test case which was invalid.

Note: I'm deliberately leaving the naming change from CI to CS to a follow up change.  That will happen shortly, I just wanted to reduce the diff to make it clear what was happening with this one.

Differential Revision: http://reviews.llvm.org/D10118

llvm-svn: 240836
2015-06-26 21:39:44 +00:00
Adrian Prantl 06b298e4b6 Debug Info: Clarify the documentation for bitfields emission.
llvm-svn: 240835
2015-06-26 21:27:30 +00:00
Tom Stellard 91efe9cebe AMDGPU/SI: Set ELF OS/ABI to ELFOSABI_AMDGPU_HSA
Reviewers: arsenm, rafael

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10708

llvm-svn: 240832
2015-06-26 21:15:11 +00:00
Tom Stellard 347ac79b15 AMDGPU/SI: Add hsa code object directives
Reviewers: arsenm

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10757

llvm-svn: 240831
2015-06-26 21:15:07 +00:00
Tom Stellard b5798b09d3 AMDGPU/SI: There are no implicit kernel args in the amdhsa ABI
Reviewers: arsenm

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10706

llvm-svn: 240830
2015-06-26 21:15:03 +00:00
Tom Stellard f151a45ccd AMDGPU/SI: Emit amd_kernel_code_t in EmitFunctionBodyStart()
Summary:
This way the function symbol points to the start of amd_kernel_code_t
rather than the start of the function.

Reviewers: arsenm

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10705

llvm-svn: 240829
2015-06-26 21:14:58 +00:00
Philip Reames 9b5c9580e3 Teach InlineCost to account for a null check which can be folded away
If we have a caller that knows a particular argument can never be null, we can exploit this fact while simplifying values in the inline cost analysis. This has the effect of reducing the cost for inlining when a null check is present in the callee, but the value is known non null in the caller. In particular, any dependent control flow can be discounted from the cost estimate.

Note that we use the parameter attributes at the call site to memoize the analysis within the caller's code.  The setting of this attribute is done in InstCombine, the inline cost analysis just consumes it.  This is intentional and important because we want the inline cost analysis results to be easily cachable themselves.  We're not currently doing so, but initial results on LTO indicate this will quickly become important.

Differential Revision: http://reviews.llvm.org/D9129

llvm-svn: 240828
2015-06-26 20:51:17 +00:00
Marek Olsak cfbdba2d0b AMDGPU: really don't commute REV opcodes if the target variant doesn't exist
If pseudoToMCOpcode failed, we would return the original opcode, so operands
would be swapped, but the instruction would remain the same.
It resulted in LSHLREV a, b ---> LSHLREV b, a.

This fixes Glamor text rendering and
piglit/arb_sample_shading-builtin-gl-sample-mask on VI.

This is a candidate for stable branches.

v2: the test was simplified by Tom Stellard
llvm-svn: 240824
2015-06-26 20:29:10 +00:00
Pete Cooper 485d1146db Convert a bunch of loops to foreach. NFC.
This uses the new SDNode::op_values() iterator range committed in r240805.

llvm-svn: 240822
2015-06-26 19:37:02 +00:00
Nemanja Ivanovic f502a428e6 Add missing builtins to the PPC back end for ABI compliance (vol. 1)
This patch corresponds to review:
http://reviews.llvm.org/D10638

This is the back end portion of patch
http://reviews.llvm.org/D10637
It just adds the code gen and intrinsic functions necessary to support that patch to the back end.

llvm-svn: 240820
2015-06-26 19:26:53 +00:00
Pete Cooper af61ac71e2 Wrap assert loops in #ifndef NDEBUG
The body of the loops here only contained asserts.  This triggered an unused variable
warning on release builds and -Werror on the bots.

llvm-svn: 240819
2015-06-26 19:23:20 +00:00
Pete Cooper 9271ccc345 Convert a bunch of loops to foreach. NFC.
This uses the new SDNode::op_values() iterator range committed in r240805.

llvm-svn: 240817
2015-06-26 19:18:49 +00:00
Pete Cooper 8fc121dfc4 Convert a bunch of loops to foreach. NFC.
This uses the new SDNode::op_values() iterator range committed in r240805.

llvm-svn: 240815
2015-06-26 19:08:33 +00:00
Matt Arsenault 572c29afc9 Show invariant loads in MMO dumping
llvm-svn: 240813
2015-06-26 19:00:11 +00:00
David Majnemer 65ff7ccf21 Revert "Revert r240762 "[X86] Cleanup X86WindowsTargetObjectFile::getSectionForConstant""
This reverts commit r240793 while fixing how we handle array constant
pool entries.

This fixes PR23966.

llvm-svn: 240811
2015-06-26 18:55:48 +00:00
Pete Cooper 8c0a710995 Convert a bunch of loops to foreach. NFC.
This uses the new SDNode::op_values() iterator range committed in r240805.

llvm-svn: 240809
2015-06-26 18:41:54 +00:00
Pete Cooper 3af9a25b65 Add op_values() to iterate over the SDValue operands of an SDNode.
SDNode already had ops() which would iterate over the operands and return
SDUse*.  This version instead gets the SDValue's out of the SDUse's so that
we can use foreach in more places.

Reviewed by David Blaikie.

llvm-svn: 240805
2015-06-26 18:17:36 +00:00
David Blaikie b447ac6435 Move VectorUtils from Transforms to Analysis to correct layering violation
llvm-svn: 240804
2015-06-26 18:02:52 +00:00
Javed Absar bced3032e0 [ARM] Cortex-R5 is not VFPOnlySP
This patch fixes the error in ARM.td which stated that Cortex-R5
floating point unit can do only single precision, when it can do double as well.

Reviewers: rengolin

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10769

llvm-svn: 240799
2015-06-26 17:42:37 +00:00
Adam Nemet c4866d29dd [LAA] Try to prove non-wrapping of pointers if SCEV cannot
Summary:
Scalar evolution does not propagate the non-wrapping flags to values
that are derived from a non-wrapping induction variable because
the non-wrapping property could be flow-sensitive.

This change is a first attempt to establish the non-wrapping property in
some simple cases.  The main idea is to look through the operations
defining the pointer.  As long as we arrive to a non-wrapping AddRec via
a small chain of non-wrapping instruction, the pointer should not wrap
either.

I believe that this essentially is what Andy described in
http://article.gmane.org/gmane.comp.compilers.llvm.cvs/220731 as the way
forward.

Reviewers: aschwaighofer, nadav, sanjoy, atrick

Reviewed By: atrick

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10472

llvm-svn: 240798
2015-06-26 17:25:43 +00:00
Alex Lorenz ec6b26b955 Fix unused variable from r240792.
The variable 'I' wasn't used when assertions were disabled.
This commit ensures that 'I' is used outside of an assert.

llvm-svn: 240797
2015-06-26 17:07:27 +00:00
Benjamin Kramer 1dcd8b09b4 [DAGCombine] Fix demanded bits computation for exact shifts.
Fixes a miscompilation of MultiSource/Benchmarks/MallocBench/gs

llvm-svn: 240796
2015-06-26 16:59:31 +00:00
Douglas Katzman 289ec857d2 [X86]: Correctly sign-extend 16-bit immediate in CALL instruction.
Patch by Matthew Barney. Thanks!

Differential Revision: http://reviews.llvm.org/D9514

llvm-svn: 240795
2015-06-26 16:58:59 +00:00
David Blaikie 1213dbf1fd Fix ODR violation waiting to happen by making static function definitions in VectorUtils.h non-static and defined out of line
Patch by Ashutosh Nema

Differential Revision: http://reviews.llvm.org/D10682

llvm-svn: 240794
2015-06-26 16:57:30 +00:00
Hans Wennborg e38fc05d3b Revert r240762 "[X86] Cleanup X86WindowsTargetObjectFile::getSectionForConstant"
It seems to have caused PR23966: "UNREACHABLE executed at ..\lib\Target\X86\X86TargetObjectFile.cpp:148"

llvm-svn: 240793
2015-06-26 16:48:02 +00:00
Alex Lorenz 33f0aef32f MIR Serialization: Serialize machine basic block operands.
This commit serializes machine basic block operands. The
machine basic block operands use the following syntax:

  %bb.<id>[.<name>]

This commit also modifies the YAML representation for the
machine basic blocks - a new, required field 'id' is added
to the MBB YAML mapping.

The id is used to resolve the MBB references to the
actual MBBs. And while the name of the MBB can be
included in a MBB reference, this name isn't used to
resolve MBB references - as it's possible that multiple
MBBs will reference the same BB and thus they will have the
same name. If the name is specified, the parser will verify
that it is equal to the name of the MBB with the specified id.

Reviewers: Duncan P. N. Exon Smith

Differential Revision: http://reviews.llvm.org/D10608

llvm-svn: 240792
2015-06-26 16:46:11 +00:00
Benjamin Kramer c2ae767377 [DAGCombiner] Preserve the exact bit when simplifying SRA to SRL.
Allows more aggressive folding of ashr/shl pairs.

llvm-svn: 240788
2015-06-26 14:51:49 +00:00
Benjamin Kramer 07e70b4fa4 [DAGCombine] fold (X >>?,exact C1) << C2 --> X << (C2-C1)
Instcombine also does this but many opportunities only become visible
after GEPs are lowered.

llvm-svn: 240787
2015-06-26 14:51:36 +00:00
Rafael Espindola 854038ed1a Rename getObjectFile to getObject for consistency.
llvm-svn: 240785
2015-06-26 14:51:16 +00:00
Toma Tabacu 0a6fa59a2c [mips] [IAS] Add partial support for the ULW pseudo-instruction.
Summary:
This only adds support for ULW of an immediate address with/without a source register.
It does not include support for ULW of the address of a symbol.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9663

llvm-svn: 240782
2015-06-26 13:20:17 +00:00
Rafael Espindola edd5f84419 Expose getFlags via ELFSectionRef.
llvm-svn: 240779
2015-06-26 12:44:10 +00:00
Rafael Espindola 41401e9c80 Add a ELFSectionRef class and use it to expose getSectionType.
llvm-svn: 240778
2015-06-26 12:33:37 +00:00
Rafael Espindola 2fa80cc5fd Simplify getSymbolType.
This is still a really odd function. Most calls are in object format specific
contexts and should probably be replaced with a more direct query, but at least
now this is not too obnoxious to use.

llvm-svn: 240777
2015-06-26 12:18:49 +00:00
Javed Absar 99a9343ae6 [ARM] Cortex-R4F is not VFPOnlySP
Cortex-R4F TRM states that fpu supports both single and double precision.
This patch corrects the information in ARM.td file and corresponding test.

Reviewers: rengolin

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10763

llvm-svn: 240776
2015-06-26 12:14:56 +00:00
Rafael Espindola eef7ffe2e9 Make getOther ELF only.
No other format has this field.

llvm-svn: 240774
2015-06-26 11:39:57 +00:00
Rafael Espindola c5fb508c9d Optimize the creation of mapping symbols.
No need to create two symbols just to assign one to the other.

llvm-svn: 240773
2015-06-26 11:31:13 +00:00
David Majnemer 4eb32e7d21 [X86] Cleanup X86WindowsTargetObjectFile::getSectionForConstant
No functionality changed, just keeping things clean.

llvm-svn: 240762
2015-06-26 07:03:12 +00:00
Hao Liu b41c0b44af [InterleavedAccess] Fix failures "undefined type 'llvm::raw_ostream'" on windows.
llvm-svn: 240760
2015-06-26 04:38:21 +00:00
Hao Liu 2cd34bb585 [ARM] Lower interleaved memory accesses to vldN/vstN intrinsics.
This patch also adds a function to calculate the cost of interleaved memory accesses.

E.g. Lower an interleaved load:
        %wide.vec = load <8 x i32>, <8 x i32>* %ptr, align 4
        %v0 = shuffle %wide.vec, undef, <0, 2, 4, 6>
        %v1 = shuffle %wide.vec, undef, <1, 3, 5, 7>
     into:
        %vld2 = { <4 x i32>, <4 x i32> } call llvm.arm.neon.vld2(%ptr, 4)
        %vec0 = extractelement { <4 x i32>, <4 x i32> } %vld2, i32 0
        %vec1 = extractelement { <4 x i32>, <4 x i32> } %vld2, i32 1

E.g. Lower an interleaved store:
        %i.vec = shuffle <8 x i32> %v0, <8 x i32> %v1, <0, 4, 8, 1, 5, 9, 2, 6, 10, 3, 7, 11>
        store <12 x i32> %i.vec, <12 x i32>* %ptr, align 4
     into:
        %sub.v0 = shuffle <8 x i32> %v0, <8 x i32> v1, <0, 1, 2, 3>
        %sub.v1 = shuffle <8 x i32> %v0, <8 x i32> v1, <4, 5, 6, 7>
        %sub.v2 = shuffle <8 x i32> %v0, <8 x i32> v1, <8, 9, 10, 11>
        call void llvm.arm.neon.vst3(%ptr, %sub.v0, %sub.v1, %sub.v2, 4)

Differential Revision: http://reviews.llvm.org/D10533

llvm-svn: 240755
2015-06-26 02:45:36 +00:00
Hao Liu 7ec8ee3119 [AArch64] Lower interleaved memory accesses to ldN/stN intrinsics. This patch also adds a function to calculate the cost of interleaved memory accesses.
E.g. Lower an interleaved load:
        %wide.vec = load <8 x i32>, <8 x i32>* %ptr
        %v0 = shuffle %wide.vec, undef, <0, 2, 4, 6>
        %v1 = shuffle %wide.vec, undef, <1, 3, 5, 7>
     into:
        %ld2 = { <4 x i32>, <4 x i32> } call llvm.aarch64.neon.ld2(%ptr)
        %vec0 = extractelement { <4 x i32>, <4 x i32> } %ld2, i32 0
        %vec1 = extractelement { <4 x i32>, <4 x i32> } %ld2, i32 1

E.g. Lower an interleaved store:
        %i.vec = shuffle <8 x i32> %v0, <8 x i32> %v1, <0, 4, 8, 1, 5, 9, 2, 6, 10, 3, 7, 11>
        store <12 x i32> %i.vec, <12 x i32>* %ptr
     into:
        %sub.v0 = shuffle <8 x i32> %v0, <8 x i32> v1, <0, 1, 2, 3>
        %sub.v1 = shuffle <8 x i32> %v0, <8 x i32> v1, <4, 5, 6, 7>
        %sub.v2 = shuffle <8 x i32> %v0, <8 x i32> v1, <8, 9, 10, 11>
        call void llvm.aarch64.neon.st3(%sub.v0, %sub.v1, %sub.v2, %ptr)

Differential Revision: http://reviews.llvm.org/D10533

llvm-svn: 240754
2015-06-26 02:32:07 +00:00
Hao Liu 1c1e0c9e71 [InterleavedAccess] Add a pass InterleavedAccess to identify interleaved memory accesses and transform into target specific intrinsics.
E.g. An interleaved load (Factor = 2):
        %wide.vec = load <8 x i32>, <8 x i32>* %ptr
        %v0 = shuffle <8 x i32> %wide.vec, <8 x i32> undef, <0, 2, 4, 6>
        %v1 = shuffle <8 x i32> %wide.vec, <8 x i32> undef, <1, 3, 5, 7>
It can be transformed into a ld2 intrinsic in AArch64 backend or a vld2 intrinsic in ARM backend.

E.g. An interleaved store (Factor = 3):
        %i.vec = shuffle <8 x i32> %v0, <8 x i32> %v1, <0, 4, 8, 1, 5, 9, 2, 6, 10, 3, 7, 11>
        store <12 x i32> %i.vec, <12 x i32>* %ptr
It can be transformed into a st3 intrinsic in AArch64 backend or a vst3 intrinsic in ARM backend.

Differential Revision: http://reviews.llvm.org/D10533

llvm-svn: 240751
2015-06-26 02:10:27 +00:00
Matthias Braun 7c6d6491dd Revert "X86: Reject register operands with obvious type mismatches."
Revert until http://llvm.org/PR23955 is investigated.

This reverts commit r239309.

llvm-svn: 240746
2015-06-26 00:26:49 +00:00
Alexey Samsonov 773e8c3966 [ASan] Use llvm::getDISubprogram() to get function entry debug location.
It can be more robust than copying debug info from first non-alloca
instruction in the entry basic block. We use the same strategy in
coverage instrumentation.

llvm-svn: 240738
2015-06-26 00:00:47 +00:00
Duncan P. N. Exon Smith 827200c822 AsmPrinter: Use an intrusively linked list for DIE::Children
Replace the `std::vector<>` for `DIE::Children` with an intrusively
linked list.  This is a strict memory improvement: it requires no
auxiliary storage, and reduces `sizeof(DIE)` by one pointer.  It also
factors out the DIE-related malloc traffic.

This drops llc memory usage from 735 MB down to 718 MB, or ~2.3%.

(I'm looking at `llc` memory usage on `verify-uselistorder.lto.opt.bc`;
see r236629 for details.)

llvm-svn: 240736
2015-06-25 23:52:10 +00:00
Duncan P. N. Exon Smith 4fb1f9cda6 AsmPrinter: Convert DIE::Values to a linked list
Change `DIE::Values` to a singly linked list, where each node is
allocated on a `BumpPtrAllocator`.  In order to support `push_back()`,
the list is circular, and points at the tail element instead of the
head.  I abstracted the core list logic out to `IntrusiveBackList` so
that it can be reused for `DIE::Children`, which also cares about
`push_back()`.

This drops llc memory usage from 799 MB down to 735 MB, about 8%.

(I'm looking at `llc` memory usage on `verify-uselistorder.lto.opt.bc`;
see r236629 for details.)

llvm-svn: 240733
2015-06-25 23:46:41 +00:00
NAKAMURA Takumi 520b45df84 PPCISelLowering.cpp: Appease PR23956. [-Wdocumentation]
llvm-svn: 240727
2015-06-25 23:38:44 +00:00
Anna Zaks 785c075786 [asan] Do not instrument special purpose LLVM sections.
Do not instrument globals that are placed in sections containing "__llvm"
in their name.

This fixes a bug in ASan / PGO interoperability. ASan interferes with LLVM's
PGO, which places its globals into a special section, which is memcpy-ed by
the linker as a whole. When those goals are instrumented, ASan's memcpy wrapper
reports an issue.

http://reviews.llvm.org/D10541

llvm-svn: 240723
2015-06-25 23:35:48 +00:00
Anna Zaks 4f652b69b1 [asan] Don't run stack malloc on functions containing inline assembly.
It makes LLVM run out of registers even on 64-bit platforms. For example, the
following test case fails on darwin.

clang -cc1 -O0 -triple x86_64-apple-macosx10.10.0 -emit-obj -fsanitize=address -mstackrealign -o ~/tmp/ex.o -x c ex.c
error: inline assembly requires more registers than available

void TestInlineAssembly(const unsigned char *S, unsigned int pS, unsigned char *D, unsigned int pD, unsigned int h) {

unsigned int sr = 4, pDiffD = pD - 5;
unsigned int pDiffS = (pS << 1) - 5;
char flagSA = ((pS & 15) == 0),
flagDA = ((pD & 15) == 0);
asm volatile (
  "mov %0,  %%"PTR_REG("si")"\n"
  "mov %2,  %%"PTR_REG("cx")"\n"
  "mov %1,  %%"PTR_REG("di")"\n"
  "mov %8,  %%"PTR_REG("ax")"\n"
  :
  : "m" (S), "m" (D), "m" (pS), "m" (pDiffS), "m" (pDiffD), "m" (sr), "m" (flagSA), "m" (flagDA), "m" (h)
  : "%"PTR_REG("si"), "%"PTR_REG("di"), "%"PTR_REG("ax"), "%"PTR_REG("cx"), "%"PTR_REG("dx"), "memory"
);
}

http://reviews.llvm.org/D10719

llvm-svn: 240722
2015-06-25 23:35:45 +00:00
Matt Arsenault f735cab986 DAGCombiner: Use pop_back_val()
llvm-svn: 240709
2015-06-25 22:15:05 +00:00