Commit Graph

80855 Commits

Author SHA1 Message Date
Pawel Bylica 143ceb6d46 [DAGCombiner] Fix & simplify constant folding of sext/zext.
Summary: This patch fixes the cases of sext/zext constant folding in DAG combiner where constans do not fit 64 bits. The fix simply removes un$

Test Plan: New regression test included.

Reviewers: RKSimon

Reviewed By: RKSimon

Subscribers: RKSimon, llvm-commits

Differential Revision: http://reviews.llvm.org/D10607

llvm-svn: 240991
2015-06-29 20:28:47 +00:00
Benjamin Kramer 6fe4e79370 [MMI] Use TinyPtrVector instead of PointerUnion with vector.
Also simplify duplicated code a bit. No functionality change intended.

llvm-svn: 240990
2015-06-29 20:21:55 +00:00
Diego Novillo b0257c8419 Tidy comment.
llvm-svn: 240987
2015-06-29 20:03:46 +00:00
Ben Langmuir 63aa8c5d28 Clean up unique lock files on signal and always release the lock
Make sure to remove the unique lock file, which is what the .lock
symlink points to, if there is a signal while the lock is held. This
will release the lock, since the symlink will point to nothing (already
tested in unit tests). For good measure, also clean up the unique lock
file if there is an error or signal before the lock is acquired.

I will add a clang test.

rdar://problem/21512307

llvm-svn: 240967
2015-06-29 17:08:41 +00:00
Alex Lorenz 8f6f4285f3 MIR Serialization: Serialize the register mask machine operands.
This commit implements serialization of the register mask machine
operands. This commit serializes only the call preserved register
masks that are defined by a target, it doesn't serialize arbitrary
register masks.

This commit also extends the TargetRegisterInfo class and TableGen so that
the users of TRI can get the list of all the call preserved register masks and
their names.

Reviewers: Duncan P. N. Exon Smith

Differential Revision: http://reviews.llvm.org/D10673

llvm-svn: 240966
2015-06-29 16:57:06 +00:00
Benjamin Kramer 025f46f367 [SymbolSize] Skip sorting by index, just assign by index.
No functional change intended.

llvm-svn: 240961
2015-06-29 16:05:00 +00:00
Benjamin Kramer aa694a65a4 Upgrade JIT listeners for changes in the libObject API.
llvm-svn: 240956
2015-06-29 15:18:48 +00:00
Tobias Grosser 3cdc37c5bc Move delinearization from SCEVAddRecExpr to ScalarEvolution
The expressions we delinearize do not necessarily have to have a SCEVAddRecExpr
at the outermost level. At this moment, the additional flexibility  is not
exploited in LLVM itself, but in Polly we will soon soonish use this
functionality. For LLVM, this change should not affect existing functionality
(which is covered by test/Analysis/Delinearization/)

llvm-svn: 240952
2015-06-29 14:42:48 +00:00
Rafael Espindola 6a1bfb2f9b Factor out the checking of string tables.
This moves the error checking for string tables to getStringTable which returns
an ErrorOr<StringRef>.

This improves error checking, makes it uniform across all string tables and
makes it possible to check them once instead of once per name.

llvm-svn: 240950
2015-06-29 14:39:25 +00:00
Elena Demikhovsky 30bc4ca313 AVX-512: all forms of SCATTER instruction on SKX,
encoding, intrinsics and tests.

llvm-svn: 240936
2015-06-29 12:14:24 +00:00
Javed Absar 3f7c8934e4 [ARM]: Extend -mfpu options for half-precision and vfpv3xd
removing default label in switch as it results.
This is part of earlier commit http://reviews.llvm.org/D1064

Subscribers: llvm-commits
llvm-svn: 240932
2015-06-29 09:53:33 +00:00
Javed Absar d5526303b7 [ARM]: Extend -mfpu options for half-precision and vfpv3xd
Some of the the permissible ARM -mfpu options, which are supported in GCC,
are currently not present in llvm/clang.This patch adds the options:
'neon-fp16', 'vfpv3-fp16', 'vfpv3-d16-fp16', 'vfpv3xd' and 'vfpv3xd-fp16.
These are related to half-precision floating-point and single precision.

Reviewers: rengolin, ranjeet.singh

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10645

llvm-svn: 240930
2015-06-29 09:32:29 +00:00
Igor Breger a7a8e9a018 AVX-512: Implemented missing encoding and intrinsics for FMA instructions
Added tests for DAG lowering ,encoding and intrinsics

Differential Revision: http://reviews.llvm.org/D10796

llvm-svn: 240926
2015-06-29 09:10:00 +00:00
NAKAMURA Takumi 7bffb6954d Whitespace.
llvm-svn: 240924
2015-06-29 04:50:09 +00:00
Matt Arsenault 8ebce8f12b AMDGPU/SI: Fix extra space when printing v_div_fmas_*
llvm-svn: 240911
2015-06-28 18:16:14 +00:00
Jingyue Wu 3abde7bea5 [SLSR] S's basis must have the same type as S
llvm-svn: 240910
2015-06-28 17:45:05 +00:00
Asaf Badouh 7ec4b7a8bb [x86][AVX512]
Add vscalef support
include encoding and intrinsics


review:
http://reviews.llvm.org/D10730

llvm-svn: 240906
2015-06-28 14:30:39 +00:00
Elena Demikhovsky 6a1a357f1f AVX-512: Added all SKX forms of GATHER instructions.
Added intrinsics.
Added encoding and tests.

llvm-svn: 240905
2015-06-28 10:53:29 +00:00
Adrian Prantl cb53eedc79 Revert "Debug Info: One more bitfield bugfix. While yesterday's r240853 fixed"
This reverts commit 240890. Breaking the gdb buildbot.

llvm-svn: 240893
2015-06-27 21:55:00 +00:00
Benjamin Kramer 5b455f0b62 [SDAG] Now that we have a way to communicate the exact bit on sdiv use it to simplify sdiv by a constant.
We had a hack in SDAGBuilder in place to work around this but now we
can avoid that. Call BuildExactSDIV from BuildSDIV so DAGCombiner can
perform this trick automatically.

The added check in DAGCombiner is necessary to prevent exact sdiv by pow2
from regressing as the target-specific pow2 lowering is not aware of
exact bits yet.

This is mostly covered by existing tests. One side effect is that we
get the better lowering for exact vector sdivs now too :)

llvm-svn: 240891
2015-06-27 20:33:26 +00:00
Adrian Prantl 57c7a62b97 Debug Info: One more bitfield bugfix. While yesterday's r240853 fixed
the DW_AT_bit_offset computation, the byte offset is in fact also
endian-dependent as it needs to point to the storage unit containing the
most-significant bit of the the bitfield.
I'm so looking forward to emitting the endian-agnostic DWARF 3 version
instead.

llvm-svn: 240890
2015-06-27 20:12:43 +00:00
Daniel Sanders a3134fae17 [mips] Add COP0 register class and use it in M[FT]C0/DM[FT]C0.
Summary:
Previously it (incorrectly) used GPR's.

Patch by Simon Dardis. A couple small corrections by myself.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10567

llvm-svn: 240883
2015-06-27 15:39:19 +00:00
David Majnemer 9f3979fd78 [LoopVectorize] Pointer indicies may be wider than the pointer
If we are dealing with a pointer induction variable, isInductionPHI
gives back a step value of Stride / size of pointer.  However, we might
be indexing with a legal type wider than the pointer width.
Handle this by inserting casts where appropriate instead of crashing.

This fixes PR23954.

llvm-svn: 240877
2015-06-27 08:38:17 +00:00
David Majnemer 5185c3c271 [PruneEH] A naked, noinline function can return via InlineAsm
The PruneEH pass tries to annotate functions as 'noreturn' if it doesn't
see a ReturnInst.  However, a naked function containing inline assembly
can contain control flow leaving the function.

This fixes PR23971.

llvm-svn: 240876
2015-06-27 07:52:53 +00:00
Petr Hosek 3294670f6c [MC] Ensure that pending labels are flushed when -mc-relax-all flag is used
Summary:
The current implementation doesn't always flush all pending labels
beforeemitting data which can result in an incorrectly placed labels in
case when when instruction bundling is enabled and -mc-relax-all flag is
being used. To address this issue, we always flush pending labels before
emitting data.

The change was tested by running PNaCl toolchain trybots with
-mc-relax-all flag set.

Fixes https://code.google.com/p/nativeclient/issues/detail?id=4063

Test Plan: Regression test attached

Reviewers: mseaborn

Subscribers: jfb, llvm-commits

Differential Revision: http://reviews.llvm.org/D10325

llvm-svn: 240870
2015-06-27 01:54:17 +00:00
Petr Hosek 4bbf563f6e [MC] Align fragments when -mc-relax-all flag is used
Summary:
Ensure that fragments are bundle aligned when instruction bundling
is enabled and the -mc-relax-all flag is set. This is implicitly
assumed by the bundle padding implementation but this assumption
does not hold when custom alignment is being used.

The change was tested by running PNaCl toolchain trybots with
-mc-relax-all flag set.

Fixes https://code.google.com/p/nativeclient/issues/detail?id=4063

Test Plan: Regression test attached

Reviewers: mseaborn

Subscribers: jfb, llvm-commits

Differential Revision: http://reviews.llvm.org/D10044

llvm-svn: 240869
2015-06-27 01:49:53 +00:00
Duncan P. N. Exon Smith 1f8a99a9ae IR: Expose ModuleSlotTracker in Value::print()
Allow callers of `Value::print()` and `Metadata::print()` to pass in a
`ModuleSlotTracker`.  This allows them to pay only once for calculating
module-level slots (such as Metadata).

This is related to PR23865, where there was a huge cost for
`MachineFunction::print()`.  Although I don't have a *particular* user
in mind for this new code, I have hit big slowdowns before when running
`opt -debug`, and I think this will be useful.  Going forward, if
someone hits a big slowdown with `print()` statements, they can create a
`ModuleSlotTracker` and send it through.  Similarly, adding support to
`Value::dump()` and `Metadata::dump()` should be trivial.

I added unit tests to be sure the `print()` functions actually behave
the same way with and without the slot tracker.

llvm-svn: 240867
2015-06-27 00:38:26 +00:00
Peter Collingbourne ba4c8b5004 LowerBitSets: Ignore bitset entries that do not directly refer to a global.
It is possible for a global to be substituted with another global of a
different type or a different kind (i.e. an alias) at IR link time. One
example of this scenario is when a Microsoft ABI vtable is substituted with
an alias referring to a larger vtable containing an RTTI reference.

This will cause the global to be RAUW'd with a possibly bitcasted reference
to the other global. This will of course also affect any references to the
global in bitset metadata.

The right way to handle such metadata is simply to ignore it. This is sound
because the linked module should contain another copy of the bitset entries as
applied to the new global.

llvm-svn: 240866
2015-06-27 00:17:51 +00:00
Duncan P. N. Exon Smith f8b3ad611d Plug a leak introduced by r240848
Apparently this obvious leak was never exercised before, but r240848
exposed it.  Plug it.

http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/5075

llvm-svn: 240865
2015-06-27 00:15:32 +00:00
Adrian Prantl d3da8caf67 Debug Info: Fix a bug in the DW_AT_bit_offset calculation that would
result in negative offsets and attempt a better job at documenting
the algorithm.

rdar://21082998

llvm-svn: 240853
2015-06-26 23:31:27 +00:00
Duncan P. N. Exon Smith c03745260e CodeGen: Create a proper ModuleSlotTracker for MachineInstr
Another follow-up related to r240848: try a little harder to share slot
tracking calculations within a single `MachineInstr` dump.  This is
unrelated to `MachineFunction::print()`, since that should be passing
through the function's `ModuleSlotTracker` by now, but could affect the
speed of dumping from a debugger if there is more than one IR-level
operand.

llvm-svn: 240852
2015-06-26 23:18:44 +00:00
Alex Lorenz 5d6108e4ed MIR Serialization: Serialize global address machine operands.
This commit serializes the global address machine operands.
This commit doesn't serialize the operand's offset and target
flags, it serializes only the global value reference.

Reviewers: Duncan P. N. Exon Smith

Differential Revision: http://reviews.llvm.org/D10671

llvm-svn: 240851
2015-06-26 22:56:48 +00:00
Philip Reames 8fe7f13af8 [RewriteStatepointsForGC] Generalized vector phi/select handling for base pointers
This change extends the detection of base pointers for vector constructs to handle arbitrary phi and select nodes. The existing non-vector code already handles those, so this is basically just extending the vector special case to be less special cased. It still isn't generalized vector handling since we can't handle arbitrary vector instructions (e.g. shufflevectors), but it's a lot closer.

The general structure of the change is as follows:
 * Extend the base defining value relation over a subset of vector instructions and vector typed phi & select instructions.
 * Move scalarization from before base pointer rewriting to after base pointer rewriting. The extension of the BDV relation is sufficient to find vector base phis for vector inputs.
 * Preserve the existing special case logic for when the base of a vector element is locally obvious. This general idea could be extended to the scalar case as well.

Differential Revision: http://reviews.llvm.org/D10461#inline-84275

llvm-svn: 240850
2015-06-26 22:47:37 +00:00
Jingyue Wu 3203818bf7 [NVPTX] noop when kernel pointers are already global
Summary:
Some front ends make kernel pointers global already. In that case,
handlePointerParams does nothing.

Test Plan: more tests in lower-kernel-ptr-arg.ll

Reviewers: grosser

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D10779

llvm-svn: 240849
2015-06-26 22:35:43 +00:00
Duncan P. N. Exon Smith 6529ed40bc CodeGen: Push the ModuleSlotTracker through Metadata
For another 1% speedup on the testcase in PR23865, push the
`ModuleSlotTracker` through to metadata-related printing in
`MachineBasicBlock::print()`.

llvm-svn: 240848
2015-06-26 22:28:47 +00:00
Philip Reames 007561acdc Minor style cleanup after 240843 [NFC]
Use a for-each loop in one case and rename the function to reflect it's new usage.

llvm-svn: 240847
2015-06-26 22:21:52 +00:00
Duncan P. N. Exon Smith f48e982706 CodeGen: Push the ModuleSlotTracker through MachineOperands
Push `ModuleSlotTracker` through `MachineOperand`s, dropping the time
for `llc -print-machineinstrs` on the testcase in PR23865 from ~13
seconds to ~9 seconds.  Now `SlotTracker::processFunctionMetadata()`
accounts for only 8% of the runtime, which seems reasonable.

llvm-svn: 240845
2015-06-26 22:06:47 +00:00
Philip Reames 9818dd77b4 [Verifier] Follow on to 240836
Address one missed review comment and do the rename I left out of that patch to make it reviewable.

llvm-svn: 240843
2015-06-26 22:04:34 +00:00
Duncan P. N. Exon Smith 3269215401 CodeGen: Use a single SlotTracker in MachineFunction::print()
Expose enough of the IR-level `SlotTracker` so that
`MachineFunction::print()` can use a single one for printing
`BasicBlock`s.  Next step would be to lift this through a few more APIs
so that we can make other print methods faster.

Fixes PR23865, changing the runtime of `llc -print-machineinstrs` from
many minutes (killed after 3 minutes, but it wasn't very close) to
13 seconds for a 502185 line dump.

llvm-svn: 240842
2015-06-26 22:04:20 +00:00
Tom Stellard 4694ed0a14 AMDPGU/SI: Use correct resource descriptors for VI on HSA
Summary: We need to set MTYPE = 2 for VI shaders when targeting the HSA runtime.

Reviewers: arsenm

Differential Revision: http://reviews.llvm.org/D10777

llvm-svn: 240841
2015-06-26 21:58:42 +00:00
Tom Stellard ff7416ba06 AMDGPU/SI: Update amd_kernel_code_t definition and add assembler support
Reviewers: arsenm

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10772

llvm-svn: 240839
2015-06-26 21:58:31 +00:00
Tom Stellard 833ae4fadd AMDGPU/SI: Remove unused variable
This should fix some bots that were broken by r240831.

llvm-svn: 240838
2015-06-26 21:58:26 +00:00
Philip Reames a3c6f0048c [Verifier] Verify invokes of intrinsics
We support invoking a subset of llvm's intrinsics, but the verifier didn't account for this.  We had previously added a special case to verify invokes of statepoints.  By generalizing the code in terms of CallSite, we can verify invokes of other intrinsics as well.  Interestingly, this found one test case which was invalid.

Note: I'm deliberately leaving the naming change from CI to CS to a follow up change.  That will happen shortly, I just wanted to reduce the diff to make it clear what was happening with this one.

Differential Revision: http://reviews.llvm.org/D10118

llvm-svn: 240836
2015-06-26 21:39:44 +00:00
Adrian Prantl 06b298e4b6 Debug Info: Clarify the documentation for bitfields emission.
llvm-svn: 240835
2015-06-26 21:27:30 +00:00
Tom Stellard 91efe9cebe AMDGPU/SI: Set ELF OS/ABI to ELFOSABI_AMDGPU_HSA
Reviewers: arsenm, rafael

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10708

llvm-svn: 240832
2015-06-26 21:15:11 +00:00
Tom Stellard 347ac79b15 AMDGPU/SI: Add hsa code object directives
Reviewers: arsenm

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10757

llvm-svn: 240831
2015-06-26 21:15:07 +00:00
Tom Stellard b5798b09d3 AMDGPU/SI: There are no implicit kernel args in the amdhsa ABI
Reviewers: arsenm

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10706

llvm-svn: 240830
2015-06-26 21:15:03 +00:00
Tom Stellard f151a45ccd AMDGPU/SI: Emit amd_kernel_code_t in EmitFunctionBodyStart()
Summary:
This way the function symbol points to the start of amd_kernel_code_t
rather than the start of the function.

Reviewers: arsenm

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10705

llvm-svn: 240829
2015-06-26 21:14:58 +00:00
Philip Reames 9b5c9580e3 Teach InlineCost to account for a null check which can be folded away
If we have a caller that knows a particular argument can never be null, we can exploit this fact while simplifying values in the inline cost analysis. This has the effect of reducing the cost for inlining when a null check is present in the callee, but the value is known non null in the caller. In particular, any dependent control flow can be discounted from the cost estimate.

Note that we use the parameter attributes at the call site to memoize the analysis within the caller's code.  The setting of this attribute is done in InstCombine, the inline cost analysis just consumes it.  This is intentional and important because we want the inline cost analysis results to be easily cachable themselves.  We're not currently doing so, but initial results on LTO indicate this will quickly become important.

Differential Revision: http://reviews.llvm.org/D9129

llvm-svn: 240828
2015-06-26 20:51:17 +00:00
Marek Olsak cfbdba2d0b AMDGPU: really don't commute REV opcodes if the target variant doesn't exist
If pseudoToMCOpcode failed, we would return the original opcode, so operands
would be swapped, but the instruction would remain the same.
It resulted in LSHLREV a, b ---> LSHLREV b, a.

This fixes Glamor text rendering and
piglit/arb_sample_shading-builtin-gl-sample-mask on VI.

This is a candidate for stable branches.

v2: the test was simplified by Tom Stellard
llvm-svn: 240824
2015-06-26 20:29:10 +00:00