Commit Graph

77457 Commits

Author SHA1 Message Date
Eric Christopher 9cda4b7ed9 Remove a use of the Subtarget in the darwin ppc asm printer.
EmitFunctionStubs is called from doFinalization and so can't
depend on the Subtarget existing. It's also irrelevant as
we know we're darwin since we're in the darwin asm printer.

llvm-svn: 230039
2015-02-20 18:53:42 +00:00
Eric Christopher f734a8bae7 Get the function specific subtarget.
llvm-svn: 230038
2015-02-20 18:44:17 +00:00
Eric Christopher 1df0c519fc Get the cached subtarget off the MachineFunction rather than
inquiring for a new one from the TargetMachine.

llvm-svn: 230037
2015-02-20 18:44:15 +00:00
Sanjay Patel 1b53f74c4c canonicalize a v2f64 blendi of 2 registers
This canonicalization step saves us 3 pattern matching possibilities * 4 math ops
for scalar FP math that uses xmm regs. The backend can re-commute the operands
post-instruction-selection if that makes register allocation better.

The tests in llvm/test/CodeGen/X86/sse-scalar-fp-arith.ll cover this scenario already,
so there are no new tests with this patch.

Differential Revision: http://reviews.llvm.org/D7777

llvm-svn: 230024
2015-02-20 16:55:27 +00:00
Kit Barton 263edb99ab I incorrectly marked the VORC instruction as isCommutable when I added it.
This fix removes the VORC instruction definition from the isCommutable block.

Phabricator review: http://reviews.llvm.org/D7772

llvm-svn: 230020
2015-02-20 15:54:58 +00:00
Igor Laevsky 7fc58a4ad8 Generalize statepoint lowering to use ImmutableStatepoint. Move statepoint lowering into a separate function 'LowerStatepoint' which uses ImmutableStatepoint instead of a CallInst. Also related utility functions are changed to receive ImmutableCallSite.
Differential Revision: http://reviews.llvm.org/D7756 

llvm-svn: 230017
2015-02-20 15:28:35 +00:00
Benjamin Kramer 7af984b710 Constants.cpp: Only read 32 bits for float.
Otherwise we'll discard the wrong half of a uint64_t on big-endian systems.

llvm-svn: 230016
2015-02-20 15:11:55 +00:00
NAKAMURA Takumi e86b9b76c5 Constants.cpp: getElementAsAPFloat(): Don't handle constant value via host's float/double, just handle with APInt/APFloat.
x87 FPU didn't keep SNAN, but demoted to QNAN.

llvm-svn: 230013
2015-02-20 14:24:49 +00:00
Benjamin Kramer 6f66545ae6 RewriteStatepointsForGC: Move details into anonymous namespaces. NFC.
While there reduce the number of duplicated std::map lookups.

llvm-svn: 230012
2015-02-20 14:00:58 +00:00
Benjamin Kramer d4a3a55564 Wrap recursive function only used in assert in #ifndef NDEBUG.
Avoids unused function warnings in Release builds.

llvm-svn: 230009
2015-02-20 13:15:49 +00:00
Chandler Carruth 0c5f059865 [x86] Switching the shuffle equivalence test to a variadic template was
the wrong answer. We also got initializer lists which are *way* cleaner
for this kind of thing. Let's use those and make this a normal, boring
functionn accepting ArrayRef.

llvm-svn: 230004
2015-02-20 10:47:28 +00:00
Eric Christopher 0218f8cfed Fix wording and grammar in Mips subtarget options.
llvm-svn: 230001
2015-02-20 08:42:34 +00:00
Eric Christopher 3ee30d0607 Get the cached subtarget off the MachineFunction rather than
inquiring for a new one from the TargetMachine.

llvm-svn: 230000
2015-02-20 08:39:06 +00:00
Eric Christopher 22b2ad265f Get the cached subtarget off the MachineFunction rather than
inquiring for a new one from the TargetMachine.

llvm-svn: 229999
2015-02-20 08:24:37 +00:00
Eric Christopher 155290edf9 Get the cached subtarget off the MachineFunction rather than
inquiring for a new one from the TargetMachine.

llvm-svn: 229998
2015-02-20 08:24:34 +00:00
Eric Christopher ad1ef04ab1 Save the MachineFunction in startFunction so that we can use it for
lookups of the subtarget later.

llvm-svn: 229996
2015-02-20 08:01:55 +00:00
Eric Christopher 4369c9b42c Use the cached subtarget from the MachineFunction rather than
doing a lookup on the TargetMachine.

llvm-svn: 229995
2015-02-20 08:01:52 +00:00
Eric Christopher 1947a9e2e2 Make the TargetMachine::getSubtarget that takes a Function argument
take a reference to match the getSubtargetImpl that takes a Function
argument.

llvm-svn: 229994
2015-02-20 07:32:59 +00:00
Justin Bogner c4f5a5e863 Disallow implicit conversions from None to integer types
This fixes an error introduced in r228934 where None was converted to
an int instead of the int being converted to an Optional as intended.
We make that sort of mistake a compile error by changing NoneType into
a scoped enum.

Finally, provide a static NoneType called None to avoid forcing all
users to spell it NoneType::None.

llvm-svn: 229980
2015-02-20 07:28:28 +00:00
Nick Lewycky b73c041005 Fix build with gcc. This has a -Wsequence-point error on 'MII', which is a good point.
llvm-svn: 229979
2015-02-20 07:17:40 +00:00
Eric Christopher a7249ec1a7 Remove more uses of TargetMachine::getSubtargetImpl from the
AsmPrinter.

getSubtargetInfo now asserts that the MachineFunction exists.
Debug printing of register naming now uses the register info
from MCAsmInfo as that's unchanging.

llvm-svn: 229978
2015-02-20 07:16:19 +00:00
Nick Lewycky a2bda08806 Fix build in release mode, -Wunused-variable on this lambda function used only in an assert.
llvm-svn: 229977
2015-02-20 07:16:17 +00:00
Nick Lewycky eb3231eefa Fix build in release mode, four cases of -Wunused-variable.
llvm-svn: 229976
2015-02-20 07:14:02 +00:00
Eric Christopher 78a3f6cc4d AsmPrinter::doFinalization is at the module level and so doesn't
have access to a target specific subtarget info. Grab the module
level MCSubtargetInfo for the JumpInstrTable output stubs.

llvm-svn: 229974
2015-02-20 06:59:48 +00:00
Eric Christopher 97ea7622b5 Remove the MCInstrInfo cached variable as it was only used in a
single place and replace calls to getSubtargetImpl with calls
to get the subtarget from the MachineFunction where valid.

llvm-svn: 229971
2015-02-20 06:35:21 +00:00
David Blaikie 9a3644c472 Fix -Wunused-variable warning in non-asserts build, and optimize a little bit while I'm here.
llvm-svn: 229970
2015-02-20 06:28:38 +00:00
Hal Finkel e5aaf3f2cd [PowerPC] Loop Data Prefetching for the BG/Q
The IBM BG/Q supercomputer's A2 cores have a hardware prefetching unit, the
L1P, but it does not prefetch directly into the A2's L1 cache. Instead, it
prefetches into its own L1P buffer, and the latency to access that buffer is
significantly higher than that to the L1 cache (although smaller than the
latency to the L2 cache). As a result, especially when multiple hardware
threads are not actively busy, explicitly prefetching data into the L1 cache is
advantageous.

I've been using this pass out-of-tree for data prefetching on the BG/Q for well
over a year, and it has worked quite well. It is enabled by default only for
the BG/Q, but can be enabled for other cores as well via a command-line option.

Eventually, we might want to add some TTI interfaces and move this into
Transforms/Scalar (there is nothing particularly target dependent about it,
although only machines like the BG/Q will benefit from its simplistic
strategy).

llvm-svn: 229966
2015-02-20 05:08:21 +00:00
Chandler Carruth 4041f2217b [x86] Remove the old vector shuffle lowering code and its flag.
The new shuffle lowering has been the default for some time. I've
enabled the new legality testing by default with no really blocking
regressions. I've fuzz tested this very heavily (many millions of fuzz
test cases have passed at this point). And this cleans up a ton of code.
=]

Thanks again to the many folks that helped with this transition. There
was a lot of work by others that went into the new shuffle lowering to
make it really excellent.

In case you aren't using a diff algorithm that can handle this:
  X86ISelLowering.cpp: 22 insertions(+), 2940 deletions(-)

llvm-svn: 229964
2015-02-20 04:25:04 +00:00
Chandler Carruth eb206aa1ea [x86] Now that the new vector shuffle legality is enabled and everything
is going well, remove the flag and the code for the old legality tests.

This is the first step toward removing the entire old vector shuffle
lowering. *Much* more code to delete coming up next.

llvm-svn: 229963
2015-02-20 03:59:35 +00:00
Duncan P. N. Exon Smith ad6eb127c9 Bitcode: Stop assuming non-null fields
When writing the bitcode serialization for the new debug info hierarchy,
I assumed two fields would never be null.

Drop that assumption, since it's brittle (and crashes the
`BitcodeWriter` if wrong), and is a check better left for the verifier
anyway.  (No need for a bitcode upgrade here, since the new hierarchy is
still not in place.)

The fields in question are `MDCompileUnit::getFile()` and
`MDDerivedType::getBaseType()`, the latter of which isn't null in
test/Transforms/Mem2Reg/ConvertDebugInfo2.ll (see !14, a pointer to
nothing).  While the testcase might have bitrotted, there's no reason
for the bitcode format to rely on non-null for metadata operands.

This also fixes a bug in `AsmWriter` where if the `file:` is null it
isn't emitted (caught by the double-round trip in the testcase I'm
adding) -- this is a required field in `LLParser`.

I'll circle back to ConvertDebugInfo2.  Once the specialized nodes are
in place, I'll be trying to turn the debug info verifier back on by
default (in the newer module pass form committed r206300) and throwing
more logic in there.  If the testcase has bitrotted (as opposed to me
not understanding the schema correctly) I'll fix it then.

llvm-svn: 229960
2015-02-20 03:17:58 +00:00
Hal Finkel 847e05f569 [InstCombine] Remove unnecessary variable indexing into single-element arrays
This change addresses a deficiency pointed out in PR22629. To copy from the bug
report:

[from the bug report]

Consider this code:

int f(int x) {
  int a[] = {12};
  return a[x];
}

GCC knows to optimize this to

movl     $12, %eax
ret

The code generated by recent Clang at -O3 is:

movslq   %edi, %rax
movl     .L_ZZ1fiE1a(,%rax,4), %eax
retq

.L_ZZ1fiE1a:
  .long    12                      # 0xc

[end from the bug report]

This definitely seems worth fixing. I've also seen this kind of code before (as
the base case of generic vector wrapper templates with one element).

The general idea is to look at the GEP feeding a load or a store, which has
some variable as its first non-zero index, and determine if that index must be
zero (or else an out-of-bounds access would occur). We can do this for allocas
and globals with constant initializers where we know the maximum size of the
underlying object. When we find such a GEP, we create a new one for the memory
access with that first variable index replaced with a constant zero.

Even if we can't eliminate the memory access (and sometimes we can't), it is
still useful because it removes unnecessary indexing calculations.

llvm-svn: 229959
2015-02-20 03:05:53 +00:00
Chandler Carruth d2b14b296c [x86] Make the new vector shuffle legality test on by default, which
reflects the fact that the x86 backend can in fact lower any shuffle you
want it to with reasonably high code quality.

My recent work on the new vector shuffle has made this regress *very*
little. The diff in the test cases makes me very, very happy.

llvm-svn: 229958
2015-02-20 03:05:47 +00:00
Kostya Serebryany 2e3622bddd [fuzzer] one more experimental search mode: -use_coverage_pairs=1
llvm-svn: 229957
2015-02-20 03:02:37 +00:00
Philip Reames 6faacf4772 Adjust enablement of RewriteStatepointsForGC
When back merging the changes in 229945 I noticed that I forgot to mark the test cases with the appropriate GC.  We want the rewriting to be off by default (even when manually added to the pass order), not on-by default.  To keep the current test working, mark them as using the statepoint-example GC and whitelist that GC.  

Longer term, we need a better selection mechanism here for both actual usage and testing.  As I migrate more tests to the in tree version of this pass, I will probably need to update the enable/disable logic as well. 

llvm-svn: 229954
2015-02-20 02:34:49 +00:00
Chandler Carruth 301ed0c3b4 Revert r229944: EH: Prune unreachable resume instructions during Dwarf EH preparation
This doesn't pass 'ninja check-llvm' for me. Lots of tests, including
the ones updated, fail with crashes and other explosions.

llvm-svn: 229952
2015-02-20 02:15:36 +00:00
Philip Reames d16a9b1fdc Add a pass for constructing gc.statepoint sequences w/explicit relocations
This patch consists of a single pass whose only purpose is to visit previous inserted gc.statepoints which do not have gc.relocates inserted yet, and insert them. This can be used either immediately after IR generation to perform 'early safepoint insertion' or late in the pass order to perform 'late insertion'.

This patch is setting the stage for work to continue in tree.  In particular, there are known naming and style violations in the current patch.  I'll try to get those resolved over the next week or so.  As I touch each area to make style changes, I need to make sure we have adequate testing in place.  As part of the cleanup, I will be cleaning up a collection of test cases we have out of tree and submitting them upstream. The tests included in this change are very basic and mostly to provide examples of usage.

The pass has several main subproblems it needs to address:
- First, it has identify any live pointers. In the current code, the use of address spaces to distinguish pointers to GC managed objects is hard coded, but this will become parametrizable in the near future.  Note that the current change doesn't actually contain a useful liveness analysis.  It was seperated into a followup change as the code wasn't ready to be shared.  Instead, the current implementation just considers any dominating def of appropriate pointer type to be live.
- Second, it has to identify base pointers for each live pointer. This is a fairly straight forward data flow algorithm. 
- Third, the information in the previous steps is used to actually introduce rewrites. Rather than trying to do this by hand, we simply re-purpose the code behind Mem2Reg to do this for us.

llvm-svn: 229945
2015-02-20 01:06:44 +00:00
Reid Kleckner 0b647e6cca EH: Prune unreachable resume instructions during Dwarf EH preparation
Today a simple function that only catches exceptions and doesn't run
destructor cleanups ends up containing a dead call to _Unwind_Resume
(PR20300). We can't remove these dead resume instructions during normal
optimization because inlining might introduce additional landingpads
that do have cleanups to run. Instead we can do this during EH
preparation, which is guaranteed to run after inlining.

Fixes PR20300.

Reviewers: majnemer

Differential Revision: http://reviews.llvm.org/D7744

llvm-svn: 229944
2015-02-20 01:00:19 +00:00
Eric Christopher 0d94fa98e5 Revert "AVX-512: Full implementation for VRNDSCALESS/SD instructions and intrinsics."
The instructions were being generated on architectures that don't support avx512.

This reverts commit r229837.

llvm-svn: 229942
2015-02-20 00:45:28 +00:00
Eric Christopher 06b32cdfed Add a license header to the AVX512 file.
llvm-svn: 229941
2015-02-20 00:36:53 +00:00
Kostya Serebryany 885994618c [sanitizer] when dumping the basic block trace, also dump the module names. Patch by Laszlo Szekeres
llvm-svn: 229940
2015-02-20 00:30:44 +00:00
Eric Christopher cd37bf5483 This needs to be a const variable so the two sides of the ternary
operator agree on type.

llvm-svn: 229938
2015-02-20 00:03:45 +00:00
Michael Gottesman 0fc2accb58 [objc-arc-contract] We can not move retains over instructions which can not conservatively be proven to not decrement the retain's RCIdentity.
I also cleaned up the code to make it more understandable for mere mortals.

<rdar://problem/19853758>

llvm-svn: 229937
2015-02-20 00:02:49 +00:00
Michael Gottesman 5ab64de62b [objc-arc] Add the predicate CanDecrementRefCount.
This is different from CanAlterRefCount since CanDecrementRefCount is
attempting to prove specifically whether or not an instruction can
decrement instead of the more general question of whether it can
decrement or increment.

llvm-svn: 229936
2015-02-20 00:02:45 +00:00
Duncan P. N. Exon Smith d34db1716e IR: Fix MDType fields from unsigned to uint64_t
When trying to match the current schema with the new debug info
hierarchy, I downgraded `SizeInBits`, `AlignInBits` and `OffsetInBits`
to 32-bits (oops!).  Caught this while testing my upgrade script to move
the hierarchy into place.  Bump it back up to 64-bits and update tests.

llvm-svn: 229933
2015-02-19 23:56:07 +00:00
Ahmed Bougacha db141ac37d [ARM] Re-re-apply VLD1/VST1 base-update combine.
This re-applies r223862, r224198, r224203, and r224754, which were
reverted in r228129 because they exposed Clang misalignment problems
when self-hosting.

The combine caused the crashes because we turned ISD::LOAD/STORE nodes
to ARMISD::VLD1/VST1_UPD nodes.  When selecting addressing modes, we
were very lax for the former, and only emitted the alignment operand
(as in "[r1:128]") when it was larger than the standard alignment of
the memory type.

However, for ARMISD nodes, we just used the MMO alignment, no matter
what.  In our case, we turned ISD nodes to ARMISD nodes, and this
caused the alignment operands to start being emitted.

And that's how we exposed alignment problems that were ignored before
(but I believe would have been caught with SCTRL.A==1?).

To fix this, we can just mirror the hack done for ISD nodes:  only
take into account the MMO alignment when the access is overaligned.

Original commit message:
We used to only combine intrinsics, and turn them into VLD1_UPD/VST1_UPD
when the base pointer is incremented after the load/store.

We can do the same thing for generic load/stores.

Note that we can only combine the first load/store+adds pair in
a sequence (as might be generated for a v16f32 load for instance),
because other combines turn the base pointer addition chain (each
computing the address of the next load, from the address of the last
load) into independent additions (common base pointer + this load's
offset).

rdar://19717869, rdar://14062261.

llvm-svn: 229932
2015-02-19 23:52:41 +00:00
Eric Christopher 2105ae98f6 Only use the initialized MCInstrInfo if it's been initialized already
during SetupMachineFunction. This is also the single use of MII
and it'll be changing to TargetInstrInfo (which is MachineFunction
based) in the next commit here.

llvm-svn: 229931
2015-02-19 23:52:35 +00:00
Ahmed Bougacha dfdf54bed0 [ARM] Minor cleanup to CombineBaseUpdate. NFC.
In preparation for a future patch:
- rename isLoad to isLoadOp: the former is confusing, and can be taken
  to refer to the fact that the node is an ISD::LOAD.  (it isn't, yet.)
- change formatting here and there.
- add some comments.
- const-ify bools.

llvm-svn: 229929
2015-02-19 23:30:37 +00:00
Eric Christopher 7330264146 Migrate away a use of the subtarget (and TargetMachine) from
AsmPrinterDwarf since the information is on the MCRegisterInfo
via the MCContext and MMI that we already have on the AsmPrinter.

llvm-svn: 229928
2015-02-19 23:29:42 +00:00
Duncan P. N. Exon Smith a9f0a8d325 IR: Add missing null operand to MDSubroutineType
Add missing `nullptr` from `MDSubroutineType`'s operands for
`MDCompositeTypeBase::getIdentifier()` (and add tests for all the other
unused fields).  This highlights just how crazy it is that
`MDSubroutineType` inherits from `MDCompositeTypeBase`.

llvm-svn: 229926
2015-02-19 23:25:21 +00:00
Ahmed Bougacha 4c2b0781a5 [CodeGen] Use ArrayRef instead of std::vector&. NFC.
The former lets us use SmallVectors.  Do so in ARM and AArch64.

llvm-svn: 229925
2015-02-19 23:13:10 +00:00
Eric Christopher cbdbf39881 MCTargetOptions reside on the TargetMachine that we always have via
TargetOptions.

llvm-svn: 229917
2015-02-19 21:29:51 +00:00
Eric Christopher 457864178f Remove a call to TargetMachine::getSubtarget from the inline
asm support in the asm printer. If we can get a subtarget from
the machine function then we should do so, otherwise we can
go ahead and create a default one since we're at the module
level.

llvm-svn: 229916
2015-02-19 21:24:23 +00:00
Colin LeMahieu 1174fea31c [Hexagon] Moving remaining methods off of HexagonMCInst in to HexagonMCInstrInfo and eliminating HexagonMCInst class.
llvm-svn: 229914
2015-02-19 21:10:50 +00:00
Benjamin Kramer 68ca67b212 MC: Allow multiple comma-separated expressions on the .uleb128 directive.
For compatiblity with GNU as. Binutils documents this as
'.uleb128 expressions'. Subtle, isn't it?

llvm-svn: 229911
2015-02-19 20:24:04 +00:00
Benjamin Kramer dfedfeb298 SSAUpdater: Use range-based for. NFC.
llvm-svn: 229908
2015-02-19 20:04:02 +00:00
Eric Christopher 64d35be6d6 Remove unused argument from emitInlineAsmStart.
llvm-svn: 229907
2015-02-19 19:52:25 +00:00
Michael Gottesman 2e0e4e07b4 [objc-arc] Convert the bodies of ARCInstKind predicates into covered switches.
This is much better than the previous manner of just using
short-curcuiting booleans from:

1. A "naive" efficiency perspective: we do not have to rely on the
compiler to change the short circuiting boolean operations into a
switch.
2. An understanding perspective by making the implicit behavior of
negative predicates explicit.
3. A maintainability perspective through the covered switch flag making
it easy to know where to update code when adding new ARCInstKinds.

llvm-svn: 229906
2015-02-19 19:51:36 +00:00
Michael Gottesman 6f729fa675 [objc-arc] Change the InstructionClass to be an enum class called ARCInstKind.
I also renamed ObjCARCUtil.cpp -> ARCInstKind.cpp. That file only contained
items related to ARCInstKind anyways.

llvm-svn: 229905
2015-02-19 19:51:32 +00:00
Chris Bieneman a747e5935d Checking if TARGET_OS_IPHONE is defined isn't good enough for 10.7 and earlier.
Older versions of the TargetConditionals header always defined TARGET_OS_IPHONE to something (0 or 1), so we need to test not only for the existence but also if it is 1.

This resolves PR22631.

llvm-svn: 229904
2015-02-19 19:50:52 +00:00
Colin LeMahieu 745c4710db [Hexagon] Moving more functions off of HexagonMCInst and in to HexagonMCInstrInfo.
llvm-svn: 229903
2015-02-19 19:49:27 +00:00
Adam Nemet 57ac766ee9 [LoopAccesses] Change LAA:getInfo to return a constant reference
As expected, this required a few more const-correctness fixes.

Based on Hal's feedback on D7684.

llvm-svn: 229899
2015-02-19 19:15:21 +00:00
Adam Nemet e91cc6ef93 [LoopAccesses] Add -analyze support
The LoopInfo in combination with depth_first is used to enumerate the
loops.

Right now -analyze is not yet complete.  It only prints the result of
the analysis, the report and the run-time checks.  Printing the unsafe
depedences will require a bit more reshuffling which I'd like to do in a
follow-on to this patchset.  Unsafe dependences are currently checked
via -debug-only=loop-accesses in the new test.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229898
2015-02-19 19:15:19 +00:00
Adam Nemet 2bd6e984ef [LoopAccesses] Split out LoopAccessReport from VectorizerReport
The only difference between these two is that VectorizerReport adds a
vectorizer-specific prefix to its messages.  When LAA is used in the
vectorizer context the prefix is added when we promote the
LoopAccessReport into a VectorizerReport via one of the constructors.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229897
2015-02-19 19:15:15 +00:00
Adam Nemet 3e87634fd8 [LoopAccesses] Add missing const to APIs in VectorizationReport
When I split out LoopAccessReport from this, I need to create some temps
so constness becomes necessary.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229896
2015-02-19 19:15:13 +00:00
Adam Nemet 929c38e8ff [LoopAccesses] Add canAnalyzeLoop
This allows the analysis to be attempted with any loop.  This feature
will be used with -analysis.  (LV only requests the analysis on loops
that have already satisfied these tests.)

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229895
2015-02-19 19:15:10 +00:00
Adam Nemet 339f42b396 [LoopAccesses] Change debug messages from LV to LAA
Also add pass name as an argument to VectorizationReport::emitAnalysis.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229894
2015-02-19 19:15:07 +00:00
Adam Nemet 3bfd93d789 [LoopAccesses] Create the analysis pass
This is a function pass that runs the analysis on demand.  The analysis
can be initiated by querying the loop access info via LAA::getInfo.  It
either returns the cached info or runs the analysis.

Symbolic stride information continues to reside outside of this analysis
pass. We may move it inside later but it's not a priority for me right
now.  The idea is that Loop Distribution won't support run-time stride
checking at least initially.

This means that when querying the analysis, symbolic stride information
can be provided optionally.  Whether stride information is used can
invalidate the cache entry and rerun the analysis.  Note that if the
loop does not have any symbolic stride, the entry should be preserved
across Loop Distribution and LV.

Since currently the only user of the pass is LV, I just check that the
symbolic stride information didn't change when using a cached result.

On the LV side, LoopVectorizationLegality requests the info object
corresponding to the loop from the analysis pass.  A large chunk of the
diff is due to LAI becoming a pointer from a reference.

A test will be added as part of the -analyze patch.

Also tested that with AVX, we generate identical assembly output for the
testsuite (including the external testsuite) before and after.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229893
2015-02-19 19:15:04 +00:00
Adam Nemet 436018c3ff [LoopAccesses] Cache the result of canVectorizeMemory
LAA will be an on-demand analysis pass, so we need to cache the result
of the analysis.  canVectorizeMemory is renamed to analyzeLoop which
computes the result.  canVectorizeMemory becomes the query function for
the cached result.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229892
2015-02-19 19:15:00 +00:00
Adam Nemet c922853b93 [LoopAccesses] Stash the report from the analysis rather than emitting it
The transformation passes will query this and then emit them as part of
their own report.  The currently only user LV is modified to do just
that.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229891
2015-02-19 19:14:56 +00:00
Adam Nemet f219c64723 [LoopAccesses] Make VectorizerParams global + fix for cyclic dep
As LAA is becoming a pass, we can no longer pass the params to its
constructor.  This changes the command line flags to have external
storage.  These can now be accessed both from LV and LAA.

VectorizerParams is moved out of LoopAccessInfo in order to shorten the
code to access it.

This commits also has the fix (D7731) to the break dependence cycle
between the analysis and vector libraries.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229890
2015-02-19 19:14:52 +00:00
Adam Nemet 04d4163e95 Revert "Reformat."
This reverts commit r229651.

I'd like to ultimately revert r229650 but this reformat stands in the
way.  I'll reformat the affected files once the the loop-access pass is
fully committed.

llvm-svn: 229889
2015-02-19 19:14:34 +00:00
Colin LeMahieu af304e5192 [Hexagon] Creating HexagonMCInstrInfo namespace as landing zone for static functions detached from HexagonMCInst.
llvm-svn: 229885
2015-02-19 19:00:00 +00:00
Eric Christopher 504f388a84 Update and remove a few calls to TargetMachine::getSubtargetImpl
out of the asm printer.

llvm-svn: 229883
2015-02-19 18:46:23 +00:00
Kostya Serebryany 016852c396 [fuzzer] split main() into FuzzerDriver() that takes a callback as a parameter and a tiny main() in a separate file
llvm-svn: 229882
2015-02-19 18:45:37 +00:00
Ben Langmuir 0897091730 Assume the original file is created before release in LockFileManager
This is true in clang, and let's us remove the problematic code that
waits around for the original file and then times out if it doesn't get
created in short order.  This caused any 'dead' lock file or legitimate
time out to cause a cascade of timeouts in any processes waiting on the
same lock (even if they only just showed up).

llvm-svn: 229881
2015-02-19 18:22:35 +00:00
Kostya Serebryany 2117269dd1 [fuzzer] properly annotate fallthrough, add one more entry to FAQ
llvm-svn: 229880
2015-02-19 18:21:12 +00:00
Colin LeMahieu f08a3ccf50 [Hexagon] Removing static variable holding MCInstrInfo.
llvm-svn: 229872
2015-02-19 17:38:39 +00:00
Benjamin Kramer 1c2beed7fd LSR: Move set instead of copying. NFC.
llvm-svn: 229871
2015-02-19 17:19:43 +00:00
Rafael Espindola 8c97e19124 Avoid conversion to float when creating ConstantDataArray/ConstantDataVector.
Patch by Raoux, Thomas F!

llvm-svn: 229864
2015-02-19 16:08:20 +00:00
Benjamin Kramer ea68a944a1 Demote vectors to arrays. No functionality change.
llvm-svn: 229861
2015-02-19 15:26:17 +00:00
Chandler Carruth 5d1a84b7b8 [x86] Delete still more piles of complex code now that we have a good
systematic lowering of v8i16.

This required a slight strategy shift to prefer unpack lowerings in more
places. While this isn't a cut-and-dry win in every case, it is in the
overwhelming majority. There are only a few places where the old
lowering would probably be a touch faster, and then only by a small
margin.

In some cases, this is yet another significant improvement.

llvm-svn: 229859
2015-02-19 15:21:57 +00:00
Chandler Carruth 0b39536390 [x86] Teach the unpack lowering how to lower with an initial unpack in
addition to lowering to trees rooted in an unpack.

This saves shuffles and or registers in many various ways, lets us
handle another class of v4i32 shuffles pre SSE4.1 without domain
crosses, etc.

llvm-svn: 229856
2015-02-19 15:06:13 +00:00
Chandler Carruth 352eba1c29 [x86] Dramatically improve v8i16 shuffle lowering by not using its
terribly complex partial blend logic.

This code path was one of the more complex and bug prone when it first
went in and it hasn't faired much better. Ultimately, with the simpler
basis for unpack lowering and support bit-math blending, this is
completely obsolete. In the worst case without this we generate
different but equivalent instructions. However, in many cases we
generate much better code. This is especially true when blends or pshufb
is available.

This does expose one (minor) weakness of the unpack lowering that I'll
try to address.

In case you were wondering, this is actually a big part of what I've
been trying to pull off in the recent string of commits.

llvm-svn: 229853
2015-02-19 14:08:24 +00:00
Chandler Carruth 2c0390ca4b [x86] Remove the final fallback in the v8i16 lowering that isn't really
needed, and significantly improve the SSSE3 path.

This makes the new strategy much more clear. If we can blend, we just go
with that. If we can't blend, we try to permute into an unpack so
that we handle cases where the unpack doing the blend also simplifies
the shuffle. If that fails and we've got SSSE3, we now call into
factored-out pshufb lowering code so that we leverage the fact that
pshufb can set up a blend for us while shuffling. This generates great
code, especially because we *know* we don't have a fast blend at this
point. Finally, we fall back on decomposing into permutes and blends
because we do at least have a bit-math-based blend if we need to use
that.

This pretty significantly improves some of the v8i16 code paths. We
never need to form pshufb for the single-input shuffles because we have
effective target-specific combines to form it there, but we were missing
its effectiveness in the blends.

llvm-svn: 229851
2015-02-19 13:56:49 +00:00
Chandler Carruth f0f0d27391 [x86] Simplify the pre-SSSE3 v16i8 lowering significantly by decomposing
them into permutes and a blend with the generic decomposition logic.

This works really well in almost every case and lets the code only
manage the expansion of a single input into two v8i16 vectors to perform
the actual shuffle. The blend-based merging is often much nicer than the
pack based merging that this replaces. The only place where it isn't we
end up blending between two packs when we could do a single pack. To
handle that case, just teach the v2i64 lowering to handle these blends
by digging out the operands.

With this we're down to only really random permutations that cause an
explosion of instructions.

llvm-svn: 229849
2015-02-19 13:15:12 +00:00
Chandler Carruth 8817e5e01b [x86] Remove the insanely over-aggressive unpack lowering strategy for
v16i8 shuffles, and replace it with new facilities.

This uses precise patterns to match exact unpacks, and the new
generalized unpack lowering only when we detect a case where we will
have to shuffle both inputs anyways and they terminate in exactly
a blend.

This fixes all of the blend horrors that I uncovered by always lowering
blends through the vector shuffle lowering. It also removes *sooooo*
much of the crazy instruction sequences required for v16i8 lowering
previously. Much cleaner now.

The only "meh" aspect is that we sometimes use pshufb+pshufb+unpck when
it would be marginally nicer to use pshufb+pshufb+por. However, the
difference there is *tiny*. In many cases its a win because we re-use
the pshufb mask. In others, we get to avoid the pshufb entirely. I've
left a FIXME, but I'm dubious we can really do better than this. I'm
actually pretty happy with this lowering now.

For SSE2 this exposes some horrors that were really already there. Those
will have to fixed by changing a different path through the v16i8
lowering.

llvm-svn: 229846
2015-02-19 12:10:37 +00:00
Jozef Kolek 5d171fc291 [mips][microMIPS] Make usage of AND16, OR16 and XOR16 by code generator
Differential Revision: http://reviews.llvm.org/D7611

llvm-svn: 229845
2015-02-19 11:51:32 +00:00
Chandler Carruth 38dea42ddf [x86] The SELECT x86 DAG combine also does legalization. It used to rely
on things not being marked as either custom or legal, but we now do
custom lowering of more VSELECT nodes. To cope with this, manually
replicate the legality tests here. These have to stay in sync with the
set of tests used in the custom lowering of VSELECT.

Ideally, we wouldn't do any of this combine-based-legalization when we
have an actual custom legalization step for VSELECT, but I'm not going
to be able to rewrite all of that today.

I don't have a test case for this currently, but it was found when
compiling a number of the test-suite benchmarks. I'll try to reduce
a test case and add it.

This should at least fix the test-suite fallout on build bots.

llvm-svn: 229844
2015-02-19 11:43:37 +00:00
Michael Kuperstein efd7a96d2e Reverting r229831 due to multiple ARM/PPC/MIPS build-bot failures.
llvm-svn: 229841
2015-02-19 11:38:11 +00:00
Igor Laevsky 9570ff94f7 Implement invoke statepoint verification.
Differential Revision: http://reviews.llvm.org/D7366

llvm-svn: 229840
2015-02-19 11:28:47 +00:00
Igor Laevsky 77f118f878 Add invoke related functionality into StatepointSite classes.
Differential Revision: http://reviews.llvm.org/D7364

llvm-svn: 229838
2015-02-19 11:02:11 +00:00
Elena Demikhovsky 69e8b45b13 AVX-512: Full implementation for VRNDSCALESS/SD instructions and intrinsics.
llvm-svn: 229837
2015-02-19 10:48:04 +00:00
Chandler Carruth bcb6c5f62d [x86] Add support for bit-wise blending and use it in the v8 and v16
lowering paths. I'm going to be leveraging this to simplify a lot of the
overly complex lowering of v8 and v16 shuffles in pre-SSSE3 modes.

Sadly, this isn't profitable on v4i32 and v2i64. There, the float and
double blending instructions for pre-SSE4.1 are actually pretty good,
and we can't beat them with bit math. And once SSE4.1 comes around we
have direct blending support and this ceases to be relevant.

Also, some of the test cases look odd because the domain fixer
canonicalizes these to floating point domain. That's OK, it'll use the
integer domain when it matters and some day I may be able to update
enough of LLVM to canonicalize the other way.

This restores almost all of the regressions from teaching x86's vselect
lowering to always use vector shuffle lowering for blends. The remaining
problems are because the v16 lowering path is still doing crazy things.
I'll be re-arranging that strategy in more detail in subsequent commits
to finish recovering the performance here.

llvm-svn: 229836
2015-02-19 10:46:52 +00:00
Chandler Carruth b89464a9b6 [x86,sdag] Two interrelated changes to the x86 and sdag code.
First, don't combine bit masking into vector shuffles (even ones the
target can handle) once operation legalization has taken place. Custom
legalization of vector shuffles may exist for these patterns (making the
predicate return true) but that custom legalization may in some cases
produce the exact bit math this matches. We only really want to handle
this prior to operation legalization.

However, the x86 backend, in a fit of awesome, relied on this. What it
would do is mark VSELECTs as expand, which would turn them into
arithmetic, which this would then match back into vector shuffles, which
we would then lower properly. Amazing.

Instead, the second change is to teach the x86 backend to directly form
vector shuffles from VSELECT nodes with constant conditions, and to mark
all of the vector types we support lowering blends as shuffles as custom
VSELECT lowering. We still mark the forms which actually support
variable blends as *legal* so that the custom lowering is bypassed, and
the legal lowering can even be used by the vector shuffle legalization
(yes, i know, this is confusing. but that's how the patterns are
written).

This makes the VSELECT lowering much more sensible, and in fact should
fix a bunch of bugs with it. However, as you'll see in the test cases,
right now what it does is point out the *hilarious* deficiency of the
new vector shuffle lowering when it comes to blends. Fortunately, my
very next patch fixes that. I can't submit it yet, because that patch,
somewhat obviously, forms the exact and/or pattern that the DAG combine
is matching here! Without this patch, teaching the vector shuffle
lowering to produce the right code infloops in the DAG combiner. With
this patch alone, we produce terrible code but at least lower through
the right paths. With both patches, all the regressions here should be
fixed, and a bunch of the improvements (like using 2 shufps with no
memory loads instead of 2 andps with memory loads and an orps) will
stay. Win!

There is one other change worth noting here. We had hilariously wrong
vectorization cost estimates for vselect because we fell through to the
code path that assumed all "expand" vector operations are scalarized.
However, the "expand" lowering of VSELECT is vector bit math, most
definitely not scalarized. So now we go back to the correct if horribly
naive cost of "1" for "not scalarized". If anyone wants to add actual
modeling of shuffle costs, that would be cool, but this seems an
improvement on its own. Note the removal of 16 and 32 "costs" for doing
a blend. Even in SSE2 we can blend in fewer than 16 instructions. ;] Of
course, we don't right now because of OMG bad code, but I'm going to fix
that. Next patch. I promise.

llvm-svn: 229835
2015-02-19 10:36:19 +00:00
Michael Kuperstein ba5b04c798 Use std::bitset for SubtargetFeatures
Previously, subtarget features were a bitfield with the underlying type being uint64_t. 
Since several targets (X86 and ARM, in particular) have hit or were very close to hitting this bound, switching the features to use a bitset.

No functional change.

Differential Revision: http://reviews.llvm.org/D7065

llvm-svn: 229831
2015-02-19 09:01:04 +00:00
Davide Italiano faafae33fa [Support/Timer] Make GetMallocUsage() aware of jemalloc.
Differential Revision:	D7657
Reviewed by:	shankarke, majnemer

llvm-svn: 229824
2015-02-19 07:27:14 +00:00
Dmitri Gribenko 3e1551c96f Provide the same ABI regardless of NDEBUG
For projects depending on LLVM, I find it very useful to combine a
release-no-asserts build of LLVM with a debug+asserts build of the dependent
project.  The motivation is that when developing a dependent project, you are
debugging that project itself, not LLVM.  In my usecase, a significant part of
the runtime is spent in LLVM optimization passes, so I would like to build LLVM
without assertions to get the best performance from this combination.

Currently, `lib/Support/Debug.cpp` changes the set of symbols it provides
depending on NDEBUG, while `include/llvm/Support/Debug.h` requires extra
symbols when NDEBUG is not defined.  Thus, it is not possible to enable
assertions in an external project that uses facilities of `Debug.h`.

This patch changes `Debug.cpp` and `Valgrind.cpp` to always define the symbols
that other code may depend on when #including LLVM headers without NDEBUG.

http://reviews.llvm.org/D7662

llvm-svn: 229819
2015-02-19 05:30:16 +00:00
Eric Christopher d84f5d30e2 Remove the local subtarget variable from the SystemZ asm printer
and update the two calls accordingly.

llvm-svn: 229805
2015-02-19 01:26:28 +00:00
Eric Christopher 0795a2ef0c Remove a few more calls to TargetMachine::getSubtarget from the
R600 port.

llvm-svn: 229804
2015-02-19 01:10:55 +00:00
Eric Christopher 7edca437f5 Grab the subtarget off of the machine function for the R600
asm printer and clean up a bunch of uses.

llvm-svn: 229803
2015-02-19 01:10:53 +00:00
Eric Christopher 96caeda730 Remove the DisasmEnabled AsmPrinter variable and just look it
up on the subtarget where it's set anyhow than looking it up
2-3 times in the same place.

llvm-svn: 229802
2015-02-19 01:10:49 +00:00
Peter Collingbourne fb8002cbe0 MC: Remove NullStreamer hook, as it is redundant with NullTargetStreamer.
llvm-svn: 229799
2015-02-19 00:45:07 +00:00
Peter Collingbourne 20c7259ce9 Introduce Target::createNullTargetStreamer and use it from IRObjectFile.
A null MCTargetStreamer allows IRObjectFile to ignore target-specific
directives. Previously we were crashing.

Differential Revision: http://reviews.llvm.org/D7711

llvm-svn: 229797
2015-02-19 00:45:02 +00:00
Michael Gottesman e5ad66f8a9 [objc-arc] Introduce the concept of RCIdentity and rename all relevant functions to use that name. NFC.
The RCIdentity root ("Reference Count Identity Root") of a value V is a
dominating value U for which retaining or releasing U is equivalent to
retaining or releasing V. In other words, ARC operations on V are
equivalent to ARC operations on U.

This is a useful property to ascertain since we can use this in the ARC
optimizer to make it easier to match up ARC operations by always mapping
ARC operations to RCIdentityRoots instead of pointers themselves. Then
we perform pairing of retains, releases which are applied to the same
RCIdentityRoot.

In general, the two ways that we see RCIdentical values in ObjC are via:

  1. PointerCasts
  2. Forwarding Calls that return their argument verbatim.

As such in ObjC, two RCIdentical pointers must always point to the same
memory location.

Previously this concept was implicit in the code and various methods
that dealt with this concept were given functional names that did not
conform to any name in the "ARC" model. This often times resulted in
code that was hard for the non-ARC acquanted to understand resulting in
unhappiness and confusion.

llvm-svn: 229796
2015-02-19 00:42:38 +00:00
Michael Gottesman dfa3e4b08a [objc-arc-contract] Rename contractRelease => tryToContractReleaseIntoStoreStrong.
NFC. Makes it clearer what this method is actually supposed to do.

llvm-svn: 229795
2015-02-19 00:42:34 +00:00
Michael Gottesman 1827973f80 [objc-arc-contract] Refactor out tryToPeepholeInstruction into its own method. NFC.
The main method of ObjCARCContract is really large and busy. By refactoring this
out, it becomes easier to reason about.

llvm-svn: 229794
2015-02-19 00:42:30 +00:00
Michael Gottesman 56bd6a077a [objc-arc-contract] Reorganize the code a bit and make the debug output easier to read.
llvm-svn: 229793
2015-02-19 00:42:27 +00:00
Duncan P. N. Exon Smith 3d62bbacb1 IR: Drop scope from MDTemplateParameter
Follow-up to r229740, which removed `DITemplate*::getContext()` after my
upgrade script revealed that scopes are always `nullptr` for template
parameters.  This is the other shoe: drop `scope:` from
`MDTemplateParameter` and its two subclasses.  (Note: a bitcode upgrade
would be pointless, since the hierarchy hasn't been moved into place.)

llvm-svn: 229791
2015-02-19 00:37:21 +00:00
Eric Christopher ca929f2469 Avoid using a self-referential initializer and fix up uses.
llvm-svn: 229790
2015-02-19 00:22:47 +00:00
Eric Christopher 111de895a0 80-column fixups.
llvm-svn: 229789
2015-02-19 00:15:33 +00:00
Eric Christopher 02389e3886 Remove all use of is64bit off of NVPTXSubtarget and clean up code
accordingly. This changes the constructors of a number of classes
that don't need to know the subtarget's 64-bitness.

llvm-svn: 229787
2015-02-19 00:08:27 +00:00
Eric Christopher beffc4e84f Remove all use of getDrvInterface off of NVPTXSubtarget and clean
up code accordingly. Delete code that was checking for all cases
of an enum.

llvm-svn: 229786
2015-02-19 00:08:23 +00:00
Eric Christopher 6aad8b1801 Migrate the NVPTX backend asm printer to a per function subtarget.
This involved moving two non-subtarget dependent features (64-bitness
and the driver interface) to the NVPTX target machine and updating
the uses (or migrating around the subtarget use for ease of review).
Otherwise use the cached subtarget or create a default subtarget
based on the TargetMachine cpu and feature string for the module
level assembler emission.

llvm-svn: 229785
2015-02-19 00:08:14 +00:00
Duncan P. N. Exon Smith 5c9a17732b IR: Allow MDSubrange to have 'count: -1'
It turns out that `count: -1` is a special value indicating an empty
array, such as `Values` in:

    struct T {
      unsigned Count;
      int Values[];
    };

Handle it.

llvm-svn: 229769
2015-02-18 23:17:51 +00:00
Reid Kleckner 7bb0738d82 Add an IR-to-IR test for dwarf EH preparation using opt
This tests the simple resume instruction elimination logic that we have
before making some changes to it.

llvm-svn: 229768
2015-02-18 23:17:41 +00:00
Andrew Kaylor 179543bb9b Style and formatting fixes for r229715
llvm-svn: 229758
2015-02-18 22:52:18 +00:00
Marek Olsak 9b8f32eed1 R600/SI: Fix READLANE and WRITELANE lane select for VI
VOP2 declares vsrc1, but VOP3 declares src1.
We can't use the same "ins" if the operands have different names in VOP2
and VOP3 encodings.

This fixes a hang in geometry shaders which spill M0 on VI.
(BTW it doesn't look like M0 needs spilling and the spilling seems
duplicated 3 times)

llvm-svn: 229752
2015-02-18 22:12:45 +00:00
Marek Olsak 8eeebcccb5 R600/SI: Simplify verification of AMDGPU::OPERAND_REG_INLINE_C
llvm-svn: 229751
2015-02-18 22:12:41 +00:00
Marek Olsak b8c818337d R600/SI: Remove explicit VOP operand checking
This should be handled by the OperandType checking.

llvm-svn: 229750
2015-02-18 22:12:37 +00:00
Duncan P. N. Exon Smith cd8fb60fce IR: Swap order of name and value in MDEnum
Put the name before the value in assembly for `MDEnum`.  While working
on the testcase upgrade script for the new hierarchy, I noticed that it
"looks nicer" to have the name first, since it lines the names up in the
(somewhat typical) case that they have a common prefix.

llvm-svn: 229747
2015-02-18 21:16:33 +00:00
Duncan P. N. Exon Smith df52349bb0 IR: Add MDSubprogram::replaceFunction()
llvm-svn: 229742
2015-02-18 20:32:57 +00:00
Duncan P. N. Exon Smith 89b075e53a IR: Drop the scope in DI template parameters
The scope/context is always the compile unit, which we replace with
`nullptr` anyway (via `getNonCompileUnitScope()`).  Drop it explicitly.

I noticed this field was always null while writing testcase upgrade
scripts to transition to the new hierarchy.  Seems wasteful to
transition it over if it's already out-of-use.

llvm-svn: 229740
2015-02-18 20:30:45 +00:00
Duncan P. N. Exon Smith e4450146fa Fix -DNDEBUG -Werror build after r229733
llvm-svn: 229736
2015-02-18 19:56:50 +00:00
Reid Kleckner 4dd0304e34 dos2unix the WinEH file and tests
llvm-svn: 229735
2015-02-18 19:52:46 +00:00
Duncan P. N. Exon Smith 8551d25fa9 IR: isScopeRef() should check isScope()
r229733 removed an invalid use of `DIScopeRef`, so now we can enforce
that a `DIScopeRef` is actually a scope.

llvm-svn: 229734
2015-02-18 19:46:02 +00:00
Duncan P. N. Exon Smith 2a78e9bcb5 IR: Avoid DIScopeRef in DIImportedEntity::getEntity()
`DIImportedEntity::getEntity()` currently returns a `DIScopeRef`, but
the nodes it references aren't always `DIScope`s.  In particular, it can
reference global variables.

Introduce `DIDescriptorRef` to avoid the lie.

llvm-svn: 229733
2015-02-18 19:39:36 +00:00
Sanjoy Das 11b279a832 Partial fix for bug 22589
Don't spend the entire iteration space in the scalar loop prologue if
computing the trip count overflows.  This change also gets rid of the
backedge check in the prologue loop and the extra check for
overflowing trip-count.

Differential Revision: http://reviews.llvm.org/D7715

llvm-svn: 229731
2015-02-18 19:32:25 +00:00
Justin Bogner 11ae7789ba InstrProf: Don't combine expansion regions with code regions
This was leading to duplicate counts when a code region happened to
overlap exactly with an expansion. The combining behaviour only makes
sense for code regions.

llvm-svn: 229723
2015-02-18 19:01:06 +00:00
David Blaikie 30f2f3fc98 Remove unused member variables (-Wunused-private-field)
llvm-svn: 229722
2015-02-18 18:52:49 +00:00
Justin Bogner 428c605dff InstrProf: Handle unknown functions if they consist only of zero-regions
This comes up when we generate coverage for a function but don't end
up emitting the function at all - dead static functions or inline
functions that aren't referenced in a particular TU, for example. In
these cases we'd like to show that the function was never called,
which is trivially true.

llvm-svn: 229717
2015-02-18 18:40:46 +00:00
Andrew Kaylor 527c5dc68d Adding implementation to outline C++ catch handlers for native Windows 64 exception handling.
Differential Revision: http://reviews.llvm.org/D7363

llvm-svn: 229715
2015-02-18 18:31:51 +00:00
Justin Bogner 1d29c08095 InstrProf: Make CoverageMapping testable and add a basic unit test
Make CoverageMapping easier to create, so that we can write targeted
unit tests for its internals, and add a some infrastructure to write
these tests. Finally, add a simple unit test for basic functionality.

llvm-svn: 229709
2015-02-18 18:01:14 +00:00
Jozef Kolek 3c6724f442 [mips][microMIPS] Make usage of ADDU16 and SUBU16 by code generator
Differential Revision: http://reviews.llvm.org/D7609

llvm-svn: 229706
2015-02-18 17:33:56 +00:00
Jozef Kolek 1fd6548297 [mips][microMIPS] Implement JALX instruction
Differential Revision: http://reviews.llvm.org/D5047

llvm-svn: 229702
2015-02-18 17:15:48 +00:00
Daniel Sanders 1779314e3c [mips] Add backend support for Mips32r[35] and Mips64r[35].
Summary:
These ISA's didn't add any instructions so they are almost identical to
Mips32r2 and Mips64r2. Even the ELF e_flags are the same, However the ISA
revision in .MIPS.abiflags is 3 or 5 respectively instead of 2.

Reviewers: vmedic

Reviewed By: vmedic

Subscribers: tomatabacu, llvm-commits, atanasyan

Differential Revision: http://reviews.llvm.org/D7381

llvm-svn: 229695
2015-02-18 16:24:50 +00:00
Kit Barton 298beb5e86 This patch adds the VSX logical instructions introduced in the Power ISA 2.07. It also removes the added complexity that favors VMX versions of the three instructions.
Phabricator review: http://reviews.llvm.org/D7616

Commiting on Nemanja's behalf.

llvm-svn: 229694
2015-02-18 16:21:46 +00:00
Tom Stellard 1ca873bbc5 R600/SI: Don't set isCodeGenOnly = 1 on all instructions
We only need to set this on pseudo instructions which won't
be used by the assembler.

llvm-svn: 229689
2015-02-18 16:08:17 +00:00
Tom Stellard c34c37ae66 R600/SI: Add missing VOP1 instructions
llvm-svn: 229688
2015-02-18 16:08:15 +00:00
Tom Stellard 894b9883f4 R600/SI: Add missing VOP2 instructions
llvm-svn: 229687
2015-02-18 16:08:14 +00:00
Tom Stellard 0c0008cb6e R600/SI: Add definition for S_CBRANCH_G_FORK
llvm-svn: 229686
2015-02-18 16:08:13 +00:00
Tom Stellard ce449ade7e R600/SI: Add missing SOP1 instructions
llvm-svn: 229685
2015-02-18 16:08:11 +00:00
Tom Stellard ee21faa029 R600/SI: Refactor SOP2 definitions
llvm-svn: 229684
2015-02-18 16:08:09 +00:00
Vasileios Kalintiris 611cb70b83 [mips] Avoid redundant sign extension of the result of binary bitwise instructions.
Reviewers: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7581

llvm-svn: 229675
2015-02-18 14:57:05 +00:00
Benjamin Kramer 6ca8992018 X86: Use bitset to manage a bag of bits. NFC.
Doesn't matter in terms of memory usage or perf here, but it's a neat
simplification.

llvm-svn: 229672
2015-02-18 14:10:44 +00:00
Toma Tabacu 8874eac5e6 [mips] [IAS] Fix using .cpsetup with local labels (PR22518).
Summary:
Parse for an MCExpr instead of an Identifier and use the symbol for relocations, not just the symbol's name.

This fixes errors when using local labels in .cpsetup (PR22518).

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: seanbruno, emaste, llvm-commits

Differential Revision: http://reviews.llvm.org/D7697

llvm-svn: 229671
2015-02-18 13:46:53 +00:00
Chandler Carruth bbb377c3a1 [x86] Tighten the assertions to document that canonicalization has
actually removed all but a *very* small number of choices for v2i64.
Also remove dead code handling cases that simply cannot arise.

llvm-svn: 229670
2015-02-18 11:46:29 +00:00
Chandler Carruth 811f0ee8c1 [x86] Switch an if which is trivially true to an assert. NFC
llvm-svn: 229669
2015-02-18 11:46:27 +00:00
Chandler Carruth 8f3e585b17 [x86] Remove some more 'bit' nomenclature from the generic shift
lowering.

llvm-svn: 229668
2015-02-18 11:46:23 +00:00
Mohit K. Bhakkad 518946e440 [MSan][MIPS] VarArgHelper for MIPS64
Reviewers: Reviewers: eugenis, kcc, samsonov, petarj

Subscribers: dsanders, sagar, llvm-commits

Differential Revision: http://reviews.llvm.org/D7182

llvm-svn: 229667
2015-02-18 11:41:24 +00:00
Chandler Carruth 672a98ea28 [x86] Fold together the two shift lowering strategies. They were doing
quite literally the same work, we just need to special case the >64-bit
element shift code emission to emit the byte shift instructions and
offsets. This also makes reasoning about each of the vector lowering
strategies easier as we don't have to remember to use both forms.

llvm-svn: 229662
2015-02-18 10:40:38 +00:00
Bradley Smith 26c9922a59 [ARM] Add missing M/R class CPUs
Add some of the missing M and R class Cortex CPUs, namely:

Cortex-M0+ (called Cortex-M0plus for GCC compatibility)
Cortex-M1
SC000
SC300
Cortex-R5

llvm-svn: 229660
2015-02-18 10:33:30 +00:00
Michael Kuperstein af9befa6b7 Fixes two issue in SimplifyDemandedBits of sext_in_reg:
1) We should not try to simplify if the sext has multiple uses
2) There is no need to simplify is the source value is already sign-extended.

Patch by Gil Rapaport <gil.rapaport@intel.com>

Differential Revision: http://reviews.llvm.org/D6949

llvm-svn: 229659
2015-02-18 09:43:40 +00:00
Ulrich Weigand b7e5909a42 [SystemZ] Clean up warning
Removed (unreachable) default case in switch to clean up warning:

lib/Target/SystemZ/SystemZISelLowering.cpp:1974:5:
error: default label in switch which covers all enumeration values
[-Werror,-Wcovered-switch-default]

llvm-svn: 229658
2015-02-18 09:42:23 +00:00
Chandler Carruth 48cc6c623a [x86] Refactor the bit shift code the same as I just did the byte shift
code.

While this didn't have the miscompile (it used MatchLeft consistently)
it missed some cases where it could use right shifts. I've added a test
case Craig Topper came up with to exercise the right shift matching.

This code is really identical between the two. I'm going to merge them
next so that we don't keep two copies of all of this logic.

llvm-svn: 229655
2015-02-18 09:19:58 +00:00
Ulrich Weigand 7db6918e2b [SystemZ] Support all TLS access models - CodeGen part
The current SystemZ back-end only supports the local-exec TLS access model.
This patch adds all required CodeGen support for the other TLS models, which
means in particular:

- Expand initial-exec TLS accesses by loading TLS offsets from the GOT
  using @indntpoff relocations.

- Expand general-dynamic and local-dynamic accesses by generating the
  appropriate calls to __tls_get_offset.  Note that this routine has
  a non-standard ABI and requires loading the GOT pointer into %r12,
  so the patch also adds support for the GLOBAL_OFFSET_TABLE ISD node.

- Add a new platform-specific optimization pass to remove redundant
  __tls_get_offset calls in the local-dynamic model (modeled after
  the corresponding X86 pass).

- Add test cases verifying all access models and optimizations.

llvm-svn: 229654
2015-02-18 09:13:27 +00:00
Ulrich Weigand 7bdd7c2346 [SystemZ] Support all TLS access models - MC part
The current SystemZ back-end only supports the local-exec TLS access model.
This patch adds all required MC support for the other TLS models, which
means in particular:

- Support additional relocation types for
  Initial-exec model: R_390_TLS_IEENT
  Local-dynamic-model: R_390_TLS_LDO32, R_390_TLS_LDO64,
                       R_390_TLS_LDM32, R_390_TLS_LDM64, R_390_TLS_LDCALL
  General-dynamic model: R_390_TLS_GD32, R_390_TLS_GD64, R_390_TLS_GDCALL

- Support assembler syntax to generate additional relocations
  for use with __tls_get_offset calls:
    :tls_gdcall:
    :tls_ldcall:

The patch also adds a new test to verify fixups and relocations,
and removes the (already unused) FK_390_PLT16DBL/FK_390_PLT32DBL
fixup kinds.

llvm-svn: 229652
2015-02-18 09:11:36 +00:00
NAKAMURA Takumi a250484c4c Reformat.
llvm-svn: 229651
2015-02-18 08:36:14 +00:00
NAKAMURA Takumi fa520c5f49 Revert r229622: "[LoopAccesses] Make VectorizerParams global" and others. r229622 brought cyclic dependencies between Analysis and Vector.
r229622: "[LoopAccesses] Make VectorizerParams global"
  r229623: "[LoopAccesses] Stash the report from the analysis rather than emitting it"
  r229624: "[LoopAccesses] Cache the result of canVectorizeMemory"
  r229626: "[LoopAccesses] Create the analysis pass"
  r229628: "[LoopAccesses] Change debug messages from LV to LAA"
  r229630: "[LoopAccesses] Add canAnalyzeLoop"
  r229631: "[LoopAccesses] Add missing const to APIs in VectorizationReport"
  r229632: "[LoopAccesses] Split out LoopAccessReport from VectorizerReport"
  r229633: "[LoopAccesses] Add -analyze support"
  r229634: "[LoopAccesses] Change LAA:getInfo to return a constant reference"
  r229638: "Analysis: fix buildbots"

llvm-svn: 229650
2015-02-18 08:34:47 +00:00
Daniel Jasper ed9eb7209e NFC: Use range-based for loops and more consistent naming.
No functional changes intended.

(I plan on doing some modifications to this function and would like to
have as few unrelated changes as possible in the patch)

llvm-svn: 229649
2015-02-18 08:19:16 +00:00
Daniel Jasper 4d7b04384e Remove experimental options to control machine block placement.
This reverts r226034. Benchmarking with those flags has not revealed
anything interesting.

llvm-svn: 229648
2015-02-18 08:18:07 +00:00
Sanjoy Das c1065b9a4f Address post commit review on r229600.
llvm-svn: 229646
2015-02-18 08:03:22 +00:00
Elena Demikhovsky 714f23bcdb AVX-512: Added support for FP instructions with embedded rounding mode.
By Asaf Badouh <asaf.badouh@intel.com>

llvm-svn: 229645
2015-02-18 07:59:20 +00:00
Chandler Carruth 55553f5299 [x86] Rewrite the byte shift detection to not use boolean variables to
track state.

I didn't like this in the code review because the pattern tends to be
error prone, but I didn't see a clear way to rewrite it. Turns out that
there were bugs here, I found them when fuzz testing our shuffle
lowering for correctness on x86.

The core of the problem is that we need to consistently test all our
preconditions for the same directionality of shift and the same input
vector. Instead, formulate this as two predicates (one doesn't depend on
the input in any way), pass things like the directionality and input
vector as inputs, and loop over the alternatives.

This fixes a pattern of very rare miscompiles coming out of this code.
Turned up roughly 4 out of every 1 million v8 shuffles in my fuzz
testing. The new code is over half a million test runs with no failures
yet. I've also fuzzed every other function in the lowering code with
over 3.5 million test cases and not discovered any other miscompiles.

llvm-svn: 229642
2015-02-18 07:13:48 +00:00
Craig Topper 1348f17205 [X86] Remove AVX512 pslldq/psrldq shift intrinsics. They aren't implemented yet and when they are they should be done with shuffles like SSE2 and AVX2.
llvm-svn: 229641
2015-02-18 06:24:49 +00:00
Craig Topper b324e43aed [X86] Remove AVX2 and SSE2 pslldq and psrldq intrinsics. We can represent them in IR with vector shuffles now. All their uses have been removed from clang in favor of shuffles.
llvm-svn: 229640
2015-02-18 06:24:44 +00:00
Saleem Abdulrasool 90b1d152b5 Analysis: fix buildbots
This should fix the compilation failure on the MSVC buildbots which find a
std::make_unique and llvm::make_unique via ADL, resulting in ambiguity.

llvm-svn: 229638
2015-02-18 05:09:50 +00:00
Adam Nemet 85fd9f8d09 [LoopAccesses] Change LAA:getInfo to return a constant reference
As expected, this required a few more const-correctness fixes.

Based on Hal's feedback on D7684.

llvm-svn: 229634
2015-02-18 03:44:33 +00:00
Adam Nemet 75bc2d111f [LoopAccesses] Add -analyze support
The LoopInfo in combination with depth_first is used to enumerate the
loops.

Right now -analyze is not yet complete.  It only prints the result of
the analysis, the report and the run-time checks.  Printing the unsafe
depedences will require a bit more reshuffling which I'd like to do in a
follow-on to this patchset.  Unsafe dependences are currently checked
via -debug-only=loop-accesses in the new test.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229633
2015-02-18 03:44:30 +00:00
Adam Nemet d7350dbb85 [LoopAccesses] Split out LoopAccessReport from VectorizerReport
The only difference between these two is that VectorizerReport adds a
vectorizer-specific prefix to its messages.  When LAA is used in the
vectorizer context the prefix is added when we promote the
LoopAccessReport into a VectorizerReport via one of the constructors.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229632
2015-02-18 03:44:25 +00:00
Adam Nemet 8b12afbeee [LoopAccesses] Add missing const to APIs in VectorizationReport
When I split out LoopAccessReport from this, I need to create some temps
so constness becomes necessary.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229631
2015-02-18 03:44:20 +00:00
Adam Nemet 450d417ecf [LoopAccesses] Add canAnalyzeLoop
This allows the analysis to be attempted with any loop.  This feature
will be used with -analysis.  (LV only requests the analysis on loops
that have already satisfied these tests.)

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229630
2015-02-18 03:44:08 +00:00
Adam Nemet a8945b7790 [LoopAccesses] Factor out RuntimePointerCheck::needsChecking
Will be used by the new RuntimePointerCheck::print.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229629
2015-02-18 03:43:58 +00:00
Adam Nemet d0db4c1395 [LoopAccesses] Change debug messages from LV to LAA
Also add pass name as an argument to VectorizationReport::emitAnalysis.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229628
2015-02-18 03:43:37 +00:00
Adam Nemet d6b7e29815 [LoopAccesses] Create the analysis pass
This is a function pass that runs the analysis on demand.  The analysis
can be initiated by querying the loop access info via LAA::getInfo.  It
either returns the cached info or runs the analysis.

Symbolic stride information continues to reside outside of this analysis
pass. We may move it inside later but it's not a priority for me right
now.  The idea is that Loop Distribution won't support run-time stride
checking at least initially.

This means that when querying the analysis, symbolic stride information
can be provided optionally.  Whether stride information is used can
invalidate the cache entry and rerun the analysis.  Note that if the
loop does not have any symbolic stride, the entry should be preserved
across Loop Distribution and LV.

Since currently the only user of the pass is LV, I just check that the
symbolic stride information didn't change when using a cached result.

On the LV side, LoopVectorizationLegality requests the info object
corresponding to the loop from the analysis pass.  A large chunk of the
diff is due to LAI becoming a pointer from a reference.

A test will be added as part of the -analyze patch.

Also tested that with AVX, we generate identical assembly output for the
testsuite (including the external testsuite) before and after.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229626
2015-02-18 03:43:24 +00:00
Adam Nemet 01abb2c355 [LoopAccesses] Make blockNeedsPredication static
blockNeedsPredication is in LoopAccess in order to share it with the
vectorizer.  It's a utility needed by LoopAccess not strictly provided
by it but it's a good place to share it.  This makes the function static
so that it no longer required to create an LoopAccessInfo instance in
order to access it from LV.

This was actually causing problems because it would have required
creating LAI much earlier that LV::canVectorizeMemory().

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229625
2015-02-18 03:43:19 +00:00
Adam Nemet 3cf32ad6db [LoopAccesses] Cache the result of canVectorizeMemory
LAA will be an on-demand analysis pass, so we need to cache the result
of the analysis.  canVectorizeMemory is renamed to analyzeLoop which
computes the result.  canVectorizeMemory becomes the query function for
the cached result.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229624
2015-02-18 03:42:57 +00:00
Adam Nemet 5474be2c80 [LoopAccesses] Stash the report from the analysis rather than emitting it
The transformation passes will query this and then emit them as part of
their own report.  The currently only user LV is modified to do just
that.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229623
2015-02-18 03:42:50 +00:00
Adam Nemet 4f3ede5a01 [LoopAccesses] Make VectorizerParams global
As LAA is becoming a pass, we can no longer pass the params to its
constructor.  This changes the command line flags to have external
storage.  These can now be accessed both from LV and LAA.

VectorizerParams is moved out of LoopAccessInfo in order to shorten the
code to access it.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229622
2015-02-18 03:42:43 +00:00
Adam Nemet 30f16e1696 [LoopAccesses] Rename LoopAccessAnalysis to LoopAccessInfo
LoopAccessAnalysis will be used as the name of the pass.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229621
2015-02-18 03:42:35 +00:00
Akira Hatanaka 1defd5afbd [InstCombine] Do not insert a GEP instruction before a landingpad instruction.
InstCombiner::visitGetElementPtrInst was using getFirstNonPHI to compute the
insertion point, which caused the verifier to complain when a GEP was inserted
before a landingpad instruction. This commit fixes it to use getFirstInsertionPt
instead.

rdar://problem/19394964

llvm-svn: 229619
2015-02-18 03:30:11 +00:00
Hal Finkel 4393559621 [BDCE] Don't forget uses of root instructions seen before the instruction itself
When visiting the initial list of "root" instructions (those which must always
be alive), for those that are integer-valued (such as invokes returning an
integer), we mark their bits as (initially) all dead (we might, obviously, find
uses of those bits later, but all bits are assumed dead until proven
otherwise). Don't do so, however, if we're already seen a use of those bits by
another root instruction (such as a store).

Fixes a miscompile of the sanitizer unit tests on x86_64.

Also, add a debug line for visiting the root instructions, and remove a debug
line which tried to print instructions being removed (printing dead
instructions is dangerous, and can sometimes crash).

llvm-svn: 229618
2015-02-18 03:12:28 +00:00
Matt Arsenault 0ba644b66b R600/SI: Rename dst encoding field to be consistent with docs
The docs call this vdst instead of just dst.

llvm-svn: 229614
2015-02-18 02:15:37 +00:00
Matt Arsenault e3dbcf6656 R600/SI: Consistently capitalize encoding field names
Some formats capitalized these, but most didn't. Change
them all to be consistently lowercase.

Now, non-encoding fields and convenience bits are capitalized.
Also remove weird looking empty line in some of the formats.

llvm-svn: 229613
2015-02-18 02:15:35 +00:00
Matt Arsenault 1ecac06a6f R600/SI: Set noNamedPositionallyEncodedOperands
llvm-svn: 229612
2015-02-18 02:15:32 +00:00
Matt Arsenault 096ec1e10c R600/SI: Fix src1_modifiers for class instructions
src1 doesn't have modifiers, but the operand was missing
resulting in an encoding build error when all fields
are required.'

llvm-svn: 229611
2015-02-18 02:15:30 +00:00
Matt Arsenault 65fa1c425d R600/SI: Fix not setting clamp / omod for v_cndmask_b32_e64
Rename the multiclass since it now applies to the output
modifiers as well.

llvm-svn: 229610
2015-02-18 02:15:27 +00:00
Matt Arsenault 284d7dfb53 R600: Fix operand encoding error
llvm-svn: 229609
2015-02-18 02:10:42 +00:00
Matt Arsenault 1991f5e40b R600/SI: Fix encoding error from glc bit on VI SMRD instructions
llvm-svn: 229608
2015-02-18 02:10:40 +00:00
Matt Arsenault e6c5241814 R600/SI: Fix operand encoding for flat instructions
llvm-svn: 229607
2015-02-18 02:10:37 +00:00
Matt Arsenault 07e3bb153f R600/SI: Fix error from vdst on no return atomics
Set the ignored field to 0 so we can enable
noNamedPositionallyEncodedOperands.

llvm-svn: 229606
2015-02-18 02:10:35 +00:00
Matt Arsenault caa1288fff R600/SI: Add missing offset operand to buffer bothen
llvm-svn: 229605
2015-02-18 02:04:38 +00:00
Matt Arsenault 2ad8bab7ee R600/SI: Add missing soffset operand to global atomics
llvm-svn: 229604
2015-02-18 02:04:35 +00:00
Matt Arsenault 3c34ae293c R600/SI: Fix brace identation
llvm-svn: 229603
2015-02-18 02:04:31 +00:00
Justin Bogner 2b6c537bdc Re-apply "InstrProf: Add unit tests for the profile reader and writer"
Have the InstrProfWriter return a MemoryBuffer instead of a
std::string. This fixes the alignment issues the reader would hit, and
it's a more appropriate type for this anyway.

I've also removed an ugly helper function that's not needed since
we're allowing initializer lists now, and updated some error code
checks based on MSVC's issues with r229473.

This reverts r229483, reapplying r229478.

llvm-svn: 229602
2015-02-18 01:58:17 +00:00
Matthias Braun 11042c8523 LiveRangeCalc: Rename some parameters from kill to use, NFC.
Those parameters did not necessarily describe kill points but just uses.

llvm-svn: 229601
2015-02-18 01:50:52 +00:00
Sanjoy Das 4153f47026 Generalize getExtendAddRecStart to work with both sign and zero
extensions.

This change also removes `DEBUG(dbgs() << "SCEV: untested prestart
overflow check\n");` because that case has a unit test now.

Differential Revision: http://reviews.llvm.org/D7645

llvm-svn: 229600
2015-02-18 01:47:07 +00:00
Eric Christopher 8af49b3214 Make the Mips AsmPrinter independent of global subtarget
initialization. Initialize the subtarget once per function and
migrate EmitStartOfAsmFile to either use calls on the
TargetMachine or get information from the subtarget we'd use
for assembling.

The top-level-ness of the MIPS attribute output for assembly is,
by nature, contrary to how we'd want to do this for an LTO
situation where we have multiple cpu architectures so this
solution is good enough for now.

llvm-svn: 229596
2015-02-18 01:01:57 +00:00
Eric Christopher bbe6ff50f3 Unify selectMipsCPU implementations.
llvm-svn: 229595
2015-02-18 00:55:06 +00:00
Sanjoy Das 102061a494 Bugfix: SCEV incorrectly marks certain expressions as nsw
I could not come up with a test case for this one; but I don't think
`getPreStartForSignExtend` can assume `AR` is `nsw` -- there is one
place in scalar evolution that calls `getSignExtendAddRecStart(AR,
...)` without proving that `AR` is `nsw`

(line 1564)

   OperandExtendedAdd =
     getAddExpr(WideStart,
                getMulExpr(WideMaxBECount,
                           getZeroExtendExpr(Step, WideTy)));
   if (SAdd == OperandExtendedAdd) {
     // If AR wraps around then
     //
     //    abs(Step) * MaxBECount > unsigned-max(AR->getType())
     // => SAdd != OperandExtendedAdd
     //
     // Thus (AR is not NW => SAdd != OperandExtendedAdd) <=>
     // (SAdd == OperandExtendedAdd => AR is NW)

     const_cast<SCEVAddRecExpr *>(AR)->setNoWrapFlags(SCEV::FlagNW);

     // Return the expression with the addrec on the outside.
     return getAddRecExpr(getSignExtendAddRecStart(AR, Ty, this),
                          getZeroExtendExpr(Step, Ty),
                          L, AR->getNoWrapFlags());
   }

Differential Revision: http://reviews.llvm.org/D7640

llvm-svn: 229594
2015-02-18 00:43:19 +00:00
Rafael Espindola 2d7ec9a860 Twines should be passed by const ref.
llvm-svn: 229590
2015-02-17 23:44:22 +00:00
Andrea Di Biagio e7b58ee555 [X86][FastIsel] Teach how to select scalar integer to float/double conversions.
This patch teaches fast-isel how to select a (V)CVTSI2SSrr for an integer to 
float conversion, and how to select a (V)CVTSI2SDrr for an integer to double
conversion.

Added test 'fast-isel-int-float-conversion.ll'.

Differential Revision: http://reviews.llvm.org/D7698

llvm-svn: 229589
2015-02-17 23:40:58 +00:00
Rafael Espindola df19519800 Add r228939 back with a fix.
The problem in the original patch was not switching back to .text after printing
an eh table.

Original message:

On ELF, put PIC jump tables in a non executable section.

Fixes PR22558.

llvm-svn: 229586
2015-02-17 23:34:51 +00:00
Sanjay Patel e951a3839a rename variables again because these tables also deal with stores; NFC
Suggestion by Simon Pilgrim

llvm-svn: 229574
2015-02-17 22:38:06 +00:00
Duncan P. N. Exon Smith a55dcaf427 IR: fieldIsMDNode() should be false for MDString
Simplify the code.  It has been a while since the schema has been so
"flexible".

llvm-svn: 229573
2015-02-17 22:34:15 +00:00
Duncan P. N. Exon Smith 57bab0bc97 AsmPrinter: Take range in DwarfExpression::AddExpression(), NFC
Previously `DwarfExpression::AddExpression()` relied on
default-constructing the end iterators for `DIExpression` -- once the
operands are represented explicitly via `MDExpression` (instead of via
the strange `StringRef` navigator in `DIHeaderIterator`) this won't
work.  Explicitly take an iterator for the end of the range.

llvm-svn: 229572
2015-02-17 22:30:56 +00:00
Simon Pilgrim 1d89a02abb [X86][SSE] Generalised unpckl/unpckh shuffle matching
Added commuted unpckl/unpckh shuffle matching patterns as many cases containing undefined lanes fail to commute by themselves.

Differential Revision: http://reviews.llvm.org/D7564

llvm-svn: 229571
2015-02-17 22:24:32 +00:00
Sanjay Patel 1a20fdf36f Add comment to explain a non-obvious setting; NFC.
This is paraphrased from Simon Pilgrim's comment in:
http://reviews.llvm.org/D7492

llvm-svn: 229566
2015-02-17 22:09:54 +00:00
Sanjay Patel 203ee500e9 remove function names from comments; NFC
llvm-svn: 229558
2015-02-17 21:55:20 +00:00
Sanjay Patel 52f9f7c0f3 replace meaningless variable names; NFCI
llvm-svn: 229549
2015-02-17 21:37:28 +00:00
Rafael Espindola 68fa249cb5 Add r228980 back.
Add support for having multiple sections with the same name and comdat.

Using this in combination with -ffunction-sections allows LLVM to output a .o
file with mulitple sections named .text. This saves space by avoiding long
unique names of the form .text.<C++ mangled name>.

llvm-svn: 229541
2015-02-17 20:48:01 +00:00
Rafael Espindola 7fe7e05379 Add r228889 back.
Original message:
Invert the section relocation map.

It now points from rel section to section. Use it to set sh_info, avoiding
a brittle name lookup.

llvm-svn: 229539
2015-02-17 20:40:59 +00:00
Rafael Espindola 3a7c0eb32e Add r228888 back.
Original message:

Use the existing SymbolTableIndex instead of doing a lookup. NFC.

llvm-svn: 229538
2015-02-17 20:37:50 +00:00
Rafael Espindola ead8549cab Add r228886 back now that r229530 fixed the issue lldb was hitting.
Original message:

Create the Seciton -> Rel Section map when it is first needed. NFC.

Saves a walk over every section.

llvm-svn: 229536
2015-02-17 20:31:13 +00:00
Tom Stellard 7b3aa88ac1 R600/SI: Fix asam errors in SIFoldOperands
We were trying to fold into implicit uses, which led to out of bounds
access of the MCInstrDesc::OpInfo arrray.

llvm-svn: 229533
2015-02-17 20:11:54 +00:00
Sanjay Patel b811c1d6a5 prevent folding a scalar FP load into a packed logical FP instruction (PR22371)
Change the memory operands in sse12_fp_packed_scalar_logical_alias from scalars to vectors. 
That's what the hardware packed logical FP instructions define: 128-bit memory operands.
There are no scalar versions of these instructions...because this is x86.

Generating the wrong code (folding a scalar load into a 128-bit load) is still possible
using the peephole optimization pass and the load folding tables. We won't completely
solve this bug until we either fix the lowering in fabs/fneg/fcopysign and any other
places where scalar FP logic is created or fix the load folding in foldMemoryOperandImpl()
to make sure it isn't changing the size of the load.

Differential Revision: http://reviews.llvm.org/D7474

llvm-svn: 229531
2015-02-17 20:08:21 +00:00
Rafael Espindola 127b6c3ba7 Don't deference the section_end() iterator.
Hard to test given the undefined behavior nature.

llvm-svn: 229530
2015-02-17 20:07:28 +00:00
Eric Christopher a49d68e078 Make the ARM AsmPrinter independent of global subtarget
initialization. Initialize the subtarget once per function and
migrate Emit{Start|End}OfAsmFile to either use attributes on the
TargetMachine or get information from the subtarget we'd use
for assembling. One bit (getISAEncoding) touched the general
AsmPrinter and the debug output. Handle this one by passing
the function for the subprogram down and updating all callers
and users.

The top-level-ness of the ARM attribute output for assembly is,
by nature, contrary to how we'd want to do this for an LTO
situation where we have multiple cpu architectures so this
solution is good enough for now.

llvm-svn: 229528
2015-02-17 20:02:32 +00:00
Eric Christopher ffc5ff32d1 80-column fixups.
llvm-svn: 229527
2015-02-17 20:02:28 +00:00
Adrian Prantl ea7f1c2d19 DIBuilder: add trackIfUnresolved() to all nodes that may be cyclic.
Tested in clang/test/CodeGenObjCCXX/debug-info-cyclic.mm

rdar://problem/19839612

llvm-svn: 229521
2015-02-17 19:17:39 +00:00
Simon Atanasyan 1d902b7cc7 [Object] Support reading 64-bit MIPS ELF archives
The 64-bit MIPS ELF archive file format is used by MIPS64 targets.
The main difference from a regular archive file is the symbol table format:
1. ar_name is equal to "/SYM64/"
2. number of symbols and offsets are 64-bit integers

http://techpubs.sgi.com/library/manuals/4000/007-4658-001/pdf/007-4658-001.pdf
Page 96

The patch allows reading of such archive files by llvm-nm, llvm-objdump
and other tools. But it does not support archive files with number of symbols
and/or offsets exceed 2^32. I think it is a rather rare case requires more
significant modification of `Archive` class code.

http://reviews.llvm.org/D7546

llvm-svn: 229520
2015-02-17 18:54:22 +00:00
Sanjay Patel ab7e86e5be Canonicalize splats as build_vectors (PR22283)
This is a follow-on patch to:
http://reviews.llvm.org/D7093

That patch canonicalized constant splats as build_vectors, 
and this patch removes the constant check so we can canonicalize
all splats as build_vectors.

This fixes the 2nd test case in PR22283:
http://llvm.org/bugs/show_bug.cgi?id=22283

The unfortunate code duplication between SelectionDAG and DAGCombiner
is discussed in the earlier patch review. At least this patch is just
removing code...

This improves an existing x86 AVX test and changes codegen in an ARM test.

Differential Revision: http://reviews.llvm.org/D7389

llvm-svn: 229511
2015-02-17 16:54:32 +00:00
Tom Stellard bc3776803b R600/SI: Extend private extload pattern to include zext loads
llvm-svn: 229507
2015-02-17 16:36:00 +00:00
Benjamin Kramer 6cd780ff21 Prefer SmallVector::append/insert over push_back loops.
Same functionality, but hoists the vector growth out of the loop.

llvm-svn: 229500
2015-02-17 15:29:18 +00:00
Elena Demikhovsky ef035bb974 Fixed a bug in store sinking.
The problem was in store-sink barrier check.

Store sink barrier should be checked for ModRef (read-write) mode.

http://llvm.org/bugs/show_bug.cgi?id=22613

llvm-svn: 229495
2015-02-17 13:10:05 +00:00
NAKAMURA Takumi cd5a3673c3 OrcJIT: Appease msc18 not to be confused on executeCompileCallback<OrcX86_64>.
llvm-svn: 229494
2015-02-17 12:53:16 +00:00
NAKAMURA Takumi 3e087357ce Reformat.
llvm-svn: 229493
2015-02-17 12:53:05 +00:00
Andrea Di Biagio eb97f92489 [X86] Silence -Wsign-compare warnings.
GCC 4.8 reported two new warnings due to comparisons
between signed and unsigned integer expressions. The new warnings were
accidentally introduced by revision 229480.
Added explicit casts to silence the warnings. No functional change intended.

llvm-svn: 229488
2015-02-17 11:20:11 +00:00
Justin Bogner 5b3ad88646 Revert "InstrProf: Add unit tests for the profile reader and writer"
This added API to the InstrProfWriter to write to a string so I could
write unittests without using temp files. This doesn't really work,
since the format has tighter alignment requirements than a char.

This reverts r229478 and its follow-up, r229481.

llvm-svn: 229483
2015-02-17 09:21:43 +00:00
Elena Demikhovsky ba84672519 AVX-512: changes in intel_ocl_bi calling conventions
- added mask types v8i1 and v16i1 to possible function parameters
- enabled passing 512-bit vectors in standard CC
- added a test for KNL intel_ocl_bi conventions

llvm-svn: 229482
2015-02-17 09:20:12 +00:00
Michael Kuperstein ff5acaf50c [X86] Combine vector anyext + and into a vector zext
Vector zext tends to get legalized into a vector anyext, represented as a vector shuffle with an undef vector + a bitcast, that gets ANDed with a mask that zeroes the undef elements.
Combine this into an explicit shuffle with a zero vector instead. This allows shuffle lowering to match it as a zext, instead of matching it as an anyext and emitting an explicit AND.
This combine only covers a subset of the cases, but it's a start.

Differential Revision: http://reviews.llvm.org/D7666

llvm-svn: 229480
2015-02-17 08:22:51 +00:00
Justin Bogner 218d0689a9 Re-apply "InstrProf: Add unit tests for the profile reader and writer"
Add these tests again, but use va_list instead of initializer lists.

This reverts r229456, reapplying r229455.

llvm-svn: 229478
2015-02-17 07:50:59 +00:00
Eric Christopher 5c0e009d3a Make the PowerPC AsmPrinter independent of global subtarget
initialization. Initialize the subtarget once per function and
migrate EmitStartOfAsmFile to either use attributes on the
TargetMachine or get information from all of the various
subtargets.

llvm-svn: 229475
2015-02-17 07:21:21 +00:00
Eric Christopher 75dc3904a5 Add a FIXME to move IsLittleEndian to the target machine.
llvm-svn: 229472
2015-02-17 06:45:17 +00:00
Eric Christopher fee6aaf683 Move ABI handling and 64-bitness to the PowerPC target machine.
This required changing how the computation of the ABI is handled
and how some of the checks for ABI/target are done.

llvm-svn: 229471
2015-02-17 06:45:15 +00:00
Duncan P. N. Exon Smith 752d6df22d AsmPrinter: Use DIExpression default constructor, NFC
llvm-svn: 229464
2015-02-17 02:42:45 +00:00
Chandler Carruth 55db07016e [x86] Teach the unpack lowering to try wider element unpacks.
This allows it to match still more places where previously we would have
to fall back on floating point shuffles or other more complex lowering
strategies.

I'm hoping to replace some of the hand-rolled unpack matching with this
routine is it gets more and more clever.

llvm-svn: 229463
2015-02-17 02:12:24 +00:00
Hal Finkel 2bb61ba2fe [BDCE] Add a bit-tracking DCE pass
BDCE is a bit-tracking dead code elimination pass. It is based on ADCE (the
"aggressive DCE" pass), with the added capability to track dead bits of integer
valued instructions and remove those instructions when all of the bits are
dead.

Currently, it does not actually do this all-bits-dead removal, but rather
replaces the instruction's uses with a constant zero, and lets instcombine (and
the later run of ADCE) do the rest. Because we essentially get a run of ADCE
"for free" while tracking the dead bits, we also do what ADCE does and removes
actually-dead instructions as well (this includes instructions newly trivially
dead because all bits were dead, but not all such instructions can be removed).

The motivation for this is a case like:

int __attribute__((const)) foo(int i);
int bar(int x) {
  x |= (4 & foo(5));
  x |= (8 & foo(3));
  x |= (16 & foo(2));
  x |= (32 & foo(1));
  x |= (64 & foo(0));
  x |= (128& foo(4));
  return x >> 4;
}

As it turns out, if you order the bit-field insertions so that all of the dead
ones come last, then instcombine will remove them. However, if you pick some
other order (such as the one above), the fact that some of the calls to foo()
are useless is not locally obvious, and we don't remove them (without this
pass).

I did a quick compile-time overhead check using sqlite from the test suite
(Release+Asserts). BDCE took ~0.4% of the compilation time (making it about
twice as expensive as ADCE).

I've not looked at why yet, but we eliminate instructions due to having
all-dead bits in:
External/SPEC/CFP2006/447.dealII/447.dealII
External/SPEC/CINT2006/400.perlbench/400.perlbench
External/SPEC/CINT2006/403.gcc/403.gcc
MultiSource/Applications/ClamAV/clamscan
MultiSource/Benchmarks/7zip/7zip-benchmark

llvm-svn: 229462
2015-02-17 01:36:59 +00:00
Lang Hames 2754714fb9 [Orc] Update the Orc indirection utils and refactor the CompileOnDemand layer.
This patch replaces most of the Orc indirection utils API with a new class:
JITCompileCallbackManager, which creates and manages JIT callbacks.
Exposing this functionality directly allows the user to create callbacks that
are associated with user supplied compilation actions. For example, you can
create a callback to lazyily IR-gen something from an AST. (A kaleidoscope
example demonstrating this will be committed shortly).

This patch also refactors the CompileOnDemand layer to use the
JITCompileCallbackManager API.

llvm-svn: 229461
2015-02-17 01:18:38 +00:00
Duncan P. N. Exon Smith b474937929 AsmPrinter: Stop creating DebugLocs
While looking at a heap profile of a clang LTO bootstrap with -g, I
noticed that 2.2% of memory in an `llvm-lto` of clang is from calling
`DebugLoc::get()` in `collectVariableInfo()` (accounting for ~40% of
memory used for `MDLocation`s).

I suspect this was introduced by r226736, whose goal was to prevent
uniquing of `DebugLoc`s (goal achieved, if so).

There's no reason we need a `DebugLoc` here at all -- it was just being
used for (in)convenient API -- so the fix is to pass the scope and
inlined-at directly to `LexicalScopes::findInlinedScope()`.

llvm-svn: 229459
2015-02-17 00:02:27 +00:00
Hal Finkel 5cedafb8cd [PowerPC] Support non-direct-sub/superclass VSX copies
Our register allocation has become better recently, it seems, and is now
starting to generate cross-block copies into inflated register classes. These
copies are not transformed into subregister insertions/extractions by the
PPCVSXCopy class, and so need to be handled directly by
PPCInstrInfo::copyPhysReg. The code to do this was *almost* there, but not
quite (it was unnecessarily restricting itself to only the direct
sub/super-register-class case (not copying between, for example, something in
VRRC and the lower-half of VSRC which are super-registers of F8RC).

Triggering this behavior manually is difficult; I'm including two
bugpoint-reduced test cases from the test suite.

llvm-svn: 229457
2015-02-16 23:46:30 +00:00
Justin Bogner fcb2de694a Revert "InstrProf: Add unit tests for the profile reader and writer"
Looks like the bots don't like my initializer lists.

This reverts r229455

llvm-svn: 229456
2015-02-16 23:31:07 +00:00
Justin Bogner f83e895fa7 InstrProf: Add unit tests for the profile reader and writer
This required some minor API to be added to these types to avoid
needing temp files.

Also, I've used initializer lists in the tests, as MSVC 2013 claims to
support them. I'll redo this without them if the bots complain.

llvm-svn: 229455
2015-02-16 23:27:48 +00:00
Simon Atanasyan 79ba8407d2 [Mips] Add .MIPS.options section descriptor kinds enumeration
No functional changes.

llvm-svn: 229452
2015-02-16 22:59:29 +00:00
Ahmed Bougacha bf2b90e92d [ARM] Remove unused declaration. NFC.
GlobalMerge was moved to lib/CodeGen a while ago, and is no longer
called "ARMGlobalMerge".

llvm-svn: 229448
2015-02-16 22:30:08 +00:00
Cameron McInally c5764cbe4e [AVX512] Make 512b vector floating point rounds legal on AVX512.
llvm-svn: 229445
2015-02-16 22:15:42 +00:00
Matthias Braun 15635c5f85 RegisterCoalescer: Don't rematerialize subregister definitions.
We cannot simply rematerialize instructions which only defining a
subregister, as the final value also depends on the previous
instructions.

This fixes test/CodeGen/R600/subreg-coalescer-bug.ll with subreg
liveness enabled.

llvm-svn: 229444
2015-02-16 22:05:17 +00:00
Matthias Braun 1b901a4435 RegisterCoalescer: Do not look for regclass of IMPLICIT_DEF.
IMPLICIT_DEF is a generic instruction and has no (fixed) output register
class defined. The rematerialization code of the register coalescer
should not scan the instruction description for a register class.

This fixes a problem showing up in
test/CodeGen/R600/subreg-coalescer-crash.ll with subregister liveness
enabled.

llvm-svn: 229443
2015-02-16 22:05:12 +00:00
Simon Pilgrim b2c00f3286 [X86][SSE] Add SSE MOVQ instructions to SSEPackedInt domain
Patch to explicitly add the SSE MOVQ (rr,mr,rm) instructions to SSEPackedInt domain - prevents a number of costly domain switches.

Differential Revision: http://reviews.llvm.org/D7600

llvm-svn: 229439
2015-02-16 21:50:56 +00:00
Mehdi Amini 3e0023b8f6 SelectionDAG: fold (fp_to_u/sint (s/uint_to_fp)) here too
Update SPARC tests to match.

From: Fiona Glaser <fglaser@apple.com>
llvm-svn: 229438
2015-02-16 21:47:58 +00:00
Mehdi Amini b9a0fa4822 InstCombine: fold more cases of (fp_to_u/sint (u/sint_to_fp val))
Fixes radar 15486701.

From: Fiona Glaser <fglaser@apple.com>
llvm-svn: 229437
2015-02-16 21:47:54 +00:00
Justin Bogner ab89ed7dd5 InstrProf: Use ErrorOr for IndexedInstrProfReader::create (NFC)
The other InstrProfReader::create factories were updated to return
ErrorOr in r221120, and it's odd for these APIs not to match.

llvm-svn: 229433
2015-02-16 21:28:58 +00:00
Craig Topper 49df44e2e2 [X86] Remove the multiply by 8 that goes into the shift constant for X86ISD::VSHLDQ and X86ISD::VSRLDQ. This simplifies the pattern matching in isel and allows these nodes to become the patterns embedded in the instruction.
llvm-svn: 229431
2015-02-16 20:52:07 +00:00
Craig Topper 44026efa88 [X86] Remove x86.avx2.psll.dq.bs and x86.avx2.psrl.dq.bs intrinsics.
llvm-svn: 229430
2015-02-16 20:51:59 +00:00
Matthias Braun d6b108e445 ARM: Transfer kill flag when lowering VSTMQIA to VSTMDIA.
llvm-svn: 229425
2015-02-16 19:34:30 +00:00
Matthias Braun 2eab0e30e1 RegisterCoalescer: Improve previous fix for wrong def after.
The previous fix in r225503 was needlessly complicated. The problem goes
away as well if the arguments to MergeValueNumberInto are supplied in the
correct order.
This was previously missed because the existing code already had the
wrong order but an additional later Merge was hiding the bug for the
main liverange VNI.

llvm-svn: 229424
2015-02-16 19:34:27 +00:00
Aaron Ballman 97a59fb464 MSVC 2013 does not ICE on this code in the same fashion that MSVC 2012 did; NFC.
llvm-svn: 229422
2015-02-16 19:33:36 +00:00
Duncan P. N. Exon Smith 060ee625b8 Bitcode: Fix major regression: large files w/ debug info
The metadata/value split introduced a major regression reading large
bitcode files that contain debug info (or other cyclic (non-self
reference) metadata graphs).  For the first time in a while, I dropped
from libLTO.dylib down to `llvm-lto` with a non-trivial bitcode file
(~350MB), and I hit this when reading the result of ld64's `-save-temps`
in `llvm-lto`.

Here's pseudo-code for what was going on:

    read-main-metadata-block:
      for each md:
        if has-fwd-ref: // Only true for cyclic graphs.
          any-fwd-refs <- true
      if any-fwd-refs:
        foreach md:
          resolve-cycles(md) // Handle cycles.

    foreach function:
      read-function-metadata-block: // Such as !alias, !loop
        if any-fwd-refs:
          foreach md: // (all metadata, not just this block)
            resolve-cycles(md) // A no-op, but the loop is expensive!!

This commit resets the `AnyFwdRefs` flag to `false`.  This on its own
was enough to change my Release+Asserts `llvm-lto` time for reading this
bitcode from over 20 minutes (I gave up on it) to 20 seconds.  I've gone
further by tracking the min/max metadata forward-references in a
metadata block.  This protects against a schema that has lots of
functions that each reference their own metadata cycle.

Unfortunately, this regression is in the 3.6 branch as well.

llvm-svn: 229421
2015-02-16 19:18:01 +00:00
David Majnemer 8b77454dff ConstantFold: Properly fold GEP indices wider than i64
llvm-svn: 229420
2015-02-16 19:10:02 +00:00
James Molloy 83570247f1 Run LICM as part of the cleanup phase from the scalar optimizer.
Things like LoopUnrolling can produce loop invariant values - make sure
we pick them up.

llvm-svn: 229419
2015-02-16 18:59:54 +00:00
Aaron Ballman da9501b25c We require MSVC 1800 as our minimum, so these checks can safely go away; NFC. (It seems this code has been copy/pasted around, unfortunately.)
llvm-svn: 229417
2015-02-16 18:34:57 +00:00
Aaron Ballman b664e2a24b We require MSVC 1800 as our minimum, so these checks can safely go away; NFC.
llvm-svn: 229415
2015-02-16 18:23:00 +00:00
Aaron Ballman f17583ee9c MSVC 2013 supports std::forward_as_tuple, while MSVC 2012 did not; so we can move to using the improved API.
llvm-svn: 229414
2015-02-16 18:21:19 +00:00
Andrew Trick 05938a5481 AArch64: Safely handle the incoming sret call argument.
This adds a safe interface to the machine independent InputArg struct
for accessing the index of the original (IR-level) argument. When a
non-native return type is lowered, we generate the hidden
machine-level sret argument on-the-fly. Before this fix, we were
representing this argument as OrigArgIndex == 0, which is an outright
lie. In particular this crashed in the AArch64 backend where we
actually try to access the type of the original argument.

Now we use a sentinel value for machine arguments that have no
original argument index. AArch64, ARM, Mips, and PPC now check for this
case before accessing the original argument.

Fixes <rdar://19792160> Null pointer assertion in AArch64TargetLowering

llvm-svn: 229413
2015-02-16 18:10:47 +00:00
Hal Finkel c64150b8f3 [ADCE] Don't indent inside an anonymous namespace
To be consistent with what clang-format does, don't add extra indentation
inside an anonymous namespace. NFC.

llvm-svn: 229412
2015-02-16 18:08:00 +00:00
James Molloy e32d806b5f [LoopReroll] Relax some assumptions a little.
We won't find a root with index zero in any loop that we are able to reroll.
However, we may find one in a non-rerollable loop, so bail gracefully instead
of failing hard.

llvm-svn: 229406
2015-02-16 17:02:00 +00:00
James Molloy 4c7deb2259 [LoopReroll] Don't crash on dead code
If a PHI has no users, don't crash; bail gracefully. This shouldn't
happen often, but we can make no guarantees that previous passes didn't leave
dead code around.

llvm-svn: 229405
2015-02-16 17:01:52 +00:00
Evgeniy Stepanov 292acab847 [asan] Reuse a common function.
Do not reimplement RoundUpToAlignment.

llvm-svn: 229397
2015-02-16 14:49:37 +00:00
Chandler Carruth 1e57e2deb8 [x86] Add a generic unpack-targeted lowering technique. This can be used
to generically lower blends and is particularly nice because it is
available frome SSE2 onward. This removes a lot of the remaining domain
crossing blends in SSE2 code.

I'm hoping to replace some of the "interleaved" lowering hacks with
something closer to this which should be more principled. First, this
needs to learn how to detect and use other interleavings besides that of
the natural type provided. That will be a follow-up patch though.

llvm-svn: 229378
2015-02-16 12:28:18 +00:00
Michael Kuperstein fc3e626752 Fix quoting of #pragma comment for MS compat, LLVM part.
For #pragma comment(linker, ...) MSVC expects the comment string to be quoted, but for #pragma comment(lib, ...) the compiler itself quotes the library name.
Since this distinction disappears by the time the directive reaches the backend, move quoting for the "lib" version to the frontend.

Differential Revision: http://reviews.llvm.org/D7652

llvm-svn: 229375
2015-02-16 11:57:17 +00:00
Chandler Carruth c802085b3a [x86] Add initial basic support for forming blends of v16i8 vectors.
This blend instruction is ... really lame. The register usage is insane.
As a consequence this is probably only *barely* better than 2 pshufbs
followed by a por, and that mostly because it only has to read from
a single memory location.

However, this doesn't fix as much as I kind of expected, so more to go.
Pretty sure that the ordering and delegation of v16i8 is just really,
really bad.

llvm-svn: 229373
2015-02-16 10:58:23 +00:00
Chandler Carruth e63bbd97a7 [x86] Switch my usage of VariadicFunction to a "normal" variadic
template now that we can use them.

This is, of course, horribly ugly because of the required recursive
formulation. Suggestions for making it less ugly welcome.

llvm-svn: 229367
2015-02-16 09:59:48 +00:00
David Majnemer 8b576a579a IR: SrcTy == DstTy doesn't imply that a cast is valid
Cast validity depends on the cast's kind, not just its types.

llvm-svn: 229366
2015-02-16 09:37:35 +00:00
David Majnemer 7ccc34dbc1 AsmParser: extractvalue requires at least one index operand
llvm-svn: 229365
2015-02-16 09:18:13 +00:00
David Majnemer 49b3d9bc84 AsmParser: Make sure GlobalVariables have sane types
llvm-svn: 229364
2015-02-16 08:41:08 +00:00
David Majnemer a3b0eb2f7f AsmParser: Reject alloca with function type
llvm-svn: 229363
2015-02-16 08:38:03 +00:00
David Majnemer 04b4ed329e Verifier: Diagnose module flags which have null ID operands
llvm-svn: 229361
2015-02-16 08:14:22 +00:00
Craig Topper 7e8dcef094 [X86] Add support for lowering shuffles to 256-bit PALIGNR instruction.
llvm-svn: 229359
2015-02-16 06:29:06 +00:00
David Majnemer e7a9cdbd20 DebugInfo: Don't crash if 'Debug Info Version' has a strange value
llvm-svn: 229356
2015-02-16 06:04:53 +00:00
David Majnemer 4b04292643 DataLayout: Validate that the pref alignment is at least the ABI align
llvm-svn: 229355
2015-02-16 05:41:55 +00:00
David Majnemer 1b9fc3a186 DataLayout: Report when the datalayout type alignment/width is too large
llvm-svn: 229354
2015-02-16 05:41:53 +00:00
David Majnemer 9b529a76e9 IR: Properly return nullptr when getAggregateElement is out-of-bounds
We didn't properly handle the out-of-bounds case for
ConstantAggregateZero and UndefValue.  This would manifest as a crash
when the constant folder was asked to fold a load of a constant global
whose struct type has no operands.

This fixes PR22595.

llvm-svn: 229352
2015-02-16 04:02:09 +00:00
Chandler Carruth 87e580a659 [x86] Teach the 128-bit vector shuffle lowering routines to take
advantage of the existence of a reasonable blend instruction.

The 256-bit vector shuffle lowering has leveraged the general technique
of decomposed shuffles and blends for quite some time, but this never
made it back into the 128-bit code, and there are a large number of
patterns where this is substantially better. For example, this removes
almost all domain crossing in vector shuffles that involve some blend
and some permutation with SSE4.1 and later. See the massive reduction
in 'shufps' for integer test cases in this commit.

This isn't perfect yet for a few reasons:

1) The v8i16 shuffle lowering continues to plague me. We don't always
   form an unpack-based blend when that would be better. But the wins
   pretty drastically outstrip the losses here.
2) The v16i8 shuffle lowering is just a disaster here. I never went and
   implemented blend support here for some terrible reason. I'll do
   that next probably. I've not updated it for now.

More variations on this technique are coming as well -- we don't
shuffle-into-unpack or shuffle-into-palignr, both of which would also be
profitable.

Note that some test cases grow significantly in the number of
instructions, but I expect to actually be faster. We use
pshufd+pshufd+blendw instead of a single shufps, but the pshufd's are
very likely to pipeline well (two ports on most modern intel chips) and
the blend is a *very* fast instruction. The domain switch penalty will
essentially always be more than a blend instruction, which is the only
increase in tree height.

llvm-svn: 229350
2015-02-16 01:52:02 +00:00
Filipe Cabecinhas ecf8f7f49b [Bitcode reader] Fix a few assertions when reading invalid files
Summary:
When creating {insert,extract}value instructions from a BitcodeReader, we
weren't verifying the fields were valid.

Bugs found with afl-fuzz

Reviewers: rafael

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7325

llvm-svn: 229345
2015-02-16 00:03:11 +00:00
Lang Hames da9d387929 [ExecutionEngine] Fix dependence issue by moving RTDyldMemoryManager into
RuntimeDyld.

This should fix http://llvm.org/PR22593.

llvm-svn: 229343
2015-02-15 23:22:43 +00:00
Aaron Ballman f9a1897c72 Removing LLVM_DELETED_FUNCTION, as MSVC 2012 was the last reason for requiring the macro. NFC; LLVM edition.
llvm-svn: 229340
2015-02-15 22:54:22 +00:00
Benjamin Kramer af09f22c4b Format: Modernize using variadic templates.
Introduces a subset of C++14 integer sequences in STLExtras. This is
just enough to support unpacking a std::tuple into the arguments of
snprintf, we can add more of it when it's actually needed.

Also removes an ancient macro hack that leaks a macro into the global
namespace. Clean up users that made use of the convenient hack.

llvm-svn: 229337
2015-02-15 22:15:41 +00:00
Aaron Ballman b46962fe5d Removing LLVM_EXPLICIT, as MSVC 2012 was the last reason for requiring the macro. NFC; LLVM edition.
llvm-svn: 229335
2015-02-15 22:00:20 +00:00
Zachary Turner c0acf6837b llvm-pdbdump: Add flags controlling the type of values to dump.
llvm-svn: 229330
2015-02-15 20:27:53 +00:00
Philip Reames 090a8242c3 Revert 229175
This change is a logical suspect in 22587 and 22590.  Given it's of minimal importanance and I can't get clang to build on my home machine, I'm reverting so that I can deal with this next week.

llvm-svn: 229322
2015-02-15 19:07:31 +00:00
Hal Finkel 8626ed2eae [ADCE] Convert another loop for a range-based for
We can use a range-based for for the operands loop too; NFC.

llvm-svn: 229319
2015-02-15 15:51:25 +00:00
Hal Finkel 92fb2d3803 [ADCE] Use inst_range and range-based fors
Convert a few loops to range-based fors; NFC.

llvm-svn: 229318
2015-02-15 15:51:23 +00:00
Hal Finkel c6035cff55 [ADCE] Fix formatting of pointer types
We prefer to put the * with the variable, not with the type; NFC.

llvm-svn: 229317
2015-02-15 15:47:52 +00:00
Hal Finkel 234d8fea7b [ADCE] Fix capitalization of another local variable
Bring another local variable in compliance with our naming conventions, NFC.

llvm-svn: 229316
2015-02-15 15:45:30 +00:00
Hal Finkel 75901293a1 [ADCE] Fix capitalization of some local variables
Bring some local variables in compliance with our naming conventions, NFC.

llvm-svn: 229315
2015-02-15 15:45:28 +00:00
Simon Pilgrim 2a7bedb73e Coding style fixes to recent patches. NFC.
llvm-svn: 229312
2015-02-15 14:19:29 +00:00
Simon Pilgrim 00bd79d794 [X86][AVX2] vpslldq/vpsrldq byte shifts for AVX2
This patch refactors the existing lowerVectorShuffleAsByteShift function to add support for 256-bit vectors on AVX2 targets.

It also fixes a tablegen issue that prevented the lowering of vpslldq/vpsrldq vec256 instructions.

Differential Revision: http://reviews.llvm.org/D7596

llvm-svn: 229311
2015-02-15 13:19:52 +00:00
Chandler Carruth bf0fb06e0d [x86] Teach the decomposed shuffle/blend lowering to use an early blend
when that will allow it to lower with a single permute instead of
multiple permutes.

It tries to detect when it will only have to do a single permute in
either case to maximize folding of loads and such.

This cuts a *lot* of the avx2 shuffle permute counts in half. =]

llvm-svn: 229309
2015-02-15 12:42:15 +00:00
Chandler Carruth 1b5285dd57 [SDAG] Teach the SelectionDAG to canonicalize vector shuffles of splats
directly into blends of the splats.

These patterns show up even very late in the vector shuffle lowering
where we don't have any chance for DAG combining to kick in, and
blending is a tremendously simpler operation to model. By coercing the
shuffle into a blend we can much more easily match and lower shuffles of
splats.

Immediately with this change there are significantly more blends being
matched in the x86 vector shuffle lowering.

llvm-svn: 229308
2015-02-15 12:18:12 +00:00
Chandler Carruth 75d9a97569 [x86] Teach the shuffle mask equivalence test to look through build
vectors and detect equivalent inputs.

This lets the code match unpck-style instructions when only one of the
inputs are lined up but the other input is a splat and so which lanes we
pull from doesn't matter. Today, this doesn't really happen, but just by
accident. I have a patch that normalizes how we shuffle splats, and with
that patch this will be necessary for a lot of the mask equivalence
tests to work.

I don't really know how to write a test case for this specific change
until the other change lands though.

llvm-svn: 229307
2015-02-15 12:07:55 +00:00
Chandler Carruth 4fe214b1f2 [x86] Tweak the ordering of unpack matching vs. element insertion, and
don't try to do element insertion for non-zero-index floating point
vectors.

We don't have any useful patterns or lowering for element insertion into
high elements of a floating point vector, and the generic shuffle
lowering will end up being better -- namely it will fall back to unpck.
But we should try to handle other forms of element insertion before
matching unpck patterns.

While this doesn't matter much right now, I'm working on a patch that
makes unpck matching much more powerful, and that patch will break
without this re-ordering.

llvm-svn: 229306
2015-02-15 12:01:14 +00:00
Chandler Carruth 56e0ceda0d [x86] Stop shuffling zero vectors. =]
I was somewhat surprised this pattern really came up, but it does. It
seems better to just directly handle it than try to special case every
place where we end up forming a shuffle that devolves to a shuffle of
a zero vector.

llvm-svn: 229301
2015-02-15 10:34:52 +00:00
Chandler Carruth 3d272daaed [x86] Use a more helpful parenthesizing of these comparisons. Silences
a -Wparentheses complaint from GCC.

llvm-svn: 229300
2015-02-15 10:15:20 +00:00
Chandler Carruth 62558c1d4d [x86] When splitting 256-bit vectors into 128-bit vectors, don't extract
subvectors from buildvectors. That doesn't really make any sense and it
breaks all of the down-stream matching of buildvectors to cleverly lower
shuffles.

With this, we now get the shift-based lowering of 256-bit vector
shuffles with AVX1 when we split them into 128-bit vectors. We also do
much better on the zero-extension patterns, although there remains quite
a bit of room for improvement here.

llvm-svn: 229299
2015-02-15 10:12:02 +00:00
Chandler Carruth a6f8a3661c [x86] Make computing the zeroable elements slightly more powerful, at
least in theory.

I don't actually have a test case that benefits from this, but
theoretically, it could come up, and I don't want to try to think about
whether this is the culprit or something else is, so I'd rather just
make this code powerful. =/ Makes me sad that I can't really test it
though.

llvm-svn: 229298
2015-02-15 09:33:36 +00:00
Chandler Carruth 0ddfe0c7c5 [x86] Add a slight variation on some of the other generic shuffle
lowerings -- one which decomposes into an initial blend followed by
a permute.

Particularly on newer chips, blends are handled independently of
shuffles and so this is much less bottlenecked on the single port that
floating point shuffles are executed with on Intel.

I'll be adding this lowering to a bunch of other code paths in
subsequent commits to handle still more places where we can effectively
leverage blends when they're available in the ISA.

llvm-svn: 229292
2015-02-15 08:26:30 +00:00
Elena Demikhovsky 6f5a859633 Enabled cost calculation for masked memory operations.
We already have implementation for cost calculation for
masked memory operations. I just call it from the loop vectorizer.

llvm-svn: 229290
2015-02-15 08:08:48 +00:00
Craig Topper 78c424dfca [X86] Add assembly parser support for mnemonic aliases for AVX-512 vpcmp instructions.
llvm-svn: 229287
2015-02-15 07:13:48 +00:00
Chandler Carruth 499d7332c5 [x86] Fix PR22377, a regression with the new vector shuffle legality
test.

This was just a matter of the DAG combine for vector shuffles being too
aggressive. This is a bit of a grey area, but I think generally if we
can re-use intermediate shuffles, we should. Certainly, given the test
cases I have available, this seems like the right call.

llvm-svn: 229285
2015-02-15 07:01:10 +00:00
Craig Topper f02ad93270 [X86] Add assembler predicates for the rest of the AVX512 feature flags. This makes the assembly matching consistent across all AVX512 instructions. Without this we were allowing some AVX512 instructions to be parsed always, but not the foundation instructions.
llvm-svn: 229280
2015-02-15 04:54:55 +00:00
Craig Topper a3776de242 [X86] Add the remaining 11 possible exact ModRM formats. This makes their encodings linear which can then be used to simplify some other code.
llvm-svn: 229279
2015-02-15 04:16:44 +00:00
Simon Pilgrim 31457d54f7 [X86][XOP] Enable commutation for XOP instructions
Patch to allow XOP instructions (integer comparison and integer multiply-add) to be commuted. The comparison instructions sometimes require the compare mode to be flipped but the remaining instructions can use default commutation modes.

This patch also sets the SSE domains of all the XOP instructions.

Differential Revision: http://reviews.llvm.org/D7646

llvm-svn: 229267
2015-02-14 22:40:46 +00:00
Craig Topper 43860838dc [X86] Improve parsing support AVX/SSE floating point compare instruction mnemonic aliases. They'll now print with the alias the parser received instead of converting to the explicit immediate form.
llvm-svn: 229266
2015-02-14 21:54:03 +00:00
Ramkumar Ramachandra 8fcb498a9a InstCombine: propagate deref via new addDereferenceableAttr
The "dereferenceable" attribute cannot be added via .addAttribute(),
since it also expects a size in bytes. AttrBuilder#addAttribute or
AttributeSet#addAttribute is wrapped by classes Function, InvokeInst,
and CallInst. Add corresponding wrappers to
AttrBuilder#addDereferenceableAttr.

Having done this, propagate the dereferenceable attribute via
gc.relocate, adding a test to exercise it. Note that -datalayout is
required during execution over and above -instcombine, because
InstCombine only optionally requires DataLayoutPass.

Differential Revision: http://reviews.llvm.org/D7510

llvm-svn: 229265
2015-02-14 19:37:54 +00:00
Duncan P. N. Exon Smith 025c0ad74c Target: Canonicalize access to function attributes, NFC
Canonicalize access to function attributes to use the simpler API.

getAttributes().getAttribute(AttributeSet::FunctionIndex, Kind)
  => getFnAttribute(Kind)

getAttributes().hasAttribute(AttributeSet::FunctionIndex, Kind)
  => hasFnAttribute(Kind)

llvm-svn: 229261
2015-02-14 15:36:52 +00:00
Duncan P. N. Exon Smith b5054333ec NVPTX: Canonicalize access to function attributes, NFC
Canonicalize access to function attributes to use the simpler API.

getAttributes().getAttribute(AttributeSet::FunctionIndex, Kind)
  => getFnAttribute(Kind)

getAttributes().hasAttribute(AttributeSet::FunctionIndex, Kind)
  => hasFnAttribute(Kind)

llvm-svn: 229260
2015-02-14 15:35:43 +00:00
Andrea Di Biagio f54432388f [optnone] Skip pass Constant Hoisting on optnone functions.
Added test CodeGen/X86/constant-hoisting-optnone.ll to verify that
pass Constant Hoisting is not run on optnone functions.

llvm-svn: 229258
2015-02-14 15:11:48 +00:00
Simon Pilgrim b0bac23fcd Line ending fix. NFC.
llvm-svn: 229256
2015-02-14 13:27:53 +00:00
Chandler Carruth 003ed332bf Remove a variable only used in an assert and sink its initializer into
the assert. Fixes -Wunused-variable on non-asserts builds.

llvm-svn: 229250
2015-02-14 09:14:44 +00:00
Matt Arsenault 0bbcd8ba2f R600/SI: Implement correct f64 fdiv
This version passes the OpenCL conformance test.

llvm-svn: 229239
2015-02-14 04:30:08 +00:00
Matt Arsenault 044f1d19cf R600/SI: Use complex operand folding for div_scale
llvm-svn: 229238
2015-02-14 04:24:28 +00:00
Matt Arsenault 1bc9d95047 R600/SI: Fix implicit vcc operand to v_div_fmas_*
This should allow finally fixing the f64 fdiv implementation.

Test is disabled for VI since there seems to be a problem with one
of the buffer load instructions on it.

llvm-svn: 229236
2015-02-14 04:22:00 +00:00
Matt Arsenault 6e26b8d854 R600/SI: Fix schedule model for v_div_scale_{f32|f64}
llvm-svn: 229235
2015-02-14 04:03:18 +00:00
Matt Arsenault 35733e2dec R600/SI: Really fix size of VReg_1
llvm-svn: 229234
2015-02-14 03:54:32 +00:00
Matt Arsenault 1bcc8cba5a R600/SI: Rename encoding field to match docs for VOP3b
llvm-svn: 229233
2015-02-14 03:54:29 +00:00
Zachary Turner 26ebe3fbcd llvm-pdbdump: Only dump whitelisted global symbols.
Dumping the global scope contains a lot of very uninteresting
things and is generally polluted with a lot of random junk.
Furthermore, it dumps values unsorted, making it hard to read.
This patch dumps known interesting types only, and as a side
effect sorts the list by symbol type.

llvm-svn: 229232
2015-02-14 03:54:28 +00:00
Zachary Turner 52c9f881de llvm-pdbdump: Re-order header files according to LLVM style guide.
llvm-svn: 229231
2015-02-14 03:53:56 +00:00
Matt Arsenault 31ec598a2a R600/SI: Fix not encoding src2 for v_div_scale_{f32|f64}
This apparently got lost in the VI changes.

llvm-svn: 229230
2015-02-14 03:40:35 +00:00
Matt Arsenault 692acf1438 R600/SI: Fix VOP3b encoding on VI
llvm-svn: 229228
2015-02-14 03:02:23 +00:00
Matt Arsenault 95546b46ab R600/SI: Fix phys reg copies in SIFoldOperands
llvm-svn: 229227
2015-02-14 02:55:57 +00:00
Matt Arsenault 9998168982 R600/SI: Fix copies from SGPR to VCC
This shows up without optimizations when vcc is required
to be used.

llvm-svn: 229226
2015-02-14 02:55:56 +00:00
Matt Arsenault 834b1aa806 R600/SI: Add hack to copy from a VGPR to VCC
This hopefully should be fixed when VReg_1 is removed.

llvm-svn: 229225
2015-02-14 02:55:54 +00:00
Duncan P. N. Exon Smith 5bedaf934f PowerPC: Canonicalize access to function attributes, NFC
Canonicalize access to function attributes to use the simpler API.

getAttributes().getAttribute(AttributeSet::FunctionIndex, Kind)
  => getFnAttribute(Kind)

getAttributes().hasAttribute(AttributeSet::FunctionIndex, Kind)
  => hasFnAttribute(Kind)

llvm-svn: 229224
2015-02-14 02:54:07 +00:00
Matt Arsenault f417ff8f2a R600/SI: Fix size of VReg_1
This is really a 32-bit register, if we try to check the size of it,
we want 32-bits.

llvm-svn: 229223
2015-02-14 02:51:44 +00:00
Duncan P. N. Exon Smith 8480c87ce6 R600: Canonicalize access to function attributes, NFC
Canonicalize access to function attributes to use the simpler API.

getAttributes().getAttribute(AttributeSet::FunctionIndex, Kind)
  => getFnAttribute(Kind)

getAttributes().hasAttribute(AttributeSet::FunctionIndex, Kind)
  => hasFnAttribute(Kind)

llvm-svn: 229222
2015-02-14 02:45:45 +00:00
Duncan P. N. Exon Smith 2e75314352 Mips: Canonicalize access to function attributes, NFC
Canonicalize access to function attributes to use the simpler API.

getAttributes().getAttribute(AttributeSet::FunctionIndex, Kind)
  => getFnAttribute(Kind)

getAttributes().hasAttribute(AttributeSet::FunctionIndex, Kind)
  => hasFnAttribute(Kind)

llvm-svn: 229221
2015-02-14 02:37:48 +00:00
Duncan P. N. Exon Smith 2cff9e19a2 ARM: Canonicalize access to function attributes, NFC
Canonicalize access to function attributes to use the simpler API.

getAttributes().getAttribute(AttributeSet::FunctionIndex, Kind)
  => getFnAttribute(Kind)

getAttributes().hasAttribute(AttributeSet::FunctionIndex, Kind)
  => hasFnAttribute(Kind)

llvm-svn: 229220
2015-02-14 02:24:44 +00:00
Duncan P. N. Exon Smith 003bb7d96e AArch64: Canonicalize access to function attributes, NFC
Canonicalize access to function attributes to use the simpler API.

getAttributes().getAttribute(AttributeSet::FunctionIndex, Kind)
  => getFnAttribute(Kind)

getAttributes().hasAttribute(AttributeSet::FunctionIndex, Kind)
  => hasFnAttribute(Kind)

llvm-svn: 229218
2015-02-14 02:09:06 +00:00
Duncan P. N. Exon Smith 5975a703e6 X86: Canonicalize access to function attributes, NFC
Canonicalize access to function attributes to use the simpler API.

getAttributes().getAttribute(AttributeSet::FunctionIndex, Kind)
  => getFnAttribute(Kind)

getAttributes().hasAttribute(AttributeSet::FunctionIndex, Kind)
  => hasFnAttribute(Kind)

llvm-svn: 229214
2015-02-14 01:59:52 +00:00
Duncan P. N. Exon Smith 70eb9c5ae5 CodeGen: Canonicalize access to function attributes, NFC
Canonicalize access to function attributes to use the simpler API.

getAttributes().getAttribute(AttributeSet::FunctionIndex, Kind)
  => getFnAttribute(Kind)

getAttributes().hasAttribute(AttributeSet::FunctionIndex, Kind)
  => hasFnAttribute(Kind)

Also, add `Function::getFnStackAlignment()`, and canonicalize:

getAttributes().getStackAlignment(AttributeSet::FunctionIndex)
  => getFnStackAlignment()

llvm-svn: 229208
2015-02-14 01:44:41 +00:00
Ahmed Bougacha 8f2b4f0be8 [X86] Factor out the CMOV pseudo definitions. NFCI.
llvm-svn: 229206
2015-02-14 01:36:53 +00:00
Matthias Braun 33cc10724d Revert "On ELF, put PIC jump tables in a non executable section."
This reverts commit r228939.

The commit broke something in the output of exception handling tables on
darwin x86-64.

llvm-svn: 229203
2015-02-14 01:16:54 +00:00
Duncan P. N. Exon Smith 2c79ad974c Transforms: Canonicalize access to function attributes, NFC
Canonicalize access to function attributes to use the simpler API.

getAttributes().getAttribute(AttributeSet::FunctionIndex, Kind)
  => getFnAttribute(Kind)

getAttributes().hasAttribute(AttributeSet::FunctionIndex, Kind)
  => hasFnAttribute(Kind)

llvm-svn: 229202
2015-02-14 01:11:29 +00:00
Reid Kleckner 2d5fb68ee0 Unify the two EH personality classification routines I wrote
We only need one.

llvm-svn: 229193
2015-02-14 00:21:02 +00:00
Duncan P. N. Exon Smith b3fc83c403 Analysis: Canonicalize access to function attributes, NFC
Canonicalize access to function attributes to use the simpler API.

getAttributes().getAttribute(AttributeSet::FunctionIndex, Kind)
  => getFnAttribute(Kind)

getAttributes().hasAttribute(AttributeSet::FunctionIndex, Kind)
  => hasFnAttribute(Kind)

llvm-svn: 229192
2015-02-14 00:12:15 +00:00
Eric Christopher b2a5fa98e4 Use the template method to grab the target specific subtarget.
llvm-svn: 229191
2015-02-14 00:09:46 +00:00
Philip Reames 9ae15209ad [InstCombine] When canonicalizing gep indices, prefer zext when possible
If we know that the sign bit of a value being sign extended is zero, we can use a zero extension instead.  This is motivated by the fact that zero extensions are generally cheaper on x86 (and most other architectures?).  We already apply a similar transform in DAGCombine, this just extends that to the IR level.

This comes up when we eagerly canonicalize gep indices to the width of a machine register (i64 on x86_64). To do so, we insert sign extensions (sext) to promote smaller types. 

Differential Revision: http://reviews.llvm.org/D7255

llvm-svn: 229189
2015-02-14 00:05:36 +00:00
Chris Bieneman f942c0c89d Fixing broken bots.
llvm-svn: 229176
2015-02-13 23:10:31 +00:00
Philip Reames 66facd6c14 Minor tweak to MDA
Two minor tweaks I noticed when reading through the code:
- No need to recompute begin() on every iteration.  We're not modifying the instructions in this loop.
- We can ignore PHINodes and Dbg intrinsics.  The current code does this anyways, but it will spend slightly more time doing so and will count towards the limit of instructions in the block.  It seems really silly to give up due the presence of PHIs...

Differential Revision: http://reviews.llvm.org/D7624

llvm-svn: 229175
2015-02-13 23:08:37 +00:00
Chris Bieneman 67e426a022 NFC. Moving the RegisteredOptionCategories global into the CommandLineParser class.
llvm-svn: 229172
2015-02-13 22:54:32 +00:00
Chris Bieneman ceaf5f660d NFC. clang-format wants to change this from two lines to one.
llvm-svn: 229171
2015-02-13 22:54:29 +00:00