Commit Graph

106470 Commits

Author SHA1 Message Date
Juergen Ributzka 53533e885a [FastISel][AArch64] Fix shift lowering for i8 and i16 value types.
This fix changes the parameters #r and #s that are passed to the UBFM/SBFM
instruction to get the zero/sign-extension for free.

The original problem was that the shift left would use the 32-bit shift even for
i8/i16 value types, which could leave the upper bits set with "garbage" values.

The arithmetic shift right on the other side would use the wrong MSB as sign-bit
to determine what bits to shift into the value.

This fixes <rdar://problem/17907720>.

llvm-svn: 214788
2014-08-04 21:49:51 +00:00
Justin Bogner ac021bac4e IR: Fix up doxygen comment for LLVMContext::diagnose
This comment was referring to the DiagnosticSeverity with RS_
prefixes, but they're actually DS_. I've also modernized the comment
style since I was changing it anyway.

llvm-svn: 214787
2014-08-04 21:49:15 +00:00
Chandler Carruth 40dbd382ad [SDAG] Fix a really, really terrible bug in the DAG combiner.
This code is completely wrong. It is also dead, as if it were to *ever*
run, it would crash. Fortunately, after my work to the combiner, it is
at least *possible* to reach the code, and llvm-stress has found a test
case. Thanks to Patrick for reporting.

It would be really good if anyone who remembers how this code works and
what it was intended to do could add some more obvious test coverage
instead of my completely contrived and reduced test case. My test case
was so brittle I left a bread crumb comment in it to help the next
person to stumble on it and not know what it was actually testing for.

llvm-svn: 214785
2014-08-04 21:29:59 +00:00
Joerg Sonnenberger 6c3e38522a tlbre / tlbwe / tlbsx / tlbsx. variants for the PPC 4xx CPUs.
llvm-svn: 214784
2014-08-04 21:28:22 +00:00
Eric Christopher 5f11f5f979 Reorder to keep data and routines separate and to keep a couple of
similar routines close to each other.

llvm-svn: 214782
2014-08-04 21:25:44 +00:00
Eric Christopher d913448b38 Remove the TargetMachine forwards for TargetSubtargetInfo based
information and update all callers. No functional change.

llvm-svn: 214781
2014-08-04 21:25:23 +00:00
Eric Christopher acc8ef273b Reimplement the temporary non-const getSubtargetImpl routine so
that we can avoid implementing it on every target. Thanks to Richard
Smith for the suggestions!

llvm-svn: 214780
2014-08-04 21:24:07 +00:00
Chad Rosier 5908ab4dd6 [AArch64] Extend the number of scalar instructions supported in the AdvSIMD
scalar integer instruction pass.

This is a patch I had lying around from a few months ago.  The pass is
currently disabled by default, so nothing to interesting.

llvm-svn: 214779
2014-08-04 21:20:25 +00:00
Joerg Sonnenberger 6d05a2b461 MC uses .lcomm now, so adjust.
llvm-svn: 214776
2014-08-04 21:06:00 +00:00
Reid Kleckner e704010450 Fix failure to invoke exception handler on Win64
When the last instruction prior to a function epilogue is a call, we
need to emit a nop so that the return address is not in the epilogue IP
range.  This is consistent with MSVC's behavior, and may be a workaround
for a bug in the Win64 unwinder.

Differential Revision: http://reviews.llvm.org/D4751

Patch by Vadim Chugunov!

llvm-svn: 214775
2014-08-04 21:05:27 +00:00
David Blaikie 16409f23f8 Correct the emission kind constants committed in r214771
llvm-svn: 214772
2014-08-04 20:36:00 +00:00
David Blaikie f851712509 Document the "emission kind" field of the DICompileUnit in LLVM's Source Level Debugging metadata.
llvm-svn: 214771
2014-08-04 20:32:48 +00:00
Joerg Sonnenberger 6e842b34a0 Recognize mftbl as alias for mftb, for symmetry with mttb.
llvm-svn: 214769
2014-08-04 20:28:34 +00:00
Joerg Sonnenberger 906ae46d57 Add a sentence that all entries should include an email address.
Add one for Greg Clayton, Peter Collingbourne, Tobias Grosser and
Jakob Olesen based on recent commits.

llvm-svn: 214762
2014-08-04 19:33:25 +00:00
David Blaikie 448c066eea Reapply "DebugInfo: Ensure that all debug location scope chains from instructions within a function, lead to the function itself."
Originally reverted in r213432 with flakey failures on an ASan self-host
build. After reduction it seems to be the same issue fixed in r213805
(ArgPromo + DebugInfo: Handle updating debug info over multiple
applications of argument promotion) and r213952 (by having
LiveDebugVariables strip dbg_value intrinsics in functions that are not
described by debug info). Though I cannot explain why this failure was
flakey...

llvm-svn: 214761
2014-08-04 19:30:08 +00:00
Matt Arsenault fa097f8f3d R600/SI: Fix definitions for ds_read2 / ds_write2 instructions.
These were just wrong, using the wrong register classes
and store2 was missing an operand.

llvm-svn: 214756
2014-08-04 18:49:22 +00:00
Joerg Sonnenberger 5f233fc723 Rename PPCLinuxMCAsmInfo to PPCELFMCAsmInfo to better reflect the
systems it represents.

llvm-svn: 214755
2014-08-04 18:46:13 +00:00
Joerg Sonnenberger b604f82cb8 Allow .lcomm with alignment on ELF targets.
llvm-svn: 214754
2014-08-04 18:45:10 +00:00
Alex Lorenz 1193b5e272 Coverage: add HasCodeBefore flag to a mapping region.
This flag will be used by the coverage tool to help 
compute the execution counts for each line in a source file.

Differential Revision: http://reviews.llvm.org/D4746

llvm-svn: 214740
2014-08-04 18:00:51 +00:00
Eric Christopher 34aaf970e2 Move the R600 intrinsic support back to the target machine - there's
nothing subtarget dependent about the intrinsic support in any
backend as far as I can tell.

llvm-svn: 214738
2014-08-04 17:37:43 +00:00
Justin Bogner 487e764b58 Path: Stop claiming path::const_iterator is bidirectional
path::const_iterator claims that it's a bidirectional iterator, but it
doesn't satisfy all of the contracts for a bidirectional iterator.
For example, n3376 24.2.5 p6 says "If a and b are both dereferenceable,
then a == b if and only if *a and *b are bound to the same object",
but this doesn't work with how we stash and recreate Components.

This means that our use of reverse_iterator on this type is invalid
and leads to many of the valgrind errors we're hitting, as explained
by Tilmann Scheller here:

    http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140728/228654.html

Instead, we admit that path::const_iterator is only an input_iterator,
and implement a second input_iterator for path::reverse_iterator (by
changing const_iterator::operator-- to reverse_iterator::operator++).
All of the uses of this just traverse once over the path in one
direction or the other anyway.

llvm-svn: 214737
2014-08-04 17:36:41 +00:00
Joerg Sonnenberger 5002fb5337 Refactor SPRG instructions.
llvm-svn: 214733
2014-08-04 17:26:15 +00:00
Akira Hatanaka e457f3e17a [X86] Place parentheses around "isMask_32(STReturns) && N <= 2".
This corrects r214672, which was committed to silence a gcc warning.

llvm-svn: 214732
2014-08-04 17:23:38 +00:00
Joerg Sonnenberger 7405210418 Add support for m[ft][di]bat[ul] instructions.
llvm-svn: 214731
2014-08-04 17:07:41 +00:00
Matt Arsenault 329eda3b82 Use the known address space constant rather than checking it
llvm-svn: 214729
2014-08-04 16:55:35 +00:00
Matt Arsenault efb3d5347d R600: Remove unused include
llvm-svn: 214728
2014-08-04 16:55:33 +00:00
Eric Christopher eba9167e7b Add a dummy subtarget to the CPP backend target machine. This will
allow us to forward all of the standard TargetMachine calls to the
subtarget and still return null as we were before.

llvm-svn: 214727
2014-08-04 16:40:55 +00:00
Joerg Sonnenberger 0b2ebcb49d Add features for PPC 4xx and e500/e500mc instructions.
Move the test cases for them into separate files.

llvm-svn: 214724
2014-08-04 15:47:38 +00:00
Ulrich Weigand 983341d3f3 [PowerPC] Add target triple to vec_urem_const.ll test case
This should hopefully fix build bots on other architectures.

llvm-svn: 214721
2014-08-04 14:55:26 +00:00
Robert Khasanov 7ca7df0bf9 [SKX] Enabling load/store instructions: encoding
Instructions: VMOVAPD, VMOVAPS, VMOVDQA8, VMOVDQA16, VMOVDQA32,VMOVDQA64, VMOVDQU8, VMOVDQU16, VMOVDQU32,VMOVDQU64, VMOVUPD, VMOVUPS,

Reviewed by Elena Demikhovsky <elena.demikhovsky@intel.com>

llvm-svn: 214719
2014-08-04 14:35:15 +00:00
Ulrich Weigand cc9909b881 [PowerPC] Swap arguments to vpkuhum/vpkuwum on little-endian
In commit r213915, Bill fixed little-endian usage of vmrgh* and vmrgl*
by swapping the input arguments.  As it turns out, the exact same fix
is also required for the vpkuhum/vpkuwum patterns.

This fixes another regression in llvmpipe when vector support is
enabled.

Reviewed by Bill Schmidt.

llvm-svn: 214718
2014-08-04 13:53:40 +00:00
Aaron Ballman c36c6abc45 Improving the name of the function parameter, which happens to solve two likely-less-than-useful MSVC warnings: warning C4258: 'I' : definition from the for loop is ignored; the definition from the enclosing scope is used.
llvm-svn: 214717
2014-08-04 13:51:27 +00:00
Ulrich Weigand 51eccec5d9 [PowerPC] MULHU/MULHS are not legal for vector types
I ran into some test failures where common code changed vector division
by constant into a multiply-high operation (MULHU).  But these are not
implemented by the back-end, so we failed to recognize the insn.

Fixed by marking MULHU/MULHS as Expand for vector types.

llvm-svn: 214716
2014-08-04 13:27:12 +00:00
Daniel Sanders a5cb453cd3 Fixed accidental use of reserved identifier in r214709.
llvm-svn: 214715
2014-08-04 13:27:03 +00:00
Ulrich Weigand c4cc7febb0 [PowerPC] Fix and improve vector comparisons
This patch refactors code generation of vector comparisons.

This fixes a wrong code-gen bug for ISD::SETGE for floating-point types,
and improves generated code for vector comparisons in general.

Specifically, the patch moves all logic deciding how to implement vector
comparisons into getVCmpInst, which gets two extra boolean outputs
indicating to its caller whether its needs to swap the input operands
and/or negate the result of the comparison.  Apart from implementing
these two modifications as directed by getVCmpInst, there is no need
to ever implement vector comparisons in any other manner; in particular,
there is never a need to perform two separate comparisons (e.g. one for
equal and one for greater-than, as code used to do before this patch).

Reviewed by Bill Schmidt.

llvm-svn: 214714
2014-08-04 13:13:57 +00:00
Daniel Sanders f0df221d76 [mips] Add assembler support for '.set mipsX'.
Summary:
This patch also fixes an issue with the way the Mips assembler enables/disables architecture
features. Before this patch, the assembler never disabled feature bits. For example,
.set mips64
.set mips32r2

would result in the 'OR' of mips64 with mips32r2 feature bits which isn't right.
Unfortunately this isn't trivial to fix because there's not an easy way to clear
feature bits as the algorithm in MCSubtargetInfo (ToggleFeature) only clears the bits
that imply the feature being cleared and not the implied bits by the feature (there's a
better explanation to the code I added).

Patch by Matheus Almeida and updated by Toma Tabacu

Reviewers: vmedic, matheusalmeida, dsanders

Reviewed By: dsanders

Subscribers: tomatabacu, llvm-commits

Differential Revision: http://reviews.llvm.org/D4123

llvm-svn: 214709
2014-08-04 12:20:00 +00:00
NAKAMURA Takumi d587e20cec TargetInstrInfo::genAlternativeCodeSequence(): Fix a couple of \param(s). [-Wdocumentation]
llvm-svn: 214708
2014-08-04 10:23:22 +00:00
Chandler Carruth 0e2ddb2790 [x86] Just unilaterally prefer SSSE3-style PSHUFB lowerings over clever
use of PACKUS. It's cleaner that way.

I looked at implementing clever combine-based folding of PACKUS chains
into PSHUFB but it is quite hard and doesn't seem likely to be worth it.
The most annoying part would be detecting that the correct masking had
been done to use PACKUS-style instructions as a blend operation rather
than there being any saturating as is indicated by its name. We generate
really nice code for what few test cases I've come up with that aren't
completely contrived for this by just directly prefering PSHUFB and so
let's go with that strategy for now. =]

llvm-svn: 214707
2014-08-04 10:17:35 +00:00
Chandler Carruth 06e6f1cae2 [x86] Implement more aggressive use of PACKUS chains for lowering common
patterns of v16i8 shuffles.

This implements one of the more important FIXMEs for the SSE2 support in
the new shuffle lowering. We now generate the optimal shuffle sequence
for truncate-derived shuffles which show up essentially everywhere.

Unfortunately, this exposes a weakness in other parts of the shuffle
logic -- we can no longer form PSHUFB here. I'll add the necessary
support for that and other things in a subsequent commit.

llvm-svn: 214702
2014-08-04 09:40:02 +00:00
Benjamin Kramer 2abde4f9d7 Update links to the gcc and java documentation that 404'd.
llvm-svn: 214700
2014-08-04 09:26:40 +00:00
Kevin Qin f31ecf3fea Revert "r214669 - MachineCombiner Pass for selecting faster instruction"
This commit broke "make check" for several hours, so get it reverted.

llvm-svn: 214697
2014-08-04 05:10:33 +00:00
NAKAMURA Takumi 56bc3419a3 MemoryBuffer: Don't use mmap when FileSize is multiple of 4k on Cygwin.
On Cygwin, getpagesize() returns 64k(AllocationGranularity).

In r214580, the size of X86GenInstrInfo.inc became 1499136.

FIXME: We should reorganize again getPageSize() on Win32.
MapFile allocates address along AllocationGranularity but view is mapped by physical page.

llvm-svn: 214681
2014-08-04 01:43:37 +00:00
Chandler Carruth 37a18821cd [x86] Handle single input shuffles in the SSSE3 case more intelligently.
I spent some time looking into a better or more principled way to handle
this. For example, by detecting arbitrary "unneeded" ORs... But really,
there wasn't any point. We just shouldn't build blatantly wrong code so
late in the pipeline rather than adding more stages and logic later on
to fix it. Avoiding this is just too simple.

llvm-svn: 214680
2014-08-04 01:14:24 +00:00
Chandler Carruth 7bbfd245b0 [x86] Fix the test case added in r214670 and tweaked in r214674 further.
Fundamentally, there isn't a really portable way to test the constant
pool contents. Instead, pin this test to the bare-metal triple. This
also makes it a 64-bit triple which allows us to only match a single
constant pool rather than two. It can also just hard code the '.' prefix
as the format should be stable now that it has a fixed triple. Finally,
I've switched it to use CHECK-NEXT to be more precise in the instruction
sequence expected and to use variables rather than hard coding decisions
by the register allocator.

llvm-svn: 214679
2014-08-04 00:54:28 +00:00
Peter Zotov 454b85606e [OCaml] Add Llvm.{string_of_const,const_element}.
llvm-svn: 214677
2014-08-03 23:54:22 +00:00
Peter Zotov f9aa882ca1 [LLVM-C] Add LLVM{IsConstantString,GetAsString,GetElementAsConstant}.
llvm-svn: 214676
2014-08-03 23:54:16 +00:00
Sanjay Patel 065cabf43e Account for possible leading '.' in label string.
llvm-svn: 214674
2014-08-03 23:20:16 +00:00
Chandler Carruth cde4eb56fe [x86] Don't add nodes to the combined set (and prune subsequent
combines) until they are legal.

Doing it the old way could, when the stars align *just* right, cause
a node to get into the combine set prior to being legalized. Then, when
the same node showed up as an operand to another node later on (but not
so much later on that it had been deleted as dead) we would fail to add
it back to the worklist thinking it had already been combined. This
would in turn cause it to not be legalized. Fortunately, we can also
walk the operands looking for uncombined (and thus potentially
un-legalized) nodes late. It will still ensure that we walk all operands
of all nodes and send all of them through both the legalizer without
changes and the combiner at least once. (Which was the original goal of
this).

I have a test case for this bug, but it is terribly brittle. For
example, it will stop finding the bug the moment I enable the new
shuffle lowering. I don't yet have any test case that reliably exercises
this bug, and it isn't clear that it will be possible to craft one. It
is entirely possible that with the new shuffle lowering the two forms of
doing this are precisely equivalent. That doesn't mean we shouldn't take
the more conservative approach of insisting on things in the combined
set having survived the legalizer.

llvm-svn: 214673
2014-08-03 23:10:59 +00:00
Saleem Abdulrasool 557023e349 X86: silence warning (-Wparentheses)
GCC 4.8.2 points out the ambiguity in evaluation of the assertion condition:

lib/Target/X86/X86FloatingPoint.cpp:949:49: warning: suggest parentheses around ‘&&’ within ‘||’ [-Wparentheses]
   assert(STReturns == 0 || isMask_32(STReturns) && N <= 2);

llvm-svn: 214672
2014-08-03 23:00:39 +00:00
Saleem Abdulrasool befa21532c CodeGen: silence a warning
GCC 4.8.2 objects to the tautological condition in the assert as the unsigned
value is guaranteed to be >= 0.  Simplify the assertion by dropping the
tautological condition.

llvm-svn: 214671
2014-08-03 23:00:38 +00:00
Sanjay Patel 2ef67440fc fix for PR20354 - Miscompile of fabs due to vectorization
This is intended to be the minimal change needed to fix PR20354 ( http://llvm.org/bugs/show_bug.cgi?id=20354 ). The check for a vector operation was wrong; we need to check that the fabs itself is not a vector operation.

This patch will not generate the optimal code. A constant pool load and 'and' op will be generated instead of just returning a value that we can calculate in advance (as we do for the scalar case). I've put a 'TODO' comment for that here and expect to have that patch ready soon.

There is a very similar optimization that we can do in visitFNEG, so I've put another 'TODO' there and expect to have another patch for that too.

llvm-svn: 214670
2014-08-03 22:48:23 +00:00
Gerolf Hoflehner 35ba467122 MachineCombiner Pass for selecting faster instruction
sequence -  AArch64 target support

 This patch turns off madd/msub generation in the DAGCombiner and generates
 them in the MachineCombiner instead. It replaces the original code sequence
 with the combined sequence when it is beneficial to do so.

 When there is no machine model support it always generates the madd/msub
 instruction. This is true also when the objective is to optimize for code
 size: when the combined sequence is shorter is always chosen and does not
 get evaluated.

 When there is a machine model the combined instruction sequence
 is evaluated for critical path and resource length using machine
 trace metrics and the original code sequence is replaced when it is
 determined to be faster.

 rdar://16319955

llvm-svn: 214669
2014-08-03 22:03:40 +00:00
Gerolf Hoflehner 5e1207e54c MachineCombiner Pass for selecting faster instruction
sequence -  target independent framework

 When the DAGcombiner selects instruction sequences
 it could increase the critical path or resource len.

 For example, on arm64 there are multiply-accumulate instructions (madd,
 msub). If e.g. the equivalent  multiply-add sequence is not on the
 crictial path it makes sense to select it instead of  the combined,
 single accumulate instruction (madd/msub). The reason is that the
 conversion from add+mul to the madd could lengthen the critical path
 by the latency of the multiply.

 But the DAGCombiner would always combine and select the madd/msub
 instruction.

 This patch uses machine trace metrics to estimate critical path length
 and resource length of an original instruction sequence vs a combined
 instruction sequence and picks the faster code based on its estimates.

 This patch only commits the target independent framework that evaluates
 and selects code sequences. The machine instruction combiner is turned
 off for all targets and expected to evolve over time by gradually
 handling DAGCombiner pattern in the target specific code.

 This framework lays the groundwork for fixing
 rdar://16319955

llvm-svn: 214666
2014-08-03 21:35:39 +00:00
Saleem Abdulrasool 4544c16eab MC: virtualise EmitWindowsUnwindTables
This makes EmitWindowsUnwindTables a virtual function and lowers the
implementation of the function to the X86WinCOFFStreamer.  This method is a
target specific operation.  This enables making the behaviour target dependent
by isolating it entirely to the target specific streamer.

llvm-svn: 214664
2014-08-03 18:51:26 +00:00
Saleem Abdulrasool b3be7371d5 MC: rename Win64EHFrameInfo to WinEH::FrameInfo
The frame information stored in this structure is driven by the requirements for
Windows NT unwinding rather than Windows 64 specifically.  As a result, this
type can be shared across multiple architectures (ARM, AXP, MIPS, PPC, SH).
Rename this class in preparation for adding support for supporting unwinding
information for Windows on ARM.

Take the opportunity to constify the members as everything except the
ChainedParent is read-only.  This required some adjustment to the label
handling.

llvm-svn: 214663
2014-08-03 18:51:17 +00:00
Matt Arsenault 9215b17eb7 R600/SI: Fix extra whitespace in asm str
This slipped in in r214467, so something like

V_MOV_B32_e32  v0, ... is now printed with 2 spaces
between the instruction name and first operand.

llvm-svn: 214660
2014-08-03 05:27:14 +00:00
Manman Ren 062f58d550 [SimplifyCFG] fix accessing deleted PHINodes in switch-to-table conversion.
When we have a covered lookup table, make sure we don't delete PHINodes that
are cached in PHIs.

rdar://17887153

llvm-svn: 214642
2014-08-02 23:41:54 +00:00
Joerg Sonnenberger c03105ba8e tlbia support
llvm-svn: 214640
2014-08-02 20:16:29 +00:00
Joerg Sonnenberger e8a167ce8f mfdcr / mtdcr support
llvm-svn: 214639
2014-08-02 20:00:26 +00:00
Erik Eckstein 26a1bf7d84 fix bug 20513 - Crash in SLP Vectorizer
llvm-svn: 214638
2014-08-02 19:39:42 +00:00
James Molloy 6b999ae682 Update test to use a more modern AArch64 triple, as requested by Renato.
llvm-svn: 214637
2014-08-02 17:15:11 +00:00
Joerg Sonnenberger 99ab590ac9 Don't use additional arguments for dss and friends to satisfy DSS_Form,
when let can do the same thing. Keep the 64bit variants as codegen-only.
While they have a different register class, the encoding is the same for
32bit and 64bit mode. Having both present would otherwise confuse the
disassembler.

llvm-svn: 214636
2014-08-02 15:09:41 +00:00
James Molloy ce45be0465 [AArch64] Teach DAGCombiner that converting two consecutive loads into a vector load is not a good transform when paired loads are available.
The combiner was creating Q-register loads and stores, which then had to be spilled because there are no callee-save Q registers!

llvm-svn: 214634
2014-08-02 14:51:24 +00:00
Chandler Carruth 16c13cad35 [x86] Remove the FIXME that was implemented in r214628. Managed to
forget to update the comment here... =/

llvm-svn: 214630
2014-08-02 11:34:23 +00:00
Chandler Carruth bec57b406d [x86] Give this test a bare metal triple so it doesn't use the weird
Darwin x86 asm comment prefix designed to work around GAS on that
platform. That makes the comment-matching of the test much more stable.

llvm-svn: 214629
2014-08-02 11:17:41 +00:00
Chandler Carruth 4c57955fe3 [x86] Largely complete the use of PSHUFB in the new vector shuffle
lowering with a small addition to it and adding PSHUFB combining.

There is one obvious place in the new vector shuffle lowering where we
should form PSHUFBs directly: when without them we will unpack a vector
of i8s across two different registers and do a potentially 4-way blend
as i16s only to re-pack them into i8s afterward. This is the crazy
expensive fallback path for i8 shuffles and we can just directly use
pshufb here as it will always be cheaper (the unpack and pack are
two instructions so even a single shuffle between them hits our
three instruction limit for forming PSHUFB).

However, this doesn't generate very good code in many cases, and it
leaves a bunch of common patterns not using PSHUFB. So this patch also
adds support for extracting a shuffle mask from PSHUFB in the X86
lowering code, and uses it to handle PSHUFBs in the recursive shuffle
combining. This allows us to combine through them, combine multiple ones
together, and generally produce sufficiently high quality code.

Extracting the PSHUFB mask is annoyingly complex because it could be
either pre-legalization or post-legalization. At least this doesn't have
to deal with re-materialized constants. =] I've added decode routines to
handle the different patterns that show up at this level and we dispatch
through them as appropriate.

The two primary test cases are updated. For the v16 test case there is
still a lot of room for improvement. Since I was going through it
systematically I left behind a bunch of FIXME lines that I'm hoping to
turn into ALL lines by the end of this.

llvm-svn: 214628
2014-08-02 10:39:15 +00:00
Chandler Carruth d10b29240c [x86] Switch to using the variable we extracted this operand into.
Spotted this missed refactoring by inspection when reading code, and it
doesn't changethe functionality at all.

llvm-svn: 214627
2014-08-02 10:29:36 +00:00
Chandler Carruth 5219d4eff6 [x86] Fix a few typos in my comments spotted in passing.
llvm-svn: 214626
2014-08-02 10:29:34 +00:00
Chandler Carruth 34f9a987e9 [x86] Teach the target shuffle mask extraction to recognize unary forms
of normally binary shuffle instructions like PUNPCKL and MOVLHPS.

This detects cases where a single register is used for both operands
making the shuffle behave in a unary way. We detect this and adjust the
mask to use the unary form which allows the existing DAG combine for
shuffle instructions to actually work at all.

As a consequence, this uncovered a number of obvious bugs in the
existing DAG combine which are fixed. It also now canonicalizes several
shuffles even with the existing lowering. These typically are trying to
match the shuffle to the domain of the input where before we only really
modeled them with the floating point variants. All of the cases which
change to an integer shuffle here have something in the integer domain, so
there are no more or fewer domain crosses here AFAICT. Technically, it
might be better to go from a GPR directly to the floating point domain,
but detecting floating point *outputs* despite integer inputs is a lot
more code and seems unlikely to be worthwhile in practice. If folks are
seeing domain-crossing regressions here though, let me know and I can
hack something up to fix it.

Also as a consequence, a bunch of missed opportunities to form pshufb
now can be formed. Notably, splats of i8s now form pshufb.
Interestingly, this improves the existing splat lowering too. We go from
3 instructions to 1. Yes, we may tie up a register, but it seems very
likely to be worth it, especially if splatting the 0th byte (the
common case) as then we can use a zeroed register as the mask.

llvm-svn: 214625
2014-08-02 10:27:38 +00:00
Chandler Carruth 2ad69eea8d [x86] Teach my pshufb comment printer to handle VPSHUFB forms as well as
PSHUFB forms. This will be important to update some AVX tests when I add
PSHUFB combining.

llvm-svn: 214624
2014-08-02 10:08:17 +00:00
Chandler Carruth 18066974d4 [SDAG] Refactor the code which deletes nodes in the DAG combiner to do
so using a single helper which adds operands back onto the worklist.

Several places didn't rigorously do this but a couple already did.
Factoring them together and doing it rigorously is important to delete
things recursively early on in the combiner and get a chance to see
accurate hasOneUse values. While no existing test cases change, an
upcoming patch to add DAG combining logic for PSHUFB requires this to
work correctly.

llvm-svn: 214623
2014-08-02 10:02:07 +00:00
Owen Anderson 9d5a8c2813 Fix issues with ISD::FNEG and ISD::FMA SDNodes where they would not be constant-folded
during DAGCombine in certain circumstances.  Unfortunately, the circumstances required
to trigger the issue seem to require a pretty specific interaction of DAGCombines,
and I haven't been able to find a testcase that reproduces on X86, ARM, or AArch64.
The functionality added here is replicated in essentially every other DAG combine,
so it seems pretty obviously correct.

llvm-svn: 214622
2014-08-02 08:45:33 +00:00
Justin Bogner 0950d79f60 CodeGen: Remove commented out code
These two lines have been commented out for over 4 years. They aren't
helping anyone.

llvm-svn: 214615
2014-08-02 06:47:07 +00:00
Akira Hatanaka dc08c30df9 [ARM] In dynamic-no-pic mode, ARM's post-RA pseudo expansion was incorrectly
expanding pseudo LOAD_STATCK_GUARD using instructions that are normally used
in pic mode. This patch fixes the bug.

<rdar://problem/17886592>

llvm-svn: 214614
2014-08-02 05:40:40 +00:00
Lang Hames 70735351ca [MCJIT] Fix an overly-aggressive check in RuntimeDyldMachOARM.
This should fix the MachO_ARM_PIC_relocations.s test failures on some 32-bit
testers.

llvm-svn: 214613
2014-08-02 03:00:49 +00:00
Matt Arsenault 4de324442b R600: Cleanup fneg tests
llvm-svn: 214612
2014-08-02 02:26:51 +00:00
Michael Gottesman 55fcf34705 Add a small utility called bisect that enables commandline bisecting on a counter.
This is something that I have found to be very useful in my work and I
wanted to contribute it back to the community since several people in
the past have asked me for something along these lines. (Jakob, I know
this has been a while coming ; )]

The way you use this is you create a script that takes in as its first
argument a count. The script passes into LLVM the count via a command
line flag that disables a pass after LLVM has run after the pass has
run for count number of times. Then the script invokes a test of some
sort and indicates whether LLVM successfully compiled the test via the
scripts exit status. Then you invoke bisect as follows:

bisect --start=<start_num> --end=<end_num> ./script.sh "%(count)s"

And bisect will continually call ./script.sh with various counts using
the exit status to determine success and failure.

llvm-svn: 214610
2014-08-02 01:39:08 +00:00
Eric Fiselier c85f00a062 [lit] Add --show-xfail flag to LIT.
Summary:
This patch add a --show-xfail flag. If this flag is specified then each xfail test will be printed to output.
When it is not given xfail tests are ignored. Ignoring xfail tests is the current behavior.

This flag is meant to mirror the --show-unsupported flag that was recently added.

Reviewers: ddunbar, EricWF

Reviewed By: EricWF

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D4750

llvm-svn: 214609
2014-08-02 01:29:52 +00:00
Matt Arsenault a80c8770f9 R600/SI: Fix formatting.
Avoid weird line wrapping of BuildMI dest register.

llvm-svn: 214608
2014-08-02 01:10:28 +00:00
Chandler Carruth 063f425ea7 [x86] Make some questionable tests not spew assembly to stdout, which
makes a mess of the lit output when they ultimately fail.

The 2012-10-02-DAGCycle test is really frustrating because the *only*
explanation for what it is testing is a rdar link. I would really rather
that rdar links (which are not public or part of the open source
project) were not committed to the source code. Regardless, the actual
problem *must* be described as the rdar link is completely opaque. The
fact that this test didn't check for any particular output further
exacerbates the inability of any other developer to debug failures.

The mem-promote-integers test has nice comments and *seems* to be
a great test for our lowering... except that we don't actually check
that any of the generated code is correct or matches some pattern. We
just avoid crashing. It would be great to go back and populate this test
with the actual expectations.

llvm-svn: 214605
2014-08-02 00:50:10 +00:00
Alexey Samsonov d9ad5cec0c [ASan] Use metadata to pass source-level information from Clang to ASan.
Instead of creating global variables for source locations and global names,
just create metadata nodes and strings. They will be transformed into actual
globals in the instrumentation pass (if necessary). This approach is more
flexible:
1) we don't have to ensure that our custom globals survive all the optimizations
2) if globals are discarded for some reason, we will simply ignore metadata for them
   and won't have to erase corresponding globals
3) metadata for source locations can be reused for other purposes: e.g. we may
   attach source location metadata to alloca instructions and provide better descriptions
   for stack variables in ASan error reports.

No functionality change.

llvm-svn: 214604
2014-08-02 00:35:50 +00:00
Chandler Carruth ee1a1fc900 [SDAG] Allow the legalizer to delete an illegally typed intermediate
introduced during legalization. This pattern is based on other patterns
in the legalizer that I changed in the same way. Now, the legalizer
eagerly collects its garbage when necessary so that we can survive
leaving such nodes around for it.

Instead, we add an assert to make sure the node will be correctly
handled by that layer.

llvm-svn: 214602
2014-08-02 00:24:54 +00:00
Chandler Carruth 3707dda904 [SDAG] Let the DAG combiner take care of dead nodes rather than manually
deleting them. This already seems to work, as no tests fail without
this.

llvm-svn: 214601
2014-08-02 00:19:10 +00:00
Tyler Nowicki 064896bbc5 Add diagnostics to the vectorizer cost model.
When the cost model determines vectorization is not possible/profitable these remarks print an analysis of that decision.

Note that in selectVectorizationFactor() we can assume that OptForSize and ForceVectorization are mutually exclusive.

Reviewed by Arnold Schwaighofer

llvm-svn: 214599
2014-08-02 00:14:03 +00:00
NAKAMURA Takumi 78c32d75e0 BitcodeTests: Fix LINK_COMPONENTS.
llvm-svn: 214598
2014-08-02 00:12:54 +00:00
Duncan P. N. Exon Smith a5a8d3f6f2 verify-uselistorder: Reverse use-lists at every verification
Updated `verify-uselistorder` to more than double the number of use-list
orders it checks.

  - Every time it verifies an order, it then reverses the order and
    verifies again.

  - It now verifies the initial order, before running any shuffles.

Changed the default to `-num-shuffles=1`, since this is already four
checks, and after r214584 shuffling is guaranteed to make a new order.

This is part of PR5680.

llvm-svn: 214596
2014-08-01 23:49:41 +00:00
Duncan P. N. Exon Smith 9a2017bfc3 verify-uselistorder: Add missing `static`
llvm-svn: 214595
2014-08-01 23:31:13 +00:00
Duncan P. N. Exon Smith 3441ffe98d IR: Add Value::reverseUseList()
I'm going to use this to improve `verify-uselistorder`.  Part of PR5680.

llvm-svn: 214594
2014-08-01 23:28:49 +00:00
Peter Collingbourne e52646cd80 PartiallyInlineLibCalls: Check sqrt result type before transforming it.
Some configure scripts declare this with the wrong prototype, which can lead
to an assertion failure.

llvm-svn: 214593
2014-08-01 23:21:21 +00:00
Duncan P. N. Exon Smith 3da117d272 verify-uselistorder: Move shuffleUseLists() out of lib/IR
`shuffleUseLists()` is only used in `verify-uselistorder`, so move it
there to avoid bloating other executables.  As a drive-by, update some
of the header docs.

This is part of PR5680.

llvm-svn: 214592
2014-08-01 23:03:36 +00:00
Adrian Prantl d13dba42f6 Cleanup this test some more.
llvm-svn: 214591
2014-08-01 23:01:32 +00:00
Adrian Prantl a717a6da3d Add the missing target triple to this testcase.
llvm-svn: 214590
2014-08-01 23:01:30 +00:00
Adrian Prantl a6cf448226 Attempt to increase the overall happiness of the MSCV-based buildbots.
llvm-svn: 214588
2014-08-01 22:56:10 +00:00
Duncan P. N. Exon Smith 36d57a2303 verify-uselistorder: Make the verification logic easier to reuse
llvm-svn: 214587
2014-08-01 22:52:06 +00:00
Justin Bogner 821d7471f9 InstrProf: Allow multiple functions with the same name
This updates the instrumentation based profiling format so that when
we have multiple functions with the same name (but different function
hashes) we keep all of them instead of rejecting the later ones.

There are a number of scenarios where this can come up where it's more
useful to keep multiple function profiles:

* Name collisions in unrelated libraries that are profiled together.
* Multiple "main" functions from multiple tools built against a common
  library.
* Combining profiles from different build configurations (ie, asserts
  and no-asserts)

The profile format now stores the number of counters between the hash
and the counts themselves, so that multiple sets of counts can be
stored. Since this is backwards incompatible, I've bumped the format
version and added some trivial logic to skip this when reading the old
format.

llvm-svn: 214585
2014-08-01 22:50:07 +00:00
Duncan P. N. Exon Smith 6d3adac217 UseListOrder: Guarantee that shuffles change use-list order
Change shuffleUseLists() always to change use-list order by rejecting
orders that have no changes.

This is part of PR5680.

llvm-svn: 214584
2014-08-01 22:50:04 +00:00
Duncan P. N. Exon Smith 6e1009b65e UseListOrder: Fix blockaddress use-list order
`parseBitcodeFile()` uses the generic `getLazyBitcodeFile()` function as
a helper.  Since `parseBitcodeFile()` isn't actually lazy -- it calls
`MaterializeAllPermanently()` -- bypass the unnecessary call to
`materializeForwardReferencedFunctions()` by extracting out a common
helper function.  This removes the last of the use-list churn caused by
blockaddresses.

This highlights that we can't reproduce use-list order of globals and
constants when parsing lazily -- but that's necessarily out of scope.
When we're parsing lazily, we never have all the functions in memory, so
the use-lists of globals (and constants that reference globals) are
always incomplete.

This is part of PR5680.

llvm-svn: 214581
2014-08-01 22:27:19 +00:00
Akira Hatanaka 3516669a50 [X86] Simplify X87 stackifier pass.
Stop using ST registers for function returns and inline-asm instructions and use
FP registers instead. This allows removing a large amount of code in the
stackifier pass that was needed to track register liveness and handle copies
between ST and FP registers and function calls returning floating point values.

It also fixes a bug which manifests when an ST register defined by an
inline-asm instruction was live across another inline-asm instruction, as shown
in the following sequence of machine instructions:

1. INLINEASM <es:frndint> $0:[regdef], %ST0<imp-def,tied5>
2. INLINEASM <es:fldcw $0>
3. %FP0<def> = COPY %ST0

<rdar://problem/16952634>

llvm-svn: 214580
2014-08-01 22:19:41 +00:00
NAKAMURA Takumi 49a53507d0 llvm/test/CodeGen/Mips/cconv/arguments-varargs.ll: Add explicit -mtriple=(mips|mipsel)-linux on 4 lines.
llvm-svn: 214578
2014-08-01 22:15:38 +00:00
Adrian Prantl b1416837f9 Debug info: Infrastructure to support debug locations for fragmented
variables (for example, by-value struct arguments passed in registers, or
large integer values split across several smaller registers).
On the IR level, this adds a new type of complex address operation OpPiece
to DIVariable that describes size and offset of a variable fragment.
On the DWARF emitter level, all pieces describing the same variable are
collected, sorted and emitted as DWARF expressions using the DW_OP_piece
and DW_OP_bit_piece operators.

http://reviews.llvm.org/D3373
rdar://problem/15928306

What this patch doesn't do / Future work:
- This patch only adds the backend machinery to make this work, patches
  that change SROA and SelectionDAG's type legalizer to actually create
  such debug info will follow. (http://reviews.llvm.org/D2680)
- Making the DIVariable complex expressions into an argument of dbg.value
  will reduce the memory footprint of the debug metadata.
- The sorting/uniquing of pieces should be moved into DebugLocEntry,
  to facilitate the merging of multi-piece entries.

llvm-svn: 214576
2014-08-01 22:11:58 +00:00
Chandler Carruth 356665a36c [SDAG] MorphNodeTo recursively deletes dead operands of the old
fromulation of the node, which isn't really the desired behavior from
within the combiner or legalizer, but is necessary within ISel. I've
added a hopefully helpful comment and fixed the only two places where
this took place.

Yet another step toward the combiner and legalizer not needing to use
update listeners with virtual calls to manage the worklists behind
legalization and combining.

llvm-svn: 214574
2014-08-01 22:09:43 +00:00
Tom Stellard 4973a13680 Revert "R600: Move code for generating REGISTER_LOAD into R600ISelLowering.cpp"
This reverts commit r214566.

I did not mean to commit this yet.

llvm-svn: 214572
2014-08-01 21:55:50 +00:00
Reid Kleckner 6a2de90039 MS inline asm: Hide symbol to attempt to fix test failure on darwin
If the symbol comes from an external DSO, it apparently requires
indirection through a register.

llvm-svn: 214571
2014-08-01 21:54:37 +00:00
Duncan P. N. Exon Smith 00f20ace9a BitcodeReader: Change mechanics of BlockAddress forward references, NFC
Now that we can reliably handle forward references to `BlockAddress`
(r214563), change the mechanics to simplify predicting use-list order.

Previously, we created dummy `GlobalVariable`s to represent block
addresses.  After every function was materialized, we'd go through any
forward references to its blocks and RAUW them with a proper
`BlockAddress` constant.  This causes some (potentially a lot of)
unnecessary use-list churn, since any constant expression that it's a
part of will need to be rematerialized as well.

Instead, pre-construct a `BasicBlock` immediately -- without attaching
it to its (empty) `Function` -- and use that to construct a
`BlockAddress`.  This constant will not have to be regenerated.  When
the function body is parsed, hook this pre-constructed basic block up
in the right place using `BasicBlock::insertInto()`.

Both before and after this change, the IR is temporarily in an invalid
state that gets resolved when `materializeForwardReferencedFunctions()`
gets called.

This is a prep commit that's part of PR5680, but the only functionality
change is the reduction of churn in the constant pool.

llvm-svn: 214570
2014-08-01 21:51:52 +00:00
Tom Stellard d44c023b21 R600/SI: Remove leftover debugging code
llvm-svn: 214569
2014-08-01 21:51:05 +00:00
Tom Stellard c16f73d7c5 R600: Move code for generating REGISTER_LOAD into R600ISelLowering.cpp
SI doesn't use REGISTER_LOAD anymore, but it was still hitting this code
path for 8-bit and 16-bit private loads.

llvm-svn: 214566
2014-08-01 21:50:47 +00:00
Reid Kleckner 2a069a8291 docs: Strongly recommend setting rpath when using a local GCC toolchain
Users keep emailing us about the difficulties of getting LD_LIBRARY_PATH
into their environment, which should be completely unecessary. Try to
strengthen the rpath recommentation by putting in an example cmake
invocation.

Speaking of which, we might want to make CMake the recommended build
system in GettingStarted.html.

llvm-svn: 214565
2014-08-01 21:40:53 +00:00
Duncan P. N. Exon Smith 17cbb97882 IR: Add BasicBlock::insertInto()
Although unlinked `BasicBlock`s can be created, there's currently no way
to insert them into `Function`s after the fact.  In particular,
`moveAfter()` and `moveBefore()` require that the basic block is already
linked.

Extract the logic for initially linking a `BasicBlock` out of the
constructor and into a member function that can be used for lazy
insertion.

  - Asserts that the basic block is currently unlinked.
  - Matches the logic of the constructor.
  - Changed the constructor to use it since the logic matches.

This is needed in a follow-up commit for PR5680.

llvm-svn: 214563
2014-08-01 21:22:04 +00:00
Peter Collingbourne 142fdff0d5 [dfsan] Correctly handle loads and stores of zero size.
llvm-svn: 214561
2014-08-01 21:18:18 +00:00
Eric Christopher 6c05d9135f Add a non-const subtarget returning function to the target machine
so that we can use it to get the old-style JIT out of the subtarget.

This code should be removed when the old-style JIT is removed
(imminently).

llvm-svn: 214560
2014-08-01 21:18:01 +00:00
Duncan P. N. Exon Smith 908d809b81 BitcodeReader: Fix some BlockAddress forward reference corner cases
`BlockAddress`es are interesting in that they can reference basic blocks
from *outside* the block's function.  Since basic blocks are not global
values, this presents particular challenges for lazy parsing.

One corner case was found in PR11677 and fixed in r147425.  In that
case, a global variable references a block address.  It's necessary to
load the relevant function to resolve the forward reference before doing
anything with the module.

By inspection, I found (and have fixed here) two other cases:

  - An instruction from one function references a block address from
    another function, and only the first function is lazily loaded.

    I fixed this the same way as PR11677: by eagerly loading the
    referenced function.

  - A function whose block address is taken is dematerialized, leaving
    invalid references to it.

    I fixed this by refusing to dematerialize functions whose block
    addresses are taken (if you have to load it, you can't unload it).

llvm-svn: 214559
2014-08-01 21:11:34 +00:00
Duncan P. N. Exon Smith 2e7e989d71 Try to fix configure+make after r214556
llvm-svn: 214558
2014-08-01 21:06:59 +00:00
Duncan P. N. Exon Smith 7a2990cfdb Rewrite BitReaderTest, NFC
Rewrite the single unit test in `BitReaderTest` so that it's easier to
add more tests.

  - Parse from an assembly string rather than using API.
  - Use more helper functions.
  - Use a separate context for the module on the other side.

Aside from relying on the assembly parser, there's no functionality
change intended.

llvm-svn: 214556
2014-08-01 21:01:04 +00:00
Reid Kleckner 5b37c18129 MS inline asm: Use memory constraints for functions instead of registers
This is consistent with how we parse them in a standalone .s file, and
inline assembly shouldn't differ.

This fixes errors about requiring more registers than available in
cases like this:
  void f();
  void __declspec(naked) g() {
    __asm pusha
    __asm call f
    __asm popa
    __asm ret
  }

There are no registers available to pass the address of 'f' into the asm
blob.  The asm should now directly call 'f'.

Tests will land in Clang shortly.

llvm-svn: 214550
2014-08-01 20:21:24 +00:00
Justin Bogner 45afa389d6 llvm-profdata: Replace redundant tests with more targeted ones
llvm-svn: 214548
2014-08-01 19:59:48 +00:00
Chandler Carruth 1f52b3da0a [SDAG] Begin simplifying the way in which the legalizer deletes nodes.
This lifts the (very few) places the legalizer would delete dead nodes
into the outer loop around the legalizer. This is significantly simpler
because it doesn't require the legalizer itself to manage the iterator
validity, and it doesn't require the legalizer to be a DAG update
listener in order to remove things from the legalized set. It also makes
the interface much less contrived for the case of the legalizer running
inside the last phase of DAG combining.

I'm working on centralizing the deletion of nodes during both legalizing
and combining as much as possible. My hope is to remove the need for DAG
update listeners from the combiner next, which would remove a costly
virtual dispatch chain on every deletion. This in turn should allow us
to more aggressively delete DAG nodes during combining which will in
turn allow us to combine more aggressively by exposing the actual nodes
which have single users to the combine phases.

llvm-svn: 214546
2014-08-01 19:49:59 +00:00
Juergen Ributzka 5dcb33bdbb [FastISel][AArch64] Fold offset into the memory operation.
Fold simple offsets into the memory operation:
  add x0, x0, #8
  ldr x0, [x0]
-->
  ldr x0, [x0, #8]

Fixes <rdar://problem/17887945>.

llvm-svn: 214545
2014-08-01 19:40:16 +00:00
Rafael Espindola dd39657a3f Include Archive.h
MSVC was complaining about Archive being an incomplete type.

llvm-svn: 214542
2014-08-01 19:28:15 +00:00
Viktor Kutuzov 5e0db8b247 Fix building with in-tree libc++abi on FreeBSD
Differential Revision: http://reviews.llvm.org/D4743

llvm-svn: 214541
2014-08-01 19:23:15 +00:00
Rafael Espindola acfd62899f Move virtual method out of line.
Should fix the MSVC build.

llvm-svn: 214539
2014-08-01 18:49:24 +00:00
Philip Reames 7684618401 Add support for StackMap section for ELF/Linux systems
This patch adds code to emits the StackMap section on ELF systems. This section is required to support llvm.experimental.stackmap and llvm.experimental.patchpoint intrinsics.

Reviewers: ributzka, echristo

Differential Revision: http://reviews.llvm.org/D4574

llvm-svn: 214538
2014-08-01 18:47:09 +00:00
Juergen Ributzka 50a4005e35 [FastISel][AArch64] Add branch weights.
Add branch weights to branch instructions, so that the following passes can
optimize based on it (i.e. basic block ordering).

Fixes <rdar://problem/17887137>.

llvm-svn: 214537
2014-08-01 18:39:24 +00:00
Rafael Espindola e192341de7 Use object::Archive::create instead of new object::Archive.
Also fix the error handling. No testcaes, issue found by inspection.

Thanks to David Blaikie for the suggestion.

llvm-svn: 214535
2014-08-01 18:31:17 +00:00
Philip Reames 87c2b605f5 Explicitly report runtime stack realignment in StackMap section
This change adds code to explicitly mark a function which requires runtime stack realignment as not having a fixed frame size in the StackMap section. As it happens, this is not actually a functional change. The size that would be reported without the check is also "-1", but as far as I can tell, that's an accident. The code change makes this explicit.

Note: There's a separate bug in handling of stackmaps and patchpoints in functions which need dynamic frame realignment. The current code assumes that offsets can be calculated from RBP, but realigned frames must use RSP. (There's a variable gap between RBP and the spill slots.) This change set does not address that issue.

Reviewers: atrick, ributzka

Differential Revision: http://reviews.llvm.org/D4572

llvm-svn: 214534
2014-08-01 18:26:27 +00:00
Rafael Espindola ce47a05c7c Replace comment about ownership with std::unique_ptr.
llvm-svn: 214533
2014-08-01 18:09:32 +00:00
Juergen Ributzka 4c018a12a3 [FastISel][ARM] Do not emit stores for undef arguments.
This is a followup patch for r214366, which added the same behavior to the
AArch64 and X86 FastISel code. This fix reproduces the already existing
behavior of SelectionDAG in FastISel.

llvm-svn: 214531
2014-08-01 18:04:14 +00:00
Rafael Espindola b4599d3531 Use range loop.
llvm-svn: 214530
2014-08-01 18:04:14 +00:00
Renato Golin 541d7e747a Add missing breaks to AArch64InstrInfo::isGPRCopy
llvm-svn: 214528
2014-08-01 17:27:31 +00:00
Matt Arsenault 06bd3933ba R600: Cleanup test
Remove -CHECKs, use multiple prefixes, name values,
also test the @llvm.fabs version

llvm-svn: 214525
2014-08-01 17:00:29 +00:00
Matt Arsenault 41e148169d Make getNamedOperandIdx readonly
llvm-svn: 214524
2014-08-01 17:00:27 +00:00
Matt Arsenault cdcdb87a62 R600/SI: Don't display GDS bit for read2
This isn't displayed for any other instructions anymore,
and isn't ever used.

llvm-svn: 214523
2014-08-01 17:00:26 +00:00
Chad Rosier 4d71a4e2c6 [AArch64] Fix test from r214518 in an attempt to appease buildbots.
llvm-svn: 214521
2014-08-01 15:30:41 +00:00
Rafael Espindola 77f1f8f170 Remove lto_codegen_set_attr.
It was never exported, so no functionality change.

llvm-svn: 214519
2014-08-01 14:57:05 +00:00
Chad Rosier 579c02c9a5 [AArch64] Generate tbz/tbnz when comparing against zero.
The tbz/tbnz checks the sign bit to convert

op w1, w1, w10
cmp w1, #0
b.lt .LBB0_0

to

op w1, w1, w10
tbnz w1, #31, .LBB0_0

Differential Revision: http://reviews.llvm.org/D4440

llvm-svn: 214518
2014-08-01 14:48:56 +00:00
Ulrich Weigand 087606898b [PowerPC] PR20280 - Slots for byval parameters are not immutable
Found by inspection while looking at PR20280: code would mark slots
in the parameter save area where a byval parameter is passed as
"immutable".  This is not correct since code is allowed to modify
byval parameters in place in the parameter save area.

llvm-svn: 214517
2014-08-01 14:35:58 +00:00
Rafael Espindola 3f6481d0d3 Remove some calls to std::move.
Instead of moving out the data in a ErrorOr<std::unique_ptr<Foo>>, get
a reference to it.

Thanks to David Blaikie for the suggestion.

llvm-svn: 214516
2014-08-01 14:31:55 +00:00
Rafael Espindola 5d457dede9 [pr20127] Check for leading \1 in the Twine version of getNameWithPrefix.
No functionality change, but will simplify an upcoming patch that uses the
Twine version.

llvm-svn: 214515
2014-08-01 14:16:40 +00:00
Rafael Espindola ce5dd1acc2 Simplify the code a bit with std::unique_ptr.
llvm-svn: 214514
2014-08-01 14:11:14 +00:00
Tim Northover 4bd286ab53 llvm-objdump: implement printing for MachO __compact_unwind info.
llvm-svn: 214509
2014-08-01 13:07:19 +00:00
James Molloy 137ce60ecf Allow only disassembling of M-class MSR masks that the assembler knows how to assemble back.
Note: The current code in DecodeMSRMask() rejects the unpredictable A/R MSR mask '0000' with Fail. The code in the patch follows this style and rejects unpredictable M-class MSR masks also with Fail (instead of SoftFail). If SoftFail is preferred in this case then additional changes to ARMInstPrinter (to print non-symbolic masks) and ARMAsmParser (to parse non-symbolic masks) will be needed.

Patch by Petr Pavlu!

llvm-svn: 214505
2014-08-01 12:42:11 +00:00
Aaron Ballman 08c0b5aa31 Improve some const-correctness to remove a -Wcast-qual warning. No functional changes intended.
llvm-svn: 214503
2014-08-01 12:34:58 +00:00
Tilmann Scheller 7cc0ed48f0 [ARM] Make the assembler reject unpredictable pre/post-indexed ARM LDRB/LDRSB instructions.
The ARM ARM prohibits LDRB/LDRSB instructions with writeback into the destination register. With this commit this constraint is now enforced and we stop assembling LDRH/LDRSH instructions with unpredictable behavior.

llvm-svn: 214500
2014-08-01 12:08:04 +00:00
Tilmann Scheller 8ff079c16b [ARM] Make the assembler reject unpredictable pre/post-indexed ARM LDRH/LDRSH instructions.
The ARM ARM prohibits LDRH/LDRSH instructions with writeback into the source register. With this commit this constraint is now enforced and we stop assembling LDRH/LDRSH instructions with unpredictable behavior.

llvm-svn: 214499
2014-08-01 11:33:47 +00:00
Tilmann Scheller 8ba74305da [ARM] Make the assembler reject unpredictable pre/post-indexed ARM LDR instructions.
The ARM ARM prohibits LDR instructions with writeback into the destination register. With this commit this constraint is now enforced and we stop assembling LDR instructions with unpredictable behavior.

llvm-svn: 214498
2014-08-01 11:08:51 +00:00
Erik Eckstein 690dd037d9 SLPVectorizer: fix build problem in Release configuration
llvm-svn: 214496
2014-08-01 09:47:38 +00:00
Erik Eckstein c80e1dc081 SLPVectorizer: improved scheduling algorithm.
llvm-svn: 214494
2014-08-01 09:20:42 +00:00
Daniel Sanders 2b553d488f [mips][PR19612] Fix va_arg for big-endian mode.
Summary:
Big-endian mode was not correctly adjusting the offset for types smaller
than an ABI slot.

Fixes PR19612

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: sstankovic, llvm-commits

Differential Revision: http://reviews.llvm.org/D4556

llvm-svn: 214493
2014-08-01 09:17:39 +00:00
Erik Eckstein f16a808292 SLP Vectorizer: added statistics counter
llvm-svn: 214487
2014-08-01 08:14:28 +00:00
Erik Eckstein 4944b2ff94 SLP Vectorizer: improve canonicalize tree operands of commutitive binary operands.
This reverts r214338 (except the test file) and replaces it with a more general algorithm.

llvm-svn: 214485
2014-08-01 08:05:55 +00:00
Sylvestre Ledru 2eadfd6da8 Revert of 214418:
"Create a default symver on Linux like ELF OSes."

Fails the build under Debian with ld.gold:
/usr/bin/ld.gold: --default-symver: unknown option

llvm-svn: 214482
2014-08-01 06:16:03 +00:00
Hal Finkel b6d0d6b263 [PowerPC] Generate unaligned vector loads using intrinsics instead of regular loads
Altivec vector loads on PowerPC have an interesting property: They always load
from an aligned address (by rounding down the address actually provided if
necessary). In order to generate an actual unaligned load, you can generate two
load instructions, one with the original address, one offset by one vector
length, and use a special permutation to extract the bytes desired.

When this was originally implemented, I generated these two loads using regular
ISD::LOAD nodes, now marked as aligned. Unfortunately, there is a problem with
this:

The alignment of a load does not contribute to its identity, and SDNodes
are uniqued. So, imagine that we have some unaligned load, L1, that is not
aligned. The routine will create two loads, L1(aligned) and (L1+16)(aligned).
Further imagine that there had already existed a load (L1+16)(unaligned) with
the same chain operand as the load L1. When (L1+16)(aligned) is created as part
of the lowering of L1, this load *is* also the (L1+16)(unaligned) node, just
now marked as aligned (because the new alignment overwrites the old). But the
original users of (L1+16)(unaligned) now get the data intended for the
permutation yielding the data for L1, and (L1+16)(unaligned) no longer exists
to get its own permutation-based expansion. This was PR19991.

A second potential problem has to do with the MMOs on these loads, which can be
used by AA during instruction scheduling to break chain-based dependencies. If
the new "aligned" loads get the MMO from the original unaligned load, this does
not represent the fact that it will load data from below the original address.
Normally, this would not matter, but this load might be combined with another
load pair for a previous vector, and then the dependency on the otherwise-
ignored lower bytes can matter.

To fix both problems, instead of generating the necessary loads using regular
ISD::LOAD instructions, ppc_altivec_lvx intrinsics are used instead. These are
provided with MMOs with a conservative address range.

Unfortunately, I no longer have a failing test case (since PR19991 was
reported, other changes in CodeGen have forced this bug back into hiding it
again). Nevertheless, this should fix the underlying problem.

llvm-svn: 214481
2014-08-01 05:20:41 +00:00
Suyog Sarda 56c9a87035 This patch implements transform for pattern "(A & ~B) ^ (~A) -> ~(A & B)".
Differential Revision: http://reviews.llvm.org/D4653

llvm-svn: 214479
2014-08-01 05:07:20 +00:00
Suyog Sarda 1c6c2f69f7 This patch implements transform for pattern "(A | B) & ((~A) ^ B) -> (A & B)".
Differential Revision: http://reviews.llvm.org/D4628

llvm-svn: 214478
2014-08-01 04:59:26 +00:00
Suyog Sarda 52324c82cc This patch implements transform for pattern "( A & (~B)) | (A ^ B) -> (A ^ B)"
Differential Revision: http://reviews.llvm.org/D4652

llvm-svn: 214477
2014-08-01 04:50:31 +00:00
Suyog Sarda 16d646594e This patch implements transform for pattern "(A & B) | ((~A) ^ B) -> (~A ^ B)".
Patch Credit to Ankit Jain !

Differential Revision: http://reviews.llvm.org/D4655

llvm-svn: 214476
2014-08-01 04:41:43 +00:00
Tom Stellard aa9a1a813e R600/SI: Fix build warning
llvm-svn: 214475
2014-08-01 02:05:57 +00:00
Juergen Ributzka 82ecc7ff2a [FastISel][AArch64] Fix the immediate versions of the {s|u}{add|sub}.with.overflow intrinsics.
ADDS and SUBS cannot encode negative immediates or immediates larger than 12bit.
This fix checks if the immediate version can be used under this constraints and
if we can convert ADDS to SUBS or vice versa to support negative immediates.

Also update the test cases to test the immediate versions.

llvm-svn: 214470
2014-08-01 01:25:55 +00:00
Hal Finkel 3604bf7fe7 [PowerPC] Recognize consecutive memory accesses from intrinsics
When generating unaligned vector loads, we need to search for other loads or
stores nearby offset by one vector width. If we find one, then we know that we
can safely generate another aligned load at that address. Otherwise, we must
generate the next load using an offset of the vector width minus one byte (so
we don't read off the end of the allocation if the base unaligned address
happened to be aligned at runtime). We had previously done this using only
other vector loads and stores, but did not consider the PowerPC-specific vector
load/store intrinsics. Now we'll also consider vector intrinsics. By itself,
this change is a feature enhancement, but is a necessary step toward fixing the
underlying problem behind PR19991.

llvm-svn: 214469
2014-08-01 01:02:01 +00:00
Reid Kleckner 71ff3f223f MS inline asm: Fix null SMLoc when 'ptr' is missing after dword & co
This improves the diagnostics from the regular assembler, but more
importantly it fixes an assertion when parsing inline assembly.  Test
landing in Clang.

llvm-svn: 214468
2014-08-01 00:59:22 +00:00
Tom Stellard b4a313a76f R600/SI: Do abs/neg folding with ComplexPatterns
Abs/neg folding has moved out of foldOperands and into the instruction
selection phase using complex patterns.  As a consequence of this
change, we now prefer to select the 64-bit encoding for most
instructions and the modifier operands have been dropped from
integer VOP3 instructions.

llvm-svn: 214467
2014-08-01 00:32:39 +00:00
Tom Stellard 6655dd699f TableGen: Allow AddedComplexity values to be negative
This is useful for cases when stand-alone patterns are preferred to the
patterns included in the instruction definitions.  Instead of requiring
that stand-alone patterns set a larger AddedComplexity value, which
can be confusing to new developers, the allows us to reduce the
complexity of the included patterns to achieve the same result.

There will be test cases for this added to the R600 backend in a
future commit.

llvm-svn: 214466
2014-08-01 00:32:36 +00:00
Tom Stellard 0e975cf6e5 R600/SI: Simplify and fix handling of VOP2 in SIInstrInfo::legalizeOperands
We were incorrectly assuming that all VOP2 instructions can read SGPRs
in Src0, but this is not true for instructions that read carry-in from
VCC.

The old logic has been replaced with new logic which checks the defined
register classes of the VOP2 instruction to determine whether or not to
legalize the operands.

llvm-svn: 214465
2014-08-01 00:32:35 +00:00
Tom Stellard 6407e1e632 R600/SI: Fold immediates when shrinking instructions
This will prevent us from using extra MOV instructions once we prefer
selecting 64-bit instructions.

llvm-svn: 214464
2014-08-01 00:32:33 +00:00
Tom Stellard 86d12ebdbd R600/SI: Fix incorrect commute operation in shrink instructions pass
We were commuting the instruction by still shrinking it using the
original opcode.

NOTE: This is a candidate for the 3.5 branch.
llvm-svn: 214463
2014-08-01 00:32:28 +00:00
Kevin Enderby 0d928a142b Add support for the X86 secure guard extensions instructions in assembler (SGX).
This allows assembling the two new instructions, encls and enclu for the
SKX processor model.

Note the diffs are a bigger than what might think, but to fit the new
MRM_CF and MRM_D7 in things in the right places things had to be
renumbered and shuffled down causing a bit more diffs.

rdar://16228228

llvm-svn: 214460
2014-07-31 23:57:38 +00:00
Reid Kleckner b7e2f6015a X86 MC: Don't crash on empty memory operand parens
Instead, create an absolute memory operand.

Fixes PR20504.

llvm-svn: 214457
2014-07-31 23:26:35 +00:00
Reid Kleckner 0c5da97dd0 X86 MC: Reject invalid segment registers before a memory operand colon
Previously we would execute unreachable during object emission.

llvm-svn: 214456
2014-07-31 23:03:22 +00:00
Louis Gerbarg 09b8cdee12 White space fix.
llvm-svn: 214455
2014-07-31 22:57:46 +00:00
Hal Finkel 9e5298549e Make classof in MemSDNode consistent with MemIntrinsicSDNode
If INTRINSIC_W_CHAIN and INTRINSIC_VOID are MemIntrinsicSDNodes, and a
MemIntrinsicSDNode is a MemSDNode, then INTRINSIC_W_CHAIN and INTRINSIC_VOID
must be MemSDNodes too.

Noticed by inspection.

llvm-svn: 214452
2014-07-31 22:31:33 +00:00
Jan Vesely 3047950964 R600: Modernize work item intrinsics test
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Matt Arsenault <Matthew.Arsenault@amd.com>
llvm-svn: 214451
2014-07-31 22:11:03 +00:00
Louis Gerbarg 67474e3755 Make sure no loads resulting from load->switch DAGCombine are marked invariant
Currently when DAGCombine converts loads feeding a switch into a switch of
addresses feeding a load the new load inherits the isInvariant flag of the left
side. This is incorrect since invariant loads can be reordered in cases where it
is illegal to reoarder normal loads.

This patch adds an isInvariant parameter to getExtLoad() and updates all call
sites to pass in the data if they have it or false if they don't. It also
changes the DAGCombine to use that data to make the right decision when
creating the new load.

llvm-svn: 214449
2014-07-31 21:45:05 +00:00
Tyler Nowicki b5a65395cc Improve the remark generated for -Rpass-missed.
The current remark is ambiguous and makes it sounds like explicitly specifying vectorization will allow the loop to be vectorized. This is not the case. The improved remark directs the user to -Rpass-analysis=loop-vectorize to determine the cause of the pass-miss.

Reviewed by Arnold Schwaighofer`

llvm-svn: 214445
2014-07-31 21:22:22 +00:00
Eric Christopher 59265af9eb Revert "Remove MCObjectDisassembler.cpp as it is untested and unused." as it is apparently used, but the build didn't return errors weirdly.
This reverts commits 214437 and 214438.

llvm-svn: 214444
2014-07-31 21:18:38 +00:00
Tyler Nowicki 9fe497fcac Improve the remark generated when a variable that is used outside the loop is not a reduction or induction variable.
Reviewed by Arnold Schwaighofer

llvm-svn: 214440
2014-07-31 21:02:40 +00:00
Rafael Espindola ceb23381ec Replaces a few pointers with references in llvm-nm.cpp.
This opens the way for a few std::uinque_ptr cleanups.

llvm-svn: 214439
2014-07-31 21:00:10 +00:00
Aaron Ballman 3866a8f2ca Fixing CMake problems with MCObjectDisassembler.cpp not existing.
llvm-svn: 214438
2014-07-31 20:48:54 +00:00
Eric Christopher 90a06fa97a Remove MCObjectDisassembler.cpp as it is untested and unused.
llvm-svn: 214437
2014-07-31 20:44:46 +00:00
Hans Wennborg 914efc7239 msbuild integration: remove duplicated lines and BOM from 2014 integration (PR20341)
llvm-svn: 214435
2014-07-31 20:33:22 +00:00
Rafael Espindola cf8dd265c5 DWOHolder takes ownership of the argument constructor, use std::unique_ptr.
Thanks to David Blaikie for noticing it.

llvm-svn: 214434
2014-07-31 20:26:42 +00:00
Rafael Espindola a04bb5b1e1 Use a reference instead of a pointer.
This makes using a std::unique_ptr in the caller more convenient.

llvm-svn: 214433
2014-07-31 20:19:36 +00:00
Eric Fiselier 18fab4684d Add documentation for lit's --show-unsupported flag
llvm-svn: 214431
2014-07-31 20:11:13 +00:00
Bill Schmidt 08a66a39ba Clarify in PowerPC release notes that 32-bit PIC support is incomplete.
As requested, changing this wording slightly.

Thanks,
Bill

llvm-svn: 214430
2014-07-31 20:04:51 +00:00
Will Schmidt 44ff8f06ec Disable IsSub subregister assert. pr18663.
This is a follow-up to the activity in the bug at
http://llvm.org/bugs/show_bug.cgi?id=18663 .  The underlying issue has
to do with how the KILL pseudo-instruction is handled.  I defer to
Hal/Jakob/Uli for additional details and background.

This will disable the (bad?) assert, add an associated fixme comment,
and add a pair of tests.

The code change and the pr18663-2.ll test are copied from the referenced
bug.  That test does not immediately fail in my environment, but I have
added the pr18663.ll test which does.

(Comment from Hal)
to provide everyone else with some context, this assert was not bad when
it was written. At that time, we only generated KILL pseudo instructions
around subregister copies. This logic, unfortunately, had its own problems.
In r199797, the relevant logic in MachineCopyPropagation was replaced to
generate KILLs for other kinds of copies too. This change in semantics broke
this now-problematic assumption in AggressiveAntiDepBreaker. The
AggressiveAntiDepBreaker really needs a proper cleanup to deal with the
change, but removing the assert (which just allows the function to return
false) is a safe conservative behavior, and should do for the time being.

llvm-svn: 214429
2014-07-31 19:50:53 +00:00
Rafael Espindola 0c54419d5d Remove unused argument.
Thanks to Justin Bogner for noticing it.

llvm-svn: 214426
2014-07-31 19:32:04 +00:00
Rafael Espindola 3f0549f66b Move MCObjectSymbolizer.h to MC/MCAnalysis.
The cpp file is already in lib/MC/MCAnalysis.

llvm-svn: 214424
2014-07-31 19:29:23 +00:00
Hal Finkel 36eff0f854 Fix ScalarEvolutionExpander when creating a PHI in a block with duplicate predecessors
It seems that when I fixed this, almost exactly a year ago, I did not quite do
it correctly. When we have duplicate block predecessors, we can indeed not have
different incoming values for the same block, but we *must* have duplicate
entries. So, instead of skipping the duplicates, we explicitly add the
duplicate incoming values.

Fixes PR20442.

llvm-svn: 214423
2014-07-31 19:13:38 +00:00
Duncan P. N. Exon Smith 852e00e3d1 verify-uselistorder: Change the default -num-shuffles=5
Change the default for `-num-shuffles` to 5 and better document the
algorithm in the header docs of `verify-uselistorder`.

llvm-svn: 214419
2014-07-31 18:46:24 +00:00
Eric Christopher 42040ed753 Create a default symver on Linux like ELF OSes.
Patch by Adam Jackson.

llvm-svn: 214418
2014-07-31 18:43:43 +00:00
Duncan P. N. Exon Smith ab6adeb8a1 UseListOrder: Handle self-users
Correctly sort self-users (such as PHI nodes).  I added a targeted test
in `test/Bitcode/use-list-order.ll` and the final missing RUN line to
tests in `test/Assembly`.

This is part of PR5680.

llvm-svn: 214417
2014-07-31 18:33:12 +00:00
Eric Christopher 5a298679d5 Fix loop end condition.
Note: This code appears to be untested.
llvm-svn: 214416
2014-07-31 18:28:08 +00:00
Bill Schmidt 08616c7004 Wrong heading level for PowerPC changes in release notes
Oops.  Used the wrong heading level by mistake.

Thanks,
Bill

llvm-svn: 214405
2014-07-31 15:20:30 +00:00
Bill Schmidt bfffb69e84 Release Notes: Overriding PPC64 and PPC64LE ABI defaults is not yet supported.
I wrongly included a description of a patch that came in after 3.5 branched
and has not been backported.

Thanks,
Bill

llvm-svn: 214404
2014-07-31 15:17:33 +00:00
Bill Schmidt 4004989b37 Add PowerPC release notes for 3.5.
Here's my take on 3.5 changes for PowerPC.  Others please feel free to add,
edit, delete as desired.

Thanks,
Bill

llvm-svn: 214403
2014-07-31 14:38:17 +00:00
Aaron Ballman ed9fabd255 Fixing an -Woverloaded-virtual warnings by exposing the hidden virtual function as well. No functional changes intended.
llvm-svn: 214400
2014-07-31 12:58:50 +00:00
Aaron Ballman 53201af4d5 Fixing a -Wcast-qual warning in GCC. No functional changes.
llvm-svn: 214399
2014-07-31 12:55:49 +00:00
Evgeniy Stepanov 5997feb7dc [msan] Fix handling of array types.
Switch array type shadow from a single integer to
an array of integers (i.e. make it per-element).
This simplifies instrumentation of extractvalue and fixes PR20493.

llvm-svn: 214398
2014-07-31 11:02:27 +00:00
Evgeniy Stepanov 77ad86681f [asan] Support x86 REP MOVS asm instrumentation.
Patch by Yuri Gorshenin.

llvm-svn: 214395
2014-07-31 09:11:04 +00:00
Stepan Dyatkovskiy 87c046189d MergeFunctions, tiny refactoring:
cmpOperation has been renamed to cmpOperations (multiple form).

llvm-svn: 214392
2014-07-31 07:16:59 +00:00
Juergen Ributzka c537bd2da4 [FastISel][AArch64] Add basic bitcast support for conversion between float and int.
Fixes <rdar://problem/17867078>.

llvm-svn: 214389
2014-07-31 06:25:37 +00:00
Juergen Ributzka 130e77e431 [FastISel][AArch64] Add sqrt intrinsic support.
Fixes <rdar://problem/17867067>.

llvm-svn: 214388
2014-07-31 06:25:33 +00:00