Commit Graph

107183 Commits

Author SHA1 Message Date
Stepan Dyatkovskiy 7f895c1184 MergeFunctions, tiny refactoring:
cmpAPInt has been renamed to cmpAPInts (multiple form).

llvm-svn: 216375
2014-08-25 08:19:50 +00:00
Stepan Dyatkovskiy 0b765dee6e MergeFunctions, tiny refactoring:
cmpType has been renamed to cmpTypes (multiple form).

llvm-svn: 216374
2014-08-25 08:16:39 +00:00
Stepan Dyatkovskiy 016daddc52 MergeFunctions, tiny refactoring:
cmpGEP has been renamed to cmpGEPs (multiple form).

llvm-svn: 216373
2014-08-25 08:12:45 +00:00
Karthik Bhat 7f33ff7dea Allow vectorization of division by uniform power of 2.
This patch adds support to recognize division by uniform power of 2 and modifies the cost table to vectorize division by uniform power of 2 whenever possible.
Updates Cost model for Loop and SLP Vectorizer.The cost table is currently only updated for X86 backend.
Thanks to Hal, Andrea, Sanjay for the review. (http://reviews.llvm.org/D4971)

llvm-svn: 216371
2014-08-25 04:56:54 +00:00
Dylan Noblesmith 6e69927d03 CodeGen/LiveVariables: hoist out code in nested loops
This makes runOnMachineFunction vastly more readable.

llvm-svn: 216368
2014-08-25 01:59:49 +00:00
Dylan Noblesmith 46a922c191 CodeGen/LiveVariables: switch to std::vector
No functionality change.

llvm-svn: 216367
2014-08-25 01:59:42 +00:00
Dylan Noblesmith b899464f5b AArch64: unique_ptr-ify map structures
llvm-svn: 216366
2014-08-25 01:59:38 +00:00
Dylan Noblesmith 6076debd98 AArch64: use std::vector for temp array
llvm-svn: 216365
2014-08-25 01:59:36 +00:00
Dylan Noblesmith 130589f804 NVPTX: remove another raw delete call
llvm-svn: 216364
2014-08-25 01:59:32 +00:00
Dylan Noblesmith 802b6ce8de NVPTX: remove raw delete call
Also make members that are never accessed outside the class
private.

llvm-svn: 216363
2014-08-25 01:59:29 +00:00
Dylan Noblesmith c4a9942a68 ExecutionEngine: unique_ptr-ify
NFC.

llvm-svn: 216362
2014-08-25 00:58:18 +00:00
Dylan Noblesmith 2b9b93e6f1 EE/JIT: unique_ptr-ify
llvm-svn: 216361
2014-08-25 00:58:15 +00:00
Dylan Noblesmith 0b59924d60 Support/Path: remove raw delete
llvm-svn: 216360
2014-08-25 00:58:13 +00:00
Dylan Noblesmith 49c758b769 Support/APFloat: unique_ptr-ify temp arrays
llvm-svn: 216359
2014-08-25 00:58:10 +00:00
Dylan Noblesmith 3ecd22fcf5 Analysis: unique_ptr-ify DependenceAnalysis::collectCoeffInfo
llvm-svn: 216358
2014-08-25 00:28:43 +00:00
Dylan Noblesmith 2cae60e730 Analysis: unique_ptr-ify DependenceAnalysis::depends
llvm-svn: 216357
2014-08-25 00:28:39 +00:00
Dylan Noblesmith d96ce66cb1 Analysis: take a reference instead of pointer
This parameter is never null.

llvm-svn: 216356
2014-08-25 00:28:35 +00:00
Dylan Noblesmith 688fa5e15b CodeGen: switch raw array to std::vector
llvm-svn: 216355
2014-08-25 00:28:31 +00:00
Dylan Noblesmith 06adf32814 IR: remove dead code
This was added in r134994, to fix a memory leak;
three days later, r135248 switched
ContainedTys from being new-allocated to being allocated
via BumpPtrAllocator, and the earlier fix was never
reverted.

The destructor doesn't seem to ever actually be called
on Types anyway, so it's harmless, but if it were,
this'd be an invalid pointer.

This reverts r134994.

llvm-svn: 216354
2014-08-25 00:28:27 +00:00
Craig Topper 4627679cec Use range based for loops to avoid needing to re-mention SmallPtrSet size.
llvm-svn: 216351
2014-08-24 23:23:06 +00:00
Dylan Noblesmith 085fc4d6c6 TableGen: unique_ptr-ify RecordKeeper
llvm-svn: 216350
2014-08-24 19:10:57 +00:00
Dylan Noblesmith aa9b74c544 TableGen: delete no-op code
This does nothing but remove the Record from the map, and
then re-add it, without actually changing it in between.

The Record's Name used to be changed before re-adding it
when the code was first committed in r137232, but the
name-changing lines were removed in r142510, and since
then this code seems to do nothing.

This was also the only caller of removeClass or removeDef,
so now RecordKeeper owns its Records unconditionally,
and could be unique_ptr-ified.

llvm-svn: 216349
2014-08-24 19:10:53 +00:00
Dylan Noblesmith 80f0e432ee TableGen: use auto and for-range
llvm-svn: 216348
2014-08-24 19:10:49 +00:00
Aaron Ballman 9d515ff695 This code is from r216285, which did not go out to the mailing list for some reason.
The switch statement would never fire due to the preceding break statement. Also, the switch statement has a default label with no case labels. Simplified the code, and allow it to execute.

llvm-svn: 216346
2014-08-24 13:25:16 +00:00
Elena Demikhovsky 22e735d725 X86 intrinsics table - simplifies intrinsics lowering.
The tables are initialized when X86TargetLowering object is created.

llvm-svn: 216345
2014-08-24 09:19:56 +00:00
Patrik Hagglund 3e2a9d6125 Silence gcc -Wpedantic.
llvm-svn: 216344
2014-08-24 09:12:33 +00:00
David Majnemer 0ffccf7fb5 InstCombine: Properly optimize or'ing bittests together
CFE, with -03, would turn:
bool f(unsigned x) {
  bool a = x & 1;
  bool b = x & 2;
  return a | b;
}

into:
  %1 = lshr i32 %x, 1
  %2 = or i32 %1, %x
  %3 = and i32 %2, 1
  %4 = icmp ne i32 %3, 0

This sort of thing exposes a nasty pathology in GCC, ICC and LLVM.

Instead, we would rather want:
  %1 = and i32 %x, 3
  %2 = icmp ne i32 %1, 0

Things get a bit more interesting in the following case:
  %1 = lshr i32 %x, %y
  %2 = or i32 %1, %x
  %3 = and i32 %2, 1
  %4 = icmp ne i32 %3, 0

Replacing it with the following sequence is better:
  %1 = shl nuw i32 1, %y
  %2 = or i32 %1, 1
  %3 = and i32 %2, %x
  %4 = icmp ne i32 %3, 0

This sequence is preferable because %1 doesn't involve %x and could
potentially be hoisted out of loops if it is invariant; only perform
this transform in the non-constant case if we know we won't increase
register pressure.

llvm-svn: 216343
2014-08-24 09:10:57 +00:00
Hal Finkel 584a70c820 [PowerPC] Add support for dcbtst and icbt (prefetch)
Adds code generation support for dcbtst (data cache prefetch for write) and
icbt (instruction cache prefetch for read - Book E cores only).

We still end up with a 'cannot select' error for the non-supported prefetch
intrinsic forms. This will be fixed in a later commit.

Fixes PR20692.

llvm-svn: 216339
2014-08-23 23:21:04 +00:00
Dylan Noblesmith c4c5180fb4 Support: add llvm::unique_lock
Based on the STL class of the same name, it guards a mutex
while also allowing it to be unlocked conditionally before
destruction.

This eliminates the last naked usages of mutexes in LLVM and
clang.

It also uncovered and fixed a bug in callExternalFunction()
when compiled without USE_LIBFFI, where the mutex would never
be unlocked if the end of the function was reached.

llvm-svn: 216338
2014-08-23 23:07:14 +00:00
Dylan Noblesmith 13044d1cc5 Support: make LLVM Mutexes STL-compatible
Use lock/unlock() convention instead of acquire/release().

llvm-svn: 216336
2014-08-23 22:49:22 +00:00
Dylan Noblesmith 4704ffe164 Support/Unix: use ScopedLock wherever possible
Only one function remains a bit too complicated
for a simple mutex guard. No functionality change.

llvm-svn: 216335
2014-08-23 22:49:17 +00:00
Dylan Noblesmith 63f9b57147 cmake: actually test -Wcomment
This test was testing nothing, as only -Werror was ever
being added to the compiler flags.

You can see the final nitty-gritty compiler invocation in
CMakeFiles/CMakeOutput.log (for successful tests) and
CMakeFiles/CMakeError.log (for failed tests).

Before:
Building C object CMakeFiles/cmTryCompileExec3385359576.dir/src.c.o
/usr/bin/clang   -fPIC -Wall -W -Wno-unused-parameter -Wwrite-strings -Wmissing-field-initializers -pedantic -Wno-long-long -Wcovered-switch-default  -DC_WCOMMENT_ALLOWS_LINE_WRAP  -Werror   -o CMakeFiles/cmTryCompileExec3385359576.dir/src.c.o   -c /home/nobled/code/llvm-b9/CMakeFiles/CMakeTmp/src.c

After:
Building C object CMakeFiles/cmTryCompileExec3385359576.dir/src.c.o
/usr/bin/clang   -fPIC -Wall -W -Wno-unused-parameter -Wwrite-strings -Wmissing-field-initializers -pedantic -Wno-long-long -Wcovered-switch-default  -DC_WCOMMENT_ALLOWS_LINE_WRAP  -Werror -Wcomment   -o CMakeFiles/cmTryCompileExec3385359576.dir/src.c.o   -c /home/nobled/code/llvm-b9/CMakeFiles/CMakeTmp/src.c

llvm-svn: 216328
2014-08-23 21:10:58 +00:00
Dylan Noblesmith 367cef6362 cmake: disable -Wnon-virtual-dtor when it gives false positives
clang has only been smart enough not to trigger -Wnon-virtual-dtor
warnings on final classes since r208449 (in clang 3.5). Building
with older versions is extremely noisy, so disable the warning
on those compilers.

llvm-svn: 216327
2014-08-23 21:10:56 +00:00
Chad Rosier ad7c910ecf Revert "ARM: improve RTABI 4.2 conformance on Linux"
This reverts commit r215862 due to nightly failures.  Will work on getting a
reduced test case, but I wanted to get our bots green in the meantime.

llvm-svn: 216325
2014-08-23 18:29:43 +00:00
Chad Rosier d2959362fb Revert "ARM: mark missing functions from RTABI"
This reverts commit r215863.

llvm-svn: 216324
2014-08-23 18:29:40 +00:00
Chandler Carruth a15258b4e6 [x86] Start fixing a really subtle and terrible form of miscompile in
these DAG combines.

The DAG auto-CSE thing is truly terrible. Due to it, when RAUW-ing
a node with its operand, you can cause its uses to CSE to itself, which
then causes their uses to become your uses which causes them to be
picked up by the RAUW. For nodes that are determined to be "no-ops",
this is "fine". But if the RAUW is one of several steps to enact
a transformation, this causes the DAG to really silently eat an discard
nodes that you would never expect. It took days for me to actually
pinpoint a test case triggering this and a really frustrating amount of
time to even comprehend the bug because I never even thought about the
ability of RAUW to iteratively consume nodes due to CSE-ing them into
itself.

To fix this, we have to build up a brand-new chain of operations any
time we are combining across (potentially) intervening nodes. But once
the logic is added to do this, another issue surfaces: CombineTo eagerly
deletes the one node combined, *but no others*. This is... really
frustrating. If deleting it makes its operands become dead, those
operand nodes often won't go onto the worklist in the
order you would want -- they're already on it and not near the top. That
means things higher on the worklist will get combined prior to these
dead nodes being GCed out of the worklist, and if the chain is long, the
immediate users won't be enough to re-detect where the root of the chain
is that became single-use again after deleting the dead nodes. The
better way to do this is to never immediately delete nodes, and instead
to just enqueue them so we can recursively delete them. The
combined-from node is typically not on the worklist anyways by virtue of
having been popped off.... But that in turn breaks other tests that
*require* CombineTo to delete unused nodes. :: sigh ::

Fortunately, there is a better way. This whole routine should have been
returning the replacement rather than using CombineTo which is quite
hacky. Switch to that, and all the pieces fall together.

I suspect the same kind of miscompile is possible in the half-shuffle
folding code, and potentially the recursive folding code. I'll be
switching those over to a pattern more like this one for safety's sake
even though I don't immediately have any test cases for them. Note that
the only way I got a test case for this instance was with *heavily* DAG
combined 256-bit shuffle sequences generated by my fuzzer. ;]

llvm-svn: 216319
2014-08-23 10:25:15 +00:00
Hans Wennborg 9a01309b3b ProgrammersManual: the flag is called -debug-only
llvm-svn: 216316
2014-08-23 04:34:58 +00:00
Alex Lorenz 7949a8b8ea llvm-cov: test: add xfail for the big-endian buildbots
llvm-svn: 216310
2014-08-23 00:47:24 +00:00
Nick Lewycky a4967c2740 Revert r215611 because it caused the infinite loop in bug 20736. There is a reduced testcase in that bug.
llvm-svn: 216307
2014-08-23 00:45:03 +00:00
Yunzhong Gao 300bdb35d4 Add a test case for SROA where the store size is bigger than slice size. The
test case was fixed in r216248.

llvm-svn: 216303
2014-08-22 23:27:04 +00:00
Rafael Espindola f7ecb11572 Add support for comdats to the gold plugin.
There are two parts to this. First, the plugin needs to tell gold the comdat by
setting comdat_key.

What gets things a bit more complicated is that gold only seems
symbols. In particular, if A is an alias to B, it only sees the symbols
A and B. It can then ask us to keep symbol A but drop symbol B. What
we have to do instead is to create an internal version of B and make A
an alias to that.

At some point some of this logic should be moved to lib/Linker so that
we don't map a Constant to an internal version just to have lib/Linker
map that again to the destination module.

The reason for implementing this in tools/gold for now is simplicity.
With it in place it should be possible to update clang to use comdats
for constructors and destructors on ELF without breaking the LTO
bootstrap. Once that is done I intend to come back and improve the
interface lib/Linker exposes.

llvm-svn: 216302
2014-08-22 23:26:10 +00:00
Alex Lorenz e82d89cc37 llvm-cov: add code coverage tool that's based on coverage mapping format and clang's pgo.
This commit expands llvm-cov's functionality by adding support for a new code coverage
tool that uses LLVM's coverage mapping format and clang's instrumentation based profiling.
The gcov compatible tool can be invoked by supplying the 'gcov' command as the first argument,
or by modifying the tool's name to end with 'gcov'.

Differential Revision: http://reviews.llvm.org/D4445

llvm-svn: 216300
2014-08-22 22:56:03 +00:00
Jingyue Wu ec33fa9aca [SROA] Fold a PHI node if all its incoming values are the same
Summary:
Fixes PR20425.

During slice building, if all of the incoming values of a PHI node are the same, replace the PHI node with the common value. This simplification makes alloca's used by PHI nodes easier to promote.

Test Plan: Added three more tests in phi-and-select.ll

Reviewers: nlewycky, eliben, meheff, chandlerc

Reviewed By: chandlerc

Subscribers: zinovy.nis, hfinkel, baldrick, llvm-commits

Differential Revision: http://reviews.llvm.org/D4659

llvm-svn: 216299
2014-08-22 22:45:57 +00:00
Reid Kleckner 2d9bb65b3d ARM / x86_64 varargs: Don't save regparms in prologue without va_start
There's no need to do this if the user doesn't call va_start. In the
future, we're going to have thunks that forward these register
parameters with musttail calls, and they won't need these spills for
handling va_start.

Most of the test suite changes are adding va_start calls to existing
tests to keep things working.

llvm-svn: 216294
2014-08-22 21:59:26 +00:00
Rafael Espindola bd334e2f32 Clear the llvm release notes to make room for 3.6.
llvm-svn: 216292
2014-08-22 21:57:38 +00:00
Kevin Enderby b76d386d7c Add the start of the support for llvm-objdump’s -private-headers for Mach-O files.
This adds the printing of the mach header. Load command printing will be next.

llvm-svn: 216285
2014-08-22 20:35:18 +00:00
Kevin Enderby 7d7eeab365 Add a few missing mach header flags.
llvm-svn: 216284
2014-08-22 20:34:31 +00:00
Reid Kleckner e3f146d941 Fix PR17239 by changing the semantics of the RemainingArgsClass Option kind
This patch contains the LLVM side of the fix of PR17239.

This bug that happens because the /link (clang-cl.exe argument) is
marked as "consume all remaining arguments". However, when inside a
response file, /link should only consume all remaining arguments inside
the response file where it is located, not the entire command line after
expansion.

My patch will change the semantics of the RemainingArgsClass kind to
always consume only until the end of the response file when the option
originally came from a response file. There are only two options in this
class: dash dash (--) and /link.

Reviewed By: rnk

Differential Revision: http://reviews.llvm.org/D4899

Patch by Rafael Auler!

llvm-svn: 216280
2014-08-22 19:29:17 +00:00
Tom Stellard f3fc555e3b R600/SI: Use READ2/WRITE2 instructions for 64-bit mem ops with 32-bit alignment
llvm-svn: 216279
2014-08-22 18:49:35 +00:00
Tom Stellard 85e8b6d5f9 R600/SI: Use a ComplexPattern for DS loads and stores
llvm-svn: 216278
2014-08-22 18:49:33 +00:00
Tom Stellard ca7ecf3dfa R600/SI: Wrap local memory pointer in AssertZExt on SI
These pointers are really just offsets and they will always be
less than 16-bits.  Using AssertZExt allows us to use computeKnownBits
to prove that these values are positive.  We will use this information
in a later commit.

llvm-svn: 216277
2014-08-22 18:49:31 +00:00
Tom Stellard 0510514e36 R600/SI: Use correct helper class for DS_WRITE2 instructions
DS_1A uses a single offset encoding, so offset1 wasn't being
encoded.

llvm-svn: 216276
2014-08-22 18:49:28 +00:00
Quentin Colombet d358e84d9c [ARM] Move the implementation of the target hooks related to copy-related
instruction from ARMInstrInfo to ARMBaseInstrInfo.
That way, thumb mode can also benefit from the advanced copy optimization.

<rdar://problem/12702965>

llvm-svn: 216274
2014-08-22 18:05:22 +00:00
David Majnemer 49775e0173 InstCombine: Don't unconditionally preserve 'nuw' when shrinking constants
Consider:
  %add = add nuw i32 %a, -16777216
  %and = and i32 %add, 255

Regardless of whether or not we demand the sign bit of %add, we cannot
replace -16777216 with 2130706432 without also removing 'nuw' from the
instruction.

llvm-svn: 216273
2014-08-22 17:11:04 +00:00
David Majnemer 0e6c986696 InstCombine: sub nsw %x, C -> add nsw %x, -C if C isn't INT_MIN
We can preserve nsw during this transform if -C won't overflow.

llvm-svn: 216269
2014-08-22 16:41:23 +00:00
Alex Lorenz 5117674f55 [Support] Fix the overflow bug in ULEB128 decoding.
Differential Revision: http://reviews.llvm.org/D5029

llvm-svn: 216268
2014-08-22 16:29:45 +00:00
Sasa Stankovic 86ebfe24e5 [mips] Don't use odd-numbered float registers for double arguments for fastcc
calling convention if FP is 64-bit and +nooddspreg is used.

Differential Revision: http://reviews.llvm.org/D4981.diff

llvm-svn: 216262
2014-08-22 09:23:22 +00:00
David Majnemer 42b83a5e36 InstCombine: Don't unconditionally preserve 'nsw' when shrinking constants
Consider:
  %add = add nsw i32 %a, -16777216
  %and = and i32 %add, 255

Regardless of whether or not we demand the sign bit of %add, we cannot
replace -16777216 with 2130706432 without also removing 'nsw' from the
instruction.

This fixes PR20377.

llvm-svn: 216261
2014-08-22 07:56:32 +00:00
Erik Eckstein b49d7abb7b fix: SLPVectorizer crashes for unreachable blocks containing not schedulable instructions.
In unreachable blocks it's legal to have instructions like "%x = op %x".
Such instuctions are not schedulable. Therefore the SLPVectorizer has to check for
unreachable blocks and ignore them.

Fixes bug 20646.

llvm-svn: 216256
2014-08-22 01:18:39 +00:00
Peter Collingbourne fab565a56b [dfsan] Fix non-determinism bug in non-zero label check annotator.
We now use a std::vector instead of a DenseSet to store the list of
label checks so that we can iterate over it deterministically.

llvm-svn: 216255
2014-08-22 01:18:18 +00:00
David Majnemer 97ddca3224 ValueTracking: Figure out more bits when looking at add/sub
Given something like X01XX + X01XX, we know that the result must look
like X1XXX.

Adapted from a patch by Richard Smith, test-case written by me.

llvm-svn: 216250
2014-08-22 00:40:43 +00:00
Reid Kleckner c36f48f08a SROA: Handle a case of store size being smaller than allocation size
In this case, we are creating an x86_fp80 slice for a union from C where
the padding bytes may contain real data. An x86_fp80 alloca is 16 bytes,
and that's just fine. We can't, however, use regular loads and stores to
access the slice, because the store size is only 10 bytes / 80 bits.
Instead, use memcpy and memset.

Fixes PR18726.

Reviewed By: chandlerc

Differential Revision: http://reviews.llvm.org/D5012

llvm-svn: 216248
2014-08-22 00:09:56 +00:00
Duncan P. N. Exon Smith c667974b65 Revert "X86: Align the stack on word boundaries in LowerFormalArguments()"
This (mostly) reverts commit r216119.

Somewhere during the review Reid committed r214980 which fixed this
another way, and I neglected to check that the testcase still failed
before committing.

I've left test/CodeGen/X86/aligned-variadic.ll around in case it adds
extra coverage.

llvm-svn: 216246
2014-08-21 23:36:08 +00:00
Reid Kleckner e42e4655ee Add an explicit move constructor to SrcBuffer
MSVC can't synthesize the explicit one.  Instead it tries to emit a copy
ctor which would call the deleted copy ctor of unique_ptr.

llvm-svn: 216244
2014-08-21 23:24:08 +00:00
Juergen Ributzka 0e0b4c1cda [FastISel][AArch64] Add support for variable shift.
This adds the missing variable shift support for value type i8, i16, and i32.

This fixes <rdar://problem/18095685>.

llvm-svn: 216242
2014-08-21 23:06:07 +00:00
Philip Reames 2c52c66816 Minor refactor to make applying patches from 'Add a "probe-stack" attribute' review thread out of order easier.
llvm-svn: 216241
2014-08-21 22:53:49 +00:00
David Blaikie 2f3f76fdb1 Use DILexicalBlockFile, rather than DILexicalBlock, to track discriminator changes to ensure discriminator changes don't introduce new DWARF DW_TAG_lexical_blocks.
Somewhat unnoticed in the original implementation of discriminators, but
it could cause instructions to end up in new, small,
DW_TAG_lexical_blocks due to the use of DILexicalBlock to track
discriminator changes.

Instead, use DILexicalBlockFile which we already use to track file
changes without introducing new scopes, so it works well to track
discriminator changes in the same way.

llvm-svn: 216239
2014-08-21 22:45:21 +00:00
Sanjay Patel 2cdea4c41e name change: isPow2DivCheap -> isPow2SDivCheap
isPow2DivCheap

That name doesn't specify signed or unsigned.

Lazy as I am, I eventually read the function and variable comments. It turns out that this is strictly about signed div. But I discovered that the comments are wrong:

   srl/add/sra

is not the general sequence for signed integer division by power-of-2. We need one more 'sra':

   sra/srl/add/sra

That's the sequence produced in DAGCombiner. The first 'sra' may be removed when dividing by exactly '2', but that's a special case.

This patch corrects the comments, changes the name of the flag bit, and changes the name of the accessor methods.

No functional change intended.

Differential Revision: http://reviews.llvm.org/D5010

llvm-svn: 216237
2014-08-21 22:31:48 +00:00
Quentin Colombet 6674b095b8 [PeepholeOptimizer] Enable the advanced copy optimization by default.
The advanced copy optimization does not yield any difference on the whole llvm
test-suite + SPECs, either in compile time or runtime (binaries are identical),
but has a big potential when data go back and forth between register files as
demonstrated with test/CodeGen/ARM/adv-copy-opt.ll.

Note: This was measured for both Os and O3 for armv7s, arm64, and x86_64.

<rdar://problem/12702965>

llvm-svn: 216236
2014-08-21 22:23:52 +00:00
Philip Reames 4e8cb79425 Whitespace change to reduce diff in future patch.
Patch 2 of 11 in 'Add a "probe-stack" attribute' review thread

Patch by: john.kare.alsaker@gmail.com

llvm-svn: 216235
2014-08-21 22:19:16 +00:00
Philip Reames 34fcca723b [X86] Split out the logic to select the stack probe function (NFC)
Patch 1 of 11 in 'Add a "probe-stack" attribute' review thread.

Patch by: <john.kare.alsaker@gmail.com>

llvm-svn: 216233
2014-08-21 22:15:20 +00:00
Robin Morisset 26b808922b Add hooks for emitLeading/TrailingFence
llvm-svn: 216232
2014-08-21 22:09:25 +00:00
Robin Morisset 59c23cd946 Rename AtomicExpandLoadLinked into AtomicExpand
AtomicExpandLoadLinked is currently rather ARM-specific. This patch is the first of
a group that aim at making it more target-independent. See
http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-August/075873.html
for details

The command line option is "atomic-expand"

llvm-svn: 216231
2014-08-21 21:50:01 +00:00
Quentin Colombet 6b36337c09 [PeepholeOptimizer] Update the kill flags when extending the live-range of the
source of a copy.

<rdar://problem/12702965>

llvm-svn: 216229
2014-08-21 21:34:06 +00:00
Justin Bogner c8dc50fa51 Fix a URL (NFC)
llvm-svn: 216228
2014-08-21 21:09:24 +00:00
Juergen Ributzka addb75a4f3 [FastISel][AArch64] Use the correct register class to make the MI verifier happy.
This is mostly achieved by providing the correct register class manually,
because getRegClassFor always returns the GPR*AllRegClass for MVT::i32 and
MVT::i64.

Also cleanup the code to use the FastEmitInst_* method whenever possible. This
makes sure that the operands' register class is properly constrained. For all
the remaining cases this adds the missing constrainOperandRegClass calls for
each operand.

llvm-svn: 216225
2014-08-21 20:57:57 +00:00
David Blaikie 1961f14cf9 Explicitly pass ownership of the MemoryBuffer to AddNewSourceBuffer using std::unique_ptr
llvm-svn: 216223
2014-08-21 20:44:56 +00:00
Tom Stellard 745f2eddef R600/SI: Teach moveToVALU how to handle more S_LOAD_* instructions
llvm-svn: 216220
2014-08-21 20:41:00 +00:00
Tom Stellard 162a947160 R600/SI: Make sure SCRATCH_WAVE_OFFSET is added as Live-In to the function
This fixes a crash in an ocl conformance test.

llvm-svn: 216219
2014-08-21 20:40:58 +00:00
Tom Stellard 8e52375bb5 R600/SI: Remove unused SGPR spilling code
llvm-svn: 216218
2014-08-21 20:40:56 +00:00
Tom Stellard c5cf2f04d9 R600/SI: Use eliminateFrameIndex() to expand SGPR spill pseudos
This will simplify the SGPR spilling and also allow us to use
MachineFrameInfo for calculating offsets, which should be more
reliable than our custom code.

This fixes a crash in some cases where a register would be spilled
in a branch such that the VGPR defined for spilling did not dominate
all the uses when restoring.

This fixes a crash in an ocl conformance test.  The test requries
register spilling and is too big to include.

llvm-svn: 216217
2014-08-21 20:40:54 +00:00
Tom Stellard 11aa80cc4a R600/SI: Handle VCC in SIRegisterInfo::getPhysRegSubReg()
This fixes a crash in an ocl conformance test.  The test requries
register spilling and is too big to include.

llvm-svn: 216216
2014-08-21 20:40:50 +00:00
Rafael Espindola 33466a745e Rewrite the gold plugin to fix pr19901.
There is a fundamental difference between how the gold API and lib/LTO view
the LTO process.

The gold API talks about a particular symbol in a particular file. The lib/LTO
API talks about a symbol in the merged module.

The merged module is then defined in terms of the IR semantics. In particular,
a linkonce_odr GV is only copied if it is used, since it is valid to drop
unused linkonce_odr GVs.

In the testcase in pr19901 both properties collide. What happens is that gold
asks us to keep a particular linkonce_odr symbol, but the IR linker doesn't
copy it to the merged module and we never have a chance to ask lib/LTO to keep
it.

This patch fixes it by having a more direct implementation of the gold API. If
it asks us to keep a symbol, we change the linkage so it is not linkonce. If it
says we can drop a symbol, we do so. All of this before we even send the module
to lib/Linker.

Since now we don't have to produce LTO_SYMBOL_SCOPE_DEFAULT_CAN_BE_HIDDEN,
during symbol resolution we can use a temporary LLVMContext and do lazy
module loading. This allows us to keep the minimum possible amount of
allocated memory around. This should also allow as much parallelism as
we want, since there is no shared context.

llvm-svn: 216215
2014-08-21 20:28:55 +00:00
Jonathan Roelofs f00e7e143e Satiate the sanitizer build bot
This fixes a missing initializer from r216182

llvm-svn: 216212
2014-08-21 20:09:15 +00:00
Rafael Espindola 7cebf36a95 Move some logic to populateLTOPassManager.
This will avoid code duplication in the next commit which calls it directly
from the gold plugin.

llvm-svn: 216211
2014-08-21 20:03:44 +00:00
Adam Nemet 5ed17dad95 [AVX512] Add class to group common template arguments related to vector type
We discussed the issue of generality vs. readability of the AVX512 classes
recently.  I proposed this approach to try to hide and centralize the mappings
we commonly perform based on the vector type.  A new class X86VectorVTInfo
captures these.

The idea is to pass an instance of this class to classes/multiclasses instead
of the corresponding ValueType.  Then the class/multiclass can use its field
for things that derive from the type rather than passing all those as separate
arguments.

I modified avx512_valign to demonstrate this new approach.  As you can see
instead of 7 related template parameters we now have one.  The downside is
that we have to refer to fields for the derived values.  I named the argument
'_' in order to make this as invisible as possible.  Please let me know if you
absolutely hate this.  (Also once we allow local initializations in
multiclasses we can recover the original version by assigning the fields to
local variables.)

Another possible use-case for this class is to directly map things, e.g.:

  RegisterClass KRC = X86VectorVTInfo<32, i16>.KRC

llvm-svn: 216209
2014-08-21 19:50:07 +00:00
Alex Lorenz 936b99c942 Coverage Mapping: add function's hash to coverage function records.
The profile data format was recently updated and the new indexing api
requires the code coverage tool to know the function's hash as well
as the function's name to get the execution counts for a function.

Differential Revision: http://reviews.llvm.org/D4994

llvm-svn: 216207
2014-08-21 19:23:25 +00:00
Rafael Espindola 40bfd6db57 llvm-gcc is dead.
llvm-svn: 216206
2014-08-21 19:22:24 +00:00
Eric Fiselier 5b0e0e9436 [LIT] Remove documentation for method since it does not exist
llvm-svn: 216204
2014-08-21 18:52:58 +00:00
Rafael Espindola 216e0c0617 Respect LibraryInfo in populateLTOPassManager and use it. NFC.
llvm-svn: 216203
2014-08-21 18:49:52 +00:00
Rafael Espindola df1836f750 Remove dead code. NFC.
llvm-svn: 216201
2014-08-21 18:11:21 +00:00
Quentin Colombet 0c740d4b9a [AArch64] Run a peephole pass right after AdvSIMD pass.
The AdvSIMD pass may produce copies that are not coalescer-friendly. The
peephole optimizer knows how to fix that as demonstrated in the test case.

<rdar://problem/12702965>

llvm-svn: 216200
2014-08-21 18:10:07 +00:00
Juergen Ributzka c83265a6c5 [FastISel][AArch64] Factor out ANDWri instruction generation into a helper function. NFCI.
llvm-svn: 216199
2014-08-21 18:02:25 +00:00
Moritz Roth dfdda0d41c Thumb1 load/store optimizer: Improve code to materialize new base register.
There are two add-immediate instructions in Thumb1: tADDi8 and tADDi3. Only
the latter supports using different source and destination registers, so
whenever we materialize a new base register (at a certain offset) we'd do
so by moving the base register value to the new register and then adding in
place. This patch changes the code to use a single tADDi3 if the offset is
small enough to fit in 3 bits.

Differential Revision: http://reviews.llvm.org/D5006

llvm-svn: 216193
2014-08-21 17:11:03 +00:00
Hans Wennborg f4cb573268 Use returns_nonnull in BumpPtrAllocator and MallocAllocator to avoid null-check in placement new
In both Clang and LLVM, this is a common pattern:

  Size = sizeof(DeclRefExpr) + SomeExtraStuff;
  void *Mem = Context.Allocate(Size, llvm::alignOf<DeclRefExpr>());
  return new (Mem) DeclRefExpr(...);

The annoying thing is that because the default placement-new operator has a
nothrow specification, the compiler will insert a null check of Mem before
calling the DeclRefExpr constructor. This null check is redundant for us,
because we expect the allocation functions to never return null.

By annotating the allocator functions with returns_nonnull, we can optimize
away these checks. Compiling clang with a recent version of Clang and measuring
with:

  $ perf stat -r20 bin/clang.patch -fsyntax-only -w gcc.c && perf stat -r20 bin/clang.orig -fsyntax-only -w gcc.c

Shows a 2.4% speed-up (+- 0.8%).

The pattern occurs in LLVM too. Measuring with -O3 (and now using bzip2.c
instead, because it's smaller):

  $ perf stat -r20 bin/clang.patch -O3 -w bzip2.c  &&  perf stat -r20 bin/clang.orig -O3 -w bzip2.c

Shows 4.4 % speed-up (+- 1%).

If anyone knows of a similar attribute we can use for MSVC, or some other
technique to get rid off the null check there, please let me know.

Differential Revision: http://reviews.llvm.org/D4989

llvm-svn: 216192
2014-08-21 17:10:00 +00:00
Juergen Ributzka 95c0f153e4 [FastISel][AArch64] Remove redundant test.
These tests and many more are already covered by fast-isel-addressing-modes.ll.

llvm-svn: 216186
2014-08-21 16:40:05 +00:00
Jonathan Roelofs 5e98ff967b Add a thread-model knob for lowering atomics on baremetal & single threaded systems
http://reviews.llvm.org/D4984

llvm-svn: 216182
2014-08-21 14:35:47 +00:00
Rafael Espindola e07caad9e7 Handle inlining in populateLTOPassManager like in populateModulePassManager.
No functionality change.

llvm-svn: 216178
2014-08-21 13:35:30 +00:00
Zinovy Nis 33406da5f4 [CLNUP] Remove return after llvm_unreachable. Thanks to Hal Finkel for pointing.
llvm-svn: 216176
2014-08-21 13:30:05 +00:00
Benjamin Kramer ff8b883772 DAGCombiner: Make concat_vector combine safe for EVTs and concat_vectors with many arguments.
PR20677

llvm-svn: 216175
2014-08-21 13:28:02 +00:00
Rafael Espindola 208bc533cd Move DisableGVNLoadPRE from populateLTOPassManager to PassManagerBuilder.
llvm-svn: 216174
2014-08-21 13:13:17 +00:00
Josh Klontz fbe17d6a32 X86AsmPrinter MCJIT MSVC bug fix.
Summary:
This bug was introduced in r213006 which makes an assumption that MCSection is COFF for Windows MSVC. This assumption is broken for MCJIT users where ELF is used instead [1]. The fix is to change the MCSection cast to a dyn_cast.

[1] http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-December/068407.html.

Reviewers: majnemer

Reviewed By: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D4872

llvm-svn: 216173
2014-08-21 12:55:27 +00:00
Oliver Stannard 51b1d460cb [ARM] Enable DP copy, load and store instructions for FPv4-SP
The FPv4-SP floating-point unit is generally referred to as
single-precision only, but it does have double-precision registers and
load, store and GPR<->DPR move instructions which operate on them.
This patch enables the use of these registers, the main advantage of
which is that we now comply with the AAPCS-VFP calling convention.
This partially reverts r209650, which added some AAPCS-VFP support,
but did not handle return values or alignment of double arguments in
registers.

This patch also adds tests for Thumb2 code generation for
floating-point instructions and intrinsics, which previously only
existed for ARM.

llvm-svn: 216172
2014-08-21 12:50:31 +00:00
Rafael Espindola 18b2a258c3 Sort declarations.
llvm-svn: 216171
2014-08-21 12:39:07 +00:00
Benjamin Kramer 002a1ced06 Make format_object_base's destructor protected and non-virtual.
It's not meant to be used with operator delete and this avoids emitting virtual
dtors for every derived format object.

llvm-svn: 216170
2014-08-21 11:22:05 +00:00
Erik Verbruggen 2b98bd2a80 Reassociate x + -0.1234 * y into x - 0.1234 * y
This does not require -ffast-math, and it gives CSE/GVN more options to
eliminate duplicate expressions in, e.g.:

  return ((x + 0.1234 * y) * (x - 0.1234 * y));

Differential Revision: http://reviews.llvm.org/D4904

llvm-svn: 216169
2014-08-21 10:45:30 +00:00
Benjamin Kramer b791ef21d2 X86: Turn redundant if into an assertion.
While there remove noop casts.

llvm-svn: 216168
2014-08-21 10:31:37 +00:00
Robert Khasanov 46409eae8e [x86] Added _addcarry_ and _subborrow_ intrinsics
llvm-svn: 216164
2014-08-21 09:43:43 +00:00
Robert Khasanov 86ca6aaf40 [x86] SMAP: added HasSMAP attribute for CLAC/STAC, corrected attributes
llvm-svn: 216163
2014-08-21 09:34:12 +00:00
Robert Khasanov 7c5a843646 [x86] Broadwell: ADOX/ADCX. Added _addcarryx_u{32|64} intrinsics to LLVM.
llvm-svn: 216162
2014-08-21 09:27:00 +00:00
Robert Khasanov 98441b6e7f [x86] Enable Broadwell target.
Added FeatureSMAP.

Broadwell ISA includes Haswell ISA + ADX + RDSEED + SMAP

llvm-svn: 216161
2014-08-21 09:16:12 +00:00
Zinovy Nis 0a36cba29d [INDVARS] Extend using of widening of induction variables for the cases of "sub nsw" and "mul nsw" instructions.
Currently only "add nsw" are widened. This patch eliminates tons of "sext" instructions for 64 bit code (and the corresponding target code) in cases like:

int N = 100;
float **A;

void foo(int x0, int x1)
{
        float * A_cur = &A[0][0];
        float * A_next = &A[1][0];
        for(int x = x0; x < x1; ++x).
        {
          // Currently only [x+N] case is widened. Others 2 cases lead to sext.
          // This patch fixes it, so all 3 cases do not need sext.
          const float div = A_cur[x + N] + A_cur[x - N] + A_cur[x * N];
          A_next[x] = div;
        }
}
...
> clang++ test.cpp -march=core-avx2 -Ofast  -fno-unroll-loops -fno-tree-vectorize -S -o -

Differential Revision: http://reviews.llvm.org/D4695

llvm-svn: 216160
2014-08-21 08:25:45 +00:00
Elena Demikhovsky 08f8596cc0 IntelJITEventListener updates to fix breaks by recent changes to EngineBuilder and DIContext.
By Arch Robison.

llvm-svn: 216159
2014-08-21 07:01:55 +00:00
Craig Topper 71b7b68b74 Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid needing to mention the size.
llvm-svn: 216158
2014-08-21 05:55:13 +00:00
David Majnemer 5d1aeba2ea InstCombine: Fold ((A | B) & C1) ^ (B & C2) -> (A & C1) ^ B if C1^C2=-1
Adapted from a patch by Richard Smith, test-case written by me.

llvm-svn: 216157
2014-08-21 05:14:48 +00:00
Craig Topper 3ced27c835 Remove custom implementations of max/min in StringRef that was originally added to work an old gcc bug. I believe its been fixed by now.
llvm-svn: 216156
2014-08-21 04:31:10 +00:00
Eric Fiselier a4e211edad add self to credits
llvm-svn: 216155
2014-08-21 04:27:11 +00:00
Jiangning Liu 950844fadb Fix a bug around truncating vector in const prop.
In constant folding stage, "TRUNC" can't handle vector data type.

llvm-svn: 216149
2014-08-21 02:12:35 +00:00
Jiangning Liu deb4b5fc37 Revert r216066, "Optimize ZERO_EXTEND and SIGN_EXTEND in both SelectionDAG Builder and type".
llvm-svn: 216147
2014-08-21 01:59:30 +00:00
Quentin Colombet 689623009b [PeepholeOptimizer] Take advantage of the isInsertSubreg property in the
advanced copy optimization.

This is the final step patch toward transforming:
udiv    r0, r0, r2
udiv    r1, r1, r3
vmov.32 d16[0], r0
vmov.32 d16[1], r1
vmov    r0, r1, d16
bx      lr

into:
udiv    r0, r0, r2
udiv    r1, r1, r3
bx      lr

Indeed, thanks to this patch, this optimization is able to look through
vmov.32 d16[0], r0
vmov.32 d16[1], r1

and is able to rewrite the following sequence:
vmov.32 d16[0], r0
vmov.32 d16[1], r1
vmov    r0, r1, d16

into simple generic GPR copies that the coalescer managed to remove.

<rdar://problem/12702965>

llvm-svn: 216144
2014-08-21 00:19:16 +00:00
Quentin Colombet 84f15bd1b0 [ARM] Mark VSETLNi32 with the InsertSubreg property and implement the related
target hook.

This patch teaches the compiler that:
dX = VSETLNi32 dY, rZ, imm
is the same as:
dX = INSERT_SUBREG dY, rZ, translateImmToSubIdx(imm)

<rdar://problem/12702965>

llvm-svn: 216143
2014-08-21 00:10:52 +00:00
James Molloy a88896b5c0 [LoopVectorize] Up the maximum unroll factor to 4 for AArch64
Only for Cortex-A57 and Cyclone for now, where it has shown wins.

llvm-svn: 216141
2014-08-21 00:02:51 +00:00
James Molloy 82c995d450 [LoopVectorizer] Limit unroll factor in the presence of nested reductions.
If we have a scalar reduction, we can increase the critical path length if the loop we're unrolling is inside another loop. Limit, by default to 2, so the critical path only gets increased by one reduction operation.

llvm-svn: 216140
2014-08-20 23:53:52 +00:00
Quentin Colombet 7e3da6677a Add isInsertSubreg property.
This patch adds a new property: isInsertSubreg and the related target hooks:
TargetIntrInfo::getInsertSubregInputs and
TargetInstrInfo::getInsertSubregLikeInputs to specify that a target specific
instruction is a (kind of) INSERT_SUBREG.

The approach is similar to r215394.

<rdar://problem/12702965>

llvm-svn: 216139
2014-08-20 23:49:36 +00:00
Jonathan Roelofs 44937d98a3 Lower thumbv4t & thumbv5 lo->lo copies through a push-pop sequence
On pre-v6 hardware, 'MOV lo, lo' gives undefined results, so such copies need to
be avoided. This patch trades simplicity for implementation time at the expense
of performance... As they say: correctness first, then performance.

See http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-August/075998.html for a few
ideas on how to make this better.

llvm-svn: 216138
2014-08-20 23:38:50 +00:00
Quentin Colombet a56749064a Mention the right target hook in the comment on isExtractSubreg property.
llvm-svn: 216137
2014-08-20 23:25:28 +00:00
Quentin Colombet 67639df146 [PeepholeOptimizer] Take advantage of the isExtractSubreg property in the
advanced copy optimization.

This patch is a step toward transforming:
udiv	r0, r0, r2
udiv	r1, r1, r3
vmov.32	d16[0], r0
vmov.32	d16[1], r1
vmov	r0, r1, d16
bx	lr

into:
udiv	r0, r0, r2
udiv	r1, r1, r3
bx	lr

Indeed, thanks to this patch, this optimization is able to look through
vmov r0, r1, d16
but it does not understand yet
vmov.32 d16[0], r0
vmov.32 d16[1], r1

Comming patches will fix that and update the related test case.

<rdar://problem/12702965>

llvm-svn: 216136
2014-08-20 23:13:02 +00:00
Yi Jiang 1a4e73d7bf New InstCombine pattern: (icmp ult/ule (A + C1), C3) | (icmp ult/ule (A + C2), C3) to (icmp ult/ule ((A & ~(C1 ^ C2)) + max(C1, C2)), C3) under certain condition
llvm-svn: 216135
2014-08-20 22:55:40 +00:00
Alexey Samsonov e5864c69a8 Don't allow MCStreamer::EmitIntValue to output 0-byte integers.
It makes no sense and can hide bugs. In particular, it lead
to left shift by 64 bits, which is an undefined behavior,
properly reported by UBSan.

llvm-svn: 216134
2014-08-20 22:46:38 +00:00
Quentin Colombet deb82eab3e [ARM] Mark VMOVRRD with the ExtractSubreg property and implement the related
target hook.

This patch teaches the compiler that:
rX, rY = VMOVRRD dZ
is the same as:
rX = EXTRACT_SUBREG dZ, ssub_0
rY = EXTRACT_SUBREG dZ, ssub_1

<rdar://problem/12702965>

llvm-svn: 216132
2014-08-20 22:16:19 +00:00
Alexey Samsonov fffd56ecdf Fix undefined behavior (left shift of negative value) in SystemZ backend.
This bug is reported by UBSan.

llvm-svn: 216131
2014-08-20 21:56:43 +00:00
Quentin Colombet 7e75cbaf47 Add isExtractSubreg property.
This patch adds a new property: isExtractSubreg and the related target hooks:
TargetIntrInfo::getExtractSubregInputs and
TargetInstrInfo::getExtractSubregLikeInputs to specify that a target specific
instruction is a (kind of) EXTRACT_SUBREG.

The approach is similar to r215394.

<rdar://problem/12702965>

llvm-svn: 216130
2014-08-20 21:51:26 +00:00
Alexey Samsonov e229ec5bfc Fix null reference creation in SelectionDAG constructor.
Store TargetSelectionDAGInfo as a pointer instead of a reference:
getSelectionDAGInfo() may not be implemented for certain backends
(e.g. it's not currently implemented for R600).

This bug is reported by UBSan.

llvm-svn: 216129
2014-08-20 21:40:15 +00:00
Alexey Samsonov 2651ae6513 Fix undefined behavior (left shift of negative value) in Hexagon backend.
This bug is reported by UBSan.

llvm-svn: 216125
2014-08-20 21:22:03 +00:00
Alexey Samsonov ea0aee622e Cleanup: Delete seemingly unused reference to MachineDominatorTree from ScheduleDAGInstrs.
llvm-svn: 216124
2014-08-20 20:57:26 +00:00
Sanjay Patel bba72c7c1e Don't prevent a vselect of constants from becoming a single load (PR20648).
Fix for PR20648 - http://llvm.org/bugs/show_bug.cgi?id=20648

This patch checks the operands of a vselect to see if all values are constants.
If yes, bail out of any further attempts to create a blend or shuffle because
SelectionDAGLegalize knows how to turn this kind of vselect into a single load.

This already happens for machines without SSE4.1, so the added checks just send
more targets down that path.

Differential Revision: http://reviews.llvm.org/D4934

llvm-svn: 216121
2014-08-20 20:34:56 +00:00
Duncan P. N. Exon Smith 7bb10f8a85 X86: Add missing triples from r216119
llvm-svn: 216120
2014-08-20 19:58:59 +00:00
Duncan P. N. Exon Smith b18263531d X86: Align the stack on word boundaries in LowerFormalArguments()
The goal of the patch is to implement section 3.2.3 of the AMD64 ABI
correctly.  The controlling sentence is, "The size of each argument gets
rounded up to eightbytes.  Therefore the stack will always be eightbyte
aligned." The equivalent sentence in the i386 ABI page 37 says, "At all
times, the stack pointer should point to a word-aligned area."  For both
architectures, the stack pointer is not being rounded up to the nearest
eightbyte or word between the last normal argument and the first
variadic argument.

Patch by Thomas Jablin!

llvm-svn: 216119
2014-08-20 19:40:59 +00:00
Alexey Samsonov 8968e6d1b0 Fix null reference creation in ScheduleDAGInstrs constructor call.
Both MachineLoopInfo and MachineDominatorTree may be null in ScheduleDAGMI
constructor call. It is undefined behavior to take references to these values.

This bug is reported by UBSan.

llvm-svn: 216118
2014-08-20 19:36:05 +00:00
Keno Fischer d750723d29 Do not insert a tail call when returning multiple values on X86
Summary: This fixes http://llvm.org/bugs/show_bug.cgi?id=19530.
The problem is that X86ISelLowering erroneously thought the third call
was eligible for tail call elimination.
It would have been if it's return value was actually the one returned
by the calling function, but here that is not the case and
additional values are being returned.

Test Plan: Test case from the original bug report is included.

Reviewers: rafael

Reviewed By: rafael

Subscribers: rafael, llvm-commits

Differential Revision: http://reviews.llvm.org/D4968

llvm-svn: 216117
2014-08-20 19:00:37 +00:00
Alexey Samsonov 314c643b4a Fix undefined behavior (left shift by 64 bits) in ScaledNumber::toString().
This bug is reported by UBSan.

llvm-svn: 216116
2014-08-20 18:30:07 +00:00
Sanjay Patel f3cfeef2e9 critical-anti-dependency breaker: don't use reg def info from kill insts (PR20308)
In PR20308 ( http://llvm.org/bugs/show_bug.cgi?id=20308 ), the critical-anti-dependency breaker
caused a miscompile because it broke a WAR hazard using a register that it thinks is available
based on info from a kill inst. Until PR18663 is solved, we shouldn't use any def/use info from
a kill because they are really just nops.

This patch adds guard checks for kills around calls to ScanInstruction() where the DefIndices
array is set. For good measure, add an assert in ScanInstruction() so we don't hit this bug again.

The test case is a reduced version of the code from the bug report.

Differential Revision: http://reviews.llvm.org/D4977

llvm-svn: 216114
2014-08-20 18:03:00 +00:00
Quentin Colombet 03e43f8e68 [PeepholeOptimizer] Refactor the advanced copy optimization to take advantage of
the isRegSequence property.

This is a follow-up of r215394 and r215404, which respectively introduces the
isRegSequence property and uses it for ARM.

Thanks to the property introduced by the previous commits, this patch is able
to optimize the following sequence:
vmov	d0, r2, r3
vmov	d1, r0, r1
vmov	r0, s0
vmov	r1, s2
udiv	r0, r1, r0
vmov	r1, s1
vmov	r2, s3
udiv	r1, r2, r1
vmov.32	d16[0], r0
vmov.32	d16[1], r1
vmov	r0, r1, d16
bx	lr

into:
udiv	r0, r0, r2
udiv	r1, r1, r3
vmov.32	d16[0], r0
vmov.32	d16[1], r1
vmov	r0, r1, d16
bx	lr

This patch refactors how the copy optimizations are done in the peephole
optimizer. Prior to this patch, we had one copy-related optimization that
replaced a copy or bitcast by a generic, more suitable (in terms of register
file), copy.

With this patch, the peephole optimizer features two copy-related optimizations:
1. One for rewriting generic copies to generic copies:
PeepholeOptimizer::optimizeCoalescableCopy.
2. One for replacing non-generic copies with generic copies:
PeepholeOptimizer::optimizeUncoalescableCopy.

The goals of these two optimizations are slightly different: one rewrite the
operand of the instruction (#1), the other kills off the non-generic instruction
and replace it by a (sequence of) generic instruction(s).

Both optimizations rely on the ValueTracker introduced in r212100.

The ValueTracker has been refactored to use the information from the
TargetInstrInfo for non-generic instruction. As part of the refactoring, we
switched the tracking from the index of the definition to the actual register
(virtual or physical). This one change is to provide better consistency with
register related APIs and to ease the use of the TargetInstrInfo.

Moreover, this patch introduces a new helper class CopyRewriter used to ease the
rewriting of generic copies (i.e., #1).

Finally, this patch adds a dead code elimination pass right after the peephole
optimizer to get rid of dead code that may appear after rewriting.

This is related to <rdar://problem/12702965>.

Review: http://reviews.llvm.org/D4874
llvm-svn: 216088
2014-08-20 17:41:48 +00:00
Andrew Trick 2223f8edbd Tweak CFGPrinter to wrap very long names.
I added wrapping to the CFGPrinter a while back so the -view-cfg
output is actually viewable. I've since enountered very long mangled
names with the same problem, so I'm slightly tweaking this code to
work in that case.

llvm-svn: 216087
2014-08-20 17:38:12 +00:00
Rafael Espindola 1c509715e6 Remove unused field.
llvm-svn: 216086
2014-08-20 17:33:44 +00:00
Juergen Ributzka e1bb055ed3 [FastISel][AArch64] Don't fold the sign-/zero-extend from i1 into the compare.
This fixes a bug I introduced in a previous commit (r216033). Sign-/Zero-
extension from i1 cannot be folded into the ADDS/SUBS instructions. Instead both
operands have to be sign-/zero-extended with separate instructions.

Related to <rdar://problem/17913111>.

llvm-svn: 216073
2014-08-20 16:34:15 +00:00
Rafael Espindola 061beab673 Quick fix for an use after free.
llvm-svn: 216071
2014-08-20 15:19:37 +00:00
Dan Liew 2661dfc7b9 Add note to LangRef about how function arguments can be unnamed and
how this affects the numbering of unnamed temporaries.

llvm-svn: 216070
2014-08-20 15:06:30 +00:00
Aaron Ballman 474972585a Silencing a -Wcast-qual warning. NFC.
llvm-svn: 216068
2014-08-20 12:54:13 +00:00
Aaron Ballman bf6ee22113 Silencing an MSVC C4334 warning ('<<' : result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)). NFC.
llvm-svn: 216067
2014-08-20 12:14:35 +00:00
Jiangning Liu f841b3b79e Optimize ZERO_EXTEND and SIGN_EXTEND in both SelectionDAG Builder and type
legalization stage. With those two optimizations, fewer signed/zero extension
instructions can be inserted, and then we can expose more opportunities to
Machine CSE pass in back-end.

llvm-svn: 216066
2014-08-20 12:05:15 +00:00
Pavel Chupin 01a4e0a1ef [x32] Fix FrameIndex check in SelectLEA64_32Addr
Summary:
Fixes http://llvm.org/bugs/show_bug.cgi?id=20016 reproducible on new
lea-5.ll case.
Also use RSP/RBP for x32 lea to save 1 byte used for 0x67 prefix in
ESP/EBP case.

Test Plan: lea tests modified to include x32/nacl and new test added

Reviewers: nadav, dschuff, t.p.northover

Subscribers: llvm-commits, zinovy.nis

Differential Revision: http://reviews.llvm.org/D4929

llvm-svn: 216065
2014-08-20 11:59:22 +00:00
Yi Kong c655f0c898 ARM: Fix codegen for rbit intrinsic
LLVM generates illegal `rbit r0, #352` instruction for rbit intrinsic.
According to ARM ARM, rbit only takes register as argument, not immediate.
The correct instruction should be rbit <Rd>, <Rm>.

The bug was originally introduced in r211057.

Differential Revision: http://reviews.llvm.org/D4980

llvm-svn: 216064
2014-08-20 10:40:20 +00:00
Bill Wendling 5d53607392 Update projects lists.
llvm-svn: 216048
2014-08-20 07:32:09 +00:00
Bill Wendling e971d6fd26 Add libcxxabi to the projects.
llvm-svn: 216047
2014-08-20 07:30:08 +00:00
David Majnemer 42158f3eea InstCombine: Annotate sub with nuw when we prove it's safe
We can prove that a 'sub' can be a 'sub nuw' if the left-hand side is
negative and the right-hand side is non-negative.

llvm-svn: 216045
2014-08-20 07:17:31 +00:00
Craig Topper 298f63803b Fix an off by 1 bug that prevented SmallPtrSet from using all of its 'small' capacity. Then fix the early return in the move constructor that prevented 'small' moves from clearing the NumElements in the moved from object. The directed test missed this because it was always testing large moves due to the off by 1 bug.
llvm-svn: 216044
2014-08-20 04:41:36 +00:00
NAKAMURA Takumi b8083476f1 Constants.h: Fix possible typo in r216015. [-Wdocumentation]
llvm-svn: 216043
2014-08-20 04:22:47 +00:00
Peter Collingbourne f39430bd4a [dfsan] Treat vararg custom functions like unimplemented functions.
Because declarations of these functions can appear in places like autoconf
checks, they have to be handled somehow, even though we do not support
vararg custom functions. We do so by printing a warning and calling the
uninstrumented function, as we do for unimplemented functions.

llvm-svn: 216042
2014-08-20 01:40:23 +00:00
Juergen Ributzka 0781b860e4 [FastISel][AArch64] Use the proper FMOV instruction to materialize a +0.0.
Use FMOVWSr/FMOVXDr instead of FMOVSr/FMOVDr, which have the proper register
class to be used with the zero register. This makes the MachineInstruction
verifier happy again.

This is related to <rdar://problem/18027157>.

llvm-svn: 216040
2014-08-20 01:10:36 +00:00
David Majnemer 57d5bc8849 InstCombine: Annotate sub with nsw when we prove it's safe
We can prove that a 'sub' can be a 'sub nsw' under certain conditions:
- The sign bits of the operands is the same.
- Both operands have more than 1 sign bit.

The subtraction cannot be a signed overflow in either case.

llvm-svn: 216037
2014-08-19 23:36:30 +00:00
Hans Wennborg fd1f0f17c5 BumpPtrAllocator: don't accept 0 for the alignment parameter
It seems unnecessary to have to use an extra branch to check for this special case.

http://reviews.llvm.org/D4945

llvm-svn: 216036
2014-08-19 23:35:33 +00:00
Juergen Ributzka c0886dd5b0 [FastISel][AArch64] Factor out ADDS/SUBS instruction emission and add support for extensions and shift folding.
Factor out the ADDS/SUBS instruction emission code into helper functions and
make the helper functions more clever to support most of the different ADDS/SUBS
instructions the architecture support. This includes better immedediate support,
shift folding, and sign-/zero-extend folding.

This fixes <rdar://problem/17913111>.

llvm-svn: 216033
2014-08-19 22:29:55 +00:00
Rafael Espindola 3f3d7acbcf Split parseAssembly into parseAssembly and parseAssemblyInto.
This should restore the functionality of parsing new code into an existing
module without the confusing interface.

llvm-svn: 216031
2014-08-19 22:05:47 +00:00
Alexey Samsonov 96b02d1573 Delete unused argument in AArch64MCInstLower constructor: it doesn't
use Mangler, and Mangler is in fact not even created when AArch64MCInstLower
is constructed.

This bug is reported by UBSan.

llvm-svn: 216030
2014-08-19 21:51:08 +00:00
Duncan P. N. Exon Smith 23046653be LangRef: Move example of function-scope uselistorder to a function
Should make the example added in r216025 a little more clear.

llvm-svn: 216027
2014-08-19 21:48:04 +00:00
Duncan P. N. Exon Smith 0a448fbca3 IR: Implement uselistorder assembly directives
Implement `uselistorder` and `uselistorder_bb` assembly directives,
which allow the use-list order to be recovered when round-tripping to
assembly.

This is the bulk of PR20515.

llvm-svn: 216025
2014-08-19 21:30:15 +00:00
Lang Hames df7be5105a [MCJIT] Add an i386 RuntimeDyldMachO test case.
llvm-svn: 216024
2014-08-19 21:26:36 +00:00
Duncan P. N. Exon Smith 35de5b8ca1 IR: Fix a missed case when threading OnlyIfReduced through ConstantExpr
In r216015 I missed propagating `OnlyIfReduced` through the inline
versions of `getGetElementPtr()` (I was relying on compile failures on
mismatches between the header and source signatures to get them all).

llvm-svn: 216023
2014-08-19 21:18:21 +00:00
Duncan P. N. Exon Smith c8eccd1147 verify-uselistorder: Force -preserve-bc-use-list-order
llvm-svn: 216022
2014-08-19 21:08:27 +00:00
Juergen Ributzka 6afeffb078 [FastISel][AArch64] Extend floating-point materialization test.
This adds the missing test that I promised for r215753 to test the
materialization of the floating-point value +0.0.

Related to <rdar://problem/18027157>.

llvm-svn: 216019
2014-08-19 20:35:07 +00:00
Rafael Espindola 066b46abcb fix the gcc build
llvm-svn: 216018
2014-08-19 20:06:25 +00:00
Lang Hames eb99df8ee2 [MCJIT] Allow '$' characters in symbol names in RuntimeDyldChecker.
llvm-svn: 216017
2014-08-19 20:04:45 +00:00
Duncan P. N. Exon Smith 33de00cf59 IR: Fix ConstantExpr::replaceUsesOfWithOnConstant()
Change `ConstantExpr` to follow the model the other constants are using:
only malloc a replacement if it's going to be used.  This fixes a subtle
bug where if an API user had used `ConstantExpr::get()` already to
create the replacement but hadn't given it any users, we'd delete the
replacement.

This relies on r216015 to thread `OnlyIfReduced` through
`ConstantExpr::getWithOperands()`.

llvm-svn: 216016
2014-08-19 20:03:35 +00:00
Duncan P. N. Exon Smith 170eb985da IR: Thread OnlyIfReduced through ConstantExpr::getWithOperands()
In order to change `ConstantExpr::replaceUsesOfWithOnConstant()` to work
like other constants (e.g., using `ConstantArray::getImpl()`), thread
`OnlyIfReduced` through as necessary.  When `OnlyIfReduced` is false,
there's no functionality change.  When it's true, if there's no constant
folding or type changes `nullptr` is returned instead of the new
constant.

`ConstantExpr::replaceUsesOfWithOnConstant()` will be updated to use the
"true" version in a follow-up commit.

llvm-svn: 216015
2014-08-19 19:45:37 +00:00
Rafael Espindola a5dbed4936 Fix the MSVC build.
llvm-svn: 216014
2014-08-19 19:45:15 +00:00
Juergen Ributzka b46ea081ad Reapply [FastISel][AArch64] Add support for more addressing modes (r215597).
Note: This was originally reverted to track down a buildbot error. Reapply
without any modifications.

Original commit message:
FastISel didn't take much advantage of the different addressing modes available
to it on AArch64. This commit allows the ComputeAddress method to recognize more
addressing modes that allows shifts and sign-/zero-extensions to be folded into
the memory operation itself.

For Example:
  lsl x1, x1, #3     --> ldr x0, [x0, x1, lsl #3]
  ldr x0, [x0, x1]

  sxtw x1, w1
  lsl x1, x1, #3     --> ldr x0, [x0, x1, sxtw #3]
  ldr x0, [x0, x1]

llvm-svn: 216013
2014-08-19 19:44:17 +00:00
Juergen Ributzka e3698ab6e3 Reapply [FastISel][X86] Add large code model support for materializing floating-point constants (r215595).
Note: This was originally reverted to track down a buildbot error. Reapply
without any modifications.

Original commit message:
In the large code model for X86 floating-point constants are placed in the
constant pool and materialized by loading from it. Since the constant pool
could be far away, a PC relative load might not work. Therefore we first
materialize the address of the constant pool with a movabsq and then load
from there the floating-point value.

Fixes <rdar://problem/17674628>.

llvm-svn: 216012
2014-08-19 19:44:13 +00:00
Juergen Ributzka 89d187b387 Reapply [FastISel][X86] Use XOR to materialize the "0" value (r215594).
Note: This was originally reverted to track down a buildbot error. Reapply
without any modifications.

llvm-svn: 216011
2014-08-19 19:44:10 +00:00
Juergen Ributzka 4952c35afd Reapply [FastISel][X86] Emit more efficient instructions for integer constant materialization (r215593).
Note: This was originally reverted to track down a buildbot error. Reapply
without any modifications.

Original commit message:
This mostly affects the i64 value type, which always resulted in an 15byte
mobavsq instruction to materialize any constant. The custom code checks the
value of the immediate and tries to use a different and smaller mov
instruction when possible.

This fixes <rdar://problem/17420988>.

llvm-svn: 216010
2014-08-19 19:44:06 +00:00
Juergen Ributzka 7e23f77d82 Reapply [FastISel][AArch64] Make use of the zero register when possible (r215591).
Note: This was originally reverted to track down a buildbot error. Reapply
without any modifications.

Original commit message:
This change materializes now the value "0" from the zero register.
The zero register can be folded by several instruction, so no
materialization is need at all.

Fixes <rdar://problem/17924413>.

llvm-svn: 216009
2014-08-19 19:44:02 +00:00
Duncan P. N. Exon Smith 349f6982ac ADT: Unit test for ArrayRef::equals change in r215986
llvm-svn: 216008
2014-08-19 19:18:46 +00:00
Duncan P. N. Exon Smith 909620a33d IR: De-duplicate code for replacing operands in place
This is non-trivial and sits in three places.  Move it to
ConstantUniqueMap.

llvm-svn: 216007
2014-08-19 19:13:30 +00:00
Juergen Ributzka 4bf6c01cdb Reapply [FastISel] Let the target decide first if it wants to materialize a constant (215588).
Note: This was originally reverted to track down a buildbot error. This commit
exposed a latent bug that was fixed in r215753. Therefore it is reapplied
without any modifications.

I run it through SPEC2k and SPEC2k6 for AArch64 and it didn't introduce any new
regeressions.

Original commit message:
This changes the order in which FastISel tries to materialize a constant.
Originally it would try to use a simple target-independent approach, which
can lead to the generation of inefficient code.

On X86 this would result in the use of movabsq to materialize any 64bit
integer constant - even for simple and small values such as 0 and 1. Also
some very funny floating-point materialization could be observed too.

On AArch64 it would materialize the constant 0 in a register even the
architecture has an actual "zero" register.

On ARM it would generate unnecessary mov instructions or not use mvn.

This change simply changes the order and always asks the target first if it
likes to materialize the constant. This doesn't fix all the issues
mentioned above, but it enables the targets to implement such
optimizations.

Related to <rdar://problem/17420988>.

llvm-svn: 216006
2014-08-19 19:05:24 +00:00
Rafael Espindola 113ba3e6c0 Fix a pair of use after free. Should bring the bots back.
llvm-svn: 216005
2014-08-19 18:59:14 +00:00
Rafael Espindola 48af1c2a1a Don't own the buffer in object::Binary.
Owning the buffer is somewhat inflexible. Some Binaries have sub Binaries
(like Archive) and we had to create dummy buffers just to handle that. It is
also a bad fit for IRObjectFile where the Module wants to own the buffer too.

Keeping this ownership would make supporting IR inside native objects
particularly painful.

This patch focuses in lib/Object. If something elsewhere used to own an Binary,
now it also owns a MemoryBuffer.

This patch introduces a few new types.

* MemoryBufferRef. This is just a pair of StringRefs for the data and name.
  This is to MemoryBuffer as StringRef is to std::string.
* OwningBinary. A combination of Binary and a MemoryBuffer. This is needed
  for convenience functions that take a filename and return both the
  buffer and the Binary using that buffer.

The C api now uses OwningBinary to avoid any change in semantics. I will start
a new thread to see if we want to change it and how.

llvm-svn: 216002
2014-08-19 18:44:46 +00:00
Alexey Samsonov f17f03e00e Hide two different AlignMode enums in anonymous namespaces. This bug is reported by UBSan.
llvm-svn: 216001
2014-08-19 18:40:39 +00:00
Renato Golin 06d601fb3e Revert "Small refactor on VectorizerHint for deduplication"
This reverts commit r215994 because MSVC 2012 can't cope with its C++11 goodness.

llvm-svn: 215999
2014-08-19 18:08:50 +00:00
Juergen Ributzka 5460cbfda4 [FastISel][AArch64] Fix a few BuildMI callsites where the result register was added as an operand register.
This fixes a few BuildMI callsites where the result register was added by
using addReg, which is per default a use and therefore an operand register.

Also use the zero register as result register when emitting a compare
instruction (SUBS with unused result register).

llvm-svn: 215997
2014-08-19 17:41:53 +00:00
Renato Golin dd6394d833 Small refactor on VectorizerHint for deduplication
Previously, the hint mechanism relied on clean up passes to remove redundant
metadata, which still showed up if running opt at low levels of optimization.
That also has shown that multiple nodes of the same type, but with different
values could still coexist, even if temporary, and cause confusion if the
next pass got the wrong value.

This patch makes sure that, if metadata already exists in a loop, the hint
mechanism will never append a new node, but always replace the existing one.
It also enhances the algorithm to cope with more metadata types in the future
by just adding a new type, not a lot of code.

llvm-svn: 215994
2014-08-19 17:30:43 +00:00
Alex Lorenz 6a128331a6 Docs: add documentation for the coverage mapping format.
Differential Revision: http://reviews.llvm.org/D4729

llvm-svn: 215990
2014-08-19 17:05:58 +00:00
Rafael Espindola 11c07d7eec Modernize the .ll parsing interface.
* Use StringRef instead of std::string&
* Return a std::unique_ptr<Module> instead of taking an optional module to write
  to (was not really used).
* Use current comment style.
* Use current naming convention.

llvm-svn: 215989
2014-08-19 16:58:54 +00:00
Duncan P. N. Exon Smith 38f556d96d CodingStandards: Document std::equal misbehaviour
I should have included this as part of r215986, which worked around this
corner by changing ArrayRef::equals() not to use std::equal.  Alas.

llvm-svn: 215988
2014-08-19 16:49:40 +00:00
Duncan P. N. Exon Smith 317c139f23 Reapply r215966, r215965, r215964, r215963, r215960, r215959, r215958, and r215957
This reverts commit r215981, which reverted the above commits because
MSVC std::equal asserts on nullptr iterators, and thes commits
introduced an `ArrayRef::equals()` on empty ArrayRefs.

ArrayRef was changed not to use std::equal in r215986.

llvm-svn: 215987
2014-08-19 16:39:58 +00:00
Duncan P. N. Exon Smith 16bec8e667 ADT: Avoid using std::equal in ArrayRef::equals
MSVC's STL has a bug in `std::equal()`: it asserts on nullptr iterators,
causing a block revert in r215981.  This works around that by re-writing
`ArrayRef::equals()` to do the work itself.

llvm-svn: 215986
2014-08-19 16:36:21 +00:00
Aaron Ballman e4b91dca91 Reverting r215966, r215965, r215964, r215963, r215960, r215959, r215958, and r215957 (these commits all rely on previous commits) due to build breakage. These commits cause failed assertions when testing Clang using MSVC 2013. The asserts are triggered from the std::equal call within ArrayRef::equals due to being passed invalid input (ArrayRef.begin() is returning a nullptr which is problematic).
llvm-svn: 215981
2014-08-19 14:59:02 +00:00
Toma Tabacu 85618b3194 [mips] Add assembler support for .set arch=x directive.
Summary:
This directive is similar to ".set mipsX".
It is used to change the CPU target of the assembler, enabling it to accept instructions for a specific CPU.

This patch only implements the r4000 CPU (which is treated internally as generic mips3) and the generic ISAs.

Contains work done by Matheus Almeida.

Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D4884

llvm-svn: 215978
2014-08-19 14:22:52 +00:00
Mayur Pandey 960507beb4 InstCombine: ((A & ~B) ^ (~A & B)) to A ^ B
Proof using CVC3 follows:
$ cat t.cvc
A, B : BITVECTOR(32);
QUERY BVXOR((A & ~B),(~A & B)) = BVXOR(A,B);
$ cvc3 t.cvc
Valid.

Differential Revision: http://reviews.llvm.org/D4898

llvm-svn: 215974
2014-08-19 08:19:19 +00:00
Craig Topper 97ebe53032 Const-correct and prevent a copy of a SmallPtrSet.
llvm-svn: 215973
2014-08-19 07:44:27 +00:00
Craig Topper 1674d9988d Prevent use of the implicit copy constructor on SmallPtrSetImpl. An accidental copy caused my SmallPtrSet->SmallPtrSetImpl conversion commit to fail the other day.
llvm-svn: 215971
2014-08-19 06:57:14 +00:00