Commit Graph

108554 Commits

Author SHA1 Message Date
Hans Wennborg 1b1a399489 MachObjectWriter: optimize the string table for common suffices
This is a follow-up to r207670 (ELF) and r218636 (COFF).

Differential Revision: http://reviews.llvm.org/D5622

llvm-svn: 219126
2014-10-06 17:05:19 +00:00
Timur Iskhodzhanov 116033303b Fix dumping codeview line tables when there are multiple debug sections
Codeview line tables for functions in different sections refer to a common
STRING_TABLE_SUBSECTION for filenames.
This happens when building with -Gy or with inline functions with MSVC.

Original patch by Jeff Muizelaar!

llvm-svn: 219125
2014-10-06 16:59:44 +00:00
Benjamin Kramer 6bf8af5de9 DbgValueHistoryCalculator: Store modified registers in a BitVector instead of std::set.
And iterate over the smaller map instead of the larger set first.  Reduces the time spent in
calculateDbgValueHistory by 30-40%.

llvm-svn: 219123
2014-10-06 15:31:04 +00:00
Hal Finkel 8eae3ad2ff [CFL-AA] Update for handling of globals and more tests
We used to return PartialAlias if *either* variable being queried interacted
with arguments or globals. AFAICT, we can change this to only returning
MayAlias iff *both* variables being queried interacted with arguments or
globals.

Also, adding some basic functionality tests: some basic IPA tests, checking
that we give conservative responses with arguments/globals thrown in the mix,
and ensuring that we trace values through stores and loads.

Note that saying that 'x' interacted with arguments or globals means that the
Attributes of the StratifiedSet that 'x' belongs to has any bits set.

Patch by George Burgess IV, thanks!

llvm-svn: 219122
2014-10-06 14:42:56 +00:00
Yaron Keren c8514a3421 Make the MD5 result name consistent between functions, header and source.
llvm-svn: 219121
2014-10-06 13:48:07 +00:00
Rafael Espindola 11527a1d71 Note that a gold bug has been fixed.
We should be able to stop working around it at some point in the future.

llvm-svn: 219115
2014-10-06 12:33:27 +00:00
Benjamin Kramer 4ba642a2f7 X86: Drop the isConvertibleTo3Addr bit from shufps/shufpd now that we don't convert them anymore.
llvm-svn: 219112
2014-10-06 09:56:40 +00:00
Eric Christopher 2dab835979 For biendian targets like ARM and AArch64, it is useful to have the
output of the llvm-dwarfdump and llvm-objdump report the endianness
used when the object files were generated.

Patch by Charlie Turner.

llvm-svn: 219110
2014-10-06 07:06:03 +00:00
Eric Christopher 9864519c8b Add support for ARM and AArch64 big endian objects to
RelocVisitor.

Patch by Charlie Turner.

llvm-svn: 219109
2014-10-06 07:02:58 +00:00
Eric Christopher 47e079d45e Refactor RelocVisitor to take an object. This removes some
string comparisons and makes it a bit easier to check individual
targets.

Patch by Charlie Turner.

llvm-svn: 219108
2014-10-06 06:55:55 +00:00
Eric Christopher 1452c8e640 Add some tests for RelocVisitor.
Patch by Charlie Turner.

llvm-svn: 219107
2014-10-06 06:52:52 +00:00
Eric Christopher 3faf2f1e02 Add subtarget caches to aarch64, arm, ppc, and x86.
These will make it easier to test further changes to the
code generation and optimization pipelines as those are
moved to subtargets initialized with target feature and
target cpu.

llvm-svn: 219106
2014-10-06 06:45:36 +00:00
Yaron Keren 28a3fc6c3e Resolve ambiguity between llvm::make_unique and std::make_unique.
Intorduced in r219098.
 

llvm-svn: 219105
2014-10-06 06:39:57 +00:00
David Blaikie febfafd13a DebugInfo: Sink constructImportedEntityDIE down into DwarfUnit from DwarfDebug.
It was just calling a bunch of DwarfUnit functions anyway, as can be
seen by the simplification of removing "TheCU" from all the function
calls in the implementation.

llvm-svn: 219103
2014-10-06 05:37:24 +00:00
Frederic Riss d1cfc3c791 [dwarfdump] Print the name for referenced specification of abstract_origin DIEs.
Reviewers: dblaikie, samsonov, echristo, aprantl

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5466

llvm-svn: 219099
2014-10-06 03:36:31 +00:00
Frederic Riss 6005dbd62e Factor the Unit section parsing into the DWARFUnitSection class.
Summary: No functional change.

Reviewers: dblaikie, samsonov

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5522

llvm-svn: 219098
2014-10-06 03:36:18 +00:00
Chandler Carruth 6d2472daca [PM] Remove an unused and rather expensive mapping from an analysis
group's interface to all of the implementations of that analysis group.
The groups themselves can and do manage this anyways, the pass registry
needn't involve itself.

llvm-svn: 219097
2014-10-06 00:30:59 +00:00
Chandler Carruth 9cf0b8f0eb [PM] Remove the (deeply misguided) 'unregister' functionality from the
pass registry.

This style of registry is somewhat questionable, but it being
non-monotonic is crazy. No one is (or should be) unloading DSOs with
passes and unregistering them here. I've checked with a few folks and
I don't know of anyone using this functionality or any important use
case where it is necessary.

llvm-svn: 219096
2014-10-06 00:13:25 +00:00
Chandler Carruth 484bc69aec [cleanup] Switch to using range-based for loops in two very obvious
places.

llvm-svn: 219095
2014-10-06 00:06:48 +00:00
Chandler Carruth c34cfb9c0a [cleanup] Fix up trailing whitespace and formatting in the pass regitsry
code prior to hacking on it more significantly.

llvm-svn: 219094
2014-10-05 23:59:03 +00:00
Owen Anderson 8373d338f6 Give the Reassociate pass a bit more flexibility and autonomy when optimizing expressions.
Particularly, it addresses cases where Reassociate breaks Subtracts but then fails to optimize combinations like I1 + -I2 where I1 and I2 have the same rank and are identical.

Patch by Dmitri Shtilman.

llvm-svn: 219092
2014-10-05 23:41:26 +00:00
Chandler Carruth 0927da4583 [x86] Remove the 2-addr-to-3-addr "optimization" from shufps to pshufd.
This trades a (register-renamer-friendly) movaps for a floating point
/ integer domain cross. That is a very bad trade, even on architectures
where domain crossing is relatively fast. On any chip where there is
even a cycle stall, this is a Very Bad Idea. It doesn't even seem likely
to cause a spill to be introduced because the reason for the copy is to
destructively shuffle in place.

Thanks to Ben Kramer for fixing a bug in this code that my new shuffle
lowering exposed and highlighting that perhaps it should just go away.
=]

llvm-svn: 219090
2014-10-05 22:57:31 +00:00
Chandler Carruth daa1ff985c [x86, dag] Teach the DAG combiner to prune inputs toa vector_shuffle
that are unused.

This allows the combiner to delete math feeding shuffles where the math
isn't actually necessary. This improves some of the vperm2x128 tests
that regressed when the vector shuffle lowering started actually
generating vperm instructions rather than forcibly decomposing them.

Sadly, this isn't enough to get this *really* right because we still
form a completely unnecessary permutation. To fix that, we also need to
fold shuffles which just rearrange concatenated or inserted subvectors.

llvm-svn: 219086
2014-10-05 19:14:34 +00:00
David Blaikie 60b8662ea7 Remove unused map
This became unnecessary/unused in r208636

llvm-svn: 219085
2014-10-05 16:31:13 +00:00
Benjamin Kramer 77b0e13aba X86: Don't drop half of the mask when converting 2-address shufps into 3-address pshufd.
It's debatable whether this transform is useful at all, but for now make sure
we don't generate invalid asm.

llvm-svn: 219084
2014-10-05 16:14:29 +00:00
Elena Demikhovsky 44bf0637d5 AVX-512-SKX: Added instruction VPMOVM2B/W/D/Q.
This instruction allows to broadacst mask vector to data vector.

llvm-svn: 219083
2014-10-05 14:11:08 +00:00
Benjamin Kramer 12a2d10769 Simplify code. No functionality change.
llvm-svn: 219082
2014-10-05 12:21:57 +00:00
Chandler Carruth acecdc0211 [x86] Fix PR21139, one of the last remaining regressions found in the
new vector shuffle lowering.

This is loosely based on a patch by Marius Wachtler to the PR (thanks!).
I refactored it a bi to use std::count_if and a mutable array ref but
the core idea was exactly right. I also added some direct testing of
this case.

I believe PR21137 is now the only remaining regression.

llvm-svn: 219081
2014-10-05 12:07:34 +00:00
Chandler Carruth 9f4d9fa54e [x86] Teach the new vector shuffle lowering how to lower 128-bit
shuffles using AVX and AVX2 instructions. This fixes PR21138, one of the
few remaining regressions impacting benchmarks from the new vector
shuffle lowering.

You may note that it "regresses" many of the vperm2x128 test cases --
these were actually "improved" by the naive lowering that the new
shuffle lowering previously did. This regression gave me fits. I had
this patch ready-to-go about an hour after flipping the switch but
wasn't sure how to have the best of both worlds here and thought the
correct solution might be a completely different approach to lowering
these vector shuffles.

I'm now convinced this is the correct lowering and the missed
optimizations shown in vperm2x128 are actually due to missing
target-independent DAG combines. I've even written most of the needed
DAG combine and will submit it shortly, but this part is ready and
should help some real-world benchmarks out.

llvm-svn: 219079
2014-10-05 11:41:36 +00:00
NAKAMURA Takumi 2a295fd337 HexagonMCCodeEmitter.cpp: Prune 2nd redundant \brief. [-Wdocumentation]
llvm-svn: 219073
2014-10-05 04:54:54 +00:00
NAKAMURA Takumi 0b6b5654bb [CMake] HexagonTests: Update LINK_COMPONENTS.
llvm-svn: 219072
2014-10-05 04:54:41 +00:00
NAKAMURA Takumi 431c9d3f1f HexagonDesc: Update LLVMBuild.txt.
llvm-svn: 219071
2014-10-05 04:54:29 +00:00
Hal Finkel 4564688806 [InstCombine] Simplify the logic from r219067 using ValueTracking
Joerg suggested on IRC that I look at generalizing the logic from r219067 to
handle more general redundancies (like removing an assume(x > 3) dominated by
an assume(x > 5)). The way to do this would be to ask ValueTracking to
determine the value of the i1 argument. It turns out that ValueTracking is not
very good at this right now (although it does get the trivial redundancy case)
because it does not understand ICmps. Nevertheless, the resulting code in
InstCombine is simpler than r219067, so we might as well do it now.

llvm-svn: 219070
2014-10-05 00:53:02 +00:00
Benjamin Kramer 4b92c6b8e5 [SystemZ] Make operator bool explicit. NFC.
llvm-svn: 219069
2014-10-04 22:44:35 +00:00
Benjamin Kramer 2e52f02864 Make AAMDNodes ctor and operator bool (!!!) explicit, mop up bugs and weirdness exposed by it.
llvm-svn: 219068
2014-10-04 22:44:29 +00:00
Hal Finkel 04a156139e [InstCombine] Remove redundant @llvm.assume intrinsics
For any @llvm.assume intrinsic, if there is another which dominates it and uses
the same condition, then it is redundant and can be removed. While this does
not alter the semantics of the @llvm.assume intrinsics, it makes subsequent
handling more efficient (and the resulting IR easier to read).

llvm-svn: 219067
2014-10-04 21:27:06 +00:00
Yaron Keren 66304c2262 Solve Visual C++ warning C4805 on getAsInteger<bool>.
Fix http://llvm.org/PR21158 by adding a cast to unsigned long long,
so the comparison would be between two unsigned long longs instead 
of bool and unsigned long long.

      if (getAsUnsignedInteger(*this, Radix, ULLVal) ||
          static_cast<unsigned long long>(static_cast<T>(ULLVal)) != ULLVal)

llvm-svn: 219065
2014-10-04 19:58:30 +00:00
Benjamin Kramer c6cc58e703 Remove unnecessary copying or replace it with moves in a bunch of places.
NFC.

llvm-svn: 219061
2014-10-04 16:55:56 +00:00
David Blaikie cda2aa823e Sink DwarfDebug::updateSubprogramScopeDIE into DwarfCompileUnit
This requires exposing some of the current function state from
DwarfDebug. I hope there's not too much of that to expose as I go
through all the functions, but it still seems nicer to expose singular
data down to multiple consumers, than have consumers expose raw mapping
data structures up to DwarfDebug for building subprograms.

Part of a series of refactoring to allow subprograms in both the
skeleton and dwo CUs under Fission.

llvm-svn: 219060
2014-10-04 16:24:00 +00:00
David Blaikie 8945219dc9 Reformatting accidentally left out of r219057
llvm-svn: 219059
2014-10-04 16:00:26 +00:00
David Blaikie 14499a7d68 Sink DwarfDebug::attachLowHighPC into DwarfCompileUnit
One of many things to sink down into DwarfCompileUnit to allow handling
of subprograms in both the skeleton and dwo CU under Fission.

llvm-svn: 219058
2014-10-04 15:58:47 +00:00
David Blaikie 37c5231051 Move DwarfCompileUnit from DwarfUnit.h to its own header (DwarfCompileUnit.h)
In preparation for sinking all the subprogram emission code down from
DwarfDebug into DwarfCompileUnit, this will avoid bloating
DwarfUnit.h/cpp greatly and make concerns a bit more clear/isolated.

(sinking this handling down is part of the work to handle emitting
minimal subprograms for -gmlt-like data into the skeleton CU under
fission)

llvm-svn: 219057
2014-10-04 15:49:50 +00:00
Duncan P. N. Exon Smith 985e1b933d DI: Fixup global syntax in example
llvm-svn: 219056
2014-10-04 15:44:01 +00:00
Duncan P. N. Exon Smith 51d7e88583 DI: Line up comments in examples
llvm-svn: 219055
2014-10-04 15:35:25 +00:00
Duncan P. N. Exon Smith 7db88d4d34 DI: Fixup example IR from r219051
llvm-svn: 219054
2014-10-04 15:31:08 +00:00
Duncan P. N. Exon Smith 8e9f2813cf DI: Prune another example
llvm-svn: 219053
2014-10-04 15:30:52 +00:00
Duncan P. N. Exon Smith 936675e281 DI: Update and prune metadata examples
Update a couple of the examples of debug info metadata, and prune the
rest.  Point to the true reference implementation in the source.

llvm-svn: 219051
2014-10-04 14:56:56 +00:00
Chandler Carruth 808ec85ad0 [x86] Slap a triple on this test since it is poking around at the stack
and calling conventions. Otherwise its too hard to craft a usefully
generic set of assertions.

llvm-svn: 219047
2014-10-04 04:22:55 +00:00
Chandler Carruth 99627bfbff [x86] Enable the new vector shuffle lowering by default.
Update the entire regression test suite for the new shuffles. Remove
most of the old testing which was devoted to the old shuffle lowering
path and is no longer relevant really. Also remove a few other random
tests that only really exercised shuffles and only incidently or without
any interesting aspects to them.

Benchmarking that I have done shows a few small regressions with this on
LNT, zero measurable regressions on real, large applications, and for
several benchmarks where the loop vectorizer fires in the hot path it
shows 5% to 40% improvements for SSE2 and SSE3 code running on Sandy
Bridge machines. Running on AMD machines shows even more dramatic
improvements.

When using newer ISA vector extensions the gains are much more modest,
but the code is still better on the whole. There are a few regressions
being tracked (PR21137, PR21138, PR21139) but by and large this is
expected to be a win for x86 generated code performance.

It is also more correct than the code it replaces. I have fuzz tested
this extensively with ISA extensions up through AVX2 and found no
crashes or miscompiles (yet...). The old lowering had a few miscompiles
and crashers after a somewhat smaller amount of fuzz testing.

There is one significant area where the new code path lags behind and
that is in AVX-512 support. However, there was *extremely little*
support for that already and so this isn't a significant step backwards
and the new framework will probably make it easier to implement lowering
that uses the full power of AVX-512's table-based shuffle+blend (IMO).

Many thanks to Quentin, Andrea, Robert, and others for benchmarking
assistance. Thanks to Adam and others for help with AVX-512. Thanks to
Hal, Eric, and *many* others for answering my incessant questions about
how the backend actually works. =]

I will leave the old code path in the tree until the 3 PRs above are at
least resolved to folks' satisfaction. Then I will rip it (and 1000s of
lines of code) out. =] I don't expect this flag to stay around for very
long. It may not survive next week.

llvm-svn: 219046
2014-10-04 03:52:55 +00:00
Jingyue Wu 4938e271c6 Add fake use to suppress defined-but-unused warnings
llvm-svn: 219045
2014-10-04 03:50:10 +00:00
Chandler Carruth 200e87c0c5 [x86] Fix a bug in the VZEXT DAG combine that I just made more powerful.
It turns out this combine was always somewhat flawed -- there are cases
where nested VZEXT nodes *can't* be combined: if their types have
a mismatch that can be observed in the result. While none of these show
up in currently, once I switch to the new vector shuffle lowering a few
test cases actually form such nested VZEXT nodes. I've not come up with
any IR pattern that I can sensible write to exercise this, but it will
be covered by tests once I flip the switch.

llvm-svn: 219044
2014-10-04 02:51:03 +00:00
Chandler Carruth 7e26a67ffa [x86] Sink a generic combine of VZEXT nodes from the lowering to VZEXT
nodes to the DAG combining of them.

This will allow the combine to fire on both old vector shuffle lowering
and the new vector shuffle lowering and generally seems like a cleaner
design. I've trimmed down the code a bit and tried to make it and the
surrounding combine fairly clean while moving it around.

llvm-svn: 219042
2014-10-04 01:05:48 +00:00
Matt Arsenault c996175b57 R600/SI: Custom lower f64 -> i64 conversions
llvm-svn: 219038
2014-10-03 23:54:56 +00:00
Matt Arsenault f7c95e3eda R600: Custom lower [s|u]int_to_fp for i64 -> f64
llvm-svn: 219037
2014-10-03 23:54:41 +00:00
Matt Arsenault 6cda887776 R600/SI: Fix ftrunc f64 conformance failures.
Re-add the tests since they were deleted at some point

llvm-svn: 219036
2014-10-03 23:54:27 +00:00
Peter Collingbourne 11d08b8be1 Remove unused ALL_BINDINGS configuration variable.
llvm-svn: 219035
2014-10-03 23:03:01 +00:00
Chandler Carruth f3e880697a [x86] Add a really preposterous number of patterns for matching all of
the various ways in which blends can be used to do vector element
insertion for lowering with the scalar math instruction forms that
effectively re-blend with the high elements after performing the
operation.

This then allows me to bail on the element insertion lowering path when
we have SSE4.1 and are going to be doing a normal blend, which in turn
restores the last of the blends lost from the new vector shuffle
lowering when I got it to prioritize insertion in other cases (for
example when we don't *have* a blend instruction).

Without the patterns, using blends here would have regressed
sse-scalar-fp-arith.ll *completely* with the new vector shuffle
lowering. For completeness, I've added RUN-lines with the new lowering
here. This is somewhat superfluous as I'm about to flip the default, but
hey, it shows that this actually significantly changed behavior.

The patterns I've added are just ridiculously repetative. Suggestions on
making them better very much welcome. In particular, handling the
commuted form of the v2f64 patterns is somewhat obnoxious.

llvm-svn: 219033
2014-10-03 22:43:17 +00:00
Chris Bieneman 489d1dce3f Converting the ErrorHandlerMutex to a ManagedStatic to avoid the static constructor and destructor.
llvm-svn: 219028
2014-10-03 22:03:12 +00:00
Chandler Carruth 0adda1e4d4 [x86] Adjust the patterns for lowering X86vzmovl nodes which don't
perform a load to use blendps rather than movss when it is available.

For non-loads, blendps is *much* faster. It can execute on two ports in
Sandy Bridge and Ivy Bridge, and *three* ports on Haswell. This fixes
one of the "regressions" from aggressively taking the "insertion" path
in the new vector shuffle lowering.

This does highlight one problem with blendps -- it isn't commuted as
heavily as it should be. That's future work though.

llvm-svn: 219022
2014-10-03 21:38:49 +00:00
Jonathan Roelofs b24884de70 Fix typo in TableGen documentation
llvm-svn: 219018
2014-10-03 20:46:05 +00:00
Sid Manning 50600f39ab Add unit tests to verify Hexagon emission.
Add the test cases I overlooked, part of the original commit,
http://reviews.llvm.org/D5523

llvm-svn: 219016
2014-10-03 20:33:03 +00:00
Adrian Prantl adc41ca4ef Add a reference to Phabricator.rst to docs/index.rst.
llvm-svn: 219015
2014-10-03 20:17:32 +00:00
Richard Smith 1ed4229f6f PR21145: Teach LLVM about C++14 sized deallocation functions.
C++14 adds new builtin signatures for 'operator delete'. This change allows
new/delete pairs to be removed in C++14 onwards, as they were in C++11 and
before.

llvm-svn: 219014
2014-10-03 20:17:06 +00:00
Duncan P. N. Exon Smith 176b691d32 Revert "Revert "DI: Fold constant arguments into a single MDString""
This reverts commit r218918, effectively reapplying r218914 after fixing
an Ocaml bindings test and an Asan crash.  The root cause of the latter
was a tightened-up check in `DILexicalBlock::Verify()`, so I'll file a
PR to investigate who requires the loose check (and why).

Original commit message follows.

--

This patch addresses the first stage of PR17891 by folding constant
arguments together into a single MDString.  Integers are stringified and
a `\0` character is used as a separator.

Part of PR17891.

Note: I've attached my testcases upgrade scripts to the PR.  If I've
just broken your out-of-tree testcases, they might help.

llvm-svn: 219010
2014-10-03 20:01:09 +00:00
Adam Nemet ff63a2dc51 [ISel] Keep matching state consistent when folding during X86 address match
In the X86 backend, matching an address is initiated by the 'addr' complex
pattern and its friends.  During this process we may reassociate and-of-shift
into shift-of-and (FoldMaskedShiftToScaledMask) to allow folding of the
shift into the scale of the address.

However as demonstrated by the testcase, this can trigger CSE of not only the
shift and the AND which the code is prepared for but also the underlying load
node.  In the testcase this node is sitting in the RecordedNode and MatchScope
data structures of the matcher and becomes a deleted node upon CSE.  Returning
from the complex pattern function, we try to access it again hitting an assert
because the node is no longer a load even though this was checked before.

Now obviously changing the DAG this late is bending the rules but I think it
makes sense somewhat.  Outside of addresses we prefer and-of-shift because it
may lead to smaller immediates (FoldMaskAndShiftToScale is an even better
example because it create a non-canonical node).  We currently don't recognize
addresses during DAGCombiner where arguably this canonicalization should be
performed.  On the other hand, having this in the matcher allows us to cover
all the cases where an address can be used in an instruction.

I've also talked a little bit to Dan Gohman on llvm-dev who added the RAUW for
the new shift node in FoldMaskedShiftToScaledMask.  This RAUW is responsible
for initiating the recursive CSE on users
(http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-September/076903.html) but it
is not strictly necessary since the shift is hooked into the visited user.  Of
course it's safer to keep the DAG consistent at all times (e.g. for accurate
number of uses, etc.).

So rather than changing the fundamentals, I've decided to continue along the
previous patches and detect the CSE.  This patch installs a very targeted
DAGUpdateListener for the duration of a complex-pattern match and updates the
matching state accordingly.  (Previous patches used HandleSDNode to detect the
CSE but that's not practical here).  The listener is only installed on X86.

I tested that there is no measurable overhead due to this while running
through the spec2k BC files with llc.  The only thing we pay for is the
creation of the listener.  The callback never ever triggers in spec2k since
this is a corner case.

Fixes rdar://problem/18206171

llvm-svn: 219009
2014-10-03 20:00:34 +00:00
Tom Stellard fae1dc8a12 R600: Align functions to 256 bytes
llvm-svn: 219002
2014-10-03 19:02:02 +00:00
Benjamin Kramer e12a6bac32 Eliminate some deep std::vector copies. NFC.
llvm-svn: 218999
2014-10-03 18:33:16 +00:00
Benjamin Kramer cb3e06ba00 MCParser: Modernize memory handling.
NFC.

llvm-svn: 218998
2014-10-03 18:32:55 +00:00
Robin Morisset 028eabcdaa [Power] Delete redundant test Atomics-32.ll
The test Atomics-32.ll was both redundant (all operations are also checked by
atomics.ll at least) and not actually checking correctness (it was not using
FileCheck, just verifying that the compiler does not crash).

llvm-svn: 218997
2014-10-03 18:10:07 +00:00
Rui Ueyama 1af0865871 llvm-readobj: print out the fields of the COFF delay-import table
llvm-svn: 218996
2014-10-03 18:07:18 +00:00
Robin Morisset 9098fee690 [Power] Use lwsync for non-seq_cst fences
Summary:
hwsync is only required for seq_cst fences, acquire and release one can use
the cheaper lwsync.

Test Plan: Added some cases to atomics.ll + make check-all

Reviewers: jfb, wschmidt

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5317

llvm-svn: 218995
2014-10-03 18:04:36 +00:00
Hans Wennborg 6a654333c5 MipsAsmParser.cpp: fix VS2012 build
llvm-svn: 218991
2014-10-03 17:16:24 +00:00
Hans Wennborg da47cf46de HexagonMCCodeEmitter.h: deleted member functions are not supported in VS2012
llvm-svn: 218990
2014-10-03 17:02:28 +00:00
Daniel Sanders ef638fea2d [mips] Print warning when using register names not available in N32/64
Summary:
The register names t4-t7 are not available in the N32 and N64 ABIs.
This patch prints a warning, when those names are used in N32/64,
along with a fix-it with the correct register names.

Patch by Vasileios Kalintiris

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5272

llvm-svn: 218989
2014-10-03 15:37:37 +00:00
Sid Manning 40d809399f Fix build break on Hexagon
Differential Revision: http://reviews.llvm.org/D5600

llvm-svn: 218987
2014-10-03 13:59:01 +00:00
Sid Manning 7da3f9acba Adding skeleton for unit testing Hexagon Code Emission
Adding and modifying CMakeLists.txt files to run unit tests under
unittests/Target/* if the directory exists.  Adding basic unit test to check
that code emitter object can be retrieved.

Differential Revision: http://reviews.llvm.org/D5523
Change by: Colin LeMahieu

llvm-svn: 218986
2014-10-03 13:18:11 +00:00
Chandler Carruth 1964078936 [x86] Teach the new vector shuffle lowering to aggressively form MOVSS
and MOVSD nodes for single element vector inserts.

This is particularly important because a number of patterns in the
backend detect these patterns and leverage them to simplify things. It
also fixes quite a few of the insertion bad code examples. However, it
regresses a specific area: when available, blendps and blendpd are
*dramatically* faster than movss and movsd respectively. But it doesn't
really work to form the blend logic first because the blends *aren't* as
crazy efficient when the data is coming from memory anyways, and thus
will have a movss or movsd regardless. Also, doing that would block
a bunch of the patterns that this is designed to hit.

So my plan is to go into the patterns for lowering MOVSS and MOVSD and
lower them via blends when available. However that's a pretty invasive
restructuring so it will need to be a follow-up patch.

I have already gone into the patterns to lower MOVSS and MOVSD from
memory using MOVLPD, etc. Without that, several of the test cases
I already have regress.

llvm-svn: 218985
2014-10-03 13:11:13 +00:00
Dan Liew 460e0f4dfe [sphinx cleanup] Fix unexpected indentation warning introduced by r218937
llvm-svn: 218982
2014-10-03 12:28:48 +00:00
Renato Golin 4e31ae1051 Revert 202433 - Provide a target override for the latest regalloc heuristic
That commit was introduced in order to help investigate a problem in ARM
codegen breaking from commit 202304 (Add a limit to the heuristic that register
allocates instructions in local order). Recent analisys indicated that the
problem no longer exists, so I'm reverting this change.

See PR18996.

llvm-svn: 218981
2014-10-03 12:20:53 +00:00
Chandler Carruth 4bf341de3c [x86] Refactor the element insertion logic in the new vector shuffle
lowering to handle the potential mirroring of 2-element vectors (because
we can't reliably sort them one way) in the caller rather than in the
insertion logic.

This will simplify things considerably as more ways to fail to match the
insertion are added because now we have a nice try and retry point.

llvm-svn: 218980
2014-10-03 12:01:55 +00:00
Alexander Musman 39ddb3e547 Fix typo in comment
llvm-svn: 218979
2014-10-03 11:55:31 +00:00
Chandler Carruth 7a4459fec1 [x86] Fix the RUN-lines of this test to make sense.
I got them quite wrong when updating it and had the SSE4.1 run checked
for SSE2 and the SSE2 run checked for SSE4.1. I think everything was
actually generic SSE, but this still seems good to fix. While here,
hoist the triple into the IR and make the flag set a bit more direct in
what it is trying to test.

llvm-svn: 218978
2014-10-03 11:30:02 +00:00
Chandler Carruth 971a560cb8 [x86] Significantly improve the ability of the new vector shuffle
lowering to match VZEXT_MOVL patterns.

I hadn't realized that these had sufficient pattern smarts in the
backend to lower zext-ing from the low element of a vector without it
being a scalar_to_vector node. They do, and this is how to match a bunch
of patterns for movq, movss, etc.

There is a weird propensity to end up using pshufd to place the element
afterward even though it means domain crossing (or rather, to use
xorps+movss to zext the element rather than movq) but that's an
orthogonal problem with VZEXT_MOVL that someone should probably look at.

llvm-svn: 218977
2014-10-03 11:25:58 +00:00
Chandler Carruth 080cab91e1 [x86] Add some important, missing test coverage for blending from one
vector to a zero vector for the v2 cases and fix the v4 integer cases to
actually blend from a vector.

There are already seprate tests for the case of inserting from a scalar.

These cases cover a lot of the regressions I've seen in the regression
test suite for the new vector shuffle lowering and specifically cover
the reported lack of using various zext-ing instruction patterns. My
next patch should fix a big chunk of this, but wanted to get a nice
baseline for these patterns in the test cases first.

llvm-svn: 218976
2014-10-03 11:16:45 +00:00
Chandler Carruth e91b316266 [x86] Unbreak SSE1 with the new vector shuffle lowering. We can't widen
element types to form illegal vector types.

I've added a special SSE1 test case here that makes sure we don't break
this going forward.

llvm-svn: 218974
2014-10-03 10:11:39 +00:00
Chandler Carruth ea026fbb7b [x86] Add two more triples to stabilize the precise assembly syntax
across platforms.

llvm-svn: 218973
2014-10-03 09:43:23 +00:00
Chandler Carruth b193d96bce [x86] Remove a couple of fairly pointless tests. These were merely
testing that we generated divps and divss but not in a very systematic
way. There are other tests for widening binary operations already that
make these unnecessary.

The second one seems mostly about testing Atom as well as normal X86,
but despite the comment claiming it is testing a different instruction
sequence, it then tests for exactly the same div instruction sequence!
(The sequence of instructions is actually quite different on Atom, but
not the sequence of div instructions....)

And then it has an "execution" test that simply isn't run? Very strange.
Anyways, none of this is really needed so clean this up.

llvm-svn: 218972
2014-10-03 09:43:19 +00:00
James Molloy cb7449d058 Revert r215343.
This was contentious and needs invesigation.

llvm-svn: 218971
2014-10-03 09:29:24 +00:00
Daniel Sanders 36a3e69bd4 [mips] Remove XFAIL from two XPASS'ing tests on the llvm-mips-linux builder
llvm-svn: 218967
2014-10-03 08:49:44 +00:00
Chandler Carruth 5bacb0aef0 [x86] Add another triple to a test to make the comment syntax stable.
Should fix darwin builders.

llvm-svn: 218956
2014-10-03 02:06:28 +00:00
Chandler Carruth bfde7c2da9 [x86] Add triples to these tests so that we see fewer calling convention
differences and they're a bit easier to maintain. This should fix the
tests on cygwin bots, etc.

llvm-svn: 218955
2014-10-03 02:00:09 +00:00
Chandler Carruth 3a0883f32e [x86] Regenerate precise FileCheck lines for the lats batch of test
cases.

llvm-svn: 218954
2014-10-03 01:57:38 +00:00
Chandler Carruth 9a24a33b14 [x86] Remove another low-value test still written using grep. We have
many tests for movss and friends.

llvm-svn: 218953
2014-10-03 01:57:35 +00:00
Chandler Carruth b56b18d8b3 [x86] Regenerate precise checks for a couple of test cases and remove
a test case that was just grepping the debug stats output rather than
actually checking the generated code for anything useful.

llvm-svn: 218951
2014-10-03 01:50:08 +00:00
Chandler Carruth be3a669440 [x86] Remove an over-reduced test case. This would need to be
intergrated much more fully into some logical part of the backend to
really understand what it is trying to accomplish and how to update it.
I suspect it no longer holds enough value to be worth having.

llvm-svn: 218950
2014-10-03 01:50:06 +00:00
Chandler Carruth fb924fdf38 [x86] Regenerate and clean up more tests is preparation for vector
shufle switch.

I nuked a win64 config from one test as it doesn't really make sense to
cover that ABI specially for generic v2f32 tests...

llvm-svn: 218948
2014-10-03 01:44:04 +00:00
Chandler Carruth c75abc162c [x86] Cleanup and generate precise FileCheck assertions for a bunch of
SSE tests.

llvm-svn: 218947
2014-10-03 01:37:58 +00:00
Chandler Carruth c1e2fc726c [x86] This is a terrible SSE1 test, but we should keep it. I've deleted
two functions that really didn't have any interesting assertions, and
generated more precise tests for one of the others.

llvm-svn: 218946
2014-10-03 01:37:56 +00:00
Chandler Carruth 23578e9055 [x86] Merge two very similar tests and regenerate FileCheck lines for
them.

llvm-svn: 218945
2014-10-03 01:37:53 +00:00
Lang Hames 89e9c17235 [BasicAA] Revert r218714 - Make better use of zext and sign information.
This patch broke 447.dealII on Darwin. I'm currently working on a reduced
test-case, but reverting for now to keep the bots happy.

<rdar://problem/18530107>

llvm-svn: 218944
2014-10-03 01:33:47 +00:00
Chandler Carruth b264e4a325 [x86] Regenerate a number of FileCheck assertions with my script for
test cases that will change with the new vector shuffle lowering. This
gives us a nice baseline for deltas against. I've checked and removed
the cases where there were weird register usage being pinned down, and
all of these are extremely pin-pointed tests so fully checking them
seems very appropriate.

llvm-svn: 218941
2014-10-03 01:06:32 +00:00
Chandler Carruth 9df29f3a6f [x86] Remove a couple of other overly isolated tests that are low-value
at this point. We have lots of tests of peephole optimizations with
insert and extract on vectors.

llvm-svn: 218940
2014-10-03 01:06:30 +00:00
Chandler Carruth d20c80ae79 [x86] Remove a test that provides little value. There are plenty of
tests for zext of a vector.

llvm-svn: 218939
2014-10-03 01:06:27 +00:00
Robin Morisset e83f59e658 Update Atomics.rst
Summary:
I changed various bits of the compilation of atomics recently, and forgot
updating the documentation. This patch just brings it up to date.

Test Plan: no change to the code

Reviewers: jfb

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5590

llvm-svn: 218937
2014-10-03 01:04:20 +00:00
Chandler Carruth 19c9f8cd50 [x86] Regenerate a bunch more avx512 test cases using my script to have
tighter, more strict FileCheck assertions. Some of these I really like
as they show case exactly what instruction sequences come out of these
microscopic functionality tests.

llvm-svn: 218936
2014-10-03 00:50:03 +00:00
Chandler Carruth 666ee3f7ab [x86] Regenerate an avx512 test with my script to provide a nice
baseline for updates from the new vector shuffle lowering.

I've inspected the results here, and I couldn't find any register
allocation decisions where there should be any realistic way to register
allocate things differently. The closest was the imul test case. If you
see something here you'd like register number variables on, just shout
and I'll add them.

llvm-svn: 218935
2014-10-03 00:44:46 +00:00
Eric Christopher f12e1ab313 constify TargetMachine parameter.
llvm-svn: 218934
2014-10-03 00:42:41 +00:00
Rui Ueyama 15d993591c llvm-readobj: print COFF delay-load import table
This patch adds another iterator to access the delay-load import table
and use it from llvm-readobj.

http://reviews.llvm.org/D5594

llvm-svn: 218933
2014-10-03 00:41:58 +00:00
Chandler Carruth 3eda855c69 [x86] Remove some of the --show-mc-encoding flags from avx512 tests that
need to be updated for the new vector shuffle lowering.

After talking to Adam Nemet, Tim Northover, etc., it seems that testing
MC encodings in the same suite as the basic codegen isn't the right
approach. Instead, we're going to want dedicated MC tests for the
encodings. These encodings are starting to get in my way so I wanted to
cut them out early. The total set of instructions that should have
encoding tests added is:

  vpaddd
  vsqrtss
  vsqrtsd
  vmovlhps
  vmovhlps
  valignq
  vbroadcastss

Not too many parts of these tests were even using this. =]

llvm-svn: 218932
2014-10-03 00:36:29 +00:00
Eric Christopher 5312afe7e1 constify TargetMachine argument.
llvm-svn: 218930
2014-10-03 00:17:59 +00:00
Eric Christopher a94e592e49 We can grab the options struct from the TargetMachine, no need to
pass it down in the constructor.

llvm-svn: 218929
2014-10-03 00:10:03 +00:00
Adam Nemet 4dca3ce4b0 [AVX512] Pull pattern for subvector insert into the instruction definition
No functional change intended.

Very similar to the change I made for subvector extract in r218480.

test/CodeGen/X86/avx512-insert-extract.ll covers this.

llvm-svn: 218928
2014-10-02 23:18:30 +00:00
Adam Nemet 4e2ef472d2 [AVX512] Refactor subvector inserts
No functional change.

Very similar to the extract refactoring I did in r218478.

Compared X86.td.expanded before and after.

llvm-svn: 218927
2014-10-02 23:18:28 +00:00
Adam Nemet dc87aea176 [AVX512] Fix i256mem->f256mem typo in VINSERTF64x4rm
Just like in the case of extracts, the refactoring is uncovering some typos in
the code.

llvm-svn: 218926
2014-10-02 23:18:26 +00:00
Rui Ueyama a3f58694b5 llvm-readobj: add a test for COFF import-by-ordinal symbols
llvm-svn: 218924
2014-10-02 22:40:55 +00:00
Hal Finkel fe3368cb57 [PowerPC] Modern Book-E cores support sync
Older Book-E cores, such as the PPC 440, support only msync (which has the same
encoding as sync 0), but not any of the other sync forms. Newer Book-E cores,
however, do support sync, and for performance reasons we should allow the use
of the more-general form.

This refactors msync use into its own feature group so that it applies by
default only to older Book-E cores (of the relevant cores, we only have
definitions for the PPC440/450 currently).

llvm-svn: 218923
2014-10-02 22:34:22 +00:00
Robin Morisset e1ca44bd4c [Power] Improve the expansion of atomic loads/stores
Summary:
Atomic loads and store of up to the native size (32 bits, or 64 for PPC64)
can be lowered to a simple load or store instruction (as the synchronization
is already handled by AtomicExpand, and the atomicity is guaranteed thanks to
the alignment requirements of atomic accesses). This is exactly what this patch
does. Previously, these were implemented by complex
load-linked/store-conditional loops.. an obvious performance problem.

For example, this patch turns
```
define void @store_i8_unordered(i8* %mem) {
  store atomic i8 42, i8* %mem unordered, align 1
  ret void
}
```
from
```
_store_i8_unordered:                    ; @store_i8_unordered
; BB#0:
    rlwinm r2, r3, 3, 27, 28
    li r4, 42
    xori r5, r2, 24
    rlwinm r2, r3, 0, 0, 29
    li r3, 255
    slw r4, r4, r5
    slw r3, r3, r5
    and r4, r4, r3
LBB4_1:                                 ; =>This Inner Loop Header: Depth=1
    lwarx r5, 0, r2
    andc r5, r5, r3
    or r5, r4, r5
    stwcx. r5, 0, r2
    bne cr0, LBB4_1
; BB#2:
    blr
```
into
```
_store_i8_unordered:                    ; @store_i8_unordered
; BB#0:
    li r2, 42
    stb r2, 0(r3)
    blr

```
which looks like a pretty clear win to me.

Test Plan:
fixed the tests + new test for indexed accesses + make check-all

Reviewers: jfb, wschmidt, hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5587

llvm-svn: 218922
2014-10-02 22:27:07 +00:00
Chandler Carruth 7425c8c279 Fix the threshold added in r186434 (a re-apply of r185393) and updaated
to be a ManagedStatic in r218163 to not be a global variable written and
read to from within the innards of SpillPlacement.

This will fix a really scary race condition for anyone that has two
copies of LLVM running spill placement concurrently. Yikes!

This will also fix a really significant compile time hit that r218163
caused because the spill placement threshold read is actually in the
*very* hot path of this code. The memory fence on each read was showing
up as huge compile time regressions when spilling is responsible for
most of the compile time. For example, optimizing sanitized code showed
over 50% compile time regressions here. =/

llvm-svn: 218921
2014-10-02 22:23:14 +00:00
Juergen Ributzka 99bd3cba8b [Stackmaps] Make ithe frame-pointer required for stackmaps.
Do not eliminate the frame pointer if there is a stackmap or patchpoint in the
function. All stackmap references should be FP relative.

This fixes PR21107.

llvm-svn: 218920
2014-10-02 22:21:49 +00:00
Duncan P. N. Exon Smith 786cd049fc Revert "DI: Fold constant arguments into a single MDString"
This reverts commit r218914 while I investigate some bots.

llvm-svn: 218918
2014-10-02 22:15:31 +00:00
Rui Ueyama 350fe8e871 Rename data -> Data
llvm-svn: 218916
2014-10-02 22:13:44 +00:00
Rui Ueyama 861021f986 llvm-readobj: print COFF imported symbols
This patch defines a new iterator for the imported symbols.
Make a change to COFFDumper to use that iterator to print
out imported symbols and its ordinals.

llvm-svn: 218915
2014-10-02 22:05:29 +00:00
Duncan P. N. Exon Smith 571f97bd90 DI: Fold constant arguments into a single MDString
This patch addresses the first stage of PR17891 by folding constant
arguments together into a single MDString.  Integers are stringified and
a `\0` character is used as a separator.

Part of PR17891.

Note: I've attached my testcases upgrade scripts to the PR.  If I've
just broken your out-of-tree testcases, they might help.

llvm-svn: 218914
2014-10-02 21:56:57 +00:00
Chandler Carruth 75e182b414 [x86] Teach the new vector shuffle lowering to widen floating point
elements as well as integer elements in order to form simpler shuffle
patterns.

This is the primary reason why we were failing to match some of the
2-and-2 floating point shuffles such as PR21140. Even after fixing this
we need to support some extra patterns in the backend in order to match
the resulting X86ISD::UNPCKL nodes into the correct instructions. This
commit should fix PR21140 and includes more comprehensive testing of
insertion patterns in v4 shuffles.

Not all of the added tests are beautiful. For example, we don't have
clever instructions to insert-via-load in the integer domain. There are
also some places where we aren't sufficiently cunning with our use of
movq and movd, but that's future work.

llvm-svn: 218911
2014-10-02 21:37:14 +00:00
Sanjay Patel 13a657819b Remove unused function attribute params.
llvm-svn: 218909
2014-10-02 21:12:04 +00:00
Duncan P. N. Exon Smith f02fe70805 LTO: Document the Boolean argument from r218784
llvm-svn: 218907
2014-10-02 21:11:04 +00:00
Sanjay Patel 12d1ce5408 Optimize square root squared (PR21126).
When unsafe-fp-math is enabled, we can turn sqrt(X) * sqrt(X) into X.

This can happen in the real world when calculating x ** 3/2. This occurs
in test-suite/SingleSource/Benchmarks/BenchmarkGame/n-body.c.

Differential Revision: http://reviews.llvm.org/D5584

llvm-svn: 218906
2014-10-02 21:10:54 +00:00
Chandler Carruth 41fdd61f64 [x86] Move the vperm2f128 test to be vperm2x128 and test both the
floating point and integer domains.

Merge the AVX2 test into it and add an extra RUN line. Generate clean
FileCheck statements with my script. Remove the now merged AVX2 tests.

llvm-svn: 218903
2014-10-02 20:11:11 +00:00
Justin Bogner ad69e64761 InstrProf: Avoid linear search in a hot loop
Every time we were adding or removing an expression when generating a
coverage mapping we were doing a linear search to try and deduplicate
the list. The indices in the list are important, so we can't just
replace it by a DenseMap entirely, but an auxilliary DenseMap for fast
lookup massively improves the performance issues I was seeing here.

llvm-svn: 218892
2014-10-02 17:14:18 +00:00
Rui Ueyama 1e152d5eec This patch adds a new flag "-coff-imports" to llvm-readobj.
When the flag is given, the command prints out the COFF import table.

Currently only the import table directory will be printed.
I'm going to make another patch to print out the imported symbols.

The implementation of import directory entry iterator in
COFFObjectFile.cpp was buggy. This patch fixes that too.

http://reviews.llvm.org/D5569

llvm-svn: 218891
2014-10-02 17:02:18 +00:00
Justin Bogner f9535c418f Reapply "InstrProf: Don't keep a large sparse list around just to zero it"
When I was preparing r218879 for commit, I removed an early return
that I decided was just noise. It wasn't. This is r218879 no-crash
edition.

This reverts commit r218881, reapplying r218879.

llvm-svn: 218887
2014-10-02 16:43:31 +00:00
Adrian Prantl 38666f1d13 Remove an extra whitespace.
llvm-svn: 218886
2014-10-02 16:42:15 +00:00
Adrian Prantl 75a0dac4b3 Pretty-printer: Paper over an ambiguity between line table entries
and tagged mdnodes.

fixes http://llvm.org/bugs/show_bug.cgi?id=21131

llvm-svn: 218885
2014-10-02 16:42:13 +00:00
Justin Bogner 70b5c562ce Revert "InstrProf: Don't keep a large sparse list around just to zero it"
This seems to be crashing on some buildbots. Reverting to investigate.

This reverts commit r218879.

llvm-svn: 218881
2014-10-02 16:15:27 +00:00
Justin Bogner d6a9e4b3be InstrProf: Don't keep a large sparse list around just to zero it
The Terms vector here represented a polynomial of of all possible
counters, and is used to simplify expressions when generating coverage
mapping. There are a few problems with this:

1. Keeping the vector as a member is wasteful, since we clear it every
   time we use it.
2. Most expressions refer to a subset of the counters, so we end up
   iterating over a large number of zeros doing nothing a lot of the
   time.

This updates the user of the vector to store the terms locally, and
uses a sort and combine approach so that we only operate on counters
that are actually used in a given expression. For small cases this
makes very little difference, but in cases with a very large number of
counted regions this is a significant performance fix.

llvm-svn: 218879
2014-10-02 16:04:03 +00:00
Sanjay Patel b41d46118a Use the local variable that other clauses around here are already using.
llvm-svn: 218876
2014-10-02 15:20:45 +00:00
Sanjay Patel 0d7dee654d Remove duplicate function names from comments. NFC.
llvm-svn: 218875
2014-10-02 15:13:22 +00:00
Tilmann Scheller 383b4fff4c [NVPTX] Remove dead code.
Found by the Clang static analyzer.

llvm-svn: 218874
2014-10-02 15:12:48 +00:00
Joerg Sonnenberger f148a6d498 Support padding unaligned data in .text.
llvm-svn: 218870
2014-10-02 13:41:42 +00:00
Aaron Ballman 254dd7e439 Silence a -Wsign-compare warning. NFC.
llvm-svn: 218868
2014-10-02 13:17:11 +00:00
Zinovy Nis ccc3e3733b [BUG][INDVAR] Fix for PR21014: wrong SCEV operands commuting for non-commutative instructions
My commit rL216160 introduced a bug PR21014: IndVars widens code 'for (i = ; i < ...; i++) arr[ CONST - i]' into 'for (i = ; i < ...; i++) arr[ i - CONST]'
thus inverting index expression. This patch fixes it. 
Thanks to Jörg Sonnenberger for pointing.

Differential Revision: http://reviews.llvm.org/D5576

llvm-svn: 218867
2014-10-02 13:01:15 +00:00
Chandler Carruth 2570e48a30 [x86] Just delete the last combine test file.
This file isn't really doing anything useful. Many of the tests that
seem to be combined are also repeats from other test files. Many of the
other tests, despite the comment that they should be combined into
a single shuffle... well... aren't combined into a single shuffle.
=/

llvm-svn: 218862
2014-10-02 08:05:57 +00:00
Chandler Carruth 71f4187dbb [x86] Merge still more combine tests into the common file. These at
least seem *slightly* more interesting test wise, although given how
spotily we actually combine anything, I remain somewhat suspicious.

llvm-svn: 218861
2014-10-02 08:02:34 +00:00
Chandler Carruth 782b0a72ac [x86] Merge the third combining test into the generic one and add proper
checks for all the ISA variants.

If the SSE2 checks here terrify you, good. This is (in large part) the
kind of amazingly bad code that is holding LLVM back when vectorizing on
older ISAs.

At the same time, these tests seem increasingly dubious to me. There are
a very large number of tests and it isn't clear that they are
systematically covering a specific set of functionality. Anyways,
I don't want to reduce testing during the transition, I just want to
consolidate it to where it is easier to manage.

llvm-svn: 218860
2014-10-02 07:56:47 +00:00
Chandler Carruth b2941e2693 [x86] Merge the second set of vector combining tests into a common test
file.

Some of these really don't make sense to test -- we're testing for the
*lack* of combining two shuffles into one, presumably because the two
would generate better shuffles in the end. But if you look at the
generated code shown here, in many cases the generated code is, frankly,
terrible. Or we combine any two generated shuffles back into a single
instruction! I've left a FIXME to revisit these decisions.

llvm-svn: 218859
2014-10-02 07:42:58 +00:00
Chandler Carruth 2110501822 [x86] Merge the bitwise operation shuffle combining into the common test
file, adding assertions across the ISA variants for it.

llvm-svn: 218858
2014-10-02 07:30:24 +00:00
Chandler Carruth 3c7bf04c87 [x86] Update this test to run a full complement of the ISA extensions,
and use the new grouped FileCheck patterns to match them.

No interesting changes yet, but this test is now in proper form to have
the other shuffle combining tests merged into it.

llvm-svn: 218857
2014-10-02 07:22:26 +00:00
Chandler Carruth 11a5ae0880 [x86] Minimize the parameters to this test for clarity.
The test has to do with DAG combines, and so it doesn't need the new
vector shuffle lowering to be effective. Also, it has a nice in-IR
triple string which we should really be using rather than command line
flags (unless it varies form RUN-line to RUN-line). Finally, I much
prefer letting LLVM synthesize the correct datalayout string from the
triple rather than baking one in here that will just become stale.

llvm-svn: 218856
2014-10-02 07:17:15 +00:00
Chandler Carruth 7b2706726e [x86] Add a comment clarifying that this test should span all manners of
generic DAG combining of shuffles relevant to x86.

My plan is to fold a bunch of the other DAG combining test cases into
this one, while converting them to use the nice new FileCheck assertion
syntax.

llvm-svn: 218855
2014-10-02 07:13:25 +00:00
Chandler Carruth c1bb0e84bc [x86] Switch some of the new consolidated vector tests to use
a bare-metal triple and have nice BB labels, etc.

No significant change here, just tidying up to have a consistent set of
OS-agnostic vector functionality here.

llvm-svn: 218854
2014-10-02 06:52:19 +00:00
Lang Hames cd025c5512 [PBQP] Update doxygen comment style to match the rest of the file. NFC.
llvm-svn: 218849
2014-10-02 04:21:27 +00:00
Lang Hames 57a28ce3ad [PBQP] Add support for graph-level metadata to the PBQP graph. This will be used
in the future to attach useful information about the PBQP graph (e.g. the
associated MachineFunction, pointers to regalloc passes) to the graph itself,
making that information accessible to the solver. This should also allow the
PBQPBuilder interface to be simplified.

llvm-svn: 218848
2014-10-02 04:17:36 +00:00
Eric Christopher 94ec17fac9 Remove test directories with no tests.
llvm-svn: 218843
2014-10-02 00:42:30 +00:00
Justin Bogner 5eec02a399 InstrProf: Simplify counting a file's regions when writing coverage (NFC)
When writing a coverage mapping we iterate through the mapping regions
in order of FileID, but we were then repeatedly searching from the
beginning of the list to count the number of regions with a given
FileID.

It is simpler and more efficient to search forward from the current
iterator to find the number of regions.

llvm-svn: 218842
2014-10-02 00:31:00 +00:00
Chandler Carruth 8a16802d46 [x86] Improve and correct how the new vector shuffle lowering was
matching and lowering 64-bit insertions.

The first problem was that we weren't looking through bitcasts to
discover that we *could* lower as insertions. Once fixed, we in turn
weren't looking through bitcasts to discover that we could fold a load
into the lowering. Once fixed, we weren't forming a SCALAR_TO_VECTOR
node around the inserted element and instead were passing a scalar to
a DAG node that expected a vector. It turns out there are some patterns
that will "lower" this into the correct asm, but the rest of the X86
backend is very unhappy with such antics.

This should fix a few more edge case regressions I've spotted going
through the regression test suite to enable the new vector shuffle
lowering.

llvm-svn: 218839
2014-10-01 23:14:28 +00:00
Bob Wilson 650cd8a380 PR21101: tablegen's FastISel emitter should filter out unused functions.
FastISel has a fixed set of virtual functions that are overridden by the
tablegen-generated code for each target. These functions are distinguished by
the kinds of operands, e.g., register + immediate = "ri". The FastISel emitter
has been blindly emitting functions with different combinations of operand
kinds, even for combinations that are completely unused by FastISel, e.g.,
"fastEmit_rrr". Change to filter out functions that will be irrelevant for
FastISel and do not bother generating the code for them. Also add explicit
"override" keywords for the virtual functions that are overridden.

llvm-svn: 218838
2014-10-01 22:44:01 +00:00
Lang Hames 24f0c24de9 [MCJIT] Don't crash in debugging output for sections that aren't emitted.
llvm-svn: 218836
2014-10-01 21:57:47 +00:00
Eric Christopher f6ed33e7fa constify the TargetMachine argument used in the subtarget and
lowering constructors.

llvm-svn: 218832
2014-10-01 21:36:28 +00:00
Duncan P. N. Exon Smith 379e375761 DIBuilder: Remove duplicated comments, NFC
These comments already appear in the header, and some of them are
out-of-date anyway.

llvm-svn: 218829
2014-10-01 21:32:15 +00:00
Duncan P. N. Exon Smith 9affbbaac0 Revert "DIBuilder: Remove dead code"
This reverts commit r218820.  It turns out that Adrian has an
outstanding SROA patch that uses this.

I've updated it to forward to `createExpression()`.

llvm-svn: 218828
2014-10-01 21:32:12 +00:00
Sanjay Patel 9ebfbb969d Lower FNEG ( FABS (x) ) -> FNABS (x) [X86 codegen] PR20578
Negative FABS of either a scalar or vector should be handled the same way
on x86 with SSE/AVX: a single OR instruction of the FP operand with a
constant to light up the sign bit(s).

http://llvm.org/bugs/show_bug.cgi?id=20578

Differential Revision: http://reviews.llvm.org/D5201

llvm-svn: 218822
2014-10-01 21:20:06 +00:00
David Blaikie 7fad1b4374 Update test name to match changes made in r218783
Addressing post commit review feedback from Justin Bogner.

llvm-svn: 218821
2014-10-01 21:19:39 +00:00
Duncan P. N. Exon Smith 1ce4fd36bf DIBuilder: Remove dead code
I neglected to update `DIBuilder::createPieceExpression()` in r218797,
which I noticed while rebasing a patch for PR17891.  On closer
inspection, it looks like dead code.

If there are any downstream users of this, you should transition to the
more general `createExpression()`.  Or, we can add this back, but then
it should just forward to `createExpression()`.

llvm-svn: 218820
2014-10-01 21:14:20 +00:00
Chandler Carruth f795e4805e [x86] Merge the remaining test cases into vector-blend.ll and remove all
the ISA-specific test files.

llvm-svn: 218818
2014-10-01 21:07:07 +00:00
Eric Christopher eb6e3bbf47 Now that the optimization level is adjusting the feature string
before we hit the subtarget, remove the constructor parameter.

llvm-svn: 218817
2014-10-01 21:05:35 +00:00
Chandler Carruth b21e033b2a [x86] Expand the ISA coverage of our blend test in preparation for
merging ISA-specific testing into this file.

llvm-svn: 218816
2014-10-01 21:03:21 +00:00
Argyrios Kyrtzidis 0b9f5507c8 Adds 'override' to overriding methods. NFC.
llvm-svn: 218815
2014-10-01 21:00:44 +00:00
Chandler Carruth a406d6c386 [x86] Merge the interesting test cases from blend-msb.ll into
vector-blend.ll and remove the former.

llvm-svn: 218814
2014-10-01 20:56:57 +00:00
Chandler Carruth 19a5366481 [x86] Move the AVX blend test to a generic name. I'm going to fold other
blend tests into this one.

llvm-svn: 218813
2014-10-01 20:52:55 +00:00
Chandler Carruth b4b09c7a6c [x86] Remove a test that wasn't doing anything really. We have plenty of
better tests for zext of vectors at this point.

llvm-svn: 218811
2014-10-01 20:50:58 +00:00
Chandler Carruth ab5ddea2cb [x86] Add a 32-bit run to the sext test, and remove a sad vec_sext.ll
test file.

This old test had a bunch of functions that were never even checked. =/
The only thing it really did was to make sure that we did something
reasonable in 32-bit mode with SSE4.1. Adding another run line to the
main vector-sext.ll test seems a better way to do that.

llvm-svn: 218810
2014-10-01 20:49:54 +00:00
Chandler Carruth bbbdb9f0ee [x86] Teach both sext and zext vector tests to cover a nice wide range
of architectures: SSE2, SSSE3, SSE4.1, AVX, and AVX2.

Unfortunately, this exposses the absolute horror of the code we generate
for many of these patterns. Anyone wanting to familiarize themselves
with the x86 backend and improve performance could do a lot of good
sitting down and making these test cases not look so terrible. While the
new vector shuffle code I'm working on well help some, it won't fix all
of the crimes here.

llvm-svn: 218807
2014-10-01 20:41:36 +00:00
Eric Christopher 36448af7f5 Rework the PPC TargetMachine so that the non-function specific
overrides happen at TargetMachine creation and not on every
subtarget creation.

llvm-svn: 218805
2014-10-01 20:38:26 +00:00
Eric Christopher 12f4a78581 constify TargetMachine parameter for X86TargetLowering.
llvm-svn: 218804
2014-10-01 20:38:22 +00:00
Sanjay Patel 7b2cd9ad86 Make the sqrt intrinsic return undef for a negative input.
As discussed here:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140609/220598.html

And again here:
http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-September/077168.html

The sqrt of a negative number when using the llvm intrinsic is undefined. 
We should return undef rather than 0.0 to match the definition in the LLVM IR lang ref.

This change should not affect any code that isn't using "no-nans-fp-math"; 
ie, no-nans is a requirement for generating the llvm intrinsic in place of a sqrt function call.

Unfortunately, the behavior introduced by this patch will not match current gcc, xlc, icc, and 
possibly other compilers. The current clang/llvm behavior of returning 0.0 doesn't either. 
We knowingly approve of this difference with the other compilers in an attempt to flag code 
that is invoking undefined behavior.

A front-end warning should also try to convince the user that the program will fail:
http://llvm.org/bugs/show_bug.cgi?id=21093

Differential Revision: http://reviews.llvm.org/D5527

llvm-svn: 218803
2014-10-01 20:36:33 +00:00
Chandler Carruth b5c9e04b51 [x86] Sort the ISA-specific RUN lines for vector-sext.ll to go from
oldest to newest. This makes more sense to me and is more consistent
with other tests.

llvm-svn: 218802
2014-10-01 20:32:44 +00:00
Tim Northover 6a1ef73140 ARM: yes it can (as of r218789)
llvm-svn: 218801
2014-10-01 20:31:58 +00:00
Chandler Carruth c66ea0fc12 [x86] Rename avx-{s,z}ext.ll to vector-{s,z}ext.ll.
These tests are far and away the best sext and zext tests we have for
vectors. I'm going to merge the other similar tests into them and expand
the ISA coverage.

llvm-svn: 218800
2014-10-01 20:30:30 +00:00
Chandler Carruth 011088fc46 [x86] Cleanup and re-generate the checks for avx-zext.ll using the new
script.

llvm-svn: 218799
2014-10-01 20:27:16 +00:00
Duncan P. N. Exon Smith 611afb229c DIBuilder: Encapsulate DIExpression's element type
`DIExpression`'s elements are 64-bit integers that are stored as
`ConstantInt`.  The accessors already encapsulate the storage.  This
commit updates the `DIBuilder` API to also encapsulate that.

llvm-svn: 218797
2014-10-01 20:26:08 +00:00
Chandler Carruth fbba2fa8d9 [x86] Generate the FileCheck assertions for avx-blend.ll with my new
script to make them nice and predictable. This will ease updating them
for the new vector shuffle lowering and seeing the delta if any.

llvm-svn: 218795
2014-10-01 20:19:45 +00:00
Chandler Carruth 1f569b05b6 [x86] Clean up and generate detailed FileCheck assertions for
avx-sext.ll using my new script.

Also add an AVX2 mode to this test.

Part of cleaning up the test suite before enabling the new vector
shuffle lowering. This also highlights some of the abysmal failures of
the old shuffle lowering. Check out those 'pinsrw' and 'pextrw'
sequences!

llvm-svn: 218794
2014-10-01 20:19:32 +00:00
Bruno Cardoso Lopes e3c513a965 [MemoryDepAnalysis] Fix compile time slowdown
- Problem
One program takes ~3min to compile under -O2. This happens after a certain
function A is inlined ~700 times in a function B, inserting thousands of new
BBs. This leads to 80% of the compilation time spent in
GVN::processNonLocalLoad and
MemoryDependenceAnalysis::getNonLocalPointerDependency, while searching for
nonlocal information for basic blocks.

Usually, to avoid spending a long time to process nonlocal loads, GVN bails out
if it gets more than 100 deps as a result from
MD->getNonLocalPointerDependency.  However this only happens *after* all
nonlocal information for BBs have been computed, which is the bottleneck in
this scenario. For instance, there are 8280 times where
getNonLocalPointerDependency returns deps with more than 100 bbs and from
those, 600 times it returns more than 1000 blocks.

- Solution
Bail out early during the nonlocal info computation whenever we reach a
specified threshold.  This patch proposes a 100 BBs threshold, it also
reduces the compile time from 3min to 23s.

- Testing
The test-suite presented no compile nor execution time regressions.

Some numbers from my machine (x86_64 darwin):
 - 17s under -Oz (which avoids inlining).
 - 1.3s under -O1.
 - 2m51s under -O2 ToT
 *** 23s under -O2 w/ Result.size() > 100
 - 1m54s under -O2 w/ Result.size() > 500

With NumResultsLimit = 100, GVN yields the same outcome as in the
unlimited 3min version.

http://reviews.llvm.org/D5532
rdar://problem/18188041

llvm-svn: 218792
2014-10-01 20:07:13 +00:00
Sanjay Patel 0e4a83e89c Don't repeat function/variable name in comment. NFC.
llvm-svn: 218791
2014-10-01 19:39:32 +00:00
Adam Nemet fd6a73d70a [X86 disasm tblegen backend] Clean up numPhysicalOperands asserts
No functionality change intended.

This implements Elena's idea to put the new additionalOperand outside the
switch to cover all cases
(http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140929/237763.html).

Note only nontrivial change is in MRMSrcMemFrm.  This requires an inclusive
interval of [2, 4] because we have prefix-dependent *optional* immediate
operand.

llvm-svn: 218790
2014-10-01 19:28:11 +00:00
Tim Northover 5d72c5de02 ARM: allow copying of CPSR when all else fails.
As with x86 and AArch64, certain situations can arise where we need to spill
CPSR in the middle of a calculation. These should be avoided where possible
(MRS/MSR is rather expensive), which ARM is actually better at than the other
two since it tries to Glue defs to uses, but as a last ditch effort, copying is
better than crashing.

rdar://problem/18011155

llvm-svn: 218789
2014-10-01 19:21:03 +00:00
Adrian Prantl 87b7eb9d0f Move the complex address expression out of DIVariable and into an extra
argument of the llvm.dbg.declare/llvm.dbg.value intrinsics.

Previously, DIVariable was a variable-length field that has an optional
reference to a Metadata array consisting of a variable number of
complex address expressions. In the case of OpPiece expressions this is
wasting a lot of storage in IR, because when an aggregate type is, e.g.,
SROA'd into all of its n individual members, the IR will contain n copies
of the DIVariable, all alike, only differing in the complex address
reference at the end.

By making the complex address into an extra argument of the
dbg.value/dbg.declare intrinsics, all of the pieces can reference the
same variable and the complex address expressions can be uniqued across
the CU, too.
Down the road, this will allow us to move other flags, such as
"indirection" out of the DIVariable, too.

The new intrinsics look like this:
declare void @llvm.dbg.declare(metadata %storage, metadata %var, metadata %expr)
declare void @llvm.dbg.value(metadata %storage, i64 %offset, metadata %var, metadata %expr)

This patch adds a new LLVM-local tag to DIExpressions, so we can detect
and pretty-print DIExpression metadata nodes.

What this patch doesn't do:

This patch does not touch the "Indirect" field in DIVariable; but moving
that into the expression would be a natural next step.

http://reviews.llvm.org/D4919
rdar://problem/17994491

Thanks to dblaikie and dexonsmith for reviewing this patch!

Note: I accidentally committed a bogus older version of this patch previously.
llvm-svn: 218787
2014-10-01 18:55:02 +00:00
Duncan P. N. Exon Smith 08a83be3ea LTO: Add missing target triple from r218784
llvm-svn: 218786
2014-10-01 18:49:58 +00:00
Reed Kotler b9dc248e9e Add fptrunc to mips fast-sel
Summary: Implement conversion of 64 to 32 bit floating point numbers (fptrunc) in mips fast-isel

Test Plan:
fptrunc.ll
checked also with 4 internal mips build bot flavors mip32r1/miprs32r2 and at -O0 and -O2

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: rfuhler

Differential Revision: http://reviews.llvm.org/D5553

llvm-svn: 218785
2014-10-01 18:47:02 +00:00
Duncan P. N. Exon Smith 30c9242caa LTO: Ignore disabled diagnostic remarks
r206400 and r209442 added remarks that are disabled by default.
However, if a diagnostic handler is registered, the remarks are sent
unfiltered to the handler.  This is the right behaviour for clang, since
it has its own filters.

However, the diagnostic handler exposed in the LTO API receives only the
severity and message.  It doesn't have the information to filter by pass
name.  For LTO, disabled remarks should be filtered by the producer.

I've changed `LLVMContext::setDiagnosticHandler()` to take a `bool`
argument indicating whether to respect the built-in filters.  This
defaults to `false`, so other consumers don't have a behaviour change,
but `LTOCodeGenerator::setDiagnosticHandler()` sets it to `true`.

To make this behaviour testable, I added a `-use-diagnostic-handler`
command-line option to `llvm-lto`.

This fixes PR21108.

llvm-svn: 218784
2014-10-01 18:36:03 +00:00
David Blaikie 847b37ec8a Add an immovable type to test Optional<T>::emplace more rigorously after r218732.
llvm-svn: 218783
2014-10-01 18:29:44 +00:00
Adrian Prantl b458dc2eee Revert r218778 while investigating buldbot breakage.
"Move the complex address expression out of DIVariable and into an extra"

llvm-svn: 218782
2014-10-01 18:10:54 +00:00
Adrian Prantl 25a7174e7a Move the complex address expression out of DIVariable and into an extra
argument of the llvm.dbg.declare/llvm.dbg.value intrinsics.

Previously, DIVariable was a variable-length field that has an optional
reference to a Metadata array consisting of a variable number of
complex address expressions. In the case of OpPiece expressions this is
wasting a lot of storage in IR, because when an aggregate type is, e.g.,
SROA'd into all of its n individual members, the IR will contain n copies
of the DIVariable, all alike, only differing in the complex address
reference at the end.

By making the complex address into an extra argument of the
dbg.value/dbg.declare intrinsics, all of the pieces can reference the
same variable and the complex address expressions can be uniqued across
the CU, too.
Down the road, this will allow us to move other flags, such as
"indirection" out of the DIVariable, too.

The new intrinsics look like this:
declare void @llvm.dbg.declare(metadata %storage, metadata %var, metadata %expr)
declare void @llvm.dbg.value(metadata %storage, i64 %offset, metadata %var, metadata %expr)

This patch adds a new LLVM-local tag to DIExpressions, so we can detect
and pretty-print DIExpression metadata nodes.

What this patch doesn't do:

This patch does not touch the "Indirect" field in DIVariable; but moving
that into the expression would be a natural next step.

http://reviews.llvm.org/D4919
rdar://problem/17994491

Thanks to dblaikie and dexonsmith for reviewing this patch!

llvm-svn: 218778
2014-10-01 17:55:39 +00:00
Tom Stellard 79243d9664 R600: Call EmitFunctionHeader() in the AsmPrinter to populate the ELF symbol table
llvm-svn: 218776
2014-10-01 17:15:17 +00:00
Tom Stellard 0a4e9a3b25 C API: Add LLVMCloneModule()
llvm-svn: 218775
2014-10-01 17:14:57 +00:00
Jingyue Wu fd47fb9976 Revert r216862 due to a performance regression
Reported by Alexey Volkov in PR21115

llvm-svn: 218771
2014-10-01 15:22:13 +00:00
Toma Tabacu c4c202a9a7 [mips] Rename emit and parse functions for the .cpload assembler directive. NFC.
Summary: It's better if we have a consistent name for .cpload-related functions.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5437

llvm-svn: 218768
2014-10-01 14:53:19 +00:00
Tom Stellard 3a35d8f4c2 R600/SI: Add a generic pseudo EXP instruction
llvm-svn: 218767
2014-10-01 14:44:45 +00:00
Tom Stellard 0c238c2fbe R600/SI: Add generic pseudo MTBUF instructions
llvm-svn: 218766
2014-10-01 14:44:43 +00:00
Tom Stellard c470c96e6b R600/SI: Add generic pseudo SMRD instructions
llvm-svn: 218765
2014-10-01 14:44:42 +00:00