Commit Graph

78486 Commits

Author SHA1 Message Date
Evan Cheng a4b6404cf0 DAG combine should not increase alignment of loads / stores with alignment less
than ABI alignment. These are loads / stores from / to "packed" data structures.
Their alignments are intentionally under-specified.

rdar://10301431

llvm-svn: 145273
2011-11-28 20:42:56 +00:00
Evan Cheng aa93ceb164 Add missing avx pattern.
llvm-svn: 145272
2011-11-28 20:27:23 +00:00
Peter Collingbourne 952adc32d9 Add OpenCL blurb to release notes.
llvm-svn: 145270
2011-11-28 20:04:12 +00:00
Chad Rosier 61e8d1026f 80-column.
llvm-svn: 145267
2011-11-28 19:59:09 +00:00
Bill Wendling 5ebc95ff4c Remove dead llvm.eh.sjlj.dispatchsetup intrinsic.
llvm-svn: 145263
2011-11-28 19:23:13 +00:00
Andrew Trick a8bdb7cbf1 Remove the temporary flag -disable-unroll-scev and dead code.
SCEV should now be used for trip count analysis, not LoopInfo.

llvm-svn: 145262
2011-11-28 19:22:09 +00:00
Eli Friedman 31f0116173 Add back a line I deleted by accident in r145141. Fixes uninitialized variable warnings and runtime failures.
llvm-svn: 145256
2011-11-28 18:50:37 +00:00
Michael J. Spencer f7a4ed7d7f Add object file related release notes.
llvm-svn: 145254
2011-11-28 18:20:09 +00:00
Jakob Stoklund Olesen 979dad7bd2 Explain what ExeDepsFix does.
llvm-svn: 145253
2011-11-28 18:03:11 +00:00
Rafael Espindola c87cebf2bf Fix spelling/grammar errors found by Duncan.
llvm-svn: 145250
2011-11-28 17:06:58 +00:00
Benjamin Kramer 4c16431a67 Handle more cases in APInt::getLowBitsSet's fast path.
llvm-svn: 145249
2011-11-28 16:56:38 +00:00
Bill Wendling 957cc212bb Support a 'final' release candidate tag.
llvm-svn: 145243
2011-11-28 11:45:10 +00:00
Duncan Sands 12330650f8 Silence wrong warnings from GCC about variables possibly being used
uninitialized: GCC doesn't understand that the variables are only used
if !UseImm, in which case they have been initialized.

llvm-svn: 145239
2011-11-28 10:31:27 +00:00
Craig Topper 818a983e93 Add X86 instruction selection for VPERM2I128 when AVX2 is enabled. Merge VPERMILPS/VPERMILPD detection since they are pretty similar.
llvm-svn: 145238
2011-11-28 10:14:51 +00:00
Bob Wilson 3f35470fc7 Add an optional separate install prefix for internal components. rdar://10217046
Some files installed by clang are not relevant for general users and we'd like
to be able to install them to a different location.  This adds a new
--with-internal-prefix configure option and a corresponding PROJ_internal_prefix
makefile variable, which defaults to the standard prefix.  A tool makefile
can specify that it should be installed to this internal prefix by defining
INTERNAL_TOOL.

llvm-svn: 145234
2011-11-28 07:59:52 +00:00
NAKAMURA Takumi 8284ec46b6 test/lit.cfg: Enable the feature 'asserts' to check output of llc -version.
llc knows whether he is compiled with -DNDEBUG.
|  Optimized build with assertions.

llvm-svn: 145230
2011-11-28 05:09:15 +00:00
NAKAMURA Takumi a0d652e71b lit/TestRunner.py: Use RemoveForce().
llvm-svn: 145223
2011-11-28 01:55:08 +00:00
NAKAMURA Takumi 57fc5adca0 lit/TestRunner.py: [Win32] Introduce WinWaitReleased(f), to wait for file handles to be released by children.
When wait() has finished, opened handles (especially writing stdout to file) might not be released immediately.
To wait for released, poll to attempt renaming.

llvm-svn: 145222
2011-11-28 01:55:01 +00:00
Jakob Stoklund Olesen 6d110aa84d Add a blurb about the new ExecutionDepsFix pass.
llvm-svn: 145220
2011-11-28 01:46:19 +00:00
Craig Topper b0456936da Make isCommutedVSHUFP more like the way isCommutedSHUFP is handled.
llvm-svn: 145218
2011-11-28 01:14:24 +00:00
NAKAMURA Takumi 4ad52a54b9 configure, config.h.in: Regenerate.
config.h.cmake: Synchronize to config.h.in.

llvm-svn: 145217
2011-11-28 01:07:19 +00:00
Dylan Noblesmith 3e79ef1d45 use llvm-config.h in public header
The config.h file's macros collide with other projects that include
LLVM and shouldn't get exported.

llvm-svn: 145215
2011-11-28 00:49:01 +00:00
Dylan Noblesmith efddf20126 rename ENABLE_THREADS to LLVM_ENABLE_THREADS
Now that it needs to be exported in a public header (Valgrind.h)
it should be prefixed to avoid collision with other projects.
Add it to llvm-config.h as well.

This'll require regenerating the configure script after this
commit, but I don't have the required autoconf version.

llvm-svn: 145214
2011-11-28 00:48:58 +00:00
Dylan Noblesmith daef41b1d1 update description of LLVM_DEFAULT_TARGET_TRIPLE
It was out of sync with the description in configure.ac/config.h.in.
Also re-alphabetize it from its position when it was LLVM_HOST_TRIPLE.

llvm-svn: 145213
2011-11-28 00:48:53 +00:00
Nick Lewycky 6404d97a99 Place the "cfg checksum" around a test. This was recently added in April 2011 to
gcc, though I thought it was older (my gcc 4.4 has it as a local patch. Whoops!)
This fixes PR10589.

Also add some debugging statements.

Remove GcnoFiles, the mapping from CompilationUnit to raw_ostream. Now that we
start by iterating over each CU and descending into them, there's no need to
maintain a mapping.

llvm-svn: 145208
2011-11-27 23:22:20 +00:00
Chris Lattner ef714c0b05 dwarf parsing stuff.
llvm-svn: 145207
2011-11-27 22:39:23 +00:00
Chris Lattner b035c31215 first pass of writing complete!
llvm-svn: 145206
2011-11-27 22:36:22 +00:00
Chris Lattner 7b32d97e02 arm and carve out a place ot mention segmented stacks.
llvm-svn: 145204
2011-11-27 22:12:32 +00:00
Rafael Espindola 799ca897e7 Add a description of the status of segmented stacks.
llvm-svn: 145201
2011-11-27 22:05:46 +00:00
Chris Lattner 7257f76728 optimize, mc, x86
llvm-svn: 145200
2011-11-27 22:03:34 +00:00
Craig Topper 79ee88a511 Merge detecting and handling for VSHUFPSY and VSHUFPDY since a lot of the code was similar for both.
llvm-svn: 145199
2011-11-27 21:41:12 +00:00
Chris Lattner 644976405f some writing.
llvm-svn: 145198
2011-11-27 21:30:28 +00:00
Chris Lattner 9661de7d30 fix some out-of-date attribution.
llvm-svn: 145197
2011-11-27 21:02:12 +00:00
Chris Lattner 6442197819 distribute various bullets to different sections.
llvm-svn: 145196
2011-11-27 20:51:47 +00:00
Chandler Carruth 4f56720754 Prevent rotating the blocks of a loop (and thus getting a backedge to be
fallthrough) in cases where we might fail to rotate an exit to an outer
loop onto the end of the loop chain.

Having *some* rotation, but not performing this rotation, is the primary
fix of thep performance regression with -enable-block-placement for
Olden/em3d (a whopping 30% regression). Still working on reducing the
test case that actually exercises this and the new rotation strategy out
of this code, but I want to check if this regresses other test cases
first as that may indicate it isn't the correct fix.

llvm-svn: 145195
2011-11-27 20:18:00 +00:00
Chris Lattner 080dd7ce30 rewrite the known problems section. Including a short list of individual bugs per target isn't particularly useful. Link to the target features matrix.
llvm-svn: 145193
2011-11-27 19:38:20 +00:00
Chris Lattner 4857190a50 move the detailed information about the EH rewrite to a comment, Bill is
blog'izing it.

llvm-svn: 145192
2011-11-27 19:26:30 +00:00
Chris Lattner e9a31c40b6 tweak subprojects' section
llvm-svn: 145191
2011-11-27 18:53:41 +00:00
Chris Lattner 25a7790603 some random notes.
llvm-svn: 145190
2011-11-27 18:47:37 +00:00
Chris Lattner 251d827d2c remove a test that is using old-style llvm.dbg intrinsics, apparently only
fails on ppc and arm hosts.

llvm-svn: 145188
2011-11-27 18:13:47 +00:00
Chandler Carruth 03adbd46ca Take two on rotating the block ordering of loops. My previous attempt
was centered around the premise of laying out a loop in a chain, and
then rotating that chain. This is good for preserving contiguous layout,
but bad for actually making sane rotations. In order to keep it safe,
I had to essentially make it impossible to rotate deeply nested loops.
The information needed to correctly reason about a deeply nested loop is
actually available -- *before* we layout the loop. We know the inner
loops are already fused into chains, etc. We lose information the moment
we actually lay out the loop.

The solution was the other alternative for this algorithm I discussed
with Benjamin and some others: rather than rotating the loop
after-the-fact, try to pick a profitable starting block for the loop's
layout, and then use our existing layout logic. I was worried about the
complexity of this "pick" step, but it turns out such complexity is
needed to handle all the important cases I keep teasing out of benchmarks.

This is, I'm afraid, a bit of a work-in-progress. It is still
misbehaving on some likely important cases I'm investigating in Olden.
It also isn't really tested. I'm going to try to craft some interesting
nested-loop test cases, but it's likely to be extremely time consuming
and I don't want to go there until I'm sure I'm testing the correct
behavior. Sadly I can't come up with a way of getting simple, fine
grained test cases for this logic. We need complex loop structures to
even trigger much of it.

llvm-svn: 145183
2011-11-27 13:34:33 +00:00
Chandler Carruth 37ab257b88 Revert r145180 as it is causing test failures on all the bots.
Original commit message:
Fixed ObjectFile functions:
- getSymbolOffset() renamed as getSymbolFileOffset()
- getSymbolFileOffset(), getSymbolAddress(), getRelocationAddress() returns same result for ELFObjectFile, MachOObjectFile and COFFObjectFile.
- added getRelocationOffset()
- fixed MachOObjectFile::getSymbolSize()
- fixed MachOObjectFile::getSymbolSection()
- fixed MachOObjectFile::getSymbolOffset() for symbols without section data.

llvm-svn: 145182
2011-11-27 10:37:47 +00:00
Chandler Carruth 9e46684154 Fix an impressive type-o / spell-o Duncan noticed.
llvm-svn: 145181
2011-11-27 10:32:16 +00:00
Danil Malyshev 2631f93f7d Fixed ObjectFile functions:
- getSymbolOffset() renamed as getSymbolFileOffset()
- getSymbolFileOffset(), getSymbolAddress(), getRelocationAddress() returns same result for ELFObjectFile, MachOObjectFile and COFFObjectFile.
- added getRelocationOffset()
- fixed MachOObjectFile::getSymbolSize()
- fixed MachOObjectFile::getSymbolSection()
- fixed MachOObjectFile::getSymbolOffset() for symbols without section data.

llvm-svn: 145180
2011-11-27 10:12:52 +00:00
Chandler Carruth a054580993 Rework a bit of the implementation of loop block rotation to not rely so
heavily on AnalyzeBranch. That routine doesn't behave as we want given
that rotation occurs mid-way through re-ordering the function. Instead
merely check that there are not unanalyzable branching constructs
present, and then reason about the CFG via successor lists. This
actually simplifies my mental model for all of this as well.

The concrete result is that we now will rotate more loop chains. I've
added a test case from Olden highlighting the effect. There is still
a bit more to do here though in order to regain all of the performance
in Olden.

llvm-svn: 145179
2011-11-27 09:22:53 +00:00
Chris Lattner 0bcbde46e2 Eli managed to kill off llvm.membarrier in llvm 3.0 also, this means
that mainline needs no autoupgrade logic for intrinsics yet, woohoo!

llvm-svn: 145178
2011-11-27 08:42:07 +00:00
Chris Lattner 3dcdc29d11 add some final random notes, I've completed my pass over all the commits.
I'll work on turning this into something intelligible tomorrow.

llvm-svn: 145177
2011-11-27 08:32:32 +00:00
Chris Lattner 410f3d7f5d The llvm.atomic intrinsics *were* removed in LLVM 3.0 (in r141333), remove the
autoupgrade logic for 2.9 and before.

llvm-svn: 145176
2011-11-27 08:18:55 +00:00
Chris Lattner ee471c484a remove autoupgrade support for old forms of llvm.prefetch and the old
trampoline forms.  Both of these were correct in LLVM 3.0, and we don't
need to support LLVM 2.9 and earlier in mainline.

llvm-svn: 145174
2011-11-27 07:42:04 +00:00
Chris Lattner d5bb9e6c4c add some notes.
llvm-svn: 145173
2011-11-27 07:37:53 +00:00
Chris Lattner bc639298e5 remove asmparsing and documentation support for "volatile load", which was only produced by LLVM 2.9 and earlier. LLVM 3.0 and later prefers "load volatile".
llvm-svn: 145172
2011-11-27 06:56:53 +00:00
Chris Lattner 6a144a2227 Upgrade syntax of tests using volatile instructions to use 'load volatile' instead of 'volatile load', which is archaic.
llvm-svn: 145171
2011-11-27 06:54:59 +00:00
Chris Lattner ebed15e973 some notes.
llvm-svn: 145170
2011-11-27 06:24:49 +00:00
Chris Lattner 90ef78c07f remove autoupgrade support for really old-style debug info intrinsics.
I think this is the last of autoupgrade that can be removed in 3.1.
Can the atomic upgrade stuff also go?

llvm-svn: 145169
2011-11-27 06:18:33 +00:00
Chris Lattner 6aa6c0c3b7 remove some old autoupgrade logic
llvm-svn: 145167
2011-11-27 06:10:54 +00:00
Chris Lattner db89153969 remove autoupgrade support for LLVM 2.9 exception stuff. Mainline supports
LLVM 3.0 and later.

llvm-svn: 145165
2011-11-27 05:56:16 +00:00
Chris Lattner 1c9e5678b8 remove support for reading llvm 2.9 .bc files. LLVM 3.1 is only compatible back to 3.0
llvm-svn: 145164
2011-11-27 05:48:27 +00:00
Chris Lattner 74a3e00ebf add some notes
llvm-svn: 145163
2011-11-27 05:47:57 +00:00
Wesley Peck 97b3da5433 Add several new instructions supported by the latest MicroBlaze.
These instructions are not generated by the backend yet, this will come in a later commit.

llvm-svn: 145161
2011-11-27 05:16:58 +00:00
Bob Wilson 8e6d9da04c Partially revert r145157 to quiet an unhappy buildbot.
Removing that buildbot would be a better solution, but this is at least
a temporary workaround.

llvm-svn: 145160
2011-11-27 01:48:54 +00:00
Wesley Peck d2e2e1782f Optimize comparison against 0 in conditional instructions.
Fix a couple of 80-column violations.

llvm-svn: 145159
2011-11-27 01:36:20 +00:00
Chandler Carruth 9ffb97e631 Introduce a loop block rotation optimization to the new block placement
pass. This is designed to achieve one of the important optimizations
that the old code placement pass did, but more simply.

This is a somewhat rough and *very* conservative version of the
transform. We could get a lot fancier here if there are profitable cases
to do so. In particular, this only looks for a single pattern, it
insists that the loop backedge being rotated away is the last backedge
in the chain, and it doesn't provide any means of doing better in-loop
placement due to the rotation. However, it appears that it will handle
the important loops I am finding in the LLVM test suite.

llvm-svn: 145158
2011-11-27 00:38:03 +00:00
Bob Wilson 4eefd2d52f Merge the install-clang-c target into install-clang. <rdar://problem/10217046>
llvm-svn: 145157
2011-11-27 00:26:22 +00:00
Benjamin Kramer 7ba71be392 Move code into anonymous namespaces.
llvm-svn: 145154
2011-11-26 23:01:57 +00:00
Craig Topper 51280d565b Merge 128-bit and 256-bit X86ISD node types for VPERMILPS and VPERMILPD. Simplify some shuffle lowering code since V1 can never be UNDEF due to canonalizing that occurs when shuffle nodes are created.
llvm-svn: 145153
2011-11-26 22:55:48 +00:00
Wesley Peck 69d5040485 Rename a couple of options and fix some simple typos.
llvm-svn: 145152
2011-11-26 21:50:38 +00:00
Craig Topper 7704bd7ac3 Collapse X86ISD node types for PUNPCKH*, PUNPCKL*, UNPCKLP*, and UNPCKHP* to not be type specific. Now we just have integer high and low and floating point high and low. Pattern matching will choose the correct instruction based on the vector type.
llvm-svn: 145148
2011-11-26 20:47:44 +00:00
Benjamin Kramer 8c8486dbb2 Move the branch probability blurb into the optimizer section. Add a minimal bullet for AVX.
llvm-svn: 145145
2011-11-26 11:14:54 +00:00
David Chisnall 07618783f3 Added Objective-C and libc++ details to the 3.0 release notes.
llvm-svn: 145144
2011-11-26 10:56:17 +00:00
Chandler Carruth f156f0cf57 FileCheck-ize this test and make it more precise. This is in preparation
for adding other tests.

llvm-svn: 145143
2011-11-26 08:24:25 +00:00
Eli Friedman a84ad7d0d0 Fix APFloat::convert so that it handles narrowing conversions correctly; it
was returning incorrect values in rare cases, and incorrectly marking
exact conversions as inexact in some more common cases. Fixes PR11406, and a
missed optimization in test/CodeGen/X86/fp-stack-O0.ll.

llvm-svn: 145141
2011-11-26 03:38:02 +00:00
Benjamin Kramer a02af616b1 shpelling
llvm-svn: 145138
2011-11-25 21:26:00 +00:00
Benjamin Kramer 889b243fd6 Remove ZooLib from the projects list.
I don't see how the project is using LLVM and we really can't list every user of
the clang analyzer. Sorry.

llvm-svn: 145137
2011-11-25 21:03:06 +00:00
Chris Lattner c3e4fdcc10 add a user
llvm-svn: 145136
2011-11-25 20:36:17 +00:00
Chris Lattner 614d0391e9 add some notes
llvm-svn: 145135
2011-11-25 20:33:27 +00:00
Chris Lattner e5b37be30a add faust
llvm-svn: 145134
2011-11-25 20:28:16 +00:00
Bruno Cardoso Lopes 0f9a1f5e6c This patch contains support for encoding FMA4 instructions and
tablegen patterns for scalar FMA4 operations and intrinsic. Also
add tests for vfmaddsd.

Patch by Jan Sjodin

llvm-svn: 145133
2011-11-25 19:33:42 +00:00
NAKAMURA Takumi 989eaf6e3f ARMLoadStoreOptimizer.cpp: Fix MSVC(Debug) build.
llvm-svn: 145129
2011-11-25 09:19:57 +00:00
Craig Topper d65a444478 Remove 256-bit specific node types for UNPCKHPS/D and instead use the 128-bit versions and let the operand type disinquish. Also fix the load form of the v8i32 patterns for these to realize that the load would be promoted to v4i64.
llvm-svn: 145126
2011-11-24 22:57:10 +00:00
Craig Topper d26466748b Remove AVX2 specific X86ISD node types for PUNPCKH/L and instead just reuse the 128-bit versions and let the vector type distinguish.
llvm-svn: 145125
2011-11-24 22:20:08 +00:00
Benjamin Kramer 8a2d143672 Devirtualize Pass::getPassID, overriding it isn't useful and it gets called a lot.
While at it pull the trivial ctor in line.

llvm-svn: 145124
2011-11-24 21:14:11 +00:00
Benjamin Kramer 6709e05012 Make ConstantRange::truncate a bit more efficient.
llvm-svn: 145122
2011-11-24 17:24:33 +00:00
Benjamin Kramer 651db37352 X86: alias cqo to cqto.
llvm-svn: 145121
2011-11-24 12:02:46 +00:00
Chandler Carruth 7adee1a01a Fix a silly use-after-free issue. A much earlier version of this code
need lots of fanciness around retaining a reference to a Chain's slot in
the BlockToChain map, but that's all gone now. We can just go directly
to allocating the new chain (which will update the mapping for us) and
using it.

Somewhat gross mechanically generated test case replicates the issue
Duncan spotted when actually testing this out.

llvm-svn: 145120
2011-11-24 11:23:15 +00:00
Chandler Carruth d394bafd2d When adding blocks to the list of those which no longer have any CFG
conflicts, we should only be adding the first block of the chain to the
list, lest we try to merge into the middle of that chain. Most of the
places we were doing this we already happened to be looking at the first
block, but there is no reason to assume that, and in some cases it was
clearly wrong.

I've added a couple of tests here. One already worked, but I like having
an explicit test for it. The other is reduced from a test case Duncan
reduced for me and used to crash. Now it is handled correctly.

llvm-svn: 145119
2011-11-24 08:46:04 +00:00
Jim Grosbach 651e2ee792 Add a few notes for ARM and a blurb about the MCJIT.
llvm-svn: 145118
2011-11-24 00:49:21 +00:00
Akira Hatanaka 049e9e4d22 This patch makes the following changes necessary for MIPS' direct code emission.
- lower unaligned loads/stores.
- encode the size operand of instructions INS and EXT.
- emit relocation information needed for JAL (jump-and-link).  

llvm-svn: 145113
2011-11-23 22:19:28 +00:00
Akira Hatanaka f5ddf13f79 This patch addresses gp relative fixups/relocations for jump tables.
llvm-svn: 145112
2011-11-23 22:18:04 +00:00
Richard Smith 4f9a8081c3 Correctly byte-swap APInts with bit-widths greater than 64.
llvm-svn: 145111
2011-11-23 21:33:37 +00:00
Benjamin Kramer 6e013bf96c Validate the return type when checking if a function is malloc.
Fixes PR11426. Not sure if a test case with a "wrong" malloc would be useful.

llvm-svn: 145106
2011-11-23 17:58:47 +00:00
Duncan Sands 81a2af12d6 Fix a crash in which a multiplication was being reported as being both negative
and positive: positive, because it could be directly computed to be positive;
negative, because the nsw flags means it is either negative or undefined (the
multiplication always overflowed).

llvm-svn: 145104
2011-11-23 16:26:47 +00:00
Benjamin Kramer ebcb451874 X86: Use btq for bit tests if the immediate can't be encoded in 32 bits.
Before:
	movabsq	$4294967296, %rax       ## encoding: [0x48,0xb8,0x00,0x00,0x00,0x00,0x01,0x00,0x00,0x00]
	testq	%rax, %rdi              ## encoding: [0x48,0x85,0xf8]
	jne	LBB0_2                  ## encoding: [0x75,A]

After:
	btq	$32, %rdi               ## encoding: [0x48,0x0f,0xba,0xe7,0x20]
	jb	LBB0_2                  ## encoding: [0x72,A]

btq is usually slower than testq because it doesn't fuse with the jump, but here we're better off
saving one register and a giant movabsq.

llvm-svn: 145103
2011-11-23 13:54:17 +00:00
NAKAMURA Takumi 0b3e996485 test/CodeGen/X86/block-placement.ll: Add explicit -mtriple=i686-linux. X86 Win32 CodeGen does not support EH yet.
llvm-svn: 145101
2011-11-23 12:18:22 +00:00
Chandler Carruth 99fe42fbd9 Relax an invariant that block placement was trying to assert a bit
further. This invariant just wasn't going to work in the face of
unanalyzable branches; we need to be resillient to the phenomenon of
chains poking into a loop and poking out of a loop. In fact, we already
were, we just needed to not assert on it.

This was found during a bootstrap with block placement turned on.

llvm-svn: 145100
2011-11-23 10:35:36 +00:00
Elena Demikhovsky 779ba6d7b7 I added several lines in X86 code generator that allow to choose
VSHUFPS/VSHUFPD instructions while lowering VECTOR_SHUFFLE node. I check a commuted VSHUFP mask.

The patch was reviewed by Bruno.

llvm-svn: 145099
2011-11-23 10:23:16 +00:00
Chandler Carruth 8c68f1f3c8 Handle the case of a no-return invoke correctly. It actually still has
successors, they just are all landing pad successors. We handle this the
same way as no successors. Comments attached for the next person to wade
through here and another lovely test case courtesy of Benjamin Kramer's
bugpoint reduction.

llvm-svn: 145098
2011-11-23 08:23:54 +00:00
Bob Wilson ebb44646c4 Enable stack protectors for all arrays, not just char arrays. rdar://5875909
Patch by Bill Wendling.

llvm-svn: 145097
2011-11-23 07:13:56 +00:00
Jakob Stoklund Olesen 02845410f9 Fix PR11422.
This was a bug in keeping track of the available domains when merging
domain values.

The wrong domain mask caused ExecutionDepsFix to try to move VANDPSYrr
to the integer domain which is only available in AVX2.

Also add an assertion to catch future attempts at emitting AVX2
instructions.

llvm-svn: 145096
2011-11-23 04:03:08 +00:00
Rafael Espindola 5d03d46127 Point to libLTO with -L/PATH/ -lLTO so that it is found in the install
directory.
Patch by Markus Trippelsdorf.

llvm-svn: 145095
2011-11-23 03:07:25 +00:00
Chandler Carruth 4a87aa0c31 Fix a crash in block placement due to an inner loop that happened to be
reversed in the function's original ordering, and we happened to
encounter it while handling an outer unnatural CFG structure.

Thanks to the test case reduced from GCC's source by Benjamin Kramer.
This may also fix a crasher in gzip that Duncan reduced for me, but
I haven't yet gotten to testing that one.

llvm-svn: 145094
2011-11-23 03:03:21 +00:00
Kostya Serebryany 8b5c7a56a3 [asan] do not instrument threadlocal globals, this is buggy
llvm-svn: 145092
2011-11-23 02:10:54 +00:00
Anshuman Dasgupta bcf6a37a58 Undo test commit
llvm-svn: 145079
2011-11-22 20:05:48 +00:00
Anshuman Dasgupta 9ff0894703 Test commit
llvm-svn: 145078
2011-11-22 20:03:30 +00:00
Hal Finkel 6f0ae783fe add basic PPC register-pressure feedback; adjust the vaarg test to match the new register-allocation pattern
llvm-svn: 145065
2011-11-22 16:21:04 +00:00
Craig Topper 83c4592619 More fixes to the X86InstComments for shuffle instructions. In particular add AVX flavors of many instructions and fix the destination operand for some of the existing AVX entries.
llvm-svn: 145063
2011-11-22 14:27:57 +00:00
Chandler Carruth ee54feb6f6 Fix a devilish miscompile exposed by block placement. The
updateTerminator code didn't correctly handle EH terminators in one very
specific case. AnalyzeBranch would find no terminator instruction, and
so the fallback in updateTerminator is to assume fallthrough. This is
correct, but the destination of the fallthrough was assumed to be the
first successor.

This is *almost always* true, but in certain cases the loop
transformations will cause the landing pad to be the first successor!
Instead of this brittle logic, actually look through the successors for
a non-landing-pad accessor, and to assert if more than one is found.

This will hopefully fix some (if not all) of the self host miscompiles
with block placement. Thanks to Benjamin Kramer for reporting, Nick
Lewycky for an initial stab at a reduction, and Duncan for endless
advice on EH (which I know nothing about) as well as reviewing the
actual fix.

llvm-svn: 145062
2011-11-22 13:13:16 +00:00
Benjamin Kramer e1effb0da2 Add configure checking for pread(2) and use it to save a syscall when reading files.
llvm-svn: 145061
2011-11-22 12:31:53 +00:00
Chandler Carruth e2530dc889 Fix an obvious omission in the SelectionDAGBuilder where we were
dropping weights on the floor for invokes. This was impeding my writing
further test cases for invoke when interacting with probabilities and
block placement.

No test case as there doesn't appear to be a way to test this stuff. =/
Suggestions for a test case of course welcome. I hope to be able to add
test cases that indirectly cover this eventually by adding probabilities
to the exceptional edge and reordering blocks as a result.

llvm-svn: 145060
2011-11-22 11:37:46 +00:00
Benjamin Kramer f22623b78b Turn error recovery into an assert.
This was put in because in a certain version of DragonFlyBSD stat(2) lied about the
size of some files. This was fixed a long time ago so we can remove the workaround.

llvm-svn: 145059
2011-11-22 11:37:11 +00:00
Rafael Espindola c55e1af137 Add triple to the test.
llvm-svn: 145057
2011-11-22 06:36:25 +00:00
Rafael Espindola 2021f38281 If a register is both an early clobber and part of a tied use, handle the use
before the clobber so that we copy the value if needed.

Fixes pr11415.

llvm-svn: 145056
2011-11-22 06:27:18 +00:00
Craig Topper ccb7097509 Fix shuffle decoding logic to handle UNPCKLPS/UNPCKLPD on 256-bit vectors correctly. Add support for decoding UNPCKHPS/UNPCKHPD for AVX 128-bit and 256-bit forms.
llvm-svn: 145055
2011-11-22 01:57:35 +00:00
Craig Topper f563977795 Add methods for querying minimum SSE version along with AVX. Simplifies all the places that had to check a version of SSE and AVX.
llvm-svn: 145053
2011-11-22 00:44:41 +00:00
Sebastian Pop 74e1bc7933 fix typo in comment
llvm-svn: 145048
2011-11-21 20:46:55 +00:00
Nick Lewycky 063ae5897c Fix crasher in GVN due to my recent capture tracking changes.
llvm-svn: 145047
2011-11-21 19:42:56 +00:00
Nick Lewycky aa2a00db35 Add virtual destructor. Whoops!
llvm-svn: 145044
2011-11-21 18:32:21 +00:00
Craig Topper 6270d072c5 Lowering for v32i8 to VPUNPCKLBW/VPUNPCKHBW when AVX2 is enabled.
llvm-svn: 145028
2011-11-21 08:26:50 +00:00
Craig Topper d12d6f4b1c Test case for r145026
llvm-svn: 145027
2011-11-21 06:58:09 +00:00
Craig Topper 669199ca94 Add support for lowering 256-bit shuffles to VPUNPCKL/H for i16, i32, i64 if AVX2 is enabled.
llvm-svn: 145026
2011-11-21 06:57:39 +00:00
Joe Abbey 96e89f6412 Fixing a comment
llvm-svn: 145025
2011-11-21 04:42:21 +00:00
Craig Topper a065238c6e Make LowerSIGN_EXTEND_INREG split 256-bit vectors when AVX1 is enabled and use AVX2 shifts when AVX2 is enabled.
llvm-svn: 145022
2011-11-21 01:12:36 +00:00
Nick Lewycky 6ae03c3378 Less template, more virtual! Refactoring suggested by Chris in code review.
llvm-svn: 145014
2011-11-20 19:37:06 +00:00
Nick Lewycky 612d70b19d Refactor code to use new attribute getters on CallSite for NoCapture and ByVal.
Suggested in code review by Eli.

That code in InstCombine looks kinda suspicious.

llvm-svn: 145013
2011-11-20 19:09:04 +00:00
NAKAMURA Takumi 76dfa03874 test/CodeGen/X86/block-placement.ll: Relax expressions for Win32.
llvm-svn: 145011
2011-11-20 12:49:45 +00:00
Chandler Carruth 18dfac385b The logic for breaking the CFG in the presence of hot successors didn't
properly account for the *global* probability of the edge being taken.
This manifested as a very large number of unconditional branches to
blocks being merged against the CFG even though they weren't
particularly hot within the CFG.

The fix is to check whether the edge being merged is both locally hot
relative to other successors for the source block, and globally hot
compared to other (unmerged) predecessors of the destination block.

This introduces a new crasher on GCC single-source, but it's currently
behind a flag, and Ben has offered to work on the reduction. =]

llvm-svn: 145010
2011-11-20 11:22:06 +00:00
Chandler Carruth bcb5f39526 Make an obviously const interface actually be marked as const.
llvm-svn: 145009
2011-11-20 11:22:03 +00:00
Benjamin Kramer 650c09aa4d XFAIL this test until I figure out what indvars is doing here (or find someone who does)
llvm-svn: 145008
2011-11-20 11:10:03 +00:00
Benjamin Kramer b5ba2eef2d SCEV: Actually set overflow flags on add expressions.
setFlags doesn't modify its arguments.

llvm-svn: 145007
2011-11-20 10:24:36 +00:00
Chandler Carruth 20df3953d3 Add some comments to the latest test case I added here to document what
is actually being tested. Also add some FileCheck goodness to much more
carefully ensure that the result is the desired result. Before this test
would only have failed through an assert failure if the underlying fix
were reverted.

Also, add some weight metadata and a comment explaining exactly what is
going on to a trick section of the test case. Originally, we were
getting very unlucky and trying to form a block chain that isn't
actually profitable. I'm working on a fix to avoid forming these
unprofitable chains, and that would also have masked any failure from
this test case. The easy solution is to add some metadata that makes it
*really* profitable to form the bad chain here.

llvm-svn: 145006
2011-11-20 09:30:40 +00:00
Craig Topper e79761df73 Add code for lowering v32i8 shifts by a splat to AVX2 immediate shift instructions. Remove 256-bit splat handling from LowerShift as it was already handled by PerformShiftCombine.
llvm-svn: 145005
2011-11-20 00:12:05 +00:00
Craig Topper a3a6583694 Use 256-bit vcmpeqd for creating an all ones vector when AVX2 is enabled.
llvm-svn: 145004
2011-11-19 22:34:59 +00:00
Craig Topper bac86038ac Remove some of the special classes that worked around an old tablegen limitation of not being able to remove redundant bitconverts from patterns.
llvm-svn: 145003
2011-11-19 21:01:54 +00:00
Craig Topper 3af6ae089f Custom lower AVX2 variable shift intrinsics to shl/srl/sra nodes and remove the intrinsic patterns.
llvm-svn: 144999
2011-11-19 17:46:46 +00:00
Chandler Carruth f3dc9eff16 Move the handling of unanalyzable branches out of the loop-driven chain
formation phase and into the initial walk of the basic blocks. We
essentially pre-merge all blocks where unanalyzable fallthrough exists,
as we won't be able to update the terminators effectively after any
reorderings. This is quite a bit more principled as there may be CFGs
where the second half of the unanalyzable pair has some analyzable
predecessor that gets placed first. Then it may get placed next,
implicitly breaking the unanalyzable branch even though we never even
looked at the part that isn't analyzable. I've included a test case that
triggers this (thanks Benjamin yet again!), and I'm hoping to synthesize
some more general ones as I dig into related issues.

Also, to make this new scheme work we have to be able to handle branches
into the middle of a chain, so add this check. We always fallback on the
incoming ordering.

Finally, this starts to really underscore a known limitation of the
current implementation -- we don't consider broken predecessors when
merging successors. This can caused major missed opportunities, and is
something I'm planning on looking at next (modulo more bug reports).

llvm-svn: 144994
2011-11-19 10:26:02 +00:00
Craig Topper 6d77f4ae14 Test cases for SSSE3/AVX integer horizontal add/sub.
llvm-svn: 144990
2011-11-19 09:03:33 +00:00
Craig Topper f984efbfce Synthesize SSSE3/AVX 128-bit horizontal integer add/sub instructions from add/sub of appropriate shuffle vectors.
llvm-svn: 144989
2011-11-19 09:02:40 +00:00
Craig Topper 81390be00f Collapse X86 PSIGNB/PSIGNW/PSIGND node types.
llvm-svn: 144988
2011-11-19 07:33:10 +00:00
Craig Topper de6b73bb4d Extend VPBLENDVB and VPSIGN lowering to work for AVX2.
llvm-svn: 144987
2011-11-19 07:07:26 +00:00
Craig Topper 75ffc5fbb5 Remove some unnecessary filtering checks from X86 disassembler table build.
llvm-svn: 144986
2011-11-19 05:48:20 +00:00
Craig Topper 66e2b5a61e Remove unused parameters from the AVX maskmov classes.
llvm-svn: 144985
2011-11-19 04:49:22 +00:00
Andrew Trick 6b4d578f54 Fix a corner case in updating LoopInfo after fully unrolling an outer loop.
The loop tree's inclusive block lists are painful and expensive to
update. (I have no idea why they're inclusive). The design was
supposed to handle this case but the implementation missed it and my
unit tests weren't thorough enough.

Fixes PR11335: loop unroll update.

llvm-svn: 144970
2011-11-18 03:42:41 +00:00
Nadav Rotem 1ec141d0f9 Add AVX2 vpbroadcast support
llvm-svn: 144967
2011-11-18 02:49:55 +00:00
Kostya Serebryany 1cdc6e9567 [asan] workaround for reg alloc bug 11395: don't instrument functions with large chunks of inline assembler
llvm-svn: 144962
2011-11-18 01:41:06 +00:00
Chad Rosier ee93ff736a Guard call to getRegForValue with isTypeLegal check to avoid unnecessary work/dead code.
llvm-svn: 144959
2011-11-18 01:17:34 +00:00
Devang Patel 107e8ec30d DISubrange supports unsigned lower/upper array bounds, so let's not fake it in the end while emitting DWARF. If a FE needs to encode signed lower/upper array bounds then we need to extend DISubrange or ad DISignedSubrange.
llvm-svn: 144937
2011-11-17 23:43:15 +00:00
Kostya Serebryany a6edf4c21f quick fix: remove GlobalVariable::GlobalVariable mistakenly commited at r144933. For some reason this compiles on linux
llvm-svn: 144936
2011-11-17 23:37:53 +00:00
Andrew Trick 949045864d Fix an overly general check in SimplifyIndvar to handle useless phi cycles.
The right way to check for a binary operation is
cast<BinaryOperator>. The original check: cast<Instruction> &&
numOperands() == 2 would match phi "instructions", leading to an
infinite loop in extreme corner case: a useless phi with operands
[self, constant] that prior optimization passes failed to remove,
being used in the loop by another useless phi, in turn being used by an
lshr or udiv.

Fixes PR11350: runaway iteration assertion.

llvm-svn: 144935
2011-11-17 23:36:35 +00:00
Kostya Serebryany 65e2211b95 fall back to explicit list of allowed linkages when instrumenting globals in asan; add a test check that asan does not touch linkonce_odr
llvm-svn: 144933
2011-11-17 23:14:59 +00:00
Ted Kremenek b42cfa0015 Fix bug in RefCountedBase/RefCountedBaseVPTR where the reference count was accidentally copied as part of the copy constructor. This could result in objects getting leaked because there reference count was too high.
llvm-svn: 144931
2011-11-17 23:02:14 +00:00
Chad Rosier 0eff3e5c21 Add TODO comment.
llvm-svn: 144920
2011-11-17 21:46:13 +00:00
Craig Topper f41e1d0246 Fix SSE/AVX integer comparison patterns to understand that all integer vector loads are promoted to i64 vector loads so patterns need a bitconvert. Also slightly simplify the AVX2 variable shift patterns by using the predefined bitconvert pattern fragments.
llvm-svn: 144896
2011-11-17 07:49:38 +00:00
Chad Rosier 15b2498e88 Dead code.
llvm-svn: 144888
2011-11-17 07:24:49 +00:00
Chad Rosier f83ab704e4 When fast iseling a GEP, accumulate the offset rather than emitting a series of
ADDs.  MaxOffs is used as a threshold to limit the size of the offset. Tradeoffs
being: (1) If we can't materialize the large constant then we'll cause fast-isel
to bail. (2) Too large of an offset can't be directly encoded in the ADD
resulting in a MOV+ADD.  Generally not a bad thing because otherwise we would
have had ADD+ADD, but on Thumb this turns into a MOVS+MOVT+ADD. Working on a fix
for that. (3) Conversely, too low of a threshold we'll miss opportunities to 
coalesce ADDs.
rdar://10412592

llvm-svn: 144886
2011-11-17 07:15:58 +00:00
Craig Topper f17b600577 Remove seemingly unnecessary duplicate VROUND definitions.
llvm-svn: 144885
2011-11-17 07:04:00 +00:00
Chris Lattner 0e439c9b7c x86/windows issues fixed.
llvm-svn: 144878
2011-11-17 01:42:23 +00:00
Eli Friedman 489c0ff4a4 Add support for custom names for library functions in TargetLibraryInfo. Add a custom name for fwrite and fputs on x86-32 OSX. Make SimplifyLibCalls honor the custom
names for fwrite and fputs.

Fixes <rdar://problem/9815881>.

llvm-svn: 144876
2011-11-17 01:27:36 +00:00
Daniel Dunbar 52f71220d5 llvm-build: Attempt to work around a CMake Makefile generator bug that doesn't
properly quote strings when writing the CMakeFiles/Makefile.cmake output file
(which lists the dependencies). This shows up when using CMake + MSYS Makefile
generator.

llvm-svn: 144873
2011-11-17 01:19:53 +00:00
Chad Rosier ce619ddfc5 Don't unconditionally set the kill flag.
rdar://10456186

llvm-svn: 144872
2011-11-17 01:16:53 +00:00
Eli Friedman 20439a42b0 Turn on vzeroupper insertion on call boundaries for AVX; it works as far as I know, and I'd like to see wider testing.
llvm-svn: 144867
2011-11-17 00:21:52 +00:00
Daniel Dunbar 586aabc44a build/make/test: Get rid of unused BUGPOINT_TOPTS variable.
llvm-svn: 144864
2011-11-16 23:56:03 +00:00
Eli Friedman ff1eaa7578 Make sure to replace the chain properly when DAGCombining a LOAD+EXTRACT_VECTOR_ELT into a single LOAD. Fixes PR10747/PR11393.
llvm-svn: 144863
2011-11-16 23:50:22 +00:00
Michael J. Spencer d27d51fbaf Object/COFF: Support common symbols.
llvm-svn: 144861
2011-11-16 23:36:12 +00:00
Jim Grosbach f4d2e0d458 Remove obsolete test.
The PLD encoding is checked via the .s file now.

llvm-svn: 144853
2011-11-16 22:50:38 +00:00
Jim Grosbach d3f02cbce9 Generalize the fixup info for ARM mode.
We don't (yet) have the granularity in the fixups to be specific about which
bitranges are affected. That's a future cleanup, but we're not there yet.

llvm-svn: 144852
2011-11-16 22:48:37 +00:00
Jim Grosbach d66cb5ab33 Update test for r144842.
llvm-svn: 144851
2011-11-16 22:46:27 +00:00
Akira Hatanaka b31abde0f3 Lower 64-bit constant pool node.
llvm-svn: 144849
2011-11-16 22:44:38 +00:00
Akira Hatanaka eb42071721 Lower 64-bit block address.
llvm-svn: 144847
2011-11-16 22:42:10 +00:00
Jim Grosbach 7ccdb7c0ae Fix encoding of NOP used for padding in ARM mode .align.
llvm-svn: 144842
2011-11-16 22:40:25 +00:00
Akira Hatanaka 7b8547c4d0 Add patterns for 64-bit tglobaladdr, tblockaddress, tjumptable and tconstpool
nodes.

llvm-svn: 144841
2011-11-16 22:39:56 +00:00
Akira Hatanaka 6d617ceca2 64-bit jump register instruction.
llvm-svn: 144840
2011-11-16 22:36:01 +00:00
Evan Cheng 011538dc79 Another missing X86ISD::MOVLPD pattern. rdar://10450317
llvm-svn: 144839
2011-11-16 22:24:44 +00:00
Jim Grosbach bfe5c5c968 ARM assembly parsing for shifted register operands for MOV instruction.
llvm-svn: 144837
2011-11-16 21:50:05 +00:00
Jim Grosbach 01e0439240 Clean up debug printing of ARM shifted operands.
llvm-svn: 144836
2011-11-16 21:46:50 +00:00
Chad Rosier ff40b1e164 Add fast-isel stats to determine who's doing all the work, the
target-independent selector or the target-specific selector.

llvm-svn: 144833
2011-11-16 21:05:28 +00:00
Chad Rosier cfd0d10e72 Fix the stats collection for fast-isel. The failed count was only accounting
for a single miss and not all predecessor instructions that get selected by
the selection DAG instruction selector.  This is still not exact (e.g., over
states misses when folded/dead instructions are present), but it is a step in
the right direction.

llvm-svn: 144832
2011-11-16 21:02:08 +00:00
Chandler Carruth d1eafe118d There are already problems with building LLVM under VS2005, and it's
quite old now. Update the documentation to reflect this, and direct
people to use VS2008 or newer.

llvm-svn: 144818
2011-11-16 19:52:13 +00:00
Jim Grosbach 3127ab6d8f ARM assmebly two operand forms for LSR, ASR, LSL, ROR register.
llvm-svn: 144814
2011-11-16 19:12:24 +00:00
Jim Grosbach 1a2f9ee3c8 ARM assembly parsing for RRX mnemonic.
rdar://9704684

llvm-svn: 144812
2011-11-16 19:05:59 +00:00
Pete Cooper 48784ed5b7 Added missing comment about new custom lowering of DEC64
llvm-svn: 144811
2011-11-16 19:03:23 +00:00
Evan Cheng 822ddde50d Disable expensive two-address optimizations at -O0. rdar://10453055
llvm-svn: 144806
2011-11-16 18:44:48 +00:00
Chad Rosier 80979b6ea6 Check to make sure we can select the instruction before trying to put the
operands into a register.  Otherwise, we may materialize dead code.

llvm-svn: 144805
2011-11-16 18:39:44 +00:00
Evan Cheng 624eb2af6f Disable the assertion again. Looks like fastisel is still generating bad kill markers.
llvm-svn: 144804
2011-11-16 18:32:14 +00:00
Jim Grosbach abcac56869 ARM mode aliases for bitwise instructions w/ register operands.
rdar://9704684

llvm-svn: 144803
2011-11-16 18:31:45 +00:00
Bob Wilson 0ca7ce389c Fix tablegen warning: hasSideEffects is inferred for eh_sjlj_dispatchsetup.
llvm-svn: 144798
2011-11-16 17:09:59 +00:00
NAKAMURA Takumi b345060a85 lib/Target/ARM/CMakeLists.txt: Disable optimization in ARMISelLowering.cpp also on MSC15(aka VS9). Seems miscompiled.
llvm-svn: 144794
2011-11-16 09:18:28 +00:00
Evan Cheng ecb2908bf9 Sink codegen optimization level into MCCodeGenInfo along side relocation model
and code model. This eliminates the need to pass OptLevel flag all over the
place and makes it possible for any codegen pass to use this information.

llvm-svn: 144788
2011-11-16 08:38:26 +00:00
Bob Wilson cca9aa58ca Record landing pads with a SmallSetVector to avoid multiple entries.
There may be many invokes that share one landing pad, and the previous code
would record the landing pad once for each invoke.  Besides the wasted
effort, a pair of volatile loads gets inserted every time the landing pad is
processed.  The rest of the code can get optimized away when a landing pad
is processed repeatedly, but the volatile loads remain, resulting in code like:

LBB35_18:
Ltmp483:
        ldr     r2, [r7, #-72]
        ldr     r2, [r7, #-68]
        ldr     r2, [r7, #-72]
        ldr     r2, [r7, #-68]
        ldr     r2, [r7, #-72]
        ldr     r2, [r7, #-68]
        ldr     r2, [r7, #-72]
        ldr     r2, [r7, #-68]
        ldr     r2, [r7, #-72]
        ldr     r2, [r7, #-68]
        ldr     r2, [r7, #-72]
        ldr     r2, [r7, #-68]
        ldr     r2, [r7, #-72]
        ldr     r2, [r7, #-68]
        ldr     r2, [r7, #-72]
        ldr     r2, [r7, #-68]
        ldr     r4, [r7, #-72]
        ldr     r2, [r7, #-68]

llvm-svn: 144787
2011-11-16 07:57:21 +00:00
Craig Topper 3ed7d9ee5a Fix the execution domain on a bunch of SSE/AVX instructions.
llvm-svn: 144784
2011-11-16 07:30:46 +00:00
Bob Wilson 643e63c40c Update the SP in the SjLj jmpbuf whenever it changes. <rdar://problem/10444602>
This same basic code was in the older version of the SjLj exception handling,
but it was removed in the recent revisions to that code.  It needs to be there.

llvm-svn: 144782
2011-11-16 07:12:00 +00:00
Bob Wilson f6d1728d8f Fix ARM SjLj-EH dispatch setup code. <rdar://problem/10444602>
The EmitBasePointerRecalculation function has 2 problems, one minor and one
fatal.  The minor problem is that it inserts the code at the setjmp
instead of in the dispatch block.  The fatal problem is that at the point
where this code runs, we don't know whether there will be a base pointer,
so the entire function is a no-op.  The base pointer recalculation needs to
be handled as it was before, by inserting a pseudo instruction that gets
expanded late.

Most of the support for the old approach is still here, but it no longer
has any connection to the eh_sjlj_dispatchsetup intrinsic.  Clean up the
parts related to the intrinsic and just generate the pseudo instruction
directly.

llvm-svn: 144781
2011-11-16 07:11:57 +00:00
Craig Topper 07d8b5e2c9 Remove code to enable execution dependency fix pass on VR256. VR128 is sufficient after r144636.
llvm-svn: 144777
2011-11-16 05:02:04 +00:00
Evan Cheng 4ac36c8e26 Revert r144568 now that r144730 has fixed the fast-isel kill marker bug.
llvm-svn: 144776
2011-11-16 04:55:01 +00:00
Nick Lewycky db408f1857 Fix typo in test.
llvm-svn: 144774
2011-11-16 03:56:38 +00:00
Nick Lewycky c7f1e7993c Merge isObjectPointerWithTrustworthySize with getPointerSize. Use it when
looking at the size of the pointee. Fixes PR11390!

llvm-svn: 144773
2011-11-16 03:49:48 +00:00
Evan Cheng b8c55a5339 If the 2addr instruction has other kills, don't move it below any other uses since we don't want to extend other live ranges.
llvm-svn: 144772
2011-11-16 03:47:42 +00:00
Evan Cheng 59f8156ea0 RescheduleKillAboveMI() must backtrack to before the rescheduled DBG_VALUE instructions. rdar://10451185
llvm-svn: 144771
2011-11-16 03:33:08 +00:00
Evan Cheng 9ddd69a8bc Process all uses first before defs to accurately capture register liveness. rdar://10449480
llvm-svn: 144770
2011-11-16 03:05:12 +00:00
Eli Friedman e6270395e3 Fix testcase.
llvm-svn: 144769
2011-11-16 03:03:52 +00:00
Eli Friedman 87f92512c3 CONCAT_VECTORS can have more than two operands. PR11389.
llvm-svn: 144768
2011-11-16 02:52:39 +00:00
Eli Friedman d257a464d1 Add a couple asserts so it will be easier to debug if we accidentally pass indexed loads/stores to the legalizer.
llvm-svn: 144767
2011-11-16 02:43:15 +00:00
Michael J. Spencer 415e8aa84a Remove extra ,.
llvm-svn: 144759
2011-11-16 01:36:50 +00:00
Kostya Serebryany 6e6b03ec46 AddressSanitizer, first commit (compiler module only)
llvm-svn: 144758
2011-11-16 01:35:23 +00:00
Michael J. Spencer ca48ae0194 Object/Archive: Give Child a operator < for map.
llvm-svn: 144757
2011-11-16 01:25:13 +00:00
Michael J. Spencer ecdd31181d Support/COFF: Add structs and enums from the standard for image files.
llvm-svn: 144756
2011-11-16 01:24:57 +00:00
Michael J. Spencer 53723de5b8 llvm-objdump: Ignore non-objects in archives.
llvm-svn: 144755
2011-11-16 01:24:41 +00:00
Kostya Serebryany db999c01f2 test commit to verify that commit access works (added blank line)
llvm-svn: 144748
2011-11-16 01:14:38 +00:00
Owen Anderson ca2f78a95b Rename MVT::untyped to MVT::Untyped to match similar nomenclature.
llvm-svn: 144747
2011-11-16 01:02:57 +00:00
Andrew Trick 90c7a108ca Fix SCEV overly optimistic back edge taken count for multi-exit loops.
Fixes PR11375: Different results for 'clang++ huh.cpp'...

llvm-svn: 144746
2011-11-16 00:52:40 +00:00
Chad Rosier af13d767a2 Add FIXME comment.
llvm-svn: 144743
2011-11-16 00:32:20 +00:00
Jakob Stoklund Olesen 653183fd5c Enable -widen-vmovs by default.
This will widen 32-bit register vmov instructions to 64-bit when
possible.  The 64-bit vmovd instructions can then be translated to NEON
vorr instructions by the execution dependency fix pass.

The copies are only widened if they are marked as clobbering the whole
D-register.

llvm-svn: 144734
2011-11-15 23:53:18 +00:00
Eric Christopher 0abbd0ef5a Stabilize the output of the dwarf accelerator tables. Fixes a comparison
failure during bootstrap with it turned on.

llvm-svn: 144731
2011-11-15 23:37:17 +00:00
Chad Rosier 291ce47db7 GEPs with all zero indices are trivially coalesced by fast-isel. For example,
%arrayidx135 = getelementptr inbounds [4 x [4 x [4 x [4 x i32]]]]* %M0, i32 0, i64 0
%arrayidx136 = getelementptr inbounds [4 x [4 x [4 x i32]]]* %arrayidx135, i32 0, i64 %idxprom134

Prior to this commit, the GEP instruction that defines %arrayidx136 thought that 
%arrayidx135 was a trivial kill.  The GEP that defines %arrayidx135 doesn't 
generate any code and thus %M0 gets folded into the second GEP.  Thus, we need
to look through GEPs with all zero indices.
rdar://10443319

llvm-svn: 144730
2011-11-15 23:34:05 +00:00
Jim Grosbach e891fe8d6c ARM assembly parsing for register range syntax for VLD/VST register lists.
For example,
vld1.f64 {d2-d5}, [r2,:128]!

Should be equivalent to:
vld1.f64 {d2,d3,d4,d5}, [r2,:128]!

It's not documented syntax in the ARM ARM, but it is consistent with what's
accepted for VLDM/VSTM and is unambiguous in meaning, so it's a good thing to
support.

rdar://10451128

llvm-svn: 144727
2011-11-15 23:19:15 +00:00
Devang Patel 8cd896404a Merge ObjCPropertyDebugInfo.html into SourceLevelDebugging.html
llvm-svn: 144724
2011-11-15 22:59:54 +00:00
Jim Grosbach 003cea6011 ARM assembly parsing for data type suffices on NEON VMOV aliases.
llvm-svn: 144722
2011-11-15 22:54:42 +00:00
Nadav Rotem 51f71054b6 Fix MSVC warnings by adding a cast.
llvm-svn: 144721
2011-11-15 22:54:21 +00:00
Nadav Rotem 37010002f2 AVX: Add support for vbroadcast from BUILD_VECTOR and refactor some of the vbroadcast code.
llvm-svn: 144720
2011-11-15 22:50:37 +00:00
Chris Lattner dfcdd0c5be jakob fixed X87 inline asm!
llvm-svn: 144719
2011-11-15 22:48:24 +00:00
Chris Lattner 7cdcbe3788 add ImmutableSet/Map dox, patch by Caitlin Sadowski!
llvm-svn: 144716
2011-11-15 22:40:14 +00:00
NAKAMURA Takumi f6b315c081 test/CodeGen/X86/dec-eflags-lower.ll: Relax expression for win32 x64.
llvm-svn: 144714
2011-11-15 22:30:37 +00:00
Jim Grosbach 75fb4abcdc ARM assembly parsing two operand forms for shift instructions.
llvm-svn: 144713
2011-11-15 22:27:54 +00:00
Chris Lattner 212a0867e9 add PTX backend info
llvm-svn: 144711
2011-11-15 22:23:46 +00:00
Jim Grosbach a01033709f ARM VFP assembly parsing for VADD and VSUB two-operand forms.
llvm-svn: 144710
2011-11-15 22:15:10 +00:00
Jim Grosbach 8279c1828f ARM accept an immediate offset in memory operands w/o the '#'.
llvm-svn: 144709
2011-11-15 22:14:41 +00:00
Chris Lattner 92f2183360 some notes.
llvm-svn: 144708
2011-11-15 22:13:27 +00:00
Pete Cooper 7c7ba1baa1 Added custom lowering for load->dec->store sequence in x86 when the EFLAGS registers is used
by later instructions.

Only done for DEC64m right now.

Fixes <rdar://problem/6172640>

llvm-svn: 144705
2011-11-15 21:57:53 +00:00
Jim Grosbach 8d579230c6 ARM enclosing curly braces optional on one-register VLD/VST instruction lists.
'vld1.f32 d4, [r7]' should be parsed as equivalent to 'vld1.f32 {d4}, [r7]'

rdar://10450488.

llvm-svn: 144701
2011-11-15 21:45:55 +00:00
Akira Hatanaka b89a58df96 Update section "MIPS Target Improvements" in the llvm 3.0 release notes.
llvm-svn: 144699
2011-11-15 21:33:05 +00:00
Jim Grosbach 84f0ba5747 ARM size suffix on VFP single-precision 'vmov' is optional.
rdar://10435114

llvm-svn: 144698
2011-11-15 21:18:35 +00:00
Devang Patel 43bde96a4c Insert modified DBG_VALUE into LiveDbgValueMap.
llvm-svn: 144696
2011-11-15 21:03:58 +00:00
Jim Grosbach a92a5d8548 Fix typo.
llvm-svn: 144695
2011-11-15 21:01:30 +00:00
Jim Grosbach 131b45e632 ARM alternate size suffices for VTRN instructions.
rdar://10435076

llvm-svn: 144694
2011-11-15 20:49:46 +00:00
Owen Anderson 05060f0748 Fix a misplaced paren bug.
llvm-svn: 144692
2011-11-15 20:30:41 +00:00
Jim Grosbach 5803f6d5a2 ARM assembly parsing for optional datatype suffix on VFP VMOV GPR<->VFP insns.
Yet more of rdar://10435076.

llvm-svn: 144691
2011-11-15 20:29:42 +00:00
Jim Grosbach c5b1bc561e ARM assembly parsing for two-operand form of 'mul' instruction.
rdar://10449856.

llvm-svn: 144689
2011-11-15 20:14:51 +00:00
Jim Grosbach 72dfd20aba ARM assembly parsing for two-operand form of 'mul' instruction.
Ongoing rdar://10435114.

llvm-svn: 144688
2011-11-15 20:02:06 +00:00
Jim Grosbach 68c899c211 Testcase for r144684.
llvm-svn: 144685
2011-11-15 19:56:17 +00:00
Jim Grosbach efa7e95d06 Thumb2 two-operand 'mul' instruction wide encoding parsing.
rdar://10449724

llvm-svn: 144684
2011-11-15 19:55:16 +00:00
Owen Anderson 0ac9058f89 Fix an ambiguous decoding where we failed to properly decode VMOVv2f32 and VMOVv4f32.
llvm-svn: 144683
2011-11-15 19:55:00 +00:00
Jim Grosbach 6efa7b9852 Thumb2 assembly parsing for mul.w in IT block fix.
When the 3rd operand is not a low-register, and the first two operands are
the same low register, the parser was incorrectly trying to use the 16-bit
instruction encoding.

rdar://10449281

llvm-svn: 144679
2011-11-15 19:29:45 +00:00
Benjamin Kramer b106bcc536 StringRefize and simplify.
llvm-svn: 144675
2011-11-15 19:12:09 +00:00
Rafael Espindola f11e7f1305 We currently use a callback to handle an IL pass deleting a BB that still
has a reference to it. Unfortunately, that doesn't work for codegen passes
since we don't get notified of MBB's being deleted (the original BB stays).

Use that fact to our advantage and after printing a function, check if
any of the IL BBs corresponds to a symbol that was not printed. This fixes
pr11202.

llvm-svn: 144674
2011-11-15 19:08:46 +00:00
Akira Hatanaka 6ee8fc88c7 Fix functions in MipsFrameLowering.cpp and MipsRegisterInfo.cpp. Use 64-bit
registers and instructions when ABI is N64.

llvm-svn: 144666
2011-11-15 18:53:55 +00:00
Akira Hatanaka 494913270e Set nomacro before emitting the sequence of instructions that set global pointer
register.

llvm-svn: 144665
2011-11-15 18:44:44 +00:00
Akira Hatanaka 66a14c0650 Simplify function PassByValArg64.
llvm-svn: 144664
2011-11-15 18:42:25 +00:00
Akira Hatanaka d519d8ca83 Remove function printMipsSymbolRef.
llvm-svn: 144663
2011-11-15 18:38:35 +00:00
Benjamin Kramer be6535b3dc Remove Value::getNameStr. It has been deprecated for a while and provides no additional value over getName().
llvm-svn: 144657
2011-11-15 18:30:12 +00:00
Benjamin Kramer 184e3ceea0 Missed some users of Value::getNameStr.
llvm-svn: 144656
2011-11-15 18:30:06 +00:00
Akira Hatanaka b7796ae938 Delete files.
llvm-svn: 144655
2011-11-15 18:22:48 +00:00
Akira Hatanaka 1c0590c5da Remove MipsMCSymbolRefExpr.
llvm-svn: 144654
2011-11-15 18:20:08 +00:00
Jim Grosbach 2aabaa704a ARM parsing datatype suffix variants for register-writeback VLD1/VST1 instructions.
rdar://10435076

llvm-svn: 144650
2011-11-15 17:49:59 +00:00
Jim Grosbach 0dde349df1 Tidy up. 80 columns.
llvm-svn: 144649
2011-11-15 16:46:22 +00:00
Benjamin Kramer 1f97a5a671 Remove all remaining uses of Value::getNameStr().
llvm-svn: 144648
2011-11-15 16:27:03 +00:00
Benjamin Kramer 4c93d15f09 Twinify GraphWriter a little bit.
llvm-svn: 144647
2011-11-15 16:26:38 +00:00
Jakob Stoklund Olesen e14ef7e6f8 Check all overlaps when looking for used registers.
A function using any RC alias is enough to enable the ExeDepsFix pass.

llvm-svn: 144636
2011-11-15 08:20:43 +00:00
Jay Foad ab9ebd3521 Make use of MachinePointerInfo::getFixedStack.
llvm-svn: 144635
2011-11-15 07:51:13 +00:00
Jay Foad 70679df664 Remove some unnecessary includes of PseudoSourceValue.h.
llvm-svn: 144634
2011-11-15 07:50:46 +00:00
Jay Foad e5cbd3c3fb Fix typo in comment.
llvm-svn: 144633
2011-11-15 07:50:05 +00:00
Jay Foad 465101bb0e Make use of MachinePointerInfo::getFixedStack. This removes all mention
of PseudoSourceValue from lib/Target/.

llvm-svn: 144632
2011-11-15 07:34:52 +00:00
Jay Foad 0745e645e0 Remove some unnecessary includes of PseudoSourceValue.h.
llvm-svn: 144631
2011-11-15 07:24:32 +00:00
Jakob Stoklund Olesen 4949b9a283 Revert r144611 and r144613.
These tests are actually correct, clang was miscompiling ExeDepsFix::processUses.

Evan fixed the miscompilation in r144628.

llvm-svn: 144630
2011-11-15 07:13:03 +00:00
Craig Topper 649d1c5eec Fix PR11370 for real. Prevents converting 256-bit FP instruction to AVX2 256-bit integer instructions when AVX2 isn't enabled.
llvm-svn: 144629
2011-11-15 06:39:01 +00:00
Evan Cheng 7098c4e5f4 Set SeenStore to true to prevent loads from being moved; also eliminates a non-deterministic behavior.
llvm-svn: 144628
2011-11-15 06:26:51 +00:00
Chandler Carruth 9b548a7fcf Rather than trying to use the loop block sequence *or* the function
block sequence when recovering from unanalyzable control flow
constructs, *always* use the function sequence. I'm not sure why I ever
went down the path of trying to use the loop sequence, it is
fundamentally not the correct sequence to use. We're trying to preserve
the incoming layout in the cases of unreasonable control flow, and that
is only encoded at the function level. We already have a filter to
select *exactly* the sub-set of blocks within the function that we're
trying to form into a chain.

The resulting code layout is also significantly better because of this.
In several places we were ending up with completely unreasonable control
flow constructs due to the ordering chosen by the loop structure for its
internal storage. This change removes a completely wasteful vector of
basic blocks, saving memory allocation in the common case even though it
costs us CPU in the fairly rare case of unnatural loops. Finally, it
fixes the latest crasher reduced out of GCC's single source. Thanks
again to Benjamin Kramer for the reduction, my bugpoint skills failed at
it.

llvm-svn: 144627
2011-11-15 06:26:43 +00:00
Craig Topper 05baa85f58 Properly qualify AVX2 specific parts of execution dependency table. Also enable converting between 256-bit PS/PD operations when AVX1 is enabled. Fixes PR11370.
llvm-svn: 144622
2011-11-15 05:55:35 +00:00
NAKAMURA Takumi f01faac473 include/llvm/Support/Compiler.h: Invalidate LLVM_ATTRIBUTE_WEAK on cygming for now.
It triggers generating insane executables with both binutils-2.19.1(msysgit) and 2.22.51.20111013(cygwin).

llvm-svn: 144621
2011-11-15 05:24:26 +00:00
Jakob Stoklund Olesen 9c0de9bb6b Really fix test.
llvm-svn: 144613
2011-11-15 03:17:01 +00:00
Jakob Stoklund Olesen 14b66375a9 Allow for depencendy-breaking instructions before cvt*.
This should unbreak clang-x86_64-darwin10-RA, but I can't actually
reproduce the failure.

llvm-svn: 144611
2011-11-15 02:29:48 +00:00
Evan Cheng 7ca4b6eb5c Add vmov.f32 to materialize f32 immediate splats which cannot be handled by
integer variants. rdar://10437054

llvm-svn: 144608
2011-11-15 02:12:34 +00:00
Jim Grosbach 29cdcda80d ARM parsing datatype suffix variants for fixed-writeback VLD1/VST1 instructions.
rdar://10435076

llvm-svn: 144606
2011-11-15 01:46:57 +00:00
Nick Lewycky 6804d27048 Move WEAK marking to the declaration.
llvm-svn: 144603
2011-11-15 01:23:22 +00:00
Jakob Stoklund Olesen f8ad336bc4 Break false dependencies before partial register updates.
Two new TargetInstrInfo hooks lets the target tell ExecutionDepsFix
about instructions with partial register updates causing false unwanted
dependencies.

The ExecutionDepsFix pass will break the false dependencies if the
updated register was written in the previoius N instructions.

The small loop added to sse-domains.ll runs twice as fast with
dependency-breaking instructions inserted.

llvm-svn: 144602
2011-11-15 01:15:30 +00:00
Jakob Stoklund Olesen 543bef6ead Track register ages more accurately.
Keep track of the last instruction to define each register individually
instead of per DomainValue.  This lets us track more accurately when a
register was last written.

Also track register ages across basic blocks.  When entering a new
basic block, use the least stale predecessor def as a worst case
estimate for register age.

The register age is used to arbitrate between conflicting domains. The
most recently defined register wins.

llvm-svn: 144601
2011-11-15 01:15:25 +00:00
Devang Patel aa284c564f Add ObjCPropertyDebugInfo.html
llvm-svn: 144600
2011-11-15 01:14:37 +00:00
Devang Patel 41eac807ee Document debug info support for objective-c properties.
llvm-svn: 144599
2011-11-15 01:11:58 +00:00
Jim Grosbach 7b03fbd25c Tidy up. Formatting.
llvm-svn: 144598
2011-11-15 01:05:12 +00:00
Nick Lewycky b2489b7484 Fix linking for some users who already have tsan enabled code and are trying to
link it against llvm code, by making our definitions weak. "Some users."

llvm-svn: 144596
2011-11-15 00:14:04 +00:00
Jim Grosbach a498af2b1d ARM parsing datatype suffix variants for non-writeback VST1 instructions.
rdar://10435076

llvm-svn: 144593
2011-11-14 23:43:46 +00:00
Jim Grosbach 72838a0345 ARM parsing datatype suffix variants for non-writeback VLD1 instructions.
rdar://10435076

llvm-svn: 144592
2011-11-14 23:32:59 +00:00
Jim Grosbach 750de7a399 Add explanatory comment.
llvm-svn: 144589
2011-11-14 23:21:09 +00:00
Jim Grosbach 9c2d9d597b Split out the plain '.{8|16|32|64}' suffix handling.
Make it easier to deal with aliases for instructions that do require a suffix
but accept more specific variants of the same size.

llvm-svn: 144588
2011-11-14 23:20:14 +00:00
Jim Grosbach 3d6c0e0bb2 ARM parsing optional datatype suffix for VAND/VEOR/VORR instructions.
rdar://10435076

llvm-svn: 144587
2011-11-14 23:11:19 +00:00
Chad Rosier 057b6d3476 Supporting inline memmove isn't going to be worthwhile. The only way to avoid
violating a dependency is to emit all loads prior to stores.  This would likely
cause a great deal of spillage offsetting any potential gains.

llvm-svn: 144585
2011-11-14 23:04:09 +00:00
Jim Grosbach 3e2c6f380c ARM VLDR/VSTR instructions don't need a size suffix.
Canonicallize on the non-suffixed form, but continue to accept assembly that
has any correctly sized type suffix.

llvm-svn: 144583
2011-11-14 23:03:21 +00:00
Nick Lewycky 7013a19e8a Refactor capture tracking (which already had a couple flags for whether returns
and stores capture) to permit the caller to see each capture point and decide
whether to continue looking.

Use this inside memdep to do an analysis that basicaa won't do. This lets us
solve another devirtualization case, fixing PR8908!

llvm-svn: 144580
2011-11-14 22:49:42 +00:00
Chad Rosier 4e88fbebde Add newline to end of file. Thanks, Eli.
llvm-svn: 144579
2011-11-14 22:48:33 +00:00
Chad Rosier ab7223e99a Add support for inlining small memcpys.
rdar://10412592

llvm-svn: 144578
2011-11-14 22:46:17 +00:00
Chad Rosier 45110fdf8d Fix a performance regression from r144565. Positive offsets were being lowered
into registers, rather then encoded directly in the load/store.

llvm-svn: 144576
2011-11-14 22:34:48 +00:00
Jim Grosbach 7996b15724 ARM assembly parsing type suffix options for VLDR/VSTR.
rdar://10435076

llvm-svn: 144575
2011-11-14 22:28:39 +00:00
Nick Lewycky e5dc7550a5 Fix Windows build, don't try to #include <pthread.h> when we know it's not
available.

llvm-svn: 144574
2011-11-14 22:10:23 +00:00
Evan Cheng f2fc508d4d Avoid dereferencing off the beginning of lists.
llvm-svn: 144569
2011-11-14 21:11:15 +00:00
Evan Cheng 28ffb7e444 At -O0, multiple uses of a virtual registers in the same BB are being marked
"kill". This looks like a bug upstream. Since that's going to take some time
to understand, loosen the assertion and disable the optimization when
multiple kills are seen.

llvm-svn: 144568
2011-11-14 21:02:09 +00:00
Nick Lewycky fe856110aa Add support for tsan annotations (thread sanitizer, a valgrind-based tool).
These annotations are disabled entirely when either ENABLE_THREADS is off, or
building a release build. When enabled, they add calls to functions with no
statements to ManagedStatic's getters.

Use these annotations to inform tsan that the race used inside ManagedStatic
initialization is actually benign. Thanks to Kostya Serebryany for helping
write this patch!

llvm-svn: 144567
2011-11-14 20:50:16 +00:00
Evan Cheng fb13d32b3f Add a missing pattern for X86ISD::MOVLPD. rdar://10436044
llvm-svn: 144566
2011-11-14 20:35:52 +00:00
Chad Rosier adfd200bcb Add support for Thumb load/stores with negative offsets.
rdar://10412592

llvm-svn: 144565
2011-11-14 20:22:27 +00:00
Benjamin Kramer 319904cc7e Unbreak Release builds.
llvm-svn: 144560
2011-11-14 19:51:48 +00:00
Evan Cheng 30f44ad785 Teach two-address pass to re-schedule two-address instructions (or the kill
instructions of the two-address operands) in order to avoid inserting copies.
This fixes the few regressions introduced when the two-address hack was
disabled (without regressing the improvements).
rdar://10422688

llvm-svn: 144559
2011-11-14 19:48:55 +00:00
Pete Cooper 890e02e854 Changed SSE4/AVX <2 x i64> extract and insert ops to be Custom lowered
Constant idx case is still done in tablegen but other cases are then expanded

Fixes <rdar://problem/10435460>

llvm-svn: 144557
2011-11-14 19:38:42 +00:00
Benjamin Kramer 42d098e1b4 Fold ConstantVector::isAllOnesValue into Constant::isAllOnesValue and simplify it.
llvm-svn: 144555
2011-11-14 19:12:20 +00:00
Akira Hatanaka f93b3f46f8 32-to-64-bit extended load.
llvm-svn: 144554
2011-11-14 19:06:14 +00:00
Akira Hatanaka 0b8bc00424 AnalyzeCallOperands function for N32/64.
N32/64 places all variable arguments in integer registers (or on stack),
regardless of their types, but follows calling convention of non-vaarg function
when it handles fixed arguments.

llvm-svn: 144553
2011-11-14 19:02:54 +00:00
Akira Hatanaka 52359363f2 Modify LowerFormalArguments to correctly handle vaarg arguments for Mips64.
llvm-svn: 144552
2011-11-14 19:01:09 +00:00
Justin Holewinski 33a519021c PTX: Let LLVM use loads/stores for all mem* intrinsics, instead of relying on custom implementations.
llvm-svn: 144551
2011-11-14 18:58:20 +00:00
Wesley Peck 1c29a83acc Add release notes for the MicroBlaze backend.
llvm-svn: 144550
2011-11-14 18:56:41 +00:00
Akira Hatanaka d673cfe027 Remove variable that keeps the size of area used to save byval or variable
argument registers on the callee's stack frame, along with functions that set
and get it.
    
It is not necessary to add the size of this area when computing stack size in
emitPrologue, since it has already been accounted for in
PEI::calculateFrameObjectOffsets.

llvm-svn: 144549
2011-11-14 18:56:20 +00:00
Jakob Stoklund Olesen 7e6004a3c1 Fix early-clobber handling in shrinkToUses.
I broke this in r144515, it affected most ARM testers.

<rdar://problem/10441389>

llvm-svn: 144547
2011-11-14 18:45:38 +00:00
Bob Wilson 8d1c7dbdff Disable generation of compact unwind encodings. <rdar://problem/10441578>
This still seems to be causing some failures.  It needs more testing before
it gets enabled again.

llvm-svn: 144543
2011-11-14 18:21:07 +00:00
Jakob Stoklund Olesen 7e07b388ac Delete stale comment.
llvm-svn: 144542
2011-11-14 18:03:05 +00:00
Jim Grosbach ee201faeac Tidy up. 80 column.
llvm-svn: 144538
2011-11-14 17:52:47 +00:00
Benjamin Kramer 0ffbcc959d Make headers standalone.
llvm-svn: 144537
2011-11-14 17:45:03 +00:00
Benjamin Kramer d00e94e882 Make headers standalone, move a virtual method out of line.
llvm-svn: 144536
2011-11-14 17:22:45 +00:00
Daniel Dunbar a5772b9278 build/Make: Switch over to using llvm-config-2 for dependencies one more (hopefully last) time, now that it also builds as a build tool.
llvm-svn: 144535
2011-11-14 17:17:45 +00:00
Chandler Carruth fd9b4d9813 It helps to deallocate memory as well as allocate it. =] This actually
cleans up all the chains allocated during the processing of each
function so that for very large inputs we don't just grow memory usage
without bound.

llvm-svn: 144533
2011-11-14 10:57:23 +00:00
Chandler Carruth 0a31d149ea Remove an over-eager assert that was firing on one of the ARM regression
tests when I forcibly enabled block placement.

It is apparantly possible for an unanalyzable block to fallthrough to
a non-loop block. I don't actually beleive this is correct, I believe
that 'canFallThrough' is returning true needlessly for the code
construct, and I've left a bit of a FIXME on the verification code to
try to track down why this is coming up.

Anyways, removing the assert doesn't degrade the correctness of the algorithm.

llvm-svn: 144532
2011-11-14 10:55:53 +00:00
Chandler Carruth 0af6a0bb69 Begin chipping away at one of the biggest quadratic-ish behaviors in
this pass. We're leaving already merged blocks on the worklist, and
scanning them again and again only to determine each time through that
indeed they aren't viable. We can instead remove them once we're going
to have to scan the worklist. This is the easy way to implement removing
them. If this remains on the profile (as I somewhat suspect it will), we
can get a lot more clever here, as the worklist's order is essentially
irrelevant. We can use swapping and fold the two loops to reduce
overhead even when there are many blocks on the worklist but only a few
of them are removed.

llvm-svn: 144531
2011-11-14 09:46:33 +00:00
Chandler Carruth 84cd44c750 Under the hood, MBPI is doing a linear scan of every successor every
time it is queried to compute the probability of a single successor.
This makes computing the probability of every successor of a block in
sequence... really really slow. ;] This switches to a linear walk of the
successors rather than a quadratic one. One of several quadratic
behaviors slowing this pass down.

I'm not really thrilled with moving the sum code into the public
interface of MBPI, but I don't (at the moment) have ideas for a better
interface. My direction I'm thinking in for a better interface is to
have MBPI actually retain much more state and make *all* of these
queries cheap. That's a lot of work, and would require invasive changes.
Until then, this seems like the least bad (ie, least quadratic)
solution. Suggestions welcome.

llvm-svn: 144530
2011-11-14 09:12:57 +00:00
Tobias Grosser 8bee91ffc6 Add clang_complete to release notes
llvm-svn: 144529
2011-11-14 09:09:26 +00:00
Tobias Grosser cfa35956c3 Add Polly to release notes
llvm-svn: 144528
2011-11-14 09:09:23 +00:00
Chandler Carruth a9e71faa0f Reuse the logic in getEdgeProbability within getHotSucc in order to
correctly handle blocks whose successor weights sum to more than
UINT32_MAX. This is slightly less efficient, but the entire thing is
already linear on the number of successors. Calling it within any hot
routine is a mistake, and indeed no one is calling it. It also
simplifies the code.

llvm-svn: 144527
2011-11-14 08:55:59 +00:00
Chandler Carruth ed5aa547bc Fix an overflow bug in MachineBranchProbabilityInfo. This pass relied on
the sum of the edge weights not overflowing uint32, and crashed when
they did. This is generally safe as BranchProbabilityInfo tries to
provide this guarantee. However, the CFG can get modified during codegen
in a way that grows the *sum* of the edge weights. This doesn't seem
unreasonable (imagine just adding more blocks all with the default
weight of 16), but it is hard to come up with a case that actually
triggers 32-bit overflow. Fortuately, the single-source GCC build is
good at this. The solution isn't very pretty, but its no worse than the
previous code. We're already summing all of the edge weights on each
query, we can sum them, check for an overflow, compute a scale, and sum
them again.

I've included a *greatly* reduced test case out of the GCC source that
triggers it. It's a pretty lame test, as it clearly is just barely
triggering the overflow. I'd like to have something that is much more
definitive, but I don't understand the fundamental pattern that triggers
an explosion in the edge weight sums.

The buggy code is duplicated within this file. I'll colapse them into
a single implementation in a subsequent commit.

llvm-svn: 144526
2011-11-14 08:50:16 +00:00
Craig Topper 182b00a2e0 Add AVX2 version of instructions to load folding tables. Also add a bunch of missing SSE/AVX instructions.
llvm-svn: 144525
2011-11-14 08:07:55 +00:00
Chandler Carruth 2432d81ee4 Add a cautionary note to this API. It was not at all obvious to me how
expensive the most useful interface to this analysis is.

Fun story -- it's also not correct. That's getting fixed in another
patch.

llvm-svn: 144523
2011-11-14 06:51:49 +00:00
Craig Topper a331515c82 Add neverHasSideEffects, mayLoad, and mayStore to many patternless SSE/AVX instructions. Remove MMX check from LowerVECTOR_SHUFFLE since MMX vector types won't go through it anyway.
llvm-svn: 144522
2011-11-14 06:46:21 +00:00
Chad Rosier 2a1df883d0 Add support for ARM halfword load/stores and signed byte loads with negative
offsets.
rdar://10412592

llvm-svn: 144518
2011-11-14 04:09:28 +00:00
Jakob Stoklund Olesen d7bcf43dc2 Use getVNInfoBefore() when it makes sense.
llvm-svn: 144517
2011-11-14 01:39:36 +00:00
Chandler Carruth 1071cfa4ae Teach machine block placement to cope with unnatural loops. These don't
get loop info structures associated with them, and so we need some way
to make forward progress selecting and placing basic blocks. The
technique used here is pretty brutal -- it just scans the list of blocks
looking for the first unplaced candidate. It keeps placing blocks like
this until the CFG becomes tractable.

The cost is somewhat unfortunate, it requires allocating a vector of all
basic block pointers eagerly. I have some ideas about how to simplify
and optimize this, but I'm trying to get the logic correct first.

Thanks to Benjamin Kramer for the reduced test case out of GCC. Sadly
there are other bugs that GCC is tickling that I'm reducing and working
on now.

llvm-svn: 144516
2011-11-14 00:00:35 +00:00
Jakob Stoklund Olesen 697979028f Use kill slots instead of the previous slot in shrinkToUses.
It's more natural to use the actual end points.

llvm-svn: 144515
2011-11-13 23:53:25 +00:00
Chandler Carruth c4a2cb34bb Cleanup some 80-columns violations and poor formatting. These snuck by
when I was reading through the code for style.

llvm-svn: 144513
2011-11-13 22:50:09 +00:00
Jakob Stoklund Olesen d8f2405e73 Terminate all dead defs at the dead slot instead of the 'next' slot.
This makes no difference for normal defs, but early clobber dead defs
now look like:

  [Slot_EarlyClobber; Slot_Dead)

instead of:

  [Slot_EarlyClobber; Slot_Register).

Live ranges for normal dead defs look like:

  [Slot_Register; Slot_Dead)

as before.

llvm-svn: 144512
2011-11-13 22:42:13 +00:00
Craig Topper 424ca7bbf5 Fix comment for LegalizeTypeAction enum.
llvm-svn: 144511
2011-11-13 22:11:24 +00:00
Jakob Stoklund Olesen ce7cc08f3a Simplify early clobber slots a bit.
llvm-svn: 144507
2011-11-13 22:05:42 +00:00
Chandler Carruth 8e1d906734 Enhance the assertion mechanisms in place to make it easier to catch
when we fail to place all the blocks of a loop. Currently this is
happening for unnatural loops, and this logic helps more immediately
point to the problem.

llvm-svn: 144504
2011-11-13 21:39:51 +00:00
Jakob Stoklund Olesen 90b5e565b6 Rename SlotIndexes to match how they are used.
The old naming scheme (load/use/def/store) can be traced back to an old
linear scan article, but the names don't match how slots are actually
used.

The load and store slots are not needed after the deferred spill code
insertion framework was deleted.

The use and def slots don't make any sense because we are using
half-open intervals as is customary in C code, but the names suggest
closed intervals.  In reality, these slots were used to distinguish
early-clobber defs from normal defs.

The new naming scheme also has 4 slots, but the names match how the
slots are really used.  This is a purely mechanical renaming, but some
of the code makes a lot more sense now.

llvm-svn: 144503
2011-11-13 20:45:27 +00:00
Craig Topper b8bcb473e2 Add BLSI, BLSMSK, and BLSR to getTargetNodeName.
llvm-svn: 144502
2011-11-13 17:31:07 +00:00
Chandler Carruth 0bb42c0f86 Teach MBP to force-merge layout successors for blocks with unanalyzable
branches that also may involve fallthrough. In the case of blocks with
no fallthrough, we can still re-order the blocks profitably. For example
instruction decoding will in some cases continue past an indirect jump,
making laying out its most likely successor there profitable.

Note, no test case. I don't know how to write a test case that exercises
this logic, but it matches the described desired semantics in
discussions with Jakob and others. If anyone has a nice example of IR
that will trigger this, that would be lovely.

Also note, there are still assertion failures in real world code with
this. I'm digging into those next, now that I know this isn't the cause.

llvm-svn: 144499
2011-11-13 12:17:28 +00:00
Chandler Carruth f9213fe721 Hoist another gross nested loop into a helper method.
llvm-svn: 144498
2011-11-13 11:42:26 +00:00
Chandler Carruth eb4ec3aea5 Add a missing doxygen comment for a helper method.
llvm-svn: 144497
2011-11-13 11:34:55 +00:00
Chandler Carruth b336172f90 Hoist a nested loop into its own method.
llvm-svn: 144496
2011-11-13 11:34:53 +00:00
Chandler Carruth 8d15078927 Rewrite #3 of machine block placement. This is based somewhat on the
second algorithm, but only loosely. It is more heavily based on the last
discussion I had with Andy. It continues to walk from the inner-most
loop outward, but there is a key difference. With this algorithm we
ensure that as we visit each loop, the entire loop is merged into
a single chain. At the end, the entire function is treated as a "loop",
and merged into a single chain. This chain forms the desired sequence of
blocks within the function. Switching to a single algorithm removes my
biggest problem with the previous approaches -- they had different
behavior depending on which system triggered the layout. Now there is
exactly one algorithm and one basis for the decision making.

The other key difference is how the chain is formed. This is based
heavily on the idea Andy mentioned of keeping a worklist of blocks that
are viable layout successors based on the CFG. Having this set allows us
to consistently select the best layout successor for each block. It is
expensive though.

The code here remains very rough. There is a lot that needs to be done
to clean up the code, and to make the runtime cost of this pass much
lower. Very much WIP, but this was a giant chunk of code and I'd rather
folks see it sooner than later. Everything remains behind a flag of
course.

I've added a couple of tests to exercise the issues that this iteration
was motivated by: loop structure preservation. I've also fixed one test
that was exhibiting the broken behavior of the previous version.

llvm-svn: 144495
2011-11-13 11:20:44 +00:00
Chad Rosier 1198d894d0 The order in which the predicate is added differs between Thumb and ARM mode. Fix predicate when in ARM mode and restore SelectIntrinsicCall.
llvm-svn: 144494
2011-11-13 09:44:21 +00:00
Chad Rosier a476e391f1 Temporarily disable SelectIntrinsicCall when in ARM mode. This is causing failures.
llvm-svn: 144492
2011-11-13 05:14:43 +00:00
Chad Rosier 5196efdf36 Fix comments.
llvm-svn: 144490
2011-11-13 04:25:02 +00:00
Chad Rosier c8cfd3a8fb Add support for emitting both signed- and zero-extend loads. Fix
SimplifyAddress to handle either a 12-bit unsigned offset or the ARM +/-imm8
offsets (addressing mode 3).  This enables a load followed by an integer 
extend to be folded into a single load.

For example:
ldrb r1, [r0]       ldrb r1, [r0]
uxtb r2, r1     =>
mov  r3, r2         mov  r3, r1

llvm-svn: 144488
2011-11-13 02:23:59 +00:00
NAKAMURA Takumi 4784df7161 Prune more RALinScan. RALinScan was also here!
llvm-svn: 144487
2011-11-13 01:33:10 +00:00
Jakob Stoklund Olesen c601d8c762 More dead code elimination in VirtRegMap.
This thing is looking a lot like a virtual register map now.

llvm-svn: 144486
2011-11-13 01:23:34 +00:00
Jakob Stoklund Olesen 28df7ef8c9 Stop tracking spill slot uses in VirtRegMap.
Nobody cared, StackSlotColoring scans the instructions to find used stack
slots.

llvm-svn: 144485
2011-11-13 01:23:30 +00:00
Jakob Stoklund Olesen 92255f27f1 Remove dead code and data from VirtRegMap.
Most of this stuff was supporting the old deferred spill code insertion
mechanism.  Modern spillers just edit machine code in place.

llvm-svn: 144484
2011-11-13 01:02:04 +00:00
Jakob Stoklund Olesen 38b3f312ca Stop tracking unused registers in VirtRegMap.
The information was only used by the register allocator in
StackSlotColoring.

llvm-svn: 144482
2011-11-13 00:39:45 +00:00
Jakob Stoklund Olesen 6ddb767fb5 Remove the -color-ss-with-regs option.
It was off by default.

The new register allocators don't have the problems that made it
necessary to reallocate registers during stack slot coloring.

llvm-svn: 144481
2011-11-13 00:31:23 +00:00
Jakob Stoklund Olesen 5343da6497 Delete VirtRegRewriter.
And there was much rejoicing.

llvm-svn: 144480
2011-11-13 00:16:01 +00:00