llvm-project

Commit Graph

Author	SHA1	Message	Date
Evan Cheng	a4b6404cf0	DAG combine should not increase alignment of loads / stores with alignment less than ABI alignment. These are loads / stores from / to "packed" data structures. Their alignments are intentionally under-specified. rdar://10301431 llvm-svn: 145273	2011-11-28 20:42:56 +00:00
Evan Cheng	aa93ceb164	Add missing avx pattern. llvm-svn: 145272	2011-11-28 20:27:23 +00:00
Peter Collingbourne	952adc32d9	Add OpenCL blurb to release notes. llvm-svn: 145270	2011-11-28 20:04:12 +00:00
Chad Rosier	61e8d1026f	80-column. llvm-svn: 145267	2011-11-28 19:59:09 +00:00
Bill Wendling	5ebc95ff4c	Remove dead llvm.eh.sjlj.dispatchsetup intrinsic. llvm-svn: 145263	2011-11-28 19:23:13 +00:00
Andrew Trick	a8bdb7cbf1	Remove the temporary flag -disable-unroll-scev and dead code. SCEV should now be used for trip count analysis, not LoopInfo. llvm-svn: 145262	2011-11-28 19:22:09 +00:00
Eli Friedman	31f0116173	Add back a line I deleted by accident in r145141. Fixes uninitialized variable warnings and runtime failures. llvm-svn: 145256	2011-11-28 18:50:37 +00:00
Michael J. Spencer	f7a4ed7d7f	Add object file related release notes. llvm-svn: 145254	2011-11-28 18:20:09 +00:00
Jakob Stoklund Olesen	979dad7bd2	Explain what ExeDepsFix does. llvm-svn: 145253	2011-11-28 18:03:11 +00:00
Rafael Espindola	c87cebf2bf	Fix spelling/grammar errors found by Duncan. llvm-svn: 145250	2011-11-28 17:06:58 +00:00
Benjamin Kramer	4c16431a67	Handle more cases in APInt::getLowBitsSet's fast path. llvm-svn: 145249	2011-11-28 16:56:38 +00:00
Bill Wendling	957cc212bb	Support a 'final' release candidate tag. llvm-svn: 145243	2011-11-28 11:45:10 +00:00
Duncan Sands	12330650f8	Silence wrong warnings from GCC about variables possibly being used uninitialized: GCC doesn't understand that the variables are only used if !UseImm, in which case they have been initialized. llvm-svn: 145239	2011-11-28 10:31:27 +00:00
Craig Topper	818a983e93	Add X86 instruction selection for VPERM2I128 when AVX2 is enabled. Merge VPERMILPS/VPERMILPD detection since they are pretty similar. llvm-svn: 145238	2011-11-28 10:14:51 +00:00
Bob Wilson	3f35470fc7	Add an optional separate install prefix for internal components. rdar://10217046 Some files installed by clang are not relevant for general users and we'd like to be able to install them to a different location. This adds a new --with-internal-prefix configure option and a corresponding PROJ_internal_prefix makefile variable, which defaults to the standard prefix. A tool makefile can specify that it should be installed to this internal prefix by defining INTERNAL_TOOL. llvm-svn: 145234	2011-11-28 07:59:52 +00:00
NAKAMURA Takumi	8284ec46b6	test/lit.cfg: Enable the feature 'asserts' to check output of llc -version. llc knows whether he is compiled with -DNDEBUG. \| Optimized build with assertions. llvm-svn: 145230	2011-11-28 05:09:15 +00:00
NAKAMURA Takumi	a0d652e71b	lit/TestRunner.py: Use RemoveForce(). llvm-svn: 145223	2011-11-28 01:55:08 +00:00
NAKAMURA Takumi	57fc5adca0	lit/TestRunner.py: [Win32] Introduce WinWaitReleased(f), to wait for file handles to be released by children. When wait() has finished, opened handles (especially writing stdout to file) might not be released immediately. To wait for released, poll to attempt renaming. llvm-svn: 145222	2011-11-28 01:55:01 +00:00
Jakob Stoklund Olesen	6d110aa84d	Add a blurb about the new ExecutionDepsFix pass. llvm-svn: 145220	2011-11-28 01:46:19 +00:00
Craig Topper	b0456936da	Make isCommutedVSHUFP more like the way isCommutedSHUFP is handled. llvm-svn: 145218	2011-11-28 01:14:24 +00:00
NAKAMURA Takumi	4ad52a54b9	configure, config.h.in: Regenerate. config.h.cmake: Synchronize to config.h.in. llvm-svn: 145217	2011-11-28 01:07:19 +00:00
Dylan Noblesmith	3e79ef1d45	use llvm-config.h in public header The config.h file's macros collide with other projects that include LLVM and shouldn't get exported. llvm-svn: 145215	2011-11-28 00:49:01 +00:00
Dylan Noblesmith	efddf20126	rename ENABLE_THREADS to LLVM_ENABLE_THREADS Now that it needs to be exported in a public header (Valgrind.h) it should be prefixed to avoid collision with other projects. Add it to llvm-config.h as well. This'll require regenerating the configure script after this commit, but I don't have the required autoconf version. llvm-svn: 145214	2011-11-28 00:48:58 +00:00
Dylan Noblesmith	daef41b1d1	update description of LLVM_DEFAULT_TARGET_TRIPLE It was out of sync with the description in configure.ac/config.h.in. Also re-alphabetize it from its position when it was LLVM_HOST_TRIPLE. llvm-svn: 145213	2011-11-28 00:48:53 +00:00
Nick Lewycky	6404d97a99	Place the "cfg checksum" around a test. This was recently added in April 2011 to gcc, though I thought it was older (my gcc 4.4 has it as a local patch. Whoops!) This fixes PR10589. Also add some debugging statements. Remove GcnoFiles, the mapping from CompilationUnit to raw_ostream. Now that we start by iterating over each CU and descending into them, there's no need to maintain a mapping. llvm-svn: 145208	2011-11-27 23:22:20 +00:00
Chris Lattner	ef714c0b05	dwarf parsing stuff. llvm-svn: 145207	2011-11-27 22:39:23 +00:00
Chris Lattner	b035c31215	first pass of writing complete! llvm-svn: 145206	2011-11-27 22:36:22 +00:00
Chris Lattner	7b32d97e02	arm and carve out a place ot mention segmented stacks. llvm-svn: 145204	2011-11-27 22:12:32 +00:00
Rafael Espindola	799ca897e7	Add a description of the status of segmented stacks. llvm-svn: 145201	2011-11-27 22:05:46 +00:00
Chris Lattner	7257f76728	optimize, mc, x86 llvm-svn: 145200	2011-11-27 22:03:34 +00:00
Craig Topper	79ee88a511	Merge detecting and handling for VSHUFPSY and VSHUFPDY since a lot of the code was similar for both. llvm-svn: 145199	2011-11-27 21:41:12 +00:00
Chris Lattner	644976405f	some writing. llvm-svn: 145198	2011-11-27 21:30:28 +00:00
Chris Lattner	9661de7d30	fix some out-of-date attribution. llvm-svn: 145197	2011-11-27 21:02:12 +00:00
Chris Lattner	6442197819	distribute various bullets to different sections. llvm-svn: 145196	2011-11-27 20:51:47 +00:00
Chandler Carruth	4f56720754	Prevent rotating the blocks of a loop (and thus getting a backedge to be fallthrough) in cases where we might fail to rotate an exit to an outer loop onto the end of the loop chain. Having some rotation, but not performing this rotation, is the primary fix of thep performance regression with -enable-block-placement for Olden/em3d (a whopping 30% regression). Still working on reducing the test case that actually exercises this and the new rotation strategy out of this code, but I want to check if this regresses other test cases first as that may indicate it isn't the correct fix. llvm-svn: 145195	2011-11-27 20:18:00 +00:00
Chris Lattner	080dd7ce30	rewrite the known problems section. Including a short list of individual bugs per target isn't particularly useful. Link to the target features matrix. llvm-svn: 145193	2011-11-27 19:38:20 +00:00
Chris Lattner	4857190a50	move the detailed information about the EH rewrite to a comment, Bill is blog'izing it. llvm-svn: 145192	2011-11-27 19:26:30 +00:00
Chris Lattner	e9a31c40b6	tweak subprojects' section llvm-svn: 145191	2011-11-27 18:53:41 +00:00
Chris Lattner	25a7790603	some random notes. llvm-svn: 145190	2011-11-27 18:47:37 +00:00
Chris Lattner	251d827d2c	remove a test that is using old-style llvm.dbg intrinsics, apparently only fails on ppc and arm hosts. llvm-svn: 145188	2011-11-27 18:13:47 +00:00
Chandler Carruth	03adbd46ca	Take two on rotating the block ordering of loops. My previous attempt was centered around the premise of laying out a loop in a chain, and then rotating that chain. This is good for preserving contiguous layout, but bad for actually making sane rotations. In order to keep it safe, I had to essentially make it impossible to rotate deeply nested loops. The information needed to correctly reason about a deeply nested loop is actually available -- before we layout the loop. We know the inner loops are already fused into chains, etc. We lose information the moment we actually lay out the loop. The solution was the other alternative for this algorithm I discussed with Benjamin and some others: rather than rotating the loop after-the-fact, try to pick a profitable starting block for the loop's layout, and then use our existing layout logic. I was worried about the complexity of this "pick" step, but it turns out such complexity is needed to handle all the important cases I keep teasing out of benchmarks. This is, I'm afraid, a bit of a work-in-progress. It is still misbehaving on some likely important cases I'm investigating in Olden. It also isn't really tested. I'm going to try to craft some interesting nested-loop test cases, but it's likely to be extremely time consuming and I don't want to go there until I'm sure I'm testing the correct behavior. Sadly I can't come up with a way of getting simple, fine grained test cases for this logic. We need complex loop structures to even trigger much of it. llvm-svn: 145183	2011-11-27 13:34:33 +00:00
Chandler Carruth	37ab257b88	Revert r145180 as it is causing test failures on all the bots. Original commit message: Fixed ObjectFile functions: - getSymbolOffset() renamed as getSymbolFileOffset() - getSymbolFileOffset(), getSymbolAddress(), getRelocationAddress() returns same result for ELFObjectFile, MachOObjectFile and COFFObjectFile. - added getRelocationOffset() - fixed MachOObjectFile::getSymbolSize() - fixed MachOObjectFile::getSymbolSection() - fixed MachOObjectFile::getSymbolOffset() for symbols without section data. llvm-svn: 145182	2011-11-27 10:37:47 +00:00
Chandler Carruth	9e46684154	Fix an impressive type-o / spell-o Duncan noticed. llvm-svn: 145181	2011-11-27 10:32:16 +00:00
Danil Malyshev	2631f93f7d	Fixed ObjectFile functions: - getSymbolOffset() renamed as getSymbolFileOffset() - getSymbolFileOffset(), getSymbolAddress(), getRelocationAddress() returns same result for ELFObjectFile, MachOObjectFile and COFFObjectFile. - added getRelocationOffset() - fixed MachOObjectFile::getSymbolSize() - fixed MachOObjectFile::getSymbolSection() - fixed MachOObjectFile::getSymbolOffset() for symbols without section data. llvm-svn: 145180	2011-11-27 10:12:52 +00:00
Chandler Carruth	a054580993	Rework a bit of the implementation of loop block rotation to not rely so heavily on AnalyzeBranch. That routine doesn't behave as we want given that rotation occurs mid-way through re-ordering the function. Instead merely check that there are not unanalyzable branching constructs present, and then reason about the CFG via successor lists. This actually simplifies my mental model for all of this as well. The concrete result is that we now will rotate more loop chains. I've added a test case from Olden highlighting the effect. There is still a bit more to do here though in order to regain all of the performance in Olden. llvm-svn: 145179	2011-11-27 09:22:53 +00:00
Chris Lattner	0bcbde46e2	Eli managed to kill off llvm.membarrier in llvm 3.0 also, this means that mainline needs no autoupgrade logic for intrinsics yet, woohoo! llvm-svn: 145178	2011-11-27 08:42:07 +00:00
Chris Lattner	3dcdc29d11	add some final random notes, I've completed my pass over all the commits. I'll work on turning this into something intelligible tomorrow. llvm-svn: 145177	2011-11-27 08:32:32 +00:00
Chris Lattner	410f3d7f5d	The llvm.atomic intrinsics were removed in LLVM 3.0 (in r141333), remove the autoupgrade logic for 2.9 and before. llvm-svn: 145176	2011-11-27 08:18:55 +00:00
Chris Lattner	ee471c484a	remove autoupgrade support for old forms of llvm.prefetch and the old trampoline forms. Both of these were correct in LLVM 3.0, and we don't need to support LLVM 2.9 and earlier in mainline. llvm-svn: 145174	2011-11-27 07:42:04 +00:00
Chris Lattner	d5bb9e6c4c	add some notes. llvm-svn: 145173	2011-11-27 07:37:53 +00:00
Chris Lattner	bc639298e5	remove asmparsing and documentation support for "volatile load", which was only produced by LLVM 2.9 and earlier. LLVM 3.0 and later prefers "load volatile". llvm-svn: 145172	2011-11-27 06:56:53 +00:00
Chris Lattner	6a144a2227	Upgrade syntax of tests using volatile instructions to use 'load volatile' instead of 'volatile load', which is archaic. llvm-svn: 145171	2011-11-27 06:54:59 +00:00
Chris Lattner	ebed15e973	some notes. llvm-svn: 145170	2011-11-27 06:24:49 +00:00
Chris Lattner	90ef78c07f	remove autoupgrade support for really old-style debug info intrinsics. I think this is the last of autoupgrade that can be removed in 3.1. Can the atomic upgrade stuff also go? llvm-svn: 145169	2011-11-27 06:18:33 +00:00
Chris Lattner	6aa6c0c3b7	remove some old autoupgrade logic llvm-svn: 145167	2011-11-27 06:10:54 +00:00
Chris Lattner	db89153969	remove autoupgrade support for LLVM 2.9 exception stuff. Mainline supports LLVM 3.0 and later. llvm-svn: 145165	2011-11-27 05:56:16 +00:00
Chris Lattner	1c9e5678b8	remove support for reading llvm 2.9 .bc files. LLVM 3.1 is only compatible back to 3.0 llvm-svn: 145164	2011-11-27 05:48:27 +00:00
Chris Lattner	74a3e00ebf	add some notes llvm-svn: 145163	2011-11-27 05:47:57 +00:00
Wesley Peck	97b3da5433	Add several new instructions supported by the latest MicroBlaze. These instructions are not generated by the backend yet, this will come in a later commit. llvm-svn: 145161	2011-11-27 05:16:58 +00:00
Bob Wilson	8e6d9da04c	Partially revert r145157 to quiet an unhappy buildbot. Removing that buildbot would be a better solution, but this is at least a temporary workaround. llvm-svn: 145160	2011-11-27 01:48:54 +00:00
Wesley Peck	d2e2e1782f	Optimize comparison against 0 in conditional instructions. Fix a couple of 80-column violations. llvm-svn: 145159	2011-11-27 01:36:20 +00:00
Chandler Carruth	9ffb97e631	Introduce a loop block rotation optimization to the new block placement pass. This is designed to achieve one of the important optimizations that the old code placement pass did, but more simply. This is a somewhat rough and very conservative version of the transform. We could get a lot fancier here if there are profitable cases to do so. In particular, this only looks for a single pattern, it insists that the loop backedge being rotated away is the last backedge in the chain, and it doesn't provide any means of doing better in-loop placement due to the rotation. However, it appears that it will handle the important loops I am finding in the LLVM test suite. llvm-svn: 145158	2011-11-27 00:38:03 +00:00
Bob Wilson	4eefd2d52f	Merge the install-clang-c target into install-clang. <rdar://problem/10217046> llvm-svn: 145157	2011-11-27 00:26:22 +00:00
Benjamin Kramer	7ba71be392	Move code into anonymous namespaces. llvm-svn: 145154	2011-11-26 23:01:57 +00:00
Craig Topper	51280d565b	Merge 128-bit and 256-bit X86ISD node types for VPERMILPS and VPERMILPD. Simplify some shuffle lowering code since V1 can never be UNDEF due to canonalizing that occurs when shuffle nodes are created. llvm-svn: 145153	2011-11-26 22:55:48 +00:00
Wesley Peck	69d5040485	Rename a couple of options and fix some simple typos. llvm-svn: 145152	2011-11-26 21:50:38 +00:00
Craig Topper	7704bd7ac3	Collapse X86ISD node types for PUNPCKH, PUNPCKL, UNPCKLP, and UNPCKHP to not be type specific. Now we just have integer high and low and floating point high and low. Pattern matching will choose the correct instruction based on the vector type. llvm-svn: 145148	2011-11-26 20:47:44 +00:00
Benjamin Kramer	8c8486dbb2	Move the branch probability blurb into the optimizer section. Add a minimal bullet for AVX. llvm-svn: 145145	2011-11-26 11:14:54 +00:00
David Chisnall	07618783f3	Added Objective-C and libc++ details to the 3.0 release notes. llvm-svn: 145144	2011-11-26 10:56:17 +00:00
Chandler Carruth	f156f0cf57	FileCheck-ize this test and make it more precise. This is in preparation for adding other tests. llvm-svn: 145143	2011-11-26 08:24:25 +00:00
Eli Friedman	a84ad7d0d0	Fix APFloat::convert so that it handles narrowing conversions correctly; it was returning incorrect values in rare cases, and incorrectly marking exact conversions as inexact in some more common cases. Fixes PR11406, and a missed optimization in test/CodeGen/X86/fp-stack-O0.ll. llvm-svn: 145141	2011-11-26 03:38:02 +00:00
Benjamin Kramer	a02af616b1	shpelling llvm-svn: 145138	2011-11-25 21:26:00 +00:00
Benjamin Kramer	889b243fd6	Remove ZooLib from the projects list. I don't see how the project is using LLVM and we really can't list every user of the clang analyzer. Sorry. llvm-svn: 145137	2011-11-25 21:03:06 +00:00
Chris Lattner	c3e4fdcc10	add a user llvm-svn: 145136	2011-11-25 20:36:17 +00:00
Chris Lattner	614d0391e9	add some notes llvm-svn: 145135	2011-11-25 20:33:27 +00:00
Chris Lattner	e5b37be30a	add faust llvm-svn: 145134	2011-11-25 20:28:16 +00:00
Bruno Cardoso Lopes	0f9a1f5e6c	This patch contains support for encoding FMA4 instructions and tablegen patterns for scalar FMA4 operations and intrinsic. Also add tests for vfmaddsd. Patch by Jan Sjodin llvm-svn: 145133	2011-11-25 19:33:42 +00:00
NAKAMURA Takumi	989eaf6e3f	ARMLoadStoreOptimizer.cpp: Fix MSVC(Debug) build. llvm-svn: 145129	2011-11-25 09:19:57 +00:00
Craig Topper	d65a444478	Remove 256-bit specific node types for UNPCKHPS/D and instead use the 128-bit versions and let the operand type disinquish. Also fix the load form of the v8i32 patterns for these to realize that the load would be promoted to v4i64. llvm-svn: 145126	2011-11-24 22:57:10 +00:00
Craig Topper	d26466748b	Remove AVX2 specific X86ISD node types for PUNPCKH/L and instead just reuse the 128-bit versions and let the vector type distinguish. llvm-svn: 145125	2011-11-24 22:20:08 +00:00
Benjamin Kramer	8a2d143672	Devirtualize Pass::getPassID, overriding it isn't useful and it gets called a lot. While at it pull the trivial ctor in line. llvm-svn: 145124	2011-11-24 21:14:11 +00:00
Benjamin Kramer	6709e05012	Make ConstantRange::truncate a bit more efficient. llvm-svn: 145122	2011-11-24 17:24:33 +00:00
Benjamin Kramer	651db37352	X86: alias cqo to cqto. llvm-svn: 145121	2011-11-24 12:02:46 +00:00
Chandler Carruth	7adee1a01a	Fix a silly use-after-free issue. A much earlier version of this code need lots of fanciness around retaining a reference to a Chain's slot in the BlockToChain map, but that's all gone now. We can just go directly to allocating the new chain (which will update the mapping for us) and using it. Somewhat gross mechanically generated test case replicates the issue Duncan spotted when actually testing this out. llvm-svn: 145120	2011-11-24 11:23:15 +00:00
Chandler Carruth	d394bafd2d	When adding blocks to the list of those which no longer have any CFG conflicts, we should only be adding the first block of the chain to the list, lest we try to merge into the middle of that chain. Most of the places we were doing this we already happened to be looking at the first block, but there is no reason to assume that, and in some cases it was clearly wrong. I've added a couple of tests here. One already worked, but I like having an explicit test for it. The other is reduced from a test case Duncan reduced for me and used to crash. Now it is handled correctly. llvm-svn: 145119	2011-11-24 08:46:04 +00:00
Jim Grosbach	651e2ee792	Add a few notes for ARM and a blurb about the MCJIT. llvm-svn: 145118	2011-11-24 00:49:21 +00:00
Akira Hatanaka	049e9e4d22	This patch makes the following changes necessary for MIPS' direct code emission. - lower unaligned loads/stores. - encode the size operand of instructions INS and EXT. - emit relocation information needed for JAL (jump-and-link). llvm-svn: 145113	2011-11-23 22:19:28 +00:00
Akira Hatanaka	f5ddf13f79	This patch addresses gp relative fixups/relocations for jump tables. llvm-svn: 145112	2011-11-23 22:18:04 +00:00
Richard Smith	4f9a8081c3	Correctly byte-swap APInts with bit-widths greater than 64. llvm-svn: 145111	2011-11-23 21:33:37 +00:00
Benjamin Kramer	6e013bf96c	Validate the return type when checking if a function is malloc. Fixes PR11426. Not sure if a test case with a "wrong" malloc would be useful. llvm-svn: 145106	2011-11-23 17:58:47 +00:00
Duncan Sands	81a2af12d6	Fix a crash in which a multiplication was being reported as being both negative and positive: positive, because it could be directly computed to be positive; negative, because the nsw flags means it is either negative or undefined (the multiplication always overflowed). llvm-svn: 145104	2011-11-23 16:26:47 +00:00
Benjamin Kramer	ebcb451874	X86: Use btq for bit tests if the immediate can't be encoded in 32 bits. Before: movabsq $4294967296, %rax ## encoding: [0x48,0xb8,0x00,0x00,0x00,0x00,0x01,0x00,0x00,0x00] testq %rax, %rdi ## encoding: [0x48,0x85,0xf8] jne LBB0_2 ## encoding: [0x75,A] After: btq $32, %rdi ## encoding: [0x48,0x0f,0xba,0xe7,0x20] jb LBB0_2 ## encoding: [0x72,A] btq is usually slower than testq because it doesn't fuse with the jump, but here we're better off saving one register and a giant movabsq. llvm-svn: 145103	2011-11-23 13:54:17 +00:00
NAKAMURA Takumi	0b3e996485	test/CodeGen/X86/block-placement.ll: Add explicit -mtriple=i686-linux. X86 Win32 CodeGen does not support EH yet. llvm-svn: 145101	2011-11-23 12:18:22 +00:00
Chandler Carruth	99fe42fbd9	Relax an invariant that block placement was trying to assert a bit further. This invariant just wasn't going to work in the face of unanalyzable branches; we need to be resillient to the phenomenon of chains poking into a loop and poking out of a loop. In fact, we already were, we just needed to not assert on it. This was found during a bootstrap with block placement turned on. llvm-svn: 145100	2011-11-23 10:35:36 +00:00
Elena Demikhovsky	779ba6d7b7	I added several lines in X86 code generator that allow to choose VSHUFPS/VSHUFPD instructions while lowering VECTOR_SHUFFLE node. I check a commuted VSHUFP mask. The patch was reviewed by Bruno. llvm-svn: 145099	2011-11-23 10:23:16 +00:00
Chandler Carruth	8c68f1f3c8	Handle the case of a no-return invoke correctly. It actually still has successors, they just are all landing pad successors. We handle this the same way as no successors. Comments attached for the next person to wade through here and another lovely test case courtesy of Benjamin Kramer's bugpoint reduction. llvm-svn: 145098	2011-11-23 08:23:54 +00:00
Bob Wilson	ebb44646c4	Enable stack protectors for all arrays, not just char arrays. rdar://5875909 Patch by Bill Wendling. llvm-svn: 145097	2011-11-23 07:13:56 +00:00
Jakob Stoklund Olesen	02845410f9	Fix PR11422. This was a bug in keeping track of the available domains when merging domain values. The wrong domain mask caused ExecutionDepsFix to try to move VANDPSYrr to the integer domain which is only available in AVX2. Also add an assertion to catch future attempts at emitting AVX2 instructions. llvm-svn: 145096	2011-11-23 04:03:08 +00:00
Rafael Espindola	5d03d46127	Point to libLTO with -L/PATH/ -lLTO so that it is found in the install directory. Patch by Markus Trippelsdorf. llvm-svn: 145095	2011-11-23 03:07:25 +00:00
Chandler Carruth	4a87aa0c31	Fix a crash in block placement due to an inner loop that happened to be reversed in the function's original ordering, and we happened to encounter it while handling an outer unnatural CFG structure. Thanks to the test case reduced from GCC's source by Benjamin Kramer. This may also fix a crasher in gzip that Duncan reduced for me, but I haven't yet gotten to testing that one. llvm-svn: 145094	2011-11-23 03:03:21 +00:00
Kostya Serebryany	8b5c7a56a3	[asan] do not instrument threadlocal globals, this is buggy llvm-svn: 145092	2011-11-23 02:10:54 +00:00
Anshuman Dasgupta	bcf6a37a58	Undo test commit llvm-svn: 145079	2011-11-22 20:05:48 +00:00
Anshuman Dasgupta	9ff0894703	Test commit llvm-svn: 145078	2011-11-22 20:03:30 +00:00
Hal Finkel	6f0ae783fe	add basic PPC register-pressure feedback; adjust the vaarg test to match the new register-allocation pattern llvm-svn: 145065	2011-11-22 16:21:04 +00:00
Craig Topper	83c4592619	More fixes to the X86InstComments for shuffle instructions. In particular add AVX flavors of many instructions and fix the destination operand for some of the existing AVX entries. llvm-svn: 145063	2011-11-22 14:27:57 +00:00
Chandler Carruth	ee54feb6f6	Fix a devilish miscompile exposed by block placement. The updateTerminator code didn't correctly handle EH terminators in one very specific case. AnalyzeBranch would find no terminator instruction, and so the fallback in updateTerminator is to assume fallthrough. This is correct, but the destination of the fallthrough was assumed to be the first successor. This is almost always true, but in certain cases the loop transformations will cause the landing pad to be the first successor! Instead of this brittle logic, actually look through the successors for a non-landing-pad accessor, and to assert if more than one is found. This will hopefully fix some (if not all) of the self host miscompiles with block placement. Thanks to Benjamin Kramer for reporting, Nick Lewycky for an initial stab at a reduction, and Duncan for endless advice on EH (which I know nothing about) as well as reviewing the actual fix. llvm-svn: 145062	2011-11-22 13:13:16 +00:00
Benjamin Kramer	e1effb0da2	Add configure checking for pread(2) and use it to save a syscall when reading files. llvm-svn: 145061	2011-11-22 12:31:53 +00:00
Chandler Carruth	e2530dc889	Fix an obvious omission in the SelectionDAGBuilder where we were dropping weights on the floor for invokes. This was impeding my writing further test cases for invoke when interacting with probabilities and block placement. No test case as there doesn't appear to be a way to test this stuff. =/ Suggestions for a test case of course welcome. I hope to be able to add test cases that indirectly cover this eventually by adding probabilities to the exceptional edge and reordering blocks as a result. llvm-svn: 145060	2011-11-22 11:37:46 +00:00
Benjamin Kramer	f22623b78b	Turn error recovery into an assert. This was put in because in a certain version of DragonFlyBSD stat(2) lied about the size of some files. This was fixed a long time ago so we can remove the workaround. llvm-svn: 145059	2011-11-22 11:37:11 +00:00
Rafael Espindola	c55e1af137	Add triple to the test. llvm-svn: 145057	2011-11-22 06:36:25 +00:00
Rafael Espindola	2021f38281	If a register is both an early clobber and part of a tied use, handle the use before the clobber so that we copy the value if needed. Fixes pr11415. llvm-svn: 145056	2011-11-22 06:27:18 +00:00
Craig Topper	ccb7097509	Fix shuffle decoding logic to handle UNPCKLPS/UNPCKLPD on 256-bit vectors correctly. Add support for decoding UNPCKHPS/UNPCKHPD for AVX 128-bit and 256-bit forms. llvm-svn: 145055	2011-11-22 01:57:35 +00:00
Craig Topper	f563977795	Add methods for querying minimum SSE version along with AVX. Simplifies all the places that had to check a version of SSE and AVX. llvm-svn: 145053	2011-11-22 00:44:41 +00:00
Sebastian Pop	74e1bc7933	fix typo in comment llvm-svn: 145048	2011-11-21 20:46:55 +00:00
Nick Lewycky	063ae5897c	Fix crasher in GVN due to my recent capture tracking changes. llvm-svn: 145047	2011-11-21 19:42:56 +00:00
Nick Lewycky	aa2a00db35	Add virtual destructor. Whoops! llvm-svn: 145044	2011-11-21 18:32:21 +00:00
Craig Topper	6270d072c5	Lowering for v32i8 to VPUNPCKLBW/VPUNPCKHBW when AVX2 is enabled. llvm-svn: 145028	2011-11-21 08:26:50 +00:00
Craig Topper	d12d6f4b1c	Test case for r145026 llvm-svn: 145027	2011-11-21 06:58:09 +00:00
Craig Topper	669199ca94	Add support for lowering 256-bit shuffles to VPUNPCKL/H for i16, i32, i64 if AVX2 is enabled. llvm-svn: 145026	2011-11-21 06:57:39 +00:00
Joe Abbey	96e89f6412	Fixing a comment llvm-svn: 145025	2011-11-21 04:42:21 +00:00
Craig Topper	a065238c6e	Make LowerSIGN_EXTEND_INREG split 256-bit vectors when AVX1 is enabled and use AVX2 shifts when AVX2 is enabled. llvm-svn: 145022	2011-11-21 01:12:36 +00:00
Nick Lewycky	6ae03c3378	Less template, more virtual! Refactoring suggested by Chris in code review. llvm-svn: 145014	2011-11-20 19:37:06 +00:00
Nick Lewycky	612d70b19d	Refactor code to use new attribute getters on CallSite for NoCapture and ByVal. Suggested in code review by Eli. That code in InstCombine looks kinda suspicious. llvm-svn: 145013	2011-11-20 19:09:04 +00:00
NAKAMURA Takumi	76dfa03874	test/CodeGen/X86/block-placement.ll: Relax expressions for Win32. llvm-svn: 145011	2011-11-20 12:49:45 +00:00
Chandler Carruth	18dfac385b	The logic for breaking the CFG in the presence of hot successors didn't properly account for the global probability of the edge being taken. This manifested as a very large number of unconditional branches to blocks being merged against the CFG even though they weren't particularly hot within the CFG. The fix is to check whether the edge being merged is both locally hot relative to other successors for the source block, and globally hot compared to other (unmerged) predecessors of the destination block. This introduces a new crasher on GCC single-source, but it's currently behind a flag, and Ben has offered to work on the reduction. =] llvm-svn: 145010	2011-11-20 11:22:06 +00:00
Chandler Carruth	bcb5f39526	Make an obviously const interface actually be marked as const. llvm-svn: 145009	2011-11-20 11:22:03 +00:00
Benjamin Kramer	650c09aa4d	XFAIL this test until I figure out what indvars is doing here (or find someone who does) llvm-svn: 145008	2011-11-20 11:10:03 +00:00
Benjamin Kramer	b5ba2eef2d	SCEV: Actually set overflow flags on add expressions. setFlags doesn't modify its arguments. llvm-svn: 145007	2011-11-20 10:24:36 +00:00
Chandler Carruth	20df3953d3	Add some comments to the latest test case I added here to document what is actually being tested. Also add some FileCheck goodness to much more carefully ensure that the result is the desired result. Before this test would only have failed through an assert failure if the underlying fix were reverted. Also, add some weight metadata and a comment explaining exactly what is going on to a trick section of the test case. Originally, we were getting very unlucky and trying to form a block chain that isn't actually profitable. I'm working on a fix to avoid forming these unprofitable chains, and that would also have masked any failure from this test case. The easy solution is to add some metadata that makes it really profitable to form the bad chain here. llvm-svn: 145006	2011-11-20 09:30:40 +00:00
Craig Topper	e79761df73	Add code for lowering v32i8 shifts by a splat to AVX2 immediate shift instructions. Remove 256-bit splat handling from LowerShift as it was already handled by PerformShiftCombine. llvm-svn: 145005	2011-11-20 00:12:05 +00:00
Craig Topper	a3a6583694	Use 256-bit vcmpeqd for creating an all ones vector when AVX2 is enabled. llvm-svn: 145004	2011-11-19 22:34:59 +00:00
Craig Topper	bac86038ac	Remove some of the special classes that worked around an old tablegen limitation of not being able to remove redundant bitconverts from patterns. llvm-svn: 145003	2011-11-19 21:01:54 +00:00
Craig Topper	3af6ae089f	Custom lower AVX2 variable shift intrinsics to shl/srl/sra nodes and remove the intrinsic patterns. llvm-svn: 144999	2011-11-19 17:46:46 +00:00
Chandler Carruth	f3dc9eff16	Move the handling of unanalyzable branches out of the loop-driven chain formation phase and into the initial walk of the basic blocks. We essentially pre-merge all blocks where unanalyzable fallthrough exists, as we won't be able to update the terminators effectively after any reorderings. This is quite a bit more principled as there may be CFGs where the second half of the unanalyzable pair has some analyzable predecessor that gets placed first. Then it may get placed next, implicitly breaking the unanalyzable branch even though we never even looked at the part that isn't analyzable. I've included a test case that triggers this (thanks Benjamin yet again!), and I'm hoping to synthesize some more general ones as I dig into related issues. Also, to make this new scheme work we have to be able to handle branches into the middle of a chain, so add this check. We always fallback on the incoming ordering. Finally, this starts to really underscore a known limitation of the current implementation -- we don't consider broken predecessors when merging successors. This can caused major missed opportunities, and is something I'm planning on looking at next (modulo more bug reports). llvm-svn: 144994	2011-11-19 10:26:02 +00:00
Craig Topper	6d77f4ae14	Test cases for SSSE3/AVX integer horizontal add/sub. llvm-svn: 144990	2011-11-19 09:03:33 +00:00
Craig Topper	f984efbfce	Synthesize SSSE3/AVX 128-bit horizontal integer add/sub instructions from add/sub of appropriate shuffle vectors. llvm-svn: 144989	2011-11-19 09:02:40 +00:00
Craig Topper	81390be00f	Collapse X86 PSIGNB/PSIGNW/PSIGND node types. llvm-svn: 144988	2011-11-19 07:33:10 +00:00
Craig Topper	de6b73bb4d	Extend VPBLENDVB and VPSIGN lowering to work for AVX2. llvm-svn: 144987	2011-11-19 07:07:26 +00:00
Craig Topper	75ffc5fbb5	Remove some unnecessary filtering checks from X86 disassembler table build. llvm-svn: 144986	2011-11-19 05:48:20 +00:00
Craig Topper	66e2b5a61e	Remove unused parameters from the AVX maskmov classes. llvm-svn: 144985	2011-11-19 04:49:22 +00:00
Andrew Trick	6b4d578f54	Fix a corner case in updating LoopInfo after fully unrolling an outer loop. The loop tree's inclusive block lists are painful and expensive to update. (I have no idea why they're inclusive). The design was supposed to handle this case but the implementation missed it and my unit tests weren't thorough enough. Fixes PR11335: loop unroll update. llvm-svn: 144970	2011-11-18 03:42:41 +00:00
Nadav Rotem	1ec141d0f9	Add AVX2 vpbroadcast support llvm-svn: 144967	2011-11-18 02:49:55 +00:00
Kostya Serebryany	1cdc6e9567	[asan] workaround for reg alloc bug 11395: don't instrument functions with large chunks of inline assembler llvm-svn: 144962	2011-11-18 01:41:06 +00:00
Chad Rosier	ee93ff736a	Guard call to getRegForValue with isTypeLegal check to avoid unnecessary work/dead code. llvm-svn: 144959	2011-11-18 01:17:34 +00:00
Devang Patel	107e8ec30d	DISubrange supports unsigned lower/upper array bounds, so let's not fake it in the end while emitting DWARF. If a FE needs to encode signed lower/upper array bounds then we need to extend DISubrange or ad DISignedSubrange. llvm-svn: 144937	2011-11-17 23:43:15 +00:00
Kostya Serebryany	a6edf4c21f	quick fix: remove GlobalVariable::GlobalVariable mistakenly commited at r144933. For some reason this compiles on linux llvm-svn: 144936	2011-11-17 23:37:53 +00:00
Andrew Trick	949045864d	Fix an overly general check in SimplifyIndvar to handle useless phi cycles. The right way to check for a binary operation is cast<BinaryOperator>. The original check: cast<Instruction> && numOperands() == 2 would match phi "instructions", leading to an infinite loop in extreme corner case: a useless phi with operands [self, constant] that prior optimization passes failed to remove, being used in the loop by another useless phi, in turn being used by an lshr or udiv. Fixes PR11350: runaway iteration assertion. llvm-svn: 144935	2011-11-17 23:36:35 +00:00
Kostya Serebryany	65e2211b95	fall back to explicit list of allowed linkages when instrumenting globals in asan; add a test check that asan does not touch linkonce_odr llvm-svn: 144933	2011-11-17 23:14:59 +00:00
Ted Kremenek	b42cfa0015	Fix bug in RefCountedBase/RefCountedBaseVPTR where the reference count was accidentally copied as part of the copy constructor. This could result in objects getting leaked because there reference count was too high. llvm-svn: 144931	2011-11-17 23:02:14 +00:00
Chad Rosier	0eff3e5c21	Add TODO comment. llvm-svn: 144920	2011-11-17 21:46:13 +00:00
Craig Topper	f41e1d0246	Fix SSE/AVX integer comparison patterns to understand that all integer vector loads are promoted to i64 vector loads so patterns need a bitconvert. Also slightly simplify the AVX2 variable shift patterns by using the predefined bitconvert pattern fragments. llvm-svn: 144896	2011-11-17 07:49:38 +00:00
Chad Rosier	15b2498e88	Dead code. llvm-svn: 144888	2011-11-17 07:24:49 +00:00
Chad Rosier	f83ab704e4	When fast iseling a GEP, accumulate the offset rather than emitting a series of ADDs. MaxOffs is used as a threshold to limit the size of the offset. Tradeoffs being: (1) If we can't materialize the large constant then we'll cause fast-isel to bail. (2) Too large of an offset can't be directly encoded in the ADD resulting in a MOV+ADD. Generally not a bad thing because otherwise we would have had ADD+ADD, but on Thumb this turns into a MOVS+MOVT+ADD. Working on a fix for that. (3) Conversely, too low of a threshold we'll miss opportunities to coalesce ADDs. rdar://10412592 llvm-svn: 144886	2011-11-17 07:15:58 +00:00
Craig Topper	f17b600577	Remove seemingly unnecessary duplicate VROUND definitions. llvm-svn: 144885	2011-11-17 07:04:00 +00:00
Chris Lattner	0e439c9b7c	x86/windows issues fixed. llvm-svn: 144878	2011-11-17 01:42:23 +00:00
Eli Friedman	489c0ff4a4	Add support for custom names for library functions in TargetLibraryInfo. Add a custom name for fwrite and fputs on x86-32 OSX. Make SimplifyLibCalls honor the custom names for fwrite and fputs. Fixes <rdar://problem/9815881>. llvm-svn: 144876	2011-11-17 01:27:36 +00:00
Daniel Dunbar	52f71220d5	llvm-build: Attempt to work around a CMake Makefile generator bug that doesn't properly quote strings when writing the CMakeFiles/Makefile.cmake output file (which lists the dependencies). This shows up when using CMake + MSYS Makefile generator. llvm-svn: 144873	2011-11-17 01:19:53 +00:00
Chad Rosier	ce619ddfc5	Don't unconditionally set the kill flag. rdar://10456186 llvm-svn: 144872	2011-11-17 01:16:53 +00:00
Eli Friedman	20439a42b0	Turn on vzeroupper insertion on call boundaries for AVX; it works as far as I know, and I'd like to see wider testing. llvm-svn: 144867	2011-11-17 00:21:52 +00:00
Daniel Dunbar	586aabc44a	build/make/test: Get rid of unused BUGPOINT_TOPTS variable. llvm-svn: 144864	2011-11-16 23:56:03 +00:00
Eli Friedman	ff1eaa7578	Make sure to replace the chain properly when DAGCombining a LOAD+EXTRACT_VECTOR_ELT into a single LOAD. Fixes PR10747/PR11393. llvm-svn: 144863	2011-11-16 23:50:22 +00:00
Michael J. Spencer	d27d51fbaf	Object/COFF: Support common symbols. llvm-svn: 144861	2011-11-16 23:36:12 +00:00
Jim Grosbach	f4d2e0d458	Remove obsolete test. The PLD encoding is checked via the .s file now. llvm-svn: 144853	2011-11-16 22:50:38 +00:00
Jim Grosbach	d3f02cbce9	Generalize the fixup info for ARM mode. We don't (yet) have the granularity in the fixups to be specific about which bitranges are affected. That's a future cleanup, but we're not there yet. llvm-svn: 144852	2011-11-16 22:48:37 +00:00
Jim Grosbach	d66cb5ab33	Update test for r144842. llvm-svn: 144851	2011-11-16 22:46:27 +00:00
Akira Hatanaka	b31abde0f3	Lower 64-bit constant pool node. llvm-svn: 144849	2011-11-16 22:44:38 +00:00
Akira Hatanaka	eb42071721	Lower 64-bit block address. llvm-svn: 144847	2011-11-16 22:42:10 +00:00
Jim Grosbach	7ccdb7c0ae	Fix encoding of NOP used for padding in ARM mode .align. llvm-svn: 144842	2011-11-16 22:40:25 +00:00
Akira Hatanaka	7b8547c4d0	Add patterns for 64-bit tglobaladdr, tblockaddress, tjumptable and tconstpool nodes. llvm-svn: 144841	2011-11-16 22:39:56 +00:00
Akira Hatanaka	6d617ceca2	64-bit jump register instruction. llvm-svn: 144840	2011-11-16 22:36:01 +00:00
Evan Cheng	011538dc79	Another missing X86ISD::MOVLPD pattern. rdar://10450317 llvm-svn: 144839	2011-11-16 22:24:44 +00:00
Jim Grosbach	bfe5c5c968	ARM assembly parsing for shifted register operands for MOV instruction. llvm-svn: 144837	2011-11-16 21:50:05 +00:00
Jim Grosbach	01e0439240	Clean up debug printing of ARM shifted operands. llvm-svn: 144836	2011-11-16 21:46:50 +00:00
Chad Rosier	ff40b1e164	Add fast-isel stats to determine who's doing all the work, the target-independent selector or the target-specific selector. llvm-svn: 144833	2011-11-16 21:05:28 +00:00
Chad Rosier	cfd0d10e72	Fix the stats collection for fast-isel. The failed count was only accounting for a single miss and not all predecessor instructions that get selected by the selection DAG instruction selector. This is still not exact (e.g., over states misses when folded/dead instructions are present), but it is a step in the right direction. llvm-svn: 144832	2011-11-16 21:02:08 +00:00
Chandler Carruth	d1eafe118d	There are already problems with building LLVM under VS2005, and it's quite old now. Update the documentation to reflect this, and direct people to use VS2008 or newer. llvm-svn: 144818	2011-11-16 19:52:13 +00:00
Jim Grosbach	3127ab6d8f	ARM assmebly two operand forms for LSR, ASR, LSL, ROR register. llvm-svn: 144814	2011-11-16 19:12:24 +00:00
Jim Grosbach	1a2f9ee3c8	ARM assembly parsing for RRX mnemonic. rdar://9704684 llvm-svn: 144812	2011-11-16 19:05:59 +00:00
Pete Cooper	48784ed5b7	Added missing comment about new custom lowering of DEC64 llvm-svn: 144811	2011-11-16 19:03:23 +00:00
Evan Cheng	822ddde50d	Disable expensive two-address optimizations at -O0. rdar://10453055 llvm-svn: 144806	2011-11-16 18:44:48 +00:00
Chad Rosier	80979b6ea6	Check to make sure we can select the instruction before trying to put the operands into a register. Otherwise, we may materialize dead code. llvm-svn: 144805	2011-11-16 18:39:44 +00:00
Evan Cheng	624eb2af6f	Disable the assertion again. Looks like fastisel is still generating bad kill markers. llvm-svn: 144804	2011-11-16 18:32:14 +00:00
Jim Grosbach	abcac56869	ARM mode aliases for bitwise instructions w/ register operands. rdar://9704684 llvm-svn: 144803	2011-11-16 18:31:45 +00:00
Bob Wilson	0ca7ce389c	Fix tablegen warning: hasSideEffects is inferred for eh_sjlj_dispatchsetup. llvm-svn: 144798	2011-11-16 17:09:59 +00:00
NAKAMURA Takumi	b345060a85	lib/Target/ARM/CMakeLists.txt: Disable optimization in ARMISelLowering.cpp also on MSC15(aka VS9). Seems miscompiled. llvm-svn: 144794	2011-11-16 09:18:28 +00:00
Evan Cheng	ecb2908bf9	Sink codegen optimization level into MCCodeGenInfo along side relocation model and code model. This eliminates the need to pass OptLevel flag all over the place and makes it possible for any codegen pass to use this information. llvm-svn: 144788	2011-11-16 08:38:26 +00:00
Bob Wilson	cca9aa58ca	Record landing pads with a SmallSetVector to avoid multiple entries. There may be many invokes that share one landing pad, and the previous code would record the landing pad once for each invoke. Besides the wasted effort, a pair of volatile loads gets inserted every time the landing pad is processed. The rest of the code can get optimized away when a landing pad is processed repeatedly, but the volatile loads remain, resulting in code like: LBB35_18: Ltmp483: ldr r2, [r7, #-72] ldr r2, [r7, #-68] ldr r2, [r7, #-72] ldr r2, [r7, #-68] ldr r2, [r7, #-72] ldr r2, [r7, #-68] ldr r2, [r7, #-72] ldr r2, [r7, #-68] ldr r2, [r7, #-72] ldr r2, [r7, #-68] ldr r2, [r7, #-72] ldr r2, [r7, #-68] ldr r2, [r7, #-72] ldr r2, [r7, #-68] ldr r2, [r7, #-72] ldr r2, [r7, #-68] ldr r4, [r7, #-72] ldr r2, [r7, #-68] llvm-svn: 144787	2011-11-16 07:57:21 +00:00
Craig Topper	3ed7d9ee5a	Fix the execution domain on a bunch of SSE/AVX instructions. llvm-svn: 144784	2011-11-16 07:30:46 +00:00
Bob Wilson	643e63c40c	Update the SP in the SjLj jmpbuf whenever it changes. <rdar://problem/10444602> This same basic code was in the older version of the SjLj exception handling, but it was removed in the recent revisions to that code. It needs to be there. llvm-svn: 144782	2011-11-16 07:12:00 +00:00
Bob Wilson	f6d1728d8f	Fix ARM SjLj-EH dispatch setup code. <rdar://problem/10444602> The EmitBasePointerRecalculation function has 2 problems, one minor and one fatal. The minor problem is that it inserts the code at the setjmp instead of in the dispatch block. The fatal problem is that at the point where this code runs, we don't know whether there will be a base pointer, so the entire function is a no-op. The base pointer recalculation needs to be handled as it was before, by inserting a pseudo instruction that gets expanded late. Most of the support for the old approach is still here, but it no longer has any connection to the eh_sjlj_dispatchsetup intrinsic. Clean up the parts related to the intrinsic and just generate the pseudo instruction directly. llvm-svn: 144781	2011-11-16 07:11:57 +00:00
Craig Topper	07d8b5e2c9	Remove code to enable execution dependency fix pass on VR256. VR128 is sufficient after r144636. llvm-svn: 144777	2011-11-16 05:02:04 +00:00
Evan Cheng	4ac36c8e26	Revert r144568 now that r144730 has fixed the fast-isel kill marker bug. llvm-svn: 144776	2011-11-16 04:55:01 +00:00
Nick Lewycky	db408f1857	Fix typo in test. llvm-svn: 144774	2011-11-16 03:56:38 +00:00
Nick Lewycky	c7f1e7993c	Merge isObjectPointerWithTrustworthySize with getPointerSize. Use it when looking at the size of the pointee. Fixes PR11390! llvm-svn: 144773	2011-11-16 03:49:48 +00:00
Evan Cheng	b8c55a5339	If the 2addr instruction has other kills, don't move it below any other uses since we don't want to extend other live ranges. llvm-svn: 144772	2011-11-16 03:47:42 +00:00
Evan Cheng	59f8156ea0	RescheduleKillAboveMI() must backtrack to before the rescheduled DBG_VALUE instructions. rdar://10451185 llvm-svn: 144771	2011-11-16 03:33:08 +00:00
Evan Cheng	9ddd69a8bc	Process all uses first before defs to accurately capture register liveness. rdar://10449480 llvm-svn: 144770	2011-11-16 03:05:12 +00:00
Eli Friedman	e6270395e3	Fix testcase. llvm-svn: 144769	2011-11-16 03:03:52 +00:00
Eli Friedman	87f92512c3	CONCAT_VECTORS can have more than two operands. PR11389. llvm-svn: 144768	2011-11-16 02:52:39 +00:00
Eli Friedman	d257a464d1	Add a couple asserts so it will be easier to debug if we accidentally pass indexed loads/stores to the legalizer. llvm-svn: 144767	2011-11-16 02:43:15 +00:00
Michael J. Spencer	415e8aa84a	Remove extra ,. llvm-svn: 144759	2011-11-16 01:36:50 +00:00
Kostya Serebryany	6e6b03ec46	AddressSanitizer, first commit (compiler module only) llvm-svn: 144758	2011-11-16 01:35:23 +00:00
Michael J. Spencer	ca48ae0194	Object/Archive: Give Child a operator < for map. llvm-svn: 144757	2011-11-16 01:25:13 +00:00
Michael J. Spencer	ecdd31181d	Support/COFF: Add structs and enums from the standard for image files. llvm-svn: 144756	2011-11-16 01:24:57 +00:00
Michael J. Spencer	53723de5b8	llvm-objdump: Ignore non-objects in archives. llvm-svn: 144755	2011-11-16 01:24:41 +00:00
Kostya Serebryany	db999c01f2	test commit to verify that commit access works (added blank line) llvm-svn: 144748	2011-11-16 01:14:38 +00:00
Owen Anderson	ca2f78a95b	Rename MVT::untyped to MVT::Untyped to match similar nomenclature. llvm-svn: 144747	2011-11-16 01:02:57 +00:00
Andrew Trick	90c7a108ca	Fix SCEV overly optimistic back edge taken count for multi-exit loops. Fixes PR11375: Different results for 'clang++ huh.cpp'... llvm-svn: 144746	2011-11-16 00:52:40 +00:00
Chad Rosier	af13d767a2	Add FIXME comment. llvm-svn: 144743	2011-11-16 00:32:20 +00:00
Jakob Stoklund Olesen	653183fd5c	Enable -widen-vmovs by default. This will widen 32-bit register vmov instructions to 64-bit when possible. The 64-bit vmovd instructions can then be translated to NEON vorr instructions by the execution dependency fix pass. The copies are only widened if they are marked as clobbering the whole D-register. llvm-svn: 144734	2011-11-15 23:53:18 +00:00
Eric Christopher	0abbd0ef5a	Stabilize the output of the dwarf accelerator tables. Fixes a comparison failure during bootstrap with it turned on. llvm-svn: 144731	2011-11-15 23:37:17 +00:00
Chad Rosier	291ce47db7	GEPs with all zero indices are trivially coalesced by fast-isel. For example, %arrayidx135 = getelementptr inbounds [4 x [4 x [4 x [4 x i32]]]]* %M0, i32 0, i64 0 %arrayidx136 = getelementptr inbounds [4 x [4 x [4 x i32]]]* %arrayidx135, i32 0, i64 %idxprom134 Prior to this commit, the GEP instruction that defines %arrayidx136 thought that %arrayidx135 was a trivial kill. The GEP that defines %arrayidx135 doesn't generate any code and thus %M0 gets folded into the second GEP. Thus, we need to look through GEPs with all zero indices. rdar://10443319 llvm-svn: 144730	2011-11-15 23:34:05 +00:00
Jim Grosbach	e891fe8d6c	ARM assembly parsing for register range syntax for VLD/VST register lists. For example, vld1.f64 {d2-d5}, [r2,:128]! Should be equivalent to: vld1.f64 {d2,d3,d4,d5}, [r2,:128]! It's not documented syntax in the ARM ARM, but it is consistent with what's accepted for VLDM/VSTM and is unambiguous in meaning, so it's a good thing to support. rdar://10451128 llvm-svn: 144727	2011-11-15 23:19:15 +00:00
Devang Patel	8cd896404a	Merge ObjCPropertyDebugInfo.html into SourceLevelDebugging.html llvm-svn: 144724	2011-11-15 22:59:54 +00:00
Jim Grosbach	003cea6011	ARM assembly parsing for data type suffices on NEON VMOV aliases. llvm-svn: 144722	2011-11-15 22:54:42 +00:00
Nadav Rotem	51f71054b6	Fix MSVC warnings by adding a cast. llvm-svn: 144721	2011-11-15 22:54:21 +00:00
Nadav Rotem	37010002f2	AVX: Add support for vbroadcast from BUILD_VECTOR and refactor some of the vbroadcast code. llvm-svn: 144720	2011-11-15 22:50:37 +00:00
Chris Lattner	dfcdd0c5be	jakob fixed X87 inline asm! llvm-svn: 144719	2011-11-15 22:48:24 +00:00
Chris Lattner	7cdcbe3788	add ImmutableSet/Map dox, patch by Caitlin Sadowski! llvm-svn: 144716	2011-11-15 22:40:14 +00:00
NAKAMURA Takumi	f6b315c081	test/CodeGen/X86/dec-eflags-lower.ll: Relax expression for win32 x64. llvm-svn: 144714	2011-11-15 22:30:37 +00:00
Jim Grosbach	75fb4abcdc	ARM assembly parsing two operand forms for shift instructions. llvm-svn: 144713	2011-11-15 22:27:54 +00:00
Chris Lattner	212a0867e9	add PTX backend info llvm-svn: 144711	2011-11-15 22:23:46 +00:00
Jim Grosbach	a01033709f	ARM VFP assembly parsing for VADD and VSUB two-operand forms. llvm-svn: 144710	2011-11-15 22:15:10 +00:00
Jim Grosbach	8279c1828f	ARM accept an immediate offset in memory operands w/o the '#'. llvm-svn: 144709	2011-11-15 22:14:41 +00:00
Chris Lattner	92f2183360	some notes. llvm-svn: 144708	2011-11-15 22:13:27 +00:00
Pete Cooper	7c7ba1baa1	Added custom lowering for load->dec->store sequence in x86 when the EFLAGS registers is used by later instructions. Only done for DEC64m right now. Fixes <rdar://problem/6172640> llvm-svn: 144705	2011-11-15 21:57:53 +00:00
Jim Grosbach	8d579230c6	ARM enclosing curly braces optional on one-register VLD/VST instruction lists. 'vld1.f32 d4, [r7]' should be parsed as equivalent to 'vld1.f32 {d4}, [r7]' rdar://10450488. llvm-svn: 144701	2011-11-15 21:45:55 +00:00
Akira Hatanaka	b89a58df96	Update section "MIPS Target Improvements" in the llvm 3.0 release notes. llvm-svn: 144699	2011-11-15 21:33:05 +00:00
Jim Grosbach	84f0ba5747	ARM size suffix on VFP single-precision 'vmov' is optional. rdar://10435114 llvm-svn: 144698	2011-11-15 21:18:35 +00:00
Devang Patel	43bde96a4c	Insert modified DBG_VALUE into LiveDbgValueMap. llvm-svn: 144696	2011-11-15 21:03:58 +00:00
Jim Grosbach	a92a5d8548	Fix typo. llvm-svn: 144695	2011-11-15 21:01:30 +00:00
Jim Grosbach	131b45e632	ARM alternate size suffices for VTRN instructions. rdar://10435076 llvm-svn: 144694	2011-11-15 20:49:46 +00:00
Owen Anderson	05060f0748	Fix a misplaced paren bug. llvm-svn: 144692	2011-11-15 20:30:41 +00:00
Jim Grosbach	5803f6d5a2	ARM assembly parsing for optional datatype suffix on VFP VMOV GPR<->VFP insns. Yet more of rdar://10435076. llvm-svn: 144691	2011-11-15 20:29:42 +00:00
Jim Grosbach	c5b1bc561e	ARM assembly parsing for two-operand form of 'mul' instruction. rdar://10449856. llvm-svn: 144689	2011-11-15 20:14:51 +00:00
Jim Grosbach	72dfd20aba	ARM assembly parsing for two-operand form of 'mul' instruction. Ongoing rdar://10435114. llvm-svn: 144688	2011-11-15 20:02:06 +00:00
Jim Grosbach	68c899c211	Testcase for r144684. llvm-svn: 144685	2011-11-15 19:56:17 +00:00
Jim Grosbach	efa7e95d06	Thumb2 two-operand 'mul' instruction wide encoding parsing. rdar://10449724 llvm-svn: 144684	2011-11-15 19:55:16 +00:00
Owen Anderson	0ac9058f89	Fix an ambiguous decoding where we failed to properly decode VMOVv2f32 and VMOVv4f32. llvm-svn: 144683	2011-11-15 19:55:00 +00:00
Jim Grosbach	6efa7b9852	Thumb2 assembly parsing for mul.w in IT block fix. When the 3rd operand is not a low-register, and the first two operands are the same low register, the parser was incorrectly trying to use the 16-bit instruction encoding. rdar://10449281 llvm-svn: 144679	2011-11-15 19:29:45 +00:00
Benjamin Kramer	b106bcc536	StringRefize and simplify. llvm-svn: 144675	2011-11-15 19:12:09 +00:00
Rafael Espindola	f11e7f1305	We currently use a callback to handle an IL pass deleting a BB that still has a reference to it. Unfortunately, that doesn't work for codegen passes since we don't get notified of MBB's being deleted (the original BB stays). Use that fact to our advantage and after printing a function, check if any of the IL BBs corresponds to a symbol that was not printed. This fixes pr11202. llvm-svn: 144674	2011-11-15 19:08:46 +00:00
Akira Hatanaka	6ee8fc88c7	Fix functions in MipsFrameLowering.cpp and MipsRegisterInfo.cpp. Use 64-bit registers and instructions when ABI is N64. llvm-svn: 144666	2011-11-15 18:53:55 +00:00
Akira Hatanaka	494913270e	Set nomacro before emitting the sequence of instructions that set global pointer register. llvm-svn: 144665	2011-11-15 18:44:44 +00:00
Akira Hatanaka	66a14c0650	Simplify function PassByValArg64. llvm-svn: 144664	2011-11-15 18:42:25 +00:00
Akira Hatanaka	d519d8ca83	Remove function printMipsSymbolRef. llvm-svn: 144663	2011-11-15 18:38:35 +00:00
Benjamin Kramer	be6535b3dc	Remove Value::getNameStr. It has been deprecated for a while and provides no additional value over getName(). llvm-svn: 144657	2011-11-15 18:30:12 +00:00
Benjamin Kramer	184e3ceea0	Missed some users of Value::getNameStr. llvm-svn: 144656	2011-11-15 18:30:06 +00:00
Akira Hatanaka	b7796ae938	Delete files. llvm-svn: 144655	2011-11-15 18:22:48 +00:00
Akira Hatanaka	1c0590c5da	Remove MipsMCSymbolRefExpr. llvm-svn: 144654	2011-11-15 18:20:08 +00:00
Jim Grosbach	2aabaa704a	ARM parsing datatype suffix variants for register-writeback VLD1/VST1 instructions. rdar://10435076 llvm-svn: 144650	2011-11-15 17:49:59 +00:00
Jim Grosbach	0dde349df1	Tidy up. 80 columns. llvm-svn: 144649	2011-11-15 16:46:22 +00:00
Benjamin Kramer	1f97a5a671	Remove all remaining uses of Value::getNameStr(). llvm-svn: 144648	2011-11-15 16:27:03 +00:00
Benjamin Kramer	4c93d15f09	Twinify GraphWriter a little bit. llvm-svn: 144647	2011-11-15 16:26:38 +00:00
Jakob Stoklund Olesen	e14ef7e6f8	Check all overlaps when looking for used registers. A function using any RC alias is enough to enable the ExeDepsFix pass. llvm-svn: 144636	2011-11-15 08:20:43 +00:00
Jay Foad	ab9ebd3521	Make use of MachinePointerInfo::getFixedStack. llvm-svn: 144635	2011-11-15 07:51:13 +00:00
Jay Foad	70679df664	Remove some unnecessary includes of PseudoSourceValue.h. llvm-svn: 144634	2011-11-15 07:50:46 +00:00
Jay Foad	e5cbd3c3fb	Fix typo in comment. llvm-svn: 144633	2011-11-15 07:50:05 +00:00
Jay Foad	465101bb0e	Make use of MachinePointerInfo::getFixedStack. This removes all mention of PseudoSourceValue from lib/Target/. llvm-svn: 144632	2011-11-15 07:34:52 +00:00
Jay Foad	0745e645e0	Remove some unnecessary includes of PseudoSourceValue.h. llvm-svn: 144631	2011-11-15 07:24:32 +00:00
Jakob Stoklund Olesen	4949b9a283	Revert r144611 and r144613. These tests are actually correct, clang was miscompiling ExeDepsFix::processUses. Evan fixed the miscompilation in r144628. llvm-svn: 144630	2011-11-15 07:13:03 +00:00
Craig Topper	649d1c5eec	Fix PR11370 for real. Prevents converting 256-bit FP instruction to AVX2 256-bit integer instructions when AVX2 isn't enabled. llvm-svn: 144629	2011-11-15 06:39:01 +00:00
Evan Cheng	7098c4e5f4	Set SeenStore to true to prevent loads from being moved; also eliminates a non-deterministic behavior. llvm-svn: 144628	2011-11-15 06:26:51 +00:00
Chandler Carruth	9b548a7fcf	Rather than trying to use the loop block sequence or the function block sequence when recovering from unanalyzable control flow constructs, always use the function sequence. I'm not sure why I ever went down the path of trying to use the loop sequence, it is fundamentally not the correct sequence to use. We're trying to preserve the incoming layout in the cases of unreasonable control flow, and that is only encoded at the function level. We already have a filter to select exactly the sub-set of blocks within the function that we're trying to form into a chain. The resulting code layout is also significantly better because of this. In several places we were ending up with completely unreasonable control flow constructs due to the ordering chosen by the loop structure for its internal storage. This change removes a completely wasteful vector of basic blocks, saving memory allocation in the common case even though it costs us CPU in the fairly rare case of unnatural loops. Finally, it fixes the latest crasher reduced out of GCC's single source. Thanks again to Benjamin Kramer for the reduction, my bugpoint skills failed at it. llvm-svn: 144627	2011-11-15 06:26:43 +00:00
Craig Topper	05baa85f58	Properly qualify AVX2 specific parts of execution dependency table. Also enable converting between 256-bit PS/PD operations when AVX1 is enabled. Fixes PR11370. llvm-svn: 144622	2011-11-15 05:55:35 +00:00
NAKAMURA Takumi	f01faac473	include/llvm/Support/Compiler.h: Invalidate LLVM_ATTRIBUTE_WEAK on cygming for now. It triggers generating insane executables with both binutils-2.19.1(msysgit) and 2.22.51.20111013(cygwin). llvm-svn: 144621	2011-11-15 05:24:26 +00:00
Jakob Stoklund Olesen	9c0de9bb6b	Really fix test. llvm-svn: 144613	2011-11-15 03:17:01 +00:00
Jakob Stoklund Olesen	14b66375a9	Allow for depencendy-breaking instructions before cvt*. This should unbreak clang-x86_64-darwin10-RA, but I can't actually reproduce the failure. llvm-svn: 144611	2011-11-15 02:29:48 +00:00
Evan Cheng	7ca4b6eb5c	Add vmov.f32 to materialize f32 immediate splats which cannot be handled by integer variants. rdar://10437054 llvm-svn: 144608	2011-11-15 02:12:34 +00:00
Jim Grosbach	29cdcda80d	ARM parsing datatype suffix variants for fixed-writeback VLD1/VST1 instructions. rdar://10435076 llvm-svn: 144606	2011-11-15 01:46:57 +00:00
Nick Lewycky	6804d27048	Move WEAK marking to the declaration. llvm-svn: 144603	2011-11-15 01:23:22 +00:00
Jakob Stoklund Olesen	f8ad336bc4	Break false dependencies before partial register updates. Two new TargetInstrInfo hooks lets the target tell ExecutionDepsFix about instructions with partial register updates causing false unwanted dependencies. The ExecutionDepsFix pass will break the false dependencies if the updated register was written in the previoius N instructions. The small loop added to sse-domains.ll runs twice as fast with dependency-breaking instructions inserted. llvm-svn: 144602	2011-11-15 01:15:30 +00:00
Jakob Stoklund Olesen	543bef6ead	Track register ages more accurately. Keep track of the last instruction to define each register individually instead of per DomainValue. This lets us track more accurately when a register was last written. Also track register ages across basic blocks. When entering a new basic block, use the least stale predecessor def as a worst case estimate for register age. The register age is used to arbitrate between conflicting domains. The most recently defined register wins. llvm-svn: 144601	2011-11-15 01:15:25 +00:00
Devang Patel	aa284c564f	Add ObjCPropertyDebugInfo.html llvm-svn: 144600	2011-11-15 01:14:37 +00:00
Devang Patel	41eac807ee	Document debug info support for objective-c properties. llvm-svn: 144599	2011-11-15 01:11:58 +00:00
Jim Grosbach	7b03fbd25c	Tidy up. Formatting. llvm-svn: 144598	2011-11-15 01:05:12 +00:00
Nick Lewycky	b2489b7484	Fix linking for some users who already have tsan enabled code and are trying to link it against llvm code, by making our definitions weak. "Some users." llvm-svn: 144596	2011-11-15 00:14:04 +00:00
Jim Grosbach	a498af2b1d	ARM parsing datatype suffix variants for non-writeback VST1 instructions. rdar://10435076 llvm-svn: 144593	2011-11-14 23:43:46 +00:00
Jim Grosbach	72838a0345	ARM parsing datatype suffix variants for non-writeback VLD1 instructions. rdar://10435076 llvm-svn: 144592	2011-11-14 23:32:59 +00:00
Jim Grosbach	750de7a399	Add explanatory comment. llvm-svn: 144589	2011-11-14 23:21:09 +00:00
Jim Grosbach	9c2d9d597b	Split out the plain '.{8\|16\|32\|64}' suffix handling. Make it easier to deal with aliases for instructions that do require a suffix but accept more specific variants of the same size. llvm-svn: 144588	2011-11-14 23:20:14 +00:00
Jim Grosbach	3d6c0e0bb2	ARM parsing optional datatype suffix for VAND/VEOR/VORR instructions. rdar://10435076 llvm-svn: 144587	2011-11-14 23:11:19 +00:00
Chad Rosier	057b6d3476	Supporting inline memmove isn't going to be worthwhile. The only way to avoid violating a dependency is to emit all loads prior to stores. This would likely cause a great deal of spillage offsetting any potential gains. llvm-svn: 144585	2011-11-14 23:04:09 +00:00
Jim Grosbach	3e2c6f380c	ARM VLDR/VSTR instructions don't need a size suffix. Canonicallize on the non-suffixed form, but continue to accept assembly that has any correctly sized type suffix. llvm-svn: 144583	2011-11-14 23:03:21 +00:00
Nick Lewycky	7013a19e8a	Refactor capture tracking (which already had a couple flags for whether returns and stores capture) to permit the caller to see each capture point and decide whether to continue looking. Use this inside memdep to do an analysis that basicaa won't do. This lets us solve another devirtualization case, fixing PR8908! llvm-svn: 144580	2011-11-14 22:49:42 +00:00
Chad Rosier	4e88fbebde	Add newline to end of file. Thanks, Eli. llvm-svn: 144579	2011-11-14 22:48:33 +00:00
Chad Rosier	ab7223e99a	Add support for inlining small memcpys. rdar://10412592 llvm-svn: 144578	2011-11-14 22:46:17 +00:00
Chad Rosier	45110fdf8d	Fix a performance regression from r144565. Positive offsets were being lowered into registers, rather then encoded directly in the load/store. llvm-svn: 144576	2011-11-14 22:34:48 +00:00
Jim Grosbach	7996b15724	ARM assembly parsing type suffix options for VLDR/VSTR. rdar://10435076 llvm-svn: 144575	2011-11-14 22:28:39 +00:00
Nick Lewycky	e5dc7550a5	Fix Windows build, don't try to #include <pthread.h> when we know it's not available. llvm-svn: 144574	2011-11-14 22:10:23 +00:00
Evan Cheng	f2fc508d4d	Avoid dereferencing off the beginning of lists. llvm-svn: 144569	2011-11-14 21:11:15 +00:00
Evan Cheng	28ffb7e444	At -O0, multiple uses of a virtual registers in the same BB are being marked "kill". This looks like a bug upstream. Since that's going to take some time to understand, loosen the assertion and disable the optimization when multiple kills are seen. llvm-svn: 144568	2011-11-14 21:02:09 +00:00
Nick Lewycky	fe856110aa	Add support for tsan annotations (thread sanitizer, a valgrind-based tool). These annotations are disabled entirely when either ENABLE_THREADS is off, or building a release build. When enabled, they add calls to functions with no statements to ManagedStatic's getters. Use these annotations to inform tsan that the race used inside ManagedStatic initialization is actually benign. Thanks to Kostya Serebryany for helping write this patch! llvm-svn: 144567	2011-11-14 20:50:16 +00:00
Evan Cheng	fb13d32b3f	Add a missing pattern for X86ISD::MOVLPD. rdar://10436044 llvm-svn: 144566	2011-11-14 20:35:52 +00:00
Chad Rosier	adfd200bcb	Add support for Thumb load/stores with negative offsets. rdar://10412592 llvm-svn: 144565	2011-11-14 20:22:27 +00:00
Benjamin Kramer	319904cc7e	Unbreak Release builds. llvm-svn: 144560	2011-11-14 19:51:48 +00:00
Evan Cheng	30f44ad785	Teach two-address pass to re-schedule two-address instructions (or the kill instructions of the two-address operands) in order to avoid inserting copies. This fixes the few regressions introduced when the two-address hack was disabled (without regressing the improvements). rdar://10422688 llvm-svn: 144559	2011-11-14 19:48:55 +00:00
Pete Cooper	890e02e854	Changed SSE4/AVX <2 x i64> extract and insert ops to be Custom lowered Constant idx case is still done in tablegen but other cases are then expanded Fixes <rdar://problem/10435460> llvm-svn: 144557	2011-11-14 19:38:42 +00:00
Benjamin Kramer	42d098e1b4	Fold ConstantVector::isAllOnesValue into Constant::isAllOnesValue and simplify it. llvm-svn: 144555	2011-11-14 19:12:20 +00:00
Akira Hatanaka	f93b3f46f8	32-to-64-bit extended load. llvm-svn: 144554	2011-11-14 19:06:14 +00:00
Akira Hatanaka	0b8bc00424	AnalyzeCallOperands function for N32/64. N32/64 places all variable arguments in integer registers (or on stack), regardless of their types, but follows calling convention of non-vaarg function when it handles fixed arguments. llvm-svn: 144553	2011-11-14 19:02:54 +00:00
Akira Hatanaka	52359363f2	Modify LowerFormalArguments to correctly handle vaarg arguments for Mips64. llvm-svn: 144552	2011-11-14 19:01:09 +00:00
Justin Holewinski	33a519021c	PTX: Let LLVM use loads/stores for all mem* intrinsics, instead of relying on custom implementations. llvm-svn: 144551	2011-11-14 18:58:20 +00:00
Wesley Peck	1c29a83acc	Add release notes for the MicroBlaze backend. llvm-svn: 144550	2011-11-14 18:56:41 +00:00
Akira Hatanaka	d673cfe027	Remove variable that keeps the size of area used to save byval or variable argument registers on the callee's stack frame, along with functions that set and get it. It is not necessary to add the size of this area when computing stack size in emitPrologue, since it has already been accounted for in PEI::calculateFrameObjectOffsets. llvm-svn: 144549	2011-11-14 18:56:20 +00:00
Jakob Stoklund Olesen	7e6004a3c1	Fix early-clobber handling in shrinkToUses. I broke this in r144515, it affected most ARM testers. <rdar://problem/10441389> llvm-svn: 144547	2011-11-14 18:45:38 +00:00
Bob Wilson	8d1c7dbdff	Disable generation of compact unwind encodings. <rdar://problem/10441578> This still seems to be causing some failures. It needs more testing before it gets enabled again. llvm-svn: 144543	2011-11-14 18:21:07 +00:00
Jakob Stoklund Olesen	7e07b388ac	Delete stale comment. llvm-svn: 144542	2011-11-14 18:03:05 +00:00
Jim Grosbach	ee201faeac	Tidy up. 80 column. llvm-svn: 144538	2011-11-14 17:52:47 +00:00
Benjamin Kramer	0ffbcc959d	Make headers standalone. llvm-svn: 144537	2011-11-14 17:45:03 +00:00
Benjamin Kramer	d00e94e882	Make headers standalone, move a virtual method out of line. llvm-svn: 144536	2011-11-14 17:22:45 +00:00
Daniel Dunbar	a5772b9278	build/Make: Switch over to using llvm-config-2 for dependencies one more (hopefully last) time, now that it also builds as a build tool. llvm-svn: 144535	2011-11-14 17:17:45 +00:00
Chandler Carruth	fd9b4d9813	It helps to deallocate memory as well as allocate it. =] This actually cleans up all the chains allocated during the processing of each function so that for very large inputs we don't just grow memory usage without bound. llvm-svn: 144533	2011-11-14 10:57:23 +00:00
Chandler Carruth	0a31d149ea	Remove an over-eager assert that was firing on one of the ARM regression tests when I forcibly enabled block placement. It is apparantly possible for an unanalyzable block to fallthrough to a non-loop block. I don't actually beleive this is correct, I believe that 'canFallThrough' is returning true needlessly for the code construct, and I've left a bit of a FIXME on the verification code to try to track down why this is coming up. Anyways, removing the assert doesn't degrade the correctness of the algorithm. llvm-svn: 144532	2011-11-14 10:55:53 +00:00
Chandler Carruth	0af6a0bb69	Begin chipping away at one of the biggest quadratic-ish behaviors in this pass. We're leaving already merged blocks on the worklist, and scanning them again and again only to determine each time through that indeed they aren't viable. We can instead remove them once we're going to have to scan the worklist. This is the easy way to implement removing them. If this remains on the profile (as I somewhat suspect it will), we can get a lot more clever here, as the worklist's order is essentially irrelevant. We can use swapping and fold the two loops to reduce overhead even when there are many blocks on the worklist but only a few of them are removed. llvm-svn: 144531	2011-11-14 09:46:33 +00:00
Chandler Carruth	84cd44c750	Under the hood, MBPI is doing a linear scan of every successor every time it is queried to compute the probability of a single successor. This makes computing the probability of every successor of a block in sequence... really really slow. ;] This switches to a linear walk of the successors rather than a quadratic one. One of several quadratic behaviors slowing this pass down. I'm not really thrilled with moving the sum code into the public interface of MBPI, but I don't (at the moment) have ideas for a better interface. My direction I'm thinking in for a better interface is to have MBPI actually retain much more state and make all of these queries cheap. That's a lot of work, and would require invasive changes. Until then, this seems like the least bad (ie, least quadratic) solution. Suggestions welcome. llvm-svn: 144530	2011-11-14 09:12:57 +00:00
Tobias Grosser	8bee91ffc6	Add clang_complete to release notes llvm-svn: 144529	2011-11-14 09:09:26 +00:00
Tobias Grosser	cfa35956c3	Add Polly to release notes llvm-svn: 144528	2011-11-14 09:09:23 +00:00
Chandler Carruth	a9e71faa0f	Reuse the logic in getEdgeProbability within getHotSucc in order to correctly handle blocks whose successor weights sum to more than UINT32_MAX. This is slightly less efficient, but the entire thing is already linear on the number of successors. Calling it within any hot routine is a mistake, and indeed no one is calling it. It also simplifies the code. llvm-svn: 144527	2011-11-14 08:55:59 +00:00
Chandler Carruth	ed5aa547bc	Fix an overflow bug in MachineBranchProbabilityInfo. This pass relied on the sum of the edge weights not overflowing uint32, and crashed when they did. This is generally safe as BranchProbabilityInfo tries to provide this guarantee. However, the CFG can get modified during codegen in a way that grows the sum of the edge weights. This doesn't seem unreasonable (imagine just adding more blocks all with the default weight of 16), but it is hard to come up with a case that actually triggers 32-bit overflow. Fortuately, the single-source GCC build is good at this. The solution isn't very pretty, but its no worse than the previous code. We're already summing all of the edge weights on each query, we can sum them, check for an overflow, compute a scale, and sum them again. I've included a greatly reduced test case out of the GCC source that triggers it. It's a pretty lame test, as it clearly is just barely triggering the overflow. I'd like to have something that is much more definitive, but I don't understand the fundamental pattern that triggers an explosion in the edge weight sums. The buggy code is duplicated within this file. I'll colapse them into a single implementation in a subsequent commit. llvm-svn: 144526	2011-11-14 08:50:16 +00:00
Craig Topper	182b00a2e0	Add AVX2 version of instructions to load folding tables. Also add a bunch of missing SSE/AVX instructions. llvm-svn: 144525	2011-11-14 08:07:55 +00:00
Chandler Carruth	2432d81ee4	Add a cautionary note to this API. It was not at all obvious to me how expensive the most useful interface to this analysis is. Fun story -- it's also not correct. That's getting fixed in another patch. llvm-svn: 144523	2011-11-14 06:51:49 +00:00
Craig Topper	a331515c82	Add neverHasSideEffects, mayLoad, and mayStore to many patternless SSE/AVX instructions. Remove MMX check from LowerVECTOR_SHUFFLE since MMX vector types won't go through it anyway. llvm-svn: 144522	2011-11-14 06:46:21 +00:00
Chad Rosier	2a1df883d0	Add support for ARM halfword load/stores and signed byte loads with negative offsets. rdar://10412592 llvm-svn: 144518	2011-11-14 04:09:28 +00:00
Jakob Stoklund Olesen	d7bcf43dc2	Use getVNInfoBefore() when it makes sense. llvm-svn: 144517	2011-11-14 01:39:36 +00:00
Chandler Carruth	1071cfa4ae	Teach machine block placement to cope with unnatural loops. These don't get loop info structures associated with them, and so we need some way to make forward progress selecting and placing basic blocks. The technique used here is pretty brutal -- it just scans the list of blocks looking for the first unplaced candidate. It keeps placing blocks like this until the CFG becomes tractable. The cost is somewhat unfortunate, it requires allocating a vector of all basic block pointers eagerly. I have some ideas about how to simplify and optimize this, but I'm trying to get the logic correct first. Thanks to Benjamin Kramer for the reduced test case out of GCC. Sadly there are other bugs that GCC is tickling that I'm reducing and working on now. llvm-svn: 144516	2011-11-14 00:00:35 +00:00
Jakob Stoklund Olesen	697979028f	Use kill slots instead of the previous slot in shrinkToUses. It's more natural to use the actual end points. llvm-svn: 144515	2011-11-13 23:53:25 +00:00
Chandler Carruth	c4a2cb34bb	Cleanup some 80-columns violations and poor formatting. These snuck by when I was reading through the code for style. llvm-svn: 144513	2011-11-13 22:50:09 +00:00
Jakob Stoklund Olesen	d8f2405e73	Terminate all dead defs at the dead slot instead of the 'next' slot. This makes no difference for normal defs, but early clobber dead defs now look like: [Slot_EarlyClobber; Slot_Dead) instead of: [Slot_EarlyClobber; Slot_Register). Live ranges for normal dead defs look like: [Slot_Register; Slot_Dead) as before. llvm-svn: 144512	2011-11-13 22:42:13 +00:00
Craig Topper	424ca7bbf5	Fix comment for LegalizeTypeAction enum. llvm-svn: 144511	2011-11-13 22:11:24 +00:00
Jakob Stoklund Olesen	ce7cc08f3a	Simplify early clobber slots a bit. llvm-svn: 144507	2011-11-13 22:05:42 +00:00
Chandler Carruth	8e1d906734	Enhance the assertion mechanisms in place to make it easier to catch when we fail to place all the blocks of a loop. Currently this is happening for unnatural loops, and this logic helps more immediately point to the problem. llvm-svn: 144504	2011-11-13 21:39:51 +00:00
Jakob Stoklund Olesen	90b5e565b6	Rename SlotIndexes to match how they are used. The old naming scheme (load/use/def/store) can be traced back to an old linear scan article, but the names don't match how slots are actually used. The load and store slots are not needed after the deferred spill code insertion framework was deleted. The use and def slots don't make any sense because we are using half-open intervals as is customary in C code, but the names suggest closed intervals. In reality, these slots were used to distinguish early-clobber defs from normal defs. The new naming scheme also has 4 slots, but the names match how the slots are really used. This is a purely mechanical renaming, but some of the code makes a lot more sense now. llvm-svn: 144503	2011-11-13 20:45:27 +00:00
Craig Topper	b8bcb473e2	Add BLSI, BLSMSK, and BLSR to getTargetNodeName. llvm-svn: 144502	2011-11-13 17:31:07 +00:00
Chandler Carruth	0bb42c0f86	Teach MBP to force-merge layout successors for blocks with unanalyzable branches that also may involve fallthrough. In the case of blocks with no fallthrough, we can still re-order the blocks profitably. For example instruction decoding will in some cases continue past an indirect jump, making laying out its most likely successor there profitable. Note, no test case. I don't know how to write a test case that exercises this logic, but it matches the described desired semantics in discussions with Jakob and others. If anyone has a nice example of IR that will trigger this, that would be lovely. Also note, there are still assertion failures in real world code with this. I'm digging into those next, now that I know this isn't the cause. llvm-svn: 144499	2011-11-13 12:17:28 +00:00
Chandler Carruth	f9213fe721	Hoist another gross nested loop into a helper method. llvm-svn: 144498	2011-11-13 11:42:26 +00:00
Chandler Carruth	eb4ec3aea5	Add a missing doxygen comment for a helper method. llvm-svn: 144497	2011-11-13 11:34:55 +00:00
Chandler Carruth	b336172f90	Hoist a nested loop into its own method. llvm-svn: 144496	2011-11-13 11:34:53 +00:00
Chandler Carruth	8d15078927	Rewrite #3 of machine block placement. This is based somewhat on the second algorithm, but only loosely. It is more heavily based on the last discussion I had with Andy. It continues to walk from the inner-most loop outward, but there is a key difference. With this algorithm we ensure that as we visit each loop, the entire loop is merged into a single chain. At the end, the entire function is treated as a "loop", and merged into a single chain. This chain forms the desired sequence of blocks within the function. Switching to a single algorithm removes my biggest problem with the previous approaches -- they had different behavior depending on which system triggered the layout. Now there is exactly one algorithm and one basis for the decision making. The other key difference is how the chain is formed. This is based heavily on the idea Andy mentioned of keeping a worklist of blocks that are viable layout successors based on the CFG. Having this set allows us to consistently select the best layout successor for each block. It is expensive though. The code here remains very rough. There is a lot that needs to be done to clean up the code, and to make the runtime cost of this pass much lower. Very much WIP, but this was a giant chunk of code and I'd rather folks see it sooner than later. Everything remains behind a flag of course. I've added a couple of tests to exercise the issues that this iteration was motivated by: loop structure preservation. I've also fixed one test that was exhibiting the broken behavior of the previous version. llvm-svn: 144495	2011-11-13 11:20:44 +00:00
Chad Rosier	1198d894d0	The order in which the predicate is added differs between Thumb and ARM mode. Fix predicate when in ARM mode and restore SelectIntrinsicCall. llvm-svn: 144494	2011-11-13 09:44:21 +00:00
Chad Rosier	a476e391f1	Temporarily disable SelectIntrinsicCall when in ARM mode. This is causing failures. llvm-svn: 144492	2011-11-13 05:14:43 +00:00
Chad Rosier	5196efdf36	Fix comments. llvm-svn: 144490	2011-11-13 04:25:02 +00:00
Chad Rosier	c8cfd3a8fb	Add support for emitting both signed- and zero-extend loads. Fix SimplifyAddress to handle either a 12-bit unsigned offset or the ARM +/-imm8 offsets (addressing mode 3). This enables a load followed by an integer extend to be folded into a single load. For example: ldrb r1, [r0] ldrb r1, [r0] uxtb r2, r1 => mov r3, r2 mov r3, r1 llvm-svn: 144488	2011-11-13 02:23:59 +00:00
NAKAMURA Takumi	4784df7161	Prune more RALinScan. RALinScan was also here! llvm-svn: 144487	2011-11-13 01:33:10 +00:00
Jakob Stoklund Olesen	c601d8c762	More dead code elimination in VirtRegMap. This thing is looking a lot like a virtual register map now. llvm-svn: 144486	2011-11-13 01:23:34 +00:00
Jakob Stoklund Olesen	28df7ef8c9	Stop tracking spill slot uses in VirtRegMap. Nobody cared, StackSlotColoring scans the instructions to find used stack slots. llvm-svn: 144485	2011-11-13 01:23:30 +00:00
Jakob Stoklund Olesen	92255f27f1	Remove dead code and data from VirtRegMap. Most of this stuff was supporting the old deferred spill code insertion mechanism. Modern spillers just edit machine code in place. llvm-svn: 144484	2011-11-13 01:02:04 +00:00
Jakob Stoklund Olesen	38b3f312ca	Stop tracking unused registers in VirtRegMap. The information was only used by the register allocator in StackSlotColoring. llvm-svn: 144482	2011-11-13 00:39:45 +00:00
Jakob Stoklund Olesen	6ddb767fb5	Remove the -color-ss-with-regs option. It was off by default. The new register allocators don't have the problems that made it necessary to reallocate registers during stack slot coloring. llvm-svn: 144481	2011-11-13 00:31:23 +00:00
Jakob Stoklund Olesen	5343da6497	Delete VirtRegRewriter. And there was much rejoicing. llvm-svn: 144480	2011-11-13 00:16:01 +00:00

... 5 6 7 8 9 ...

78486 Commits