llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	51280d565b	Merge 128-bit and 256-bit X86ISD node types for VPERMILPS and VPERMILPD. Simplify some shuffle lowering code since V1 can never be UNDEF due to canonalizing that occurs when shuffle nodes are created. llvm-svn: 145153	2011-11-26 22:55:48 +00:00
Wesley Peck	69d5040485	Rename a couple of options and fix some simple typos. llvm-svn: 145152	2011-11-26 21:50:38 +00:00
Craig Topper	7704bd7ac3	Collapse X86ISD node types for PUNPCKH, PUNPCKL, UNPCKLP, and UNPCKHP to not be type specific. Now we just have integer high and low and floating point high and low. Pattern matching will choose the correct instruction based on the vector type. llvm-svn: 145148	2011-11-26 20:47:44 +00:00
Eli Friedman	a84ad7d0d0	Fix APFloat::convert so that it handles narrowing conversions correctly; it was returning incorrect values in rare cases, and incorrectly marking exact conversions as inexact in some more common cases. Fixes PR11406, and a missed optimization in test/CodeGen/X86/fp-stack-O0.ll. llvm-svn: 145141	2011-11-26 03:38:02 +00:00
Bruno Cardoso Lopes	0f9a1f5e6c	This patch contains support for encoding FMA4 instructions and tablegen patterns for scalar FMA4 operations and intrinsic. Also add tests for vfmaddsd. Patch by Jan Sjodin llvm-svn: 145133	2011-11-25 19:33:42 +00:00
NAKAMURA Takumi	989eaf6e3f	ARMLoadStoreOptimizer.cpp: Fix MSVC(Debug) build. llvm-svn: 145129	2011-11-25 09:19:57 +00:00
Craig Topper	d65a444478	Remove 256-bit specific node types for UNPCKHPS/D and instead use the 128-bit versions and let the operand type disinquish. Also fix the load form of the v8i32 patterns for these to realize that the load would be promoted to v4i64. llvm-svn: 145126	2011-11-24 22:57:10 +00:00
Craig Topper	d26466748b	Remove AVX2 specific X86ISD node types for PUNPCKH/L and instead just reuse the 128-bit versions and let the vector type distinguish. llvm-svn: 145125	2011-11-24 22:20:08 +00:00
Benjamin Kramer	8a2d143672	Devirtualize Pass::getPassID, overriding it isn't useful and it gets called a lot. While at it pull the trivial ctor in line. llvm-svn: 145124	2011-11-24 21:14:11 +00:00
Benjamin Kramer	6709e05012	Make ConstantRange::truncate a bit more efficient. llvm-svn: 145122	2011-11-24 17:24:33 +00:00
Benjamin Kramer	651db37352	X86: alias cqo to cqto. llvm-svn: 145121	2011-11-24 12:02:46 +00:00
Chandler Carruth	7adee1a01a	Fix a silly use-after-free issue. A much earlier version of this code need lots of fanciness around retaining a reference to a Chain's slot in the BlockToChain map, but that's all gone now. We can just go directly to allocating the new chain (which will update the mapping for us) and using it. Somewhat gross mechanically generated test case replicates the issue Duncan spotted when actually testing this out. llvm-svn: 145120	2011-11-24 11:23:15 +00:00
Chandler Carruth	d394bafd2d	When adding blocks to the list of those which no longer have any CFG conflicts, we should only be adding the first block of the chain to the list, lest we try to merge into the middle of that chain. Most of the places we were doing this we already happened to be looking at the first block, but there is no reason to assume that, and in some cases it was clearly wrong. I've added a couple of tests here. One already worked, but I like having an explicit test for it. The other is reduced from a test case Duncan reduced for me and used to crash. Now it is handled correctly. llvm-svn: 145119	2011-11-24 08:46:04 +00:00
Akira Hatanaka	049e9e4d22	This patch makes the following changes necessary for MIPS' direct code emission. - lower unaligned loads/stores. - encode the size operand of instructions INS and EXT. - emit relocation information needed for JAL (jump-and-link). llvm-svn: 145113	2011-11-23 22:19:28 +00:00
Akira Hatanaka	f5ddf13f79	This patch addresses gp relative fixups/relocations for jump tables. llvm-svn: 145112	2011-11-23 22:18:04 +00:00
Richard Smith	4f9a8081c3	Correctly byte-swap APInts with bit-widths greater than 64. llvm-svn: 145111	2011-11-23 21:33:37 +00:00
Benjamin Kramer	6e013bf96c	Validate the return type when checking if a function is malloc. Fixes PR11426. Not sure if a test case with a "wrong" malloc would be useful. llvm-svn: 145106	2011-11-23 17:58:47 +00:00
Duncan Sands	81a2af12d6	Fix a crash in which a multiplication was being reported as being both negative and positive: positive, because it could be directly computed to be positive; negative, because the nsw flags means it is either negative or undefined (the multiplication always overflowed). llvm-svn: 145104	2011-11-23 16:26:47 +00:00
Benjamin Kramer	ebcb451874	X86: Use btq for bit tests if the immediate can't be encoded in 32 bits. Before: movabsq $4294967296, %rax ## encoding: [0x48,0xb8,0x00,0x00,0x00,0x00,0x01,0x00,0x00,0x00] testq %rax, %rdi ## encoding: [0x48,0x85,0xf8] jne LBB0_2 ## encoding: [0x75,A] After: btq $32, %rdi ## encoding: [0x48,0x0f,0xba,0xe7,0x20] jb LBB0_2 ## encoding: [0x72,A] btq is usually slower than testq because it doesn't fuse with the jump, but here we're better off saving one register and a giant movabsq. llvm-svn: 145103	2011-11-23 13:54:17 +00:00
Chandler Carruth	99fe42fbd9	Relax an invariant that block placement was trying to assert a bit further. This invariant just wasn't going to work in the face of unanalyzable branches; we need to be resillient to the phenomenon of chains poking into a loop and poking out of a loop. In fact, we already were, we just needed to not assert on it. This was found during a bootstrap with block placement turned on. llvm-svn: 145100	2011-11-23 10:35:36 +00:00
Elena Demikhovsky	779ba6d7b7	I added several lines in X86 code generator that allow to choose VSHUFPS/VSHUFPD instructions while lowering VECTOR_SHUFFLE node. I check a commuted VSHUFP mask. The patch was reviewed by Bruno. llvm-svn: 145099	2011-11-23 10:23:16 +00:00
Chandler Carruth	8c68f1f3c8	Handle the case of a no-return invoke correctly. It actually still has successors, they just are all landing pad successors. We handle this the same way as no successors. Comments attached for the next person to wade through here and another lovely test case courtesy of Benjamin Kramer's bugpoint reduction. llvm-svn: 145098	2011-11-23 08:23:54 +00:00
Bob Wilson	ebb44646c4	Enable stack protectors for all arrays, not just char arrays. rdar://5875909 Patch by Bill Wendling. llvm-svn: 145097	2011-11-23 07:13:56 +00:00
Jakob Stoklund Olesen	02845410f9	Fix PR11422. This was a bug in keeping track of the available domains when merging domain values. The wrong domain mask caused ExecutionDepsFix to try to move VANDPSYrr to the integer domain which is only available in AVX2. Also add an assertion to catch future attempts at emitting AVX2 instructions. llvm-svn: 145096	2011-11-23 04:03:08 +00:00
Chandler Carruth	4a87aa0c31	Fix a crash in block placement due to an inner loop that happened to be reversed in the function's original ordering, and we happened to encounter it while handling an outer unnatural CFG structure. Thanks to the test case reduced from GCC's source by Benjamin Kramer. This may also fix a crasher in gzip that Duncan reduced for me, but I haven't yet gotten to testing that one. llvm-svn: 145094	2011-11-23 03:03:21 +00:00
Kostya Serebryany	8b5c7a56a3	[asan] do not instrument threadlocal globals, this is buggy llvm-svn: 145092	2011-11-23 02:10:54 +00:00
Hal Finkel	6f0ae783fe	add basic PPC register-pressure feedback; adjust the vaarg test to match the new register-allocation pattern llvm-svn: 145065	2011-11-22 16:21:04 +00:00
Craig Topper	83c4592619	More fixes to the X86InstComments for shuffle instructions. In particular add AVX flavors of many instructions and fix the destination operand for some of the existing AVX entries. llvm-svn: 145063	2011-11-22 14:27:57 +00:00
Chandler Carruth	ee54feb6f6	Fix a devilish miscompile exposed by block placement. The updateTerminator code didn't correctly handle EH terminators in one very specific case. AnalyzeBranch would find no terminator instruction, and so the fallback in updateTerminator is to assume fallthrough. This is correct, but the destination of the fallthrough was assumed to be the first successor. This is almost always true, but in certain cases the loop transformations will cause the landing pad to be the first successor! Instead of this brittle logic, actually look through the successors for a non-landing-pad accessor, and to assert if more than one is found. This will hopefully fix some (if not all) of the self host miscompiles with block placement. Thanks to Benjamin Kramer for reporting, Nick Lewycky for an initial stab at a reduction, and Duncan for endless advice on EH (which I know nothing about) as well as reviewing the actual fix. llvm-svn: 145062	2011-11-22 13:13:16 +00:00
Benjamin Kramer	e1effb0da2	Add configure checking for pread(2) and use it to save a syscall when reading files. llvm-svn: 145061	2011-11-22 12:31:53 +00:00
Chandler Carruth	e2530dc889	Fix an obvious omission in the SelectionDAGBuilder where we were dropping weights on the floor for invokes. This was impeding my writing further test cases for invoke when interacting with probabilities and block placement. No test case as there doesn't appear to be a way to test this stuff. =/ Suggestions for a test case of course welcome. I hope to be able to add test cases that indirectly cover this eventually by adding probabilities to the exceptional edge and reordering blocks as a result. llvm-svn: 145060	2011-11-22 11:37:46 +00:00
Benjamin Kramer	f22623b78b	Turn error recovery into an assert. This was put in because in a certain version of DragonFlyBSD stat(2) lied about the size of some files. This was fixed a long time ago so we can remove the workaround. llvm-svn: 145059	2011-11-22 11:37:11 +00:00
Rafael Espindola	2021f38281	If a register is both an early clobber and part of a tied use, handle the use before the clobber so that we copy the value if needed. Fixes pr11415. llvm-svn: 145056	2011-11-22 06:27:18 +00:00
Craig Topper	ccb7097509	Fix shuffle decoding logic to handle UNPCKLPS/UNPCKLPD on 256-bit vectors correctly. Add support for decoding UNPCKHPS/UNPCKHPD for AVX 128-bit and 256-bit forms. llvm-svn: 145055	2011-11-22 01:57:35 +00:00
Craig Topper	f563977795	Add methods for querying minimum SSE version along with AVX. Simplifies all the places that had to check a version of SSE and AVX. llvm-svn: 145053	2011-11-22 00:44:41 +00:00
Nick Lewycky	063ae5897c	Fix crasher in GVN due to my recent capture tracking changes. llvm-svn: 145047	2011-11-21 19:42:56 +00:00
Nick Lewycky	aa2a00db35	Add virtual destructor. Whoops! llvm-svn: 145044	2011-11-21 18:32:21 +00:00
Craig Topper	6270d072c5	Lowering for v32i8 to VPUNPCKLBW/VPUNPCKHBW when AVX2 is enabled. llvm-svn: 145028	2011-11-21 08:26:50 +00:00
Craig Topper	669199ca94	Add support for lowering 256-bit shuffles to VPUNPCKL/H for i16, i32, i64 if AVX2 is enabled. llvm-svn: 145026	2011-11-21 06:57:39 +00:00
Joe Abbey	96e89f6412	Fixing a comment llvm-svn: 145025	2011-11-21 04:42:21 +00:00
Craig Topper	a065238c6e	Make LowerSIGN_EXTEND_INREG split 256-bit vectors when AVX1 is enabled and use AVX2 shifts when AVX2 is enabled. llvm-svn: 145022	2011-11-21 01:12:36 +00:00
Nick Lewycky	6ae03c3378	Less template, more virtual! Refactoring suggested by Chris in code review. llvm-svn: 145014	2011-11-20 19:37:06 +00:00
Nick Lewycky	612d70b19d	Refactor code to use new attribute getters on CallSite for NoCapture and ByVal. Suggested in code review by Eli. That code in InstCombine looks kinda suspicious. llvm-svn: 145013	2011-11-20 19:09:04 +00:00
Chandler Carruth	18dfac385b	The logic for breaking the CFG in the presence of hot successors didn't properly account for the global probability of the edge being taken. This manifested as a very large number of unconditional branches to blocks being merged against the CFG even though they weren't particularly hot within the CFG. The fix is to check whether the edge being merged is both locally hot relative to other successors for the source block, and globally hot compared to other (unmerged) predecessors of the destination block. This introduces a new crasher on GCC single-source, but it's currently behind a flag, and Ben has offered to work on the reduction. =] llvm-svn: 145010	2011-11-20 11:22:06 +00:00
Benjamin Kramer	b5ba2eef2d	SCEV: Actually set overflow flags on add expressions. setFlags doesn't modify its arguments. llvm-svn: 145007	2011-11-20 10:24:36 +00:00
Craig Topper	e79761df73	Add code for lowering v32i8 shifts by a splat to AVX2 immediate shift instructions. Remove 256-bit splat handling from LowerShift as it was already handled by PerformShiftCombine. llvm-svn: 145005	2011-11-20 00:12:05 +00:00
Craig Topper	a3a6583694	Use 256-bit vcmpeqd for creating an all ones vector when AVX2 is enabled. llvm-svn: 145004	2011-11-19 22:34:59 +00:00
Craig Topper	bac86038ac	Remove some of the special classes that worked around an old tablegen limitation of not being able to remove redundant bitconverts from patterns. llvm-svn: 145003	2011-11-19 21:01:54 +00:00
Craig Topper	3af6ae089f	Custom lower AVX2 variable shift intrinsics to shl/srl/sra nodes and remove the intrinsic patterns. llvm-svn: 144999	2011-11-19 17:46:46 +00:00
Chandler Carruth	f3dc9eff16	Move the handling of unanalyzable branches out of the loop-driven chain formation phase and into the initial walk of the basic blocks. We essentially pre-merge all blocks where unanalyzable fallthrough exists, as we won't be able to update the terminators effectively after any reorderings. This is quite a bit more principled as there may be CFGs where the second half of the unanalyzable pair has some analyzable predecessor that gets placed first. Then it may get placed next, implicitly breaking the unanalyzable branch even though we never even looked at the part that isn't analyzable. I've included a test case that triggers this (thanks Benjamin yet again!), and I'm hoping to synthesize some more general ones as I dig into related issues. Also, to make this new scheme work we have to be able to handle branches into the middle of a chain, so add this check. We always fallback on the incoming ordering. Finally, this starts to really underscore a known limitation of the current implementation -- we don't consider broken predecessors when merging successors. This can caused major missed opportunities, and is something I'm planning on looking at next (modulo more bug reports). llvm-svn: 144994	2011-11-19 10:26:02 +00:00
Craig Topper	f984efbfce	Synthesize SSSE3/AVX 128-bit horizontal integer add/sub instructions from add/sub of appropriate shuffle vectors. llvm-svn: 144989	2011-11-19 09:02:40 +00:00
Craig Topper	81390be00f	Collapse X86 PSIGNB/PSIGNW/PSIGND node types. llvm-svn: 144988	2011-11-19 07:33:10 +00:00
Craig Topper	de6b73bb4d	Extend VPBLENDVB and VPSIGN lowering to work for AVX2. llvm-svn: 144987	2011-11-19 07:07:26 +00:00
Craig Topper	66e2b5a61e	Remove unused parameters from the AVX maskmov classes. llvm-svn: 144985	2011-11-19 04:49:22 +00:00
Andrew Trick	6b4d578f54	Fix a corner case in updating LoopInfo after fully unrolling an outer loop. The loop tree's inclusive block lists are painful and expensive to update. (I have no idea why they're inclusive). The design was supposed to handle this case but the implementation missed it and my unit tests weren't thorough enough. Fixes PR11335: loop unroll update. llvm-svn: 144970	2011-11-18 03:42:41 +00:00
Nadav Rotem	1ec141d0f9	Add AVX2 vpbroadcast support llvm-svn: 144967	2011-11-18 02:49:55 +00:00
Kostya Serebryany	1cdc6e9567	[asan] workaround for reg alloc bug 11395: don't instrument functions with large chunks of inline assembler llvm-svn: 144962	2011-11-18 01:41:06 +00:00
Chad Rosier	ee93ff736a	Guard call to getRegForValue with isTypeLegal check to avoid unnecessary work/dead code. llvm-svn: 144959	2011-11-18 01:17:34 +00:00
Devang Patel	107e8ec30d	DISubrange supports unsigned lower/upper array bounds, so let's not fake it in the end while emitting DWARF. If a FE needs to encode signed lower/upper array bounds then we need to extend DISubrange or ad DISignedSubrange. llvm-svn: 144937	2011-11-17 23:43:15 +00:00
Kostya Serebryany	a6edf4c21f	quick fix: remove GlobalVariable::GlobalVariable mistakenly commited at r144933. For some reason this compiles on linux llvm-svn: 144936	2011-11-17 23:37:53 +00:00
Andrew Trick	949045864d	Fix an overly general check in SimplifyIndvar to handle useless phi cycles. The right way to check for a binary operation is cast<BinaryOperator>. The original check: cast<Instruction> && numOperands() == 2 would match phi "instructions", leading to an infinite loop in extreme corner case: a useless phi with operands [self, constant] that prior optimization passes failed to remove, being used in the loop by another useless phi, in turn being used by an lshr or udiv. Fixes PR11350: runaway iteration assertion. llvm-svn: 144935	2011-11-17 23:36:35 +00:00
Kostya Serebryany	65e2211b95	fall back to explicit list of allowed linkages when instrumenting globals in asan; add a test check that asan does not touch linkonce_odr llvm-svn: 144933	2011-11-17 23:14:59 +00:00
Chad Rosier	0eff3e5c21	Add TODO comment. llvm-svn: 144920	2011-11-17 21:46:13 +00:00
Craig Topper	f41e1d0246	Fix SSE/AVX integer comparison patterns to understand that all integer vector loads are promoted to i64 vector loads so patterns need a bitconvert. Also slightly simplify the AVX2 variable shift patterns by using the predefined bitconvert pattern fragments. llvm-svn: 144896	2011-11-17 07:49:38 +00:00
Chad Rosier	15b2498e88	Dead code. llvm-svn: 144888	2011-11-17 07:24:49 +00:00
Chad Rosier	f83ab704e4	When fast iseling a GEP, accumulate the offset rather than emitting a series of ADDs. MaxOffs is used as a threshold to limit the size of the offset. Tradeoffs being: (1) If we can't materialize the large constant then we'll cause fast-isel to bail. (2) Too large of an offset can't be directly encoded in the ADD resulting in a MOV+ADD. Generally not a bad thing because otherwise we would have had ADD+ADD, but on Thumb this turns into a MOVS+MOVT+ADD. Working on a fix for that. (3) Conversely, too low of a threshold we'll miss opportunities to coalesce ADDs. rdar://10412592 llvm-svn: 144886	2011-11-17 07:15:58 +00:00
Craig Topper	f17b600577	Remove seemingly unnecessary duplicate VROUND definitions. llvm-svn: 144885	2011-11-17 07:04:00 +00:00
Eli Friedman	489c0ff4a4	Add support for custom names for library functions in TargetLibraryInfo. Add a custom name for fwrite and fputs on x86-32 OSX. Make SimplifyLibCalls honor the custom names for fwrite and fputs. Fixes <rdar://problem/9815881>. llvm-svn: 144876	2011-11-17 01:27:36 +00:00
Chad Rosier	ce619ddfc5	Don't unconditionally set the kill flag. rdar://10456186 llvm-svn: 144872	2011-11-17 01:16:53 +00:00
Eli Friedman	20439a42b0	Turn on vzeroupper insertion on call boundaries for AVX; it works as far as I know, and I'd like to see wider testing. llvm-svn: 144867	2011-11-17 00:21:52 +00:00
Eli Friedman	ff1eaa7578	Make sure to replace the chain properly when DAGCombining a LOAD+EXTRACT_VECTOR_ELT into a single LOAD. Fixes PR10747/PR11393. llvm-svn: 144863	2011-11-16 23:50:22 +00:00
Michael J. Spencer	d27d51fbaf	Object/COFF: Support common symbols. llvm-svn: 144861	2011-11-16 23:36:12 +00:00
Jim Grosbach	d3f02cbce9	Generalize the fixup info for ARM mode. We don't (yet) have the granularity in the fixups to be specific about which bitranges are affected. That's a future cleanup, but we're not there yet. llvm-svn: 144852	2011-11-16 22:48:37 +00:00
Akira Hatanaka	b31abde0f3	Lower 64-bit constant pool node. llvm-svn: 144849	2011-11-16 22:44:38 +00:00
Akira Hatanaka	eb42071721	Lower 64-bit block address. llvm-svn: 144847	2011-11-16 22:42:10 +00:00
Jim Grosbach	7ccdb7c0ae	Fix encoding of NOP used for padding in ARM mode .align. llvm-svn: 144842	2011-11-16 22:40:25 +00:00
Akira Hatanaka	7b8547c4d0	Add patterns for 64-bit tglobaladdr, tblockaddress, tjumptable and tconstpool nodes. llvm-svn: 144841	2011-11-16 22:39:56 +00:00
Akira Hatanaka	6d617ceca2	64-bit jump register instruction. llvm-svn: 144840	2011-11-16 22:36:01 +00:00
Evan Cheng	011538dc79	Another missing X86ISD::MOVLPD pattern. rdar://10450317 llvm-svn: 144839	2011-11-16 22:24:44 +00:00
Jim Grosbach	bfe5c5c968	ARM assembly parsing for shifted register operands for MOV instruction. llvm-svn: 144837	2011-11-16 21:50:05 +00:00
Jim Grosbach	01e0439240	Clean up debug printing of ARM shifted operands. llvm-svn: 144836	2011-11-16 21:46:50 +00:00
Chad Rosier	ff40b1e164	Add fast-isel stats to determine who's doing all the work, the target-independent selector or the target-specific selector. llvm-svn: 144833	2011-11-16 21:05:28 +00:00
Chad Rosier	cfd0d10e72	Fix the stats collection for fast-isel. The failed count was only accounting for a single miss and not all predecessor instructions that get selected by the selection DAG instruction selector. This is still not exact (e.g., over states misses when folded/dead instructions are present), but it is a step in the right direction. llvm-svn: 144832	2011-11-16 21:02:08 +00:00
Jim Grosbach	3127ab6d8f	ARM assmebly two operand forms for LSR, ASR, LSL, ROR register. llvm-svn: 144814	2011-11-16 19:12:24 +00:00
Jim Grosbach	1a2f9ee3c8	ARM assembly parsing for RRX mnemonic. rdar://9704684 llvm-svn: 144812	2011-11-16 19:05:59 +00:00
Pete Cooper	48784ed5b7	Added missing comment about new custom lowering of DEC64 llvm-svn: 144811	2011-11-16 19:03:23 +00:00
Evan Cheng	822ddde50d	Disable expensive two-address optimizations at -O0. rdar://10453055 llvm-svn: 144806	2011-11-16 18:44:48 +00:00
Chad Rosier	80979b6ea6	Check to make sure we can select the instruction before trying to put the operands into a register. Otherwise, we may materialize dead code. llvm-svn: 144805	2011-11-16 18:39:44 +00:00
Evan Cheng	624eb2af6f	Disable the assertion again. Looks like fastisel is still generating bad kill markers. llvm-svn: 144804	2011-11-16 18:32:14 +00:00
Jim Grosbach	abcac56869	ARM mode aliases for bitwise instructions w/ register operands. rdar://9704684 llvm-svn: 144803	2011-11-16 18:31:45 +00:00
Bob Wilson	0ca7ce389c	Fix tablegen warning: hasSideEffects is inferred for eh_sjlj_dispatchsetup. llvm-svn: 144798	2011-11-16 17:09:59 +00:00
NAKAMURA Takumi	b345060a85	lib/Target/ARM/CMakeLists.txt: Disable optimization in ARMISelLowering.cpp also on MSC15(aka VS9). Seems miscompiled. llvm-svn: 144794	2011-11-16 09:18:28 +00:00
Evan Cheng	ecb2908bf9	Sink codegen optimization level into MCCodeGenInfo along side relocation model and code model. This eliminates the need to pass OptLevel flag all over the place and makes it possible for any codegen pass to use this information. llvm-svn: 144788	2011-11-16 08:38:26 +00:00
Bob Wilson	cca9aa58ca	Record landing pads with a SmallSetVector to avoid multiple entries. There may be many invokes that share one landing pad, and the previous code would record the landing pad once for each invoke. Besides the wasted effort, a pair of volatile loads gets inserted every time the landing pad is processed. The rest of the code can get optimized away when a landing pad is processed repeatedly, but the volatile loads remain, resulting in code like: LBB35_18: Ltmp483: ldr r2, [r7, #-72] ldr r2, [r7, #-68] ldr r2, [r7, #-72] ldr r2, [r7, #-68] ldr r2, [r7, #-72] ldr r2, [r7, #-68] ldr r2, [r7, #-72] ldr r2, [r7, #-68] ldr r2, [r7, #-72] ldr r2, [r7, #-68] ldr r2, [r7, #-72] ldr r2, [r7, #-68] ldr r2, [r7, #-72] ldr r2, [r7, #-68] ldr r2, [r7, #-72] ldr r2, [r7, #-68] ldr r4, [r7, #-72] ldr r2, [r7, #-68] llvm-svn: 144787	2011-11-16 07:57:21 +00:00
Craig Topper	3ed7d9ee5a	Fix the execution domain on a bunch of SSE/AVX instructions. llvm-svn: 144784	2011-11-16 07:30:46 +00:00
Bob Wilson	643e63c40c	Update the SP in the SjLj jmpbuf whenever it changes. <rdar://problem/10444602> This same basic code was in the older version of the SjLj exception handling, but it was removed in the recent revisions to that code. It needs to be there. llvm-svn: 144782	2011-11-16 07:12:00 +00:00
Bob Wilson	f6d1728d8f	Fix ARM SjLj-EH dispatch setup code. <rdar://problem/10444602> The EmitBasePointerRecalculation function has 2 problems, one minor and one fatal. The minor problem is that it inserts the code at the setjmp instead of in the dispatch block. The fatal problem is that at the point where this code runs, we don't know whether there will be a base pointer, so the entire function is a no-op. The base pointer recalculation needs to be handled as it was before, by inserting a pseudo instruction that gets expanded late. Most of the support for the old approach is still here, but it no longer has any connection to the eh_sjlj_dispatchsetup intrinsic. Clean up the parts related to the intrinsic and just generate the pseudo instruction directly. llvm-svn: 144781	2011-11-16 07:11:57 +00:00
Craig Topper	07d8b5e2c9	Remove code to enable execution dependency fix pass on VR256. VR128 is sufficient after r144636. llvm-svn: 144777	2011-11-16 05:02:04 +00:00
Evan Cheng	4ac36c8e26	Revert r144568 now that r144730 has fixed the fast-isel kill marker bug. llvm-svn: 144776	2011-11-16 04:55:01 +00:00
Nick Lewycky	c7f1e7993c	Merge isObjectPointerWithTrustworthySize with getPointerSize. Use it when looking at the size of the pointee. Fixes PR11390! llvm-svn: 144773	2011-11-16 03:49:48 +00:00
Evan Cheng	b8c55a5339	If the 2addr instruction has other kills, don't move it below any other uses since we don't want to extend other live ranges. llvm-svn: 144772	2011-11-16 03:47:42 +00:00
Evan Cheng	59f8156ea0	RescheduleKillAboveMI() must backtrack to before the rescheduled DBG_VALUE instructions. rdar://10451185 llvm-svn: 144771	2011-11-16 03:33:08 +00:00
Evan Cheng	9ddd69a8bc	Process all uses first before defs to accurately capture register liveness. rdar://10449480 llvm-svn: 144770	2011-11-16 03:05:12 +00:00
Eli Friedman	87f92512c3	CONCAT_VECTORS can have more than two operands. PR11389. llvm-svn: 144768	2011-11-16 02:52:39 +00:00
Eli Friedman	d257a464d1	Add a couple asserts so it will be easier to debug if we accidentally pass indexed loads/stores to the legalizer. llvm-svn: 144767	2011-11-16 02:43:15 +00:00
Kostya Serebryany	6e6b03ec46	AddressSanitizer, first commit (compiler module only) llvm-svn: 144758	2011-11-16 01:35:23 +00:00
Kostya Serebryany	db999c01f2	test commit to verify that commit access works (added blank line) llvm-svn: 144748	2011-11-16 01:14:38 +00:00
Owen Anderson	ca2f78a95b	Rename MVT::untyped to MVT::Untyped to match similar nomenclature. llvm-svn: 144747	2011-11-16 01:02:57 +00:00
Andrew Trick	90c7a108ca	Fix SCEV overly optimistic back edge taken count for multi-exit loops. Fixes PR11375: Different results for 'clang++ huh.cpp'... llvm-svn: 144746	2011-11-16 00:52:40 +00:00
Chad Rosier	af13d767a2	Add FIXME comment. llvm-svn: 144743	2011-11-16 00:32:20 +00:00
Jakob Stoklund Olesen	653183fd5c	Enable -widen-vmovs by default. This will widen 32-bit register vmov instructions to 64-bit when possible. The 64-bit vmovd instructions can then be translated to NEON vorr instructions by the execution dependency fix pass. The copies are only widened if they are marked as clobbering the whole D-register. llvm-svn: 144734	2011-11-15 23:53:18 +00:00
Eric Christopher	0abbd0ef5a	Stabilize the output of the dwarf accelerator tables. Fixes a comparison failure during bootstrap with it turned on. llvm-svn: 144731	2011-11-15 23:37:17 +00:00
Chad Rosier	291ce47db7	GEPs with all zero indices are trivially coalesced by fast-isel. For example, %arrayidx135 = getelementptr inbounds [4 x [4 x [4 x [4 x i32]]]]* %M0, i32 0, i64 0 %arrayidx136 = getelementptr inbounds [4 x [4 x [4 x i32]]]* %arrayidx135, i32 0, i64 %idxprom134 Prior to this commit, the GEP instruction that defines %arrayidx136 thought that %arrayidx135 was a trivial kill. The GEP that defines %arrayidx135 doesn't generate any code and thus %M0 gets folded into the second GEP. Thus, we need to look through GEPs with all zero indices. rdar://10443319 llvm-svn: 144730	2011-11-15 23:34:05 +00:00
Jim Grosbach	e891fe8d6c	ARM assembly parsing for register range syntax for VLD/VST register lists. For example, vld1.f64 {d2-d5}, [r2,:128]! Should be equivalent to: vld1.f64 {d2,d3,d4,d5}, [r2,:128]! It's not documented syntax in the ARM ARM, but it is consistent with what's accepted for VLDM/VSTM and is unambiguous in meaning, so it's a good thing to support. rdar://10451128 llvm-svn: 144727	2011-11-15 23:19:15 +00:00
Jim Grosbach	003cea6011	ARM assembly parsing for data type suffices on NEON VMOV aliases. llvm-svn: 144722	2011-11-15 22:54:42 +00:00
Nadav Rotem	51f71054b6	Fix MSVC warnings by adding a cast. llvm-svn: 144721	2011-11-15 22:54:21 +00:00
Nadav Rotem	37010002f2	AVX: Add support for vbroadcast from BUILD_VECTOR and refactor some of the vbroadcast code. llvm-svn: 144720	2011-11-15 22:50:37 +00:00
Jim Grosbach	75fb4abcdc	ARM assembly parsing two operand forms for shift instructions. llvm-svn: 144713	2011-11-15 22:27:54 +00:00
Jim Grosbach	a01033709f	ARM VFP assembly parsing for VADD and VSUB two-operand forms. llvm-svn: 144710	2011-11-15 22:15:10 +00:00
Jim Grosbach	8279c1828f	ARM accept an immediate offset in memory operands w/o the '#'. llvm-svn: 144709	2011-11-15 22:14:41 +00:00
Pete Cooper	7c7ba1baa1	Added custom lowering for load->dec->store sequence in x86 when the EFLAGS registers is used by later instructions. Only done for DEC64m right now. Fixes <rdar://problem/6172640> llvm-svn: 144705	2011-11-15 21:57:53 +00:00
Jim Grosbach	8d579230c6	ARM enclosing curly braces optional on one-register VLD/VST instruction lists. 'vld1.f32 d4, [r7]' should be parsed as equivalent to 'vld1.f32 {d4}, [r7]' rdar://10450488. llvm-svn: 144701	2011-11-15 21:45:55 +00:00
Jim Grosbach	84f0ba5747	ARM size suffix on VFP single-precision 'vmov' is optional. rdar://10435114 llvm-svn: 144698	2011-11-15 21:18:35 +00:00
Devang Patel	43bde96a4c	Insert modified DBG_VALUE into LiveDbgValueMap. llvm-svn: 144696	2011-11-15 21:03:58 +00:00
Jim Grosbach	a92a5d8548	Fix typo. llvm-svn: 144695	2011-11-15 21:01:30 +00:00
Jim Grosbach	131b45e632	ARM alternate size suffices for VTRN instructions. rdar://10435076 llvm-svn: 144694	2011-11-15 20:49:46 +00:00
Owen Anderson	05060f0748	Fix a misplaced paren bug. llvm-svn: 144692	2011-11-15 20:30:41 +00:00
Jim Grosbach	5803f6d5a2	ARM assembly parsing for optional datatype suffix on VFP VMOV GPR<->VFP insns. Yet more of rdar://10435076. llvm-svn: 144691	2011-11-15 20:29:42 +00:00
Jim Grosbach	c5b1bc561e	ARM assembly parsing for two-operand form of 'mul' instruction. rdar://10449856. llvm-svn: 144689	2011-11-15 20:14:51 +00:00
Jim Grosbach	72dfd20aba	ARM assembly parsing for two-operand form of 'mul' instruction. Ongoing rdar://10435114. llvm-svn: 144688	2011-11-15 20:02:06 +00:00
Jim Grosbach	efa7e95d06	Thumb2 two-operand 'mul' instruction wide encoding parsing. rdar://10449724 llvm-svn: 144684	2011-11-15 19:55:16 +00:00
Owen Anderson	0ac9058f89	Fix an ambiguous decoding where we failed to properly decode VMOVv2f32 and VMOVv4f32. llvm-svn: 144683	2011-11-15 19:55:00 +00:00
Jim Grosbach	6efa7b9852	Thumb2 assembly parsing for mul.w in IT block fix. When the 3rd operand is not a low-register, and the first two operands are the same low register, the parser was incorrectly trying to use the 16-bit instruction encoding. rdar://10449281 llvm-svn: 144679	2011-11-15 19:29:45 +00:00
Benjamin Kramer	b106bcc536	StringRefize and simplify. llvm-svn: 144675	2011-11-15 19:12:09 +00:00
Rafael Espindola	f11e7f1305	We currently use a callback to handle an IL pass deleting a BB that still has a reference to it. Unfortunately, that doesn't work for codegen passes since we don't get notified of MBB's being deleted (the original BB stays). Use that fact to our advantage and after printing a function, check if any of the IL BBs corresponds to a symbol that was not printed. This fixes pr11202. llvm-svn: 144674	2011-11-15 19:08:46 +00:00
Akira Hatanaka	6ee8fc88c7	Fix functions in MipsFrameLowering.cpp and MipsRegisterInfo.cpp. Use 64-bit registers and instructions when ABI is N64. llvm-svn: 144666	2011-11-15 18:53:55 +00:00
Akira Hatanaka	494913270e	Set nomacro before emitting the sequence of instructions that set global pointer register. llvm-svn: 144665	2011-11-15 18:44:44 +00:00
Akira Hatanaka	66a14c0650	Simplify function PassByValArg64. llvm-svn: 144664	2011-11-15 18:42:25 +00:00
Akira Hatanaka	d519d8ca83	Remove function printMipsSymbolRef. llvm-svn: 144663	2011-11-15 18:38:35 +00:00
Benjamin Kramer	be6535b3dc	Remove Value::getNameStr. It has been deprecated for a while and provides no additional value over getName(). llvm-svn: 144657	2011-11-15 18:30:12 +00:00
Benjamin Kramer	184e3ceea0	Missed some users of Value::getNameStr. llvm-svn: 144656	2011-11-15 18:30:06 +00:00
Akira Hatanaka	b7796ae938	Delete files. llvm-svn: 144655	2011-11-15 18:22:48 +00:00
Akira Hatanaka	1c0590c5da	Remove MipsMCSymbolRefExpr. llvm-svn: 144654	2011-11-15 18:20:08 +00:00
Jim Grosbach	2aabaa704a	ARM parsing datatype suffix variants for register-writeback VLD1/VST1 instructions. rdar://10435076 llvm-svn: 144650	2011-11-15 17:49:59 +00:00
Jim Grosbach	0dde349df1	Tidy up. 80 columns. llvm-svn: 144649	2011-11-15 16:46:22 +00:00
Benjamin Kramer	1f97a5a671	Remove all remaining uses of Value::getNameStr(). llvm-svn: 144648	2011-11-15 16:27:03 +00:00
Benjamin Kramer	4c93d15f09	Twinify GraphWriter a little bit. llvm-svn: 144647	2011-11-15 16:26:38 +00:00
Jakob Stoklund Olesen	e14ef7e6f8	Check all overlaps when looking for used registers. A function using any RC alias is enough to enable the ExeDepsFix pass. llvm-svn: 144636	2011-11-15 08:20:43 +00:00
Jay Foad	ab9ebd3521	Make use of MachinePointerInfo::getFixedStack. llvm-svn: 144635	2011-11-15 07:51:13 +00:00
Jay Foad	70679df664	Remove some unnecessary includes of PseudoSourceValue.h. llvm-svn: 144634	2011-11-15 07:50:46 +00:00
Jay Foad	e5cbd3c3fb	Fix typo in comment. llvm-svn: 144633	2011-11-15 07:50:05 +00:00
Jay Foad	465101bb0e	Make use of MachinePointerInfo::getFixedStack. This removes all mention of PseudoSourceValue from lib/Target/. llvm-svn: 144632	2011-11-15 07:34:52 +00:00
Jay Foad	0745e645e0	Remove some unnecessary includes of PseudoSourceValue.h. llvm-svn: 144631	2011-11-15 07:24:32 +00:00
Craig Topper	649d1c5eec	Fix PR11370 for real. Prevents converting 256-bit FP instruction to AVX2 256-bit integer instructions when AVX2 isn't enabled. llvm-svn: 144629	2011-11-15 06:39:01 +00:00
Evan Cheng	7098c4e5f4	Set SeenStore to true to prevent loads from being moved; also eliminates a non-deterministic behavior. llvm-svn: 144628	2011-11-15 06:26:51 +00:00
Chandler Carruth	9b548a7fcf	Rather than trying to use the loop block sequence or the function block sequence when recovering from unanalyzable control flow constructs, always use the function sequence. I'm not sure why I ever went down the path of trying to use the loop sequence, it is fundamentally not the correct sequence to use. We're trying to preserve the incoming layout in the cases of unreasonable control flow, and that is only encoded at the function level. We already have a filter to select exactly the sub-set of blocks within the function that we're trying to form into a chain. The resulting code layout is also significantly better because of this. In several places we were ending up with completely unreasonable control flow constructs due to the ordering chosen by the loop structure for its internal storage. This change removes a completely wasteful vector of basic blocks, saving memory allocation in the common case even though it costs us CPU in the fairly rare case of unnatural loops. Finally, it fixes the latest crasher reduced out of GCC's single source. Thanks again to Benjamin Kramer for the reduction, my bugpoint skills failed at it. llvm-svn: 144627	2011-11-15 06:26:43 +00:00
Craig Topper	05baa85f58	Properly qualify AVX2 specific parts of execution dependency table. Also enable converting between 256-bit PS/PD operations when AVX1 is enabled. Fixes PR11370. llvm-svn: 144622	2011-11-15 05:55:35 +00:00
Evan Cheng	7ca4b6eb5c	Add vmov.f32 to materialize f32 immediate splats which cannot be handled by integer variants. rdar://10437054 llvm-svn: 144608	2011-11-15 02:12:34 +00:00
Jim Grosbach	29cdcda80d	ARM parsing datatype suffix variants for fixed-writeback VLD1/VST1 instructions. rdar://10435076 llvm-svn: 144606	2011-11-15 01:46:57 +00:00
Nick Lewycky	6804d27048	Move WEAK marking to the declaration. llvm-svn: 144603	2011-11-15 01:23:22 +00:00
Jakob Stoklund Olesen	f8ad336bc4	Break false dependencies before partial register updates. Two new TargetInstrInfo hooks lets the target tell ExecutionDepsFix about instructions with partial register updates causing false unwanted dependencies. The ExecutionDepsFix pass will break the false dependencies if the updated register was written in the previoius N instructions. The small loop added to sse-domains.ll runs twice as fast with dependency-breaking instructions inserted. llvm-svn: 144602	2011-11-15 01:15:30 +00:00
Jakob Stoklund Olesen	543bef6ead	Track register ages more accurately. Keep track of the last instruction to define each register individually instead of per DomainValue. This lets us track more accurately when a register was last written. Also track register ages across basic blocks. When entering a new basic block, use the least stale predecessor def as a worst case estimate for register age. The register age is used to arbitrate between conflicting domains. The most recently defined register wins. llvm-svn: 144601	2011-11-15 01:15:25 +00:00
Nick Lewycky	b2489b7484	Fix linking for some users who already have tsan enabled code and are trying to link it against llvm code, by making our definitions weak. "Some users." llvm-svn: 144596	2011-11-15 00:14:04 +00:00
Jim Grosbach	a498af2b1d	ARM parsing datatype suffix variants for non-writeback VST1 instructions. rdar://10435076 llvm-svn: 144593	2011-11-14 23:43:46 +00:00
Jim Grosbach	72838a0345	ARM parsing datatype suffix variants for non-writeback VLD1 instructions. rdar://10435076 llvm-svn: 144592	2011-11-14 23:32:59 +00:00
Jim Grosbach	750de7a399	Add explanatory comment. llvm-svn: 144589	2011-11-14 23:21:09 +00:00
Jim Grosbach	9c2d9d597b	Split out the plain '.{8\|16\|32\|64}' suffix handling. Make it easier to deal with aliases for instructions that do require a suffix but accept more specific variants of the same size. llvm-svn: 144588	2011-11-14 23:20:14 +00:00
Jim Grosbach	3d6c0e0bb2	ARM parsing optional datatype suffix for VAND/VEOR/VORR instructions. rdar://10435076 llvm-svn: 144587	2011-11-14 23:11:19 +00:00
Chad Rosier	057b6d3476	Supporting inline memmove isn't going to be worthwhile. The only way to avoid violating a dependency is to emit all loads prior to stores. This would likely cause a great deal of spillage offsetting any potential gains. llvm-svn: 144585	2011-11-14 23:04:09 +00:00
Jim Grosbach	3e2c6f380c	ARM VLDR/VSTR instructions don't need a size suffix. Canonicallize on the non-suffixed form, but continue to accept assembly that has any correctly sized type suffix. llvm-svn: 144583	2011-11-14 23:03:21 +00:00
Nick Lewycky	7013a19e8a	Refactor capture tracking (which already had a couple flags for whether returns and stores capture) to permit the caller to see each capture point and decide whether to continue looking. Use this inside memdep to do an analysis that basicaa won't do. This lets us solve another devirtualization case, fixing PR8908! llvm-svn: 144580	2011-11-14 22:49:42 +00:00
Chad Rosier	ab7223e99a	Add support for inlining small memcpys. rdar://10412592 llvm-svn: 144578	2011-11-14 22:46:17 +00:00
Chad Rosier	45110fdf8d	Fix a performance regression from r144565. Positive offsets were being lowered into registers, rather then encoded directly in the load/store. llvm-svn: 144576	2011-11-14 22:34:48 +00:00
Jim Grosbach	7996b15724	ARM assembly parsing type suffix options for VLDR/VSTR. rdar://10435076 llvm-svn: 144575	2011-11-14 22:28:39 +00:00
Evan Cheng	f2fc508d4d	Avoid dereferencing off the beginning of lists. llvm-svn: 144569	2011-11-14 21:11:15 +00:00
Evan Cheng	28ffb7e444	At -O0, multiple uses of a virtual registers in the same BB are being marked "kill". This looks like a bug upstream. Since that's going to take some time to understand, loosen the assertion and disable the optimization when multiple kills are seen. llvm-svn: 144568	2011-11-14 21:02:09 +00:00
Nick Lewycky	fe856110aa	Add support for tsan annotations (thread sanitizer, a valgrind-based tool). These annotations are disabled entirely when either ENABLE_THREADS is off, or building a release build. When enabled, they add calls to functions with no statements to ManagedStatic's getters. Use these annotations to inform tsan that the race used inside ManagedStatic initialization is actually benign. Thanks to Kostya Serebryany for helping write this patch! llvm-svn: 144567	2011-11-14 20:50:16 +00:00
Evan Cheng	fb13d32b3f	Add a missing pattern for X86ISD::MOVLPD. rdar://10436044 llvm-svn: 144566	2011-11-14 20:35:52 +00:00
Chad Rosier	adfd200bcb	Add support for Thumb load/stores with negative offsets. rdar://10412592 llvm-svn: 144565	2011-11-14 20:22:27 +00:00
Benjamin Kramer	319904cc7e	Unbreak Release builds. llvm-svn: 144560	2011-11-14 19:51:48 +00:00
Evan Cheng	30f44ad785	Teach two-address pass to re-schedule two-address instructions (or the kill instructions of the two-address operands) in order to avoid inserting copies. This fixes the few regressions introduced when the two-address hack was disabled (without regressing the improvements). rdar://10422688 llvm-svn: 144559	2011-11-14 19:48:55 +00:00
Pete Cooper	890e02e854	Changed SSE4/AVX <2 x i64> extract and insert ops to be Custom lowered Constant idx case is still done in tablegen but other cases are then expanded Fixes <rdar://problem/10435460> llvm-svn: 144557	2011-11-14 19:38:42 +00:00
Benjamin Kramer	42d098e1b4	Fold ConstantVector::isAllOnesValue into Constant::isAllOnesValue and simplify it. llvm-svn: 144555	2011-11-14 19:12:20 +00:00
Akira Hatanaka	f93b3f46f8	32-to-64-bit extended load. llvm-svn: 144554	2011-11-14 19:06:14 +00:00
Akira Hatanaka	0b8bc00424	AnalyzeCallOperands function for N32/64. N32/64 places all variable arguments in integer registers (or on stack), regardless of their types, but follows calling convention of non-vaarg function when it handles fixed arguments. llvm-svn: 144553	2011-11-14 19:02:54 +00:00
Akira Hatanaka	52359363f2	Modify LowerFormalArguments to correctly handle vaarg arguments for Mips64. llvm-svn: 144552	2011-11-14 19:01:09 +00:00
Justin Holewinski	33a519021c	PTX: Let LLVM use loads/stores for all mem* intrinsics, instead of relying on custom implementations. llvm-svn: 144551	2011-11-14 18:58:20 +00:00
Akira Hatanaka	d673cfe027	Remove variable that keeps the size of area used to save byval or variable argument registers on the callee's stack frame, along with functions that set and get it. It is not necessary to add the size of this area when computing stack size in emitPrologue, since it has already been accounted for in PEI::calculateFrameObjectOffsets. llvm-svn: 144549	2011-11-14 18:56:20 +00:00
Jakob Stoklund Olesen	7e6004a3c1	Fix early-clobber handling in shrinkToUses. I broke this in r144515, it affected most ARM testers. <rdar://problem/10441389> llvm-svn: 144547	2011-11-14 18:45:38 +00:00
Bob Wilson	8d1c7dbdff	Disable generation of compact unwind encodings. <rdar://problem/10441578> This still seems to be causing some failures. It needs more testing before it gets enabled again. llvm-svn: 144543	2011-11-14 18:21:07 +00:00
Jim Grosbach	ee201faeac	Tidy up. 80 column. llvm-svn: 144538	2011-11-14 17:52:47 +00:00
Benjamin Kramer	d00e94e882	Make headers standalone, move a virtual method out of line. llvm-svn: 144536	2011-11-14 17:22:45 +00:00
Chandler Carruth	fd9b4d9813	It helps to deallocate memory as well as allocate it. =] This actually cleans up all the chains allocated during the processing of each function so that for very large inputs we don't just grow memory usage without bound. llvm-svn: 144533	2011-11-14 10:57:23 +00:00
Chandler Carruth	0a31d149ea	Remove an over-eager assert that was firing on one of the ARM regression tests when I forcibly enabled block placement. It is apparantly possible for an unanalyzable block to fallthrough to a non-loop block. I don't actually beleive this is correct, I believe that 'canFallThrough' is returning true needlessly for the code construct, and I've left a bit of a FIXME on the verification code to try to track down why this is coming up. Anyways, removing the assert doesn't degrade the correctness of the algorithm. llvm-svn: 144532	2011-11-14 10:55:53 +00:00
Chandler Carruth	0af6a0bb69	Begin chipping away at one of the biggest quadratic-ish behaviors in this pass. We're leaving already merged blocks on the worklist, and scanning them again and again only to determine each time through that indeed they aren't viable. We can instead remove them once we're going to have to scan the worklist. This is the easy way to implement removing them. If this remains on the profile (as I somewhat suspect it will), we can get a lot more clever here, as the worklist's order is essentially irrelevant. We can use swapping and fold the two loops to reduce overhead even when there are many blocks on the worklist but only a few of them are removed. llvm-svn: 144531	2011-11-14 09:46:33 +00:00
Chandler Carruth	84cd44c750	Under the hood, MBPI is doing a linear scan of every successor every time it is queried to compute the probability of a single successor. This makes computing the probability of every successor of a block in sequence... really really slow. ;] This switches to a linear walk of the successors rather than a quadratic one. One of several quadratic behaviors slowing this pass down. I'm not really thrilled with moving the sum code into the public interface of MBPI, but I don't (at the moment) have ideas for a better interface. My direction I'm thinking in for a better interface is to have MBPI actually retain much more state and make all of these queries cheap. That's a lot of work, and would require invasive changes. Until then, this seems like the least bad (ie, least quadratic) solution. Suggestions welcome. llvm-svn: 144530	2011-11-14 09:12:57 +00:00
Chandler Carruth	a9e71faa0f	Reuse the logic in getEdgeProbability within getHotSucc in order to correctly handle blocks whose successor weights sum to more than UINT32_MAX. This is slightly less efficient, but the entire thing is already linear on the number of successors. Calling it within any hot routine is a mistake, and indeed no one is calling it. It also simplifies the code. llvm-svn: 144527	2011-11-14 08:55:59 +00:00
Chandler Carruth	ed5aa547bc	Fix an overflow bug in MachineBranchProbabilityInfo. This pass relied on the sum of the edge weights not overflowing uint32, and crashed when they did. This is generally safe as BranchProbabilityInfo tries to provide this guarantee. However, the CFG can get modified during codegen in a way that grows the sum of the edge weights. This doesn't seem unreasonable (imagine just adding more blocks all with the default weight of 16), but it is hard to come up with a case that actually triggers 32-bit overflow. Fortuately, the single-source GCC build is good at this. The solution isn't very pretty, but its no worse than the previous code. We're already summing all of the edge weights on each query, we can sum them, check for an overflow, compute a scale, and sum them again. I've included a greatly reduced test case out of the GCC source that triggers it. It's a pretty lame test, as it clearly is just barely triggering the overflow. I'd like to have something that is much more definitive, but I don't understand the fundamental pattern that triggers an explosion in the edge weight sums. The buggy code is duplicated within this file. I'll colapse them into a single implementation in a subsequent commit. llvm-svn: 144526	2011-11-14 08:50:16 +00:00
Craig Topper	182b00a2e0	Add AVX2 version of instructions to load folding tables. Also add a bunch of missing SSE/AVX instructions. llvm-svn: 144525	2011-11-14 08:07:55 +00:00
Craig Topper	a331515c82	Add neverHasSideEffects, mayLoad, and mayStore to many patternless SSE/AVX instructions. Remove MMX check from LowerVECTOR_SHUFFLE since MMX vector types won't go through it anyway. llvm-svn: 144522	2011-11-14 06:46:21 +00:00

... 2 3 4 5 6 ...

51420 Commits