llvm-project

Commit Graph

Author	SHA1	Message	Date
Andrew Trick	1df762abf4	RegPressure: fix array index iteration style. llvm-svn: 156560	2012-05-10 19:11:49 +00:00
Manman Ren	b555b382bd	Revert: 156550 "ARM: peephole optimization to remove cmp instruction" This commit broke an external linux bot and gave a compile-time warning. llvm-svn: 156556	2012-05-10 18:49:43 +00:00
Manman Ren	c860887b2d	ARM: peephole optimization to remove cmp instruction This patch will optimize the following cases: sub r1, r3 \| sub r1, imm cmp r3, r1 or cmp r1, r3 \| cmp r1, imm bge L1 TO subs r1, r3 bge L1 or ble L1 If the branch instruction can use flag from "sub", then we can replace "sub" with "subs" and eliminate the "cmp" instruction. rdar: 10734411 llvm-svn: 156550	2012-05-10 16:48:21 +00:00
Eric Christopher	8d2a77de63	Fix thinko in conditional. Part of rdar://11352000 and should bring the buildbots back. llvm-svn: 156421	2012-05-08 21:24:39 +00:00
Jim Grosbach	92f6adc8be	DAGCombiner should not change the type of an extract_vector index. When a combine twiddles an extract_vector, care should be take to preserve the type of the index operand. No luck extracting a reasonable testcase, unfortunately. rdar://11391009 llvm-svn: 156419	2012-05-08 20:56:07 +00:00
Akira Hatanaka	fd82286e62	Formatting fixes. Patch by Jack Carter. llvm-svn: 156409	2012-05-08 19:14:42 +00:00
Eric Christopher	4d25052a9a	Handle OpDeref in case it comes in as a register operand. Part of rdar://11352000 llvm-svn: 156405	2012-05-08 18:56:00 +00:00
Jakob Stoklund Olesen	952b4c11fe	Extract methods for joining physregs. No functional change. llvm-svn: 156345	2012-05-08 00:08:35 +00:00
Jakob Stoklund Olesen	9e8ae6c37f	Naming convention and whitespace. No functional change. llvm-svn: 156342	2012-05-07 23:46:16 +00:00
Jakob Stoklund Olesen	98595b5a61	Coalesce subreg-subreg copies. At least some of them: %vreg1:sub_16bit = COPY %vreg2:sub_16bit; GR64:%vreg1, GR32: %vreg2 Previously, we couldn't figure out that the above copy could be eliminated by coalescing %vreg2 with %vreg1:sub_32bit. The new getCommonSuperRegClass() hook makes it possible. This is not very useful yet since the unmodified part of the destination register usually interferes with the source register. The coalescer needs to understand sub-register interference checking first. llvm-svn: 156334	2012-05-07 22:57:55 +00:00
Jakob Stoklund Olesen	3c52f0281f	Add an MF argument to TRI::getPointerRegClass() and TII::getRegClass(). The getPointerRegClass() hook can return register classes that depend on the calling convention of the current function (ptr_rc_tailcall). So far, we have been able to infer the calling convention from the subtarget alone, but as we add support for multiple calling conventions per target, that no longer works. Patch by Yiannis Tsiouris! llvm-svn: 156328	2012-05-07 22:10:26 +00:00
Owen Anderson	ab63d84252	Teach DAG combine to fold x-x to 0.0 when unsafe FP math is enabled. llvm-svn: 156324	2012-05-07 20:51:25 +00:00
Benjamin Kramer	e31f31e5c0	Add a new target hook "predictableSelectIsExpensive". This will be used to determine whether it's profitable to turn a select into a branch when the branch is likely to be predicted. Currently enabled for everything but Atom on X86 and Cortex-A9 devices on ARM. I'm not entirely happy with the name of this flag, suggestions welcome ;) llvm-svn: 156233	2012-05-05 12:49:14 +00:00
Jakob Stoklund Olesen	e326ed33a8	Make sure findRepresentativeClass picks the widest super-register. We want the representative register class to contain the largest super-registers available. This makes the function less sensitive to the register class numbering. llvm-svn: 156220	2012-05-04 22:53:28 +00:00
Jakob Stoklund Olesen	e89496fe63	Remove extra comma in debug output. llvm-svn: 156219	2012-05-04 22:53:26 +00:00
Jakob Stoklund Olesen	75fbe90839	Use SuperRegClassIterator for findRepresentativeClass(). The masks returned by SuperRegClassIterator are computed automatically by TableGen. This is better than depending on the manually specified SuperRegClasses. llvm-svn: 156147	2012-05-04 02:19:22 +00:00
Evan Cheng	b64e7b778b	Fix two-address pass's aggressive instruction commuting heuristics. It's meant to catch cases like: %reg1024<def> = MOV r1 %reg1025<def> = MOV r0 %reg1026<def> = ADD %reg1024, %reg1025 r0 = MOV %reg1026 By commuting ADD, it let coalescer eliminate all of the copies. However, there was a bug in the heuristics where it ended up commuting the ADD in: %reg1024<def> = MOV r0 %reg1025<def> = MOV 0 %reg1026<def> = ADD %reg1024, %reg1025 r0 = MOV %reg1026 That did no benefit but rather ensure the last MOV would not be coalesced. rdar://11355268 llvm-svn: 156048	2012-05-03 01:45:13 +00:00
Andrew Trick	32aea358e1	Added TargetRegisterInfo::getAllocatableClass. The ensures that virtual registers always belong to an allocatable class. If your target attempts to create a vreg for an operand that has no allocatable register subclass, you will crash quickly. This ensures that targets define register classes as intended. llvm-svn: 156046	2012-05-03 01:14:37 +00:00
Owen Anderson	41b0665b5b	Teach DAGCombine the same multiply-by-1.0 folding trick when doing FMAs, just like it now knows for FMULs. llvm-svn: 156029	2012-05-02 22:17:40 +00:00
Owen Anderson	b5f167c660	Teach DAG combine that multiplication by 1.0 can always be constant folded. llvm-svn: 156023	2012-05-02 21:32:35 +00:00
Jim Grosbach	edcb868fe3	Tidy up. Naming conventions. llvm-svn: 155960	2012-05-01 23:21:41 +00:00
Jakub Staszak	cd2353402d	Use dyn_cast instead of checking opcode and cast. llvm-svn: 155957	2012-05-01 23:06:00 +00:00
Bill Wendling	b6b50c6638	Strip the pointer casts off of allocas so that the selection DAG can find them. PR10799 llvm-svn: 155954	2012-05-01 22:50:45 +00:00
Sirish Pande	94212168fc	Target independent Hexagon Packetizer fix. llvm-svn: 155947	2012-05-01 21:28:30 +00:00
Bill Wendling	b12f16e75f	Change the PassManager from a reference to a pointer. The TargetPassManager's default constructor wants to initialize the PassManager to 'null'. But it's illegal to bind a null reference to a null l-value. Make the ivar a pointer instead. PR12468 llvm-svn: 155902	2012-05-01 08:27:43 +00:00
Jakub Staszak	cec09b2594	Add some constantness. No functionality change. llvm-svn: 155859	2012-04-30 23:41:30 +00:00
Benjamin Kramer	db25381a54	RegisterPressure: ArrayRefize some functions for better readability. No functionality change. llvm-svn: 155795	2012-04-29 18:52:56 +00:00
Jakob Stoklund Olesen	6053899aa0	Don't update spill weights when joining intervals. We don't compute spill weights until after coalescing anyway. llvm-svn: 155766	2012-04-28 19:19:11 +00:00
Jakob Stoklund Olesen	4fe0e1908e	Spring cleaning - Delete dead code. llvm-svn: 155765	2012-04-28 19:19:07 +00:00
Andrew Trick	833f04962a	Reapply 155668: Fix the SD scheduler to avoid gluing the same node twice. This time, also fix the caller of AddGlue to properly handle incomplete chains. AddGlue had failure modes, but shamefully hid them from its caller. It's luck ran out. Fixes rdar://11314175: BuildSchedUnits assert. llvm-svn: 155749	2012-04-28 01:03:23 +00:00
Andrew Trick	7a773ec053	Temporarily revert r155668: Fix the SD scheduler to avoid gluing. This definitely caused regression with ARM -mno-thumb. llvm-svn: 155743	2012-04-27 22:55:59 +00:00
Andrew Trick	03fa574af5	Fix the SD scheduler to avoid gluing the same node twice. DAGCombine strangeness may result in multiple loads from the same offset. They both may try to glue themselves to another load. We could insist that the redundant loads glue themselves to each other, but the beter fix is to bail out from bad gluing at the time we detect it. Fixes rdar://11314175: BuildSchedUnits assert. llvm-svn: 155668	2012-04-26 21:48:25 +00:00
Jakob Stoklund Olesen	01f201f484	Remove more dead code. llvm-svn: 155566	2012-04-25 18:01:30 +00:00
Jakob Stoklund Olesen	983dd43b15	Remove the -disable-cross-class-join option. Cross-class joins have been normal and fully supported for a while now. With TableGen generating the getMatchingSuperRegClass() hook, they are unlikely to cause problems again. llvm-svn: 155552	2012-04-25 16:17:50 +00:00
Jakob Stoklund Olesen	d11cf9677f	Cross-class joining is winning. Remove the heuristic for disabling cross-class joins. The greedy register allocator can handle the narrow register classes, and when it splits a live range, it can pick a larger register class. Benchmarks were unaffected by this change. <rdar://problem/11302212> llvm-svn: 155551	2012-04-25 16:17:47 +00:00
Andrew Trick	4d4b5469ab	Fix a naughty header include that breaks "installed" builds. llvm-svn: 155486	2012-04-24 20:36:19 +00:00
Evan Cheng	2d14d8aca1	MachineBasicBlock::SplitCriticalEdge() should follow LLVM IR variant and refuse to break edge to EH landing pad. rdar://11300144 llvm-svn: 155470	2012-04-24 19:06:55 +00:00
Andrew Trick	26bdff9b82	cmake: new file llvm-svn: 155460	2012-04-24 18:06:49 +00:00
Andrew Trick	9e9a9f1465	misched: DAG builder must special case earlyclobber llvm-svn: 155459	2012-04-24 18:04:41 +00:00
Andrew Trick	c3ea00565f	misched: try (not too hard) to place debug values where they belong llvm-svn: 155458	2012-04-24 18:04:37 +00:00
Andrew Trick	cc45a28320	misched: ignore debug values during scheduling llvm-svn: 155457	2012-04-24 18:04:34 +00:00
Andrew Trick	88639928bd	misched: DAG builder support for tracking register pressure within the current scheduling region. The DAG builder is a convenient place to do it. Hopefully this is more efficient than a separate traversal over the same region. llvm-svn: 155456	2012-04-24 17:56:43 +00:00
Andrew Trick	3cd53a1a52	RegisterPressure: A utility for computing register pressure within a MachineInstr sequence. This uses the new target interface for tracking register pressure using pressure sets to model overlapping register classes and subregisters. RegisterPressure results can be tracked incrementally or stored at region boundaries. Global register pressure can be deduced from local RegisterPressure results if desired. This is an early, somewhat untested implementation. I'm working on testing it within the context of a register pressure reducing MachineScheduler. llvm-svn: 155454	2012-04-24 17:53:35 +00:00
Bill Wendling	f1b14b719f	Look for the 'Is Simulated' module flag. This indicates that the program is compiled to run on a simulator. llvm-svn: 155435	2012-04-24 11:03:50 +00:00
Preston Gurd	9a0914753a	This patch fixes a problem which arose when using the Post-RA scheduler on X86 Atom. Some of our tests failed because the tail merging part of the BranchFolding pass was creating new basic blocks which did not contain live-in information. When the anti-dependency code in the Post-RA scheduler ran, it would sometimes rename the register containing the function return value because the fact that the return value was live-in to the subsequent block had been lost. To fix this, it is necessary to run the RegisterScavenging code in the BranchFolding pass. This patch makes sure that the register scavenging code is invoked in the X86 subtarget only when post-RA scheduling is being done. Post RA scheduling in the X86 subtarget is only done for Atom. This patch adds a new function to the TargetRegisterClass to control whether or not live-ins should be preserved during branch folding. This is necessary in order for the anti-dependency optimizations done during the PostRASchedulerList pass to work properly when doing Post-RA scheduling for the X86 in general and for the Intel Atom in particular. The patch adds and invokes the new function trackLivenessAfterRegAlloc() instead of using the existing requiresRegisterScavenging(). It changes BranchFolding.cpp to call trackLivenessAfterRegAlloc() instead of requiresRegisterScavenging(). It changes the all the targets that implemented requiresRegisterScavenging() to also implement trackLivenessAfterRegAlloc(). It adds an assertion in the Post RA scheduler to make sure that post RA liveness information is available when it is needed. It changes the X86 break-anti-dependencies test to use –mcpu=atom, in order to avoid running into the added assertion. Finally, this patch restores the use of anti-dependency checking (which was turned off temporarily for the 3.1 release) for Intel Atom in the Post RA scheduler. Patch by Andy Zhang! Thanks to Jakob and Anton for their reviews. llvm-svn: 155395	2012-04-23 21:39:35 +00:00
Chandler Carruth	af0f8bf595	Temporarily revert r155364 until the upstream review can complete, per the stated developer policy. llvm-svn: 155373	2012-04-23 18:28:57 +00:00
Sirish Pande	995c8dbfd2	Hexagon Packetizer's target independent fix. llvm-svn: 155364	2012-04-23 17:49:09 +00:00
Elena Demikhovsky	8d7e56c409	ZERO_EXTEND/SIGN_EXTEND/TRUNCATE optimization for AVX2 llvm-svn: 155309	2012-04-22 09:39:03 +00:00
Nadav Rotem	31caa27bf5	Teach getVectorTypeBreakdown about promotion of vectors in addition to widening of vectors. llvm-svn: 155296	2012-04-21 20:08:32 +00:00
Jakob Stoklund Olesen	d114da6004	Fix PR12599. The X86 target is editing the selection DAG while isel is selecting nodes following a topological ordering. When the DAG hacking triggers CSE, nodes can be deleted and bad things happen. llvm-svn: 155257	2012-04-20 23:36:09 +00:00
Jakob Stoklund Olesen	e3a891cf08	Make ISelPosition a local variable. Now that multiple DAGUpdateListeners can be active at the same time, ISelPosition can become a local variable in DoInstructionSelection. We simply register an ISelUpdater with CurDAG while ISelPosition exists. llvm-svn: 155249	2012-04-20 22:08:50 +00:00
Jakob Stoklund Olesen	beb9469d5c	Register DAGUpdateListeners with SelectionDAG. Instead of passing listener pointers to RAUW, let SelectionDAG itself keep a linked list of interested listeners. This makes it possible to have multiple listeners active at once, like RAUWUpdateListener was already doing. It also makes it possible to register listeners up the call stack without controlling all RAUW calls below. DAGUpdateListener uses an RAII pattern to add itself to the SelectionDAG list of active listeners. llvm-svn: 155248	2012-04-20 22:08:46 +00:00
Jakob Stoklund Olesen	7111a630d5	Print <def,read-undef> to avoid confusion. The <undef> flag on a def operand only applies to partial register redefinitions. Only print the flag when relevant, and print it as <def,read-undef> to make it clearer what it means. llvm-svn: 155239	2012-04-20 21:45:33 +00:00
Andrew Trick	51ee936101	New and improved comment. llvm-svn: 155229	2012-04-20 20:24:33 +00:00
Andrew Trick	1eb4a0da55	SparseSet: Add support for key-derived indexes and arbitrary key types. This nicely handles the most common case of virtual register sets, but also handles anticipated cases where we will map pointers to IDs. The goal is not to develop a completely generic SparseSet template. Instead we want to handle the expected uses within llvm without any template antics in the client code. I'm adding a bit of template nastiness here, and some assumption about expected usage in order to make the client code very clean. The expected common uses cases I'm designing for: - integer keys that need to be reindexed, and may map to additional data - densely numbered objects where we want pointer keys because no number->object map exists. llvm-svn: 155227	2012-04-20 20:05:28 +00:00
Andrew Trick	7405c6d57a	misched: initialize BB llvm-svn: 155226	2012-04-20 20:05:21 +00:00
Andrew Trick	a11810ad60	Allow targets to select the default scheduler by name. llvm-svn: 155090	2012-04-19 01:34:10 +00:00
Chandler Carruth	b415bf98f0	This reverts a long string of commits to the Hexagon backend. These commits have had several major issues pointed out in review, and those issues are not being addressed in a timely fashion. Furthermore, this was all committed leading up to the v3.1 branch, and we don't need piles of code with outstanding issues in the branch. It is possible that not all of these commits were necessary to revert to get us back to a green state, but I'm going to let the Hexagon maintainer sort that out. They can recommit, in order, after addressing the feedback. Reverted commits, with some notes: Primary commit r154616: HexagonPacketizer - There are lots of review comments here. This is the primary reason for reverting. In particular, it introduced large amount of warnings due to a bad construct in tablegen. - Follow-up commits that should be folded back into this when reposting: - r154622: CMake fixes - r154660: Fix numerous build warnings in release builds. - Please don't resubmit this until the three commits above are included, and the issues in review addressed. Primary commit r154695: Pass to replace transfer/copy ... - Reverted to minimize merge conflicts. I'm not aware of specific issues with this patch. Primary commit r154703: New Value Jump. - Primarily reverted due to merge conflicts. - Follow-up commits that should be folded back into this when reposting: - r154703: Remove iostream usage - r154758: Fix CMake builds - r154759: Fix build warnings in release builds - Please incorporate these fixes and and review feedback before resubmitting. Primary commit r154829: Hexagon V5 (floating point) support. - Primarily reverted due to merge conflicts. - Follow-up commits that should be folded back into this when reposting: - r154841: Remove unused variable (fixing build warnings) There are also accompanying Clang commits that will be reverted for consistency. llvm-svn: 155047	2012-04-18 21:31:19 +00:00
Pete Cooper	8998657c64	LiveIntervalUpdate validators weren't recorded after the calls to std::for_each. Turns out std::for_each doesn't update the variable passed in for the functor but instead copy constructs a new one. llvm-svn: 155041	2012-04-18 20:29:17 +00:00
Joel Jones	828531f798	Fixes a problem in instruction selection with testing whether or not the transformation: (X op C1) ^ C2 --> (X op C1) & ~C2 iff (C1&C2) == C2 should be done. This change has been tested: Using a debug+asserts build: on the specific test case that brought this bug to light make check-all lnt nt using this clang to build a release version of clang Using the release+asserts clang-with-clang build: on the specific test case that brought this bug to light make check-all lnt nt Checking in because Evan wants it checked in. Test case forthcoming after scrubbing. llvm-svn: 154955	2012-04-17 22:23:10 +00:00
Lang Hames	aef9178301	SlotIndexes used to store the index list in a crufty custom linked-list. I can't for the life of me remember why I wrote it this way, but I can't see any good reason for it now. This patch replaces the custom linked list with an ilist. This change should preserve the existing numberings exactly, so no generated code should change (if it does, file a bug!). llvm-svn: 154904	2012-04-17 04:15:51 +00:00
Eric Christopher	a8caa739de	Make comment here more clear. llvm-svn: 154878	2012-04-16 23:54:23 +00:00
Chandler Carruth	1f5580b6f3	Fix updateTerminator to be resiliant to degenerate terminators where both fallthrough and a conditional branch target the same successor. Gracefully delete the conditional branch and introduce any unconditional branch needed to reach the actual successor. This fixes memory corruption in 2009-06-15-RegScavengerAssert.ll and possibly other tests. Also, while I'm here fix a latent bug I spotted by inspection. I never applied the same fundamental fix to this fallthrough successor finding logic that I did to the logic used when there are no conditional branches. As a consequence it would have selected landing pads had they be aligned in just the right way here. I don't have a test case as I spotted this by inspection, and the previous time I found this required have of TableGen's source code to produce it. =/ I hate backend bugs. ;] Thanks to Jim Grosbach for helping me reason through this and reviewing the fix. llvm-svn: 154867	2012-04-16 22:03:00 +00:00
Chandler Carruth	4190b507c5	Flip the new block-placement pass to be on by default. This is mostly to test the waters. I'd like to get results from FNT build bots and other bots running on non-x86 platforms. This feature has been pretty heavily tested over the last few months by me, and it fixes several of the execution time regressions caused by the inlining work by preventing inlining decisions from radically impacting block layout. I've seen very large improvements in yacr2 and ackermann benchmarks, along with the expected noise across all of the benchmark suite whenever code layout changes. I've analyzed all of the regressions and fixed them, or found them to be impossible to fix. See my email to llvmdev for more details. I'd like for this to be in 3.1 as it complements the inliner changes, but if any failures are showing up or anyone has concerns, it is just a flag flip and so can be easily turned off. I'm switching it on tonight to try and get at least one run through various folks' performance suites in case SPEC or something else has serious issues with it. I'll watch bots and revert if anything shows up. llvm-svn: 154816	2012-04-16 13:49:17 +00:00
Chandler Carruth	8c0b41d656	Add a somewhat hacky heuristic to do something different from whole-loop rotation. When there is a loop backedge which is an unconditional branch, we will end up with a branch somewhere no matter what. Try placing this backedge in a fallthrough position above the loop header as that will definitely remove at least one branch from the loop iteration, where whole loop rotation may not. I haven't seen any benchmarks where this is important but loop-blocks.ll tests for it, and so this will be covered when I flip the default. llvm-svn: 154812	2012-04-16 13:33:36 +00:00
Chandler Carruth	8c74c7b1c6	Tweak the loop rotation logic to check whether the loop is naturally laid out in a form with a fallthrough into the header and a fallthrough out of the bottom. In that case, leave the loop alone because any rotation will introduce unnecessary branches. If either side looks like it will require an explicit branch, then the rotation won't add any, do it to ensure the branch occurs outside of the loop (if possible) and maximize the benefit of the fallthrough in the bottom. llvm-svn: 154806	2012-04-16 09:31:23 +00:00
Hal Finkel	e0cf6397fd	Remove dead SD nodes after the combining pass. Fixes PR12201. llvm-svn: 154786	2012-04-16 03:33:22 +00:00
Chandler Carruth	ccc7e42b1f	Rewrite how machine block placement handles loop rotation. This is a complex change that resulted from a great deal of experimentation with several different benchmarks. The one which proved the most useful is included as a test case, but I don't know that it captures all of the relevant changes, as I didn't have specific regression tests for each, they were more the result of reasoning about what the old algorithm would possibly do wrong. I'm also failing at the moment to craft more targeted regression tests for these changes, if anyone has ideas, it would be welcome. The first big thing broken with the old algorithm is the idea that we can take a basic block which has a loop-exiting successor and a looping successor and use the looping successor as the layout top in order to get that particular block to be the bottom of the loop after layout. This happens to work in many cases, but not in all. The second big thing broken was that we didn't try to select the exit which fell into the nearest enclosing loop (to which we exit at all). As a consequence, even if the rotation worked perfectly, it would result in one of two bad layouts. Either the bottom of the loop would get fallthrough, skipping across a nearer enclosing loop and thereby making it discontiguous, or it would be forced to take an explicit jump over the nearest enclosing loop to earch its successor. The point of the rotation is to get fallthrough, so we need it to fallthrough to the nearest loop it can. The fix to the first issue is to actually layout the loop from the loop header, and then rotate the loop such that the correct exiting edge can be a fallthrough edge. This is actually much easier than I anticipated because we can handle all the hard parts of finding a viable rotation before we do the layout. We just store that, and then rotate after layout is finished. No inner loops get split across the post-rotation backedge because we check for them when selecting the rotation. That fix exposed a latent problem with our exitting block selection -- we should allow the backedge to point into the middle of some inner-loop chain as there is no real penalty to it, the whole point is that it won't be a fallthrough edge. This may have blocked the rotation at all in some cases, I have no idea and no test case as I've never seen it in practice, it was just noticed by inspection. Finally, all of these fixes, and studying the loops they produce, highlighted another problem: in rotating loops like this, we sometimes fail to align the destination of these backwards jumping edges. Fix this by actually walking the backwards edges rather than relying on loopinfo. This fixes regressions on heapsort if block placement is enabled as well as lots of other cases where the previous logic would introduce an abundance of unnecessary branches into the execution. llvm-svn: 154783	2012-04-16 01:12:56 +00:00
Nadav Rotem	02ef0c3524	When emulating vselect using OR/AND/XOR make sure to bitcast the result back to the original type. llvm-svn: 154764	2012-04-15 15:08:09 +00:00
Andrew Trick	97d5b9cca6	misched: Added CanHandleTerminators. This is a special flag for targets that really want their block terminators in the DAG. The default scheduler cannot handle this correctly, so it becomes the specialized scheduler's responsibility to schedule terminators. llvm-svn: 154712	2012-04-13 23:29:54 +00:00
Benjamin Kramer	330970d658	Reduce malloc traffic in DwarfAccelTable - Don't copy offsets into HashData, the underlying vector won't change once the table is finalized. - Allocate HashData and HashDataContents in a BumpPtrAllocator. - Allocate string map entries in the same allocator. - Random cleanups. llvm-svn: 154694	2012-04-13 20:06:17 +00:00
Sirish Pande	b486144c12	HexagonPacketizer patch. llvm-svn: 154616	2012-04-12 21:06:38 +00:00
Nadav Rotem	9d376b6578	Reapply 154397. Original message: Fix a dagcombine optimization which assumes that the vsetcc result type is always of the same size as the compared values. This is ture for SSE/AVX/NEON but not for all targets. llvm-svn: 154490	2012-04-11 08:26:11 +00:00
Craig Topper	692d584910	Fix an overly indented line. Remove an 'else' after an 'if' that returns. llvm-svn: 154479	2012-04-11 04:55:51 +00:00
Craig Topper	bc680061e8	Inline implVisitAluOverflow by introducing a nested switch to convert the intrinsic to an nodetype. llvm-svn: 154478	2012-04-11 04:34:11 +00:00
Craig Topper	3ef01cdb2e	Optimize code a bit by calling push_back only once in some loops. Reduces compiled code size a bit. llvm-svn: 154473	2012-04-11 03:06:35 +00:00
Jakob Stoklund Olesen	645bdd4b69	Tweak MachineLICM heuristics for cheap instructions. Allow cheap instructions to be hoisted if they are register pressure neutral or better. This happens if the instruction is the last loop use of another virtual register. Only expensive instructions are allowed to increase loop register pressure. llvm-svn: 154455	2012-04-11 00:00:28 +00:00
Jakob Stoklund Olesen	a3e86a604a	Only check for PHI uses inside the current loop. Hoisting a value that is used by a PHI in the loop will introduce a copy because the live range is extended to cross the PHI. The same applies to PHIs in exit blocks. Also use this opportunity to make HasLoopPHIUse() non-recursive. llvm-svn: 154454	2012-04-11 00:00:26 +00:00
Owen Anderson	6f1ee1634d	Move the constant-folding support for FP_ROUND in SelectionDAG from the one-operand version of getNode() to the two-operand version, since it became a two-operand node at sound point. Zap a testcase that this allows us to completely fold away. llvm-svn: 154447	2012-04-10 22:46:53 +00:00
Duncan Sands	4f53074cca	Add a comment noting that the fdiv -> fmul conversion won't generate multiplication by a denormal, and some tests checking that. llvm-svn: 154431	2012-04-10 20:35:27 +00:00
Eric Christopher	e9abba71fe	To ensure that we have more accurate line information for a block don't elide the branch instruction if it's the only one in the block, otherwise it's ok. PR9796 and rdar://11215207 llvm-svn: 154417	2012-04-10 18:18:10 +00:00
Owen Anderson	3efc8f22bd	Revert r154397, which was causing make check failures on the buildbots. llvm-svn: 154414	2012-04-10 18:02:12 +00:00
Nadav Rotem	065564d85a	Fix a dagcombine optimization which assumes that the vsetcc result type is always of the same size as the compared values. This is ture for SSE/AVX/NEON but not for all targets. llvm-svn: 154397	2012-04-10 14:58:31 +00:00
Chandler Carruth	68062617a6	Make a somewhat subtle change in the logic of block placement. Sometimes the loop header has a non-loop predecessor which has been pre-fused into its chain due to unanalyzable branches. In this case, rotating the header into the body of the loop in order to place a loop exit at the bottom of the loop is a Very Bad Idea as it makes the loop non-contiguous. I'm working on a good test case for this, but it's a bit annoynig to craft. I should get one shortly, but I'm submitting this now so I can begin the (lengthy) performance analysis process. An initial run of LNT looks really, really good, but there is too much noise there for me to trust it much. llvm-svn: 154395	2012-04-10 13:35:57 +00:00
Anton Korobeynikov	4d1220de34	Transform div to mul with reciprocal only when fp imm is legal. This fixes PR12516 and uncovers one weird problem in legalize (workarounded) llvm-svn: 154394	2012-04-10 13:22:49 +00:00
Evan Cheng	136861d994	Make the code slightly more palatable. llvm-svn: 154378	2012-04-10 03:15:18 +00:00
Evan Cheng	f8bad08001	Fix a long standing tail call optimization bug. When a libcall is emitted legalizer always use the DAG entry node. This is wrong when the libcall is emitted as a tail call since it effectively folds the return node. If the return node's input chain is not the entry (i.e. call, load, or store) use that as the tail call input chain. PR12419 rdar://9770785 rdar://11195178 llvm-svn: 154370	2012-04-10 01:51:00 +00:00
Rafael Espindola	1d9672bdce	Don't try to zExt just to check if an integer constant is zero, it might not fit in a i64. llvm-svn: 154364	2012-04-10 00:16:22 +00:00
Akira Hatanaka	8483a6c47d	Have TargetLowering::getPICJumpTableRelocBase return a node that points to the GOT if jump table uses 64-bit gp-relative relocation. llvm-svn: 154341	2012-04-09 20:32:12 +00:00
Lang Hames	3ad11ff90f	Patch r153892 for PR11861 apparently broke an external project (see PR12493). This patch restores TwoAddressInstructionPass's pre-r153892 behaviour when rescheduling instructions in TryInstructionTransform. Hopefully this will fix PR12493. To refix PR11861, lowering of INSERT_SUBREGS is deferred until after the copy that unties the operands is emitted (this seems to be a more appropriate fix for that issue anyway). llvm-svn: 154338	2012-04-09 20:17:30 +00:00
Rafael Espindola	8f62b3248e	Pattern match a setcc of boolean value with 0 as a truncate. llvm-svn: 154322	2012-04-09 16:06:03 +00:00
Craig Topper	9c3da316ec	Remove unnecessary type check when combining and/or/xor of swizzles. Move some checks to allow better early out. llvm-svn: 154309	2012-04-09 07:19:09 +00:00
Craig Topper	e5893f64e8	Remove unnecessary 'else' on an 'if' that always returns llvm-svn: 154308	2012-04-09 05:59:53 +00:00
Craig Topper	e3ad4834ae	Optimize code slightly. No functionality change. llvm-svn: 154307	2012-04-09 05:55:33 +00:00
Craig Topper	5894fe430a	Replace some explicit checks with asserts for conditions that should never happen. llvm-svn: 154305	2012-04-09 05:16:56 +00:00
Craig Topper	6148fe65e8	Optimize code a bit. No functional change intended. llvm-svn: 154299	2012-04-08 23:15:04 +00:00
Benjamin Kramer	bb6ff08766	Silence sign-compare warning. llvm-svn: 154297	2012-04-08 19:04:45 +00:00
Duncan Sands	2f1dc3814b	Only have codegen turn fdiv by a constant into fmul by the reciprocal when -ffast-math, i.e. don't just always do it if the reciprocal can be formed exactly. There is already an IR level transform that does that, and it does it more carefully. llvm-svn: 154296	2012-04-08 18:08:12 +00:00
Craig Topper	c8e2d91a58	Simplify code that tries to do vector extracts for shuffles when the mask width and the input vector widths don't match. No need to check the min and max are in range before calculating the start index. The range check after having the start index is sufficient. Also no need to check for an extract from the beginning differently. llvm-svn: 154295	2012-04-08 17:53:33 +00:00
Chandler Carruth	16f0ebcbb5	Move the TLSModel information into the TargetMachine rather than hiding in TargetLowering. There was already a FIXME about this location being odd. The interface is simplified as a consequence. This will also make it easier to change TLS models when compiling with PIE. llvm-svn: 154292	2012-04-08 17:20:55 +00:00
Chandler Carruth	bed1abf9ca	Remove an over zealous assert. The assert was trying to catch places where a chain outside of the loop block-set ended up in the worklist for scheduling as part of the contiguous loop. However, asserting the first block in the chain is in the loop-set isn't a valid check -- we may be forced to drag a chain into the worklist due to one block in the chain being part of the loop even though the first block is not in the loop. This occurs when we have been forced to form a chain early due to un-analyzable branches. No test case here as I have no idea how to even begin reducing one, and it will be hopelessly fragile. We have to somehow end up with a loop header of an inner loop which is a successor of a basic block with an unanalyzable pair of branch instructions. Ow. Self-host triggers it so it is unlikely it will regress. This at least gets block placement back to passing selfhost and the test suite. There are still a lot of slowdown that I don't like coming out of block placement, although there are now also a lot of speedups. =[ I'm seeing swings in both directions up to 10%. I'm going to try to find time to dig into this and see if we can turn this on for 3.1 as it does a really good job of cleaning up after some loops that degraded with the inliner changes. llvm-svn: 154287	2012-04-08 14:37:02 +00:00
Chandler Carruth	49158908dc	Add a debug-only 'dump' method to the BlockChain structure to ease debugging. llvm-svn: 154286	2012-04-08 14:37:01 +00:00
Craig Topper	d024cef233	Turn avx2 vinserti128 intrinsic calls into INSERT_SUBVECTOR DAG nodes and remove patterns for selecting the intrinsic. Similar was already done for avx1. llvm-svn: 154272	2012-04-07 22:32:29 +00:00
Craig Topper	e09d1c5c48	Remove 'else' after 'if' that ends in return. llvm-svn: 154267	2012-04-07 21:23:41 +00:00
Nadav Rotem	71d07ae5cb	1. Remove the part of r153848 which optimizes shuffle-of-shuffle into a new shuffle node because it could introduce new shuffle nodes that were not supported efficiently by the target. 2. Add a more restrictive shuffle-of-shuffle optimization for cases where the second shuffle reverses the transformation of the first shuffle. llvm-svn: 154266	2012-04-07 21:19:08 +00:00
Duncan Sands	5f8397a934	Convert floating point division by a constant into multiplication by the reciprocal if converting to the reciprocal is exact. Do it even if inexact if -ffast-math. This substantially speeds up ac.f90 from the polyhedron benchmarks. llvm-svn: 154265	2012-04-07 20:04:00 +00:00
Eric Christopher	aec8a82694	Patch to set is_stmt a little better for prologue lines in a function. This enables debuggers to see what are interesting lines for a breakpoint rather than any line that starts a function. rdar://9852092 llvm-svn: 154120	2012-04-05 20:39:05 +00:00
Jakob Stoklund Olesen	37492eac8c	Don't break the IV update in TLI::SimplifySetCC(). LSR always tries to make the ICmp in the loop latch use the incremented induction variable. This allows the induction variable to be kept in a single register. When the induction variable limit is equal to the stride, SimplifySetCC() would break LSR's hard work by transforming: (icmp (add iv, stride), stride) --> (cmp iv, 0) This forced us to use lea for the IC update, preventing the simpler incl+cmp. <rdar://problem/7643606> <rdar://problem/11184260> llvm-svn: 154119	2012-04-05 20:30:20 +00:00
Owen Anderson	a6eebf6013	Treat f16 the same as f80/f128 for the purposes of generating constants during instruction selection. llvm-svn: 154113	2012-04-05 18:50:32 +00:00
Pete Cooper	d7290700e6	REG_SEQUENCE expansion to COPY instructions wasn't taking account of sub register indices on the source registers. No simple test case llvm-svn: 154051	2012-04-04 21:03:25 +00:00
Pete Cooper	8a3dc0ed8c	f16 FREM can now be legalized by promoting to f32 llvm-svn: 154039	2012-04-04 19:36:31 +00:00
Jakob Stoklund Olesen	92fd79a639	Remove spurious debug output. llvm-svn: 154032	2012-04-04 18:23:38 +00:00
Rafael Espindola	ba0a6cabb8	Always compute all the bits in ComputeMaskedBits. This allows us to keep passing reduced masks to SimplifyDemandedBits, but know about all the bits if SimplifyDemandedBits fails. This allows instcombine to simplify cases like the one in the included testcase. llvm-svn: 154011	2012-04-04 12:51:34 +00:00
Craig Topper	4c7d995029	Remove default case from switch that was already covering all cases. llvm-svn: 153996	2012-04-04 04:42:42 +00:00
Pete Cooper	e7bff68a5e	Removed useless switch for default case when switch was covering all the enum values llvm-svn: 153984	2012-04-04 00:53:04 +00:00
Pete Cooper	9511ec86f9	Add VSELECT to LegalizeVectorTypes::ScalariseVectorResult. Previously it would crash if it encountered a 1 element VSELECT. Solution is slightly more complicated than just creating a SELET as we have to mask or sign extend the vector condition if it had different boolean contents from the scalar condition. Fixes <rdar://problem/11178095> llvm-svn: 153976	2012-04-03 22:57:55 +00:00
Pete Cooper	b98934cf72	Removed one last bad continue statement meant to be removed in r153914. llvm-svn: 153975	2012-04-03 22:18:49 +00:00
Chad Rosier	2a02fe1bb2	Fix an issue in SimplifySetCC() specific to vector comparisons. When folding X == X we need to check getBooleanContents() to determine if the result is a vector of ones or a vector of negative ones. I tried creating a test case, but the problem seems to only be exposed on a much older version of clang (around r144500). rdar://10923049 llvm-svn: 153966	2012-04-03 20:11:24 +00:00
Eric Christopher	b81e2b403c	Fix thinko check for number of operands to be the one that actually might have more than 19 operands. Add a testcase to make sure I never screw that up again. Part of rdar://11026482 llvm-svn: 153961	2012-04-03 17:55:42 +00:00
Eric Christopher	34164196af	Add a line number for the scope of the function (starting at the first brace) so that we get more accurate line number information about the declaration of a given function and the line where the function first starts. Part of rdar://11026482 llvm-svn: 153916	2012-04-03 00:43:49 +00:00
Pete Cooper	4f0dbb27d9	Fixes to r153903. Added missing explanation of behaviour when the VirtRegMap is NULL. Also changed it in this case to just avoid updating the map, but live ranges or intervals will still get updated and created llvm-svn: 153914	2012-04-03 00:28:46 +00:00
Pete Cooper	3ca96f9950	Moved LiveRangeEdit.h so that it can be called from other parts of the backend, not just libCodeGen llvm-svn: 153906	2012-04-02 22:44:18 +00:00
Jakob Stoklund Olesen	291007b055	Allocate virtual registers in ascending order. This is just the fallback tie-breaker ordering, the main allocation order is still descending size. Patch by Shamil Kurmangaleev! llvm-svn: 153904	2012-04-02 22:30:39 +00:00
Pete Cooper	2bde2f42b1	Refactored the LiveRangeEdit interface so that MachineFunction, TargetInstrInfo, MachineRegisterInfo, LiveIntervals, and VirtRegMap are all passed into the constructor and stored as members instead of passed in to each method. llvm-svn: 153903	2012-04-02 22:22:53 +00:00
Owen Anderson	98f2c0c384	Add predicates for checking whether targets have free FNEG and FABS operations, and prevent the DAGCombiner from turning them into bitwise operations if they do. llvm-svn: 153901	2012-04-02 22:10:29 +00:00
Lang Hames	aaafacd07e	During two-address lowering, rescheduling an instruction does not untie operands. Make TryInstructionTransform return false to reflect this. Fixes PR11861. llvm-svn: 153892	2012-04-02 19:58:43 +00:00
Eric Christopher	ad9fe8955a	Turn on the accelerator tables for Darwin. llvm-svn: 153880	2012-04-02 17:58:52 +00:00
Nadav Rotem	702f080767	Optimizing swizzles of complex shuffles may generate additional complex shuffles. Do not try to optimize swizzles of shuffles if the source shuffle has more than a single user, except when the source shuffle is also a swizzle. llvm-svn: 153864	2012-04-02 07:11:12 +00:00
Craig Topper	54bfde79db	Make MCInstrInfo available to the MCInstPrinter. This will be used to remove getInstructionName and the static data it contains since the same tables are already in MCInstrInfo. llvm-svn: 153860	2012-04-02 06:09:36 +00:00
Nadav Rotem	b078350872	This commit contains a few changes that had to go in together. 1. Simplify xor/and/or (bitcast(A), bitcast(B)) -> bitcast(op (A,B)) (and also scalar_to_vector). 2. Xor/and/or are indifferent to the swizzle operation (shuffle of one src). Simplify xor/and/or (shuff(A), shuff(B)) -> shuff(op (A, B)) 3. Optimize swizzles of shuffles: shuff(shuff(x, y), undef) -> shuff(x, y). 4. Fix an X86ISelLowering optimization which was very bitcast-sensitive. Code which was previously compiled to this: movd (%rsi), %xmm0 movdqa .LCPI0_0(%rip), %xmm2 pshufb %xmm2, %xmm0 movd (%rdi), %xmm1 pshufb %xmm2, %xmm1 pxor %xmm0, %xmm1 pshufb .LCPI0_1(%rip), %xmm1 movd %xmm1, (%rdi) ret Now compiles to this: movl (%rsi), %eax xorl %eax, (%rdi) ret llvm-svn: 153848	2012-04-01 19:31:22 +00:00
Lang Hames	652f21274f	Fix typo. llvm-svn: 153846	2012-04-01 19:27:25 +00:00
Andrew Trick	779b32a44e	misched: Add finalizeScheduler to complete the target interface. llvm-svn: 153827	2012-04-01 07:24:23 +00:00
Rafael Espindola	80c540e656	Teach CodeGen's version of computeMaskedBits to understand the range metadata. This is the CodeGen equivalent of r153747. I tested that there is not noticeable performance difference with any combination of -O0/-O2 /-g when compiling gcc as a single compilation unit. llvm-svn: 153817	2012-03-31 18:14:00 +00:00
Bill Wendling	9f829f1cc4	If we have a VLA that has a "use" in a metadata node that's then used here but it has no other uses, then we have a problem. E.g., int foo (const int x) { char a[x]; return 0; } If we assign 'a' a vreg and fast isel later on has to use the selection DAG isel, it will want to copy the value to the vreg. However, there are no uses, which goes counter to what selection DAG isel expects. <rdar://problem/11134152> llvm-svn: 153705	2012-03-30 00:02:55 +00:00
Eric Christopher	70e1bd8872	Add support for objc property decls according to the page at: http://llvm.org/docs/SourceLevelDebugging.html#objcproperty including type and DECL. Expand the metadata needed accordingly. rdar://11144023 llvm-svn: 153639	2012-03-29 08:42:56 +00:00
Jakob Stoklund Olesen	c3e80cc885	Enable machine code verification in the entire code generator. Some targets still mess up the liveness information, but that isn't verified after MRI->invalidateLiveness(). The verifier can still check other useful things like register classes and CFG, so it should be enabled after all passes. llvm-svn: 153615	2012-03-28 23:54:28 +00:00
Jakob Stoklund Olesen	d1bd8fba13	Enable machine code verification after PreSched2 passes. The late scheduler depends on accurate liveness information if it is breaking anti-dependencies, so we should be able to verify it. Relax the terminator checking in the machine code verifier so it can handle the basic blocks created by if conversion. llvm-svn: 153614	2012-03-28 23:31:15 +00:00
Jakob Stoklund Olesen	e433c68d7c	Also verify after ExpandPostRAPseudos. llvm-svn: 153599	2012-03-28 20:49:30 +00:00
Jakob Stoklund Olesen	341e06f8d5	Enable machine code verification after the late machine optimization passes. Branch folding invalidates liveness and disables liveness verification on some targets. llvm-svn: 153597	2012-03-28 20:47:37 +00:00
Jakob Stoklund Olesen	b21df32cf5	Skip liveness verification when MRI->tracksLiveness() is false. Extract the liveness verification into its own method. This makes it possible to run the machine code verifier after liveness information is no longer required to be valid. llvm-svn: 153596	2012-03-28 20:47:35 +00:00
Jakob Stoklund Olesen	8e58c90f51	Allow removeLiveIn to be called with a register that isn't live-in. This avoids the silly double search: if (isLiveIn(Reg)) removeLiveIn(Reg); llvm-svn: 153592	2012-03-28 20:11:42 +00:00
Pete Cooper	148ebb8802	Fixed commuteInstructions bug where if its called pre-regalloc the subreg indices weren't commuted llvm-svn: 153579	2012-03-28 17:02:22 +00:00
Eric Christopher	24a6298512	More debug output. llvm-svn: 153571	2012-03-28 07:34:36 +00:00
Eric Christopher	7285c7d51d	Fix the output of the DW_TAG_friend tag to include DW_AT_friend and not the rest of the member tag. Fixes PR11695 llvm-svn: 153570	2012-03-28 07:34:31 +00:00
Lang Hames	5544bf1b8a	Use a SmallVector and linear lookup instead of a DenseSet - SourceMap values will always be tiny sets, so DenseSet is overkill (SmallSet won't work as we need iteration support). llvm-svn: 153529	2012-03-27 19:10:45 +00:00
Eric Christopher	7ed2efca6a	Use DW_AT_low_pc for a single entry point into a routine. Fixes PR10105 llvm-svn: 153524	2012-03-27 18:35:54 +00:00
Jakob Stoklund Olesen	6c08534aff	Print SSA and liveness tracking flags in MF::print(). llvm-svn: 153518	2012-03-27 17:17:16 +00:00
Jakob Stoklund Olesen	d1664a1571	Branch folding may invalidate liveness. Branch folding can use a register scavenger to update liveness information when required. Don't do that if liveness information is already invalid. llvm-svn: 153517	2012-03-27 17:06:09 +00:00
Chris Lattner	1cc25e8a40	fix what looks like a real logic bug, found by PVS-Studio (part of PR12357) llvm-svn: 153513	2012-03-27 16:27:21 +00:00
Jakob Stoklund Olesen	9c1ad5cb7d	Add an MRI::tracksLiveness() flag. Late optimization passes like branch folding and tail duplication can transform the machine code in a way that makes it expensive to keep the register liveness information up to date. There is a fuzzy line between register allocation and late scheduling where the liveness information degrades. The MRI::tracksLiveness() flag makes the line clear: While true, liveness information is accurate, and can be used for register scavenging. Once the flag is false, liveness information is not accurate, and can only be used as a hint. Late passes generally don't need the liveness information, but they will sometimes use the register scavenger to help update it. The scavenger enforces strict correctness, and we have to spend a lot of code to update register liveness that may never be used. llvm-svn: 153511	2012-03-27 15:13:58 +00:00
Evan Cheng	7fede87349	Post-ra LICM should take care not to hoist an instruction that would clobber a register that's read by the preheader terminator. rdar://11095580 llvm-svn: 153492	2012-03-27 01:50:58 +00:00
Lang Hames	551662bf5d	During MachineCopyPropagation a register may be the source operand of multiple copies being considered for removal. Make sure to track all of the copies, rather than just the most recent encountered, by holding a DenseSet instead of an unsigned in SrcMap. No test case - couldn't reduce something with a sane size. llvm-svn: 153487	2012-03-27 00:44:47 +00:00
Lang Hames	95e021faf5	Add a debug option to dump PBQP graphs during register allocation. llvm-svn: 153483	2012-03-26 23:07:23 +00:00
Eric Christopher	0925c62c74	Use the file in the inlined die rather than the compile unit for backtrace locations. Testcase forthcoming, but I wanted to get some testing here. Should fix: PR12323 PR12314 rdar://11091100 llvm-svn: 153471	2012-03-26 21:38:38 +00:00
Benjamin Kramer	3e6719c133	No need to do an expensive stable sort for a bunch of integers. llvm-svn: 153438	2012-03-26 14:17:26 +00:00
Craig Topper	6e80c28017	Prune some includes and forward declarations. llvm-svn: 153429	2012-03-26 06:58:25 +00:00
Eric Christopher	c1e2dcdb8a	Add a debug statement. llvm-svn: 153428	2012-03-26 06:10:32 +00:00
Hal Finkel	71c2ba3d2e	Add the ability to promote legal integer VAARGs. This is required for the PPC64 SVR4 ABI. llvm-svn: 153372	2012-03-24 03:53:52 +00:00
Jim Grosbach	4a2909ab0f	Pretty-printing comments for literal floating point in .s files. Dump the hex representation to the comment stream as well as the float value. llvm-svn: 153346	2012-03-23 23:06:47 +00:00
Lang Hames	45c6d21ae1	Add support for register masks to PBQP. llvm-svn: 153341	2012-03-23 17:33:42 +00:00
Evan Cheng	8ab58a21a5	Source order scheduler should not preschedule nodes with multiple uses. rdar://11096639 llvm-svn: 153270	2012-03-22 19:31:17 +00:00
Evan Cheng	79f03e915d	Assign node orders to target intrinsics which do not produce results. rdar://11096639 llvm-svn: 153269	2012-03-22 19:29:09 +00:00
Eric Christopher	12da169839	In erroneous inline assembly we could mistakenly try to access the metadata operand as an actual operand, leading to an assert. Error out in this case. rdar://11007633 llvm-svn: 153234	2012-03-22 01:33:51 +00:00
Chad Rosier	6a63a74113	[fast-isel] Fold "urem x, pow2" -> "and x, pow2-1". This should fix the 271% execution-time regression for nsieve-bits on the ARMv7 -O0 -g nightly tester. This may also improve compile-time on architectures that would otherwise generate a libcall for urem (e.g., ARM) or fall back to the DAG selector. rdar://10810716 llvm-svn: 153230	2012-03-22 00:21:17 +00:00
Jim Grosbach	e13adc38d0	Checking a build_vector for an all-ones value. Type legalization can zero-extend the elements of the build_vector node, so, for example, we may have an <8 x i8> with i32 elements of value 255. That should return 'true' for the vector being all ones. llvm-svn: 153203	2012-03-21 17:48:04 +00:00
Andrew Trick	25baeca54d	misched: fix LiveInterval update for bottom-up scheduling llvm-svn: 153162	2012-03-21 04:12:16 +00:00
Andrew Trick	adb03b91ee	misched: trace LiveIntervals after scheduling. llvm-svn: 153161	2012-03-21 04:12:12 +00:00
Andrew Trick	54f7def703	misched: obvious iterator update fixes for bottom-up. llvm-svn: 153160	2012-03-21 04:12:10 +00:00
Andrew Trick	de670c0304	misched: cleanup main loop llvm-svn: 153159	2012-03-21 04:12:07 +00:00
Andrew Trick	3bfafcba10	misched: fix LI update for bottom-up. llvm-svn: 153158	2012-03-21 04:12:01 +00:00
Bill Wendling	7315c4b9cd	It's possible to have a constant expression who's size is quite big (e.g., i128). In that case, we may not be able to print out the MCExpr as an expression. For instance, we could have an MCExpr like this: 0xBEEF0000BEEF0000 \| (0xBEEF0000BEEF0000 << 64) The MCExpr printer handles sizes up to 64-bits, but this expression would require 128-bits. In this situation, try to evaluate the constant expression and emit that as the value into 64-bit chunks. <rdar://problem/11070338> llvm-svn: 153081	2012-03-20 08:56:43 +00:00
Craig Topper	aaeae98936	When combining (vextract shuffle (load ), <1,u,u,u>), 0) -> (load ), add users of the final load to the worklist too. Needed by changes I'm preparing to make to X86 backend. llvm-svn: 153078	2012-03-20 05:28:39 +00:00
Eric Christopher	60e01c560a	Do everything up to generating code to try to get a register for a variable. The previous code would break the debug info changing code invariant. This will regress debug info for arguments where we elide the alloca created. Fixes rdar://11066468 llvm-svn: 153074	2012-03-20 01:07:58 +00:00
Eric Christopher	997aaa9237	Untabify. llvm-svn: 153073	2012-03-20 01:07:56 +00:00
Eric Christopher	e5e54c87fa	Add another debugging statement here. llvm-svn: 153072	2012-03-20 01:07:53 +00:00
Eric Christopher	1a06cc9ae6	Use lookUpRegForValue here instead of duplicating the code. llvm-svn: 153071	2012-03-20 01:07:47 +00:00
Pete Cooper	e69be6df4f	f16 FDIV can now be legalized by promoting to f32 llvm-svn: 153064	2012-03-19 23:38:12 +00:00
Lang Hames	dd98c497b9	Add an option to the MI scheduler to cut off scheduling after a fixed number of instructions have been scheduled. Handy for tracking down scheduler bugs, or bugs exposed by scheduling. llvm-svn: 153045	2012-03-19 18:38:38 +00:00
Duncan Sands	3fb2fc6edb	Fix DAG combine which creates illegal vector shuffles. Patch by Heikki Kultala. llvm-svn: 153035	2012-03-19 15:35:44 +00:00
Benjamin Kramer	5d1bca8016	CriticalAntiDepBreaker: Replace a SmallSet of regs with a much denser BitVector. llvm-svn: 152999	2012-03-17 20:22:57 +00:00
Benjamin Kramer	97f889f43b	MachineInstr: Inline the fast path (non-bundle instruction) of hasProperty. This is particularly helpful as both arguments tend to be constants. llvm-svn: 152991	2012-03-17 17:03:45 +00:00
Benjamin Kramer	411d5a2026	ScheduleDAGInstrs: When adding uses we add them into a set that's empty at the beginning, no need to maintain another set for the added regs. llvm-svn: 152934	2012-03-16 17:38:19 +00:00
Benjamin Kramer	d03878bdf2	Limit the number of memory operands in MachineInstr to 2^16 and store the number in padding. Saves one machine word on MachineInstr (88->80 bytes on x86_64, 48->44 on i386). llvm-svn: 152930	2012-03-16 16:39:27 +00:00
Benjamin Kramer	8e5af375db	CriticalAntiDepBreaker: BasicBlock::size is an expensive operation, reuse the cached value. No functionality change. llvm-svn: 152927	2012-03-16 15:46:47 +00:00
Andrew Trick	e6913c7245	misched: add DAG edges from vreg defs to ExitSU. These edges are not really necessary, but it is consistent with the way we currently create physreg edges. Scheduler heuristics that expect a DAG edge to the block terminator could benefit from this change. Although in the future I hope we have a better mechanism for modeling latency across scheduling regions. llvm-svn: 152895	2012-03-16 05:04:25 +00:00
Chad Rosier	1a9c17efad	Revert r152705, which reapplied r152486 as this appears to be causing failures on our internal nightly testers. So, basically revert r152486 again. Abbreviated original commit message: Implement a more intelligent way of spilling uses across an invoke boundary. It looks as if Chander's inlining work, r152737, exposed an issue. llvm-svn: 152887	2012-03-16 01:04:00 +00:00
NAKAMURA Takumi	a7e57ace28	Revert r152613 (and r152614), "Inline the d'tor and add an anchor instead." for workaround of g++-4.4's miscompilation. It caused MSP430DAGToDAGISel::SelectIndexedBinOp() to be miscompiled. When two ReplaceUses()'s are expanded as inline, vtable in base class is stored to latter (ISelUpdater)ISU. llvm-svn: 152877	2012-03-16 00:01:55 +00:00
Eric Christopher	7734ca2891	For types with a parent of the compile unit make sure and emit the DECL information. rdar://10855921 llvm-svn: 152876	2012-03-15 23:55:40 +00:00
Eric Christopher	3390a6e5e3	We actually handle AllocaInst via getRegForValue below just fine. Part of rdar://8905263 llvm-svn: 152845	2012-03-15 21:33:47 +00:00
Eric Christopher	142820ba8d	Add some debugging output into fast isel as well. llvm-svn: 152844	2012-03-15 21:33:44 +00:00
Eric Christopher	be7a1016fc	Add another debug statement. llvm-svn: 152843	2012-03-15 21:33:41 +00:00
Eric Christopher	6a0c679762	Tabs. llvm-svn: 152842	2012-03-15 21:33:39 +00:00
Eric Christopher	be153e6610	Typo. llvm-svn: 152841	2012-03-15 21:33:35 +00:00
Nadav Rotem	6fd1d32c63	When optimizing certain BUILD_VECTOR nodes into other BUILD_VECTOR nodes, add the new node into the work list because there is a potential for further optimizations. llvm-svn: 152784	2012-03-15 08:49:06 +00:00
Eric Christopher	7dd54fb695	Revert the removal of DW_AT_MIPS_linkage_name when we aren't putting out the DW_AT_name. Older gdbs unfortunately still use it to disambiguate member functions in templated classes (gdb.cp/templates.exp). rdar://11043421 (which is now deferred for a bit) llvm-svn: 152782	2012-03-15 08:19:33 +00:00
Bill Wendling	df170db2f6	Add a xform to the DAG combiner. Transform: (fsub x, (fadd x, y)) -> (fneg y) and (fsub x, (fadd y, x)) -> (fneg y) if 'unsafe math' is specified. <rdar://problem/7540295> llvm-svn: 152777	2012-03-15 05:12:00 +00:00
Benjamin Kramer	05e7a843aa	Silence operator precedence warnings. llvm-svn: 152711	2012-03-14 11:26:37 +00:00
Bill Wendling	d7c0aae45b	Reapply r152486 with a fix for the nightly testers. There were cases where a value could be used and it's both crossing an invoke and NOT crossing an invoke. This could happen in the landing pads. In that case, we will demote the value to the stack like we did before. <rdar://problem/10609139> llvm-svn: 152705	2012-03-14 07:28:01 +00:00
Bill Wendling	618d57310a	Insert the debugging instructions in one fell-swoop so that it doesn't call the expensive "getFirstTerminator" call. This reduces the time of compilation in PR12258 from >10 minutes to < 10 seconds. llvm-svn: 152704	2012-03-14 07:14:25 +00:00
Andrew Trick	8823decdd4	misched: implemented a framework for top-down or bottom-up scheduling. New flags: -misched-topdown, -misched-bottomup. They can be used with the default scheduler or with -misched=shuffle. Without either topdown/bottomup flag -misched=shuffle now alternates scheduling direction. LiveIntervals update is unimplemented with bottom-up scheduling, so only -misched-topdown currently works. Capped the ScheduleDAG hierarchy with a concrete ScheduleDAGMI class. ScheduleDAGMI is aware of the top and bottom of the unscheduled zone within the current region. Scheduling policy can be plugged into the ScheduleDAGMI driver by implementing MachineSchedStrategy. ConvergingScheduler is now the default scheduling algorithm. It exercises the new driver but still does no reordering. llvm-svn: 152700	2012-03-14 04:00:41 +00:00

... 2 3 4 5 6 ...

13624 Commits