llvm-project

Commit Graph

Author	SHA1	Message	Date
Kevin Enderby	d2980cd041	Fix ARM disassembly of VLD instructions with writebacks. And add test a case for all opcodes handed by DecodeVLDInstruction() in ARMDisassembler.cpp . llvm-svn: 154459	2012-04-11 00:25:40 +00:00
Jim Grosbach	ad66de155b	ARM add missing Thumb1 two-operand aliases for shift-by-immediate. rdar://11222742 llvm-svn: 154457	2012-04-11 00:15:16 +00:00
Evan Cheng	aca6c822e6	Fix a number of problems with ARM fused multiply add/subtract instructions. 1. The new instruction itinerary entries are not properly described. 2. The asm parser can't handle vfms and vfnms. 3. There were no assembler, disassembler test cases. 4. HasNEON2 has the wrong assembler predicate. rdar://10139676 llvm-svn: 154456	2012-04-11 00:13:00 +00:00
Jakob Stoklund Olesen	0bcf8f4bfb	Fix test to be register assignment invariant. llvm-svn: 154453	2012-04-11 00:00:24 +00:00
Owen Anderson	6f1ee1634d	Move the constant-folding support for FP_ROUND in SelectionDAG from the one-operand version of getNode() to the two-operand version, since it became a two-operand node at sound point. Zap a testcase that this allows us to completely fold away. llvm-svn: 154447	2012-04-10 22:46:53 +00:00
Kostya Serebryany	5ba61ac651	[tsan] two more compile-time optimizations: - don't isntrument reads from constant globals. Saves ~1.5% of instrumented instructions on CPU2006 (counting static instructions, not their execution). - don't insrument reads from vtable (which is a global constant too). Saves ~5%. I did not measure the run-time impact of this, but it is certainly non-negative. llvm-svn: 154444	2012-04-10 22:29:17 +00:00
Evan Cheng	d0007f3c83	Handle llvm.fma.* intrinsics. rdar://10914096 llvm-svn: 154439	2012-04-10 21:40:28 +00:00
Duncan Sands	4f53074cca	Add a comment noting that the fdiv -> fmul conversion won't generate multiplication by a denormal, and some tests checking that. llvm-svn: 154431	2012-04-10 20:35:27 +00:00
Eric Christopher	65ada95b84	Temporarily revert this patch to see if it brings the buildbots back. llvm-svn: 154425	2012-04-10 19:33:16 +00:00
Kostya Serebryany	bf2de80be6	[tsan] compile-time instrumentation: do not instrument a read if a write to the same temp follows in the same BB. Also add stats printing. On Spec CPU2006 this optimization saves roughly 4% of instrumented reads (which is 3% of all instrumented accesses): Writes : 161216 Reads : 446458 Reads-before-write: 18295 llvm-svn: 154418	2012-04-10 18:18:56 +00:00
Eric Christopher	e9abba71fe	To ensure that we have more accurate line information for a block don't elide the branch instruction if it's the only one in the block, otherwise it's ok. PR9796 and rdar://11215207 llvm-svn: 154417	2012-04-10 18:18:10 +00:00
Jim Grosbach	df5a244797	ARM fix cc_out operand handling for t2SUBrr instructions. We were incorrectly conflating some add variants which don't have a cc_out operand with the mirroring sub encodings, which do. Part of the awesome non-orthogonality legacy of thumb1. Similarly, handling of add/sub of an immediate was sometimes incorrectly removing the cc_out operand for add/sub register variants. rdar://11216577 llvm-svn: 154411	2012-04-10 17:31:55 +00:00
Nadav Rotem	f934f91709	Modify the code that lowers shuffles to blends from using blendvXX to vblendXX. blendv uses a register for the selection while vblend uses an immediate. On sandybridge they still have the same latency and execute on the same execution ports. llvm-svn: 154396	2012-04-10 14:33:13 +00:00
Anton Korobeynikov	4d1220de34	Transform div to mul with reciprocal only when fp imm is legal. This fixes PR12516 and uncovers one weird problem in legalize (workarounded) llvm-svn: 154394	2012-04-10 13:22:49 +00:00
Duncan Sands	af06b26c8e	Express the number of ULPs in fpaccuracy metadata as a real rather than a rational number, eg as 2.5 rather than 5, 2. OK'd by Peter Collingbourne. llvm-svn: 154387	2012-04-10 08:22:43 +00:00
Andrew Trick	4442bfe559	Fix 12513: Loop unrolling breaks with indirect branches. Take this opportunity to generalize the indirectbr bailout logic for loop transformations. CFG transformations will never get indirectbr right, and there's no point trying. llvm-svn: 154386	2012-04-10 05:14:42 +00:00
Evan Cheng	0752624970	Add proper checks. llvm-svn: 154379	2012-04-10 03:15:42 +00:00
Evan Cheng	f8bad08001	Fix a long standing tail call optimization bug. When a libcall is emitted legalizer always use the DAG entry node. This is wrong when the libcall is emitted as a tail call since it effectively folds the return node. If the return node's input chain is not the entry (i.e. call, load, or store) use that as the tail call input chain. PR12419 rdar://9770785 rdar://11195178 llvm-svn: 154370	2012-04-10 01:51:00 +00:00
Rafael Espindola	1d9672bdce	Don't try to zExt just to check if an integer constant is zero, it might not fit in a i64. llvm-svn: 154364	2012-04-10 00:16:22 +00:00
Lang Hames	ec96cd0690	Test case for PR12495. llvm-svn: 154359	2012-04-09 23:58:59 +00:00
Akira Hatanaka	8483a6c47d	Have TargetLowering::getPICJumpTableRelocBase return a node that points to the GOT if jump table uses 64-bit gp-relative relocation. llvm-svn: 154341	2012-04-09 20:32:12 +00:00
Chad Rosier	e0e38f61a5	When performing a truncating store, it's possible to rearrange the data in-register, such that we can use a single vector store rather then a series of scalar stores. For func_4_8 the generated code vldr d16, LCPI0_0 vmov d17, r0, r1 vadd.i16 d16, d17, d16 vmov.u16 r0, d16[3] strb r0, [r2, #3] vmov.u16 r0, d16[2] strb r0, [r2, #2] vmov.u16 r0, d16[1] strb r0, [r2, #1] vmov.u16 r0, d16[0] strb r0, [r2] bx lr becomes vldr d16, LCPI0_0 vmov d17, r0, r1 vadd.i16 d16, d17, d16 vuzp.8 d16, d17 vst1.32 {d16[0]}, [r2, :32] bx lr I'm not fond of how this combine pessimizes 2012-03-13-DAGCombineBug.ll, but I couldn't think of a way to judiciously apply this combine. This ldrh r0, [r0, #4] strh r0, [r1] becomes vldr d16, [r0] vmov.u16 r0, d16[2] vmov.32 d16[0], r0 vuzp.16 d16, d17 vst1.32 {d16[0]}, [r1, :32] PR11158 rdar://10703339 llvm-svn: 154340	2012-04-09 20:32:02 +00:00
Rafael Espindola	8f62b3248e	Pattern match a setcc of boolean value with 0 as a truncate. llvm-svn: 154322	2012-04-09 16:06:03 +00:00
Nadav Rotem	fb7e2ae53c	Lower some x86 shuffle sequences to the vblend family of instructions. llvm-svn: 154313	2012-04-09 08:33:21 +00:00
Nadav Rotem	b801ca3976	Fix a bug in the lowering of broadcasts: ConstantPools need to use the target pointer type. Move NormalizeVectorShuffle and LowerVectorBroadcast into X86TargetLowering. llvm-svn: 154310	2012-04-09 07:45:58 +00:00
Chandler Carruth	3779ac10b4	Cleanup and relax a restriction on the matching of global offsets into x86 addressing modes. This allows PIE-based TLS offsets to fit directly into an addressing mode immediate offset, which is the last remaining code quality issue from PR12380. With this patch, that PR is completely fixed. To understand why this patch is correct to match these offsets into addressing mode immediates, break it down by cases: 1) 32-bit is trivially correct, and unmodified here. 2) 64-bit non-small mode is unchanged and never matches. 3) 64-bit small PIC code which is RIP-relative is handled specially in the match to try to fit RIP into the base register. If it fails, it now early exits. This behavior is unchanged by the patch. 4) 64-bit small non-PIC code which is not RIP-relative continues to work as it did before. The reason these immediates are safe is because the ABI ensures they fit in small mode. This behavior is unchanged. 5) 64-bit small PIC code which is not using RIP-relative addressing. This is the only case changed by the patch, and the primary place you see it is in TLS, either the win64 section offset TLS or Linux local-exec TLS model in a PIC compilation. Here the ABI again ensures that the immediates fit because we are in small mode, and any other operations required due to the PIC relocation model have been handled externally to the Wrapper node (extra loads etc are made around the wrapper node in ISelLowering). I've tested this as much as I can comparing it with GCC's output, and everything appears safe. I discussed this with Anton and it made sense to him at least at face value. That said, if there are issues with PIC code after this patch, yell and we can revert it. llvm-svn: 154304	2012-04-09 02:13:06 +00:00
Chandler Carruth	84b834267e	Fold 15 tiny test cases into a single file that implements the comprehensive testing of TLS codegen for x86. Convert all of the ones that were still using grep to use FileCheck. Remove some redundancies between them. Perhaps most interestingly expand the test cases so that they actually fully list the instruction snippet being tested. TLS operations are very narrowly defined, and so these seem reasonably stable. More importantly, the existing test cases already were crazy fine grained, expecting specific registers to be allocated. This just clarifies that no other instructions are expected, and fills in some crucial gaps that weren't being tested at all. This will make any subsequent changes to TLS much more clear during review. llvm-svn: 154303	2012-04-09 01:43:17 +00:00
Duncan Sands	2f1dc3814b	Only have codegen turn fdiv by a constant into fmul by the reciprocal when -ffast-math, i.e. don't just always do it if the reciprocal can be formed exactly. There is already an IR level transform that does that, and it does it more carefully. llvm-svn: 154296	2012-04-08 18:08:12 +00:00
Chandler Carruth	ede4a8aa2b	Teach LLVM about a PIE option which, when enabled on top of PIC, makes optimizations which are valid for position independent code being linked into a single executable, but not for such code being linked into a shared library. I discussed the design of this with Eric Christopher, and the decision was to support an optional bit rather than a completely separate relocation model. Fundamentally, this is still PIC relocation, its just that certain optimizations are only valid under a PIC relocation model when the resulting code won't be in a shared library. The simplest path to here is to expose a single bit option in the TargetOptions. If folks have different/better designs, I'm all ears. =] I've included the first optimization based upon this: changing TLS models to the *Exec models when PIE is enabled. This is the LLVM component of PR12380 and is all of the hard work. llvm-svn: 154294	2012-04-08 17:51:45 +00:00
Chandler Carruth	f82b0e2d29	Teach InstCombine to nuke a common alloca pattern -- an alloca which has GEPs, bit casts, and stores reaching it but no other instructions. These often show up during the iterative processing of the inliner, SROA, and DCE. Once we hit this point, we can completely remove the alloca. These were actually showing up in the final, fully optimized code in a bunch of inliner tests I've been working on, and notably they show up after LLVM finishes optimizing away all function calls involved in hash_combine(a, b). llvm-svn: 154285	2012-04-08 14:36:56 +00:00
Nadav Rotem	82609df647	AVX2: Build splat vectors by broadcasting a scalar from the constant pool. Previously we used three instructions to broadcast an immediate value into a vector register. On Sandybridge we continue to load the broadcasted value from the constant pool. llvm-svn: 154284	2012-04-08 12:54:54 +00:00
Bill Wendling	8c783d4122	Remove old 'grep' lines. llvm-svn: 154283	2012-04-08 11:53:54 +00:00
Bill Wendling	57f8e5ebe4	FileCheckize these testcases. llvm-svn: 154281	2012-04-08 11:00:38 +00:00
Nadav Rotem	71d07ae5cb	1. Remove the part of r153848 which optimizes shuffle-of-shuffle into a new shuffle node because it could introduce new shuffle nodes that were not supported efficiently by the target. 2. Add a more restrictive shuffle-of-shuffle optimization for cases where the second shuffle reverses the transformation of the first shuffle. llvm-svn: 154266	2012-04-07 21:19:08 +00:00
Duncan Sands	5f8397a934	Convert floating point division by a constant into multiplication by the reciprocal if converting to the reciprocal is exact. Do it even if inexact if -ffast-math. This substantially speeds up ac.f90 from the polyhedron benchmarks. llvm-svn: 154265	2012-04-07 20:04:00 +00:00
Chandler Carruth	28192c9398	Fix ValueTracking to conclude that debug intrinsics are safe to speculate. Without this, loop rotate (among many other places) would suddenly stop working in the presence of debug info. I found this looking at loop rotate, and have augmented its tests with a reduction out of a very hot loop in yacr2 where failing to do this rotation costs sometimes more than 10% in runtime performance, perturbing numerous downstream optimizations. This should have no impact on performance without debug info, but the change in performance when debug info is enabled can be extreme. As a consequence (and this how I got to this yak) any profiling of performance problems should be treated with deep suspicion -- they may have been wildly innacurate of debug info was enabled for profiling. =/ Just a heads up. llvm-svn: 154263	2012-04-07 19:22:18 +00:00
Benjamin Kramer	e1f4ca1b0f	SCEV: When expanding a GEP the final addition to the base pointer has NUW but not NSW. Found by inspection. llvm-svn: 154262	2012-04-07 17:19:26 +00:00
Alexis Hunt	78fce432b7	Make the test for r154235 more platform-independent with a shorter string. llvm-svn: 154243	2012-04-07 01:33:14 +00:00
Alexis Hunt	0235f684f0	Output UTF-8-encoded characters as identifier characters into assembly by default. This is a behaviour configurable in the MCAsmInfo. I've decided to turn it on by default in (possibly optimistic) hopes that most assemblers are reasonably sane. If this proves a problem, switching to default seems reasonable. I'm not sure if this is the opportune place to test, but it seemed good to make sure it was tested somewhere. llvm-svn: 154235	2012-04-07 00:37:53 +00:00
Akira Hatanaka	487e56763d	Add lines in global-address.ll to test N32 and N64 code generation. llvm-svn: 154202	2012-04-06 20:23:36 +00:00
Jakob Stoklund Olesen	967b86a0a2	Allow negative immediates in ARM and Thumb2 compares. ARM and Thumb2 mode can use cmn instructions to compare against negative immediates. Thumb1 mode can't. llvm-svn: 154183	2012-04-06 17:45:04 +00:00
Chandler Carruth	49da93396e	Sink the collection of return instructions until after all simplification has been performed. This is a bit less efficient (requires another ilist walk of the basic blocks) but shouldn't matter in practice. More importantly, it's just too much work to keep track of all the various ways the return instructions can be mutated while simplifying them. This fixes yet another crasher, reported by Daniel Dunbar. llvm-svn: 154179	2012-04-06 17:21:31 +00:00
Chandler Carruth	e547fefcb7	Tweak this test to ensure the inliner did indeed fire. Thanks to Richard Smith for pointing this out in review. llvm-svn: 154178	2012-04-06 17:21:28 +00:00
Craig Topper	bdc9f071a4	Test case for PR12413 llvm-svn: 154172	2012-04-06 14:38:25 +00:00
Craig Topper	447417c932	Allow 256-bit shuffles to be split if a 128-bit lane contains elements from a single source. This is a rewrite of the 256-bit shuffle splitting code based on similar code from legalize types. Fixes PR12413. llvm-svn: 154166	2012-04-06 07:45:23 +00:00
Craig Topper	4eb9616b24	Add the tests that were supposed to go with r153935 that I forgot svn add llvm-svn: 154165	2012-04-06 07:09:59 +00:00
Chandler Carruth	17e335888c	Actually finish this sentence in the comment the way I intended. Thanks Matt for pointing this out. llvm-svn: 154158	2012-04-06 01:19:38 +00:00
Chandler Carruth	e41f6f4189	Sink the return instruction collection until after we're done deleting dead code, including dead return instructions in some cases. Otherwise, we end up having a bogus poniter to a return instruction that blows up much further down the road. It turns out that this pattern is both simpler to code, easier to update in the face of enhancements to the inliner cleanup, and likely cheaper given that it won't add dead instructions to the list. Thanks to John Regehr's numerous test cases for teasing this out. llvm-svn: 154157	2012-04-06 01:11:52 +00:00
Jim Grosbach	930f2f66e7	ARM assembly aliases for add negative immediates using sub. 'add r2, #-1024' should just use 'sub r2, #1024' rather than erroring out. Thumb1 aliases for adding a negative immediate to the stack pointer, also. rdar://11192734 llvm-svn: 154123	2012-04-05 20:57:13 +00:00
Akira Hatanaka	43fb2b2cea	Reapply test case in 154038, this time with triple to prevent the backend from emitting gp_rel relocation. llvm-svn: 154122	2012-04-05 20:44:35 +00:00
Eric Christopher	aec8a82694	Patch to set is_stmt a little better for prologue lines in a function. This enables debuggers to see what are interesting lines for a breakpoint rather than any line that starts a function. rdar://9852092 llvm-svn: 154120	2012-04-05 20:39:05 +00:00
Jakob Stoklund Olesen	37492eac8c	Don't break the IV update in TLI::SimplifySetCC(). LSR always tries to make the ICmp in the loop latch use the incremented induction variable. This allows the induction variable to be kept in a single register. When the induction variable limit is equal to the stride, SimplifySetCC() would break LSR's hard work by transforming: (icmp (add iv, stride), stride) --> (cmp iv, 0) This forced us to use lea for the IC update, preventing the simpler incl+cmp. <rdar://problem/7643606> <rdar://problem/11184260> llvm-svn: 154119	2012-04-05 20:30:20 +00:00
Dan Gohman	cc64bbca81	Fix accidentally inverted logic from r152803, and make the testcase slightly less trivial. This fixes rdar://11171718. llvm-svn: 154118	2012-04-05 20:27:21 +00:00
Silviu Baranga	af3c79f0ac	Added support for unpredictable ADC/SBC instructions on ARM, and also fixed some corner cases involving the PC register as an operand for these instructions. llvm-svn: 154101	2012-04-05 16:19:29 +00:00
Silviu Baranga	d365397daa	Added support for handling unpredictable arithmetic instructions on ARM. llvm-svn: 154100	2012-04-05 16:13:15 +00:00
James Molloy	1ea6473688	An oversight when applying the patches for r150956 and r150957 to a vanilla tree meant I forgot to svn add these testcases. Noticed while investigating PR12274! llvm-svn: 154090	2012-04-05 10:01:12 +00:00
Jim Grosbach	15c6884a4b	ARM assembly aliases for two-operand V[R]SHR instructions. rdar://11189467 llvm-svn: 154087	2012-04-05 07:23:53 +00:00
Jim Grosbach	3d00eecc53	ARM assembly parsing for 'msr' plain 'cpsr' operand. Plain 'cpsr' is an alias for 'cpsr_fc'. rdar://11153753 llvm-svn: 154080	2012-04-05 03:17:53 +00:00
Jakob Stoklund Olesen	f2390e8303	Pass the right sign to TLI->isLegalICmpImmediate. LSR can fold three addressing modes into its ICmpZero node: ICmpZero BaseReg + Offset => ICmp BaseReg, -Offset ICmpZero -1ScaleReg + Offset => ICmp ScaleReg, Offset ICmpZero BaseReg + -1ScaleReg => ICmp BaseReg, ScaleReg The first two cases are only used if TLI->isLegalICmpImmediate() likes the offset. Make sure the right Offset sign is passed to this method in the second case. The ARM version is not symmetric. <rdar://problem/11184260> llvm-svn: 154079	2012-04-05 03:10:56 +00:00
Akira Hatanaka	121342fcc2	Reapply 154038 without the failing test. llvm-svn: 154062	2012-04-04 22:16:36 +00:00
Owen Anderson	4743c6e159	Revert r154038. It was causing make check failures. llvm-svn: 154054	2012-04-04 21:18:58 +00:00
Akira Hatanaka	9705c865d9	Fix LowerGlobalAddress to produce instructions with the correct relocation types for N32 ABI. Add new test case and update existing ones. llvm-svn: 154038	2012-04-04 19:02:38 +00:00
Akira Hatanaka	b3a2b8c199	Fix LowerConstantPool to produce instructions with the correct relocation types for N32 ABI and update test case. llvm-svn: 154034	2012-04-04 18:26:12 +00:00
Jakob Stoklund Olesen	0a5b72f0e4	Implement ARMBaseInstrInfo::commuteInstruction() for MOVCCr. A MOVCCr instruction can be commuted by inverting the condition. This can help reduce register pressure and remove unnecessary copies in some cases. <rdar://problem/11182914> llvm-svn: 154033	2012-04-04 18:23:42 +00:00
Akira Hatanaka	aeff24e424	Fix LowerBlockAddress to produce instructions with the correct relocation types for N32 ABI and update test case. llvm-svn: 154031	2012-04-04 18:22:53 +00:00
Hongbin Zheng	e1fd20172b	Add testcase for r154007, when a function has the optsize attribute, the loop should be unrolled according the value of OptSizeUnrollThreshold. llvm-svn: 154014	2012-04-04 13:24:40 +00:00
Rafael Espindola	ba0a6cabb8	Always compute all the bits in ComputeMaskedBits. This allows us to keep passing reduced masks to SimplifyDemandedBits, but know about all the bits if SimplifyDemandedBits fails. This allows instcombine to simplify cases like the one in the included testcase. llvm-svn: 154011	2012-04-04 12:51:34 +00:00
Michael J. Spencer	22120c47a7	Add YAML parser to Support. llvm-svn: 153977	2012-04-03 23:09:22 +00:00
Pete Cooper	9511ec86f9	Add VSELECT to LegalizeVectorTypes::ScalariseVectorResult. Previously it would crash if it encountered a 1 element VSELECT. Solution is slightly more complicated than just creating a SELET as we have to mask or sign extend the vector condition if it had different boolean contents from the scalar condition. Fixes <rdar://problem/11178095> llvm-svn: 153976	2012-04-03 22:57:55 +00:00
Eric Christopher	b81e2b403c	Fix thinko check for number of operands to be the one that actually might have more than 19 operands. Add a testcase to make sure I never screw that up again. Part of rdar://11026482 llvm-svn: 153961	2012-04-03 17:55:42 +00:00
Nadav Rotem	269703f983	Add an additional testcase which checks ops with multiple users. llvm-svn: 153939	2012-04-03 07:39:36 +00:00
Craig Topper	7629d63bc4	Add support for AVX enhanced comparison predicates. Patch from Kay Tiong Khoo. llvm-svn: 153935	2012-04-03 05:20:24 +00:00
Akira Hatanaka	d19f025374	Revert r153924. Delete test/MC/Disassembler/Mips and lib/Target/Mips/Disassembler. llvm-svn: 153926	2012-04-03 03:01:13 +00:00
Akira Hatanaka	55059262aa	Revert r153924. There were buildbot failures. llvm-svn: 153925	2012-04-03 02:51:09 +00:00
Akira Hatanaka	e2498d014b	MIPS disassembler support. Patch by Vladimir Medic. llvm-svn: 153924	2012-04-03 02:20:58 +00:00
Jakob Stoklund Olesen	291007b055	Allocate virtual registers in ascending order. This is just the fallback tie-breaker ordering, the main allocation order is still descending size. Patch by Shamil Kurmangaleev! llvm-svn: 153904	2012-04-02 22:30:39 +00:00
Lang Hames	aaafacd07e	During two-address lowering, rescheduling an instruction does not untie operands. Make TryInstructionTransform return false to reflect this. Fixes PR11861. llvm-svn: 153892	2012-04-02 19:58:43 +00:00
Rafael Espindola	2e5c58e77b	No need to run llvm-as. llvm-svn: 153890	2012-04-02 19:44:20 +00:00
Akira Hatanaka	b1f68f9696	Initial 64 bit direct object support. This patch allows llvm to recognize that a 64 bit object file is being produced and that the subsequently generated ELF header has the correct information. The test case checks for both big and little endian flavors. Patch by Jack Carter. llvm-svn: 153889	2012-04-02 19:25:22 +00:00
Stepan Dyatkovskiy	f62ffeca88	Fast fix for PR12343: http://llvm.org/bugs/show_bug.cgi?id=12343 We have not trivial way for splitting edges that are goes from indirect branch. We can do it with some tricks, but it should be additionally discussed. And it is still dangerous due to difficulty of indirect branches controlling. Fix forbids this case for unswitching. llvm-svn: 153879	2012-04-02 17:16:45 +00:00
Silviu Baranga	ac37acd31b	Added fix in TableGen instruction decoder generation. The decoder now breaks for every leaf node. llvm-svn: 153874	2012-04-02 15:20:39 +00:00
Nadav Rotem	702f080767	Optimizing swizzles of complex shuffles may generate additional complex shuffles. Do not try to optimize swizzles of shuffles if the source shuffle has more than a single user, except when the source shuffle is also a swizzle. llvm-svn: 153864	2012-04-02 07:11:12 +00:00
Hal Finkel	322e41a914	Enable prefetch generation on PPC64. llvm-svn: 153851	2012-04-01 20:08:17 +00:00
Nadav Rotem	b078350872	This commit contains a few changes that had to go in together. 1. Simplify xor/and/or (bitcast(A), bitcast(B)) -> bitcast(op (A,B)) (and also scalar_to_vector). 2. Xor/and/or are indifferent to the swizzle operation (shuffle of one src). Simplify xor/and/or (shuff(A), shuff(B)) -> shuff(op (A, B)) 3. Optimize swizzles of shuffles: shuff(shuff(x, y), undef) -> shuff(x, y). 4. Fix an X86ISelLowering optimization which was very bitcast-sensitive. Code which was previously compiled to this: movd (%rsi), %xmm0 movdqa .LCPI0_0(%rip), %xmm2 pshufb %xmm2, %xmm0 movd (%rdi), %xmm1 pshufb %xmm2, %xmm1 pxor %xmm0, %xmm1 pshufb .LCPI0_1(%rip), %xmm1 movd %xmm1, (%rdi) ret Now compiles to this: movl (%rsi), %eax xorl %eax, (%rdi) ret llvm-svn: 153848	2012-04-01 19:31:22 +00:00
Hal Finkel	9f9f8929ee	Add instruction itinerary for the PPC64 A2 core. This adds a full itinerary for IBM's PPC64 A2 embedded core. These cores form the basis for the CPUs in the new IBM BG/Q supercomputer. llvm-svn: 153842	2012-04-01 19:22:40 +00:00
Chandler Carruth	cdb1f8cff1	Add some more testing to cover the remaining two cases where always-inlining is disabled: recursive functions and indirectbr. llvm-svn: 153833	2012-04-01 10:36:17 +00:00
Chandler Carruth	c5bfb3c0f5	Fix a pretty scary bug I introduced into the always inliner with a single missing character. Somehow, this had gone untested. I've added tests for returns-twice logic specifically with the always-inliner that would have caught this, and fixed the bug. Thanks to Matt for the careful review and spotting this!!! =D llvm-svn: 153832	2012-04-01 10:21:05 +00:00
Chandler Carruth	1989bb9c43	Replace four tiny tests with various uses of grep and not with a single test and FileCheck. llvm-svn: 153831	2012-04-01 10:11:17 +00:00
Rafael Espindola	77242fa79e	Add a triple to the test. llvm-svn: 153818	2012-03-31 18:59:07 +00:00
Rafael Espindola	80c540e656	Teach CodeGen's version of computeMaskedBits to understand the range metadata. This is the CodeGen equivalent of r153747. I tested that there is not noticeable performance difference with any combination of -O0/-O2 /-g when compiling gcc as a single compilation unit. llvm-svn: 153817	2012-03-31 18:14:00 +00:00
Chandler Carruth	0539c071ea	Initial commit for the rewrite of the inline cost analysis to operate on a per-callsite walk of the called function's instructions, in breadth-first order over the potentially reachable set of basic blocks. This is a major shift in how inline cost analysis works to improve the accuracy and rationality of inlining decisions. A brief outline of the algorithm this moves to: - Build a simplification mapping based on the callsite arguments to the function arguments. - Push the entry block onto a worklist of potentially-live basic blocks. - Pop the first block off of the front of the worklist (for breadth-first ordering) and walk its instructions using a custom InstVisitor. - For each instruction's operands, re-map them based on the simplification mappings available for the given callsite. - Compute any simplification possible of the instruction after re-mapping, and store that back int othe simplification mapping. - Compute any bonuses, costs, or other impacts of the instruction on the cost metric. - When the terminator is reached, replace any conditional value in the terminator with any simplifications from the mapping we have, and add any successors which are not proven to be dead from these simplifications to the worklist. - Pop the next block off of the front of the worklist, and repeat. - As soon as the cost of inlining exceeds the threshold for the callsite, stop analyzing the function in order to bound cost. The primary goal of this algorithm is to perfectly handle dead code paths. We do not want any code in trivially dead code paths to impact inlining decisions. The previous metric was extremely flawed here, and would always subtract the average cost of two successors of a conditional branch when it was proven to become an unconditional branch at the callsite. There was no handling of wildly different costs between the two successors, which would cause inlining when the path actually taken was too large, and no inlining when the path actually taken was trivially simple. There was also no handling of the code path, only the immediate successors. These problems vanish completely now. See the added regression tests for the shiny new features -- we skip recursive function calls, SROA-killing instructions, and high cost complex CFG structures when dead at the callsite being analyzed. Switching to this algorithm required refactoring the inline cost interface to accept the actual threshold rather than simply returning a single cost. The resulting interface is pretty bad, and I'm planning to do lots of interface cleanup after this patch. Several other refactorings fell out of this, but I've tried to minimize them for this patch. =/ There is still more cleanup that can be done here. Please point out anything that you see in review. I've worked really hard to try to mirror at least the spirit of all of the previous heuristics in the new model. It's not clear that they are all correct any more, but I wanted to minimize the change in this single patch, it's already a bit ridiculous. One heuristic that is not yet mirrored is to allow inlining of functions with a dynamic alloca if the caller has a dynamic alloca. I will add this back, but I think the most reasonable way requires changes to the inliner itself rather than just the cost metric, and so I've deferred this for a subsequent patch. The test case is XFAIL-ed until then. As mentioned in the review mail, this seems to make Clang run about 1% to 2% faster in -O0, but makes its binary size grow by just under 4%. I've looked into the 4% growth, and it can be fixed, but requires changes to other parts of the inliner. llvm-svn: 153812	2012-03-31 12:42:41 +00:00
Chandler Carruth	6f202a7ced	Clean up the naming in this test. Someone pointed this out in review at one point, and I forgot to go back and clean it up. Sorry about that. =/ llvm-svn: 153801	2012-03-31 10:38:48 +00:00
Chandler Carruth	564b4ba704	FileCheck-ize this test, and generally tidy it up prior to changing things around. llvm-svn: 153799	2012-03-31 09:22:33 +00:00
Hal Finkel	5cad8742cc	Correctly vectorize powi. The powi intrinsic requires special handling because it always takes a single integer power regardless of the result type. As a result, we can vectorize only if the powers are equal. Fixes PR12364. llvm-svn: 153797	2012-03-31 03:38:40 +00:00
Jim Grosbach	fdaab531b7	ARM assembler should prefer non-aliases encoding of cmp. When an immediate is both a value [t2_]so_imm and a [t2_]so_imm_neg, we want to use the non-negated form to make sure we prefer the normal encoding, not the aliased encoding via the negation of, e.g., 'cmp.w'. llvm-svn: 153770	2012-03-30 19:59:02 +00:00
Jim Grosbach	daa04130ed	ARM encoding for VSWP got the second operand incorrect. Make the non-tied register operand names line up with what the base class encoding handler expects. rdar://11157236 llvm-svn: 153766	2012-03-30 18:53:01 +00:00
Jim Grosbach	def5e34812	ARM integrated assembler should encoding choice for add/sub imm. For 'adds r2, r2, #56' outside of an IT block, the 16-bit encoding T2 can be used for this syntax. Prefer the narrow encoding when possible. rdar://11156277 llvm-svn: 153759	2012-03-30 17:20:40 +00:00
Jim Grosbach	199ab90946	ARM assembly parsing needs to be paranoid about negative immediates. Make sure to treat immediates as unsigned when doing relative comparisons. rdar://11153621 llvm-svn: 153753	2012-03-30 16:31:31 +00:00
James Molloy	fb5cd6085f	Ensure conditional BL instructions for ARM are given the fixup fixup_arm_condbranch. Patch by Tim Northover! llvm-svn: 153737	2012-03-30 09:15:32 +00:00
Evan Cheng	a40d40602c	ARM target should allow codegenprep to duplicate ret instructions to enable tailcall opt. rdar://11140249 llvm-svn: 153717	2012-03-30 01:24:39 +00:00
Bill Wendling	afe7ec7070	Testcase for r153710. llvm-svn: 153711	2012-03-30 00:26:54 +00:00
Bill Wendling	4f2a951275	Add testcase for r153705 llvm-svn: 153706	2012-03-30 00:05:02 +00:00
Lang Hames	323a5ced21	Change the constant in this testcase so that it results in a constant pool load. llvm-svn: 153704	2012-03-29 23:52:38 +00:00
Bill Wendling	76fdc4b885	Revert r153694. It was causing failures in the buildbots. llvm-svn: 153701	2012-03-29 23:23:59 +00:00
Chandler Carruth	d6735ce57a	Filecheck-ize this test so that it actually tests something reasonable. llvm-svn: 153697	2012-03-29 22:01:41 +00:00
Danil Malyshev	3548eaf896	Re-factored RuntimeDyld. Added ExecutionEngine/MCJIT tests. llvm-svn: 153694	2012-03-29 21:46:18 +00:00
Jim Grosbach	0b0298302c	ARM assembly 'cmp lr, #0' should not encode using 'cmn'. The CMP->CMN alias was matching for an immediate of zero when it should only match for negative values. rdar://11129224 llvm-svn: 153689	2012-03-29 21:19:52 +00:00
Lang Hames	dd1211b4e1	The shuffle scheduler is only available in asserts build - make misched-new.ll testcase require asserts. llvm-svn: 153687	2012-03-29 21:11:47 +00:00
Lang Hames	5569ce7d56	Make x86 REP_MOV* and REP_STO instructions use the correct operand sizes in 64-bit mode. llvm-svn: 153680	2012-03-29 19:54:28 +00:00
Akira Hatanaka	0603ad8c65	Expand FREM. llvm-svn: 153671	2012-03-29 18:43:11 +00:00
Jakob Stoklund Olesen	4e55044ff5	Don't PRE compares. CodeGenPrepare sinks compare instructions down to their uses to prevent live flags and predicate registers across basic blocks. PRE of a compare instruction prevents that, forcing the i1 compare result into a general purpose register. That is usually more expensive than the redundant compare PRE was trying to eliminate in the first place. llvm-svn: 153657	2012-03-29 17:22:39 +00:00
Joel Jones	68d59e8a90	For X86, change load/dec-or-inc/store into dec-or-inc, respectively. This is a code change to add support for changing instruction sequences of the form: load inc/dec of 8/16/32/64 bits store into the appropriate X86 inc/dec through memory instruction: inc[qlwb] / dec[qlwb] The checks that were in X86DAGToDAGISel::Select(SDNode *Node)>>ISD::STORE have been extracted to isLoadIncOrDecStore and reworked to use the better named wrappers for getOperand(unsigned) (e.g. getOffset()) and replaced Chain.getNode() with LoadNode. The comments have also been expanded. llvm-svn: 153635	2012-03-29 05:45:48 +00:00
Joel Jones	b474099e63	Reverted to revision 153616 to unblock build llvm-svn: 153623	2012-03-29 01:20:56 +00:00
Joel Jones	b88c81fe0f	For X86, change load/dec-or-inc/store into dec-or-inc, respectively. This is a code change to add support for changing instruction sequences of the form: load inc/dec of 8/16/32/64 bits store into the appropriate X86 inc/dec through memory instruction: inc[qlwb] / dec[qlwb] The checks that were in X86DAGToDAGISel::Select(SDNode *Node)>>ISD::STORE have been extracted to isLoadIncOrDecStore and reworked to use the better named wrappers for getOperand(unsigned) (e.g. getOffset()) and replaced Chain.getNode() with LoadNode. The comments have also been expanded. llvm-svn: 153617	2012-03-29 00:37:47 +00:00
Jakob Stoklund Olesen	b6a7a89289	Don't kill the base register when expanding strd. When an strd instruction doesn't get the registers it wants, it can be expanded into two str instructions. Make sure the first str doesn't kill the base register in the case where the base and data registers are identical: t2STRi12 %R0<kill>, %R0, 4, pred:14, pred:%noreg t2STRi12 %R2<kill>, %R0, 8, pred:14, pred:%noreg <rdar://problem/11101911> llvm-svn: 153611	2012-03-28 23:07:03 +00:00
Rafael Espindola	5054ee82cc	Handle intrinsics in GlobalsModRef. Fixes pr12351. llvm-svn: 153604	2012-03-28 21:31:24 +00:00
Jakob Stoklund Olesen	9e512120b7	Spill DPair registers, not just QPR. The arm_neon intrinsics can create virtual registers from the DPair register class which allows both even-odd and odd-even D-register pairs. This fixes PR12389. llvm-svn: 153603	2012-03-28 21:20:32 +00:00
Chad Rosier	e27081d348	Revert r153521 as it's causing large regressions on the nightly testers. Original commit message for r153521 (aka r153423): Use the new range metadata in computeMaskedBits and add a new optimization to instruction simplify that lets us remove an and when loding a boolean value. llvm-svn: 153587	2012-03-28 18:42:50 +00:00
Benjamin Kramer	aa9e4a5e59	GlobalOpt: If we have an inbounds GEP from a ConstantAggregateZero global that we just determined to be constant, replace all loads from it with a zero value. llvm-svn: 153576	2012-03-28 14:50:09 +00:00
Richard Barton	7ce39497b4	Fixup VST1.32 with writeback instruction. Also re-factor non-writeback version. llvm-svn: 153573	2012-03-28 10:18:11 +00:00
Chandler Carruth	772c88b887	Switch to WeakVHs in the value mapper, and aggressively prune dead basic blocks in the function cloner. This removes the last case of trivially dead code that I've been seeing in the wild getting inlined, analyzed, re-inlined, optimized, only to be deleted. Nukes a FIXME from the cleanup tests. llvm-svn: 153572	2012-03-28 08:38:27 +00:00
Eric Christopher	7285c7d51d	Fix the output of the DW_TAG_friend tag to include DW_AT_friend and not the rest of the member tag. Fixes PR11695 llvm-svn: 153570	2012-03-28 07:34:31 +00:00
Akira Hatanaka	e3c00e5b97	Fix test case. llvm-svn: 153555	2012-03-28 00:25:01 +00:00
Eric Christopher	d8abaf3fc4	Add a test for the previous commit. Also, remove two tests that were testing a) the wrong behavior or b) something that I'm already testing in the new test. llvm-svn: 153525	2012-03-27 18:35:57 +00:00
Chad Rosier	8e6dbccd03	Reapply r153423; the original commit was fine. The failing test, distray, had undefined behavior, which Rafael was kind enough to fix. Original commit message for r153423: Use the new range metadata in computeMaskedBits and add a new optimization to instruction simplify that lets us remove an and when loding a boolean value. llvm-svn: 153521	2012-03-27 17:44:52 +00:00
Evan Cheng	7fede87349	Post-ra LICM should take care not to hoist an instruction that would clobber a register that's read by the preheader terminator. rdar://11095580 llvm-svn: 153492	2012-03-27 01:50:58 +00:00
Evan Cheng	a2b48d985b	ARM has a peephole optimization which looks for a def / use pair. The def produces a 32-bit immediate which is consumed by the use. It tries to fold the immediate by breaking it into two parts and fold them into the immmediate fields of two uses. e.g movw r2, #40885 movt r3, #46540 add r0, r0, r3 => add.w r0, r0, #3019898880 add.w r0, r0, #30146560 ; However, this transformation is incorrect if the user produces a flag. e.g. movw r2, #40885 movt r3, #46540 adds r0, r0, r3 => add.w r0, r0, #3019898880 adds.w r0, r0, #30146560 Note the adds.w may not set the carry flag even if the original sequence would. rdar://11116189 llvm-svn: 153484	2012-03-26 23:31:00 +00:00
Andrew Trick	7004e4b95e	SCEV fix: Handle loop invariant loads. Fixes PR11882: NULL dereference in ComputeLoadConstantCompareExitLimit. llvm-svn: 153480	2012-03-26 22:33:59 +00:00
Andrew Trick	f62744bb0d	Unit test for PR11950: LSR crash. llvm-svn: 153472	2012-03-26 21:45:37 +00:00
Chad Rosier	08e57e5ccf	Revert r153423 as this is causing failures on our internal nightly testers. Original commit message: Use the new range metadata in computeMaskedBits and add a new optimization to instruction simplify that lets us remove an and when loading a boolean value. llvm-svn: 153452	2012-03-26 18:07:14 +00:00
Kostya Serebryany	6f8a776041	[tsan] treat vtable pointer updates in a special way (requires tbaa); fix a bug (forgot to return true after instrumenting); make sure the tsan tests are run llvm-svn: 153448	2012-03-26 17:35:03 +00:00
Benjamin Kramer	df2348ecf3	Remove stale CBackend tests. llvm-svn: 153433	2012-03-26 11:16:50 +00:00
Rafael Espindola	df9b4adb82	Use the new range metadata in computeMaskedBits and add a new optimization to instruction simplify that lets us remove an and when loding a boolean value. llvm-svn: 153423	2012-03-26 01:44:11 +00:00
Chandler Carruth	8059c84af1	Teach instsimplify how to simplify comparisons of pointers which are constant-offsets of a common base using the generic GEP-walking logic I added for computing pointer differences in the same situation. llvm-svn: 153419	2012-03-25 21:28:14 +00:00
Chandler Carruth	2741aae80b	Switch the pointer-difference simplification logic to only work with inbounds GEPs. This isn't really necessary for simplifying pointer differences, but I'm planning to re-use the same code to simplify pointer comparisons where it is necessary. Since real code almost exclusively uses inbounds GEPs, it doesn't seem worth it to support the extra complexity of turning it on and off. If anyone would like that back, feel free to shout. Note that instcombine will still catch any of these patterns. llvm-svn: 153418	2012-03-25 20:43:07 +00:00
Eli Bendersky	a77c95f317	This file is no longer needed (DejaGNU-isms removed from code) llvm-svn: 153412	2012-03-25 12:43:54 +00:00
Chandler Carruth	ef82cf5b1e	Teach the function cloner (and thus the inliner) to simplify PHINodes aggressively. There are lots of dire warnings about this being expensive that seem to predate switching to the TrackingVH-based value remapper that is automatically updated on RAUW. This makes it easy to not just prune single-entry PHIs, but to fully simplify PHIs, and to recursively simplify the newly inlined code to propagate PHINode simplifications. This introduces a bit of a thorny problem though. We may end up simplifying a branch condition to a constant when we fold PHINodes, and we would like to nuke any dead blocks resulting from this so that time isn't wasted continually analyzing them, but this isn't easy. Deleting basic blocks after they are fully cloned and mapped into the new function currently requires manually updating the value map. The last piece of the simplification-during-inlining puzzle will require either switching to WeakVH mappings or some other piece of refactoring. I've left a FIXME in the testcase about this. llvm-svn: 153410	2012-03-25 10:34:54 +00:00
Eli Bendersky	f33086052d	Continue cleanup of LIT, getting rid of the remaining artifacts from dejagnu * Removed test/lib/llvm.exp - it is no longer needed * Deleted the dg.exp reading code from test/lit.cfg. There are no dg.exp files left in the test suite so this code is no longer required. test/lit.cfg is now much shorter and clearer * Removed a lot of duplicate code in lit.local.cfg files that need access to the root configuration, by adding a "root" attribute to the TestingConfig object. This attribute is dynamically computed to provide the same information as was previously provided by the custom getRoot functions. * Documented the config.root attribute in docs/CommandGuide/lit.pod llvm-svn: 153408	2012-03-25 09:02:19 +00:00
Chandler Carruth	2121199241	Move the instruction simplification of callsite arguments in the inliner to instead rely on much more generic and powerful instruction simplification in the function cloner (and thus inliner). This teaches the pruning function cloner to use instsimplify rather than just the constant folder to fold values during cloning. This can simplify a large number of things that constant folding alone cannot begin to touch. For example, it will realize that 'or' and 'and' instructions with certain constant operands actually become constants regardless of what their other operand is. It also can thread back through the caller to perform simplifications that are only possible by looking up a few levels. In particular, GEPs and pointer testing tend to fold much more heavily with this change. This should (in some cases) have a positive impact on compile times with optimizations on because the inliner itself will simply avoid cloning a great deal of code. It already attempted to prune proven-dead code, but now it will be use the stronger simplifications to prove more code dead. llvm-svn: 153403	2012-03-25 04:03:40 +00:00
Chandler Carruth	bc3bc9df2f	FileCheck-ize this test. Note the FIXME I've introduced here: we've regressed seriously here, we are no longer removing allocas during inline cleanup. This appears to be because of lifetime markers "using" them. =/ I'll look into this shortly. llvm-svn: 153394	2012-03-24 21:24:19 +00:00
Hal Finkel	e44eb28807	Fix small-integer VAARG on SVR4 ABI PPC64. The PPC64 SVR4 ABI requires integer stack arguments, and thus the var. args., that are smaller than 64 bits be zero extended to 64 bits. llvm-svn: 153373	2012-03-24 03:53:55 +00:00
Rafael Espindola	ef9f5504ea	First part of PR12251. Add documentation and verifier support for the range metadata. llvm-svn: 153359	2012-03-24 00:14:51 +00:00
Dan Gohman	e3ed2b0699	Don't convert objc_retainAutoreleasedReturnValue to objc_retain if it is retaining the return value of an invoke that it immediately follows. llvm-svn: 153344	2012-03-23 18:09:00 +00:00
Dan Gohman	5c70fadc17	It's not possible to insert code immediately after an invoke in the same basic block, and it's not safe to insert code in the successor blocks if the edges are critical edges. Splitting those edges is possible, but undesirable, especially on the unwind side. Instead, make the bottom-up code motion to consider invokes to be part of their successor blocks, rather than part of their parent blocks, so that it doesn't push code past them and onto the edges. This fixes PR12307. llvm-svn: 153343	2012-03-23 17:47:54 +00:00
Andrew Trick	d97b83e320	Remove -enable-lsr-nested in time for 3.1. Tests cases have been removed but attached to open PR12330. llvm-svn: 153286	2012-03-22 22:42:45 +00:00
Andrew Trick	f2c7af53f3	Convert -indvars tests that rely on SCEV expansion to -loop-reduce tests. llvm-svn: 153259	2012-03-22 17:10:07 +00:00
Andrew Trick	b4f08cd6df	Remove tests: indvars trivially preserves GEPs now. llvm-svn: 153258	2012-03-22 17:09:46 +00:00
Andrew Trick	a8242b6a58	Remove test: trivial canonical IV test which is covered by other SCEV tests. llvm-svn: 153257	2012-03-22 17:09:34 +00:00
Andrew Trick	bd11257df7	Test scalar evolution directly instead of testing the result of canonical indvars. llvm-svn: 153256	2012-03-22 17:09:31 +00:00
Andrew Trick	db149f9e73	Remove redundant -enable-iv-rewrite=false flags from test cases. llvm-svn: 153255	2012-03-22 17:09:04 +00:00
Silviu Baranga	4afd7d2316	Added soft fail checks for the disassembler when decoding some corner cases of the STRD, STRH, LDRD, LDRH, LDRSH and LDRSB instructions on ARM. llvm-svn: 153252	2012-03-22 14:14:49 +00:00
Silviu Baranga	d213f2111a	Added soft fail cases for the disassembler when decoding LDRSBT, LDRHT or LDRSHT instruction on ARM llvm-svn: 153251	2012-03-22 13:24:43 +00:00
Silviu Baranga	a6ea32afdd	Added soft fail cases for the disassembler when decoding MUL instructions on ARM. llvm-svn: 153250	2012-03-22 13:14:39 +00:00
Chandler Carruth	e26dafeb79	Revert a series of commits to MCJIT to get the build working in CMake (and hopefully on Windows). The bots have been down most of the day because of this, and it's not clear to me what all will be required to fix it. The commits started with r153205, then r153207, r153208, and r153221. The first commit seems to be the real culprit, but I couldn't revert a smaller number of patches. When resubmitting, r153207 and r153208 should be folded into r153205, they were simple build fixes. llvm-svn: 153241	2012-03-22 05:44:06 +00:00
Chad Rosier	6a63a74113	[fast-isel] Fold "urem x, pow2" -> "and x, pow2-1". This should fix the 271% execution-time regression for nsieve-bits on the ARMv7 -O0 -g nightly tester. This may also improve compile-time on architectures that would otherwise generate a libcall for urem (e.g., ARM) or fall back to the DAG selector. rdar://10810716 llvm-svn: 153230	2012-03-22 00:21:17 +00:00
Andrew Trick	267b57de6f	misched: tag a few XFAILs that I plan to fix llvm-svn: 153222	2012-03-21 22:31:31 +00:00
Danil Malyshev	70186bef8b	Re-factored RuntimeDyld. Added ExecutionEngine/MCJIT tests. llvm-svn: 153221	2012-03-21 21:06:29 +00:00
Kevin Enderby	7e7d5eefb2	Fix ARM disassembly of VST1 and VST2 instructions with writeback. And add test case for all opcodes handed by DecodeVSTInstruction() in ARMDisassembler.cpp . llvm-svn: 153218	2012-03-21 20:54:32 +00:00
Joerg Sonnenberger	5463e66768	Fix generation of the address size override prefix. Add assertions for the invalid cases. At least 16bit operand in 64bit mode is currently not rejected in the parser. llvm-svn: 153166	2012-03-21 05:48:07 +00:00
Andrew Trick	e357cfa3db	I meant to disable this test, not XFAIL it llvm-svn: 153165	2012-03-21 05:18:53 +00:00
Andrew Trick	f0a517fec8	misched: beginning to add unit tests llvm-svn: 153163	2012-03-21 04:12:19 +00:00
Akira Hatanaka	0137dfe42a	Incremental big endian patch by Jack Carter. These changes allow us to compile big endian from the command line for 32 bit Mips targets. This patch will result in code and data actually being produced in the correct endianess. llvm-svn: 153153	2012-03-21 00:52:01 +00:00
Chad Rosier	cbf45a6d8a	Fix test case from r153135. llvm-svn: 153140	2012-03-20 21:49:54 +00:00
Chad Rosier	4106917355	[avx] Add patterns for combining vextractf128 + vmovaps/vmovups/vmobdqu to vextractf128 with 128-bit mem dest. Combines vextractf128 $0, %ymm0, %xmm0 vmovaps %xmm0, (%rdi) to vextractf128 $0, %ymm0, (%rdi) rdar://11082570 llvm-svn: 153139	2012-03-20 21:43:40 +00:00
Jim Grosbach	1283317db4	Assembler should accept redefinitions of unused variable symbols. rdar://11027851 llvm-svn: 153137	2012-03-20 21:33:21 +00:00
Andrew Trick	f7711010e1	LoopSimplify bug fix. Handle indirect loop back edges. Do not call SplitBlockPredecessors on a loop preheader when one of the predecessors is an indirectbr. Otherwise, you will hit this assert: !isa<IndirectBrInst>(Preds[i]->getTerminator()) && "Cannot split an edge from an IndirectBrInst" llvm-svn: 153134	2012-03-20 21:24:52 +00:00
Andrew Trick	9c45706baf	LSR: teach isSimplifiedLoopNest to handle PHI IVUsers. llvm-svn: 153132	2012-03-20 21:24:44 +00:00
Andrew Trick	3660735e18	LSR: fix IVUsers isSimplifiedLoopNest to perform a full domtree walk instead of skipping the current loop. My prior fix was incomplete because of an overzealous compile-time optimization: Better fix for: <rdar://problem/11049788> Segmentation fault: 11 in LoopStrengthReduce llvm-svn: 153131	2012-03-20 21:24:40 +00:00
Chad Rosier	5a6011267a	[avx] Move the vextractf128 patterns closer to the vextractf128 def. Remove whitespace from test case. No functional change intended. llvm-svn: 153103	2012-03-20 18:24:55 +00:00
Kevin Enderby	816ca27ef6	Fix assembling ARM vst2 instructions with double-spaced registers. llvm-svn: 153099	2012-03-20 17:41:51 +00:00
Jim Grosbach	997614f597	ARM non-scattered MachO relocations for movw/movt. Needed when building -mdynamic-no-pic code. rdar://10459256 llvm-svn: 153097	2012-03-20 17:25:45 +00:00
Chad Rosier	58a7c9fd3e	Fix test. llvm-svn: 153095	2012-03-20 17:20:46 +00:00
Chad Rosier	07a4cb9382	[avx] Adjust the VINSERTF128rm pattern to allow for unaligned loads. This results in things such as vmovups 16(%rdi), %xmm0 vinsertf128 $1, %xmm0, %ymm0, %ymm0 to be combined to vinsertf128 $1, 16(%rdi), %ymm0, %ymm0 rdar://11076953 llvm-svn: 153092	2012-03-20 17:08:51 +00:00
Silviu Baranga	32a49333ec	The ARM instructions that have an unpredictable behavior when the pc register operand is given now fail with soft fail. Modified the regression tests to reflect this. llvm-svn: 153089	2012-03-20 15:54:56 +00:00
Bill Wendling	7315c4b9cd	It's possible to have a constant expression who's size is quite big (e.g., i128). In that case, we may not be able to print out the MCExpr as an expression. For instance, we could have an MCExpr like this: 0xBEEF0000BEEF0000 \| (0xBEEF0000BEEF0000 << 64) The MCExpr printer handles sizes up to 64-bits, but this expression would require 128-bits. In this situation, try to evaluate the constant expression and emit that as the value into 64-bit chunks. <rdar://problem/11070338> llvm-svn: 153081	2012-03-20 08:56:43 +00:00
Anton Korobeynikov	3edd854d64	Perform mul combine when multiplying wiht negative constants. Patch by Weiming Zhao! This fixes PR12212 llvm-svn: 153049	2012-03-19 19:19:50 +00:00
NAKAMURA Takumi	bed1cb1e13	llvm/test/DebugInfo: Move two tests to DebugInfo/X86. They are X86-dependent. llvm-svn: 153038	2012-03-19 16:16:03 +00:00
Preston Gurd	48ccc4df0b	This patch adds X86 instruction itineraries for non-pseudo opcodes in X86InstrCompiler.td. It also adds –mcpu-generic to the legalize-shift-64.ll test so the test will pass if run on an Intel Atom CPU, which would otherwise produce an instruction schedule which differs from that which the test expects. llvm-svn: 153033	2012-03-19 14:10:12 +00:00
Nick Lewycky	fa30607eca	Factor out the multiply analysis code in ComputeMaskedBits and apply it to the overflow checking multiply intrinsic as well. Add a test for this, updating the test from grep to FileCheck. llvm-svn: 153028	2012-03-18 23:28:48 +00:00
Jim Grosbach	2c8e0ac85c	MC asm parser macro argument count was wrong when empty. evaluated to '1' when the argument list was empty (should be '0'). rdar://11057257 llvm-svn: 152967	2012-03-17 00:11:42 +00:00
Jim Grosbach	905686a82a	ARM ldm/stm register lists can be out of order. It's not a good style idea, as the registers will be laid down in memory in numerical order, not the order they're in the list, but it's legal. vldm/vstm are stricter. rdar://11064740 llvm-svn: 152943	2012-03-16 20:48:38 +00:00
Bill Wendling	55b6b2b6a9	Revert r152907. llvm-svn: 152935	2012-03-16 18:20:54 +00:00
Bill Wendling	a2a26b546c	The alignment of the pointer part of the store instruction may have an alignment. If that's the case, then we want to make sure that we don't increase the alignment of the store instruction. Because if we increase it to be "more aligned" than the pointer, code-gen may use instructions which require a greater alignment than the pointer guarantees. <rdar://problem/11043589> llvm-svn: 152907	2012-03-16 07:40:08 +00:00
Chandler Carruth	b37fc13a36	Rip out support for 'llvm.noinline'. This thing has a strange history... It was added in 2007 as the first cut at supporting no-inline attributes, but we didn't have function attributes of any form at the time. However, it was added without any mention in the LangRef or other documentation. Later on, in 2008, Devang added function notes for 'inline=never' and then turned them into proper function attributes. From that point onward, as far as I can tell, the world moved on, and no one has touched 'llvm.noinline' in any meaningful way since. It's time has now come. We have had better mechanisms for doing this for a long time, all the frontends I'm aware of use them, and this is just holding back progress. Given that it was never a documented feature of the IR, I've provided no auto-upgrade support. If people know of real, in-the-wild bitcode that relies on this, yell at me and I'll add it, but I seriously doubt anyone cares. llvm-svn: 152904	2012-03-16 06:10:15 +00:00
Andrew Trick	070e540a3e	LSR fix: Add isSimplifiedLoopNest to IVUsers analysis. Only record IVUsers that are dominated by simplified loop headers. Otherwise SCEVExpander will crash while looking for a preheader. I previously tried to work around this in LSR itself, but that was insufficient. This way, LSR can continue to run if some uses are not in simple loops, as long as we don't attempt to analyze those users. Fixes <rdar://problem/11049788> Segmentation fault: 11 in LoopStrengthReduce llvm-svn: 152892	2012-03-16 03:16:56 +00:00
Eli Friedman	e06535b2f6	In InstCombiner::visitOr, make sure we reverse the operand swap used for checking for or-of-xor operations after those checks; a later check expects that any constant will be in Op1. PR12234. llvm-svn: 152884	2012-03-16 00:52:42 +00:00
Jim Grosbach	7cb9a13b02	ARM optional operand on MRC/MCR assembly instructions. rdar://11058464 llvm-svn: 152883	2012-03-16 00:45:58 +00:00
Jim Grosbach	24d90e2ddc	ARM vmrs system registers mvfr0 and mvfr1 handling. rdar://11058464 llvm-svn: 152881	2012-03-16 00:27:18 +00:00
Eric Christopher	a4a0cf8394	Do the right thing on NULL uint64 fields. Patch by Clemens Hammacher! Fixes PR12243 llvm-svn: 152880	2012-03-16 00:21:54 +00:00
Eric Christopher	7734ca2891	For types with a parent of the compile unit make sure and emit the DECL information. rdar://10855921 llvm-svn: 152876	2012-03-15 23:55:40 +00:00
Chad Rosier	26d05887d9	[fast-isel] Address Eli's comments for r152847. Specifically, add a test case and still allow immediate encoding, just not with cmn. rdar://11038907 llvm-svn: 152869	2012-03-15 22:54:20 +00:00
Jim Grosbach	d28888dd77	ARM case-insensitive checking for APSR_nzcv. rdar://11056591 llvm-svn: 152846	2012-03-15 21:34:14 +00:00
Matt Beaumont-Gay	18abf74edd	line endings llvm-svn: 152832	2012-03-15 20:24:29 +00:00
Lang Hames	c35ee8b54a	Use vmov.f32 to materialize f32 consts on ARM. This relaxes constraints on register allocation by allowing all 32 D-registers to be used. Patch by Cameron Zwarich. llvm-svn: 152824	2012-03-15 18:49:02 +00:00
Kristof Beyls	327d2f9da5	Fix VCVT decoding (between floating-point and fixed-point, Floating-point). Patch by Richard Barton. llvm-svn: 152814	2012-03-15 17:50:29 +00:00
Rafael Espindola	f58927855b	Short term fix for pr12270 before we change dominates to handle unreachable code. While here, reduce indentation. llvm-svn: 152803	2012-03-15 15:52:59 +00:00
Nadav Rotem	6fd1d32c63	When optimizing certain BUILD_VECTOR nodes into other BUILD_VECTOR nodes, add the new node into the work list because there is a potential for further optimizations. llvm-svn: 152784	2012-03-15 08:49:06 +00:00
Eric Christopher	7dd54fb695	Revert the removal of DW_AT_MIPS_linkage_name when we aren't putting out the DW_AT_name. Older gdbs unfortunately still use it to disambiguate member functions in templated classes (gdb.cp/templates.exp). rdar://11043421 (which is now deferred for a bit) llvm-svn: 152782	2012-03-15 08:19:33 +00:00
Chad Rosier	b9b73170e3	[avx] Add patterns for VINSERTF128rm. This results in things such as vmovaps -96(%rbx), %xmm1 vinsertf128 $1, %xmm1, %ymm0, %ymm0 to be combined to vinsertf128 $1, -96(%rbx), %ymm0, %ymm0 rdar://10643481 llvm-svn: 152762	2012-03-15 00:45:30 +00:00
Aaron Ballman	a733297fa6	Fixed a transform crash when setting a negative size value for memset. Fixes PR12202. llvm-svn: 152756	2012-03-15 00:05:31 +00:00
Chandler Carruth	4d1d34fbfc	Extend the inline cost calculation to account for bonuses due to correlated pairs of pointer arguments at the callsite. This is designed to recognize the common C++ idiom of begin/end pointer pairs when the end pointer is a constant offset from the begin pointer. With the C-based idiom of a pointer and size, the inline cost saw the constant size calculation, and this provides the same level of information for begin/end pairs. In order to propagate this information we have to search for candidate operations on a pair of pointer function arguments (or derived from them) which would be simplified if the pointers had a known constant offset. Then the callsite analysis looks for such pointer pairs in the argument list, and applies the appropriate bonus. This helps LLVM detect that half of bounds-checked STL algorithms (such as hash_combine_range, and some hybrid sort implementations) disappear when inlined with a constant size input. However, it's not a complete fix due the inaccuracy of our cost metric for constants in general. I'm looking into that next. Benchmarks showed no significant code size change, and very minor performance changes. However, specific code such as hashing is showing significantly cleaner inlining decisions. llvm-svn: 152752	2012-03-14 23:19:53 +00:00
Dan Gohman	532fb8131b	When an invoke is marked with metadata indicating its unwind edge should be ignored by ARC optimization, don't insert new ARC runtime calls in the unwind destination. llvm-svn: 152748	2012-03-14 23:05:06 +00:00
Eric Christopher	a9916d0296	Remove the DW_AT_MIPS_linkage name attribute when we don't need it output (we're emitting a specification already and the information isn't changing). Saves 1% on the debug information for a build of llvm. Fixes rdar://11043421 llvm-svn: 152697	2012-03-14 02:59:17 +00:00
Evan Cheng	7bf83096df	DAG combine incorrectly optimize (i32 vextract (v4i16 load $addr), c) to (i16 load $addr+csizeof(i16)) and replace uses of (i32 vextract) with the i16 load. It should issue an extload instead: (i32 extload $addr+csizeof(i16)). rdar://11035895 llvm-svn: 152675	2012-03-13 22:00:52 +00:00
Kevin Enderby	1ef22f33d0	Change the X86 assembler to not require a segment register on string instruction's destination operand like it does for the source operand. Also fix a typo in the comment for X86AsmParser::isSrcOp(). llvm-svn: 152654	2012-03-13 19:47:55 +00:00
Chris Lattner	87fa77bd8a	enhance jump threading to preserve TBAA information when PRE'ing loads, fixing rdar://11039258, an issue that came up when inspecting clang's bootstrapped codegen. llvm-svn: 152635	2012-03-13 18:07:41 +00:00
Dan Gohman	eab06fa3c9	Teach globalopt how to evaluate an invoke with a non-void return type. llvm-svn: 152634	2012-03-13 18:01:37 +00:00
Duncan Sands	395ac42dd2	Generalize the "trunc(ptrtoint(x)) - trunc(ptrtoint(y)) -> trunc(ptrtoint(x-y))" optimization introduced by Chandler. llvm-svn: 152626	2012-03-13 14:07:05 +00:00
Eli Friedman	c8cbd06947	Fix regression from r151466: an we can't replace uses of an instruction reachable from the entry block with uses of an instruction not reachable from the entry block. PR12231. llvm-svn: 152595	2012-03-13 01:06:07 +00:00
Kevin Enderby	987cef1fe2	Change the second line of the test added for r152414 to use CHECK-NEXT. Suggestion by Bill Wendling! llvm-svn: 152582	2012-03-12 21:38:09 +00:00
Kevin Enderby	fb3110b5d2	Added a missing error check for X86 assembly with mismatched base and index registers not both being 64-bit or both being 32-bit registers. llvm-svn: 152580	2012-03-12 21:32:09 +00:00
Kostya Serebryany	afbb65dee7	[asan] move x86-specific test to a separate X86 directory with a custom lit.local.cfg file llvm-svn: 152567	2012-03-12 18:49:11 +00:00
Chandler Carruth	595fda8466	When inlining a function and adding its inner call sites to the candidate set for subsequent inlining, try to simplify the arguments to the inner call site now that inlining has been performed. The goal here is to propagate and fold constants through deeply nested call chains. Without doing this, we loose the inliner bonus that should be applied because the arguments don't match the exact pattern the cost estimator uses. Reviewed on IRC by Benjamin Kramer. llvm-svn: 152556	2012-03-12 11:19:33 +00:00
Chandler Carruth	a0796555e2	Teach instsimplify how to constant fold pointer differences. Typically instcombine has handled this, but pointer differences show up in several contexts where we would like to get constant folding, and cannot afford to run instcombine. Specifically, I'm working on improving the constant folding of arguments used in inline cost analysis with instsimplify. Doing this in instsimplify implies some algorithm changes. We have to handle multiple layers of all-constant GEPs because instsimplify cannot fold them into a single GEP the way instcombine can. Also, we're only interested in all-constant GEPs. The result is that this doesn't really replace the instcombine logic, it's just complimentary and focused on constant folding. Reviewed on IRC by Benjamin Kramer. llvm-svn: 152555	2012-03-12 11:19:31 +00:00
Chandler Carruth	6242a0f771	FileCheck-ize this test. llvm-svn: 152554	2012-03-12 11:19:28 +00:00
Andrew Trick	61d277f146	Move llc + target triple tests into X86 llvm-svn: 152502	2012-03-10 19:03:51 +00:00
Benjamin Kramer	fee6372daa	Don't try to filecheck bitcode. llvm-svn: 152498	2012-03-10 18:07:46 +00:00
Bill Wendling	0624d2a1ec	Make this transformation slightly less agressive and more correct. The 'CmpInst::isFalseWhenEqual' function returns 'false' for values other than simply equality. For instance, it returns 'false' for <= or >=. This isn't the correct behavior for this transformation, which is checking for strict equality and non-equality. It was causing the gcc.c-torture/execute/frame-address.c test to fail because it would completely (and incorrectly) optimize a whole function into a 'ret i32 0'. llvm-svn: 152497	2012-03-10 17:56:03 +00:00
Bill Wendling	ebb10df441	Fix disasm of iret, sysexit, and sysret when displayed with Intel syntax. Patch by Kay Tiong Khoo! llvm-svn: 152487	2012-03-10 07:37:27 +00:00
Kevin Enderby	deed5aaa41	Add the missing call to Error when a bad X86 scale expression is parsed. llvm-svn: 152443	2012-03-09 22:24:10 +00:00
David Meyer	6c614bf717	Support reading GNU symbol versions in ELFObjectFile * Add enums and structures for GNU version information. * Implement extraction of that information on a per-symbol basis (ELFObjectFile::getSymbolVersion). * Implement a generic interface, GetELFSymbolVersion(), for getting the symbol version from the ObjectFile (hides the templating). * Have llvm-readobj print out the version, when available. * Add a test for the new feature: readobj-elf-versioning.test llvm-svn: 152436	2012-03-09 20:59:52 +00:00
Dan Gohman	500b598c5c	When identifying exit nodes for the reverse-CFG reverse-post-order traversal, consider nodes for which the only successors are backedges which the traversal is ignoring to be exit nodes. This fixes a problem where the bottom-up traversal was failing to visit split blocks along split loop backedges. This fixes rdar://10989035. llvm-svn: 152421	2012-03-09 18:50:52 +00:00
Kevin Enderby	014e1cde5f	Fix the x86 disassembler to at least print the lock prefix if it is the first prefix. Added a FIXME to remind us this still does not work when it is not the first prefix. llvm-svn: 152414	2012-03-09 17:52:49 +00:00
NAKAMURA Takumi	aebd3da46d	test/MC/X86/lit.local.cfg: Fix up to detect 'X86' in targets. llvm-svn: 152406	2012-03-09 14:52:38 +00:00
Duncan Sands	cca89124a2	Eliminate switch cases that can never match, for example removes all negative switch cases if the branch condition is known to be positive. Inspired by a recent improvement to GCC's VRP. llvm-svn: 152405	2012-03-09 13:45:18 +00:00
Chandler Carruth	783b7198b7	Undo a previous restriction on the inline cost calculation which Nick introduced. Specifically, there are cost reductions for all constant-operand icmp instructions against an alloca, regardless of whether the alloca will in fact be elligible for SROA. That means we don't want to abort the icmp reduction computation when we abort the SROA reduction computation. That in turn frees us from the need to keep a separate worklist and defer the ICmp calculations. Use this new-found freedom and some judicious function boundaries to factor the innards of computing the cost factor of any given instruction out of the loop over the instructions and into static helper functions. This greatly simplifies the code, and hopefully makes it more clear what is happening here. Reviewed by Eric Christopher. There is some concern that we'd like to ensure this doesn't get out of hand, and I plan to benchmark the effects of this change over the next few days along with some further fixes to the inline cost. llvm-svn: 152368	2012-03-09 02:49:36 +00:00
Chad Rosier	a281afc676	Fix a regression from r147481. Original commit message from r147481: DAGCombine for transforming 128->256 casts into a vmovaps, rather then a vxorps + vinsertf128 pair if the original vector came from a load. Fix: Unaligned loads need to generate a vmovups. rdar://10974078 llvm-svn: 152366	2012-03-09 02:00:48 +00:00
Benjamin Kramer	0ef86b0ea3	Remove the no longer existent psp triple from a test. The test fell back to the C backend, making it useless and it started to fail on configurations that don't build the C backend. llvm-svn: 152342	2012-03-08 21:22:27 +00:00
Akira Hatanaka	d60cb3822f	Test case for r152280, r152285 and r152290. llvm-svn: 152292	2012-03-08 03:32:42 +00:00
Rafael Espindola	bdd1258784	Use llvm-mc instead of llc. Patch by Jack Carter. llvm-svn: 152242	2012-03-07 20:58:59 +00:00
Jakob Stoklund Olesen	aa0f752fc8	Fix infinite loop in nested multiclasses. Patch by Michael Liao! llvm-svn: 152232	2012-03-07 16:39:35 +00:00
Eric Christopher	54cf8ff45e	Add the DW_AT_APPLE_runtime_class attribute to forward declarations as well as completely defined classes. This fixes rdar://10956070 llvm-svn: 152171	2012-03-07 00:15:19 +00:00
Evan Cheng	80893ce5f5	Extend r148086 to check for [r +/- reg] address mode. This fixes queens performance regression (due to increased register pressure from overly aggressive pre-inc formation). llvm-svn: 152162	2012-03-06 23:33:32 +00:00
Eli Friedman	de850676e0	Fix the operand ordering on aliases for shld and shrd. PR12173, part 2. llvm-svn: 152136	2012-03-06 19:58:46 +00:00
Kevin Enderby	520eb3ba8a	Fix a bug in the ARM disassembly of the neon VLD2 all lanes instruction. llvm-svn: 152127	2012-03-06 18:33:12 +00:00
Jakob Stoklund Olesen	d9b427ee65	Add <imp-def> operands when reloading into physregs. When an instruction only writes sub-registers, it is still necessary to add an <imp-def> operand for the super-register. When reloading into a virtual register, rewriting will add the operand, but when loading directly into a virtual register, the <imp-def> operand is still necessary. llvm-svn: 152095	2012-03-06 02:48:17 +00:00
Lang Hames	718cfbe05a	Split fpscr into two registers: FPSCR and FPSCR_NZCV. The fpscr register contains both flags (set by FP operations/comparisons) and control bits. The control bits (FPSCR) should be reserved, since they're always available and needn't be defined before use. The flag bits (FPSCR_NZCV) should like to be unreserved so they can be hoisted by MachineCSE. This fixes PR12165. llvm-svn: 152076	2012-03-06 00:19:55 +00:00
Jim Grosbach	8dc347fc27	ARM vpush/vpop assembler mnemonics accept an optional size suffix. rdar://10988114 llvm-svn: 152068	2012-03-05 23:16:31 +00:00
Eli Friedman	a8b75ac798	Make sure we don't return bits outside the mask in ComputeMaskedBits. PR12189. llvm-svn: 152066	2012-03-05 23:09:40 +00:00
Jakob Stoklund Olesen	fcd435ee73	Remove a test case that no longer makes sense. This was testing the handling of sub-register coalescing followed by remat. The original problem was caused by the extra <imp-def> operands added by sub-register coalescing. Those <imp-def> operands are not added any longer, and the test case passes even when the original patch is reverted. llvm-svn: 152040	2012-03-05 19:10:13 +00:00
Sebastian Pop	957a6583f1	updated patch for the ARM fused multiply add/sub In this update: - I assumed neon2 does not imply vfpv4, but neon and vfpv4 imply neon2. - I kept setting .fpu=neon-vfpv4 code attribute because that is what the assembler understands. Patch by Ana Pazos <apazos@codeaurora.org> llvm-svn: 152036	2012-03-05 17:39:52 +00:00
Eli Friedman	a5a6d6aa8f	Make aliases for shld and shrd match gas. PR12173. llvm-svn: 152014	2012-03-05 04:31:54 +00:00
Jakob Stoklund Olesen	f729ceae04	Use <def,undef> operands when spilling NEON bundles. MachineOperands that define part of a virtual register must have an <undef> flag if they are not intended as read-modify-write operands. The old trick of adding an <imp-def> operand doesn't work any longer. Fixes PR12177. llvm-svn: 152008	2012-03-04 18:40:30 +00:00
Duncan Sands	4d928e7dff	Nick pointed out on IRC that GVN's propagateEquality wasn't propagating equalities into phi node operands for which the equality is known to hold in the incoming basic block. That's because replaceAllDominatedUsesWith wasn't handling phi nodes correctly in general (that this didn't give wrong results was just luck: the specific way GVN uses replaceAllDominatedUsesWith precluded wrong changes to phi nodes). llvm-svn: 152006	2012-03-04 13:25:19 +00:00
Bill Wendling	97b9359623	Do trivial CSE of dead BBs during codegen preparation. Some BBs can become dead after codegen preparation. If we delete them here, it could help enable tail-call optimizations later on. <rdar://problem/10256573> llvm-svn: 152002	2012-03-04 10:46:01 +00:00
Jakob Stoklund Olesen	a0bd36e3bc	Fix RA-dependent test. llvm-svn: 151958	2012-03-03 00:26:30 +00:00
Benjamin Kramer	d9d80b1dde	LVI: Recognize the form instcombine canonicalizes range checks into when forming constant ranges. This could probably be made a lot smarter, but this is a common case and doesn't require LVI to scan a lot of code. With this change CVP can optimize away the "shift == 0" case in Hashing.h that only gets hit when "shift" is in a range not containing 0. llvm-svn: 151919	2012-03-02 15:34:43 +00:00
Chad Rosier	f5e086f18e	Prevent obscure and incorrect tail-call optimization. In this instance we are generating the tail-call during legalizeDAG. The 2nd floor call can't be a tail call because it clobbers %xmm1, which is defined by the first floor call. The first floor call can't be a tail-call because it's not in the tail position. The only reasonable way I could think to fix this in a target-independent manner was to check for glue logic on the copy reg. rdar://10930395 llvm-svn: 151877	2012-03-02 02:50:46 +00:00
Eric Christopher	7524fe4551	Revert "Reorder the sections being output to reduce the number of assembler" The inline table needs to be constructed ahead of time so that it doesn't try to create new strings while we're emitting everything. This reverts commit a8ff9bccb399183cdd5f1c3cec2bda763664b4b0. llvm-svn: 151864	2012-03-02 00:30:24 +00:00
Evan Cheng	d12af5dc69	Neuter the optimization I implemented with r107852 and r108258 which turn some floating point equality comparisons into integer ones with -ffast-math. The issue is the optimization causes +0.0 != -0.0. Now the optimization is only done when one side is known to be 0.0. The other side's sign bit is masked off for the comparison. rdar://10964603 llvm-svn: 151861	2012-03-01 23:27:13 +00:00
Eric Christopher	66b0721014	Reorder the sections being output to reduce the number of assembler fixups that are being used to determine section offsets. Reduces the total number of fixups by 50% for a non-trivial testcase. Part of rdar://10413936 llvm-svn: 151852	2012-03-01 22:50:31 +00:00
David Meyer	c429b80da1	[Object] Add ObjectFile::getLoadName() for retrieving the soname/installname of a shared object. llvm-svn: 151845	2012-03-01 22:19:54 +00:00
Kevin Enderby	f0269b4270	Change ARMInstPrinter::printPredicateOperand() so it will not abort if it runs into the undefined 15 condition code value. llvm-svn: 151844	2012-03-01 22:13:02 +00:00
Akira Hatanaka	6bbe1f0d10	Fix bugs which were introduced when support for base+index floating point loads and stores was added. - SelectAddr should return false if Parent is an unaligned f32 load or store. - Only aligned load and store nodes should be matched to select reg+imm floating point instructions. - MIPS does not have support for f64 unaligned load or store instructions. llvm-svn: 151843	2012-03-01 22:12:30 +00:00
Preston Gurd	be1c875a1c	Trivial change to make the test use Use –mcpu=generic, so that the test will not fail when run on an Intel Atom processor, due to the Atom scheduler producing an instruction sequence that is different from that which is normally expected. llvm-svn: 151832	2012-03-01 19:57:20 +00:00
Chad Rosier	2913f500fa	Revert r151816 as Jim has the appropriate fix. llvm-svn: 151818	2012-03-01 17:41:19 +00:00
Chad Rosier	f0208ed76a	Fix testcases from r151807. llvm-svn: 151816	2012-03-01 17:31:30 +00:00
Jim Grosbach	394ad59d90	Add missing triple for tests. Make darwin bots happier. llvm-svn: 151813	2012-03-01 17:30:32 +00:00
James Molloy	f6298e9281	Fix a codegen fault in which log2 or exp2 could be dead-code eliminated even though they could have sideeffects. Only allow log2/exp2 to be converted to an intrinsic if they are declared "readnone". llvm-svn: 151807	2012-03-01 14:32:18 +00:00
NAKAMURA Takumi	74e736f0eb	llvm/test/CMakeLists.txt: Update dependencies to add llvm-readobj to "check". llvm-svn: 151795	2012-03-01 03:14:13 +00:00
David Meyer	2fc34c5f84	[Object] * Add begin_dynamic_table() / end_dynamic_table() private interface to ELFObjectFile. * Add begin_libraries_needed() / end_libraries_needed() interface to ObjectFile, for grabbing the list of needed libraries for a shared object or dynamic executable. * Implement this new interface completely for ELF, leave stubs for COFF and MachO. * Add 'llvm-readobj' tool for dumping ObjectFile information. llvm-svn: 151785	2012-03-01 01:36:50 +00:00
Lang Hames	76e66c31a0	Don't redundantly copy implicit operands when rematerializing. While we're at it - don't copy vreg implicit operands while rematerializing. This fixes PR12138. llvm-svn: 151779	2012-03-01 00:41:17 +00:00
Richard Trieu	37ddc0fab6	Fix flags for test in MC/MachO/ARM/empty-function-nop.ll llvm-svn: 151778	2012-03-01 00:29:09 +00:00
Benjamin Kramer	d05a0c6c42	LegalizeIntegerTypes: Reorder operations in the "big shift by small amount" optimization, making the lives of later passes easier. llvm-svn: 151722	2012-02-29 13:27:00 +00:00
Duncan Sands	bb2fe65542	Have GVN also do condition propagation when the right-hand side is not a constant. This fixes PR1768. llvm-svn: 151713	2012-02-29 11:12:03 +00:00
Bill Wendling	7f9f5680ca	Testcase for r151691. llvm-svn: 151694	2012-02-29 01:53:13 +00:00
Jim Grosbach	617f84ddbd	ARM implement TargetInstrInfo::getNoopForMachoTarget() Without this hook, functions w/ a completely empty body (including no epilogue) will cause an MCEmitter assertion failure. For example, define internal fastcc void @empty_function() { unreachable } rdar://10947471 llvm-svn: 151673	2012-02-28 23:53:30 +00:00
David Meyer	1df4b84db4	In the ObjectFile interface, replace isInternal(), isAbsolute(), isGlobal(), and isWeak(), with a bitset of flags. llvm-svn: 151670	2012-02-28 23:47:53 +00:00
Rafael Espindola	c22c85c29c	On ELF, create relocations to the abbreviation and line sections when producing debug info for assembly files. We were already doing the right thing when producing debug info for C/C++. ELF linkers don't know dwarf, so they depend on these relocations to produce valid dwarf output. llvm-svn: 151655	2012-02-28 21:13:05 +00:00
Benjamin Kramer	0c281a7deb	LegalizeIntegerTypes: Reenable the large shift with small amount optimization. To avoid problems with zero shifts when getting the bits that move between words we use a trick: first shift the by amount-1, then do another shift by one. When amount is 0 (and size 32) we first shift by 31, then by one, instead of by 32. Also fix a latent bug that emitted the low and high words in the wrong order when shifting right. Fixes PR12113. llvm-svn: 151637	2012-02-28 17:58:00 +00:00
Daniel Dunbar	ee7b899343	Revert r151623 "Some ARM implementaions, e.g. A-series, does return stack prediction. ...", it is breaking the Clang build during the Compiler-RT part. llvm-svn: 151630	2012-02-28 15:36:07 +00:00
Nadav Rotem	875e463b19	Fix a bug in the code that builds SDNodes from vector GEPs. When the GEP index is a vector of pointers, the code that calculated the size of the element started from the vector type, and not the contained pointer type. As a result, instead of looking at the data element pointed by the vector, this code used the size of the vector. This works for 32bit members (on 32bit systems), but not for other types. Added code to peel the vector type and added a test. llvm-svn: 151626	2012-02-28 11:54:05 +00:00
Evan Cheng	87c7b09d8d	Some ARM implementaions, e.g. A-series, does return stack prediction. That is, the processor keeps a return addresses stack (RAS) which stores the address and the instruction execution state of the instruction after a function-call type branch instruction. Calling a "noreturn" function with normal call instructions (e.g. bl) can corrupt RAS and causes 100% return misprediction so LLVM should use a unconditional branch instead. i.e. mov lr, pc b _foo The "mov lr, pc" is issued in order to get proper backtrace. rdar://8979299 llvm-svn: 151623	2012-02-28 06:42:03 +00:00
Pete Cooper	39b5255df4	Reverted r152620 - DSE: Shorten memset when a later store overwrites the start of it. There were all sorts of buildbot issues llvm-svn: 151621	2012-02-28 05:06:24 +00:00
Pete Cooper	f3862f91de	DSE: Shorten memset when a later store overwrites the start of it llvm-svn: 151620	2012-02-28 04:27:10 +00:00
Akira Hatanaka	330d901ce3	Add support for floating point base register + offset register addressing mode load and store instructions. llvm-svn: 151611	2012-02-28 02:55:02 +00:00
Jakob Stoklund Olesen	4c5ad2b812	Handle regmasks in MachineCSE. Don't attempt to extend physreg live ranges across calls. <rdar://problem/10942095> llvm-svn: 151610	2012-02-28 02:08:50 +00:00
Jakob Stoklund Olesen	92c15b2b2c	Enable ARM base pointer when calling functions with large arguments. When an outgoing call takes more than 2k of arguments on the stack, we don't allocate that call frame in the prolog, but adjust the stack pointer immediately before the call instead. This causes problems with the emergency spill slot because PEI can't track stack pointer adjustments on the second pass, and if the outgoing arguments are too big, SP can't be used to reach the emergency spill slot at all. Work around these problems by ensuring there is a base or frame pointer that can be used to access the emergency spill slot. <rdar://problem/10917166> llvm-svn: 151604	2012-02-28 01:15:01 +00:00
Michael J. Spencer	8c4729fd44	[Object] Add {begin,end}_dynamic_symbols stubs and implementation for ELF. Add -D option to llvm-nm to dump dynamic symbols. Patch by David Meyer. llvm-svn: 151600	2012-02-28 00:40:37 +00:00
Bill Wendling	2b3f61af18	Add back removed code. It still causes LLVM to miscompile. But not having it breaks other things. llvm-svn: 151594	2012-02-27 23:48:30 +00:00
Preston Gurd	43b2506e32	test commit. llvm-svn: 151588	2012-02-27 23:31:51 +00:00
Eli Friedman	0774902a00	Duncan pointed out that if the alignment isn't explicitly specified, it defaults to the ABI alignment. Given that, make this code a bit more aggressive in such cases. llvm-svn: 151584	2012-02-27 23:16:46 +00:00
Bill Wendling	06e4818dd6	XFAIL test until <rdar://problem/10913281> is fixed. llvm-svn: 151578	2012-02-27 22:53:42 +00:00
Jim Grosbach	7b811d30d9	ARM BL/BLX instruction fixups should use relocations. We on the linker to resolve calls to the appropriate BL/BLX instruction to make interworking function correctly. It uses the symbol in the relocation to do that, so we need to be careful about being too clever. To enable this for ARM mode, split the BL/BLX fixup kind off from the unconditional-branch fixups. rdar://10927209 llvm-svn: 151571	2012-02-27 21:36:23 +00:00
Eli Friedman	8bc169c3c5	Teach BasicAA about the LLVM IR rules that allow reading past the end of an object given sufficient alignment. Fixes PR12098. llvm-svn: 151553	2012-02-27 20:46:07 +00:00
Roman Divacky	ded7f01062	Test the section specification. llvm-svn: 151552	2012-02-27 20:42:19 +00:00
Roman Divacky	8fe40cd659	Reapply r151278 with fixes. MCize function entry label emission on PowerPC64 properly. llvm-svn: 151547	2012-02-27 20:20:47 +00:00
Duncan Sands	27f459519d	When performing a conditional branch depending on the value of a comparison %cmp (eg: A==B) we already replace %cmp with "true" under the true edge, and with "false" under the false edge. This change enhances this to replace the negated compare (A!=B) with "false" under the true edge and "true" under the false edge. Reported to improve perlbench results by 1%. llvm-svn: 151517	2012-02-27 08:14:30 +00:00
Rafael Espindola	09a4201d3c	Fix this assert. IP can point to an instruction with strange dominance properties (invoke). Just assert that the instruction we return dominates the insertion point. llvm-svn: 151511	2012-02-27 02:13:03 +00:00
Craig Topper	6491c8020e	X86 disassembler support for jcxz, jecxz, and jrcxz. Fixes PR11643. Patch by Kay Tiong Khoo. llvm-svn: 151510	2012-02-27 01:54:29 +00:00
Rafael Espindola	a640db900a	Add testcase for the previous commit. llvm-svn: 151475	2012-02-26 05:49:57 +00:00
Rafael Espindola	94df267db3	Change the implementation of dominates(inst, inst) to one based on what the verifier does. This correctly handles invoke. Thanks to Duncan, Andrew and Chris for the comments. Thanks to Joerg for the early testing. llvm-svn: 151469	2012-02-26 02:19:19 +00:00
Nick Lewycky	3db143ea8c	Reinstate the optimization from r151449 with a fix to not turn 'gep %x' into 'gep null' when the icmp predicate is unsigned (or is signed without inbounds). llvm-svn: 151467	2012-02-26 02:09:49 +00:00
Nick Lewycky	7bbd72da46	Roll these back to r151448 until I figure out how they're breaking MultiSource/Applications/lua. llvm-svn: 151463	2012-02-25 23:01:19 +00:00
Nick Lewycky	eeeffbb497	An argument and a local identified object (eg. a noalias call) could turn out equal if both are null. In the test, scope type %t and global @y by adding a 'gep' prefix to them. llvm-svn: 151452	2012-02-25 20:19:07 +00:00
Nick Lewycky	51f2be8bff	Teach instsimplify to be more aggressive when analyzing comparisons of pointers by using llvm::isIdentifiedObject. Also teach it to handle GEPs that have the same base pointer and constant operands. Fixes PR11238! llvm-svn: 151449	2012-02-25 19:07:42 +00:00
Hal Finkel	6fd2b434bd	Revert r151278, breaks static linking. Reverting this because it breaks static linking on ppc64. Specifically, it may be linkonce_odr functions that are the problem. With this patch, if you link statically, calls to some functions end up calling their descriptor addresses instead of calling to their entry points. This causes the execution to fail with SIGILL (b/c the descriptor address just has some pointers, not code). llvm-svn: 151433	2012-02-25 03:40:11 +00:00
NAKAMURA Takumi	bdf94879df	Target/X86: Fix assertion failures and warnings caused by r151382 _ftol2 lowering for i386-*-win32 targets. Patch by Joe Groff. [Joe Groff] Hi everyone. My previous patch applied as r151382 had a few problems: Clang raised a warning, and X86 LowerOperation would assert out for fptoui f64 to i32 because it improperly lowered to an illegal BUILD_PAIR. Here's a patch that addresses these issues. Let me know if any other changes are necessary. Thanks. llvm-svn: 151432	2012-02-25 03:37:25 +00:00
Akira Hatanaka	60f7a8e710	Add definitions of floating point multiply add/sub and negative multiply add/sub instructions. llvm-svn: 151415	2012-02-25 00:21:52 +00:00
Akira Hatanaka	b049aef2d1	Add an option to use a virtual register as the global base register instead of reserving a physical register ($gp or $28) for that purpose. This will completely eliminate loads that restore the value of $gp after every function call, if the register allocator assigns a callee-saved register, or eliminate unnecessary loads if it assigns a temporary register. example: .cpload $25 // set $gp. ... .cprestore 16 // store $gp to stack slot 16($sp). ... jalr $25 // function call. clobbers $gp. lw $gp, 16($sp) // not emitted if callee-saved reg is chosen. ... lw $2, 4($gp) ... jalr $25 // function call. lw $gp, 16($sp) // not emitted if $gp is not live after this instruction. ... llvm-svn: 151402	2012-02-24 22:34:47 +00:00

... 4 5 6 7 8 ...

16216 Commits