llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	f38dea1cfa	[x86] Add assembly parser bounds checking to the immediate value for cmpss/cmpsd/cmpps/cmppd. llvm-svn: 226642	2015-01-21 06:07:53 +00:00
Adrian Prantl	34bcbeed03	Make DIExpression::Verify() stricter by checking that the number of elements and the ordering is sane and cleanup the accessors. llvm-svn: 226627	2015-01-21 00:59:20 +00:00
Simon Pilgrim	f5dcc1cbe6	[X86][AVX] Simplified diff between AVX1 and SSE42 fp stack folding tests. NFC. Changed the AVX1 tests register spill tail call to return a xmm like the SSE42 version - makes doing diffs between them a lot easier without affecting the spills themselves. llvm-svn: 226623	2015-01-21 00:02:13 +00:00
Simon Pilgrim	0177fa3d8c	[X86][SSE] Added SSE/AVX1 integer stack folding tests. Some folding patterns + tests are missing (marked as TODO) - these will be added in a future patch for review. llvm-svn: 226622	2015-01-20 23:54:17 +00:00
Simon Pilgrim	ffddc01b00	[X86][SSE] Added SSE fp stack folding tests. Some folding patterns + tests are missing (marked as TODO) - these will be added in a future patch for review. llvm-svn: 226621	2015-01-20 23:50:18 +00:00
Simon Pilgrim	0d68d98bd5	[X86][AVX] Renamed AVX1 fp stack folding tests. NFC. The SSE42 version of the AVX1 float stack folding tests will be added shortly, this renames the AVX1 file so that the files will be near each other in a directory listing to help ensure they are kept in sync. llvm-svn: 226620	2015-01-20 23:45:50 +00:00
Adrian Prantl	de200dfad2	DebugLocs without a scope should fail the verification. Follow-up to r226588. llvm-svn: 226616	2015-01-20 22:37:25 +00:00
Kevin Enderby	98da6136d0	For llvm-objdump, hook up existing options to work when using -macho (the Mach-O parser). llvm-svn: 226612	2015-01-20 21:47:46 +00:00
Colin LeMahieu	988c68f2a7	[Hexagon] Adding intrinsics for doubleword ALU operations. llvm-svn: 226606	2015-01-20 20:45:05 +00:00
Colin LeMahieu	734001c0a6	[Hexagon] Removing unnecessary clutter in intrinsic tests. llvm-svn: 226602	2015-01-20 19:46:07 +00:00
Daniel Jasper	6b77455f81	Prevent binary-tree deterioration in sparse switch statements. This addresses part of llvm.org/PR22262. Specifically, it prevents considering the densities of sub-ranges that have fewer than TLI.getMinimumJumpTableEntries() elements. Those densities won't help jump tables. This is not a complete solution but works around the most pressing issue. Review: http://reviews.llvm.org/D7070 llvm-svn: 226600	2015-01-20 19:43:33 +00:00
Ramkumar Ramachandra	be10ece5ed	[GC] Verify-pass void vararg functions in gc.statepoint With the appropriate Verifier changes, exactracting the result out of a statepoint wrapping a vararg function crashes. However, a void vararg function works fine: commit this first step. Differential Revision: http://reviews.llvm.org/D7071 llvm-svn: 226599	2015-01-20 19:42:46 +00:00
Adrian Prantl	565cc18d8f	Reapply: Teach SROA how to update debug info for fragmented variables. This reapplies r225379. ChangeLog: - The assertion that this commit previously ran into about the inability to handle indirect variables has since been removed and the backend can handle this now. - Testcases were upgrade to the new MDLocation format. - Instead of keeping a DebugDeclares map, we now use llvm::FindAllocaDbgDeclare(). Original commit message follows. Debug info: Teach SROA how to update debug info for fragmented variables. This allows us to generate debug info for extremely advanced code such as typedef struct { long int a; int b;} S; int foo(S s) { return s.b; } which at -O1 on x86_64 is codegen'd into define i32 @foo(i64 %s.coerce0, i32 %s.coerce1) #0 { ret i32 %s.coerce1, !dbg !24 } with this patch we emit the following debug info for this TAG_formal_parameter [3] AT_location( 0x00000000 0x0000000000000000 - 0x0000000000000006: rdi, piece 0x00000008, rsi, piece 0x00000004 0x0000000000000006 - 0x0000000000000008: rdi, piece 0x00000008, rax, piece 0x00000004 ) AT_name( "s" ) AT_decl_file( "/Volumes/Data/llvm/_build.ninja.release/test.c" ) Thanks to chandlerc, dblaikie, and echristo for their feedback on all previous iterations of this patch! llvm-svn: 226598	2015-01-20 19:42:22 +00:00
Tom Stellard	021053f500	R600/SI: Fix simple-loop.ll test llvm-svn: 226596	2015-01-20 19:33:02 +00:00
Jozef Kolek	0d49117769	Reverted revision 226577. llvm-svn: 226595	2015-01-20 19:29:28 +00:00
Tom Stellard	8255af45cb	R600/SI: Add kill flag when copying scratch offset to a register This allows us to re-use the same register for the scratch offset when accessing large private arrays. llvm-svn: 226585	2015-01-20 17:49:45 +00:00
Tom Stellard	8058069529	R600/SI: Don't store scratch buffer frame index in MUBUF offset field We don't have a good way of legalizing this if the frame index offset is more than the 12-bits, which is size of MUBUF's offset field, so now we store the frame index in the vaddr field. llvm-svn: 226584	2015-01-20 17:49:43 +00:00
Jozef Kolek	45f7f9c1ab	[mips][microMIPS] MicroMIPS 16-bit unconditional branch instruction B Implement microMIPS 16-bit unconditional branch instruction B. Implemented 16-bit microMIPS unconditional instruction has real name B16, and B is an alias which expands to either B16 or BEQ according to the rules: b 256 --> b16 256 # R_MICROMIPS_PC10_S1 b 12256 --> beq $zero, $zero, 12256 # R_MICROMIPS_PC16_S1 b label --> beq $zero, $zero, label # R_MICROMIPS_PC16_S1 Differential Revision: http://reviews.llvm.org/D3514 llvm-svn: 226577	2015-01-20 16:45:27 +00:00
Kai Nacke	e7a647886a	[mips] Add registers and ALL check prefix to octeon test case. No functional change. Reviewed by D. Sanders llvm-svn: 226574	2015-01-20 16:14:02 +00:00
Kai Nacke	63072f81b3	[mips] Add octeon branch instructions bbit0/bbit032/bbit1/bbit132 This commits adds the octeon branch instructions bbit0/bbit032/bbit1/bbit132. It also includes patterns for instruction selection and test cases. Reviewed by D. Sanders llvm-svn: 226573	2015-01-20 16:10:51 +00:00
Evgeniy Stepanov	c5b974e6d2	[msan] Optimize -msan-check-constant-shadow. The new code does not create new basic blocks in the case when shadow is a compile-time constant; it generates either an unconditional __msan_warning call or nothing instead. llvm-svn: 226569	2015-01-20 15:21:35 +00:00
Chandler Carruth	aaf0b4cd57	[PM] Port LoopInfo to the new pass manager, adding both a LoopAnalysis pass and a LoopPrinterPass with the expected associated wiring. I've added a RUN line to the only test case (!!!) we have that actually prints loops. Everything seems to be working. This is somewhat exciting as this is the first analysis using another analysis to go in for the new pass manager. =D I also believe it is the last analysis necessary for porting instcombine, but of course I may yet discover more. llvm-svn: 226560	2015-01-20 10:58:50 +00:00
Karthik Bhat	0b0f4660fa	Fix Operandreorder logic in SLPVectorizer to generate longer vectorizable chain. This patch fixes 2 issues in reorderInputsAccordingToOpcode 1) AllSameOpcodeLeft and AllSameOpcodeRight was being calculated incorrectly resulting in code not being vectorized in few cases. 2) Adds logic to reorder operands if we get longer chain of consecutive loads enabling vectorization. Handled the same for cases were we have AltOpcode. Thanks Michael for inputs and review. Review: http://reviews.llvm.org/D6677 llvm-svn: 226547	2015-01-20 06:11:00 +00:00
David Majnemer	3087b22e1a	Bitcode: Don't create comdats when autoupgrading macho bitcode Don't infer COMDAT groups from older bitcode if the target is macho, it doesn't have COMDATs. llvm-svn: 226546	2015-01-20 05:58:07 +00:00
Frederic Riss	e4a6fef98f	[dsymutil] Add the detected target triple to the debug map. It will be needed to instantiate the Target object that we will use to create all the MC objects for the dwarf emission. llvm-svn: 226525	2015-01-19 23:33:14 +00:00
Duncan P. N. Exon Smith	13890af51c	AsmParser: Fix error location for missing fields llvm-svn: 226524	2015-01-19 23:32:36 +00:00
Simon Pilgrim	20bc37c7db	[X86][AVX] Missing AVX1 memory folding float instructions Now that we can create much more exhaustive X86 memory folding tests, this patch adds the missing AVX1/F16C floating point instruction stack foldings we can easily test for including the scalar intrinsics (add, div, max, min, mul, sub), conversions float/int to double, half precision conversions, rounding, dot product and bit test. The patch also adds a couple of obviously missing SSE instructions (more to follow once we have full SSE testing). Now that scalar folding is working it broke a very old test (2006-10-07-ScalarSSEMiscompile.ll) - this test appears to make no sense as its trying to ensure that a scalar subtraction isn't folded as it 'would zero the top elts of the loaded vector' - this test just appears to be wrong to me. Differential Revision: http://reviews.llvm.org/D7055 llvm-svn: 226513	2015-01-19 22:40:45 +00:00
Rafael Espindola	2658554aec	Add r224985 back with fixes. The fixes are to note that AArch64 has additional restrictions on when local relocations can be used. In particular, ld64 requires that relocations to cstring/cfstrings use linker visible symbols. Original message: In an assembly expression like bar: .long L0 + 1 the intended semantics is that bar will contain a pointer one byte past L0. In sections that are merged by content (strings, 4 byte constants, etc), a single position in the section doesn't give the linker enough information. For example, it would not be able to tell a relocation must point to the end of a string, since that would look just like the start of the next. The solution used in ELF to use relocation with symbols if there is a non-zero addend. In MachO before this patch we would just keep all symbols in some sections. This would miss some cases (only cstrings on x86_64 were implemented) and was inefficient since most relocations have an addend of 0 and can be represented without the symbol. This patch implements the non-zero addend logic for MachO too. llvm-svn: 226503	2015-01-19 21:11:14 +00:00
Colin LeMahieu	0ee02fc9fe	[Hexagon] Updating muxir/ri/ii intrinsics. Setting predicate registers as compatible with i32 rather than doing custom type conversion. llvm-svn: 226500	2015-01-19 20:31:18 +00:00
Colin LeMahieu	fcd4569af6	[Hexagon] Converting intrinsics combine imm/imm, simple shifts and extends. llvm-svn: 226483	2015-01-19 18:56:19 +00:00
Colin LeMahieu	9327bdad2f	[Hexagon] Converting remaining ALU32/ALU intrinsics. llvm-svn: 226480	2015-01-19 18:33:58 +00:00
Colin LeMahieu	663419b008	[Hexagon] Converting ALU32/ALU intrinsics to new patterns. llvm-svn: 226478	2015-01-19 18:22:19 +00:00
Adrian Prantl	5883af3faa	Remove support for DIVariable's FlagIndirectVariable and expect frontends to use a DIExpression with a DW_OP_deref instead. This is not only a much more natural place for this informationl; there is also a technical reason: The FlagIndirectVariable is used to mark a variable that is turned into a reference by virtue of the calling convention; this happens for example to aggregate return values. The inliner, for example, may actually need to undo this indirection to correctly represent the value in its new context. This is impossible to implement because the DIVariable can't be safely modified. We can however safely construct a new DIExpression on the fly. llvm-svn: 226476	2015-01-19 17:57:29 +00:00
Greg Fitzgerald	fa78d08675	[AArch64] Implement GHC calling convention Original patch by Luke Iannini. Minor improvements and test added by Erik de Castro Lopo. Differential Revision: http://reviews.llvm.org/D6877 From: Erik de Castro Lopo <erikd@mega-nerd.com> llvm-svn: 226473	2015-01-19 17:40:05 +00:00
Colin LeMahieu	310bad8b7e	[Hexagon] Converting halfword to double accumulating multiply intrinsics. llvm-svn: 226472	2015-01-19 17:36:32 +00:00
Rafael Espindola	c569ac46eb	Produce errors when an assignment expression would use a common symbol. An assignment will produce a symbol with a given section and offset. There is no way to represent something like "1 byte after a common symbol". This matches the behavior of GNU as. Part of PR22217. llvm-svn: 226470	2015-01-19 17:30:24 +00:00
Bradley Smith	3131e85edd	[ARM] SSAT/USAT with an 'asr #32' shift should result in an undefined encoding rather than unpredictable llvm-svn: 226469	2015-01-19 16:37:17 +00:00
Bradley Smith	30057b245e	[ARM] Fixup sign extend instruction availability w.r.t. DSP extension llvm-svn: 226468	2015-01-19 16:36:02 +00:00
Rafael Espindola	12ca34f53f	Bring r226038 back. No change in this commit, but clang was changed to also produce trivial comdats when needed. Original message: Don't create new comdats in CodeGen. This patch stops the implicit creation of comdats during codegen. Clang now sets the comdat explicitly when it is required. With this patch clang and gcc now produce the same result in pr19848. llvm-svn: 226467	2015-01-19 15:16:06 +00:00
Michael Kuperstein	54c61edee7	[MIScheduler] Slightly better handling of constrainLocalCopy when both source and dest are local This fixes PR21792. Differential Revision: http://reviews.llvm.org/D6823 llvm-svn: 226433	2015-01-19 07:30:47 +00:00
Hal Finkel	af51993ee1	[PowerPC] Add r2 as an operand for all calls under both PPC64 ELF V1 and V2 Our PPC64 ELF V2 call lowering logic added r2 as an operand to all direct call instructions in order to represent the dependency on the TOC base pointer value. Restricting this to ELF V2, however, does not seem to make sense: calls under ELF V1 have the same dependence, and indirect calls have an r2 dependence just as direct ones. Make sure the dependence is noted for all calls under both ELF V1 and ELF V2. llvm-svn: 226432	2015-01-19 07:20:27 +00:00
Matt Arsenault	4843f193ad	R600: Remove redundant test This is already covered in ftrunc.ll llvm-svn: 226412	2015-01-18 19:30:32 +00:00
Daniel Sanders	01dce6c931	[mips] 'CHECK :' is not a valid check directive. Fixed. llvm-svn: 226409	2015-01-18 18:43:10 +00:00
Daniel Sanders	0cb9dc6e68	[mips] Make whitespace in disassembler tests more consistent. NFC. The tests for the ISA's should now be approximately diffable. That is, the output of 'diff valid-mips1.txt valid-mips2.txt' should be emit the lines for instructions that were added/removed to/from MIPS-I by MIPS-II. This doesn't work perfectly at the moment due to ordering differences but it should be close. llvm-svn: 226408	2015-01-18 18:38:36 +00:00
Daniel Sanders	46ad7cbfce	[mips] Make whitespace of disassembler tests more consistent by removing blank lines. NFC. llvm-svn: 226407	2015-01-18 18:21:19 +00:00
Simon Pilgrim	4cf275eab2	[X86][SSE] Added scalar min/max folding tests. NFC. llvm-svn: 226406	2015-01-18 18:06:23 +00:00
Simon Pilgrim	1d6dcdcacb	[X86][SSE] Added float extract and xmm extract/insert stack folding tests. NFC. llvm-svn: 226405	2015-01-18 17:04:32 +00:00
Simon Pilgrim	cd26d0b611	[X86][SSE] Added scalar conversion stack folding tests. NFC. llvm-svn: 226404	2015-01-18 16:22:15 +00:00
Simon Pilgrim	fe3bfb80c9	AVX1 stack folding tests. NFC. Begun adding more exhaustive tests - all floating point instructions should now be either tested or have placeholders. We do seem to have a number of missing instructions, I will add a patch for review once the remaining working instructions are added. I'll then move on to SSE tests and then the integer instructions. llvm-svn: 226400	2015-01-18 12:56:39 +00:00
Hal Finkel	f81b6dd7a2	[PowerPC] Initial PPC64 calling-convention changes for fastcc The default calling convention specified by the PPC64 ELF (V1 and V2) ABI is designed to work with both prototyped and non-prototyped/varargs functions. As a result, GPRs and stack space are allocated for every argument, even those that are passed in floating-point or vector registers. GlobalOpt::OptimizeFunctions will transform local non-varargs functions (that do not have their address taken) to use the 'fast' calling convention. When functions are using the 'fast' calling convention, don't allocate GPRs for arguments passed in other types of registers, and don't allocate stack space for arguments passed in registers. Other changes for the fast calling convention may be added in the future. llvm-svn: 226399	2015-01-18 12:08:47 +00:00
Hal Finkel	c19805a75d	[PowerPC] Don't list R11 as a patchpoint scratch register R11's status is the same under both the PPC64 ELF V1 and V2 ABIs: it is reserved for use as an "environment pointer" for compilation models that require such a thing. We don't, we also don't need a second scratch register, and because we support only "local" patchpoint call targets, we might as well let R11 be used for anyregcc patchpoints. llvm-svn: 226369	2015-01-17 03:57:34 +00:00
Mehdi Amini	37f316afaf	Improve DAG combine pass on certain IR vector patterns Loading 2 2x32-bit float vectors into the bottom half of a 256-bit vector produced suboptimal code in AVX2 mode with certain IR combinations. In particular, the IR optimizer folded 2f32 + 2f32 -> 4f32, 4f32 + 4f32 (undef) -> 8f32 into a 2f32 + 2f32 -> 8f32, which seems more canonical, but then mysteriously generated rather bad code; the movq/movhpd combination didn't match. The problem lay in the BUILD_VECTOR optimization path. The 2f32 inputs would get promoted to 4f32 by the type legalizer, eventually resulting in a BUILD_VECTOR on two 4f32 into an 8f32. The BUILD_VECTOR then, recognizing these were both half the output size, concatted them and then produced a shuffle. However, the resulting concat + shuffle was more complex than it should be; in the case where the upper half of the output is undef, we probably want to generate shuffle + concat instead. This enhancement causes the vector_shuffle combine step to recognize this suboptimal pattern and correct it. I included it there instead of in BUILD_VECTOR in case the same suboptimal pattern occurs for other reasons. This results in the optimizer correctly producing the optimal movq + movhpd sequence for all three variations on this IR, even with AVX2. I've included a test case. Radar link: rdar://problem/19287012 Fix for PR 21943. From: Fiona Glaser <fglaser@apple.com> llvm-svn: 226360	2015-01-17 01:35:56 +00:00
Kevin Enderby	51f5cb143f	Change the test case for llvm-objdump’s -archive-headers option to not check the size while I once again try to figure out why only the clang-cmake-armv7-a15-full bot is getting that value wrong. llvm-svn: 226345	2015-01-16 23:29:07 +00:00
Matt Arsenault	76723d733b	R600: Clean up floor tests These were using different naming schemes, not using multiple check prefixes and not using -LABEL. llvm-svn: 226333	2015-01-16 22:11:00 +00:00
Kevin Enderby	c1271893af	Fix the Archive::Child::getRawSize() method used by llvm-objdump’s -archive-headers option and tweak its use in llvm-objdump. Add back the test case for the -archive-headers option. llvm-svn: 226332	2015-01-16 22:10:36 +00:00
Colin LeMahieu	823415b881	[Hexagon] Converting halfword to doubleword multiply intrinsics. llvm-svn: 226326	2015-01-16 21:41:57 +00:00
Colin LeMahieu	cd9b276966	[Hexagon] Converting accumulating halfword multiply intrinsics to patterns. llvm-svn: 226324	2015-01-16 21:36:34 +00:00
Colin LeMahieu	3b047e0ee5	[Hexagon] Beginning converting intrinsics to patterns instead of duplicated definitions. Converting halfword multiply intrinsics. llvm-svn: 226318	2015-01-16 20:38:54 +00:00
Adam Nemet	3e8b22bc1b	[AVX512] Add intrinsics for masked aligned FP loads and stores Similar to the unaligned cases. Test was generated with update_llc_test_checks.py. Part of <rdar://problem/17688758> llvm-svn: 226296	2015-01-16 18:50:09 +00:00
Adam Nemet	9b8cfa212c	[AVX512] Remove trailing whitespaces in this test llvm-svn: 226295	2015-01-16 18:50:07 +00:00
Duncan P. N. Exon Smith	2f5bb31302	IR: Allow 16-bits for column info Raise the limit for column information from 8 bits to 16 bits. llvm-svn: 226291	2015-01-16 17:33:08 +00:00
Andrea Di Biagio	ae47bc6ab9	[X86][DAG] Disable target specific combine on INSERTPS dag nodes at -O0. This patch disables target specific combine on X86ISD::INSERTPS dag nodes if optlevel is CodeGenOpt::None. The backend currently implements a target specific combine rule that converts a vector load used by an INSERTPS dag node into a scalar load plus a scalar_to_vector. This allows ISel to select a single INSERTPSrm instead of two instructions (i.e. a vector load plus INSERTPSrr). However, the existing target combine rule on INSERTPS nodes only works under the assumption that ISel will always be able to match an INSERTPSrm. This is not true in general at -O0, since the backend only allows folding a load into the memory operand of an instruction if the optimization level is not CodeGenOpt::None. In the example below: // __m128 test(__m128 a, __m128 b) { __m128 c = _mm_insert_ps(a, b, 1 << 6); return c; } // Before this patch, at -O0, the backend would have canonicalized the load to 'b' into a scalar load plus scalar_to_vector. Later on, ISel would have selected an INSERTPSrr leaving the insertps mask in an inconsistent state: movss 4(%rdi), %xmm1 insertps $64, %xmm1, %xmm0 # xmm0 = xmm1[1],xmm0[1,2,3]. With this patch, the backend avoids folding the vector load into the operand of the INSERTPS. The new codegen at -O0 is: movaps (%rdi), %xmm1 insertps $64, %xmm1, %xmm0 # %xmm1[1],xmm0[1,2,3]. llvm-svn: 226277	2015-01-16 14:55:26 +00:00
Simon Pilgrim	367db8eec6	[X86] Refactored stack memory folding tests to explicitly force register spilling The current 'big vectors' stack folded reload testing pattern is very bulky and makes it difficult to test all instructions as big vectors will tend to use only the ymm instruction implementations. This patch changes the tests to use a nop call that lists explicit xmm registers as sideeffects, with this we can force a partial register spill of the relevant registers and then check that the reload is correctly folded. The asm generated only adds the forced spill, a nop instruction and a couple of extra labels (a fraction of the current approach). More exhaustive tests will follow shortly, I've added some extra tests (the xmm versions of some of the existing folding tests) as a starting point. Differential Revision: http://reviews.llvm.org/D6932 llvm-svn: 226264	2015-01-16 09:32:54 +00:00
Timur Iskhodzhanov	60b721363c	Revert r226242 - Revert Revert Don't create new comdats in CodeGen This breaks AddressSanitizer (ninja check-asan) on Windows llvm-svn: 226251	2015-01-16 08:38:45 +00:00
Filipe Cabecinhas	3ca723c9e5	Use report_fatal_error instead of llvm_unreachable, so we don't crash on user input llvm-svn: 226248	2015-01-16 04:54:12 +00:00
Hal Finkel	52f7c018d3	[PowerPC] Adjust PatchPoints for ppc64le Bill Schmidt pointed out that some adjustments would be needed to properly support powerpc64le (using the ELF V2 ABI). For one thing, R11 is not available as a scratch register, so we need to use R12. R12 is also available under ELF V1, so to maintain consistency, I flipped the order to make R12 the first scratch register in the array under both ABIs. llvm-svn: 226247	2015-01-16 04:40:58 +00:00
Mehdi Amini	590a2700fc	Fix Reassociate handling of constant in presence of undef float http://reviews.llvm.org/D6993 llvm-svn: 226245	2015-01-16 03:00:58 +00:00
Rafael Espindola	67a79e72f5	Revert "Revert Don't create new comdats in CodeGen" This reverts commit r226173, adding r226038 back. No change in this commit, but clang was changed to also produce trivial comdats for costructors, destructors and vtables when needed. Original message: Don't create new comdats in CodeGen. This patch stops the implicit creation of comdats during codegen. Clang now sets the comdat explicitly when it is required. With this patch clang and gcc now produce the same result in pr19848. llvm-svn: 226242	2015-01-16 02:22:55 +00:00
Kevin Enderby	15e9420f69	Work around to get the build bot clang-cmake-armv7-a15-full green by removing the macho-archive-headers.test added with r226228 that it is failing on for now while I try to figure out what is going on. llvm-svn: 226241	2015-01-16 02:08:11 +00:00
Kevin Enderby	95f1860d4c	Another attempt to fix the build bot clang-cmake-armv7-a15-full failing on the macho-archive-headers.test added with r226228. llvm-svn: 226239	2015-01-16 01:09:54 +00:00
Sanjoy Das	a1837a342d	Add a new pass "inductive range check elimination" IRCE eliminates range checks of the form 0 <= A * I + B < Length by splitting a loop's iteration space into three segments in a way that the check is completely redundant in the middle segment. As an example, IRCE will convert len = < known positive > for (i = 0; i < n; i++) { if (0 <= i && i < len) { do_something(); } else { throw_out_of_bounds(); } } to len = < known positive > limit = smin(n, len) // no first segment for (i = 0; i < limit; i++) { if (0 <= i && i < len) { // this check is fully redundant do_something(); } else { throw_out_of_bounds(); } } for (i = limit; i < n; i++) { if (0 <= i && i < len) { do_something(); } else { throw_out_of_bounds(); } } IRCE can deal with multiple range checks in the same loop (it takes the intersection of the ranges that will make each of them redundant individually). Currently IRCE does not do any profitability analysis. That is a TODO. Please note that the status of this pass is experimental, and it is not part of any default pass pipeline. Having said that, I will love to get feedback and general input from people interested in trying this out. This pass was originally r226201. It was reverted because it used C++ features not supported by MSVC 2012. Differential Revision: http://reviews.llvm.org/D6693 llvm-svn: 226238	2015-01-16 01:03:22 +00:00
Matt Arsenault	eeb2a7e688	R600/SI: Add patterns for v_cvt_{flr\|rpi}_i32_f32 llvm-svn: 226230	2015-01-15 23:58:35 +00:00
Filipe Cabecinhas	c552c9abce	Fix edge case when Start overflowed in 32 bit mode llvm-svn: 226229	2015-01-15 23:50:44 +00:00
Kevin Enderby	13023a1af6	Add the option, -archive-headers, used with -macho to print the Mach-O archive headers to llvm-objdump. llvm-svn: 226228	2015-01-15 23:19:11 +00:00
Matt Arsenault	268757ba60	R600/SI: Fix trailing comma with modifiers Instructions with 1 operand can still use source modifiers, so make sure we don't print an extra comma afterwards. llvm-svn: 226226	2015-01-15 23:17:03 +00:00
Colin LeMahieu	cd9c4e3e07	[Hexagon] Adding new-value store and bit reverse instructions. llvm-svn: 226224	2015-01-15 23:10:29 +00:00
Filipe Cabecinhas	4013950034	Report fatal errors instead of segfaulting/asserting on a few invalid accesses while reading MachO files. Summary: Shift an older “invalid file” test to get a consistent naming for these tests. Bugs found by afl-fuzz Reviewers: rafael Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6945 llvm-svn: 226219	2015-01-15 22:52:38 +00:00
Sanjoy Das	7f62ac8e4d	Revert r226201 (Add a new pass "inductive range check elimination") The change used C++11 features not supported by MSVC 2012. I will fix the change to use things supported MSVC 2012 and recommit shortly. llvm-svn: 226216	2015-01-15 22:18:10 +00:00
Hal Finkel	e2ab0f17cf	[PowerPC] Loosen ELFv1 PPC64 func descriptor loads for indirect calls Function pointers under PPC64 ELFv1 (which is used on PPC64/Linux on the POWER7, A2 and earlier cores) are really pointers to a function descriptor, a structure with three pointers: the actual pointer to the code to which to jump, the pointer to the TOC needed by the callee, and an environment pointer. We used to chain these loads, and make them opaque to the rest of the optimizer, so that they'd always occur directly before the call. This is not necessary, and in fact, highly suboptimal on embedded cores. Once the function pointer is known, the loads can be performed ahead of time; in fact, they can be hoisted out of loops. Now these function descriptors are almost always generated by the linker, and thus the contents of the descriptors are invariant. As a result, by default, we'll mark the associated loads as invariant (allowing them to be hoisted out of loops). I've added a target feature to turn this off, however, just in case someone needs that option (constructing an on-stack descriptor, casting it to a function pointer, and then calling it cannot be well-defined C/C++ code, but I can imagine some JIT-compilation system doing so). Consider this simple test: $ cat call.c typedef void (fp)(); void bar(fp x) { for (int i = 0; i < 1600000000; ++i) x(); } $ cat main.c typedef void (fp)(); void bar(fp x); void foo() {} int main() { bar(foo); } On the PPC A2 (the BG/Q supercomputer), marking the function-descriptor loads as invariant brings the execution time down to ~8 seconds from ~32 seconds with the loads in the loop. The difference on the POWER7 is smaller. Compiling with: gcc -std=c99 -O3 -mcpu=native call.c main.c : ~6 seconds [this is 4.8.2] clang -O3 -mcpu=native call.c main.c : ~5.3 seconds clang -O3 -mcpu=native call.c main.c -mno-invariant-function-descriptors : ~4 seconds (looks like we'd benefit from additional loop unrolling here, as a first guess, because this is faster with the extra loads) The -mno-invariant-function-descriptors will be added to Clang shortly. llvm-svn: 226207	2015-01-15 21:17:34 +00:00
Colin LeMahieu	f87697f05e	[Hexagon] Updating indexed load-extend patterns and changing test to new expected output. llvm-svn: 226206	2015-01-15 21:07:52 +00:00
Sanjoy Das	7059e2959d	Add a new pass "inductive range check elimination" IRCE eliminates range checks of the form 0 <= A * I + B < Length by splitting a loop's iteration space into three segments in a way that the check is completely redundant in the middle segment. As an example, IRCE will convert len = < known positive > for (i = 0; i < n; i++) { if (0 <= i && i < len) { do_something(); } else { throw_out_of_bounds(); } } to len = < known positive > limit = smin(n, len) // no first segment for (i = 0; i < limit; i++) { if (0 <= i && i < len) { // this check is fully redundant do_something(); } else { throw_out_of_bounds(); } } for (i = limit; i < n; i++) { if (0 <= i && i < len) { do_something(); } else { throw_out_of_bounds(); } } IRCE can deal with multiple range checks in the same loop (it takes the intersection of the ranges that will make each of them redundant individually). Currently IRCE does not do any profitability analysis. That is a TODO. Please note that the status of this pass is experimental, and it is not part of any default pass pipeline. Having said that, I will love to get feedback and general input from people interested in trying this out. Differential Revision: http://reviews.llvm.org/D6693 llvm-svn: 226201	2015-01-15 20:45:46 +00:00
Hal Finkel	5ef58eb86d	Revert "r226086 - Revert "r226071 - [RegisterCoalescer] Remove copies to reserved registers"" Reapply r226071 with fixes. Two fixes: 1. We need to manually remove the old and create the new 'deaf defs' associated with physical register definitions when we move the definition of the physical register from the copy point to the point of the original vreg def. This problem was picked up by the machinstr verifier, and could trigger a verification failure on test/CodeGen/X86/2009-02-12-DebugInfoVLA.ll, so I've turned on the verifier in the tests. 2. When moving the def point of the phys reg up, we need to make sure that it is neither defined nor read in between the two instructions. We don't, however, extend the live ranges of phys reg defs to cover uses, so just checking for live-range overlap between the pair interval and the phys reg aliases won't pick up reads. As a result, we manually iterate over the range and check for reads. A test soon to be committed to the PowerPC backend will test this change. Original commit message: [RegisterCoalescer] Remove copies to reserved registers This allows the RegisterCoalescer to join "non-flipped" range pairs with a physical destination register -- which allows the RegisterCoalescer to remove copies like this: <vreg> = something (maybe a load, for example) ... (things that don't use PHYSREG) PHYSREG = COPY <vreg> (with all of the restrictions normally applied by the RegisterCoalescer: having compatible register classes, etc. ) Previously, the RegisterCoalescer handled only the opposite case (copying from a physical register). I don't handle the problem fully here, but try to get the common case where there is only one use of <vreg> (the COPY). An upcoming commit to the PowerPC backend will make this pattern much more common on PPC64/ELF systems. llvm-svn: 226200	2015-01-15 20:32:09 +00:00
Matt Arsenault	59b09ab9ef	R600/SI: Improve fpext / fptrunc test coverage llvm-svn: 226197	2015-01-15 19:39:42 +00:00
Colin LeMahieu	538b85810c	[Hexagon] Removing old versions of vsplice, valign, cl0, ct0 and updating references to new versions. llvm-svn: 226194	2015-01-15 19:28:32 +00:00
Marek Olsak	c536850526	R600/SI: Use 64-bit encoding by default for opcodes that are VOP3-only on VI llvm-svn: 226190	2015-01-15 18:43:01 +00:00
Colin LeMahieu	504157f1ae	[Hexagon] Adding vmux instruction. Removing old transfer instructions and updating references. llvm-svn: 226184	2015-01-15 18:16:00 +00:00
Ramkumar Ramachandra	bb406c0b9a	statepoint tests: use statepoint-example gc Mechanical conversion of statepoint tests to use the example-statepoint gc. llvm-svn: 226183	2015-01-15 18:10:44 +00:00
Joerg Sonnenberger	b6956e113a	Support @PLT loads on 32bit x86. llvm-svn: 226182	2015-01-15 17:59:02 +00:00
Colin LeMahieu	2d1c14563e	[Hexagon] Deleting old float comparison instruction and updating references to new ones. llvm-svn: 226179	2015-01-15 17:28:14 +00:00
Colin LeMahieu	7959cac725	[Hexagon] Replacing old fadd/fsub instructions and updating references. llvm-svn: 226176	2015-01-15 16:30:07 +00:00
Timur Iskhodzhanov	f5adf13fac	Revert Don't create new comdats in CodeGen It breaks AddressSanitizer on Windows. llvm-svn: 226173	2015-01-15 16:14:34 +00:00
Daniel Sanders	023c806109	[mips] Fix a typo in the compare patterns for MIPS32r6/MIPS64r6. Summary: The patterns intended for the SETLE node were actually matching the SETLT node. Reviewers: atanasyan, sstankovic, vmedic Reviewed By: vmedic Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6997 llvm-svn: 226171	2015-01-15 15:41:03 +00:00
Vladimir Medic	df3ed1c9d6	Add disassembler tests for mips64r6 platform. There are no functional changes. llvm-svn: 226166	2015-01-15 14:18:12 +00:00
Vladimir Medic	d6d486ddcc	Add disassembler tests for mips32r6 platform. There are no functional changes. llvm-svn: 226165	2015-01-15 14:11:38 +00:00
Vladimir Medic	5dcf17b881	Add disassembler tests for mips64r2 platform. There are no functional changes. llvm-svn: 226164	2015-01-15 14:06:34 +00:00
Chandler Carruth	8ca43224db	[PM] Port TargetLibraryInfo to the new pass manager, provided by the TargetLibraryAnalysis pass. There are actually no direct tests of this already in the tree. I've added the most basic test that the pass manager bits themselves work, and the TLI object produced will be tested by an upcoming patches as they port passes which rely on TLI. This is starting to point out the awkwardness of the invalidate API -- it seems poorly fitting on the result object. I suspect I will change it to live on the analysis instead, but that's not for this change, and I'd rather have a few more passes ported in order to have more experience with how this plays out. I believe there is only one more analysis required in order to start porting instcombine. =] llvm-svn: 226160	2015-01-15 11:39:46 +00:00
Vladimir Medic	e993dac523	Add disassembler tests for mips64 platform. There are no functional changes. llvm-svn: 226151	2015-01-15 08:50:20 +00:00
Hal Finkel	dd669615dd	Revert "r226071 - [RegisterCoalescer] Remove copies to reserved registers" Reverting this while I investigate some bad behavior this is causing. As a possibly-related issue, adding -verify-machineinstrs to one of the test cases now fails because of this change: llc test/CodeGen/X86/2009-02-12-DebugInfoVLA.ll -march=x86-64 -o - -verify-machineinstrs * Bad machine code: No instruction at def index * - function: foo - basic block: BB#0 return (0x10007e21f10) [0B;736B) - liverange: [128r,128d:9)[160r,160d:8)[176r,176d:7)[336r,336d:6)[464r,464d:5)[480r,480d:4)[624r,624d:3)[752r,752d:2)[768r,768d:1)[78 4r,784d:0) 0@784r 1@768r 2@752r 3@624r 4@480r 5@464r 6@336r 7@176r 8@160r 9@128r - register: %DS Valno #3 is defined at 624r * Bad machine code: Live segment doesn't end at a valid instruction * - function: foo - basic block: BB#0 return (0x10007e21f10) [0B;736B) - liverange: [128r,128d:9)[160r,160d:8)[176r,176d:7)[336r,336d:6)[464r,464d:5)[480r,480d:4)[624r,624d:3)[752r,752d:2)[768r,768d:1)[78 4r,784d:0) 0@784r 1@768r 2@752r 3@624r 4@480r 5@464r 6@336r 7@176r 8@160r 9@128r - register: %DS [624r,624d:3) LLVM ERROR: Found 2 machine code errors. where 624r corresponds exactly to the interval combining change: 624B %RSP<def> = COPY %vreg16; GR64:%vreg16 Considering merging %vreg16 with %RSP RHS = %vreg16 [608r,624r:0) 0@608r updated: 608B %RSP<def> = MOV64rm <fi#3>, 1, %noreg, 0, %noreg; mem:LD8[%saved_stack.1] Success: %vreg16 -> %RSP Result = %RSP llvm-svn: 226086	2015-01-15 03:08:59 +00:00
Sanjoy Das	8c252bde36	Fix PR22222 The bug was introduced in r225282. r225282 assumed that sub X, Y is the same as add X, -Y. This is not correct if we are going to upgrade the sub to sub nuw. This change fixes the issue by making the optimization ignore sub instructions. Differential Revision: http://reviews.llvm.org/D6979 llvm-svn: 226075	2015-01-15 01:46:09 +00:00
Hal Finkel	8299646236	[RegisterCoalescer] Remove copies to reserved registers This allows the RegisterCoalescer to join "non-flipped" range pairs with a physical destination register -- which allows the RegisterCoalescer to remove copies like this: <vreg> = something (maybe a load, for example) ... (things that don't use PHYSREG) PHYSREG = COPY <vreg> (with all of the restrictions normally applied by the RegisterCoalescer: having compatible register classes, etc. ) Previously, the RegisterCoalescer handled only the opposite case (copying from a physical register). I don't handle the problem fully here, but try to get the common case where there is only one use of <vreg> (the COPY). An upcoming commit to the PowerPC backend will make this pattern much more common on PPC64/ELF systems. llvm-svn: 226071	2015-01-15 01:25:28 +00:00
Hal Finkel	64202167c5	[PowerPC] Add assembler support for mcrfs and friends Fill out our support for the floating-point status and control register instructions (mcrfs and friends). As it turns out, these are necessary for compiling src/test/harness_fp.h in TBB for PowerPC. Thanks to Raf Schietekat for reporting the issue! llvm-svn: 226070	2015-01-15 01:00:53 +00:00
Richard Smith	e78bb1249e	For PR21145: recognise a builtin call to a known deallocation function even if it's defined in the current module. Clang generates this situation for the C++14 sized deallocation functions, because it generates a weak definition in case one isn't provided by the C++ runtime library. llvm-svn: 226069	2015-01-15 01:00:33 +00:00
Ramkumar Ramachandra	dba7329ebb	[GC] CodeGenPrep transform: simplify offsetable relocate The transform is somewhat involved, but the basic idea is simple: find derived pointers that have been offset from the base pointer using gep and replace the relocate of the derived pointer with a gep to the relocated base pointer (with the same offset). llvm-svn: 226060	2015-01-14 23:27:07 +00:00
Philip Reames	8d5d68f1aa	getMangledTypeStr: clarify how it mangles types, and add tests "Write a set of tests that show how name mangling is done for overloaded intrinsics." These happen to use gc.relocates to exercise the codepath in question, but is not a GC specific test. Patch by: artagnon@gmail.com Differential Revision: http://reviews.llvm.org/D6915 llvm-svn: 226056	2015-01-14 23:05:17 +00:00
Duncan P. N. Exon Smith	9885469922	IR: Move MDLocation into place This commit moves `MDLocation`, finishing off PR21433. There's an accompanying clang commit for frontend testcases. I'll attach the testcase upgrade script I used to PR21433 to help out-of-tree frontends/backends. This changes the schema for `DebugLoc` and `DILocation` from: !{i32 3, i32 7, !7, !8} to: !MDLocation(line: 3, column: 7, scope: !7, inlinedAt: !8) Note that empty fields (line/column: 0 and inlinedAt: null) don't get printed by the assembly writer. llvm-svn: 226048	2015-01-14 22:27:36 +00:00
Duncan P. N. Exon Smith	503cf3bff9	IR: Always print MDLocation line Print `MDLocation`'s `line` field even when it's 0. llvm-svn: 226046	2015-01-14 22:14:26 +00:00
Rafael Espindola	fad1639a12	Don't create new comdats in CodeGen. This patch stops the implicit creation of comdats during codegen. Clang now sets the comdat explicitly when it is required. With this patch clang and gcc now produce the same result in pr19848. llvm-svn: 226038	2015-01-14 20:55:48 +00:00
Rafael Espindola	a371989f66	Add a test that would have found the issue with r225644. llvm-svn: 226035	2015-01-14 20:24:46 +00:00
Chandler Carruth	e3288147f0	[MBP] Add flags to disable the BadCFGConflict check in MachineBlockPlacement. Some benchmarks have shown that this could lead to a potential performance benefit, and so adding some flags to try to help measure the difference. A possible explanation. In diamond-shaped CFGs (A followed by either B or C both followed by D), putting B and C both in between A and D leads to the code being less dense than it could be. Always either B or C have to be skipped increasing the chance of cache misses etc. Moving either B or C to after D might be beneficial on average. In the long run, but we should probably do a better job of analyzing the basic block and branch probabilities to move the correct one of B or C to after D. But even if we don't use this in the long run, it is a good baseline for benchmarking. Original patch authored by Daniel Jasper with test tweaks and a second flag added by me. Differential Revision: http://reviews.llvm.org/D6969 llvm-svn: 226034	2015-01-14 20:19:29 +00:00
Bill Schmidt	082cfc05f1	[PPC64] Add support for the ICBT instruction on POWER8. Patch by Kit Barton. Support for the ICBT instruction is currently present, but limited to embedded processors. This change adds a new FeatureICBT that can be used to identify whether the ICBT instruction is available on a specific processor. Two new tests are added: * Positive test to ensure the icbt instruction is present when using -mcpu=pwr8 * Negative test to ensure the icbt instruction is not generated when using -mcpu=pwr7 Both test cases use the Prefetch opcode in LLVM. They are based on the ppc64-prefetch.ll test case. llvm-svn: 226033	2015-01-14 20:17:10 +00:00
Rafael Espindola	9e3e53f6dd	Fix linking of shared libraries. In shared libraries the plugin can see non-weak declarations that are still undefined. llvm-svn: 226031	2015-01-14 20:08:46 +00:00
Rafael Espindola	0fd9e5f719	Fix handling of extern_weak. This was broken by r225983. llvm-svn: 226026	2015-01-14 19:43:32 +00:00
David Majnemer	a0afb55ff9	InstCombine: Don't take A-B<0 into A<B if A-B has other uses This fixes PR22226. llvm-svn: 226023	2015-01-14 19:26:56 +00:00
Rafael Espindola	7244bb3c17	Revert "Add r224985 back with two fixes." This reverts commit r225644 while I debug a regression. llvm-svn: 226022	2015-01-14 19:07:23 +00:00
Rafael Espindola	4e74d3be35	Add support for comdats with names larger than 256 characters. llvm-svn: 226012	2015-01-14 18:25:45 +00:00
Olivier Sallenave	d657321aef	Check that the TLI callback enableAggressiveFMAFusion has the desired effect on FMA folding. llvm-svn: 225987	2015-01-14 15:36:28 +00:00
Rafael Espindola	5ca7fa15fd	Handle a symbol being undefined. This can happen if: * It is present in a comdat in one file. * It is not present in the comdat of the file that is kept. * Is is not used. This should fix the LTO boostrap. Thanks to Takumi NAKAMURA for setting up the bot! llvm-svn: 225983	2015-01-14 13:53:50 +00:00
Vladimir Medic	1080666e80	Add disassembler tests for mips32r2 platform. There are no functional changes. llvm-svn: 225980	2015-01-14 11:35:22 +00:00
Jyoti Allur	5a1391410d	Correct POP handling for v7m llvm-svn: 225972	2015-01-14 10:48:16 +00:00
Chandler Carruth	64764b446b	[PM] Port domtree to the new pass manager (at last). This adds the domtree analysis to the new pass manager. The analysis returns the same DominatorTree result entity used by the old pass manager and essentially all of the code is shared. We just have different boilerplate for running and printing the analysis. I've converted one test to run in both modes just to make sure this is exercised while both are live in the tree. llvm-svn: 225969	2015-01-14 10:19:28 +00:00
Kai Nacke	755b6e8a42	[mips] Refine octeon instructions seq/seqi/sne/snei This commit refines the pattern for the octeon seq/seqi/sne/snei instructions. The target register is set to 0 or 1 according to the result of the comparison. In C, this is something like rd = (unsigned long)(rs == rt) This commit adds a zext to bring the result to i64. With this change the instruction is selected for this type of code. (gcc produces the same code for the above C code.) llvm-svn: 225968	2015-01-14 10:19:09 +00:00
Vladimir Medic	62dfce3240	Add disassembler tests for mips32r2 platform. There are no functional changes. llvm-svn: 225967	2015-01-14 10:18:56 +00:00
Brad Smith	dd6675cef9	Use the integrated assembler by default on SPARC. llvm-svn: 225957	2015-01-14 07:53:39 +00:00
JF Bastien	eeea8970b4	Revert "Insert random noops to increase security against ROP attacks (llvm)" This reverts commit: http://reviews.llvm.org/D3392 llvm-svn: 225948	2015-01-14 05:24:33 +00:00
Saleem Abdulrasool	ca24b1d638	X86: validate 'int' instruction The int instruction takes as an operand an 8-bit immediate value. Validate that the input is valid rather than silently truncating the value. llvm-svn: 225941	2015-01-14 05:10:21 +00:00
NAKAMURA Takumi	c16c427ebc	Disable a couple of tests, CodeGen/X86/noop-insert.ll and CodeGen/X86/noop-insert-percentage.ll, in r225908, to unbreak tests. llvm-svn: 225940	2015-01-14 04:21:33 +00:00
Chandler Carruth	ef7a9fb63b	[dom] Add a basic dominator tree test. Correct, we have zero basic testing of the dominator tree in the regression test suite. There is a single test that even prints it out, and that test only checks a single line of the output. There are a handful of tests that check post dominators, but all of those are looking for bugs rather than just exercising the basic machinery. This test is super boring and unexciting. But hey, it's something. I needed there to be something so I could switch the basic test to run with both the old and new pass manager. llvm-svn: 225936	2015-01-14 03:34:55 +00:00
Tim Northover	a203ca61af	ARM: add test for crc32 instructions in CodeGen. Somehow we seem to have ended up without any actual tests of the CodeGen side. Easy enough to fix. llvm-svn: 225930	2015-01-14 01:43:33 +00:00
Hal Finkel	2307a2f088	[PowerPC] Fix the noop-insert test The form of nops used is CPU-specific (some CPUs, such as the POWER7, have special group-terminating nops). We probably want a different callback for this kind of nop insertion (something more like MCAsmBackend::writeNopData), or for PPC to use a different mechanism for scheduling nops, but this will stop the test from failing for now. llvm-svn: 225928	2015-01-14 01:37:21 +00:00
Matt Arsenault	edb6f03852	R600/SI: Remove some redudant load testcases. This reduces coverage for Evergreen, since the more complete tests have those run lines disabled. llvm-svn: 225927	2015-01-14 01:35:26 +00:00
Matt Arsenault	e698663687	R600/SI: Fix bad code with unaligned byte vector loads Don't do the v4i8 -> v4f32 combine if the load will need to be expanded due to alignment. This stops adding instructions to repack into a single register that the v_cvt_ubyteN_f32 instructions read. llvm-svn: 225926	2015-01-14 01:35:22 +00:00
Matt Arsenault	bd22342322	Implement new way of expanding extloads. Now that the source and destination types can be specified, allow doing an expansion that doesn't use an EXTLOAD of the result type. Try to do a legal extload to an intermediate type and extend that if possible. This generalizes the special case custom lowering of extloads R600 has been using to work around this problem. This also happens to fix a bug that would incorrectly use more aligned loads than should be used. llvm-svn: 225925	2015-01-14 01:35:17 +00:00
Duncan P. N. Exon Smith	a5a0f5766a	Utils: Handle remapping distinct MDLocations Part of PR21433. llvm-svn: 225921	2015-01-14 01:29:32 +00:00
Duncan P. N. Exon Smith	47d82981d6	Utils: Add mapping for uniqued MDLocations Still doesn't handle distinct ones. Part of PR21433. llvm-svn: 225914	2015-01-14 01:20:27 +00:00
Hal Finkel	934361a4b8	Revert "r225811 - Revert "r225808 - [PowerPC] Add StackMap/PatchPoint support"" This re-applies r225808, fixed to avoid problems with SDAG dependencies along with the preceding fix to ScheduleDAGSDNodes::RegDefIter::InitNodeNumDefs. These problems caused the original regression tests to assert/segfault on many (but not all) systems. Original commit message: This commit does two things: 1. Refactors PPCFastISel to use more of the common infrastructure for call lowering (this lets us take advantage of this common code for lowering some common intrinsics, stackmap/patchpoint among them). 2. Adds support for stackmap/patchpoint lowering. For the most part, this is very similar to the support in the AArch64 target, with the obvious differences (different registers, NOP instructions, etc.). The test cases are adapted from the AArch64 test cases. One difference of note is that the patchpoint call sequence takes 24 bytes, so you can't use less than that (on AArch64 you can go down to 16). Also, as noted in the docs, we take the patchpoint address to be the actual code address (assuming the call is local in the TOC-sharing sense), which should yield higher performance than generating the full cross-DSO indirect-call sequence and is likely just as useful for JITed code (if not, we'll change it). StackMaps and Patchpoints are still marked as experimental, and so this support is doubly experimental. So go ahead and experiment! llvm-svn: 225909	2015-01-14 01:07:51 +00:00
JF Bastien	dcdd5ad252	Insert random noops to increase security against ROP attacks (llvm) A pass that adds random noops to X86 binaries to introduce diversity with the goal of increasing security against most return-oriented programming attacks. Command line options: -noop-insertion // Enable noop insertion. -noop-insertion-percentage=X // X% of assembly instructions will have a noop prepended (default: 50%, requires -noop-insertion) -max-noops-per-instruction=X // Randomly generate X noops per instruction. ie. roll the dice X times with probability set above (default: 1). This doesn't guarantee X noop instructions. In addition, the following 'quick switch' in clang enables basic diversity using default settings (currently: noop insertion and schedule randomization; it is intended to be extended in the future). -fdiversify This is the llvm part of the patch. clang part: D3393 http://reviews.llvm.org/D3392 Patch by Stephen Crane (@rinon) llvm-svn: 225908	2015-01-14 01:07:26 +00:00
Reid Kleckner	0a57f65514	CodeGen support for x86_64 SEH catch handlers in LLVM This adds handling for ExceptionHandling::MSVC, used by the x86_64-pc-windows-msvc triple. It assumes that filter functions have already been outlined in either the frontend or the backend. Filter functions are used in place of the landingpad catch clause type info operands. In catch clause order, the first filter to return true will catch the exception. The C specific handler table expects the landing pad to be split into one block per handler, but LLVM IR uses a single landing pad for all possible unwind actions. This patch papers over the mismatch by synthesizing single instruction BBs for every catch clause to fill in the EH selector that the landing pad block expects. Missing functionality: - Accessing data in the parent frame from outlined filters - Cleanups (from __finally) are unsupported, as they will require outlining and parent frame access - Filter clauses are unsupported, as there's no clear analogue in SEH In other words, this is the minimal set of changes needed to write IR to catch arbitrary exceptions and resume normal execution. Reviewers: majnemer Differential Revision: http://reviews.llvm.org/D6300 llvm-svn: 225904	2015-01-14 01:05:27 +00:00
Ahmed Bougacha	71d7b18e3d	[SimplifyLibCalls] Don't try to simplify indirect calls. It turns out, all callsites of the simplifier are guarded by a check for CallInst::getCalledFunction (i.e., to make sure the callee is direct). This check wasn't done when trying to further optimize a simplified fortified libcall, introduced by a refactoring in r225640. Fix that, add a testcase, and document the requirement. llvm-svn: 225895	2015-01-14 00:55:05 +00:00
Adrian Prantl	092d9489ed	Debug Info: Move the complex expression handling (=the remainder) of emitDebugLocValue() into DwarfExpression. Ought to be NFC, but it actually uncovered a bug in the debug-loc-asan.ll testcase. The testcase checks that the address of variable "y" is stored at [RSP+16], which also lines up with the comment. It also check(ed) that the value of "y" is stored in RDI before that, but that is actually incorrect, since RDI is the very value that is stored in [RSP+16]. Here's the assembler output: movb 2147450880(%rcx), %r8b #DEBUG_VALUE: bar:y <- RDI cmpb $0, %r8b movq %rax, 32(%rsp) # 8-byte Spill movq %rsi, 24(%rsp) # 8-byte Spill movq %rdi, 16(%rsp) # 8-byte Spill .Ltmp3: #DEBUG_VALUE: bar:y <- [RSP+16] Fixed the comment to spell out the correct register and the check to expect an address rather than a value. Note that the range that is emitted for the RDI location was and is still wrong, it claims to begin at the function prologue, but really it should start where RDI is first assigned. llvm-svn: 225851	2015-01-13 23:39:11 +00:00
Adam Nemet	d23c88db15	[AVX512] Add 16x32 unpck tests as well Forgot this from r225838. llvm-svn: 225850	2015-01-13 23:27:55 +00:00
Chandler Carruth	703378f156	[PM] Remove the defunt CGSCC-specific debug flag. Even before I sunk the debug flag into the opt tool this had been made obsolete by factoring the pass and analysis managers into a single set of templates that all used the core flag. No functionality changed here. llvm-svn: 225842	2015-01-13 22:45:13 +00:00
Adam Nemet	f07a0ba96c	Fix function names in tests from r225838. llvm-svn: 225840	2015-01-13 22:40:15 +00:00
Adam Nemet	e5dbcb7fd0	[AVX512] Unpack support in new shuffle lowering This now handles both 32 and 64-bit element sizes. In this version, the test are in vector-shuffle-512-v8.ll, canonicalized by Chandler's update_llc_test_checks.py. Part of <rdar://problem/17688758> llvm-svn: 225838	2015-01-13 22:20:18 +00:00
Duncan P. N. Exon Smith	6a4848324b	AsmParser/Bitcode: Add support for MDLocation This adds assembly and bitcode support for `MDLocation`. The assembly side is rather big, since this is the first `MDNode` subclass (that isn't `MDTuple`). Part of PR21433. (If you're wondering where the mountains of testcase updates are, we don't need them until I update `DILocation` and `DebugLoc` to actually use this class.) llvm-svn: 225830	2015-01-13 21:10:44 +00:00
Matt Arsenault	e93d06a579	R600: Implement getRsqrtEstimate Only do for f32 since I'm unclear on both what this is expecting for the refinement steps in terms of accuracy, and what f64 instruction actually provides. llvm-svn: 225827	2015-01-13 20:53:18 +00:00
Matt Arsenault	b56d843348	R600: Make cttz / ctlz cheap to speculate Speculating things is generally good. SI+ has instructions for these for 32-bit values. This is still probably better even with the expansion for 64-bit values, although it is odd that this callback doesn't have the size as a parameter. llvm-svn: 225822	2015-01-13 19:46:48 +00:00
Ulrich Weigand	bd039299c0	Use the integrated assembler as default on SystemZ This was already done in clang, this commit now uses the integrated assembler as default when using LLVM tools directly. A number of test cases deliberately using an invalid instruction in inline asm now have to use -no-integrated-as. llvm-svn: 225820	2015-01-13 19:45:16 +00:00
Ulrich Weigand	6b577e26f0	Use the integrated assembler as default on PowerPC This was already done in clang, this commit now uses the integrated assembler as default when using LLVM tools directly. A number of test cases using inline asm had to be adapted, either by updating the expected output, or by using -no-integrated-as (for such tests that deliberately use an invalid instruction in inline asm). llvm-svn: 225819	2015-01-13 19:43:45 +00:00
Hal Finkel	63fb928109	Revert "r225808 - [PowerPC] Add StackMap/PatchPoint support" Reverting this while I investiage buildbot failures (segfaulting in GetCostForDef at ScheduleDAGRRList.cpp:314). llvm-svn: 225811	2015-01-13 18:25:05 +00:00
Will Schmidt	4a2d333982	Update multiline.ll testcase to handle (ppc64le) .localentry directive The ppc64le platform will emit a .localentry directive. This is triggering a false-positive against a CHECK-NOT: .loc in multiline.ll. Add a space "{{ }}" to the check-not line to allow for arguments, and prevent .localentry from matching. Differential Revision: http://reviews.llvm.org/D6935 llvm-svn: 225810	2015-01-13 18:17:08 +00:00
Hal Finkel	821befd52b	[PowerPC] Add StackMap/PatchPoint support This commit does two things: 1. Refactors PPCFastISel to use more of the common infrastructure for call lowering (this lets us take advantage of this common code for lowering some common intrinsics, stackmap/patchpoint among them). 2. Adds support for stackmap/patchpoint lowering. For the most part, this is very similar to the support in the AArch64 target, with the obvious differences (different registers, NOP instructions, etc.). The test cases are adapted from the AArch64 test cases. One difference of note is that the patchpoint call sequence takes 24 bytes, so you can't use less than that (on AArch64 you can go down to 16). Also, as noted in the docs, we take the patchpoint address to be the actual code address (assuming the call is local in the TOC-sharing sense), which should yield higher performance than generating the full cross-DSO indirect-call sequence and is likely just as useful for JITed code (if not, we'll change it). StackMaps and Patchpoints are still marked as experimental, and so this support is doubly experimental. So go ahead and experiment! llvm-svn: 225808	2015-01-13 17:48:12 +00:00
Jozef Kolek	e7cad7a1df	[mips][microMIPS] Fix issue with 16b instructions in jr instruction delay slot 16 bit instructions are not allowed in jr delay slot. Same stands for PseudoIndirectBranch and PseudoReturn. Differential Revision: http://reviews.llvm.org/D6815 llvm-svn: 225798	2015-01-13 15:59:17 +00:00
Chandler Carruth	816702ffe0	[PM] Refactor the new pass manager to use a single template to implement the generic functionality of the pass managers themselves. In the new infrastructure, the pass "manager" isn't actually interesting at all. It just pipelines a single chunk of IR through N passes. We don't need to know anything about the IR or the passes to do this really and we can replace the 3 implementations of the exact same functionality with a single generic PassManager template, complementing the single generic AnalysisManager template. I've left typedefs in place to give convenient names to the various obvious instantiations of the template. With this, I think I've nuked almost all of the redundant logic in the managers, and I think the overall design is actually simpler for having single templates that clearly indicate there is no special logic here. The logging is made somewhat more annoying by this change, but I don't think the difference is worth having heavy-weight traits to help log things. llvm-svn: 225783	2015-01-13 11:13:56 +00:00
Chandler Carruth	7ad6d620b7	[PM] Fold all three analysis managers into a single AnalysisManager template. This consolidates three copies of nearly the same core logic. It adds "complexity" to the ModuleAnalysisManager in that it makes it possible to share a ModuleAnalysisManager across multiple modules... But it does so by deleting all of the code, so I'm OK with that. This will naturally make fixing bugs in this code much simpler, etc. The only down side here is that we have to use 'typename' and 'this->' in various places, and the implementation is lifted into the header. I'll take that for the code size reduction. The convenient names are still typedef-ed and used throughout so that users can largely ignore this aspect of the implementation. The follow-up change to this will do the exact same refactoring for the PassManagers. =D It turns out that the interesting different code is almost entirely in the adaptors. At the end, that should be essentially all that is left. llvm-svn: 225757	2015-01-13 02:51:47 +00:00
Reid Kleckner	3542ace6ef	Rename llvm.recoverframeallocation to llvm.framerecover This name is less descriptive, but it sort of puts things in the 'llvm.frame...' namespace, relating it to frameallocate and frameaddress. It also avoids using "allocate" and "allocation" together. llvm-svn: 225752	2015-01-13 01:51:34 +00:00
Reid Kleckner	e9b8931873	Add the llvm.frameallocate and llvm.recoverframeallocation intrinsics These intrinsics allow multiple functions to share a single stack allocation from one function's call frame. The function with the allocation may only perform one allocation, and it must be in the entry block. Functions accessing the allocation call llvm.recoverframeallocation with the function whose frame they are accessing and a frame pointer from an active call frame of that function. These intrinsics are very difficult to inline correctly, so the intention is that they be introduced rarely, or at least very late during EH preparation. Reviewers: echristo, andrew.w.kaylor Differential Revision: http://reviews.llvm.org/D6493 llvm-svn: 225746	2015-01-13 00:48:10 +00:00
Matt Arsenault	a982e4f82b	Combine fcmp + select to fminnum / fmaxnum if no nans and legal Also require unsafe FP math for no since there isn't a way to test for signed zeros. llvm-svn: 225744	2015-01-13 00:43:00 +00:00
Reid Kleckner	bba20f06de	musttail: Only set the inreg flag for fastcall and vectorcall Otherwise we'll attempt to forward ECX, EDX, and EAX for cdecl and stdcall thunks, leaving us with no scratch registers for indirect call targets. Fixes PR22052. llvm-svn: 225729	2015-01-12 23:28:23 +00:00
Adrian Prantl	b16d9ebb0c	Debug info: Factor out the creation of DWARF expressions from AsmPrinter into a new class DwarfExpression that can be shared between AsmPrinter and DwarfUnit. This is the first step towards unifying the two entirely redundant implementations of dwarf expression emission in DwarfUnit and AsmPrinter. Almost no functional change — Testcases were updated because asm comments that used to be on two lines now appear on the same line, which is actually preferable. llvm-svn: 225706	2015-01-12 22:19:22 +00:00
Ahmed Bougacha	291833b959	[X86] Also create+widen FMIN/FMAX nodes for v2f32. This happens in the HINT benchmark, where the SLP-vectorizer created v2f32 fcmp/select code. The "correct" solution would have been to teach the vectorizer cost model that v2f32 isn't legal (because really, it isn't), but if we can vectorize we might as well do so. We legalize these v2f32 FMIN/FMAX nodes by widening to v4f32 later on. v3f32 were already widened to v4f32 by the generic unroll-and-build-vector legalization. rdar://15763436 Differential Revision: http://reviews.llvm.org/D6557 llvm-svn: 225691	2015-01-12 20:31:30 +00:00
Ahmed Bougacha	66fde538ee	[X86] Make SSE min/max testcases more explicit. NFC. llvm-svn: 225687	2015-01-12 20:15:47 +00:00
Tom Stellard	b6550529a6	R600/SI: Use RegisterOperands to specify which operands can accept immediates There are some operands which can take either immediates or registers and we were previously using different register class to distinguish between operands that could take immediates and those that could not. This patch switches to using RegisterOperands which should simplify the backend by reducing the number of register classes and also make it easier to implement the assembler. llvm-svn: 225662	2015-01-12 19:33:18 +00:00
Sanjay Patel	5f1d9eaad3	GVN: propagate equalities for floating point compares Allow optimizations based on FP comparison values in the same way as integers. This resolves PR17713: http://llvm.org/bugs/show_bug.cgi?id=17713 Differential Revision: http://reviews.llvm.org/D6911 llvm-svn: 225660	2015-01-12 19:29:48 +00:00
Rafael Espindola	d9c3e308f5	Add r224985 back with two fixes. One is that AArch64 has additional restrictions on when local relocations can be used. We have to take those into consideration when deciding to put a L symbol in the symbol table or not. The other is that ld64 requires the relocations to cstring to use linker visible symbols on AArch64. Thanks to Michael Zolotukhin for testing this! Remove doesSectionRequireSymbols. In an assembly expression like bar: .long L0 + 1 the intended semantics is that bar will contain a pointer one byte past L0. In sections that are merged by content (strings, 4 byte constants, etc), a single position in the section doesn't give the linker enough information. For example, it would not be able to tell a relocation must point to the end of a string, since that would look just like the start of the next. The solution used in ELF to use relocation with symbols if there is a non-zero addend. In MachO before this patch we would just keep all symbols in some sections. This would miss some cases (only cstrings on x86_64 were implemented) and was inefficient since most relocations have an addend of 0 and can be represented without the symbol. This patch implements the non-zero addend logic for MachO too. llvm-svn: 225644	2015-01-12 18:13:07 +00:00
Jozef Kolek	9761e96b01	[mips][microMIPS] Implement BEQZ16 and BNEZ16 instructions Differential Revision: http://reviews.llvm.org/D5271 llvm-svn: 225627	2015-01-12 12:03:34 +00:00
Richard Smith	600ee4ad66	Put this test's input in the Inputs directory where it belongs, rather than reusing a file from a different test directory. llvm-svn: 225621	2015-01-12 08:50:47 +00:00
Hal Finkel	87deb0b8e3	[PowerPC] Fix calls to non-function objects Looking at r225438 inspired me to see how the PowerPC backend handled the situation (calling a bitcasted TLS global), and it turns out we also produced an error (cannot select ...). What it means to "call" something that is not a function is implementation and platform specific, but in the name of doing something (besides crashing), this makes sure we do what GCC does (treat all such calls as calls through a function pointer -- meaning that the pointer is assumed, as is the convention on PPC, to point to a function descriptor structure holding the actual code address along with the function's TOC pointer and environment pointer). As GCC does, we now do the same for calling regular (non-TLS) non-function globals too. I'm not sure whether this is the most useful way to define the behavior, but at least we won't be alone. llvm-svn: 225617	2015-01-12 04:34:47 +00:00
David Majnemer	14141f941a	Revert most of r225597 We can't rely on a DataLayout enlightened constant folder. llvm-svn: 225599	2015-01-11 07:29:51 +00:00
David Majnemer	292d0c796b	X86: Properly decode shuffle masks when the constant pool type is weird It's possible for the constant pool entry for the shuffle mask to come from a completely different operation. This occurs when Constants have the same bit pattern but have different types. Make DecodePSHUFBMask tolerant of types which, after a bitcast, are appropriately sized vector types. This fixes PR22188. llvm-svn: 225597	2015-01-11 05:08:57 +00:00
Saleem Abdulrasool	9cf2679d3b	X86: teach X86TargetLowering about L,M,O constraints Teach the ISelLowering for X86 about the L,M,O target specific constraints. Although, for the moment, clang performs constraint validation and prevents passing along inline asm which may have immediate constant constraints violated, the backend should be able to cope with the invalid inline asm a bit better. llvm-svn: 225596	2015-01-11 04:39:24 +00:00
Saleem Abdulrasool	fe781977b9	ARM: add support for segment base relocations (SBREL) This adds support for parsing and emitting the SBREL relocation variant for the ARM target. Handling this relocation variant is necessary for supporting the full ARM ELF specification. Addresses PR22128. llvm-svn: 225595	2015-01-11 04:39:18 +00:00
Chandler Carruth	c491f72e7a	[x86] Remove some windows line endings that snuck into the tests here. Folks on Windows, remember to set up your subversion to strip these when submitting... llvm-svn: 225593	2015-01-11 01:36:20 +00:00
Sanjoy Das	81401d4b19	Fix PR22179. We were incorrectly inferring nsw for certain SCEVs. We can be more aggressive here (see Richard Smith's comment on http://llvm.org/bugs/show_bug.cgi?id=22179) but this change just focuses on correctness. Differential Revision: http://reviews.llvm.org/D6914 llvm-svn: 225591	2015-01-10 23:41:24 +00:00
Simon Pilgrim	94a4cc027a	[X86][SSE] Improved (v)insertps shuffle matching In the current code we only attempt to match against insertps if we have exactly one element from the second input vector, irrespective of how much of the shuffle result is zeroable. This patch checks to see if there is a single non-zeroable element from either input that requires insertion. It also supports matching of cases where only one of the inputs need to be referenced. We also split insertps shuffle matching off into a new lowerVectorShuffleAsInsertPS function. Differential Revision: http://reviews.llvm.org/D6879 llvm-svn: 225589	2015-01-10 19:45:33 +00:00
Hal Finkel	5d5d1539cc	[PowerPC] Mark zext of a small scalar load as free This initial implementation of PPCTargetLowering::isZExtFree marks as free zexts of small scalar loads (that are not sign-extending). This callback is used by SelectionDAGBuilder's RegsForValue::getCopyToRegs, and thus to determine whether a zext or an anyext is used to lower illegally-typed PHIs. Because later truncates of zero-extended values are nops, this allows for the elimination of later unnecessary truncations. Fixes the initial complaint associated with PR22120. llvm-svn: 225584	2015-01-10 08:21:59 +00:00
Saleem Abdulrasool	c552218e28	tests: fix previous commit The previous commit accidentally missed changes to the test output checking, resulting in an errant failure. llvm-svn: 225577	2015-01-10 02:53:25 +00:00
Saleem Abdulrasool	48bbb6c821	test: merge ARM relocations test There is a fair number of relocations that are part of the AAELF specification. Simply merge the tests into a single test file, otherwise, we will end up with far too many test files to test each relocation type. NFC. llvm-svn: 225576	2015-01-10 02:48:29 +00:00
Saleem Abdulrasool	ff2da70fdd	tests: convert a couple of ARM relocation tests to readobj These tests are checking the relocation generation. Use the readobj output as it is much easier to follow when glancing over the tests. llvm-svn: 225575	2015-01-10 02:48:25 +00:00
Justin Hibbits	654346e6f9	Fully fix Bug #22115 . Summary: In the previous commit, the register was saved, but space was not allocated. This resulted in the parameter save area potentially clobbering r30, leading to nasty results. Test Plan: Tests updated Reviewers: hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6906 llvm-svn: 225573	2015-01-10 01:57:21 +00:00
Hal Finkel	611b127ad8	[PowerPC] Readjust the loop unrolling threshold Now that the way that the partial unrolling threshold for small loops is used to compute the unrolling factor as been corrected, a slightly smaller threshold is preferable. This is expected; other targets may need to re-tune as well. llvm-svn: 225566	2015-01-10 00:31:10 +00:00
Hal Finkel	38dd590861	[LoopUnroll] Fix the partial unrolling threshold for small loop sizes When we compute the size of a loop, we include the branch on the backedge and the comparison feeding the conditional branch. Under normal circumstances, these don't get replicated with the rest of the loop body when we unroll. This led to the somewhat surprising behavior that really small loops would not get unrolled enough -- they could be unrolled more and the resulting loop would be below the threshold, because we were assuming they'd take (LoopSize * UnrollingFactor) instructions after unrolling, instead of (((LoopSize-2) * UnrollingFactor)+2) instructions. This fixes that computation. llvm-svn: 225565	2015-01-10 00:30:55 +00:00
Rafael Espindola	d0b23bef6f	Use the DiagnosticHandler to print diagnostics when reading bitcode. The bitcode reading interface used std::error_code to report an error to the callers and it is the callers job to print diagnostics. This is not ideal for error handling or diagnostic reporting: * For error handling, all that the callers care about is 3 possibilities: * It worked * The bitcode file is corrupted/invalid. * The file is not bitcode at all. * For diagnostic, it is user friendly to include far more information about the invalid case so the user can find out what is wrong with the bitcode file. This comes up, for example, when a developer introduces a bug while extending the format. The compromise we had was to have a lot of error codes. With this patch we use the DiagnosticHandler to communicate with the human and std::error_code to communicate with the caller. This allows us to have far fewer error codes and adds the infrastructure to print better diagnostics. This is so because the diagnostics are printed when he issue is found. The code that detected the problem in alive in the stack and can pass down as much context as needed. As an example the patch updates test/Bitcode/invalid.ll. Using a DiagnosticHandler also moves the fatal/non-fatal error decision to the caller. A simple one like llvm-dis can just use fatal errors. The gold plugin needs a bit more complex treatment because of being passed non-bitcode files. An hypothetical interactive tool would make all bitcode errors non-fatal. llvm-svn: 225562	2015-01-10 00:07:30 +00:00
Alexey Samsonov	55acbc071c	Disable Go bindings test under UBSan. llvm-svn: 225557	2015-01-09 23:17:23 +00:00
Andrew Kaylor	a10379ad49	Fix the JIT event listeners and replace the associated tests. The changes to EventListenerCommon.h were contributed by Arch Robison. This fixes bug 22095. http://reviews.llvm.org/D6905 llvm-svn: 225554	2015-01-09 22:53:24 +00:00
Hans Wennborg	dcc6e5bc03	SimplifyCFG: check uses of constant-foldable instrs in switch destinations (PR20210) The previous code assumed that such instructions could not have any uses outside CaseDest, with the motivation that the instruction could not dominate CommonDest because CommonDest has phi nodes in it. That simply isn't true; e.g., CommonDest could have an edge back to itself. llvm-svn: 225552	2015-01-09 22:13:31 +00:00
Simon Pilgrim	ec1f2c2cab	[X86][SSE] Avoid vector byte shuffles with zero by using pshufb to create zeros pshufb can shuffle in zero bytes as well as bytes from a source vector - we can use this to avoid having to shuffle 2 vectors and ORing the result when the used inputs from a vector are all zeroable. Differential Revision: http://reviews.llvm.org/D6878 llvm-svn: 225551	2015-01-09 22:03:19 +00:00
Rafael Espindola	1ea49d1bdd	Add a testcase of llvm-lto error handling. llvm-svn: 225545	2015-01-09 20:55:09 +00:00
Kevin Enderby	131d1770f6	Add the option, -universal-headers, used with -macho to print the Mach-O universal headers to llvm-objdump. llvm-svn: 225537	2015-01-09 19:22:37 +00:00
Tim Northover	eb16112e97	Re-reapply r221924: "[GVN] Perform Scalar PRE on gep indices that feed loads before doing Load PRE" It's not really expected to stick around, last time it provoked a weird LTO build failure that I can't reproduce now, and the bot logs are long gone. I'll re-revert it if the failures recur. Original description: Perform Scalar PRE on gep indices that feed loads before doing Load PRE. llvm-svn: 225536	2015-01-09 19:19:56 +00:00
Daniel Sanders	1440bb2a26	[mips] Add support for accessing $gp as a named register. Summary: Mips Linux uses $gp to hold a pointer to thread info structure and accesses it with a named register. This makes this work for LLVM. The N32 ABI doesn't quite work yet since the frontend generates incorrect IR for this case. It neglects to truncate the 64-bit GPR to a 32-bit value before converting to a pointer. Given correct IR (as in the testcase in this patch), it works correctly. Reviewers: sstankovic, vmedic, atanasyan Reviewed By: atanasyan Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6893 llvm-svn: 225529	2015-01-09 17:21:30 +00:00
Hal Finkel	b359b735d6	[PowerPC] Enable late partial unrolling on the POWER7 The P7 benefits from not have really-small loops so that we either have multiple dispatch groups in the loop and/or the ability to form more-full dispatch groups during scheduling. Setting the partial unrolling threshold to 44 seems good, empirically, for the P7. Compared to using no late partial unrolling, this yields the following test-suite speedups: SingleSource/Benchmarks/Adobe-C++/simple_types_constant_folding -66.3253% +/- 24.1975% SingleSource/Benchmarks/Misc-C++/oopack_v1p8 -44.0169% +/- 29.4881% SingleSource/Benchmarks/Misc/pi -27.8351% +/- 12.2712% SingleSource/Benchmarks/Stanford/Bubblesort -30.9898% +/- 22.4647% I've speculatively added a similar setting for the P8. Also, I've noticed that the unroller does not quite calculate the unrolling factor correctly for really tiny loops because it neglects to account for the fact that not every loop body replicant contains an ending branch and counter increment. I'll fix that later. llvm-svn: 225522	2015-01-09 15:51:16 +00:00
Saleem Abdulrasool	b68fa3b576	ARM: add support for R_ARM_ABS16 Add support for R_ARM_ABS16 relocation mapping. Addresses PR22156. llvm-svn: 225510	2015-01-09 06:57:24 +00:00
Saleem Abdulrasool	3e81ecfeb6	test: add additional test for SVN r225507 Add an additional test case to ensure that we generate the relocation even if the thumb target is used. llvm-svn: 225509	2015-01-09 06:57:18 +00:00
Saleem Abdulrasool	3c0f78a2fc	ARM: add support for R_ARM_ABS8 relocations Add support for R_ARM_ABS8 relocation. Addresses PR22126. llvm-svn: 225507	2015-01-09 05:59:12 +00:00
Matthias Braun	7e87384592	RegisterCoalescer: Fix removeCopyByCommutingDef with subreg liveness The code that eliminated additional coalescable copies in removeCopyByCommutingDef() used MergeValueNumberInto() which internally may merge A into B or B into A. In this case A and B had different Def points, so we have to reset ValNo.Def to the intended one after merging. llvm-svn: 225503	2015-01-09 03:01:31 +00:00
Hal Finkel	6c39269a4c	[PowerPC] Fold [sz]ext with fp_to_int lowering where possible On modern cores with lfiw[az]x, we can fold a sign or zero extension from i32 to i64 into the load necessary for an i64 -> fp conversion. llvm-svn: 225493	2015-01-09 01:34:30 +00:00
Duncan P. N. Exon Smith	953e1a48f0	Utils: Keep distinct MDNodes distinct in MapMetadata() Create new copies of distinct `MDNode`s instead of following the uniquing `MDNode` logic. Just like self-references (or other cycles), `MapMetadata()` creates a new node. In practice most calls use `RF_NoModuleLevelChanges`, in which case nothing is duplicated anyway. Part of PR22111. llvm-svn: 225476	2015-01-08 22:42:30 +00:00
Duncan P. N. Exon Smith	090a19bd3c	IR: Add 'distinct' MDNodes to bitcode and assembly Propagate whether `MDNode`s are 'distinct' through the other types of IR (assembly and bitcode). This adds the `distinct` keyword to assembly. Currently, no one actually calls `MDNode::getDistinct()`, so these nodes only get created for: - self-references, which are never uniqued, and - nodes whose operands are replaced that hit a uniquing collision. The concept of distinct nodes is still not quite first-class, since distinct-ness doesn't yet survive across `MapMetadata()`. Part of PR22111. llvm-svn: 225474	2015-01-08 22:38:29 +00:00
Hal Finkel	3c0952b072	[PowerPC] Mark all instructions as non-cheap for MachineLICM MachineLICM uses a callback named hasLowDefLatency to determine if an instruction def operand has a 'low' latency. If all relevant operands have a 'low' latency, the instruction is considered too cheap to hoist out of loops even in low-register-pressure situations. On PowerPC cores, both the embedded cores and the others, there is no reason to believe that this is a good choice: all instructions have a cost inside a loop, and hoisting them when not limited by register pressure is a reasonable default. llvm-svn: 225471	2015-01-08 22:11:49 +00:00
Akira Hatanaka	442b40c2eb	[ARM] Fix a bug in constant island pass that was triggering an assertion. The assert was being triggered when the distance between a constant pool entry and its user exceeded the maximally allowed distance after thumb2 branch shortening. A padding was inserted after a thumb2 branch instruction was shrunk, which caused the user to be out of range. This is wrong as the padding should have been inserted by the layout algorithm so that the distance between two instructions doesn't grow later during thumb2 instruction optimization. This commit fixes the code in ARMConstantIslands::createNewWater to call computeBlockSize and set BasicBlock::Unalign when a branch instruction is inserted to create new water after a basic block. A non-zero Unalign causes the worst-case padding to be inserted when adjustBBOffsetsAfter is called to recompute the basic block offsets. rdar://problem/19130476 llvm-svn: 225467	2015-01-08 20:44:50 +00:00
Matt Arsenault	b935d9df4c	Fix fcmp + fabs instcombines when using the intrinsic This was only handling the libcall. This is another example of why only the intrinsic should ever be used when it exists. llvm-svn: 225465	2015-01-08 20:09:34 +00:00
Lang Hames	e89539f711	[MCJIT] Remove a few redundant MCJIT tests, and drop the extraneous datalayout strings from the copies that remain. llvm-svn: 225460	2015-01-08 18:52:15 +00:00
Rafael Espindola	dffdf14bb7	Make this test a bit stricter. It now checks for the end of the line or the opening '{'. While at it, remove empty comments. llvm-svn: 225451	2015-01-08 16:11:18 +00:00
Justin Hibbits	98a532dd8e	Add saving and restoring of r30 to the prologue and epilogue, respectively Summary: The PIC additions didn't update the prologue and epilogue code to save and restore r30 (PIC base register). This does that. Test Plan: Tests updated. Reviewers: hfinkel Reviewed By: hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6876 llvm-svn: 225450	2015-01-08 15:47:19 +00:00
Kristof Beyls	933de7aa06	Fix large stack alignment codegen for ARM and Thumb2 targets This partially fixes PR13007 (ARM CodeGen fails with large stack alignment): for ARM and Thumb2 targets, but not for Thumb1, as it seems stack alignment for Thumb1 targets hasn't been supported at all. Producing an aligned stack pointer is done by zero-ing out the lower bits of the stack pointer. The BIC instruction was used for this. However, the immediate field of the BIC instruction only allows to encode an immediate that can zero out up to a maximum of the 8 lower bits. When a larger alignment is requested, a BIC instruction cannot be used; llvm was silently producing incorrect code in this case. This commit fixes code generation for large stack aligments by using the BFC instruction instead, when the BFC instruction is available. When not, it uses 2 instructions: a right shift, followed by a left shift to zero out the lower bits. The lowering of ARM::Int_eh_sjlj_dispatchsetup still has code that unconditionally uses BIC to realign the stack pointer, so it very likely has the same problem. However, I wasn't able to produce a test case for that. This commit adds an assert so that the compiler will fail the assert instead of silently generating wrong code if this is ever reached. llvm-svn: 225446	2015-01-08 15:09:14 +00:00
Tom Stellard	654d669e56	R600/SI: Remove SIISelLowering::legalizeOperands() Its functionality has been replaced by calling SIInstrInfo::legalizeOperands() from SIISelLowering::AdjstInstrPostInstrSelection() and running the SIFoldOperands and SIShrinkInstructions passes. llvm-svn: 225445	2015-01-08 15:08:17 +00:00
Elena Demikhovsky	285fbd551a	Masked Load/Store - fixed a bug in type legalization. llvm-svn: 225441	2015-01-08 12:29:19 +00:00
Michael Kuperstein	381dc08bc1	Fix a think-o in the test for r225438. llvm-svn: 225440	2015-01-08 12:05:02 +00:00
Michael Kuperstein	46f7d525c3	[X86] Don't try to generate direct calls to TLS globals The call lowering assumes that if the callee is a global, we want to emit a direct call. This is correct for regular globals, but not for TLS ones. Differential Revision: http://reviews.llvm.org/D6862 llvm-svn: 225438	2015-01-08 11:50:58 +00:00
Craig Topper	0c4d51b779	Fix test case I missed in r225432. llvm-svn: 225434	2015-01-08 07:57:27 +00:00
Craig Topper	7c10252943	[X86] Don't print 'dword ptr' or 'qword ptr' on the operand to some of the LEA variants in Intel syntax. The memory operand is inherently unsized. llvm-svn: 225432	2015-01-08 07:41:30 +00:00
Adrian Prantl	2561bb8831	Revert "Reapply: Teach SROA how to update debug info for fragmented variables." This reverts commit r225379 while investigating an assertion failure reported by Alexey. llvm-svn: 225424	2015-01-08 02:02:00 +00:00
Quentin Colombet	a799e2e014	[RegAllocGreedy] Introduce a late pass to repair broken hints. A broken hint is a copy where both ends are assigned different colors. When a variable gets evicted in the neighborhood of such copies, it is likely we can reconcile some of them. Context Copies are inserted during the register allocation via splitting. These split points are required to relax the constraints on the allocation problem. When such a point is inserted, both ends of the copy would not share the same color with respect to the current allocation problem. When variables get evicted, the allocation problem becomes different and some split point may not be required anymore. However, the related variables may already have been colored. This usually shows up in the assembly with pattern like this: def A ... save A to B def A use A restore A from B ... use B Whereas we could simply have done: def B ... def A use A ... use B Proposed Solution A variable having a broken hint is marked for late recoloring if and only if selecting a register for it evict another variable. Indeed, if no eviction happens this is pointless to look for recoloring opportunities as it means the situation was the same as the initial allocation problem where we had to break the hint. Finally, when everything has been allocated, we look for recoloring opportunities for all the identified candidates. The recoloring is performed very late to rely on accurate copy cost (all involved variables are allocated). The recoloring is simple unlike the last change recoloring. It propagates the color of the broken hint to all its copy-related variables. If the color is available for them, the recoloring uses it, otherwise it gives up on that hint even if a more complex coloring would have worked. The recoloring happens only if it is profitable. The profitability is evaluated using the expected frequency of the copies of the currently recolored variable with a) its current color and b) with the target color. If a) is greater or equal than b), then it is profitable and the recoloring happen. Example Consider the following example: BB1: a = b = BB2: ... = b = a Let us assume b gets split: BB1: a = b = BB2: c = b ... d = c = d = a Because of how the allocation work, b, c, and d may be assigned different colors. Now, if a gets evicted to make room for c, assuming b and d were assigned to something different than a. We end up with: BB1: a = st a, SpillSlot b = BB2: c = b ... d = c = d e = ld SpillSlot = e This is likely that we can assign the same register for b, c, and d, getting rid of 2 copies. Performances Both ARM64 and x86_64 show performance improvements of up to 3% for the llvm-testsuite + externals with Os and O3. There are a few regressions too that comes from the (in)accuracy of the block frequency estimate. <rdar://problem/18312047> llvm-svn: 225422	2015-01-08 01:16:39 +00:00
Matthias Braun	d55e6ddacf	RegisterCoalescer: Fix valuesIdentical() in some subrange merge cases. I got confused and assumed SrcIdx/DstIdx of the CoalescerPair is a subregister index in SrcReg/DstReg, but they are actually subregister indices of the coalesced register that get you back to SrcReg/DstReg when applied. Fixed the bug, improved comments and simplified code accordingly. Testcase by Tom Stellard! llvm-svn: 225415	2015-01-07 23:58:38 +00:00
Philip Reames	76ebd15437	[GC] improve testing around gc.relocate and fix a test Patch by: Ramkumar Ramachandra <artagnon@gmail.com> "This patch started out as an exploration of gc.relocate, and an attempt to write a simple test in call-lowering. I then noticed that the arguments of gc.relocate were not checked fully, so I went in and fixed a few things. Finally, the most important outcome of this patch is that my new error handling code caught a bug in a callsite in stackmap-format." Differential Revision: http://reviews.llvm.org/D6824 llvm-svn: 225412	2015-01-07 22:48:01 +00:00
Tom Stellard	0599297cb4	R600/SI: Commute instructions to enable more folding opportunities llvm-svn: 225410	2015-01-07 22:44:19 +00:00
Tom Stellard	26cc18df43	R600/SI: Only fold immediates that have one use Folding the same immediate into multiple instruction will increase program size, which can hurt performance. llvm-svn: 225405	2015-01-07 22:18:27 +00:00
Duncan P. N. Exon Smith	df55d8ba83	Linker: Don't use MDNode::replaceOperandWith() `MDNode::replaceOperandWith()` changes all instances of metadata. Stop using it when linking module flags, since (due to uniquing) the flag values could be used by other metadata. Instead, use new API `NamedMDNode::setOperand()` to update the reference directly. llvm-svn: 225397	2015-01-07 21:32:27 +00:00
Alexey Samsonov	82826f7a55	XFAIL several MCJIT EH tests under ASan and MSan bootstrap. llvm-svn: 225393	2015-01-07 21:27:26 +00:00
Rafael Espindola	31875f7490	Add a test that would have found the issue in r224935. llvm-svn: 225385	2015-01-07 21:10:25 +00:00
Kevin Enderby	e2297ddd11	Slightly refactor things for llvm-objdump and the -macho option so it can be used with options other than just -disassemble so that universal files can be used with other options combined with -arch options. No functional change to existing options and use. One test case added for the additional functionality with a universal file an a -arch option. llvm-svn: 225383	2015-01-07 21:02:18 +00:00
Olivier Sallenave	0451532996	More FMA folding opportunities. llvm-svn: 225380	2015-01-07 20:54:17 +00:00
Adrian Prantl	72b8ee708f	Reapply: Teach SROA how to update debug info for fragmented variables. The two buildbot failures were addressed in LLVM r225378 and CFE r225359. This rapplies commit 225272 without modifications. llvm-svn: 225379	2015-01-07 20:52:22 +00:00
Adrian Prantl	3dd48c6fde	Debug info: Allow aggregate types to be described by constants. llvm-svn: 225378	2015-01-07 20:48:58 +00:00
Colin LeMahieu	627df427eb	[Hexagon] Adding floating point classification and creation. llvm-svn: 225374	2015-01-07 20:28:57 +00:00
Tom Stellard	4842c05216	R600/SI: Add a V_MOV_B64 pseudo instruction This is used to simplify the SIFoldOperands pass and make it easier to fold immediates. llvm-svn: 225373	2015-01-07 20:27:25 +00:00
Colin LeMahieu	290ece7d4c	[Hexagon] Adding encodings for v5 floating point instructions. llvm-svn: 225372	2015-01-07 20:24:09 +00:00
Colin LeMahieu	777abcb1d7	[Hexagon] Adding encoding for popcount, fastcorner, dword asr with rounding. llvm-svn: 225371	2015-01-07 20:07:28 +00:00
Tom Stellard	ef3b864a07	R600/SI: Teach SIFoldOperands to split 64-bit constants when folding This allows folding of sequences like: s[0:1] = s_mov_b64 4 v_add_i32 v0, s0, v0 v_addc_u32 v1, s1, v1 into v_add_i32 v0, 4, v0 v_add_i32 v1, 0, v1 llvm-svn: 225369	2015-01-07 19:56:17 +00:00
Philip Reames	4ac17a3026	Introduce an example statepoint GC strategy This change includes the most basic possible GCStrategy for a GC which is using the statepoint lowering code. At the moment, this GCStrategy doesn't really do much - aside from actually generate correct stackmaps that is - but I went ahead and added a few extra correctness checks as proof of concept. It's mostly here to provide documentation on how to do one, and to provide a point for various optimization legality hooks I'd like to add going forward. (For context, see the TODOs in InstCombine around gc.relocate.) Most of the validation logic added here as proof of concept will soon move in to the Verifier. That move is dependent on http://reviews.llvm.org/D6811 There was discussion in the review thread about addrspace(1) being reserved for something. I'm going to follow up on a seperate llvmdev thread. If needed, I'll update all the code at once. Note that I am deliberately not making a GCStrategy required to use gc.statepoints with this change. I want to give folks out of tree - including myself - a chance to migrate. In a week or two, I'll make having a GCStrategy be required for gc.statepoints. To this end, I added the gc tag to one of the test cases but not others. Differential Revision: http://reviews.llvm.org/D6808 llvm-svn: 225365	2015-01-07 19:07:50 +00:00
David Majnemer	4d77fdf311	X86: Allow the stack probe size to be configurable per function LLVM emits stack probes on Windows targets to ensure that the stack is correctly accessed. However, the amount of stack allocated before emitting such a probe is hardcoded to 4096. It is desirable to have this be configurable so that a function might opt-out of stack probes. Our level of granularity is at the function level instead of, say, the module level to permit proper generation of code after LTO. Patch by Andrew H! N.B. The inliner needs to be updated to properly consider what happens after inlining a function with a specific stack-probe-size into another function with a different stack-probe-size. llvm-svn: 225360	2015-01-07 18:14:07 +00:00
Ahmed Bougacha	aa2d290997	[X86] Teach FCOPYSIGN lowering to recognize constant magnitudes. For code like: float foo(float x) { return copysign(1.0, x); } We used to generate: andps <-0.000000e+00,0,0,0>, %xmm0 movss <1.000000e+00>, %xmm1 andps <nan>, %xmm1 orps %xmm0, %xmm1 Basically doing an abs(1.0f) in the two middle instructions. We now generate: andps <-0.000000e+00,0,0,0>, %xmm0 orps <1.000000e+00,0,0,0>, %xmm0 Builds on cleanups r223415, r223542. rdar://19049548 Differential Revision: http://reviews.llvm.org/D6555 llvm-svn: 225357	2015-01-07 17:33:03 +00:00
Charlie Turner	06f22f4678	[ARM] Add missing Tag_DIV_use tests. llvm-svn: 225348	2015-01-07 11:37:40 +00:00
Chandler Carruth	e5b0a9cf3d	[PM] Give slightly less horrible names to the utility pass templates for requiring and invalidating specific analyses. Also make their printed names match their class names. Writing these out as prose really doesn't make sense to me any more. llvm-svn: 225346	2015-01-07 11:14:51 +00:00
Karthik Bhat	9ba55334dc	Revert r225165 and r225169 Even thouh gcc produces simialr instructions as Owen pointed out the two patterns aren’t equivalent in the case where the original subtraction could have caused an overflow. Reverting the same. llvm-svn: 225341	2015-01-07 06:34:34 +00:00
Chandler Carruth	fdb4180514	[PM] Fix a pretty nasty bug where the new pass manager would invalidate passes too many time. I think this is actually the issue that someone raised with me at the developer's meeting and in an email, but that we never really got to the bottom of. Having all the testing utilities made it much easier to dig down and uncover the core issue. When a pass manager is running many passes over a single function, we need it to invalidate the analyses between each run so that they can be re-computed as needed. We also need to track the intersection of preserved higher-level analyses across all the passes that we run (for example, if there is one module analysis which all the function analyses preserve, we want to track that and propagate it). Unfortunately, this interacted poorly with any enclosing pass adaptor between two IR units. It would see the intersection of preserved analyses, and need to invalidate any other analyses, but some of the un-preserved analyses might have already been invalidated and recomputed! We would fail to propagate the fact that the analysis had already been invalidated. The solution to this struck me as really strange at first, but the more I thought about it, the more natural it seemed. After a nice discussion with Duncan about it on IRC, it seemed even nicer. The idea is that invalidating an analysis causes it to be preserved! Preserving the lack of result is trivial. If it is recomputed, great. Until something else invalidates it again, we're good. The consequence of this is that the invalidate methods on the analysis manager which operate over many passes now consume their PreservedAnalyses object, update it to "preserve" every analysis pass to which it delivers an invalidation (regardless of whether the pass chooses to be removed, or handles the invalidation itself by updating itself). Then we return this augmented set from the invalidate routine, letting the pass manager take the result and use the intersection of that across each pass run to compute the final preserved set. This accounts for all the places where the early invalidation of an analysis has already "preserved" it for a future run. I've beefed up the testing and adjusted the assertions to show that we no longer repeatedly invalidate or compute the analyses across nested pass managers. llvm-svn: 225333	2015-01-07 01:58:35 +00:00
Matt Arsenault	d0101a2dfd	R600/SI: Add combine for isinfinite pattern llvm-svn: 225310	2015-01-06 23:00:46 +00:00
Matt Arsenault	6f6233dc58	R600/SI: Pattern match isinf to v_cmp_class instructions llvm-svn: 225307	2015-01-06 23:00:41 +00:00
Matt Arsenault	f2290336b7	R600/SI: Add basic DAG combines for fp_class llvm-svn: 225306	2015-01-06 23:00:39 +00:00
Matt Arsenault	4831ce5491	R600/SI: Add class intrinsic llvm-svn: 225305	2015-01-06 23:00:37 +00:00
Matt Arsenault	2458393104	Fix using wrong intrinsic in test This is a leftover from renaming the intrinsic. It's surprising the unknown llvm. intrinsic wasn't rejected. llvm-svn: 225304	2015-01-06 23:00:33 +00:00
Rafael Espindola	83a362cde8	Change the .ll syntax for comdats and add a syntactic sugar. In order to make comdats always explicit in the IR, we decided to make the syntax a bit more compact for the case of a GlobalObject in a comdat with the same name. Just dropping the $name causes problems for @foo = globabl i32 0, comdat $bar = comdat ... and declare void @foo() comdat $bar = comdat ... So the syntax is changed to @g1 = globabl i32 0, comdat($c1) @g2 = globabl i32 0, comdat and declare void @foo() comdat($c1) declare void @foo() comdat llvm-svn: 225302	2015-01-06 22:55:16 +00:00
Hal Finkel	ed844c4ad1	[PowerPC] Reuse a load operand in int->fp conversions int->fp conversions on PPC must be done through memory loads and stores. On a modern core, this process begins by storing the int value to memory, then loading it using a (sometimes special) FP load instruction. Unfortunately, we would do this even when the value to be converted was itself a load, and we can just use that same memory location instead of copying it to another first. There is a slight complication when handling int_to_fp(fp_to_int(x)) pairs, because the fp_to_int operand has not been lowered when the int_to_fp is being lowered. We handle this specially by invoking fp_to_int's lowering logic (partially) and getting the necessary memory location (some trivial refactoring was done to make this possible). This is all somewhat ugly, and it would be nice if some later CodeGen stage could just clean this stuff up, but because doing so would involve modifying target-specific nodes (or instructions), it is not immediately clear how that would work. Also, remove a related entry from the README.txt for which we now generate reasonable code. llvm-svn: 225301	2015-01-06 22:31:02 +00:00
Colin LeMahieu	507dd32703	[Hexagon] Adding compound jump encodings. llvm-svn: 225291	2015-01-06 20:03:31 +00:00
Tom Stellard	9d6797ae58	R600/SI: Insert s_waitcnt before s_barrier instructions. This ensures that all memory operations are complete when all threads reach the barrier. llvm-svn: 225290	2015-01-06 19:52:07 +00:00
Adrian Prantl	52f943b536	Revert "Reapply: Teach SROA how to update debug info for fragmented variables." because of a tsan buildbot failure. This reverts commit 225272. Fix should be coming soon. llvm-svn: 225288	2015-01-06 19:47:27 +00:00
Colin LeMahieu	68b2e050f0	[Hexagon] Adding encoding for misc v4 instructions: boundscheck, tlbmatch, dcfetch. llvm-svn: 225283	2015-01-06 19:03:20 +00:00
Sanjoy Das	7c0ce26614	This patch teaches IndVarSimplify to add nuw and nsw to certain kinds of operations that provably don't overflow. For example, we can prove %civ.inc below does not sign-overflow. With this change, IndVarSimplify changes %civ.inc to an add nsw. define i32 @foo(i32* %array, i32* %length_ptr, i32 %init) { entry: %length = load i32* %length_ptr, !range !0 %len.sub.1 = sub i32 %length, 1 %upper = icmp slt i32 %init, %len.sub.1 br i1 %upper, label %loop, label %exit loop: %civ = phi i32 [ %init, %entry ], [ %civ.inc, %latch ] %civ.inc = add i32 %civ, 1 %cmp = icmp slt i32 %civ.inc, %length br i1 %cmp, label %latch, label %break latch: store i32 0, i32* %array %check = icmp slt i32 %civ.inc, %len.sub.1 br i1 %check, label %loop, label %break break: ret i32 %civ.inc exit: ret i32 42 } Differential Revision: http://reviews.llvm.org/D6748 llvm-svn: 225282	2015-01-06 19:02:56 +00:00
Colin LeMahieu	d9c605ddae	[Hexagon] Adding encoding information for absolute address loads. llvm-svn: 225279	2015-01-06 18:38:26 +00:00
Tom Stellard	49f8bfdcb7	R600/SI: Add a stub GCNTargetMachine This is equivalent to the AMDGPUTargetMachine now, but it is the starting point for separating R600 and GCN functionality into separate targets. It is recommened that users start using the gcn triple for GCN-based GPUs, because using the r600 triple for these GPUs will be deprecated in the future. llvm-svn: 225277	2015-01-06 18:00:21 +00:00

... 3 4 5 6 7 ...

28249 Commits