llvm-project

Commit Graph

Author	SHA1	Message	Date
David Majnemer	1a666e0f69	ExecutionEngine: Preliminary support for dynamically loadable coff objects Provide basic support for dynamically loadable coff objects. Only handles a subset of x64 currently. Patch by Andy Ayers! Differential Revision: http://reviews.llvm.org/D7793 llvm-svn: 231574	2015-03-07 20:21:27 +00:00
Andrea Di Biagio	c9d79e8103	[DAGCombiner] Fix wrong folding of AND dag nodes. This patch fixes the logic in the DAGCombiner that folds an AND node according to rule: (and (X (load V)), C) -> (X (load V)) An AND between a vector load 'X' and a constant build_vector 'C' can be folded into the load itself only if we can prove that the AND operation is redundant. The algorithm implemented by 'visitAND' firstly computes the splat value 'S' from C, and then checks if S has the lower 'B' bits set (where B is the size in bits of the vector element type). The algorithm takes into account also the 'undef' bits in the splat mask. Unfortunately, the algorithm only worked under the assumption that the size of S is a multiple of the vector element type. With this patch, we conservatively avoid folding the AND if the splat bits are not compatible with the vector element type. Added X86 test and-load-fold.ll Differential Revision: http://reviews.llvm.org/D8085 llvm-svn: 231563	2015-03-07 12:24:55 +00:00
Simon Pilgrim	bede80a440	[DAGCombiner] SCALAR_TO_VECTOR(EXTRACT_VECTOR_ELT(V,C)) -> VECTOR_SHUFFLE This patch attempts to convert a SCALAR_TO_VECTOR using an operand from an EXTRACT_VECTOR_ELT into a VECTOR_SHUFFLE. This prevents many cases of spilling scalar data between the gpr + simd registers. At present the optimization only accepts cases where there is no TRUNC of the scalar type (i.e. all types must match). Differential Revision: http://reviews.llvm.org/D8132 llvm-svn: 231554	2015-03-07 05:52:42 +00:00
Eric Christopher	e035e26655	Remove use of misched-bench from this test and replace it with non-temporary enabling options. This is part of removing misched-bench as an option. llvm-svn: 231546	2015-03-07 01:39:06 +00:00
Frederic Riss	23e20e95e9	[dsymutil] Apply relocations to DIE data before cloning. Doing this gets function's low_pc and global variable's locations right in the output debug info. It also could get right other attributes that need to be relocated (in linker terms), but I don't know of any other than the address attributes. This doesn't fixup low_pc attributes in compile_unit, lexical_block or inlined subroutine, nor does it get right high_pc attributes for function. This will come in a subsequent commit. llvm-svn: 231544	2015-03-07 01:25:09 +00:00
Eric Christopher	7e70aba1a8	Recommit r231324 with a fix to the ARM execution domain code to disable lane switching if we don't actually have the instruction set we want to switch to. Models the earlier check above the conditional for the pass. The testcase is one that triggered with the assert that's added as part of the fix, use it to avoid adding a new testcase as it highlights the same problem. llvm-svn: 231539	2015-03-07 00:12:22 +00:00
Frederic Riss	9833de65a7	[dsymutil] Support cloning DIE reference attributes. Reference attributes are mainly handled by just creating DIEEntry attributes for them. There is a special case for DW_FORM_ref_addr attributes though, because the DIEEntry code needs a DwarfDebug code to emit them (and we don't have one as we do no CodeGen). In that case, just use DIEInteger attributes with the right form. llvm-svn: 231531	2015-03-06 23:22:53 +00:00
Olivier Sallenave	049d803ce0	Do not restrict interleaved unrolling to small loops, depending on the target. llvm-svn: 231528	2015-03-06 23:12:04 +00:00
Quentin Colombet	66b616351c	[AArch64][LoadStoreOptimizer] Generate LDP + SXTW instead of LD[U]R + LD[U]RSW. Teach the load store optimizer how to sign extend a result of a load pair when it helps creating more pairs. The rational is that loads are more expensive than sign extensions, so if we gather some in one instruction this is better! <rdar://problem/20072968> llvm-svn: 231527	2015-03-06 22:42:10 +00:00
Sanjay Patel	3fee49b236	fixed to test features, not CPUs llvm-svn: 231524	2015-03-06 21:50:42 +00:00
Sanjay Patel	a800b6c04b	fixed to test features, not CPUs llvm-svn: 231523	2015-03-06 21:50:27 +00:00
Sanjay Patel	4593045f01	loosen checking for buildbots llvm-svn: 231522	2015-03-06 21:30:18 +00:00
Sanjay Patel	3fd51f3c4d	fixed to test only the feature, not the feature and a CPU llvm-svn: 231521	2015-03-06 21:24:56 +00:00
Sanjay Patel	eb60f0728d	fixed to test only the feature, not the feature and a CPU llvm-svn: 231520	2015-03-06 21:19:32 +00:00
Sanjay Patel	9c04ad5ed7	fixed test to use FileCheck llvm-svn: 231519	2015-03-06 21:16:15 +00:00
Sanjay Patel	9881f9531c	fixed to use CHECK-LABELs llvm-svn: 231517	2015-03-06 21:05:02 +00:00
Sanjay Patel	6a53998a48	fixed to test only the feature, not the feature and a CPU llvm-svn: 231516	2015-03-06 20:58:15 +00:00
Sanjay Patel	869cea48cc	fixed to test only the feature, not the feature and a CPU llvm-svn: 231515	2015-03-06 20:57:40 +00:00
Sanjay Patel	dba8012f69	fixed to test feature, not CPU llvm-svn: 231513	2015-03-06 20:51:25 +00:00
Sanjay Patel	7c6eaf03d7	fixed to test features, not CPUs llvm-svn: 231512	2015-03-06 20:46:16 +00:00
Sanjay Patel	829c7347d1	fixed test to use SSE2 attribute llvm-svn: 231510	2015-03-06 20:38:55 +00:00
Sanjay Patel	2b7229c34d	fixed to test only the feature, not the feature and a CPU llvm-svn: 231509	2015-03-06 20:34:20 +00:00
Matthias Braun	898d11e864	DAGCombiner: Canonicalize select(and/or,x,y) depending on target. This is based on the following equivalences: select(C0 & C1, X, Y) <=> select(C0, select(C1, X, Y), Y) select(C0 \| C1, X, Y) <=> select(C0, X, select(C1, X, Y)) Many target cannot perform and/or on the CPU flags and therefore the right side should be choosen to avoid materializign the i1 flags in an integer register. If the target can perform this operation efficiently we normalize to the left form. Differential Revision: http://reviews.llvm.org/D7622 llvm-svn: 231507	2015-03-06 19:49:10 +00:00
Bruno Cardoso Lopes	61b9fd4686	[AsmPrinter][TLOF] Remove AArch64 test to appease buildbots Follow up from r231497. Using XFAIL would still trigger fail on some buildbots. Will re-introduce it as soon as I have a fix. llvm-svn: 231505	2015-03-06 19:42:18 +00:00
Bruno Cardoso Lopes	6e38693507	[AsmPrinter][TLOF] XFAIL AArch64 test to appease buildbots The checking for extgotequiv and localgotequiv rely on the emission order, which is not guaranteed because we use DenseMap to hold the GOT equivalents. XFAIL this now until I get time to use MapVector and test out the solution. In the meantime, appease buildbots. llvm-svn: 231497	2015-03-06 18:38:42 +00:00
Frederic Riss	ef648462d2	[dsymutil] Add debug_str construction support. With this comes the ability to correctly clone string attributes in DIEs. llvm-svn: 231493	2015-03-06 17:56:30 +00:00
Bruno Cardoso Lopes	5b75f4a356	[AsmPrinter][TLOF] Make AArch64 test a bit more flexible llvm-svn: 231481	2015-03-06 15:11:41 +00:00
Bruno Cardoso Lopes	2d54aa496e	[AsmPrinter][TLOF] Split tests and move to appropriate directories Follow up from r231474 and 231475 to appease buildbots llvm-svn: 231480	2015-03-06 14:41:56 +00:00
Bruno Cardoso Lopes	618c67a018	[AsmPrinter][TLOF] 32-bit MachO support for replacing GOT equivalents Add MachO 32-bit (i.e. arm and x86) support for replacing global GOT equivalent symbol accesses. Unlike 64-bit targets, there's no GOTPCREL relocation, and access through a non_lazy_symbol_pointers section is used instead. -- before _extgotequiv: .long _extfoo _delta: .long _extgotequiv-_delta -- after _delta: .long L_extfoo$non_lazy_ptr-_delta .section __IMPORT,__pointers,non_lazy_symbol_pointers L_extfoo$non_lazy_ptr: .indirect_symbol _extfoo .long 0 llvm-svn: 231475	2015-03-06 13:49:05 +00:00
Bruno Cardoso Lopes	52b1391df6	[AsmPrinter][TLOF] ARM64 MachO support for replacing GOT equivalents Follow up r230264 and add ARM64 support for replacing global GOT equivalent symbol accesses by references to the GOT entry for the final symbol instead, example: -- before .globl _foo _foo: .long 42 .globl _gotequivalent _gotequivalent: .quad _foo .globl _delta _delta: .long _gotequivalent-_delta -- after .globl _foo _foo: .long 42 .globl _delta Ltmp3: .long _foo@GOT-Ltmp3 llvm-svn: 231474	2015-03-06 13:48:45 +00:00
Toma Tabacu	4e0cf8e211	[mips] [IAS] Add missing constraints and improve testing for the .module directive. Summary: None of the .set directives can be used before the .module directives. The .set mips0/pop/push were not triggering this constraint. Also added testing for all the other implemented directives which are supposed to trigger this constraint. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7140 llvm-svn: 231465	2015-03-06 12:15:12 +00:00
Karthik Bhat	88db86dd29	Add a new pass "Loop Interchange" This pass interchanges loops to provide a more cache-friendly memory access. For e.g. given a loop like - for(int i=0;i<N;i++) for(int j=0;j<N;j++) A[j][i] = A[j][i]+B[j][i]; is interchanged to - for(int j=0;j<N;j++) for(int i=0;i<N;i++) A[j][i] = A[j][i]+B[j][i]; This pass is currently disabled by default. To give a brief introduction it consists of 3 stages- LoopInterchangeLegality : Checks the legality of loop interchange based on Dependency matrix. LoopInterchangeProfitability: A very basic heuristic has been added to check for profitibility. This will evolve over time. LoopInterchangeTransform : Which does the actual transform. LNT Performance tests shows improvement in Polybench/linear-algebra/kernels/mvt and Polybench/linear-algebra/kernels/gemver becnmarks. TODO: 1) Add support for reductions and lcssa phi. 2) Improve profitability model. 3) Improve loop selection algorithm to select best loop for interchange. Currently the innermost loop is selected for interchange. 4) Improve compile time regression found in llvm lnt due to this pass. 5) Fix issues in Dependency Analysis module. A special thanks to Hal for reviewing this code. Review: http://reviews.llvm.org/D7499 llvm-svn: 231458	2015-03-06 10:11:25 +00:00
David Majnemer	b61f4e403d	X86: Form IMGREL relocations for LLVM Functions We supported forming IMGREL relocations from ConstantExprs involving __ImageBase if the minuend was a GlobalVariable. Extend this functionality to all GlobalObjects. llvm-svn: 231456	2015-03-06 08:11:32 +00:00
Michael Zolotukhin	03dd1082ad	LegalizeTypes: Handle shift by 0 in ExpandShiftByConstant. Though such shifts are usually optimized away by combiner, we still can encounter them after a vector shift is legalized. llvm-svn: 231443	2015-03-06 01:13:01 +00:00
Rafael Espindola	a5b9e1cf39	Remember to move a type to the correct set when setting the body. We would set the body of a struct type (therefore making it non-opaque) but were forgetting to move it to the non-opaque set. Fixes pr22807. llvm-svn: 231442	2015-03-06 00:50:21 +00:00
Michael Gottesman	f6bcb81000	[objc-arc] Remove annotations code. It will always be in the history if it is needed again. Now it is just dead code. llvm-svn: 231435	2015-03-06 00:34:29 +00:00
Nadav Rotem	c99a38796c	Teach ComputeNumSignBits about signed reminder. This optimization a continuation of r231140 that reasoned about signed div. llvm-svn: 231433	2015-03-06 00:23:58 +00:00
Philip Reames	e21ce4540c	[RewriteStatepointsForGC] Yet more test cases for relocation At this point, we should have decent coverage of the involved code. I've got a few more test cases to cleanup and submit, but what's here is already reasonable. I've got a collection of liveness tests which will be posted for review along with a decent liveness algorithm in the next few days. Once those are in, the code in this file should be well tested and I can start renaming things without risk of serious breakage. llvm-svn: 231414	2015-03-05 22:28:06 +00:00
Sanjay Patel	302404b277	[AVX] Lower / fast-isel scalar FP selects into VBLENDV instructions (PR22483) This patch reduces code size for all AVX targets and increases speed for some chips. SSE 4.1 introduced the useless (see code comments) 2-register form of BLENDV and only in the packed float/double flavors. AVX subsequently made the instruction useful by adding a 4-register operand form. So we just need to paper over the lack of scalar forms of this instruction, complicate the code to choose float or double forms, and use blendv on scalars since all FP is in xmm registers anyway. This gives us an approximately 50% speed up for a blendv microbenchmark sequence on SandyBridge and Haswell: blendv : 29.73 cycles/iter logic : 43.15 cycles/iter No new test cases with this patch because: 1. fast-isel-select-sse.ll tests the positive side for regular X86 lowering and fast-isel 2. sse-minmax.ll and fp-select-cmp-and.ll confirm that we're not firing for scalar selects without AVX 3. fp-select-cmp-and.ll and logical-load-fold.ll confirm that we're not firing for scalar selects with constants. http://llvm.org/bugs/show_bug.cgi?id=22483 Differential Revision: http://reviews.llvm.org/D8063 llvm-svn: 231408	2015-03-05 21:46:54 +00:00
Ahmed Bougacha	1b67630cb3	[AArch64] Teach AsmPrinter about GlobalAddress operands. Fixes PR22761, rdar://20024866. Differential Revision: http://reviews.llvm.org/D8042 llvm-svn: 231400	2015-03-05 20:04:21 +00:00
Philip Reames	03ea8642b1	[RewriteStatepointsForGC] Add additional tests around relocation These are focused around the actual relocation rewriting itself, not the rest of the infrastructure. llvm-svn: 231399	2015-03-05 19:52:13 +00:00
Rafael Espindola	092b619e55	Use the correct func begin symbol in all places in ppc. I missed an occurrence of the old symbol in my previous patch. llvm-svn: 231398	2015-03-05 19:47:50 +00:00
Ahmed Bougacha	4200cc95b4	[ARM] Enable vector extload combine for legal types. This commit enables forming vector extloads for ARM. It only does so for legal types, and when we can't fold the extension in a wide/long form of the user instruction. Enabling it for larger types isn't as good an idea on ARM as it is on X86, because: - we pretend that extloads are legal, but end up generating vld+vmov - we have instructions like vld {dN, dM}, which can't be generated when we "manually expand" extloads to vld+vmov. For legal types, the combine doesn't fire that often: in the integration tests only in a big endian testcase, where it removes a pointless AND. Related to rdar://19723053 Differential Revision: http://reviews.llvm.org/D7423 llvm-svn: 231396	2015-03-05 19:37:53 +00:00
Rafael Espindola	86bd6a1202	Use the generic Lfunc_begin label on ppc. This removes yet another custom label to mark the start of a function. llvm-svn: 231390	2015-03-05 18:55:50 +00:00
David Majnemer	71b9b6be1b	X86: Optimize address mode matching for FRAME_ALLOC_RECOVER nodes We know that the absolute symbol will be less than 2GB and thus will always fit. llvm-svn: 231389	2015-03-05 18:50:12 +00:00
Reid Kleckner	cfb9ce53c1	Replace llvm.frameallocate with llvm.frameescape Turns out it's pretty straightforward and simplifies the implementation. Reviewers: andrew.w.kaylor Differential Revision: http://reviews.llvm.org/D8051 llvm-svn: 231386	2015-03-05 18:26:34 +00:00
Simon Pilgrim	7189084bef	[DagCombiner] Allow shuffles to merge through bitcasts Currently shuffles may only be combined if they are of the same type, despite the fact that bitcasts are often introduced in between shuffle nodes (e.g. x86 shuffle type widening). This patch allows a single input shuffle to peek through bitcasts and if the input is another shuffle will merge them, shuffling using the smallest sized type, and re-applying the bitcasts at the inputs and output instead. Dropped old ShuffleToZext test - this patch removes the use of the zext and vector-zext.ll covers these anyhow. Differential Revision: http://reviews.llvm.org/D7939 llvm-svn: 231380	2015-03-05 17:14:04 +00:00
Kit Barton	e48b1e1c4f	While reviewing the changes to Clang to add builtin support for the vsld, vsrd, and vsrad instructions, it was pointed out that the builtins are generating the LLVM opcodes (shl, lshr, and ashr) not calls to the intrinsics. This patch changes the implementation of the vsld, vsrd, and vsrad instructions from from intrinsics to VXForm_1 instructions and makes them legal with P8 Altivec. It also removes the definition of the int_ppc_altivec_vsld, int_ppc_altivec_vsrd, and int_ppc_altivec_vsrad intrinsics. llvm-svn: 231378	2015-03-05 16:24:38 +00:00
Igor Laevsky	8d0851f509	Revert change r231366 as it broke clang-native-arm-cortex-a9 Analysis/properties.m test. llvm-svn: 231374	2015-03-05 15:41:14 +00:00
Elena Demikhovsky	de05f10de2	AVX-512, SKX: Enabled masked_load/store operations for this target. Added lowering for ISD::CONCAT_VECTORS and ISD::INSERT_SUBVECTOR for i1 vectors, it is needed to pass all masked_memop.ll tests for SKX. llvm-svn: 231371	2015-03-05 15:11:35 +00:00
Igor Laevsky	1725997f14	Teach lowering to correctly handle invoke statepoint and gc results tied to them. Note that we still can not lower gc.relocates for invoke statepoints. Also it extracts getCopyFromRegs helper function in SelectionDAGBuilder as we need to be able to customize type of the register exported from basic block during lowering of the gc.result. llvm-svn: 231366	2015-03-05 14:11:21 +00:00
Michael Kuperstein	bcb26d6880	[InstCombine] Fix an assertion when fmul has a ConstantExpr operand isNormalFp and isFiniteNonZeroFp should not assume vector operands can not be constant expressions. Patch by Pawel Jurek <pawel.jurek@intel.com> Differential Revision: http://reviews.llvm.org/D8053 llvm-svn: 231359	2015-03-05 08:38:57 +00:00
Craig Topper	0ee8470a43	[X86] Use vmovss to handle inserting an element into index 0 of a v8f32 vector of zeros. llvm-svn: 231354	2015-03-05 06:38:42 +00:00
Rafael Espindola	07c03d316d	Use the existing begin and end symbol for debug info. llvm-svn: 231338	2015-03-05 02:05:42 +00:00
Kostya Serebryany	83ce8779d5	[sanitizer] add nosanitize metadata to more coverage instrumentation instructions llvm-svn: 231333	2015-03-05 01:20:05 +00:00
Chandler Carruth	af7e99f2f4	[MBP] Revert r231238 which attempted to fix a nasty bug where MBP is just arbitrarily interleaving unrelated control flows once they get moved "out-of-line" (both outside of natural CFG ordering and with diamonds that cannot be fully laid out by chaining fallthrough edges). This easy solution doesn't work in practice, and it isn't just a small bug. It looks like a very different strategy will be required. I'm working on that now, and it'll again go behind some flag so that everyone can experiment and make sure it is working well for them. llvm-svn: 231332	2015-03-05 01:07:03 +00:00
Paul Robinson	49e38965dc	Turn off .debug_pubnames/pubtypes for PS4. Differential Revision: http://reviews.llvm.org/D8067 llvm-svn: 231322	2015-03-05 00:08:27 +00:00
Matthias Braun	eca5151780	Improve test robustness Improve test robustness in preparation of coming commits: - Avoid undefs which may get propagated too much. - Remove several pointless add 0, instructions llvm-svn: 231307	2015-03-04 22:31:18 +00:00
Sanjoy Das	9e2c5010f6	[SCEV] make SCEV smarter about proving no-wrap. Summary: Teach SCEV to prove no overflow for an add recurrence by proving something about the range of another add recurrence a loop-invariant distance away from it. Reviewers: atrick, hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7980 llvm-svn: 231305	2015-03-04 22:24:17 +00:00
Frederic Riss	b8b43d5494	[dsymutil] Add minimal code to emit DIE trees. This commit adds code to emit DIE trees that have been pruned from the parts that haven't been marked as kept in the previous pass. It works by 'cloning' the input DIE tree (as read by libDebugInfoDwarf) into a tree of DIE objects. Cloning the DIEs means essentially cloning their attributes. The code in this commit does only handle scalar and block attributes (scalar because they are trivial, blocks because they can't be easily replaced by a scalr placeholder), all the other ones are replaced by placeholder zero values and will be handled in further commits. The added tests mostly check that the DIE tree has the correct layout and also verify that a few chosen scalar and block attributes correctly make their way into the output. llvm-svn: 231300	2015-03-04 22:07:44 +00:00
Rafael Espindola	266b8c8043	Expand variables when evaluating absolute expressions. This allows for variables to be used in .size. This matches gnu AS functionality. llvm-svn: 231295	2015-03-04 22:03:21 +00:00
Paul Robinson	78cc0821f0	Support standard DWARF TLS opcode; Darwin and PS4 use it. Differential Revision: http://reviews.llvm.org/D8018 llvm-svn: 231286	2015-03-04 20:55:11 +00:00
Nemanja Ivanovic	e8effe1edb	Add LLVM support for PPC cryptography builtins Review: http://reviews.llvm.org/D7955 llvm-svn: 231285	2015-03-04 20:44:33 +00:00
Rafael Espindola	f3f185486c	Bring r231132 back with a fix. The issue was that we were always printing the remarks. Fix that and add a test showing that it prints nothing if -pass-remarks is not given. Original message: Correctly handle -pass-remarks in the gold plugin. llvm-svn: 231273	2015-03-04 18:51:45 +00:00
Mehdi Amini	46a43556db	Make DataLayout Non-Optional in the Module Summary: DataLayout keeps the string used for its creation. As a side effect it is no longer needed in the Module. This is "almost" NFC, the string is no longer canonicalized, you can't rely on two "equals" DataLayout having the same string returned by getStringRepresentation(). Get rid of DataLayoutPass: the DataLayout is in the Module The DataLayout is "per-module", let's enforce this by not duplicating it more than necessary. One more step toward non-optionality of the DataLayout in the module. Make DataLayout Non-Optional in the Module Module->getDataLayout() will never returns nullptr anymore. Reviewers: echristo Subscribers: resistor, llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D7992 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 231270	2015-03-04 18:43:29 +00:00
Adrian Prantl	afdac4b7f0	Update the out-of-date dwarf expressions in these testcases. llvm-svn: 231261	2015-03-04 17:39:59 +00:00
Marek Olsak	d2af89df10	R600/SI: Add an intrinsic for S_FLBIT_I32 / V_FFBH_I32 Required by OpenGL (ARB_gpu_shader5). llvm-svn: 231259	2015-03-04 17:33:45 +00:00
NAKAMURA Takumi	84a9697c17	Revert r231132, "Correctly handle -pass-remarks in the gold plugin.", for now, to suppress log floodng in LTO. llvm-svn: 231253	2015-03-04 16:24:28 +00:00
Jozef Kolek	c925808ee5	[mips][microMIPS] Make usage of ADDU16 and SUBU16 by code generator Differential Revision: http://reviews.llvm.org/D7609 llvm-svn: 231249	2015-03-04 15:47:42 +00:00
Andrea Di Biagio	df93ccf49a	[X86][FastISel] Simplify the logic in method X86SelectSIToFP. The target-independent selection algorithm in FastISel already knows how to select a SINT_TO_FP if the target is SSE but not AVX. On targets that have SSE but not AVX, the tablegen'd 'fastEmit' functions for ISD::SINT_TO_FP know how to select instruction X86::CVTSI2SSrr (for an i32 to f32 conversion) and X86::CVTSI2SDrr (for an i32 to f64 conversion). This patch simplifies the logic in method X86SelectSIToFP knowing that the code would not be reachable if the subtarget doesn't have AVX. No functional change intended. llvm-svn: 231243	2015-03-04 14:23:25 +00:00
Dmitry Vyukov	b37b95ed3e	asan: do not instrument direct inbounds accesses to stack variables Do not instrument direct accesses to stack variables that can be proven to be inbounds, e.g. accesses to fields of structs on stack. But it eliminates 33% of instrumentation on webrtc/modules_unittests (number of memory accesses goes down from 290152 to 193998) and reduces binary size by 15% (from 74M to 64M) and improved compilation time by 6-12%. The optimization is guarded by asan-opt-stack flag that is off by default. http://reviews.llvm.org/D7583 llvm-svn: 231241	2015-03-04 13:27:53 +00:00
Chandler Carruth	9a53fbe243	[MBP] Fix a really horrible bug in MachineBlockPlacement, but behind a flag for now. First off, thanks to Daniel Jasper for really pointing out the issue here. It's been here forever (at least, I think it was there when I first wrote this code) without getting really noticed or fixed. The key problem is what happens when two reasonably common patterns happen at the same time: we outline multiple cold regions of code, and those regions in turn have diamonds or other CFGs for which we can't just topologically lay them out. Consider some C code that looks like: if (a1()) { if (b1()) c1(); else d1(); f1(); } if (a2()) { if (b2()) c2(); else d2(); f2(); } done(); Now consider the case where a1() and a2() are unlikely to be true. In that case, we might lay out the first part of the function like: a1, a2, done; And then we will be out of successors in which to build the chain. We go to find the best block to continue the chain with, which is perfectly reasonable here, and find "b1" let's say. Laying out successors gets us to: a1, a2, done; b1, c1; At this point, we will refuse to lay out the successor to c1 (f1) because there are still un-placed predecessors of f1 and we want to try to preserve the CFG structure. So we go get the next best block, d1. ... wait for it ... Except that the next best block isn't d1. It is b2! d1 is waaay down inside these conditionals. It is much less important than b2. Except that this is exactly what we didn't want. If we keep going we get the entire set of the rest of the CFG interleaved!!! a1, a2, done; b1, c1; b2, c2; d1, f1; d2, f2; So we clearly need a better strategy here. =] My current favorite strategy is to actually try to place the block whose predecessor is closest. This very simply ensures that we unwind these kinds of CFGs the way that is natural and fitting, and should minimize the number of cache lines instructions are spread across. It also happens to be dead simple. It's like the datastructure was specifically set up for this use case or something. We only push blocks onto the work list when the last predecessor for them is placed into the chain. So the back of the worklist is the nearest next block. Unfortunately, a change like this is going to cause soooo many benchmarks to swing wildly. So for now I'm adding this under a flag so that we and others can validate that this is fixing the problems described, that it seems possible to enable, and hopefully that it fixes more of our problems long term. llvm-svn: 231238	2015-03-04 12:18:08 +00:00
Daniel Jasper	471e856f49	Add a flag to experiment with outlining optional branches. In a CFG with the edges A->B->C and A->C, B is an optional branch. LLVM's default behavior is to lay the blocks out naturally, i.e. A, B, C, in order to improve code locality and fallthroughs. However, if a function contains many of those optional branches only a few of which are taken, this leads to a lot of unnecessary icache misses. Moving B out of line can work around this. Review: http://reviews.llvm.org/D7719 llvm-svn: 231230	2015-03-04 11:05:34 +00:00
Kristof Beyls	aea8461820	Fix PR22408 - LLVM producing AArch64 TLS relocations that GNU linkers cannot handle yet. As is described at http://llvm.org/bugs/show_bug.cgi?id=22408, the GNU linkers ld.bfd and ld.gold currently only support a subset of the whole range of AArch64 ELF TLS relocations. Furthermore, they assume that some of the code sequences to access thread-local variables are produced in a very specific sequence. When the sequence is not as the linker expects, it can silently mis-relaxe/mis-optimize the instructions. Even if that wouldn't be the case, it's good to produce the exact sequence, as that ensures that linkers can perform optimizing relaxations. This patch: * implements support for 16MiB TLS area size instead of 4GiB TLS area size. Ideally clang would grow an -mtls-size option to allow support for both, but that's not part of this patch. * by default doesn't produce local dynamic access patterns, as even modern ld.bfd and ld.gold linkers do not support the associated relocations. An option (-aarch64-elf-ldtls-generation) is added to enable generation of local dynamic code sequence, but is off by default. * makes sure that the exact expected code sequence for local dynamic and general dynamic accesses is produced, by making use of a new pseudo instruction. The patch also removes two (AArch64ISD::TLSDESC_BLR, AArch64ISD::TLSDESC_CALL) pre-existing AArch64-specific pseudo SDNode instructions that are superseded by the new one (TLSDESC_CALLSEQ). llvm-svn: 231227	2015-03-04 09:12:08 +00:00
Michael Kuperstein	fb95697c88	[DAGCombine] Fix a bug in a BUILD_VECTOR combine When trying to convert a BUILD_VECTOR into a shuffle, we try to split a single source vector that is twice as wide as the destination vector. We can not do this when we also need the zero vector to create a blend. This fixes PR22774. Differential Revision: http://reviews.llvm.org/D8040 llvm-svn: 231219	2015-03-04 07:27:39 +00:00
Davide Italiano	fcae934c03	[MC][Target] Implement support for R_X86_64_SIZE{32,64}. Differential Revision: D7990 Reviewed by: rafael, majnemer llvm-svn: 231216	2015-03-04 06:49:39 +00:00
Zachary Turner	653236596a	[llvm-pdbdump] Display full enum definitions. This will now display enum definitions both at the global scope as well as nested inside of classes. Additionally, it will no longer display enums at the global scope if the enum is nested. Instead, it will omit the definition of the enum globally and instead emit it in the corresponding class definition. llvm-svn: 231215	2015-03-04 06:09:53 +00:00
Filipe Cabecinhas	0524acc727	Fix the test for r231201. We don't crash anymore. llvm-svn: 231207	2015-03-04 02:09:40 +00:00
Rafael Espindola	310e4b592f	Use the vanilla func_end symbol for .size. No need to create yet another temp symbol. llvm-svn: 231198	2015-03-04 01:35:23 +00:00
Eric Christopher	afc703da52	Weaken the check for a specific movl on the twoaddr-coalesce-3 test - we only care that there are two moves in the loop and not which part is relative to which register anyhow. llvm-svn: 231191	2015-03-04 01:19:17 +00:00
Filipe Cabecinhas	6b79728815	Fix the x86-upgrade-avx2-vbroadcast.ll test by commenting the CHECK lines llvm-svn: 231187	2015-03-04 00:49:12 +00:00
Rafael Espindola	0ac5075f31	Drop the "eh_" from eh_func_begin and eh_func_end. They will be used for more than eh tables. llvm-svn: 231185	2015-03-04 00:27:43 +00:00
Philip Reames	6da37857d1	[RewriteStatepointsForGC] Fix a relocation bug w.r.t values defined by invoke instructions RewriteStatepointsForGC pass emits an alloca for each GC pointer which will be relocated. It then inserts stores after def and all relocations, and inserts loads before each use as well. In the end, mem2reg is used to update IR with relocations in SSA form. However, there is a problem with inserting stores for values defined by invoke instructions. The code didn't expect a def was a terminator instruction, and inserting instructions after these terminators resulted in malformed IR. This patch fixes this problem by handling invoke instructions as a special case. If the def is an invoke instruction, the store will be inserted at the beginning of the normal destination block. Since return value from invoke instruction does not dominate the unwind destination block, no action is needed there. Patch by: Chen Li Differential Revision: http://reviews.llvm.org/D7923 llvm-svn: 231183	2015-03-04 00:13:52 +00:00
Juergen Ributzka	1f7a17661c	Remove 'llvm.x86.avx2.vbroadcasti128' intrinsic. The intrinsic is no longer generated by the front-end. Remove the intrinsic and auto-upgrade it to a vector shuffle. Reviewed by Nadav This is related to rdar://problem/18742778. llvm-svn: 231182	2015-03-04 00:13:25 +00:00
Eric Christopher	9900a5d037	Update twoaddr-coalesce-3.ll to run on darwin and linux machines: a) Default relocation model differences, b) Different numbers of # in comments llvm-svn: 231178	2015-03-03 23:56:20 +00:00
Kostya Serebryany	be5e0ed919	[sanitizer/coverage] Add AFL-style coverage counters (search heuristic for fuzzing). Introduce -mllvm -sanitizer-coverage-8bit-counters=1 which adds imprecise thread-unfriendly 8-bit coverage counters. The run-time library maps these 8-bit counters to 8-bit bitsets in the same way AFL (http://lcamtuf.coredump.cx/afl/technical_details.txt) does: counter values are divided into 8 ranges and based on the counter value one of the bits in the bitset is set. The AFL ranges are used here: 1, 2, 3, 4-7, 8-15, 16-31, 32-127, 128+. These counters provide a search heuristic for single-threaded coverage-guided fuzzers, we do not expect them to be useful for other purposes. Depending on the value of -fsanitize-coverage=[123] flag, these counters will be added to the function entry blocks (=1), every basic block (=2), or every edge (=3). Use these counters as an optional search heuristic in the Fuzzer library. Add a test where this heuristic is critical. llvm-svn: 231166	2015-03-03 23:27:02 +00:00
Reid Kleckner	423665311d	WinEH: Remove vestigial EH object Ultimately, we'll need to leave something behind to indicate which alloca will hold the exception, but we can figure that out when it comes time to emit the __CxxFrameHandler3 catch handler table. llvm-svn: 231164	2015-03-03 23:20:30 +00:00
David Majnemer	1bacc0abc9	InstCombine: Ensure select condition types are identical before merging Selection conditions may be vectors or scalars. Make sure InstCombine doesn't indiscriminately assume that a select which is value dependent on another select have identical select condition types. This fixes PR22773. llvm-svn: 231156	2015-03-03 22:40:36 +00:00
Andrew Kaylor	5b70b76069	Moving WinEH outlining tests to an architecture neutral location llvm-svn: 231155	2015-03-03 22:33:39 +00:00
Eric Christopher	2891913f1a	Fix a problem where the TwoAddressInstructionPass which generate redundant register moves in a loop. From: int M, total; void foo() { int i; for (i = 0; i < M; i++) { total = total + i / 2; } } This is the kernel loop: .LBB0_2: # %for.body =>This Inner Loop Header: Depth=1 movl %edx, %esi movl %ecx, %edx shrl $31, %edx addl %ecx, %edx sarl %edx addl %esi, %edx incl %ecx cmpl %eax, %ecx jl .LBB0_2 -------------------------- The first mov insn "movl %edx, %esi" could be removed if we change "addl %esi, %edx" to "addl %edx, %esi". The IR before TwoAddressInstructionPass is: BB#2: derived from LLVM BB %for.body Predecessors according to CFG: BB#1 BB#2 %vreg3<def> = COPY %vreg12<kill>; GR32:%vreg3,%vreg12 %vreg2<def> = COPY %vreg11<kill>; GR32:%vreg2,%vreg11 %vreg7<def,tied1> = SHR32ri %vreg3<tied0>, 31, %EFLAGS<imp-def,dead>; GR32:%vreg7,%vreg3 %vreg8<def,tied1> = ADD32rr %vreg3<tied0>, %vreg7<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg8,%vreg3,%vreg7 %vreg9<def,tied1> = SAR32r1 %vreg8<kill,tied0>, %EFLAGS<imp-def,dead>; GR32:%vreg9,%vreg8 %vreg4<def,tied1> = ADD32rr %vreg9<kill,tied0>, %vreg2<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg4,%vreg9,%vreg2 %vreg5<def,tied1> = INC64_32r %vreg3<kill,tied0>, %EFLAGS<imp-def,dead>; GR32:%vreg5,%vreg3 CMP32rr %vreg5, %vreg0, %EFLAGS<imp-def>; GR32:%vreg5,%vreg0 %vreg11<def> = COPY %vreg4; GR32:%vreg11,%vreg4 %vreg12<def> = COPY %vreg5<kill>; GR32:%vreg12,%vreg5 JL_4 <BB#2>, %EFLAGS<imp-use,kill> Now TwoAddressInstructionPass will choose vreg9 to be tied with vreg4. However, it doesn't see that there is copy from vreg4 to vreg11 and another copy from vreg11 to vreg2 inside the loop body. To remove those copies, it is necessary to choose vreg2 to be tied with vreg4 instead of vreg9. This code pattern commonly appears when there is reduction operation in a loop. So check for a reversed copy chain and if we encounter one then we can commute the add instruction so we can avoid a copy. Patch by Wei Mi. http://reviews.llvm.org/D7806 llvm-svn: 231148	2015-03-03 22:03:03 +00:00
Nadav Rotem	029c5c7fdb	Teach ComputeNumSignBits about signed divisions. http://reviews.llvm.org/D8028 rdar://20023136 llvm-svn: 231140	2015-03-03 21:39:02 +00:00
Rafael Espindola	84483d247f	Correctly handle -pass-remarks in the gold plugin. llvm-svn: 231132	2015-03-03 21:11:13 +00:00
Paul Robinson	06a8eb8343	[X86][ELF] Correct relocation for DWARF TLS references Previously we had only Linux using DTPOFF for these; all X86 ELF targets should. Fixes a side issue mentioned in PR21077. Differential Revision: http://reviews.llvm.org/D8011 llvm-svn: 231130	2015-03-03 21:01:27 +00:00
Adrian Prantl	b283815a30	Fix PR22762. When emitting a DWARF expression check whether this is the frame register before checking if there is a DWARF register number for it. Thanks to H.J. Lu for diagnosing this and providing the testcase! llvm-svn: 231121	2015-03-03 20:12:52 +00:00
Andrew Kaylor	f0f5e46e07	Outline cleanup handlers for native Windows C++ exception handling Differential Revision: http://reviews.llvm.org/D7865 llvm-svn: 231117	2015-03-03 20:00:16 +00:00
Kit Barton	0cfa7b7ad0	Add the following 64-bit vector integer arithmetic instructions added in POWER8: vaddudm vsubudm vmulesw vmulosw vmuleuw vmulouw vmuluwm vmaxsd vmaxud vminsd vminud vcmpequd vcmpequd. vcmpgtsd vcmpgtsd. vcmpgtud vcmpgtud. vrld vsld vsrd vsrad Phabricator review: http://reviews.llvm.org/D7959 llvm-svn: 231115	2015-03-03 19:55:45 +00:00
Reid Kleckner	2f05d4c91f	Make llvm.eh.begincatch use an outparam Ultimately, __CxxFrameHandler3 needs us to put a stack offset in a table, and it will take responsibility for copying the exception object into that slot. Modelling the exception object as an SSA value returned by begincatch isn't going to work in general, so make it use an output parameter. Reviewers: andrew.w.kaylor Differential Revision: http://reviews.llvm.org/D7920 llvm-svn: 231086	2015-03-03 17:41:09 +00:00
Chad Rosier	8e38f30e49	[AArch64] When combining constant mul of -3, prefer (sub x, (shl x, N)). This change only effects codegen when the constant is -3. llvm-svn: 231085	2015-03-03 17:31:01 +00:00
Duncan P. N. Exon Smith	e274180f0e	DebugInfo: Move new hierarchy into place Move the specialized metadata nodes for the new debug info hierarchy into place, finishing off PR22464. I've done bootstraps (and all that) and I'm confident this commit is NFC as far as DWARF output is concerned. Let me know if I'm wrong :). The code changes are fairly mechanical: - Bumped the "Debug Info Version". - `DIBuilder` now creates the appropriate subclass of `MDNode`. - Subclasses of DIDescriptor now expect to hold their "MD" counterparts (e.g., `DIBasicType` expects `MDBasicType`). - Deleted a ton of dead code in `AsmWriter.cpp` and `DebugInfo.cpp` for printing comments. - Big update to LangRef to describe the nodes in the new hierarchy. Feel free to make it better. Testcase changes are enormous. There's an accompanying clang commit on its way. If you have out-of-tree debug info testcases, I just broke your build. - `upgrade-specialized-nodes.sh` is attached to PR22564. I used it to update all the IR testcases. - Unfortunately I failed to find way to script the updates to CHECK lines, so I updated all of these by hand. This was fairly painful, since the old CHECKs are difficult to reason about. That's one of the benefits of the new hierarchy. This work isn't quite finished, BTW. The `DIDescriptor` subclasses are almost empty wrappers, but not quite: they still have loose casting checks (see the `RETURN_FROM_RAW()` macro). Once they're completely gutted, I'll rename the "MD" classes to "DI" and kill the wrappers. I also expect to make a few schema changes now that it's easier to reason about everything. llvm-svn: 231082	2015-03-03 17:24:31 +00:00
NAKAMURA Takumi	10d576d8dc	Make llvm/test/Object/archive-format.test CRLF-tolerant. llvm-svn: 231074	2015-03-03 15:54:48 +00:00
Daniel Jasper	8f239f83b0	During PHI elimination, split critical edges that move copies out of loops. This prevents the behavior observed in llvm.org/PR22369. I am not sure whether I am reading the code correctly, but the early exit based on isLiveOutPastPHIs() seems to make the wrong assumption that RegisterCoalescer won't be able to coalesce those copies later. This change hides the new behavior behind -no-phi-elim-live-out-early-exit as it currently breaks four tests: * Assertion in: CodeGen/Hexagon/hwloop-cleanup.ll * Worse code in: CodeGen/X86/coalescer-commute4.ll CodeGen/X86/phys_subreg_coalesce-2.ll CodeGen/X86/zlib-longest-match.ll The root cause here seems to be that the heuristic that determines the visitation order in RegisterCoalescer gets less lucky. llvm-svn: 231064	2015-03-03 10:23:11 +00:00
Owen Anderson	7325b91783	Cleanup after r230934 per Dave's suggestions. llvm-svn: 231056	2015-03-03 05:39:27 +00:00
Ahmed Bougacha	afbd6887c4	[X86] Special-case 2x CMOV when custom-inserting. This lets us avoid a few copies that are otherwise hard to get rid of. The way this is done is, the custom-inserter looks at the following instruction for another CMOV, and replaces both at the same time. A previous version used a new CMOV2 opcode, but the custom inserter is expected to be able to return a different basic block anyway, which means it's OK - though far from ideal - to alter that block's contents. Explicitly document that, in case it ever makes a difference. Alternatives welcome! Follow-up to r231045. rdar://19767934 Closes http://reviews.llvm.org/D8019 llvm-svn: 231046	2015-03-03 01:21:16 +00:00
Ahmed Bougacha	066d0b8e64	[X86] Combine (cmov (and/or (setcc) (setcc))) into (cmov (cmov)). Fold and/or of setcc's to double CMOV: (CMOV F, T, ((cc1 \| cc2) != 0)) -> (CMOV (CMOV F, T, cc1), T, cc2) (CMOV F, T, ((cc1 & cc2) != 0)) -> (CMOV (CMOV T, F, !cc1), F, !cc2) When we can't use the CMOV instruction, it might increase branch mispredicts. When we can, or when there is no mispredict, this improves throughput and reduces register pressure. These can't be catched by generic combines, because the pattern can appear when legalizing some instructions (such as fcmp une). rdar://19767934 http://reviews.llvm.org/D7634 llvm-svn: 231045	2015-03-03 01:09:14 +00:00
Reid Kleckner	1d2c3f91cd	Fix cppeh breakage due to racing commits llvm-svn: 231044	2015-03-03 01:04:39 +00:00
Peter Collingbourne	da2dbf21a9	LowerBitSets: Use byte arrays instead of bit sets to represent in-memory bit sets. By loading from indexed offsets into a byte array and applying a mask, a program can test bits from the bit set with a relatively short instruction sequence. For example, suppose we have 15 bit sets to lay out: A (16 bits), B (15 bits), C (14 bits), D (13 bits), E (12 bits), F (11 bits), G (10 bits), H (9 bits), I (7 bits), J (6 bits), K (5 bits), L (4 bits), M (3 bits), N (2 bits), O (1 bit) These bits can be laid out in a 16-byte array like this: Byte Offset 0123456789ABCDEF Bit 7 HHHHHHHHHIIIIIII 6 GGGGGGGGGGJJJJJJ 5 FFFFFFFFFFFKKKKK 4 EEEEEEEEEEEELLLL 3 DDDDDDDDDDDDDMMM 2 CCCCCCCCCCCCCCNN 1 BBBBBBBBBBBBBBBO 0 AAAAAAAAAAAAAAAA For example, to test bit X of A, we evaluate ((bits[X] & 1) != 0), or to test bit X of I, we evaluate ((bits[9 + X] & 0x80) != 0). This can be done in 1-2 machine instructions on x86, or 4-6 instructions on ARM. This uses the LPT multiprocessor scheduling algorithm to lay out the bits efficiently. Saves ~450KB of instructions in a recent build of Chromium. Differential Revision: http://reviews.llvm.org/D7954 llvm-svn: 231043	2015-03-03 00:49:28 +00:00
Andrew Kaylor	72029c6f2f	Remap arguments and non-alloca values used by outlined C++ exception handlers. Differential Revision: http://reviews.llvm.org/D7844 llvm-svn: 231042	2015-03-03 00:41:03 +00:00
Benjamin Kramer	838752d3f6	LoopIdiom: Give globals for memset_pattern16 private linkage. There's really no reason to have them have entries in the symbol table anymore. Old versions of ld64 had some bugs in this area but those have been fixed long ago. llvm-svn: 231041	2015-03-03 00:17:09 +00:00
Reid Kleckner	6f0e4b897e	WinEH: Run opt -instnamer over some cppeh tests and update CHECKs In the future, we should run the output of clang through instnamer to make it easier to manually edit test cases. No functionality change. llvm-svn: 231037	2015-03-03 00:05:35 +00:00
Adrian Prantl	92da14b244	Refactor DebugLocDWARFExpression so it doesn't require access to the TargetRegisterInfo. DebugLocEntry now holds a buffer with the raw bytes of the pre-calculated DWARF expression. Ought to be NFC, but it does slightly alter the output format of the textual assembly. This reapplies 230930 without the assertion in DebugLocEntry::finalize() because not all Machine registers can be lowered into DWARF register numbers and floating point constants cannot be expressed. llvm-svn: 231023	2015-03-02 22:02:33 +00:00
Sanjoy Das	2d38031271	Revert some changes that were made to fix PR20680. This re-lands change r230921. r230921 was reverted because it broke a clang test; a checkin fixing the clang test will be commited shortly. Summary: As far as I can tell, the real bug causing the issue was fixed in r230533. SCEVExpander should mark an increment operation as nuw or nsw only if it can prove that the operation does not overflow. There shouldn't be any situation where we have to do something different because of no-wrap flags generated by SCEVExpander. Revert "IndVarSimplify: Allow LFTR to fire more often" This reverts commit 1ade0f0faa98877b688e0b9da58e876052c1e04e (SVN: 222213). Revert "IndVarSimplify: Don't let LFTR compare against a poison value" This reverts commit c0f2b8b528d8a37b0a1522aae90af649d6357eb5 (SVN: 217102). Reviewers: majnemer, atrick, spatel Differential Revision: http://reviews.llvm.org/D7979 llvm-svn: 231018	2015-03-02 21:41:07 +00:00
Reid Kleckner	02ec6a3ec3	lit: Add 'cd' support to the internal shell and port some tests The internal shell was already threading around a 'cwd' parameter. We just have to make it mutable so that we can update it as the test script executes. If the shell ever grows support for environment variable substitution, we could also implement support for export. llvm-svn: 231017	2015-03-02 21:33:18 +00:00
Adrian Prantl	2185aa179d	Revert "Refactor DebugLocDWARFExpression so it doesn't require access to the" This reverts commit 230975 to investigate buildbot breakage. llvm-svn: 231004	2015-03-02 20:01:54 +00:00
David Blaikie	41fe3a495d	Change SystemZ large tests to use the existing long_tests property (this is already used in Clang for a couple of tests) Reviewers: uweigand Differential Revision: http://reviews.llvm.org/D7965 llvm-svn: 230998	2015-03-02 19:34:11 +00:00
Rafael Espindola	503f883b95	Add r230655 back with a fix. The issue is that now we have a diag handler during optimizations and get forward every optimization remark, flooding stdout. The same filtering should probably be done with or without a custom handler, but for now just ignore remarks. Original message: gold-plugin: "Upgrade" debug info and handle its warnings. The gold plugin never calls MaterializeModule, so any old debug info was not deleted and could cause crashes. Now that it is being "upgraded", the plugin also has to handle warnings and create Modules with a nice id (it shows in the warning). llvm-svn: 230991	2015-03-02 19:08:03 +00:00
Paul Robinson	9f4cfc574e	Revert r230979, should apply to all X86 ELF. llvm-svn: 230985	2015-03-02 18:50:18 +00:00
Paul Robinson	10ae2e52de	[PS4] Correct relocation for DWARF TLS references. llvm-svn: 230979	2015-03-02 17:44:52 +00:00
Adrian Prantl	d50bca7314	Refactor DebugLocDWARFExpression so it doesn't require access to the TargetRegisterInfo. DebugLocEntry now holds a buffer with the raw bytes of the pre-calculated DWARF expression. Ought to be NFC, but it does slightly alter the output format of the textual assembly. This reapplies 230930 with a relaxed assertion in DebugLocEntry::finalize() that allows for empty DWARF expressions for constant FP values. llvm-svn: 230975	2015-03-02 17:21:06 +00:00
Elena Demikhovsky	18fd49602b	AVX-512: Add assembly parser support for Rounding mode By Asaf Badouh <asaf.badouh@intel.com> llvm-svn: 230962	2015-03-02 15:00:34 +00:00
Vasileios Kalintiris	e741eb2c7d	[mips] Optimize conditional moves where RHS is zero. Summary: When the RHS of a conditional move node is zero, we can utilize the $zero register by inverting the conditional move instruction and by swapping the order of its True/False operands. Reviewers: dsanders Differential Revision: http://reviews.llvm.org/D7945 llvm-svn: 230956	2015-03-02 12:47:32 +00:00
Owen Anderson	63fbf10c32	Teach the verifier to enforce that the alignment argument of memory intrinsics must be a power of 2. llvm-svn: 230941	2015-03-02 09:35:06 +00:00
Owen Anderson	5af4b21c2e	Teach DataLayout that alignments on basic types must be powers of two. Fixes assertion failures/crashes on bad datalayout specifications. llvm-svn: 230940	2015-03-02 09:35:03 +00:00
Owen Anderson	ab1c7a77d2	Teach DataLayout that ABI alignments for non-aggregate types must be non-zero. This manifested as assertions and/or crashes in later phases of optimization, depending on the build configuration. llvm-svn: 230939	2015-03-02 09:34:59 +00:00
Owen Anderson	040f2f890e	Teach DataLayout that pointer ABI and preferred alignments are required to be powers of two. Previously this resulted in asserts and/or crashes (depending on build configuration) at various phases in the optimizer. llvm-svn: 230938	2015-03-02 06:33:51 +00:00
Owen Anderson	5bc2bbe601	Teach DataLayout that zero-byte pointer sizes don't make sense. Previously this would result in assertion failures or simply crashes at various points in the optimizer when trying to create types of zero bit width. llvm-svn: 230936	2015-03-02 06:00:02 +00:00
Owen Anderson	576a9a2728	Teach the LLParser to fail gracefully when it encounters an invalid label name. Previous it would either assert in +Asserts, or crash in -Asserts. Found by fuzzing LLParser. llvm-svn: 230935	2015-03-02 05:25:09 +00:00
Owen Anderson	91bdf07650	Fix a crash in the LL parser where it failed to validate that the pointer operand of a GEP was valid. This manifested as an assertion failure in +Asserts builds, and a hard crash in -Asserts builds. Found by fuzzing the LL parser. llvm-svn: 230934	2015-03-02 05:25:06 +00:00
Zachary Turner	7797c726b9	[llvm-pdbdump] Many minor fixes and improvements A short list of some of the improvements: 1) Now supports -all command line argument, which implies many other command line arguments to simplify usage. 2) Now supports -no-compiler-generated command line argument to exclude compiler generated types. 3) Prints base class list. 4) -class-definitions implies -types. 5) Proper display of bitfields. 6) Can now distinguish between struct/class/interface/union. And a few other minor tweaks. llvm-svn: 230933	2015-03-02 04:39:56 +00:00
Nico Weber	968ceddca9	Revert r230930, it caused PR22747. llvm-svn: 230932	2015-03-02 04:37:11 +00:00
Adrian Prantl	e2c9e64532	Refactor DebugLocDWARFExpression so it doesn't require access to the TargetRegisterInfo. DebugLocEntry now holds a buffer with the raw bytes of the pre-calculated DWARF expression. Ought to be NFC, but it does slightly alter the output format of the textual assembly. llvm-svn: 230930	2015-03-02 02:38:18 +00:00
NAKAMURA Takumi	0cd23c842e	Revert r230921, "Revert some changes that were made to fix PR20680.", for now. It caused a failure on clang/test/Misc/backend-optimization-failure.cpp . llvm-svn: 230929	2015-03-02 01:14:03 +00:00
Craig Topper	09b27e7b24	[X86] Fix diassembler crash on AVX512 cmpps/cmppd with immediate that doesn't fit in 5-bits. Fixes PR22743. llvm-svn: 230924	2015-03-02 00:22:29 +00:00
Sanjoy Das	876bd51486	Revert some changes that were made to fix PR20680. Summary: As far as I can tell, the real bug causing the issue was fixed in r230533. SCEVExpander should mark an increment operation as nuw or nsw only if it can prove that the operation does not overflow. There shouldn't be any situation where we have to do something different because of no-wrap flags generated by SCEVExpander. Revert "IndVarSimplify: Allow LFTR to fire more often" This reverts commit 1ade0f0faa98877b688e0b9da58e876052c1e04e (SVN: 222213). Revert "IndVarSimplify: Don't let LFTR compare against a poison value" This reverts commit c0f2b8b528d8a37b0a1522aae90af649d6357eb5 (SVN: 217102). Reviewers: majnemer, atrick, spatel Differential Revision: http://reviews.llvm.org/D7979 llvm-svn: 230921	2015-03-01 23:36:26 +00:00
Elena Demikhovsky	02ffd26023	AVX-512: Added mask and rounding mode for scalar arithmetics Added more tests for scalar instructions to destinguish between AVX and AVX-512 forms. llvm-svn: 230891	2015-03-01 07:44:04 +00:00
Zachary Turner	f5abda2a2f	[llvm-pdbdump] Add regex-based filtering. llvm-svn: 230888	2015-03-01 06:49:49 +00:00
NAKAMURA Takumi	0f480f5010	Revert r230655, "gold-plugin: "Upgrade" debug info and handle its warnings." It emits millions of warnings during selfhosting LTO build, to choke the buildbot with gigbytes of log. llvm-svn: 230885	2015-03-01 04:16:28 +00:00
Sanjay Patel	b8c907e2a7	avoid infinite looping when folding vector multiplies of constants (PR22698) We were missing a check for the following fold in DAGCombiner: // fold (fmul (fmul x, c1), c2) -> (fmul x, (fmul c1, c2)) If 'x' is also a constant, then we shouldn't do anything. Otherwise, we could end up swapping the operands back and forth forever. This should fix: http://llvm.org/bugs/show_bug.cgi?id=22698 Differential Revision: http://reviews.llvm.org/D7917 llvm-svn: 230884	2015-03-01 00:09:35 +00:00
Sanjay Patel	d076b2a879	fixed to test only the feature, not the feature and a CPU llvm-svn: 230883	2015-03-01 00:02:03 +00:00
Duncan P. N. Exon Smith	d0c2a99f0e	DebugInfo: Convert DW_OP_piece => DW_OP_bit_piece r228631 stopped using `DW_OP_piece` inside `DIExpression`s in the IR, but it apparently missed updating these testcases. Caught by verifier checks for `MDExpression` while working on moving the new hierarchy into place. llvm-svn: 230882	2015-02-28 23:57:16 +00:00
Sanjay Patel	7aa7412a0b	make the tested feature (SSE2) explicit llvm-svn: 230881	2015-02-28 23:55:24 +00:00
Duncan P. N. Exon Smith	02f4bbc588	DebugInfo: Fix invalid file reference in CodeGen/X86/unknown-location.ll There are two types of files in the old (current) debug info schema. !0 = !{!"some/filename", !"/path/to/dir"} !1 = !{!"0x29", !0} ; [ DW_TAG_file_type ] !1 has a wrapper class called `DIFile` which inherits from `DIScope` and is referenced in 'scope' fields. !0 is called a "file node", and debug info nodes with a 'file' field point at one of these directly -- although they're built in `DIBuilder` by sending in a `DIFile` and reaching into it. In the new hierarchy, I unified these nodes as `MDFile` (which `DIFile` is a lightweight wrapper for) in r230057. Moving the new hierarchy into place (and upgrading testcases) caused CodeGen/X86/unknown-location.ll to start failing -- apparently "0x29" was previously showing up in the linetable as a filename, causing: .loc 2 4 3 (where 2 points at filename "0x29") instead of: .loc 1 4 3 (where 1 points at the actual filename). Change the testcase to use the old schema correctly. llvm-svn: 230880	2015-02-28 23:52:24 +00:00
Sanjay Patel	db962e2afb	fixed to test only the feature, not the feature and a CPU llvm-svn: 230878	2015-02-28 23:47:09 +00:00
Duncan P. N. Exon Smith	16d182acb9	Optimize metadata node fields for CHECK-ability While gaining practical experience hand-updating CHECK lines (for moving the new debug info hierarchy into place), I learnt a few things about CHECK-ability of the specialized node assembly output. - The first part of a `CHECK:` is to identify the "right" node (this is especially true if you intend to use the new `CHECK-SAME` feature, since the first CHECK needs to identify the node correctly before you can split the line). - If there's a `tag:`, it should go first. - If there's a `name:`, it should go next (followed by the `linkageName:`, if any). - If there's a `scope:`, it should follow after that. - When a node type supports multiple DW_TAGs, but one is implied by its name and is overwhelmingly more common, the `tag:` field is terribly uninteresting unless it's different. - `MDBasicType` is almost always `DW_TAG_base_type`. - `MDTemplateValueParameter` is almost always `DW_TAG_template_value_parameter`. - Printing `name: ""` doesn't improve CHECK-ability, and there are far more nodes than I realized that are commonly nameless. - There are a few other fields that similarly aren't very interesting when they're empty. This commit updates the `AsmWriter` as suggested above (and makes necessary changes in `LLParser` for round-tripping). llvm-svn: 230877	2015-02-28 23:21:38 +00:00
Duncan P. N. Exon Smith	c296fcc39e	AsmWriter: Escape string fields in metadata Properly escape string fields in metadata. I've added a spot-check with direct coverage for `MDFile::getFilename()`, but we'll get more coverage once the hierarchy is moved into place (since this comes up in various checked-in testcases). I've replicated the `if` logic using the `ShouldSkipEmpty` flag (although a follow-up commit is going to change how often this flag is specified); no NFCI other than escaping the string fields. llvm-svn: 230875	2015-02-28 22:20:16 +00:00
Duncan P. N. Exon Smith	a951165e5a	Fix line endings on Transforms/Inline/inline_dbg_declare.ll llvm-svn: 230870	2015-02-28 21:38:32 +00:00
Craig Topper	782d620657	[X86] Remove the blendpd/blendps/pblendw/pblendd intrinsics. They can represented by shuffle_vector instructions. llvm-svn: 230860	2015-02-28 19:33:17 +00:00
Benjamin Kramer	cb570f1bc9	TRE: Just erase dead BBs and tweak the iteration loop not to increment the deleted BB iterator. Leaving empty blocks around just opens up a can of bugs like PR22704. Deleting them early also slightly simplifies code. Thanks to Sanjay for the IR test case. llvm-svn: 230856	2015-02-28 16:47:27 +00:00
Eric Christopher	b759340fc8	Remove option.ll as part of the Forward Control Flow Integrity removal. llvm-svn: 230844	2015-02-28 10:04:18 +00:00
Philip Reames	2e5bcbe8d5	[RewriteStatepointsForGC] Fix another order of iteration bug It turns out the naming of inserted phis and selects is sensative to the order in which two sets are iterated. We need to nail this down to avoid non-deterministic output and possible test failures. The modified test is the one I first noticed something odd in. The change is making it more strict to report the error. With the test change, but without the code change, the test fails roughly 1 in 5. With the code change, I've run ~30 runs without error. Long term, the right fix here is to adjust the naming scheme. I'm checking in this hack to avoid any possible non-determinism in the tests over the weekend. HJust because I only noticed one case doesn't mean it's actually the only case. I hope to get to the right change Monday. std->llvm data structure changes bugfix change #3 llvm-svn: 230835	2015-02-28 01:52:09 +00:00
Frederic Riss	c99ea20eda	[dsymutil] Add the DwarfStreamer class. This class is responsible for getting the linked data to the disk in the appropriate form. Today it it an empty shell that just instantiates an MC layer. As we do not put anything in the resulting file yet, we just check it has the right architecture (and check that -o does the right thing). To be able to create all the components, this commit adds a few dependencies to llvm-dsymutil, namely all-targets, MC and AsmPrinter. Also add a -no-output option, so that tests that do not need the binary result can continue to run even if they do not have the required target linked in. llvm-svn: 230824	2015-02-28 00:29:11 +00:00

1 2 3 4 5 ...

29092 Commits