llvm-project

Commit Graph

Author	SHA1	Message	Date
Igor Laevsky	1725997f14	Teach lowering to correctly handle invoke statepoint and gc results tied to them. Note that we still can not lower gc.relocates for invoke statepoints. Also it extracts getCopyFromRegs helper function in SelectionDAGBuilder as we need to be able to customize type of the register exported from basic block during lowering of the gc.result. llvm-svn: 231366	2015-03-05 14:11:21 +00:00
Michael Kuperstein	bcb26d6880	[InstCombine] Fix an assertion when fmul has a ConstantExpr operand isNormalFp and isFiniteNonZeroFp should not assume vector operands can not be constant expressions. Patch by Pawel Jurek <pawel.jurek@intel.com> Differential Revision: http://reviews.llvm.org/D8053 llvm-svn: 231359	2015-03-05 08:38:57 +00:00
Craig Topper	0ee8470a43	[X86] Use vmovss to handle inserting an element into index 0 of a v8f32 vector of zeros. llvm-svn: 231354	2015-03-05 06:38:42 +00:00
Rafael Espindola	07c03d316d	Use the existing begin and end symbol for debug info. llvm-svn: 231338	2015-03-05 02:05:42 +00:00
Kostya Serebryany	83ce8779d5	[sanitizer] add nosanitize metadata to more coverage instrumentation instructions llvm-svn: 231333	2015-03-05 01:20:05 +00:00
Chandler Carruth	af7e99f2f4	[MBP] Revert r231238 which attempted to fix a nasty bug where MBP is just arbitrarily interleaving unrelated control flows once they get moved "out-of-line" (both outside of natural CFG ordering and with diamonds that cannot be fully laid out by chaining fallthrough edges). This easy solution doesn't work in practice, and it isn't just a small bug. It looks like a very different strategy will be required. I'm working on that now, and it'll again go behind some flag so that everyone can experiment and make sure it is working well for them. llvm-svn: 231332	2015-03-05 01:07:03 +00:00
Paul Robinson	49e38965dc	Turn off .debug_pubnames/pubtypes for PS4. Differential Revision: http://reviews.llvm.org/D8067 llvm-svn: 231322	2015-03-05 00:08:27 +00:00
Matthias Braun	eca5151780	Improve test robustness Improve test robustness in preparation of coming commits: - Avoid undefs which may get propagated too much. - Remove several pointless add 0, instructions llvm-svn: 231307	2015-03-04 22:31:18 +00:00
Sanjoy Das	9e2c5010f6	[SCEV] make SCEV smarter about proving no-wrap. Summary: Teach SCEV to prove no overflow for an add recurrence by proving something about the range of another add recurrence a loop-invariant distance away from it. Reviewers: atrick, hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7980 llvm-svn: 231305	2015-03-04 22:24:17 +00:00
Frederic Riss	b8b43d5494	[dsymutil] Add minimal code to emit DIE trees. This commit adds code to emit DIE trees that have been pruned from the parts that haven't been marked as kept in the previous pass. It works by 'cloning' the input DIE tree (as read by libDebugInfoDwarf) into a tree of DIE objects. Cloning the DIEs means essentially cloning their attributes. The code in this commit does only handle scalar and block attributes (scalar because they are trivial, blocks because they can't be easily replaced by a scalr placeholder), all the other ones are replaced by placeholder zero values and will be handled in further commits. The added tests mostly check that the DIE tree has the correct layout and also verify that a few chosen scalar and block attributes correctly make their way into the output. llvm-svn: 231300	2015-03-04 22:07:44 +00:00
Rafael Espindola	266b8c8043	Expand variables when evaluating absolute expressions. This allows for variables to be used in .size. This matches gnu AS functionality. llvm-svn: 231295	2015-03-04 22:03:21 +00:00
Paul Robinson	78cc0821f0	Support standard DWARF TLS opcode; Darwin and PS4 use it. Differential Revision: http://reviews.llvm.org/D8018 llvm-svn: 231286	2015-03-04 20:55:11 +00:00
Nemanja Ivanovic	e8effe1edb	Add LLVM support for PPC cryptography builtins Review: http://reviews.llvm.org/D7955 llvm-svn: 231285	2015-03-04 20:44:33 +00:00
Rafael Espindola	f3f185486c	Bring r231132 back with a fix. The issue was that we were always printing the remarks. Fix that and add a test showing that it prints nothing if -pass-remarks is not given. Original message: Correctly handle -pass-remarks in the gold plugin. llvm-svn: 231273	2015-03-04 18:51:45 +00:00
Mehdi Amini	46a43556db	Make DataLayout Non-Optional in the Module Summary: DataLayout keeps the string used for its creation. As a side effect it is no longer needed in the Module. This is "almost" NFC, the string is no longer canonicalized, you can't rely on two "equals" DataLayout having the same string returned by getStringRepresentation(). Get rid of DataLayoutPass: the DataLayout is in the Module The DataLayout is "per-module", let's enforce this by not duplicating it more than necessary. One more step toward non-optionality of the DataLayout in the module. Make DataLayout Non-Optional in the Module Module->getDataLayout() will never returns nullptr anymore. Reviewers: echristo Subscribers: resistor, llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D7992 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 231270	2015-03-04 18:43:29 +00:00
Adrian Prantl	afdac4b7f0	Update the out-of-date dwarf expressions in these testcases. llvm-svn: 231261	2015-03-04 17:39:59 +00:00
Marek Olsak	d2af89df10	R600/SI: Add an intrinsic for S_FLBIT_I32 / V_FFBH_I32 Required by OpenGL (ARB_gpu_shader5). llvm-svn: 231259	2015-03-04 17:33:45 +00:00
NAKAMURA Takumi	84a9697c17	Revert r231132, "Correctly handle -pass-remarks in the gold plugin.", for now, to suppress log floodng in LTO. llvm-svn: 231253	2015-03-04 16:24:28 +00:00
Jozef Kolek	c925808ee5	[mips][microMIPS] Make usage of ADDU16 and SUBU16 by code generator Differential Revision: http://reviews.llvm.org/D7609 llvm-svn: 231249	2015-03-04 15:47:42 +00:00
Andrea Di Biagio	df93ccf49a	[X86][FastISel] Simplify the logic in method X86SelectSIToFP. The target-independent selection algorithm in FastISel already knows how to select a SINT_TO_FP if the target is SSE but not AVX. On targets that have SSE but not AVX, the tablegen'd 'fastEmit' functions for ISD::SINT_TO_FP know how to select instruction X86::CVTSI2SSrr (for an i32 to f32 conversion) and X86::CVTSI2SDrr (for an i32 to f64 conversion). This patch simplifies the logic in method X86SelectSIToFP knowing that the code would not be reachable if the subtarget doesn't have AVX. No functional change intended. llvm-svn: 231243	2015-03-04 14:23:25 +00:00
Dmitry Vyukov	b37b95ed3e	asan: do not instrument direct inbounds accesses to stack variables Do not instrument direct accesses to stack variables that can be proven to be inbounds, e.g. accesses to fields of structs on stack. But it eliminates 33% of instrumentation on webrtc/modules_unittests (number of memory accesses goes down from 290152 to 193998) and reduces binary size by 15% (from 74M to 64M) and improved compilation time by 6-12%. The optimization is guarded by asan-opt-stack flag that is off by default. http://reviews.llvm.org/D7583 llvm-svn: 231241	2015-03-04 13:27:53 +00:00
Chandler Carruth	9a53fbe243	[MBP] Fix a really horrible bug in MachineBlockPlacement, but behind a flag for now. First off, thanks to Daniel Jasper for really pointing out the issue here. It's been here forever (at least, I think it was there when I first wrote this code) without getting really noticed or fixed. The key problem is what happens when two reasonably common patterns happen at the same time: we outline multiple cold regions of code, and those regions in turn have diamonds or other CFGs for which we can't just topologically lay them out. Consider some C code that looks like: if (a1()) { if (b1()) c1(); else d1(); f1(); } if (a2()) { if (b2()) c2(); else d2(); f2(); } done(); Now consider the case where a1() and a2() are unlikely to be true. In that case, we might lay out the first part of the function like: a1, a2, done; And then we will be out of successors in which to build the chain. We go to find the best block to continue the chain with, which is perfectly reasonable here, and find "b1" let's say. Laying out successors gets us to: a1, a2, done; b1, c1; At this point, we will refuse to lay out the successor to c1 (f1) because there are still un-placed predecessors of f1 and we want to try to preserve the CFG structure. So we go get the next best block, d1. ... wait for it ... Except that the next best block isn't d1. It is b2! d1 is waaay down inside these conditionals. It is much less important than b2. Except that this is exactly what we didn't want. If we keep going we get the entire set of the rest of the CFG interleaved!!! a1, a2, done; b1, c1; b2, c2; d1, f1; d2, f2; So we clearly need a better strategy here. =] My current favorite strategy is to actually try to place the block whose predecessor is closest. This very simply ensures that we unwind these kinds of CFGs the way that is natural and fitting, and should minimize the number of cache lines instructions are spread across. It also happens to be dead simple. It's like the datastructure was specifically set up for this use case or something. We only push blocks onto the work list when the last predecessor for them is placed into the chain. So the back of the worklist is the nearest next block. Unfortunately, a change like this is going to cause soooo many benchmarks to swing wildly. So for now I'm adding this under a flag so that we and others can validate that this is fixing the problems described, that it seems possible to enable, and hopefully that it fixes more of our problems long term. llvm-svn: 231238	2015-03-04 12:18:08 +00:00
Daniel Jasper	471e856f49	Add a flag to experiment with outlining optional branches. In a CFG with the edges A->B->C and A->C, B is an optional branch. LLVM's default behavior is to lay the blocks out naturally, i.e. A, B, C, in order to improve code locality and fallthroughs. However, if a function contains many of those optional branches only a few of which are taken, this leads to a lot of unnecessary icache misses. Moving B out of line can work around this. Review: http://reviews.llvm.org/D7719 llvm-svn: 231230	2015-03-04 11:05:34 +00:00
Kristof Beyls	aea8461820	Fix PR22408 - LLVM producing AArch64 TLS relocations that GNU linkers cannot handle yet. As is described at http://llvm.org/bugs/show_bug.cgi?id=22408, the GNU linkers ld.bfd and ld.gold currently only support a subset of the whole range of AArch64 ELF TLS relocations. Furthermore, they assume that some of the code sequences to access thread-local variables are produced in a very specific sequence. When the sequence is not as the linker expects, it can silently mis-relaxe/mis-optimize the instructions. Even if that wouldn't be the case, it's good to produce the exact sequence, as that ensures that linkers can perform optimizing relaxations. This patch: * implements support for 16MiB TLS area size instead of 4GiB TLS area size. Ideally clang would grow an -mtls-size option to allow support for both, but that's not part of this patch. * by default doesn't produce local dynamic access patterns, as even modern ld.bfd and ld.gold linkers do not support the associated relocations. An option (-aarch64-elf-ldtls-generation) is added to enable generation of local dynamic code sequence, but is off by default. * makes sure that the exact expected code sequence for local dynamic and general dynamic accesses is produced, by making use of a new pseudo instruction. The patch also removes two (AArch64ISD::TLSDESC_BLR, AArch64ISD::TLSDESC_CALL) pre-existing AArch64-specific pseudo SDNode instructions that are superseded by the new one (TLSDESC_CALLSEQ). llvm-svn: 231227	2015-03-04 09:12:08 +00:00
Michael Kuperstein	fb95697c88	[DAGCombine] Fix a bug in a BUILD_VECTOR combine When trying to convert a BUILD_VECTOR into a shuffle, we try to split a single source vector that is twice as wide as the destination vector. We can not do this when we also need the zero vector to create a blend. This fixes PR22774. Differential Revision: http://reviews.llvm.org/D8040 llvm-svn: 231219	2015-03-04 07:27:39 +00:00
Davide Italiano	fcae934c03	[MC][Target] Implement support for R_X86_64_SIZE{32,64}. Differential Revision: D7990 Reviewed by: rafael, majnemer llvm-svn: 231216	2015-03-04 06:49:39 +00:00
Zachary Turner	653236596a	[llvm-pdbdump] Display full enum definitions. This will now display enum definitions both at the global scope as well as nested inside of classes. Additionally, it will no longer display enums at the global scope if the enum is nested. Instead, it will omit the definition of the enum globally and instead emit it in the corresponding class definition. llvm-svn: 231215	2015-03-04 06:09:53 +00:00
Filipe Cabecinhas	0524acc727	Fix the test for r231201. We don't crash anymore. llvm-svn: 231207	2015-03-04 02:09:40 +00:00
Rafael Espindola	310e4b592f	Use the vanilla func_end symbol for .size. No need to create yet another temp symbol. llvm-svn: 231198	2015-03-04 01:35:23 +00:00
Eric Christopher	afc703da52	Weaken the check for a specific movl on the twoaddr-coalesce-3 test - we only care that there are two moves in the loop and not which part is relative to which register anyhow. llvm-svn: 231191	2015-03-04 01:19:17 +00:00
Filipe Cabecinhas	6b79728815	Fix the x86-upgrade-avx2-vbroadcast.ll test by commenting the CHECK lines llvm-svn: 231187	2015-03-04 00:49:12 +00:00
Rafael Espindola	0ac5075f31	Drop the "eh_" from eh_func_begin and eh_func_end. They will be used for more than eh tables. llvm-svn: 231185	2015-03-04 00:27:43 +00:00
Philip Reames	6da37857d1	[RewriteStatepointsForGC] Fix a relocation bug w.r.t values defined by invoke instructions RewriteStatepointsForGC pass emits an alloca for each GC pointer which will be relocated. It then inserts stores after def and all relocations, and inserts loads before each use as well. In the end, mem2reg is used to update IR with relocations in SSA form. However, there is a problem with inserting stores for values defined by invoke instructions. The code didn't expect a def was a terminator instruction, and inserting instructions after these terminators resulted in malformed IR. This patch fixes this problem by handling invoke instructions as a special case. If the def is an invoke instruction, the store will be inserted at the beginning of the normal destination block. Since return value from invoke instruction does not dominate the unwind destination block, no action is needed there. Patch by: Chen Li Differential Revision: http://reviews.llvm.org/D7923 llvm-svn: 231183	2015-03-04 00:13:52 +00:00
Juergen Ributzka	1f7a17661c	Remove 'llvm.x86.avx2.vbroadcasti128' intrinsic. The intrinsic is no longer generated by the front-end. Remove the intrinsic and auto-upgrade it to a vector shuffle. Reviewed by Nadav This is related to rdar://problem/18742778. llvm-svn: 231182	2015-03-04 00:13:25 +00:00
Eric Christopher	9900a5d037	Update twoaddr-coalesce-3.ll to run on darwin and linux machines: a) Default relocation model differences, b) Different numbers of # in comments llvm-svn: 231178	2015-03-03 23:56:20 +00:00
Kostya Serebryany	be5e0ed919	[sanitizer/coverage] Add AFL-style coverage counters (search heuristic for fuzzing). Introduce -mllvm -sanitizer-coverage-8bit-counters=1 which adds imprecise thread-unfriendly 8-bit coverage counters. The run-time library maps these 8-bit counters to 8-bit bitsets in the same way AFL (http://lcamtuf.coredump.cx/afl/technical_details.txt) does: counter values are divided into 8 ranges and based on the counter value one of the bits in the bitset is set. The AFL ranges are used here: 1, 2, 3, 4-7, 8-15, 16-31, 32-127, 128+. These counters provide a search heuristic for single-threaded coverage-guided fuzzers, we do not expect them to be useful for other purposes. Depending on the value of -fsanitize-coverage=[123] flag, these counters will be added to the function entry blocks (=1), every basic block (=2), or every edge (=3). Use these counters as an optional search heuristic in the Fuzzer library. Add a test where this heuristic is critical. llvm-svn: 231166	2015-03-03 23:27:02 +00:00
Reid Kleckner	423665311d	WinEH: Remove vestigial EH object Ultimately, we'll need to leave something behind to indicate which alloca will hold the exception, but we can figure that out when it comes time to emit the __CxxFrameHandler3 catch handler table. llvm-svn: 231164	2015-03-03 23:20:30 +00:00
David Majnemer	1bacc0abc9	InstCombine: Ensure select condition types are identical before merging Selection conditions may be vectors or scalars. Make sure InstCombine doesn't indiscriminately assume that a select which is value dependent on another select have identical select condition types. This fixes PR22773. llvm-svn: 231156	2015-03-03 22:40:36 +00:00
Andrew Kaylor	5b70b76069	Moving WinEH outlining tests to an architecture neutral location llvm-svn: 231155	2015-03-03 22:33:39 +00:00
Eric Christopher	2891913f1a	Fix a problem where the TwoAddressInstructionPass which generate redundant register moves in a loop. From: int M, total; void foo() { int i; for (i = 0; i < M; i++) { total = total + i / 2; } } This is the kernel loop: .LBB0_2: # %for.body =>This Inner Loop Header: Depth=1 movl %edx, %esi movl %ecx, %edx shrl $31, %edx addl %ecx, %edx sarl %edx addl %esi, %edx incl %ecx cmpl %eax, %ecx jl .LBB0_2 -------------------------- The first mov insn "movl %edx, %esi" could be removed if we change "addl %esi, %edx" to "addl %edx, %esi". The IR before TwoAddressInstructionPass is: BB#2: derived from LLVM BB %for.body Predecessors according to CFG: BB#1 BB#2 %vreg3<def> = COPY %vreg12<kill>; GR32:%vreg3,%vreg12 %vreg2<def> = COPY %vreg11<kill>; GR32:%vreg2,%vreg11 %vreg7<def,tied1> = SHR32ri %vreg3<tied0>, 31, %EFLAGS<imp-def,dead>; GR32:%vreg7,%vreg3 %vreg8<def,tied1> = ADD32rr %vreg3<tied0>, %vreg7<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg8,%vreg3,%vreg7 %vreg9<def,tied1> = SAR32r1 %vreg8<kill,tied0>, %EFLAGS<imp-def,dead>; GR32:%vreg9,%vreg8 %vreg4<def,tied1> = ADD32rr %vreg9<kill,tied0>, %vreg2<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg4,%vreg9,%vreg2 %vreg5<def,tied1> = INC64_32r %vreg3<kill,tied0>, %EFLAGS<imp-def,dead>; GR32:%vreg5,%vreg3 CMP32rr %vreg5, %vreg0, %EFLAGS<imp-def>; GR32:%vreg5,%vreg0 %vreg11<def> = COPY %vreg4; GR32:%vreg11,%vreg4 %vreg12<def> = COPY %vreg5<kill>; GR32:%vreg12,%vreg5 JL_4 <BB#2>, %EFLAGS<imp-use,kill> Now TwoAddressInstructionPass will choose vreg9 to be tied with vreg4. However, it doesn't see that there is copy from vreg4 to vreg11 and another copy from vreg11 to vreg2 inside the loop body. To remove those copies, it is necessary to choose vreg2 to be tied with vreg4 instead of vreg9. This code pattern commonly appears when there is reduction operation in a loop. So check for a reversed copy chain and if we encounter one then we can commute the add instruction so we can avoid a copy. Patch by Wei Mi. http://reviews.llvm.org/D7806 llvm-svn: 231148	2015-03-03 22:03:03 +00:00
Nadav Rotem	029c5c7fdb	Teach ComputeNumSignBits about signed divisions. http://reviews.llvm.org/D8028 rdar://20023136 llvm-svn: 231140	2015-03-03 21:39:02 +00:00
Rafael Espindola	84483d247f	Correctly handle -pass-remarks in the gold plugin. llvm-svn: 231132	2015-03-03 21:11:13 +00:00
Paul Robinson	06a8eb8343	[X86][ELF] Correct relocation for DWARF TLS references Previously we had only Linux using DTPOFF for these; all X86 ELF targets should. Fixes a side issue mentioned in PR21077. Differential Revision: http://reviews.llvm.org/D8011 llvm-svn: 231130	2015-03-03 21:01:27 +00:00
Adrian Prantl	b283815a30	Fix PR22762. When emitting a DWARF expression check whether this is the frame register before checking if there is a DWARF register number for it. Thanks to H.J. Lu for diagnosing this and providing the testcase! llvm-svn: 231121	2015-03-03 20:12:52 +00:00
Andrew Kaylor	f0f5e46e07	Outline cleanup handlers for native Windows C++ exception handling Differential Revision: http://reviews.llvm.org/D7865 llvm-svn: 231117	2015-03-03 20:00:16 +00:00
Kit Barton	0cfa7b7ad0	Add the following 64-bit vector integer arithmetic instructions added in POWER8: vaddudm vsubudm vmulesw vmulosw vmuleuw vmulouw vmuluwm vmaxsd vmaxud vminsd vminud vcmpequd vcmpequd. vcmpgtsd vcmpgtsd. vcmpgtud vcmpgtud. vrld vsld vsrd vsrad Phabricator review: http://reviews.llvm.org/D7959 llvm-svn: 231115	2015-03-03 19:55:45 +00:00
Reid Kleckner	2f05d4c91f	Make llvm.eh.begincatch use an outparam Ultimately, __CxxFrameHandler3 needs us to put a stack offset in a table, and it will take responsibility for copying the exception object into that slot. Modelling the exception object as an SSA value returned by begincatch isn't going to work in general, so make it use an output parameter. Reviewers: andrew.w.kaylor Differential Revision: http://reviews.llvm.org/D7920 llvm-svn: 231086	2015-03-03 17:41:09 +00:00
Chad Rosier	8e38f30e49	[AArch64] When combining constant mul of -3, prefer (sub x, (shl x, N)). This change only effects codegen when the constant is -3. llvm-svn: 231085	2015-03-03 17:31:01 +00:00
Duncan P. N. Exon Smith	e274180f0e	DebugInfo: Move new hierarchy into place Move the specialized metadata nodes for the new debug info hierarchy into place, finishing off PR22464. I've done bootstraps (and all that) and I'm confident this commit is NFC as far as DWARF output is concerned. Let me know if I'm wrong :). The code changes are fairly mechanical: - Bumped the "Debug Info Version". - `DIBuilder` now creates the appropriate subclass of `MDNode`. - Subclasses of DIDescriptor now expect to hold their "MD" counterparts (e.g., `DIBasicType` expects `MDBasicType`). - Deleted a ton of dead code in `AsmWriter.cpp` and `DebugInfo.cpp` for printing comments. - Big update to LangRef to describe the nodes in the new hierarchy. Feel free to make it better. Testcase changes are enormous. There's an accompanying clang commit on its way. If you have out-of-tree debug info testcases, I just broke your build. - `upgrade-specialized-nodes.sh` is attached to PR22564. I used it to update all the IR testcases. - Unfortunately I failed to find way to script the updates to CHECK lines, so I updated all of these by hand. This was fairly painful, since the old CHECKs are difficult to reason about. That's one of the benefits of the new hierarchy. This work isn't quite finished, BTW. The `DIDescriptor` subclasses are almost empty wrappers, but not quite: they still have loose casting checks (see the `RETURN_FROM_RAW()` macro). Once they're completely gutted, I'll rename the "MD" classes to "DI" and kill the wrappers. I also expect to make a few schema changes now that it's easier to reason about everything. llvm-svn: 231082	2015-03-03 17:24:31 +00:00
NAKAMURA Takumi	10d576d8dc	Make llvm/test/Object/archive-format.test CRLF-tolerant. llvm-svn: 231074	2015-03-03 15:54:48 +00:00
Daniel Jasper	8f239f83b0	During PHI elimination, split critical edges that move copies out of loops. This prevents the behavior observed in llvm.org/PR22369. I am not sure whether I am reading the code correctly, but the early exit based on isLiveOutPastPHIs() seems to make the wrong assumption that RegisterCoalescer won't be able to coalesce those copies later. This change hides the new behavior behind -no-phi-elim-live-out-early-exit as it currently breaks four tests: * Assertion in: CodeGen/Hexagon/hwloop-cleanup.ll * Worse code in: CodeGen/X86/coalescer-commute4.ll CodeGen/X86/phys_subreg_coalesce-2.ll CodeGen/X86/zlib-longest-match.ll The root cause here seems to be that the heuristic that determines the visitation order in RegisterCoalescer gets less lucky. llvm-svn: 231064	2015-03-03 10:23:11 +00:00
Owen Anderson	7325b91783	Cleanup after r230934 per Dave's suggestions. llvm-svn: 231056	2015-03-03 05:39:27 +00:00
Ahmed Bougacha	afbd6887c4	[X86] Special-case 2x CMOV when custom-inserting. This lets us avoid a few copies that are otherwise hard to get rid of. The way this is done is, the custom-inserter looks at the following instruction for another CMOV, and replaces both at the same time. A previous version used a new CMOV2 opcode, but the custom inserter is expected to be able to return a different basic block anyway, which means it's OK - though far from ideal - to alter that block's contents. Explicitly document that, in case it ever makes a difference. Alternatives welcome! Follow-up to r231045. rdar://19767934 Closes http://reviews.llvm.org/D8019 llvm-svn: 231046	2015-03-03 01:21:16 +00:00
Ahmed Bougacha	066d0b8e64	[X86] Combine (cmov (and/or (setcc) (setcc))) into (cmov (cmov)). Fold and/or of setcc's to double CMOV: (CMOV F, T, ((cc1 \| cc2) != 0)) -> (CMOV (CMOV F, T, cc1), T, cc2) (CMOV F, T, ((cc1 & cc2) != 0)) -> (CMOV (CMOV T, F, !cc1), F, !cc2) When we can't use the CMOV instruction, it might increase branch mispredicts. When we can, or when there is no mispredict, this improves throughput and reduces register pressure. These can't be catched by generic combines, because the pattern can appear when legalizing some instructions (such as fcmp une). rdar://19767934 http://reviews.llvm.org/D7634 llvm-svn: 231045	2015-03-03 01:09:14 +00:00
Reid Kleckner	1d2c3f91cd	Fix cppeh breakage due to racing commits llvm-svn: 231044	2015-03-03 01:04:39 +00:00
Peter Collingbourne	da2dbf21a9	LowerBitSets: Use byte arrays instead of bit sets to represent in-memory bit sets. By loading from indexed offsets into a byte array and applying a mask, a program can test bits from the bit set with a relatively short instruction sequence. For example, suppose we have 15 bit sets to lay out: A (16 bits), B (15 bits), C (14 bits), D (13 bits), E (12 bits), F (11 bits), G (10 bits), H (9 bits), I (7 bits), J (6 bits), K (5 bits), L (4 bits), M (3 bits), N (2 bits), O (1 bit) These bits can be laid out in a 16-byte array like this: Byte Offset 0123456789ABCDEF Bit 7 HHHHHHHHHIIIIIII 6 GGGGGGGGGGJJJJJJ 5 FFFFFFFFFFFKKKKK 4 EEEEEEEEEEEELLLL 3 DDDDDDDDDDDDDMMM 2 CCCCCCCCCCCCCCNN 1 BBBBBBBBBBBBBBBO 0 AAAAAAAAAAAAAAAA For example, to test bit X of A, we evaluate ((bits[X] & 1) != 0), or to test bit X of I, we evaluate ((bits[9 + X] & 0x80) != 0). This can be done in 1-2 machine instructions on x86, or 4-6 instructions on ARM. This uses the LPT multiprocessor scheduling algorithm to lay out the bits efficiently. Saves ~450KB of instructions in a recent build of Chromium. Differential Revision: http://reviews.llvm.org/D7954 llvm-svn: 231043	2015-03-03 00:49:28 +00:00
Andrew Kaylor	72029c6f2f	Remap arguments and non-alloca values used by outlined C++ exception handlers. Differential Revision: http://reviews.llvm.org/D7844 llvm-svn: 231042	2015-03-03 00:41:03 +00:00
Benjamin Kramer	838752d3f6	LoopIdiom: Give globals for memset_pattern16 private linkage. There's really no reason to have them have entries in the symbol table anymore. Old versions of ld64 had some bugs in this area but those have been fixed long ago. llvm-svn: 231041	2015-03-03 00:17:09 +00:00
Reid Kleckner	6f0e4b897e	WinEH: Run opt -instnamer over some cppeh tests and update CHECKs In the future, we should run the output of clang through instnamer to make it easier to manually edit test cases. No functionality change. llvm-svn: 231037	2015-03-03 00:05:35 +00:00
Adrian Prantl	92da14b244	Refactor DebugLocDWARFExpression so it doesn't require access to the TargetRegisterInfo. DebugLocEntry now holds a buffer with the raw bytes of the pre-calculated DWARF expression. Ought to be NFC, but it does slightly alter the output format of the textual assembly. This reapplies 230930 without the assertion in DebugLocEntry::finalize() because not all Machine registers can be lowered into DWARF register numbers and floating point constants cannot be expressed. llvm-svn: 231023	2015-03-02 22:02:33 +00:00
Sanjoy Das	2d38031271	Revert some changes that were made to fix PR20680. This re-lands change r230921. r230921 was reverted because it broke a clang test; a checkin fixing the clang test will be commited shortly. Summary: As far as I can tell, the real bug causing the issue was fixed in r230533. SCEVExpander should mark an increment operation as nuw or nsw only if it can prove that the operation does not overflow. There shouldn't be any situation where we have to do something different because of no-wrap flags generated by SCEVExpander. Revert "IndVarSimplify: Allow LFTR to fire more often" This reverts commit 1ade0f0faa98877b688e0b9da58e876052c1e04e (SVN: 222213). Revert "IndVarSimplify: Don't let LFTR compare against a poison value" This reverts commit c0f2b8b528d8a37b0a1522aae90af649d6357eb5 (SVN: 217102). Reviewers: majnemer, atrick, spatel Differential Revision: http://reviews.llvm.org/D7979 llvm-svn: 231018	2015-03-02 21:41:07 +00:00
Reid Kleckner	02ec6a3ec3	lit: Add 'cd' support to the internal shell and port some tests The internal shell was already threading around a 'cwd' parameter. We just have to make it mutable so that we can update it as the test script executes. If the shell ever grows support for environment variable substitution, we could also implement support for export. llvm-svn: 231017	2015-03-02 21:33:18 +00:00
Adrian Prantl	2185aa179d	Revert "Refactor DebugLocDWARFExpression so it doesn't require access to the" This reverts commit 230975 to investigate buildbot breakage. llvm-svn: 231004	2015-03-02 20:01:54 +00:00
David Blaikie	41fe3a495d	Change SystemZ large tests to use the existing long_tests property (this is already used in Clang for a couple of tests) Reviewers: uweigand Differential Revision: http://reviews.llvm.org/D7965 llvm-svn: 230998	2015-03-02 19:34:11 +00:00
Rafael Espindola	503f883b95	Add r230655 back with a fix. The issue is that now we have a diag handler during optimizations and get forward every optimization remark, flooding stdout. The same filtering should probably be done with or without a custom handler, but for now just ignore remarks. Original message: gold-plugin: "Upgrade" debug info and handle its warnings. The gold plugin never calls MaterializeModule, so any old debug info was not deleted and could cause crashes. Now that it is being "upgraded", the plugin also has to handle warnings and create Modules with a nice id (it shows in the warning). llvm-svn: 230991	2015-03-02 19:08:03 +00:00
Paul Robinson	9f4cfc574e	Revert r230979, should apply to all X86 ELF. llvm-svn: 230985	2015-03-02 18:50:18 +00:00
Paul Robinson	10ae2e52de	[PS4] Correct relocation for DWARF TLS references. llvm-svn: 230979	2015-03-02 17:44:52 +00:00
Adrian Prantl	d50bca7314	Refactor DebugLocDWARFExpression so it doesn't require access to the TargetRegisterInfo. DebugLocEntry now holds a buffer with the raw bytes of the pre-calculated DWARF expression. Ought to be NFC, but it does slightly alter the output format of the textual assembly. This reapplies 230930 with a relaxed assertion in DebugLocEntry::finalize() that allows for empty DWARF expressions for constant FP values. llvm-svn: 230975	2015-03-02 17:21:06 +00:00
Elena Demikhovsky	18fd49602b	AVX-512: Add assembly parser support for Rounding mode By Asaf Badouh <asaf.badouh@intel.com> llvm-svn: 230962	2015-03-02 15:00:34 +00:00
Vasileios Kalintiris	e741eb2c7d	[mips] Optimize conditional moves where RHS is zero. Summary: When the RHS of a conditional move node is zero, we can utilize the $zero register by inverting the conditional move instruction and by swapping the order of its True/False operands. Reviewers: dsanders Differential Revision: http://reviews.llvm.org/D7945 llvm-svn: 230956	2015-03-02 12:47:32 +00:00
Owen Anderson	63fbf10c32	Teach the verifier to enforce that the alignment argument of memory intrinsics must be a power of 2. llvm-svn: 230941	2015-03-02 09:35:06 +00:00
Owen Anderson	5af4b21c2e	Teach DataLayout that alignments on basic types must be powers of two. Fixes assertion failures/crashes on bad datalayout specifications. llvm-svn: 230940	2015-03-02 09:35:03 +00:00
Owen Anderson	ab1c7a77d2	Teach DataLayout that ABI alignments for non-aggregate types must be non-zero. This manifested as assertions and/or crashes in later phases of optimization, depending on the build configuration. llvm-svn: 230939	2015-03-02 09:34:59 +00:00
Owen Anderson	040f2f890e	Teach DataLayout that pointer ABI and preferred alignments are required to be powers of two. Previously this resulted in asserts and/or crashes (depending on build configuration) at various phases in the optimizer. llvm-svn: 230938	2015-03-02 06:33:51 +00:00
Owen Anderson	5bc2bbe601	Teach DataLayout that zero-byte pointer sizes don't make sense. Previously this would result in assertion failures or simply crashes at various points in the optimizer when trying to create types of zero bit width. llvm-svn: 230936	2015-03-02 06:00:02 +00:00
Owen Anderson	576a9a2728	Teach the LLParser to fail gracefully when it encounters an invalid label name. Previous it would either assert in +Asserts, or crash in -Asserts. Found by fuzzing LLParser. llvm-svn: 230935	2015-03-02 05:25:09 +00:00
Owen Anderson	91bdf07650	Fix a crash in the LL parser where it failed to validate that the pointer operand of a GEP was valid. This manifested as an assertion failure in +Asserts builds, and a hard crash in -Asserts builds. Found by fuzzing the LL parser. llvm-svn: 230934	2015-03-02 05:25:06 +00:00
Zachary Turner	7797c726b9	[llvm-pdbdump] Many minor fixes and improvements A short list of some of the improvements: 1) Now supports -all command line argument, which implies many other command line arguments to simplify usage. 2) Now supports -no-compiler-generated command line argument to exclude compiler generated types. 3) Prints base class list. 4) -class-definitions implies -types. 5) Proper display of bitfields. 6) Can now distinguish between struct/class/interface/union. And a few other minor tweaks. llvm-svn: 230933	2015-03-02 04:39:56 +00:00
Nico Weber	968ceddca9	Revert r230930, it caused PR22747. llvm-svn: 230932	2015-03-02 04:37:11 +00:00
Adrian Prantl	e2c9e64532	Refactor DebugLocDWARFExpression so it doesn't require access to the TargetRegisterInfo. DebugLocEntry now holds a buffer with the raw bytes of the pre-calculated DWARF expression. Ought to be NFC, but it does slightly alter the output format of the textual assembly. llvm-svn: 230930	2015-03-02 02:38:18 +00:00
NAKAMURA Takumi	0cd23c842e	Revert r230921, "Revert some changes that were made to fix PR20680.", for now. It caused a failure on clang/test/Misc/backend-optimization-failure.cpp . llvm-svn: 230929	2015-03-02 01:14:03 +00:00
Craig Topper	09b27e7b24	[X86] Fix diassembler crash on AVX512 cmpps/cmppd with immediate that doesn't fit in 5-bits. Fixes PR22743. llvm-svn: 230924	2015-03-02 00:22:29 +00:00
Sanjoy Das	876bd51486	Revert some changes that were made to fix PR20680. Summary: As far as I can tell, the real bug causing the issue was fixed in r230533. SCEVExpander should mark an increment operation as nuw or nsw only if it can prove that the operation does not overflow. There shouldn't be any situation where we have to do something different because of no-wrap flags generated by SCEVExpander. Revert "IndVarSimplify: Allow LFTR to fire more often" This reverts commit 1ade0f0faa98877b688e0b9da58e876052c1e04e (SVN: 222213). Revert "IndVarSimplify: Don't let LFTR compare against a poison value" This reverts commit c0f2b8b528d8a37b0a1522aae90af649d6357eb5 (SVN: 217102). Reviewers: majnemer, atrick, spatel Differential Revision: http://reviews.llvm.org/D7979 llvm-svn: 230921	2015-03-01 23:36:26 +00:00
Elena Demikhovsky	02ffd26023	AVX-512: Added mask and rounding mode for scalar arithmetics Added more tests for scalar instructions to destinguish between AVX and AVX-512 forms. llvm-svn: 230891	2015-03-01 07:44:04 +00:00
Zachary Turner	f5abda2a2f	[llvm-pdbdump] Add regex-based filtering. llvm-svn: 230888	2015-03-01 06:49:49 +00:00
NAKAMURA Takumi	0f480f5010	Revert r230655, "gold-plugin: "Upgrade" debug info and handle its warnings." It emits millions of warnings during selfhosting LTO build, to choke the buildbot with gigbytes of log. llvm-svn: 230885	2015-03-01 04:16:28 +00:00
Sanjay Patel	b8c907e2a7	avoid infinite looping when folding vector multiplies of constants (PR22698) We were missing a check for the following fold in DAGCombiner: // fold (fmul (fmul x, c1), c2) -> (fmul x, (fmul c1, c2)) If 'x' is also a constant, then we shouldn't do anything. Otherwise, we could end up swapping the operands back and forth forever. This should fix: http://llvm.org/bugs/show_bug.cgi?id=22698 Differential Revision: http://reviews.llvm.org/D7917 llvm-svn: 230884	2015-03-01 00:09:35 +00:00
Sanjay Patel	d076b2a879	fixed to test only the feature, not the feature and a CPU llvm-svn: 230883	2015-03-01 00:02:03 +00:00
Duncan P. N. Exon Smith	d0c2a99f0e	DebugInfo: Convert DW_OP_piece => DW_OP_bit_piece r228631 stopped using `DW_OP_piece` inside `DIExpression`s in the IR, but it apparently missed updating these testcases. Caught by verifier checks for `MDExpression` while working on moving the new hierarchy into place. llvm-svn: 230882	2015-02-28 23:57:16 +00:00
Sanjay Patel	7aa7412a0b	make the tested feature (SSE2) explicit llvm-svn: 230881	2015-02-28 23:55:24 +00:00
Duncan P. N. Exon Smith	02f4bbc588	DebugInfo: Fix invalid file reference in CodeGen/X86/unknown-location.ll There are two types of files in the old (current) debug info schema. !0 = !{!"some/filename", !"/path/to/dir"} !1 = !{!"0x29", !0} ; [ DW_TAG_file_type ] !1 has a wrapper class called `DIFile` which inherits from `DIScope` and is referenced in 'scope' fields. !0 is called a "file node", and debug info nodes with a 'file' field point at one of these directly -- although they're built in `DIBuilder` by sending in a `DIFile` and reaching into it. In the new hierarchy, I unified these nodes as `MDFile` (which `DIFile` is a lightweight wrapper for) in r230057. Moving the new hierarchy into place (and upgrading testcases) caused CodeGen/X86/unknown-location.ll to start failing -- apparently "0x29" was previously showing up in the linetable as a filename, causing: .loc 2 4 3 (where 2 points at filename "0x29") instead of: .loc 1 4 3 (where 1 points at the actual filename). Change the testcase to use the old schema correctly. llvm-svn: 230880	2015-02-28 23:52:24 +00:00
Sanjay Patel	db962e2afb	fixed to test only the feature, not the feature and a CPU llvm-svn: 230878	2015-02-28 23:47:09 +00:00
Duncan P. N. Exon Smith	16d182acb9	Optimize metadata node fields for CHECK-ability While gaining practical experience hand-updating CHECK lines (for moving the new debug info hierarchy into place), I learnt a few things about CHECK-ability of the specialized node assembly output. - The first part of a `CHECK:` is to identify the "right" node (this is especially true if you intend to use the new `CHECK-SAME` feature, since the first CHECK needs to identify the node correctly before you can split the line). - If there's a `tag:`, it should go first. - If there's a `name:`, it should go next (followed by the `linkageName:`, if any). - If there's a `scope:`, it should follow after that. - When a node type supports multiple DW_TAGs, but one is implied by its name and is overwhelmingly more common, the `tag:` field is terribly uninteresting unless it's different. - `MDBasicType` is almost always `DW_TAG_base_type`. - `MDTemplateValueParameter` is almost always `DW_TAG_template_value_parameter`. - Printing `name: ""` doesn't improve CHECK-ability, and there are far more nodes than I realized that are commonly nameless. - There are a few other fields that similarly aren't very interesting when they're empty. This commit updates the `AsmWriter` as suggested above (and makes necessary changes in `LLParser` for round-tripping). llvm-svn: 230877	2015-02-28 23:21:38 +00:00
Duncan P. N. Exon Smith	c296fcc39e	AsmWriter: Escape string fields in metadata Properly escape string fields in metadata. I've added a spot-check with direct coverage for `MDFile::getFilename()`, but we'll get more coverage once the hierarchy is moved into place (since this comes up in various checked-in testcases). I've replicated the `if` logic using the `ShouldSkipEmpty` flag (although a follow-up commit is going to change how often this flag is specified); no NFCI other than escaping the string fields. llvm-svn: 230875	2015-02-28 22:20:16 +00:00
Duncan P. N. Exon Smith	a951165e5a	Fix line endings on Transforms/Inline/inline_dbg_declare.ll llvm-svn: 230870	2015-02-28 21:38:32 +00:00
Craig Topper	782d620657	[X86] Remove the blendpd/blendps/pblendw/pblendd intrinsics. They can represented by shuffle_vector instructions. llvm-svn: 230860	2015-02-28 19:33:17 +00:00
Benjamin Kramer	cb570f1bc9	TRE: Just erase dead BBs and tweak the iteration loop not to increment the deleted BB iterator. Leaving empty blocks around just opens up a can of bugs like PR22704. Deleting them early also slightly simplifies code. Thanks to Sanjay for the IR test case. llvm-svn: 230856	2015-02-28 16:47:27 +00:00
Eric Christopher	b759340fc8	Remove option.ll as part of the Forward Control Flow Integrity removal. llvm-svn: 230844	2015-02-28 10:04:18 +00:00
Philip Reames	2e5bcbe8d5	[RewriteStatepointsForGC] Fix another order of iteration bug It turns out the naming of inserted phis and selects is sensative to the order in which two sets are iterated. We need to nail this down to avoid non-deterministic output and possible test failures. The modified test is the one I first noticed something odd in. The change is making it more strict to report the error. With the test change, but without the code change, the test fails roughly 1 in 5. With the code change, I've run ~30 runs without error. Long term, the right fix here is to adjust the naming scheme. I'm checking in this hack to avoid any possible non-determinism in the tests over the weekend. HJust because I only noticed one case doesn't mean it's actually the only case. I hope to get to the right change Monday. std->llvm data structure changes bugfix change #3 llvm-svn: 230835	2015-02-28 01:52:09 +00:00
Frederic Riss	c99ea20eda	[dsymutil] Add the DwarfStreamer class. This class is responsible for getting the linked data to the disk in the appropriate form. Today it it an empty shell that just instantiates an MC layer. As we do not put anything in the resulting file yet, we just check it has the right architecture (and check that -o does the right thing). To be able to create all the components, this commit adds a few dependencies to llvm-dsymutil, namely all-targets, MC and AsmPrinter. Also add a -no-output option, so that tests that do not need the binary result can continue to run even if they do not have the required target linked in. llvm-svn: 230824	2015-02-28 00:29:11 +00:00
Philip Reames	a5aeaf4b4f	[RewriteStatepointsForGC] Add tests for the base pointer identification algorithm These tests cover the 'base object' identification and rewritting portion of RewriteStatepointsForGC. These aren't completely exhaustive, but they've proven to be reasonable effective over time at finding regressions. In the process of porting these tests over, I found my first "cleanup per llvm code style standards" bug. We were relying on the order of iteration when testing the base pointers found for a derived pointer. When we switched from std::set to DenseSet, this stopped being a safe assumption. I'm suspecting I'm going to find more of those. In particular, I'm now really wondering about the main iteration loop for this algorithm. I need to go take a closer look at the assumptions there. I'm not really happy with the fact these are testing what is essentially debug output (i.e. enabled via command line flags). Suggestions for how to structure this better are very welcome. llvm-svn: 230818	2015-02-28 00:20:48 +00:00
Bill Schmidt	164350e2ea	Regenerated test case from pr 230801 for change in LLVM IR syntax llvm-svn: 230811	2015-02-27 23:29:57 +00:00
David Blaikie	2c302a8dfa	Update SystemZ/Large test generators to handle new gep IR syntax llvm-svn: 230810	2015-02-27 23:29:39 +00:00
David Blaikie	d7e13b0eb2	Update SystemZ/Large test generators to handle new load IR syntax llvm-svn: 230809	2015-02-27 23:29:33 +00:00
David Majnemer	86ee173712	llvm-vtabledump: Update field with a better name llvm-svn: 230804	2015-02-27 22:35:25 +00:00
Bill Schmidt	bb9460a3bc	Revert test case until it can be fixed llvm-svn: 230803	2015-02-27 22:31:14 +00:00
Bill Schmidt	e3959eb54e	[PowerPC] Fix PR22711 - Misaligned .toc section Straightforward patch to emit an alignment directive when emitting a TOC entry. The test case was generated from the test in PR22711 that demonstrated a misaligned .toc section. The object code is run through llvm-readobj to verify that the correct alignment has been applied to the .toc section. Thanks to Ulrich Weigand for running down where the fix was needed. llvm-svn: 230801	2015-02-27 22:14:10 +00:00
David Blaikie	a79ac14fa6	[opaque pointer type] Add textual IR support for explicit type parameter to load instruction Essentially the same as the GEP change in r230786. A similar migration script can be used to update test cases, though a few more test case improvements/changes were required this time around: (r229269-r229278) import fileinput import sys import re pat = re.compile(r"((?:=\|:\|^)\sload (?:atomic )?(?:volatile )?(.?))(\| addrspace$\d+$ )\($\| (?:%\|@\|null\|undef\|blockaddress\|getelementptr\|addrspacecast\|bitcast\|inttoptr\|\[\[[a-zA-Z]\|\{\{).$)") for line in sys.stdin: sys.stdout.write(re.sub(pat, r"\1, \2\3*\4", line)) Reviewers: rafael, dexonsmith, grosser Differential Revision: http://reviews.llvm.org/D7649 llvm-svn: 230794	2015-02-27 21:17:42 +00:00
Charles Davis	83687fb9e6	Target/X86: Never use the redzone for Win64 ABI functions. Summary: Until now, we did this (among other things) based on whether or not the target was Windows. This is clearly wrong, not just for Win64 ABI functions on non-Windows, but for System V ABI functions on Windows, too. In this change, we make this decision based on the ABI the calling convention specifies instead. Reviewers: rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7953 llvm-svn: 230793	2015-02-27 21:11:16 +00:00
Hal Finkel	5c3cacf5c0	[PowerPC] Use vector types for memcpy and friends (sometimes) When using Altivec, we can use vector loads and stores for aligned memcpy and friends. Starting with the P7 and VXS, we have reasonable unaligned vector stores. Starting with the P8, we have fast unaligned loads too. For QPX, we use vector loads are stores, but only for aligned memory accesses. llvm-svn: 230788	2015-02-27 19:58:28 +00:00
David Blaikie	79e6c74981	[opaque pointer type] Add textual IR support for explicit type parameter to getelementptr instruction One of several parallel first steps to remove the target type of pointers, replacing them with a single opaque pointer type. This adds an explicit type parameter to the gep instruction so that when the first parameter becomes an opaque pointer type, the type to gep through is still available to the instructions. * This doesn't modify gep operators, only instructions (operators will be handled separately) * Textual IR changes only. Bitcode (including upgrade) and changing the in-memory representation will be in separate changes. * geps of vectors are transformed as: getelementptr <4 x float> %x, ... ->getelementptr float, <4 x float> %x, ... Then, once the opaque pointer type is introduced, this will ultimately look like: getelementptr float, <4 x ptr> %x with the unambiguous interpretation that it is a vector of pointers to float. * address spaces remain on the pointer, not the type: getelementptr float addrspace(1)* %x ->getelementptr float, float addrspace(1)* %x Then, eventually: getelementptr float, ptr addrspace(1) %x Importantly, the massive amount of test case churn has been automated by same crappy python code. I had to manually update a few test cases that wouldn't fit the script's model (r228970,r229196,r229197,r229198). The python script just massages stdin and writes the result to stdout, I then wrapped that in a shell script to handle replacing files, then using the usual find+xargs to migrate all the files. update.py: import fileinput import sys import re ibrep = re.compile(r"(^.?[^%\w]getelementptr inbounds )(((?:<\d x )?)(.?)(\| addrspace$\d$) \(\|>)(?:$\| (?:%\|@\|null\|undef\|blockaddress\|getelementptr\|addrspacecast\|bitcast\|inttoptr\|\[\[[a-zA-Z]\|\{\{).$))") normrep = re.compile( r"(^.?[^%\w]getelementptr )(((?:<\d* x )?)(.?)(\| addrspace$\d$) \(\|>)(?:$\| (?:%\|@\|null\|undef\|blockaddress\|getelementptr\|addrspacecast\|bitcast\|inttoptr\|\[\[[a-zA-Z]\|\{\{).$))") def conv(match, line): if not match: return line line = match.groups()[0] if len(match.groups()[5]) == 0: line += match.groups()[2] line += match.groups()[3] line += ", " line += match.groups()[1] line += "\n" return line for line in sys.stdin: if line.find("getelementptr ") == line.find("getelementptr inbounds"): if line.find("getelementptr inbounds") != line.find("getelementptr inbounds ("): line = conv(re.match(ibrep, line), line) elif line.find("getelementptr ") != line.find("getelementptr ("): line = conv(re.match(normrep, line), line) sys.stdout.write(line) apply.sh: for name in "$@" do python3 `dirname "$0"`/update.py < "$name" > "$name.tmp" && mv "$name.tmp" "$name" rm -f "$name.tmp" done The actual commands: From llvm/src: find test/ -name .ll \| xargs ./apply.sh From llvm/src/tools/clang: find test/ -name .mm -o -name .m -o -name .cpp -o -name .c \| xargs -I '{}' ../../apply.sh "{}" From llvm/src/tools/polly: find test/ -name *.ll \| xargs ./apply.sh After that, check-all (with llvm, clang, clang-tools-extra, lld, compiler-rt, and polly all checked out). The extra 'rm' in the apply.sh script is due to a few files in clang's test suite using interesting unicode stuff that my python script was throwing exceptions on. None of those files needed to be migrated, so it seemed sufficient to ignore those cases. Reviewers: rafael, dexonsmith, grosser Differential Revision: http://reviews.llvm.org/D7636 llvm-svn: 230786	2015-02-27 19:29:02 +00:00
Eric Christopher	3b94e33277	Remove the Forward Control Flow Integrity pass and its dependencies. This work is currently being rethought along different lines and if this work is needed it can be resurrected out of svn. Remove it for now as no current work in ongoing on it and it's unused. Verified with the authors before removal. llvm-svn: 230780	2015-02-27 19:03:38 +00:00
Justin Bogner	ac631cb03d	Object: Test for reading kext bundles In the review for r230567, it was pointed out we should really test the lib/Object part of that change. This does so using llvm-readobj. llvm-svn: 230779	2015-02-27 18:58:23 +00:00
Mehdi Amini	945a660cbc	Change the fast-isel-abort option from bool to int to enable "levels" Summary: Currently fast-isel-abort will only abort for regular instructions, and just warn for function calls, terminators, function arguments. There is already fast-isel-abort-args but nothing for calls and terminators. This change turns the fast-isel-abort options into an integer option, so that multiple levels of strictness can be defined. This will help no being surprised when the "abort" option indeed does not abort, and enables the possibility to write test that verifies that no intrinsics are forgotten by fast-isel. Reviewers: resistor, echristo Subscribers: jfb, llvm-commits Differential Revision: http://reviews.llvm.org/D7941 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 230775	2015-02-27 18:32:11 +00:00
Rafael Espindola	629cdbae94	Centralize handling of the eh_begin and eh_end labels. This removes a bit of duplicated code and more importantly, remembers the labels so that they don't need to be looked up by name. This in turn allows for any name to be used and avoids a crash if the name we wanted was already taken. llvm-svn: 230772	2015-02-27 18:18:39 +00:00
Renato Golin	a78995c0a0	Equally to NetBSD, Bitrig/ARM uses the Itanium-ABI. Patch by Patrick Wildt. llvm-svn: 230762	2015-02-27 16:35:27 +00:00
Zoran Jovanovic	71a33e2ad6	[mips][microMIPS] Change register class for GP register Differential Revision: http://reviews.llvm.org/D7934 llvm-svn: 230760	2015-02-27 15:03:50 +00:00
Petar Jovanovic	1df918083c	Pass correct -mtriple for krait-cpu-div-attribute.ll Not passing mtriple for one of the tests caused a regression failure on MIPS buildbot. The issue was introduced by r230651. Differential Revision: http://reviews.llvm.org/D7938 llvm-svn: 230756	2015-02-27 14:46:41 +00:00
Chandler Carruth	9ad2ffac23	[x86] Run most of the rest of the shuffle combining over non-128-bit vectors. This lets us fix the rest of the v16 lowering problems when pshufb is clearly better. We might still be able to improve some of the lowerings by enabling the other combine-based rewriting to fire for non-128-bit vectors, but this at least should remove any regressions from using the fancy v16i16 lowering strategy. llvm-svn: 230753	2015-02-27 12:13:14 +00:00
Chandler Carruth	66b705bc64	[x86] Teach a bunch of the x86-specific shuffle combining to work with 256-bit vectors as well as 128-bit vectors. Fixes some of the redundant shuffles for v16i16. llvm-svn: 230752	2015-02-27 11:45:13 +00:00
Chandler Carruth	97f3260f57	[x86] Make the v8i16 clever single-input shuffle lowering usable for repeated 128-bit lane shuffles of wider vector types and use it to lower 256-bit v16i16 vector shuffles where applicable. This should let us perfectly lowering the pattern of pshuflw and pshufhw even for AVX2 256-bit patterns. I've not added AVX-512 support, but it should be trivial for someone working on that to wire up. Note that currently this generates bad, long shuffle chains because we don't combine 256-bit target shuffles. The subsequent patches will fix that. llvm-svn: 230751	2015-02-27 11:33:46 +00:00
Chandler Carruth	84dfd1a851	[x86] Add a bunch more tests for v16i16 shuffles. All of these are taken by mirroring v8i16 test cases across both 128-bit lanes. This should highlight problems where we aren't correctly using 128-bit shuffles to implement things. llvm-svn: 230750	2015-02-27 11:25:10 +00:00
Zachary Turner	db18f5ca76	[llvm-pdbdump] Add support for dumping global variables. llvm-svn: 230744	2015-02-27 09:15:18 +00:00
Vasileios Kalintiris	18581f16b4	[mips] Account for constant-zero operands in ADDE nodes. Summary: We identify the cases where the operand to an ADDE node is a constant zero. In such cases, we can avoid generating an extra ADDu instruction disguised as an identity move alias (ie. addu $r, $r, 0 --> move $r, $r). Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7906 llvm-svn: 230742	2015-02-27 09:01:39 +00:00
Anna Zaks	8ed1d8196b	[asan] Skip promotable allocas to improve performance at -O0 Currently, the ASan executables built with -O0 are unnecessarily slow. The main reason is that ASan instrumentation pass inserts redundant checks around promotable allocas. These allocas do not get instrumented under -O1 because they get converted to virtual registered by mem2reg. With this patch, ASan instrumentation pass will only instrument non promotable allocas, giving us a speedup of 39% on a collection of benchmarks with -O0. (There is no measurable speedup at -O1.) llvm-svn: 230724	2015-02-27 03:12:36 +00:00
Charles Davis	84d28de627	Target/X86: Save Win64 non-volatile registers in a Win64 ABI function. Summary: This change causes us to actually save non-volatile registers in a Win64 ABI function that calls a System V ABI function, and vice-versa. Reviewers: rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7919 llvm-svn: 230714	2015-02-27 00:57:01 +00:00
David Majnemer	f50d0a5ecf	llvm-vtabledump: Dump catch/throw exception structures for MS ABI llvm-svn: 230713	2015-02-27 00:43:58 +00:00
Rafael Espindola	4491d0d337	Put jump tables in distinct sections if -ffunction-sections is used. A small regression in r230411 was that we were basing the decision on -fdata-sections. llvm-svn: 230707	2015-02-26 23:55:11 +00:00
Zachary Turner	d270d22f35	[llvm-pdbdump] Fix dumping of function pointers and basic types. Function pointers were not correctly handled by the dumper, and they would print as "* name". They now print as "int (__cdecl *name)(int arg1, int arg2)" as they should. Also, doubles were being printed as floats. This fixes that bug as well, and adds tests for all builtin types. as well as a test for function pointers. llvm-svn: 230703	2015-02-26 23:49:23 +00:00
Chandler Carruth	653773d004	[x86] Fix PR22706 where we would incorrectly try lower a v32i8 dynamic blend as legal. We made the same mistake in two different places. Whenever we are custom lowering a v32i8 blend we need to check whether we are custom lowering it only for constant conditions that can be shuffled, or whether we actually have AVX2 and full dynamic blending support on bytes. Both are fixed, with comments added to make it clear what is going on and a new test case. llvm-svn: 230695	2015-02-26 22:15:34 +00:00
Sanjoy Das	54ad996ca2	IRCE: add a test case for r230619. llvm-svn: 230680	2015-02-26 20:14:32 +00:00
Frederic Riss	adbb3f207f	[MC] Use the non-EH register mapping in the debug_frame section. On 32bits x86 Darwin, the register mappings for the eh_frane and debug_frame sections are different. Thus the same CFI instructions should result in different registers in the object file. The problem isn't target specific though, but it requires that the mappings for EH register numbers be different from the standard Dwarf one. The patch looks a bit clumsy. LLVM uses the EH mapping as canonical for everything frame related. Thus we need to do a double conversion EH -> LLVM -> Non-EH, when emitting the debug_frame section. Fixes PR22363. Differential Revision: http://reviews.llvm.org/D7593 llvm-svn: 230670	2015-02-26 19:48:07 +00:00
Reid Kleckner	e81017248c	Don't sibcall between SysV and Win64 convention functions The shadow stack space expectations won't match. Fixes PR22709. llvm-svn: 230667	2015-02-26 19:43:20 +00:00
Hal Finkel	221f467185	[InstCombine/PowerPC] Convert aligned QPX load/store intrinsics into loads/stores InstCombine has long had logic to convert aligned Altivec load/store intrinsics into regular loads and stores. This mirrors that functionality for QPX vector load/store intrinsics. llvm-svn: 230660	2015-02-26 18:56:03 +00:00
Paul Robinson	093d6e1a70	When the source has a series of assignments, users reasonably want to have the debugger step through each one individually. Turn off the combine for adjacent stores at -O0 so we get this behavior. Possibly, DAGCombine shouldn't run at all at -O0, but that's for another day; see PR22346. Differential Revision: http://reviews.llvm.org/D7181 llvm-svn: 230659	2015-02-26 18:47:57 +00:00
Petar Jovanovic	90ec1b175e	Fix justify error for small structures in varargs for MIPS64BE There was a problem when passing structures as variable arguments. The structures smaller than 64 bit were not left justified on MIPS64 big endian. This is now fixed by shifting the value to make it left- justified when appropriate. This fixes the bug http://llvm.org/bugs/show_bug.cgi?id=21608 Patch by Aleksandar Beserminji. Differential Revision: http://reviews.llvm.org/D7881 llvm-svn: 230657	2015-02-26 18:35:15 +00:00
Rafael Espindola	7360fb6206	gold-plugin: "Upgrade" debug info and handle its warnings. The gold plugin never calls MaterializeModule, so any old debug info was not deleted and could cause crashes. Now that it is being "upgraded", the plugin also has to handle warnings and create Modules with a nice id (it shows in the warning). llvm-svn: 230655	2015-02-26 18:24:37 +00:00
Sumanth Gundapaneni	28a3b86b06	Use ".arch_extension" ARM directive to support hwdiv on krait In case of "krait" CPU, asm printer doesn't emit any ".cpu" so the features bits are not computed. This patch lets the asm printer emit ".cpu cortex-a9" directive for krait and the hwdiv feature is enabled through ".arch_extension". In short, krait is treated as "cortex-a9" with hwdiv. We can not emit ".krait" as CPU since it is not supported bu GNU GAS yet llvm-svn: 230651	2015-02-26 18:08:41 +00:00
Adam Nemet	9cc0c3999d	[LV/LoopAccesses] Backward dependences are not safe just because the accesses are via different types Noticed this while generalizing the code for loop distribution. I confirmed with Arnold that this was indeed a bug and managed to create a testcase. llvm-svn: 230647	2015-02-26 17:58:48 +00:00
Tom Stellard	eb05c610b4	R600/SI: Remove M0 from DS assembly strings This matches the assembly syntax for the proprietary compiler. llvm-svn: 230645	2015-02-26 17:08:43 +00:00
Bruno Cardoso Lopes	9801cd9b6a	[X86][MMX] Fix a typo in a couple of tests llvm-svn: 230638	2015-02-26 15:16:09 +00:00
Bruno Cardoso Lopes	7b6c1ec22d	[X86][MMX] Remove widening experimental flag from MMX tests. Turns out that after the past MMX commits, we don't need to rely on this flag to get better codegen for MMX. Also update the tests to become triple neutral. llvm-svn: 230637	2015-02-26 15:10:38 +00:00
Hal Finkel	18ee7c14fd	[InstCombine] Add a test for altivec load/store intrinsic simplification InstCombine has logic to convert aligned Altivec load/store intrinsics into regular loads and stores. Unfortunately, there seems to be no regression test covering this behavior. Adding one... llvm-svn: 230632	2015-02-26 14:22:41 +00:00
Vladimir Medic	187958b27a	Replace obsolete -mattr=n64 command line option with -target-abi=n64. No functional changes. llvm-svn: 230628	2015-02-26 12:29:48 +00:00
Sanjoy Das	e75ed92630	IRCE: generalize to handle loops with decreasing induction variables. IRCE can now split the iteration space for loops like: for (i = n; i >= 0; i--) a[i + k] = 42; // bounds check on access llvm-svn: 230618	2015-02-26 08:19:31 +00:00
Duncan P. N. Exon Smith	01ac1707d6	FileCheck: Add CHECK-SAME Add `CHECK-SAME`, which requires that the pattern matches on the same line as the previous `CHECK`/`CHECK-NEXT` -- in other words, no newline is allowed in the skipped region. This is similar to `CHECK-NEXT`, which requires exactly 1 newline in the skipped region. My motivation is to simplify checking the long lines of LLVM assembly for the new debug info hierarchy. This allows CHECK sequences like the following: CHECK: ![[REF]] = !SomeMDNode( CHECK-SAME: file: ![[FILE:[0-9]+]] CHECK-SAME: otherField: 93{{[,)]}} which is equivalent to: CHECK: ![[REF]] = !SomeMDNode({{.}}file: ![[FILE:[0-9]+]]{{.}}otherField: 93{{[,)]}} While this example just has two fields, many nodes in debug info have more than that. `CHECK-SAME` will keep the logic easy to follow. Morever, it enables interleaving `CHECK-NOT`s without allowing newlines. Consider the following: CHECK: ![[REF]] = !SomeMDNode( CHECK-SAME: file: ![[FILE:[0-9]+]] CHECK-NOT: unexpectedField: CHECK-SAME: otherField: 93{{[,)]}} CHECK-NOT: otherUnexpectedField: CHECK-SAME: ) which doesn't seem to have an equivalent `CHECK` line. llvm-svn: 230612	2015-02-26 04:53:00 +00:00
Ramkumar Ramachandra	3408f3e296	PlaceSafepoints: use IRBuilder helpers Use the IRBuilder helpers for gc.statepoint and gc.result, instead of coding the construction by hand. Note that the gc.statepoint IRBuilder handles only CallInst, not InvokeInst; retain that part of hand-coding. Differential Revision: http://reviews.llvm.org/D7518 llvm-svn: 230591	2015-02-26 00:35:56 +00:00
Justin Bogner	2e427d4dbd	InstrProf: Make the __llvm_profile_runtime_user symbol hidden This symbol exists only to pull in the required pieces of the runtime, so nothing ever needs to refer to it. Making it hidden avoids the potential for issues with duplicate symbols when linking profiled libraries together. llvm-svn: 230566	2015-02-25 22:52:20 +00:00
Sanjay Patel	cc29f4f2cb	only propagate equality comparisons of FP values that we are certain are non-zero This is a follow-on to r227491 which tightens the check for propagating FP values. If a non-constant value happens to be a zero, we would hit the same bug as before. Bug noted and patch suggested by Eli Friedman. llvm-svn: 230564	2015-02-25 22:46:08 +00:00
JF Bastien	d52c990a90	InstCombine: extract instead of shuffle when performing vector/array type punning Summary: SROA generates code that isn't quite as easy to optimize and contains unusual-sized shuffles, but that code is generally correct. As discussed in D7487 the right place to clean things up is InstCombine, which will pick up the type-punning pattern and transform it into a more obvious bitcast+extractelement, while leaving the other patterns SROA encounters as-is. Test Plan: make check Reviewers: jvoung, chandlerc Subscribers: llvm-commits llvm-svn: 230560	2015-02-25 22:30:51 +00:00
Hal Finkel	cf59921670	[PowerPC] Make LDtocL and friends invariant loads LDtocL, and other loads that roughly correspond to the TOC_ENTRY SDAG node, represent loads from the TOC, which is invariant. As a result, these loads can be hoisted out of loops, etc. In order to do this, we need to generate GOT-style MMOs for TOC_ENTRY, which requires treating it as a legitimate memory intrinsic node type. Once this is done, the MMO transfer is automatically handled for TableGen-driven instruction selection, and for nodes generated directly in PPCISelDAGToDAG, we need to transfer the MMOs manually. Also, we were not transferring MMOs associated with pre-increment loads, so do that too. Lastly, this fixes an exposed bug where R30 was not added as a defined operand of UpdateGBR. This problem was highlighted by an example (used to generate the test case) posted to llvmdev by Francois Pichet. llvm-svn: 230553	2015-02-25 21:36:59 +00:00
Frederic Riss	c0dd7243ee	[dwarfdump] Make debug_frame dump actually useful. This adds support for pretty-printing instruction operands. The new output looks like: 00000000 00000010 ffffffff CIE Version: 1 Augmentation: Code alignment factor: 1 Data alignment factor: -4 Return address column: 8 DW_CFA_def_cfa: reg4 +4 DW_CFA_offset: reg8 -4 DW_CFA_nop: DW_CFA_nop: 00000014 00000010 00000000 FDE cie=00000000 pc=00000000...00000022 DW_CFA_advance_loc: 3 DW_CFA_def_cfa_offset: +12 DW_CFA_nop: llvm-svn: 230551	2015-02-25 21:30:22 +00:00
David Majnemer	e1bbad9eb2	X86, Win64: Allow 'mov' to restore the stack pointer if we have a FP The Win64 epilogue structure is very restrictive, it permits a very small number of opcodes and none of them are 'mov'. This means that given: mov %rbp, %rsp pop %rbp The mov isn't the epilogue, only the pop is. This is problematic unless a frame pointer is present in which case we are free to do whatever we'd like in the "body" of the function. If a frame pointer is present, unwinding will undo the prologue operations in reverse order regardless of the fact that we are at an instruction which is reseting the stack pointer. llvm-svn: 230543	2015-02-25 21:13:37 +00:00
Peter Collingbourne	eba7f73ff9	LowerBitSets: Align referenced globals. This change aligns globals to the next highest power of 2 bytes, up to a maximum of 128. This makes it more likely that we will be able to compress bit sets with a greater alignment. In many more cases, we can now take advantage of a new optimization also introduced in this patch that removes bit set checks if the bit set is all ones. The 128 byte maximum was found to provide the best tradeoff between instruction overhead and data overhead in a recent build of Chromium. It allows us to remove ~2.4MB of instructions at the cost of ~250KB of data. Differential Revision: http://reviews.llvm.org/D7873 llvm-svn: 230540	2015-02-25 20:42:41 +00:00
Sanjoy Das	dcc84db264	Bugfix: SCEVExpander incorrectly marks increment operations as no-wrap (The change was landed in r230280 and caused the regression PR22674. This version contains a fix and a test-case for PR22674). When emitting the increment operation, SCEVExpander marks the operation as nuw or nsw based on the flags on the preincrement SCEV. This is incorrect because, for instance, it is possible that {-6,+,1} is <nuw> while {-6,+,1}+1 = {-5,+,1} is not. This change teaches SCEV to mark the increment as nuw/nsw only if it can explicitly prove that the increment operation won't overflow. Apart from the attached test case, another (more realistic) manifestation of the bug can be seen in Transforms/IndVarSimplify/pr20680.ll. Differential Revision: http://reviews.llvm.org/D7778 llvm-svn: 230533	2015-02-25 20:02:59 +00:00
Sanjay Patel	40eaa8df99	Fix really obscure bug in CannotBeNegativeZero() (PR22688) With a diabolically crafted test case, we could recurse through this code and return true instead of false. The larger engineering crime is the use of magic numbers. Added FIXME comments for those. llvm-svn: 230515	2015-02-25 18:00:15 +00:00
Vladimir Medic	bcb7467540	[MIPS]Multiple and add instructions for Mips are currently available in mips32r2/mips64r2 and later but should also be available in mips4, mips5, and mips64. This patch fixes the requested features and updates the corresponding test files. llvm-svn: 230500	2015-02-25 15:24:37 +00:00
Bruno Cardoso Lopes	ab7afa9144	[X86][MMX] Reapply: Add MMX instructions to foldable tables Reapply r230248. Teach the peephole optimizer to work with MMX instructions by adding entries into the foldable tables. This covers folding opportunities not handled during isel. llvm-svn: 230499	2015-02-25 15:14:02 +00:00
Renato Golin	b9887ef32a	Improve handling of stack accesses in Thumb-1 Thumb-1 only allows SP-based LDR and STR to be word-sized, and SP-base LDR, STR, and ADD only allow offsets that are a multiple of 4. Make some changes to better make use of these instructions: * Use word loads for anyext byte and halfword loads from the stack. * Enforce 4-byte alignment on objects accessed in this way, to ensure that the offset is valid. * Do the same for objects whose frame index is used, in order to avoid having to use more than one ADD to generate the frame index. * Correct how many bits of offset we think AddrModeT1_s has. Patch by John Brawn. llvm-svn: 230496	2015-02-25 14:41:06 +00:00
Vladimir Medic	addb2daaac	Replace obsolete -mattr=n64 command line option with -target-abi=n64. No functional changes. llvm-svn: 230482	2015-02-25 11:43:01 +00:00
Charles Davis	33d1dc0008	[IC] Turn non-null MD on pointer loads to range MD on integer loads. Summary: This change fixes the FIXME that you recently added when you committed (a modified version of) my patch. When `InstCombine` combines a load and store of an pointer to those of an equivalently-sized integer, it currently drops any `!nonnull` metadata that might be present. This change replaces `!nonnull` metadata with `!range !{ 1, -1 }` metadata instead. Reviewers: chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7621 llvm-svn: 230462	2015-02-25 05:10:25 +00:00
Hal Finkel	6b6e9e2b5c	[PowerPC] Add triples to QPX tests Some of these tests fail on Darwin systems because of a lack of a triple; fix that. llvm-svn: 230421	2015-02-25 01:26:59 +00:00
Duncan P. N. Exon Smith	a6b8895442	llvm-dis: Stop crashing when dropping debug info Since r199356, we've printed a warning when dropping debug info. r225562 started crashing on that, since it registered a diagnostic handler that only expected errors. This fixes the handler to expect other severities. As a side effect, it now prints "error: " at the start of error messages, similar to `llvm-as`. There was a testcase for r199356, but it only really checked the assembler. Move `test/Bitcode/drop-debug-info.ll` to `test/Assembler`, and introduce `test/Bitcode/drop-debug-info.3.5.ll` (and companion `.bc`) to test the bitcode reader. Note: tools/gold/gold-plugin.cpp has an equivalent bug, but I'm not sure what the best fix is there. I'll file a PR. llvm-svn: 230416	2015-02-25 01:10:03 +00:00
David Blaikie	b5b5efd2d1	[opaque pointer type] Bitcode support for explicit type parameter on GEP. Like r230414, add bitcode support including backwards compatibility, for an explicit type parameter to GEP. At the suggestion of Duncan I tried coalescing the two older bitcodes into a single new bitcode, though I did hit a wrinkle: I couldn't figure out how to create an explicit abbreviation for a record with a variable number of arguments (the indicies to the gep). This means the discriminator between inbounds and non-inbounds gep is a full variable-length field I believe? Is my understanding correct? Is there a way to create such an abbreviation? Should I just use two bitcodes as before? Reviewers: dexonsmith Differential Revision: http://reviews.llvm.org/D7736 llvm-svn: 230415	2015-02-25 01:08:52 +00:00
Hal Finkel	c93a9a2cb4	[PowerPC] Add support for the QPX vector instruction set This adds support for the QPX vector instruction set, which is used by the enhanced A2 cores on the IBM BG/Q supercomputers. QPX vectors are 256 bytes wide, holding 4 double-precision floating-point values. Boolean values, modeled here as <4 x i1> are actually also represented as floating-point values (essentially { -1, 1 } for { false, true }). QPX shares many features with Altivec and VSX, but is distinct from both of them. One major difference is that, instead of adding completely-separate vector registers, QPX vector registers are extensions of the scalar floating-point registers (lane 0 is the corresponding scalar floating-point value). The operations supported on QPX vectors mirrors that supported on the scalar floating-point values (with some additional ones for permutations and logical/comparison operations). I've been maintaining this support out-of-tree, as part of the bgclang project, for several years. This is not the entire bgclang patch set, but is most of the subset that can be cleanly integrated into LLVM proper at this time. Adding this to the LLVM backend is part of my efforts to rebase bgclang to the current LLVM trunk, but is independently useful (especially for codes that use LLVM as a JIT in library form). The assembler/disassembler test coverage is complete. The CodeGen test coverage is not, but I've included some tests, and more will be added as follow-up work. llvm-svn: 230413	2015-02-25 01:06:45 +00:00
Rafael Espindola	8bc9ccc60a	Support SHF_MERGE sections in COMDATs. This patch unifies the comdat and non-comdat code paths. By doing this it add missing features to the comdat side and removes the fixed section assumptions from the non-comdat side. In ELF there is no one true section for "4 byte mergeable" constants. We are better off computing the required properties of the section and asking the context for it. llvm-svn: 230411	2015-02-25 00:52:15 +00:00
Eric Christopher	0aec6ab354	Make this test even more OS and register allocation neutral. llvm-svn: 230404	2015-02-25 00:12:11 +00:00
Eric Christopher	e4c02c6450	Make this test not dependent upon the triple. All that was needed was some flexibility in the check line for the comment basic block. llvm-svn: 230400	2015-02-24 23:43:26 +00:00
Peter Collingbourne	1baeaa395a	LowerBitSets: Introduce global layout builder. The builder is based on a layout algorithm that tries to keep members of small bit sets together. The new layout compresses Chromium's bit sets to around 15% of their original size. Differential Revision: http://reviews.llvm.org/D7796 llvm-svn: 230394	2015-02-24 23:17:02 +00:00
Simon Pilgrim	d8820ae70c	Reapplied D7816 & rL230177 & rL230278 - with an additional fix toensure that the smallest build vector input scalar type is always used. Additional (crash) test cases already committed. llvm-svn: 230388	2015-02-24 22:08:56 +00:00
Simon Pilgrim	b1468daf00	Added test case for PR22678 (check CONCAT_VECTORS DAG combiner pass doesn't introduce illegal types) llvm-svn: 230386	2015-02-24 21:46:23 +00:00
Justin Bogner	2ce48056a4	InstrProf: Test for appropriate linkage of the profiling structures This test checks that the symbols instrprof creates have appropriate linkage. The tests already exist in clang in a slightly different form from before we sunk profile generation into an LLVM pass, but that's an awkward place for them now. I'll remove/simplify the clang versions shortly. llvm-svn: 230383	2015-02-24 21:42:42 +00:00
Andrew Kaylor	1476e6d1bb	Fixing eol-style llvm-svn: 230378	2015-02-24 20:49:35 +00:00
Eric Christopher	af48495130	Revert: Author: Simon Pilgrim <llvm-dev@redking.me.uk> Date: Mon Feb 23 23:04:28 2015 +0000 Fix based on post-commit comment on D7816 & rL230177 - BUILD_VECTOR operand truncation was using the the BV's output scalar type instead of the input type. and Author: Simon Pilgrim <llvm-dev@redking.me.uk> Date: Sun Feb 22 18:17:28 2015 +0000 [DagCombiner] Generalized BuildVector Vector Concatenation The CONCAT_VECTORS combiner pass can transform the concat of two BUILD_VECTOR nodes into a single BUILD_VECTOR node. This patch generalises this to support any number of BUILD_VECTOR nodes, and also permits UNDEF nodes to be included as well. This was noticed as AVX vec128 -> vec256 canonicalization sometimes creates a CONCAT_VECTOR with a real vec128 lower and an vec128 UNDEF upper. Differential Revision: http://reviews.llvm.org/D7816 as the root cause of PR22678 which is causing an assertion inside the DAG combiner. I'll follow up to the main thread as well. llvm-svn: 230358	2015-02-24 19:11:00 +00:00
Matthias Braun	7526035155	AArch64: Relax assert about large shift sizes. The reason why these large shift sizes happen is because OpaqueConstants currently inhibit alot of DAG combining, but that has to be addressed in another commit (like the proposal in D6946). Differential Revision: http://reviews.llvm.org/D6940 llvm-svn: 230355	2015-02-24 18:52:04 +00:00
Tom Stellard	ecc419c31d	R600/SI: Remove isel mubuf legalization We legalize mubuf instructions post-instruction selection, so this code is no longer needed. llvm-svn: 230352	2015-02-24 17:59:19 +00:00
Tim Northover	e95c5b3236	ARM: treat [N x i32] and [N x i64] as AAPCS composite types The logic is almost there already, with our special homogeneous aggregate handling. Tweaking it like this allows front-ends to emit AAPCS compliant code without ever having to count registers or add discarded padding arguments. Only arrays of i32 and i64 are needed to model AAPCS rules, but I decided to apply the logic to all integer arrays for more consistency. llvm-svn: 230348	2015-02-24 17:22:34 +00:00
Hans Wennborg	953d6fb84e	Revert r230280: "Bugfix: SCEVExpander incorrectly marks increment operations as no-wrap" This caused PR22674, failing this assert: Instructions.h:2281: llvm::Value* llvm::PHINode::getOperand(unsigned int) const: Assertion `i_nocapture < OperandTraits<PHINode>::operands(this) && "getOperand() out of range!"' failed. llvm-svn: 230341	2015-02-24 16:19:29 +00:00
Michael Kuperstein	8ffb409135	[x32] x32 should use ebx as the base pointer. This fixes the original issue in PR22655, but not the secondary one. llvm-svn: 230334	2015-02-24 15:27:13 +00:00
Reed Kotler	5fb7d8b508	Beginning of alloca implementation for Mips fast-isel Summary: Begin to add various address modes; including alloca. Test Plan: Make sure there are no regressions in test-suite at O0/02 in mips32r1/r2 Reviewers: dsanders Reviewed By: dsanders Subscribers: echristo, rfuhler, llvm-commits Differential Revision: http://reviews.llvm.org/D6426 llvm-svn: 230300	2015-02-24 02:36:45 +00:00
Sanjoy Das	b14010d28b	Fix bug 22641 The bug was a result of getPreStartForExtend interpreting nsw/nuw flags on an add recurrence more strongly than is legal. {S,+,X}<nsw> implies S+X is nsw only if the backedge of the loop is taken at least once. NOTE: I had accidentally committed an unrelated change with the commit message of this change in r230275 (r230275 was reverted in r230279). This is the correct change for this commit message. Differential Revision: http://reviews.llvm.org/D7808 llvm-svn: 230291	2015-02-24 01:02:42 +00:00
Manman Ren	6487ce955a	[LTO API] add lto_codegen_set_module to set the destination module. When debugging LTO issues with ld64, we use -save-temps to save the merged optimized bitcode file, then invoke ld64 again on the single bitcode file to speed up debugging code generation passes and ld64 stuff after code generation. llvm linking a single bitcode file via lto_codegen_add_module will generate a different bitcode file from the single input. With the newly-added lto_codegen_set_module, we can make sure the destination module is the same as the input. lto_codegen_set_module will transfer the ownship of the module to code generator. rdar://19024554 llvm-svn: 230290	2015-02-24 00:45:56 +00:00
David Majnemer	3aa0bd81a2	X86: Only use 'lea' in Win64 epilogues if a frame pointer exists We can only use 'add' in epilogues, 'lea' is not permitted unless we've established a frame pointer in the prologue. llvm-svn: 230286	2015-02-24 00:11:32 +00:00
Sanjoy Das	82ea3d45b5	New instcombine rule: max(~a,~b) -> ~min(a, b) This case is interesting because ScalarEvolutionExpander lowers min(a, b) as ~max(~a,~b). I think the profitability heuristics can be made more clever/aggressive, but this is a start. Differential Revision: http://reviews.llvm.org/D7821 llvm-svn: 230285	2015-02-24 00:08:41 +00:00
Sanjoy Das	18c243b933	Bugfix: SCEVExpander incorrectly marks increment operations as no-wrap When emitting the increment operation, SCEVExpander marks the operation as nuw or nsw based on the flags on the preincrement SCEV. This is incorrect because, for instance, it is possible that {-6,+,1} is <nuw> while {-6,+,1}+1 = {-5,+,1} is not. This change teaches SCEV to mark the increment as nuw/nsw only if it can explicitly prove that the increment operation won't overflow. Apart from the attached test case, another (more realistic) manifestation of the bug can be seen in Transforms/IndVarSimplify/pr20680.ll. NOTE: this change was landed with an incorrect commit message in rL230275 and was reverted for that reason in rL230279. This commit message is the correct one. Differential Revision: http://reviews.llvm.org/D7778 llvm-svn: 230280	2015-02-23 23:22:58 +00:00
Sanjoy Das	c9cf0151cf	Revert 230275. 230275 got committed with an incorrect commit message due to a mixup on my side. Will re-land in a few moments with the correct commit message. llvm-svn: 230279	2015-02-23 23:13:22 +00:00
Andrea Di Biagio	af3f397b10	[X86] Teach how to custom lower double-to-half conversions under fast-math. This patch teaches the backend how to expand a double-half conversion into a double-float conversion immediately followed by a float-half conversion. We do this only under fast-math, and if float-half conversions are legal for the target. Added test CodeGen/X86/fastmath-float-half-conversion.ll Differential Revision: http://reviews.llvm.org/D7832 llvm-svn: 230276	2015-02-23 22:59:02 +00:00
Sanjoy Das	913dfd8f7f	Fix bug 22641 The bug was a result of getPreStartForExtend interpreting nsw/nuw flags on an add recurrence more strongly than is legal. {S,+,X}<nsw> implies S+X is nsw only if the backedge of the loop is taken at least once. Differential Revision: http://reviews.llvm.org/D7808 llvm-svn: 230275	2015-02-23 22:55:13 +00:00
David Majnemer	006c490ba8	X86: Use a smaller 'mov' instruction for stack probe calls Prologue emission, in some cases, requires calls to a stack probe helper function. The amount of stack to probe is passed as a register argument in the Win64 ABI but the instruction sequence used is pessimistic: it assumes that the number of bytes to probe is greater than 4 GB. Instead, select a more appropriate opcode depending on the number of bytes we are going to probe. llvm-svn: 230270	2015-02-23 21:50:30 +00:00
David Majnemer	31d868b618	X86: Use 'mov' instead of 'lea' in Win64 SEH prologues when possible 'mov' and 'lea' are equivalent when the displacement applied with 'lea' is zero. However, 'mov' should encode smaller. llvm-svn: 230269	2015-02-23 21:50:27 +00:00
Bruno Cardoso Lopes	24492b057e	[AsmPrinter] Access pointers to globals via pcrel GOT entries Front-ends could use global unnamed_addr to hold pointers to other symbols, like @gotequivalent below: @foo = global i32 42 @gotequivalent = private unnamed_addr constant i32* @foo @delta = global i32 trunc (i64 sub (i64 ptrtoint (i32** @gotequivalent to i64), i64 ptrtoint (i32* @delta to i64)) to i32) The global @delta holds a data "PC"-relative offset to @gotequivalent, an unnamed pointer to @foo. The darwin/x86-64 assembly output for this follows: .globl _foo _foo: .long 42 .globl _gotequivalent _gotequivalent: .quad _foo .globl _delta _delta: .long _gotequivalent-_delta Since unnamed_addr indicates that the address is not significant, only the content, we can optimize the case above by replacing pc-relative accesses to "GOT equivalent" globals, by a PC relative access to the GOT entry of the final symbol instead. Therefore, "delta" can contain a pc relative relocation to foo's GOT entry and we avoid the emission of "gotequivalent", yielding the assembly code below: .globl _foo _foo: .long 42 .globl _delta _delta: .long _foo@GOTPCREL+4 There are a couple of advantages of doing this: (1) Front-ends that need to emit a great deal of data to store pointers to external symbols could save space by not emitting such "got equivalent" globals and (2) IR constructs combined with this opt opens a way to represent GOT pcrel relocations by using the LLVM IR, which is something we previously had no way to express. Differential Revision: http://reviews.llvm.org/D6922 rdar://problem/18534217 llvm-svn: 230264	2015-02-23 21:26:18 +00:00
Justin Bogner	4d7aae932c	InstrProf: Teach llvm-cov to show the max count instead of the last When multiple regions start on the same line, llvm-cov was just showing the count of the last one as the line count. This can be confusing and misleading for things like one-liner loops, where the count at the end isn't very interesting, or even "if" statements with an opening brace at the end of the line. Instead, use the maximum of all of the region start counts. llvm-svn: 230263	2015-02-23 21:21:34 +00:00
Bruno Cardoso Lopes	1eb8376ca7	[X86][MMX] Fix test to reflect current codegen This test failed in several buildbots, a bit unclear how that happen since this was the previous behavior before r230248. llvm-svn: 230258	2015-02-23 20:57:46 +00:00
Andrew Kaylor	1cc6db071b	Adding test for Windows EH frame variable remapping. llvm-svn: 230250	2015-02-23 20:04:51 +00:00
Andrew Kaylor	f22fe4ae18	Remap frame variables for native Windows exception handling. Differential Revision: http://reviews.llvm.org/D7770 llvm-svn: 230249	2015-02-23 20:01:56 +00:00
Bruno Cardoso Lopes	32173cdf06	Revert "[X86][MMX] Add MMX instructions to foldable tables" This reverts commit r230226 since it breaks win buildbots. llvm-svn: 230248	2015-02-23 19:53:37 +00:00
Chad Rosier	543900539f	Prevent hoisting fmul from THEN/ELSE to IF if there is fmsub/fmadd opportunity. This patch adds the isProfitableToHoist API. For AArch64, we want to prevent a fmul from being hoisted in cases where it is more profitable to form a fmsub/fmadd. Phabricator Review: http://reviews.llvm.org/D7299 Patch by Lawrence Hu <lawrence@codeaurora.org> llvm-svn: 230241	2015-02-23 19:15:16 +00:00
Mehdi Amini	cd3ca6f7dd	InstSimplify: simplify 0 / X if nnan and nsz From: Fiona Glaser <fglaser@apple.com> llvm-svn: 230238	2015-02-23 18:30:25 +00:00
Daniel Sanders	afe27c7d27	[mips] Honour -mno-odd-spreg for vector insert/extract when MSA is enabled. Summary: -mno-odd-spreg prohibits the use of odd-numbered single-precision floating point registers. However, vector insert/extract was still using them when manipulating the subregisters of an MSA register. Fixed this by ensuring that insertion/extraction is only performed on even-numbered vector registers when -mno-odd-spreg is given. Reviewers: vmedic, sstankovic Reviewed By: sstankovic Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7672 llvm-svn: 230235	2015-02-23 17:22:16 +00:00
Bruno Cardoso Lopes	1cacda086f	[X86] Add specific mtriple in order to appease builbots llvm-svn: 230229	2015-02-23 15:33:40 +00:00

... 2 3 4 5 6 ...

29092 Commits