llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	ee9eef2fd8	Teach disassembler to handle illegal immediates on (v)cmpps/pd/ss/sd instructions. Instead of rejecting we'll just generate the _alt forms that don't try to alter the mnemonic. While I'm here, merge some common code in the Instruction printers for the condition code replacement and fix the mask on SSE to be 3-bits instead of 4. llvm-svn: 224846	2014-12-26 06:36:28 +00:00
Craig Topper	2e44492b1d	Use MCPhysReg for table of register encodings. llvm-svn: 224845	2014-12-26 06:36:23 +00:00
Hal Finkel	0c505b08a5	[PowerPC] [FastISel] i1 constants must be zero extended When materializing constant i1 values, they must be zero extended. We represent i1 values as [0, 1], not [0, -1], in i32 registers. As it turns out, this code path was dead for i1 values prior to r216006 (which is why this did not manifest in miscompiles until recently). Fixes -O0 self-hosting on PPC64/Linux. llvm-svn: 224842	2014-12-25 23:08:25 +00:00
David Majnemer	25b383ac66	Silence GCC's -Wparentheses warning No functionality change intended. llvm-svn: 224833	2014-12-25 10:03:23 +00:00
Elena Demikhovsky	3d13f1c82c	Documentation for Masked Load and Store intrinsics. llvm-svn: 224832	2014-12-25 09:29:13 +00:00
Elena Demikhovsky	fb81b93e17	Masked Load/Store - Changed the order of parameters in intrinsics. No functional changes. The documentation is coming. llvm-svn: 224829	2014-12-25 07:49:20 +00:00
David Majnemer	2913eca4e2	CodeGen: Error on redefinitions instead of asserting It's possible to have a prior definition of a symbol in module asm. Raise an error instead of crashing. llvm-svn: 224828	2014-12-24 23:06:55 +00:00
David Majnemer	8e92dfee20	CodeGen: Allow aliases to be overridden by variables llvm-svn: 224827	2014-12-24 22:44:29 +00:00
Saleem Abdulrasool	747ec2dda3	MC: address some comments in deprecation checks Bob Wilson pointed out the unnecessary checks that had been committed to the instruction check predicates. The check was meant to ensure that the check was not accidentally applied to non-ARM instructions. This is better served as an assertion rather than a condition check. llvm-svn: 224825	2014-12-24 18:40:42 +00:00
David Majnemer	58cb80c940	MC: Label definitions are permitted after .set directives .set directives may be overridden by other .set directives as well as label definitions. This fixes PR22019. llvm-svn: 224811	2014-12-24 10:27:50 +00:00
Saleem Abdulrasool	4d6ed7c778	IAS: correct debug line info for asm macros Correct the line information generation for preprocessed assembly. Although we tracked the source information for the macro instantiation, we failed to account for the fact that we were instantiating a macro, which is populated into a new buffer and that the line information would be relative to the definition rather than the actual instantiation location. This could cause the line number associated with the statement to be very high due to wrapping of the difference calculated for the preprocessor line information emitted into the stream. Properly calculate the line for the macro instantiation, referencing the line where the macro is actually used as GCC/gas do. The test case uses x86, though the same problem exists on any other target using the LLVM IAS. llvm-svn: 224810	2014-12-24 06:32:43 +00:00
Craig Topper	b86338f7b2	[X86] Remove the single AdSize indicator and replace it with separate AdSize16/32/64 flags. This removes a hardcoded list of instructions in the CodeEmitter. Eventually I intend to remove the predicates on the affected instructions since in any given mode two of them are valid if we supported addr32/addr16 prefixes in the assembler. llvm-svn: 224809	2014-12-24 06:05:22 +00:00
David Majnemer	0fe246e079	MC: Don't emit .no_dead_strip on targets which don't support it llvm-svn: 224808	2014-12-24 04:11:42 +00:00
Matthias Braun	51ca510094	LiveInterval: Remove accidentally committed debug code. llvm-svn: 224807	2014-12-24 02:35:07 +00:00
Matthias Braun	dbcca0dbb4	LiveInterval: Introduce createMainRangeFromSubranges(). This function constructs the main liverange by merging all subranges if subregister liveness tracking is available. This should be slightly faster to compute instead of performing the liveness calculation again for the main range. More importantly it avoids cases where the main liverange would cover positions where no subrange was live. These cases happened for partial definitions where the actual defined part was dead and only the undefined parts used later. The register coalescing requires that every part covered by the main live range has at least one subrange live. I also expect this function to become usefull later for places where the subranges are modified in a way that it is hard to correctly fix the main liverange in the machine scheduler, we can simply reconstruct it from subranges then. llvm-svn: 224806	2014-12-24 02:11:51 +00:00
Matthias Braun	7030dda8d5	RegisterCoalescer: With subrange liveness there may be no RedefVNI for unused lanes. llvm-svn: 224805	2014-12-24 02:11:48 +00:00
Matthias Braun	36768c684f	LiveRangeEdit: Check for completely empy subranges after removing ValNos. Completely empty subranges are not allowed and must be removed when subreg liveness is enabled. llvm-svn: 224804	2014-12-24 02:11:46 +00:00
Matthias Braun	f603c88d13	LiveIntervalAnalysis: Fix performance bug that I introduced in r224663. Without a reference the code did not remember when moving the iterators of the subranges/registerunit ranges forward and instead would scan from the beginning again at the next position. llvm-svn: 224803	2014-12-24 02:11:43 +00:00
Peter Zotov	283e20219e	[OCaml] PR21901: Update tests. This finishes the fix partially applied by r224782. llvm-svn: 224802	2014-12-24 01:58:45 +00:00
Peter Zotov	af6535bf12	[OCaml] Expose Llvm_executionengine.get_{global_value,function}_address. Patch by Ramkumar Ramachandra <artagnon@gmail.com>. Also remove Llvm_executionengine.get_pointer_to_global, as it is actually deprecated and didn't appear in a stable release. llvm-svn: 224801	2014-12-24 01:52:51 +00:00
Chandler Carruth	ffb7ce56a6	[SROA] Update the documentation and names for accessing the slices within a partition of an alloca in SROA. This reflects the fact that the organization of the slices isn't really ideal for analysis, but is the naive way in which the slices are available while we're processing them in the core partitioning algorithm. It is possible we could improve matters, and I've left a FIXME with one of my ideas for how to do this, but it is a lot of work, the benefit is somewhat minor, and it isn't clear that it would be strictly better. =/ Not really satisfying, but I'm out of really good ideas. This also improves one place where the debug logging failed to mark some split partitions. Now we log in one place, slightly later, and with accurate information about whether the slice is split by the partition being rewritten. llvm-svn: 224800	2014-12-24 01:48:09 +00:00
Adrian Prantl	3026a54aa2	Debug Info: In symmetry to DW_TAG_pointer_type, do not emit the byte size of a DW_TAG_ptr_to_member_type. This restores the behavior from before r224780-r224781. llvm-svn: 224799	2014-12-24 01:17:51 +00:00
Chandler Carruth	5031bbe86a	[SROA] Refactor the integer and vector promotion testing logic to operate in terms of the new Partition class, and generally have a more clear set of arguments. No functionality changed. The most notable improvements here are consistently using the terminology of 'partition' for a collection of slices that will be rewritten together and 'slice' for a region of an alloca that is used by a particular instruction. This also makes it more clear that the split things are actually slices as well, just ones that will be split by the proposed partition. This doesn't yet address the confusing aspects of the partition's interface where slices that will be split by the partition and start prior to the partition are accesssed via Partition::splitSlices() while the core range of slices exposed by a Partition includes both unsplit slices and slices which will be split by the end, but started within the offset range of the partition. This is particularly hard to address because the algorithm which computes partitions quite literally doesn't know which slices these will end up being until too late. I'm looking at whether I can fix that or not, but I'm not optimistic. I'll update the comments and/or names to further explain this either way. I've also added one FIXME in this patch relating to this confusion so that I don't forget about it. llvm-svn: 224798	2014-12-24 01:05:14 +00:00
Colin LeMahieu	e193e1c48b	[Hexagon] Removing old classes. llvm-svn: 224795	2014-12-24 00:43:00 +00:00
Kevin Enderby	aefb00337f	Another attempt to fix the LLVM Windows build bot lld-x86_64-win7, one last place to fix I think. llvm-svn: 224794	2014-12-24 00:16:51 +00:00
Kevin Enderby	227df348cc	Attempt to fix the LLVM Windows build bot lld-x86_64-win7. llvm-svn: 224793	2014-12-23 23:43:59 +00:00
Kevin Enderby	48ef534b74	Add printing the LC_THREAD load commands with llvm-objdump’s -private-headers. llvm-svn: 224792	2014-12-23 22:56:39 +00:00
Kostya Serebryany	9fdeb37bd3	[asan] change the coverage collection scheme so that we can easily emit coverage for the entire process as a single bit set, and if coverage_bitset=1 actually emit that bitset llvm-svn: 224789	2014-12-23 22:32:17 +00:00
Hal Finkel	fc096c98f3	[PowerPC] Ensure that the TOC reload directly follows bctrl on PPC64 On non-Darwin PPC64, the TOC reload needs to come directly after the bctrl instruction (for indirect calls) because the 'bctrl/ld 2, 40(1)' instruction sequence is interpreted by the unwinding code in libgcc. To make sure these occur as a pair, as with other pairings interpreted by the linker, fuse the two instructions into one instruction (for code generation only). In the future, we might wish to do this by emitting CFI directives instead, but this solution is simpler, and mirrors what GCC does. Additional discussion on this point is contained in the PR. Fixes PR22015. llvm-svn: 224788	2014-12-23 22:29:40 +00:00
Colin LeMahieu	947cd70413	[Hexagon] Adding doubleword load. llvm-svn: 224787	2014-12-23 20:44:59 +00:00
Colin LeMahieu	026e88d317	[Hexagon] Reapplying 224775 load words. llvm-svn: 224786	2014-12-23 20:02:16 +00:00
Jozef Kolek	ab6d1cce3e	[mips][microMIPS] Implement CACHE, PREF, SSNOP, EHB and PAUSE instructions Differential Revision: http://reviews.llvm.org/D5204 llvm-svn: 224785	2014-12-23 19:55:34 +00:00
Colin LeMahieu	20be15718b	Reverting 224775 until mayLoad flag is addressed. llvm-svn: 224783	2014-12-23 19:22:59 +00:00
Rafael Espindola	c6c58d5e71	Finish removing DestroySource. Fixes pr21901. llvm-svn: 224782	2014-12-23 19:16:45 +00:00
Adrian Prantl	48af2ef40f	DIBuilder: Similar to createPointerType, make createMemberPointerType take a size and alignment. Several assertions in DwarfDebug rely on all variable types to report back a size, or to be derived from a type with a size. Tested in CFE. llvm-svn: 224780	2014-12-23 19:11:47 +00:00
Mehdi Amini	d38920891e	Always assert in DAGCombine and not only when -debug is enabled Right now in DAG Combine check the validity of the returned type only when -debug is given on the command line. However usually the test cases in the validation does not use -debug. An Assert build should always check this. llvm-svn: 224779	2014-12-23 18:59:02 +00:00
Rafael Espindola	eeb4d46c71	Pass LSAN_OPTIONS down so that it is possible to add suppressions. llvm-svn: 224777	2014-12-23 18:39:02 +00:00
Rafael Espindola	538c9a88d9	Fix a leak found by asan. llvm-svn: 224776	2014-12-23 18:18:37 +00:00
Colin LeMahieu	122aeaafea	[Hexagon] Adding word loads. llvm-svn: 224775	2014-12-23 18:06:56 +00:00
Colin LeMahieu	8e39cad934	[Hexagon] Adding signed halfword loads. llvm-svn: 224774	2014-12-23 17:25:57 +00:00
Rafael Espindola	5c014f509b	Fix a leak found by asan. llvm-svn: 224773	2014-12-23 17:20:23 +00:00
Colin LeMahieu	a9386d28a5	[Hexagon] Adding unsigned halfword load. llvm-svn: 224772	2014-12-23 16:42:57 +00:00
Jozef Kolek	12c6982b3b	[mips][microMIPS] Implement LWSP and SWSP instructions Differential Revision: http://reviews.llvm.org/D6416 llvm-svn: 224771	2014-12-23 16:16:33 +00:00
Peter Zotov	4ab66cdfcb	[OCaml] PR22014: OCaml bindings didn't link to libLLVM-*.so with -Wl,--as-needed Patch by Evangelos Foutras <evangelos@foutrelis.com>. llvm-svn: 224766	2014-12-23 13:09:59 +00:00
Michael Kuperstein	be8032c875	[ValueTracking] Move GlobalAlias handling to be after the max depth check in computeKnownBits() GlobalAlias handling used to be after GlobalValue handling, which meant it was, in practice, dead code. r220165 moved GlobalAlias handling to be before GlobalValue handling, but also moved it to be before the max depth check, causing an assert due to a recursion depth limit violation. This moves GlobalAlias handling forward to where it's safe, and changes the GlobalValue handling to only look at GlobalObjects. Differential Revision: http://reviews.llvm.org/D6758 llvm-svn: 224765	2014-12-23 11:33:41 +00:00
Elena Demikhovsky	fcea06acb5	AVX-512: Added FMA instructions, intrinsics an tests for KNL and SKX targets by Asaf Badouh http://reviews.llvm.org/D6456 llvm-svn: 224764	2014-12-23 10:30:39 +00:00
Hal Finkel	6e27c6d450	[PowerPC] Don't mark the return-address slot as immutable It is tempting to mark the fixed stack slot used to store the return address as immutable when lowering @llvm.returnaddress(i32 0). Unfortunately, within the function, it is not completely immutable: it is written during the function prologue. When using post-RA instruction scheduling, the prologue instructions are available for scheduling, and we're not free to interchange the order of a particular store in the prologue with loads from that stack location. Fixes PR21976. llvm-svn: 224761	2014-12-23 09:45:06 +00:00
Elena Demikhovsky	3121449f0b	AVX-512: BLENDM - fixed encoding of the broadcast version Added more intrinsics and encoding tests. llvm-svn: 224760	2014-12-23 09:36:28 +00:00
Michael Kuperstein	f4536ea6e8	[DagCombine] Improve DAGCombiner BUILD_VECTOR when it has two sources of elements This partially fixes PR21943. For AVX, we go from: vmovq (%rsi), %xmm0 vmovq (%rdi), %xmm1 vpermilps $-27, %xmm1, %xmm2 ## xmm2 = xmm1[1,1,2,3] vinsertps $16, %xmm2, %xmm1, %xmm1 ## xmm1 = xmm1[0],xmm2[0],xmm1[2,3] vinsertps $32, %xmm0, %xmm1, %xmm1 ## xmm1 = xmm1[0,1],xmm0[0],xmm1[3] vpermilps $-27, %xmm0, %xmm0 ## xmm0 = xmm0[1,1,2,3] vinsertps $48, %xmm0, %xmm1, %xmm0 ## xmm0 = xmm1[0,1,2],xmm0[0] To the expected: vmovq (%rdi), %xmm0 vmovhpd (%rsi), %xmm0, %xmm0 retq Fixing this for AVX2 is still open. Differential Revision: http://reviews.llvm.org/D6749 llvm-svn: 224759	2014-12-23 08:59:45 +00:00
Hal Finkel	04b16b51ec	[PowerPC] Don't attempt a 64-bit pow2 division on PPC32 In r224033, in moving the signed power-of-2 division expansion into BuildSDIVPow2, I accidentally made it possible to attempt the lowering for a 64-bit division on PPC32. This later asserts. Fixes PR21928. llvm-svn: 224758	2014-12-23 08:38:50 +00:00
Michael Liao	5313da3263	[SimplifyCFG] Revise common code sinking - Fix the case where more than 1 common instructions derived from the same operand cannot be sunk. When a pair of value has more than 1 derived values in both branches, only 1 derived value could be sunk. - Replace BB1 -> (BB2, PN) map with joint value map, i.e. map of (BB1, BB2) -> PN, which is more accurate to track common ops. llvm-svn: 224757	2014-12-23 08:26:55 +00:00
Michael Kuperstein	0bf33ffde4	Remove a bad cast in CloneModule() A cast that was introduced in r209007 was accidentally left in after the changes made to GlobalAlias rules in r210062. This crashes if the aliasee is a now-leggal ConstantExpr. llvm-svn: 224756	2014-12-23 08:23:45 +00:00
Ahmed Bougacha	4553bff412	[ARM] Don't break alignment when combining base updates into load/stores. r223862/r224203 tried to also combine base-updating load/stores. There was a mistake there: the alignment was added as is as an operand to the ARMISD::VLD/VST node. However, the VLD/VST selection logic doesn't care about less-than-standard alignment attributes. For example, no matter the alignment of a v2i64 load (say 1), SelectVLD picks VLD1q64 (because of the memory type). But VLD1q64 ("vld1.64 {dXX, dYY}") is 8-aligned, per ARMARMv7a 3.2.1. For the 1-aligned load, what we really want is VLD1q8. This commit introduces bitcasts if necessary, and changes the vld/vst type to one whose standard alignment matches the original load/store alignment. Differential Revision: http://reviews.llvm.org/D6759 llvm-svn: 224754	2014-12-23 06:07:31 +00:00
Alexey Samsonov	2c55974da5	Fix UBSan bootstrap: replace shift of negative value with multiplication. llvm-svn: 224752	2014-12-23 04:15:53 +00:00
Alexey Samsonov	a274728398	Fix UBSan bootstrap: don't bind reference to nullptr. llvm-svn: 224751	2014-12-23 04:15:47 +00:00
Chandler Carruth	c7d1e24b34	Revert r224739: Debug info: Teach SROA how to update debug info for fragmented variables. This caused codegen to start crashing when we built somewhat large programs with debug info and optimizations. 'check-msan' hit in, and I suspect a bootstrap would as well. I mailed a test case to the review thread. llvm-svn: 224750	2014-12-23 02:58:14 +00:00
Jim Grosbach	1bd0f3530e	X86: Don't over-align combined loads. When combining consecutive loads+inserts into a single vector load, we should keep the alignment of the base load. Doing otherwise can, and does, lead to using overly aligned instructions. In the included test case, for example, using a 32-byte vmovaps on a 16-byte aligned value. Oops. rdar://19190968 llvm-svn: 224746	2014-12-23 00:35:23 +00:00
Reid Kleckner	ce0093344f	Make musttail more robust for vector types on x86 Previously I tried to plug musttail into the existing vararg lowering code. That turned out to be a mistake, because non-vararg calls use significantly different register lowering, even on x86. For example, AVX vectors are usually passed in registers to normal functions and memory to vararg functions. Now musttail uses a completely separate lowering. Hopefully this can be used as the basis for non-x86 perfect forwarding. Reviewers: majnemer Differential Revision: http://reviews.llvm.org/D6156 llvm-svn: 224745	2014-12-22 23:58:37 +00:00
David Blaikie	ea37c1173e	Remove dynamic allocation/indirection from GCOVBlocks owned by GCOVFunction Since these are all created in the DenseMap before they are referenced, there's no problem with pointer validity by the time it's required. This removes another use of DeleteContainerSeconds/manual memory management which I'm cleaning up from time to time. llvm-svn: 224744	2014-12-22 23:12:42 +00:00
Adrian Prantl	d9e64b6c08	Thumb1 frame lowering: Mark CFI instructions with the FrameSetup flag. Followup to r224294: ARM/AArch64: Attach the FrameSetup MIFlag to CFI instructions. Debug info marks the first instruction without the FrameSetup flag as being the end of the function prologue. Any CFI instructions in the middle of the function prologue would cause debug info to end the prologue too early and worse, attach the line number of the CFI instruction, which incidentally is often 0. llvm-svn: 224743	2014-12-22 23:09:14 +00:00
Chandler Carruth	e2f66ceed9	[SROA] Lift the logic for traversing the alloca slices one partition at a time into a partition iterator and a Partition class. There is a lot of knock-on simplification that this enables, largely stemming from having a Partition object to refer to in lots of helpers. I've only done a minimal amount of that because enoguh stuff is changing as-is in this commit. This shouldn't change any observable behavior. I've worked hard to preserve the exact traversal semantics which were originally present even though some of them make no sense. I'll be changing some of this in subsequent commits now that the logic is carefully factored into a reusable place. The primary motivation for this change is to break the rewriting into phases in order to support more intelligent rewriting. For example, I'm planning to change how split loads and stores are rewritten to remove the significant overuse of integer bit packing in the resulting code and allow more effective secondary splitting of aggregates. For any of this to work, they have to share the exact traversal logic. llvm-svn: 224742	2014-12-22 22:46:00 +00:00
Bruno Cardoso Lopes	bad65c3b70	[LCSSA] Handle PHI insertion in disjoint loops Take two disjoint Loops L1 and L2. LoopSimplify fails to simplify some loops (e.g. when indirect branches are involved). In such situations, it can happen that an exit for L1 is the header of L2. Thus, when we create PHIs in one of such exits we are also inserting PHIs in L2 header. This could break LCSSA form for L2 because these inserted PHIs can also have uses in L2 exits, which are never handled in the current implementation. Provide a fix for this corner case and test that we don't assert/crash on that. Differential Revision: http://reviews.llvm.org/D6624 rdar://problem/19166231 llvm-svn: 224740	2014-12-22 22:35:46 +00:00
Adrian Prantl	a47ace5901	Debug info: Teach SROA how to update debug info for fragmented variables. This allows us to generate debug info for extremely advanced code such as typedef struct { long int a; int b;} S; int foo(S s) { return s.b; } which at -O1 on x86_64 is codegen'd into define i32 @foo(i64 %s.coerce0, i32 %s.coerce1) #0 { ret i32 %s.coerce1, !dbg !24 } with this patch we emit the following debug info for this TAG_formal_parameter [3] AT_location( 0x00000000 0x0000000000000000 - 0x0000000000000006: rdi, piece 0x00000008, rsi, piece 0x00000004 0x0000000000000006 - 0x0000000000000008: rdi, piece 0x00000008, rax, piece 0x00000004 ) AT_name( "s" ) AT_decl_file( "/Volumes/Data/llvm/_build.ninja.release/test.c" ) Thanks to chandlerc, dblaikie, and echristo for their feedback on all previous iterations of this patch! llvm-svn: 224739	2014-12-22 22:26:00 +00:00
Reid Kleckner	5aabc06c16	Fix Windows unwind info for functions in sections other than .text Previously we assumed the section name had the form .text$foo, which is what we used to do for inline functions. If the dollar wasn't present, we'd put unwind data in the .pdata and .xdata sections for the main .text section, which is incorrect. Fixes PR22001. llvm-svn: 224738	2014-12-22 22:10:08 +00:00
Colin LeMahieu	4b1eac4dda	[Hexagon] Adding memb instruction. Fixing whitespace in test from 224730. llvm-svn: 224735	2014-12-22 21:40:43 +00:00
David Blaikie	9a6f2836db	Use iterators rather than indices to make this forwards-compatible with a change to the underlying container (to std::list) llvm-svn: 224734	2014-12-22 21:26:38 +00:00
David Blaikie	ba4e00f04a	unique_ptrify MatchableInfo(const CodeGenInstAlias *Alias)'s parameter llvm-svn: 224733	2014-12-22 21:26:26 +00:00
Colin LeMahieu	af1e5de141	[Hexagon] Adding classes and load unsigned byte instruction, updating usages. llvm-svn: 224730	2014-12-22 21:20:03 +00:00
Bruno Cardoso Lopes	811c173523	[x86] Add vector @llvm.ctpop intrinsic custom lowering Currently, when ctpop is supported for scalar types, the expansion of @llvm.ctpop.vXiY uses vector element extractions, insertions and individual calls to @llvm.ctpop.iY. When not, expansion with bit-math operations is used for the scalar calls. Local haswell measurements show that we can improve vector @llvm.ctpop.vXiY expansion in some cases by using a using a vector parallel bit twiddling approach, based on: v = v - ((v >> 1) & 0x55555555); v = (v & 0x33333333) + ((v >> 2) & 0x33333333); v = ((v + (v >> 4) & 0xF0F0F0F) v = v + (v >> 8) v = v + (v >> 16) v = v & 0x0000003F (from http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel) When scalar ctpop isn't supported, the approach above performs better for v2i64, v4i32, v4i64 and v8i32 (see numbers below). And even when scalar ctpop is supported, this approach performs ~2x better for v8i32. Here, x86_64 implies -march=corei7-avx without ctpop and x86_64h includes ctpop support with -march=core-avx2. == [x86_64h - new] v8i32: 0.661685 v4i32: 0.514678 v4i64: 0.652009 v2i64: 0.324289 == [x86_64h - old] v8i32: 1.29578 v4i32: 0.528807 v4i64: 0.65981 v2i64: 0.330707 == [x86_64 - new] v8i32: 1.003 v4i32: 0.656273 v4i64: 1.11711 v2i64: 0.754064 == [x86_64 - old] v8i32: 2.34886 v4i32: 1.72053 v4i64: 1.41086 v2i64: 1.0244 More work for other vector types will come next. llvm-svn: 224725	2014-12-22 19:45:43 +00:00
Juergen Ributzka	a3ca1b8823	Remove unused header. NFC. llvm-svn: 224722	2014-12-22 19:09:15 +00:00
Adrian Prantl	5c2824ada5	Add a C++ marker to this header file. llvm-svn: 224721	2014-12-22 19:07:45 +00:00
Peter Zotov	c433cd7bfa	[C API] Expose LLVMGetGlobalValueAddress and LLVMGetFunctionAddress. Patch by Ramkumar Ramachandra <artagnon@gmail.com> llvm-svn: 224720	2014-12-22 18:53:11 +00:00
Quentin Colombet	84f89ccd45	[CodeGenPrepare] Handle properly the promotion of operands when this does not generate instructions. Fixes PR21978. Related to <rdar://problem/18310086> llvm-svn: 224717	2014-12-22 18:11:52 +00:00
Elena Demikhovsky	949b0d46bf	AVX-512: Added all forms of BLENDM instructions, intrinsics, encoding tests for AVX-512F and skx instructions. llvm-svn: 224707	2014-12-22 13:52:48 +00:00
Karthik Bhat	bf662901c1	Lower multiply-negate operation to mneg on AArch64 This patch pattern matches code such as- neg w8, w8 mul w8, w9, w8 to mneg w8, w8, w9 Review: http://reviews.llvm.org/D6754 llvm-svn: 224706	2014-12-22 13:38:58 +00:00
Rafael Espindola	2645d33977	Convert a few tests to FileCheck. NFC. llvm-svn: 224705	2014-12-22 13:29:46 +00:00
Rafael Espindola	366e5c1bf1	The leak detector is dead, long live asan and valgrind. In resent times asan and valgrind have found way more memory management bugs in llvm than the special purpose leak detector. llvm-svn: 224703	2014-12-22 13:00:36 +00:00
Saleem Abdulrasool	90c224a143	CodeGen: minor style tweaks to SSP Clean up some style related things in the StackProtector CodeGen. NFC. llvm-svn: 224693	2014-12-21 21:52:38 +00:00
Craig Topper	23fd69560b	[X86] Add hasSideEffects = 0 to CALLpcrel16. This matches what is inferred from patterns for the 32-bit version. llvm-svn: 224692	2014-12-21 20:05:06 +00:00
Matt Arsenault	22b4c256e1	Enable (sext x) == C --> x == (trunc C) combine Extend the existing code which handles this for zext. This makes this more useful for targets with ZeroOrNegativeOne BooleanContent and obsoletes a custom combine SI uses for i1 setcc (sext(i1), 0, setne) since the constant will now be shrunk to i1. llvm-svn: 224691	2014-12-21 16:48:42 +00:00
Craig Topper	01dcd8a31d	[X86] Swap operand order in Intel syntax on a bunch of aliases. llvm-svn: 224687	2014-12-20 23:05:59 +00:00
Craig Topper	643a11268f	[X86] Swap operand order of imul aliases in Intel syntax. Also disable printing of the alias instead of the real instruction. llvm-svn: 224686	2014-12-20 23:05:57 +00:00
Craig Topper	4b8a47050f	[X86] Remove '*' from asm strings in far call/jump aliases for Intel syntax. llvm-svn: 224685	2014-12-20 23:05:55 +00:00
Craig Topper	3564080896	[X86] Don't swap the order of segment and offset in immediate form of far call/jump in Intel syntax. llvm-svn: 224684	2014-12-20 23:05:52 +00:00
Saleem Abdulrasool	57b5fe57e3	CodeGen: constify and use range loop for SSP Use range-based for loop and constify the iterators. NFC. llvm-svn: 224683	2014-12-20 21:37:51 +00:00
Saleem Abdulrasool	0fa832002c	ARM: further improve deprecated diagnosis (LDM) The ARM ARM states: LDM/LDMIA/LDMFD: The SP can be in the list. However, ARM deprecates using these instructions with SP in the list. ARM deprecates using these instructions with both the LR and the PC in the list. LDMDA/LDMFA/LDMDB/LDMEA/LDMIB/LDMED: The SP can be in the list. However, instructions that include the SP in the list are deprecated. Instructions that include both the LR and the PC in the list are deprecated. POP: The SP can only be in the list before ARMv7. ARM deprecates any use of ARM instructions that include the SP, and the value of the SP after such an instruction is UNKNOWN. ARM deprecates the use of this instruction with both the LR and the PC in the list. Attempt to diagnose use of deprecated forms of these instructions. This mirrors the previous changes to diagnose use of the deprecated forms of STM in ARM mode. llvm-svn: 224682	2014-12-20 20:25:36 +00:00
David Majnemer	d4449ed0f2	strnlen isn't available on some platforms, use StringRef instead llvm-svn: 224679	2014-12-20 08:24:43 +00:00
Craig Topper	35545fa20a	[X86] Immediate forms of far call/jump are not valid in x86-64. llvm-svn: 224678	2014-12-20 07:43:27 +00:00
David Majnemer	6eed0e0d20	This should have been part of r224676. llvm-svn: 224677	2014-12-20 04:48:34 +00:00
David Majnemer	b0362e4ee6	InstCombine: Squash an icmp+select into bitwise arithmetic (X & INT_MIN) == 0 ? X ^ INT_MIN : X into X \| INT_MIN (X & INT_MIN) != 0 ? X ^ INT_MIN : X into X & INT_MAX This fixes PR21993. llvm-svn: 224676	2014-12-20 04:45:35 +00:00
David Majnemer	147f8586be	InstSimplify: Don't bother if getScalarSizeInBits returns zero getScalarSizeInBits returns zero when the comparison operands are not integral. No functionality change intended. llvm-svn: 224675	2014-12-20 04:45:33 +00:00
David Majnemer	7bd7144e44	Simplify the code No functionality change intended. llvm-svn: 224673	2014-12-20 03:29:59 +00:00
Eric Fiselier	3ea49da2e9	Split executeShTest into two parts so that it can be better leveraged by libc++ llvm-svn: 224672	2014-12-20 03:23:53 +00:00
David Majnemer	0b6a0b0257	InstSimplify: Optimize away pointless comparisons (X & INT_MIN) ? X & INT_MAX : X into X & INT_MAX (X & INT_MIN) ? X : X & INT_MAX into X (X & INT_MIN) ? X \| INT_MIN : X into X (X & INT_MIN) ? X : X \| INT_MIN into X \| INT_MIN llvm-svn: 224669	2014-12-20 03:04:38 +00:00
Chandler Carruth	113dc64c67	[SROA] Run clang-format over the entire SROA pass as I wrote it before much of the glory of clang-format, and now any time I touch it I risk introducing formatting changes as part of a functional commit. Also, clang-format is way better at formatting my code than I am. Most of this is a huge improvement although I reverted a couple of places where I hit a clang-format bug with lambdas that has been filed but not (fully) fixed. llvm-svn: 224666	2014-12-20 02:39:18 +00:00
Chandler Carruth	b07d836577	[x86] Change the test added in r223774 to first check the spelling of the error message for a bogus processor, and then look specifically for that error message using FileCheck. I actually tried to write the test this way at first, but drew a blank on how to ensure the error message stayed in sync (oops). Now that I've recalled how to do that, this is clearly better. It also fixes an issue with a malloc implementation that actually prints to stderr in all cases, which was causing problems for some builders it seems. llvm-svn: 224665	2014-12-20 02:19:22 +00:00
Matthias Braun	714c494ca1	LiveIntervalAnalysis: No kill flags for partially undefined uses. We must not add kill flags when reading a vreg with some undefined subregisters, if subreg liveness tracking is enabled. This is because the register allocator may reuse these undefined subregisters for other values which are not killed. llvm-svn: 224664	2014-12-20 01:54:50 +00:00
Matthias Braun	7f8dece1d7	LiveIntervalAnalysis: cleanup addKills(), NFC - Use more const modifiers - Use references for things that can't be nullptr - Improve some variable names llvm-svn: 224663	2014-12-20 01:54:48 +00:00
Matthias Braun	ad344aff50	Unbreak cmake build with shared libraries enabled. llvm-svn: 224661	2014-12-20 01:51:02 +00:00
Eric Christopher	3ab98895bc	Remove unused variable and initialization. llvm-svn: 224655	2014-12-20 00:07:09 +00:00
Eric Christopher	8985ba912f	Remove unused variable, initializer, and accessor. llvm-svn: 224650	2014-12-19 23:46:53 +00:00
Matt Arsenault	013ddaf18c	R600: Remove outdated comment llvm-svn: 224648	2014-12-19 23:29:13 +00:00
Elena Demikhovsky	fb73ca516b	Masked load and store codegen - fixed 128-bit vectors The codegen failed on 128-bit types on AVX2. I added patterns and in td files and tests. llvm-svn: 224647	2014-12-19 23:27:57 +00:00
Matt Arsenault	dc10307524	R600/SI: Only form min/max with 1 use. If the condition is used for something else, this increases the number of instructions. llvm-svn: 224646	2014-12-19 23:15:30 +00:00
Reid Kleckner	f2acbbaf22	EH: Sink computation of local PadMap variable into function that uses it No functionality change. llvm-svn: 224635	2014-12-19 22:30:08 +00:00
Eric Fiselier	20ca10bd68	[LIT] Add JSONMetricValue type to wrap types supported by the json encoder. Summary: The following types can be encoded and decoded by the json library: `dict`, `list`, `tuple`, `str`, `unicode`, `int`, `long`, `float`, `bool`, `NoneType`. `JSONMetricValue` can be constructed with any of these types, and used as part of Test.Result. This patch also adds a toMetricValue function that converts a value into a MetricValue. Reviewers: ddunbar, EricWF Reviewed By: EricWF Subscribers: cfe-commits, llvm-commits Differential Revision: http://reviews.llvm.org/D6576 llvm-svn: 224628	2014-12-19 22:29:12 +00:00
Kevin Enderby	52e4ce4a53	Add printing the LC_ROUTINES load commands with llvm-objdump’s -private-headers. llvm-svn: 224627	2014-12-19 22:25:22 +00:00
Reid Kleckner	93acac6cfc	Add the ExceptionHandling::MSVC enumeration It is intended to be used for a family of personality functions that have similar IR preparation requirements. Typically when interoperating with MSVC personality functions, bits of functionality need to be outlined from the main function into helper functions. There is also usually more than one landing pad per invoke, which does not match the LLVM IR landingpad representation. None of this is implemented yet. This change just adds a new enum that is active for *-windows-msvc and delegates to the EH removal preparation pass. No functionality change for other targets. llvm-svn: 224625	2014-12-19 22:19:48 +00:00
Sanjay Patel	1da5f1645b	Model sqrtss as a binary operation with one source operand tied to the destination (PR14221) This is a continuation of r167064 ( http://llvm.org/viewvc/llvm-project?view=revision&revision=167064 ). That patch started to fix PR14221 ( http://llvm.org/bugs/show_bug.cgi?id=14221 ), but it was not completed. Differential Revision: http://reviews.llvm.org/D6330 llvm-svn: 224624	2014-12-19 22:16:28 +00:00
Tom Stellard	5352f35a89	R600/SI: isLegalOperand() shouldn't check constant bus for SALU instructions The constant bus restrictions only apply to VALU instructions. This enables SIFoldOperands to fold immediates into SALU instructions. llvm-svn: 224623	2014-12-19 22:15:37 +00:00
Tom Stellard	c3d7eeb6e5	R600/SI: Make sure non-inline constants aren't folded into mubuf soffset operand mubuf instructions now define the soffset field using the SCSrc_32 register class which indicates that only SGPRs and inline constants are allowed. llvm-svn: 224622	2014-12-19 22:15:30 +00:00
Yaron Keren	e8270225a2	Remove isSubroutineType test for isCompositeType, getTag() is enough. llvm-svn: 224621	2014-12-19 22:15:09 +00:00
David Blaikie	86a7f71549	Update SmallPtrSet::insert's doc comment to match the new return type llvm-svn: 224619	2014-12-19 21:45:11 +00:00
Kevin Enderby	186eac3c0c	Add printing the LC_SUB_CLIENT load command with llvm-objdump’s -private-headers. llvm-svn: 224616	2014-12-19 21:06:24 +00:00
Peter Collingbourne	0e5c068592	CodeGen: do not attempt to invalidate virtual registers for zero-sized phis. llvm-svn: 224615	2014-12-19 20:50:07 +00:00
Colin LeMahieu	0f850bde0e	[Hexagon] Removing old variants of instructions and updating references. llvm-svn: 224612	2014-12-19 20:29:29 +00:00
Sanjay Patel	0428a5786e	merge consecutive stores of extracted vector elements Add a path to DAGCombiner::MergeConsecutiveStores() to combine multiple scalar stores when the store operands are extracted vector elements. This is a partial fix for PR21711 ( http://llvm.org/bugs/show_bug.cgi?id=21711 ). For the new test case, codegen improves from: vmovss %xmm0, (%rdi) vextractps $1, %xmm0, 4(%rdi) vextractps $2, %xmm0, 8(%rdi) vextractps $3, %xmm0, 12(%rdi) vextractf128 $1, %ymm0, %xmm0 vmovss %xmm0, 16(%rdi) vextractps $1, %xmm0, 20(%rdi) vextractps $2, %xmm0, 24(%rdi) vextractps $3, %xmm0, 28(%rdi) vzeroupper retq To: vmovups %ymm0, (%rdi) vzeroupper retq Patch reviewed by Nadav Rotem. Differential Revision: http://reviews.llvm.org/D6698 llvm-svn: 224611	2014-12-19 20:23:41 +00:00
Colin LeMahieu	38ce8cd2e2	[Hexagon] Adding bit extraction and table indexing instructions. llvm-svn: 224610	2014-12-19 20:01:08 +00:00
Colin LeMahieu	3c7f664d5a	[Hexagon] Adding bit insertion instructions. llvm-svn: 224609	2014-12-19 19:54:38 +00:00
Colin LeMahieu	d63ef93b4b	[Hexagon] Adding more xtype shift instructions. llvm-svn: 224608	2014-12-19 19:51:35 +00:00
Kevin Enderby	36c8d3ae63	Add printing the LC_SUB_LIBRARY load command with llvm-objdump’s -private-headers. llvm-svn: 224607	2014-12-19 19:48:16 +00:00
Colin LeMahieu	cc09d1ccc5	[Hexagon] Adding xtype shift instructions. llvm-svn: 224604	2014-12-19 19:34:50 +00:00
Colin LeMahieu	f3db884efb	[Hexagon] Adding transfers to and from control registers. llvm-svn: 224599	2014-12-19 19:06:32 +00:00
Colin LeMahieu	402f772b82	[Hexagon] Adding doubleregs for control registers. Renaming control register class. llvm-svn: 224598	2014-12-19 18:56:10 +00:00
Frederic Riss	b88adbdeb0	[DebugInfo] Move all DWARF headers to the public include directory. dsymutil needs access to DWARF specific inforamtion, the small DIContext wrapper isn't sufficient. Other DWARF consumers might want to use it too (I'm looking at you lldb). Differential Revision: http://reviews.llvm.org/D6694 llvm-svn: 224594	2014-12-19 18:26:33 +00:00
Tilmann Scheller	be98f3c4ec	[BBVectorize] Remove two more redundant assignments. Found by the Clang static analyzer. llvm-svn: 224590	2014-12-19 17:21:38 +00:00
Tilmann Scheller	945ce0ae00	[BBVectorize] Remove redundant assignment. Found by the Clang static analyzer. llvm-svn: 224589	2014-12-19 17:13:12 +00:00
Bruno Cardoso Lopes	f6cf8ad4e5	Reapply: [InstCombine] Fix visitSwitchInst to use right operand types for sub cstexpr The visitSwitchInst generates SUB constant expressions to recompute the switch condition. When truncating the condition to a smaller type, SUB expressions should use the previous type (before trunc) for both operands. Also, fix code to also return the modified switch when only the truncation is performed. This fixes an assertion crash. Differential Revision: http://reviews.llvm.org/D6644 rdar://problem/19191835 llvm-svn: 224588	2014-12-19 17:12:35 +00:00
Tilmann Scheller	b811030b47	[LoopVectorize] Remove redundant assignment. Found by the Clang static analyzer. llvm-svn: 224587	2014-12-19 17:02:31 +00:00
Tilmann Scheller	e24bb41bad	[ARM] Remove dead assignment. Found by the Clang static analyzer. llvm-svn: 224586	2014-12-19 16:57:33 +00:00
Sanjay Patel	ea3c802887	use -0.0 when creating an fneg instruction Backends recognize (-0.0 - X) as the canonical form for fneg and produce better code. Eg, ppc64 with 0.0: lis r2, ha16(LCPI0_0) lfs f0, lo16(LCPI0_0)(r2) fsubs f1, f0, f1 blr vs. -0.0: fneg f1, f1 blr Differential Revision: http://reviews.llvm.org/D6723 llvm-svn: 224583	2014-12-19 16:44:08 +00:00
Bruno Cardoso Lopes	3be15b2fa6	Revert "[InstCombine] Fix visitSwitchInst to use right operand types for sub cstexpr" Reverts commit r224574 to appease buildbots: The visitSwitchInst generates SUB constant expressions to recompute the switch condition. When truncating the condition to a smaller type, SUB expressions should use the previous type (before trunc) for both operands. This fixes an assertion crash. llvm-svn: 224576	2014-12-19 14:36:24 +00:00
Bruno Cardoso Lopes	c9005f2f2b	[InstCombine] Fix visitSwitchInst to use right operand types for sub cstexpr The visitSwitchInst generates SUB constant expressions to recompute the switch condition. When truncating the condition to a smaller type, SUB expressions should use the previous type (before trunc) for both operands. This fixes an assertion crash. Differential Revision: http://reviews.llvm.org/D6644 rdar://problem/19191835 llvm-svn: 224574	2014-12-19 14:23:15 +00:00
Tilmann Scheller	3f7f3341b9	Remove redundant assignment. Found with the Clang static analyzer. llvm-svn: 224570	2014-12-19 11:29:34 +00:00
Duncan P. N. Exon Smith	d34b613b92	LTO: Export local context symbols Export symbols in libLTO.dylib for the local context-related functions added in r221733 (`LTO_API_VERSION=11`)... and add the missing definition for `lto_codegen_create_in_local_context()`. llvm-svn: 224567	2014-12-19 07:19:50 +00:00
Duncan P. N. Exon Smith	46d7af5729	Rename MapValue(Metadata*) to MapMetadata() Instead of reusing the name `MapValue()` when mapping `Metadata`, use `MapMetadata()`. The old name doesn't make much sense after the `Metadata`/`Value` split. llvm-svn: 224566	2014-12-19 06:06:18 +00:00
Juergen Ributzka	4d7f70d47e	[Object] Don't crash on empty export lists. Summary: This fixes the exports iterator if the export list is empty. Reviewers: Bigcheese, kledzik Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6732 llvm-svn: 224563	2014-12-19 02:31:01 +00:00
Matthias Braun	aeb50b3805	RegisterCoalescer: rewrite eliminateUndefCopy(). This also fixes problems with undef copies of subregisters. I can't attach a testcase for that as none of the targets in trunk has subregister liveness tracking enabled. llvm-svn: 224560	2014-12-19 01:39:46 +00:00
Colin LeMahieu	5ccbb1298b	[Hexagon] Adding loop0/1 sp0/1/2loop0 instructions. llvm-svn: 224556	2014-12-19 00:06:53 +00:00
Adrian Prantl	cf44e7870b	Explain why LLVM is emitting a DW_AT_containing_type inside of a class. llvm-svn: 224555	2014-12-19 00:01:20 +00:00
Peter Zotov	f94eed6083	[cmake] Unbreak LLVM-Config.cmake / llvm_expand_dependencies. The algorithm for sorting libraries in topological order, as previously implemented, had a few issues: * It didn't make any sense. * It didn't actually sort libraries in topological order. * It hung on some inputs, e.g. "LLVMipo". This commit replaces the old algorithm with a straightforward port from llvm-config.cpp. llvm-svn: 224554	2014-12-18 23:56:52 +00:00
David Majnemer	824e011ad7	ConstantFold: Shifting undef by zero results in undef llvm-svn: 224553	2014-12-18 23:54:43 +00:00
Colin LeMahieu	174476ed96	Reverting 224550, was not ready for commit. llvm-svn: 224552	2014-12-18 23:36:15 +00:00
Kevin Enderby	07a40f3827	Remove an extra ';' on line 1120 include/llvm/Support/MachO.h . Caught by Mike Edwards! llvm-svn: 224551	2014-12-18 23:34:16 +00:00
Colin LeMahieu	9000481cda	[Hexagon] Adding loop0/1 sp0/1/2loop0 instructions. llvm-svn: 224550	2014-12-18 23:27:51 +00:00
Kevin Enderby	a2bd8d98a1	Add printing the LC_SUB_UMBRELLA load command with llvm-objdump’s -private-headers. llvm-svn: 224548	2014-12-18 23:13:26 +00:00
Roman Divacky	a93d002321	Instead of explicitely comparing both lowercase and uppercase variants. .lower() the Name and compare only the lowecase. Removing 81 compares/lines of code. This changes the accepted string to be mixed lower/upper case but it should be ok. Discussed with Jim Grosbach. llvm-svn: 224547	2014-12-18 23:12:34 +00:00
Sanjay Patel	c242dbb3b6	fix formatting; NFC llvm-svn: 224542	2014-12-18 21:11:09 +00:00
Chris Bieneman	9ef9c1144a	Have llvm-c-test only use libLLVM if libLLVM has all the right components. Summary: We should only have llvm-c-test use libLLVM if the library is built with the default set of components or if LLVM_DYLIB_COMPONENTS includes all the LLVM_LINK_COMPONENTS required for llvm-c-test. Making libLLVM always used causes build failures if libLLVM doesn't include all Reviewers: chapuni, ributzka Reviewed By: ributzka Subscribers: ributzka, llvm-commits Differential Revision: http://reviews.llvm.org/D6668 llvm-svn: 224541	2014-12-18 21:03:49 +00:00
Colin LeMahieu	17c59e87c9	[NFC] Removing extra semicolon. llvm-svn: 224539	2014-12-18 20:15:32 +00:00
Matthias Braun	15abf3743c	LiveIntervalAnalysis: Cleanup computeDeadValues - This also fixes a bug introduced in r223880 where values were not correctly marked as Dead anymore. - Cleanup computeDeadValues(): split up SubRange code variant, simplify arguments. llvm-svn: 224538	2014-12-18 19:58:52 +00:00
Ulrich Weigand	52cba39cb6	Add myself as SystemZ code owner As agreed with Richard Sandiford, I'm taking over code ownership for the SystemZ back end from him. llvm-svn: 224535	2014-12-18 19:27:50 +00:00
Kevin Enderby	b4b7931748	Add printing the LC_SUB_FRAMEWORK load command with llvm-objdump’s -private-headers. llvm-svn: 224534	2014-12-18 19:24:35 +00:00
Juergen Ributzka	84ba342ba9	Add missing implementation of 'sys::path::is_other' to the support library. The header claims that this function exists, but the linker wasn't too happy about it not being in the library. llvm-svn: 224527	2014-12-18 18:19:47 +00:00
Jozef Kolek	2f27d571c8	[mips][microMIPS] Fix bugs related to atomic SC/LL instructions Fix bugs related to atomic microMIPS SC/LL instructions: While expanding atomic operations the mips32r2 encoding was emitted instead of microMIPS. Differential Revision: http://reviews.llvm.org/D6659 llvm-svn: 224524	2014-12-18 16:39:29 +00:00
Saleem Abdulrasool	0b5a8520ac	ARM: fix an off-by-one in the register list access Fix an off-by-one access introduced in 224502 for push.w and pop.w with single register operands. Add test cases for both scenarios. Thanks to Asiri Rathnayake for pointing out the failure! llvm-svn: 224521	2014-12-18 16:16:53 +00:00
Toma Tabacu	83cd33c4f6	[mips] Clean up the CodeGen/Mips/inlineasmmemop.ll test. NFC. Summary: Improve comments and remove a redundant attribute list. There are no functional changes (to the CHECK's or to the code). Part of these changes were suggested in http://reviews.llvm.org/D6637. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6705 llvm-svn: 224517	2014-12-18 13:03:51 +00:00
Robert Khasanov	79fb7292d7	[AVX512] Enable FP arithmetic lowering for AVX512VL subsets. Added RegOp2MemOpTable4 to transform 4th operand from register to memory in merge-masked versions of instructions. Added lowering tests. llvm-svn: 224516	2014-12-18 12:28:22 +00:00
Viktor Kutuzov	b4ffb5d5e9	[Msan] Generalize instrumentation code to support FreeBSD mapping Differential Revision: http://reviews.llvm.org/D6666 llvm-svn: 224514	2014-12-18 12:12:59 +00:00
Yaron Keren	06d6930b68	Fix Visual C++ error "'llvm::make_unique' : ambiguous call to overloaded function". llvm-svn: 224506	2014-12-18 10:03:35 +00:00
Saleem Abdulrasool	3a23917d48	ARM: improve instruction validation for thumb mode The ARM Architecture Reference Manual states the following: LDM{,IA,DB}: The SP cannot be in the list. The PC can be in the list. If the PC is in the list: • the LR must not be in the list • the instruction must be either outside any IT block, or the last instruction in an IT block. POP: The PC can be in the list. If the PC is in the list: • the LR must not be in the list • the instruction must be either outside any IT block, or the last instruction in an IT block. PUSH: The SP and PC can be in the list in ARM instructions, but not in Thumb instructions. STM:{,IA,DB}: The SP and PC can be in the list in ARM instructions, but not in Thumb instructions. llvm-svn: 224502	2014-12-18 05:24:38 +00:00
Saleem Abdulrasool	69973e0fa0	test: avoid unnecessary temporary files Use pipes and redirect the error output to FileCheck directly. NFC. llvm-svn: 224501	2014-12-18 05:24:32 +00:00
Chandler Carruth	68ea415d04	[SROA] Cleanup - remove the use of std::mem_fun_ref nonsense and use a lambda now that we have them. llvm-svn: 224500	2014-12-18 05:19:47 +00:00
Rafael Espindola	7d727b5f11	Modernize the getStreamedBitcodeModule interface a bit. NFC. llvm-svn: 224499	2014-12-18 05:08:43 +00:00
Craig Topper	f7df7221d1	[PowerPC] Use MCPhysReg for tables of registers. Const-correct the tables. Only put the anonymous namespace around classes. NFC. llvm-svn: 224498	2014-12-18 05:02:14 +00:00
Craig Topper	5645f6b45b	[X86] Use correct opsize on indirect call and jump aliases. llvm-svn: 224497	2014-12-18 05:02:12 +00:00
Craig Topper	9480732be2	[X86] Don't use PS prefix on LDMXCSR/STMXCSR. Near as I can tell prefixes are ignored on these instructions except for a comment in the Intel docs about 0xf3. Binutils disassembler seems to ignore prefixes on these instructions. Our disassembler still doesn't distinguish PS and "no prefix" well enough for this to make a functional change, but it helps with experiments I'm doing on a potential new disassembler table builder. llvm-svn: 224496	2014-12-18 05:02:10 +00:00
Craig Topper	2e2aee0cd6	[X86] Remove unnecessary 'In64BitMode' predicate for instructions that already indicate use of REX.W. llvm-svn: 224495	2014-12-18 05:02:08 +00:00
Justin Hibbits	88030b94c6	Add a corresponding '@LOCAL' parse to match r224415. Pointed out by Jim Grosbach. llvm-svn: 224494	2014-12-18 03:06:37 +00:00
Eric Christopher	661f2d1ca1	Add a new string member to the TargetOptions struct for the name of the abi we should be using. For targets that don't use the option there's no change, otherwise this allows external users to set the ABI via string and avoid some of the -backend-option pain in clang. Use this option to move the ABI for the ARM port from the Subtarget to the TargetMachine and update the testcases accordingly since it's no longer valid to set via -mattr. llvm-svn: 224492	2014-12-18 02:20:58 +00:00
Eric Christopher	1971c3508a	Model ARM backend ABI selection after the front end code doing the same. This will change the "bare metal" ABI from APCS to AAPCS. The only difference between the front and back end code is that the code for Triple::GNU was added for environment. That will migrate to the front end shortly. Tests updated with the ABI they were originally testing in the case of bare metal (e.g. -mtriple armv7) or with a -gnu for arm-linux triples. llvm-svn: 224489	2014-12-18 02:08:45 +00:00
Duncan P. N. Exon Smith	fda0cee7c6	Reapply "Linker: Drop superseded subprograms" This reverts commit r224416, reapplying r224389. The buildbots hadn't recovered after my revert, waiting until David reverted a couple of his commits. It looks like it was just bad timing (where we were both modifying code related to the same assertion). Trying again... Here's the original text: When a function gets replaced by `ModuleLinker`, drop superseded subprograms. This ensures that the "first" subprogram pointing at a function is the same one that `!dbg` references point at. This is a stop-gap fix for PR21910. Notably, this fixes Release+Asserts bootstraps that are currently asserting out in `LexicalScopes::initialize()` due to the explicit instantiations in `lib/IR/Dominators.cpp` eventually getting replaced by -argpromotion. llvm-svn: 224487	2014-12-18 01:05:33 +00:00
Duncan P. N. Exon Smith	9fb3665426	IR: Make DICompositeType mutators private Make `DICompositeType` mutators private to prevent misuse. All calls to `setArrays()` and `setContainingType()` should go through `DIBuilder::replaceArrays()` and `DIBuilder::replaceVTableHolder()`. This is a follow-up to r224482 (now that clang has been updated in r224483). llvm-svn: 224486	2014-12-18 00:54:39 +00:00
Kevin Enderby	d0b6b7fb7f	Add printing the LC_LINKER_OPTION load command with llvm-objdump’s -private-headers. Also corrected the name of the load command to not end in an ’S’ as well as corrected the name of the MachO::linker_option_command struct and other places that had the word option as plural which did not match the Mac OS X headers. llvm-svn: 224485	2014-12-18 00:53:40 +00:00
Duncan P. N. Exon Smith	97f07c2778	IR: Handle self-referencing DICompositeTypes in DIBuilder Add API to DIBuilder to handle self-referencing `DICompositeType`s. Self-references aren't expected in the debug info graph, and we take advantage of that by only calling `resolveCycles()` on nodes that were once forward declarations (otherwise, DIBuilder needs an expensive tracking reference to every unresolved node it creates, which in cyclic graphs is all of them). However, clang seems to create self-referencing `DICompositeType`s. Add API to manage this safely. The paired commit to clang will include the regression test. I'll make the `DICompositeType` API `private` in a follow-up to prevent misuse (I've separated that to prevent build failures from missing the clang commit). llvm-svn: 224482	2014-12-18 00:46:16 +00:00
Alexey Samsonov	d927bd8d15	[dsymutil] Fix missing member initializer. This bug was found by the MSan bootstrap bot: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/5330/steps/check-llvm%20msan/logs/stdio llvm-svn: 224481	2014-12-18 00:45:32 +00:00
Duncan P. N. Exon Smith	140d41b791	LTO: Lazy-load LTOModule in local contexts Start lazy-loading `LTOModule`s that own their contexts. These can only really be used for parsing symbols, so its unnecessary to ever materialize their functions. I looked into using `IRObjectFile::create()` and optionally calling `materializAllPermanently()` afterwards, but this turned out to be awkward. - The default target triple and data layout logic needs to happen before the call to `IRObjectFile::IRObjectFile()`, but after `Module` was created. - I tried passing a lambda in to do the module initialization, but this seemed to require threading the error message from `TargetRegistry::lookupTarget()` through `std::error_code`. - I also looked at setting `errMsg` directly from within the lambda, but this didn't look any better. (I guess there's a reason we weren't already using that function.) llvm-svn: 224466	2014-12-17 22:05:42 +00:00
Kostya Serebryany	fea4fb404e	[sanitizer] allow -fsanitize-coverage=N w/ -fsanitize=leak, llvm part llvm-svn: 224463	2014-12-17 21:50:04 +00:00
Matthias Braun	0a410f6243	RegisterCoalescer: Fix stripCopies() picking up main range instead of subregister range This fixes a problem where stripCopies() would switch to values in the main liverange when it crossed a copy instruction. However when joining subranges we need to stay in the respective subregister ranges. llvm-svn: 224461	2014-12-17 21:25:20 +00:00
Matt Arsenault	303011a005	R600/SI: Fix f64 inline immediates llvm-svn: 224458	2014-12-17 21:04:08 +00:00
Colin LeMahieu	2055538edb	[Hexagon] Reconfiguring register alternate names. llvm-svn: 224455	2014-12-17 20:35:11 +00:00
Will Schmidt	428488c594	Enable the P8Model entry This was missed last time around, for the P8 Instruction Scheduling changes (223257). This will hook the P8Model entry in so those changes will actually be used. llvm-svn: 224452	2014-12-17 19:56:29 +00:00
Matthias Braun	8142efa8ea	ExecutionDepsFix: Correctly handle wide registers. The ExecutionDepsFix previously mapped each register to 1 or zero registers of the register class it was called with and therefore simulating liveness for. This was problematic for cases involving wider registers like Q0 on ARM where ExecutionDepsFix gets invoked for the Dxx registers. In these cases the wide register would get mapped to the last matching D register, while it should have been all matching D registers. This commit changes the AliasMap to use a SmallVector to map registers to potentially multiple destination regclass registers. This is required to avoid regressions with subregister liveness tracking enabled. llvm-svn: 224447	2014-12-17 19:13:47 +00:00
JF Bastien	e6acbdc487	Random Number Generator Refactoring (removing from Module) This patch removes the RNG from Module. Passes should instead create a new RNG for their use as needed. Patch by Stephen Crane @rinon. Differential revision: http://reviews.llvm.org/D4377 llvm-svn: 224444	2014-12-17 18:12:10 +00:00
Jingyue Wu	e4c9cf04f5	[NVPTX] Fix bugs related to isSingleValueType Summary: With isSingleValueType starting to treat vector types as single-value types, code that uses this interface needs to be updated. Test Plan: vector-global.ll nvcl-param-align.ll Reviewers: jholewinski Reviewed By: jholewinski Subscribers: llvm-commits, meheff, eliben, jholewinski Differential Revision: http://reviews.llvm.org/D6573 llvm-svn: 224440	2014-12-17 17:59:04 +00:00
Timur Iskhodzhanov	44ee1c0e91	Fix CR/LF line endings in test case llvm-svn: 224437	2014-12-17 17:52:12 +00:00
Saleem Abdulrasool	1ce7d31f33	ARM: correct an off-by-one in an assert The assert was off-by-one, resulting in failures for valid input. Thanks to Asiri Rathnayake for pointing out the failure! llvm-svn: 224432	2014-12-17 16:17:44 +00:00
Michael Kuperstein	047b1a0400	[DAGCombine] Slightly improve lowering of BUILD_VECTOR into a shuffle. This handles the case of a BUILD_VECTOR being constructed out of elements extracted from a vector twice the size of the result vector. Previously this was always scalarized. Now, we try to construct a shuffle node that feeds on extract_subvectors. This fixes PR15872 and provides a partial fix for PR21711. Differential Revision: http://reviews.llvm.org/D6678 llvm-svn: 224429	2014-12-17 12:32:17 +00:00
Vladimir Medic	636fefe252	MipsABIInfo class is used in different libraries. Moving the files to MCTargetDesc folder(LLVMMipsDesc library) prevents linkage errors. There are no functional changes. llvm-svn: 224427	2014-12-17 11:49:56 +00:00
Toma Tabacu	a23f13c3b0	[mips] Set GCC-compatible MIPS asssembler options before inline asm blocks. Summary: When generating MIPS assembly, LLVM always overrides the default assembler options by emitting the '.set noreorder', '.set nomacro' and '.set noat' directives, while GCC uses the default options if an assembly-level function contains inline assembly code. This becomes a problem when the code generated by LLVM is interleaved with inline assembly which assumes GCC-like assembler options (from Linux, for example). This patch fixes these conflicts by setting the appropriate assembler options at the beginning of an inline asm block and popping them at the end. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6637 llvm-svn: 224425	2014-12-17 10:56:16 +00:00
Suyog Sarda	43fae93da8	Revert 224119 "This patch recognizes (+ (+ v0, v1) (+ v2, v3)), reorders them for bundling into vector of loads, and vectorizes it." This was re-ordering floating point data types resulting in mismatch in output. llvm-svn: 224424	2014-12-17 10:34:27 +00:00
Yaron Keren	f630971635	Teach lit.cfg to recognize -windows-gnu in addition to -mingw32. llvm-svn: 224421	2014-12-17 09:55:15 +00:00
Elena Demikhovsky	028e966a54	Added 5 more tests related to sink store revision 224247 - by Ella Bolshinsky http://reviews.llvm.org/D6420 llvm-svn: 224418	2014-12-17 08:12:59 +00:00
Erik Eckstein	a451b9b0b5	Strength reduce intrinsics with overflow into regular arithmetic operations if possible. Some intrinsics, like s/uadd.with.overflow and umul.with.overflow, are already strength reduced. This change adds other arithmetic intrinsics: s/usub.with.overflow, smul.with.overflow. It completes the work on PR20194. llvm-svn: 224417	2014-12-17 07:29:19 +00:00
Duncan P. N. Exon Smith	92731d26bc	Revert "Linker: Drop superseded subprograms" This reverts commit r224389. Based on feedback from the bots, the assertion seems to be going off more often, not less (previously I was just seeing it in an internal bootstrap, now it's happening in public builds too). http://lab.llvm.org:8080/green/job/clang-stage2-configure-Rlto_build/936/ http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/5325 Reverting in order to investigate. llvm-svn: 224416	2014-12-17 07:27:31 +00:00
Justin Hibbits	0c0d5deff1	Add parsing of 'foo@local". Summary: Currently, it supports generating, but not parsing, this expression. Test added as well. Test Plan: New test added, no regressions due to this. Reviewers: hfinkel Reviewed By: hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6672 llvm-svn: 224415	2014-12-17 06:23:35 +00:00
Rafael Espindola	5f06030989	Remove a debugging assert. Sorry for the noise, I have no idea how it survived to the final version. llvm-svn: 224414	2014-12-17 03:38:04 +00:00
Rafael Espindola	839353bca0	Remove unused includes and out of date comment. NFC. llvm-svn: 224413	2014-12-17 03:07:20 +00:00
Rafael Espindola	81adfb5c2e	Fix the windows build. llvm-svn: 224412	2014-12-17 02:42:20 +00:00
Rafael Espindola	97935a9123	Refactor and simplify the code reading /proc/cpuinfo. NFC. llvm-svn: 224410	2014-12-17 02:32:44 +00:00
Matthias Braun	f4a72cd06e	RegisterCoalescer: Sprinkle some const modifiers. llvm-svn: 224409	2014-12-17 02:18:13 +00:00
Duncan P. N. Exon Smith	f9abf4fb0c	llvm-lto: Add testing coverage for local contexts Add coverage in `llvm-lto` for the API exposed by libLTO to create modules in local contexts. The goal here isn't to test the symbol-related API extensively, just to confirm that these modules work at all. (I'll be shifting code around soon that should be NFC and I realized there was no test coverage.) llvm-svn: 224408	2014-12-17 02:00:38 +00:00
Nick Lewycky	52ee5e446b	Delete debugging cruft that crept in with r223802. llvm-svn: 224407	2014-12-17 01:56:51 +00:00
David Majnemer	65c52ae8ca	InstSimplify: shl nsw/nuw undef, %V -> undef We can always choose an value for undef which might cause %V to shift out an important bit except for one case, when %V is zero. However, shl behaves like an identity function when the right hand side is zero. llvm-svn: 224405	2014-12-17 01:54:33 +00:00
Nick Lewycky	ee0a3a7a2f	Make ValueEnumerator::print use OS for metadata too. Noticed by inspection. llvm-svn: 224404	2014-12-17 01:52:08 +00:00
Quentin Colombet	fc2201e922	[CodeGenPrepare] Reapply r224351 with a fix for the assertion failure: The type promotion helper does not support vector type, so when make such it does not kick in in such cases. Original commit message: [CodeGenPrepare] Move sign/zero extensions near loads using type promotion. This patch extends the optimization in CodeGenPrepare that moves a sign/zero extension near a load when the target can combine them. The optimization may promote any operations between the extension and the load to make that possible. Although this optimization may be beneficial for all targets, in particular AArch64, this is enabled for X86 only as I have not benchmarked it for other targets yet. Context Most targets feature extended loads, i.e., loads that perform a zero or sign extension for free. In that context it is interesting to expose such pattern in CodeGenPrepare so that the instruction selection pass can form such loads. Sometimes, this pattern is blocked because of instructions between the load and the extension. When those instructions are promotable to the extended type, we can expose this pattern. Motivating Example Let us consider an example: define void @foo(i8* %addr1, i32* %addr2, i8 %a, i32 %b) { %ld = load i8* %addr1 %zextld = zext i8 %ld to i32 %ld2 = load i32* %addr2 %add = add nsw i32 %ld2, %zextld %sextadd = sext i32 %add to i64 %zexta = zext i8 %a to i32 %addza = add nsw i32 %zexta, %zextld %sextaddza = sext i32 %addza to i64 %addb = add nsw i32 %b, %zextld %sextaddb = sext i32 %addb to i64 call void @dummy(i64 %sextadd, i64 %sextaddza, i64 %sextaddb) ret void } As it is, this IR generates the following assembly on x86_64: [...] movzbl (%rdi), %eax # zero-extended load movl (%rsi), %es # plain load addl %eax, %esi # 32-bit add movslq %esi, %rdi # sign extend the result of add movzbl %dl, %edx # zero extend the first argument addl %eax, %edx # 32-bit add movslq %edx, %rsi # sign extend the result of add addl %eax, %ecx # 32-bit add movslq %ecx, %rdx # sign extend the result of add [...] The throughput of this sequence is 7.45 cycles on Ivy Bridge according to IACA. Now, by promoting the additions to form more extended loads we would generate: [...] movzbl (%rdi), %eax # zero-extended load movslq (%rsi), %rdi # sign-extended load addq %rax, %rdi # 64-bit add movzbl %dl, %esi # zero extend the first argument addq %rax, %rsi # 64-bit add movslq %ecx, %rdx # sign extend the second argument addq %rax, %rdx # 64-bit add [...] The throughput of this sequence is 6.15 cycles on Ivy Bridge according to IACA. This kind of sequences happen a lot on code using 32-bit indexes on 64-bit architectures. Note: The throughput numbers are similar on Sandy Bridge and Haswell. Proposed Solution To avoid the penalty of all these sign/zero extensions, we merge them in the loads at the beginning of the chain of computation by promoting all the chain of computation on the extended type. The promotion is done if and only if we do not introduce new extensions, i.e., if we do not degrade the code quality. To achieve this, we extend the existing “move ext to load” optimization with the promotion mechanism introduced to match larger patterns for addressing mode (r200947). The idea of this extension is to perform the following transformation: ext(promotableInst1(...(promotableInstN(load)))) => promotedInst1(...(promotedInstN(ext(load)))) The promotion mechanism in that optimization is enabled by a new TargetLowering switch, which is off by default. In other words, by default, the optimization performs the “move ext to load” optimization as it was before this patch. Performance Configuration: x86_64: Ivy Bridge fixed at 2900MHz running OS X 10.10. Tested Optimization Levels: O3/Os Tests: llvm-testsuite + externals. Results: - No regression beside noise. - Improvements: CINT2006/473.astar: ~2% Benchmarks/PAQ8p: ~2% Misc/perlin: ~3% The results are consistent for both O3 and Os. <rdar://problem/18310086> llvm-svn: 224402	2014-12-17 01:36:17 +00:00
Kevin Enderby	57538299e8	Add printing the LC_ENCRYPTION_INFO_64 load command with llvm-objdump’s -private-headers and add tests for the two AArch64 binaries. llvm-svn: 224400	2014-12-17 01:01:30 +00:00
David Blaikie	8b979f01c6	PR21875: codegen for non-type template parameters of nullptr_t type llvm-svn: 224399	2014-12-17 00:43:22 +00:00
Reid Kleckner	04b69f89aa	Revert "[CodeGenPrepare] Move sign/zero extensions near loads using type promotion." This reverts commit r224351. It causes assertion failures when building ICU. llvm-svn: 224397	2014-12-17 00:29:23 +00:00
Hans Wennborg	224cb82a39	SelectionDAG switch lowering: use 'unsigned' to count destination popularity SwitchInst::getNumCases() returns unsinged, so using uint64_t to count cases seems unnecessary. Also fix a missing CHECK in the test case. llvm-svn: 224393	2014-12-16 23:41:59 +00:00
Colin LeMahieu	aa1bade7b4	[Hexagon] Updating doubleword shift usages to new versions. llvm-svn: 224391	2014-12-16 23:36:15 +00:00
Kevin Enderby	0804f467f2	Add printing the LC_ENCRYPTION_INFO load command with llvm-objdump’s -private-headers. llvm-svn: 224390	2014-12-16 23:25:52 +00:00
Duncan P. N. Exon Smith	8759026893	Linker: Drop superseded subprograms When a function gets replaced by `ModuleLinker`, drop superseded subprograms. This ensures that the "first" subprogram pointing at a function is the same one that `!dbg` references point at. This is a stop-gap fix for PR21910. Notably, this fixes Release+Asserts bootstraps that are currently asserting out in `LexicalScopes::initialize()` due to the explicit instantiations in `lib/IR/Dominators.cpp` eventually getting replaced by -argpromotion. llvm-svn: 224389	2014-12-16 23:23:41 +00:00
Sanjay Patel	494a625fee	fix typo, add spaces; NFC llvm-svn: 224384	2014-12-16 22:48:42 +00:00
Simon Pilgrim	bf1e079005	[X86][SSE] Vector double -> float conversion memory folding (cvtpd2ps) Added a missing memory folding relationship for the (V)CVTPD2PS instruction - we can safely fold these for stack reloads. Differential Revision: http://reviews.llvm.org/D6663 llvm-svn: 224383	2014-12-16 22:30:10 +00:00
Rafael Espindola	9573a9cf9d	Make the assert a bit stronger. We should get no declarations in here. llvm-svn: 224382	2014-12-16 22:29:43 +00:00
Colin LeMahieu	7fc90fc7e9	[Hexagon] Removing old XTYPE/BIT instructions and replacing usages. llvm-svn: 224381	2014-12-16 22:17:09 +00:00
Sanjay Patel	7129c10cae	merge consecutive loads that are offset from a base address SelectionDAG::isConsecutiveLoad() was not detecting consecutive loads when the first load was offset from a base address. This patch recognizes that pattern and subtracts the offset before comparing the second load to see if it is consecutive. The codegen change in the new test case improves from: vmovsd 32(%rdi), %xmm0 vmovsd 48(%rdi), %xmm1 vmovhpd 56(%rdi), %xmm1, %xmm1 vmovhpd 40(%rdi), %xmm0, %xmm0 vinsertf128 $1, %xmm1, %ymm0, %ymm0 To: vmovups 32(%rdi), %ymm0 An existing test case is also improved from: vmovsd (%rdi), %xmm0 vmovsd 16(%rdi), %xmm1 vmovsd 24(%rdi), %xmm2 vunpcklpd %xmm2, %xmm0, %xmm0 ## xmm0 = xmm0[0],xmm2[0] vmovhpd 8(%rdi), %xmm1, %xmm3 To: vmovsd (%rdi), %xmm0 vmovsd 16(%rdi), %xmm1 vmovhpd 24(%rdi), %xmm0, %xmm0 vmovhpd 8(%rdi), %xmm1, %xmm1 This patch fixes PR21771 ( http://llvm.org/bugs/show_bug.cgi?id=21771 ). Differential Revision: http://reviews.llvm.org/D6642 llvm-svn: 224379	2014-12-16 21:57:18 +00:00
Kevin Enderby	1ff0ecc7a1	Fix a bug in llvm-objdump’s -private-headers for the LC_VERSION_MIN_IPHONEOS load command not getting printed. llvm-svn: 224376	2014-12-16 21:48:27 +00:00
Colin LeMahieu	f5acc8c625	[Hexagon] Adding tstbit/bitclr/bitset instructions. llvm-svn: 224374	2014-12-16 21:28:58 +00:00
Kostya Serebryany	7376294086	[sanitizer] prevent function call merging for sanitizer-coverage callbacks llvm-svn: 224372	2014-12-16 21:24:15 +00:00
Kevin Enderby	75594b6142	Fix another use of PRIx32 that should have been PRIx64. llvm-svn: 224368	2014-12-16 21:00:25 +00:00
Colin LeMahieu	615757f2f1	[Hexagon] Adding bit count and twiddling instructions. llvm-svn: 224367	2014-12-16 20:57:56 +00:00
Colin LeMahieu	6fce46baf6	[Hexagon] Adding asr/lsr/asl reg/imm, asl with saturation, asr with rounding. Doubleword abs/neg/not. Interleave and deinterleave instructions. llvm-svn: 224365	2014-12-16 20:40:23 +00:00
Frederic Riss	19b68ddda1	[dsymutil] Pass the verbosity flag down to the processing. NFC for now. llvm-svn: 224361	2014-12-16 20:22:11 +00:00
Frederic Riss	896b2c53ba	[dsymutil] Avoid calling getStringTableData() for each symbol. NFC. llvm-svn: 224360	2014-12-16 20:21:34 +00:00
JF Bastien	5d3280c7a7	x86-32: PUSHF/POPF use/def EFLAGS Summary: As a side-quest for D6629 jvoung pointed out that I should use -verify-machineinstrs and this found a bug in x86-32's handling of EFLAGS for PUSHF/POPF. This patch fixes the use/def, and adds -verify-machineinstrs to all x86 tests which contain 'EFLAGS'. One exception: this patch leaves inline-asm-fpstack.ll as-is because it fails -verify-machineinstrs in a way unrelated to EFLAGS. This patch also modifies cmpxchg-clobber-flags.ll along the lines of what D6629 already does by also testing i386. Test Plan: ninja check Reviewers: t.p.northover, jvoung Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6687 llvm-svn: 224359	2014-12-16 20:15:45 +00:00
Rafael Espindola	a4a94f1b55	Use CastInst::castIsValid to simplify the verifier. Also delete a dead member variable. llvm-svn: 224356	2014-12-16 19:29:29 +00:00
Matt Arsenault	31a52ad48c	NVPTX: Remove duplicate of AsmPrinter::lowerConstant llvm-svn: 224355	2014-12-16 19:16:17 +00:00
Matt Arsenault	dd3b77d64c	Move lowerConstant to AsmPrinter This was a static function before, and NVPTX duplicated it because it wasn't exposed. llvm-svn: 224354	2014-12-16 19:16:14 +00:00
Quentin Colombet	d5e57b731f	[CodeGenPrepare] Move sign/zero extensions near loads using type promotion. This patch extends the optimization in CodeGenPrepare that moves a sign/zero extension near a load when the target can combine them. The optimization may promote any operations between the extension and the load to make that possible. Although this optimization may be beneficial for all targets, in particular AArch64, this is enabled for X86 only as I have not benchmarked it for other targets yet. Context Most targets feature extended loads, i.e., loads that perform a zero or sign extension for free. In that context it is interesting to expose such pattern in CodeGenPrepare so that the instruction selection pass can form such loads. Sometimes, this pattern is blocked because of instructions between the load and the extension. When those instructions are promotable to the extended type, we can expose this pattern. Motivating Example Let us consider an example: define void @foo(i8* %addr1, i32* %addr2, i8 %a, i32 %b) { %ld = load i8* %addr1 %zextld = zext i8 %ld to i32 %ld2 = load i32* %addr2 %add = add nsw i32 %ld2, %zextld %sextadd = sext i32 %add to i64 %zexta = zext i8 %a to i32 %addza = add nsw i32 %zexta, %zextld %sextaddza = sext i32 %addza to i64 %addb = add nsw i32 %b, %zextld %sextaddb = sext i32 %addb to i64 call void @dummy(i64 %sextadd, i64 %sextaddza, i64 %sextaddb) ret void } As it is, this IR generates the following assembly on x86_64: [...] movzbl (%rdi), %eax # zero-extended load movl (%rsi), %es # plain load addl %eax, %esi # 32-bit add movslq %esi, %rdi # sign extend the result of add movzbl %dl, %edx # zero extend the first argument addl %eax, %edx # 32-bit add movslq %edx, %rsi # sign extend the result of add addl %eax, %ecx # 32-bit add movslq %ecx, %rdx # sign extend the result of add [...] The throughput of this sequence is 7.45 cycles on Ivy Bridge according to IACA. Now, by promoting the additions to form more extended loads we would generate: [...] movzbl (%rdi), %eax # zero-extended load movslq (%rsi), %rdi # sign-extended load addq %rax, %rdi # 64-bit add movzbl %dl, %esi # zero extend the first argument addq %rax, %rsi # 64-bit add movslq %ecx, %rdx # sign extend the second argument addq %rax, %rdx # 64-bit add [...] The throughput of this sequence is 6.15 cycles on Ivy Bridge according to IACA. This kind of sequences happen a lot on code using 32-bit indexes on 64-bit architectures. Note: The throughput numbers are similar on Sandy Bridge and Haswell. Proposed Solution To avoid the penalty of all these sign/zero extensions, we merge them in the loads at the beginning of the chain of computation by promoting all the chain of computation on the extended type. The promotion is done if and only if we do not introduce new extensions, i.e., if we do not degrade the code quality. To achieve this, we extend the existing “move ext to load” optimization with the promotion mechanism introduced to match larger patterns for addressing mode (r200947). The idea of this extension is to perform the following transformation: ext(promotableInst1(...(promotableInstN(load)))) => promotedInst1(...(promotedInstN(ext(load)))) The promotion mechanism in that optimization is enabled by a new TargetLowering switch, which is off by default. In other words, by default, the optimization performs the “move ext to load” optimization as it was before this patch. Performance Configuration: x86_64: Ivy Bridge fixed at 2900MHz running OS X 10.10. Tested Optimization Levels: O3/Os Tests: llvm-testsuite + externals. Results: - No regression beside noise. - Improvements: CINT2006/473.astar: ~2% Benchmarks/PAQ8p: ~2% Misc/perlin: ~3% The results are consistent for both O3 and Os. <rdar://problem/18310086> llvm-svn: 224351	2014-12-16 19:09:03 +00:00
Kevin Enderby	adb7c43c40	Fix the arm build bots for a test that was added. A printing routine was incorrectly using PRIx32 when it should have been using PRIx64 for the value that was passed as uint64_t . llvm-svn: 224350	2014-12-16 18:58:11 +00:00
Robert Khasanov	d04cd2fbfe	[AVX512] Enable integer arithmetic lowering for AVX512BW/VL subsets. Added lowering tests. llvm-svn: 224349	2014-12-16 18:24:07 +00:00
Evgeny Astigeevich	b42003d2bf	On behalf of Matthew Wahab: An instruction alias defined with InstAlias and an optional operand in the middle of the AsmString field, "..${a} <operands>", would get the final "}" printed in the instruction disassembly. This wouldn't happen if the optional operand appeared as the last item in the AsmString which is how the current backends avoided the problem. There don't appear to be any tests for this part of Tablegen but it passes the pre-commit tests. Manually tested the change by enabling the generic alias printer in the ARM backend and checking the output. Differential Revision: http://reviews.llvm.org/D6529 llvm-svn: 224348	2014-12-16 18:16:17 +00:00
Ahmed Bougacha	0dc1979293	[MC] Reset the MCInst in the matcher function before adding opcode/operands. On X86, the Intel asm parser tries to match all memory operand sizes when none is explicitly specified. For LEA, which doesn't really have a memory operand (just a pointer one), this results in multiple successful matches, one for each memory size. There's no error because it's same opcode, so really, it's just one match. However, the tablegen'd matcher function adds opcode/operands to the passed MCInst, and this results in multiple duplicated operands. This commit clears the MCInst in the tablegen'd matcher function. We sometimes clear it when the match failed, so there's no expectation of keeping the previous content anyway. Differential Revision: http://reviews.llvm.org/D6670 llvm-svn: 224347	2014-12-16 18:05:28 +00:00
Colin LeMahieu	1944a8cd04	[Hexagon] Adding absolute value, and negate with saturation llvm-svn: 224346	2014-12-16 17:44:49 +00:00
Sanjay Patel	e46d54f0bf	combine consecutive subvector 16-byte loads into one 32-byte load This is a fix for PR21709 ( http://llvm.org/bugs/show_bug.cgi?id=21709 ). When we have 2 consecutive 16-byte loads that are merged into one 32-byte vector, we can use a single 32-byte load instead. But we don't do this for SandyBridge / IvyBridge because they have slower 32-byte memops. We also don't bother using 32-byte integer loads on a machine that only has AVX1 (btver2) because those operands would have to be split in half anyway since there is no support for 32-byte integer math ops. Differential Revision: http://reviews.llvm.org/D6492 llvm-svn: 224344	2014-12-16 16:30:01 +00:00
Colin LeMahieu	455f24aa77	[Hexagon] Adding saturate and swizzle instructions. llvm-svn: 224343	2014-12-16 16:27:17 +00:00
Robert Khasanov	8d9b93eac8	[AVX512] Add a comment for avx512_broadcast_pat multiclass llvm-svn: 224341	2014-12-16 16:12:11 +00:00
Colin LeMahieu	d9b23509bf	[Hexagon] Removing old multiply defs and updating references to new versions. llvm-svn: 224340	2014-12-16 16:10:01 +00:00
Vladimir Medic	e88609388a	The single check for N64 inside MipsDisassemblerBase's subclasses is actually wrong. It should be testing for FeatureGP64bit.There are no functional changes. llvm-svn: 224339	2014-12-16 15:29:12 +00:00
Zoran Jovanovic	2deca34803	[mips][microMIPS] Implement SWP and LWP instructions Differential Revision: http://reviews.llvm.org/D5667 llvm-svn: 224338	2014-12-16 14:59:10 +00:00
Aaron Ballman	0d6a010c13	Fixing -Wsign-compare warnings; NFC. llvm-svn: 224337	2014-12-16 14:04:11 +00:00
Vladimir Medic	a489a631ae	Add disassembler tests for mips4 platform. There are no functional changes. llvm-svn: 224335	2014-12-16 13:02:25 +00:00
Elena Demikhovsky	f5b72afff4	Masked Load and Store Intrinsics in loop vectorizer. The loop vectorizer optimizes loops containing conditional memory accesses by generating masked load and store intrinsics. This decision is target dependent. http://reviews.llvm.org/D6527 llvm-svn: 224334	2014-12-16 11:50:42 +00:00
Daniel Sanders	97797a8a0f	[mips] Fix arguments-struct.ll for Windows and OSX hosts. llvm-svn: 224333	2014-12-16 11:21:58 +00:00
Bradley Smith	ececb7f6e2	[ARM] Prevent PerformVCVTCombine from combining a vmul/vcvt with 8 lanes This would result in a crash since the vcvt used does not support v8i32 types. llvm-svn: 224332	2014-12-16 10:59:27 +00:00
Elena Demikhovsky	a79fc16bb0	X86: Added FeatureVectorUAMem for all AVX architectures. According to AVX specification: "Most arithmetic and data processing instructions encoded using the VEX prefix and performing memory accesses have more flexible memory alignment requirements than instructions that are encoded without the VEX prefix. Specifically, With the exception of explicitly aligned 16 or 32 byte SIMD load/store instructions, most VEX-encoded, arithmetic and data processing instructions operate in a flexible environment regarding memory address alignment, i.e. VEX-encoded instruction with 32-byte or 16-byte load semantics will support unaligned load operation by default. Memory arguments for most instructions with VEX prefix operate normally without causing #GP(0) on any byte-granularity alignment (unlike Legacy SSE instructions)." The same for AVX-512. This change does not affect anything right now, because only the "memop pattern fragment" depends on FeatureVectorUAMem and it is not used in AVX patterns. All AVX patterns are based on the "unaligned load" anyway. llvm-svn: 224330	2014-12-16 09:10:08 +00:00
Duncan P. N. Exon Smith	8c66273d0c	Remove 'metadata' from comments llvm-svn: 224328	2014-12-16 07:45:05 +00:00
Duncan P. N. Exon Smith	bb7d2fb1e5	IR: Stop printing 'metadata' in Metadata::print() Stop printing `metadata` in `Metadata::print()` and `Metadata::printAsOperand()`. llvm-svn: 224327	2014-12-16 07:40:31 +00:00

... 3 4 5 6 7 ...

111267 Commits