llvm-project

Commit Graph

Author	SHA1	Message	Date
Justin Holewinski	124e93de93	[NVPTX] Properly handle bitcast ConstantExpr when checking for the alignment of function parameters llvm-svn: 194410	2013-11-11 19:28:19 +00:00
Justin Holewinski	4f5bc9b33a	[NVPTX] Fix logic error in loading vector parameters of more than 4 components llvm-svn: 194409	2013-11-11 19:28:16 +00:00
Chad Rosier	d3684a0566	[AArch64] The shift right/left and insert immediate builtins expect 3 source operands, a vector, an element to insert, and a shift amount. llvm-svn: 194406	2013-11-11 19:11:11 +00:00
Chad Rosier	35575e737c	[AArch64] Add support for NEON scalar floating-point convert to fixed-point instructions. llvm-svn: 194394	2013-11-11 18:04:07 +00:00
Daniel Sanders	a1840d2f88	Vector forms of SHL, SRA, and SRL can be constant folded using SimplifyVBinOp too Reviewers: dsanders Reviewed By: dsanders CC: llvm-commits, nadav Differential Revision: http://llvm-reviews.chandlerc.com/D1958 llvm-svn: 194393	2013-11-11 17:23:41 +00:00
Matheus Almeida	c051a40506	[mips][msa] CHECK-DAG-ize MSA 3r-a.ll test. No functional changes. llvm-svn: 194391	2013-11-11 16:46:20 +00:00
Matheus Almeida	ce207fa078	[mips][msa] CHECK-DAG-ize MSA 2rf_int_float.ll test. No functional changes. llvm-svn: 194390	2013-11-11 16:38:55 +00:00
Matheus Almeida	fed22ad33b	[mips][msa] CHECK-DAG-ize MSA 2rf_float_int.ll test. No functional changes. llvm-svn: 194389	2013-11-11 16:31:46 +00:00
Matheus Almeida	c596839e67	[mips][msa] CHECK-DAG-ize MSA 2rf.ll test. No functional changes. llvm-svn: 194387	2013-11-11 16:24:53 +00:00
Matheus Almeida	9826d07a2f	[mips][msa] CHECK-DAG-ize MSA 2r.ll test. No functional changes. llvm-svn: 194386	2013-11-11 16:16:53 +00:00
Hal Finkel	c6a243987d	Add PPC option for full register names in asm On non-Darwin PPC systems, we currently strip off the register name prefix prior to instruction printing. So instead of something like this: mr r3, r4 we print this: mr 3, 4 The first form is the default on Darwin, and is understood by binutils, but not yet understood by our integrated assembler. Once our integrated-as understands full register names as well, this temporary option will be replaced by tying this functionality to the verbose-asm option. The numeric-only form is compatible with legacy assemblers and tools, and is also gcc's default on most PPC systems. On the other hand, it is harder to read, and there are some analysis tools that expect full register names. llvm-svn: 194384	2013-11-11 14:58:40 +00:00
Reed Kotler	45c5927c5c	Mostly finish up constant islands port for Mips for load constants. Still need to finish the branch part. Still lots more review of the code, clean up and testing. llvm-svn: 194337	2013-11-10 00:09:26 +00:00
Akira Hatanaka	d1c58ed8a7	[mips] Make sure there is a chain edge dependency between loads that read formal arguments on the stack and stores created afterwards. We need this to ensure tail call optimized function calls do not write over the argument area of the stack before it is read out. llvm-svn: 194309	2013-11-09 02:38:51 +00:00
Juergen Ributzka	87ed906b2e	[Stackmap] Materialize the jump address within the patchpoint noop slide. This patch moves the jump address materialization inside the noop slide. This enables patching of the materialization itself or its complete removal. This patch also adds the ability to define scratch registers that can be used safely by the code called from the patchpoint intrinsic. At least one scratch register is required, because that one is used for the materialization of the jump address. This patch depends on D2009. Differential Revision: http://llvm-reviews.chandlerc.com/D2074 Reviewed by Andy llvm-svn: 194306	2013-11-09 01:51:33 +00:00
Juergen Ributzka	9969d3e6e8	[Stackmap] Add AnyReg calling convention support for patchpoint intrinsic. The idea of the AnyReg Calling Convention is to provide the call arguments in registers, but not to force them to be placed in a paticular order into a specified set of registers. Instead it is up tp the register allocator to assign any register as it sees fit. The same applies to the return value (if applicable). Differential Revision: http://llvm-reviews.chandlerc.com/D2009 Reviewed by Andy llvm-svn: 194293	2013-11-08 23:28:16 +00:00
Quentin Colombet	b06a0ed4b0	[VirtRegMap] Fix for PR17825. Do not ignore noreturn definitions when setting isPhysRegUsed if the unwind information is required. Indeed, the runtime may need a correct stack to be able to unwind the call. llvm-svn: 194271	2013-11-08 18:14:17 +00:00
Tim Northover	93bcc66e73	ARM: fold prologue/epilogue sp updates into push/pop for code size ARM prologues usually look like: push {r7, lr} sub sp, sp, #4 If code size is extremely important, this can be optimised to the single instruction: push {r6, r7, lr} where we don't actually care about the contents of r6, but pushing it subtracts 4 from sp as a side effect. This should implement such a conversion, predicated on the "minsize" function attribute (-Oz) since I've yet to find any code it actually makes faster. llvm-svn: 194264	2013-11-08 17:18:07 +00:00
Vincent Lejeune	4f3751f2af	R600: Fix LowerUDIVREM llvm-svn: 194153	2013-11-06 17:36:04 +00:00
Jiangning Liu	f4226f1d7b	Implement AArch64 Neon instruction set Perm. llvm-svn: 194123	2013-11-06 03:35:27 +00:00
Jiangning Liu	a50e22ca4f	Implement AArch64 Neon instruction set Bitwise Extract. llvm-svn: 194118	2013-11-06 02:25:49 +00:00
Andrew Trick	6664df12fb	Slightly change the way stackmap and patchpoint intrinsics are lowered. MorphNodeTo is not safe to call during DAG building. It eagerly deletes dependent DAG nodes which invalidates the NodeMap. We could expose a safe interface for morphing nodes, but I don't think it's worth it. Just create a new MachineNode and replaceAllUsesWith. My understaning of the SD design has been that we want to support early target opcode selection. That isn't very well supported, but generally works. It seems reasonable to rely on this feature even if it isn't widely used. llvm-svn: 194102	2013-11-05 22:44:04 +00:00
Jiangning Liu	d7c52676f6	Implement AArch64 Neon Crypto instruction classes AES, SHA, and 3 SHA. llvm-svn: 194085	2013-11-05 17:42:05 +00:00
Reed Kotler	0f007fc4ce	Fix r194019 as requested by Eric Christopher. Submit the basic port of the rest of ARM constant islands code to Mips. Two test cases are added which reflect the next level of functionality: constants getting moved to water areas that are out of range from the initial placement at the end of the function and basic blocks being split to create water when none exists that can be used. There is a bunch of this code that is not complete and has been marked with IN_PROGRESS. I will finish cleaning this all up during the next week or two and submit the rest of the test cases. I have elminated some code for dealing with inline assembly because to me it unecessarily complicates things and some of the newer features of llvm like function attributies and builtin assembler give me better tools to solve the alignment issues created there. Also, for Mips16 I even have the option of not doing constant islands in the present of inline assembler if I chose. When everything has been completed I will summarize the port and notify people that are knowledgable regarding the ARM Constant Islands code so they can review it in it's entirety if they wish. llvm-svn: 194053	2013-11-05 08:14:14 +00:00
Hao Liu	d6b40b51c7	Implement AArch64 post-index vector load/store multiple N-element structure class SIMD(lselem-post). Including following 14 instructions: 4 ld1 insts: post-index load multiple 1-element structure to sequential 1/2/3/4 registers. ld2/ld3/ld4: post-index load multiple N-element structure to sequential N registers (N=2,3,4). 4 st1 insts: post-index store multiple 1-element structure from sequential 1/2/3/4 registers. st2/st3/st4: post-index store multiple N-element structure from sequential N registers (N = 2,3,4). llvm-svn: 194043	2013-11-05 03:39:32 +00:00
Kevin Qin	97f6aaa8ad	Implemented aarch64 neon intrinsic vcopy_lane with float type. llvm-svn: 194041	2013-11-05 02:03:59 +00:00
NAKAMURA Takumi	5267613e3a	Revert r194019 to r194021, "Submit the basic port of the rest of ARM constant islands code to Mips." It broke -Asserts build. llvm-svn: 194026	2013-11-04 23:14:36 +00:00
Tim Northover	ace0bd4d33	AArch64: use default asm operand printing when modifier inapplicable If an inline assembly operand has multiple constraints (e.g. "Ir" for immediate or register) and an operand modifier (E.g. "w" for "print register as wN") then we need to decide behaviour when the modifier doesn't apply to the constraint. Previousely produced some combination of an assertion failure and a fatal error. GCC's behaviour appears to be to ignore the modifier and print the operand in the default way. This patch should implement that. llvm-svn: 194024	2013-11-04 23:04:07 +00:00
Reed Kotler	3fe68871da	Add the test case that goes with the previous submission for constant islands. I forgot to add it to svn on that patch. Ooops. llvm-svn: 194020	2013-11-04 22:13:41 +00:00
Eric Christopher	542c8d934d	Check for both styles of clobbers, those produced by dragonegg and those produced by clang for the inline asm bswap conversion. Modified from a patch by Chris Smowton. llvm-svn: 194016	2013-11-04 21:41:21 +00:00
Cameron McInally	d80f7d34de	Add support for AVX512 masked vector blend intrinsics. llvm-svn: 194006	2013-11-04 19:14:56 +00:00
Elena Demikhovsky	dacddb0bab	AVX-512: added VPCONFLICT instruction and intrinsics, added EVEX_KZ to tablegen llvm-svn: 193959	2013-11-03 13:46:31 +00:00
Venkatraman Govindaraju	5ae77f7564	[SparcV9] Handle i64 <-> float conversions in sparcv9 mode. llvm-svn: 193957	2013-11-03 12:28:40 +00:00
Venkatraman Govindaraju	f1d807ee13	[Sparc] Expand FP_TO_UINT, UINT_TO_FP for fp128. llvm-svn: 193947	2013-11-03 08:00:19 +00:00
Bob Wilson	e7dde0c061	Enable optimization of sin / cos pair into call to __sincos_stret for iOS7+. rdar://12856873 Patch by Evan Cheng, with a fix for rdar://13209539 by Tilmann Scheller llvm-svn: 193942	2013-11-03 06:14:38 +00:00
Venkatraman Govindaraju	5615aca219	[SparcV9] Add ctpop instruction for i64. Also, expand ctlz, cttz and bswap. llvm-svn: 193941	2013-11-03 05:59:07 +00:00
Michael Liao	b638d05ecb	Fix PR17764 - When selecting BLEND from vselect, the operands need swapping as due to the difference between vselect and SSE/AVX's BLEND insn llvm-svn: 193900	2013-11-02 00:10:02 +00:00
Bradley Smith	2521975a42	[ARM] Add Virtualization subtarget feature and more build attributes in this area Add a Virtualization ARM subtarget feature along with adding proper build attribute emission for Tag_Virtualization_use (encodes Virtualization and TrustZone) and Tag_MPextension_use. Also rework test/CodeGen/ARM/2010-10-19-mc-elf-objheader.ll testcase to something that is more maintainable. This changes the focus of this testcase away from testing CPU defaults (which is tested elsewhere), onto specifically testing that attributes are encoded correctly. llvm-svn: 193859	2013-11-01 13:27:35 +00:00
Bradley Smith	c848beba5e	[ARM] Fix Tag_ABI_HardFP_use build attribute Fix Tag_ABI_HardFP_use build attribute to handle single precision FP, replace deprecated Tag_ABI_HardFP_use value of 3 with 0 and also add some tests for Tag_ABI_VFP_args. llvm-svn: 193856	2013-11-01 11:21:16 +00:00
Andrew Trick	f990411256	These test cases for experimental features are a bit too darwin-specific still. Use a triple. llvm-svn: 193820	2013-10-31 22:46:51 +00:00
Chad Rosier	74b65cd811	[AArch64] Add support for NEON scalar fixed-point convert to floating-point instructions. llvm-svn: 193816	2013-10-31 22:36:59 +00:00
Andrew Trick	a3a11dedca	Add new calling convention for WebKit Java Script. llvm-svn: 193812	2013-10-31 22:12:01 +00:00
Andrew Trick	153ebe6d2a	Add support for stack map generation in the X86 backend. Originally implemented by Lang Hames. llvm-svn: 193811	2013-10-31 22:11:56 +00:00
Chad Rosier	20e1f20d69	[AArch64] Add support for NEON scalar shift immediate instructions. llvm-svn: 193790	2013-10-31 19:28:44 +00:00
Roman Divacky	2262cfaf19	SparcV9 doesnt have rem instruction either. llvm-svn: 193789	2013-10-31 19:22:33 +00:00
Roman Divacky	8d72f4a06f	Merge and filecheckize. llvm-svn: 193778	2013-10-31 17:50:45 +00:00
Cameron McInally	394d557f41	Add AVX512 unmasked integer broadcast intrinsics and support. llvm-svn: 193748	2013-10-31 13:56:31 +00:00
Elena Demikhovsky	496656900e	AVX-512: Implemented CMOV for 512-bit vectors llvm-svn: 193747	2013-10-31 13:15:32 +00:00
Richard Sandiford	f834ea19db	[SystemZ] Automatically detect zEC12 and z196 hosts As on other hosts, the CPU identification instruction is priveleged, so we need to look through /proc/cpuinfo. I copied the PowerPC way of handling "generic". Several tests were implicitly assuming z10 and so failed on z196. llvm-svn: 193742	2013-10-31 12:14:17 +00:00
Amara Emerson	f80f95fcc7	[AArch64] Make the use of FP instructions optional, but enabled by default. This adds a new subtarget feature called FPARMv8 (implied by NEON), and predicates the support of the FP instructions and registers on this feature. llvm-svn: 193739	2013-10-31 09:32:11 +00:00
Jim Grosbach	7236678687	Legalize: Improve legalization of long vector extends. When an extend more than doubles the size of the elements (e.g., a zext from v16i8 to v16i32), the normal legalization method of splitting the vectors will run into problems as by the time the destination vector is legal, the source vector is illegal. The end result is the operation often becoming scalarized, with the typical horrible performance. For example, on x86_64, the simple input of: define void @bar(<16 x i8> %a, <16 x i32>* %p) nounwind { %tmp = zext <16 x i8> %a to <16 x i32> store <16 x i32> %tmp, <16 x i32>*%p ret void } Generates: .section __TEXT,__text,regular,pure_instructions .section __TEXT,__const .align 5 LCPI0_0: .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .section __TEXT,__text,regular,pure_instructions .globl _bar .align 4, 0x90 _bar: vpunpckhbw %xmm0, %xmm0, %xmm1 vpunpckhwd %xmm0, %xmm1, %xmm2 vpmovzxwd %xmm1, %xmm1 vinsertf128 $1, %xmm2, %ymm1, %ymm1 vmovaps LCPI0_0(%rip), %ymm2 vandps %ymm2, %ymm1, %ymm1 vpmovzxbw %xmm0, %xmm3 vpunpckhwd %xmm0, %xmm3, %xmm3 vpmovzxbd %xmm0, %xmm0 vinsertf128 $1, %xmm3, %ymm0, %ymm0 vandps %ymm2, %ymm0, %ymm0 vmovaps %ymm0, (%rdi) vmovaps %ymm1, 32(%rdi) vzeroupper ret So instead we can check if there are legal types that enable us to split more cleverly when the input vector is already legal such that we don't turn it into an illegal type. If the extend is such that it's more than doubling the size of the input we check if - the number of vector elements is even, - the source type is legal, - the type of a split source is illegal, - the type of an extended (by doubling element size) source is legal, and - the type of that extended source when split is legal. If the conditions are met, instead of just splitting both the destination and the source types, we create an extend that only goes up one "step" (doubling the element width), and the continue legalizing the rest of the operation normally. The result is that this operates as a new, more effecient, termination condition for the loop of "split the operation until the destination type is legal." With this change, the above example now compiles to: _bar: vpxor %xmm1, %xmm1, %xmm1 vpunpcklbw %xmm1, %xmm0, %xmm2 vpunpckhwd %xmm1, %xmm2, %xmm3 vpunpcklwd %xmm1, %xmm2, %xmm2 vinsertf128 $1, %xmm3, %ymm2, %ymm2 vpunpckhbw %xmm1, %xmm0, %xmm0 vpunpckhwd %xmm1, %xmm0, %xmm3 vpunpcklwd %xmm1, %xmm0, %xmm0 vinsertf128 $1, %xmm3, %ymm0, %ymm0 vmovaps %ymm0, 32(%rdi) vmovaps %ymm2, (%rdi) vzeroupper ret This generalizes a custom lowering that was added a while back to the ARM backend. That lowering is no longer necessary, and is removed. The testcases for it, however, provide excellent ARM tests for this change and so remain. rdar://14735100 llvm-svn: 193727	2013-10-31 00:20:48 +00:00

1 2 3 4 5 ...

8515 Commits