llvm-project

Commit Graph

Author	SHA1	Message	Date
Hao Liu	9a342778b9	[ARM64]Fix a bug cannot select UQSHL/SQSHL with constant i64 shift amount. llvm-svn: 207399	2014-04-28 07:34:27 +00:00
Benjamin Kramer	3693e77cb4	X86: If SSE4.1 is missing lower SMUL_LOHI of v4i32 to pmuludq and fix up the high parts. This is more expensive than pmuldq but still cheaper than scalarizing the whole thing. llvm-svn: 207370	2014-04-27 18:47:41 +00:00
Benjamin Kramer	99767ddf0b	Update test not to check for a shuffle of an all-zero vector. llvm-svn: 207354	2014-04-27 11:54:45 +00:00
Benjamin Kramer	6bca8ef667	SelectionDAG: Aggressively fold shuffles of constant splats. llvm-svn: 207352	2014-04-27 11:41:06 +00:00
Benjamin Kramer	da4841b3a9	DAGCombiner: Simplify code a bit, make more transforms work with vectors. llvm-svn: 207338	2014-04-26 23:09:49 +00:00
Benjamin Kramer	6d2dff61f9	X86: Lower SMUL_LOHI of v4i32 to pmuldq when SSE4.1 is available. llvm-svn: 207318	2014-04-26 14:12:19 +00:00
Benjamin Kramer	c9827ab103	X86: Add patterns for MULHU/MULHS of v8i16 and v16i16. This gets us pretty code for divs of i16 vectors. Turn the existing intrinsics into the corresponding nodes. llvm-svn: 207317	2014-04-26 13:01:03 +00:00
Benjamin Kramer	4dae598bc8	DAGCombiner: Turn divs of vector splats into vectorized multiplications. Otherwise the legalizer would just scalarize everything. Support for mulhi in the targets isn't that great yet so on most targets we get exactly the same scalarized output. Add a test for x86 vector udiv. I had to disable the mulhi nodes on ARM because there aren't any patterns for it. As far as I know ARM has instructions for getting the high part of a multiply so this should be fixed. llvm-svn: 207315	2014-04-26 12:06:28 +00:00
Michael Zolotukhin	1a97a7bcbf	Revert r206749 till a final decision about the intrinsics is made. llvm-svn: 207313	2014-04-26 09:56:41 +00:00
Juergen Ributzka	a6bda8bae2	[DAG] During DAG legalization keep opaque constants even after expanding. The included test case would return the incorrect results, because the expansion of an shift with a constant shift amount of 0 would generate undefined behavior. This is because ExpandShiftByConstant assumes that all shifts by constants with a value of 0 have already been optimized away. This doesn't happen for opaque constants and usually this isn't a problem, because opaque constants won't take this code path - they are not supposed to. In the case that the opaque constant has to be expanded by the legalizer, the legalizer would drop the opaque flag. In this case we hit the limitations of ExpandShiftByConstant and create incorrect code. This commit fixes the legalizer by not dropping the opaque flag when expanding opaque constants and adding an assertion to ExpandShiftByConstant to catch this not supported case in the future. This fixes <rdar://problem/16718472> llvm-svn: 207304	2014-04-26 02:58:04 +00:00
Quentin Colombet	ea18933d97	[X86] Implement TargetLowering::getScalingFactorCost hook. Scaling factors are not free on X86 because every "complex" addressing mode breaks the related instruction into 2 allocations instead of 1. <rdar://problem/16730541> llvm-svn: 207301	2014-04-26 01:11:26 +00:00
Filipe Cabecinhas	d71f110fe9	Appease the almighty buildbots. llvm-svn: 207295	2014-04-26 00:02:37 +00:00
Filipe Cabecinhas	363b570d2a	Optimization for certain shufflevector by using insertps. Summary: If we're doing a v4f32/v4i32 shuffle on x86 with SSE4.1, we can lower certain shufflevectors to an insertps instruction: When most of the shufflevector result's elements come from one vector (and keep their index), and one element comes from another vector or a memory operand. Added tests for insertps optimizations on shufflevector. Added support and tests for v4i32 vector optimization. Reviewers: nadav Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3475 llvm-svn: 207291	2014-04-25 23:51:17 +00:00
Saleem Abdulrasool	99f0d458c3	ARM: remove @llvm.arm.sevl This intrinsic is no longer needed with the new @llvm.arm.hint(i32) intrinsic which provides a generic, extensible manner for adding hint instructions. This functionality can now be represented as @llvm.arm.hint(i32 5). llvm-svn: 207246	2014-04-25 17:51:25 +00:00
Saleem Abdulrasool	7e7c2f9ca6	ARM: provide a new generic hint intrinsic Introduce the llvm.arm.hint(i32) intrinsic that can be used to inject hints into the instruction stream. This is particularly useful for generating IR from a compiler where the user may inject an intrinsic (e.g. __yield). These are then pattern substituted into the correct instruction which already existed. llvm-svn: 207242	2014-04-25 17:24:24 +00:00
Tilmann Scheller	2c65bbddd8	[ARM64] When compiling for ELF in PIC mode, local symbols shouldn't go through the GOT There's no need for local symbols to go through the GOT, in fact it seems GNU ld is not even emitting GOT entries for local symbols and will error out when trying to resolve a GOT relocation for a local symbol. This bug triggers when bootstrapping clang on AArch64 Linux with -fPIC and the ARM64 backend. The AArch64 backend is not affected. With this commit it's now possible to bootstrap clang on AArch64 Linux with the ARM64 backend (-fPIC, -O3). llvm-svn: 207226	2014-04-25 13:43:18 +00:00
Jiangning Liu	533b560bc6	[ARM64] Handle fp128 for parameter passing on stack llvm-svn: 207222	2014-04-25 12:07:03 +00:00
Tim Northover	eb7354fd3b	ARM64: fix assertion in ISelDAGToDAG Also an unused variable, so double bonus! This should deal with PR19548. llvm-svn: 207221	2014-04-25 10:48:47 +00:00
Bradley Smith	672df15122	[ARM64] Print preferred aliases for SFBM/UBFM in InstPrinter llvm-svn: 207219	2014-04-25 10:25:29 +00:00
Kevin Qin	022d395c9c	[ARM64] Add RUN lines for "–target arm64 –mattr=-fp-armv8" on AArch64 no-fp test. This patch is a supplement of implementing predicate of FP, enabling aarch64 backend no-fp tests on arm64 target for verification. During this, one bug is exposed and fixed by this patch. llvm-svn: 207215	2014-04-25 09:44:20 +00:00
Kevin Qin	0e7b07704e	[ARM64] Support crc predicate on ARM64. According to the specification, CRC is an optional extension of the architecture. llvm-svn: 207214	2014-04-25 09:25:42 +00:00
Benjamin Kramer	76f753e9a9	X86: Don't transform shifts into ands when the sign bit is tested. Should unbreak MultiSource/Benchmarks/mediabench/g721/g721encode/encode. llvm-svn: 207145	2014-04-24 20:51:37 +00:00
Reid Kleckner	5772b77789	Add 'musttail' marker to call instructions This is similar to the 'tail' marker, except that it guarantees that tail call optimization will occur. It also comes with convervative IR verification rules that ensure that tail call optimization is possible. Reviewers: nicholas Differential Revision: http://llvm-reviews.chandlerc.com/D3240 llvm-svn: 207143	2014-04-24 20:14:34 +00:00
Reid Kleckner	0fbb1e91e5	Fix rdtsc.ll test to match r8 on win64 llvm-svn: 207142	2014-04-24 20:14:08 +00:00
Andrea Di Biagio	d1ab866868	[X86] Add support for Read Time Stamp Counter x86 builtin intrinsics. This patch: - Adds two new X86 builtin intrinsics ('int_x86_rdtsc' and 'int_x86_rdtscp') as GCCBuiltin intrinsics; - Teaches the backend how to lower the two new builtins; - Introduces a common function to lower READCYCLECOUNTER dag nodes and the two new rdtsc/rdtscp intrinsics; - Improves (and extends) the existing x86 test 'rdtsc.ll'; now test 'rdtsc.ll' correctly verifies that both READCYCLECOUNTER and the two new intrinsics work fine for both 64bit and 32bit Subtargets. llvm-svn: 207127	2014-04-24 17:18:27 +00:00
Tim Northover	6331d4b975	AArch64: print NEON lists with a space. This matches ARM64 behaviour, which I think is clearer. It also puts all the churn from that difference into one easily ignored commit. llvm-svn: 207116	2014-04-24 14:06:20 +00:00
Tim Northover	9b594d1163	AArch64/ARM64: port bitfield test to ARM64. llvm-svn: 207103	2014-04-24 12:11:56 +00:00
Tim Northover	eb6611e727	AArch64/ARM64: implement BFI optimisation ARM64 was not producing pure BFI instructions for bitfield insertion operations, unlike AArch64. The approach had to be a little different (in ISelDAGToDAG rather than ISelLowering), and the outcomes aren't identical but hopefully this gives it similar power. This should address PR19424. llvm-svn: 207102	2014-04-24 12:11:53 +00:00
Tim Northover	1cb984fbcf	AArch64/ARM64: port more tests llvm-svn: 207101	2014-04-24 12:11:46 +00:00
Benjamin Kramer	f4575db2fd	X86: Emit test instead of constant shift + compare if the shift result is unused. This allows us to compile return (mask & 0x8 ? a : b); into testb $8, %dil cmovnel %edx, %esi instead of andl $8, %edi shrl $3, %edi cmovnel %edx, %esi which we formed previously because dag combiner canonicalizes setcc of and into shift. llvm-svn: 207088	2014-04-24 08:15:31 +00:00
Saleem Abdulrasool	9e6a524551	MC: move test from Generic to COFF This is a COFF specific test, move it to COFF to fix the Hexagon buildbots. llvm-svn: 207030	2014-04-23 21:41:07 +00:00
Saleem Abdulrasool	11049a0fef	MC: honour IMAGE_SCN_CNT_INITIALIZED_DATA Emit the flag to indicate to the assembler that a section contains data if there is pre-populated data present. llvm-svn: 207028	2014-04-23 21:29:34 +00:00
Quentin Colombet	ef86b4067c	[ARM64] Fix the information we give to the peephole optimizer for comparison. ANDS does not use the same encoding scheme as other xxxS instructions (e.g., ADDS). Take that into account to avoid wrong peephole optimization. <rdar://problem/16693089> llvm-svn: 207020	2014-04-23 20:43:38 +00:00
Matt Arsenault	4c6ab696e2	R600: Add a test that used to be broken that I forgot to add llvm-svn: 207017	2014-04-23 19:45:05 +00:00
Kevin Qin	a4ee178762	[ARM64] Enable feature predicates for NEON / FP / CRYPTO. AArch64 has feature predicates for NEON, FP and CRYPTO instructions. This allows the compiler to generate code without using FP, NEON or CRYPTO instructions. llvm-svn: 206949	2014-04-23 06:22:48 +00:00
Reid Kleckner	feb1148ed6	Fix test/CodeGen/arm.ll The 'CHECK: add' line was occasionally matching against the filename, breaking the subsequent CHECK-NOT. Also use CHECK-LABEL. llvm-svn: 206936	2014-04-23 01:09:29 +00:00
Matt Arsenault	16353871c3	R600: Emit error instead of unreachable on function call llvm-svn: 206904	2014-04-22 16:42:00 +00:00
Elena Demikhovsky	acc5c9e83e	AVX-512: store and truncstore for i1 values llvm-svn: 206897	2014-04-22 14:13:10 +00:00
Tim Northover	52d3283026	AArch64/ARM64: more testing from AArch64 to ARM64 llvm-svn: 206889	2014-04-22 12:45:47 +00:00
Tim Northover	a962398a3f	AArch64/ARM64: make use of ANDS and BICS instructions for comparisons. llvm-svn: 206888	2014-04-22 12:45:42 +00:00
Tim Northover	31ebef86b8	AArch64/ARM64: add extra testing from AArch64 to ARM64 llvm-svn: 206887	2014-04-22 12:45:32 +00:00
Tim Northover	2b73e74238	AArch64/ARM64: enable various AArch64 tests on ARM64. llvm-svn: 206877	2014-04-22 10:10:26 +00:00
Tim Northover	00b4ee848f	AArch64/ARM64: add patterns for scalar_to_vector/extract pairs llvm-svn: 206876	2014-04-22 10:10:18 +00:00
Tim Northover	e74fb0d7b9	AArch64/ARM64: mark fmul intrinsic as commutative. This gives DAG patterns matching indexed patterns where either side is an indexed vector. llvm-svn: 206875	2014-04-22 10:10:14 +00:00
Tim Northover	978d25f391	ARM: disable emission of __XYZvfp in soft-float environment. The point of these calls is to allow Thumb-1 code to make use of the VFP unit to perform its operations. This is not desirable with -msoft-float, since most of the reasons you'd want that apply equally to the runtime library. rdar://problem/13766161 llvm-svn: 206874	2014-04-22 10:10:09 +00:00
Hao Liu	c636d15284	Fix an infinite loop bug in DAG Combine about keeping transfering between ANY_EXTEND and SIGN_EXTEND. llvm-svn: 206873	2014-04-22 09:57:06 +00:00
Lang Hames	f6f42cac3f	[X86] Don't use BZHI for short masks (>=32 bits). Thanks to Ben Kramer for the review. llvm-svn: 206869	2014-04-22 07:40:34 +00:00
Matt Arsenault	5dbd5db518	R600: Make sign_extend_inreg legal. Don't know why I didn't just do this in the first place. llvm-svn: 206862	2014-04-22 03:49:30 +00:00
Jiangning Liu	87486e0bac	[AArch64] Enable global merge pass. llvm-svn: 206861	2014-04-22 03:33:26 +00:00
Quentin Colombet	d4f44690ef	[CodeGenPrepare] Use APInt to check the value of the immediate in a and while checking candidate for bit field extract. Otherwise the value may not fit in uint64_t and this will trigger an assertion. This fixes PR19503. llvm-svn: 206834	2014-04-22 01:20:34 +00:00

1 2 3 4 5 ...

9638 Commits