llvm-project

Commit Graph

Author	SHA1	Message	Date
Quentin Colombet	5d2f7cfd44	[X86] Enable shrink-wrapping by default, but keep it disabled for stack frames without a frame pointer when unwind may happen. This is a workaround for a bug in the way we emit the CFI directives for frameless unwind information. See PR25614. llvm-svn: 255175	2015-12-09 23:08:18 +00:00
Dan Gohman	1cf96c0c34	[WebAssembly] Reintroduce ARGUMENT moving logic Reinteroduce the code for moving ARGUMENTS back to the top of the basic block. While the ARGUMENTS physical register prevents sinking and scheduling from moving them, it does not appear to be sufficient to prevent SelectionDAG from moving them down in the initial schedule. This patch introduces a patch that moves them back to the top immediately after SelectionDAG runs. This is still hopefully a temporary solution. http://reviews.llvm.org/D14750 is one alternative, though the review has not been favorable, and proposed alternatives are longer-term and have other downsides. This fixes the main outstanding -verify-machineinstrs failures, so it adds -verify-machineinstrs to several tests. Differential Revision: http://reviews.llvm.org/D15377 llvm-svn: 255125	2015-12-09 16:23:59 +00:00
Tim Northover	d91d635b36	ARM: don't use a deleted node as the BaseReg in complex pattern. We mutated the DAG, which invalidated the node we were trying to use as a base register. Sometimes we got away with it, but other times the node really did get deleted before it was finished with. Should fix PR25733 llvm-svn: 255120	2015-12-09 15:54:50 +00:00
Robert Lougher	f0033b29d4	Fix cycle in selection DAG introduced by extractelement legalization During selection DAG legalization, extractelement is replaced with a load instruction. To do this, a temporary store to the stack is used unless an existing store is found that can be re-used. If re-using a store, the chain going out of the store must be replaced by the one going out of the new load (this ensures that any stores that must take place after the store happens after the load, else the value might be overwritten before it is loaded). The problem is, if the extractelement index is dependent on the store replacing the chain will introduce a cycle in the selection DAG (the load uses the index, and by replacing the chain we will make the index dependent on the load). To fix this, if the index is dependent on the store, the store is skipped. This is conservative as we may end up creating an unnecessary extra store to the stack. However, the situation is not expected to occur very often. Differential Revision: http://reviews.llvm.org/D15330 llvm-svn: 255114	2015-12-09 14:34:10 +00:00
Ahmed Bougacha	97564c3a1b	[AArch64][ARM] Don't base interleaved op legality on type alloc size. Otherwise, we think that most types that look like they'd fit in a legal vector type are legal (so, basically, any vector type with a size between 33 and 128 bits, I think, since we use pow2 alignment; e.g., v2i25, v3f32, ...). DataLayout::getTypeAllocSize rounds up based on alignment. When checking for target intrinsic legality, that's not what we want: if rounding makes a difference, the type isn't legal, and the target intrinsics shouldn't be used, as they are always assumed legal. One could make the argument that alloc size is ultimately the most relevant here, since we're dealing with LD/ST intrinsics. That's only true if we did legalize them though; that's a problem for another day. Use DataLayout::getTypeSizeInBits instead of getTypeAllocSizeInBits. Type::getSizeInBits can't be used because that'd gratuitously break pointer vector support. Some of these uses are currently fine, because we only hit them when the type is already known legal (e.g., r114454). Update them for consistency. It's faster to avoid the rounding anyway! llvm-svn: 255089	2015-12-09 01:19:50 +00:00
Vyacheslav Klochkov	a3cd08b05c	X86-FMA3: Defined the ExeDomain property for Scalar FMA3 opcodes. Reviewer: Simon Pilgrim. Differential Revision: http://reviews.llvm.org/D15317 llvm-svn: 255080	2015-12-09 00:12:13 +00:00
Pirama Arumuga Nainar	e6ccd7b66a	Define selection for v4f16, v8f16 scalar_to_vector Summary: This fixes failure when trying to select insertelement <4 x half> undef, half %a, i64 0 which gets transformed to a scalar_to_vector node. The accompanying v4 and v8 tests fail instruction selection without this patch. Reviewers: ab, jmolloy Subscribers: srhines, llvm-commits Differential Revision: http://reviews.llvm.org/D15322 llvm-svn: 255072	2015-12-08 23:07:06 +00:00
Simon Pilgrim	323e00d9c7	[X86][AVX] Fold loads + splats into broadcast instructions On AVX and AVX2, BROADCAST instructions can load a scalar into all elements of a target vector. This patch improves the lowering of 'splat' shuffles of a loaded vector into a broadcast - currently the lowering only works for cases where we are splatting the zero'th element, which is now generalised to any element. Fix for PR23022 Differential Revision: http://reviews.llvm.org/D15310 llvm-svn: 255061	2015-12-08 22:17:11 +00:00
Simon Pilgrim	0aea1b89eb	[X86][SSE4A] Added fast-isel intrinsics tests As discussed on PR24580, this patch adds fast-isel codegen tests to match the IR generated in clang/test/CodeGen/sse4a-builtins.c llvm-svn: 255053	2015-12-08 21:43:41 +00:00
Simon Pilgrim	0ca7cb6334	[X86][SSSE3] Added fast-isel intrinsics tests As discussed on PR24580, this patch adds fast-isel codegen tests to match the IR generated in clang/test/CodeGen/ssse3-builtins.c llvm-svn: 255052	2015-12-08 21:32:08 +00:00
Simon Pilgrim	9d76810949	[X86][SSE3] Added fast-isel intrinsics tests As discussed on PR24580, this patch adds fast-isel codegen tests to match the IR generated in clang/test/CodeGen/sse3-builtins.c llvm-svn: 255051	2015-12-08 21:27:19 +00:00
Artyom Skrobov	0a37b80bcb	Fix ARMv4T (Thumb1) epilogue generation Summary: Before ARMv5T, Thumb1 code could not pop PC, as described at D14357 and D14986; so we need the special fixup in the epilogue. Reviewers: jroelofs, qcolombet Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D15126 llvm-svn: 255047	2015-12-08 19:59:01 +00:00
Ron Lieberman	e6540e244a	[Hexagon] Add NewValueJump support for C4_cmpneq, C4_cmplte, C4_cmplteu llvm-svn: 255027	2015-12-08 16:28:32 +00:00
Justin Bogner	0ebc8605ad	IR: Allow vectors of halfs to be ConstantDataVectors Currently, vectors of halfs end up as ConstantVectors, but there isn't a good reason they can't be ConstantDataVectors. This should save some memory. llvm-svn: 254991	2015-12-08 03:01:16 +00:00
Justin Bogner	3135ba9b38	AsmPrinter: Use emitGlobalConstantFP to emit elements of constant data It's strange to duplicate the logic for emitting FP values into emitGlobalConstantDataSequential, and it's even stranger that we end up printing the verbose assembly comments differently between the two paths. Just call into emitGlobalConstantFP rather than crudely duplicating its logic. llvm-svn: 254988	2015-12-08 02:37:48 +00:00
Manman Ren	cb8470b4b5	[CXX TLS calling convention] Add support for AArch64. rdar://9001553 llvm-svn: 254978	2015-12-08 00:14:38 +00:00
Kit Barton	a1c712fae5	[PPC64] Convert bool literals to i32 Convert i1 values to i32 values if they should be allocated in GPRs instead of CRs. Phabricator: http://reviews.llvm.org/D14064 llvm-svn: 254942	2015-12-07 20:50:29 +00:00
Simon Pilgrim	69aa463780	Fix line endings llvm-svn: 254939	2015-12-07 20:36:00 +00:00
Ron Lieberman	c5e20a41a0	[Hexagon] Adding v60 test, vasr in particular. llvm-svn: 254923	2015-12-07 18:52:39 +00:00
Sanjay Patel	fe2e9121e2	Tighten checks so we can see existing codegen The 2-element vector case shows a surprising bug: we failed to eliminate ops on undefs, so there are 4 fmax calls even though there can only be 2 valid elements in the inputs. llvm-svn: 254920	2015-12-07 17:39:48 +00:00
Elena Demikhovsky	291fe0159f	VX-512: Fixed a bug in FP logic operation lowering FP logic instructions are supported in DQ extension on AVX-512 target. I use integer operations instead. Added tests. I also enabled FABS in this patch in order to check ANDPS. The operations are FOR, FXOR, FAND, FANDN. The instructions, that supported for 512-bit vector under DQ are: VORPS/PD, VXORPS/PD, VANDPS/PD, FANDNPS/PD. Differential Revision: http://reviews.llvm.org/D15110 llvm-svn: 254913	2015-12-07 14:33:34 +00:00
Artyom Skrobov	e9b3fb8603	[ARM] Generate ABI_optimization_goals build attribute, as described in the ARM ARM. Summary: This reverts r254234, and adds a simple fix for the annoying case of use-after-free. Reviewers: rengolin Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D15236 llvm-svn: 254912	2015-12-07 14:22:39 +00:00
Elena Demikhovsky	33e61eceb4	AVX-512: Fixed masked load / store instruction selection for KNL. Patterns were missing for KNL target for <8 x i32>, <8 x float> masked load/store. This intrinsic comes with all legal types: <8 x float> @llvm.masked.load.v8f32(<8 x float>* %addr, i32 align, <8 x i1> %mask, <8 x float> %passThru), but still requires lowering, because VMASKMOVPS, VMASKMOVDQU32 work with 512-bit vectors only. All data operands should be widened to 512-bit vector. The mask operand should be widened to v16i1 with zeroes. Differential Revision: http://reviews.llvm.org/D15265 llvm-svn: 254909	2015-12-07 13:39:24 +00:00
Igor Breger	3ab6f17530	AVX-512: implement kunpck intrinsics. Differential Revision: http://reviews.llvm.org/D14821 llvm-svn: 254908	2015-12-07 13:25:18 +00:00
Bradley Smith	d5a1f47a63	[ARM] Flag vcvt{t,b} with an f16 type specifier as part of the FP16 extension Additionally correct the Cortex-R7 definition to allow the FP16 feature. llvm-svn: 254900	2015-12-07 10:54:36 +00:00
Simon Pilgrim	12301b0814	[X86][AVX] Added tests to load+broadcast non-zero'th vector elements Baseline for an upcoming patch for PR23022 llvm-svn: 254898	2015-12-07 09:09:54 +00:00
Keno Fischer	0ef8ccf968	[Verifier] Fix !dbg validation if Scope is the Subprogram Summary: We are inserting both Scope and SP into the Seen map and check whether it was already there in which case we skip the validation (the idea being that we already checked this Subprogram before). However, if (Scope == SP) as MDNodes, then inserting the Scope, will trigger the Seen check causing us to incorrectly not validate this !dbg attachment. Fix this by not performing the SP Seen check if Scope == SP Reviewers: pcc, dexonsmith, dblaikie Subscribers: dblaikie, llvm-commits Differential Revision: http://reviews.llvm.org/D14697 llvm-svn: 254887	2015-12-06 23:05:38 +00:00
Simon Pilgrim	38e93ea1cc	[X86][AVX] Tidied up BROADCASTPD/BROADCASTPS tests Regenerate tests using update_llc_test_checks.py llvm-svn: 254886	2015-12-06 20:12:19 +00:00
Dan Gohman	a4b710a74f	[WebAssembly] Enable folding of offsets into global variable addresses. llvm-svn: 254882	2015-12-06 19:33:32 +00:00
Dan Gohman	6ddce716cb	[WebAssembly] Tighten up some testcase regular expressions. llvm-svn: 254881	2015-12-06 19:31:44 +00:00
Sanjay Patel	6226e6d993	[x86] add missing maxnum/minnum tests for 256-bit vectors Also, switch to x86-64 because once we can lower these to something more reasonable, there will be less noise in the checks. And add AVX runs because those will be different than SSE. llvm-svn: 254879	2015-12-06 18:05:12 +00:00
Asaf Badouh	41ecf460fa	[X86][AVX512] add vmovss/sd missing encoding Differential Revision: http://reviews.llvm.org/D14701 llvm-svn: 254875	2015-12-06 13:26:56 +00:00
Michael Kuperstein	77ce9d3b1a	[X86] Always generate precise CFA adjustments. This removes the code path that generate "synchronous" (only correct at call site) CFA. We will probably want to re-introduce it once we are capable of emitting different .eh_frame and .debug_frame sections. Differential Revision: http://reviews.llvm.org/D14948 llvm-svn: 254874	2015-12-06 13:06:20 +00:00
Igor Breger	076dfe5c12	AVX512: support AVX512BW Intrinsic in 32bit mode. Differential Revision: http://reviews.llvm.org/D15076 llvm-svn: 254873	2015-12-06 11:35:18 +00:00
Dan Gohman	d85c3b1fbc	[WebAssembly] Don't perform the returned-argument optimization on constants. llvm-svn: 254866	2015-12-05 22:12:39 +00:00
Dan Gohman	e2a7a8278f	[WebAssembly] Implement direct calls to external symbols. llvm-svn: 254863	2015-12-05 20:41:36 +00:00
Sanjay Patel	f413410f55	Add vector fmaxnum tests that correspond to the existing fminnum tests Note: missing 256-bit tests for min and max should also be added. llvm-svn: 254862	2015-12-05 20:27:10 +00:00
Dan Gohman	284384b640	[WebAssembly] Support inline asm constraints of type i16 and similar. llvm-svn: 254861	2015-12-05 20:03:44 +00:00
Sanjay Patel	1c7692b881	fix typo; NFC llvm-svn: 254860	2015-12-05 19:54:59 +00:00
Simon Pilgrim	4ba5969224	[X86][ADX] Added memory folding patterns and stack folding tests llvm-svn: 254844	2015-12-05 07:27:50 +00:00
Simon Pilgrim	5a64d98303	[X86][FMA4] Explicitly set the domain of FMA4 float/double scalar instructions Both were defaulting to the float domain - now matches the packed instructions. llvm-svn: 254841	2015-12-05 07:07:42 +00:00
Cong Hou	833fe143f5	Normalize successors' probabilities when building MBBs for jump table. llvm-svn: 254837	2015-12-05 05:00:55 +00:00
Dan Gohman	f0b165a7f8	[WebAssembly] Implement ReverseBranchCondition, and re-enable MachineBlockPlacement This patch introduces a codegen-only instruction currently named br_unless, which makes it convenient to implement ReverseBranchCondition and re-enable the MachineBlockPlacement pass. Then in a late pass, it lowers br_unless back into br_if. Differential Revision: http://reviews.llvm.org/D14995 llvm-svn: 254826	2015-12-05 03:03:35 +00:00
Dan Gohman	4da4abd87f	[WebAssembly] Fix scheduling dependencies in register-stackified code Add physical register defs to instructions used from stackified instructions to prevent them from being scheduled into the middle of a stack sequence. This is a conservative measure which may be loosened in the future. Differential Revision: http://reviews.llvm.org/D15252 llvm-svn: 254811	2015-12-05 00:51:40 +00:00
Derek Schuff	9d77952332	[WebAssembly] Support constant offsets on loads and stores This is just prototype for load/store for i32 types. I'll add them to the rest of the types if we like this direction. Differential Revision: http://reviews.llvm.org/D15197 llvm-svn: 254807	2015-12-05 00:26:39 +00:00
Dan Gohman	35bfb24c28	[WebAssembly] Initial varargs support. Full varargs support will depend on prologue/epilogue support, but this patch gets us started with most of the basic infrastructure. Differential Revision: http://reviews.llvm.org/D15231 llvm-svn: 254799	2015-12-04 23:22:35 +00:00
Hans Wennborg	5000ce8a63	X86: Don't emit SAHF/LAHF for 64-bit targets unless explicitly supported These instructions are not supported by all CPUs in 64-bit mode. Emitting them causes Chromium to crash on start-up for users with such chips. (GCC puts these instructions behind -msahf on 64-bit for the same reason.) This patch adds FeatureLAHFSAHF, enables it by default for 32-bit targets and modern CPUs, and changes X86InstrInfo::copyPhysReg back to the lowering from before r244503 when the instructions are not available. Differential Revision: http://reviews.llvm.org/D15240 llvm-svn: 254793	2015-12-04 23:00:33 +00:00
Chad Rosier	f3491496dc	[AArch64] Expand vector SDIVREM/UDIVREM operations. http://reviews.llvm.org/D15214 Patch by Ana Pazos <apazos@codeaurora.org>! llvm-svn: 254773	2015-12-04 21:38:44 +00:00
Manman Ren	19c7bbe3b7	[CXX TLS calling convention] Add CXX TLS calling convention. This commit adds a new target-independent calling convention for C++ TLS access functions. It aims to minimize overhead in the caller by perserving as many registers as possible. The target-specific implementation for X86-64 is defined as following: Arguments are passed as for the default C calling convention The same applies for the return value(s) The callee preserves all GPRs - except RAX and RDI The access function makes C-style TLS function calls in the entry and exit block, C-style TLS functions save a lot more registers than normal calls. The added calling convention ties into the existing implementation of the C-style TLS functions, so we can't simply use existing calling conventions such as preserve_mostcc. rdar://9001553 llvm-svn: 254737	2015-12-04 17:40:13 +00:00
Alexey Bataev	7cf324772f	LEA code size optimization pass (Part 1): Remove redundant address recalculations, by Andrey Turetsky Add new x86 pass which replaces address calculations in load or store instructions with def register of existing LEA (must be in the same basic block), if the LEA calculates address that differs only by a displacement. Works only with -Os or -Oz. Differential Revision: http://reviews.llvm.org/D13294 llvm-svn: 254712	2015-12-04 10:53:15 +00:00

1 2 3 4 5 ...

14391 Commits