llvm-project

Commit Graph

Author	SHA1	Message	Date
Sanjay Patel	ea81edf351	[x86] enable machine combiner reassociations for scalar double-precision adds llvm-svn: 241871	2015-07-09 22:48:54 +00:00
Reid Kleckner	8eecb3c160	[WinEH] Give up on using CSRs across 32-bit invokes for now The runtime does not restore CSRs when transferring control back to the function handling the exception. According to the experts on IRC, LLVM's register allocator has no way to model register clobbers that only happen on one edge of the CFG. For now, don't worry about trying to use the meager three CSRs available on 32-bit X86 and just say that such invokes preserve nothing. llvm-svn: 241865	2015-07-09 22:09:41 +00:00
Tom Stellard	dcb9f0907f	AMDGPU: Add helper function for implicit parameter offsets. Patch by: Zoltan Gilian llvm-svn: 241861	2015-07-09 21:20:37 +00:00
JF Bastien	b379643f7c	Unbreak WebAssembly build Summary: D11021 and D11045 didn't update the WebAssembly target's code. It's still experimental so all tests passed. Reviewers: sunfish, joker.eph, echristo Subscribers: llvm-commits, jfb Differential Revision: http://reviews.llvm.org/D11084 llvm-svn: 241859	2015-07-09 21:00:09 +00:00
Matt Arsenault	8b03e6c164	AMDGPU/R600: Return correct chain when lowering loads The other LowerLOAD should be returning the correct chain. llvm-svn: 241839	2015-07-09 18:47:03 +00:00
Pat Gavlin	a717f255b6	Allow {e,r}bp as the target of {read,write}_register. This patch allows the read_register and write_register intrinsics to read/write the RBP/EBP registers on X86 iff the targeted register is the frame pointer for the containing function. Differential Revision: http://reviews.llvm.org/D10977 llvm-svn: 241827	2015-07-09 17:40:29 +00:00
Tom Stellard	ab6e9c0f94	AMDGPU/SI: The SIShrinkInstructions pass should only fold immediates with one use This is convered by existing testcases and will be exposed by a future commit. llvm-svn: 241817	2015-07-09 16:30:36 +00:00
Tom Stellard	9ebf7ca2f0	AMDGPU/SI: Fix crash on physical registers in SIInstrInfo::isOperandLegal() No test case for this. I ran into it while working on some improvements to SIShrinkInstructions.cpp. llvm-svn: 241816	2015-07-09 16:30:27 +00:00
Krzysztof Parzyszek	8b26fbf758	[Hexagon] Add missing preamble to a source file llvm-svn: 241813	2015-07-09 15:40:25 +00:00
Mehdi Amini	eaabc51e78	Re-instate the EVT parameter to getScalarShiftAmountTy() for OOT user A documentation for this function would be nice by the way. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241807	2015-07-09 15:12:23 +00:00
Pawel Bylica	d1b818bcf4	Reapply fixed r241790: Fix shift legalization and lowering for big constants. Summary: If shift amount is a constant value > 64 bit it is handled incorrectly during type legalization and X86 lowering. This patch the type of shift amount argument in function DAGTypeLegalizer::ExpandShiftByConstant from unsigned to APInt. Reviewers: nadav, majnemer, sanjoy, RKSimon Subscribers: RKSimon, llvm-commits Differential Revision: http://reviews.llvm.org/D10767 llvm-svn: 241806	2015-07-09 14:58:04 +00:00
Krzysztof Parzyszek	feaf7b8d35	[Hexagon] Add support for atomic RMW operations llvm-svn: 241804	2015-07-09 14:51:21 +00:00
Arnaud A. de Grandmaison	f40f99e3a4	[AArch64] Select SBFIZ or UBFIZ instead of left + right shifts And rename LSB to Immr / MSB to Imms to match the ARM ARM terminology. llvm-svn: 241803	2015-07-09 14:33:38 +00:00
Scott Douglass	8143bc25ee	[ARM] Thumb1 3 to 2 operand convertion for commutative operations Differential Revision: http://reviews.llvm.org/D11057 llvm-svn: 241802	2015-07-09 14:13:55 +00:00
Scott Douglass	2740a63725	[ARM] Don't be overzealous converting Thumb1 3 to 2 operands Differential Revision: http://reviews.llvm.org/D11056 llvm-svn: 241801	2015-07-09 14:13:48 +00:00
Scott Douglass	47a3fce461	[ARM] Add Thumb2 ADD with PC narrowing from 3 operand to 2 Differential Revision: http://reviews.llvm.org/D11055 llvm-svn: 241800	2015-07-09 14:13:41 +00:00
Scott Douglass	8c7803f4c1	[ARM] Refactor converting Thumb1 from 3 to 2 operand (nfc) Also adds some test cases. Differential Revision: http://reviews.llvm.org/D11054 llvm-svn: 241799	2015-07-09 14:13:34 +00:00
Renato Golin	17d4efe7c1	Add support for nest attribute to AArch64 backend The nest attribute is currently supported on the x86 (32-bit) and x86-64 backends, but not on ARM (32-bit) or AArch64. This patch adds support for nest to the AArch64 backend. Register x18 is used by GCC for this purpose and hence is used here. As discussed on the GCC mailing list the register choice is an ABI issue and so choosing the same register as GCC means __builtin_call_with_static_chain is compatible. Patch by Stephen Cross. llvm-svn: 241794	2015-07-09 10:18:02 +00:00
Pawel Bylica	627762fda5	Revert r241790: Fix shift legalization and lowering for big constants. llvm-svn: 241792	2015-07-09 09:50:54 +00:00
Pawel Bylica	eb122f2baf	Fix shift legalization and lowering for big constants. Summary: If shift amount is a constant value > 64 bit it is handled incorrectly during type legalization and X86 lowering. This patch the type of shift amount argument in function DAGTypeLegalizer::ExpandShiftByConstant from unsigned to APInt. Reviewers: nadav, majnemer, sanjoy, RKSimon Subscribers: RKSimon, llvm-commits Differential Revision: http://reviews.llvm.org/D10767 llvm-svn: 241790	2015-07-09 08:01:36 +00:00
Mehdi Amini	157e5a6d10	Remove getDataLayout() from TargetSelectionDAGInfo (had no users) Summary: Remove empty subclass in the process. This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, llvm-commits, rafael, yaron.keren, ted Differential Revision: http://reviews.llvm.org/D11045 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241780	2015-07-09 02:10:08 +00:00
Mehdi Amini	a749f2ad47	Remove getDataLayout() from TargetLowering Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: yaron.keren, rafael, llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D11042 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241779	2015-07-09 02:09:52 +00:00
Mehdi Amini	0cdec1e2ab	Make isLegalAddressingMode() taking DataLayout as an argument Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, llvm-commits, rafael, yaron.keren Differential Revision: http://reviews.llvm.org/D11040 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241778	2015-07-09 02:09:40 +00:00
Mehdi Amini	5c183d5239	Make getByValTypeAlignment() taking DataLayout as an argument Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: yaron.keren, rafael, llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D11038 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241777	2015-07-09 02:09:28 +00:00
Mehdi Amini	9639d650bb	Make TargetLowering::getShiftAmountTy() taking DataLayout as an argument Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, llvm-commits, rafael, yaron.keren Differential Revision: http://reviews.llvm.org/D11037 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241776	2015-07-09 02:09:20 +00:00
Mehdi Amini	44ede33a69	Make TargetLowering::getPointerTy() taking DataLayout as an argument Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, ted, yaron.keren, rafael, llvm-commits Differential Revision: http://reviews.llvm.org/D11028 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241775	2015-07-09 02:09:04 +00:00
Mehdi Amini	5010ebf181	Make TargetTransformInfo keeping a reference to the Module DataLayout DataLayout is no longer optional. It was initialized with or without a DataLayout, and the DataLayout when supplied could have been the one from the TargetMachine. Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, llvm-commits, rafael, yaron.keren Differential Revision: http://reviews.llvm.org/D11021 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241774	2015-07-09 02:08:42 +00:00
Mehdi Amini	56228dabfa	Redirect DataLayout from TargetMachine to Module in ComputeValueVTs() Summary: Avoid using the TargetMachine owned DataLayout and use the Module owned one instead. This requires passing the DataLayout up the stack to ComputeValueVTs(). This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, yaron.keren, rafael, llvm-commits Differential Revision: http://reviews.llvm.org/D11019 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241773	2015-07-09 01:57:34 +00:00
Sanjay Patel	093fb170a6	[x86] enable machine combiner reassociations for scalar single-precision multiplies llvm-svn: 241752	2015-07-08 22:35:20 +00:00
Diego Novillo	13e20f1bbf	Add missing dependency to Hexagon target. A recent patch added calls to isInstructionTriviallyDead without the corresponding dependency on TransformUtils. llvm-svn: 241731	2015-07-08 21:13:37 +00:00
Reid Kleckner	4f21df2b96	[Win64] Only treat some functions as having the Win64 convention All the usual X86 target-specific conventions are collapsed to the normal Win64 convention, but the custom conventions like GHC and webkit should not be. Previously we would assume that the caller allocated 32 bytes of shadow space for us, which is not how webkit_jscc or other custom conventions are supposed to work. Based on a patch by peavo@outlook.com. Fixes PR24051. llvm-svn: 241725	2015-07-08 21:03:47 +00:00
Krzysztof Parzyszek	79b2433e7c	[Hexagon] Implement commoning of GetElementPtr instructions llvm-svn: 241714	2015-07-08 19:22:28 +00:00
Reid Kleckner	ed012dbf2a	[SEH] Ensure that empty __except blocks have their own BB The 32-bit lowering assumed that WinEHPrepare had this invariant. WinEHPrepare did it for C++, but not SEH. The result was that we would insert calls to llvm.x86.seh.restoreframe in normal basic blocks, which corrupted the frame pointer. llvm-svn: 241699	2015-07-08 18:08:52 +00:00
Duncan P. N. Exon Smith	ad98745561	MC: Constify MCSubtargetInfo in getDeprecationInfo(), NFC There's no reason to be able to mutate `MCSubtargetInfo` in `getDeprecationInfo()`. Constify the reference. llvm-svn: 241693	2015-07-08 17:30:55 +00:00
Eli Bendersky	8e131f8cbc	Cosmetic cleanups - NFC Remove commented lines, trailing whitespace, etc. llvm-svn: 241687	2015-07-08 16:33:21 +00:00
James Y Knight	f238d176eb	[SPARC] Cleanup handling of the Y/ASR registers. - Implement copying ASR to/from GPR regs. - Mark ASRs as non-allocatable, so it won't try to arbitrarily use them inappropriately. - Instead of inserting explicit WRASR/RDASR nodes in the MUL/DIV routines, just do normal register copies. - Also...mark div as using Y, not just writing it. Added a test case with some code which previously died with an assertion failure (with -O0), or produced wrong code (otherwise). (Third time's the charm?) Differential Revision: http://reviews.llvm.org/D10401 llvm-svn: 241686	2015-07-08 16:25:12 +00:00
Krzysztof Parzyszek	21b53a5120	[Hexagon] Generate "insert" instructions more aggressively llvm-svn: 241683	2015-07-08 14:47:34 +00:00
Krzysztof Parzyszek	d19b4767ff	Revert 241681: causes Windows builds to fail llvm-svn: 241682	2015-07-08 14:34:13 +00:00
Krzysztof Parzyszek	712b15b45e	[Hexagon] Generate "insert" instructions more aggressively llvm-svn: 241681	2015-07-08 14:22:27 +00:00
Simon Pilgrim	752de5dff2	[X86][SSE] Added (V)ROUNDSD + (V)ROUNDSS stack folding support llvm-svn: 241671	2015-07-08 08:07:57 +00:00
Mehdi Amini	ffc1402fad	Remove IsLittleEndian from TargetLowering and redirect to DataLayout Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: llvm-commits, rafael, yaron.keren Differential Revision: http://reviews.llvm.org/D11017 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241655	2015-07-08 01:00:38 +00:00
Reid Kleckner	e69bdb8619	[WinEH] Make llvm.x86.seh.restoreframe work for stack realignment prologues The incoming EBP value points to the end of a local stack allocation, so we can use that to restore ESI, the base pointer. Once we do that, we can use local stack allocations. If we know we need stack realignment, spill the original frame pointer in the prologue and reload it after restoring ESI. llvm-svn: 241648	2015-07-07 23:45:58 +00:00
Reid Kleckner	d5afc62ff6	[WinEH] Add localaddress intrinsic instead of using frameaddress Clang uses this for SEH finally. The new intrinsic will produce the right value when stack realignment is required. llvm-svn: 241643	2015-07-07 23:23:03 +00:00
Arnold Schwaighofer	3d43f66c91	Add more nvcasts Tim Northover has told me that they can occur when the compiler cleverly constructs constants - as demonstrated in the test case. rdar://21703486 llvm-svn: 241641	2015-07-07 23:13:18 +00:00
Dan Gohman	489abd7046	[WebAssembly] Set the scheduling preference. llvm-svn: 241637	2015-07-07 22:38:06 +00:00
Reid Kleckner	60381791b5	Rename llvm.frameescape and llvm.framerecover to localescape and localrecover Summary: Initially, these intrinsics seemed like part of a family of "frame" related intrinsics, but now I think that's more confusing than helpful. Initially, the LangRef specified that this would create a new kind of allocation that would be allocated at a fixed offset from the frame pointer (EBP/RBP). We ended up dropping that design, and leaving the stack frame layout alone. These intrinsics are really about sharing local stack allocations, not frame pointers. I intend to go further and add an `llvm.localaddress()` intrinsic that returns whatever register (EBP, ESI, ESP, RBX) is being used to address locals, which should not be confused with the frame pointer. Naming suggestions at this point are welcome, I'm happy to re-run sed. Reviewers: majnemer, nicholas Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11011 llvm-svn: 241633	2015-07-07 22:25:32 +00:00
Sanjay Patel	d4e1bb89e3	fix typo; NFC llvm-svn: 241629	2015-07-07 21:31:54 +00:00
Arnold Schwaighofer	4bc34b1515	Add a pattern for a nvcast from v2f64 -> v4f32 Since the NvCast is generated by the selection process the concerns about endianess and bit reversal don't apply. rdar://21703486 llvm-svn: 241611	2015-07-07 18:31:55 +00:00
Reid Kleckner	af04c2a972	Use default member initializers to deduplicate code in X86MachineFunctionInfo, NFC llvm-svn: 241609	2015-07-07 18:12:06 +00:00
Krzysztof Parzyszek	a45971ac94	[Hexagon] Fix unused variable warnings in NDEBUG build caused by r241595 llvm-svn: 241600	2015-07-07 16:02:11 +00:00
Reid Kleckner	9200b2f93b	[WinEH] Add a report_fatal_error for 32-bit stack realignment This type of prologue isn't supported yet. Implementing it should be a matter of copying the adjusted incoming EBP into ESI (the base pointer) instead of EBP. The original EBP can be saved and restored from other memory afterwards. llvm-svn: 241597	2015-07-07 15:47:29 +00:00
Krzysztof Parzyszek	e53b31a593	[Hexagon] Implement bit-tracking facility with specifics for Hexagon This includes code that is intended to be target-independent as well as the Hexagon-specific details. This is just the framework without any users. llvm-svn: 241595	2015-07-07 15:16:42 +00:00
Sanjay Patel	cf0a80728c	use range-based for loops; NFCI llvm-svn: 241592	2015-07-07 15:03:53 +00:00
Denis Protivensky	b612902faa	Fix gcc warnings of different enum and non-enum types in ternaries llvm-svn: 241567	2015-07-07 07:48:48 +00:00
Akira Hatanaka	1bc8af78f4	[ARM] Define a subtarget feature and use it to decide whether long calls should be emitted. This is needed to enable ARM long calls for LTO and enable and disable it on a per-function basis. Out-of-tree projects currently using EnableARMLongCalls to emit long calls should start passing "+long-calls" to the feature string (see the changes made to clang in r241565). rdar://problem/21529937 Differential Revision: http://reviews.llvm.org/D9364 llvm-svn: 241566	2015-07-07 06:54:42 +00:00
Simon Pilgrim	40343e6b3a	[X86][AVX] Add support for shuffle decoding of vperm2f128/vperm2i128 with zero'd lanes The vperm2f128/vperm2i128 shuffle mask decoding was not attempting to deal with shuffles that give zero lanes. This patch fixes this so that the assembly printer can provide shuffle comments. As this decoder is also used in X86ISelLowering for shuffle combining, I've added an early-out to match existing behaviour. The hope is that we can add zero support in the future, this would allow other ops' decodes (e.g. insertps) to be combined as well. Differential Revision: http://reviews.llvm.org/D10593 llvm-svn: 241516	2015-07-06 22:46:46 +00:00
Sanjay Patel	681a56ac58	[x86] extend machine combiner reassociation optimization to SSE scalar adds Extend the reassociation optimization of http://reviews.llvm.org/rL240361 (D10460) to SSE scalar FP SP adds in addition to AVX scalar FP SP adds. With the 'switch' in place, we can trivially add other opcodes and test cases in future patches. Differential Revision: http://reviews.llvm.org/D10975 llvm-svn: 241515	2015-07-06 22:35:29 +00:00
Simon Pilgrim	8fbf1c1f4a	[X86][SSE] Vectorized i64 uniform constant SRA shifts This patch adds vectorization support for uniform constant i64 arithmetic shift right operators. Differential Revision: http://reviews.llvm.org/D9645 llvm-svn: 241514	2015-07-06 22:35:19 +00:00
JF Bastien	86bc91508d	WebAssembly: add some TODO Reviewers: sunfish Subscribers: llvm-commits, jfb Differential Revision: http://reviews.llvm.org/D10971 llvm-svn: 241513	2015-07-06 21:41:59 +00:00
Simon Pilgrim	d85cae3d52	[X86][SSE4A] Shuffle lowering using SSE4A EXTRQ/INSERTQ instructions This patch adds support for v8i16 and v16i8 shuffle lowering using the immediate versions of the SSE4A EXTRQ and INSERTQ instructions. Although rather limited (they can only act on the lower 64-bits of the source vectors, leave the upper 64-bits of the result vector undefined and don't have VEX encoded variants), the instructions are still useful for the zero extension of any lane (EXTRQ) or inserting a lane into another vector (INSERTQ). Testing demonstrated that it wasn't typically worth it to use these instructions for v2i64 or v4i32 vector shuffles although they are capable of it. As well as adding specific pattern matching for the shuffles, the patch uses EXTRQ for zero extension cases where SSE41 isn't available and its more efficient than the SSE2 'unpack' default approach. It also adds shuffle decode support for the EXTRQ / INSERTQ cases when the instructions are handling full byte-sized extractions / insertions. From this foundation, future patches will be able to make use of the instructions for situations that use their ability to extract/insert at the bit level. Differential Revision: http://reviews.llvm.org/D10146 llvm-svn: 241508	2015-07-06 20:46:41 +00:00
Simon Pilgrim	8b756596fc	[X86][SSE] Use the general SMAX/SMIN/UMAX/UMIN opcodes and remove the X86 implementation With the completion of D9746 there is now a common implementation of integer signed/unsigned min/max nodes, removing the need for the equivalent X86 specific implementations. This patch removes the old X86ISD nodes, legalizes the relevant SSE2/SSE41/AVX2/AVX512 instructions for the ISD versions and converts the small amount of existing X86 code. Differential Revision: http://reviews.llvm.org/D10947 llvm-svn: 241506	2015-07-06 20:30:47 +00:00
Alex Lorenz	e2d75239d1	llc: Add a 'run-pass' option. This commit adds a 'run-pass' option to llc, which instructs the compiler to run one specific code generation pass only. Llc already has the 'start-after' and the 'stop-after' options, and this new option complements the other two by making it easier to write tests that want to invoke a single pass only. Reviewers: Duncan P. N. Exon Smith Differential Revision: http://reviews.llvm.org/D10776 llvm-svn: 241476	2015-07-06 17:44:26 +00:00
Matt Arsenault	db7781c6e9	AMDGPU: Run SIInsertWaits as pre-emit pass Running this after the scheduler enables scheduling waits later so other ALU instructions can run while this would be waiting. When combined with enabling the post-RA scheduler, this gives about a ~20% improvement on sgemm. llvm-svn: 241473	2015-07-06 17:02:20 +00:00
Daniel Sanders	f423f5627c	Change the last few internal StringRef triples into Triple objects. Summary: This concludes the patch series to eliminate StringRef forms of GNU triples from the internals of LLVM that began in r239036. At this point, the StringRef-form of GNU Triples should only be used in the public API (including IR serialization) and a couple objects that directly interact with the API (most notably the Module class). The next step is to replace these Triple objects with the TargetTuple object that will represent our authoratative/unambiguous internal equivalent to GNU Triples. Reviewers: rengolin Subscribers: llvm-commits, jholewinski, ted, rengolin Differential Revision: http://reviews.llvm.org/D10962 llvm-svn: 241472	2015-07-06 16:56:07 +00:00
Daniel Sanders	fbdab437f0	Where Triple has a suitable predicate, use it rather than the enum values. NFC. Reviewers: mcrosier Subscribers: llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D10960 llvm-svn: 241469	2015-07-06 16:33:18 +00:00
Matt Arsenault	706f930b72	AMDGPU/SI: Add debugging subtarget feature for DS offsets We don't have a good way to detect most situations where DS offsets are usable on SI, so add an option to force using them even if unsafe for debugging performance problems. llvm-svn: 241462	2015-07-06 16:01:58 +00:00
James Y Knight	89ac11de32	[Sparc] Add more instruction aliases. These are mostly from the chart in the SparcV8 spec, section "A.3 Synthetic Instructions". Differential Revision: http://reviews.llvm.org/D9834 llvm-svn: 241461	2015-07-06 16:01:07 +00:00
James Y Knight	7208a12eef	[Sparc] Add support for flush instruction. Differential Revision: http://reviews.llvm.org/D9833 llvm-svn: 241460	2015-07-06 16:01:04 +00:00
Chad Rosier	85a346395e	Fix a bug in the A57FPLoadBalancing register tracking/scavenger. The code in AArch64A57FPLoadBalancing::scavengeRegister() to handle dead defs was not correctly handling aliased registers. E.g. if the dead def was of D2, then S2 was not being marked as unavailable, so it could potentially be used across a live-range in which it would be clobbered. Patch by Geoff Berry <gberry@codeaurora.org>! Phabricator: http://reviews.llvm.org/D10900 llvm-svn: 241449	2015-07-06 14:46:34 +00:00
Asaf Badouh	c6f3c82ffc	[X86][AVX512] Multiply Packed Unsigned Integers with Round and Scale pmulhrsw review: http://reviews.llvm.org/D10948 llvm-svn: 241443	2015-07-06 14:03:40 +00:00
Peter Collingbourne	6a9d1774d0	IR: Do not consider available_externally linkage to be linker-weak. From the linker's perspective, an available_externally global is equivalent to an external declaration (per isDeclarationForLinker()), so it is incorrect to consider it to be a weak definition. Also clean up some logic in the dead argument elimination pass and clarify its comments to better explain how its behavior depends on linkage, introduce GlobalValue::isStrongDefinitionForLinker() and start using it throughout the optimizers and backend. Differential Revision: http://reviews.llvm.org/D10941 llvm-svn: 241413	2015-07-05 20:52:35 +00:00
Benjamin Kramer	9bfb627a0e	[TargetLowering] StringRefize asm constraint getters. There is some functional change here because it changes target code from atoi(3) to StringRef::getAsInteger which has error checking. For valid constraints there should be no difference. llvm-svn: 241411	2015-07-05 19:29:18 +00:00
Asaf Badouh	73f26f8ffc	[x86][AVX512] add Multiply High Op include encoding and intrinsics tests. review http://reviews.llvm.org/D10896 llvm-svn: 241406	2015-07-05 12:23:20 +00:00
Michael Kuperstein	5f05153fbb	[X86] Fix incorrect/inefficient pushw encodings for x86-64 targets Correctly support assembling "pushw $imm8" on x86-64 targets. Also some cleanup of the PUSH instructions (PUSH64i16 and PUSHi16 actually represent the same instruction) This fixes PR23996 Patch by: david.l.kreitzer@intel.com Differential Revision: http://reviews.llvm.org/D10878 llvm-svn: 241404	2015-07-05 10:25:41 +00:00
Nemanja Ivanovic	d358b8f80d	Add missing builtins to the PPC back end for ABI compliance (vol. 2) This patch corresponds to review: http://reviews.llvm.org/D10874 Back end portion of the second round of additions to altivec.h. llvm-svn: 241398	2015-07-05 06:03:51 +00:00
Simon Pilgrim	ea1b6ee366	[X86][SSE] Improved i8/i16 to f64 uint2fp vector conversions Followup to D10433 and D10589 that fixes i8/i16 uint2fp vector conversions by zero extending to i32 and using the sint2fp path (unless the target does actually support uint2fp). llvm-svn: 241394	2015-07-04 15:33:34 +00:00
Craig Topper	de8395229a	[X86] Add proper 64-bit mode checks to jrcxz and jcxz. llvm-svn: 241381	2015-07-04 00:01:07 +00:00
Matt Arsenault	24e33d10a0	AMDGPU: Fix indentation of switch llvm-svn: 241380	2015-07-03 23:33:38 +00:00
Rafael Espindola	ed067c45d4	Return ErrorOr from getSymbolAddress. It can fail trying to get the section on ELF and COFF. This makes sure the error is handled. llvm-svn: 241366	2015-07-03 18:19:00 +00:00
Rafael Espindola	e2df87f24b	Replace a few more MachO only uses of getSymbolAddress. llvm-svn: 241365	2015-07-03 18:02:36 +00:00
Simon Pilgrim	b504263e4a	[X86][SSE] Sign extension for target vector sizes less than 128 bits (pt2) Add support for v2i8/v2i16 to v2f64 by using a sign extension to v2i32 before conversion to v2f64. Differential Revision: http://reviews.llvm.org/D10589 llvm-svn: 241325	2015-07-03 08:01:36 +00:00
Simon Pilgrim	385bf00ea2	[X86][SSE] Sign extension for target vector sizes less than 128 bits (pt1) This patch adds support for sign extension for sub 128-bit vectors, such as to v2i32. It concatenates with UNDEF subvectors up to 128-bits, performs the sign extension (i.e. as v4i32) and then extracts the target subvector. Patch 1/2 of D10589 - the second patch covers the conversion of v2i8/v2i16 to v2f64. llvm-svn: 241323	2015-07-03 07:51:01 +00:00
Dan Gohman	bfaf7e15d1	[WebAssembly] Set the HasFloatingPointExceptions flag for WebAssembly. llvm-svn: 241302	2015-07-02 21:36:25 +00:00
Rafael Espindola	5d0c2ffadf	Return ErrorOr from SymbolRef::getName. This function can really fail since the string table offset can be out of bounds. Using ErrorOr makes sure the error is checked. Hopefully a lot of the boilerplate code in tools/* can go away once we have a diagnostic manager in Object. llvm-svn: 241297	2015-07-02 20:55:21 +00:00
Bill Schmidt	a1c30053e7	[PPC64LE] Remove implicit-subreg restriction from VSX swap removal In r241285, I removed the SUBREG_TO_REG restriction from VSX swap removal, determining that this was overly conservative. We have another form of the same restriction in that we check for the presence of implicit subregs in vector operations. As with SUBREG_TO_REG for partial register conversions, an implicit subreg is safe in and of itself, provided no other operation makes a lane-sensitive assumption about the result. This patch removes that restriction, by removing the HasImplicitSubreg flag and all code that relies on it. I've added a test case that fails to optimize before this patch is applied, and optimizes properly with the patch. Test based on a report from Anton Blanchard. llvm-svn: 241290	2015-07-02 19:01:22 +00:00
Bill Schmidt	7c691fee1c	[PPC64LE] Teach swap optimization about the doubleword splat idiom With a previous patch, the VSX swap optimization is able to recognize the doubleword load-splat idiom that can be implemented using lxvdsx. However, that does not cover a doubleword splat where the source is a register. We can implement this using xxspltd (a special form of xxpermdi). This patch teaches the swap optimization pass about this idiom. As a prerequisite, it also permits swap optimization to succeed for all forms of SUBREG_TO_REG. Previously we were conservative and only allowed SUBREG_TO_REG when it copied a full register. However, on reflection any form of SUBREG_TO_REG is safe in and of itself, so long as an unsafe operation is not performed on its result. In particular, a widening SUBREG_TO_REG often occurs as an input to a doubleword splat idiom, particularly in auto-vectorized code. The doubleword splat idiom is an XXPERMDI operation where both source registers are identical, and the selection mask is either 0 (splat the first element) or 3 (splat the second element). To determine whether the registers are identical, we use the existing mechanism for looking through "copy-like" operations. That mechanism has a side effect of marking the XXPERMDI operation as using a physical register, which would invalidate its presence in a swap-optimized region. This is correct for the form of XXPERMDI that performs a swap and hence would be removed, but is not what we want for a doubleword-splat variety of XXPERMDI. Therefore we reset the physical-register flag on the XXPERMDI when it represents a splat. A simple test case is added to verify that we generate the splat and that we also remove the xxswapd instructions that would otherwise be associated with the load and store of another operand. llvm-svn: 241285	2015-07-02 17:03:06 +00:00
Eric Christopher	e100226879	Implement TargetTransformInfo::hasCompatibleFunctionAttributes for X86. This checks subtarget feature compatibility for inlining by verifying that the callee is a strict subset of the caller's features. This includes the cpu as part of the subtarget we can get via the incoming functions as the backend takes CPUs as feature sets. This allows us to inline things like: int foo() { return baz(); } int __attribute__((target("sse4.2"))) bar() { return foo(); } so that generic code can be inlined into specialized functions. llvm-svn: 241221	2015-07-02 01:11:50 +00:00
JF Bastien	03855df197	WebAssembly: start instructions Summary: * Add 64-bit address space feature. * Rename SIMD feature to SIMD128. * Handle single-thread model with an IR pass (same way ARM does). * Rename generic processor to MVP, to follow design's lead. * Add bleeding-edge processors, with all features included. * Fix a few DEBUG_TYPE to match other backends. Test Plan: ninja check Reviewers: sunfish Subscribers: jfb, llvm-commits Differential Revision: http://reviews.llvm.org/D10880 llvm-svn: 241211	2015-07-01 23:41:25 +00:00
Dan Gohman	d82494bb75	[WebAssembly] Define separate Target instances for 32-bit and 64-bit. llvm-svn: 241193	2015-07-01 21:42:34 +00:00
Jingyue Wu	a0a56601c0	[NVPTX] expand extload/truncstore for vectors of floats Summary: According to PTX ISA: For convenience, ld, st, and cvt instructions permit source and destination data operands to be wider than the instruction-type size, so that narrow values may be loaded, stored, and converted using regular-width registers. For example, 8-bit or 16-bit values may be held directly in 32-bit or 64-bit registers when being loaded, stored, or converted to other types and sizes. The operand type checking rules are relaxed for bit-size and integer (signed and unsigned) instruction types; floating-point instruction types still require that the operand type-size matches exactly, unless the operand is of bit-size type. So, the ISA does not support load with extending/store with truncatation for floating numbers. This is reflected in setting the loadext/truncstore actions to expand in the code for floating numbers, but vectors of floating numbers are not taken care of. As a result, loading a vector of floats followed by a fp_extend may be combined by DAGCombiner to a extload, and the extload may be lowered to NVPTXISD::LoadV2 with extending information. However, NVPTXISD::LoadV2 does not perform extending, and no extending instructions are inserted. Finally, PTX instructions with mismatched types are generated, like ld.v2.f32 {%fd3, %fd4}, [%rd2] This patch adds the correct actions for vectors of floats, so DAGCombiner would not create loads with extending, and correct code is generated. Patched by Gang Hu. Test Plan: Test case attached. Reviewers: jingyue Reviewed By: jingyue Subscribers: llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D10876 llvm-svn: 241191	2015-07-01 21:32:42 +00:00
Jingyue Wu	77b5b385ee	[NVPTX] Move NVPTXPeephole after NVPTXPrologEpilogPass Summary: Offset of frame index is calculated by NVPTXPrologEpilogPass. Before that the correct offset of stack objects cannot be obtained, which leads to wrong offset if there are more than 2 frame objects. This patch move NVPTXPeephole after NVPTXPrologEpilogPass. Because the frame index is already replaced by %VRFrame in NVPTXPrologEpilogPass, we check VRFrame register instead, and try to remove the VRFrame if there is no usage after NVPTXPeephole pass. Patched by Xuetian Weng. Test Plan: Strengthened test/CodeGen/NVPTX/local-stack-frame.ll to check the offset calculation based on SP and SPL. Reviewers: jholewinski, jingyue Reviewed By: jingyue Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D10853 llvm-svn: 241185	2015-07-01 20:08:06 +00:00
Bill Schmidt	ae94f11d55	[PPC64LE] Enable missing lxvdsx optimization, and related swap optimization When adding little-endian vector support for PowerPC last year, I inadvertently disabled an optimization that recognizes a load-splat idiom and generates the lxvdsx instruction. This patch moves the offending logic so lxvdsx is once again generated. This pattern is frequently generated by the vectorizer for scalar loads of an effective constant. Previously the lxvdsx instruction was wrongly listed as lane-sensitive for the VSX swap optimization (since both doublewords are identical, swaps are safe). This patch fixes this as well, so that vectorized code using lxvdsx can now have swaps removed from the computation. There is an existing test (@test50) in test/CodeGen/PowerPC/vsx.ll that checks for the missing optimization. However, vsx.ll was only being tested for POWER7 with big-endian code generation. I've added a little-endian RUN statement and expected LE code generation for all the tests in vsx.ll to give us a bit better VSX coverage, including what's needed for this patch. llvm-svn: 241183	2015-07-01 19:40:07 +00:00
Sanjay Patel	910d5daa4b	fix formatting; NFC llvm-svn: 241175	2015-07-01 17:58:53 +00:00
Sanjay Patel	e4d95c6c9a	fix typos in comment; NFC llvm-svn: 241174	2015-07-01 17:55:07 +00:00
Reid Kleckner	f80636682c	[SEH] Don't assert if the parent function lacks a personality The EH code might have been deleted as unreachable and the personality pruned while the filter is still present. Currently I'm hitting this at -O0 due to the clang bug PR24009. llvm-svn: 241170	2015-07-01 16:45:47 +00:00
Arnaud A. de Grandmaison	650c520007	[AArch64] Implement add/adds/sub/subs/cmp/cmn with negative immediate aliases This patch teaches the AsmParser to accept add/adds/sub/subs/cmp/cmn with a negative immediate operand and convert them as shown: add Rd, Rn, -imm -> sub Rd, Rn, imm sub Rd, Rn, -imm -> add Rd, Rn, imm adds Rd, Rn, -imm -> subs Rd, Rn, imm subs Rd, Rn, -imm -> adds Rd, Rn, imm cmp Rn, -imm -> cmn Rn, imm cmn Rn, -imm -> cmp Rn, imm Those instructions are an alternate syntax available to assembly coders, and are needed in order to support code already compiling with some other assemblers (gas). They are documented in the "ARMv8 Instruction Set Overview", in the "Arithmetic (immediate)" section. This makes llvm-mc a programmer-friendly assembler ! This also fixes PR20978: "Assembly handling of adding negative numbers not as smart as gas". llvm-svn: 241166	2015-07-01 15:05:58 +00:00
James Y Knight	a8a8c605ee	[Sparc] Rearrange SparcInstrInfo, no change. Move some instructions into order of sections in the spec, as the rest already were. Differential Revision: http://reviews.llvm.org/D9102 llvm-svn: 241163	2015-07-01 14:38:07 +00:00
Igor Breger	15820b072b	AVX-512: Implemented missing encoding for FMA scalar instructions Added tests for encoding Differential Revision: http://reviews.llvm.org/D10865 llvm-svn: 241159	2015-07-01 13:24:28 +00:00
Michael Kuperstein	21a3c18443	[X86] Avoid over-relaxation of 8-bit immediates in integer arithmetic instructions. Only consider an instruction a candidate for relaxation if the last operand of the instruction is an expression. We previously checked whether any operand is an expression, which is useless, since for all instructions concerned, the only operand that may be affected by relaxation is the last one. In addition, this removes the check for having RIP as an argument, since it was plain wrong - even when one of the arguments is RIP, relaxation may still be needed. This fixes PR9807. Patch by: david.l.kreitzer@intel.com Differential Revision: http://reviews.llvm.org/D10766 llvm-svn: 241152	2015-07-01 10:54:42 +00:00
Zoran Jovanovic	2a47d08afd	[mips][microMIPS] Implement SLL and NOP instructions http://reviews.llvm.org/D10474 llvm-svn: 241150	2015-07-01 09:54:51 +00:00

1 2 3 4 5 ...

33623 Commits