llvm-project

Commit Graph

Author	SHA1	Message	Date
Hans Wennborg	08de833c1c	SelectionDAG switch lowering: Replace unreachable default with most popular case. This can significantly reduce the size of the switch, allowing for more efficient lowering. I also worked with the idea of exploiting unreachable defaults by omitting the range check for jump tables, but always ended up with a non-neglible binary size increase. It might be worth looking into some more. SimplifyCFG currently does this transformation, but I'm working towards changing that so we can optimize harder based on unreachable defaults. Differential Revision: http://reviews.llvm.org/D6510 llvm-svn: 223566	2014-12-06 01:28:50 +00:00
Eric Christopher	d1fb7e4590	These two calls were grabbing the same register info. Unify them. llvm-svn: 223502	2014-12-05 19:23:55 +00:00
Ahmed Bougacha	55e3c2d9cf	[CodeGenPrepare] Use variables for reused values. NFC. llvm-svn: 223491	2014-12-05 18:04:40 +00:00
Hal Finkel	66d7791176	Revert "r223440 - Consider subregs when calling MI::registerDefIsDead for phys deps" Reverting this because, while it fixes the problem in the reduced test case, it does not fix the problem in the full test case from the bug report. llvm-svn: 223442	2014-12-05 02:07:35 +00:00
Hal Finkel	d013d99fe0	Consider subregs when calling MI::registerDefIsDead for phys deps The scheduling dependency graph is built bottom-up within each scheduling region, and ScheduleDAGInstrs::addPhysRegDeps is called to add output/anti dependencies, based on physical registers, to the SUs for instructions based on those that come before them. In the test case, we start before post-RA scheduling with a block that looks like this: ... INLINEASM <... andc $0,$0,$2 stdcx. $0,0,$3 bne- 1b > [sideeffect] [mayload] [maystore] [attdialect], $0:[regdef-ec:G8RC], %X6<earlyclobber,def,dead>, $1:[mem], %X3<kill>, $2:[reguse:G8RC], %X5<kill>, $3:[reguse:G8RC], %X3, $4:[mem], %X3, $5:[clobber], %CC<earlyclobber,imp-def,dead>, <<badref>> ... %X4<def,dead> = ANDIo8 %X4<kill>, 1, %CR0<imp-def,dead>, %CR0GT<imp-def> ... %R29<def> = ISEL %R3<undef>, %R4<kill>, %CR0GT<kill> where it is relevant that %CC is an alias to %CR0, and that %CR0GT is a subregister of %CR0. However, for post-RA scheduling, no dependency was added to prevent the INLINEASM from being scheduled in between the ANDIo8 and the ISEL (which communicate via the %CR0GT register). In ScheduleDAGInstrs::addPhysRegDeps, when called for the %CC operand, we'd iterate over all of its aliases (which include %CC itself and also %CR0), and look for previously-encountered defs of those registers. We'd find the ANDIo8, but decide not to add a dependency between the INLINEASM and the ANDIo8 because both the INLINEASM's def of %CC is dead, and also the ANDIo8 def of %CR0 is dead. This ignores, however, that ANDIo8 has a non-dead def of %CR0GT, a subregister of %CR0, and thus a dependency still must exist. To fix this problem, when calling registerDefIsDead on the SU with the def, we also check all subregisters for possible non-dead defs, and add the dependency if any are found. Fixes PR21742. llvm-svn: 223440	2014-12-05 01:57:22 +00:00
Adrian Prantl	ab255fcd09	Cleanup: Calls to getDwarfRegNum() may actually fail, if there is no DWARF register number mapping, or if the register was a virtual register that was never materialized. Previously, we would just emit a bogus location, after this patch we don't emit a location at all by doing an early exit. After my bugfix in r223401 today, this doesn't actually happen on any target that I tested this with, but it's still preferable to make the possibility of a failure explicit. llvm-svn: 223428	2014-12-05 01:02:46 +00:00
Adrian Prantl	da7e03f1bf	Simplify implementation and testcase of r223401 based on feedback from dblaikie. llvm-svn: 223405	2014-12-04 22:58:41 +00:00
Adrian Prantl	a3ae0b3b5b	Debug info: If the RegisterCoalescer::reMaterializeTrivialDef() is eliminating all uses of a vreg, update any DBG_VALUE describing that vreg to point to the rematerialized register instead. llvm-svn: 223401	2014-12-04 22:29:04 +00:00
Patrik Hagglund	d06de4b954	Use DomTree in MachineSink to sink over diamonds. According to a previous FIXME comment we now not only look at MBB successors, but also handle code sinking past them: x = computation if () {} else {} use x The instruction could be sunk over the whole diamond for the if/then/else (or loop, etc), allowing it to be sunk into other blocks after that. Modified test added in r204522, due to one spill less present. Minor fixes in comments. Patch provided by Jonas Paulsson. Reviewed by Hal Finkel. llvm-svn: 223350	2014-12-04 10:36:42 +00:00
Simon Pilgrim	be24ab367b	[InstCombine] Minor optimization for bswap with binary ops Added instcombine optimizations for BSWAP with AND/OR/XOR ops: OP( BSWAP(x), BSWAP(y) ) -> BSWAP( OP(x, y) ) OP( BSWAP(x), CONSTANT ) -> BSWAP( OP(x, BSWAP(CONSTANT) ) ) Since its just a one liner, I've also added BSWAP to the DAGCombiner equivalent as well: fold (OP (bswap x), (bswap y)) -> (bswap (OP x, y)) Refactored bswap-fold tests to use FileCheck instead of just checking that the bswaps had gone. Differential Revision: http://reviews.llvm.org/D6407 llvm-svn: 223349	2014-12-04 09:44:01 +00:00
Elena Demikhovsky	f1de34b84d	Masked Load / Store Intrinsics - the CodeGen part. I'm recommiting the codegen part of the patch. The vectorizer part will be send to review again. Masked Vector Load and Store Intrinsics. Introduced new target-independent intrinsics in order to support masked vector loads and stores. The loop vectorizer optimizes loops containing conditional memory accesses by generating these intrinsics for existing targets AVX2 and AVX-512. The vectorizer asks the target about availability of masked vector loads and stores. Added SDNodes for masked operations and lowering patterns for X86 code generator. Examples: <16 x i32> @llvm.masked.load.v16i32(i8* %addr, <16 x i32> %passthru, i32 4 /* align /, <16 x i1> %mask) declare void @llvm.masked.store.v8f64(i8 %addr, <8 x double> %value, i32 4, <8 x i1> %mask) Scalarizer for other targets (not AVX2/AVX-512) will be done in a separate patch. http://reviews.llvm.org/D6191 llvm-svn: 223348	2014-12-04 09:40:44 +00:00
Matt Arsenault	4e27343eec	Allow target to specify prefix for labels Use the MCAsmInfo instead of the DataLayout, and allow specifying a custom prefix for labels specifically. HSAIL requires that labels begin with @, but global symbols with &. llvm-svn: 223323	2014-12-04 00:06:57 +00:00
Quentin Colombet	079aba733a	[RegAllocFast] Handle implicit definitions conservatively. Prior to this commit, physical registers defined implicitly were considered free right after their definition, i.e.. like dead definitions. Therefore, their uses had to immediately follow their definitions, otherwise the related register may be reused to allocate a virtual register. This commit fixes this assumption by keeping implicit definitions alive until they are actually used. The downside is that if the implicit definition was dead (and not marked at such), we block an otherwise available register. This is however conservatively correct and makes the fast register allocator much more robust in particular regarding the scheduling of the instructions. Fixes PR21700. llvm-svn: 223317	2014-12-03 23:38:08 +00:00
Peter Collingbourne	51d2de7b9e	Prologue support Patch by Ben Gamari! This redefines the `prefix` attribute introduced previously and introduces a `prologue` attribute. There are a two primary usecases that these attributes aim to serve, 1. Function prologue sigils 2. Function hot-patching: Enable the user to insert `nop` operations at the beginning of the function which can later be safely replaced with a call to some instrumentation facility 3. Runtime metadata: Allow a compiler to insert data for use by the runtime during execution. GHC is one example of a compiler that needs this functionality for its tables-next-to-code functionality. Previously `prefix` served cases (1) and (2) quite well by allowing the user to introduce arbitrary data at the entrypoint but before the function body. Case (3), however, was poorly handled by this approach as it required that prefix data was valid executable code. Here we redefine the notion of prefix data to instead be data which occurs immediately before the function entrypoint (i.e. the symbol address). Since prefix data now occurs before the function entrypoint, there is no need for the data to be valid code. The previous notion of prefix data now goes under the name "prologue data" to emphasize its duality with the function epilogue. The intention here is to handle cases (1) and (2) with prologue data and case (3) with prefix data. References ---------- This idea arose out of discussions[1] with Reid Kleckner in response to a proposal to introduce the notion of symbol offsets to enable handling of case (3). [1] http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-May/073235.html Test Plan: testsuite Differential Revision: http://reviews.llvm.org/D6454 llvm-svn: 223189	2014-12-03 02:08:38 +00:00
Hal Finkel	bbdee93638	[PowerPC] Implement readcyclecounter for PPC32 We've long supported readcyclecounter on PPC64, but it is easier there (the read of the 64-bit time-base register can be accomplished via a single instruction). This now provides an implementation for PPC32 as well. On PPC32, the time-base register is still 64 bits, but can only be read 32 bits at a time via two separate SPRs. The ISA manual explains how to do this properly (it involves re-reading the upper bits and looping if the counter has wrapped while being read). This requires PPC to implement a custom integer splitting legalization for the READCYCLECOUNTER node, turning it into a target-specific SDAG node, which then gets turned into a pseudo-instruction, which is then expanded to the necessary sequence (which has three SPR reads, the comparison and the branch). Thanks to Paul Hargrove for pointing out to me that this was still unimplemented. llvm-svn: 223161	2014-12-02 22:01:00 +00:00
Philip Reames	72fbe7a6f0	Restructure some assertion checking based on post commit feedback by Aaron and Tom. llvm-svn: 223150	2014-12-02 21:01:48 +00:00
Philip Reames	f814a511da	Appease a build bot complaining about an unused variable that's used in an assertion. llvm-svn: 223142	2014-12-02 19:28:57 +00:00
Philip Reames	1a1bdb22bf	[Statepoints 3/4] Statepoint infrastructure for garbage collection: SelectionDAGBuilder This is the third patch in a small series. It contains the CodeGen support for lowering the gc.statepoint intrinsic sequences (223078) to the STATEPOINT pseudo machine instruction (223085). The change also includes the set of helper routines and classes for working with gc.statepoints, gc.relocates, and gc.results since the lowering code uses them. With this change, gc.statepoints should be functionally complete. The documentation will follow in the fourth change, and there will likely be some cleanup changes, but interested parties can start experimenting now. I'm not particularly happy with the amount of code or complexity involved with the lowering step, but at least it's fairly well isolated. The statepoint lowering code is split into it's own files and anyone not working on the statepoint support itself should be able to ignore it. During the lowering process, we currently spill aggressively to stack. This is not entirely ideal (and we have plans to do better), but it's functional, relatively straight forward, and matches closely the implementations of the patchpoint intrinsics. Most of the complexity comes from trying to keep relocated copies of values in the same stack slots across statepoints. Doing so avoids the insertion of pointless load and store instructions to reshuffle the stack. The current implementation isn't as effective as I'd like, but it is functional and 'good enough' for many common use cases. In the long term, I'd like to figure out how to integrate the statepoint lowering with the register allocator. In principal, we shouldn't need to eagerly spill at all. The register allocator should do any spilling required and the statepoint should simply record that fact. Depending on how challenging that turns out to be, we may invest in a smarter global stack slot assignment mechanism as a stop gap measure. Reviewed by: atrick, ributzka llvm-svn: 223137	2014-12-02 18:50:36 +00:00
Ahmed Bougacha	54b7d334c7	[MachineCSE] Clear kill-flag on registers imp-def'd by the CSE'd instruction. Go through implicit defs of CSMI and MI, and clear the kill flags on their uses in all the instructions between CSMI and MI. We might have made some of the kill flags redundant, consider: subs ... %NZCV<imp-def> <- CSMI csinc ... %NZCV<imp-use,kill> <- this kill flag isn't valid anymore subs ... %NZCV<imp-def> <- MI, to be eliminated csinc ... %NZCV<imp-use,kill> Since we eliminated MI, and reused a register imp-def'd by CSMI (here %NZCV), that register, if it was killed before MI, should have that kill flag removed, because it's lifetime was extended. Also, add an exhaustive testcase for the motivating example. Reviewed by: Juergen Ributzka <juergen@apple.com> llvm-svn: 223133	2014-12-02 18:09:51 +00:00
Philip Reames	0365f1a376	[Statepoints 2/4] Statepoint infrastructure for garbage collection: MI & x86-64 Backend This is the second patch in a small series. This patch contains the MachineInstruction and x86-64 backend pieces required to lower Statepoints. It does not include the code to actually generate the STATEPOINT machine instruction and as a result, the entire patch is currently dead code. I will be submitting the SelectionDAG parts within the next 24-48 hours. Since those pieces are by far the most complicated, I wanted to minimize the size of that patch. That patch will include the tests which exercise the functionality in this patch. The entire series can be seen as one combined whole in http://reviews.llvm.org/D5683. The STATEPOINT psuedo node is generated after all gc values are explicitly spilled to stack slots. The purpose of this node is to wrap an actual call instruction while recording the spill locations of the meta arguments used for garbage collection and other purposes. The STATEPOINT is modeled as modifing all of those locations to prevent backend optimizations from forwarding the value from before the STATEPOINT to after the STATEPOINT. (Doing so would break relocation semantics for collectors which wish to relocate roots.) The implementation of STATEPOINT is closely modeled on PATCHPOINT. Eventually, much of the code in this patch will be removed. The long term plan is to merge the functionality provided by statepoints and patchpoints. Merging their implementations in the backend is likely to be a good starting point. Reviewed by: atrick, ributzka llvm-svn: 223085	2014-12-01 22:52:56 +00:00
Ahmed Bougacha	fb6eeb74c5	[MachineVerifier] Accept a MBB with a single landing pad successor. The MachineVerifier used to check that there was always exactly one unconditional branch to a non-landingpad (normal) successor. If that normal successor to an invoke BB is unreachable, it seems reasonable to only have one successor, the landing pad. On targets other than AArch64 (and on AArch64 with a different testcase), the branch folder turns the branch to the landing pad into a fallthrough. The MachineVerifier, which relies on AnalyzeBranch, is unable to check the condition, and doesn't complain. However, it does in this specific testcase, where the branch to the landing pad remained. Make the MachineVerifier accept it. llvm-svn: 223059	2014-12-01 18:43:53 +00:00
Hans Wennborg	5bef5b522b	Revert r223049, r223050 and r223051 while investigating test failures. I didn't foresee affecting the Clang test suite :/ llvm-svn: 223054	2014-12-01 17:36:43 +00:00
Hans Wennborg	1571336fb2	SelectionDAG switch lowering: Replace unreachable default with most popular case. This can significantly reduce the size of the switch, allowing for more efficient lowering. I also worked with the idea of exploiting unreachable defaults by omitting the range check for jump tables, but always ended up with a non-neglible binary size increase. It might be worth looking into some more. llvm-svn: 223049	2014-12-01 17:08:32 +00:00
Akira Hatanaka	b9991a2656	[stack protector] Set edge weights for newly created basic blocks. This commit fixes a bug in stack protector pass where edge weights were not set when new basic blocks were added to lists of successor basic blocks. Differential Revision: http://reviews.llvm.org/D5766 llvm-svn: 222987	2014-12-01 04:27:03 +00:00
Hans Wennborg	6dfb041ffc	Switch lowering: reformat some for loops etc. NFC llvm-svn: 222962	2014-11-29 21:24:12 +00:00
Hans Wennborg	6c42d1a5de	Switch lowering: Fix broken 'Figure out which block is next' code This doesn't seem to have worked in a long time, but other optimizations would clean it up. llvm-svn: 222961	2014-11-29 21:17:05 +00:00
Simon Pilgrim	2bfd9129f4	Target triple OS detection tidyup. NFC Use Triple::isOS*() helpers where possible. llvm-svn: 222960	2014-11-29 19:18:21 +00:00
Duncan P. N. Exon Smith	9bc81fbe92	Revert "Masked Vector Load and Store Intrinsics." This reverts commit r222632 (and follow-up r222636), which caused a host of LNT failures on an internal bot. I'll respond to the commit on the list with a reproduction of one of the failures. Conflicts: lib/Target/X86/X86TargetTransformInfo.cpp llvm-svn: 222936	2014-11-28 21:29:14 +00:00
Elena Demikhovsky	6de72e5b0c	Converted back to Unix format (after my last commit 222632) llvm-svn: 222636	2014-11-23 15:21:53 +00:00
Elena Demikhovsky	9e5089a938	Masked Vector Load and Store Intrinsics. Introduced new target-independent intrinsics in order to support masked vector loads and stores. The loop vectorizer optimizes loops containing conditional memory accesses by generating these intrinsics for existing targets AVX2 and AVX-512. The vectorizer asks the target about availability of masked vector loads and stores. Added SDNodes for masked operations and lowering patterns for X86 code generator. Examples: <16 x i32> @llvm.masked.load.v16i32(i8* %addr, <16 x i32> %passthru, i32 4 /* align /, <16 x i1> %mask) declare void @llvm.masked.store.v8f64(i8 %addr, <8 x double> %value, i32 4, <8 x i1> %mask) Scalarizer for other targets (not AVX2/AVX-512) will be done in a separate patch. http://reviews.llvm.org/D6191 llvm-svn: 222632	2014-11-23 08:07:43 +00:00
Manman Ren	f0a582bada	Debug Info: revert r222195, r222210 and r222239. This is no longer needed after David's fix at r222377 + r222485. rdar://18958417 llvm-svn: 222563	2014-11-21 19:55:23 +00:00
Manman Ren	c98ec0e70a	[Objective-C] Support a new special module flag that will be put into the objc_imageinfo struct. rdar://17954668 llvm-svn: 222558	2014-11-21 19:24:55 +00:00
Sanjay Patel	eb4a4d5aeb	Don't repeat class/function/variable names in comments. NFC. llvm-svn: 222555	2014-11-21 18:58:38 +00:00
Sanjay Patel	b06441aded	Less space; NFC llvm-svn: 222546	2014-11-21 18:05:59 +00:00
Andrea Di Biagio	0225b5bf6f	[DAG] Teach how to turn a build_vector into a shuffle if some of the operands are zero. Before this patch, the DAGCombiner only tried to convert build_vector dag nodes into shuffles if all operands were either extract_vector_elt or undef. This patch improves that logic and teaches the DAGCombiner how to deal with build_vector dag nodes where one or more operands are zero. A build_vector dag node with some zero operands is turned into a shuffle only if the resulting shuffle mask is legal for the target. llvm-svn: 222536	2014-11-21 14:32:06 +00:00
Andrea Di Biagio	26e8f4d166	[DAG] Refactor the shuffle combining logic in DAGCombiner. NFC. This patch simplifies the logic that combines a pair of shuffle nodes into a single shuffle if there is a legal mask. Also added comments to better describe the algorithm. No functional change intended. llvm-svn: 222522	2014-11-21 11:33:07 +00:00
Hao Liu	44e5d7a131	DAGCombiner: Allow the DAGCombiner to combine multiple FDIVs with the same divisor info FMULs by the reciprocal. E.g., ( a / D; b / D ) -> ( recip = 1.0 / D; a * recip; b * recip) A hook is added to allow the target to control whether it needs to do such combine. Reviewed in http://reviews.llvm.org/D6334 llvm-svn: 222510	2014-11-21 06:39:58 +00:00
Matthias Braun	745ea43687	RegisterCoalescer: Improve debug messages - Show "Considering..." message after flipping so you actually see the final destination vreg as destination. - Add a message on final join, so you can grep for "Success" messages to obtain a list of which register got merged with which. llvm-svn: 222382	2014-11-19 19:46:17 +00:00
Matthias Braun	d2f4c77800	Add a print and verify pass after the RegisterCoalescer llvm-svn: 222381	2014-11-19 19:46:15 +00:00
Matthias Braun	47760d9667	MachineVerifier: Report register for bad liveranges llvm-svn: 222380	2014-11-19 19:46:13 +00:00
Matthias Braun	9f87d75060	Introduce register dump helper llvm-svn: 222379	2014-11-19 19:46:11 +00:00
Simon Pilgrim	3ac3b251a9	[X86][SSE] pslldq/psrldq byte shifts/rotation for SSE2 This patch builds on http://reviews.llvm.org/D5598 to perform byte rotation shuffles (lowerVectorShuffleAsByteRotate) on pre-SSSE3 (palignr) targets - pre-SSSE3 is only enabled on i8 and i16 vector targets where it is a more definite performance gain. I've also added a separate byte shift shuffle (lowerVectorShuffleAsByteShift) that makes use of the ability of the SLLDQ/SRLDQ instructions to implicitly shift in zero bytes to avoid the need to create a zero register if we had used palignr. Differential Revision: http://reviews.llvm.org/D5699 llvm-svn: 222340	2014-11-19 10:06:49 +00:00
David Blaikie	70573dcd9f	Update SetVector to rely on the underlying set's insert to return a pair<iterator, bool> This is to be consistent with StringSet and ultimately with the standard library's associative container insert function. This lead to updating SmallSet::insert to return pair<iterator, bool>, and then to update SmallPtrSet::insert to return pair<iterator, bool>, and then to update all the existing users of those functions... llvm-svn: 222334	2014-11-19 07:49:26 +00:00
David Blaikie	5106ce7897	Remove StringMap::GetOrCreateValue in favor of StringMap::insert Having two ways to do this doesn't seem terribly helpful and consistently using the insert version (which we already has) seems like it'll make the code easier to understand to anyone working with standard data structures. (I also updated many references to the Entry's key and value to use first() and second instead of getKey{Data,Length,} and get/setValue - for similar consistency) Also removes the GetOrCreateValue functions so there's less surface area to StringMap to fix/improve/change/accommodate move semantics, etc. llvm-svn: 222319	2014-11-19 05:49:42 +00:00
Owen Anderson	b5a259935c	Fix an incorrect chain operand when expanding INSERT_VECTOR operations through the stack. Patch by Daniil Troshkov! llvm-svn: 222254	2014-11-18 20:50:19 +00:00
Frederic Riss	fdccfc1e19	Allow DwarfCompileUnit::constructImportedEntityDIE to instanciate a GlobalVariable DIE. Usually global variables are in a retain list and instanciated before any call to constructImportedEntityDIE is made. This isn't true for forward declarations though. The testcase for this change is generated by a clang patched to emit such forward declarations (patch at http://reviews.llvm.org/D6173 which will land soon). The updated testcase tests more than just global variables, it now tests every type of 'using' clause we support. llvm-svn: 222217	2014-11-18 02:46:11 +00:00
Manman Ren	554865da5b	Debug Info: In DIBuilder, the context field of a global variable is updated to use DIScopeRef. A paired commit at clang will follow to show cases where we will use an identifer for the context of a global variable. rdar://18958417 llvm-svn: 222195	2014-11-18 00:29:08 +00:00
Oliver Stannard	d29db9b949	Fix optimisations of SELECT_CC which assumed result is boolean Some optimisations in DAGCombiner cause miscompilations for targets that use TargetLowering::UndefinedBooleanContent, because they assume that the results of a SELECT_CC node are boolean values, and can be safely ANDed, ORed and XORed. These optimisations are only valid for targets that use ZeroOrOneBooleanContent or ZeroOrNegativeOneBooleanContent. This is a follow-up to D6210/r221693. llvm-svn: 222123	2014-11-17 10:49:31 +00:00
Craig Topper	f98c606479	Add missing semicolon from r222118. llvm-svn: 222119	2014-11-17 05:58:26 +00:00
Craig Topper	cf0444ba2a	Move register class name strings to a single array in MCRegisterInfo to reduce static table size and number of relocation entries. Indices into the table are stored in each MCRegisterClass instead of a pointer. A new method, getRegClassName, is added to MCRegisterInfo and TargetRegisterInfo to lookup the string in the table. llvm-svn: 222118	2014-11-17 05:50:14 +00:00
Craig Topper	6438fc3d05	Replace a couple asserts with static_asserts. llvm-svn: 222114	2014-11-17 00:26:50 +00:00
Craig Topper	7f416c8acb	Convert some EVTs to MVTs where only a SimpleValueType is needed. llvm-svn: 222109	2014-11-16 21:17:18 +00:00
Andrea Di Biagio	e13a0b81f4	[DAG] Improved target independent vector shuffle folding logic. This patch teaches the DAGCombiner how to combine shuffles according to rules: shuffle(shuffle(A, Undef, M0), B, M1) -> shuffle(B, A, M2) shuffle(shuffle(A, B, M0), B, M1) -> shuffle(B, A, M2) shuffle(shuffle(A, B, M0), A, M1) -> shuffle(B, A, M2) llvm-svn: 222090	2014-11-15 22:56:25 +00:00
Reid Kleckner	c2291f3905	Rename EH related stuff to be more precise Summary: The current "WinEH" exception handling type is more about Itanium-style LSDA tables layered on top of the Windows native unwind info format instead of .eh_frame tables or EHABI unwind info. Use the name "ItaniumWinEH" to better reflect the hybrid nature of the design. Also rename isExceptionHandlingDWARF to usesItaniumLSDAForExceptions, since the LSDA is part of the Itanium C++ ABI document, and not the DWARF standard. Reviewers: echristo Subscribers: llvm-commits, compnerd Differential Revision: http://reviews.llvm.org/D6279 llvm-svn: 222062	2014-11-14 23:31:07 +00:00
Reid Kleckner	283bc2ed28	Allow the use of functions as typeinfo in landingpad clauses This is one step towards supporting SEH filter functions in LLVM. llvm-svn: 221954	2014-11-14 00:35:50 +00:00
Reid Kleckner	971c3ea67b	Use nullptr instead of NULL for variadic sentinels Windows defines NULL to 0, which when used as an argument to a variadic function, is not a null pointer constant. As a result, Clang's -Wsentinel fires on this code. Using '0' would be wrong on most 64-bit platforms, but both MSVC and Clang make it work on Windows. Sidestep the issue with nullptr. llvm-svn: 221940	2014-11-13 22:55:19 +00:00
Aditya Nandakumar	3053155652	We can get the TLOF from the TargetMachine - so constructor no longer requires TargetLoweringObjectFile to be passed. llvm-svn: 221926	2014-11-13 21:29:21 +00:00
Aditya Nandakumar	a27193297f	This patch changes the ownership of TLOF from TargetLoweringBase to TargetMachine so that different subtargets could share the TLOF effectively llvm-svn: 221878	2014-11-13 09:26:31 +00:00
Frederic Riss	0f7abef2cf	Add an assert and a test that verify r221709's fix. llvm-svn: 221854	2014-11-13 03:20:23 +00:00
Quentin Colombet	f5485bb008	[CodeGenPrepare] Handle zero extensions in the TypePromotionHelper. Prior to this patch the TypePromotionHelper was promoting only sign extensions. Supporting zero extensions changes: - How constants are extended. - How sign extensions, zero extensions, and truncate are composed together. - How the type of the extended operation is recorded. Now we need to know the kind of the extension as well as its type. Each change is fairly small, unlike the diff. Most of the diff are comments/variable renaming to say "extension" instead of "sign extension". The performance improvements on the test suite are within the noise. Related to <rdar://problem/18310086>. llvm-svn: 221851	2014-11-13 01:44:51 +00:00
Frederic Riss	3a6b354b3e	Fix emission of Dwarf accelerator table when there are multiple CUs. The DIE offset in the accel tables is an offset relative to the start of the debug_info section, but we were encoding the offset to the start of the containing CU. llvm-svn: 221837	2014-11-12 23:48:14 +00:00
Ahmed Bougacha	026600d967	[CodeGenPrepare] Replace other uses of EVT::getEVT with TL::getValueType. r221820 fixed a problem (PR21548) where an iPTR was used in TLI legality checks, which isn't valid and resulted in a failed assertion. The solution was to lower pointer types into the correct target's VT, by using TL::getValueType instead of EVT::getEVT. This commit changes 3 other uses of EVT::getEVT, but without any tests: - One of these non-lowered EVTs is passed to allowsMisalignedMemoryAccesses, which goes into target's TL implementation and doesn't cause any problem (yet.) - Two others are passed to TLI.isOperationLegalOrCustom: - one only looks at extensions, so doesn't concern pointers. - one only looks at binary operators, so also isn't a problem. The latter might some day be exposed to pointers and cause the same assert as the original PR, because there's a comment hinting at also supporting cast ops. For consistency, update all of them and be done with it. llvm-svn: 221827	2014-11-12 23:05:03 +00:00
Ahmed Bougacha	0788d49a40	[CodeGenPrepare][AArch64] Fix a TLI legality check on iPTR to use a lowered instead. Fixes PR21548. Related to PR20474. llvm-svn: 221820	2014-11-12 22:16:55 +00:00
Timur Iskhodzhanov	0e76a16200	Temporary fix for PR21528 - use mangled C++ function names in COFF debug info to un-break ASan on Windows llvm-svn: 221813	2014-11-12 20:21:20 +00:00
Timur Iskhodzhanov	a11b32b7e5	[COFF] Make it clearer that the symbols subsection holds function display name rather than just name llvm-svn: 221812	2014-11-12 20:10:09 +00:00
Duncan P. N. Exon Smith	de36e8040f	Revert "IR: MDNode => Value" Instead, we're going to separate metadata from the Value hierarchy. See PR21532. This reverts commit r221375. This reverts commit r221373. This reverts commit r221359. This reverts commit r221167. This reverts commit r221027. This reverts commit r221024. This reverts commit r221023. This reverts commit r220995. This reverts commit r220994. llvm-svn: 221711	2014-11-11 21:30:22 +00:00
Tom Roeder	6312f4a422	Fix build break: remove unused variable in FCFI. llvm-svn: 221710	2014-11-11 21:26:33 +00:00
Frederic Riss	8ad4f498fb	Totally forget deallocated SDNodes in SDDbgInfo. What would happen before that commit is that the SDDbgValues associated with a deallocated SDNode would be marked Invalidated, but SDDbgInfo would keep a map entry keyed by the SDNode pointer pointing to this list of invalidated SDDbgNodes. As the memory gets reused, the list might get wrongly associated with another new SDNode. As the SDDbgValues are cloned when they are transfered, this can lead to an exponential number of SDDbgValues being produced during DAGCombine like in http://llvm.org/bugs/show_bug.cgi?id=20893 Note that the previous behavior wasn't really buggy as the invalidation made sure that the SDDbgValues won't be used. This commit can be considered a memory optimization and as such is really hard to validate in a unit-test. llvm-svn: 221709	2014-11-11 21:21:08 +00:00
Tom Roeder	eb7a303d1b	Add Forward Control-Flow Integrity. This commit adds a new pass that can inject checks before indirect calls to make sure that these calls target known locations. It supports three types of checks and, at compile time, it can take the name of a custom function to call when an indirect call check fails. The default failure function ignores the error and continues. This pass incidentally moves the function JumpInstrTables::transformType from private to public and makes it static (with a new argument that specifies the table type to use); this is so that the CFI code can transform function types at call sites to determine which jump-instruction table to use for the check at that site. Also, this removes support for jumptables in ARM, pending further performance analysis and discussion. Review: http://reviews.llvm.org/D4167 llvm-svn: 221708	2014-11-11 21:08:02 +00:00
Oliver Stannard	8c2c67e63c	LLVM incorrectly folds xor into select LLVM replaces the SelectionDAG pattern (xor (set_cc cc x y) 1) with (set_cc !cc x y), which is only correct when the xor has type i1. Instead, we should check that the constant operand to the xor is all ones. llvm-svn: 221693	2014-11-11 17:36:01 +00:00
Saleem Abdulrasool	d2c5d7f6da	Transforms: address some late comments We already use the llvm namespace. Remove the unnecessary prefix. Use the StringRef::equals method to compare with C strings rather than instantiating std::strings. Addresses late review comments from David Majnemer. llvm-svn: 221564	2014-11-08 00:00:50 +00:00
Saleem Abdulrasool	5898e09057	Transform: add SymbolRewriter pass This introduces the symbol rewriter. This is an IR->IR transformation that is implemented as a CodeGenPrepare pass. This allows for the transparent adjustment of the symbols during compilation. It provides a clean, simple, elegant solution for symbol inter-positioning. This technique is often used, such as in the various sanitizers and performance analysis. The control of this is via a custom YAML syntax map file that indicates source to destination mapping, so as to avoid having the compiler to know the exact details of the source to destination transformations. llvm-svn: 221548	2014-11-07 21:32:08 +00:00
Lang Hames	cdd9077f3a	[RegAlloc] Kill off the trivial spiller - nobody is using it any more. llvm-svn: 221474	2014-11-06 19:12:38 +00:00
Rafael Espindola	26cfbea738	Compute the correct jump table entries on 32 bit windows. On 32 bit windows we use label differences and .set does not suppress rolocations, a combination that was not used before r220256. This fixes PR21497. llvm-svn: 221456	2014-11-06 14:39:49 +00:00
Rafael Espindola	83f0ea8982	Add three other sections when L symbols are allowed. llvm-svn: 221436	2014-11-06 05:01:21 +00:00
Rafael Espindola	bf77ed6826	Allow L symbols in no_dead_strip sections. If a section cannot be dead stripped, it is safe to use L symbols, since the linker will keep all of it in the end. llvm-svn: 221431	2014-11-06 02:42:03 +00:00
Duncan P. N. Exon Smith	c5754a65e6	IR: MDNode => Value: NamedMDNode::getOperator() Change `NamedMDNode::getOperator()` from returning `MDNode ` to returning `Value `. To reduce boilerplate at some call sites, add a `getOperatorAsMDNode()` for named metadata that's expected to only return `MDNode` -- for now, that's everything, but debug node named metadata (such as llvm.dbg.cu and llvm.dbg.sp) will soon change. This is part of PR21433. Note that there's a follow-up patch to clang for the API change. llvm-svn: 221375	2014-11-05 18:16:03 +00:00
Andrea Di Biagio	ce46b97b48	[X86] Teach method 'isVectorClearMaskLegal' how to check for legal blend masks. This patch improves the folding of vector AND nodes into blend operations for targets that feature SSE4.1. A vector AND node where one of the operands is a constant build_vector with elements that are either zero or all-ones can be converted into a blend. This allows for example to simplify the following code: define <4 x i32> @test(<4 x i32> %A, <4 x i32> %B) { %1 = and <4 x i32> %A, <i32 0, i32 0, i32 0, i32 -1> %2 = and <4 x i32> %B, <i32 -1, i32 -1, i32 -1, i32 0> %3 = or <4 x i32> %1, %2 ret <4 x i32> %3 } Before this patch llc (-mcpu=corei7) generated: andps LCPI1_0(%rip), %xmm0, %xmm0 andps LCPI1_1(%rip), %xmm1, %xmm1 orps %xmm1, %xmm0, %xmm0 retq With this patch we generate a single 'vpblendw'. llvm-svn: 221343	2014-11-05 13:04:14 +00:00
Craig Topper	12f0d9ef2c	Improve logic that decides if its profitable to commute when some of the virtual registers involved have uses/defs chains connecting them to physical register. Fix up the tests that this change improves. llvm-svn: 221336	2014-11-05 06:43:02 +00:00
David Blaikie	3a443c29b9	Provide gmlt-like inline scope information in the skeleton CU to facilitate symbolication without needing the .dwo files Clang -gsplit-dwarf self-host -O0, binary increases by 0.0005%, -O2, binary increases by 25%. A large binary inside Google, split-dwarf, -O0, and other internal flags (GDB index, etc) increases by 1.8%, optimized build is 35%. The size impact may be somewhat greater in .o files (I haven't measured that much - since the linked executable -O0 numbers seemed low enough) due to relocations. These relocations could be removed if we taught the llvm-symbolizer to handle indexed addressing in the .o file (GDB can't cope with this just yet, but GDB won't be reading this info anyway). Also debug_ranges could be shared between .o and .dwo, though ideally debug_ranges would get a schema that could used index(+offset) addressing, and move to the .dwo file, then we'd be back to sharing addresses in the address pool again. But for now, these sizes seem small enough to go ahead with this. Verified that no other DW_TAGs are produced into the .o file other than subprograms and inlined_subroutines. llvm-svn: 221306	2014-11-04 22:12:25 +00:00
David Blaikie	9bfd7a9f43	Move cross-unit DIE caching to the DwarfFile level, so it doesn't interfere with fission-gmlt data and produce skeleton<>full unit cross referencing. llvm-svn: 221305	2014-11-04 22:12:18 +00:00
Arnaud A. de Grandmaison	a11cab3120	[PBQP] Callee saved regs should have a higher cost than scratch regs Registers are not all equal. Some are not allocatable (infinite cost), some have to be preserved but can be used, and some others are just free to use. Ensure there is a cost hierarchy reflecting this fact, so that the allocator will favor scratch registers over callee-saved registers. llvm-svn: 221293	2014-11-04 20:51:29 +00:00
Arnaud A. de Grandmaison	829dd81377	[PBQP] Tweak spill costs and coalescing benefits This patch improves how the different costs (register, interference, spill and coalescing) relates together. The assumption is now that: - coalescing (or any other "side effect" of reg alloc) is negative, and instead of being derived from a spill cost, they use the block frequency info. - spill costs are in the [MinSpillCost:+inf( range - register or interference costs are in [0.0:MinSpillCost( or +inf The current MinSpillCost is set to 10.0, which is a random value high enough that the current constraint builders do not need to worry about when settings costs. It would however be worth adding a normalization step for register and interference costs as the last step in the constraint builder chain to ensure they are not greater than SpillMinCost (unless this has some sense for some architectures). This would work well with the current builder pipeline, where all costs are tweaked relatively to each others, but could grow above MinSpillCost if the pipeline is deep enough. The current heuristic is tuned to depend rather on the number of uses of a live interval rather than a density of uses, as used by the greedy allocator. This heuristic provides a few percent improvement on a number of benchmarks (eembc, spec, ...) and will definitely need to change once spill placement is implemented: the current spill placement is really ineficient, so making the cost proportionnal to the number of use is a clear win. llvm-svn: 221292	2014-11-04 20:51:24 +00:00
David Majnemer	b925715c56	CodeGen: Enable DWARF emission for MS ABI targets This is experimental, just barely enough to get things to not immediately combust. A note for those who are curious: Only lld can successfully link the object files, other linkers truncate the section names making the debug sections illegible to debuggers. Even with this in mind, we believe we are having trouble with SECREL relocations. llvm-svn: 221245	2014-11-04 08:03:31 +00:00
Sanjoy Das	e839965faa	The patchpoint lowering logic would crash with live constants equal to the tombstone or empty keys of a DenseMap<int64_t, T>. This patch fixes the issue (and adds a tests case). llvm-svn: 221214	2014-11-04 00:59:21 +00:00
Sanjoy Das	429c9cae09	Change logic in StackMaps::recordStackMapOpers to use the isInt<32> predicate instead of bitwise operations. This is not a functional change. llvm-svn: 221209	2014-11-04 00:06:57 +00:00
David Blaikie	5b02a19f90	Use common range handling for the CU's ranges This generalizes the range handling for ranges in both the skeleton and full unit, laying the foundation for the addition of more ranges (rather than just the CU's special case) in the skeleton CU with fission+gmlt. llvm-svn: 221202	2014-11-03 23:10:59 +00:00
David Blaikie	542616d47c	Push the CURangeList down into the skeleton CU (where available) rather than the full CU So that it may be shared between skeleton/full compile unit, for CU ranges and other ranges to be added for fission+gmlt. (at some point we might want some kind of object shared between the skeleton and full compile units for all those things we only want one of in that scope, rather than having the full unit always look through to the skeleton... - alternatively, we might be able to have the skeleton pointer (or another, separate pointer) point to the skeleton or to the unit itself in non-fission, so we don't have to special case its absence) llvm-svn: 221186	2014-11-03 21:52:56 +00:00
David Blaikie	ce343492ee	Add DwarfCompileUnit::BaseAddress to track the base address used by relative addressing in debug_ranges and debug_loc This is one of a few steps to generalize range handling to include the CU range (thus the CU's range list will be moved into the range list list, losing track of the base address in the process), which means generalizing ranges from both the skeleton and full unit under fission. And... then I can used that generalized support for ranges in fission+gmlt where there'll be a bunch more ranges in the skeleton. llvm-svn: 221182	2014-11-03 21:15:30 +00:00
Paul Robinson	ad06e430ce	Normally an 'optnone' function goes through fast-isel, which does not call DAGCombiner. But we ran into a case (on Windows) where the calling convention causes argument lowering to bail out of fast-isel, and we end up in CodeGenAndEmitDAG() which does run DAGCombiner. So, we need to make DAGCombiner check for 'optnone' after all. Commit includes the test that found this, plus another one that got missed in the original optnone work. llvm-svn: 221168	2014-11-03 18:19:26 +00:00
David Blaikie	077ad48447	Cleanup some unused or trivial functions in DwarfCompileUnit llvm-svn: 221164	2014-11-03 17:10:38 +00:00
David Blaikie	bc532b44a0	Sink DwarfUnit::CURanges into DwarfCompileUnit llvm-svn: 221161	2014-11-03 16:40:43 +00:00
Oliver Stannard	cf6bfb1dd0	Revert r221150, as it broke sanitizer tests llvm-svn: 221151	2014-11-03 12:19:03 +00:00
Oliver Stannard	652ec6ee89	Emit .eh_frame with relocations to functions, rather than sections When LLVM emits DWARF call frame information, it currently creates a local, section-relative symbol in the code section, which is pointed to by a relocation on the .eh_frame section. However, for C++ we emit some functions in section groups, and the SysV ABI has some rules to make it easier to remove these sections (http://www.sco.com/developers/gabi/latest/ch4.sheader.html#section_group_rules): A symbol table entry with STB_LOCAL binding that is defined relative to one of a group's sections, and that is contained in a symbol table section that is not part of the group, must be discarded if the group members are discarded. References to this symbol table entry from outside the group are not allowed. This means that we need to use the function symbol for the relocation, not a temporary symbol. There was a comment in the code claiming that the local symbol was used to avoid creating a relocation, but a relocation must be created anyway as the code and CFI are in different sections. llvm-svn: 221150	2014-11-03 12:02:51 +00:00
David Blaikie	89a26f012a	Sink range list handling down from DwarfUnit into its only use, in DwarfCompileUnit. llvm-svn: 221123	2014-11-03 02:41:49 +00:00
David Blaikie	4aa49b2054	Formatting llvm-svn: 221095	2014-11-02 08:52:37 +00:00
David Blaikie	cafd962d97	Add DwarfUnit::isDwoUnit and use it to generalize string creation Currently we only need to emit skeleton strings into the CU header and we do this by explicitly calling "addLocalString". With gmlt-in-fission, we'll be emitting a bunch of other strings from other codepaths where it's not statically known that these strings will be local or not. Introduce a virtual function to indicate whether this unit is a DWO unit or not (I'm not sure if we have a good term for this, the opposite/alternative to 'skeleton' unit) and use that to generalize the string emission logic so that strings can be correctly emitted in both the skeleton and dwo unit when in split dwarf mode. And to demonstrate that this works, switch the existing special callers of addLocalString in the skeleton builder to addString - and they still work. Yay. llvm-svn: 221094	2014-11-02 08:51:37 +00:00
David Blaikie	279c451c0b	Remove the last mention of LineTablesOnly from DwarfUnit, sinking it into DwarfCompileUnit This is a useful distinction/invariant/delination to make because LineTablesOnly mode is never relevant to type units, so it's clear that we're not doing weird line-tables-only-with-types by making this API choice. It also lays the foundations nicely for adding gmlt-like data to fission skeleton CUs while limiting the effects to CUs and not TUs. llvm-svn: 221093	2014-11-02 08:18:06 +00:00
David Blaikie	3363a57c8e	Sink DwarfUnit::applySubprogramAttributesToDefinition into DwarfCompileUnit llvm-svn: 221092	2014-11-02 08:09:09 +00:00
David Blaikie	978020807a	Sink DwarfUnit::addExpr into DwarfCompileUnit llvm-svn: 221090	2014-11-02 07:11:55 +00:00
David Blaikie	8c485b5d74	Fix the build from the last commit llvm-svn: 221089	2014-11-02 07:08:12 +00:00
David Blaikie	02a6333ba7	Sink DwarfUnit::applyVariableAttributes into DwarfCompileUnit llvm-svn: 221088	2014-11-02 07:06:51 +00:00
David Blaikie	4bc0881ac7	Sink DwarfUnit::addLocationList down into DwarfCompileUnit llvm-svn: 221087	2014-11-02 07:03:19 +00:00
David Blaikie	77895fb276	Sink DwarfUnit::addComplexAddress down into DwarfCompileUnit llvm-svn: 221086	2014-11-02 06:58:44 +00:00
David Blaikie	f7435ee6ce	Push DwarfUnit::addAddress down into DwarfCompileUnit llvm-svn: 221085	2014-11-02 06:46:40 +00:00
David Blaikie	7d48be2b7b	Sink DwarfUnit::addVariableAddress into DwarfCompileUnit since type units don't have variables llvm-svn: 221084	2014-11-02 06:37:23 +00:00
David Blaikie	192b45c1ef	DebugInfo: Sink accelerator table lists down (GlobalNames/Types) into DwarfCompileUnit llvm-svn: 221083	2014-11-02 06:16:39 +00:00
David Blaikie	98cf172175	Add DwarfUnit::addGlobalType to match DwarfUnit::addGlobalName (these will shortly become virtual, with a null implementation in DwarfUnit (since type units don't have accelerator tables in the current schema) and the current implementation down in DwarfCompileUnit, moving the actual maps there too) llvm-svn: 221082	2014-11-02 06:06:14 +00:00
David Blaikie	871c2d9d63	DebugInfo: Refactor index type DIE initialization by rolling it into the accessor llvm-svn: 221080	2014-11-02 03:09:13 +00:00
David Blaikie	ce47366150	Be sure to initialize DwarfCompileUnit::LabelBegin now that it may be skipped in initSection llvm-svn: 221079	2014-11-02 02:40:26 +00:00
David Blaikie	b6726a9ece	Don't bother creating LabelBegin for .dwo units This would help catch cases where we might otherwise try to reference a dwo CU label, which would be weird - because without relocations in the dwo file it's not generally meaningful to talk about the CU offsets there (or, if it is, we can do so in absolute terms without using a relocation to compute it). llvm-svn: 221078	2014-11-02 02:26:24 +00:00
David Blaikie	27e35f2302	Drop DwarfCompileUnit::getLocalLabel* in favor of just mapping through the skeleton explicitly. Confusing to do this two different ways - I'm not too wedded to either one, but here goes. llvm-svn: 221076	2014-11-02 01:21:43 +00:00
David Blaikie	f4bdc31271	Sink DwarfUnit::LabelBegin down into DwarfCompileUnit since that's the only place it's needed. llvm-svn: 221075	2014-11-02 01:21:40 +00:00
David Blaikie	ae57e66e36	Sink dwarf unit length emission down into DwarfUnit::emitHeader This allows the CU label to be emitted only for compile units, as they're the only ones that need it (so they can be referenced from pubnames) llvm-svn: 221072	2014-11-01 23:59:23 +00:00
David Blaikie	983bfea0d0	Remove DwarfUnit::LabelEnd in favor of computing the length of the section directly This was a compile-unit specific label (unused in type units) and seems unnecessary anyway when we can more easily directly compute the size of the compile unit. llvm-svn: 221067	2014-11-01 23:07:14 +00:00
David Blaikie	a34568b220	Sink DwarfUnit::SectionSym into DwarfCompileUnit as it's only needed/used there. llvm-svn: 221062	2014-11-01 20:06:28 +00:00
David Blaikie	a5437b6bf6	Make DwarfCompileUnit::Skeleton more narrowly typed (DwarfCompileUnit* instead of DwarfUnit*) now that it's specific to DwarfCompileUnit anyway. llvm-svn: 221060	2014-11-01 19:26:05 +00:00
David Blaikie	7cbf58af15	Sink DwarfUnit::Skeleton down into DwarfCompileUnit Type units no longer have skeletons and it's misleading to be able to query for a type unit's skeleton (it might incorrectly lead one to conclude that if a unit doesn't have a skeleton it's not in a .dwo file... ). llvm-svn: 221055	2014-11-01 18:18:07 +00:00
David Blaikie	f6dac29a83	Sink DwarfDebug::AbstractSPDies down into DwarfFile This is the first big step to allowing gmlt-like inline scope information in the skeleton CU. While this commit doesn't change the functionality, it's only a small step to call "constructAbstractSubprogramDIE" on both the InfoHolder and the SkeletonHolder (when in use) and that will at least create the abstract SP dies in that case, though still not creating the other subprograms. llvm-svn: 221051	2014-11-01 17:21:26 +00:00
David Blaikie	2f08093dda	Remove unused function llvm-svn: 221037	2014-11-01 01:15:26 +00:00
David Blaikie	a8be0a794f	And... fix the build some more. llvm-svn: 221036	2014-11-01 01:15:24 +00:00
David Blaikie	1998a2e789	Just iterate the DwarfCompileUnits rather than trying to filter them out of the list of all units. llvm-svn: 221034	2014-11-01 01:11:19 +00:00
David Blaikie	bbad0bbb2f	Add '*' to auto variable that is a pointer, as per the coding conventions. llvm-svn: 221033	2014-11-01 01:03:39 +00:00
David Blaikie	a33cd6aa27	Add DwarfCompileUnit::getSkeleton that returns DwarfCompileUnit* to avoid having to cast from DwarfUnit* on every call. llvm-svn: 221031	2014-11-01 00:50:34 +00:00
Duncan P. N. Exon Smith	3872d0084c	IR: MDNode => Value: Instruction::getMetadata() Change `Instruction::getMetadata()` to return `Value` as part of PR21433. Update most callers to use `Instruction::getMDNode()`, which wraps the result in a `cast_or_null<MDNode>`. llvm-svn: 221024	2014-11-01 00:10:31 +00:00
David Blaikie	1d96cc2ff3	Sink some of DwarfDebug::collectDeadVariables down into DwarfCompileUnit. llvm-svn: 221010	2014-10-31 22:30:30 +00:00
David Blaikie	49be5b357b	Sink most of DwarfDebug::constructAbstractSubprogramScopeDIE into DwarfCompileUnit llvm-svn: 221005	2014-10-31 21:57:02 +00:00
Quentin Colombet	c32615dfef	[CodeGenPrepare] Move extractelement close to store if they can be combined. This patch adds an optimization in CodeGenPrepare to move an extractelement right before a store when the target can combine them. The optimization may promote any scalar operations to vector operations in the way to make that possible. Context Some targets use different register files for both vector and scalar operations. This means that transitioning from one domain to another may incur copy from one register file to another. These copies are not coalescable and may be expensive. For example, according to the scheduling model, on cortex-A8 a vector to GPR move is 20 cycles. Motivating Example Let us consider an example: define void @foo(<2 x i32>* %addr1, i32* %dest) { %in1 = load <2 x i32>* %addr1, align 8 %extract = extractelement <2 x i32> %in1, i32 1 %out = or i32 %extract, 1 store i32 %out, i32* %dest, align 4 ret void } As it is, this IR generates the following assembly on armv7: vldr d16, [r0] @vector load vmov.32 r0, d16[1] @ cross-register-file copy: 20 cycles orr r0, r0, #1 @ scalar bitwise or str r0, [r1] @ scalar store bx lr Whereas we could generate much faster code: vldr d16, [r0] @ vector load vorr.i32 d16, #0x1 @ vector bitwise or vst1.32 {d16[1]}, [r1:32] @ vector extract + store bx lr Half of the computation made in the vector is useless, but this allows to get rid of the expensive cross-register-file copy. Proposed Solution To avoid this cross-register-copy penalty, we promote the scalar operations to vector operations. The penalty will be removed if we manage to promote the whole chain of computation in the vector domain. Currently, we do that only when the chain of computation ends by a store and the target is able to combine an extract with a store. Stores are the most likely candidates, because other instructions produce values that would need to be promoted and so, extracted as some point[1]. Moreover, this is customary that targets feature stores that perform a vector extract (see AArch64 and X86 for instance). The proposed implementation relies on the TargetTransformInfo to decide whether or not it is beneficial to promote a chain of computation in the vector domain. Unfortunately, this interface is rather inaccurate for this level of details and although this optimization may be beneficial for X86 and AArch64, the inaccuracy will lead to the optimization being too aggressive. Basically in TargetTransformInfo, everything that is legal has a cost of 1, whereas, even if a vector type is legal, usually a vector operation is slightly more expensive than its scalar counterpart. That will lead to too many promotions that may not be counter balanced by the saving of the cross-register-file copy. For instance, on AArch64 this penalty is just 4 cycles. For now, the optimization is just enabled for ARM prior than v8, since those processors have a larger penalty on cross-register-file copies, and the scope is limited to basic blocks. Because of these two factors, we limit the effects of the inaccuracy. Indeed, I did not want to build up a fancy cost model with block frequency and everything on top of that. [1] We can imagine targets that can combine an extractelement with other instructions than just stores. If we want to go into that direction, the current interfaces must be augmented and, moreover, I think this becomes a global isel problem. Differential Revision: http://reviews.llvm.org/D5921 <rdar://problem/14170854> llvm-svn: 220978	2014-10-31 17:52:53 +00:00
David Blaikie	fb2efde341	Correct assert text from r220923 Noticed in post-commit review by Adrian Prantl. llvm-svn: 220967	2014-10-31 16:45:36 +00:00
Hao Liu	e02b1a068f	PR20557: Fix the bug that bogus cpu parameter crashes llc on AArch64 backend. Initial patch by Oleg Ranevskyy. llvm-svn: 220945	2014-10-31 02:35:34 +00:00
Ahmed Bougacha	9f336c4ec5	[SelectionDAG] When scalarizing trunc, don't assert for legal operands. r212242 introduced a legalizer hook, originally to let AArch64 widen v1i{32,16,8} rather than scalarize, because the legalizer expected, when scalarizing the result of a conversion operation, to already have scalarized the operands. On AArch64, v1i64 is legal, so that commit ensured operations such as v1i32 = trunc v1i64 wouldn't assert. It did that by choosing to widen v1 types whenever possible. However, v1i1 types, for which there's no legal widened type, would still trigger the assert. This commit fixes that, by only scalarizing a trunc's result when the operand has already been scalarized, and introducing an extract_elt otherwise. This is similar to r205625. Fixes PR20777. llvm-svn: 220937	2014-10-30 23:46:50 +00:00
Louis Gerbarg	e8f9c78247	Fix incorrect invariant check in DAG Combine Earlier this summer I fixed an issue where we were incorrectly combining multiple loads that had different constraints such alignment, invariance, temporality, etc. Apparently in one case I made copt paste error and swapped alignment and invariance. Tests included. rdar://18816719 llvm-svn: 220933	2014-10-30 22:21:03 +00:00
David Blaikie	76fd43c653	PR21408: Workaround the appearance of duplicate variables due to problems when inlining two calls to the same function from the same call site. llvm-svn: 220923	2014-10-30 20:20:11 +00:00
NAKAMURA Takumi	f51a34ec1f	Whitespace. llvm-svn: 220857	2014-10-29 15:23:11 +00:00
David Blaikie	ff468a5e0b	Minimize the scope of some variables, NFC. llvm-svn: 220759	2014-10-28 02:57:26 +00:00
Lang Hames	5fe30ca56f	[PBQP] Unique allowed-sets for nodes in the PBQP graph and use pairs of these sets as keys into a cache of interference matrice values in the Interference constraint adder. Creating interference matrices was one of the large remaining time-sinks in PBQP. Caching them reduces the total compile time (when using PBQP) on the nightly test suite by ~10%. llvm-svn: 220688	2014-10-27 17:44:25 +00:00
David Blaikie	df1cca3a24	Remove some unnecessary casts. llvm-svn: 220658	2014-10-26 23:37:04 +00:00
Frederic Riss	987fe22789	Sink DwarfUnit::constructImportedEntityDIE into DwarfCompileUnit. So that it has access to getOrCreateGlobalVariableDIE. If we ever support decsribing using directive in C++ classes (thus requiring support in type units), it will certainly use another mechanism anyway. Differential Revision: http://reviews.llvm.org/D5975 llvm-svn: 220594	2014-10-24 21:31:09 +00:00
Matt Arsenault	4b7bd2d52f	Fix copy paste comment llvm-svn: 220581	2014-10-24 18:13:10 +00:00
David Blaikie	80e5b1ebd1	DebugInfo: Sink DwarfDebug::ScopeVariables down into DwarfFile (part of refactoring to allow subprogram emission in both the skeleton and main units to enable -gmlt-like data to be included in the skeleton for live inlined backtracing purposes) llvm-svn: 220578	2014-10-24 17:57:34 +00:00
David Blaikie	0dc623c232	Remove DwarfDebug::FirstCU as it has no use It was only being used as a flag to identify the lack of debug info from within endModule - use the section labels for that instead. llvm-svn: 220575	2014-10-24 17:53:38 +00:00
Sanjay Patel	957efc23bb	Use rsqrt (X86) to speed up reciprocal square root calcs This is a first step for generating SSE rsqrt instructions for reciprocal square root calcs when fast-math is allowed. For now, be conservative and only enable this for AMD btver2 where performance improves significantly - for example, 29% on llvm/projects/test-suite/SingleSource/Benchmarks/BenchmarkGame/n-body.c (if we convert the data type to single-precision float). This patch adds a two constant version of the Newton-Raphson refinement algorithm to DAGCombiner that can be selected by any target via a parameter returned by getRsqrtEstimate().. See PR20900 for more details: http://llvm.org/bugs/show_bug.cgi?id=20900 Differential Revision: http://reviews.llvm.org/D5658 llvm-svn: 220570	2014-10-24 17:02:16 +00:00
Marcello Maggioni	2259400f8e	Added reset of LexicalScope in LiveDebugVariables reset function. llvm-svn: 220545	2014-10-24 02:46:50 +00:00
Timur Iskhodzhanov	2bc90fdbdc	Fix PR21189 -- Emit symbol subsection required to debug LLVM-built binaries with VS2012+ Reviewed at http://reviews.llvm.org/D5772 llvm-svn: 220544	2014-10-24 01:27:45 +00:00
David Blaikie	f4c504e03c	DebugInfo: Remove DwarfDebug::addScopeVariable now that it's just a trivial wrapper llvm-svn: 220542	2014-10-24 00:43:47 +00:00
Ahmed Bougacha	7daf3b89f9	[SelectionDAG] Teach the vector scalarizer about FP conversions. This adds support for legalization of instructions of the form: [fp_conv] <1 x i1> %op to <1 x double> where fp_conv is one of fpto[us]i, [us]itofp. This used to assert because they were simply missing from the vector operand scalarizer. A similar problem arose in r190830, with trunc instead. Fixes PR20778. Differential Revision: http://reviews.llvm.org/D5810 llvm-svn: 220533	2014-10-23 22:49:25 +00:00
Ahmed Bougacha	e05afd4d1e	Update comment and fix typos in assert message. (NFC) llvm-svn: 220531	2014-10-23 22:40:34 +00:00
Tim Northover	e4c7be56bf	ScheduleDAG: record PhysReg dependencies represented by CopyFromReg nodes x86's CMPXCHG -> EFLAGS consumer wasn't being recorded as a real EFLAGS dependency because it was represented by a pair of CopyFromReg(EFLAGS) -> CopyToReg(EFLAGS) nodes. ScheduleDAG was expecting the source to be an implicit-def on the instruction, where the result numbers in the DAG and the Uses list in TableGen matched up precisely. The Copy notation seems much more robust, so this patch extends ScheduleDAG rather than refactoring x86. Should fix PR20376. llvm-svn: 220529	2014-10-23 22:31:48 +00:00
David Blaikie	1dd573db45	DebugInfo: Remove DwarfDebug::CurrentFnArguments since we have to handle argument ordering of other arguments (abstract arguments) in the same way and already have code for that too. While refactoring this code I was confused by both the name I had introduced (addNonArgumentVariable... but it has all this logic to handle argument numbering and keep things in order?) and by the redundancy. Seems when I fixed the misordered inlined argument handling, I didn't realize it was mostly redundant with the argument ordering code (which I may've also written, I'm not sure). So let's just rely on the more general case. The only oddity in output this produces is that it means when we emit all the variables for the current function, we don't track when we've finished the argument variables and are about to start the local variables and insert DW_AT_unspecified_parameters (for varargs functions) there. Instead it ends up after the local variables, scopes, etc. But this isn't invalid and doesn't cause DWARF consumers problems that I know of... so we'll just go with that because it makes the code nice & simple. (though, let's see what the buildbots have to say about this - crosses fingers) There will be some cleanup commits to follow to remove the now trivial wrappers, etc. llvm-svn: 220527	2014-10-23 22:27:50 +00:00
David Blaikie	3b5c84008d	DebugInfo: Sink DwarfDebug::addNonArgumentScopeVariable into DwarfFile. llvm-svn: 220520	2014-10-23 22:04:30 +00:00

1 2 3 4 5 ...

17595 Commits