llvm-project

Commit Graph

Author	SHA1	Message	Date
Duncan P. N. Exon Smith	571f97bd90	DI: Fold constant arguments into a single MDString This patch addresses the first stage of PR17891 by folding constant arguments together into a single MDString. Integers are stringified and a `\0` character is used as a separator. Part of PR17891. Note: I've attached my testcases upgrade scripts to the PR. If I've just broken your out-of-tree testcases, they might help. llvm-svn: 218914	2014-10-02 21:56:57 +00:00
Adrian Prantl	87b7eb9d0f	Move the complex address expression out of DIVariable and into an extra argument of the llvm.dbg.declare/llvm.dbg.value intrinsics. Previously, DIVariable was a variable-length field that has an optional reference to a Metadata array consisting of a variable number of complex address expressions. In the case of OpPiece expressions this is wasting a lot of storage in IR, because when an aggregate type is, e.g., SROA'd into all of its n individual members, the IR will contain n copies of the DIVariable, all alike, only differing in the complex address reference at the end. By making the complex address into an extra argument of the dbg.value/dbg.declare intrinsics, all of the pieces can reference the same variable and the complex address expressions can be uniqued across the CU, too. Down the road, this will allow us to move other flags, such as "indirection" out of the DIVariable, too. The new intrinsics look like this: declare void @llvm.dbg.declare(metadata %storage, metadata %var, metadata %expr) declare void @llvm.dbg.value(metadata %storage, i64 %offset, metadata %var, metadata %expr) This patch adds a new LLVM-local tag to DIExpressions, so we can detect and pretty-print DIExpression metadata nodes. What this patch doesn't do: This patch does not touch the "Indirect" field in DIVariable; but moving that into the expression would be a natural next step. http://reviews.llvm.org/D4919 rdar://problem/17994491 Thanks to dblaikie and dexonsmith for reviewing this patch! Note: I accidentally committed a bogus older version of this patch previously. llvm-svn: 218787	2014-10-01 18:55:02 +00:00
Adrian Prantl	b458dc2eee	Revert r218778 while investigating buldbot breakage. "Move the complex address expression out of DIVariable and into an extra" llvm-svn: 218782	2014-10-01 18:10:54 +00:00
Adrian Prantl	25a7174e7a	Move the complex address expression out of DIVariable and into an extra argument of the llvm.dbg.declare/llvm.dbg.value intrinsics. Previously, DIVariable was a variable-length field that has an optional reference to a Metadata array consisting of a variable number of complex address expressions. In the case of OpPiece expressions this is wasting a lot of storage in IR, because when an aggregate type is, e.g., SROA'd into all of its n individual members, the IR will contain n copies of the DIVariable, all alike, only differing in the complex address reference at the end. By making the complex address into an extra argument of the dbg.value/dbg.declare intrinsics, all of the pieces can reference the same variable and the complex address expressions can be uniqued across the CU, too. Down the road, this will allow us to move other flags, such as "indirection" out of the DIVariable, too. The new intrinsics look like this: declare void @llvm.dbg.declare(metadata %storage, metadata %var, metadata %expr) declare void @llvm.dbg.value(metadata %storage, i64 %offset, metadata %var, metadata %expr) This patch adds a new LLVM-local tag to DIExpressions, so we can detect and pretty-print DIExpression metadata nodes. What this patch doesn't do: This patch does not touch the "Indirect" field in DIVariable; but moving that into the expression would be a natural next step. http://reviews.llvm.org/D4919 rdar://problem/17994491 Thanks to dblaikie and dexonsmith for reviewing this patch! llvm-svn: 218778	2014-10-01 17:55:39 +00:00
Jingyue Wu	fd47fb9976	Revert r216862 due to a performance regression Reported by Alexey Volkov in PR21115 llvm-svn: 218771	2014-10-01 15:22:13 +00:00
David Blaikie	32b0f365a2	Implement DW_TAG_subrange_type with DW_AT_count rather than DW_AT_upper_bound This allows proper disambiguation of unbounded arrays and arrays of zero bound ("struct foo { int x[]; };" and "struct foo { int x[0]; }"). GCC instead produces an upper bound of -1 in the latter situation, but count seems tidier. This way lower_bound is provided if it's not the language default and count is provided if the count is known, otherwise it's omitted. Simple. If someone wants to look at rdar://problem/12566646 and see if this change is acceptable to that bug/fix, that might be helpful (see the empty-and-one-elem-array.ll test case which cites that radar). llvm-svn: 218726	2014-10-01 00:56:55 +00:00
David Blaikie	6cca8109ab	Omit DW_AT_inline under -gmlt to save a little more space. llvm-svn: 218719	2014-09-30 23:29:16 +00:00
David Blaikie	1cae849c04	DebugInfo: Sink the code emitting DW_AT_APPLE_omit_frame_ptr down to a more common spot. No functional change. Pre-emptive refactoring before I start pushing some of this subprogram creation down into DWARFCompileUnit so I can build different subprograms in the skeleton unit from the dwo unit for adding -gmlt-like data to the skeleton. llvm-svn: 218713	2014-09-30 22:32:49 +00:00
David Blaikie	e1c79749ca	Disable the -gmlt optimization implemented in r218129 under Darwin due to issues with dsymutil. r218129 omits DW_TAG_subprograms which have no inlined subroutines when emitting -gmlt data. This makes -gmlt very low cost for -O0 builds. Darwin's dsymutil reasonably considers a CU empty if it has no subprograms (which occurs with the above optimization in -O0 programs without any force_inline function calls) and drops the line table, CU, and everything in this situation, making backtraces impossible. Until dsymutil is modified to account for this, disable this optimization on Darwin to preserve the desired functionality. (see r218545, which should be reverted after this patch, for other discussion/details) Footnote: In the long term, it doesn't look like this scheme (of simplified debug info to describe inlining to enable backtracing) is tenable, it is far too size inefficient for optimized code (the DW_TAG_inlined_subprograms, even once compressed, are nearly twice as large as the line table itself (also compressed)) and we'll be considering things like Cary's two level line table proposal to encode all this information directly in the line table. llvm-svn: 218702	2014-09-30 21:28:32 +00:00
Sanjay Patel	ab7f460bca	Use the target-specified iteration count to opt out of any further refinement of an estimate. NFC. llvm-svn: 218700	2014-09-30 20:44:23 +00:00
Sanjay Patel	8fde95cb2b	Split the estimate() interface into separate functions for each type. NFC. It was hacky to use an opcode as a switch because it won't always match (rsqrte != sqrte), and it looks like we'll need to add more special casing per arch than I had hoped for. Eg, x86 will prefer a different NR estimate implementation. ARM will want to use it's 'step' instructions. There also don't appear to be any new estimate instructions in any arch in a long, long time. Altivec vloge and vexpte may have been the first and last in that field... llvm-svn: 218698	2014-09-30 20:28:48 +00:00
Andrea Di Biagio	c7c524129b	[DAG] Check in advance if a build_vector has a legal type before attempting to convert it into a shuffle. Currently, the DAG Combiner only tries to convert type-legal build_vector nodes into shuffles. This patch simply moves the logic that checks if a build_vector has a legal value type up before we even start analyzing the operands. This allows to early exit immediately from method 'visitBUILD_VECTOR' if the node type is known to be illegal. No functional change intended. llvm-svn: 218677	2014-09-30 15:30:22 +00:00
Matt Arsenault	93ffe58f90	Add MachineOperand::ChangeToFPImmediate and setFPImm llvm-svn: 218579	2014-09-28 19:24:59 +00:00
James Molloy	463db9a77c	[AArch64] Redundant store instructions should be removed as dead code If there is a store followed by a store with the same value to the same location, then the store is dead/noop. It can be removed. This problem is found in spec2006-197.parser. For example, stur w10, [x11, #-4] stur w10, [x11, #-4] Then one of the two stur instructions can be removed. Patch by David Xu! llvm-svn: 218569	2014-09-27 17:02:54 +00:00
Sanjay Patel	bdf1e38856	Refactor reciprocal and reciprocal square root estimate into target-independent functions (part 2). This is purely refactoring. No functional changes intended. PowerPC is the only target that is currently using this interface. The ultimate goal is to allow targets other than PowerPC (certainly X86 and Aarch64) to turn this: z = y / sqrt(x) into: z = y * rsqrte(x) And: z = y / x into: z = y * rcpe(x) using whatever HW magic they can use. See http://llvm.org/bugs/show_bug.cgi?id=20900 . There is one hook in TargetLowering to get the target-specific opcode for an estimate instruction along with the number of refinement steps needed to make the estimate usable. Differential Revision: http://reviews.llvm.org/D5484 llvm-svn: 218553	2014-09-26 23:01:47 +00:00
David Xu	418da223dd	Revert patch ofr218493 llvm-svn: 218494	2014-09-26 02:28:03 +00:00
David Xu	64f661ee0b	Redundant store instructions should be removed as dead code llvm-svn: 218493	2014-09-26 02:02:09 +00:00
Eric Christopher	3976f78247	Move resetTargetOptions from taking a MachineFunction to a Function since we are accessing the TargetMachine that we're a member function of. llvm-svn: 218489	2014-09-26 01:28:10 +00:00
Bruno Cardoso Lopes	d04f7596e7	[MachineSink+PGO] Teach MachineSink to use BlockFrequencyInfo Machine Sink uses loop depth information to select between successors BBs to sink machine instructions into, where BBs within smaller loop depths are preferable. This patch adds support for choosing between successors by using profile information from BlockFrequencyInfo instead, whenever the information is available. Tested it under SPEC2006 train (average of 30 runs for each program); ~1.5% execution speedup in average on x86-64 darwin. <rdar://problem/18021659> llvm-svn: 218472	2014-09-25 23:14:26 +00:00
Tom Stellard	529efcf9d0	SelectionDAG: Remove #if NDEBUG from check for a post-isel hook The InstrEmitter will skip the check of MI.hasPostISelHook() before calling AdjustInstrPostInstrSelection() when NDEBUG is not defined. This was added in r140228, and I'm not sure if it is intentional or not, but it is a likely source for bugs, because it means with Release+Asserts builds you can forget to set the hasPostISelHook flag on TableGen definitions and AdjustInstrPostInstrSelection() will still be called. llvm-svn: 218458	2014-09-25 18:59:22 +00:00
Robin Morisset	810739d174	Lower idempotent RMWs to fence+load Summary: I originally tried doing this specifically for X86 in the backend in D5091, but it was rather brittle and generally running too late to be general. Furthermore, other targets may want to implement similar optimizations. So I reimplemented it at the IR-level, fitting it into AtomicExpandPass as it interacts with that pass (which could not be cleanly done before at the backend level). This optimization relies on a new target hook, which is only used by X86 for now, as the correctness of the optimization on other targets remains an open question. If it is found correct on other targets, it should be trivial to enable for them. Details of the optimization are discussed in D5091. Test Plan: make check-all + a new test Reviewers: jfb Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5422 llvm-svn: 218455	2014-09-25 17:27:43 +00:00
Jiangning Liu	3b096172cf	Clear PreferredExtendType for in each function-specific state FunctionLoweringInfo. llvm-svn: 218364	2014-09-24 03:22:56 +00:00
Robin Morisset	6dbbbc28b0	[X86] Make wide loads be managed by AtomicExpand Summary: AtomicExpand already had logic for expanding wide loads and stores on LL/SC architectures, and for expanding wide stores on CmpXchg architectures, but not for wide loads on CmpXchg architectures. This patch fills this hole, and makes use of this new feature in the X86 backend. Only one functionnal change: we now lose the SynchScope attribute. It is regrettable, but I have another patch that I will submit soon that will solve this for all of AtomicExpand (it seemed better to split it apart as it is a different concern). Test Plan: make check-all (lots of tests for this functionality already exist) Reviewers: jfb Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5404 llvm-svn: 218332	2014-09-23 20:59:25 +00:00
Robin Morisset	dedef3325f	Add AtomicExpandPass::bracketInstWithFences, and use it whenever getInsertFencesForAtomic would trigger in SelectionDAGBuilder Summary: The goal is to eventually remove all the code related to getInsertFencesForAtomic in SelectionDAGBuilder as it is wrong (designed for ARM, not really portable, works mostly by accident because the backends are overly conservative), and repeats the same logic that goes in emitLeading/TrailingFence. In this patch, I make AtomicExpandPass insert the fences as it knows better where to put them. Because this requires getting the fences and not just passing an IRBuilder around, I had to change the return type of emitLeading/TrailingFence. This code only triggers on ARM for now. Because it is earlier in the pipeline than SelectionDAGBuilder, it triggers and lowers atomic accesses to atomic so SelectionDAGBuilder does not add barriers anymore on ARM. If this patch is accepted I plan to implement emitLeading/TrailingFence for all backends that setInsertFencesForAtomic(true), which will allow both making them less conservative and simplifying SelectionDAGBuilder once they are all using this interface. This should not cause any functionnal change so the existing tests are used and not modified. Test Plan: make check-all, benefits from existing tests of atomics on ARM Reviewers: jfb, t.p.northover Subscribers: aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D5179 llvm-svn: 218329	2014-09-23 20:31:14 +00:00
Lang Hames	d5f496d57c	[MCJIT] Nuke MachineRelocation and MachineCodeEmitter. Now that the old JIT is gone they're no longer needed. llvm-svn: 218320	2014-09-23 18:08:47 +00:00
Sanjay Patel	6a42292795	Use SDValue bool operator to reduce code. No functional change. llvm-svn: 218314	2014-09-23 16:24:20 +00:00
David Majnemer	597be2ded6	MC: ReadOnlyWithRel section kinds should map to rdata in COFF Don't consider ReadOnlyWithRel as a writable section in COFF, they really belong in .rdata. llvm-svn: 218268	2014-09-22 20:39:23 +00:00
Sanjay Patel	b67bd262ea	Refactor reciprocal square root estimate into target-independent function; NFC. This is purely a plumbing patch. No functional changes intended. The ultimate goal is to allow targets other than PowerPC (certainly X86 and Aarch64) to turn this: z = y / sqrt(x) into: z = y * rsqrte(x) using whatever HW magic they can use. See http://llvm.org/bugs/show_bug.cgi?id=20900 . The first step is to add a target hook for RSQRTE, take the already target-independent code selfishly hoarded by PPC, and put it into DAGCombiner. Next steps: The code in DAGCombiner::BuildRSQRTE() should be refactored further; tests that exercise that logic need to be added. Logic in PPCTargetLowering::BuildRSQRTE() should be hoisted into DAGCombiner. X86 and AArch64 overrides for TargetLowering.BuildRSQRTE() should be added. Differential Revision: http://reviews.llvm.org/D5425 llvm-svn: 218219	2014-09-21 15:19:15 +00:00
Sanjay Patel	d649235fc3	mop up: "Don’t duplicate function or class name at the beginning of the comment." llvm-svn: 218218	2014-09-21 14:48:16 +00:00
Sanjay Patel	69df41e92e	mop up: "Don’t duplicate function or class name at the beginning of the comment." llvm-svn: 218194	2014-09-20 22:39:16 +00:00
David Majnemer	b8dbebb31c	MC: Treat ReadOnlyWithRel and ReadOnlyWithRelLocal as ReadOnly for COFF A problem with our old behavior becomes observable under x86-64 COFF when we need a read-only GV which has an initializer which is referenced using a relocation: we would mark the section as writable. Marking the section as writable interferes with section merging. This fixes PR21009. llvm-svn: 218179	2014-09-20 07:31:46 +00:00
Peter Collingbourne	975726345c	Fix crash with an insertvalue that produces an empty object. llvm-svn: 218171	2014-09-20 00:10:47 +00:00
Chris Bieneman	1a98490ce5	Converting SpillPlacement's BlockFrequency threshold to a ManagedStatic to avoid static constructors and destructors. llvm-svn: 218163	2014-09-19 22:46:28 +00:00
David Blaikie	3a7ce252cc	Omit DW_TAG_subprograms for subprograms without inlined subroutines when producing -gmlt data To reduce the size of -gmlt data, skip the subprograms without any inlined subroutines. Since we've now got the ability to make these determinations in the backend (funnily enough - we added the flag so we wouldn't produce ranges under -gmlt, but with this change we use the flag, but go back to producing ranges under -gmlt). Instead, just produce CU ranges to inform the consumer which parts of the code are described by this CU's line table. Tools could inspect the line table directly to compute the range, but the CU ranges only seem to be about 0.5% of object/executable size, so I'm not too worried about teaching llvm-symbolizer that trick just yet - it's certainly a possible piece of future work. Update an llvm-symbolizer test just to demonstrate that this schema is acceptable there (if it wasn't, the compiler-rt tests would catch this, but good to have an in-llvm-tree test for llvm-symbolizer's behavior here) Building the clang binary with -gmlt with this patch reduces the total size of object files by 5.1% (5.56% without ranges) without compression and the executable by 4.37% (4.75% without ranges). llvm-svn: 218129	2014-09-19 17:03:16 +00:00
Frederic Riss	9ba9efff56	Change DwarfCompileUnit::createGlobalVariable to getOrCreateGlobalVariable. Summary: This will allow to request the creation of a forward delacred variable at is point of use (for imported declarations, this will be DwarfDebug::constructImportedEntityDIE) rather than having to put the forward decl in a retention list. Note that getOrCreateGlobalVariable returns the actual definition DIE when the routine creates a declaration and a definition DIE. If you agree this is the right behavior, then I'll have a followup patch that registers the definition in the DIE map instead of the declaration as it is today (this 'breaks' only one test, where we test that the imported entity is the declaration). I'm not sure what's best here, but it's easy enough for a consumer to follow the DW_AT_specification link to get to the declaration, whereas it takes more work to find the actual definition from a declaration DIE. Reviewers: echristo, dblaikie, aprantl Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5381 llvm-svn: 218126	2014-09-19 15:12:03 +00:00
Hal Finkel	62ac736faa	Optionally enable more-aggressive FMA formation in DAGCombine The heuristic used by DAGCombine to form FMAs checks that the FMUL has only one use, but this is overly-conservative on some systems. Specifically, if the FMA and the FADD have the same latency (and the FMA does not compete for resources with the FMUL any more than the FADD does), there is no need for the restriction, and furthermore, forming the FMA leaving the FMUL can still allow for higher overall throughput and decreased critical-path length. Here we add a new TLI callback, enableAggressiveFMAFusion, false by default, to elide the hasOneUse check. This is enabled for PowerPC by default, as most PowerPC systems will benefit. Patch by Olivier Sallenave, thanks! llvm-svn: 218120	2014-09-19 11:42:56 +00:00
Jiangning Liu	ffbc690933	Optimize sext/zext insertion algorithm in back-end. With this optimization, we will not always insert zext for values crossing basic blocks, but insert sext if the users of a value crossing basic block has preference of sign predicate. llvm-svn: 218101	2014-09-19 05:30:35 +00:00
David Blaikie	03c3dbeb62	Omit DW_AT_frame_base under -gmlt for size llvm-svn: 218100	2014-09-19 04:55:05 +00:00
David Blaikie	0b9438b1c1	Describe the -gmlt optimization committed in the previous revision. llvm-svn: 218099	2014-09-19 04:47:46 +00:00
David Blaikie	73b65d236c	Omit all the extra static attributes on subprograms in -gmlt This omission will be done in a fancier manner once we're dealing with "put gmlt in the skeleton CUs under fission" - it'll have to be conditional on the kind of CU we're emitting into (skeleton or gmlt). llvm-svn: 218098	2014-09-19 04:30:36 +00:00
Hans Wennborg	c0f0c511db	Fix an it's vs. its typo. llvm-svn: 218093	2014-09-19 01:14:56 +00:00
Frederic Riss	0baab0cded	Revert part of r218041. The patch moved some logic around in an attempt to generate potentially more DW_AT_declaration attributes. The patch was flawed though and it stopped generating the attribute in some cases. llvm-svn: 218060	2014-09-18 16:41:04 +00:00
Frederic Riss	be26dfb595	Always emit DW_AT_declaration attribute when the variable isn't a definition. Summary: This doesn't show up today as we don't emit decalration only variables. This will be tested when the followup patches implementing import of forward declared entities lands in clang. Reviewers: echristo, dblaikie, aprantl Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5382 llvm-svn: 218041	2014-09-18 09:38:23 +00:00
Eric Christopher	d85ffb1fc0	Add a new pass FunctionTargetTransformInfo. This pass serves as a shim between the TargetTransformInfo immutable pass and the Subtarget via the TargetMachine and Function. Migrate a single call from BasicTargetTransformInfo as an example and provide shims where TargetMachine begins taking a Function to determine the subtarget. No functional change. llvm-svn: 218004	2014-09-18 00:34:14 +00:00
Robin Morisset	25c8e318e4	[X86] Use the generic AtomicExpandPass instead of X86AtomicExpandPass This required a new hook called hasLoadLinkedStoreConditional to know whether to expand atomics to LL/SC (ARM, AArch64, in a future patch Power) or to CmpXchg (X86). Apart from that, the new code in AtomicExpandPass is mostly moved from X86AtomicExpandPass. The main result of this patch is to get rid of that pass, which had lots of code duplicated with AtomicExpandPass. llvm-svn: 217928	2014-09-17 00:06:58 +00:00
Quentin Colombet	ac55b15bf4	[CodeGenPrepare][AddressingModeMatcher] The promotion mechanism was expecting instructions when truncate, sext, or zext were created. Fix that. llvm-svn: 217926	2014-09-16 22:36:07 +00:00
Owen Anderson	bfc80a45a7	Add back a fallback case for targets that do not or cannot implement getNoopForMachoTarget(). llvm-svn: 217899	2014-09-16 20:28:00 +00:00
Hal Finkel	cc4f31d3d7	Fix BasicTTI::getCmpSelInstrCost to deal with illegal vector types The default implementation of getCmpSelInstrCost, which provides the cost of icmp/fcmp/select instructions, did not deal sensibly with illegal vector types that were scalarized. We'd ask for the legalization cost of the vector type, which would return something like (4, f64) given an input of <4 x double>, and we'd then check the TLI status of the ISD opcode on that scalar type. This would result in querying (ISD::VSELECT, f64), for example. Amusingly enough, ISD::VSELECT on scalar types is marked as Legal by default (as with most other operations), and most backends never change this because VSELECT is never generated on scalars. However, seeing the resulting operation as Legal, we'd neglect to add the scalarization cost before returning. The result is that we'd grossly under-estimate the cost of cmps/selects on illegal vector types. Now, if type legalization clearly results in scalarization, we skip the early return and add the scalarization cost. llvm-svn: 217859	2014-09-16 04:35:50 +00:00
David Blaikie	ba656e1d7c	DebugInfo: Add comment describing the need to disable address pool usage in skeleton units. Post commit review from Eric Christopher. llvm-svn: 217842	2014-09-15 22:41:25 +00:00
Sanjay Patel	d4f4c4e416	Replace repeated null checks with an assert. NFC. Without a vector to hold the created ops, these functions don't have any use. llvm-svn: 217831	2014-09-15 21:52:51 +00:00

1 2 3 4 5 ...

17230 Commits