llvm-project

Commit Graph

Author	SHA1	Message	Date
Chandler Carruth	6b78bac9fb	[ADT] Add a much simpler loop to DenseMap::clear when the types are POD-like and we can just splat the empty key across memory. Sadly we can't optimize the normal loop well enough because we can't turn the conditional store into an unconditional store according to the memory model. This loop actually showed up in a profile of code that was calling clear as a serious source of time. =[ llvm-svn: 310189	2017-08-05 22:48:37 +00:00
Chandler Carruth	adbf14ab85	[LCG] Completely remove the parent set and leaf tracking for RefSCCs. After the previous series of patches, this is now trivial and deletes a pretty astonishing amount of complexity. This has been a long time coming, as the move toward a PO sequence of RefSCCs started eroding the underlying use cases for this half of the data structure. Among the biggest advantages here is that now there aren't two independent data structures that need to stay in sync. Some of my profiling has also indicated that updating the parent sets was among the most expensive parts of the lazy call graph. Eliminating it whole sale is likely to be a nice win in terms of compile time. Last but not least, I had discussed with some folks previously keeping it around for asserts and other correctness checking, but once the fundamentals of the parent and child checking were implemented without the parent sets their value in correctness checking was tiny and no where near worth the cost of the complexity required to keep everything up-to-date. llvm-svn: 310171	2017-08-05 07:37:00 +00:00
Chandler Carruth	38bd6b50ef	[LCG] Re-implement the basic isParentOf, isAncestorOf, isChildOf, and isDescendantOf methods on RefSCCs in terms of the forward edges rather than the parent sets. This is technically slower, but probably not interestingly slower, and all of these routines were already so expensive that they're guarded behind both !NDEBUG and EXPENSIVE_CHECKS. This removes another non-critical usage of parent sets. I've also added some comments to try and help clarify to any potential users the costs of these routines. They're mostly useful for debugging, asserts, or other queries. llvm-svn: 310170	2017-08-05 06:24:09 +00:00
Chandler Carruth	c718b8e7c3	[LCG] Add the concept of a "dead" node and use it to avoid a complex walk over the parent set. When removing a single function from the call graph, we previously would walk the entire RefSCC's parent set and then walk every outgoing edge just to find the ones to remove. In addition to this being quite high complexity in theory, it is also the last fundamental use of the parent sets. With this change, when we remove a function we transform the node containing it to be recognizably "dead" and then teach the edge iterators to recognize edges to such nodes and skip them the same way they skip null edges. We can't move fully to using "dead" nodes -- when disconnecting two live nodes we need to null out the edge. But the complexity this adds to the edge sequence isn't too bad and the simplification of lazily handling this seems like a significant win. llvm-svn: 310169	2017-08-05 05:47:37 +00:00
Joel Jones	60711ca253	[AArch64] LSE Atomics reorg - part 1 Add memory synchronization semantics to LSE Atomics. The memory semantics feature will be added in a subsequent patch. In this patch, several corrections were added to the existing LSE Atomics implementation, based on the ARM Errata D11904 from 05/12/2017. Patch by: steleman Differential Revision: https://reviews.llvm.org/D35319 llvm-svn: 310167	2017-08-05 04:30:55 +00:00
Chandler Carruth	39df40d8c2	[LCG] Replace an implicit bool operator with a named function. (NFC) The definition of 'false' here was already pretty vague and debatable, and I'm about to add another potential 'false' that would actually make much more sense in a bool operator. Especially given how rarely this is used, a nicely named method seems better. llvm-svn: 310165	2017-08-05 04:04:06 +00:00
Adrian McCarthy	b41f03e768	Enable llvm-pdbutil to list enumerations using native PDB reader This extends the native reader to enable llvm-pdbutil to list the enums in a PDB and it includes a simple test. It does not yet list the values in the enumerations, which requires an actual implementation of NativeEnumSymbol::FindChildren. To exercise this code, use a command like: llvm-pdbutil pretty -native -enums foo.pdb Differential Revision: https://reviews.llvm.org/D35738 llvm-svn: 310144	2017-08-04 22:37:58 +00:00
Adrian Prantl	299bebfbc3	Remove unused include directive and un-break the module build. llvm-svn: 310124	2017-08-04 20:41:37 +00:00
Quentin Colombet	c046208c52	[GlobalISel] Remove the GISelAccessor API. Its sole purpose was to avoid spreading around ifdefs related to building global-isel. Since r309990, GlobalISel is not optional anymore, thus, we can get rid of this mechanism all together. NFC. llvm-svn: 310115	2017-08-04 20:15:46 +00:00
Connor Abbott	66b9bd6e50	[AMDGPU] Implement llvm.amdgcn.set.inactive intrinsic Summary: This intrinsic lets us set inactive lanes to an identity value when implementing wavefront reductions. In combination with Whole Wavefront Mode, it lets inactive lanes be skipped over as required by GLSL/Vulkan. Lowering the intrinsic needs to happen post-RA so that RA knows that the destination isn't completely overwritten due to the EXEC shenanigans, so we need another pseudo-instruction to represent the un-lowered intrinsic. Reviewers: tstellar, arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D34719 llvm-svn: 310088	2017-08-04 18:36:54 +00:00
Connor Abbott	92638ab625	[AMDGPU] Add support for Whole Wavefront Mode Summary: Whole Wavefront Wode (WWM) is similar to WQM, except that all of the lanes are always enabled, regardless of control flow. This is required for implementing wavefront reductions in non-uniform control flow, where we need to use the inactive lanes to propagate intermediate results, so they need to be enabled. We need to propagate WWM to uses (unless they're explicitly marked as exact) so that they also propagate intermediate results correctly. We do the analysis and exec mask munging during the WQM pass, since there are interactions with WQM for things that require both WQM and WWM. For simplicity, WWM is entirely block-local -- blocks are never WWM on entry or exit of a block, and WWM is not propagated to the block level. This means that computations involving WWM cannot involve control flow, but we only ever plan to use WWM for a few limited purposes (none of which involve control flow) anyways. Shaders can ask for WWM using the @llvm.amdgcn.wwm intrinsic. There isn't yet a way to turn WWM off -- that will be added in a future change. Finally, it turns out that turning on inactive lanes causes a number of problems with register allocation. While the best long-term solution seems like teaching LLVM's register allocator about predication, for now we need to add some hacks to prevent ourselves from getting into trouble due to constraints that aren't currently expressed in LLVM. For the gory details, see the comments at the top of SIFixWWMLiveness.cpp. Reviewers: arsenm, nhaehnle, tpr Subscribers: kzhuravl, wdng, mgorny, yaxunl, dstuttard, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D35524 llvm-svn: 310087	2017-08-04 18:36:52 +00:00
Connor Abbott	8c217d0a29	[AMDGPU] Add an llvm.amdgcn.wqm intrinsic for WQM Summary: Previously, we assumed that certain types of instructions needed WQM in pixel shaders, particularly DS instructions and image sampling instructions. This was ok because with OpenGL, the assumption was correct. But we want to start using DPP instructions for derivatives as well as other things, so the assumption that we can infer whether to use WQM based on the instruction won't continue to hold. This intrinsic lets frontends like Mesa indicate what things need WQM based on their knowledge of the API, rather than second-guessing them in the backend. We need to keep around the old method of enabling WQM, but eventually we should remove it once Mesa catches up. For now, this will let us use DPP instructions for computing derivatives correctly. Reviewers: arsenm, tpr, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D35167 llvm-svn: 310085	2017-08-04 18:36:49 +00:00
Marcello Maggioni	8de4bbdaa5	[MachineOperand] Add ChangeToTargetIndex method. NFC Differential Revision: https://reviews.llvm.org/D36301 llvm-svn: 310083	2017-08-04 18:24:09 +00:00
Reid Kleckner	af3e93ac93	[Support] Remove getPathFromOpenFD, it was unused Summary: It was added to support clang warnings about includes with case mismatches, but it ended up not being necessary. Reviewers: twoh, rafael Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D36328 llvm-svn: 310078	2017-08-04 17:43:49 +00:00
Charles Saternos	75da10d1b2	[ThinLTO] Add FunctionAttrs to ThinLTO index Adds function attributes to index: ReadNone, ReadOnly, NoRecurse, NoAlias. This attributes will be used for future ThinLTO optimizations that will propagate function attributes across modules. llvm-svn: 310061	2017-08-04 16:00:58 +00:00
Nikolai Bozhenov	1545eb3408	[InstCombine] Canonicalize clamp of float types to minmax in fast mode. Summary: This commit allows matchSelectPattern to recognize clamp of float arguments in the presence of FMF the same way as already done for integers. This case is a little different though. With integers, given the min/max pattern is recognized, DAGBuilder starts selecting MIN/MAX "automatically". That is not the case for float, because for them only full FMINNAN/FMINNUM/FMAXNAN/FMAXNUM ISD nodes exist and they do care about NaNs. On the other hand, some backends (e.g. X86) have only FMIN/FMAX nodes that do not care about NaNS and the former NAN/NUM nodes are illegal thus selection is not happening. So I decided to do such kind of transformation in IR (InstCombiner) instead of complicating the logic in the backend. Reviewers: spatel, jmolloy, majnemer, efriedma, craig.topper Reviewed By: efriedma Subscribers: hiraditya, javed.absar, n.bozhenov, llvm-commits Patch by Andrei Elovikov <andrei.elovikov@intel.com> Differential Revision: https://reviews.llvm.org/D33186 llvm-svn: 310054	2017-08-04 12:22:17 +00:00
Victor Leschuk	56b03d0dd6	Un-revert r310014: false revert, it wasn't the cause of build break llvm-svn: 310021	2017-08-04 04:51:15 +00:00
Victor Leschuk	21713ebfb1	Revert r310014 as it breaks build lld-x86_64-darwin13 llvm-svn: 310020	2017-08-04 04:43:54 +00:00
Reid Kleckner	02aeadcf3d	[Support] Update comments about stdout, raw_fd_ostream, and outs() The full story is in the comments: // Do not attempt to close stdout or stderr. We used to try to maintain the // property that tools that support writing file to stdout should not also // write informational output to stdout, but in practice we were never able to // maintain this invariant. Many features have been added to LLVM and clang // (-fdump-record-layouts, optimization remarks, etc) that print to stdout, so // users must simply be aware that mixed output and remarks is a possibility. NFC, I am just updating comments to reflect reality. llvm-svn: 310016	2017-08-04 01:39:23 +00:00
Adrian Prantl	fd8c8e9fe6	Teach GlobalSRA to update the debug info for split-up globals. This is similar to what we are doing in "regular" SROA and creates DW_OP_LLVM_fragment operations to describe the resulting variables. rdar://problem/33654891 llvm-svn: 310014	2017-08-04 01:19:54 +00:00
Teresa Johnson	8482e56920	Use profile summary to disable peeling for huge working sets Summary: Detect when the working set size of a profiled application is huge, by comparing the number of counts required to reach the hot percentile in the profile summary to a large threshold. When the working set size is determined to be huge, disable peeling to avoid bloating the working set further. Note that the selected threshold (15K) is significantly larger than the largest working set value in SPEC cpu2006 (which is gcc at around 11K). Reviewers: davidxl Subscribers: mehdi_amini, mzolotukhin, eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D36288 llvm-svn: 310005	2017-08-03 23:42:58 +00:00
Easwaran Raman	974d4eea93	[Inliner] Increase threshold for hot callsites without PGO. Summary: This increases the inlining threshold for hot callsites. Hotness is defined in terms of block frequency of the callsite relative to the caller's entry block's frequency. Since this requires BFI in the inliner, this only affects the new PM pipeline. This is enabled by default at -O3. This improves the performance of some internal benchmarks. Notably, an internal benchmark for Gipfeli compression (https://github.com/google/gipfeli) improves by ~7%. Povray in SPEC2006 improves by ~2.5%. I am running more experiments and will update the thread if other benchmarks show improvement/regression. In terms of text size, LLVM test-suite shows an 1.22% text size increase. Diving into the results, 13 of the benchmarks in the test-suite increases by > 10%. Most of these are small, but Adobe-C++/loop_unroll (17.6% increases) and tramp3d(20.7% size increase) have >250K text size. On a large application, the text size increases by 2% Reviewers: chandlerc, davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36199 llvm-svn: 309994	2017-08-03 22:23:33 +00:00
Matt Arsenault	a52391f2db	DAG: Provide access to Pass instance from SelectionDAG This allows accessing an analysis pass during lowering. llvm-svn: 309991	2017-08-03 21:54:00 +00:00
Reid Kleckner	175af4bcc7	[PDB] Fix section contributions Summary: PDB section contributions are supposed to use output section indices and offsets, not input section indices and offsets. This allows the debugger to look up the index of the module that it should look up in the modules stream for symbol information. With this change, windbg can now find line tables, but it still cannot print local variables. Fixes PR34048 Reviewers: zturner Subscribers: hiraditya, ruiu, llvm-commits Differential Revision: https://reviews.llvm.org/D36285 llvm-svn: 309987	2017-08-03 21:15:09 +00:00
Teresa Johnson	9a18a6f08b	Disable loop peeling during full unrolling pass. Summary: Peeling should not occur during the full unrolling invocation early in the pipeline, but rather later with partial and runtime loop unrolling. The later loop unrolling invocation will also eventually utilize profile summary and branch frequency information, which we would like to use to control peeling. And for ThinLTO we want to delay peeling until the backend (post thin link) phase, just as we do for most types of unrolling. Ensure peeling doesn't occur during the full unrolling invocation by adding a parameter to the shared implementation function, similar to the way partial and runtime loop unrolling are disabled. Performance results for ThinLTO suggest this has a neutral to positive effect on some internal benchmarks. Reviewers: chandlerc, davidxl Subscribers: mzolotukhin, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D36258 llvm-svn: 309966	2017-08-03 17:52:38 +00:00
NAKAMURA Takumi	2c43dce3ee	Prune linefeed at eof. llvm-svn: 309932	2017-08-03 11:36:44 +00:00
NAKAMURA Takumi	ab91cc3a2f	llvm/Support/CodeGenCWrappers.h: Add missing "llvm/ADT/Optional.h", to fix modules build. llvm-svn: 309931	2017-08-03 11:36:42 +00:00
Max Kazantsev	2cb3653404	[SCEV] Re-enable "Cache results of computeExitLimit" The patch rL309080 was reverted because it did not clean up the cache on "forgetValue" method call. This patch re-enables this change, adds the missing check and introduces two new unit tests that make sure that the cache is cleaned properly. Differential Revision: https://reviews.llvm.org/D36087 llvm-svn: 309925	2017-08-03 08:41:30 +00:00
Daniel Sanders	820def6fdd	[globalisel][tablegen] Update a comment to use the name of the constant rather than the value. llvm-svn: 309924	2017-08-03 08:38:04 +00:00
Rafael Espindola	8c549de761	Add LLVM_FALLTHROUGH. llvm-svn: 309918	2017-08-03 03:52:34 +00:00
Rafael Espindola	79e238afee	Delete Default and JITDefault code models IMHO it is an antipattern to have a enum value that is Default. At any given piece of code it is not clear if we have to handle Default or if has already been mapped to a concrete value. In this case in particular, only the target can do the mapping and it is nice to make sure it is always done. This deletes the two default enum values of CodeModel and uses an explicit Optional<CodeModel> when it is possible that it is unspecified. llvm-svn: 309911	2017-08-03 02:16:21 +00:00
Vedant Kumar	4cc7222872	Move two functions to a nicer spot. NFC. llvm-svn: 309906	2017-08-02 23:35:27 +00:00
Vedant Kumar	be6b598bfa	Rely on autobrief, remove \briefs from a header. NFC. llvm-svn: 309905	2017-08-02 23:35:26 +00:00
Vedant Kumar	dde19c5a73	[Coverage] Add an API to retrive all instantiations of a function (NFC) The CoverageMapping::getInstantiations() API retrieved all function records corresponding to functions with more than one instantiation (e.g template functions with multiple specializations). However, there was no simple way to determine which function a given record was an instantiation of. This was an oversight, since it's useful to aggregate coverage information over all instantiations of a function. llvm-cov works around this by building a mapping of source locations to instantiation sets, but this duplicates logic that libCoverage already has (see FunctionInstantiationSetCollector). This change adds a new API, CoverageMapping::getInstantiationGroups(), which returns a list of InstantiationGroups. A group contains records for each instantiation of some particular function, and also provides utilities to get the total execution count within the group, the source location of the common definition, etc. This lets removes some hacky logic in llvm-cov by reusing FunctionInstantiationSetCollector and makes the CoverageMapping API friendlier for other clients. llvm-svn: 309904	2017-08-02 23:35:25 +00:00
Zachary Turner	9fb9d71d3e	[pdb/lld] Write a valid FPM. The PDB reserves certain blocks for the FPM that describe which blocks in the file are allocated and which are free. We weren't filling that out at all, and in some cases we were even stomping it with incorrect data. This patch writes a correct FPM. Differential Revision: https://reviews.llvm.org/D36235 llvm-svn: 309896	2017-08-02 22:31:39 +00:00
Zachary Turner	c3d8eec9e9	[pdbutil] Add a command to dump the FPM. Recently problems have been discovered in the way we write the FPM (free page map). In order to fix this, we first need to establish a baseline about what a correct FPM looks like using an MSVC generated PDB, so that we can then make our own generated PDBs match. And in order to do this, the dumper needs a mode where it can dump an FPM so that we can write tests for it. This patch adds a command to dump the FPM, as well as a test against a known-good PDB. llvm-svn: 309894	2017-08-02 22:25:52 +00:00
Teresa Johnson	ecd901314d	[PM] Split LoopUnrollPass and make partial unroller a function pass Summary: This is largely NFC, in preparation for utilizing ProfileSummaryInfo and BranchFrequencyInfo analyses. In this patch I am only doing the splitting for the New PM, but I can do the same for the legacy PM as a follow-on if this looks good. Not NFC since for partial unrolling we lose the updates done to the loop traversal (adding new sibling and child loops) - according to Chandler this is not very useful for partial unrolling, but it also means that the debugging flag -unroll-revisit-child-loops no longer works for partial unrolling. Reviewers: chandlerc Subscribers: mehdi_amini, mzolotukhin, eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D36157 llvm-svn: 309886	2017-08-02 20:35:29 +00:00
Rafael Espindola	9f92995781	Don't pass the code model to MC I was surprised to see the code model being passed to MC. After all, it assembles code, it doesn't create it. The one place it is used is in the expansion of .cfi directives to handle .eh_frame being more that 2gb away from the code. As far as I can tell, gnu assembler doesn't even have an option to enable this. Compiling a c file with gcc -mcmodel=large produces a regular looking .eh_frame. This is probably because in practice linker parse and recreate .eh_frames. In llvm this is used because the JIT can place the code and .eh_frame very far apart. Ideally we would fix the jit and delete this option. This is hard. Apart from confusion another problem with the current interface is that most callers pass CodeModel::Default, which is bad since MC has no way to map it to the target default if it actually needed to. This patch then replaces the argument with a boolean with a default value. The vast majority of users don't ever need to look at it. In fact, only CodeGen and llvm-mc use it and llvm-mc just to enable more testing. llvm-svn: 309884	2017-08-02 20:32:26 +00:00
David Blaikie	22dc4474a6	DebugInfo: Test & handle (differently) non-zero DW_AT_ranges_base Followup to r309570, fixing it slightly differently (ranges_base and addr_base should never be read from a DWO file - so there shouldn't be any issue with 'overriding' the values - conditionalize the code and assert that the values aren't being overriden). llvm-svn: 309879	2017-08-02 20:16:22 +00:00
Jakub Kuderski	d869913f3b	[Dominators] Teach LoopDeletion to use the new incremental API Summary: This patch makes LoopDeletion use the incremental DominatorTree API. We modify LoopDeletion to perform the deletion in 5 steps: 1. Create a new dummy edge from the preheader to the exit, by adding a conditional branch. 2. Inform the DomTree about the new edge. 3. Remove the conditional branch and replace it with an unconditional edge to the exit. This removes the edge to the loop header, making it unreachable. 4. Inform the DomTree about the deleted edge. 5. Remove the unreachable block from the function. Creating the dummy conditional branch is necessary to perform incremental DomTree update. We should consider using the batch updater when it's ready. Reviewers: dberlin, davide, grosser, sanjoy Reviewed By: dberlin, grosser Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D35391 llvm-svn: 309850	2017-08-02 18:17:52 +00:00
Adrian Prantl	5577363a2c	Remove the unused Offset field from MachineLocation (NFC) rdar://problem/33580047 llvm-svn: 309831	2017-08-02 17:07:38 +00:00
Adrian Prantl	3d1a2147ea	Assert that the offset in MachineLocation::set() is always 0. (NFC) llvm-svn: 309818	2017-08-02 14:45:50 +00:00
Alexey Bataev	36e6096a03	[SLPVectorizer] Generalize interface of functions, NFC. llvm-svn: 309816	2017-08-02 14:38:07 +00:00
Diana Picus	d5a00b0ff6	[MIR] Print target-specific constant pools This should enable us to test the generation of target-specific constant pools, e.g. for ARM: constants: - id: 0 value: 'g(GOT_PREL)-(LPC0+8-.)' alignment: 4 isTargetSpecific: true I intend to use this to test PIC support in GlobalISel for ARM. This is difficult to test outside of that context, since the existing MIR tests usually rely on parser support as well, and that seems a bit trickier to add. We could try to add a unit test, but the setup for that seems rather convoluted and overkill. We do test however that the parser reports a nice error when encountering a target-specific constant pool. Differential Revision: https://reviews.llvm.org/D36092 llvm-svn: 309806	2017-08-02 11:09:30 +00:00
Daniel Sanders	078572b6b1	[globalisel][tablegen] Do not merge memoperands from instructions that weren't in the match. Summary: Fix a bug discovered in an out-of-tree target where memoperands from pseudo-instructions that weren't part of the match were being merged into the result instructions as part of GIR_MergeMemOperands. This bug was caused by a change to the handling of State.MIs between rules when the state machine tables were fused into a single table. Previously, each rule would reset State.MIs using State.MIs.resize(1) but this is no longer done, as a result stale data is occasionally left in some elements of State.MIs. Most opcodes aren't affected by this but GIR_MergeMemOperands merges all memoperands from the intructions recorded in State.MIs into the result instruction. Suppose for example, we processed but rejected the following pattern: (signextend (load x)) at this point, State.MIs contains the signextend and the load. Now suppose we process and accept this pattern: (add x, y) at this point, State.MIs contains the add as well as the (now irrelevant) load. When GIR_MergeMemOperands is processed, the memoperands from that irrelevant load will be merged into the result instruction even though it was not part of the match. Bringing back the State.MIs.resize(1) would fix the problem but it would limit our ability to optimize the table in the future. Instead, this patch fixes the problem by explicitly stating which instructions should be merged into the result. There's no direct test case in this commit because a test case would be very brittle. However, at the time of writing this should fix the failures in http://green.lab.llvm.org/green/job/Compiler_Verifiers_GlobalISEL/ as well as a failure in test/CodeGen/ARM/GlobalISel/arm-isel.ll when expensive checks are enabled. Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar Subscribers: fhahn, kristof.beyls, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D36094 llvm-svn: 309804	2017-08-02 11:03:36 +00:00
Dehao Chen	3246dc3df1	Fix the bug that parseAAPipeline is not invoked in runNewPMPasses in release compiler. Summary: The logic is guarded by "assert". Reviewers: davidxl, davide, chandlerc Reviewed By: davide, chandlerc Subscribers: sanjoy, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D36195 llvm-svn: 309787	2017-08-02 03:03:19 +00:00
Chandler Carruth	95055d8f8b	[PM] Fix a bug where through CGSCC iteration we can get infinite-inlining across multiple runs of the inliner by keeping a tiny history of internal-to-SCC inlining decisions. This is still a bit gross, but I don't yet have any fundamentally better ideas and numerous people are blocked on this to use new PM and ThinLTO together. The core of the idea is to detect when we are about to do an inline that has a chance of re-splitting an SCC which we have split before with a similar inlining step. That is a critical component in the inlining forming a cycle and so far detects all of the various cyclic patterns I can come up with as well as the original real-world test case (which comes from a ThinLTO build of libunwind). I've added some tests that I think really demonstrate what is going on here. They are essentially state machines that march the inliner through various steps of a cycle and check that we stop when the cycle is closed and that we actually did do inlining to form that cycle. A lot of thanks go to Eric Christopher and Sanjoy Das for the help understanding this issue and improving the test cases. The biggest "yuck" here is the layering issue -- the CGSCC pass manager is providing somewhat magical state to the inliner for it to use to make itself converge. This isn't great, but I don't honestly have a lot of better ideas yet and at least seems nicely isolated. I have tested this patch, and it doesn't block any inlining on the entire LLVM test suite and SPEC, so it seems sufficiently narrowly targeted to the issue at hand. We have come up with hypothetical issues that this patch doesn't cover, but so far none of them are practical and we don't have a viable solution yet that covers the hypothetical stuff, so proceeding here in the interim. Definitely an area that we will be back and revisiting in the future. Differential Revision: https://reviews.llvm.org/D36188 llvm-svn: 309784	2017-08-02 02:09:22 +00:00
Adrian Prantl	4f345060dd	Remove unused accessor (NFC) rdar://problem/33580047 llvm-svn: 309763	2017-08-01 23:16:36 +00:00
Adrian Prantl	81ea122121	Assert that the offset of a MachineLocation is always 0. This is to convince me that it may safely be removed in a follow-up commit. rdar://problem/33580047 llvm-svn: 309761	2017-08-01 22:57:05 +00:00
Adrian Prantl	032d2381bf	Remove PrologEpilogInserter's usage of DBG_VALUE's offset field In the last half-dozen commits to LLVM I removed code that became dead after removing the offset parameter from llvm.dbg.value gradually proceeding from IR towards the backend. Before I can move on to DwarfDebug and friends there is one last side-called offset I need to remove: This patch modifies PrologEpilogInserter's use of the DBG_VALUE's offset argument to use a DIExpression instead. Because the PrologEpilogInserter runs at the Machine level I had to play a little trick with a named llvm.dbg.mir node to get the DIExpressions to print in MIR dumps (which print the llvm::Module followed by the MachineFunction dump). I also had to add rudimentary DwarfExpression support to CodeView and as a side-effect also fixed a bug (CodeViewDebug::collectVariableInfo was supposed to give up on variables with complex DIExpressions, but would fail to do so for fragments, which are also modeled as DIExpressions). With this last holdover removed we will have only one canonical way of representing offsets to debug locations which will simplify the code in DwarfDebug (and future versions of CodeViewDebug once it starts handling more complex expressions) and make it easier to reason about. This patch is NFC-ish: All test case changes are for assembler comments and the binary output does not change. rdar://problem/33580047 Differential Revision: https://reviews.llvm.org/D36125 llvm-svn: 309751	2017-08-01 21:45:24 +00:00

1 2 3 4 5 ...

32171 Commits