llvm-project

Commit Graph

Author	SHA1	Message	Date
Teresa Johnson	720d9b4111	Fix code section prefix for proper layout Summary: r284533 added hot and cold section prefixes based on profile information, to enable grouping of hot/cold functions at link time. However, it used "cold" as the prefix for cold sections, but gold only recognizes "unlikely" (which is used by gcc for cold sections). Therefore, cold sections were not properly being grouped. Switch to using "unlikely" Reviewers: danielcdh, davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32983 llvm-svn: 302502	2017-05-09 01:43:24 +00:00
Sanjoy Das	e6bca0eecb	Rename WeakVH to WeakTrackingVH; NFC This relands r301424. llvm-svn: 301812	2017-05-01 17:07:49 +00:00
Daniel Berlin	4d0fe64ae3	Kill off the old SimplifyInstruction API by converting remaining users. llvm-svn: 301673	2017-04-28 19:55:38 +00:00
Sanjoy Das	2cbeb00f38	Reverts commit r301424, r301425 and r301426 Commits were: "Use WeakVH instead of WeakTrackingVH in AliasSetTracker's UnkownInsts" "Add a new WeakVH value handle; NFC" "Rename WeakVH to WeakTrackingVH; NFC" The changes assumed pointers are 8 byte aligned on all architectures. llvm-svn: 301429	2017-04-26 16:37:05 +00:00
Sanjoy Das	01de557738	Rename WeakVH to WeakTrackingVH; NFC Summary: I plan to use WeakVH to mean "nulls itself out on deletion, but does not track RAUW" in a subsequent commit. Reviewers: dblaikie, davide Reviewed By: davide Subscribers: arsenm, mehdi_amini, mcrosier, mzolotukhin, jfb, llvm-commits, nhaehnle Differential Revision: https://reviews.llvm.org/D32266 llvm-svn: 301424	2017-04-26 16:20:52 +00:00
Craig Topper	fc947bcfba	[APInt] Use lshrInPlace to replace lshr where possible This patch uses lshrInPlace to replace code where the object that lshr is called on is being overwritten with the result. This adds an lshrInPlace(const APInt &) version as well. Differential Revision: https://reviews.llvm.org/D32155 llvm-svn: 300566	2017-04-18 17:14:21 +00:00
Brendon Cahoon	7769a0854e	[CodeGenPrepare] Fix crash due to an invalid CFG The splitIndirectCriticalEdges function generates and invalid CFG when the 'Target' basic block is a loop to itself. When this occurs, the code that updates the predecessor terminator needs to update the terminator in the split basic block. This occurs when there is an edge from block D back to D. Since D is split in to D0 and D1, the code needs to update the terminator in D1. But D1 is not in the OtherPreds vector, so it was not getting updated. Differential Revision: https://reviews.llvm.org/D32126 llvm-svn: 300480	2017-04-17 19:11:04 +00:00
Chandler Carruth	927d8e610a	[IR] Redesign the case iterator in SwitchInst to actually be an iterator and to expose a handle to represent the actual case rather than having the iterator return a reference to itself. All of this allows the iterator to be used with common STL facilities, standard algorithms, etc. Doing this exposed some missing facilities in the iterator facade that I've fixed and required some work to the actual iterator to fully support the necessary API. Differential Revision: https://reviews.llvm.org/D31548 llvm-svn: 300032	2017-04-12 07:27:28 +00:00
Eli Friedman	5fba1e53f2	Turn on -addr-sink-using-gep by default. The new codepath has been in the tree for years, and there isn't any reason to use two codepaths here. Differential Revision: https://reviews.llvm.org/D30596 llvm-svn: 299723	2017-04-06 22:42:18 +00:00
Jun Bum Lim	dee5565869	[CodeGenPrep] move aarch64-type-promotion to CGP Summary: Move the aarch64-type-promotion pass within the existing type promotion framework in CGP. This change also support forking sexts when a new sext is required for promotion. Note that change is based on D27853 and I am submitting this out early to provide a better idea on D27853. Reviewers: jmolloy, mcrosier, javed.absar, qcolombet Reviewed By: qcolombet Subscribers: llvm-commits, aemerson, rengolin, mcrosier Differential Revision: https://reviews.llvm.org/D28680 llvm-svn: 299379	2017-04-03 19:20:07 +00:00
Craig Topper	d33ee1b960	[APInt] Move isMask and isShiftedMask out of APIntOps and into the APInt class. Implement them without memory allocation for multiword This moves the isMask and isShiftedMask functions to be class methods. They now use the MathExtras.h function for single word size and leading/trailing zeros/ones or countPopulation for the multiword size. The previous implementation made multiple temorary memory allocations to do the bitwise arithmetic operations to match the MathExtras.h implementation. Differential Revision: https://reviews.llvm.org/D31565 llvm-svn: 299362	2017-04-03 16:34:59 +00:00
Dehao Chen	775341a14c	Use isFunctionHotInCallGraph to set the function section prefix. Summary: The current prefix based function layout algorithm only looks at function's entry count, which is not sufficient. A function should be grouped together if its entry count or any call edge count is hot. Reviewers: davidxl, eraman Reviewed By: eraman Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31225 llvm-svn: 298656	2017-03-23 23:14:11 +00:00
Reid Kleckner	b518054b87	Rename AttributeSet to AttributeList Summary: This class is a list of AttributeSetNodes corresponding the function prototype of a call or function declaration. This class used to be called ParamAttrListPtr, then AttrListPtr, then AttributeSet. It is typically accessed by parameter and return value index, so "AttributeList" seems like a more intuitive name. Rename AttributeSetImpl to AttributeListImpl to follow suit. It's useful to rename this class so that we can rename AttributeSetNode to AttributeSet later. AttributeSet is the set of attributes that apply to a single function, argument, or return value. Reviewers: sanjoy, javed.absar, chandlerc, pete Reviewed By: pete Subscribers: pete, jholewinski, arsenm, dschuff, mehdi_amini, jfb, nhaehnle, sbc100, void, llvm-commits Differential Revision: https://reviews.llvm.org/D31102 llvm-svn: 298393	2017-03-21 16:57:19 +00:00
Jun Bum Lim	4230101def	[CodeGenPrep]Restructure promoting Ext to form ExtLoad Summary: Instead of just looking for a load which is mergable with Ext to form ExtLoad, trying to promote Exts as long as the cost is acceptable. This change is not a NFC as it continue promoting Exts even after finding a load during promotions; the change in arm64-codegen-prepare-extload.ll described in 2.b might show the case. This change was motivated from D26524. Based on this change, I will move the transformation performed in aarch64-type-promotion into CGP. Reviewers: jmolloy, qcolombet, mcrosier, javed.absar Reviewed By: qcolombet Subscribers: rengolin, llvm-commits, aemerson Differential Revision: https://reviews.llvm.org/D27853 llvm-svn: 298114	2017-03-17 19:05:21 +00:00
Matt Arsenault	02d915be90	CodeGenPrepare: Sink addressing modes for atomics llvm-svn: 297903	2017-03-15 22:35:20 +00:00
Michael Kuperstein	13bf8a2684	[CGP] Split some critical edges coming out of indirect branches Splitting critical edges when one of the source edges is an indirectbr is hard in general (because it requires changing the memory the indirectbr reads). But if a block only has a single indirectbr predecessor (which is the common case), we can simulate splitting that edge by splitting the destination block, and retargeting the direct branches. This is motivated by the use of computed gotos in python 2.7: PyEval_EvalFrame() ends up using an indirect branch with ~100 successors, and passing a constant to each of those. Since MachineSink can't break indirect critical edges on demand (and doing this in MIR doesn't look feasible), this causes us to emit about ~100 defs of registers containing constants, which we in the predecessor block, where only one of those constants is used in each successor. So, at each computed goto, we needlessly spill about a 100 constants to stack. The end result is that a clang-compiled python interpreter can be about ~2.5x slower on a simple python reduction loop than a gcc-compiled interpreter. Differential Revision: https://reviews.llvm.org/D29916 llvm-svn: 296416	2017-02-28 00:11:34 +00:00
Daniel Jasper	3ca4525612	Revert "[CGP] Split some critical edges coming out of indirect branches" This reverts commit r296149 as it leads to crashes when compiling for PPC. llvm-svn: 296295	2017-02-26 11:09:12 +00:00
Eli Friedman	c12a5a7595	[CodeGenPrepare] Make -addr-sink-using-gep work with address spaces. When we construct addressing modes, we use isNoopAddrSpaceCast to ignore addrspacecast instructions. Make sure we insert the correct addrspacecast when we reconstruct the addressing mode. Differential Revision: https://reviews.llvm.org/D30114 llvm-svn: 296167	2017-02-24 20:51:36 +00:00
Michael Kuperstein	46b131e3f8	[CGP] Split some critical edges coming out of indirect branches Splitting critical edges when one of the source edges is an indirectbr is hard in general (because it requires changing the memory the indirectbr reads). But if a block only has a single indirectbr predecessor (which is the common case), we can simulate splitting that edge by splitting the destination block, and retargeting the direct branches. This is motivated by the use of computed gotos in python 2.7: PyEval_EvalFrame() ends up using an indirect branch with ~100 successors, and passing a constant to each of those. Since MachineSink can't break indirect critical edges on demand (and doing this in MIR doesn't look feasible), this causes us to emit about ~100 defs of registers containing constants, which we in the predecessor block, where only one of those constants is used in each successor. So, at each computed goto, we needlessly spill about a 100 constants to stack. The end result is that a clang-compiled python interpreter can be about ~2.5x slower on a simple python reduction loop than a gcc-compiled interpreter. Differential Revision: https://reviews.llvm.org/D29916 llvm-svn: 296149	2017-02-24 18:41:32 +00:00
Michael Kuperstein	581c9f4b20	Revert r269060 to pacify bots. llvm-svn: 296064	2017-02-24 01:22:19 +00:00
Michael Kuperstein	12e79d5002	[CGP] Split some critical edges coming out of indirect branches Splitting critical edges when one of the source edges is an indirectbr is hard in general (because it requires changing the memory the indirectbr reads). But if a block only has a single indirectbr predecessor (which is the common case), we can simulate splitting that edge by splitting the destination block, and retargeting the direct branches. This is motivated by the use of computed gotos in python 2.7: PyEval_EvalFrame() ends up using an indirect branch with ~100 successors, and passing a constant to each of those. Since MachineSink can't break indirect critical edges on demand (and doing this in MIR doesn't look feasible), this causes us to emit about ~100 defs of registers containing constants, which we in the predecessor block, where only one of those constants is used in each successor. So, at each computed goto, we needlessly spill about a 100 constants to stack. The end result is that a clang-compiled python interpreter can be about ~2.5x slower on a simple python reduction loop than a gcc-compiled interpreter. Differential Revision: https://reviews.llvm.org/D29916 llvm-svn: 296060	2017-02-24 00:56:21 +00:00
Geoff Berry	5d534b6a11	[CodeGenPrepare] Sink and duplicate more 'and' instructions. Summary: Rework the code that was sinking/duplicating (icmp and, 0) sequences into blocks where they were being used by conditional branches to form more tbz instructions on AArch64. The new code is more general in that it just looks for 'and's that have all icmp 0's as users, with a target hook used to select which subset of 'and' instructions to consider. This change also enables 'and' sinking for X86, where it is more widely beneficial than on AArch64. The 'and' sinking/duplicating code is moved into the optimizeInst phase of CodeGenPrepare, where it can take advantage of the fact the OptimizeCmpExpression has already sunk/duplicated any icmps into the blocks where they are used. One minor complication from this change is that optimizeLoadExt needed to be updated to always mark 'and's it has determined should be in the same block as their feeding load in the InsertedInsts set to avoid an infinite loop of hoisting and sinking the same 'and'. This change fixes a regression on X86 in the tsan runtime caused by moving GVNHoist to a later place in the optimization pipeline (see PR31382). Reviewers: t.p.northover, qcolombet, MatzeB Subscribers: aemerson, mcrosier, sebpop, llvm-commits Differential Revision: https://reviews.llvm.org/D28813 llvm-svn: 295746	2017-02-21 18:53:14 +00:00
Matt Arsenault	1672b1b86e	TargetLowering: Remove AddrSpace parameter from GetAddrModeArguments It doesn't make any sense to pass in to what is supposed to be parsing the call, and this can be inferred from the pointer output. llvm-svn: 294412	2017-02-08 07:09:03 +00:00
Igor Laevsky	3be81bae63	[CodeGenPrepare] Hoist all getSubtargetImpl calls to the beginning of the pass Differential Revision: https://reviews.llvm.org/D29456 llvm-svn: 294301	2017-02-07 13:27:20 +00:00
Jun Bum Lim	b99a06b7c9	[CodeGenPrep]No negative cost in the ExtLd promotion Summary: This change prevent the signed value of cost from being negative as the value is passed as an unsigned argument. Reviewers: mcrosier, jmolloy, qcolombet, javed.absar Reviewed By: mcrosier, qcolombet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28871 llvm-svn: 293307	2017-01-27 17:16:37 +00:00
Haicheng Wu	8ce2d14356	[CodeGenPrepare] Fix a typo in the comment. NFC. encode => endcode. Differential Revision: https://reviews.llvm.org/D28866 llvm-svn: 292438	2017-01-18 21:12:10 +00:00
Wei Mi	a2f0b594c2	Redo store splitting in CodeGenPrepare. This is a succeeding patch of https://reviews.llvm.org/D22840 to address the issue when a value to be merged into an int64 pair is in a different BB. Redoing the store splitting in CodeGenPrepare so we can match the pattern across multiple BBs and move some instructions into the same BB. We still keep the code in dag combine so that we can catch cases that show up after DAG combining runs. Differential Revision: https://reviews.llvm.org/D25914 llvm-svn: 290365	2016-12-22 19:44:45 +00:00
George Burgess IV	3f08914e7e	[Analysis] Centralize objectsize lowering logic. We're currently doing nearly the same thing for @llvm.objectsize in three different places: two of them are missing checks for overflow, and one of them could subtly break if InstCombine gets much smarter about removing alloc sites. Seems like a good idea to not do that. llvm-svn: 290214	2016-12-20 23:46:36 +00:00
Jun Bum Lim	90b6b5074a	[CodeGenPrep] Skip merging empty case blocks This is recommit of r287553 after fixing the invalid loop info after eliminating an empty block and unit test failures in AVR and WebAssembly : Summary: Merging an empty case block into the header block of switch could cause ISel to add COPY instructions in the header of switch, instead of the case block, if the case block is used as an incoming block of a PHI. This could potentially increase dynamic instructions, especially when the switch is in a loop. I added a test case which was reduced from the benchmark I was targetting. Reviewers: t.p.northover, mcrosier, manmanren, wmi, joerg, davidxl Subscribers: joerg, qcolombet, danielcdh, hfinkel, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D22696 llvm-svn: 289988	2016-12-16 20:38:39 +00:00
Sanjoy Das	007572706b	Inline stripInvariantGroupMetadata out of existence As a one liner function, I don't think it is pulling its weight in terms of helping readability. llvm-svn: 289987	2016-12-16 20:29:39 +00:00
Sanjoy Das	089c699743	Fix CodeGenPrepare::stripInvariantGroupMetadata `dropUnknownNonDebugMetadata` takes a list of "known" metadata IDs. The only reason it worked at all is that `getMetadataID` returns something unrelated -- it returns the subclass ID of the receiver (which is used in `dyn_cast` etc.). That does not numerically match `LLVMContext::MD_invariant_group` and ends up dropping `invariant_group` along with every other metadata that does not numerically match `LLVMContext::MD_invariant_group`. llvm-svn: 289973	2016-12-16 18:52:33 +00:00
Jun Bum Lim	f9416af191	Revert "[CodeGenPrep] Skip merging empty case blocks" This reverts commit r289951. llvm-svn: 289960	2016-12-16 17:06:14 +00:00
Jun Bum Lim	85347dde27	[CodeGenPrep] Skip merging empty case blocks This is recommit of r287553 after fixing the invalid loop info after eliminating an empty block: Summary: Merging an empty case block into the header block of switch could cause ISel to add COPY instructions in the header of switch, instead of the case block, if the case block is used as an incoming block of a PHI. This could potentially increase dynamic instructions, especially when the switch is in a loop. I added a test case which was reduced from the benchmark I was targetting. Reviewers: t.p.northover, mcrosier, manmanren, wmi, joerg, davidxl Subscribers: joerg, qcolombet, danielcdh, hfinkel, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D22696 llvm-svn: 289951	2016-12-16 16:03:31 +00:00
Peter Collingbourne	ab85225be4	IR: Change the gep_type_iterator API to avoid always exposing the "current" type. Instead, expose whether the current type is an array or a struct, if an array what the upper bound is, and if a struct the struct type itself. This is in preparation for a later change which will make PointerType derive from Type rather than SequentialType. Differential Revision: https://reviews.llvm.org/D26594 llvm-svn: 288458	2016-12-02 02:24:42 +00:00
Joerg Sonnenberger	caaa82d90d	Revert r287553: [CodeGenPrep] Skip merging empty case blocks It results in assertions in lib/Analysis/BlockFrequencyInfoImpl.cpp line 670 ("Expected irreducible CFG"). llvm-svn: 288052	2016-11-28 18:56:54 +00:00
Justin Lebar	3e50a5be8f	[CodeGenPrepare] Don't sink non-cheap addrspacecasts. Summary: Previously, CGP would unconditionally sink addrspacecast instructions, even going so far as to sink them into a loop. Now we check that the cast is "cheap", as defined by TLI. We introduce a new "is-cheap" function to TLI rather than using isNopAddrSpaceCast because some GPU platforms want the ability to ask for non-nop casts to be sunk. Reviewers: arsenm, tra Subscribers: jholewinski, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D26923 llvm-svn: 287591	2016-11-21 22:49:15 +00:00
Justin Lebar	838c7f5a85	[CodeGenPrepare] Rewrite a loop in terms of llvm::none_of. NFC. Reviewers: arsenm Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D26924 llvm-svn: 287590	2016-11-21 22:49:11 +00:00
Jun Bum Lim	82f55c5446	[CodeGenPrep] Skip merging empty case blocks Summary: Merging an empty case block into the header block of switch could cause ISel to add COPY instructions in the header of switch, instead of the case block, if the case block is used as an incoming block of a PHI. This could potentially increase dynamic instructions, especially when the switch is in a loop. I added a test case which was reduced from the benchmark I was targetting. Reviewers: t.p.northover, mcrosier, manmanren, wmi, davidxl Subscribers: qcolombet, danielcdh, hfinkel, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D22696 llvm-svn: 287553	2016-11-21 16:47:28 +00:00
Simon Pilgrim	f2fbf43704	Fix comment typos. NFC. Identified by Pedro Giffuni in PR27636. llvm-svn: 287490	2016-11-20 13:47:59 +00:00
Chandler Carruth	0f139b4f71	Hoist check for TLI above all of the attempts to use it (including one of which that is hidden inside a separate function call) and helpfully before building expensive transaction infrastructure. This will avoid crashing when running CGP in a generic mode if we ever managed to hit this case. Note that I spent some time looking at alternatives. CGP is actually used without a TM or TLI in order to do some target-independent testing. Further, all of the neighboring optimization techniques actually have some paths that are effective even in the absence of TLI so this seemed the correct scope at which to check and bypass logic. It still isn't clear that long-term support for missing TM/TLI is the right cost/benefit tradeoff for CGP -- we seem to get relatively little for it and the code is just littered with checks (and assumptions which I suspect are still missing some checks). This at least fixes the potential bug in this code spotted by PVS-Studio, so we've got that going for us. ;] llvm-svn: 285987	2016-11-04 06:54:00 +00:00
Dehao Chen	302b69c940	Use profile info to set function section prefix to group hot/cold functions. Summary: The original implementation is in r261607, which was reverted in r269726 to accomendate the ProfileSummaryInfo analysis pass. The new implementation: 1. add a new metadata for function section prefix 2. query against ProfileSummaryInfo in CGP to set the correct section prefix for each function 3. output the section prefix set by CGP Reviewers: davidxl, eraman Subscribers: vsk, llvm-commits Differential Revision: https://reviews.llvm.org/D24989 llvm-svn: 284533	2016-10-18 20:42:47 +00:00
Andrea Di Biagio	fa90c692db	[CodeGenPrepare] When moving a zext near to its associated load, do not retain the original debug location. CodeGenPrepare knows how to move a zext of a load into the same basic block where the load lives. The goal is to help ISel match a zero-extending load instead of two separated instructions. CGP attempts to move a zext computation even if it lives in a basic block that does not post-dominate the load's basic block. That means, the hoisted zext may be speculated. Preserving the zext location would hurt the debugging experience and the quality of sample pgo. With this patch, when moving a zext near to its associated load, CGP no longer propagates the zext's debug location. Instead, CGP conservatively reuses the same debug location for the load and the zext. An alternative approach would be to assign an artificial line-0 location to the zext. However we don't want to over-use the 'line-0' for this particular case because it would have a size cost in the line-table section for no additional benefit. Differential Revision: https://reviews.llvm.org/D25611 llvm-svn: 284377	2016-10-17 11:32:26 +00:00
Wolfgang Pieb	e51bede1d8	Preserve the debug location when CodeGenPrepare sinks a compare instruction into the basic block of a user. Patch by Andrea DiBiagio. Differential Revision: https://reviews.llvm.org/D24632 llvm-svn: 283500	2016-10-06 21:43:45 +00:00
Mehdi Amini	117296c0a0	Use StringRef in Pass/PassManager APIs (NFC) llvm-svn: 283004	2016-10-01 02:56:57 +00:00
Dehao Chen	c32d71253c	Fix the bug introduced in r281252. llvm-svn: 281253	2016-09-12 20:29:54 +00:00
Dehao Chen	9bbb941acf	Lower consecutive select instructions correctly. Summary: If consecutive select instructions are lowered separately in CGP, it will introduce redundant condition check and branches that cannot be removed by later optimization phases. This patch lowers all consecutive select instructions at the same to to avoid inefficent code as demonstrated in https://llvm.org/bugs/show_bug.cgi?id=29095 Reviewers: davidxl Subscribers: vsk, llvm-commits Differential Revision: https://reviews.llvm.org/D24147 llvm-svn: 281252	2016-09-12 20:23:28 +00:00
Michael Kuperstein	f79af6f8c4	[CGP] Be less conservative about tail-duplicating a ret to allow tail calls CGP tail-duplicates rets into blocks that end with a call that feed the ret. This puts the call in tail position, potentially allowing the DAG builder to lower it as a tail call. To avoid tail duplication in cases where we won't form the tail call, CGP tried to predict whether this is going to be possible, and avoids doing it when lowering as a tail call will definitely fail. However, it was being too conservative by always throwing away calls to functions with a signext/zeroext attribute on the return type. Instead, we can use the same logic the builder uses to determine whether the attributes work out. Differential Revision: https://reviews.llvm.org/D24315 llvm-svn: 280894	2016-09-08 00:48:37 +00:00
Michael Kuperstein	71321563de	Don't reuse a variable name in a nested scope. NFC. llvm-svn: 280853	2016-09-07 20:29:49 +00:00
Xinliang David Li	241e6c7086	[Profile] preserve branch metadata lowering select in CGP CGP currently drops select's MD_prof profile data when generating conditional branch which can lead to bad code layout. The patch fixes the issue. Differential Revision: http://reviews.llvm.org/D24169 llvm-svn: 280600	2016-09-03 21:26:36 +00:00
David Majnemer	0d955d0bf5	Use the range variant of find instead of unpacking begin/end If the result of the find is only used to compare against end(), just use is_contained instead. No functionality change is intended. llvm-svn: 278433	2016-08-11 22:21:41 +00:00
Tim Northover	918f05063c	CodeGenPrep: use correct function to determine Global's alignment. Elsewhere (particularly computeKnownBits) we assume that a global will be aligned to the value returned by Value::getPointerAlignment. This is used to boost the alignment on memcpy/memset, so any target-specific request can only increase that value. llvm-svn: 275866	2016-07-18 18:28:52 +00:00
Chad Rosier	a00df49dc5	Clarify that we match BSwap in InstCombine and BitReverse in CGP. NFC. Also, rename recognizeBitReverseOrBSwapIdiom to recognizeBSwapOrBitReverseIdiom, so the ordering of the MatchBSwaps and MatchBitReversals arguments are consistent with the function name. llvm-svn: 270715	2016-05-25 16:22:14 +00:00
Jun Bum Lim	be11bdc4b0	Rename getLargestLegalIntTypeSize to getLargestLegalIntTypeSizeInBits(). NFC. Summary: Rename DataLayout::getLargestLegalIntTypeSize to DataLayout::getLargestLegalIntTypeSizeInBits() to prevent similar mistakes fixed in r269433. Reviewers: joker.eph, mcrosier Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D20248 llvm-svn: 269456	2016-05-13 18:38:35 +00:00
Sanjay Patel	c7b91e65d8	[CGP] avoid crashing from weightlessness It's possible that we have branch weights with 0 values. In that case, don't try to create an impossible BranchProbability. llvm-svn: 268935	2016-05-09 17:31:55 +00:00
David Majnemer	0c80e2eac6	[CodeGenPrepare] Don't sink a cast past its user The sink cast machinery is supposed to sink casts as close to their user as possible. However, an EH pad is the first instruction in it's basic block. Don't sink if the user is an EH pad. This fixes PR27536. llvm-svn: 267767	2016-04-27 19:36:38 +00:00
Sanjay Patel	d66607bd8c	[CodeGenPrepare] use branch weight metadata to decide if a select should be turned into a branch This is part of solving PR27344: https://llvm.org/bugs/show_bug.cgi?id=27344 CGP should undo the SimplifyCFG transform for the same reason that earlier patches have used this same mechanism: it's possible that passes between SimplifyCFG and CGP may be able to optimize the IR further with a select in place. For the TLI hook default, >99% taken or not taken is chosen as the default threshold for a highly predictable branch. Even the most limited HW branch predictors will be correct on this branch almost all the time, so even a massive mispredict penalty perf loss would be overcome by the win from all the times the branch was predicted correctly. As a follow-up, we could make the default target hook less conservative by using the SchedMachineModel's MispredictPenalty. Or we could just let targets override the default by implementing the hook with that and other target-specific options. Note that trying to statically determine mispredict rates for close-to-balanced profile weight data is generally impossible if the HW is sufficiently advanced. Ie, 50/50 taken/not-taken might still be 100% predictable. Finally, note that this patch as-is will not solve PR27344 because the current __builtin_unpredictable() branch weight default values are 4 and 64. A proposal to change that is in D19435. Differential Revision: http://reviews.llvm.org/D19488 llvm-svn: 267572	2016-04-26 17:11:17 +00:00
Sanjay Patel	a31b0c0ece	[CodeGenPrepare] don't convert an unpredictable select into control flow Suggested in the review of D19488: http://reviews.llvm.org/D19488 llvm-svn: 267504	2016-04-26 00:47:39 +00:00
Sanjay Patel	dc88bd6e1f	replace duplicated static functions for profile metadata access with BranchInst member function; NFCI llvm-svn: 267295	2016-04-23 20:01:22 +00:00
Andrew Kaylor	aa641a5171	Re-commit optimization bisect support (r267022) without new pass manager support. The original commit was reverted because of a buildbot problem with LazyCallGraph::SCC handling (not related to the OptBisect handling). Differential Revision: http://reviews.llvm.org/D19172 llvm-svn: 267231	2016-04-22 22:06:11 +00:00
Vedant Kumar	6013f45f92	Revert "Initial implementation of optimization bisect support." This reverts commit r267022, due to an ASan failure: http://lab.llvm.org:8080/green/job/clang-stage2-cmake-RgSan_check/1549 llvm-svn: 267115	2016-04-22 06:51:37 +00:00
Andrew Kaylor	f0f279291c	Initial implementation of optimization bisect support. This patch implements a optimization bisect feature, which will allow optimizations to be selectively disabled at compile time in order to track down test failures that are caused by incorrect optimizations. The bisection is enabled using a new command line option (-opt-bisect-limit). Individual passes that may be skipped call the OptBisect object (via an LLVMContext) to see if they should be skipped based on the bisect limit. A finer level of control (disabling individual transformations) can be managed through an addition OptBisect method, but this is not yet used. The skip checking in this implementation is based on (and replaces) the skipOptnoneFunction check. Where that check was being called, a new call has been inserted in its place which checks the bisect limit and the optnone attribute. A new function call has been added for module and SCC passes that behaves in a similar way. Differential Revision: http://reviews.llvm.org/D19172 llvm-svn: 267022	2016-04-21 17:58:54 +00:00
Petar Jovanovic	644b8c1a5d	Calculate __builtin_object_size when pointer depends on a condition This patch fixes calculating of builtin_object_size if it depends on a condition. Before this patch compiler did not know how to calculate the object size when it finds a condition that cannot be eliminated. This patch enables calculating of builtin_object_size even in case when condition cannot be eliminated by choosing minimum or maximum value as a result from condition. Choosing minimum or maximum value from condition is based on the second argument of __builtin_object_size function. Patch by Strahinja Petrovic. Differential Revision: http://reviews.llvm.org/D18438 llvm-svn: 266193	2016-04-13 12:25:25 +00:00
Sanjay Patel	892f167aa5	use range-loops; NFCI llvm-svn: 265985	2016-04-11 20:13:44 +00:00
Chuang-Yu Cheng	d3fb38cae5	Don't delete empty preheaders in CodeGenPrepare if it would create a critical edge Presently, CodeGenPrepare deletes all nearly empty (only phi and branch) basic blocks. This pass can delete loop preheaders which frequently creates critical edges. A preheader can be a convenient place to spill registers to the stack. If the entrance to a loop body is a critical edge, then spills may occur in the loop body rather than immediately before it. This patch protects loop preheaders from deletion in CodeGenPrepare even if they are nearly empty. Since the patch alters the CFG, it affects a large number of test cases. In most cases, the changes are merely cosmetic (basic blocks have different names or instruction orders change slightly). I am somewhat concerned about the test/CodeGen/Mips/brdelayslot.ll test case. If the loop preheader is not deleted, then the MIPS backend does not take advantage of a branch delay slot. Consequently, I would like some close review by a MIPS expert. The patch also partially subsumes D16893 from George Burgess IV. George correctly notes that CodeGenPrepare does not actually preserve the dominator tree. I think the dominator tree was usually not valid when CodeGenPrepare ran, but I am using LoopInfo to mark preheaders, so the dominator tree is now always valid before CodeGenPrepare. Author: Tom Jablin (tjablin) Reviewers: hfinkel george.burgess.iv vkalintiris dsanders kbarton cycheng http://reviews.llvm.org/D16984 llvm-svn: 265397	2016-04-05 14:06:20 +00:00
Peter Zotov	8efe38a1e2	[CodeGenPrepare] Fix r265264 (again). Don't require TLI for SinkCmpExpression, like it wasn't before r265264. llvm-svn: 265271	2016-04-03 19:32:13 +00:00
Peter Zotov	f87e550e89	[CodeGenPrepare] Fix r265264. The case where there was no TargetLowering was not handled, leading to null pointer dereferences. llvm-svn: 265265	2016-04-03 17:11:53 +00:00
Peter Zotov	0b6d7bc682	[CodeGenPrepare] Avoid sinking soft-FP comparisons Sinking comparisons in CGP can undo the job of hoisting them done earlier by LICM, and soft-FP makes this an expensive mistake. A common pattern that produces floating point comparisons uniform over a loop is an explicit check for division by zero. If the divisor is hoisted out of the loop, the comparison can also be, but hoisting the function that unwinds is never legal, since it may cause side effects in the loop body prior to the unwinding to not be executed. Differential Revision: http://reviews.llvm.org/D18744 llvm-svn: 265264	2016-04-03 16:36:17 +00:00
George Burgess IV	d4febd1612	Keep CodeGenPrepare from preserving the domtree. CGP modifies the domtree in some cases, so saying that it preserves the domtree is a lie. We'll be able to selectively preserve it with the new pass manager. Differential Revision: http://reviews.llvm.org/D16893 llvm-svn: 264099	2016-03-22 21:25:08 +00:00
Junmo Park	6098cbbd2c	Minor code cleanups. NFC. llvm-svn: 263200	2016-03-11 07:05:32 +00:00
Philip Reames	ac115ed72f	[CGP] Duplicate addressing computation in cold paths if required to sink addressing mode This patch teaches CGP to duplicate addressing mode computations into cold paths (detected via explicit cold attribute on calls) if required to let addressing mode be safely sunk into the basic block containing each load and store. In general, duplicating code into cold blocks may result in code growth, but should not effect performance. In this case, it's better to duplicate some code than to put extra pressure on the register allocator by making it keep the address through the entirely of the fast path. This patch only handles addressing computations, but in principal, we could implement a more general cold cold scheduling heuristic which tries to reduce register pressure in the fast path by duplicating code into the cold path. Getting the profitability of the general case right seemed likely to be challenging, so I stuck to the existing case (addressing computation) we already had. Differential Revision: http://reviews.llvm.org/D17652 llvm-svn: 263074	2016-03-09 23:13:12 +00:00
Junmo Park	161dc1c605	[CodeGenPrepare] Remove load-based heuristic Summary: Both the hardware and LLVM have changed since 2012. Now, load-based heuristic don't show big differences any more on OoO cores. There is no notable regressons and improvements on spec2000/2006. (Cortex-A57, Core i5). Reviewers: spatel, zansari Differential Revision: http://reviews.llvm.org/D16836 llvm-svn: 261809	2016-02-25 00:23:27 +00:00
Duncan P. N. Exon Smith	a848c47130	ADT: Stop using getNodePtrUnchecked on end() iterators Stop using `getNodePtrUnchecked()` when building IR. Eventually a dereference will be required to get at the downcast node, since the iterator will only store an `ilist_node_base` of some sort. This should have no functionality change for now, but is a path towards removing some more UB from ilist. llvm-svn: 261495	2016-02-21 19:52:15 +00:00
Duncan P. N. Exon Smith	7b269642d2	CodeGen: Avoid getNodePtrUnchecked() where we need a Value, NFC `ilist_iterator<NodeTy>::getNodePtrUnchecked()` is documented as being for internal use only, but CodeGenPrepare was using it anyway. This code relies on pulling out the `Value` pointer even after the lifetime of the iterator is over. But having this pointer available in ilist_iterator depends on UB in the first place. Instead, safely pull out the `Value` when the iterator is alive and stop using the internal-only API. There should be no functionality change here. llvm-svn: 261493	2016-02-21 19:37:45 +00:00
Yaron Keren	eb2a25467e	Annotate dump() methods with LLVM_DUMP_METHOD, addressing Richard Smith r259192 post commit comment. clang part in r259232, this is the LLVM part of the patch. llvm-svn: 259240	2016-01-29 20:50:44 +00:00
Junmo Park	7d6c5f19f1	Minor code cleanups. NFC. llvm-svn: 259033	2016-01-28 09:42:39 +00:00
Sanjay Patel	3388d1fc6d	function names start with a lowercase letter; NFC llvm-svn: 258552	2016-01-22 21:11:47 +00:00
Sanjay Patel	545a456235	fix formatting; NFC llvm-svn: 258330	2016-01-20 18:59:16 +00:00
Manuel Jacob	5f6eaac611	GlobalValue: use getValueType() instead of getType()->getPointerElementType(). Reviewers: mjacob Subscribers: jholewinski, arsenm, dsanders, dblaikie Patch by Eduard Burtescu. Differential Revision: http://reviews.llvm.org/D16260 llvm-svn: 257999	2016-01-16 20:30:46 +00:00
James Y Knight	ac03dca412	Stop increasing alignment of externally-visible globals on ELF platforms. With ELF, the alignment of a global variable in a shared library will get copied into an executables linked against it, if the executable even accesss the variable. So, it's not possible to implicitly increase alignment based on access patterns, or you'll break existing binaries. This happened to affect libc++'s std::cout symbol, for example. See thread: http://thread.gmane.org/gmane.comp.compilers.clang.devel/45311 (This is a re-commit of r257719, without the bug reported in PR26144. I've tweaked the code to not assert-fail in enforceKnownAlignment when computeKnownBits doesn't recurse far enough to find the underlying Alloca/GlobalObject value.) Differential Revision: http://reviews.llvm.org/D16145 llvm-svn: 257902	2016-01-15 16:33:06 +00:00
James Molloy	3ef84c4cbb	[CodeGenPrepare] Try and appease sanitizers dupRetToEnableTailCallOpts(BB) can invalidate BB. It must run after we iterate across BB! llvm-svn: 257886	2016-01-15 10:36:01 +00:00
James Molloy	f01488e2bc	[InstCombine] Rewrite bswap/bitreverse handling completely. There are several requirements that ended up with this design; 1. Matching bitreversals is too heavyweight for InstCombine and doesn't really need to be done so early. 2. Bitreversals and byteswaps are very related in their matching logic. 3. We want to implement support for matching more advanced bswap/bitreverse patterns like partial bswaps/bitreverses. 4. Bswaps are best matched early in InstCombine. The result of these is that a new utility function is created in Transforms/Utils/Local.h that can be configured to search for bswaps, bitreverses or both. InstCombine uses it to find only bswaps, CGP uses it to find only bitreversals. We can then extend the matching logic in one place only. llvm-svn: 257875	2016-01-15 09:20:19 +00:00
James Y Knight	582f556251	Revert "Stop increasing alignment of externally-visible globals on ELF platforms." This reverts commit r257719, due to PR26144. llvm-svn: 257775	2016-01-14 16:33:21 +00:00
James Y Knight	9de6d7becc	Stop increasing alignment of externally-visible globals on ELF platforms. With ELF, the alignment of a global variable in a shared library will get copied into an executables linked against it, if the executable even accesss the variable. So, it's not possible to implicitly increase alignment based on access patterns, or you'll break existing binaries. This happened to affect libc++'s std::cout symbol, for example. See thread: http://thread.gmane.org/gmane.comp.compilers.clang.devel/45311 llvm-svn: 257719	2016-01-13 23:59:19 +00:00
Junmo Park	aa9243a25d	Remove extra whitespace. NFC. llvm-svn: 257144	2016-01-08 04:20:32 +00:00
Manuel Jacob	83eefa6d20	[Statepoints] Refactor GCRelocateOperands into an intrinsic wrapper. NFC. Summary: This commit renames GCRelocateOperands to GCRelocateInst and makes it an intrinsic wrapper, similar to e.g. MemCpyInst. Also, all users of GCRelocateOperands were changed to use the new intrinsic wrapper instead. Reviewers: sanjoy, reames Subscribers: reames, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D15762 llvm-svn: 256811	2016-01-05 04:03:00 +00:00
Eric Christopher	49a7d6c473	Clarify that the bypassSlowDivision optimization operates on a single BB [v2] Update some comments to be more explicit. Change bypassSlowDivision and the functions it calls so that they take BasicBlocks and Instructions, rather than Function::iterator&s and BasicBlock::iterator&s. Change the APIs so that the caller is responsible for updating the iterator, rather than the callee. This makes control flow much easier to follow. Patch by Justin Lebar! llvm-svn: 256789	2016-01-04 23:18:58 +00:00
Manuel Jacob	5b90b147d4	Remove unnecessary casts. NFC. llvm-svn: 256101	2015-12-19 18:38:42 +00:00
Sanjay Patel	af674fbfd9	getParent() ^ 3 == getModule() ; NFCI llvm-svn: 255511	2015-12-14 17:24:23 +00:00
Reid Kleckner	8de1fe23ed	[CGP] Reimplement r255055 a different way llvm-svn: 255070	2015-12-08 23:00:03 +00:00
Reid Kleckner	e18f92bfe9	Revert "[CGP] Check that we have an insert point before moving llvm.dbg.value around" This reverts commit r255055. Breakage has been reported. llvm-svn: 255063	2015-12-08 22:33:23 +00:00
Reid Kleckner	7c005324d5	[CGP] Check that we have an insert point before moving llvm.dbg.value around llvm-svn: 255055	2015-12-08 21:50:52 +00:00
Andrew Kaylor	d0430e8580	[WinEH] Fix problem where CodeGenPrepare incorrectly sinks a bitcast into an EH pad. Differential Revision: http://reviews.llvm.org/D14842 llvm-svn: 253902	2015-11-23 19:16:15 +00:00
Geoff Berry	5256fcada0	[CodeGenPrepare] Create more extloads and fewer ands Summary: Add and instructions immediately after loads that only have their low bits used, assuming that the (and (load x) c) will be matched as a extload and the ands/truncs fed by the extload will be removed by isel. Reviewers: mcrosier, qcolombet, ab Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14584 llvm-svn: 253722	2015-11-20 22:34:39 +00:00
Sanjay Patel	4699b8ab6a	[CGP] despeculate expensive cttz/ctlz intrinsics This is another step towards allowing SimplifyCFG to speculate harder, but then have CGP clean things up if the target doesn't like it. Previous patches in this series: http://reviews.llvm.org/D12882 http://reviews.llvm.org/D13297 D13297 should catch most expensive ops, but speculation of cttz/ctlz requires special handling because of weirdness in the intrinsic definition for handling a zero input (that definition can probably be blamed on x86). For example, if we have the usual speculated-by-select expensive op pattern like this: %tobool = icmp eq i64 %A, 0 %0 = tail call i64 @llvm.cttz.i64(i64 %A, i1 true) ; is_zero_undef == true %cond = select i1 %tobool, i64 64, i64 %0 ret i64 %cond There's an instcombine that will turn it into: %0 = tail call i64 @llvm.cttz.i64(i64 %A, i1 false) ; is_zero_undef == false This CGP patch is looking for that case and despeculating it back into: entry: %tobool = icmp eq i64 %A, 0 br i1 %tobool, label %cond.end, label %cond.true cond.true: %0 = tail call i64 @llvm.cttz.i64(i64 %A, i1 true) ; is_zero_undef == true br label %cond.end cond.end: %cond = phi i64 [ %0, %cond.true ], [ 64, %entry ] ret i64 %cond This unfortunately may lead to poorer codegen (see the changes in the existing x86 test), but if we increase speculation in SimplifyCFG (the next step in this patch series), then we should avoid those kinds of cases in the first place. The need for this patch was originally mentioned here: http://reviews.llvm.org/D7506 with follow-up here: http://reviews.llvm.org/D7554 Differential Revision: http://reviews.llvm.org/D14630 llvm-svn: 253573	2015-11-19 16:37:10 +00:00
Pete Cooper	67cf9a723b	Revert "Change memcpy/memset/memmove to have dest and source alignments." This reverts commit r253511. This likely broke the bots in http://lab.llvm.org:8011/builders/clang-ppc64-elf-linux2/builds/20202 http://bb.pgr.jp/builders/clang-3stage-i686-linux/builds/3787 llvm-svn: 253543	2015-11-19 05:56:52 +00:00
Pete Cooper	72bc23ef02	Change memcpy/memset/memmove to have dest and source alignments. Note, this was reviewed (and more details are in) http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html These intrinsics currently have an explicit alignment argument which is required to be a constant integer. It represents the alignment of the source and dest, and so must be the minimum of those. This change allows source and dest to each have their own alignments by using the alignment attribute on their arguments. The alignment argument itself is removed. There are a few places in the code for which the code needs to be checked by an expert as to whether using only src/dest alignment is safe. For those places, they currently take the minimum of src/dest alignments which matches the current behaviour. For example, code which used to read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 500, i32 8, i1 false) will now read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 8 %dest, i8* align 8 %src, i32 500, i1 false) For out of tree owners, I was able to strip alignment from calls using sed by replacing: (call.llvm\.memset.)i32\ [0-9]\,\ i1 false\) with: $1i1 false) and similarly for memmove and memcpy. I then added back in alignment to test cases which needed it. A similar commit will be made to clang which actually has many differences in alignment as now IRBuilder can generate different source/dest alignments on calls. In IRBuilder itself, a new argument was added. Instead of calling: CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, / isVolatile / false) you now call CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, SrcAlign, / isVolatile */ false) There is a temporary class (IntegerAlignment) which takes the source alignment and rejects implicit conversion from bool. This is to prevent isVolatile here from passing its default parameter to the source alignment. Note, changes in future can now be made to codegen. I didn't change anything here, but this change should enable better memcpy code sequences. Reviewed by Hal Finkel. llvm-svn: 253511	2015-11-18 22:17:24 +00:00
Igor Laevsky	f637b4a52e	[CodegenPrepare] Do not rematerialize gc.relocates across different basic blocks Differential Revision: http://reviews.llvm.org/D14258 llvm-svn: 251957	2015-11-03 18:37:40 +00:00
Sanjay Patel	0ed9aeaa5f	[CGP] widen switch condition and case constants to target's register width (2nd try) This is a redo of r251849 except the tests have been split into arch-specific folders to hopefully make the bots happy. This is a follow-up from the discussion in D12965. The block-at-a-time limitation of SelectionDAG also came up in D13297. Without the InstCombine change from D12965, I don't expect this patch to make any difference in the real world because InstCombine does not shrink cases like this in visitSwitchInst(). But we need to have this CGP safety harness in place before proceeding with any shrinkage in D12965, so we won't generate extra extends for compares. I've opted for IR regression tests in the patch because that seems like a clearer way to test the transform, but PowerPC CodeGen for an i16 widening test is shown below. x86 will need more work to solve: https://llvm.org/bugs/show_bug.cgi?id=22473 Before: BB#0: mr 4, 3 extsh. 3, 4 ble 0, .LBB0_5 BB#1: cmpwi 3, 99 bgt 0, .LBB0_9 BB#2: rlwinm 4, 4, 0, 16, 31 <--- 32-bit mask/extend li 3, 0 cmplwi 4, 1 beqlr 0 BB#3: cmplwi 4, 10 bne 0, .LBB0_12 BB#4: li 3, 1 blr .LBB0_5: rlwinm 3, 4, 0, 16, 31 <--- 32-bit mask/extend cmplwi 3, 65436 beq 0, .LBB0_13 BB#6: cmplwi 3, 65526 beq 0, .LBB0_15 BB#7: cmplwi 3, 65535 bne 0, .LBB0_12 BB#8: li 3, 4 blr .LBB0_9: rlwinm 3, 4, 0, 16, 31 <--- 32-bit mask/extend cmplwi 3, 100 beq 0, .LBB0_14 ... After: BB#0: rlwinm 4, 3, 0, 16, 31 <--- mask/extend to 32-bit and then use that for comparisons cmpwi 4, 999 ble 0, .LBB0_5 BB#1: lis 3, 0 ori 3, 3, 65525 cmpw 4, 3 bgt 0, .LBB0_9 BB#2: cmplwi 4, 1000 beq 0, .LBB0_14 BB#3: cmplwi 4, 65436 bne 0, .LBB0_13 BB#4: li 3, 6 blr .LBB0_5: li 3, 0 cmplwi 4, 1 beqlr 0 BB#6: cmplwi 4, 10 beq 0, .LBB0_12 BB#7: cmplwi 4, 100 bne 0, .LBB0_13 BB#8: li 3, 2 blr .LBB0_9: cmplwi 4, 65526 beq 0, .LBB0_15 BB#10: cmplwi 4, 65535 bne 0, .LBB0_13 ... Differential Revision: http://reviews.llvm.org/D13532 llvm-svn: 251857	2015-11-02 23:22:49 +00:00
Sanjay Patel	dfc825eb36	revert r251849; need to move tests to arch-specific folders llvm-svn: 251851	2015-11-02 23:05:20 +00:00
Sanjay Patel	b90a078de9	[CGP] widen switch condition and case constants to target's register width This is a follow-up from the discussion in D12965. The block-at-a-time limitation of SelectionDAG also came up in D13297. Without the InstCombine change from D12965, I don't expect this patch to make any difference in the real world because InstCombine does not shrink cases like this in visitSwitchInst(). But we need to have this CGP safety harness in place before proceeding with any shrinkage in D12965, so we won't generate extra extends for compares. I've opted for IR regression tests in the patch because that seems like a clearer way to test the transform, but PowerPC CodeGen for an i16 widening test is shown below. x86 will need more work to solve: https://llvm.org/bugs/show_bug.cgi?id=22473 Before: BB#0: mr 4, 3 extsh. 3, 4 ble 0, .LBB0_5 BB#1: cmpwi 3, 99 bgt 0, .LBB0_9 BB#2: rlwinm 4, 4, 0, 16, 31 <--- 32-bit mask/extend li 3, 0 cmplwi 4, 1 beqlr 0 BB#3: cmplwi 4, 10 bne 0, .LBB0_12 BB#4: li 3, 1 blr .LBB0_5: rlwinm 3, 4, 0, 16, 31 <--- 32-bit mask/extend cmplwi 3, 65436 beq 0, .LBB0_13 BB#6: cmplwi 3, 65526 beq 0, .LBB0_15 BB#7: cmplwi 3, 65535 bne 0, .LBB0_12 BB#8: li 3, 4 blr .LBB0_9: rlwinm 3, 4, 0, 16, 31 <--- 32-bit mask/extend cmplwi 3, 100 beq 0, .LBB0_14 ... After: BB#0: rlwinm 4, 3, 0, 16, 31 <--- mask/extend to 32-bit and then use that for comparisons cmpwi 4, 999 ble 0, .LBB0_5 BB#1: lis 3, 0 ori 3, 3, 65525 cmpw 4, 3 bgt 0, .LBB0_9 BB#2: cmplwi 4, 1000 beq 0, .LBB0_14 BB#3: cmplwi 4, 65436 bne 0, .LBB0_13 BB#4: li 3, 6 blr .LBB0_5: li 3, 0 cmplwi 4, 1 beqlr 0 BB#6: cmplwi 4, 10 beq 0, .LBB0_12 BB#7: cmplwi 4, 100 bne 0, .LBB0_13 BB#8: li 3, 2 blr .LBB0_9: cmplwi 4, 65526 beq 0, .LBB0_15 BB#10: cmplwi 4, 65535 bne 0, .LBB0_13 ... Differential Revision: http://reviews.llvm.org/D13532 llvm-svn: 251849	2015-11-02 22:46:24 +00:00
Elena Demikhovsky	092858588a	Scalarizer for masked.gather and masked.scatter intrinsics. When the target does not support these intrinsics they should be converted to a chain of scalar load or store operations. If the mask is not constant, the scalarizer will build a chain of conditional basic blocks. I added isLegalMaskedGather() isLegalMaskedScatter() APIs. Differential Revision: http://reviews.llvm.org/D13722 llvm-svn: 251237	2015-10-25 15:37:55 +00:00
Rafael Espindola	84921b9860	Refactor: Simplify boolean conditional return statements in lib/CodeGen. Patch by Richard. llvm-svn: 251213	2015-10-24 23:11:13 +00:00
Elena Demikhovsky	3ad76a1acd	Masked Load/Store optimization for scalar code When we have to convert the masked.load, masked.store to scalar code, we generate a chain of conditional basic blocks. I added optimization for constant mask vector. Differential Revision: http://reviews.llvm.org/D13855 llvm-svn: 250893	2015-10-21 11:50:54 +00:00
Sanjay Patel	69a50a1e17	[CGP] transform select instructions into branches and sink expensive operands This was originally checked in at r250527, but reverted at r250570 because of PR25222. There were at least 2 problems: 1. The cost check was checking for an instruction with an exact cost of TCC_Expensive; that should have been >=. 2. The cause of the clang stage 1 failures was illegally sinking 'call' instructions; we can't sink instructions that may have side effects / are not safe to execute speculatively. Fixed those conditions in sinkSelectOperand() and added test cases. Original commit message: This is a follow-up to the discussion in D12882. Ideally, we would like SimplifyCFG to be able to form select instructions even when the operands are expensive (as defined by the TTI cost model) because that may expose further optimizations. However, we would then like a later pass like CodeGenPrepare to undo that transformation if the target would likely benefit from not speculatively executing an expensive op (this patch). Once we have this safety mechanism in place, we can adjust SimplifyCFG to restore its select-formation behavior that changed with r248439. Differential Revision: http://reviews.llvm.org/D13297 llvm-svn: 250743	2015-10-19 21:59:12 +00:00
Elena Demikhovsky	20662e39f1	Removed parameter "Consecutive" from isLegalMaskedLoad() / isLegalMaskedStore(). Originally I planned to use the same interface for masked gather/scatter and set isConsecutive to "false" in this case. Now I'm implementing masked gather/scatter and see that the interface is inconvenient. I want to add interfaces isLegalMaskedGather() / isLegalMaskedScatter() instead of using the "Consecutive" parameter in the existing interfaces. Differential Revision: http://reviews.llvm.org/D13850 llvm-svn: 250686	2015-10-19 07:43:38 +00:00
Benjamin Kramer	b43d33bf0f	Revert "This is a follow-up to the discussion in D12882." Breaks clang selfhost, see PR25222. This reverts commits r250527 and r250528. llvm-svn: 250570	2015-10-16 23:00:29 +00:00
Sanjay Patel	374dd8d88e	This is a follow-up to the discussion in D12882. Ideally, we would like SimplifyCFG to be able to form select instructions even when the operands are expensive (as defined by the TTI cost model) because that may expose further optimizations. However, we would then like a later pass like CodeGenPrepare to undo that transformation if the target would likely benefit from not speculatively executing an expensive op (this patch). Once we have this safety mechanism in place, we can adjust SimplifyCFG to restore its select-formation behavior that changed with r248439. Differential Revision: http://reviews.llvm.org/D13297 llvm-svn: 250527	2015-10-16 16:54:30 +00:00
Duncan P. N. Exon Smith	d83547a16e	CodeGen: Remove a few more ilist iterator implicit conversions, NFC llvm-svn: 249875	2015-10-09 18:44:40 +00:00
Sanjay Patel	9fbe22bac6	fix typos; NFC llvm-svn: 249863	2015-10-09 18:01:03 +00:00
Sanjay Patel	4e6527682a	tidy up comments; NFC llvm-svn: 248750	2015-09-28 22:14:51 +00:00
Sanjay Patel	5e5f0e9756	move one-use check under the comment that describes it; NFCI llvm-svn: 248745	2015-09-28 21:44:46 +00:00
Sanjay Patel	fc580a60e2	function names should start with a lower case letter; NFC llvm-svn: 248224	2015-09-21 23:03:16 +00:00
Sanjay Patel	4ac6b115e8	don't repeat function/variable names in header comments; NFC llvm-svn: 248222	2015-09-21 22:47:23 +00:00
Piotr Padlewski	ea09288ee7	Added MD_invariant_group to LLVMContext http://reviews.llvm.org/D12926 llvm-svn: 247931	2015-09-17 20:25:07 +00:00
Piotr Padlewski	6c15ec49ed	Introducing llvm.invariant.group.barrier intrinsic For more info for what reason it was invented, goto: http://lists.llvm.org/pipermail/cfe-dev/2015-July/044227.html invariant.group.barrier: http://reviews.llvm.org/D12310 docs: http://reviews.llvm.org/D11399 CodeGenPrepare: http://reviews.llvm.org/D12875 llvm-svn: 247711	2015-09-15 18:32:14 +00:00
Sanjay Patel	42574203e5	use "unpredictable" metadata in fast-isel when splitting compares This patch uses the metadata defined in D12341 to avoid creating an unpredictable branch. Differential Revision: http://reviews.llvm.org/D12342 llvm-svn: 246692	2015-09-02 19:23:23 +00:00
Sanjay Patel	82d91ddb4f	fix minsize detection: minsize attribute implies optimizing for size Also, add a test for optsize because this was not part of any existing regression test. llvm-svn: 244651	2015-08-11 19:39:36 +00:00
Benjamin Kramer	df005cbe19	Fix some comment typos. llvm-svn: 244402	2015-08-08 18:27:36 +00:00
Sanjay Patel	924879ad2c	wrap OptSize and MinSize attributes for easier and consistent access (NFCI) Create wrapper methods in the Function class for the OptimizeForSize and MinSize attributes. We want to hide the logic of "or'ing" them together when optimizing just for size (-Os). Currently, we are not consistent about this and rely on a front-end to always set OptimizeForSize (-Os) if MinSize (-Oz) is on. Thus, there are 18 FIXME changes here that should be added as follow-on patches with regression tests. This patch is NFC-intended: it just replaces existing direct accesses of the attributes by the equivalent wrapper call. Differential Revision: http://reviews.llvm.org/D11734 llvm-svn: 243994	2015-08-04 15:49:57 +00:00
Benjamin Kramer	4cd5faaa87	[CodeGenPrepare] Compress a pair. No functional change. llvm-svn: 243759	2015-07-31 17:00:39 +00:00
Mehdi Amini	0cdec1e2ab	Make isLegalAddressingMode() taking DataLayout as an argument Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, llvm-commits, rafael, yaron.keren Differential Revision: http://reviews.llvm.org/D11040 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241778	2015-07-09 02:09:40 +00:00
Mehdi Amini	44ede33a69	Make TargetLowering::getPointerTy() taking DataLayout as an argument Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, ted, yaron.keren, rafael, llvm-commits Differential Revision: http://reviews.llvm.org/D11028 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241775	2015-07-09 02:09:04 +00:00
Mehdi Amini	8ac7a9d57a	Redirect DataLayout from TargetMachine to Module in SelectionDAG Summary: SelectionDAG itself is not invoking directly the DataLayout in the TargetMachine, but the "TargetLowering" class is still using it. I'll address it in a following commit. This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11000 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241618	2015-07-07 19:07:19 +00:00
Mehdi Amini	4fe3798dca	Redirect DataLayout from TargetMachine to Module in CodeGen Prepare Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10986 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241614	2015-07-07 18:45:17 +00:00
Alexander Kornienko	f00654e31b	Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC) Apparently, the style needs to be agreed upon first. llvm-svn: 240390	2015-06-23 09:49:53 +00:00
Alexander Kornienko	70bc5f1398	Fixed/added namespace ending comments using clang-tidy. NFC The patch is generated using this command: tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \ -checks=-,llvm-namespace-comment -header-filter='llvm/.\|clang/.*' \ llvm/lib/ Thanks to Eugene Kosov for the original patch! llvm-svn: 240137	2015-06-19 15:57:42 +00:00
Ahmed Bougacha	f32991461f	[CodeGenPrepare] Generalize inserted set from truncs to any inst. It's been used before to avoid infinite loops caused by separate CGP optimizations undoing one another. We found one more such issue caused by r238054. To avoid it, generalize the "InsertedTruncs" set to any inst, and use it to avoid touching those again. llvm-svn: 239938	2015-06-17 20:44:32 +00:00
Matt Arsenault	f72b49bc17	CodeGenPrepare: Provide address space to isLegalAddressingMode Use -1 as the address space if it can't be determined. llvm-svn: 239052	2015-06-04 16:17:38 +00:00
Matt Arsenault	f05b02351f	CodeGenPrepare: Don't match addressing modes through addrspacecast This was resulting in the addrspacecast being removed and incorrectly replaced with a ptrtoint when sinking. llvm-svn: 238217	2015-05-26 16:59:43 +00:00
Ahmed Bougacha	236f9040d0	[AArch64][CGP] Sink zext feeding stxr/stlxr into the same block. The usual CodeGenPrepare trickery, on a target-specific intrinsic. Without this, the expansion of atomics will usually have the zext be hoisted out of the loop, defeating the various patterns we have to catch this precise case. Differential Revision: http://reviews.llvm.org/D9930 llvm-svn: 238054	2015-05-22 21:37:17 +00:00
Pete Cooper	833f34d837	Convert PHI getIncomingValue() to foreach over incoming_values(). NFC. We already had a method to iterate over all the incoming values of a PHI. This just changes all eligible code to use it. Ineligible code included anything which cared about the index, or was also trying to get the i'th incoming BB. llvm-svn: 237169	2015-05-12 20:05:31 +00:00
Sanjoy Das	3d705e37c3	Refactoring gc_relocate related code in CodeGenPrepare.cpp Summary: The original code inserted new instructions by following a Create->Remove->ReInsert flow. This patch removes the unnecessary Remove->ReInsert part by setting up the InsertPoint correctly at the very beginning. This change does not introduce any functionality change. Patch by Chen Li! Reviewers: reames, AndyAyers, sanjoy Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9687 llvm-svn: 237070	2015-05-11 23:47:30 +00:00
Sanjoy Das	89c5491a72	[RewriteStatepointsForGC] Fix a bug on creating gc_relocate for pointer to vector of pointers Summary: In RewriteStatepointsForGC pass, we create a gc_relocate intrinsic for each relocated pointer, and the gc_relocate has the same type with the pointer. During the creation of gc_relocate intrinsic, llvm requires to mangle its type. However, llvm does not support mangling of all possible types. RewriteStatepointsForGC will hit an assertion failure when it tries to create a gc_relocate for pointer to vector of pointers because mangling for vector of pointers is not supported. This patch changes the way RewriteStatepointsForGC pass creates gc_relocate. For each relocated pointer, we erase the type of pointers and create an unified gc_relocate of type i8 addrspace(1)*. Then a bitcast is inserted to convert the gc_relocate to the correct type. In this way, gc_relocate does not need to deal with different types of pointers and the unsupported type mangling is no longer a problem. This change would also ease further merge when LLVM erases types of pointers and introduces an unified pointer type. Some minor changes are also introduced to gc_relocate related part in InstCombineCalls, CodeGenPrepare, and Verifier accordingly. Patch by Chen Li! Reviewers: reames, AndyAyers, sanjoy Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9592 llvm-svn: 237009	2015-05-11 18:49:34 +00:00
Sanjoy Das	499d703f52	[Statepoint] Clean up Statepoint.h: accessor names. Use getFoo() as accessors consistently and some other naming changes. llvm-svn: 236564	2015-05-06 02:36:26 +00:00
John Brawn	e8fd6c8563	[ARM] Align global variables passed to memory intrinsics Fill in the TODO in CodeGenPrepare::OptimizeCallInst so that global variables that are passed to memory intrinsics are aligned in the same way that allocas are. Differential Revision: http://reviews.llvm.org/D8421 llvm-svn: 234735	2015-04-13 10:47:39 +00:00
Alexander Kornienko	f817c1cb9a	Use 'override/final' instead of 'virtual' for overridden methods The patch is generated using clang-tidy misc-use-override check. This command was used: tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py \ -checks='-*,misc-use-override' -header-filter='llvm\|clang' \ -j=32 -fix -format http://reviews.llvm.org/D8925 llvm-svn: 234679	2015-04-11 02:11:45 +00:00
Benjamin Kramer	b4bf14ceaa	[CodeGenPrepare] Report all changes made during instruction sinking r234638 chained another transform below which was tripping over the deleted instruction. Use after free found by asan in many regression tests. llvm-svn: 234654	2015-04-10 22:25:36 +00:00
Sanjoy Das	b6c5914308	[InstCombine][CodeGenPrep] Create llvm.uadd.with.overflow in CGP. Summary: This change moves creating calls to `llvm.uadd.with.overflow` from InstCombine to CodeGenPrep. Combining overflow check patterns into calls to the said intrinsic in InstCombine inhibits optimization because it introduces an intrinsic call that not all other transforms and analyses understand. Depends on D8888. Reviewers: majnemer, atrick Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8889 llvm-svn: 234638	2015-04-10 21:07:09 +00:00
David Blaikie	aa41cd57e0	[opaque pointer type] More GEP IRBuilder API migrations... llvm-svn: 234058	2015-04-03 21:33:42 +00:00
David Blaikie	3909da7f4b	[opaque pointer type] More IRBuilder::createGEP (non-inbounds) migrations: CodeGenPrepare and SimplifyLibCalls llvm-svn: 233596	2015-03-30 20:42:56 +00:00
David Blaikie	68d535c45f	Opaque Pointer Types: GEP API migrations to specify the gep type explicitly The changes to InstCombine do seem a bit silly - it doesn't make anything obviously better to have the caller access the pointers element type (the thing I'm trying to remove) than the GEP itself, but it's a helpful migration step. This will allow me to more obviously lock down GEP (& Load, etc) API usage, then fix all the code that accesses pointer element types except the places that need to be removed (most of the InstCombines) anyway - at which point I'll need to just remove all that code because it won't be meaningful anymore (there will be no pointer types, so no bitcasts to combine) llvm-svn: 233126	2015-03-24 22:38:16 +00:00
Quentin Colombet	7bdd50d2a0	[CodeGenPrepare] Remove broken, dead, code. NFC. llvm-svn: 232690	2015-03-18 23:17:28 +00:00
John Brawn	0dbcd65442	[ARM] Align stack objects passed to memory intrinsics Memcpy, and other memory intrinsics, typically tries to use LDM/STM if the source and target addresses are 4-byte aligned. In CodeGenPrepare look for calls to memory intrinsics and, if the object is on the stack, 4-byte align it if it's large enough that we expect that memcpy would want to use LDM/STM to copy it. Differential Revision: http://reviews.llvm.org/D7908 llvm-svn: 232627	2015-03-18 12:01:59 +00:00
Quentin Colombet	1b274f99ad	[CodeGenPrepare] Refine the cost model provided by the promotion helper. - Use TargetLowering to check for the actual cost of each extension. - Provide a factorized method to check for the cost of an extension: TargetLowering::isExtFree. - Provide a virtual method TargetLowering::isExtFreeImpl for targets to be able to tune the cost of non-free extensions. This refactoring offers a better granularity to model what really happens on different targets. No performance changes and very few code differences. Part of <rdar://problem/19267165> llvm-svn: 231855	2015-03-10 21:48:15 +00:00
Mehdi Amini	a28d91d81b	DataLayout is mandatory, update the API to reflect it with references. Summary: Now that the DataLayout is a mandatory part of the module, let's start cleaning the codebase. This patch is a first attempt at doing that. This patch is not exactly NFC as for instance some places were passing a nullptr instead of the DataLayout, possibly just because there was a default value on the DataLayout argument to many functions in the API. Even though it is not purely NFC, there is no change in the validation. I turned as many pointer to DataLayout to references, this helped figuring out all the places where a nullptr could come up. I had initially a local version of this patch broken into over 30 independant, commits but some later commit were cleaning the API and touching part of the code modified in the previous commits, so it seemed cleaner without the intermediate state. Test Plan: Reviewers: echristo Subscribers: llvm-commits From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 231740	2015-03-10 02:37:25 +00:00
David Blaikie	dc3f01e9cf	Simplify expressions involving boolean constants with clang-tidy Patch by Richard (legalize at xmission dot com). Differential Revision: http://reviews.llvm.org/D8154 llvm-svn: 231617	2015-03-09 01:57:13 +00:00
Sanjoy Das	b818676f6d	Don't modify the DenseMap being iterated over from within the loop that is iterating over it Inserting elements into a `DenseMap` invalidated iterators pointing into the `DenseMap` instance. Differential Revision: http://reviews.llvm.org/D7924 llvm-svn: 230719	2015-02-27 02:24:16 +00:00
Eric Christopher	11e4df73c8	getRegForInlineAsmConstraint wants to use TargetRegisterInfo for a lookup, pass that in rather than use a naked call to getSubtargetImpl. This involved passing down and around either a TargetMachine or TargetRegisterInfo. Update all callers/definitions around the targets and SelectionDAG. llvm-svn: 230699	2015-02-26 22:38:43 +00:00
Eric Christopher	d75c00c638	Add a TargetMachine argument to the AddressingModeMatcher, we'll need this shortly to get a TargetRegisterInfo from the subtarget for TargetLowering routines. llvm-svn: 230698	2015-02-26 22:38:34 +00:00
Duncan P. N. Exon Smith	70eb9c5ae5	CodeGen: Canonicalize access to function attributes, NFC Canonicalize access to function attributes to use the simpler API. getAttributes().getAttribute(AttributeSet::FunctionIndex, Kind) => getFnAttribute(Kind) getAttributes().hasAttribute(AttributeSet::FunctionIndex, Kind) => hasFnAttribute(Kind) Also, add `Function::getFnStackAlignment()`, and canonicalize: getAttributes().getStackAlignment(AttributeSet::FunctionIndex) => getFnStackAlignment() llvm-svn: 229208	2015-02-14 01:44:41 +00:00

1 2 3 4 5 ...

335 Commits