llvm-project

Commit Graph

Author	SHA1	Message	Date
Nikita Popov	cf18ec445d	[GVN] Check load type in select PRE This is no longer implicitly guaranteed with opaque pointers.	2022-03-14 12:46:54 +01:00
Nikita Popov	e9c0720010	[PHITransAddr] Check GEP source element type It's not the same GEP if the source element type is different.	2022-02-11 16:22:48 +01:00
Nikita Popov	2a1b1f1b1b	[GVN] Store source element type for GEP expressions To avoid incorrectly merging GEPs with different source types under opaque pointers. To avoid increasing the Expression structure size, this reuses the existing type member. The code does not rely on this to be the expression result type, it's only used as a disambiguator.	2022-02-11 13:03:30 +01:00
Nikita Popov	46f9e45ef0	[Statepoint] Update gc.statepoint calls in tests with elementtype (NFC) This updates tests for the LangRef change in D117890.	2022-02-04 14:15:41 +01:00
Florian Hahn	8a12cae862	[GVN] Support load of pointer-select to value-select conversion. This patch extends the available-value logic to detect loads of pointer-selects that can be replaced by a value select. For example, consider the code below: loop: %sel.phi = phi i32* [ %start, %ph ], [ %sel, %ph ] %l = load %ptr %l.sel = load %sel.phi %sel = select cond, %ptr, %sel.phi ... exit: %res = load %sel use(%res) The load of the pointer phi can be replaced by a load of the start value outside the loop and a new phi/select chain based on the loaded values, as illustrated below %l.start = load %start loop: sel.phi.prom = phi i32 [ %l.start, %ph ], [ %sel.prom, %ph ] %l = load %ptr %sel.prom = select cond, %l, %sel.phi.prom ... exit: use(%sel.prom) This is a first step towards alllowing vectorizing loops using common libc++ library functions, like std::min_element (https://clang.godbolt.org/z/6czGzzqbs) #include <vector> #include <algorithm> int foo(const std::vector<int> &V) { return *std::min_element(V.begin(), V.end()); } Reviewed By: reames Differential Revision: https://reviews.llvm.org/D118143	2022-02-02 09:23:09 +00:00
Florian Hahn	b1fb613924	[GVN] Add additional tests after `216d1a729`. Further extend test coverage added in `216d1a729`	2022-02-01 21:02:41 +00:00
Florian Hahn	216d1a729c	[GVN] Add tests for D118143 not requiring loops.	2022-02-01 20:24:19 +00:00
Florian Hahn	9dd5fffd30	[GVN] Add tests with redundant load of pointer select. Additional test cases for D118144.	2022-01-28 20:15:32 +00:00
Florian Hahn	56659c80d0	[GVN] Add additional tests for PRE with pointer selects. Additional tests for D118143.	2022-01-28 19:11:36 +00:00
Florian Hahn	b0956a9acf	[GVN] Add tests for loop load PRE through select.	2022-01-25 14:49:17 +00:00
Bryce Wilson	dd13744bfb	Revert "[BasicAliasAnalysis] Remove isMallocOrCallocLikeFn" This reverts commit `1f2cfc4fdc`.	2022-01-14 14:42:53 -08:00
Bryce Wilson	1f2cfc4fdc	[BasicAliasAnalysis] Remove isMallocOrCallocLikeFn Allocation functions should be marked with onlyAccessesInaccessibleMemory (when that is correct for the given function) which is checked elsewhere so this check is no longer needed. Differential Revision: https://reviews.llvm.org/D117180	2022-01-14 12:22:01 -08:00
Nick Desaulniers	79ebc3b0dd	[llvm][test] rewrite callbr to use i rather than X constraint NFC In D115311, we're looking to modify clang to emit i constraints rather than X constraints for callbr's indirect destinations. Prior to doing so, update all of the existing tests in llvm/ to match. Reviewed By: void, jyknight Differential Revision: https://reviews.llvm.org/D115410	2022-01-11 11:31:08 -08:00
Nikita Popov	92d55e7336	[MemoryBuiltins] Remove isNoAliasFn() in favor of isNoAliasCall() We currently have two similar implementations of this concept: isNoAliasCall() only checks for the noalias return attribute. isNoAliasFn() also checks for allocation functions. We should switch to only checking the attribute. SLC is responsible for inferring the noalias return attribute for non-new allocation functions (with a missing case fixed in `348bc76e35`). For new, clang is responsible for setting the attribute, if -fno-assume-sane-operator-new is not passed. Differential Revision: https://reviews.llvm.org/D116800	2022-01-10 09:18:15 +01:00
Philip Reames	6b0ff0969d	Extract utility function for checking initial value of allocation [NFC, try 2] This is a reoccuring pattern, we can consolidate three copies into one. The main motivation is to reduce usages of isMallocLike. The original commit (which was quickly reverted) didn't account for the allocation function could be an invoke, test coverage for that case added in this commit.	2022-01-07 08:44:08 -08:00
Nuno Lopes	84b285d6eb	[GVN] Set phi entries of unreachable predecessors to poison instead of undef This matches NewGVN's behavior.	2021-12-30 14:47:24 +00:00
Dmitry Makogon	267ddbb581	[Test] [GVN] Add test showing equivalent PHIs generation by GVN	2021-12-09 16:40:23 +07:00
Dmitry Makogon	0cada82f0a	[Test] Remove incorrect test in GVN Removed it because it runs indvars after GVN and it's obvious it's an indvars test, not GVN. Also the removed test file includes a 'target triple' with aarch64 specified, which is needed for indvars to eliminate the Phis produced by GVN, and the test directory doesn't have a lit config to exclude lit from running the test if aarch64 is not supported by a build.	2021-11-10 15:22:43 +07:00
Arthur Eubanks	1d8750c3da	[NFC] Rename GVN -> GVNPass and SROA -> SROAPass To be more consistent with other pass struct names. There are still more passes that don't end with "Pass", but these are the important ones. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D112935	2021-11-09 10:35:58 -08:00
Douglas Yung	7cd273c339	Revert "Reapply `db28934` "[IndVars] Pass TTI to replaceCongruentIVs"" This reverts commit `5ec2386332`. This change is causing test failures on the PS4 linux build bot: https://lab.llvm.org/buildbot/#/builders/139/builds/12871	2021-11-09 10:28:41 -08:00
Dmitry Makogon	5ec2386332	Reapply `db28934` "[IndVars] Pass TTI to replaceCongruentIVs" This reapplies patch `db289340c8`. The test failures on build with expensive checks caused by the patch happened due to the fact that we sorted loop Phis in replaceCongruentIVs using llvm::sort, which shuffles the given container if the expensive checks are enabled, so equivalent Phis in the sorted vector had different mutual order from run to run. replaceCongruentIVs tries to replace narrow Phis with truncations of wide ones. In some test cases there were several Phis with the same width, so if their order differs from run to run, the narrow Phis would be replaced with a different Phi, depending on the shuffling result. The patch `ae14fae0ff` fixed this issue by replacing llvm::sort with llvm::stable_sort.	2021-11-09 17:42:29 +07:00
Dmitry Makogon	8d4eba6c0d	Revert "[IndVars] Pass TTI to replaceCongruentIVs" This reverts commit `db289340c8`. The patch caused 2 crashes with expensive checks enabled.	2021-11-08 19:35:14 +07:00
Dmitry Makogon	db289340c8	[IndVars] Pass TTI to replaceCongruentIVs In IndVarSimplify after simplifying and extending loop IVs we call 'replaceCongruentIVs'. This function optionally takes a TTI argument to be able to replace narrow IVs uses with truncates of the widest one. For some reason the TTI wasn't passed to the function, so it couldn't perform such transform. This patch fixes it. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D113024	2021-11-08 19:20:53 +07:00
Dmitry Makogon	e15cf498f6	[Test] Fix tests showing generation of already existent PHIs by GVN Add target triple to the test module, so IndVars has a TTI to perform congruent IVs elimination.	2021-11-02 21:57:37 +07:00
Dmitry Makogon	ce12e68a49	[Test] Fix tests showing generation of already existent PHIs by GVN This is a fix for tests added by `96591a14cd`. A function which was called in tests wasn't marked as 'readonly', and the GVN performed PRE for the loads, but they were supposed to be non-local. So added 'readonly' to the called function.	2021-11-01 18:56:11 +07:00
Dmitry Makogon	96591a14cd	[GVN] Add tests showing generation of already existent PHIs for non-local loads When we eliminate a non-local load in a loop, we create a new PHI for the loaded value, while there already may be the exact same PHIs in the loop. IndVarsSimplify currently can handle this case eliminating the duplicated PHIs. However, if the loop PHI is of type of the load and also there exists an use of the z(s)ext'ed of it, IndVarSimplify wouldn't eliminate the duplicating PHI. It would just replace the IV with a widened one, leaving the GVN-generated PHI as is.	2021-10-29 17:20:44 +07:00
Arthur Eubanks	05392466f0	Reland [IR] Increase max alignment to 4GB Currently the max alignment representable is 1GB, see D108661. Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bits are clear which can be used for some optimizations, e.g. https://crbug.com/1016945. This uses an extra bit in instructions that carry an alignment. We can store 15 bits of "free" information, and with this change some instructions (e.g. AtomicCmpXchgInst) use 14 bits. We can increase the max alignment representable above 4GB (up to 2^62) since we're only using 33 of the 64 values, but I've just limited it to 4GB for now. The one place we have to update the bitcode format is for the alloca instruction. It stores its alignment into 5 bits of a 32 bit bitfield. I've added another field which is 8 bits and should be future proof for a while. For backward compatibility, we check if the old field has a value and use that, otherwise use the new field. Updating clang's max allowed alignment will come in a future patch. Reviewed By: hans Differential Revision: https://reviews.llvm.org/D110451	2021-10-06 13:29:23 -07:00
Arthur Eubanks	569346f274	Revert "Reland [IR] Increase max alignment to 4GB" This reverts commit `8d64314ffe`.	2021-10-06 11:38:11 -07:00
Arthur Eubanks	8d64314ffe	Reland [IR] Increase max alignment to 4GB Currently the max alignment representable is 1GB, see D108661. Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bits are clear which can be used for some optimizations, e.g. https://crbug.com/1016945. This uses an extra bit in instructions that carry an alignment. We can store 15 bits of "free" information, and with this change some instructions (e.g. AtomicCmpXchgInst) use 14 bits. We can increase the max alignment representable above 4GB (up to 2^62) since we're only using 33 of the 64 values, but I've just limited it to 4GB for now. The one place we have to update the bitcode format is for the alloca instruction. It stores its alignment into 5 bits of a 32 bit bitfield. I've added another field which is 8 bits and should be future proof for a while. For backward compatibility, we check if the old field has a value and use that, otherwise use the new field. Updating clang's max allowed alignment will come in a future patch. Reviewed By: hans Differential Revision: https://reviews.llvm.org/D110451	2021-10-06 11:03:51 -07:00
Arthur Eubanks	72cf8b6044	Revert "[IR] Increase max alignment to 4GB" This reverts commit `df84c1fe78`. Breaks some bots	2021-10-06 10:21:35 -07:00
Arthur Eubanks	df84c1fe78	[IR] Increase max alignment to 4GB Currently the max alignment representable is 1GB, see D108661. Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bits are clear which can be used for some optimizations, e.g. https://crbug.com/1016945. This uses an extra bit in instructions that carry an alignment. We can store 15 bits of "free" information, and with this change some instructions (e.g. AtomicCmpXchgInst) use 14 bits. We can increase the max alignment representable above 4GB (up to 2^62) since we're only using 33 of the 64 values, but I've just limited it to 4GB for now. The one place we have to update the bitcode format is for the alloca instruction. It stores its alignment into 5 bits of a 32 bit bitfield. I've added another field which is 8 bits and should be future proof for a while. For backward compatibility, we check if the old field has a value and use that, otherwise use the new field. Updating clang's max allowed alignment will come in a future patch. Reviewed By: hans Differential Revision: https://reviews.llvm.org/D110451	2021-10-06 09:54:14 -07:00
Roman Lebedev	564d85e090	The maximal representable alignment in LLVM IR is 1GiB, not 512MiB In LLVM IR, `AlignmentBitfieldElementT` is 5-bit wide But that means that the maximal alignment exponent is `(1<<5)-2`, which is `30`, not `29`. And indeed, alignment of `1073741824` roundtrips IR serialization-deserialization. While this doesn't seem all that important, this doubles the maximal supported alignment from 512MiB to 1GiB, and there's actually one noticeable use-case for that; On X86, the huge pages can have sizes of 2MiB and 1GiB (!). So while this doesn't add support for truly huge alignments, which i think we can easily-ish do if wanted, i think this adds zero-cost support for a not-trivially-dismissable case. I don't believe we need any upgrade infrastructure, and since we don't explicitly record the IR version, we don't need to bump one either. As @craig.topper speculates in D108661#2963519, this might be an artificial limit imposed by the original implementation of the `getAlignment()` functions. Differential Revision: https://reviews.llvm.org/D108661	2021-08-26 12:53:39 +03:00
Jingu Kang	b52171629f	[GVN] Execute performLoopLoadPRE ahead of PerformLoadPRE Differential Revision: https://reviews.llvm.org/D108204	2021-08-24 09:50:27 +01:00
Nikita Popov	79b55e5038	[GVN] Fix test for loop load PRE on alloca (NFC) This test was not modifying the pointer in the loop, so the loads just ended up as undef, without relation to loop load PRE. Pass the alloca to the called function, so the memory is potentially modified.	2021-08-22 22:28:53 +02:00
Jingu Kang	f45ba18e96	Precommit test for D108204	2021-08-20 09:33:59 +01:00
Philip Reames	e75a2dfe20	[tests] Stablize tests for possible change in deref semantics There's a potential change in dereferenceability attribute semantics in the nearish future. See llvm-dev thread "RFC: Decomposing deref(N) into deref(N) + nofree" and D99100 for context. This change simply adds appropriate attributes to tests to keep transform logic exercised under both old and new/proposed semantics. Note that for many of these cases, O3 would infer exactly these attributes on the test IR. This change handles the idiomatic pattern of a dereferenceable object being passed to a call which can not free that memory. There's a couple other tests which need more one-off attention, they'll be handled in another change.	2021-07-14 13:05:43 -07:00
Juneyoung Lee	ce192ced2b	[InstCombine] Use poison constant to represent the result of unreachable instrs This patch updates InstCombine to use poison constant to represent the resulting value of (either semantically or syntactically) unreachable instrs, or a don't-care value of an unreachable store instruction. This allows more aggressive folding of unused results, as shown in llvm/test/Transforms/InstCombine/getelementptr.ll . Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D104602	2021-06-21 09:58:44 +09:00
serge-sans-paille	4ab3041acb	Revert "[NFC] remove explicit default value for strboolattr attribute in tests" This reverts commit `bda6e5bee0`. See https://lab.llvm.org/buildbot/#/builders/109/builds/15424 for instance	2021-05-24 19:43:40 +02:00
serge-sans-paille	bda6e5bee0	[NFC] remove explicit default value for strboolattr attribute in tests Since `d6de1e1a71`, no attributes is quivalent to setting attribute to false. This is a preliminary commit for https://reviews.llvm.org/D99080	2021-05-24 19:31:04 +02:00
Adam Nemet	ab1f6ffa56	[GVN] Improve analysis for missed optimization remark This change tries to handle multiple dominating users of the pointer operand by choosing the most immediately dominating one, if possible. While making this change I also found that the previous implementation had a missing break statement, making all loads with an odd number of dominating users emit an OtherAccess value, so that has also been fixed. Patch by Henrik G Olsson! Differential Revision: https://reviews.llvm.org/D79097	2021-05-17 21:51:15 -07:00
dfukalov	fdae3fc8b3	[GVN] Clobber partially aliased loads. Use offsets stored in `AliasResult` implemented in D98718. Updated with fix of issue reported in https://reviews.llvm.org/D95543#2745161 Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D95543	2021-05-14 11:17:14 +03:00
Jordan Rupprecht	fec2945998	Revert "[GVN] Clobber partially aliased loads." This reverts commit `6c57044231`. It causes assertion errors due to widening atomic loads, and potentially causes miscompile elsewhere too. Repro, also posted to D95543: ``` $ cat repro.ll ; ModuleID = 'repro.ll' source_filename = "repro.ll" target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" %struct.widget = type { i32 } %struct.baz = type { i32, %struct.snork } %struct.snork = type { %struct.spam } %struct.spam = type { i32, i32 } @global = external local_unnamed_addr global %struct.widget, align 4 @global.1 = external local_unnamed_addr global i8, align 1 @global.2 = external local_unnamed_addr global i32, align 4 define void @zot(%struct.baz* %arg) local_unnamed_addr align 2 { bb: %tmp = getelementptr inbounds %struct.baz, %struct.baz* %arg, i64 0, i32 1 %tmp1 = bitcast %struct.snork* %tmp to i64* %tmp2 = load i64, i64* %tmp1, align 4 %tmp3 = getelementptr inbounds %struct.baz, %struct.baz* %arg, i64 0, i32 1, i32 0, i32 1 %tmp4 = icmp ugt i64 %tmp2, 4294967295 br label %bb5 bb5: ; preds = %bb14, %bb %tmp6 = load i32, i32* %tmp3, align 4 %tmp7 = icmp ne i32 %tmp6, 0 %tmp8 = select i1 %tmp7, i1 %tmp4, i1 false %tmp9 = zext i1 %tmp8 to i8 store i8 %tmp9, i8* @global.1, align 1 %tmp10 = load i32, i32* @global.2, align 4 switch i32 %tmp10, label %bb11 [ i32 1, label %bb12 i32 2, label %bb12 ] bb11: ; preds = %bb5 br label %bb14 bb12: ; preds = %bb5, %bb5 %tmp13 = load atomic i32, i32* getelementptr inbounds (%struct.widget, %struct.widget* @global, i64 0, i32 0) acquire, align 4 br label %bb14 bb14: ; preds = %bb12, %bb11 br label %bb5 } $ opt -O2 repro.ll -disable-output opt: /home/rupprecht/src/llvm-project/llvm/lib/Transforms/Utils/VNCoercion.cpp:496: llvm::Value llvm::VNCoercion::getLoadValueForLoad(llvm::LoadInst , unsigned int, llvm::Type , llvm::Instruction , const llvm::DataLayout &): Assertion `SrcVal->isSimple() && "Cannot widen volatile/atomic load!"' failed. PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace. Stack dump: 0. Program arguments: /home/rupprecht/dev/opt -O2 repro.ll -disable-output ... ```	2021-05-11 16:08:53 -07:00
dfukalov	6c57044231	[GVN] Clobber partially aliased loads. Use offsets stored in `AliasResult` implemented in D98718. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D95543	2021-04-24 14:14:20 +03:00
Nikita Popov	1d96107cfe	[GVN] Generate LE and BE check lines (NFC) I accidentally dropped some check lines in my previous commit. Apparently update_test_checks no longer warns on label conflicts???	2021-04-22 22:44:08 +02:00
Nikita Popov	9d1b2bc7bf	[GVN] Regenerate test checks (NFC)	2021-04-22 22:38:41 +02:00
Max Kazantsev	8fe62b7af1	[GVN] Introduce loop load PRE This patch allows PRE of the following type of loads: ``` preheader: br label %loop loop: br i1 ..., label %merge, label %clobber clobber: call foo() // Clobbers %p br label %merge merge: ... br i1 ..., label %loop, label %exit ``` Into ``` preheader: %x0 = load %p br label %loop loop: %x.pre = phi(x0, x2) br i1 ..., label %merge, label %clobber clobber: call foo() // Clobbers %p %x1 = load %p br label %merge merge: x2 = phi(x.pre, x1) ... br i1 ..., label %loop, label %exit ``` So instead of loading from %p on every iteration, we load only when the actual clobber happens. The typical pattern which it is trying to address is: hot loop, with all code inlined and provably having no side effects, and some side-effecting calls on cold path. The worst overhead from it is, if we always take clobber block, we make 1 more load overall (in preheader). It only matters if loop has very few iteration. If clobber block is not taken at least once, the transform is neutral or profitable. There are several improvements prospect open up: - We can sometimes be smarter in loop-exiting blocks via split of critical edges; - If we have block frequency info, we can handle multiple clobbers. The only obstacle now is that we don't know if their sum is colder than the header. Differential Revision: https://reviews.llvm.org/D99926 Reviewed By: reames	2021-04-22 12:50:38 +07:00
Max Kazantsev	0ef7e0041a	[Test] Add a negative unit test	2021-04-21 12:11:05 +07:00
Max Kazantsev	988926127b	[Test] Add -lcssa run to force LI in GVN	2021-04-20 13:26:55 +07:00
Max Kazantsev	bcde9f1b6c	[Test] Add loop load PRE test with GC pointers	2021-04-20 11:30:54 +07:00
Max Kazantsev	6148e3fc8e	[Test] Propagate nofree attribute from function to calls	2021-04-15 11:50:37 +07:00

1 2 3 4 5 ...

615 Commits