llvm-project

Commit Graph

Author	SHA1	Message	Date
David Bolvansky	0e0fbae1a4	[BuildLibCalls] Noalias annotation Summary: I think this is better solution than annotating callsites in IC/SLC. Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: MaskRay, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66217 llvm-svn: 368875	2019-08-14 16:50:06 +00:00
Bill Wendling	cc2bebe039	Ignore indirect branches from callbr. Summary: We can't speculate around indirect branches: indirectbr and invoke. The callbr instruction needs to be included here. Reviewers: nickdesaulniers, manojgupta, chandlerc Reviewed By: chandlerc Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66200 llvm-svn: 368873	2019-08-14 16:44:07 +00:00
Simon Pilgrim	828a89e244	Fix "not all control paths return a value" MSVC warnings. NFCI. llvm-svn: 368831	2019-08-14 11:31:05 +00:00
Simon Pilgrim	3f40bdb558	Fix "not all control paths return a value" MSVC warning. NFCI. llvm-svn: 368830	2019-08-14 11:29:56 +00:00
Simon Pilgrim	8bba4798c2	Fix "not all control paths return a value" MSVC warnings. NFCI. llvm-svn: 368829	2019-08-14 11:29:16 +00:00
Roman Lebedev	32f1e1a01d	[InstCombine] Refactor getFlippedStrictnessPredicateAndConstant() out of canonicalizeCmpWithConstant(), NFCI I'd like to use it elsewhere, hopefully without reinventing the wheel. No functional change intended so far. llvm-svn: 368820	2019-08-14 09:57:20 +00:00
Dorit Nuzman	491ca2425d	[LV] Fold-tail flag This is the compiler-flag equivalent of the Predicate pragma (https://reviews.llvm.org/D65197), to direct the vectorizer to fold the remainder-loop into the main-loop using predication. Differential Revision: https://reviews.llvm.org/D66108 Reviewers: Ayal, hsaito, fhahn, SjoerdMeije llvm-svn: 368801	2019-08-14 05:22:20 +00:00
David L. Jones	d4edd9d97e	Revert '[LICM] Make Loop ICM profile aware' and 'Fix pass dependency for LICM' This reverts r368526 (git commit `7e71aa24bc`) This reverts r368542 (git commit `cb5a90fd31`) llvm-svn: 368800	2019-08-14 04:50:33 +00:00
John McCall	a318c55073	Coroutines: adjust for SVN r358739 CallSite has been removed in favour of CallBase. Adjust the coroutine split to account for that. llvm-svn: 368798	2019-08-14 03:54:25 +00:00
John McCall	3bbf207fbc	Don't run a full verifier pass in coro-splitting's private pipeline. Potentially addresses rdar://49022293. llvm-svn: 368797	2019-08-14 03:54:18 +00:00
John McCall	5f60b68c68	Remove unreachable blocks before splitting a coroutine. The suspend-crossing algorithm is not correct in the presence of uses that cannot be reached on some successor path from their defs. llvm-svn: 368796	2019-08-14 03:54:13 +00:00
John McCall	2133feec93	Support swifterror in coroutine lowering. The support for swifterror allocas should work in all lowerings. The support for swifterror arguments only really works in a lowering with prototypes where you can ensure that the prototype also has a swifterror argument; I'm not really sure how it could possibly be made to work in the switch lowering. llvm-svn: 368795	2019-08-14 03:54:05 +00:00
John McCall	d47801e718	In coro.retcon lowering, don't explode if the optimizer messes around with the linkage of the prototype or the exact types of the yielded values. llvm-svn: 368793	2019-08-14 03:53:52 +00:00
John McCall	ac40483276	Fix a use-after-free in the coro.alloca treatment. llvm-svn: 368792	2019-08-14 03:53:46 +00:00
John McCall	62a5dde0c2	Add intrinsics for doing frame-bound dynamic allocations within a coroutine. These rely on having an allocator provided to the coroutine and thus, for now, only work in retcon lowerings. llvm-svn: 368791	2019-08-14 03:53:40 +00:00
John McCall	137b50f0c3	Guard dumps in the coro intrinsic validation logic behind NDEBUG checks. dump() is not guaranteed to be defined in all builds. llvm-svn: 368790	2019-08-14 03:53:31 +00:00
John McCall	3829214185	Generalize llvm.coro.suspend.retcon to allow an arbitrary number of arguments to be passed back to the continuation function. llvm-svn: 368789	2019-08-14 03:53:26 +00:00
John McCall	94010b2b7f	Extend coroutines to support a "returned continuation" lowering. A quick contrast of this ABI with the currently-implemented ABI: - Allocation is implicitly managed by the lowering passes, which is fine for frontends that are fine with assuming that allocation cannot fail. This assumption is necessary to implement dynamic allocas anyway. - The lowering attempts to fit the coroutine frame into an opaque, statically-sized buffer before falling back on allocation; the same buffer must be provided to every resume point. A buffer must be at least pointer-sized. - The resume and destroy functions have been combined; the continuation function takes a parameter indicating whether it has succeeded. - Conversely, every suspend point begins its own continuation function. - The continuation function pointer is directly returned to the caller instead of being stored in the frame. The continuation can therefore directly destroy the frame when exiting the coroutine instead of having to leave it in a defunct state. - Other values can be returned directly to the caller instead of going through a promise allocation. The frontend provides a "prototype" function declaration from which the type, calling convention, and attributes of the continuation functions are taken. - On the caller side, the frontend can generate natural IR that directly uses the continuation functions as long as it prevents IPO with the coroutine until lowering has happened. In combination with the point above, the frontend is almost totally in charge of the ABI of the coroutine. - Unique-yield coroutines are given some special treatment. llvm-svn: 368788	2019-08-14 03:53:17 +00:00
David Bolvansky	038d604f4f	[SimplifyLibCalls] Add noalias from known callsites Summary: Should be fine for memcpy, strcpy, strncpy. Reviewers: jdoerfert, efriedma Reviewed By: jdoerfert Subscribers: uenoku, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66135 llvm-svn: 368724	2019-08-13 17:18:46 +00:00
David Bolvansky	90a30fdcc3	[SLC] Improve dereferenceable bytes annotation llvm-svn: 368715	2019-08-13 16:44:16 +00:00
Roman Lebedev	73f702ff19	[InstCombine] Non-canonical clamp-like pattern handling Summary: Given a pattern like: ``` %old_cmp1 = icmp slt i32 %x, C2 %old_replacement = select i1 %old_cmp1, i32 %target_low, i32 %target_high %old_x_offseted = add i32 %x, C1 %old_cmp0 = icmp ult i32 %old_x_offseted, C0 %r = select i1 %old_cmp0, i32 %x, i32 %old_replacement ``` it can be rewritten as more canonical pattern: ``` %new_cmp1 = icmp slt i32 %x, -C1 %new_cmp2 = icmp sge i32 %x, C0-C1 %new_clamped_low = select i1 %new_cmp1, i32 %target_low, i32 %x %r = select i1 %new_cmp2, i32 %target_high, i32 %new_clamped_low ``` Iff `-C1 s<= C2 s<= C0-C1` Also, `ULT` predicate can also be `UGE`; or `UGT` iff `C0 != -1` (+invert result) Also, `SLT` predicate can also be `SGE`; or `SGT` iff `C2 != INT_MAX` (+invert result) If `C1 == 0`, then all 3 instructions must be one-use; else at most either `%old_cmp1` or `%old_x_offseted` can have extra uses. NOTE: if we could reuse `%old_cmp1` as one of the comparisons we'll have to build, this could be less limiting. So there are two icmp's, each one with 3 predicate variants, so there are 9 fold variants: \| \| ULT \| UGE \| UGT \| \| SLT \| https://rise4fun.com/Alive/yIJ \| https://rise4fun.com/Alive/5BfN \| https://rise4fun.com/Alive/INH \| \| SGE \| https://rise4fun.com/Alive/hd8 \| https://rise4fun.com/Alive/Abk \| https://rise4fun.com/Alive/PlzS \| \| SGT \| https://rise4fun.com/Alive/VYG \| https://rise4fun.com/Alive/oMY \| https://rise4fun.com/Alive/KrzC \| {F9730206} This fold was brought up in https://reviews.llvm.org/D65148#1603922 by @dmgreen, and is needed to unblock that patch. This patch requires D65530. Reviewers: spatel, nikic, xbolva00, dmgreen Reviewed By: spatel Subscribers: hiraditya, llvm-commits, dmgreen Tags: #llvm Differential Revision: https://reviews.llvm.org/D65765 llvm-svn: 368687	2019-08-13 12:49:28 +00:00
Roman Lebedev	0410489a34	[InstCombine][NFC] Rename IsFreeToInvert() -> isFreeToInvert() for consistency As per https://reviews.llvm.org/D65530#inline-592325 llvm-svn: 368686	2019-08-13 12:49:16 +00:00
Roman Lebedev	2635c324da	[InstCombine] foldXorOfICmps(): don't give up on non-single-use ICmp's if all users are freely invertible Summary: This is rather unconventional.. As the comment there says, we don't have much folds for xor-of-icmps, we try to turn them into an and-of-icmps, for which we have plenty of folds. But if the ICmp we need to invert is not single-use - we give up. As discussed in https://reviews.llvm.org/D65148#1603922, we may have a non-canonical CLAMP pattern, with bit match and select-of-threshold that we'll potentially clamp. As it can be seen in `canonicalize-clamp-with-select-of-constant-threshold-pattern.ll`, out of all 8 variations of the pattern, only two are not canonicalized into the variant with and+icmp instead of bit math. The reason is because the ICmp we need to invert is not single-use - we give up. We indeed can't perform this fold at will, the general rule is that we should not increase instruction count in InstCombine, But we wouldn't end up increasing instruction count if we can adapt every other user to the inverted value. This way the `not` we create will get folded, and in the end the instruction count did not increase. For that, of course, we need to look at the users of a Value, which is again rather unconventional for InstCombine :S Thus i'm proposing to be a little bit more insistive in `foldXorOfICmps()`. The alternatives would be to not create that `not`, but add duplicate code to manually invert all users; or to add some even less general combine to handle some more specific pattern[s]. Reviewers: spatel, nikic, RKSimon, craig.topper Reviewed By: spatel Subscribers: hiraditya, jdoerfert, dmgreen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65530 llvm-svn: 368685	2019-08-13 12:49:06 +00:00
David Bolvansky	39130314fe	[SimplifyLibCalls] Add dereferenceable bytes from known callsites Summary: int mm(char a, char b) { return memcmp(a,b,16); } Currently: define dso_local i32 @mm(i8* nocapture readonly %a, i8* nocapture readonly %b) local_unnamed_addr #1 { entry: %call = tail call i32 @memcmp(i8* %a, i8* %b, i64 16) ret i32 %call } After patch: define dso_local i32 @mm(i8* nocapture readonly %a, i8* nocapture readonly %b) local_unnamed_addr #1 { entry: %call = tail call i32 @memcmp(i8* dereferenceable(16) %a, i8* dereferenceable(16) %b, i64 16) ret i32 %call } Reviewers: jdoerfert, efriedma Reviewed By: jdoerfert Subscribers: javed.absar, spatel, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66079 llvm-svn: 368657	2019-08-13 09:11:49 +00:00
Johannes Doerfert	26e58466de	[Attributor] Use the cached data layout directly This removes the warning by using the new DL member. It also simplifies the code. llvm-svn: 368625	2019-08-12 22:21:09 +00:00
Johannes Doerfert	acc8079f8e	[Attributor][NFC] Add IntegerState raw_ostream << operator llvm-svn: 368622	2019-08-12 22:07:34 +00:00
Johannes Doerfert	ece8190497	[Attributor] Make the InformationCache an Attributor member The functionality is not changed but the interfaces are simplified and repetition is removed. llvm-svn: 368621	2019-08-12 22:05:53 +00:00
Wenlei He	4b99b58a84	[ThinLTO][AutoFDO] Fix memory corruption due to race condition from thin backends Summary: This commit fixed a race condition from multi-threaded thinLTO backends that causes non-deterministic memory corruption for a data structure used only by AutoFDO with compact binary profile. GUIDToFuncNameMap, a static data member of type DenseMap in FunctionSamples is used as a per-module mapping from function name MD5 to name string when input AutoFDO profile is in compact binary format. However with ThinLTO, we can have parallel backends modifying and accessing the class static map concurrently. The fix is to make GUIDToFuncNameMap a member of SampleProfileLoader instead of a file static data. Reviewers: wmi, davidxl, danielcdh Subscribers: mehdi_amini, inglorion, hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65848 llvm-svn: 368596	2019-08-12 17:45:14 +00:00
David Bolvansky	20d37fab82	[InstCombine] x /c fabs(x) -> copysign(1.0, x) Summary: x / fabs(x) -> copysign(1.0, x) fabs(x) / x -> copysign(1.0, x) Reviewers: spatel, foad, RKSimon, efriedma Reviewed By: spatel Subscribers: lebedev.ri, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65898 llvm-svn: 368570	2019-08-12 13:43:35 +00:00
Roman Lebedev	ccdad6ef48	[InstCombine] foldShiftIntoShiftInAnotherHandOfAndInICmp(): avoid constantexpr pitfail (PR42962) Instead of matching value and then blindly casting to BinaryOperator just to get the opcode, just match instruction and do no cast. Fixes https://bugs.llvm.org/show_bug.cgi?id=42962 llvm-svn: 368554	2019-08-12 11:28:02 +00:00
Wenlei He	cb5a90fd31	Fix pass dependency for LICM Expected to address buildbot failure http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/16285 caused by D65060. llvm-svn: 368542	2019-08-11 22:54:05 +00:00
Wenlei He	7e71aa24bc	[LICM] Make Loop ICM profile aware Summary: Hoisting/sinking instruction out of a loop isn't always beneficial. Hoisting an instruction from a cold block inside a loop body out of the loop could hurt performance. This change makes Loop ICM profile aware - it now checks block frequency to make sure hoisting/sinking anly moves instruction to colder block. Test Plan: ninja check Reviewers: asbirlea, sanjoy, reames, nikic, hfinkel, vsk Reviewed By: asbirlea Subscribers: fhahn, vsk, davidxl, xbolva00, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65060 llvm-svn: 368526	2019-08-11 06:05:35 +00:00
Roman Lebedev	96474d17c6	[InstCombine][NFC] Use SimplifyAddInst() instead of SimplifyBinOp(Instruction::BinaryOps::Add, ) llvm-svn: 368521	2019-08-10 19:29:10 +00:00
Roman Lebedev	a8d20b4467	[InstCombine] Shift amount reassociation in bittest: relax one-use check when shifting constant If one of the values being shifted is a constant, since the new shift amount is known-constant, the new shift will end up being constant-folded so, we don't need that one-use restriction then. llvm-svn: 368519	2019-08-10 19:28:54 +00:00
Roman Lebedev	64fe806c4e	[InstCombine] Shift amount reassociation in bittest: drop pointless one-use restriction That one-use restriction is not needed for correctness - we have already ensured that one of the shifts will go away, so we know we won't increase the instruction count. So there is no need for that restriction. llvm-svn: 368518	2019-08-10 19:28:44 +00:00
Sanjay Patel	21c15ef384	[Reassociate] try harder to convert negative FP constants to positive This is an extension of a transform that tries to produce positive floating-point constants to improve canonicalization (and hopefully lead to more reassociation and CSE). The original patches were: D4904 D5363 (rL221721) But as the test diffs show, these were limited to basic patterns by walking from an instruction to its single user rather than recursively moving up the def-use sequence. No fast-math is required here because we're only rearranging implicit FP negations in intermediate ops. A motivating bug is: https://bugs.llvm.org/show_bug.cgi?id=32939 Differential Revision: https://reviews.llvm.org/D65954 llvm-svn: 368512	2019-08-10 13:17:54 +00:00
Peter Collingbourne	0e497d1554	cfi-icall: Allow the jump table to be optionally made non-canonical. The default behavior of Clang's indirect function call checker will replace the address of each CFI-checked function in the output file's symbol table with the address of a jump table entry which will pass CFI checks. We refer to this as making the jump table `canonical`. This property allows code that was not compiled with ``-fsanitize=cfi-icall`` to take a CFI-valid address of a function, but it comes with a couple of caveats that are especially relevant for users of cross-DSO CFI: - There is a performance and code size overhead associated with each exported function, because each such function must have an associated jump table entry, which must be emitted even in the common case where the function is never address-taken anywhere in the program, and must be used even for direct calls between DSOs, in addition to the PLT overhead. - There is no good way to take a CFI-valid address of a function written in assembly or a language not supported by Clang. The reason is that the code generator would need to insert a jump table in order to form a CFI-valid address for assembly functions, but there is no way in general for the code generator to determine the language of the function. This may be possible with LTO in the intra-DSO case, but in the cross-DSO case the only information available is the function declaration. One possible solution is to add a C wrapper for each assembly function, but these wrappers can present a significant maintenance burden for heavy users of assembly in addition to adding runtime overhead. For these reasons, we provide the option of making the jump table non-canonical with the flag ``-fno-sanitize-cfi-canonical-jump-tables``. When the jump table is made non-canonical, symbol table entries point directly to the function body. Any instances of a function's address being taken in C will be replaced with a jump table address. This scheme does have its own caveats, however. It does end up breaking function address equality more aggressively than the default behavior, especially in cross-DSO mode which normally preserves function address equality entirely. Furthermore, it is occasionally necessary for code not compiled with ``-fsanitize=cfi-icall`` to take a function address that is valid for CFI. For example, this is necessary when a function's address is taken by assembly code and then called by CFI-checking C code. The ``__attribute__((cfi_jump_table_canonical))`` attribute may be used to make the jump table entry of a specific function canonical so that the external code will end up taking a address for the function that will pass CFI checks. Fixes PR41972. Differential Revision: https://reviews.llvm.org/D65629 llvm-svn: 368495	2019-08-09 22:31:59 +00:00
Evandro Menezes	59fbe516bd	[InstCombine] Refactor optimizeExp2() (NFC) Refactor `LibCallSimplifier::optimizeExp2()` to use the new `emitBinaryFloatFnCall()` version that fetches the function name from TLI. llvm-svn: 368457	2019-08-09 17:22:56 +00:00
Evandro Menezes	8a21214174	[Transforms] Add a emitBinaryFloatFnCall() version that fetches the function name from TLI Add the counterpart to a similar function for single operands. Differential revision: https://reviews.llvm.org/D65976 llvm-svn: 368453	2019-08-09 17:06:46 +00:00
Evandro Menezes	c6c00cdf2e	[Transforms] Rename hasUnaryFloatFn() and getUnaryFloatFn() (NFC) Rename `hasUnaryFloatFn()` to `hasFloatFn()` and `getUnaryFloatFn()` to `getFloatFnName()`. llvm-svn: 368449	2019-08-09 16:04:18 +00:00
Sanjay Patel	991834a516	[GlobalOpt] prevent crashing on large integer types (PR42932) This is a minimal fix (copy the predicate for the assert) to prevent the crashing seen in: https://bugs.llvm.org/show_bug.cgi?id=42932 ...when converting a constant integer of arbitrary width to uint64_t. Differential Revision: https://reviews.llvm.org/D65970 llvm-svn: 368437	2019-08-09 12:43:25 +00:00
Bjorn Pettersson	d218a3326e	[InstSimplify] Report "Changed" also when only deleting dead instructions Summary: Make sure that we report that changes has been made by InstSimplify also in situations when only trivially dead instructions has been removed. If for example a call is removed the call graph must be updated. Bug seem to have been introduced by llvm-svn r367173 (commit `02b9e45a7e`), since the code in question was rewritten in that commit. Reviewers: spatel, chandlerc, foad Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65973 llvm-svn: 368401	2019-08-09 07:08:25 +00:00
Peter Collingbourne	bb17e46644	Linker: Add support for GlobalIFunc. GlobalAlias and GlobalIFunc ought to be treated the same by the IR linker, so we can generalize the code to be in terms of their common base class GlobalIndirectSymbol. Differential Revision: https://reviews.llvm.org/D55046 llvm-svn: 368357	2019-08-08 22:09:18 +00:00
Cameron McInally	8416f20f2f	[LICM] Support unary FNeg in LICM Differential Revision: https://reviews.llvm.org/D65908 llvm-svn: 368350	2019-08-08 21:38:31 +00:00
Tim Corringham	4f64f1ba3c	Add llvm.licm.disable metadata For some targets the LICM pass can result in sub-optimal code in some cases where it would be better not to run the pass, but it isn't always possible to suppress the transformations heuristically. Where the front-end has insight into such cases it is beneficial to attach loop metadata to disable the pass - this change adds the llvm.licm.disable metadata to enable that. Differential Revision: https://reviews.llvm.org/D64557 llvm-svn: 368296	2019-08-08 13:46:17 +00:00
Johannes Doerfert	d1b79e0774	[Attributor][Stats] Locate statistics tracking with the attributes Summary: The ever growing switch required Attribute::AttrKind values but they might not be available for all abstract attributes we deduce. With the new method we track statistics at the abstract attribute level. The provided macros simplify the usage and make the messages uniform. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65732 llvm-svn: 368227	2019-08-07 22:46:11 +00:00
Johannes Doerfert	beb5150f47	[Attributor][NFC] Code simplification and style normalization llvm-svn: 368225	2019-08-07 22:36:15 +00:00
Johannes Doerfert	344d038960	[Attributor] Introduce a state wrapper class Summary: The wrapper reduces boilerplate code and also provide a nice way to determine the state type used by an abstract attributes statically via AAType::StateType. This was already discussed as part of the review of D65711. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65786 llvm-svn: 368224	2019-08-07 22:34:26 +00:00
Johannes Doerfert	d620781872	[Attributor][NFC] Avoid unnecessary liveness queries If we know everything is live there is no need to query for liveness. Indicating a pessimistic fixpoint will cause the state to be "invalid" which will cause the Attributor to not return the AAIsDead on request, which will prevent us from querying isAssumedDead(). llvm-svn: 368223	2019-08-07 22:32:38 +00:00
Johannes Doerfert	14a0493a88	[Attributor] Provide easier checkForallReturnedValues functionality Summary: So far, whenever one wants to look at returned values, one had to deal with the AAReturnedValues and potentially with the AAIsDead attribute. In the same spirit as other checkForAllXXX methods, we add this functionality now to the Attributor. By adopting the use sites we got better results when return instructions were dead. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65733 llvm-svn: 368222	2019-08-07 22:27:24 +00:00

1 2 3 4 5 ...

22185 Commits