llvm-project

Commit Graph

Author	SHA1	Message	Date
Krishna Kariya	7bd361200a	[InstCombine] Fix PR47960 - Incorrect transformation of fabs with nnan flag Bug Fix for PR: https://llvm.org/PR47960 This patch makes sure that the fast math flag used in the 'select' instruction is the same as the 'fabs' instruction after the transformation. Differential Revision: https://reviews.llvm.org/D101727	2021-07-25 10:43:33 -04:00
Shilei Tian	f1b8fa55d0	[OpenMP][NVPTX] Disable OpenMPOpt when building deviceRTLs We build `deviceRTLs` with `-O1` by default, which also triggers OpenMPOpt. When the info cache is created, some attributes are removed. As a result, although we mark a few functions `noinline`, they are still inlined when the bitcode library is generated. This can cause an issue in middle end optimization. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D106710	2021-07-25 10:38:27 -04:00
Roman Lebedev	fa0910e6de	[NFC][Codegen][X86] Improve test coverage for repeated insertions of the same scalar into different elements	2021-07-25 17:37:04 +03:00
Simon Pilgrim	939291041b	[AMDGPU] Regenerate wave32.ll test checks To simplify diff in future patch	2021-07-25 15:13:09 +01:00
Simon Pilgrim	54e5ced7e6	[AMDGPU] Regenerate mul24 test checks To simplify diffs in future patch	2021-07-25 15:13:09 +01:00
Sanjay Patel	1ce05ad619	[x86] improve CMOV codegen by pushing add into operands, part 2 This is a minimum extension of D106607 to allow folding for 2 non-zero constantsi that can be materialized as immediates.. In the reduced test examples, we save 1 instruction by rolling the constants into LEA/ADD. In the motivating test from the bullet benchmark, we absorb both of the constant moves into add ops via LEA magic, so we reduce by 2 instructions. Differential Revision: https://reviews.llvm.org/D106684	2021-07-25 10:05:41 -04:00
Kazu Hirata	0fc5534ac7	[GlobalISel] Remove FlagsOp (NFC) The class was introduced without a use on Dec 11, 2018 in commit `cef44a2342`.	2021-07-25 07:05:07 -07:00
Kazu Hirata	4e288a8528	[Inline] Fix a warning by removing an explicit copy constructor This patches fixes the warning: llvm/include/llvm/Analysis/InlineCost.h:62:3: error: definition of implicit copy assignment operator for 'CostBenefitPair' is deprecated because it has a user-declared copy constructor [-Werror,-Wdeprecated-copy] by removing the explicit copy constructor.	2021-07-25 06:56:47 -07:00
Simon Pilgrim	15b883f457	[X86][AVX] Adjust AllowBWIVPERMV3 tolerance to account for VariableCrossLaneShuffleDepth As noticed on D105390 - we were hardwiring the depth limit for combining to VPERMI2W/VPERMI2B instructions. Not only had we made the limit too low, we hadn't accounted for slow/fast shuffles via the VariableCrossLaneShuffleDepth control	2021-07-25 14:05:11 +01:00
Simon Pilgrim	9591abd74e	[AMDGPU] Regenerate global-load-saddr-to-vaddr test checks To simplify diff in future patch	2021-07-25 14:05:10 +01:00
Simon Pilgrim	00e37c1cd4	[AMDGPU] Regenerate ctpop16 test checks To simplify diff in future patch	2021-07-25 14:05:09 +01:00
Simon Pilgrim	249ef1fa82	[AMDGPU] Regenerate half test checks To simplify diff in future patch	2021-07-25 14:05:08 +01:00
Simon Pilgrim	97d2277b37	[AMDGPU] Regenerate anyext test checks To simplify diff in future patch	2021-07-25 14:05:08 +01:00
Liqiang Tao	4bdfea2c51	[llvm][Inline] Add interface to return cost-benefit stuff Return cost-benefit stuff which is computed by cost-benefit analysis. Reviewed By: mtrofin Differential Revision: https://reviews.llvm.org/D105349	2021-07-25 20:18:19 +08:00
Amara Emerson	acbc0c5f0e	[AArch64][GlobalISel] Widen non-pow-2 types for shifts before clamping. For types like s96, we don't want to clamp to s64, we want to first widen to s128 and then narrow it. Otherwise we end up with impossible to legalize types.	2021-07-24 15:50:43 -07:00
Eugene Zhulenev	de7a4e53a2	[mlir] Async: lower SCF operations into CFG inside coroutines Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D106747	2021-07-24 14:36:26 -07:00
Craig Topper	c63dbd8501	[RISCV] Custom lower (i32 (fptoui/fptosi X)). I stumbled onto a case where our (sext_inreg (assertzexti32 (fptoui X)), i32) isel pattern can cause an fcvt.wu and fcvt.lu to be emitted if the assertzexti32 has an additional user. If we add a one use check it would just cause a fcvt.lu followed by a sext.w when only need a fcvt.wu to satisfy both users. To mitigate this I've added custom isel and new ISD opcodes for fcvt.wu. This allows us to keep know it started life as a conversion to i32 without needing to match multiple nodes. ComputeNumSignBits has been taught that this new nodes produces 33 sign bits. To prevent regressions when we need to zero extend the result of an (i32 (fptoui X)), I've added a DAG combine to convert it to an (i64 (fptoui X)) before type legalization. In most cases this would happen in InstCombine, but a zero_extend can be created for function returns or arguments. To keep everything consistent I've added new nodes for fptosi as well. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D106346	2021-07-24 10:50:43 -07:00
Nikita Popov	c7e69e46c8	[Tests] Add additional tests for incorrect willreturn handling (NFC) Highlight a few of the places that don't handle non-willreturn calls correctly right now.	2021-07-24 17:27:29 +02:00
Nikita Popov	baa51a0cef	[Tests] Add missing willreturn attributes (NFC) To retain the spirit of these tests after an upcoming change to mayHaveSideEffect(), add willreturn attributes to a number of functions.	2021-07-24 17:17:48 +02:00
Nikita Popov	0339fcc728	[LICM] Extract debugify test (NFC) Only one of the tests in the file wants to check debug info, so move it into a separate file. This allows update_test_checks to work.	2021-07-24 17:04:42 +02:00
Kazu Hirata	4ccfb1076f	[ADT] Remove WrappedPairNodeDataIterator (NFC) The last use was removed on Jul 16, 2020 in commit `f1d4db4f0c`.	2021-07-24 08:02:57 -07:00
Simon Pilgrim	f8191ee32b	[X86] Add additional div-mod-pair negative test coverage As suggested on D106745	2021-07-24 15:21:46 +01:00
Benjamin Kramer	e27c700b9a	[mlir] Restore markUnknownOpDynamicallyLegal to call isDynamicallyLegal by default Looks like an oversight from `b7a4649899` This should probably have a test case ...	2021-07-24 15:54:42 +02:00
Sander de Smalen	c3277a8828	[BasicTTI] Set scalarization cost of scalable vector casts to Invalid. When BasicTTIImpl::getCastInstrCost can't determine the cost of a vector cast operation when the types need legalization, it falls back to calculating scalarization costs. Instead of crashing on `cast<FixedVectorType>(DstVTy)` when the type is a scalable vector, return an Invalid cost. Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D106655	2021-07-24 14:13:21 +01:00
Simon Pilgrim	01f20581dd	[X86] Add i128 div-mod-pair test coverage	2021-07-24 14:00:53 +01:00
Paul Walker	e697a542ca	[SVE][NFC] Cleanup fixed length code gen tests to make them more resilient. Many of the tests have used NEXT when DAG is more approprite. In some cases single DAG lines have been used. Note that these are manual tests because they're to complex for update_llc_test_checks.py and so it's worth not relying too much on the ordered output. I've also made the CHECK lines more uniform when it comes to the ordering of things like LO/HI.	2021-07-24 13:14:42 +01:00
Simon Pilgrim	478b22d95a	[CGP] despeculateCountZeros - Don't create is-zero branch if cttz/ctlz source is known non-zero If value tracking can confirm that the cttz/ctlz source is known non-zero then we don't need to create a branch (which DAG will struggle to recover from). Differential Revision: https://reviews.llvm.org/D106685	2021-07-24 13:11:49 +01:00
LLVM GN Syncbot	fcb3bb581b	[gn build] Port `6aa9e746eb`	2021-07-24 12:03:50 +00:00
Ayke van Laethem	4d7f5c0a85	[AVR] Only support sp, r0 and r1 in llvm.read_register Most other registers are allocatable and therefore cannot be used. This issue was flagged by the machine verifier, because reading other registers is considered reading from an undefined register. Differential Revision: https://reviews.llvm.org/D96969	2021-07-24 14:03:27 +02:00
Ayke van Laethem	41f905b211	[AVR] Fix rotate instructions This patch fixes some issues with the RORB pseudo instruction. - A minor issue in which the instructions were said to use the SREG, which is not true. - An issue with the BLD instruction, which did not have an output operand. - A major issue in which invalid instructions were generated. The fix also reduce RORB from 4 to 3 instructions, so it's also a small optimization. These issues were flagged by the machine verifier. Differential Revision: https://reviews.llvm.org/D96957	2021-07-24 14:03:26 +02:00
Ayke van Laethem	6aa9e746eb	[AVR] Expand large shifts early in IR This patch makes sure shift instructions such as this one: %result = shl i32 %n, %amount are expanded just before the IR to SelectionDAG conversion to a loop so that calls to non-existing library functions such as __ashlsi3 are avoided. The generated code is currently pretty bad but there's a lot of room for improvement: the shift itself can be done in just four instructions. Differential Revision: https://reviews.llvm.org/D96677	2021-07-24 14:03:26 +02:00
Ayke van Laethem	431a941465	[AVR] Improve 8/16 bit atomic operations There were some serious issues with atomic operations. This patch should fix the biggest issues. For details on the issue take a look at this Compiler Explorer sample: https://godbolt.org/z/n3ndhn Code: void atomicadd(_Atomic char val) { val += 5; } Output: atomicadd: movw r26, r24 ldi r24, 5 ; 'operand' register in r0, 63 cli ld r24, X ; load value add r24, r26 ; value += X st X, r24 ; store value back out 63, r0 ret ; return the wrong value (in r24) There are various problems with this. - The value to add (5) is stored in r24. However, the value to add to is loaded in the same register: r24. - The `add` instruction adds half of the pointer to the loaded value, instead of (attempting to) add the operand with value 5. - The output value of the cmpxchg instruction (which is not used in this code sample) is the new value with 5 added, not the old value. The LangRef specifies that it has to be the old value, before the operation. This patch fixes the first two and leaves the third problem to be fixed at a later date. I believe atomics were mostly broken before this patch, with this patch they should become usable as long as you ignore the output of the atomic operation. In particular it fixes the following things: - It sets the earlyclobber flag for the input ('$operand' operand) so that the register allocator puts it in a different register than the output value. - It fixes a number of issues with the pseudo op expansion pass, for example now it adds the $operand field instead of the pointer. This fixes most machine instruction verifier issues (other flagged issues are unrelated to atomics). Differential Revision: https://reviews.llvm.org/D97127	2021-07-24 14:03:26 +02:00
Ayke van Laethem	8544ce80f8	[AVR] Set R31R30 as clobbered after ADJCALLSTACKDOWN In most cases, using R31R30 is fine because the call (which always precedes ADJCALLSTACKDOWN) will clobber R31R30 anyway. However, in some rare cases the register allocator might insert an instruction between the call and the ADJCALLSTACKDOWN instruction and expect the register pair to be live afterwards. I think this happens as a result of rematerialization. Therefore, to fix this, the instruction needs to have Defs set to R31R30. Setting the Defs field does have the effect of making the instruction look dead, which it certainly is not. This is fixed by setting hasSideEffects to true. Differential Revision: https://reviews.llvm.org/D97745	2021-07-24 14:03:26 +02:00
Ayke van Laethem	feda08b70a	[AVR] Do not chain stores in call frame setup Previously, AVRTargetLowering::LowerCall attempted to keep stack stores in order with chains. Perhaps this worked in the past, but it does not work now: it appears that the SelectionDAG legalization phase removes these chains. Therefore, I've removed these chains entirely to match X86 (which, similar to AVR, also prefers to use push instructions over stack-relative stores to set up a call frame). With this change, all the stack stores are in a somewhat reasonable order. Differential Revision: https://reviews.llvm.org/D97853	2021-07-24 14:03:26 +02:00
Ayke van Laethem	13ca0c87ed	[lld][WebAssembly] Align __heap_base __heap_base was not aligned. In practice, it will often be aligned simply because it follows the stack, but when the stack is placed at the beginning (with the --stack-first option), the __heap_base might be unaligned. It could even be byte-aligned. At least wasi-libc appears to expect that __heap_base is aligned: `659ff41456/dlmalloc/src/malloc.c (L5224)` While WebAssembly itself does not appear to require any alignment for memory accesses, it is sometimes required when sharing a pointer externally. For example, WASI might expect alignment up to 8: https://github.com/WebAssembly/WASI/blob/main/phases/snapshot/docs.md#-timestamp-u64 This issue got introduced with the addition of the --stack-first flag: https://reviews.llvm.org/D46141 I suspect the lack of alignment wasn't intentional here. Differential Revision: https://reviews.llvm.org/D106499	2021-07-24 14:03:26 +02:00
Butygin	b7a4649899	[mlir] ConversionTarget legality callbacks refactoring * Get rid of Optional<std::function> as std::function already have a null state * Add private setLegalityCallback function to set legality callback for unknown ops * Get rid of unknownOpsDynamicallyLegal flag, use unknownLegalityFn state insted. This causes behavior change when user first calls markUnknownOpDynamicallyLegal with callback and then without but I am not sure is the original behavior was really a 'feature', or just oversignt in the original implementation. Differential Revision: https://reviews.llvm.org/D105496	2021-07-24 14:59:36 +03:00
Melanie Blower	05ae303555	[clang][patch] Remove test artifact before running test for consistent results Fix non-deterministic test behavior by removing previously-created test directory, see comments in D95159	2021-07-24 07:55:10 -04:00
Simon Pilgrim	c261a06b7a	[DAG] Add initial SelectionDAG::isGuaranteedNotToBeUndefOrPoison framework (PR51129) I've setup the basic framework for the isGuaranteedNotToBeUndefOrPoison call and updated DAGCombiner::visitFREEZE to use it, further Opcodes can be handled when we have test coverage. I'm not aware of any vector test freeze coverage so the DemandedElts (and the Depth) args are not being used yet - but they are in place. SelectionDAG::isGuaranteedNotToBePoison wrappers have also been added. Differential Revision: https://reviews.llvm.org/D106668	2021-07-24 11:36:35 +01:00
Sanjay Patel	937e7c60c8	[x86] add more tests for add with CMOV of constants; NFC See D106607 / https://llvm.org/PR51069 for details.	2021-07-24 06:23:36 -04:00
Alexander Belyaev	edb05d555e	[llvm] Inline getAssociatedFunction() in LLVM_DEBUG. Function* F is used only inside LLVM_DEBUG, so that it causes unused variable warning.	2021-07-24 11:49:21 +02:00
hyeongyu kim	aca5aeb752	[InstCombine] Add freezeAllUsesOfArgument to visitFreeze In D106041, a freeze was added before the branch condition to solve the miscompilation problem of SimpleLoopUnswitch. However, I found that the added freeze disturbed other optimizations in the following situations. ``` arg.fr = freeze(arg) use(arg.fr) ... use(arg) ``` It is a problem that occurred when arg and arg.fr were recognized as different values. Therefore, changing to use arg.fr instead of arg throughout the function eliminates the above problem. Thus, I add a function that changes all uses of arg to freeze(arg) to visitFreeze of InstCombine. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D106233	2021-07-24 18:08:58 +09:00
George Balatsouras	228bea6a36	Revert D106195 "[dfsan] Add wrappers for v*printf functions" This reverts commit `bf281f3647`. This commit causes dfsan to segfault.	2021-07-24 08:53:48 +00:00
Nikita Popov	9706dd4940	[SimplifyCFG] Add additional if conversion tests (NFC) Test a readonly call in between, as well as the combination of an atomic and simple store.	2021-07-24 10:35:36 +02:00
Markus Böck	ffe32b5c71	[CMake] Add LIBXML2_DEFINITIONS when testing for symbol existance Currently when linking LLVM against Libxml2, a simple check is performed to check whether it can be linked successfully. This check currently adds the include directories and the libraries for libxml2, but not definitions found by the config. This causes issues on Windows when trying to link against a static libxml2. Libxml2 requires LIBXML_STATIC to be defined in the preprocessor to be able to link statically. This definition is put into LIBXML2_DEFINITIONS in the cmake config, but not properly forwarded to check_symbol_exists leading to it failing as it could not find xmlReadMemory in a DLL. This patch simply appends the content of LIBXML2_DEFINITIONS to the symbol check definitions, fixing the issue. Differential Revision: https://reviews.llvm.org/D106740	2021-07-24 09:55:14 +02:00
Azharuddin Mohammed	8da3b7d857	[CMake] Don't LTO optimize targets on Darwin, but only if its not ThinLTO This is just a workaround. Pass the `-mllvm,-O0` link flags only if its not ThinLTO. Doing that with ThinLTO currently results in an error: ``` Remaining virtual register operands UNREACHABLE executed at .../llvm/lib/CodeGen/MachineRegisterInfo.cpp:209! ```	2021-07-23 22:38:35 -07:00
Amara Emerson	5ec0f051c8	[GlobalISel] Add GUnmerge, GMerge, GConcatVectors, GBuildVector abstractions. NFC. Use these to slightly simplify some code in the artifact combiner.	2021-07-23 22:32:26 -07:00
Lang Hames	eda6afdad6	Re-re-re-apply "[ORC][ORC-RT] Add initial native-TLV support to MachOPlatform." The ccache builders have recevied a config update that should eliminate the build issues seen previously.	2021-07-24 13:16:12 +10:00
LLVM GN Syncbot	698fef3eb6	[gn build] Port `96709823ec`	2021-07-24 03:08:02 +00:00
Kuter Dinel	96709823ec	[AMDGPU] Deduce attributes with the Attributor This patch introduces a pass that uses the Attributor to deduce AMDGPU specific attributes. Reviewed By: jdoerfert, arsenm Differential Revision: https://reviews.llvm.org/D104997	2021-07-24 06:07:15 +03:00
peter klausler	4d42e16eb8	[flang] runtime: fix problems with I/O around EOF & delimited characters When a WRITE overwrites an endfile record, we need to forget that there was an endfile record. When doing a BACKSPACE after an explicit ENDFILE statement, the position afterwards must be upon the endfile record. Attempts to join list-directed delimited character input across record boundaries was due to a bad reading of the standard and has been deleted, now that the requirements are better understood. This problem would cause a read attempt past EOF if a delimited character input value was at the end of a record. It turns out that delimited list-directed (and NAMELIST) character output is required to emit contiguous doubled instances of the delimiter character when it appears in the output value. When fixed-size records are being emitted, as is the case with internal output, this is not possible when the problematic character falls on the last position of a record. No two other Fortran compilers do the same thing in this situation so there is no good precedent to follow. Because it seems least wrong, with this patch we now emit one copy of the delimiter as the last character of the current record and another as the first character of the next record. (The second-least-wrong alternative might be to flag a runtime error, but that seems harsh since it's not an explicit error in the standard, and the output may not have to be usable later as input anyway.) Consequently, the output is not suitable for use as list-directed or NAMELIST input. If a later standard were to clarify this case, this behavior will of course change as needed to conform. Differential Revision: https://reviews.llvm.org/D106695	2021-07-23 18:23:26 -07:00

1 2 3 4 5 ...

394716 Commits All Branches Search

394716 Commits

All Branches