llvm-project

Commit Graph

Author	SHA1	Message	Date
Kazushi (Jam) Marukawa	6c32bc4875	[VE] Change to expand BRCOND VE doesn't have BRCOND instruction, so need to expand it. Also add a regression test. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D89173	2020-10-12 19:18:37 +09:00
Haojian Wu	f1bf41e433	Fix buildbot failure for `702529d899`.	2020-10-12 12:04:44 +02:00
Evgeny Leviant	7102793065	Add test for cortex-a57/ARM sched model. NFC	2020-10-12 12:49:56 +03:00
sstefan1	a64e8583da	[IR][FIX] Intrinsics - don't apply default willreturn if IntrNoReturn is specified Summary: Since willreturn will soon be added as default attribute, we can end up with both noreturn and willreturn on the same intrinsic. This was exposed by llvm.wasm.throw which has IntrNoReturn. Reviewers: jdoerfert, arsenm Differential Revision: https://reviews.llvm.org/D88644	2020-10-12 11:29:33 +02:00
Haojian Wu	8852d30b1c	[AST][RecoveryExpr] Don't perform early typo correction in C. The dependent mechanism for C error-recovery is mostly finished, this is the only place we have missed. Differential Revision: https://reviews.llvm.org/D89045	2020-10-12 11:24:45 +02:00
Haojian Wu	bb406f36dc	[AST][RecoveryExpr] Build dependent callexpr in C for error-recovery. See whole context: https://reviews.llvm.org/D85025 Reviewed By: sammccall Differential Revision: https://reviews.llvm.org/D84304	2020-10-12 11:15:01 +02:00
Georgii Rymar	25e437ec1e	[llvm-readobj/elf] - Ignore the hash table when on EM_S390/EM_ALPHA platforms. Specification for `SHT_HASH` table says (https://refspecs.linuxbase.org/elf/gabi4+/ch5.dynamic.html#hash) that it contains `Elf32_Word` entries for both `32/64` bit objects. But there is a problem with `EM_S390` and `ELF::EM_ALPHA` platforms: they use 8-bytes entries. (see the issue reported: https://bugs.llvm.org/show_bug.cgi?id=47681). Currently we might infer the size of the dynamic symbols table from hash table, but because of the issue mentioned, the calculation is wrong. And also we don't dump the hash table properly. I am not sure if we want to support 8-bytes entries as they violates specification and also the `.hash` table is kind of deprecated by itself (the `.gnu.hash` table is used nowadays). So, the solution this patch suggests is to ban using of the hash table on `EM_S390/EM_ALPHA` platforms. Differential revision: https://reviews.llvm.org/D88817	2020-10-12 12:13:01 +03:00
Haojian Wu	702529d899	[clang] Fix returning the underlying VarDecl as top-level decl for VarTemplateDecl. Given the following VarTemplateDecl AST, ``` VarTemplateDecl col:26 X \|-TemplateTypeParmDecl typename depth 0 index 0 `-VarDecl X 'bool' cinit `-CXXBoolLiteralExpr 'bool' true ``` previously, we returned the VarDecl as the top-level decl, which was not correct, the top-level decl should be VarTemplateDecl. Differential Revision: https://reviews.llvm.org/D89098	2020-10-12 10:46:18 +02:00
Nicolas Vasilache	60cf8453d0	Revert "Revert "Give attributes C++ namespaces."" This reverts commit `df295fac6c`. Reactivates a spuriously rolled back change.	2020-10-12 08:23:54 +00:00
Alexander Belyaev	b98e5e0f7e	[mlir] Move Linalg tensors-to-buffers tests to Linalg tests. The buffer placement preparation tests in test/Transforms/buffer-placement-preparation* are using Linalg as a test dialect which leads to confusion and "copy-pasta", i.e. Linalg is being extended now and when TensorsToBuffers.cpp is changed, TestBufferPlacement is sometimes kept in-sync, which should not be the case. This has led to the unnoticed bug, because the tests were in a different directory and the patterns were slightly off. Differential Revision: https://reviews.llvm.org/D89209	2020-10-12 10:18:57 +02:00
David Sherwood	d765d12676	Fix build failure caused by `c5ba0d33cc`	2020-10-12 09:05:39 +01:00
Roman Lebedev	1c021c64ca	[SCEV] Model ptrtoint(SCEVUnknown) cast not as unknown, but as zext/trunc/self of SCEVUnknown While we indeed can't treat them as no-ops, i believe we can/should do better than just modelling them as `unknown`. `inttoptr` story is complicated, but for `ptrtoint`, it seems straight-forward to model it just as a zext-or-trunc of unknown. This may be important now that we track towards making inttoptr/ptrtoint casts not no-op, and towards preventing folding them into loads/etc (see D88979/D88789/D88788) Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D88806	2020-10-12 11:04:03 +03:00
David Sherwood	c5ba0d33cc	[SVE] Make ElementCount and TypeSize use a new PolySize class I have introduced a new template PolySize class, where the template parameter determines the type of quantity, i.e. for an element count this is just an unsigned value. The ElementCount class is now just a simple derivation of PolySize<unsigned>, whereas TypeSize is more complicated because it still needs to contain the uint64_t cast operator, since there are still many places in the code that rely upon this implicit cast. As such the class also still needs some of it's own operators. I've tried to minimise the amount of code in the base PolySize class, which led to a couple of changes: 1. In some places we were relying on '==' operator comparisons between ElementCounts and the scalar value 1. I didn't put this operator in the new PolySize class, and thought it was actually clearer to use the isScalar() function instead. 2. I removed the isByteSized function and replaced it with calls to isKnownMultipleOf(8). I've also renamed NextPowerOf2 to be coefficientNextPowerOf2 so that it's more consistent with coefficientDivideBy. Differential Revision: https://reviews.llvm.org/D88409	2020-10-12 08:23:38 +01:00
Kito Cheng	6bf25f45a9	[Tablegen][SubtargetEmitter] Print TuneCPU in Subtarget::ParseSubtargetFeatures Let user able to know which -tune-cpu are used now. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D88951	2020-10-12 14:46:44 +08:00
Vitaly Buka	d784f74069	[NFC][Asan] Remove unused macro	2020-10-11 22:29:51 -07:00
John McCall	cec49a5836	Revert "[SYCL] Implement __builtin_unique_stable_name." This reverts commit `b5a034e771`. This feature was added without following the proper process.	2020-10-12 01:10:09 -04:00
Fangrui Song	cddb49bcc0	[SchedDAGInstrs] Delete redundant contains(). NFC	2020-10-11 20:58:30 -07:00
Jonas Devlieghere	ba2dff0159	Revert "PR47792: Include the type of a pointer or reference non-type template" This reverts commit `849c60541b` because it results in a stage 2 build failure: llvm-project/clang/include/clang/AST/ExternalASTSource.h:409:20: error: definition with same mangled name '_ZN5clang25LazyGenerationalUpdatePtrIPKNS_4DeclEPS1_XadL_ZNS_17ExternalASTSource19CompleteRedeclChainES3_EEE9makeValueERKNS_10ASTContextES4_' as another definition static ValueType makeValue(const ASTContext &Ctx, T Value);	2020-10-11 20:16:46 -07:00
Qiu Chaofan	6f7e1ce214	[NFC] Move PPC strict-fp MIR test to dedicated file fp-strict-conv-f128.ll is generated by script, but some manual MIR tests exist in it. Move them to another file to satisfy script when updating.	2020-10-12 10:40:19 +08:00
Valentin Clement	4b01190122	[mlir][openacc] Introduce acc.enter_data operation This patch introduces the acc.enter_data operation that represents an OpenACC Enter Data directive. Operands and attributes are dervied from clauses in the spec 2.6.6. Reviewed By: kiranchandramohan Differential Revision: https://reviews.llvm.org/D88941	2020-10-11 21:27:06 -04:00
Richard Smith	849c60541b	PR47792: Include the type of a pointer or reference non-type template parameter in its notion of template argument identity. We already did this for all the other kinds of non-type template argument. We're still missing the type from the mangling, so we continue to be able to see collisions at link time; that's an open ABI issue.	2020-10-11 15:59:49 -07:00
Craig Topper	9e72d3eaf3	[ValueTracking] Use KnownBits::countMaxLeadingZeros/countMaxTrailingZeros to make code more readable. NFC	2020-10-11 14:26:18 -07:00
Richard Smith	c25da4b04a	Fix arc lint's clang-format rule: only format the file we were asked to format. This avoids diffs being applied in the work tree to files that are supposed to be excluded (clang tests), allows arc to properly provide interactive feedback for the formatting fixes, and reduces the number of files that we format, in a change affecting N files, from N^2 to N.	2020-10-11 14:24:23 -07:00
Christian Iversen	a9cefc3dee	[ELF] Fix broken bitstream linking with lld when e_machine > 255 In ELF/InputFiles.cpp, getBitcodeMachineKind() is limited to uint8_t return type. This works as long as EM_xxx is < 256, which is true for common architectures, but not for some newly assigned or unofficial EM_* values. The corresponding ELF field (e_machine) can hold uint16_t. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D89185	2020-10-11 14:19:25 -07:00
Tres Popp	8178e41dc1	[mlir] Type erase inputs to select statements in shape.broadcast lowering. This is required or broadcasting with operands of different ranks will lead to failures as the select op requires both possible outputs and its output type to be the same. Differential Revision: https://reviews.llvm.org/D89134	2020-10-11 21:58:06 +02:00
Nathan Ridge	f82346fd73	[clangd] Avoid relations being overwritten in a header shard Fixes https://github.com/clangd/clangd/issues/510 Differential Revision: https://reviews.llvm.org/D87256	2020-10-11 15:32:54 -04:00
Roman Lebedev	544a6aa267	[InstCombine] combineLoadToOperationType(): don't fold int<->ptr cast into load And another step towards transforms not introducing inttoptr and/or ptrtoint casts that weren't there already. As we've been establishing (see D88788/D88789), if there is a int<->ptr cast, it basically must stay as-is, we can't do much with it. I've looked, and the most source of new such casts being introduces, as far as i can tell, is this transform, which, ironically, tries to reduce count of casts.. On vanilla llvm test-suite + RawSpeed, @ `-O3`, this results in -33.58% less `IntToPtr`s (19014 -> 12629) and +76.20% more `PtrToInt`s (18589 -> 32753), which is an increase of +20.69% in total. However just on RawSpeed, where i know there are basically none `IntToPtr` in the original source code, this results in -99.27% less `IntToPtr`s (2724 -> 20) and +82.92% more `PtrToInt`s (4513 -> 8255). which is again an increase of 14.34% in total. To me this does seem like the step in the right direction, we end up with strictly less `IntToPtr`, but strictly more `PtrToInt`, which seems like a reasonable trade-off. See https://reviews.llvm.org/D88860 / https://reviews.llvm.org/D88995 for some more discussion on the subject. (Eventually, `CastInst::isNoopCast()`/`CastInst::isEliminableCastPair` should be taught about this, yes) Reviewed By: nlopes, nikic Differential Revision: https://reviews.llvm.org/D88979	2020-10-11 20:24:28 +03:00
Fangrui Song	cbe4d973ed	[X86] Define __LAHF_SAHF__ if feature 'sahf' is set or 32-bit mode GCC 11 will define this macro. In LLVM, the feature flag only applies to 64-bit mode and we always define the macro in 32-bit mode. This is different from GCC -m32 in which -mno-sahf can suppress the macro. The discrepancy can unlikely cause trouble. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D89198	2020-10-11 09:46:00 -07:00
David Green	be6e8e50f4	[LV] Tail folded inloop reductions. This expands upon the inloop reductions added in e9761688e41cb9e976, allowing them to be inserted into tail folded loops. Reductions are generates with the form: x = select(mask, vecop, zero) v = vecreduce.add(x) c = add chain, v Where zero here is chosen as the identity value for add reductions. The backend is then expected to fold the select and the vecreduce into a single predicated instruction. Most of the code is fairly straight forward, except for the creation of blockmasks which need to ensure they are created in dominance order. The order they are added is altered to be after any phis, keeping the requirements for the underlying IR. Differential Revision: https://reviews.llvm.org/D84451	2020-10-11 16:58:34 +01:00
Zinovy Nis	32d565b461	[clang-tidy] Fix crash in readability-function-cognitive-complexity on weak refs Fix for https://bugs.llvm.org/show_bug.cgi?id=47779 Differential Revision: https://reviews.llvm.org/D89194	2020-10-11 18:52:38 +03:00
Nikita Popov	d7186fe371	[MemCpyOpt] Add lifetime may alias test (NFC) Test the case where a lifetime intrinsic may alias the memcpy source. Other cases test must or no alias.	2020-10-11 17:08:28 +02:00
David Green	8f2cacae67	[LV] Extra predicated inloop reduction tests. NFC	2020-10-11 15:06:21 +01:00
Nikita Popov	bdb193a6ed	[MemCpyOpt] Add additional byval tests (NFC) Test read/write clobbers and the the non-local case.	2020-10-11 15:22:31 +02:00
Sanjay Patel	3f3356bdd9	[InstCombine] allow vector splats for add+xor --> shifts	2020-10-11 09:04:24 -04:00
Sanjay Patel	f81200ae99	[InstCombine] add one-use check to add+xor transform As shown in the affected test, we could increase instruction count without this limitation. There's another test with extra use that shows we still convert directly to a real "sext" if possible.	2020-10-11 09:04:24 -04:00
Sanjay Patel	85c7653d92	[InstCombine] add tests with extra uses for add+xor transform; NFC	2020-10-11 09:04:24 -04:00
Sanjay Patel	c5138e61e1	[InstCombine] add/adjust tests for add+xor -> shifts; NFC	2020-10-11 09:04:24 -04:00
Kazushi (Jam) Marukawa	86f69689f9	[VE][NFC] Clean VEISelLowering.cpp Clean the order of setOperationActions and others. Differential Revision: https://reviews.llvm.org/D89203	2020-10-11 21:47:50 +09:00
Simon Pilgrim	c7f3bc87d3	Fix Wdocumentation warning. NFCI. Add a space after /param names before any commas otherwise the doxygen parsers get confused.	2020-10-11 11:25:22 +01:00
Simon Pilgrim	913d7a110e	[X86][SSE2] Use smarter instruction patterns for lowering UMIN/UMAX with v8i16. This is my first LLVM patch, so please tell me if there are any process issues. The main observation for this patch is that we can lower UMIN/UMAX with v8i16 by using unsigned saturated subtractions in a clever way. Previously this operation was lowered by turning the signbit of both inputs and the output which turns the unsigned minimum/maximum into a signed one. We could use this trick in reverse for lowering SMIN/SMAX with v16i8 instead. In terms of latency/throughput this is the needs one large move instruction. It's just that the sign bit turning has an increased chance of being optimized further. This is particularly apparent in the "reduce" test cases. However due to the slight regression in the single use case, this patch no longer proposes this. Unfortunately this argument also applies in reverse to the new lowering of UMIN/UMAX with v8i16 which regresses the "horizontal-reduce-umax", "horizontal-reduce-umin", "vector-reduce-umin" and "vector-reduce-umax" test cases a bit with this patch. Maybe some extra casework would be possible to avoid this. However independent of that I believe that the benefits in the common case of just 1 to 3 chained min/max instructions outweighs the downsides in that specific case. Patch By: @TomHender (Tom Hender) ActuallyaDeviloper Differential Revision: https://reviews.llvm.org/D87236	2020-10-11 11:21:23 +01:00
Simon Pilgrim	7c71b44980	[InstCombine] Remove accidental unnecessary ConstantExpr qualification added in rGb752daa26b64155 MSVC didn't complain but everything else did....	2020-10-11 10:39:51 +01:00
Simon Pilgrim	b97093e520	[InstCombine] matchFunnelShift - fold or(shl(a,x),lshr(b,sub(bw,x))) -> fshl(a,b,x) iff x < bw If value tracking can confirm that a shift value is less than the type bitwidth then we can more confidently fold general or(shl(a,x),lshr(b,sub(bw,x))) patterns to a funnel/rotate intrinsic pattern without causing bad codegen regressions in the backend (see D89139). Differential Revision: https://reviews.llvm.org/D88783	2020-10-11 10:37:20 +01:00
Simon Pilgrim	b752daa26b	[InstCombine] Replace getLogBase2 internal helper with ConstantExpr::getExactLogBase2. NFCI. This exposes the helper for other power-of-2 instcombine folds that I'm intending to add vector support to. The helper only operated on power-of-2 constants so getExactLogBase2 is a more accurate name.	2020-10-11 10:31:17 +01:00
Tobias Gysi	93377888ae	[mlir] add scf.if op canonicalization pattern that removes unused results The patch adds a canonicalization pattern that removes the unused results of scf.if operation. As a result, cse may remove unused computations in the then and else regions of the scf.if operation. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D89029	2020-10-11 10:40:28 +02:00
Xun Li	667dfe39ca	[Coroutines] Refactor/Rewrite Spill and Alloca processing This patch is a refactoring of how we process spills and allocas during CoroSplit. In the previous implementation, everything that needs to go to the heap is put into Spills, including all the values defined by allocas. And the way to identify a Spill, is to check whether there exists a use-def relationship that crosses suspension points. This approach is fundamentally confusing, and unfortunately, incorrect. First of all, allocas are always process differently than spills, hence it's quite confusing to put them together. It's a much cleaner to separate them and process them separately. Doing so simplify lots of code and makes the logic more clear and easier to reason about. Secondly, use-def relationship is insufficient to decide whether a value defined by AllocaInst needs to go to the heap. There are many cases where a value defined by AllocaInst can implicitly be used across suspension points without a direct use-def relationship. For example, you can store the address of an alloca into the heap, and load that address after suspension. Or you can escape the address into an object through a function call. Or you can have a PHINode that takes two allocas, and this PHINode is used across suspension point (when this happens, the existing implementation will spill the PHINode, a.k.a a stack adddress to the heap!). All these issues suggest that we need to separate spill and alloca in order to properly implement this. This patch does not yet fix these bugs, however it sets up the code in a better shape so that we can start fixing them in the next patch. The core idea of this patch is to add a new struct called FrameDataInfo, which contains all Spills, all Allocas, and a map from each definition to its layout index in the frame (FieldIndexMap). Spills and Allocas are identified, stored and processed independently. When they are initially added to the frame, we record their field index through FieldIndexMap. When the frame layout is finalized, we update each index into their final layout index. In doing so, I also cleaned up a few things and also discovered a few other bugs. Cleanups: 1. Found out that PromiseFieldId is not used, delete it. 2. Previously, SpillInfo is a vector, which is strange because every def can have multiple users. This patch cleans it up by turning it into a map from def to users. 3. Previously, a frame Field struct contains a list of Spills that field corresponds to. This isn't necessary since we only need the layout index for each given definition. This patch removes that list. Instead, we connect each field and definition using the FieldIndexMap. 4. All the loops that process Spills are simplified now because we use a map instead of a vector. Bugs: It seems that we are only keeping llvm.dbg.declare intrinsics in the .resume part of the function. The ramp function will no longer has it. This means we are dropping some debug information in the ramp function. The next step is to start fixing the bugs where the implementation fails to identify some allocas that should live on the frame. Differential Revision: https://reviews.llvm.org/D88872	2020-10-10 22:21:34 -07:00
Craig Topper	9895327914	[X86] Redefine X86ISD::PEXTRB/W and X86ISD::PINSRB/PINSRW to use a i8 TargetConstant for the immediate instead of a ptr constant. This is more consistent with other target specific ISD opcodes that require immediates.	2020-10-10 21:50:58 -07:00
Craig Topper	7f1b2a6125	[X86] AMX intrinsics should have ImmArg for the register numbers and use timm in isel patterns.	2020-10-10 20:12:28 -07:00
Craig Topper	375849518d	[X86] Add a X86ISD::BEXTRI to distinquish the case where the control must be a constant. The bextri intrinsic has a ImmArg attribute which will be converted in SelectionDAG using TargetConstant. We previously converted this to a plain Constant to allow X86ISD::BEXTR to call SimplifyDemandedBits on it. But while trying to decide if D89178 was safe, I realized that this conversion of TargetConstant to Constant would be one case where that would break. So this patch adds a new opcode specifically for the immediate case. And then teaches computeKnownBits and SimplifyDemandedBits to also handle it, but not try to SimplifyDemandedBits on it. To make up for that, I immediately masked the constant to 16 bits when converting from the intrinsic node to the X86ISD node.	2020-10-10 19:18:06 -07:00
Krzysztof Parzyszek	9237e73ae8	[Hexagon] Replace HexagonISD::VSPLAT with ISD::SPLAT_VECTOR This removes VSPLAT and VZERO. VZERO is now SPLAT_VECTOR of (i32 0). Included is also a testcase for the previous (target-independent) commit.	2020-10-10 19:49:47 -05:00
Krzysztof Parzyszek	61eaa2e14a	[SDAG] Remember to set UndefElts in isSplatValue for SPLAT_VECTOR	2020-10-10 19:42:24 -05:00

1 2 3 4 5 ...

368648 Commits All Branches Search

368648 Commits

All Branches