llvm-project

Commit Graph

Author	SHA1	Message	Date
Yaxun (Sam) Liu	78b0f3701d	[HIPSPV][1/4] Refactor HIP tool chain This patch refactors the HIP tool chain for new HIP tool chain, HIPSPV tool chain, which is added in the follow up patch part 2. Rename HIPToolChain to HIPAMDToolChain and Renames HIP.* files to HIPAMD.. Introduce HIPUtility. file where common HIP utilities, shared among HIP tool chain implementations, are placed in. Move constructHIPFatbinCommand() and constructGenerateObjFileFromHIPFatBinary() to HIPUtility. HIPSPV tool chain is going to use them. Tweak bundle target ID in constructHIPFatbinCommand(): extra dashes are dropped if the Target ID is empty and 'hip' offload kind is made default for non-AMD targets. Patch by: Henry Linjamäki Reviewed by: Yaxun Liu, Artem Belevich, Eric Christopher Differential Revision: https://reviews.llvm.org/D110549	2021-12-13 10:50:25 -05:00
Nikita Popov	220815a91a	[AMDGPUPerfHintAnalysis] Avoid getPointerElementType() Extract the load/store type from the instruction rather than fetching it from the pointer element type.	2021-12-13 16:48:21 +01:00
Neubauer, Sebastian	26924b57e8	[AMDGPU] Ignore special ABI registers for graphics Fixed ABI arguments are compute specific and should not be added to graphics shaders or functions, so do not try to add them. Differential Revision: https://reviews.llvm.org/D115344	2021-12-13 16:44:37 +01:00
Lei Zhang	5e55a20119	[mlir][spirv] Serialize selection with separate header block The previous "optimization" that tries to reuse existing block for selection header block can be problematic for deserialization because it effectively pulls in previous ops in the selection op's enclosing block into the selection op's header. When deserializing, those ops will be placed in the selection op's region. If any of the previous ops has usage after the section op, it will break. That is, the following IR cannot round trip: ```mlir ^bb: %def = ... spv.mlir.selection { ... } %use = spv.SomeOp %def ``` This commit removes the "optimization" to always create new blocks for the selection header. Along the way, also made error reporting better in deserialization by turning asserts into proper errors and add check of uses outside of sinked structured control flow region blocks. Reviewed By: Hardcode84 Differential Revision: https://reviews.llvm.org/D115582	2021-12-13 10:42:26 -05:00
Chuanqi Xu	9db8162820	[NFC] Format .cppm files in tests	2021-12-13 23:32:25 +08:00
Louis Dionne	7c1d4c2e77	[libc++abi][NFC] Fix comment	2021-12-13 10:29:29 -05:00
Sanjay Patel	f46a9c8edd	[InstCombine] don't automatically drop poison-generating flags in SimplifyVectorDemandedElts I noticed this while reviewing the test diffs in D115460 (and so the diffs in that patch will be reduced if this one is applied first). This is effectively a revert of `3436dc2923` ( https://reviews.llvm.org/rG3436dc29239d ) - since that commit, we've made several enhancements, so the reasoning there is no longer valid. Specifically, we added a poison value to IR, and we clarified the behavior of undef/poison elements in a shuffle mask: https://llvm.org/docs/LangRef.html#shufflevector-instruction Alive2 seems to agree that the propagation of flags in the test diffs shown here are valid: https://alive2.llvm.org/ce/z/UuY-jr https://alive2.llvm.org/ce/z/GXoMD9 https://alive2.llvm.org/ce/z/nVCyVH Differential Revision: https://reviews.llvm.org/D115526	2021-12-13 10:12:19 -05:00
Mogball	843534db3c	[mlir][ods] Fix OpDefinitionsGen infer return types builder with regions Despite handling regions and inferred return types, the builder was never generated for ops with both InferReturnTypeOpInterface and regions. Reviewed By: jpienaar Differential Revision: https://reviews.llvm.org/D115525	2021-12-13 15:11:35 +00:00
Kadir Cetinkaya	a47af1ac34	[clangd][Dex] Fix crashes when building trigrams for empty identifier	2021-12-13 15:58:33 +01:00
gysit	6c85a49e22	[mlir][memref] Use current source type in getCanonicalSubViewResultType. Use the current instead of the new source type to compute the rank-reduction map in getCanonicalSubViewResultType. Otherwise, the computation of the rank-reduction map fails when folding a cast into a subview since the strides of the new source type cannot be related to the strides of the current result type. Depends On D115428 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D115446	2021-12-13 14:50:41 +00:00
Jay Foad	16de2c09dd	[AMDGPU] SIShrinkInstructions: sink code to where it's used. NFC.	2021-12-13 14:46:40 +00:00
Jay Foad	63681527ee	[AMDGPU] SIShrinkInstructions: remove redundant check canShrink already calls hasVALU32BitEncoding, so there is no need to call it again here.	2021-12-13 14:46:40 +00:00
Jay Foad	61f8af2657	[AMDGPU] Remove a FIXME implemented in D11061	2021-12-13 14:46:40 +00:00
Nikita Popov	432c41ebe9	[SLP] Avoid getPointerElementType() call Use the load result type instead of the element type of the load pointer operand.	2021-12-13 15:46:13 +01:00
Pavel Labath	529e03ea65	[lldb] Remove named function arguments from TestQemuLaunch This is a swig-4 feature.	2021-12-13 15:30:26 +01:00
Nikita Popov	9cbab13282	[ConstantsTest] Avoid crash with opaque pointers With opaque pointers there will be no bitcast, so don't assume that.	2021-12-13 15:23:12 +01:00
Daniil Fukalov	e5c64b45be	[CostModel][AMDGPU] Fix intrinsics costs estimations. 1. Fixed costs inconsistency for llvm.fma.vXf16 instinsiscs. 2. Added tests for llvm.sadd.sat, llvm.ssub.sat, llvm.uadd.sat, llvm.usub.sat intrisics since they have special processing in cost model. 3. Minor intrisics' costs tests updat and refinement. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D115385	2021-12-13 17:17:34 +03:00
Markus Böck	664cc9312c	[mlir] Implement `DataLayoutTypeInterface` for `LLVMStructType` Using this implementation of the interface it is possible to query the size, ABI alignment as well as the preferred alignment of a struct. It should yield the same results as LLVMs `llvm::DataLayout` on an equivalent `llvm::StructType`, including for packed structs. Additionally it is also possible to increase the ABI and preferred alignment using a data layout entry with the type `llvm.struct<()>, which serves the same functionality as the `a:` component in LLVMs data layout string. Differential Revision: https://reviews.llvm.org/D115600	2021-12-13 15:09:16 +01:00
Jon Chesterfield	28345d7f6f	[amdgpu] Add regression test for LDS in metadata	2021-12-13 13:35:38 +00:00
Florian Hahn	e2885c7c9b	[VPlan] Add printing test with VPInstruction with debug locs. Test case for D113223.	2021-12-13 13:08:41 +00:00
gysit	db7a2e9176	[mlir][linalg] Only compose PadTensorOps if no ExtractSliceOp is rank-reducing. Do not compose pad tensor operations if the extract slice of the outer pad tensor operation is rank reducing. The inner extract slice op cannot be rank-reducing since it source type must match the desired type of the padding. Depends On D115359 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D115428	2021-12-13 13:01:30 +00:00
gysit	6859f8ed1e	[mlir][linalg] Adapt the PadTensorOpVectorizationWithInsertSlicePattern matching. Tighten the matcher of the PadTensorOpVectorizationWithInsertSlicePattern pattern. Only match if the PadOp result is used by the InsertSliceOp source. Fail if the result is used by the InsertSliceOp dest. Depends On D115336 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D115359	2021-12-13 12:55:07 +00:00
gysit	f895e95138	[mlir][linalg] Make padding work for rank-reducing slice ops. Adapt the computation of a static bounding box to take rank-reducing slice operations into account by filtering out reduced size one dimensions. The revision is needed to make padding work for decomposed convolution operations. The decomposition introduces rank reducing extract slice operations that previously let padding fail. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D115336	2021-12-13 12:34:20 +00:00
Nico Weber	45158b1804	Revert "[NFC] format .cppm files in test" This reverts commit `7c51a12833`. Breaks SemaCXX/modules-ts.cppm in check-clang.	2021-12-13 07:13:17 -05:00
Florian Hahn	42263e7d26	[LV] Add test with debug locations on branches that get scalarized.	2021-12-13 12:06:35 +00:00
Nico Weber	b6f317d94d	[gn build] Make arm_neon_sve_bridge.h header auto-syncable	2021-12-13 07:04:45 -05:00
Evgeniy Brevnov	7002125cff	[LV][NFC] Fix debug message to print out resulting clamped VF	2021-12-13 18:54:05 +07:00
Chuanqi Xu	7c51a12833	[NFC] format .cppm files in test	2021-12-13 19:52:31 +08:00
Dmitry Vyukov	9fb8058a80	tsan: enable the new runtime This enables the new runtime (D112603) by default. Depends on D112603. Differential Revision: https://reviews.llvm.org/D115624	2021-12-13 12:50:13 +01:00
Dmitry Vyukov	b332134921	tsan: new runtime (v3) This change switches tsan to the new runtime which features: - 2x smaller shadow memory (2x of app memory) - faster fully vectorized race detection - small fixed-size vector clocks (512b) - fast vectorized vector clock operations - unlimited number of alive threads/goroutimes Depends on D112602. Reviewed By: melver Differential Revision: https://reviews.llvm.org/D112603	2021-12-13 12:48:34 +01:00
Peter Waller	921e89c59a	[SVE] Only combine (fneg (fma)) => FNMLA with nsz -(Za + Zm * Zn) != (-Za + Zm * (-Zn)) when the FMA produces a zero output (e.g. all zero inputs can produce -0 output) Add a PatFrag to check presence of nsz on the fneg, add tests which ensure the combine does not fire in the absense of nsz. See https://reviews.llvm.org/D90901 for a similar discussion on X86. Differential Revision: https://reviews.llvm.org/D109525	2021-12-13 11:33:07 +00:00
Matt Devereau	41def32040	[AArch64][SVE][NEON] Add NEON-SVE-Bridge intrinsics Adds svset_neonq, svget_neonq, svdup_neonq AArch64 intrinsics. These are described in the ACLE specification: https://github.com/ARM-software/acle/pull/72 https://reviews.llvm.org/D114713	2021-12-13 11:31:57 +00:00
Kazushi (Jam) Marukawa	cffce86a1c	[VE] Support srel32 in symbol reference Support R_VE_SREL32 in symbol references in MC layer. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D115591	2021-12-13 20:29:17 +09:00
Kazushi (Jam) Marukawa	d1057f9604	[VE] Support R_VE_RELATIVE Change getELFRelativeRelocationType() to return R_VE_RELATIVE as a preparation of lld for VE. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D115592	2021-12-13 20:28:35 +09:00
Matt Devereau	2e585dd91a	[AArch64][SVE] Lower vector.insert to predicated merged MOV Use predicated SEL for vector.insert instead of going through memory Differential Revision: https://reviews.llvm.org/D115259	2021-12-13 11:17:55 +00:00
Florian Hahn	e90630e5a5	[VPlan] Remove unused createNaryOp (NFC).	2021-12-13 11:11:00 +00:00
Dmitry Vyukov	b088833375	tsan: deflake dlopen_static_tls.cpp Currently the test calls dlclose in the thread concurrently with the main thread calling a function from the dynamic library. This is not good. Wait for the main thread to call the function before calling dlclose. Depends on D115612. Reviewed By: melver Differential Revision: https://reviews.llvm.org/D115613	2021-12-13 12:01:40 +01:00
Dmitry Vyukov	7de546e9e8	tsan: deflake flush_memory.cpp The test contains a race and checks that it's detected. But the race may not be detected since we are doing aggressive flushes and if the state flush happens between racing accesses, tsan won't detect the race). So return 1 to make the test deterministic regardless of the race. Reviewed By: melver Differential Revision: https://reviews.llvm.org/D115612	2021-12-13 12:01:30 +01:00
Fraser Cormack	b0319ab79b	[PR52475] Ensure a correct chain in copies to/from hidden sret parameter This patch fixes an issue during SelectionDAG construction. When the target is unable to lower the function's return value, a hidden sret parameter is created. It is initialized and copied to a stored variable (DemoteRegister) with CopyToReg and is later fetched with CopyFromReg. The bug is that the chains used for each copy are inconsistent, and thus in rare cases the scheduler may issue them out of order. The fix is to ensure that the CopyFromReg uses the DAG root which is set as the chain corresponding to the initial CopyToReg. Fixes https://llvm.org/PR52475 Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D114795	2021-12-13 10:46:32 +00:00
Nikita Popov	396370e889	[MemCpyOpt] Add additional call slot capture tests (NFC) One test shows a miscompile when bitcasts are involved, the others cases where we can perform the optimization despite a capture.	2021-12-13 10:57:06 +01:00
Simon Moll	9feeb2fb61	[VE][NFC] Cleanup vector patterns Cleanup VE vector isel patterns and follow the downstream LLVM-VE pattern naming convention. Reviewed By: kaz7 Differential Revision: https://reviews.llvm.org/D115516	2021-12-13 10:12:27 +01:00
Siva Chandra Reddy	d37d0aadbf	[libc][NFC] Add back NOLINT anntotations to PolyEval. They were accidentally removed in a previous change.	2021-12-13 07:08:08 +00:00
Evgeniy Brevnov	2025e0985c	[LV] Make sure VF doesn't exceed compile time known TC For the simple copy loop (see test case) vectorizer selects VF equal to 32 while the loop is known to have 17 iterations only. Such behavior makes no sense to me since such vector loop will never be executed. The only case we may want to select VF large than TC is masked vectoriztion. So I haven't touched that case. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D114528	2021-12-13 13:48:46 +07:00
Fangrui Song	9115d75117	[ELF] Use parallelSort for .rela.dyn An unstable sort suffices. In a large link (11.06s), this decreases .rela.dyn writeTo time from 1.52s to 0.81s, resulting in 6% total time speedup (the benefit will greatly dilute if --pack-dyn-relocs=relr becomes prevailing). Encoding the dynamic relocations then sorting raw Elf_Rel/Elf_Rela doesn't seem to improve much (doing that would require code duplicate because of Elf_Rel/Elf_Rela plus unfortunate mips64le), so don't do that.	2021-12-12 20:53:06 -08:00
Fangrui Song	1eaa9b4374	[ELF] initializeSections: move SHT_LLVM_CALL_GRAPH_PROFILE check into SHF_EXCLUDE && !relocatable. NFC Avoid a comparison in the majority of cases.	2021-12-12 20:05:21 -08:00
Fangrui Song	d29766bb48	[ELF] relocateAlloc: remove variables type and expr. NFC	2021-12-12 19:31:30 -08:00
Fangrui Song	4cfff19b88	[ELF] Move adjustSplitStackFunctionPrologues's splitStack check to the caller. NFC Avoid a function call in the majority of cases and make the output smaller.	2021-12-12 19:26:03 -08:00
Fangrui Song	a8024dfc06	[ELF] Avoid mutable addend parameter. NFC	2021-12-12 19:12:01 -08:00
Fangrui Song	5fadb39e9b	[Driver][test] Make some tests work with CLANG_DEFAULT_PIE_ON_LINUX=on Also delete some cross-linux.c tests which are covered by linux-cross.cpp	2021-12-12 16:28:33 -08:00
Kazu Hirata	bb6447a78c	[llvm] Use llvm::reverse (NFC)	2021-12-12 16:13:49 -08:00

1 2 3 4 5 ...

407321 Commits All Branches Search

407321 Commits

All Branches