llvm-project

Commit Graph

Author	SHA1	Message	Date
Lang Hames	aa061ddde7	[ORC] Fix the LLJITWithRemoteDebugging example. This was broken by the switch from JITTargetAddress to ExecutorAddr in `21a06254a3`.	2021-09-27 20:06:00 -07:00
Xiang1 Zhang	ebe9944a34	[ISel] Legalized arithmetic.fence.f128 for 32-bits target Reviewed By: Craig Topper, Wang Pengfei Differential Revision: https://reviews.llvm.org/D110467	2021-09-28 10:27:25 +08:00
Anna Thomas	90fb73aa73	[LoopPred Test] Fix lld-x86_64-win BB failure Need a more general CHECK line for testcase in `5df9112` for correctly handling lld-x86_64-win buildbot.	2021-09-27 21:28:46 -04:00
Ahsan Saghir	4f6a6ba126	Revert "tsan: fix trace tests on darwin" This reverts commit `94ea36649e`. Reverting due to errors on buildbots.	2021-09-27 20:17:17 -05:00
Anna Thomas	5df9112ce3	Reland "[LoopPredication] Add testcase showing BPI computation. NFC" This relands commit `16a62d4f`. Relanded after fixing CHECK-LINES for opt pipeline output to be more general (based on failures seen in buildbot).	2021-09-27 21:15:46 -04:00
Lang Hames	61e25d2550	clang-format	2021-09-27 18:02:06 -07:00
Lang Hames	22f8276fe4	[llvm-jitlink] Add more information about allocation failures. Slab allocator failures will now report requested size and remaining capacity.	2021-09-27 18:02:06 -07:00
Ahsan Saghir	593b074a09	[PowerPC] MMA - Add __builtin_vsx_build_pair and __builtin_mma_build_acc builtins This patch adds the following built-ins: __builtin_vsx_build_pair __builtin_mma_build_acc Reviewed By: #powerpc, nemanjai, lei Differential Revision: https://reviews.llvm.org/D107647	2021-09-27 19:51:28 -05:00
Lang Hames	21a06254a3	[ORC] Switch from JITTargetAddress to ExecutorAddr for EPC-call APIs. Part of the ongoing move to ExecutorAddr.	2021-09-27 16:53:09 -07:00
Michael Kruse	027c036663	[Polly] Reject regions entered by an indirectbr/callbr. SplitBlockPredecessors is unable to insert an additional BasicBlock between an indirectbr/callbr terminator and the successor blocks. This is needed by Polly to normalize the control flow before emitting its optimzed code. This patches rejects regions entered by an indirectbr/callbr to not fail later at code generation. This fixes llvm.org/PR51964 Recommit with "REQUIRES: asserts" in test that uses statistics.	2021-09-27 18:49:11 -05:00
Joe Loser	9451d9da95	[libc++][NFC] s/enable_if<...>::type/enable_if_t<...> in span There is some use of `enable_if<...>::type` when the rest of the file uses `enable_if_t`. So, use `enable_if_t` consistently throughout.	2021-09-27 19:21:07 -04:00
Haowei Wu	283ed7de32	Revert "[Polly] Reject reject regions entered by an indirectbr/callbr." This reverts commit `91f46bb77e` which causes test failures when assertions are off.	2021-09-27 16:05:33 -07:00
Lang Hames	6fe2e9a9cc	[ORC] Hold shared_ptr<SymbolStringPool> in errors containing SymbolStringPtrs. This allows these error values to remain valid, even if they tear down the JIT itself.	2021-09-27 15:46:56 -07:00
Congzhe Cao	c42772752a	[CodeMoverUtils] Enhance isSafeToMoveBefore() when control flow equivalence is satisfied With improved analysis in determining CFG equivalence that does not require strict dominance and post-dominance conditions, we now relax isSafeToMoveBefore() such that an instruction I can be moved before InsertPoint even if they do not strictly dominate each other, as long as they follow the same control flow path. For example, we can move Instruction 0 before Instruction 1, and vice versa. ``` if (cond1) // Instruction 0: %add = add i32 1, 2 if (cond1) // Instruction 1: %add2 = add i32 2, 1 ``` Reviewed By: Whitney Differential Revision: https://reviews.llvm.org/D110456	2021-09-27 18:37:36 -04:00
Kevin Athey	b345952ad4	Revert "tsan: add a test for stack init race" This reverts commit `b72176b9bc`. Broke bot: https://lab.llvm.org/buildbot/#/builders/70/builds/12193	2021-09-27 15:31:23 -07:00
LLVM GN Syncbot	57cd7b018c	[gn build] Port `6cfb4d46ba`	2021-09-27 21:56:39 +00:00
Jozef Lawrynowicz	6cfb4d46ba	[llvm-readobj] Support dumping of MSP430 ELF attributes The MSP430 ABI supports build attributes for specifying the ISA, code model, data model and enum size in ELF object files. Differential Revision: https://reviews.llvm.org/D107969	2021-09-28 00:56:11 +03:00
Jon Chesterfield	2bc4d48a78	[libomptarget][amdgpu] Follow on to D110513, empty kernarg pools are not fatal	2021-09-27 22:44:35 +01:00
Jon Chesterfield	738734f655	[libomptarget][amdgpu] Report zero devices if plugin construction fails, instead of segv	2021-09-27 22:13:12 +01:00
Anna Thomas	a0a9e3e05f	Revert "[LoopPredication] Add testcase showing BPI computation. NFC" This reverts commit `16a62d4f3d`. Needs some update to check lines to fix bb failure.	2021-09-27 17:08:57 -04:00
Louis Dionne	1e628d0c14	[libc++] Do not enable P1951 before C++23, since it's a breaking change In reaction to the issues raised by Richard in https://llvm.org/D109066, this commit does not apply P1951 as a DR in previous standard modes, since it breaks valid code. I do believe it should be applied as a DR, however ideally we'd get some sort of statement from the Committee to this effect (and all implementations would behave consistently). In the meantime, only implement P1951 starting with C++23 -- we can always come back and apply it as a DR if that's what the Committee says. Differential Revision: https://reviews.llvm.org/D110347	2021-09-27 17:06:44 -04:00
Anna Thomas	16a62d4f3d	[LoopPredication] Add testcase showing BPI computation. NFC Precommit testcase for D110438. Since we do not preserve BPI in loop pass manager, we are forced to compute BPI everytime Loop predication is invoked. The patch referenced changes that behaviour by preserving lossy BPI for loop passes.	2021-09-27 16:54:22 -04:00
Simon Pilgrim	540ed354d3	[X86] Add slow/fast pmulld test coverage to vector-mul.ll	2021-09-27 21:53:56 +01:00
Kostya Kortchinsky	04f5913395	[gwp-asan] Initialize AllocatorVersionMagic at runtime GWP-ASan's `AllocatorState` was recently extended with a `AllocatorVersionMagic` structure required so that GWP-ASan bug reports can be understood by tools at different versions. On Fuchsia, this in included in the `scudo::Allocator` structure, and by having non-zero initializers, this effectively moved the static allocator structure from the `.bss` segment to the `.data` segment, thus increasing (significantly) the size of the libc. This CL proposes to initialize the structure with its magic numbers at runtime, allowing for the allocator to go back into the `.bss` segment. I will work on adding a test on the Scudo side to ensure that this type of changes get detected early on. Additional work is also needed to reduce the footprint of the (large) memory-tagging related structures that are currently part of the allocator. Differential Revision: https://reviews.llvm.org/D110575	2021-09-27 13:49:55 -07:00
Roman Lebedev	ee6228ff8c	[NFC][X86] Add 'gather' optsize/minsize test coverage	2021-09-27 23:49:10 +03:00
Florian Mayer	4f352d444e	[NFC] [PSI] explain encoding of PercentileCutoff. Reviewed By: mtrofin, davidxl Differential Revision: https://reviews.llvm.org/D109764	2021-09-27 21:41:33 +01:00
Fangrui Song	75f0194d3d	[Driver] Remove confusing *-linux-android detection with non-android --target= These values allow, for example, `--target=aarch64` and `--target=aarch64-linux-gnu` to detect `aarch64-linux-android`. This is confusing. Users should specify `--target=aarch64-linux-android` to get Android GCC installation. Reverts D53463. Reviewed By: nickdesaulniers, danalbert Differential Revision: https://reviews.llvm.org/D110379	2021-09-27 13:28:40 -07:00
Roman Lebedev	f7e82e4fa8	[NFC][X86] Add test showing that legal `GATHER`'s are expoanded on Znver3	2021-09-27 22:41:09 +03:00
modimo	20faf78919	[ThinLTO] Add noRecurse and noUnwind thinlink function attribute propagation Thinlink provides an opportunity to propagate function attributes across modules, enabling additional propagation opportunities. This change propagates (currently default off, turn on with `disable-thinlto-funcattrs=1`) noRecurse and noUnwind based off of function summaries of the prevailing functions in bottom-up call-graph order. Testing on clang self-build: 1. There's a 35-40% increase in noUnwind functions due to the additional propagation opportunities. 2. Throughput is measured at 10-15% increase in thinlink time which itself is 1.5% of E2E link time. Implementation-wise this adds the following summary function attributes: 1. noUnwind: function is noUnwind 2. mayThrow: function contains a non-call instruction that `Instruction::mayThrow` returns true on (e.g. windows SEH instructions) 3. hasUnknownCall: function contains calls that don't make it into the summary call-graph thus should not be propagated from (e.g. indirect for now, could add no-opt functions as well) Testing: Clang self-build passes and 2nd stage build passes check-all ninja check-all with newly added tests passing Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D36850	2021-09-27 12:28:07 -07:00
Tobias Gysi	d20d0e145d	[mlir][linalg] Finer-grained padding control. Adapt the signature of the PaddingValueComputationFunction callback to either return the padding value or failure to signal padding is not desired. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D110572	2021-09-27 19:21:37 +00:00
Roman Lebedev	2a7a768dad	[X86][Costmodel] Load/store i16 Stride=4 VF=32 interleaving costs The only sched models that for cpu's that support avx2 but not avx512 are: haswell, broadwell, skylake, zen1-3 For this tuple, measuring becomes problematic since there's a lot of spilling going on, but apparently all these memory ops do not affect worst-case estimate at all here. For load we have: https://godbolt.org/z/zP4hd8MT6 - for intels `Block RThroughput: =150.0`; for ryzens, `Block RThroughput: <=59` So pick cost of `150`. For store we have: https://godbolt.org/z/vKb8zTK8E - for intels `Block RThroughput: =32.0`; for ryzens, `Block RThroughput: <=24.0` So pick cost of `64`. I'm directly using the shuffling asm the llc produced, without any manual fixups that may be needed to ensure sequential execution. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D110548	2021-09-27 22:20:01 +03:00
Roman Lebedev	ee5a050e2e	[X86][Costmodel] Load/store i16 Stride=4 VF=16 interleaving costs The only sched models that for cpu's that support avx2 but not avx512 are: haswell, broadwell, skylake, zen1-3 For load we have: https://godbolt.org/z/Wd9cKab83 - for intels `Block RThroughput: =75.0`; for ryzens, `Block RThroughput: <=29.5` So pick cost of `75`. (note that `# 32-byte Reload` does not affect throughput there.) For store we have: https://godbolt.org/z/Wd9cKab83 - for intels `Block RThroughput: =32.0`; for ryzens, `Block RThroughput: <=12.0` So pick cost of `32`. I'm directly using the shuffling asm the llc produced, without any manual fixups that may be needed to ensure sequential execution. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D110543	2021-09-27 22:20:01 +03:00
Roman Lebedev	5615d6a6dd	[X86][Costmodel] Load/store i16 Stride=4 VF=8 interleaving costs The only sched models that for cpu's that support avx2 but not avx512 are: haswell, broadwell, skylake, zen1-3 For load we have: https://godbolt.org/z/dd8T5P471 - for intels `Block RThroughput: =33.0`; for ryzens, `Block RThroughput: <=14.5` So pick cost of `33`. For store we have: https://godbolt.org/z/zPxcKWhn4 - for intels `Block RThroughput: =10.0`; for ryzens, `Block RThroughput: <=6.0` So pick cost of `10`. I'm directly using the shuffling asm the llc produced, without any manual fixups that may be needed to ensure sequential execution. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D110541	2021-09-27 22:20:01 +03:00
Roman Lebedev	df2b42d12e	[X86][Costmodel] Load/store i16 Stride=4 VF=4 interleaving costs The only sched models that for cpu's that support avx2 but not avx512 are: haswell, broadwell, skylake, zen1-3 For load we have: https://godbolt.org/z/rnsf639Wh - for intels `Block RThroughput: =17.0`; for ryzens, `Block RThroughput: <=7.5` So pick cost of `17`. For store we have: https://godbolt.org/z/565KKrcY6 - for intels `Block RThroughput: =6.0`; for ryzens, `Block RThroughput: =2.0` So pick cost of `6`. I'm directly using the shuffling asm the llc produced, without any manual fixups that may be needed to ensure sequential execution. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D110537	2021-09-27 22:20:01 +03:00
Roman Lebedev	45caac91c4	[X86][Costmodel] Load/store i16 Stride=4 VF=2 interleaving costs The only sched models that for cpu's that support avx2 but not avx512 are: haswell, broadwell, skylake, zen1-3 For load we have: https://godbolt.org/z/5EYc6r9nh - for intels `Block RThroughput: =6.0`; for ryzens, `Block RThroughput: <=3.0` So pick cost of `6`. For store we have: https://godbolt.org/z/z61e5d6GE - for intels `Block RThroughput: =2.0`; for ryzens, `Block RThroughput: <=1.0` So pick cost of `2`. I'm directly using the shuffling asm the llc produced, without any manual fixups that may be needed to ensure sequential execution. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D110536	2021-09-27 22:20:01 +03:00
Chris Bieneman	18cf5b220d	Fixing docs build I always forget that new line...	2021-09-27 14:16:28 -05:00
Chris Bieneman	1e48ef2035	Implement #pragma clang final extension This patch adds a new preprocessor extension ``#pragma clang final`` which enables warning on undefinition and re-definition of macros. The intent of this warning is to extend beyond ``-Wmacro-redefined`` to warn against any and all alterations to macros that are marked `final`. This warning is part of the ``-Wpedantic-macros`` diagnostics group. Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D108567	2021-09-27 14:11:16 -05:00
Sanjay Patel	fdba1dccbe	[InstCombine] reduce code for shl-of-sub transform; NFC	2021-09-27 14:56:01 -04:00
Sanjay Patel	b75ed244af	[InstCombine] add tests for shl-of-sub; NFC	2021-09-27 14:56:01 -04:00
Aart Bik	06e2a0684e	[mlir][sparse] sampled matrix multiplication fusion test This integration tests runs a fused and non-fused version of sampled matrix multiplication. Both should eventually have the same performance! NOTE: relies on pending tensor.init fix! Reviewed By: bixia Differential Revision: https://reviews.llvm.org/D110444	2021-09-27 11:50:49 -07:00
Nico Weber	36dc5c048a	Revert "[clangd] Refactor IncludeStructure: use File (unsigned) for most computations" This reverts commit `0b1eff1bc5`. Breaks check-clangd on Windows, see comments on https://reviews.llvm.org/D110386	2021-09-27 14:38:18 -04:00
Jon Chesterfield	80fa43fe9a	Revert "[openmp] Add addrspacecast to getOrCreateIdent" This reverts commit `1a761e5b7b`. Failed CI, albeit with a different failure mode to BZ51982	2021-09-27 19:27:35 +01:00
Jon Chesterfield	1a761e5b7b	[openmp] Add addrspacecast to getOrCreateIdent Fixes 51982. Minor refactor to remove `return x = y` construct. Test case derived from https://github.com/ROCm-Developer-Tools/aomp/\ blob/aomp-dev/test/smoke/nest_call_par2/nest_call_par2.c by deleting parts while checking the assertion failure still occurred. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D110556	2021-09-27 19:23:12 +01:00
Aart Bik	ec97a205c3	[mlir][sparse] preserve zero-initialization for materializing buffers This revision makes sure that when the output buffer materializes locally (in contrast with the passing in of output tensors either in-place or not in-place), the zero initialization assumption is preserved. This also adds a bit more documentation on our sparse kernel assumption (viz. TACO assumptions). Reviewed By: bixia Differential Revision: https://reviews.llvm.org/D110442	2021-09-27 11:22:05 -07:00
Aaron Ballman	ef0f728abe	Add a missing include to appease the build bots	2021-09-27 14:19:39 -04:00
Sanjay Patel	623f93ed1c	[InstCombine] add use check to shl transform This bug was introduced with the refactoring in: `9075edc89b` ...but there were no tests to detect it.	2021-09-27 14:10:26 -04:00
Sanjay Patel	d992950078	[InstCombine] add tests for opposing shifts separated by trunc; NFC	2021-09-27 14:10:26 -04:00
Jameson Nash	e27a6db529	Bad SLPVectorization shufflevector replacement, resulting in write to wrong memory location We see that it might otherwise do: %10 = getelementptr {}, <2 x {}> %9, <2 x i32> <i32 10, i32 4> %11 = bitcast <2 x {}*> %10 to <2 x i64> ... %27 = extractelement <2 x i64> %11, i32 0 %28 = bitcast i64 %27 to <2 x i64>* store <2 x i64> %22, <2 x i64>* %28, align 4, !tbaa !2 Which is an out-of-bounds store (the extractelement got offset 10 instead of offset 4 as intended). With the fix, we correctly generate extractelement for i32 1 and generate correct code. Differential Revision: https://reviews.llvm.org/D106613	2021-09-27 14:06:13 -04:00
Carlos Galvez	b2a2c38349	Fix bug in readability-uppercase-literal-suffix Fixes https://bugs.llvm.org/show_bug.cgi?id=51790. The check triggers incorrectly with non-type template parameters. A bisect determined that the bug was introduced here: `ea2225a10b` Unfortunately that patch can no longer be reverted on top of the main branch, so add a fix instead. Add a unit test to avoid regression in the future.	2021-09-27 14:03:53 -04:00
peter klausler	9eab0da183	[flang] Catch branching into FORALL/WHERE constructs Enforce constraints C1034 & C1038, which disallow the use of otherwise valid statements as branch targets when they appear in FORALL &/or WHERE constructs. (And make the diagnostic message somewhat more user-friendly.) Differential Revision: https://reviews.llvm.org/D109936	2021-09-27 10:51:44 -07:00

1 2 3 4 5 ...

400160 Commits All Branches Search

400160 Commits

All Branches