llvm-project

Commit Graph

Author	SHA1	Message	Date
Dmitry Vyukov	94ea36649e	tsan: fix trace tests on darwin The trace tests crashed on darwin because of some thread initialization issues (thread initialization is somewhat different on darwin). Instead of starting real threads, create a new ThreadState in the main thread. This makes the tests more unit-testy and hopefully won't crash on darwin (there is almost no platform-specific code involved now). This will also help with future trace tests that will need more than 1 thread. Creating more than 1 real thread and dispatching test actions across multiple threads in the required deterministic order is painful. Depends on D110539. Reviewed By: melver Differential Revision: https://reviews.llvm.org/D110546	2021-09-27 16:40:57 +02:00
Dmitry Vyukov	b72176b9bc	tsan: add a test for stack init race Depends on D110538. Reviewed By: melver Differential Revision: https://reviews.llvm.org/D110539	2021-09-27 16:40:17 +02:00
Dmitry Vyukov	b4c1e5cb73	tsan: fix and test detection of TLS races Currently detection of races with TLS/stack initialization is broken because we imitate the write before thread initialization, so it's modelled with a wrong thread/epoch. Fix that and add a test. Reviewed By: melver Differential Revision: https://reviews.llvm.org/D110538	2021-09-27 16:40:08 +02:00
Sebastian Neubauer	bf980930e5	[AMDGPU] Ignore KILLs when forming clauses KILL instructions are sometimes present and prevented hard clauses from being formed. Fix this by ignoring all meta instructions in clauses. Differential Revision: https://reviews.llvm.org/D106042	2021-09-27 16:33:52 +02:00
Nico Weber	63bb2d585e	[clang] Put original flags on 'Driver args:' crash report line We used to put the canonical spelling of flags after alias processing on that line. For clang-cl in particular, that meant that we put flags on that line that the clang-cl driver doesn't even accept, and the "Driver args:" line wasn't usable. Differential Revision: https://reviews.llvm.org/D110458	2021-09-27 10:24:46 -04:00
Dmitry Vyukov	1455b552b7	tsan: de-hardcode MemCount const Use MemCount instead of hard-coded value 7. Reviewed By: melver Differential Revision: https://reviews.llvm.org/D110532	2021-09-27 16:11:49 +02:00
Michał Górny	33031545bf	[lldb] [DynamicRegisterInfo] Add a convenience method to add suppl. registers Add a convenience method to add supplementary registers that takes care of adding invalidate_regs to all (potentially) overlapping registers. Differential Revision: https://reviews.llvm.org/D110023	2021-09-27 16:01:30 +02:00
Sjoerd Meijer	eba76056a3	[FuncSpec] Don't specialise (or crash) on poison or constexpr values Function specialization was crashing on poison values and constexpr values. The problem is that these values are not added to the solver, so it crashes when a lookup is performed for these values. This fixes that by not specialising on these values. For poison that is obvious, but for constexpr this is a change in behaviour. Thus, in one way this is a bit of a stopgap, but specialising on constexpr values wasn't done very intentionally, and need some more work and tests if we wanted to support this. As a follow up, we need to look if the solver should exit more gracefully and return a "don't know", or that it should really support these constexprs. This should fix PR51600 (https://bugs.llvm.org/show_bug.cgi?id=51600). Differential Revision: https://reviews.llvm.org/D110529	2021-09-27 14:58:53 +01:00
David Green	ebee606e38	[AArch64] Fix neon-reverseshuffle test extension. NFC Apparently I gave a ll file a .patch extension. Oops.	2021-09-27 14:43:26 +01:00
Aaron Ballman	38d09080c9	Removing a default constructor argument; NFC The argument is always used with its default value, so remove the argument entirely.	2021-09-27 09:41:28 -04:00
Sjoerd Meijer	a588ae482b	[LoopFlatten] Precommit new test widen-iv2.ll for D110234.	2021-09-27 14:37:44 +01:00
gbreynoo	05b1c7aebf	[llvm-dwarfdump][docs] Add missing options to the help output and the command guide This change is to add some missing details to the help text and command guide: - Added a note to the command guide that --debug-macro also dumps .debug_macinfo. - Added a note to the command guide that --debug-frame and --eh_frame are aliases, and in cases where both sections are present one command outputs both. - Changed the wording in the help output for --ignore-case and --regex to closer match the command guide.	2021-09-27 14:28:31 +01:00
Jun Ma	3a998c06a8	Revert "Recommit "Revert "[CVP] processSwitch: Remove default case when switch cover all possible values.""" This reverts commit `8ba2adcf9e`.	2021-09-27 20:39:05 +08:00
LLVM GN Syncbot	e2eb651cfc	[gn build] Port `9da2fa277e`	2021-09-27 12:33:13 +00:00
Michał Górny	9da2fa277e	[lldb] Move StringConvert inside debugserver The StringConvert API is no longer used anywhere but in debugserver. Since debugserver does not use LLVM API, we cannot replace it with llvm::to_integer() and llvm::to_float() there. Let's just move the sources into debugserver. Differential Revision: https://reviews.llvm.org/D110478	2021-09-27 14:32:42 +02:00
Pushpinder Singh	b1695c2eb8	[AMDGPU][OpenMP] Add memory pool size check to isValidMemoryPool Keeping all the checks in one place for future simplification. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D110513	2021-09-27 12:29:00 +00:00
Michał Górny	93b82f45bc	[lldb] [Host] Refactor XML converting getters Refactor the XML converting attribute and text getters to use LLVM API. While at it, remove some redundant error and missing XML support handling, as the called base functions do that anyway. Add tests for these methods. Note that this patch changes the getter behavior to be IMHO more correct. In particular: - negative and overflowing integers are now reported as failures to convert, rather than being wrapped over or capped - digits followed by text are now reported as failures to convert to double, rather than their numeric part being converted Differential Revision: https://reviews.llvm.org/D110410	2021-09-27 14:26:33 +02:00
Michael Kruse	1b242dccff	[OpenMP][CMake] Use in-project clang as CUDA->IR compiler for new DeviceRTL. Use the in-project clang, llvm-link and opt if available and unless CMake cache variables specify to use a different compiler. This applies D101265 to the new DeviceRTL's CMakeLists.txt which was copied before D101265 was applied. Fixes the openmp-offloading-cuda-runtime builder which was failing since D110006. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D110251	2021-09-27 07:14:19 -05:00
Tobias Gysi	e158b5634a	[mlir][linalg] Make fusion on tensor rewriter friendly (NFC). Let the calling pass or pattern replace the uses of the original root operation. Internally, the tileAndFuse still replaces uses and updates operands but only of newly created operations. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D110169	2021-09-27 11:28:25 +00:00
Emre Kultursay	d5629b5d4d	Fix rendezvous for rebase_exec=true case When rebase_exec=true in DidAttach(), all modules are loaded before the rendezvous breakpoint is set, which means the LoadInterpreterModule() method is not called and m_interpreter_module is not initialized. This causes the very first rendezvous breakpoint hit with m_initial_modules_added=false to accidentally unload the module_sp that corresponds to the dynamic loader. This bug (introduced in D92187) was causing the rendezvous mechanism to not work in Android 28. The mechanism works fine on older/newer versions of Android. Test: Verified rendezvous on Android 28 and 29 Test: Added dlopen test Reviewed By: labath Differential Revision: https://reviews.llvm.org/D109797	2021-09-27 13:27:27 +02:00
Roman Lebedev	7424deb743	[X86][Costmodel] Load/store i16 Stride=2 VF=32 interleaving costs The only sched models that for cpu's that support avx2 but not avx512 are: haswell, broadwell, skylake, zen1-3 For load we have: https://godbolt.org/z/q6GbK89br - for intels `Block RThroughput: =18.0`; for ryzens, `Block RThroughput: <=7.0` So pick cost of `18`. For store we have: https://godbolt.org/z/Yzfoo5TnW - for intels `Block RThroughput: =8.0`; for ryzens, `Block RThroughput: <=4.0` So pick cost of `8`. I'm directly using the shuffling asm the llc produced, without any manual fixups that may be needed to ensure sequential execution. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D110507	2021-09-27 14:21:12 +03:00
Roman Lebedev	a5113e9445	[X86][Costmodel] Load/store i16 Stride=2 VF=16 interleaving costs The only sched models that for cpu's that support avx2 but not avx512 are: haswell, broadwell, skylake, zen1-3 For load we have: https://godbolt.org/z/Y1E7qnjz8 - for intels `Block RThroughput: =9.0`; for ryzens, `Block RThroughput: <=3.5` So pick cost of `9`. For store we have: https://godbolt.org/z/Y1E7qnjz8 - for intels `Block RThroughput: =4.0`; for ryzens, `Block RThroughput: <=2.0` So pick cost of `4`. I'm directly using the shuffling asm the llc produced, without any manual fixups that may be needed to ensure sequential execution. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D110506	2021-09-27 14:20:11 +03:00
Roman Lebedev	70c90cc5bd	[X86][Costmodel] Load/store i16 Stride=2 VF=8 interleaving costs The only sched models that for cpu's that support avx2 but not avx512 are: haswell, broadwell, skylake, zen1-3 For load we have: https://godbolt.org/z/e5YE99a4P - for intels `Block RThroughput: =6.0`; for ryzens, `Block RThroughput: =2.0` So pick cost of `6`. For store we have: https://godbolt.org/z/3vM4KsE1n - for intels `Block RThroughput: =3.0`; for ryzens, `Block RThroughput: <=2.0` So pick cost of `3`. I'm directly using the shuffling asm the llc produced, without any manual fixups that may be needed to ensure sequential execution. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D110505	2021-09-27 14:18:29 +03:00
Roman Lebedev	49e532aa52	[X86][Costmodel] Load/store i16 Stride=2 VF=4 interleaving costs The only sched models that for cpu's that support avx2 but not avx512 are: haswell, broadwell, skylake, zen1-3 For load we have: https://godbolt.org/z/1j3nf3dro - for intels `Block RThroughput: =2.0`; for ryzens, `Block RThroughput: <=1.0` So pick cost of `2`. For store we have: https://godbolt.org/z/4n1zvP37j - for intels `Block RThroughput: =1.0`; for ryzens, `Block RThroughput: <=0.5` So pick cost of `1`. I'm directly using the shuffling asm the llc produced, without any manual fixups that may be needed to ensure sequential execution. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D110504	2021-09-27 14:15:25 +03:00
Dmitry Vyukov	354ded67b3	tsan: align ThreadState to cache line There are 2 reasons to do this: 1. We place hot data in the first cache line of ThreadState, this assumed that it's cache-line-aligned but we never actually enforced it (or it was lost at some point). 2. The new vector clock uses vector instructions and requires data alignment. Later the new vector clock will be embedded in ThreadState, then ensuring vector clock alignment will be impossible w/o ThreadState alignment. Depends on D110519. Reviewed By: melver Differential Revision: https://reviews.llvm.org/D110520	2021-09-27 12:54:09 +02:00
Dmitry Vyukov	ed7f3f5bc9	tsan: move shadow stack into ThreadState Currently the shadow stack is located in the trace memory mapping. The new tsan runtime will remove the trace memory mapping. Move the shadow stack into ThreadState as a preparation step. Reviewed By: melver Differential Revision: https://reviews.llvm.org/D110519	2021-09-27 12:53:02 +02:00
Fraser Cormack	e2b46e336b	[DAGCombiner][VP] Fold zero-length or false-masked VP ops This patch adds a generic DAGCombine for vector-predicated (VP) nodes. Those for which we can determine that no vector element is active can be replaced by either undef or, for reductions, the start value. This is tested rather trivially at the IR level, where it's possible that we want to teach instcombine to perform this optimization. However, we can also see the zero-evl case arise during SelectionDAG legalization, when wide VP operations can be split into two and the upper operation emerges as trivially false. It's possible that we could perform this optimization "proactively" (both on legal vectors and before splitting) and reduce the width of an operation and insert it into a larger undef vector: ``` v8i32 vp_add x, y, mask, 4 -> v8i32 insert_subvector (v8i32 undef), (v4i32 vp_add xsub, ysub, mask, 4), i32 0 ``` This is somewhat analogous to similar vector narrow/widening optimizations, but it's unclear at this point whether that's beneficial to do this for VP ops for any/all targets. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D109148	2021-09-27 11:30:09 +01:00
David Green	bb2d23dcd4	[ARM] Improve detection of fallthough when aligning blocks We align non-fallthrough branches under Cortex-M at O3 to lead to fewer instruction fetches. This improves that for the block after a LE or LETP. These blocks will still have terminating branches until the LowOverheadLoops pass is run (as they are not handled by analyzeBranch, the branch is not removed until later), so canFallThrough will return false. These extra branches will eventually be removed, leaving a fallthrough, so treat them as such and don't add unnecessary alignments. Differential Revision: https://reviews.llvm.org/D107810	2021-09-27 11:21:21 +01:00
Nicolas Vasilache	1b49a72de9	[mlir] Factor out constraint set creation from hoist padding. This revision adds a ``` FlatAffineValueConstraints(ValueRange ivs, ValueRange lbs, ValueRange ubs) ``` method and use it in hoist padding. Differential Revision: https://reviews.llvm.org/D110427	2021-09-27 10:11:35 +00:00
Daniel Kiss	77aa9ca92a	[libunwind] Support cfi_undefined and cfi_register for float registers. During a backtrace the `.cfi_undefined` for a float register causes an assert in libunwind. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D110144	2021-09-27 12:04:02 +02:00
Max Kazantsev	4992220ea7	[Test] Regenerate test checks with autogen script	2021-09-27 16:55:59 +07:00
Nicolas Vasilache	b74493ecea	[mlir][Linalg] Refactor padding hoisting - NFC This revision extracts padding hoisting in a new file and cleans it up in prevision of future improvements and extensions. Differential Revision: https://reviews.llvm.org/D110414	2021-09-27 09:50:31 +00:00
Simon Pilgrim	468ff703e1	[X86] combineVectorHADDSUB - remove the broken HOP(x,x) merging code (PR51974) This intention of this code turns out to be superfluous as we can handle this with shuffle combining, and it has a critical flaw in that it doesn't check for dependencies. Fixes PR51974	2021-09-27 10:41:22 +01:00
Florian Hahn	4b581e87df	[LV] Add tests where rt checks may make vectorization unprofitable. Add a few additional tests which require a large number of runtime checks for D109368.	2021-09-27 10:32:28 +01:00
Pushpinder Singh	9d0eb440ff	[libomptarget][nfc][amdgpu] Reorder function to clarify review diff	2021-09-27 09:30:55 +00:00
Matthias Springer	ffdf0a370d	[mlir][vector] Fix bug in vector-transfer-full-partial-split When splitting with linalg.copy, cannot write into the destination alloc directly. Instead, write into a subview of the alloc. Differential Revision: https://reviews.llvm.org/D110512	2021-09-27 18:12:17 +09:00
Ben Shi	683e506324	[AArch64][test] Add more tests of add/sub with immediate Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D110474	2021-09-27 09:05:21 +00:00
David Spickett	3c65d54ec3	[llvm] Disable LLVM_ENABLE_PER_TARGET_RUNTIME_DIR by default on Arm Linux Due to the way detecting the hard float ABI is currently handled, clang fails to find the per target dir. I am working to fix this but in the meantime disable it by default on Arm Linux.	2021-09-27 09:03:26 +00:00
Fraser Cormack	d48f6df1f8	[RISCV] Create the correct mask type when lowering EXTRACT_VECTOR_ELT This particular case was creating a `VMSET_VL` using the old fixed-length type in order to pass a mask to other custom nodes operating on the scalable container type. This kind of thing wasn't caught for us; I only noticed when experimenting with odd-length vectors, where it was trying to generate an invalid `v3i1` MVT. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D110420	2021-09-27 09:43:40 +01:00
Krasimir Georgiev	8cb234e07d	[Bazel] Fix for `6498b0e991`	2021-09-27 10:44:58 +02:00
Jon Chesterfield	726a34f063	[libomptarget][amdgpu] Replace dead exit call with returning error	2021-09-27 09:43:37 +01:00
Michał Górny	f4b71e3479	[llvm] [ADT] Add a range/iterator-based Split() Add a llvm::Split() implementation that can be used via range-for loop, e.g.: for (StringRef x : llvm::Split("foo,bar,baz", ',')) ... The implementation uses an additional SplittingIterator class that uses StringRef::split() internally. Differential Revision: https://reviews.llvm.org/D110496	2021-09-27 10:43:09 +02:00
Balazs Benics	66d9d1012b	[clang][AST] Add support for ShuffleVectorExpr to ASTImporter Addresses https://bugs.llvm.org/show_bug.cgi?id=51902 Reviewed By: shafik, martong Differential Revision: https://reviews.llvm.org/D110052	2021-09-27 10:17:12 +02:00
Max Kazantsev	0bd9162fd7	[Test] Add test showing that SCEV cannot properly infer ranges of cycled phis	2021-09-27 15:01:43 +07:00
Krasimir Georgiev	92b475f0b0	[lldb] silence -Wsometimes-uninitialized warnings No functional changes intended. Silence warnings from `3a6ba36751`.	2021-09-27 09:35:58 +02:00
Vignesh Balu	62fddd5ff5	[OpenMP][OMPD] Implementation of OMPD debugging library - libompd. This is a continuation of the review: https://reviews.llvm.org/D100182 This patch implements the OMPD API as specified in the standard doc. Reviewed By: @hbae Differential Revision: https://reviews.llvm.org/D100183	2021-09-27 12:32:31 +05:30
serge-sans-paille	e45f67f31e	Make analyze-cc path discovery sensible to symlinks Fix https://bugs.llvm.org/show_bug.cgi?id=51897 Differential Revision: https://reviews.llvm.org/D110521	2021-09-27 08:35:19 +02:00
Freddy Ye	902ec6142a	[X86][ISel] Lowering FROUND(f16) and FROUNDEVEN(f16) When AVX512FP16 is enabled, FROUND(f16) cannot be dealt with TypeLegalize, and no libcall in libm is ready for fround(f16) now. FROUNDEVEN(f16) has related instruction in AVX512FP16. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D110312	2021-09-27 13:35:03 +08:00
Max Kazantsev	e787678cef	[Test] Add some simple tests where IndVars cannot remove a check in loop Previously I've added tests that require context for inference, but it seems tha SCEV can't prove same facts even when the context isn't required.	2021-09-27 12:12:51 +07:00
Michael Kruse	91f46bb77e	[Polly] Reject reject regions entered by an indirectbr/callbr. SplitBlockPredecessors is unable to insert an additional BasicBlock between an indirectbr/callbr terminator and the successor blocks. This is needed by Polly to normalize the control flow before emitting its optimzed code. This patches rejects regions entered by an indirectbr/callbr to not fail later at code generation. This fixes llvm.org/PR51964	2021-09-26 21:21:50 -05:00

1 2 3 4 5 ...

400080 Commits All Branches Search

400080 Commits

All Branches