llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	45ad631b4c	[ScalarizeMaskedMemIntrin] Add some IR only test cases for masked gather expansion. llvm-svn: 343272	2018-09-27 21:28:55 +00:00
Craig Topper	7d234d6628	[ScalarizeMaskedMemIntrin] When expanding masked loads, start with the passthru value and insert each conditional load result over their element. Previously we started with undef and did one final merge at the end with a select. llvm-svn: 343271	2018-09-27 21:28:52 +00:00
Craig Topper	dfc0f289fa	[ScalarizeMaskedMemIntrin] Handle the case where the mask is an all zero vector. This shouldn't really happen in practice I hope, but we tried to handle other constant cases. We missed this one because we checked for ConstantVector without realizing that zero becomes ConstantAggregateZero instead. So instead just check for Constant and use getAggregateElement which will do the dirty work for us. llvm-svn: 343270	2018-09-27 21:28:46 +00:00
Craig Topper	a6478ac5d4	[ScalarizeMaskedMemIntrin] Add dedicated IR only tests for masked load expansion so I can begin making modifications. llvm-svn: 343269	2018-09-27 21:28:43 +00:00
Craig Topper	dfe460db57	[ScalarizeMaskedMemIntrin] Remove some temporary variables that are only used by a single if condition. llvm-svn: 343268	2018-09-27 21:28:41 +00:00
Craig Topper	49dad8b8af	[ScalarizeMaskedMemIntrin] Cleanup comments. NFC llvm-svn: 343267	2018-09-27 21:28:39 +00:00
Lang Hames	8b81395db9	[ORC] Add definition for IRLayer::setCloneToNewContextOnEmit, use it to set the flag to true in LLJIT when running in multithreaded mode. The IRLayer::setCloneToNewContextOnEmit method sets a flag within the IRLayer that causes modules added to that layer to be moved to a new context (by serializing to/from a memory buffer) when they are emitted. This allows modules that were all loaded on the same context to be compiled in parallel. llvm-svn: 343266	2018-09-27 21:13:07 +00:00
Sam Clegg	305b0343ce	[WebAssembly] Add --[no]-export-dynamic to replace --export-default In a very recent change I introduced a --no-export-default flag but after conferring with others it seems that this feature already exists in gnu GNU ld and lld in the form the --export-dynamic flag which is off by default. This change replaces export-default with export-dynamic and also changes the default to match the traditional linker behaviour. Now, by default, only the entry point is exported. If other symbols are required by the embedder then --export-dynamic or --export can be used to export all visibility hidden symbols or individual symbols respectively. This change touches a lot of tests that were relying on symbols being exported by default. I imagine it will also effect many users but do think the change is worth it match of the traditional behaviour and flag names. Differential Revision: https://reviews.llvm.org/D52587 llvm-svn: 343265	2018-09-27 21:06:25 +00:00
Konstantin Zhuravlyov	5f1b8181ad	AMDGPU: Split HasExt into HasExtDPP/SDWA/SDWA9 llvm-svn: 343264	2018-09-27 20:49:00 +00:00
Erik Pilkington	46420b6fee	NFC: Fix some darwin linker warnings introduced in r338385 The darwin linker was complaining about Toolchains/RISCV.cpp and Toolchains/Arch/RISCV.cpp had the same name. Fix is to just rename Toolchains/RISCV.cpp to Toolchains/RISCVToolchain.cpp. Differential revision: https://reviews.llvm.org/D52574 llvm-svn: 343263	2018-09-27 20:36:28 +00:00
Lang Hames	01bb2baaa1	[ORC] Make LocalIndirectStubsManager's operations thread-safe. Locks stub management operations and switches to atomic update for stub pointers. llvm-svn: 343262	2018-09-27 20:36:10 +00:00
Lang Hames	cf0949aa9f	[ORC] Lock ThreadSafeContext during Module destructing in ThreadSafeModule. Failure to lock the context can lead to data races if other threads are operating on other ThreadSafeModules that share the same context. llvm-svn: 343261	2018-09-27 20:36:08 +00:00
Gheorghe-Teodor Bercea	8233af90e1	[OpenMP] Make default parallel for schedule in NVPTX target regions in SPMD mode achieve coalescing Summary: Set default schedule for parallel for loops to schedule(static, 1) when using SPMD mode on the NVPTX device offloading toolchain to ensure coalescing. Reviewers: ABataev, Hahnfeld, caomhin Reviewed By: ABataev Subscribers: jholewinski, guansong, cfe-commits Differential Revision: https://reviews.llvm.org/D52629 llvm-svn: 343260	2018-09-27 20:29:00 +00:00
Konstantin Zhuravlyov	9da26b20da	AMDGPU: Split VOP2Inst into VOP2Inst_e32/e64/sdwa llvm-svn: 343259	2018-09-27 19:46:41 +00:00
Patrick Lyster	e653b63d83	Test commit. NFC llvm-svn: 343258	2018-09-27 19:30:32 +00:00
Lang Hames	c5192f751a	[ORC] Coalesce all of ORC's symbol renaming / linkage-promotion utilities into one SymbolLinkagePromoter utility. SymbolLinkagePromoter renames anonymous and private symbols, and bumps all linkages to at least global/hidden-visibility. Modules whose symbols have been promoted by this utility can be decomposed into sub-modules without introducing link errors. This is used by the CompileOnDemandLayer to extract single-function modules for lazy compilation. llvm-svn: 343257	2018-09-27 19:27:20 +00:00
Lang Hames	3ac3c0d717	[ORC] LastKey needs to be protected to prevent data races. llvm-svn: 343256	2018-09-27 19:27:20 +00:00
Lang Hames	94fe9d4bd9	[lli] Fix ArgV setup bug when running in -jit-kind=orc-lazy mode. ArgV[ArgC] should be null. llvm-svn: 343255	2018-09-27 19:27:19 +00:00
Konstantin Zhuravlyov	7d424aae13	AMDGPU/NFC: Simplify VOP_MAC_F16/F32 llvm-svn: 343254	2018-09-27 19:24:05 +00:00
Gheorghe-Teodor Bercea	02650d4c2c	[OpenMP] Make default distribute schedule for NVPTX target regions in SPMD mode achieve coalescing Summary: For the OpenMP NVPTX toolchain choose a default distribute schedule that ensures coalescing on the GPU when in SPMD mode. This significantly increases the performance of offloaded target code and reduces the number of registers used on the GPU side. Reviewers: ABataev, caomhin, Hahnfeld Reviewed By: ABataev, Hahnfeld Subscribers: Hahnfeld, jholewinski, guansong, cfe-commits Differential Revision: https://reviews.llvm.org/D52434 llvm-svn: 343253	2018-09-27 19:22:56 +00:00
Kostya Kortchinsky	67392feb49	[sanitizer] Disable failing Android test after D52371 Summary: The default values used for Space/Size for the new SizeClassMap do not work with Android. The Compact map appears to be in the same boat. Disable the test on Android for now to turn the bots green, but there is no reason Compact & Dense should not have an Android test. Added a FIXME, I will revisit this soon. Reviewers: eugenis Subscribers: srhines, kubamracek, delcypher, #sanitizers, llvm-commits Differential Revision: https://reviews.llvm.org/D52623 llvm-svn: 343252	2018-09-27 19:15:40 +00:00
Roman Lebedev	fe7dd583b8	[clang][ubsan][NFC] Slight test cleanup in preparation for D50901 Reviewers: vsk, vitalybuka, filcab Reviewed By: vitalybuka Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D52589 llvm-svn: 343251	2018-09-27 19:07:48 +00:00
Roman Lebedev	f11d564629	[compiler-rt][ubsan][NFC] Slight test cleanup in preparation for D50902. Reviewers: vsk, vitalybuka, filcab Reviewed By: vitalybuka Subscribers: kubamracek, dberris, llvm-commits, #sanitizers Differential Revision: https://reviews.llvm.org/D52590 llvm-svn: 343250	2018-09-27 19:07:47 +00:00
Stanislav Mekhanoshin	b080adfc0c	[AMDGPU] Fold copy (copy vgpr) This allows to reduce a number of used VGPRs in some cases. Differential Revision: https://reviews.llvm.org/D52577 llvm-svn: 343249	2018-09-27 18:55:20 +00:00
Eric Liu	670c147d83	[clangd] Initial supoprt for cross-namespace global code completion. Summary: When no scope qualifier is specified, allow completing index symbols from any scope and insert proper automatically. This is still experimental and hidden behind a flag. Things missing: - Scope proximity based scoring. - FuzzyFind supports weighted scopes. Reviewers: sammccall Reviewed By: sammccall Subscribers: kbobyrev, ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits Differential Revision: https://reviews.llvm.org/D52364 llvm-svn: 343248	2018-09-27 18:46:00 +00:00
Eric Liu	ee7fe93fa8	[clangd] Add more tracing to index queries. NFC Reviewers: sammccall Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits Differential Revision: https://reviews.llvm.org/D52611 llvm-svn: 343247	2018-09-27 18:23:23 +00:00
Kostya Kortchinsky	7685301d79	[sanitizer] Introduce a new SizeClassMap with minimal amount of cached entries Summary: _Note_: I am not attached to the name `DenseSizeClassMap`, so if someone has a better idea, feel free to suggest it. The current pre-defined `SizeClassMap` hold a decent amount of cached entries, either in cheer number of, or in amount of memory cached. Empirical testing shows that more compact per-class arrays (whose sizes are directly correlated to the number of cached entries) are beneficial to performances, particularly in highly threaded environments. The new proposed `SizeClassMap` has the following properties: ``` c00 => s: 0 diff: +0 00% l 0 cached: 0 0; id 0 c01 => s: 16 diff: +16 00% l 4 cached: 8 128; id 1 c02 => s: 32 diff: +16 100% l 5 cached: 8 256; id 2 c03 => s: 48 diff: +16 50% l 5 cached: 8 384; id 3 c04 => s: 64 diff: +16 33% l 6 cached: 8 512; id 4 c05 => s: 80 diff: +16 25% l 6 cached: 8 640; id 5 c06 => s: 96 diff: +16 20% l 6 cached: 8 768; id 6 c07 => s: 112 diff: +16 16% l 6 cached: 8 896; id 7 c08 => s: 128 diff: +16 14% l 7 cached: 8 1024; id 8 c09 => s: 144 diff: +16 12% l 7 cached: 7 1008; id 9 c10 => s: 160 diff: +16 11% l 7 cached: 6 960; id 10 c11 => s: 176 diff: +16 10% l 7 cached: 5 880; id 11 c12 => s: 192 diff: +16 09% l 7 cached: 5 960; id 12 c13 => s: 208 diff: +16 08% l 7 cached: 4 832; id 13 c14 => s: 224 diff: +16 07% l 7 cached: 4 896; id 14 c15 => s: 240 diff: +16 07% l 7 cached: 4 960; id 15 c16 => s: 256 diff: +16 06% l 8 cached: 4 1024; id 16 c17 => s: 320 diff: +64 25% l 8 cached: 3 960; id 49 c18 => s: 384 diff: +64 20% l 8 cached: 2 768; id 50 c19 => s: 448 diff: +64 16% l 8 cached: 2 896; id 51 c20 => s: 512 diff: +64 14% l 9 cached: 2 1024; id 48 c21 => s: 640 diff: +128 25% l 9 cached: 1 640; id 49 c22 => s: 768 diff: +128 20% l 9 cached: 1 768; id 50 c23 => s: 896 diff: +128 16% l 9 cached: 1 896; id 51 c24 => s: 1024 diff: +128 14% l 10 cached: 1 1024; id 48 c25 => s: 1280 diff: +256 25% l 10 cached: 1 1280; id 49 c26 => s: 1536 diff: +256 20% l 10 cached: 1 1536; id 50 c27 => s: 1792 diff: +256 16% l 10 cached: 1 1792; id 51 c28 => s: 2048 diff: +256 14% l 11 cached: 1 2048; id 48 c29 => s: 2560 diff: +512 25% l 11 cached: 1 2560; id 49 c30 => s: 3072 diff: +512 20% l 11 cached: 1 3072; id 50 c31 => s: 3584 diff: +512 16% l 11 cached: 1 3584; id 51 c32 => s: 4096 diff: +512 14% l 12 cached: 1 4096; id 48 c33 => s: 5120 diff: +1024 25% l 12 cached: 1 5120; id 49 c34 => s: 6144 diff: +1024 20% l 12 cached: 1 6144; id 50 c35 => s: 7168 diff: +1024 16% l 12 cached: 1 7168; id 51 c36 => s: 8192 diff: +1024 14% l 13 cached: 1 8192; id 48 c37 => s: 10240 diff: +2048 25% l 13 cached: 1 10240; id 49 c38 => s: 12288 diff: +2048 20% l 13 cached: 1 12288; id 50 c39 => s: 14336 diff: +2048 16% l 13 cached: 1 14336; id 51 c40 => s: 16384 diff: +2048 14% l 14 cached: 1 16384; id 48 c41 => s: 20480 diff: +4096 25% l 14 cached: 1 20480; id 49 c42 => s: 24576 diff: +4096 20% l 14 cached: 1 24576; id 50 c43 => s: 28672 diff: +4096 16% l 14 cached: 1 28672; id 51 c44 => s: 32768 diff: +4096 14% l 15 cached: 1 32768; id 48 c45 => s: 40960 diff: +8192 25% l 15 cached: 1 40960; id 49 c46 => s: 49152 diff: +8192 20% l 15 cached: 1 49152; id 50 c47 => s: 57344 diff: +8192 16% l 15 cached: 1 57344; id 51 c48 => s: 65536 diff: +8192 14% l 16 cached: 1 65536; id 48 c49 => s: 81920 diff: +16384 25% l 16 cached: 1 81920; id 49 c50 => s: 98304 diff: +16384 20% l 16 cached: 1 98304; id 50 c51 => s: 114688 diff: +16384 16% l 16 cached: 1 114688; id 51 c52 => s: 131072 diff: +16384 14% l 17 cached: 1 131072; id 48 c53 => s: 64 diff: +0 00% l 0 cached: 8 512; id 4 Total cached: 864928 (152/432) ``` It holds a bit less of 1MB of cached entries at most, and the cache fits in a page. The plan is to use this map by default for Scudo once we make sure that there is no unforeseen impact for any of current use case. Benchmarks give the most increase in performance (with Scudo) when looking at highly threaded/contentious environments. For example, rcp2-benchmark experiences a 10K QPS increase (~3%), and a decrease of 50MB for the max RSS (~10%). On platforms like Android where we only have a couple of caches, performance remain similar. Reviewers: eugenis, kcc Reviewed By: eugenis Subscribers: kubamracek, delcypher, #sanitizers, llvm-commits Differential Revision: https://reviews.llvm.org/D52371 llvm-svn: 343246	2018-09-27 18:20:42 +00:00
Jordan Rupprecht	777bc9f924	[compiler-rt] [builtins] Restore tests from r342917 (disabled in r343095) on Windows. Summary: -lm is needed for these tests on Linux, but the lit config for this package automatically adds it for Linux and excludes it for Windows. So we should be able to get these tests running again by just dropping -lm and let the lit config add it when possible. I was under the impression that -lm worked across platforms because it exists in other tests without and 'UNSUPPORTED: windows' commands (e.g. divsc3_test.c), but those are actually excluded because they 'REQUIRES: c99-complex' which is excluded from windows platforms (also by the local lit config). I don't have easy access to a windows machine to verify this patch, but I can trigger a build bot run on clang-x64-ninja-win7 shortly after submitting. Reviewers: hans Subscribers: dberris, delcypher, llvm-commits, #sanitizers Differential Revision: https://reviews.llvm.org/D52563 llvm-svn: 343245	2018-09-27 18:13:01 +00:00
Craig Topper	0423681d4a	[ScalarizeMaskedMemIntrin] Don't emit 'icmp eq i1 %x, 1' to check mask values. That's just %x so use that directly. Had we emitted this IR earlier, InstCombine would have removed icmp so I'm going to assume using the i1 directly would be considered canonical. llvm-svn: 343244	2018-09-27 18:01:48 +00:00
Greg Clayton	141f208e12	Fixes for GDB remote packet disassembler: - Add latency timings to GDB packet log summary if timestamps are on log - Add the ability to plot the latencies for each packet type with --plot - Don't crash the script when target xml register info is in wierd format llvm-svn: 343243	2018-09-27 17:55:36 +00:00
Greg Clayton	95c23f6643	Add an interactive mode to BSD archive parser. llvm-svn: 343242	2018-09-27 17:45:14 +00:00
Simon Pilgrim	2a64d393ea	[X86] Remove BT/BTC/BTR/BTS rr/ri overrides llvm-svn: 343241	2018-09-27 17:29:13 +00:00
Jonas Hahnfeld	fb8ca8a1ec	Fix greedy FileCheck expression in test/Driver/mips-abi.c 'ld{{.*}}"' seems to match the complete line for me which is failing the test. Only allow an optional '.exe' for Windows systems as most other tests do. Another possibility would be to collapse the greedy expression with the next check to avoid matching the full line. Differential Revision: https://reviews.llvm.org/D52619 llvm-svn: 343240	2018-09-27 17:27:48 +00:00
George Karpenkov	2b7682d1b4	[analyzer] Highlight nodes which have error reports in them in red in exploded graph Differential Revision: https://reviews.llvm.org/D52584 llvm-svn: 343239	2018-09-27 17:26:41 +00:00
Simon Pilgrim	86c7b07ecd	[X86][Btver2] (V)MPSADBW instructions take 3uops not 1 llvm-svn: 343238	2018-09-27 17:13:57 +00:00
Kadir Cetinkaya	133d46f9a7	Introduce completionItemKind capability support. Reviewers: sammccall Reviewed By: sammccall Subscribers: ilya-biryukov, ioeric, jkorous, arphaman, cfe-commits Differential Revision: https://reviews.llvm.org/D52616 llvm-svn: 343237	2018-09-27 17:13:07 +00:00
Luke Cheeseman	66217a3bad	Revert r343193 together with r343192 llvm-svn: 343236	2018-09-27 16:48:04 +00:00
Luke Cheeseman	8e5676b1aa	Revert r343192 as an ubsan build is currently failing llvm-svn: 343235	2018-09-27 16:47:30 +00:00
Simon Pilgrim	dd744f158a	[X86][Btver2] BTC/BTR/BTS instructions take 2uops not 1 llvm-svn: 343234	2018-09-27 16:39:52 +00:00
Simon Pilgrim	29cf499bca	[X86] Split BT and BTC/BTR/BTS scheduler classes llvm-svn: 343233	2018-09-27 16:24:42 +00:00
Simon Pilgrim	06ccc9d998	[Sparc] EXPENSIVE_CHECKS now passes all machine verifier errors (PR27461) Now that D51487 has landed, the last machine verifier tests that failed EXPENSIVE_CHECKS builds have now been fixed/removed, so we can remove @MatzeB 's isMachineVerifierClean() hack for sparc targets. Differential Revision: https://reviews.llvm.org/D52612 llvm-svn: 343232	2018-09-27 16:21:35 +00:00
Oliver Stannard	2721e6f0ed	[AArch64] Refactor immediate details out of add/sub tblgen class (NFCI) Bits [23-22] are used in Add and Sub to specify the shift. The value of the shift field must be 0x; values of 1x are unallocated. MTE adds some instructions that use such encodings, and this patch refactors the Add/Sub class so that another class could derive from this one to implement other encodings and other formats of bitfields. Patch by Pablo Barrio! Differential revision: https://reviews.llvm.org/D52489 llvm-svn: 343231	2018-09-27 16:19:04 +00:00
Jonas Hahnfeld	a981f67bcd	[OpenMP] Improve search for libomptarget-nvptx When looking for the bclib Clang considered the default library path first while it preferred directories in LIBRARY_PATH when constructing the invocation of nvlink. The latter actually makes more sense because during development it allows using a non-default runtime library. So change the search for the bclib to start looking in directories given by LIBRARY_PATH. Additionally add a new option --libomptarget-nvptx-path= which will be searched first. This will be handy for testing purposes. Differential Revision: https://reviews.llvm.org/D51686 llvm-svn: 343230	2018-09-27 16:12:32 +00:00
Oliver Stannard	a4f68bf4ad	[AArch64][v8.5A] Add speculation barriers SSBB and PSSBB This adds two new barrier instructions which can be used to restrict speculative execution of load instructions. Patch by Pablo Barrio! Differential revision: https://reviews.llvm.org/D52483 llvm-svn: 343229	2018-09-27 16:09:05 +00:00
Sanjay Patel	c3f50ff92e	[InstCombine] Without infinites, fold (C / X) < 0.0 --> (X < 0) When C is not zero and infinites are not allowed (C / X) > 0 is a sign test. Depending on the sign of C, the predicate must be swapped. E.g.: foo(double X) { if ((-2.0 / X) <= 0) ... } => foo(double X) { if (X >= 0) ... } Patch by: @marels (Martin Elshuber) Differential Revision: https://reviews.llvm.org/D51942 llvm-svn: 343228	2018-09-27 15:59:24 +00:00
Simon Pilgrim	c2a88ea64e	[X86][Btver2] BLSI/BLSMSK/BLSR instructions take 2uops not 1 (same as TZCNT) llvm-svn: 343227	2018-09-27 14:57:57 +00:00
Teresa Johnson	f24136f17a	[WPD] Fix incorrect devirtualization after indirect call promotion Summary: Add a dominance check to ensure that the possible devirtualizable call is actually dominated by the type test/checked load intrinsic being analyzed. With PGO, after indirect call promotion is performed during the compile step, followed by inlining, we may have a type test in the promoted and inlined sequence that allows an indirect call in that sequence to be devirtualized. That indirect call (inserted by inlining after promotion) will share the same vtable pointer as the fallback indirect call that cannot be devirtualized. Before this patch the code was incorrectly devirtualizing the fallback indirect call. See the new test and the example described there for more details. Reviewers: pcc, vitalybuka Subscribers: mehdi_amini, Prazek, eraman, steven_wu, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D52514 llvm-svn: 343226	2018-09-27 14:55:32 +00:00
Oliver Stannard	a9a5eee169	[AArch64][v8.5A] Add Branch Target Identification instructions This adds new instructions used by the Branch Target Identification feature. When this is enabled, these are the only instructions which can be targeted by indirect branch instructions. Patch by Pablo Barrio! Differential revision: https://reviews.llvm.org/D52485 llvm-svn: 343225	2018-09-27 14:54:33 +00:00
Eric Liu	bb898704e9	[Tooling] Get rid of uses of llvm::Twine::str which is slow. NFC llvm-svn: 343224	2018-09-27 14:50:24 +00:00
Eric Liu	fd9f426049	[clangd] Make IncludeInserter less slow. NFC llvm-svn: 343223	2018-09-27 14:27:02 +00:00

1 2 3 4 5 ...

299863 Commits All Branches Search

299863 Commits

All Branches