llvm-project

Commit Graph

Author	SHA1	Message	Date
zoecarver	82c4701d4e	[libc++][nfc] SFINAE on pair/tuple assignment operators: LWG 2729. This patch ensures that SFINAE is used to delete assignment operators in pair and tuple based on issue 2729. Differential Review: https://reviews.llvm.org/D62454	2021-02-19 13:25:34 -08:00
Craig Topper	7e54d7304b	[RISCV] Remove VPatILoad and VPatIStore multiclasses that are no longer used. NFC	2021-02-19 13:23:08 -08:00
Philip Reames	cc574f85fa	Add datalayout to test added in `7e3183d73` Realized after pushing this would probably fail on bots for other than x86-64.	2021-02-19 13:10:19 -08:00
Dave Lee	9d3b9e5799	[lldb] Rename {stop,run}_vote to report_{stop,run}_vote Rename `stop_vote` and `run_vote` to `report_stop_vote` and `report_run_vote` respectively. These variables are limited to logic involving (event) reporting only. This naming is intended to make their context more clear. Differential Revision: https://reviews.llvm.org/D96917	2021-02-19 13:04:53 -08:00
Philip Reames	7e3183d735	Add test triggered by review discussion on D97077	2021-02-19 13:03:58 -08:00
Tim Shen	a0757d8ebd	Patch by @wecing (Chenguang Wang). The current getFoldedSizeOf() implementation uses naive recursion, which could be really slow when the input structure type is too complex. This issue was first brought up in http://llvm.org/bugs/show_bug.cgi?id=8281; this change fixes it by adding memoization. Differential Revision: https://reviews.llvm.org/D6594	2021-02-19 12:44:17 -08:00
Eugene Zhulenev	f99ccf6516	[mlir] Add math polynomial approximation pass This gives ~30x speedup compared to expanding Tanh into exp operations: ``` name old cpu/op new cpu/op delta BM_mlir_Tanh_f32/10 253ns ± 3% 55ns ± 7% -78.35% (p=0.000 n=44+41) BM_mlir_Tanh_f32/100 2.21µs ± 4% 0.14µs ± 8% -93.85% (p=0.000 n=48+49) BM_mlir_Tanh_f32/1k 22.6µs ± 4% 0.7µs ± 5% -96.68% (p=0.000 n=32+42) BM_mlir_Tanh_f32/10k 225µs ± 5% 7µs ± 6% -96.88% (p=0.000 n=49+55) name old time/op new time/op delta BM_mlir_Tanh_f32/10 259ns ± 1% 56ns ± 2% -78.31% (p=0.000 n=41+39) BM_mlir_Tanh_f32/100 2.27µs ± 1% 0.14µs ± 5% -93.89% (p=0.000 n=46+49) BM_mlir_Tanh_f32/1k 22.9µs ± 1% 0.8µs ± 4% -96.67% (p=0.000 n=30+42) BM_mlir_Tanh_f32/10k 230µs ± 0% 7µs ± 3% -96.88% (p=0.000 n=37+55) ``` This approximations is based on Eigen::generic_fast_tanh function Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D96739	2021-02-19 12:43:36 -08:00
Teresa Johnson	0923a60ea7	[clang] Emit type metadata on available_externally vtables for WPD When WPD is enabled, via WholeProgramVTables, emit type metadata for available_externally vtables. Additionally, add the vtables to the llvm.compiler.used global so that they are not prematurely eliminated (before *LTO analysis). This is needed to avoid devirtualizing calls to a function overriding a class defined in a header file but with a strong definition in a shared library. Without type metadata on the available_externally vtables from the header, the WPD analysis never sees what a derived class is overriding. Even if the available_externally base class functions are pure virtual, because shared library definitions are already treated conservatively (committed patches D91583, D96721, and D96722) we will not devirtualize, which would be unsafe since the library might contain overrides that aren't visible to the LTO unit. An example is std::error_category, which is overridden in LLVM and causing failures after a self build with WPD enabled, because libstdc++ contains hidden overrides of the virtual base class methods. Differential Revision: https://reviews.llvm.org/D96919	2021-02-19 12:42:34 -08:00
Jianzhou Zhao	efc8f3311b	[msan] Set cmpxchg shadow precisely In terms of https://llvm.org/docs/LangRef.html#cmpxchg-instruction, the return type of chmpxchg is a pair {ty, i1}, while I think we only wanted to set the shadow for the address 0th op, and it has type ty. Reviewed-by: eugenis Differential Revision: https://reviews.llvm.org/D97029	2021-02-19 20:23:23 +00:00
Philip Reames	5de47ebff6	precommit test cleanup for D97077	2021-02-19 12:19:39 -08:00
Eric Schweitz	a88991d782	[flang][fir][NFC] run clang-format cleanup post-merge	2021-02-19 12:07:13 -08:00
Sanjay Patel	d79063129c	[Verifier] remove dead code for saturating intrinsics; NFC Test coverage shows that we assert with the string from the tablegen defs file for these intrinsics, so these cases should never be live.	2021-02-19 14:58:25 -05:00
Sanjay Patel	38730b0029	[Verifier] add tests for saturating intrinsics; NFC As noted in D96904, we don't have direct tests for these malformed ops.	2021-02-19 14:58:25 -05:00
Martin Storsjö	f4f5fb9151	[libcxx] Make generic_*string return paths with forward slashes on windows This matches what MS STL returns; in std::filesystem, forward slashes are considered generic dir separators that are valid on all platforms. Differential Revision: https://reviews.llvm.org/D91181	2021-02-19 21:49:51 +02:00
Haowei Wu	784c7debb2	[elfabi] Fix a bug when .dynsym contains no non-local symbol This patch fixed a bug when elbabi was supplied with a tbe file contains no non-local symbol. Before this patch, it wrote 0 to sh_info of the .dynsym section, making the ELF stub file invalid. This patch fixed this issue. Differential Revision: https://reviews.llvm.org/D96930	2021-02-19 11:36:53 -08:00
zoecarver	dbc89028d7	[libcxx] Fix LWG 2875: shared_ptr::shared_ptr(Y*, D, […]) constructors should be constrained. Fixes LWG issue 2875. Differential Revision: https://reviews.llvm.org/D81414	2021-02-19 11:11:39 -08:00
Martin Storsjö	513463fd26	[libcxx] Have lexically_normal return the path with preferred separators Differential Revision: https://reviews.llvm.org/D91179	2021-02-19 21:06:54 +02:00
Sanjay Patel	5b250a27ec	[Analysis][LoopVectorize] do not form reductions of pointers This is a fix for https://llvm.org/PR49215 either before/after we make a verifier enhancement for vector reductions with D96904. I'm not sure what the current thinking is for pointer math/logic in IR. We allow icmp on pointer values. Therefore, we match min/max patterns, so without this patch, the vectorizer could form a vector reduction from that sequence. But the LangRef definitions for min/max and vector reduction intrinsics do not allow pointer types: https://llvm.org/docs/LangRef.html#llvm-smax-intrinsic https://llvm.org/docs/LangRef.html#llvm-vector-reduce-umax-intrinsic So we would crash/assert at some point - either in IR verification, in the cost model, or in codegen. If we do want to allow this kind of transform, we will need to update the LangRef and all of those parts of the compiler. Differential Revision: https://reviews.llvm.org/D97047	2021-02-19 14:01:57 -05:00
Michael Kruse	91c472c86c	[Polly] Fix test after D96534.	2021-02-19 12:49:29 -06:00
Craig Topper	e7c86f4ac4	[RISCV] Use inheritance to reduce some repeated code in tablegen. NFC The VLX and VSX searchable tables, share the same format so we can have a common base class for them.	2021-02-19 10:42:18 -08:00
Simon Pilgrim	d7350efc40	[X86] Regenerate 2007-06-28-X86-64-isel.ll	2021-02-19 18:35:15 +00:00
Simon Pilgrim	3dae0b5703	[X86] Remove unused intrinsic declaration	2021-02-19 18:35:14 +00:00
Simon Pilgrim	6ad4bf330b	[X86] Regenerate 2011-12-06-AVXVectorExtractCombine.ll	2021-02-19 18:35:14 +00:00
Craig Topper	7f5b3886e4	[RISCV] Remove unneeded indexed segment load/store vector pseudo instruction. We had more combinations of data and index lmuls than we needed. Also add some asserts to verify that the IndexVT and data VT have the same element count when we isel these pseudo instructions.	2021-02-19 10:28:48 -08:00
Craig Topper	d056d5decf	[RISCV] Use custom isel for vector indexed load/store intrinsics. There are many legal combinations of index and data VTs supported for these intrinsics. This results in a lot of isel patterns in RISCVGenDAGISel.inc. By adding a separate table similar to what we use for segment load/stores, we can more efficiently manually select these intrinsics. We should also be able to reuse this table scalable vector gather/scatter. This reduces the llc binary size by ~56K. Reviewed By: khchen Differential Revision: https://reviews.llvm.org/D97033	2021-02-19 10:10:06 -08:00
Craig Topper	dbf910f0d9	[RISCV] Prevent selecting a 0 VL to X0 for the segment load/store intrinsics. Just like we do for isel patterns, we need to call selectVLOp to prevent 0 from being selected to X0 by the default isel. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D97021	2021-02-19 10:07:12 -08:00
Craig Topper	98dff5e804	[RISCV] Move SHFLI matching to DAG combine. Add 32-bit support for RV64 We previously used isel patterns for this, but that used quite a bit of space in the isel table due to OR being associative and commutative. It also wouldn't handle shifts/ands being in reversed order. This generalizes the shift/and matching from GREVI to take the expected mask table as input so we can reuse it for SHFLI. There is no SHFLIW instruction, but we can promote a 32-bit SHFLI to i64 on RV64. As long as bit 4 of the control bit isn't set, a 64-bit SHFLI will preserve 33 sign bits if the input had at least 33 sign bits. ComputeNumSignBits has been updated to account for that to avoid sext.w in the tests. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D96661	2021-02-19 10:07:12 -08:00
Wei Mi	4ffad1fb48	[SampleFDO] Add PromotedInsns to prevent repeated ICP. In https://reviews.llvm.org/rG5fb65c02ca5e91e7e1a00e0efdb8edc899f3e4b9, We use 0 count value profile to memorize which target has been promoted and prevent repeated ICP for the same target, so we delete PromotedInsns. However, I found the implementation in the patch has some shortcomings to be fixed otherwise there will still be repeated ICP. So I add PromotedInsns back temorarily. Will remove it after I get a thorough fix.	2021-02-19 10:01:49 -08:00
Artem Belevich	1a368ae3b7	[CUDA] fix builtin constraints for PTX 7.2 This fixes build issues w/ CUDA-11 introduced by https://reviews.llvm.org/D95974 Reviewed By: yaxunl Differential Revision: https://reviews.llvm.org/D97009	2021-02-19 09:57:21 -08:00
Luís Marques	43fa23a01f	[Sanitizer][NFC] Fix typo	2021-02-19 17:46:02 +00:00
Jessica Paquette	8d3442eddb	[AArch64][GlobalISel] Run redundant_sext_inreg in the post-legalizer combiner This is to ensure that we can eliminate G_ASSERT_SEXT. In a follow-up patch, I'm going to make CallLowering emit G_ASSERT_SEXT for signext parameters. Differential Revision: https://reviews.llvm.org/D96913	2021-02-19 09:34:47 -08:00
Nicolas Vasilache	0ee4bf151c	[mlir] Add folding of tensor.cast -> subtensor_insert Differential Revision: https://reviews.llvm.org/D97059	2021-02-19 17:24:16 +00:00
Geoffrey Martin-Noble	236aab0b0c	[MLIR] Delete unused functions getCollapsedInitTensor and getExpandedInitTensor These are unused since https://reviews.llvm.org/rG81264dfbe80df08668a325a61613b64243b99c01 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D97014	2021-02-19 09:23:54 -08:00
Benjamin Kramer	59f442e6bb	[LV] Fold single-use variable into assert. NFC.	2021-02-19 18:11:39 +01:00
Nikita Popov	71a8e4e7d6	[MemCopyOpt] Enable MemorySSA by default This enables use of MemorySSA instead of MemDep in MemCpyOpt. To allow this without significant compile-time impact, the MemCpyOpt pass is moved directly before DSE (in the cases where this was not already the case), which allows us to reuse the existing MemorySSA analysis. Unlike the MemDep-based implementation, the MemorySSA-based MemCpyOpt can also perform simple optimizations across basic blocks. Differential Revision: https://reviews.llvm.org/D94376	2021-02-19 18:06:25 +01:00
Matthew Malcomson	c1653b8cc7	Hwasan InitPrctl check for error using internal_iserror When adding this function in https://reviews.llvm.org/D68794 I did not notice that internal_prctl has the API of the syscall to prctl rather than the API of the glibc (posix) wrapper. This means that the error return value is not necessarily -1 and that errno is not set by the call. For InitPrctl this means that the checks do not catch running on a kernel without the required ABI (not caught since I only tested this function correctly enables the ABI when it exists). This commit updates the two calls which check for an error condition to use internal_iserror. That function sets a provided integer to an equivalent errno value and returns a boolean to indicate success or not. Tested by running on a kernel that has this ABI and on one that does not. Verified that running on the kernel without this ABI the current code prints the provided error message and does not attempt to run the program. Verified that running on the kernel with this ABI the current code does not print an error message and turns on the ABI. This done on an x86 kernel (where the ABI does not exist), an AArch64 kernel without this ABI, and an AArch64 kernel with this ABI. In order to keep running the testsuite on kernels that do not provide this new ABI we add another option to the HWASAN_OPTIONS environment variable, this option determines whether the library kills the process if it fails to enable the relaxed syscall ABI or not. This new flag is `fail_without_syscall_abi`. The check-hwasan testsuite results do not change with this patch on either x86, AArch64 without a kernel supporting this ABI, and AArch64 with a kernel supporting this ABI. Differential Revision: https://reviews.llvm.org/D96964	2021-02-19 16:30:56 +00:00
Philip Reames	4a5edea193	[SCEV] Use both known bits and sign bits when computing range of SCEV unknowns When computing a range for a SCEVUnknown, today we use computeKnownBits for unsigned ranges, and computeNumSignBots for signed ranges. This means we miss opportunities to improve range results. One common missed pattern is that we have a signed range of a value which CKB can determine is positive, but CNSB doesn't convey that information. The current range includes the negative part, and is thus double the size. Per the removed comment, the original concern which delayed using both (after some code merging years back) was a compile time concern. CTMark results (provided by Nikita, thanks!) showed a geomean impact of about 0.1%. This doesn't seem large enough to avoid higher quality results. Differential Revision: https://reviews.llvm.org/D96534	2021-02-19 08:29:12 -08:00
Marek Kurdej	bcb5a124ae	[libc++] Turn off clang-format for auto-generated version header. NFC.	2021-02-19 17:26:16 +01:00
Joel E. Denny	ef8b3b5ffd	[OpenMP] Fix nvptx CUDA_VERSION conversion As mentioned in PR#49250, without this patch, ptxas for CUDA 9.1 fails in the following two tests: - openmp/libomptarget/test/mapping/lambda_mapping.cpp - openmp/libomptarget/test/offloading/bug49021.cpp The error looks like: ``` ptxas /tmp/lambda_mapping-081ea9.s, line 828; error : Not a name of any known instruction: 'activemask' ``` The problem is that our cmake script converts CUDA version strings incorrectly: 9.1 becomes 9100, but it should be 9010, as shown in `getCudaVersion` in `clang/lib/Driver/ToolChains/Cuda.cpp`. Thus, `openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.cu` inadvertently enables `activemask` because it apparently becomes available in 9.2. This patch fixes the conversion. This patch does not fix the other two tests in PR#49250. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D97012	2021-02-19 11:09:26 -05:00
Joel E. Denny	d2147b1a87	[OpenMP] Fix always,from and delete for data absent at exit Without this patch, there's a runtime error for those map types at exit from an "omp target data" or at "omp target exit data", but the spec says the list item should be ignored. This patch tests that fix in data_absent_at_exit.c, and it also improves other testing for data that is not fully present at exit. Reviewed By: grokos, RaviNarayanaswamy Differential Revision: https://reviews.llvm.org/D96999	2021-02-19 11:09:26 -05:00
Mircea Trofin	82492f24ff	[NFC][Regalloc] Share the VirtRegAuxInfo object with LiveRangeEdit VirtRegAuxInfo is an extensibility point, so the register allocator's decision on which implementation to use should be communicated to the other users - namely, LiveRangeEdit. Differential Revision: https://reviews.llvm.org/D96898	2021-02-19 07:44:28 -08:00
madhur13490	3c297a2564	Make fixed-abi default for AMD HSA OS fixed-abi uses pre-defined and predictable SGPR/VGPRs for passing arguments. This patch makes this scheme default when HSA OS is specified in triple. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D96340	2021-02-19 15:05:25 +00:00
David Green	a1c34a9d6a	[ARM] Correct vector predicate type in MVE getCmpSelInstrCost	2021-02-19 14:43:51 +00:00
Jay Foad	b2c7f06db1	[AMDGPU] Add some GFX9 test coverage. NFC.	2021-02-19 14:38:52 +00:00
Simon Pilgrim	5d3930bb8f	[DAG] visitTRUNCATE - attempt to truncate USUBSAT Fold trunc(usubsat(zext(x),y)) -> usubsat(x,trunc(umin(y,satlimit)))	2021-02-19 14:26:05 +00:00
Nicolas Vasilache	62f5c46eec	[mlir][Linalg] NFC - Expose more options to the CodegenStrategy	2021-02-19 14:01:44 +00:00
Djordje Todorovic	b6db47d7e0	[llvm-dwarfdump][locstats] Unify handling of inlined vars with no loc The presence or absence of an inline variable (as well as formal parameter) with only an abstract_origin ref (without DW_AT_location) should not change the location coverage. It means, for both: DW_TAG_inlined_subroutine DW_AT_abstract_origin (0x0000004e "f") DW_AT_low_pc (0x0000000000000010) DW_AT_high_pc (0x0000000000000013) DW_TAG_formal_parameter DW_AT_abstract_origin (0x0000005a "b") and, DW_TAG_inlined_subroutine DW_AT_abstract_origin (0x0000004e "f") DW_AT_low_pc (0x0000000000000010) DW_AT_high_pc (0x0000000000000013) we should report 0% location coverage. If we add DW_AT_location, for both cases the coverage should be improved. Differential Revision: https://reviews.llvm.org/D96045	2021-02-19 05:38:01 -08:00
Jan Kratochvil	08331281af	[lldb/Commands] Fix help text typo for 'breakpoint set' -a\|--address.	2021-02-19 14:33:42 +01:00
David Green	7a5c26e99a	Revert "[ARM] Expand the range of allowed post-incs in load/store optimizer" This reverts commit `3b34b06fc5` as runtime errors were reported.	2021-02-19 13:15:10 +00:00
Florian Hahn	edc92a1c42	[LV] Remove VPCallback. Now that all state for generated instructions is managed directly in VPTransformState, VPCallBack is no longer needed. This patch updates the last use of `getOrCreateScalarValue` to instead manage the value directly in VPTransformState and removes VPCallback. Reviewed By: gilr Differential Revision: https://reviews.llvm.org/D95383	2021-02-19 12:50:41 +00:00

1 2 3 4 5 ...

380548 Commits All Branches Search

380548 Commits

All Branches