llvm-project

Commit Graph

Author	SHA1	Message	Date
Eric Schweitz	a88991d782	[flang][fir][NFC] run clang-format cleanup post-merge	2021-02-19 12:07:13 -08:00
Sanjay Patel	d79063129c	[Verifier] remove dead code for saturating intrinsics; NFC Test coverage shows that we assert with the string from the tablegen defs file for these intrinsics, so these cases should never be live.	2021-02-19 14:58:25 -05:00
Sanjay Patel	38730b0029	[Verifier] add tests for saturating intrinsics; NFC As noted in D96904, we don't have direct tests for these malformed ops.	2021-02-19 14:58:25 -05:00
Martin Storsjö	f4f5fb9151	[libcxx] Make generic_*string return paths with forward slashes on windows This matches what MS STL returns; in std::filesystem, forward slashes are considered generic dir separators that are valid on all platforms. Differential Revision: https://reviews.llvm.org/D91181	2021-02-19 21:49:51 +02:00
Haowei Wu	784c7debb2	[elfabi] Fix a bug when .dynsym contains no non-local symbol This patch fixed a bug when elbabi was supplied with a tbe file contains no non-local symbol. Before this patch, it wrote 0 to sh_info of the .dynsym section, making the ELF stub file invalid. This patch fixed this issue. Differential Revision: https://reviews.llvm.org/D96930	2021-02-19 11:36:53 -08:00
zoecarver	dbc89028d7	[libcxx] Fix LWG 2875: shared_ptr::shared_ptr(Y*, D, […]) constructors should be constrained. Fixes LWG issue 2875. Differential Revision: https://reviews.llvm.org/D81414	2021-02-19 11:11:39 -08:00
Martin Storsjö	513463fd26	[libcxx] Have lexically_normal return the path with preferred separators Differential Revision: https://reviews.llvm.org/D91179	2021-02-19 21:06:54 +02:00
Sanjay Patel	5b250a27ec	[Analysis][LoopVectorize] do not form reductions of pointers This is a fix for https://llvm.org/PR49215 either before/after we make a verifier enhancement for vector reductions with D96904. I'm not sure what the current thinking is for pointer math/logic in IR. We allow icmp on pointer values. Therefore, we match min/max patterns, so without this patch, the vectorizer could form a vector reduction from that sequence. But the LangRef definitions for min/max and vector reduction intrinsics do not allow pointer types: https://llvm.org/docs/LangRef.html#llvm-smax-intrinsic https://llvm.org/docs/LangRef.html#llvm-vector-reduce-umax-intrinsic So we would crash/assert at some point - either in IR verification, in the cost model, or in codegen. If we do want to allow this kind of transform, we will need to update the LangRef and all of those parts of the compiler. Differential Revision: https://reviews.llvm.org/D97047	2021-02-19 14:01:57 -05:00
Michael Kruse	91c472c86c	[Polly] Fix test after D96534.	2021-02-19 12:49:29 -06:00
Craig Topper	e7c86f4ac4	[RISCV] Use inheritance to reduce some repeated code in tablegen. NFC The VLX and VSX searchable tables, share the same format so we can have a common base class for them.	2021-02-19 10:42:18 -08:00
Simon Pilgrim	d7350efc40	[X86] Regenerate 2007-06-28-X86-64-isel.ll	2021-02-19 18:35:15 +00:00
Simon Pilgrim	3dae0b5703	[X86] Remove unused intrinsic declaration	2021-02-19 18:35:14 +00:00
Simon Pilgrim	6ad4bf330b	[X86] Regenerate 2011-12-06-AVXVectorExtractCombine.ll	2021-02-19 18:35:14 +00:00
Craig Topper	7f5b3886e4	[RISCV] Remove unneeded indexed segment load/store vector pseudo instruction. We had more combinations of data and index lmuls than we needed. Also add some asserts to verify that the IndexVT and data VT have the same element count when we isel these pseudo instructions.	2021-02-19 10:28:48 -08:00
Craig Topper	d056d5decf	[RISCV] Use custom isel for vector indexed load/store intrinsics. There are many legal combinations of index and data VTs supported for these intrinsics. This results in a lot of isel patterns in RISCVGenDAGISel.inc. By adding a separate table similar to what we use for segment load/stores, we can more efficiently manually select these intrinsics. We should also be able to reuse this table scalable vector gather/scatter. This reduces the llc binary size by ~56K. Reviewed By: khchen Differential Revision: https://reviews.llvm.org/D97033	2021-02-19 10:10:06 -08:00
Craig Topper	dbf910f0d9	[RISCV] Prevent selecting a 0 VL to X0 for the segment load/store intrinsics. Just like we do for isel patterns, we need to call selectVLOp to prevent 0 from being selected to X0 by the default isel. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D97021	2021-02-19 10:07:12 -08:00
Craig Topper	98dff5e804	[RISCV] Move SHFLI matching to DAG combine. Add 32-bit support for RV64 We previously used isel patterns for this, but that used quite a bit of space in the isel table due to OR being associative and commutative. It also wouldn't handle shifts/ands being in reversed order. This generalizes the shift/and matching from GREVI to take the expected mask table as input so we can reuse it for SHFLI. There is no SHFLIW instruction, but we can promote a 32-bit SHFLI to i64 on RV64. As long as bit 4 of the control bit isn't set, a 64-bit SHFLI will preserve 33 sign bits if the input had at least 33 sign bits. ComputeNumSignBits has been updated to account for that to avoid sext.w in the tests. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D96661	2021-02-19 10:07:12 -08:00
Wei Mi	4ffad1fb48	[SampleFDO] Add PromotedInsns to prevent repeated ICP. In https://reviews.llvm.org/rG5fb65c02ca5e91e7e1a00e0efdb8edc899f3e4b9, We use 0 count value profile to memorize which target has been promoted and prevent repeated ICP for the same target, so we delete PromotedInsns. However, I found the implementation in the patch has some shortcomings to be fixed otherwise there will still be repeated ICP. So I add PromotedInsns back temorarily. Will remove it after I get a thorough fix.	2021-02-19 10:01:49 -08:00
Artem Belevich	1a368ae3b7	[CUDA] fix builtin constraints for PTX 7.2 This fixes build issues w/ CUDA-11 introduced by https://reviews.llvm.org/D95974 Reviewed By: yaxunl Differential Revision: https://reviews.llvm.org/D97009	2021-02-19 09:57:21 -08:00
Luís Marques	43fa23a01f	[Sanitizer][NFC] Fix typo	2021-02-19 17:46:02 +00:00
Jessica Paquette	8d3442eddb	[AArch64][GlobalISel] Run redundant_sext_inreg in the post-legalizer combiner This is to ensure that we can eliminate G_ASSERT_SEXT. In a follow-up patch, I'm going to make CallLowering emit G_ASSERT_SEXT for signext parameters. Differential Revision: https://reviews.llvm.org/D96913	2021-02-19 09:34:47 -08:00
Nicolas Vasilache	0ee4bf151c	[mlir] Add folding of tensor.cast -> subtensor_insert Differential Revision: https://reviews.llvm.org/D97059	2021-02-19 17:24:16 +00:00
Geoffrey Martin-Noble	236aab0b0c	[MLIR] Delete unused functions getCollapsedInitTensor and getExpandedInitTensor These are unused since https://reviews.llvm.org/rG81264dfbe80df08668a325a61613b64243b99c01 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D97014	2021-02-19 09:23:54 -08:00
Benjamin Kramer	59f442e6bb	[LV] Fold single-use variable into assert. NFC.	2021-02-19 18:11:39 +01:00
Nikita Popov	71a8e4e7d6	[MemCopyOpt] Enable MemorySSA by default This enables use of MemorySSA instead of MemDep in MemCpyOpt. To allow this without significant compile-time impact, the MemCpyOpt pass is moved directly before DSE (in the cases where this was not already the case), which allows us to reuse the existing MemorySSA analysis. Unlike the MemDep-based implementation, the MemorySSA-based MemCpyOpt can also perform simple optimizations across basic blocks. Differential Revision: https://reviews.llvm.org/D94376	2021-02-19 18:06:25 +01:00
Matthew Malcomson	c1653b8cc7	Hwasan InitPrctl check for error using internal_iserror When adding this function in https://reviews.llvm.org/D68794 I did not notice that internal_prctl has the API of the syscall to prctl rather than the API of the glibc (posix) wrapper. This means that the error return value is not necessarily -1 and that errno is not set by the call. For InitPrctl this means that the checks do not catch running on a kernel without the required ABI (not caught since I only tested this function correctly enables the ABI when it exists). This commit updates the two calls which check for an error condition to use internal_iserror. That function sets a provided integer to an equivalent errno value and returns a boolean to indicate success or not. Tested by running on a kernel that has this ABI and on one that does not. Verified that running on the kernel without this ABI the current code prints the provided error message and does not attempt to run the program. Verified that running on the kernel with this ABI the current code does not print an error message and turns on the ABI. This done on an x86 kernel (where the ABI does not exist), an AArch64 kernel without this ABI, and an AArch64 kernel with this ABI. In order to keep running the testsuite on kernels that do not provide this new ABI we add another option to the HWASAN_OPTIONS environment variable, this option determines whether the library kills the process if it fails to enable the relaxed syscall ABI or not. This new flag is `fail_without_syscall_abi`. The check-hwasan testsuite results do not change with this patch on either x86, AArch64 without a kernel supporting this ABI, and AArch64 with a kernel supporting this ABI. Differential Revision: https://reviews.llvm.org/D96964	2021-02-19 16:30:56 +00:00
Philip Reames	4a5edea193	[SCEV] Use both known bits and sign bits when computing range of SCEV unknowns When computing a range for a SCEVUnknown, today we use computeKnownBits for unsigned ranges, and computeNumSignBots for signed ranges. This means we miss opportunities to improve range results. One common missed pattern is that we have a signed range of a value which CKB can determine is positive, but CNSB doesn't convey that information. The current range includes the negative part, and is thus double the size. Per the removed comment, the original concern which delayed using both (after some code merging years back) was a compile time concern. CTMark results (provided by Nikita, thanks!) showed a geomean impact of about 0.1%. This doesn't seem large enough to avoid higher quality results. Differential Revision: https://reviews.llvm.org/D96534	2021-02-19 08:29:12 -08:00
Marek Kurdej	bcb5a124ae	[libc++] Turn off clang-format for auto-generated version header. NFC.	2021-02-19 17:26:16 +01:00
Joel E. Denny	ef8b3b5ffd	[OpenMP] Fix nvptx CUDA_VERSION conversion As mentioned in PR#49250, without this patch, ptxas for CUDA 9.1 fails in the following two tests: - openmp/libomptarget/test/mapping/lambda_mapping.cpp - openmp/libomptarget/test/offloading/bug49021.cpp The error looks like: ``` ptxas /tmp/lambda_mapping-081ea9.s, line 828; error : Not a name of any known instruction: 'activemask' ``` The problem is that our cmake script converts CUDA version strings incorrectly: 9.1 becomes 9100, but it should be 9010, as shown in `getCudaVersion` in `clang/lib/Driver/ToolChains/Cuda.cpp`. Thus, `openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.cu` inadvertently enables `activemask` because it apparently becomes available in 9.2. This patch fixes the conversion. This patch does not fix the other two tests in PR#49250. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D97012	2021-02-19 11:09:26 -05:00
Joel E. Denny	d2147b1a87	[OpenMP] Fix always,from and delete for data absent at exit Without this patch, there's a runtime error for those map types at exit from an "omp target data" or at "omp target exit data", but the spec says the list item should be ignored. This patch tests that fix in data_absent_at_exit.c, and it also improves other testing for data that is not fully present at exit. Reviewed By: grokos, RaviNarayanaswamy Differential Revision: https://reviews.llvm.org/D96999	2021-02-19 11:09:26 -05:00
Mircea Trofin	82492f24ff	[NFC][Regalloc] Share the VirtRegAuxInfo object with LiveRangeEdit VirtRegAuxInfo is an extensibility point, so the register allocator's decision on which implementation to use should be communicated to the other users - namely, LiveRangeEdit. Differential Revision: https://reviews.llvm.org/D96898	2021-02-19 07:44:28 -08:00
madhur13490	3c297a2564	Make fixed-abi default for AMD HSA OS fixed-abi uses pre-defined and predictable SGPR/VGPRs for passing arguments. This patch makes this scheme default when HSA OS is specified in triple. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D96340	2021-02-19 15:05:25 +00:00
David Green	a1c34a9d6a	[ARM] Correct vector predicate type in MVE getCmpSelInstrCost	2021-02-19 14:43:51 +00:00
Jay Foad	b2c7f06db1	[AMDGPU] Add some GFX9 test coverage. NFC.	2021-02-19 14:38:52 +00:00
Simon Pilgrim	5d3930bb8f	[DAG] visitTRUNCATE - attempt to truncate USUBSAT Fold trunc(usubsat(zext(x),y)) -> usubsat(x,trunc(umin(y,satlimit)))	2021-02-19 14:26:05 +00:00
Nicolas Vasilache	62f5c46eec	[mlir][Linalg] NFC - Expose more options to the CodegenStrategy	2021-02-19 14:01:44 +00:00
Djordje Todorovic	b6db47d7e0	[llvm-dwarfdump][locstats] Unify handling of inlined vars with no loc The presence or absence of an inline variable (as well as formal parameter) with only an abstract_origin ref (without DW_AT_location) should not change the location coverage. It means, for both: DW_TAG_inlined_subroutine DW_AT_abstract_origin (0x0000004e "f") DW_AT_low_pc (0x0000000000000010) DW_AT_high_pc (0x0000000000000013) DW_TAG_formal_parameter DW_AT_abstract_origin (0x0000005a "b") and, DW_TAG_inlined_subroutine DW_AT_abstract_origin (0x0000004e "f") DW_AT_low_pc (0x0000000000000010) DW_AT_high_pc (0x0000000000000013) we should report 0% location coverage. If we add DW_AT_location, for both cases the coverage should be improved. Differential Revision: https://reviews.llvm.org/D96045	2021-02-19 05:38:01 -08:00
Jan Kratochvil	08331281af	[lldb/Commands] Fix help text typo for 'breakpoint set' -a\|--address.	2021-02-19 14:33:42 +01:00
David Green	7a5c26e99a	Revert "[ARM] Expand the range of allowed post-incs in load/store optimizer" This reverts commit `3b34b06fc5` as runtime errors were reported.	2021-02-19 13:15:10 +00:00
Florian Hahn	edc92a1c42	[LV] Remove VPCallback. Now that all state for generated instructions is managed directly in VPTransformState, VPCallBack is no longer needed. This patch updates the last use of `getOrCreateScalarValue` to instead manage the value directly in VPTransformState and removes VPCallback. Reviewed By: gilr Differential Revision: https://reviews.llvm.org/D95383	2021-02-19 12:50:41 +00:00
Kadir Cetinkaya	6329ce75da	[clangd] Expose absoluteParent helper Will be used in other components that need ancestor traversal. Reviewed By: sammccall Differential Revision: https://reviews.llvm.org/D96123	2021-02-19 13:40:21 +01:00
Simon Pilgrim	c1664c5a27	[X86][SSE] Add tests for trunc(usubsat()) patterns.	2021-02-19 12:26:48 +00:00
Nico Weber	3b7580951c	[gn build] Port `1a2b3536ef`	2021-02-19 07:23:48 -05:00
Fraser Cormack	d9531a3097	[RISCV] Address some clang-tidy warnings. NFCI.	2021-02-19 12:10:28 +00:00
Nikita Popov	ac065b7a37	[LLD] Fix tests after D96993 We now need mustprogress to eliminate these calls. The code doesn't really make sense, but that's not the point of the test...	2021-02-19 13:08:17 +01:00
Alexander Belyaev	53367b8fe1	[mlir][nfc] Fix indentation in LinalgOps.td.	2021-02-19 13:02:58 +01:00
Ron Lieberman	30c0d5b4c3	[OPENMP][AMDGCN] Improvements to print_kernel_trace (bit mask) allow bit masking to select various trace features. bit 0 => Launch tracing (stderr) bit 1 => timing of runtime (stdout) bit 2 => detailed launch tracing (stderr) bit 3 => timing goes to stdout instead of stderr example: LIBOMPTARGET_KERNEL_TRACE=7 does it all LIBOMPTARGET_KERNEL_TRACE=5 Launch + details LIBOMPTARGET_KERNEL_TRACE=2 timings + launch to stderr LIBOMPTARGET_KERNEL_TRACE=10 timings + launch to stdout Differential Revision: https://reviews.llvm.org/D96998	2021-02-19 06:47:22 -05:00
Carl Ritson	8181dcd30f	[AMDGPU] WQM/WWM: Fix marking of partial definitions Track lanes when processing definitions for marking WQM/WWM. If all lanes have been defined then marking can stop. This prevents marking unnecessary instructions as WQM/WWM. In particular this fixes a bug where values passing through V_SET_INACTIVE would me marked as requiring WWM. Reviewed By: piotr Differential Revision: https://reviews.llvm.org/D95503	2021-02-19 20:45:24 +09:00
Nikita Popov	2f17ed294f	[DCE] Don't remove non-willreturn calls In both ADCE and BDCE (via DemandedBits) we should not remove instructions that are not guaranteed to return. This issue was pointed out by fhahn in the recent llvm-dev thread. Differential Revision: https://reviews.llvm.org/D96993	2021-02-19 12:35:40 +01:00
Faris Rehman	529f71811b	[flang][driver] Add debug measure-parse-tree and pre-fir-tree options Add the following options: * -fdebug-measure-parse-tree * -fdebug-pre-fir-tree Summary of changes: - Add 2 new frontend actions: DebugMeasureParseTreeAction and DebugPreFIRTreeAction - Add MeasurementVisitor to FrontendActions.h - Make reportFatalSemanticErrors return true if there are any fatal errors - Port most of the `-fdebug-pre-fir-tree` tests to use the new driver if built, otherwise use f18. Differential Revision: https://reviews.llvm.org/D96884	2021-02-19 11:27:54 +00:00

1 2 3 4 5 ...

380488 Commits All Branches Search

380488 Commits

All Branches