llvm-project

Commit Graph

Author	SHA1	Message	Date
Tim Northover	9097a07e4e	AArch64: work around how Cyclone handles "movi.2d vD, #0". For Cylone, the instruction "movi.2d vD, #0" is executed incorrectly in some rare circumstances. Work around the issue conservatively by avoiding the instruction entirely. This patch changes CodeGen so that problematic instructions are never generated, and the AsmParser so that an equivalent instruction is used (with a warning). llvm-svn: 320965	2017-12-18 10:36:00 +00:00
Igor Laevsky	7bd3fb15e1	[TargetLibraryInfo] Discard library functions with incorrectly sized integers Differential Revision: https://reviews.llvm.org/D41184 llvm-svn: 320964	2017-12-18 10:31:58 +00:00
Sam Parker	fd967f2f7a	[ARM] Adjust test checks Correct the CHECK-LABELS of a couple of dag combine tests. llvm-svn: 320963	2017-12-18 10:08:03 +00:00
Sam Parker	00804efd72	[DAGCombine] Move AND nodes to multiple load leaves Search from AND nodes to find whether they can be propagated back to loads, so that the AND and load can be combined into a narrow load. We search through OR, XOR and other AND nodes and all bar one of the leaves are required to be loads or constants. The exception node then needs to be masked off meaning that the 'and' isn't removed, but the loads(s) are narrowed still. Differential Revision: https://reviews.llvm.org/D41177 llvm-svn: 320962	2017-12-18 10:04:27 +00:00
Pavel Labath	d8b3c1a135	NPL: Clean up handling of inferior exit Summary: lldb-server was sending the "exit" packet (W??) twice. This happened because it was handling both the pre-exit (PTRACE_EVENT_EXIT) and post-exit (WIFEXITED) as exit events. We had some code which was trying to detect when we've already sent the exit packet, but this stopped working quite a while ago. This never really caused any problems in practice because the client automatically closes the connection after receiving the first packet, so the only effect of this was some warning messages about extra packets from the lldb-server test suite, which were ignored because they didn't fail the test. The new test suite will be stricter about this, so I fix this issue ignoring the first event. I think this is the correct behavior, as the inferior is not really dead at that point, so it's premature to send the exit packet. There isn't an actual test yet which would verify the exit behavior, but in my next patch I will add a test which will also test this functionality. Reviewers: eugene Subscribers: lldb-commits Differential Revision: https://reviews.llvm.org/D41069 llvm-svn: 320961	2017-12-18 09:44:29 +00:00
Clement Courbet	6f42de3062	[NFC][CodeGen][ExpandMemCmp] Fix documentation. llvm-svn: 320960	2017-12-18 07:32:48 +00:00
Craig Topper	7034d401f8	[X86] Use mattr instead of mcpu in some of the cost model tests. Based on the names of the check lines, features seems more appropriate that cpu. Spotted while prototyping my patch to make 512-bit vectors illegal on SKX sometimes. llvm-svn: 320959	2017-12-18 07:21:58 +00:00
Hiroshi Inoue	c6faf15459	[SROA] Disable non-whole-alloca splits by default This patch introduce a switch to control splitting of non-whole-alloca slices with default off. The switch will be default on again after fixing an issue reported in PR35657. llvm-svn: 320958	2017-12-18 06:47:37 +00:00
Craig Topper	8e2837cc6e	[X86] Fix mistake that I made when splitting up the setOperationAction calls recently. The block I moved things that need BWI and 512-bit or VLX is incorrectly qualified with just hasBWI \|\| hasVLX. Here I've qualified it with hasBWI && (hasAVX512 \|\| hasVLX) where the hasAVX512 will be replaced with allowing 512-bit vectors in an upcoming patch. llvm-svn: 320957	2017-12-18 04:50:05 +00:00
Serguei Katkov	b0b67a8d38	[CGP] Fix the handling select inst in complex addressing mode When we put the value in select placeholder we must pass the value through simplification tracker due to the value might be already simplified and erased. This is a fix for PR35658. Reviewers: john.brawn, uabelho Reviewed By: john.brawn Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41251 llvm-svn: 320956	2017-12-18 04:25:07 +00:00
Sanjay Patel	9da049fa8a	[x86] add tests for finite libcall lowering (PR35672); NFC llvm-svn: 320955	2017-12-18 00:38:45 +00:00
Benjamin Kramer	acfa339e15	Refactor overridden methods iteration to avoid double lookups. Convert most uses to range-for loops. No functionality change intended. llvm-svn: 320954	2017-12-17 23:52:45 +00:00
Bjorn Steinbrink	3603de2fa2	Re-commit "Properly handle multi-element and dynamically sized allocas in getPointerDereferenceableBytes()"" llvm-clang-x86_64-expensive-checks-win is still broken, so the failure seems unrelated. llvm-svn: 320953	2017-12-17 21:20:16 +00:00
Davide Italiano	5cc82f24ff	[testsuite] Un-XFAIL the global variables tests. <rdar://problem/28725399> Differential Revision: https://reviews.llvm.org/D41312 llvm-svn: 320952	2017-12-17 18:58:27 +00:00
Craig Topper	255a76d6d1	[X86] Add test cases that show cases where buildvector of extract and inserts should be turned into fmsubadd. This is a follow up to the fmaddsub support added in r320950. Hopefully in the future we can fix lowering to handle this fmsubadd too. llvm-svn: 320951	2017-12-17 18:31:36 +00:00
Craig Topper	fd8d040820	[X86] Make the code that creates fmaddsub from build_vector of extracts and inserts functional and add tests. Summary: We had no tests for this and we couldn't do the optimization because of a bad use count check. We need to know how many non-undef pieces of the build vector were filled in and ensure our use count is equal to that. But on the shuffle combine version we need the use count to be 2. The missing coverage was noticed during the review of D40335. Reviewers: RKSimon, zvi, spatel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41133 llvm-svn: 320950	2017-12-17 18:23:45 +00:00
Simon Pilgrim	406d04a916	[X86] Regenerate truncated rotation tests + add missing 32-bit checks llvm-svn: 320949	2017-12-17 18:20:42 +00:00
Sam Clegg	f61676a18a	[WebAssembly] Move code for copying of data segment relocation. NFC. This is a preparetory change for function gc which also requires relocations to be copied in ranges like this. Differential Revision: https://reviews.llvm.org/D41313 llvm-svn: 320948	2017-12-17 17:52:01 +00:00
Sam Clegg	b07a016ed1	use uint32_t llvm-svn: 320947	2017-12-17 17:50:07 +00:00
Sam Clegg	c551522d25	[WebAssembly] Export some more info on wasm funtions Summary: These fields are useful for lld's gc-sections support Also remove an unused field. Subscribers: jfb, dschuff, jgravelle-google, aheejin, sunfish Differential Revision: https://reviews.llvm.org/D41320 llvm-svn: 320946	2017-12-17 17:50:07 +00:00
Bjorn Steinbrink	6f7bbf349f	Revert "Properly handle multi-element and dynamically sized allocas in getPointerDereferenceableBytes()" This reverts commit 217067d5179882de9deb60d2e866befea4c126e7. Fails on llvm-clang-x86_64-expensive-checks-win llvm-svn: 320945	2017-12-17 15:16:58 +00:00
Bjorn Steinbrink	e880f262e5	Revert "Treat sret arguments as being dereferenceable in getPointerDereferenceableBytes()" This reverts commit 8b7a7660a3904b2088bc594311bcea2c651def08. I didn't mean to commit this. llvm-svn: 320944	2017-12-17 15:16:51 +00:00
Bjorn Steinbrink	7afcb71a42	Treat sret arguments as being dereferenceable in getPointerDereferenceableBytes() llvm-svn: 320943	2017-12-17 15:11:52 +00:00
Aleksei Sidorin	dec81835d1	[ASTImporter] Support importing FunctionTemplateDecl and CXXDependentScopeMemberExpr * Also introduces ImportTemplateArgumentListInfo facility (A. Sidorin) Patch by Peter Szecsi! Differential Revision: https://reviews.llvm.org/D38692 llvm-svn: 320942	2017-12-17 14:16:17 +00:00
Simon Pilgrim	b1b30286bf	Remove superfluous break after a return. NFCI. llvm-svn: 320941	2017-12-17 11:01:33 +00:00
Craig Topper	5992535e1a	[X86DomainReassignment] Store legal domains in a std::bitset instead of using a SmallVector that really only ever has one element as a set. llvm-svn: 320940	2017-12-17 03:16:23 +00:00
Bjorn Steinbrink	c27f81b92b	Properly handle byval arguments in getPointerDereferenceableBytes() Summary: For byval arguments, the number of dereferenceable bytes is equal to the size of the pointee, not the pointer. Reviewers: hfinkel, rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41305 llvm-svn: 320939	2017-12-17 02:37:42 +00:00
Bjorn Steinbrink	5d86532467	Properly handle multi-element and dynamically sized allocas in getPointerDereferenceableBytes() Reviewers: hfinkel, rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41288 llvm-svn: 320938	2017-12-17 01:54:25 +00:00
Craig Topper	ee1e71e576	[X86] Use extract_vector_elt instead of X86ISD::VEXTRACT for isel of vXi1 extractions. llvm-svn: 320937	2017-12-17 01:35:48 +00:00
Craig Topper	c0c2d19e08	[X86] Canonicalize extract_vector_elt from vXi1 to always return MVT::i32. This allows us to remove some isel patterns that allowed MVT::i8 result type. llvm-svn: 320936	2017-12-17 01:35:47 +00:00
Craig Topper	c609dc8f55	[X86] Don't create X86ISD::VEXTRACT nodes directly. Use EXTRACT_VECTOR_ELT and allow that to be legaized to VEXTRACT. I think we can remove the VEXTRACT node completely and use a canonicalized EXTRACT_VECTOR_ELT instead. This is a first step. llvm-svn: 320935	2017-12-17 01:35:44 +00:00
Simon Pilgrim	5c0c93ed4c	Fix unused variable warning. llvm-svn: 320934	2017-12-16 23:37:51 +00:00
Simon Pilgrim	4c9e8215e9	[X86][AVX] lowerVectorShuffleAsBroadcast - aggressively peek through BITCASTs Assuming we can safely adjust the broadcast index for the new type to keep it suitably aligned, then peek through BITCASTs when looking for the broadcast source. Fixes PR32007 llvm-svn: 320933	2017-12-16 23:32:18 +00:00
Simon Pilgrim	88c10bc969	[X86][AVX] Use extract128BitVector helper. NFCI. llvm-svn: 320932	2017-12-16 23:09:57 +00:00
Kostya Kortchinsky	8bcbcea929	[sanitizer] Define __sanitizer_clockid_t on FreeBSD Summary: https://reviews.llvm.org/D41121 broke the FreeBSD build due to that type not being defined on FreeBSD. As far as I can tell, it is an int, but I do not have a way to test the change. Reviewers: alekseyshl, kparzysz Reviewed By: kparzysz Subscribers: kparzysz, emaste, kubamracek, krytarowski, #sanitizers, llvm-commits Differential Revision: https://reviews.llvm.org/D41325 llvm-svn: 320931	2017-12-16 23:01:14 +00:00
Simon Pilgrim	f3b6da00f5	[X86][AVX] Fix failed broadcast fold Strip excess BITCASTs from EXTRACT_SUBVECTOR input llvm-svn: 320930	2017-12-16 22:57:17 +00:00
Sean Fertile	68d7f9da76	[Memcpy Loop Lowering] Only calculate residual size/bytes copied when needed. If the loop operand type is int8 then there will be no residual loop for the unknown size expansion. Dont create the residual-size and bytes-copied values when they are not needed. llvm-svn: 320929	2017-12-16 22:41:39 +00:00
Craig Topper	849b717c86	[X86] Don't pass a zero input to the passthru operand of getVectorMaskingNode/getScalarMaskingNode when its going to emit an ISD::OR/ISD::AND. NFCI In those cases, the pass thru operand of the methods isn't used. The calls to the scalar version were passing a MVT::i1 zero, which is an illegal type at the stage this code runs. llvm-svn: 320928	2017-12-16 21:12:24 +00:00
Craig Topper	93253e189c	[X86] Have getVectorMaskingNode return an ISD::AND for X86ISD::VPSHUFBITQMB instead of creating a select with one input being 0. llvm-svn: 320927	2017-12-16 21:12:23 +00:00
Craig Topper	1260a4e826	[X86] When using vpopcntdq for ctpop of v8i16 vectors, only promote to v8i32. Previously we promoted to v8i64, but we don't need to go all the way to 512-bits. If we have VLX we can use the 256-bit instruction. And even if we don't have VLX we can widen v8i32 to v16i32 and drop the upper half. llvm-svn: 320926	2017-12-16 19:31:36 +00:00
Sam Clegg	5029d676f8	[libcxx] Add WebAssembly support It turns out that this is the only change required in libcxx for it to compile with the new `wasm32-unknown-unknown-wasm` target recently added to Clang. Patch by Nicholas Wilson! Differential Revision: https://reviews.llvm.org/D41073 llvm-svn: 320925	2017-12-16 18:59:50 +00:00
Craig Topper	a42a2ba221	[X86] Combine some more scheduler model entries using regular expressions. We had a lot of separate 32 and 64 instructions that had the same scheduling data. This merges them into the same regular expression. This is pretty consistent with a lot of other instructions. llvm-svn: 320924	2017-12-16 18:35:31 +00:00
Craig Topper	17a311831c	[X86] Use instrs instead of instregex for gather/scatter instructions in the scheduler models. Combine into single InstrRW entries. The reduces the number of scheduler groups in subtarget info. llvm-svn: 320923	2017-12-16 18:35:29 +00:00
Simon Pilgrim	5f022d278b	[InstCombine] Regenerate FMUL/FMA combine tests with update_test_checks.py llvm-svn: 320922	2017-12-16 17:18:15 +00:00
Sanjay Patel	5a0cdac174	[InstCombine] canonicalize shifty abs(): ashr+add+xor --> cmp+neg+sel We want to do this for 2 reasons: 1. Value tracking does not recognize the ashr variant, so it would fail to match for cases like D39766. 2. DAGCombiner does better at producing optimal codegen when we have the cmp+sel pattern. More detail about what happens in the backend: 1. DAGCombiner has a generic transform for all targets to convert the scalar cmp+sel variant of abs into the shift variant. That is the opposite of this IR canonicalization. 2. DAGCombiner has a generic transform for all targets to convert the vector cmp+sel variant of abs into either an ABS node or the shift variant. That is again the opposite of this IR canonicalization. 3. DAGCombiner has a generic transform for all targets to convert the exact shift variants produced by #1 or #2 into an ISD::ABS node. Note: It would be an efficiency improvement if we had #1 go directly to an ABS node when that's legal/custom. 4. The pattern matching above is incomplete, so it is possible to escape the intended/optimal codegen in a variety of ways. a. For #2, the vector path is missing the case for setlt with a '1' constant. b. For #3, we are missing a match for commuted versions of the shift variants. 5. Therefore, this IR canonicalization can only help get us to the optimal codegen. The version of cmp+sel produced by this patch will be recognized in the DAG and converted to an ABS node when possible or the shift sequence when not. 6. In the following examples with this patch applied, we may get conditional moves rather than the shift produced by the generic DAGCombiner transforms. The conditional move is created using a target-specific decision for any given target. Whether it is optimal or not for a particular subtarget may be up for debate. define i32 @abs_shifty(i32 %x) { %signbit = ashr i32 %x, 31 %add = add i32 %signbit, %x %abs = xor i32 %signbit, %add ret i32 %abs } define i32 @abs_cmpsubsel(i32 %x) { %cmp = icmp slt i32 %x, zeroinitializer %sub = sub i32 zeroinitializer, %x %abs = select i1 %cmp, i32 %sub, i32 %x ret i32 %abs } define <4 x i32> @abs_shifty_vec(<4 x i32> %x) { %signbit = ashr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31> %add = add <4 x i32> %signbit, %x %abs = xor <4 x i32> %signbit, %add ret <4 x i32> %abs } define <4 x i32> @abs_cmpsubsel_vec(<4 x i32> %x) { %cmp = icmp slt <4 x i32> %x, zeroinitializer %sub = sub <4 x i32> zeroinitializer, %x %abs = select <4 x i1> %cmp, <4 x i32> %sub, <4 x i32> %x ret <4 x i32> %abs } > $ ./opt -instcombine shiftyabs.ll -S \| ./llc -o - -mtriple=x86_64 -mattr=avx > abs_shifty: > movl %edi, %eax > negl %eax > cmovll %edi, %eax > retq > > abs_cmpsubsel: > movl %edi, %eax > negl %eax > cmovll %edi, %eax > retq > > abs_shifty_vec: > vpabsd %xmm0, %xmm0 > retq > > abs_cmpsubsel_vec: > vpabsd %xmm0, %xmm0 > retq > > $ ./opt -instcombine shiftyabs.ll -S \| ./llc -o - -mtriple=aarch64 > abs_shifty: > cmp w0, #0 // =0 > cneg w0, w0, mi > ret > > abs_cmpsubsel: > cmp w0, #0 // =0 > cneg w0, w0, mi > ret > > abs_shifty_vec: > abs v0.4s, v0.4s > ret > > abs_cmpsubsel_vec: > abs v0.4s, v0.4s > ret > > $ ./opt -instcombine shiftyabs.ll -S \| ./llc -o - -mtriple=powerpc64le > abs_shifty: > srawi 4, 3, 31 > add 3, 3, 4 > xor 3, 3, 4 > blr > > abs_cmpsubsel: > srawi 4, 3, 31 > add 3, 3, 4 > xor 3, 3, 4 > blr > > abs_shifty_vec: > vspltisw 3, -16 > vspltisw 4, 15 > vsubuwm 3, 4, 3 > vsraw 3, 2, 3 > vadduwm 2, 2, 3 > xxlxor 34, 34, 35 > blr > > abs_cmpsubsel_vec: > vspltisw 3, -16 > vspltisw 4, 15 > vsubuwm 3, 4, 3 > vsraw 3, 2, 3 > vadduwm 2, 2, 3 > xxlxor 34, 34, 35 > blr > Differential Revision: https://reviews.llvm.org/D40984 llvm-svn: 320921	2017-12-16 16:41:17 +00:00
Sanjay Patel	cb8c009801	[Driver, CodeGen] pass through and apply -fassociative-math There are 2 parts to getting the -fassociative-math command-line flag translated to LLVM FMF: 1. In the driver/frontend, we accept the flag and its 'no' inverse and deal with the interactions with other flags like -ffast-math -fno-signed-zeros -fno-trapping-math. This was mostly already done - we just need to translate the flag as a codegen option. The test file is complicated because there are many potential combinations of flags here. Note that we are matching gcc's behavior that requires 'nsz' and no-trapping-math. 2. In codegen, we map the codegen option to FMF in the IR builder. This is simple code and corresponding test. For the motivating example from PR27372: float foo(float a, float x) { return ((a + x) - x); } $ ./clang -O2 27372.c -S -o - -ffast-math -fno-associative-math -emit-llvm \| egrep 'fadd\|fsub' %add = fadd nnan ninf nsz arcp contract float %0, %1 %sub = fsub nnan ninf nsz arcp contract float %add, %2 So 'reassoc' is off as expected (and so is the new 'afn' but that's a different patch). This case now works as expected end-to-end although the underlying logic is still wrong: $ ./clang -O2 27372.c -S -o - -ffast-math -fno-associative-math \| grep xmm addss %xmm1, %xmm0 subss %xmm1, %xmm0 We're not done because the case where 'reassoc' is set is ignored by optimizer passes. Example: $ ./clang -O2 27372.c -S -o - -fassociative-math -fno-signed-zeros -fno-trapping-math -emit-llvm \| grep fadd %add = fadd reassoc float %0, %1 $ ./clang -O2 27372.c -S -o - -fassociative-math -fno-signed-zeros -fno-trapping-math \| grep xmm addss %xmm1, %xmm0 subss %xmm1, %xmm0 Differential Revision: https://reviews.llvm.org/D39812 llvm-svn: 320920	2017-12-16 16:11:17 +00:00
Craig Topper	5028ace602	[X86] Implement kand/kandn/kor/kxor/kxnor/knot intrinsics using native IR. llvm-svn: 320919	2017-12-16 08:26:22 +00:00
Craig Topper	d2a2a39c93	[X86] Remove GCCBuiltin from kand/kandn/kor/kxor/kxnor/knot intrinsics so clang can implement with native IR. llvm-svn: 320918	2017-12-16 08:25:30 +00:00
Craig Topper	1c7d07c601	[X86] Remove unneeded code for handling the old kunpck intrinsics. llvm-svn: 320917	2017-12-16 06:58:30 +00:00
Craig Topper	798f2c037c	[X86] Add the two files I forgot to commit in r320915. llvm-svn: 320916	2017-12-16 06:10:24 +00:00

1 2 3 4 5 ...

278922 Commits All Branches Search

278922 Commits

All Branches