llvm-project

Commit Graph

Author	SHA1	Message	Date
Luke Cheeseman	10981cc884	Revert r343317 - asan buildbots are breaking and I need to investigate the issue llvm-svn: 343341	2018-09-28 17:01:50 +00:00
Aditya Nandakumar	1cbb057142	[GISel]: Remove an incorrect assert in CallLowering https://reviews.llvm.org/D51147 Asserting if any extend of vectors should be up to the target's legalizer/target specific code not in CallLowering. reviewed by : dsanders. llvm-svn: 343325	2018-09-28 15:08:49 +00:00
Luke Cheeseman	21f2955bb2	Reapply changes reverted by r343235 - Add fix so that all code paths that create DWARFContext with an ObjectFile initialise the target architecture in the context - Add an assert that the Arch is known in the Dwarf CallFrameString method llvm-svn: 343317	2018-09-28 13:37:27 +00:00
Petar Jovanovic	ff1bc621a0	[MIPS GlobalISel] Lower i64 arguments Lower integer arguments larger then 32 bits for MIPS32. setMostSignificantFirst is used in order for G_UNMERGE_VALUES and G_MERGE_VALUES to always hold registers in same order, regardless of endianness. Patch by Petar Avramovic. Differential Revision: https://reviews.llvm.org/D52409 llvm-svn: 343315	2018-09-28 13:28:47 +00:00
Jonas Devlieghere	f1c414cd0d	Split invocations in CodeGen/X86/cpus.ll among multiple tests. (NFC) On GreenDragon `CodeGen/X86/cpus.ll` is timing out on the bot with Asan and UBSan enabled. With the same configuration on my machine, the test passes but takes more than 3 minutes to do so. I could increase the timeout, but I believe it makes more sense to split up the test because it allows for more parallelism. Differential revision: https://reviews.llvm.org/D52603 llvm-svn: 343313	2018-09-28 12:08:51 +00:00
Simon Pilgrim	17e5981ebf	[X86][Btver2] Fix BSF/BSR schedule Double throughput to account for 2 pipes + fix BSF's latency/uop counts Match AMD Fam16h SOG + llvm-exegesis tests llvm-svn: 343311	2018-09-28 10:26:48 +00:00
David Spickett	ea605913be	[ARM] Allow execute only code on Cortex-m23 The NoMovt feature prevents the use of MOVW/MOVT instructions on Cortex-M23 for performance reasons. These instructions are required for execute only code so NoMovt should be disabled when that option is enabled. Differential Revision: https://reviews.llvm.org/D52551 llvm-svn: 343302	2018-09-28 08:55:19 +00:00
Simon Pilgrim	280af1c7f0	[X86][BtVer2] Fix PHMINPOS schedule resources typo PHMINPOS can run on either JFPU pipe llvm-svn: 343299	2018-09-28 08:21:39 +00:00
Craig Topper	1b29615330	[X86] Add the test case from PR38986. The assembly for this test should be optimal now after changes to the ScalarizeMaskedMemIntrin patch. llvm-svn: 343281	2018-09-27 23:25:10 +00:00
Craig Topper	6911bfe263	[ScalarizeMaskedMemIntrin] When expanding masked gathers, start with the passthru vector and insert the new load results into it. Previously we started with undef and did a final merge with the passthru at the end. llvm-svn: 343273	2018-09-27 21:28:59 +00:00
Craig Topper	7d234d6628	[ScalarizeMaskedMemIntrin] When expanding masked loads, start with the passthru value and insert each conditional load result over their element. Previously we started with undef and did one final merge at the end with a select. llvm-svn: 343271	2018-09-27 21:28:52 +00:00
Stanislav Mekhanoshin	b080adfc0c	[AMDGPU] Fold copy (copy vgpr) This allows to reduce a number of used VGPRs in some cases. Differential Revision: https://reviews.llvm.org/D52577 llvm-svn: 343249	2018-09-27 18:55:20 +00:00
Craig Topper	0423681d4a	[ScalarizeMaskedMemIntrin] Don't emit 'icmp eq i1 %x, 1' to check mask values. That's just %x so use that directly. Had we emitted this IR earlier, InstCombine would have removed icmp so I'm going to assume using the i1 directly would be considered canonical. llvm-svn: 343244	2018-09-27 18:01:48 +00:00
Luke Cheeseman	8e5676b1aa	Revert r343192 as an ubsan build is currently failing llvm-svn: 343235	2018-09-27 16:47:30 +00:00
Daniel Cederman	0c05bdea2b	[Sparc] Remove the support for builtin setjmp/longjmp Summary: It is currently broken and for Sparc there is not much benefit in using a builtin version compared to a library version. Both versions needs to store the same four values in setjmp and flush the register windows in longjmp. If the need for a builtin setjmp/longjmp arises there is an improved implementation available at https://reviews.llvm.org/D50969. Reviewers: jyknight, joerg, venkatra Subscribers: fedor.sergeev, jrtc27, llvm-commits Differential Revision: https://reviews.llvm.org/D51487 llvm-svn: 343210	2018-09-27 13:32:54 +00:00
Luke Cheeseman	f6844b307a	Reapply changes reverted in r343114, lldb patch to follow shortly llvm-svn: 343192	2018-09-27 10:39:20 +00:00
Craig Topper	e4c96f4a48	[X86] Update tzcnt fast-isel tests to match clang r343126. We now generate cttz with the zero_undef flag set to false. This allows -O0 to avoid the zero check. llvm-svn: 343127	2018-09-26 17:19:28 +00:00
Luke Cheeseman	77aaa22081	Revert r343112 as CallFrameString API change has broken lldb builds llvm-svn: 343114	2018-09-26 14:48:03 +00:00
Luke Cheeseman	03ad8812f5	[AArch64] - Return address signing dwarf support - Reapply r343089 with a fix for DebugInfo/Sparc/gnu-window-save.ll llvm-svn: 343112	2018-09-26 14:30:29 +00:00
Francis Visoiu Mistrih	6acaa18afc	[CodeGen] Always print register ties in MI::dump() It was the case when calling MO::dump(), but MI::dump() was still depending on hasComplexRegisterTies(). The MIR output is not affected. llvm-svn: 343107	2018-09-26 13:33:09 +00:00
Hans Wennborg	00b88bbcaf	Revert r343089 "[AArch64] - Return address signing dwarf support" This caused the DebugInfo/Sparc/gnu-window-save.ll test to fail. > Functions that have signed return addresses need additional dwarf support: > - After signing the LR, and before authenticating it, the LR register is in a > state the is unusable by a debugger or unwinder > - To account for this a new directive, .cfi_negate_ra_state, is added > - This directive says the signed state of the LR register has now changed, > i.e. unsigned -> signed or signed -> unsigned > - This directive has the same CFA code as the SPARC directive GNU_window_save > (0x2d), adding a macro to account for multiply defined codes > - This patch matches the gcc implementation of this support: > https://patchwork.ozlabs.org/patch/800271/ > > Differential Revision: https://reviews.llvm.org/D50136 llvm-svn: 343103	2018-09-26 12:57:45 +00:00
Hiroshi Inoue	20982f0995	[PowerPC] optimize conditional branch on CRSET/CRUNSET This patch adds a check to optimize conditional branch (BC and BCn) based on a constant set by CRSET or CRUNSET. Other optimizers, such as block placement, may generate such code and hence I do this at the very end of the optimization in pre-emit peephole pass. A conditional branch based on a constant is eliminated or converted into unconditional branch. Also CRSET/CRUNSET is eliminated if the condition code register is not used by instruction other than the branch to be optimized. Differential Revision: https://reviews.llvm.org/D52345 llvm-svn: 343100	2018-09-26 12:32:45 +00:00
Simon Pilgrim	26223bccde	[X86][SSE] Refresh PR34947 test code to handle D52504 The previously reduced version used urem <9 x i32> zeroinitializer, %tmp which D52504 will simplify. llvm-svn: 343097	2018-09-26 11:53:51 +00:00
Simon Pilgrim	5beaac433d	[X86][SSE] Use ISD::MULHS for constant vXi16 ISD::SRA lowering (PR38151) Similar to the existing ISD::SRL constant vector shifts from D49562, this patch adds ISD::SRA support with ISD::MULHS. As we're dealing with signed values, we have to handle shift by zero and shift by one special cases, so XOP+AVX2/AVX512 splitting/extension is still a better solution - really we should still use ISD::MULHS if one of the special cases are used but for now I've just left a TODO and filtered by isKnownNeverZero. Differential Revision: https://reviews.llvm.org/D52171 llvm-svn: 343093	2018-09-26 10:57:05 +00:00
Sam Parker	75aca94093	[ARM] Fix for PR39060 When calculating whether a value can safely overflow for use by an icmp, we weren't checking that the value couldn't wrap around. To do this we need the icmp to be using a constant, as well as the incoming add or sub. bugzilla report: https://bugs.llvm.org/show_bug.cgi?id=39060 Differential Revision: https://reviews.llvm.org/D52463 llvm-svn: 343092	2018-09-26 10:56:00 +00:00
David Green	353cb3d4e5	[CodeGen] Enable tail calls for functions with NonNull attributes. Adding NonNull as attributes to returned pointers has the unfortunate side effect of disabling tail calls. This patch ignores the NonNull attribute when we decide whether to tail merge, in the same way that we ignore the NoAlias attribute, as it has no affect on the call sequence. Differential Revision: https://reviews.llvm.org/D52238 llvm-svn: 343091	2018-09-26 10:46:18 +00:00
Yury Gribov	67572004df	Fixes removal of dead elements from PressureDiff (PR37252). Reviewed By: MatzeB Differential Revision: https://reviews.llvm.org/D51495 llvm-svn: 343090	2018-09-26 10:42:41 +00:00
Luke Cheeseman	f755e687fc	[AArch64] - Return address signing dwarf support Functions that have signed return addresses need additional dwarf support: - After signing the LR, and before authenticating it, the LR register is in a state the is unusable by a debugger or unwinder - To account for this a new directive, .cfi_negate_ra_state, is added - This directive says the signed state of the LR register has now changed, i.e. unsigned -> signed or signed -> unsigned - This directive has the same CFA code as the SPARC directive GNU_window_save (0x2d), adding a macro to account for multiply defined codes - This patch matches the gcc implementation of this support: https://patchwork.ozlabs.org/patch/800271/ Differential Revision: https://reviews.llvm.org/D50136 llvm-svn: 343089	2018-09-26 10:14:15 +00:00
Hans Wennborg	4b2e7daa7e	Revert r342870 "[ARM] bottom-top mul support ARMParallelDSP" This broke Chromium's Android build (https://crbug.com/889390) and the polly-aosp buildbot (http://lab.llvm.org:8011/builders/aosp-O3-polly-before-vectorizer-unprofitable). > Originally committed in rL342210 but was reverted in rL342260 because > it was causing issues in vectorized code, because I had forgotten to > ensure that we're operating on scalar values. > > Original commit message: > > On failing to find sequences that can be converted into dual macs, > try to find sequential 16-bit loads that are used by muls which we > can then use smultb, smulbt, smultt with a wide load. > > Differential Revision: https://reviews.llvm.org/D51983 llvm-svn: 343082	2018-09-26 08:41:50 +00:00
Zhaoshi Zheng	95710337b4	Revert "Revert "[ConstHoist] Do not rebase single (or few) dependent constant"" This reverts commit bd7b44f35ee9fbe365eb25ce55437ea793b39346. Reland r342994: disabled the optimization and explicitly enable it in test. -mllvm -consthoist-min-num-to-rebase<unsigned>=0 [ConstHoist] Do not rebase single (or few) dependent constant If an instance (InsertionPoint or IP) of Base constant A has only one or few rebased constants depending on it, do NOT rebase. One extra ADD instruction is required to materialize each rebased constant, assuming A and the rebased have the same materialization cost. Differential Revision: https://reviews.llvm.org/D52243 llvm-svn: 343053	2018-09-26 00:59:09 +00:00
Thomas Lively	c949857a7f	[WebAssembly] SIMD conversions Summary: Lowers (s\|u)itofp and fpto(s\|u)i instructions for vectors. The fp to int conversions produce poison values if their arguments are out of the convertible range, so a future CL will have to add an LLVM intrinsic to make the saturating behavior of this conversion usable. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D52372 llvm-svn: 343052	2018-09-26 00:34:36 +00:00
Stanislav Mekhanoshin	8dfcd83371	[AMDGPU] Fix ds combine with subregs Differential Revision: https://reviews.llvm.org/D52522 llvm-svn: 343047	2018-09-25 23:33:18 +00:00
Craig Topper	12c18840fa	[X86] Allow movmskpd/ps ISD nodes to be created and selected with integer input types. This removes an int->fp bitcast between the surrounding code and the movmsk. I had already added a hack to combineMOVMSK to try to look through this bitcast to improve the SimplifyDemandedBits there. But I found an additional issue where the bitcast was preventing combineMOVMSK from being called again after earlier nodes in the DAG are optimized. The bitcast gets revisted, but not the user of the bitcast. By using integer types throughout, the bitcast doesn't get in the way. llvm-svn: 343046	2018-09-25 23:28:27 +00:00
Craig Topper	d8c68840c8	[X86] Add some more movmsk test cases. NFC These IR patterns represent the exact behavior of a movmsk instruction using (zext (bitcast (icmp slt X, 0))). For the v4i32/v8i32/v2i64/v4i64 we currently emit a PCMPGT for the icmp slt which is unnecessary since we only care about the sign bit of the result. This is because of the int->fp bitcast we put on the input to the movmsk nodes for these cases. I'll be fixing this in a future patch. llvm-svn: 343045	2018-09-25 23:28:24 +00:00
Changpeng Fang	6f4922ccc9	AMDGPU: Add Selection patterns to support add of one bit. Summary: We generate s_xor to lower add of i1s in general cases, and s_not to lower add with a one-bit imm of -1 (true). Reviewers: rampitec Differential Revision: https://reviews.llvm.org/D52518 llvm-svn: 343030	2018-09-25 21:21:18 +00:00
Sanjay Patel	10c11b867a	[x86] avoid 256-bit andnp that requires insert/extract with AVX1 (PR37449) This is the final (I hope!) problem pattern mentioned in PR37749: https://bugs.llvm.org/show_bug.cgi?id=37749 We are trying to avoid an AVX1 sinkhole caused by having 256-bit bitwise logic ops but no other 256-bit integer ops. We've already solved the simple logic ops, but 'andn' is an x86 special. I looked at alternative solutions like extending the generic DAG combine or trying to wait until the ANDNP node is created, but those are bigger patches that can over-reach. Ie, splitting to 128-bit does not look like a win in most cases with >1 256-bit op. The pattern matching is cluttered with bitcasts because of our i64 element canonicalization. For the affected test, we have this vector-type-legalized sequence: t29: v8i32 = concat_vectors t27, t28 t30: v4i64 = bitcast t29 t18: v8i32 = BUILD_VECTOR Constant:i32<-1>, Constant:i32<-1>, ... t31: v4i64 = bitcast t18 t32: v4i64 = xor t30, t31 t9: v8i32 = BUILD_VECTOR Constant:i32<255>, Constant:i32<255>, ... t34: v4i64 = bitcast t9 t35: v4i64 = and t32, t34 t36: v8i32 = bitcast t35 t37: v4i32 = extract_subvector t36, Constant:i64<0> t38: v4i32 = extract_subvector t36, Constant:i64<4> Differential Revision: https://reviews.llvm.org/D52318 llvm-svn: 343008	2018-09-25 19:09:34 +00:00
Jessica Paquette	e02de05b32	Revert "[ConstHoist] Do not rebase single (or few) dependent constant" This caused a couple test failures on a bot: CodeGen/X86/constant-hoisting-bfi.ll Transforms/ConstantHoisting/X86/ehpad.ll Example: http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/53575/ llvm-svn: 343005	2018-09-25 18:41:40 +00:00
Daniil Fukalov	349b5943b4	[RegAllocGreedy] avoid using physreg candidates that cannot be correctly spilled For the AMDGPU target if a MBB contains exec mask restore preamble, SplitEditor may get state when it cannot insert a spill instruction. E.g. for a MIR bb.100: %1 = S_OR_SAVEEXEC_B64 %2, implicit-def $exec, implicit-def $scc, implicit $exec and if the regalloc will try to allocate a virtreg to the physreg already assigned to virtreg %1, it should insert spill instruction before the S_OR_SAVEEXEC_B64 instruction. But it is not possible since can generate incorrect code in terms of exec mask. The change makes regalloc to ignore such physreg candidates. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D52052 llvm-svn: 343004	2018-09-25 18:37:38 +00:00
Zhaoshi Zheng	2c1a09188f	[ConstHoist] Do not rebase single (or few) dependent constant If an instance (InsertionPoint or IP) of Base constant A has only one or few rebased constants depending on it, do NOT rebase. One extra ADD instruction is required to materialize each rebased constant, assuming A and the rebased have the same materialization cost. Differential Revision: https://reviews.llvm.org/D52243 llvm-svn: 342994	2018-09-25 17:45:37 +00:00
Craig Topper	6fb1358a98	[X86] Add AVX512 support to combineVectorSizedSetCCEquality. Reviewers: spatel, RKSimon Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52424 llvm-svn: 342989	2018-09-25 16:27:12 +00:00
Simon Pilgrim	b56be79e0c	Revert rL342916: [X86] Remove shift/rotate by CL memory (RMW) overrides As suggested by Craig Topper - I'm going to look at cleaning up the RMW sequences instead. The uops are slightly different to the register variant, so requires a +1uop tweak llvm-svn: 342969	2018-09-25 13:01:26 +00:00
Sameer Sahasrabuddhe	b4f2d1cb68	[AMDGPU] restore r342722 which was reverted with r342743 [AMDGPU] lower-switch in preISel as a workaround for legacy DA Summary: The default target of the switch instruction may sometimes be an "unreachable" block, when it is guaranteed that one of the cases is always taken. The dominator tree concludes that such a switch instruction does not have an immediate post dominator. This confuses divergence analysis, which is unable to propagate sync dependence to the targets of the switch instruction. As a workaround, the AMDGPU target now invokes lower-switch as a preISel pass. LowerSwitch is designed to handle the unreachable default target correctly, allowing the divergence analysis to locate the correct immediate dominator of the now-lowered switch. llvm-svn: 342956	2018-09-25 09:39:21 +00:00
Thomas Lively	12da0f9c3d	[WebAssembly] SIMD sqrt Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D52387 llvm-svn: 342937	2018-09-25 03:39:28 +00:00
Stanislav Mekhanoshin	14fefe7f8e	[AMDGPU] Remove useless check from test. NFC. The check for assignment of zero is practically useless while the assignment moves around with different scheduling. llvm-svn: 342935	2018-09-25 01:24:54 +00:00
Craig Topper	9ce5da7b62	[X86] Don't create FILD ISD nodes when X87 is disabled. The included test case previously asserted because the type legalizer tried to soften the FILD ISD node. Fixes PR38819. llvm-svn: 342934	2018-09-25 00:16:57 +00:00
Thomas Lively	586153652c	[WebAssembly][NFC] Fix hardcoded stack indices in tests Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D52388 llvm-svn: 342928	2018-09-24 23:42:07 +00:00
Christy Lee	e94374809e	Re-submitting changes in D51550 because it failed to patch. Reviewers: javed.absar, trentxintong, courbet Reviewed By: trentxintong Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52433 llvm-svn: 342919	2018-09-24 20:47:12 +00:00
Simon Pilgrim	0b4ad7596f	[X86] Remove shift/rotate by CL memory (RMW) overrides The uops are slightly different to the register variant, so requires a +1uop tweak llvm-svn: 342916	2018-09-24 20:11:50 +00:00
Stefan Pintilie	b5305771fb	[Power9] [LLVM] Add __float128 exponent GET and SET builtins Added __builtin_vsx_scalar_extract_expq __builtin_vsx_scalar_insert_exp_qp Builtins should behave the same way as in GCC. Differential Revision: https://reviews.llvm.org/D48185 llvm-svn: 342910	2018-09-24 18:14:13 +00:00
Simon Pilgrim	51cbd838d0	[X86][AVX] Add truncation as shuffle test for PR31451 llvm-svn: 342908	2018-09-24 17:26:31 +00:00

1 2 3 4 5 ...

25981 Commits