llvm-project

Commit Graph

Author	SHA1	Message	Date
Simon Pilgrim	833c260a4b	[X86][AVX512] Tag VPTRUNC/VPMOVSX/VPMOVZX instruction scheduler classes llvm-svn: 319815	2017-12-05 19:21:28 +00:00
Matt Arsenault	7f0a527300	AMDGPU: Fix infinite loop with dbg_value Surprisingly SIOptimizeExecMaskingPreRA can infinite loop in some case with DBG_VALUE. Most tests using dbg_value are run at -O0, so don't run this pass. This seems to only happen when the value argument is undef. llvm-svn: 319808	2017-12-05 18:23:17 +00:00
Simon Pilgrim	65f805fe30	[X86][X87] Tag FCMOV instruction scheduler classes llvm-svn: 319804	2017-12-05 18:01:26 +00:00
Dan Gohman	c2c997718d	[WebAssembly] Implement WASM_STACK_POINTER. Use the .stack_pointer directive to implement WASM_STACK_POINTER for specifying a global variable to be the stack pointer. llvm-svn: 319797	2017-12-05 17:23:43 +00:00
Dan Gohman	f7172f4ab0	[WebAssembly] Don't emit .import_global for the wasm target. .import_global is used by the ELF-based target and not needed by the wasm target. llvm-svn: 319796	2017-12-05 17:21:57 +00:00
Simon Pilgrim	d9f1ae3266	[X86][AVX512] Tag VNNIW instruction scheduler classes llvm-svn: 319784	2017-12-05 16:17:21 +00:00
Simon Pilgrim	4a9b1e1273	[X86][AVX512] Drop some default NoItinerary arguments that aren't needed any more llvm-svn: 319782	2017-12-05 16:10:57 +00:00
Jina Nahias	51c1a627c2	[x86][AVX512] Lowering kunpack intrinsics to LLVM IR This patch, together with a matching clang patch (https://reviews.llvm.org/D39719), implements the lowering of X86 kunpack intrinsics to IR. Differential Revision: https://reviews.llvm.org/D39720 Change-Id: I4088d9428478f9457f6afddc90bd3d66b3daf0a1 llvm-svn: 319778	2017-12-05 15:42:56 +00:00
Simon Pilgrim	4d08aedba3	[X86][AVX512] Tag VPMADD52/VPSADBW instruction scheduler classes llvm-svn: 319772	2017-12-05 14:59:40 +00:00
Simon Pilgrim	71660c61e6	[X86][AVX512] Add missing scalar CMPSS/CMPSD logic scheduler classes llvm-svn: 319770	2017-12-05 14:34:42 +00:00
Simon Pilgrim	b9b46394e3	[X86][AVX512] Cleanup bit logic scheduler classes llvm-svn: 319767	2017-12-05 14:04:23 +00:00
Simon Pilgrim	fd3a2632e5	[X86][AVX512] Tag scalar CVT and CMP instruction scheduler classes llvm-svn: 319765	2017-12-05 13:49:44 +00:00
Simon Pilgrim	aa91155960	[X86][AVX512] Tag VPCMP/VPCMPU instruction scheduler classes Move hardcoded itinerary out to the instruction declarations. Not sure that IIC_SSE_ALU_F32P is the best schedule for integer comparisons, but I'm not going to change it right now. llvm-svn: 319760	2017-12-05 12:14:36 +00:00
Simon Pilgrim	a2b5862641	[X86][AVX512] Cleanup VPCMP scheduler classes Move hardcoded itinerary out to the instruction declarations. Not sure that IIC_SSE_ALU_F32P is the best schedule for integer comparisons, but I'm not going to change it right now. llvm-svn: 319758	2017-12-05 12:02:22 +00:00
Simon Pilgrim	54b8aa2bb2	[X86][AVX512] Tag VFIXUPIMM instructions scheduler classes llvm-svn: 319757	2017-12-05 11:46:57 +00:00
Jonas Paulsson	b5b91cd402	[SystemZ] set 'guessInstructionProperties = 0' and set flags as needed. This has proven a healthy exercise, as many cases of incorrect instruction flags were corrected in the process. As part of this, IntrWriteMem was added to several SystemZ instrinsics. Furthermore, a bug was exposed in TwoAddress with this change (as incorrect hasSideEffects flags were removed and instructions could now be sunk), and the test case for that bugfix (r319646) is included here as test/CodeGen/SystemZ/twoaddr-sink.ll. One temporary test regression (one extra copy) which will hopefully go away in upcoming patches for similar cases: test/CodeGen/SystemZ/vec-trunc-to-i1.ll Review: Ulrich Weigand. https://reviews.llvm.org/D40437 llvm-svn: 319756	2017-12-05 11:24:39 +00:00
Jonas Paulsson	86c40db49d	[Regalloc] Generate and store multiple regalloc hints. MachineRegisterInfo used to allow just one regalloc hint per virtual register. This patch extends this to a vector of regalloc hints, which is filled in by common code with sorted copy hints. Such hints will make for more ID copies that can be removed. NB! This improvement is currently (and hopefully temporarily) disabled by default, except for SystemZ. The only reason for this is the big impact this has on tests, which has unfortunately proven unmanageable. It was a long while since all the tests were updated and just waiting for review (which didn't happen), but now targets have to enable this themselves instead. Several targets could get a head-start by downloading the tests updates from the Phabricator review. Thanks to those who helped, and sorry you now have to do this step yourselves. This should be an improvement generally for any target! The target may still create its own hint, in which case this has highest priority and is stored first in the vector. If it has target-type, it will not be recomputed, as per the previous behaviour. The temporary hook enableMultipleCopyHints() will be removed as soon as all targets return true. Review: Quentin Colombet, Ulrich Weigand. https://reviews.llvm.org/D38128 llvm-svn: 319754	2017-12-05 10:52:24 +00:00
Guy Blank	f3cefdd350	[X86] Fix a bug in handling GRXX subclasses in Domain Reassignment pass When trying to determine the correct Mask register class corresponding to a GPR register class, not all register classes were handled. This caused an assertion to be raised on some scenarios. Differential Revision: https://reviews.llvm.org/D40290 llvm-svn: 319745	2017-12-05 09:08:24 +00:00
Craig Topper	a404ce955a	[X86] Use vector widening to support sign extend from i1 when the dest type is not 512-bits and vlx is not enabled. Previously we used a wider element type and truncated. But its more efficient to keep the element type and drop unused elements. If BWI isn't supported and we have a i16 or i8 type, we'll extend it to be i32 and still use a truncate. llvm-svn: 319740	2017-12-05 06:37:21 +00:00
Daniel Sanders	3c1c4c0ee0	Revert r319691: [globalisel][tablegen] Split atomic load/store into separate opcode and enable for AArch64. Some concerns were raised with the direction. Revert while we discuss it and look into an alternative llvm-svn: 319739	2017-12-05 05:52:07 +00:00
Craig Topper	e1ba2450c2	[X86] Fix a crash if avx512bw and xop are both enabled when the IR contrains a v32i8 bitreverse. llvm-svn: 319737	2017-12-05 04:47:12 +00:00
Matt Arsenault	e42b08d96d	AMDGPU: Fix missing subtarget feature initializer llvm-svn: 319733	2017-12-05 03:15:44 +00:00
Matt Arsenault	9a60c3ea36	AMDGPU: Fix crash when scheduling DBG_VALUE This calls handleMove with a DBG_VALUE instruction, which isn't tracked by LiveIntervals. I'm not sure this is the correct place to fix this. The generic scheduler seems to have more deliberate region selection that skips dbg_value. The test is also really hard to reduce. I haven't been able to figure out what exactly causes this particular case to try moving the dbg_value. llvm-svn: 319732	2017-12-05 03:09:23 +00:00
Craig Topper	276c770e57	[X86] Use vector widening to support zero extend from i1 when the dest type is not 512-bits and vlx is not enabled. Previously we used a wider element type and truncated. But its more efficient to keep the element type and drop unused elements. If BWI isn't supported and we have a i16 or i8 type, we'll extend it to be i32 and still use a truncate. llvm-svn: 319728	2017-12-05 01:45:46 +00:00
Craig Topper	913b42b0e1	[X86] Don't use kunpck for vXi1 concat_vectors if the upper bits are undef. This can be efficiently selected by a COPY_TO_REGCLASS without the need for an extra instruction. llvm-svn: 319726	2017-12-05 01:28:06 +00:00
Craig Topper	6302012442	[X86] Use getZeroVector and remove an unnecessary creation of an APInt before calling getConstant. NFCI The getConstant function can take care of creating the APInt internally. getZeroVector will take care of using the correct type for the build vector to avoid re-lowering. The test change here is because execution domain constraints apparently pass through undef inputs of a zeroing xor. So the different ordering of register allocation here caused the dependency to change. llvm-svn: 319725	2017-12-05 01:28:04 +00:00
Craig Topper	adadaae586	[X86] Rearrange some of the code around AVX512 sign/zero extends. NFCI Move the AVX512 code out of LowerAVXExtend. LowerAVXExtend has two callers but one of them pre-checks for AVX-512 so the code is only live from the other caller. So move the AVX-512 checks up to that caller for symmetry. Move all of the i1 input type code in Lower_AVX512ZeroExend together. llvm-svn: 319724	2017-12-05 01:28:00 +00:00
Jan Vesely	39aeab4f30	AMDGPU/EG: Add a new FeatureFMA and use it to selectively enable FMA instruction Only used by pre-GCN targets v2: fix predicate setting for FMA_Common Differential Revision: https://reviews.llvm.org/D40692 llvm-svn: 319712	2017-12-04 23:07:28 +00:00
Jan Vesely	d1c9b61e2b	AMDGPU: Disable fp64 support on pre GCN asics It's not implemented. Passing +fp64-fp16-denormal feature enables fp64 even on asics that don't support it v2: fix hasFP64 query Differential Revision: https://reviews.llvm.org/D39931 llvm-svn: 319709	2017-12-04 22:57:29 +00:00
Hans Wennborg	361d4392cf	Revert r319490 "XOR the frame pointer with the stack cookie when protecting the stack" This broke the Chromium build (crbug.com/791714). Reverting while investigating. > Summary: This strengthens the guard and matches MSVC. > > Reviewers: hans, etienneb > > Subscribers: hiraditya, JDevlieghere, vlad.tsyrklevich, llvm-commits > > Differential Revision: https://reviews.llvm.org/D40622 > > git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319490 91177308-0d34-0410-b5e6-96231b3b80d8 llvm-svn: 319706	2017-12-04 22:21:15 +00:00
Matt Arsenault	68f0505263	AMDGPU: Fix creating invalid copy when adjusting dmask Move the entire optimization to one place. Before it was possible to adjust dmask without changing the register class of the output instruction, since they were done in separate places. Fix all lane sizes and move all of the optimization into the DAG folding. llvm-svn: 319705	2017-12-04 22:18:27 +00:00
Matt Arsenault	e6667ded4d	AMDGPU: Use return value of MorphNodeTo llvm-svn: 319704	2017-12-04 22:18:22 +00:00
Daniel Sanders	04e4f47e93	[globalisel][tablegen] Split atomic load/store into separate opcode and enable for AArch64. This patch splits atomics out of the generic G_LOAD/G_STORE and into their own G_ATOMIC_LOAD/G_ATOMIC_STORE. This is a pragmatic decision rather than a necessary one. Atomic load/store has little in implementation in common with non-atomic load/store. They tend to be handled very differently throughout the backend. It also has the nice side-effect of slightly improving the common-case performance at ISel since there's no longer a need for an atomicity check in the matcher table. All targets have been updated to remove the atomic load/store check from the G_LOAD/G_STORE path. AArch64 has also been updated to mark G_ATOMIC_LOAD/G_ATOMIC_STORE legal. There is one issue with this patch though which also affects the extending loads and truncating stores. The rules only match when an appropriate G_ANYEXT is present in the MIR. For example, (G_ATOMIC_STORE (G_TRUNC:s16 (G_ANYEXT:s32 (G_ATOMIC_LOAD:s16 X)))) will match but: (G_ATOMIC_STORE (G_ATOMIC_LOAD:s16 X)) will not. This shouldn't be a problem at the moment, but as we get better at eliminating extends/truncates we'll likely start failing to match in some cases. The current plan is to fix this in a patch that changes the representation of extending-load/truncating-store to allow the MMO to describe a different type to the operation. llvm-svn: 319691	2017-12-04 20:39:32 +00:00
Francis Visoiu Mistrih	25528d6de7	[CodeGen] Unify MBB reference format in both MIR and debug output As part of the unification of the debug format and the MIR format, print MBB references as '%bb.5'. The MIR printer prints the IR name of a MBB only for block definitions. * find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" $ -type f -print0 \| xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)->getNumber/" << printMBBReference(\1)/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" $ -type f -print0 \| xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)\.getNumber/" << printMBBReference(\1)/g' * find . $ -name ".txt" -o -name ".s" -o -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" $ -type f -print0 \| xargs -0 sed -i '' -E 's/BB#([0-9]+)/%bb.\1/g' * grep -nr 'BB#' and fix Differential Revision: https://reviews.llvm.org/D40422 llvm-svn: 319665	2017-12-04 17:18:51 +00:00
Pablo Barrio	2b4385846c	Fix function pointer tail calls in armv8-M.base Summary: The compiler fails with the following error message: fatal error: error in backend: ran out of registers during register allocation Tail call optimization for Armv8-M.base fails to meet all the required constraints when handling calls to function pointers where the arguments take up r0-r3. This is because the pointer to the function to be called can only be stored in r0-r3, but these are all occupied by arguments. This patch makes sure that tail call optimization does not try to handle this type of calls. Reviewers: chill, MatzeB, olista01, rengolin, efriedma Reviewed By: olista01, efriedma Subscribers: efriedma, aemerson, javed.absar, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D40706 llvm-svn: 319664	2017-12-04 16:55:49 +00:00
Sam Kolton	5f7f32c382	[AMDGPU] SDWA: add support for PRESERVE into SDWA peephole. Summary: Reviewers: arsenm, vpykhtin, rampitec Subscribers: kzhuravl, wdng, nhaehnle, mgorny, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D37817 llvm-svn: 319662	2017-12-04 16:22:32 +00:00
Jonas Hahnfeld	5db24d7c22	[NVPTX] Assign valid global names PTX requires that identifiers consist only of [a-zA-Z0-9_$]. The existing pass already ensured this for globals and this patch adds the cleanup for functions with local linkage. However, there was a different problem in the case of collisions of the adjusted name: The ValueSymbolTable then automatically appended ".N" with increasing Ns to get a unique name while helping the ABI demangling. Special case this behavior to omit the dots and append N directly. This will always give us legal names according to the PTX requirements. Differential Revision: https://reviews.llvm.org/D40573 llvm-svn: 319657	2017-12-04 14:19:33 +00:00
Oliver Stannard	7ab60605f8	Revert r319649 - [Asm, ARM] Add fallback diag for multiple invalid operands This is causing a failure in the llvm-clang-x86_64-expensive-checks-win buildbot, and I can't reproduce it locally, so reverting until I can work out what is wrong. llvm-svn: 319654	2017-12-04 13:42:22 +00:00
Tim Corringham	6c6d5e24cd	AMDGPU: fix missing s_waitcnt Summary: The pass that inserts s_waitcnt instructions where needed propagated info used to track dependencies for each block by iterating over the predecessor blocks. The iteration was terminated when a predecessor that had not yet been processed was encountered. Any info in blocks later in the list was therefore not processed, leading to the possiblility of a required s_waitcnt not being inserted. The fix is simply to change the "break" to "continue" for the relevant loops, so that all visited blocks are processed. This is likely what was intended when the code was written. There is no test case provided for this fix because: 1) the only example that reproduces this is large and resistant to being reduced 2) the change is trivial Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D40544 llvm-svn: 319651	2017-12-04 12:30:49 +00:00
Oliver Stannard	7cd4db94f8	[Asm, ARM] Add fallback diag for multiple invalid operands This adds a "invalid operands for instruction" diagnostic for instructions where there is an instruction encoding with the correct mnemonic and which is available for this target, but where multiple operands do not match those which were provided. This makes it clear that there is some combination of operands that is valid for the current target, which the default diagnostic of "invalid instruction" does not. Since this is a very general error, we only emit it if we don't have a more specific error. Differential revision: https://reviews.llvm.org/D36747 llvm-svn: 319649	2017-12-04 12:02:32 +00:00
Martin Storsjo	eca862de07	[AArch64] Allow using emulated tls on platforms other than ELF This matches how it is done on X86. This allows using emulated tls on windows; in MinGW environments, native tls isn't supported at the moment. Set the right Data*bitsDirective for windows to match the existing tests for other platforms. Make parts of the existing tests a regex, to allow matching .section .rdata for windows, to avoid having to duplicate the rest of the tests for windows. Differential Revision: https://reviews.llvm.org/D40770 llvm-svn: 319644	2017-12-04 09:09:04 +00:00
Martin Storsjo	c85cc41801	[ARM] Allow using emulated tls on platforms other than ELF This matches how it is done on X86. This allows using emulated tls on windows; in MinGW environments, native tls isn't supported at the moment. Differential Revision: https://reviews.llvm.org/D40769 llvm-svn: 319643	2017-12-04 09:08:55 +00:00
Craig Topper	4520d4f8ad	[X86] Allow VPMAXUQ/VPMAXSQ/VPMINUQ/VPMINSQ to be used with 128/256 bit vectors when AVX512 is enabled. These instructions can be used by widening to 512-bits and extracting back to 128/256. We do similar to several other instructions already. llvm-svn: 319641	2017-12-04 07:21:01 +00:00
Craig Topper	1151facf76	[X86] Don't turn UINT_TO_FP into SINT_TO_FP during lowering. We already do this as a DAG combine. The version during lowering can only trigger if known bits changes something that improves known bits analysis. But this means we should be improving known bits analysis to work on the unlowered form instead. llvm-svn: 319640	2017-12-04 05:38:44 +00:00
Simon Pilgrim	569e53b0f6	[X86][AVX512] Tag PH2PS/PS2PH conversion instructions scheduler classes llvm-svn: 319637	2017-12-03 21:43:54 +00:00
Simon Pilgrim	465a88bb92	[X86][AVX512] Tag packed F2I/I2F/F2F conversion instructions scheduler class llvm-svn: 319636	2017-12-03 21:16:12 +00:00
Simon Pilgrim	bc8d0223fb	[X86][SSE] Remove unused IIC_SSE_CVT_PI2PS_RR/IIC_SSE_CVT_PI2PS_RM itineraries llvm-svn: 319634	2017-12-03 20:57:04 +00:00
Simon Pilgrim	299a54c5b9	[X86][SSE] Cleanup float/int conversion scheduler itinerary classes Makes it easier to grok where each is supposed to be used, mainly useful for adding to the AVX512 instructions but hopefully can be used more in SSE/AVX as well. llvm-svn: 319614	2017-12-02 12:27:44 +00:00
Craig Topper	7d9a3b82c6	[X86] Teach the assembler to support %db8-%db15 as aliases for %dr8-%dr15. llvm-svn: 319612	2017-12-02 08:27:46 +00:00
Craig Topper	3e846ecb5b	[X86] Support %dr8-%dr15 in the assembler. Apparently I failed to make this work when I fixed it in the disassembler way back in r224862. llvm-svn: 319611	2017-12-02 08:27:45 +00:00

1 2 3 4 5 ...

45037 Commits