llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	050c9c8f83	[X86] Prevent selecting masked aligned load instructions if the load should be non-temporal Summary: The aligned load predicates don't suppress themselves if the load is non-temporal the way the unaligned predicates do. For the most part this isn't a problem because the aligned predicates are mostly used for instructions that only load the the non-temporal loads have priority over those. The exception are masked loads. Reviewers: RKSimon, zvi Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D35712 llvm-svn: 309079	2017-07-26 04:31:04 +00:00
Eugene Zelenko	96d933da4f	[AArch64] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 309062	2017-07-25 23:51:02 +00:00
Eric Christopher	97ae58686f	Update the comments on default subtargets based on feedback. llvm-svn: 309041	2017-07-25 22:21:08 +00:00
Marek Olsak	6096f542d1	AMDGPU/SI: Fix Depth and Height computation for SI scheduler Patch by: Axel Davy Differential Revision: https://reviews.llvm.org/D34967 llvm-svn: 309028	2017-07-25 20:37:03 +00:00
Marek Olsak	e6f74384b1	AMDGPU/SI: Force exports at the end for SI scheduler Patch by: Axel Davy Differential Revision: https://reviews.llvm.org/D34965 llvm-svn: 309027	2017-07-25 20:36:58 +00:00
Eric Christopher	adfe5368ee	Revert "This patch enables the usage of constant Enum identifiers within Microsoft style inline assembly statements." This reverts commit r308966. llvm-svn: 309005	2017-07-25 19:22:09 +00:00
Nemanja Ivanovic	009016bb70	[PowerPC] Pretty-print CR bits the way the binutils disassembler does This patch just adds printing of CR bit registers in a more human-readable form akin to that used by the GNU binutils. Differential Revision: https://reviews.llvm.org/D31494 llvm-svn: 309001	2017-07-25 18:26:35 +00:00
Nemanja Ivanovic	864c953773	[PowerPC] - Recommit r304907 now that the issue has been fixed This is just a recommit since the issue that the commit exposed is now resolved. llvm-svn: 308995	2017-07-25 17:54:51 +00:00
Simon Pilgrim	18b97f78fe	[X86][CGP] Reduce memcmp() expansion to 2 load pairs (PR33914) D35067/rL308322 attempted to support up to 4 load pairs for memcmp inlining which resulted in regressions for some optimized libc memcmp implementations (PR33914). Until we can match these more optimal cases, this patch reduces the memcmp expansion to a maximum of 2 load pairs (which matches what we do for -Os). This patch should be considered for the 5.0.0 release branch as well Differential Revision: https://reviews.llvm.org/D35830 llvm-svn: 308986	2017-07-25 17:04:37 +00:00
Fedor Sergeev	7856a3205f	[Sparc] invalid adjustments in TLS_LE/TLS_LDO relocations removed Summary: Some SPARC TLS relocations were applying nontrivial adjustments to zero value, leading to unexpected non-zero values in ELF and then Solaris linker failures. Getting rid of these adjustments. Fixes PR33825. Reviewers: rafael, asb, jyknight Subscribers: joerg, jyknight, llvm-commits Differential Revision: https://reviews.llvm.org/D35567 llvm-svn: 308978	2017-07-25 15:28:28 +00:00
Andrew V. Tischenko	32e9b1ad0b	X86 Asm uses assertions instead of proper diagnostic. This patch fixes that. Differential Revision: https://reviews.llvm.org/D35115 llvm-svn: 308972	2017-07-25 13:05:12 +00:00
Matan Haroush	2f21017be2	This patch enables the usage of constant Enum identifiers within Microsoft style inline assembly statements. Differential Revision: https://reviews.llvm.org/D33277 https://reviews.llvm.org/D33278 llvm-svn: 308966	2017-07-25 10:44:09 +00:00
Sam Parker	19a08e42a8	[ARM] Enable partial and runtime unrolling Enable runtime and partial loop unrolling of simple loops without calls on M-class cores. The thresholds are calculated based on whether the target is Thumb or Thumb-2. Differential Revision: https://reviews.llvm.org/D34619 llvm-svn: 308956	2017-07-25 08:51:30 +00:00
Martin Storsjo	8cb3667541	[AArch64] Reserve a 16 byte aligned amount of fixed stack for win64 varargs Create a dummy 8 byte fixed object for the unused slot below the first stored vararg. Alternative ideas tested but skipped: One could try to align the whole fixed object to 16, but I haven't found how to add an offset to the stack frame used in LowerWin64_VASTART. If only the size of the fixed stack object size is padded but not the offset, via MFI.CreateFixedObject(alignTo(GPRSaveSize, 16), -(int)GPRSaveSize, false), PrologEpilogInserter crashes due to "Attempted to reset backwards range!". This fixes misconceptions about where registers are spilled, since AArch64FrameLowering.cpp assumes the offset from fixed objects is aligned to 16 bytes (and the Win64 case there already manually aligns the offset to 16 bytes). This fixes cases where local stack allocations could overwrite callee saved registers on the stack. Differential Revision: https://reviews.llvm.org/D35720 llvm-svn: 308950	2017-07-25 05:20:01 +00:00
Reid Kleckner	c990b5d916	Revert "[X86][InlineAsm][Ms Compatibility]Prefer variable name over a register when the two collides" This reverts r308867 and r308866. It broke the sanitizer-windows buildbot on C++ code similar to the following: namespace cl { } void f() { __asm { mov al, cl } } t.cpp(4,13): error: unexpected namespace name 'cl': expected expression mov al, cl ^ In this case, MSVC parses 'cl' as a register, not a namespace. llvm-svn: 308926	2017-07-24 20:48:15 +00:00
Krzysztof Parzyszek	1fd0c7e598	[Hexagon] Recognize C4_cmpneqi, C4_cmpltei and C4_cmplteui in NewValueJump llvm-svn: 308914	2017-07-24 19:35:48 +00:00
Evandro Menezes	29ffb0e66a	[AArch64] Adjust the cost model for Exynos M1 and M2 Fine tune the resources in a couple of ASIMD loads. llvm-svn: 308904	2017-07-24 18:06:16 +00:00
Matt Arsenault	7052a6a505	AMDGPU: Fix allocating pseudo-registers There's no need for these to be part of a class since they are immediately replaced. New unreachable hit in existing tests.' llvm-svn: 308903	2017-07-24 18:06:15 +00:00
Ayman Musa	b16ce777e3	[X86][AVX512] Add patterns for masked AVX512 floating point compare instructions that were missing. patterns were missed by D33188. Adding for completion. +Updating test. Differential Revesion: https://reviews.llvm.org/D35179 llvm-svn: 308868	2017-07-24 08:10:32 +00:00
Coby Tayree	c48388d3d3	[X86][InlineAsm][Ms Compatibility]Prefer variable name over a register when the two collides On MS-style, the following snippet: int eax; __asm mov eax, ebx should yield loading of ebx, into the location pointed by the variable eax This patch sees to it. Currently, a reg-to-reg move would have been invoked. clang: D34740 Differential Revision: https://reviews.llvm.org/D34739 llvm-svn: 308866	2017-07-24 07:04:55 +00:00
Dylan McKay	6c5c6aa9d8	[AVR] Remove the instrumentation pass I have a much better way of running integration tests now. https://github.com/dylanmckay/avr-test-suite llvm-svn: 308857	2017-07-23 23:39:11 +00:00
Petr Hosek	710479cede	[CodeGen][X86] Fuchsia supports sincos* libcalls and sin+cos->sincos optimization Patch by Roland McGrath Differential Revision: https://reviews.llvm.org/D35748 llvm-svn: 308854	2017-07-23 22:30:00 +00:00
Chad Rosier	9b2b4c961a	[AArch64] Redundant Copy Elimination - remove more zero copies. This patch removes unnecessary zero copies in BBs that are targets of b.eq/b.ne and we know the result of the compare instruction is zero. For example, BB#0: subs w0, w1, w2 str w0, [x1] b.ne .LBB0_2 BB#1: mov w0, wzr ; <-- redundant str w0, [x2] .LBB0_2 Differential Revision: https://reviews.llvm.org/D35075 llvm-svn: 308849	2017-07-23 16:38:08 +00:00
Craig Topper	07a7d56144	[X86] Add some hasSideEffects=0 flags. llvm-svn: 308835	2017-07-23 03:59:39 +00:00
Craig Topper	6912d7faa3	[X86] Add patterns for memory forms of SARX/SHLX/SHRX with careful complexity adjustment to keep shift by immediate using the legacy instructions. These patterns were only missing to favor using the legacy instructions when the shift was a constant. With careful adjustment of the pattern complexity we can make sure the immediate instructions still have priority over these patterns. llvm-svn: 308834	2017-07-23 03:59:37 +00:00
Craig Topper	abfe380f9a	[X86] Add nopq instruction which is a rex encoded version of nopl for gas compatibility. llvm-svn: 308818	2017-07-22 01:30:53 +00:00
Craig Topper	e88aef4b5f	[X86] Add register form of NOPL and NOPW for assembler/disassembler. Fixes PR32805. llvm-svn: 308817	2017-07-22 01:30:51 +00:00
Matt Arsenault	416d755675	AMDGPU: Remove leftover td file All of the instructions were moved out of this a while ago, so it's just a useless comment now. llvm-svn: 308815	2017-07-22 00:40:46 +00:00
Erich Keane	d8f61f8f7e	Remove Bitrig: LLVM Changes Bitrig code has been merged back to OpenBSD, thus the OS has been abandoned. Differential Revision: https://reviews.llvm.org/D35707 llvm-svn: 308799	2017-07-21 22:48:47 +00:00
Farhana Aleen	e4a89a6462	X86InterleaveAccess: A fix for bug33826 Reviewers: DavidKreitzer Differential Revision: https://reviews.llvm.org/D35638 llvm-svn: 308784	2017-07-21 21:35:00 +00:00
Konstantin Zhuravlyov	e9a5a77ee3	AMDGPU: Implement memory model llvm-svn: 308781	2017-07-21 21:19:23 +00:00
Guozhi Wei	e0094ce22e	[PPC] Add Defs = [CARRY] to MIR SRADI_32 MIR SRADI uses instruction template XSForm_1rc which declares Defs = [CARRY]. But MIR SRADI_32 uses instruction template XSForm_1, and it doesn't declare such implicit definition. With patch D33720 it causes wrong code generation for perl. This patch adds the implicit definition. Differential Revision: https://reviews.llvm.org/D35699 llvm-svn: 308780	2017-07-21 21:06:08 +00:00
Konstantin Zhuravlyov	070d88e335	AMDGPU: Introduce maybeAtomic instruction flag Testing is in the follow up change llvm-svn: 308779	2017-07-21 21:05:45 +00:00
Matt Arsenault	f014d7cbde	AMDGPU: Preserve undef flag in eliminateFrameIndex Fixes verifier errors in some call tests. Not sure why we haven't run into this before. Test split into separate patch for once call support is committed. llvm-svn: 308774	2017-07-21 19:31:44 +00:00
Matt Arsenault	0ed39d329d	AMDGPU: Partially fix improper reliance on memoperands There are 2 more places doing this, but I'm not sure what they are doing and don't make any sense to me llvm-svn: 308770	2017-07-21 18:54:54 +00:00
Matt Arsenault	6ab9ea9614	AMDGPU: Don't track lgkmcnt for global_/scratch_ instructions llvm-svn: 308766	2017-07-21 18:34:51 +00:00
Matt Arsenault	37a58e03c7	AMDGPU: Fix getMemOpBaseRegImmOfs for flat with offsets llvm-svn: 308762	2017-07-21 18:06:36 +00:00
Krzysztof Parzyszek	3ad0d01e9e	[Hexagon] Add inline-asm constraint 'a' for modifier register class For example asm ("memw(%0++%1) = %2" : : "r"(addr),"a"(mod),"r"(val) : "memory") llvm-svn: 308761	2017-07-21 17:51:27 +00:00
Simon Dardis	0310eb7a67	[mips] Support -membedded-data and fix a related bug -membedded-data changes the location of constant data from the .sdata to the .rodata section. Previously it was (incorrectly) always located in the .rodata section. Reviewers: atanasyan Differential Revision: https://reviews.llvm.org/D35686 llvm-svn: 308758	2017-07-21 17:19:00 +00:00
Matt Arsenault	ca7b0a1777	AMDGPU: Add instruction definitions for some scratch_* instructions Omit atomics for now since they probably aren't useful. llvm-svn: 308747	2017-07-21 15:36:16 +00:00
Petar Jovanovic	9494258223	[mips] Enable IAS by default for Android MIPS64 Follow up to r306280 in Clang. Enable IAS by default for Android MIPS64 (uses N64 ABI). Differential Revision: https://reviews.llvm.org/D35482 llvm-svn: 308742	2017-07-21 14:25:42 +00:00
Dmitry Preobrazhensky	abf2839478	[AMDGPU][MC][GFX9] Added support of VOP3 'op_sel' modifier See bug 33591: https://bugs.llvm.org//show_bug.cgi?id=33591 Reviewers: vpykhtin, artem.tamazov, SamWot, arsenm Differential Revision: https://reviews.llvm.org/D35424 llvm-svn: 308740	2017-07-21 13:54:11 +00:00
Jonas Paulsson	024e319489	[SystemZ, LoopStrengthReduce] This patch makes LSR generate better code for SystemZ in the cases of memory intrinsics, Load->Store pairs or comparison of immediate with memory. In order to achieve this, the following common code changes were made: * New TTI hook: LSRWithInstrQueries(), which defaults to false. Controls if LSR should do instruction-based addressing evaluations by calling isLegalAddressingMode() with the Instruction pointers. * In LoopStrengthReduce: handle address operands of memset, memmove and memcpy as address uses, and call isFoldableMemAccessOffset() for any LSRUse::Address, not just loads or stores. SystemZ changes: * isLSRCostLess() implemented with Insns first, and without ImmCost. * New function supportedAddressingMode() that is a helper for TTI methods looking at Instructions passed via pointers. Review: Ulrich Weigand, Quentin Colombet https://reviews.llvm.org/D35262 https://reviews.llvm.org/D35049 llvm-svn: 308729	2017-07-21 11:59:37 +00:00
Simon Pilgrim	32c377a1cf	[X86][SSE] Add pre-AVX2 support for (i32 bitcast(v32i1)) -> 2xMOVMSK Currently we only support (i32 bitcast(v32i1)) using the AVX2 VPMOVMSKB ymm instruction. This patch adds support for splitting pre-AVX2 targets into 2 x (V)PMOVMSKB xmm instructions and merging the integer results. In future we could probably generalize this to handle more cases. Differential Revision: https://reviews.llvm.org/D35303 llvm-svn: 308723	2017-07-21 09:58:50 +00:00
Craig Topper	31140ade70	[AVX-512] Fix a bug that prevented some non-temporal loads from using the movntdqa instruction. The bitconverts here had an input type of 128-bits and an output type of 256 bits. The input type should also have been 256 bits. llvm-svn: 308702	2017-07-21 00:40:42 +00:00
Evandro Menezes	55459609c8	[AArch64] Adjust the cost model for Exynos M1 and M2 Add the cost for the EXT instructions and explicitly add the cost for a few instructions that were implied by the coarse model. llvm-svn: 308697	2017-07-20 23:41:50 +00:00
Tim Northover	7b6d66c0c9	Recommit: GlobalISel: select G_EXTRACT and G_INSERT instructions on AArch64. It revealed a bug in the Localizer pass which has now been fixed. This includes the fix for SUBREG_TO_REG committed separately last time. llvm-svn: 308688	2017-07-20 22:58:38 +00:00
Artem Belevich	d7a73824e4	[NVPTX] Add lowering of i128 params. The patch adds support of i128 params lowering. The changes are quite trivial to support i128 as a "special case" of integer type. With this patch, we lower i128 params the same way as aggregates of size 16 bytes: .param .b8 _ [16]. Currently, NVPTX can't deal with the 128 bit integers: * in some cases because of failed assertions like ValVTs.size() == OutVals.size() && "Bad return value decomposition" * in other cases emitting PTX with .i128 or .u128 types (which are not valid [1]) [1] http://docs.nvidia.com/cuda/parallel-thread-execution/index.html#fundamental-types Differential Revision: https://reviews.llvm.org/D34555 Patch by: Denys Zariaiev (denys.zariaiev@gmail.com) llvm-svn: 308675	2017-07-20 21:16:03 +00:00
Matt Arsenault	e5456ce5e5	AMDGPU: Rename _RTN atomic instructions Move the _RTN to the end of the name. It reads better if the other addressing mode components line up with the non-RTN version. It is also more convenient to define saddr variants of FLAT atomics to have the RTN last, and it is good to have a consistent naming scheme. llvm-svn: 308674	2017-07-20 21:06:04 +00:00
Matt Arsenault	db78273b6e	Add an ID field to StackObjects On AMDGPU SGPR spills are really spilled to another register. The spiller creates the spills to new frame index objects, which is used as a placeholder. This will eventually be replaced with a reference to a position in a VGPR to write to and the frame index deleted. It is most likely not a real stack location that can be shared with another stack object. This is a problem when StackSlotColoring decides it should combine a frame index used for a normal VGPR spill with a real stack location and a frame index used for an SGPR. Add an ID field so that StackSlotColoring has a way of knowing the different frame index types are incompatible. llvm-svn: 308673	2017-07-20 21:03:45 +00:00
Artem Belevich	fef0804e35	Changed EOL back to LF. NFC. llvm-svn: 308671	2017-07-20 20:57:51 +00:00
Mandeep Singh Grang	d41ac895bb	[COFF, ARM64, CodeView] Add support to emit CodeView debug info for ARM64 COFF Reviewers: compnerd, ruiu, rnk, zturner Reviewed By: rnk Subscribers: majnemer, aemerson, aprantl, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D35518 llvm-svn: 308665	2017-07-20 20:20:00 +00:00
James Y Knight	bb76d48d59	[SPARC] Clean up the support for disabling fsmuld and fmuls instructions. Summary: Also enable no-fsmuld for sparcv7 (which doesn't have the instruction). The previous code which used a post-processing pass to do this was unnecessary; disabling the instruction is entirely sufficient. Reviewers: jacob_hansen, ekedaigle Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D35576 llvm-svn: 308661	2017-07-20 20:09:11 +00:00
Krzysztof Parzyszek	f3a778d757	Implement LaneBitmask::getNumLanes and LaneBitmask::getHighestLane This should eliminate most uses of countPopulation and Log2_32 on the lane mask values. llvm-svn: 308658	2017-07-20 19:43:19 +00:00
Craig Topper	27c12e088e	[X86] Allow masks with more than 6 bits set on the x << (y & mask) optimization for the 64-bit memory shifts. llvm-svn: 308657	2017-07-20 19:29:58 +00:00
Krzysztof Parzyszek	e9f0c1e031	Use LaneBitmask::getLane in a few more places llvm-svn: 308655	2017-07-20 19:15:56 +00:00
Matt Arsenault	c37fe66ec5	AMDGPU: Add encoding for carryless add/sub instructions llvm-svn: 308639	2017-07-20 17:42:47 +00:00
Matt Arsenault	f65c5ac9c9	AMDGPU: Add encodings for global atomics llvm-svn: 308638	2017-07-20 17:31:56 +00:00
Stefan Maksimovic	be0bc71e02	Reland r308585 Builder clang-x86_64-linux-abi-test apparently failed due to a spurious error unrelated to the changes r308585 introduced. llvm-svn: 308612	2017-07-20 13:08:18 +00:00
Javed Absar	e9599e39fe	[ARM] Simplify ExpandPseudoInst. NFC. Remove headers not required and convert to range-loop Reviewed by: @mcrosier Differential Revision: https://reviews.llvm.org/D35626 llvm-svn: 308607	2017-07-20 12:35:37 +00:00
Simon Atanasyan	fb953926b1	[mips] Support `long_call/far/near` attributes passed by front-end This patch adds handling of the `long_call`, `far`, and `near` attributes passed by front-end. The patch depends on D35479. Differential revision: https://reviews.llvm.org/D35480. llvm-svn: 308606	2017-07-20 12:19:26 +00:00
Diana Picus	7534b28291	Revert "GlobalISel: select G_EXTRACT and G_INSERT instructions on AArch64." This reverts commit 36c6a2ea9669bc3bb695928529a85d12d1d3e3f9 because it broke the test-suite on the GlobalISel bot. llvm-svn: 308603	2017-07-20 11:36:03 +00:00
Stefan Maksimovic	3793a82b28	Revert r308585 Builder clang-x86_64-linux-abi-test seems to fail after this change llvm-svn: 308597	2017-07-20 09:57:14 +00:00
Stefan Maksimovic	8539f77bc3	[mips] Fix fp select machine verifier errors Introduced FSELECT node necesary when lowering ISD::SELECT which has i32, f64, f64 as its operands. SEL_D instruction required that its output and first operand of a SELECT node, which it used, have matching types. MTC1_D64 node introduced to aid FSELECT lowering. This fixes machine verifier errors on following tests: CodeGen/Mips/llvm-ir/select-dbl.ll CodeGen/Mips/llvm-ir/select-flt.ll CodeGen/Mips/select.ll Differential Revision: https://reviews.llvm.org/D35408 llvm-svn: 308595	2017-07-20 09:21:10 +00:00
Craig Topper	33225ef314	[X86] Use SARX/SHLX/SHLX instructions for (shift x (and y, (BitWidth-1))) Fixes PR33841. llvm-svn: 308591	2017-07-20 06:19:55 +00:00
Matt Arsenault	04004716ff	AMDGPU: Correct encoding for global instructions The soffset field needs to be be set to 0x7f to disable it, not 0. 0 is interpreted as an SGPR offset. This should be enough to get basic usage of the global instructions working. Technically it is possible to use an SGPR_32 offset, but I'm not sure if it's correct with 64-bit pointers, but that is not handled now. This should also be cleaned up to be more similar to how different MUBUF modes are handled, and to have InstrMappings between the different types. llvm-svn: 308583	2017-07-20 05:17:54 +00:00
Tim Northover	967d4aa7a0	GlobalISel: partially revert r308540. An unfinished and untested implementation of ISel for G_UNMERGE_VALUES crept in by mistake. llvm-svn: 308542	2017-07-19 22:11:08 +00:00
Tim Northover	0e0b3c97dd	GlobalISel: fix SUBREG_TO_REG implementation. The first argument needs to be an immediate rather than a register. Should fix some crashes in the verifier bot. llvm-svn: 308540	2017-07-19 22:08:08 +00:00
Martin Storsjo	b2e9fcfca4	[AArch64] Force relocations for all ADRP instructions This generalizes an existing fix from ELF to MachO and COFF. Test that an ADRP to a local symbol whose offset is known at assembly time still produces relocations, both for MachO and COFF. Test that an ADRP without a @page modifier on MachO fails (previously it didn't). Differential Revision: https://reviews.llvm.org/D35544 llvm-svn: 308518	2017-07-19 20:14:32 +00:00
Martin Storsjo	2ff5f5d681	[AArch64, COFF] Interpret .align as power of two for COFF as well Differential Revision: https://reviews.llvm.org/D35545 llvm-svn: 308517	2017-07-19 20:14:24 +00:00
Krzysztof Parzyszek	ac01994db9	[Hexagon] Fix a bug in r308502: post-inc offset is always 0 llvm-svn: 308510	2017-07-19 19:17:32 +00:00
Davide Italiano	5fc5d0a406	[X86] Don't try to scale down if that exceeds the bitwidth. Fixes the crash reported in PR33844. llvm-svn: 308503	2017-07-19 18:09:46 +00:00
Krzysztof Parzyszek	3fce9d9c49	[Hexagon] Handle subregisters in areMemAccessesTriviallyDisjoint llvm-svn: 308502	2017-07-19 18:03:46 +00:00
Tim Northover	d59fbec8e2	GlobalISel: select G_EXTRACT and G_INSERT instructions on AArch64. llvm-svn: 308493	2017-07-19 16:47:07 +00:00
Krzysztof Parzyszek	b449dc189a	[Hexagon] Handle subregisters and non-immediates in getBaseAndOffset llvm-svn: 308485	2017-07-19 15:39:28 +00:00
Javed Absar	2cb0c95031	[ARM] Unify handling of M-Class system registers This patch cleans up and fixes issues in the M-Class system register handling: 1. It defines the system registers and the encoding (SYSm values) in one place: a new ARMSystemRegister.td using SearchableTable, thereby removing the hand-coded values which existed in multiple places. 2. Some system registers e.g. BASEPRI_MAX_NS which do not exist were being allowed! Ref: ARMv6/7/8M architecture reference manual. Reviewed by: @t.p.northover, @olist01, @john.brawn Differential Revision: https://reviews.llvm.org/D35209 llvm-svn: 308456	2017-07-19 12:57:16 +00:00
Simon Pilgrim	e5c7925c5e	[X86][XOP] Use default AVX2 lowering for v4i64 ashr by splat constants XOP shifts only support 128-bit vectors, so we were ending up with less optimal codegen requiring constants llvm-svn: 308430	2017-07-19 10:29:31 +00:00
Jonas Paulsson	4690193dec	[SystemZ] Minor fixing in SystemZScheduleZ14.td Some minor corrections for recently added instructions. Review: Ulrich Weigand llvm-svn: 308429	2017-07-19 10:19:21 +00:00
James Y Knight	0e4ce61d2a	[SPARC] Add missing variable initialization after r308343. llvm-svn: 308415	2017-07-19 04:08:42 +00:00
Craig Topper	106b5b6856	AMD znver1 Initial Scheduler model Summary: This patch adds the following 1. Adds a skeleton scheduler model for AMD Znver1. 2. Introduces the znver1 execution units and pipes. 3. Caters the instructions based on the generic scheduler classes. 4. Further additions to the scheduler model with instruction itineraries will be carried out incrementally based on a. Instructions types b. Registers used 5. Since itineraries are not added based on instructions, throughput information are bound to change when incremental changes are added. 6. Scheduler testcases are modified accordingly to suit the new model. Patch by Ganesh Gopalasubramanian. With minor formatting tweaks from me. Reviewers: craig.topper, RKSimon Subscribers: javed.absar, shivaram, ddibyend, vprasad Differential Revision: https://reviews.llvm.org/D35293 llvm-svn: 308411	2017-07-19 02:45:14 +00:00
Evandro Menezes	e8411cba87	[AArch64] Adjust the feature set for Exynos M2 Add fusion of AES operations. llvm-svn: 308388	2017-07-18 22:51:25 +00:00
Mandeep Singh Grang	d857b4ca98	[COFF, ARM64] Reserve X18 register by default Reviewers: compnerd, rnk, ruiu, mstorsjo Reviewed By: mstorsjo Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D35531 llvm-svn: 308358	2017-07-18 20:41:33 +00:00
James Y Knight	dda87cab7d	[Sparc] Added software multiplication/division feature Added a feature to the Sparc back-end that replaces the integer multiply and divide instructions with calls to .mul/.sdiv/.udiv. This is a step towards having full v7 support. Patch by: Eric Kedaigle Differential Revision: https://reviews.llvm.org/D35500 llvm-svn: 308343	2017-07-18 19:08:38 +00:00
Matt Arsenault	254ad3de5c	AMDGPU: Annotate necessity of flat-scratch-init As an approximation of the existing handling to avoid regressions. Fixes using too many registers with calls on subtargets with the SGPR allocation bug. llvm-svn: 308326	2017-07-18 16:44:58 +00:00
Matt Arsenault	1cc47f8413	AMDGPU: Figure out private memory regs after lowering Introduce pseudo-registers for registers needed for stack access, which are replaced during finalizeLowering. Note these pseudo-registers are currently only used for the used register location, and not for determining their input argument register. This is better because it avoids the need to try to predict whether a call will be emitted from the IR, and also detects stack objects introduced by legalization. Test changes are from the HasStackObjects check being more accurate since stack objects introduced during legalization are now known. llvm-svn: 308325	2017-07-18 16:44:56 +00:00
Geoff Berry	9962faed2b	[AArch64][Falkor] Avoid HW prefetcher tag collisions (step 2) Summary: Avoid HW prefetcher instruction tag collisions in loops by inserting MOVs to change the base address register of strided loads. Reviewers: t.p.northover, mcrosier Subscribers: aemerson, rengolin, javed.absar, kristof.beyls, hfinkel, llvm-commits Differential Revision: https://reviews.llvm.org/D35366 llvm-svn: 308324	2017-07-18 16:14:22 +00:00
Simon Pilgrim	483927aefb	[x86, CGP] increase memcmp() expansion up to 4 load pairs It should be a win to avoid going out to the system lib for all small memcmp() calls using scalar ops. For x86 32-bit, this means most everything up to 16 bytes. For 64-bit, that doubles because we can do 8-byte loads. Notes: Reduced from 4 to 2 loads for -Os behavior, which might not be optimal in all cases. It's effectively a question of how much do we trust the system implementation. Linux and macOS (and Windows I assume, but did not test) have optimized memcmp() code for x86, so it's probably not bad either way? PPC is using 8/4 for defaults on these. We do not expand at all for -Oz. There are still potential improvements to make for the CGP expansion IR and/or lowering such as avoiding select-of-constants (D34904) and not doing zexts to the max load type before doing a compare. We have special-case SSE/AVX codegen for (memcmp(x, y, 16/32) == 0) that will no longer be produced after this patch. I've shown the experimental justification for that change in PR33329: https://bugs.llvm.org/show_bug.cgi?id=33329#c12 TLDR: While the vector code is a likely winner, we can't guarantee that it's a winner in all cases on all CPUs, so I'm willing to sacrifice it for the greater good of expanding all small memcmp(). If we want to resurrect that codegen, it can be done by adjusting the CGP params or poking a hole to let those fall-through the CGP expansion. Committed on behalf of Sanjay Patel Differential Revision: https://reviews.llvm.org/D35067 llvm-svn: 308322	2017-07-18 15:55:30 +00:00
Sumanth Gundapaneni	d5aa0f3464	[Hexagon] Emit lookup tables in text section based on a flag The flag "-hexagon-emit-lut-text" (defaulted to false) is added to decide on where to keep the switch generated lookup table. Differential Revision: https://reviews.llvm.org/D34818 llvm-svn: 308316	2017-07-18 15:31:37 +00:00
Nicolai Haehnle	a253e4c028	AMDGPU: Fix crash when folding immediates into multiple uses Summary: When an immediate is folded by constant folding, we re-scan the entire use list for two reasons: 1. The constant folding may have created a new use of the same reg. 2. The constant folding may have removed an additional use in the list we're currently traversing (e.g., constant folding an S_ADD_I32 c, c). However, this could previously lead to a crash when an unrelated use was added twice into the FoldList. Since we re-scan the whole list anyway, we might as well just clear the FoldList again before we do so. Using a MIR test to show this because real code seems to trigger the issue only in connection with some really subtle control flow structures. Fixes GL45-CTS.shading_language_420pack.binding_images on gfx9. Reviewers: arsenm Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D35416 llvm-svn: 308314	2017-07-18 14:54:41 +00:00
Sam Kolton	4685b70a77	[AMDGPU] resubmit r308179: CodeGen: check dst operand type to determine if omod is supported for VOP3 instructions llvm-svn: 308310	2017-07-18 14:23:26 +00:00
Daniel Sanders	40b66d646e	[globalisel][tablegen] Enable the import of rules involving fma. Summary: G_FMA was recently added to GlobalISel which enables the import of rules involving fma. Add the mapping to allow it. Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar Reviewed By: rovka Subscribers: kristof.beyls, javed.absar, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D35130 llvm-svn: 308308	2017-07-18 14:10:07 +00:00
Hiroshi Inoue	393ef84cb6	fix formatting issue; NFC llvm-svn: 308305	2017-07-18 13:31:40 +00:00
Dmitry Preobrazhensky	30fc523984	[AMDGPU][MC] Corrected disassembler for proper decoding of v_mqsad_u32_u8 See Bug 33639: https://bugs.llvm.org//show_bug.cgi?id=33639 Reviewers: vpykhtin, artem.tamazov Differential Revision: https://reviews.llvm.org/D34892 llvm-svn: 308303	2017-07-18 13:12:48 +00:00
Stefan Maksimovic	58f225b371	[mips] Alter register classes for MSA pseudo f16 instructions This change introduces additional machine instructions in functions dealing with the expansion of msa pseudo f16 instructions due to register classes being inappropriate when checked with machine verifier. Differential Revision: https://reviews.llvm.org/D34276 llvm-svn: 308301	2017-07-18 12:05:35 +00:00
Dmitry Preobrazhensky	00deef8f00	[AMDGPU][MC] Optimized IsRegIntersect function Optimized IsRegIntersect by using MCRegAliasIterator See Bug 33800: https://bugs.llvm.org//show_bug.cgi?id=33800 Reviewers: arsenm, artem.tamazov Differential Revision: https://reviews.llvm.org/D35452 llvm-svn: 308294	2017-07-18 11:14:02 +00:00
Javed Absar	5b8e487b47	[ARM\|CodeGen] Improve the code in FastISel Cleaned up the code in FastISel a bit. Had to add make_range to MCInstrDesc as that was needed and seems missing. Reviewed by: @t.p.northover Differential Revision: https://reviews.llvm.org/D35494 llvm-svn: 308291	2017-07-18 10:19:48 +00:00
Diana Picus	da25d5b8b0	[ARM] GlobalISel: Support G_(S\|U)REM for s8 and s16 Widen to s32, and then do whatever Lowering/Custom/Libcall action the subtarget wants. llvm-svn: 308285	2017-07-18 10:07:01 +00:00
Florian Hahn	3530094de6	[AArch64] Use 16 bytes as preferred function alignment on Cortex-A73. Summary: Using 16 byte alignment is beneficial on Cortex-A73, similar to Cortex-A72 (added in D34961). Reviewers: mcrosier, t.p.northover, aadg, silviu.baranga Reviewed By: t.p.northover Subscribers: aemerson, rengolin, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D35493 llvm-svn: 308283	2017-07-18 09:31:18 +00:00
Dmitry Preobrazhensky	095ec3da81	[AMDGPU][MC] Added missing VOP3P opcodes Added support of the following opcodes: v_pk_sub_u16 v_pk_mad_i16 v_pk_mad_u16 See Bug 33593: https://bugs.llvm.org//show_bug.cgi?id=33593 Reviewers: vpykhtin, artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D34890 llvm-svn: 308281	2017-07-18 09:24:10 +00:00
Jonas Paulsson	d667417e80	[SystemZ, AsmParser] Enable the mnemonic spell corrector. This enables the suggestions of other mnemonics when invalid ones are specified. Review: Ulrich Weigand llvm-svn: 308280	2017-07-18 09:17:00 +00:00
Chandler Carruth	9a7442d088	Revert r308179 which causes tablegen to spam stderr on every build. Original commit log: [AMDGPU] CodeGen: check dst operand type to determine if omod is supported for VOP3 instructions llvm-svn: 308270	2017-07-18 07:40:47 +00:00
Craig Topper	f54a500101	[X86] Prevent an assertion failure if a gather intrinsic is passed a non-constant scale value. This isn't legal code, but we shouldn't crash on it. Now we just don't convert the gather intrinsic if the scale isn't constant and let it go through to isel where we'll report an isel failure. Fixes PR33772. llvm-svn: 308267	2017-07-18 06:49:23 +00:00
Matt Arsenault	e15855d9e3	AMDGPU: Annotate features from x work item/group IDs. This wasn't necessary before since they are always enabled for kernels, but this is necessary if they need to be forwarded to a callable function. llvm-svn: 308226	2017-07-17 22:35:50 +00:00
Mandeep Singh Grang	6d6f2fa198	[COFF, ARM64] Correct the data layout string for COFF ARM64 target llvm-svn: 308223	2017-07-17 21:25:19 +00:00
Geoff Berry	0cf9e702bf	[AArch64][Falkor] Address some stylistic review comments. NFC. llvm-svn: 308211	2017-07-17 20:19:05 +00:00
Martin Storsjo	2f24e93481	[AArch64] Extend CallingConv::X86_64_Win64 to AArch64 as well Rename the enum value from X86_64_Win64 to plain Win64. The symbol exposed in the textual IR is changed from 'x86_64_win64cc' to 'win64cc', but the numeric value is kept, keeping support for old bitcode. Differential Revision: https://reviews.llvm.org/D34474 llvm-svn: 308208	2017-07-17 20:05:19 +00:00
Ulrich Weigand	f2968d58cb	[SystemZ] Add support for IBM z14 processor (3/3) This adds support for the new 128-bit vector float instructions of z14. Note that these instructions actually only operate on the f128 type, since only each 128-bit vector register can hold only one 128-bit float value. However, this is still preferable to the legacy 128-bit float instructions, since those operate on pairs of floating-point registers (so we can hold at most 8 values in registers), while the new instructions use single vector registers (so we hold up to 32 value in registers). Adding support includes: - Enabling the instructions for the assembler/disassembler. - CodeGen for the instructions. This includes allocating the f128 type now to the VR128BitRegClass instead of FP128BitRegClass. - Scheduler description support for the instructions. Note that for a small number of operations, we have no new vector instructions (like integer <-> 128-bit float conversions), and so we use the legacy instruction and then reformat the operand (i.e. copy between a pair of floating-point registers and a vector register). llvm-svn: 308196	2017-07-17 17:44:20 +00:00
Ulrich Weigand	33435c4c9c	[SystemZ] Add support for IBM z14 processor (2/3) This adds support for the new 32-bit vector float instructions of z14. This includes: - Enabling the instructions for the assembler/disassembler. - CodeGen for the instructions, including new LLVM intrinsics. - Scheduler description support for the instructions. - Update to the vector cost function calculations. In general, CodeGen support for the new v4f32 instructions closely matches support for the existing v2f64 instructions. llvm-svn: 308195	2017-07-17 17:42:48 +00:00
Ulrich Weigand	2b3482fe85	[SystemZ] Add support for IBM z14 processor (1/3) This patch series adds support for the IBM z14 processor. This part includes: - Basic support for the new processor and its features. - Support for new instructions (except vector 32-bit float and 128-bit float). - CodeGen for new instructions, including new LLVM intrinsics. - Scheduler description for the new processor. - Detection of z14 as host processor. Support for the new 32-bit vector float and 128-bit vector float instructions is provided by separate patches. llvm-svn: 308194	2017-07-17 17:41:11 +00:00
Krzysztof Parzyszek	5eef92eb7f	[Hexagon] Remove custom lowering of loads of v4i16 The target-independent lowering works fine, except concatenating 32-bit words. Add a pattern to generate A2_combinew instead of 64-bit asl/or. llvm-svn: 308186	2017-07-17 15:45:45 +00:00
Nirav Dave	8d0ecbedbe	Avoid store merge to f128 in context of noimpiccitfloat NFCI. Prevent store merge from merging stores into an invalid 128-bit store (realized as a f128 value in the context of the noimplicitfloat attribute). Previously, such stores are immediately split back into valid stores. llvm-svn: 308184	2017-07-17 15:09:47 +00:00
Sam Kolton	a2b9e2f755	[AMDGPU] CodeGen: check dst operand type to determine if omod is supported for VOP3 instructions Summary: Previously, CodeGen checked first src operand type to determine if omod is supported by instruction. This isn't correct for some instructions: e.g. V_CMP_EQ_F32 has floating-point src operands but desn't support omod. Changed .td files to check if dst operand instead of src operand. Reviewers: arsenm, vpykhtin Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D35350 llvm-svn: 308179	2017-07-17 14:23:38 +00:00
Simon Pilgrim	1cbe8c2ca5	[X86][AVX512] Add lowering of vXi32/vXi64 ISD::ROTL/ISD::ROTR Add support for lowering to ISD::ROTL/ISD::ROTR, including rotate by immediate Differential Revision: https://reviews.llvm.org/D35463 llvm-svn: 308177	2017-07-17 14:11:30 +00:00
Javed Absar	dd2c29ef83	[CodeGen] Add begin-end iterators to MachineInstr Convert iteration over operands to range-loop. Reviewed by: @rovka, @echristo Differential Revision: https://reviews.llvm.org/D35419 llvm-svn: 308173	2017-07-17 13:15:26 +00:00
Mandeep Singh Grang	a210f1d7bf	[COFF, ARM64] Add initial relocation types Reviewers: compnerd, ruiu, rnk Reviewed By: compnerd Subscribers: mstorsjo, aemerson, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D34857 llvm-svn: 308154	2017-07-17 00:05:32 +00:00
Konstantin Zhuravlyov	2ec725c9d8	AMDGPU: Fix amdgpu-flat-work-group-size/amdgpu-waves-per-eu check Differential Revision: https://reviews.llvm.org/D35433 llvm-svn: 308147	2017-07-16 19:38:47 +00:00
Konstantin Zhuravlyov	163af2ed7a	AMDGPU: Remove duplicate print outs from .AMDGPU.csdata Differential Revision: https://reviews.llvm.org/D35428 llvm-svn: 308145	2017-07-16 19:24:08 +00:00
Simon Pilgrim	64fff14bde	Strip trailing whitespace. NFCI llvm-svn: 308143	2017-07-16 18:37:23 +00:00
Amjad Aboud	4563c062b1	[X86] X86::CMOV to Branch heuristic based optimization. LLVM compiler recognizes opportunities to transform a branch into IR select instruction(s) - later it will be lowered into X86::CMOV instruction, assuming no other optimization eliminated the SelectInst. However, it is not always profitable to emit X86::CMOV instruction. For example, branch is preferable over an X86::CMOV instruction when: 1. Branch is well predicted 2. Condition operand is expensive, compared to True-value and the False-value operands In CodeGenPrepare pass there is a shallow optimization that tries to convert SelectInst into branch, but it is not enough. This commit, implements machine optimization pass that converts X86::CMOV instruction(s) into branch, based on a conservative heuristic. Differential Revision: https://reviews.llvm.org/D34769 llvm-svn: 308142	2017-07-16 17:39:56 +00:00
Simon Pilgrim	73ef87978f	[X86][SSE4A] Add EXTRQ/INSERTQ values to BTVER2 scheduling model llvm-svn: 308132	2017-07-16 12:06:06 +00:00
Hiroshi Inoue	7f46baff2c	fix typos in comments; NFC llvm-svn: 308127	2017-07-16 08:11:56 +00:00
Hiroshi Inoue	a9ee279e70	fix typos in comments; NFC llvm-svn: 308126	2017-07-16 07:48:48 +00:00
Simon Atanasyan	f217c7b7e2	[mips] Handle the `long-calls` feature flags in the MIPS backend If the `long-calls` feature flags is enabled, disable use of the `jal` instruction. Instead of that call a function by by first loading its address into a register, and then using the contents of that register. Differential revision: https://reviews.llvm.org/D35168 llvm-svn: 308087	2017-07-15 07:14:25 +00:00
NAKAMURA Takumi	19a652381b	SystemZCodeGen: Update libdeps. r308024 introduced LoopDataPrefetchPass. llvm-svn: 308086	2017-07-15 06:32:12 +00:00
Yonghong Song	fbfae984b4	bpf: fix a compilation bug due to unused variable for release build Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 308083	2017-07-15 06:08:08 +00:00
Matt Arsenault	b34635550a	AMDGPU: Return correct type during argument lowering The type needs to be casted back to the original argument type. Fixes an assert that for some reason is only run when using -debug. Includes an additional combine to avoid test regressions from having conversions mixed with multiple Assert[SZ]ext nodes. On subtargets where i16 is legal, this was producing an i32 register with an i16 AssertZExt, truncated to i16 with another i8 AssertZExt. t2: i32,ch = CopyFromReg t0, Register:i32 %vreg0 t3: i16 = truncate t2 t5: i16 = AssertZext t3, ValueType:ch:i8 t6: i8 = truncate t5 t7: i32 = zero_extend t6 llvm-svn: 308082	2017-07-15 05:52:59 +00:00
Yonghong Song	9276ef05c8	bpf: generate better lowering code for certain select/setcc instructions Currently, for code like below, === inner_map = bpf_map_lookup_elem(outer_map, &port_key); if (!inner_map) { inner_map = &fallback_map; } === the compiler generates (pseudo) code like the below: === I1: r1 = bpf_map_lookup_elem(outer_map, &port_key); I2: r2 = 0 I3: if (r1 == r2) I4: r6 = &fallback_map I5: ... === During kernel verification process, After I1, r1 holds a state map_ptr_or_null. If I3 condition is not taken (path [I1, I2, I3, I5]), supposedly r1 should become map_ptr. Unfortunately, kernel does not recognize this pattern and r1 remains map_ptr_or_null at insn I5. This will cause verificaiton failure later on. Kernel, however, is able to recognize pattern "if (r1 == 0)" properly and give a map_ptr state to r1 in the above case. LLVM here generates suboptimal code which causes kernel verification failure. This patch fixes the issue by changing BPF insn pattern matching and lowering to generate proper codes if the righthand parameter of the above condition is a constant. A test case is also added. Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 308080	2017-07-15 05:41:42 +00:00
Yi Kong	3b680d8d81	[AArch64] Avoid selecting XZR inline ASM memory operand Restricting register class to PointerRegClass for memory operands. Also fix the PointerRegClass for AArch64 from GPR64 to GPR64sp, since XZR cannot hold a memory pointer while SP is. Fixes PR33134. Differential Revision: https://reviews.llvm.org/D34999 llvm-svn: 308060	2017-07-14 21:46:16 +00:00
Geoff Berry	b1e8714af9	[AArch64][Falkor] Avoid HW prefetcher tag collisions (step 1) Summary: This patch is the first step in reducing HW prefetcher instruction tag collisions in inner loops for Falkor. It adds a pass that annotates IR loads with metadata to indicate that they are known to be strided loads, and adds a target lowering hook that translates this metadata to a target-specific MachineMemOperand flag. A follow on change will use this MachineMemOperand flag to re-write instructions to reduce tag collisions. Reviewers: mcrosier, t.p.northover Subscribers: aemerson, rengolin, mgorny, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D34963 llvm-svn: 308059	2017-07-14 21:44:12 +00:00
Davide Italiano	6fdfede10d	[AMDGPU] Throw away more dead code. NFCI. llvm-svn: 308055	2017-07-14 21:20:29 +00:00
Krzysztof Parzyszek	302a9d41c6	[Hexagon] Replace ISD opcode VPACK with VPACKE/VPACKO, NFC This breaks up pack-even and pack-odd into two separate operations. llvm-svn: 308049	2017-07-14 19:02:32 +00:00
Davide Italiano	502ac724ac	[AMDGPU] Garbage collect dead code. NFCI. Unbreaks the build with GCC7. llvm-svn: 308047	2017-07-14 18:47:29 +00:00
Alfred Huang	5b27072f57	[AMDGPU] Do not insert an instruction into worklist twice in movetovalu In moveToVALU(), move to vector ALU is performed, all instrs in the use chain will be visited. We do not want the same node to be pushed to the visit worklist more than once. Differential Revision: https://reviews.llvm.org/D34726 llvm-svn: 308039	2017-07-14 17:56:55 +00:00
Krzysztof Parzyszek	9c084fc55d	[Hexagon] Add intrinsics for data cache operations This is the LLVM part, adding definitions for void @llvm.hexagon.Y2.dccleana(i8) void @llvm.hexagon.Y2.dccleaninva(i8) void @llvm.hexagon.Y2.dcinva(i8) void @llvm.hexagon.Y2.dczeroa(i8) void @llvm.hexagon.Y4.l2fetch(i8, i32) void @llvm.hexagon.Y5.l2fetch(i8, i64) The clang part will follow. llvm-svn: 308032	2017-07-14 15:58:48 +00:00
Simon Dardis	45b2277a33	Revert "Reland "[mips][mt][6/7] Add support for mftr, mttr instructions."" FileCheck is crashing on in the input file, so reverting again while I investigate. This reverts r308023. llvm-svn: 308030	2017-07-14 15:08:05 +00:00
Jonas Paulsson	b144af49c1	[SystemZ] Minor fixing in SystemZScheduleZ196.td Some minor corrections for the recently added instructions. Review: Ulrich Weigand llvm-svn: 308028	2017-07-14 14:30:46 +00:00
Jonas Paulsson	89ca10de33	[SystemZ] Enable LoopDataPrefetch pass. Loop data prefetching has shown some improvements on benchmarks, and is enabled at -O1 and above. Review: Ulrich Weigand llvm-svn: 308024	2017-07-14 13:52:38 +00:00
Simon Dardis	b3529841db	Reland "[mips][mt][6/7] Add support for mftr, mttr instructions."" Unlike many other instructions, these instructions have aliases which take coprocessor registers, gpr register, accumulator (and dsp accumulator) registers, floating point registers, floating point control registers and coprocessor 2 data and control operands. For the moment, these aliases are treated as pseudo instructions which are expanded into the underlying instruction. As a result, disassembling these instructions shows the underlying instruction and not the alias. Reviewers: slthakur, atanasyan Differential Revision: https://reviews.llvm.org/D35253 The last version of this patch broke one of the expensive checks buildbots, this version changes the failing test/MC/Mips/mt/invalid.s and other invalid tests to write the errors to a file and run FileCheck on that, rather than relying on the 'not llvm-mc ... <%s 2>&1 \| Filecheck %s' idiom. Hopefully this will sarisfy the buildbot. llvm-svn: 308023	2017-07-14 13:44:12 +00:00
Zoran Jovanovic	0e03935182	Reverting commit 308011. llvm-svn: 308017	2017-07-14 10:52:22 +00:00
Zoran Jovanovic	d374c5993b	[mips][microMIPS] Extending size reduction pass with ADDIUSP and ADDIUR1SP Author: milena.vujosevic.janicic Reviewers: sdardis The patch extends size reduction pass for MicroMIPS. The following instructions are examined and transformed, if possible: ADDIU instruction is transformed into 16-bit instruction ADDIUSP ADDIU instruction is transformed into 16-bit instruction ADDIUR1SP Function InRange is changed to avoid left shifting of negative values, since that caused some sanitizer tests to fail (so the previous patch Differential Revision: https://reviews.llvm.org/D34511 llvm-svn: 308011	2017-07-14 10:13:11 +00:00
Diana Picus	87a7067983	[ARM] GlobalISel: Support G_BRCOND Insert a TSTri to set the flags and a Bcc to branch based on their values. This is a bit inefficient in the (common) cases where the condition for the branch comes from a compare right before the branch, since we set the flags both as part of the compare lowering and as part of the branch lowering. We're going to live with that until we settle on a principled way to handle this kind of situation, which occurs with other patterns as well (combines might be the way forward here). llvm-svn: 308009	2017-07-14 09:46:06 +00:00
Jonas Paulsson	a84f9f5364	[SystemZ] Minor fixing in SystemZScheduleZEC12.td Some minor corrections for the recently added instructions. Review: Ulrich Weigand llvm-svn: 308007	2017-07-14 09:18:18 +00:00
Sam Parker	2893448576	[ARM] Allow rematerialization of ARM Thumb literal pool loads Constants are crucial for code size in the ARM Thumb-1 instruction set. The 16 bit instruction size often does not offer enough space for immediate arguments. This means that additional instructions are frequently used to load constants into registers. Since constants are hoisted, this can lead to significant register spillage if they are used multiple times in a single function. This can be avoided by rematerialization, i.e. recomputing a constant instead of reloading it from the stack. This patch fixes the rematerialization of literal pool loads in the ARM Thumb instruction set. Patch by Philip Ginsbach Differential Revision: https://reviews.llvm.org/D33936 llvm-svn: 308004	2017-07-14 08:23:56 +00:00
Eric Christopher	4e332c7cf1	Add a set of comments explaining why getSubtargetImpl() is deleted on these targets. llvm-svn: 307999	2017-07-14 04:33:43 +00:00
Matt Arsenault	23e4df6a59	AMDGPU: Detect kernarg segment pointer This is necessary to pass the kernarg segment pointer to callee functions. Also don't unconditionally enable for kernels. llvm-svn: 307978	2017-07-14 00:11:13 +00:00
Stanislav Mekhanoshin	dc2890a887	[AMDGPU] fcaninicalize optimization for GFX9+ Since GFX9 supports denorm modes for v_min_f32/v_max_f32 that is possible to further optimize fcanonicalize and remove it if applied to min/max given their operands are known not to be an sNaN or that sNaNs are not supported. Additionally we can remove fcanonicalize if denorms are supported for the VT and we know that its argument is never a NaN. Differential Revision: https://reviews.llvm.org/D35335 llvm-svn: 307976	2017-07-13 23:59:15 +00:00
Matt Arsenault	6b93046f29	AMDGPU: Annotate call graph with used features Previously this wouldn't detect used features indirectly used in callee functions. llvm-svn: 307967	2017-07-13 21:43:42 +00:00
Jakub Kuderski	34327d28fd	[NFC] Move DEBUG_TYPE below includes in Hexagon llvm-svn: 307947	2017-07-13 20:26:45 +00:00
Simon Dardis	1558ee3365	Revert "[mips][mt][6/7] Add support for mftr, mttr instructions." This reverts r307836, it broke one of the buildbots. Reverting while I investigate. llvm-svn: 307939	2017-07-13 19:27:41 +00:00
Krzysztof Parzyszek	89b2d7c938	[Hexagon] Use VSPLAT instead of COMBINE for vectors of type v2i32, NFC This cleans up the vector shift patterns. llvm-svn: 307935	2017-07-13 18:17:58 +00:00

1 2 3 4 5 ...

43414 Commits