llvm-project

Commit Graph

Author	SHA1	Message	Date
Duncan P. N. Exon Smith	83c4b68720	ADT: Remove last implicit ilist iterator conversions, NFC Some implicit ilist iterator conversions have crept back into Analysis, Transforms, Hexagon, and llvm-stress. This removes them. I'll commit a patch immediately after this to disallow them (in a separate patch so that it's easy to revert if necessary). llvm-svn: 252371	2015-11-07 00:01:16 +00:00
Ahmed Bougacha	cf49b523a0	[AArch64][FastISel] Don't even try to select vector icmps. We used to try to constant-fold them to i32 immediates. Given that fast-isel doesn't otherwise support vNi1, when selecting the result users, we'd fallback to SDAG anyway. However, if the users were in another block, we'd insert broken cross-class copies (GPR32 to FPR64). Give up, let SDAG agree with itself on a vNi1 legalization strategy. llvm-svn: 252364	2015-11-06 23:16:53 +00:00
Ahmed Bougacha	b49eb3ab4b	[X86] Fold (trunc (i32 (zextload i16))) into vbroadcast. When matching non-LSB-extracting truncating broadcasts, we now insert the necessary SRL. If the scalar resulted from a load, the SRL will be folded into it, creating a narrower, offset, load. However, i16 loads aren't Desirable, so we get i16->i32 zextloads. We already catch i16 aextloads; catch these as well. llvm-svn: 252363	2015-11-06 23:16:48 +00:00
Ahmed Bougacha	05a0514b12	[X86] SRL non-LSB extracts when folding to truncating broadcasts. Now that we recognize this, we can support it instead of bailing out. That is, we can fold: (v8i16 (shufflevector (v8i16 (bitcast (v4i32 (build_vector X, Y, ...)))), <1,1,...,1>)) into: (v8i16 (vbroadcast (i16 (trunc (srl Y, 16))))) llvm-svn: 252362	2015-11-06 23:16:43 +00:00
Ahmed Bougacha	68614a36d1	[X86] Don't fold non-LSB extracts into truncating broadcasts. We used to incorrectly assume that the offset we're extracting from was a multiple of the element size. So, we'd fold: (v8i16 (shufflevector (v8i16 (bitcast (v4i32 (build_vector X, Y, ...)))), <1,1,...,1>)) into: (v8i16 (vbroadcast (i16 (trunc Y)))) whereas we should have extracted the higher bits from X. Instead, bail out if the assumption doesn't hold. llvm-svn: 252361	2015-11-06 23:16:38 +00:00
Tom Stellard	41b7e63040	AMDGPU/SI: Refactor VOP[12C] tablegen definitions Summary: Pass the VOPProfile object all the through to *_m multiclasses. This will allow us to do more simplifications in the future. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D13437 llvm-svn: 252339	2015-11-06 20:56:18 +00:00
Andrew Kaylor	4731bea3e5	Improved the operands commute transformation for X86-FMA3 instructions. All 3 operands of FMA3 instructions are commutable now. Patch by Slava Klochkov Reviewers: Quentin Colombet(qcolombet), Ahmed Bougacha(ab). Differential Revision: http://reviews.llvm.org/D13269 llvm-svn: 252335	2015-11-06 19:47:25 +00:00
Dan Gohman	4b96d8d1ff	[WebAssembly] Make expression-stack pushing explicit Modelling of the expression stack is evolving. This patch takes another step by making pushes explicit. Differential Revision: http://reviews.llvm.org/D14338 llvm-svn: 252334	2015-11-06 19:45:01 +00:00
Matt Arsenault	f59e538937	AMDGPU: Cleanup includes llvm-svn: 252328	2015-11-06 18:23:00 +00:00
Matt Arsenault	0c90e9501e	AMDGPU: Create emergency stack slots during frame lowering Test has a bogus verifier error which will be fixed by later commits. llvm-svn: 252327	2015-11-06 18:17:45 +00:00
Matt Arsenault	08f14de244	AMDGPU: Remove unused scratch resource operands The SGPR spill pseudos don't actually use them. llvm-svn: 252324	2015-11-06 18:07:53 +00:00
Matt Arsenault	3931948bb6	AMDGPU: Add pass to detect used kernel features Mark kernels that use certain features that require user SGPRs to support with kernel attributes. We need to know before instruction selection begins because it impacts the kernel calling convention lowering. For now this only detects the workitem intrinsics. llvm-svn: 252323	2015-11-06 18:01:57 +00:00
Matt Arsenault	4dc7a5a5c6	AMDGPU: Fix hardcoded alignment of spill. Instead of forcing 4 alignment when spilled, set register class alignments. llvm-svn: 252322	2015-11-06 17:54:47 +00:00
Matt Arsenault	623e6fd466	AMDGPU: Hack for VS_32 register pressure For some reason VS_32 ends up factoring into the pressure heuristics even though we should never see a virtual register with this class. When SGPRs are reserved for register spilling, this for some reason triggers reg-crit scheduling. Setting isAllocatable = 0 may help with this since that seems to remove it from the default implementation's generated table. llvm-svn: 252321	2015-11-06 17:54:43 +00:00
Reid Kleckner	b8fd162fc5	[WinEH] Mark funclet entries and exits as clobbering all registers Summary: In this implementation, LiveIntervalAnalysis invents a few register masks on basic block boundaries that preserve no registers. The nice thing about this is that it prevents the prologue inserter from thinking it needs to spill all XMM CSRs, because it doesn't see any explicit physreg defs in the MI. Reviewers: MatzeB, qcolombet, JosephTremoulet, majnemer Subscribers: MatzeB, llvm-commits Differential Revision: http://reviews.llvm.org/D14407 llvm-svn: 252318	2015-11-06 17:06:38 +00:00
Jun Bum Lim	22fe15ee86	[AArch64]Enable the narrow ld promotion only on profitable microarchitectures The benefit from converting narrow loads into a wider load (r251438) could be micro-architecturally dependent, as it assumes that a single load with two bitfield extracts is cheaper than two narrow loads. Currently, this conversion is enabled only in cortex-a57 on which performance benefits were verified. llvm-svn: 252316	2015-11-06 16:27:47 +00:00
Daniel Sanders	5762a4f9d1	[mips][ias] Range check uimm4 operands and fixed a bug this revealed. Summary: The bug was that the sldi instructions have immediate widths dependant on their element size. So sldi.d has a 1-bit immediate and sldi.b has a 4-bit immediate. All of these were using 4-bit immediates previously. Reviewers: vkalintiris Subscribers: llvm-commits, atanasyan, dsanders Differential Revision: http://reviews.llvm.org/D14018 llvm-svn: 252297	2015-11-06 12:41:43 +00:00
Daniel Sanders	38ce0f629c	[mips][ias] Range check uimm3 operands. Summary: Reviewers: vkalintiris Subscribers: atanasyan, dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D14016 llvm-svn: 252296	2015-11-06 12:31:27 +00:00
Daniel Sanders	ea4f653d18	[mips][ias] Range check uimm2 operands and fix a bug this revealed. Summary: The bug was that the MIPS32R6/MIPS64R6/microMIPS32R6 versions of LSA and DLSA (unlike the MSA version) failed to account for the off-by-one encoding of the immediate. The range is actually 1..4 rather than 0..3. Reviewers: vkalintiris Subscribers: atanasyan, dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D14015 llvm-svn: 252295	2015-11-06 12:22:31 +00:00
Daniel Sanders	52da7af4d2	[mips][ias] Range check uimmz operands. Reviewers: vkalintiris Subscribers: dsanders, atanasyan, llvm-commits Differential Revision: http://reviews.llvm.org/D14013 llvm-svn: 252294	2015-11-06 12:11:03 +00:00
Vasileios Kalintiris	b04672cade	[mips] Define patterns for the atomic_{load,store}_{8,16,32,64} nodes. Summary: Without these patterns we would generate a complete LL/SC sequence. This would be problematic for memory regions marked as WRITE-only or READ-only, as the instructions LL/SC would read/write to the protected memory regions correspondingly. Reviewers: dsanders Subscribers: llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D14397 llvm-svn: 252293	2015-11-06 12:07:20 +00:00
Tom Stellard	1e1b05db24	AMDGPU/SI: Emit HSA kernels with symbol type STT_AMDGPU_HSA_KERNEL Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D13804 llvm-svn: 252291	2015-11-06 11:45:14 +00:00
Reid Kleckner	51460c139e	[WinEH] Split EH_RESTORE out of CATCHRET for 32-bit EH This adds the EH_RESTORE x86 pseudo instr, which is responsible for restoring the stack pointers: EBP and ESP, and ESI if stack realignment is involved. We only need this on 32-bit x86, because on x64 the runtime restores CSRs for us. Previously we had to keep the CATCHRET instruction around during SEH so that we could convince X86FrameLowering to restore our frame pointers. Now we can split these instructions earlier. This was confusing, because we had a return instruction which wasn't really a return and was ultimately going to be removed by X86FrameLowering. This change also simplifies X86FrameLowering, which really shouldn't be building new MBBs. No observable functional change currently, but with the new register mask stuff in D14407, CATCHRET will become a register allocator barrier, and our existing tests rely on us having reasonable register allocation around SEH. llvm-svn: 252266	2015-11-06 01:49:05 +00:00
Tim Northover	775aaeb765	Remove windows line endings introduced by r252177. NFC. llvm-svn: 252217	2015-11-05 21:54:58 +00:00
Reid Kleckner	6ddae31045	[WinEH] Fix funclet prologues with stack realignment We already had a test for this for 32-bit SEH catchpads, but those don't actually create funclets. We had a bug that only appeared in funclet prologues, where we would establish EBP and ESI as our FP and BP, and then downstream prologue code would overwrite them. While I was at it, I fixed Win64+funclets+stackrealign. This issue doesn't come up as often there due to the ABI requring 16 byte stack alignment, but now we can rest easy that AVX and WinEH will work well together =P. llvm-svn: 252210	2015-11-05 21:09:49 +00:00
Dan Gohman	b9ce5a8b6c	[WebAssembly] Fix copypasta. Noticed by dschff in http://reviews.llvm.org/rL252203 llvm-svn: 252208	2015-11-05 20:59:49 +00:00
Dan Gohman	da7f428a4a	[WebAssembly] Rename Immediate instructions to Const. This more closely reflects the naming convention in the spec. llvm-svn: 252204	2015-11-05 20:44:29 +00:00
Dan Gohman	af29bd4fd4	[WebAssembly] Add AsmString strings for most instructions. Mangling type information into MachineInstr opcode names was a temporary measure, and it's starting to get hairy. At the same time, the MC instruction printer wants to use AsmString strings for printing. This patch takes the first step, starting the process of adding AsmStrings for instructions. llvm-svn: 252203	2015-11-05 20:42:30 +00:00
Dan Gohman	d7ffb919c1	[WebAssembly] Update wasm builtin functions to match spec changes. The page_size operator has been removed from the spec, and the resize_memory operator has been changed to grow_memory. llvm-svn: 252202	2015-11-05 20:16:59 +00:00
Sanjay Patel	387e66e79f	replace MachineCombinerPattern namespace and enum with enum class; NFCI Also, remove an enum hack where enum values were used as indexes into an array. We may want to make this a real class to allow pattern-based queries/customization (D13417). llvm-svn: 252196	2015-11-05 19:34:57 +00:00
Dan Gohman	e9361d58ff	[WebAssembly] Add WebAssemblyMCInstLower.cpp. This isn't used yet; it's just a start towards eventually using MC to do instruction printing, and eventually binary encoding. llvm-svn: 252194	2015-11-05 19:28:16 +00:00
Oleg Ranevskyy	057c5a6b2b	[DebugInfo] Fix ARM/AArch64 prologue_end position. Related to D11268. Summary: This review is related to another review request http://reviews.llvm.org/D11268, does the same and merely fixes a couple of issues with it. D11268 is quite old and has merge conflicts against the current trunk. This request - rebases D11268 onto the new trunk; - resolves the merge conflicts; - fixes the prologue_end tests, which do not pass due to the subprogram definitions not marked as distinct. Reviewers: echristo, rengolin, kubabrecka Subscribers: aemerson, rengolin, jyknight, dsanders, llvm-commits, asl Differential Revision: http://reviews.llvm.org/D14338 llvm-svn: 252177	2015-11-05 17:50:17 +00:00
Petar Jovanovic	99fba3c141	Add cfi instr for CFA calculation when movpc is expanded to call and pop This fixes the issue of wrong CFA calculation in the following case: 0x08048400 <+0>: push %ebx 0x08048401 <+1>: sub $0x8,%esp 0x08048404 <+4>: call 0x8048409 <test+9> 0x08048409 <+9>: pop %eax 0x0804840a <+10>: add $0x1bf7,%eax 0x08048410 <+16>: mov %eax,%ebx 0x08048412 <+18>: call 0x80483f0 <bar> 0x08048417 <+23>: add $0x8,%esp 0x0804841a <+26>: pop %ebx 0x0804841b <+27>: ret The highlighted instructions are a product of movpc instruction. The call instruction changes the stack pointer, and pop instruction restores its value. However, the rule for computing CFA is not updated and is wrong on the pop instruction. So, e.g. backtrace in gdb does not work when on the pop instruction. This adds cfi instructions for both call and pop instructions. cfi_adjust_cfa_offset** instruction is used with the appropriate offset for setting the rules to calculate CFA correctly. Patch by Violeta Vukobrat. Differential Revision: http://reviews.llvm.org/D14021 llvm-svn: 252176	2015-11-05 17:19:59 +00:00
Derek Schuff	8a76b04a63	[WebAssembly] Rename ior operator to or to match the spec Summary: The spec uses "or" for inclusive-or and "xor" for exclusive-or Reviewers: sunfish Subscribers: jfb, llvm-commits, dschuff Differential Revision: http://reviews.llvm.org/D14362 llvm-svn: 252174	2015-11-05 17:08:11 +00:00
James Molloy	bef6e43107	[ARM] Compute known bits for ARMISD::CMOV We can conservatively know that CMOV's known bits are the intersection of known bits for each of its operands. This helps PerformCMOVToBFICombine find more opportunities. I tried hard to create a testcase for this and failed - we have to sufficiently confuse DAG.computeKnownBits which can see through all the cheap tricks I tried to narrow my larger testcase down :( This code is actually exercised in CodeGen/ARM/bfi.ll, there's just no functional difference because DAG.computeKnownBits gets the right answer in that case. llvm-svn: 252168	2015-11-05 15:21:58 +00:00
Asaf Badouh	f99c054ebc	revert rev. 252153 due to build failure on ubuntu [X86][AVX512] add comi with Sae llvm-svn: 252154	2015-11-05 08:55:54 +00:00
Asaf Badouh	7fdabf0a35	[X86][AVX512] add comi with Sae add builtin_ia32_vcomisd and builtin_ia32_vcomisd Differential Revision: http://reviews.llvm.org/D14331 llvm-svn: 252153	2015-11-05 08:45:06 +00:00
Asaf Badouh	a8209d92cc	[X86][AVX512] small bugfix in VPBROADCASTM VPBROADCASTMW2D and VPBROADCASTMB2Q Differential Revision: http://reviews.llvm.org/D14335 llvm-svn: 252151	2015-11-05 08:08:21 +00:00
Matt Arsenault	5b22dfa65d	AMDGPU: Also track whether SGPRs were spilled llvm-svn: 252145	2015-11-05 05:27:10 +00:00
Matt Arsenault	d41c0dbff0	AMDGPU: Print number user SGPRs This doesn't quite match how SC prints it, which doesn't put it in a comment. llvm-svn: 252144	2015-11-05 05:27:07 +00:00
Matt Arsenault	68802d3177	AMDGPU: Disallow s[102:103] on VI in assembler llvm-svn: 252142	2015-11-05 03:11:27 +00:00
Matt Arsenault	a40450cba2	AMDGPU: Fix assert when legalizing atomic operands The operand layout is slightly different for the atomic opcodes from the usual MUBUF loads and stores. This should only fix it on SI/CI. VI is still broken because it still emits the addr64 replacement. llvm-svn: 252140	2015-11-05 02:46:56 +00:00
Matt Arsenault	bed42a7320	AMDGPU: Make addr64 atomic operand order consistent vaddr comes before srsrc in every other MUBUF instruction, and is the order it is printed. llvm-svn: 252139	2015-11-05 02:46:53 +00:00
Joseph Tremoulet	6afccf6120	[WinEH] Fix establisher param reg in CLR funclets Summary: The CLR's personality routine passes the pointer to the establisher frame in RCX, not RDX. Reviewers: pgavlin, majnemer, rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14343 llvm-svn: 252135	2015-11-05 02:20:07 +00:00
Rafael Espindola	e61a902371	Go back to producing relocations for out of range symbols. This brings back the behavior from before r252090 for out of range symbols. Should bring some arm bots back. llvm-svn: 252119	2015-11-05 01:10:15 +00:00
Matt Arsenault	6c2e200d38	AMDGPU: Fix typo llvm-svn: 252116	2015-11-05 01:03:08 +00:00
Rafael Espindola	49b8548903	Slightly saner handling of thumb branches. The generic infrastructure already did a lot of work to decide if the fixup value is know or not. It doesn't make sense to reimplement a very basic case: same fragment. llvm-svn: 252090	2015-11-04 23:00:39 +00:00
Quentin Colombet	421723cdd8	[x86] Teach the shrink-wrapping hooks to do the proper thing with Win64. Win64 has some strict requirements for the epilogue. As a result, we disable shrink-wrapping for Win64 unless the block that gets the epilogue is already an exit block. Fixes PR24193. llvm-svn: 252088	2015-11-04 22:37:28 +00:00
Simon Pilgrim	f669d381f9	Warning fix. llvm-svn: 252078	2015-11-04 21:27:22 +00:00
Simon Pilgrim	7e6606f4f1	[X86][SSE] Add general memory folding for (V)INSERTPS instruction This patch improves the memory folding of the inserted float element for the (V)INSERTPS instruction. The existing implementation occurs in the DAGCombiner and relies on the narrowing of a whole vector load into a scalar load (and then converted into a vector) to (hopefully) allow folding to occur later on. Not only has this proven problematic for debug builds, it also prevents other memory folds (notably stack reloads) from happening. This patch removes the old implementation and moves the folding code to the X86 foldMemoryOperand handler. A new private 'special case' function - foldMemoryOperandCustom - has been added to deal with memory folding of instructions that can't just use the lookup tables - (V)INSERTPS is the first of several that could be done. It also tweaks the memory operand folding code with an additional pointer offset that allows existing memory addresses to be modified, in this case to convert the vector address to the explicit address of the scalar element that will be inserted. Unlike the previous implementation we now set the insertion source index to zero, although this is ignored for the (V)INSERTPSrm version, anything that relied on shuffle decodes (such as unfolding of insertps loads) was incorrectly calculating the source address - I've added a test for this at insertps-unfold-load-bug.ll Differential Revision: http://reviews.llvm.org/D13988 llvm-svn: 252074	2015-11-04 20:48:09 +00:00

1 2 3 4 5 ...

34868 Commits