llvm-project

Commit Graph

Author	SHA1	Message	Date
Matt Arsenault	f2ddbf00ed	AMDGPU: Prepare for reducing private element size. Tests for the new scalarize all private access options will be included with a future commit. The only functional change is to make the split/scalarize behavior for private access of > 4 element vectors to be consistent with the flat/global handling. This makes the spilling worse in the two changed tests. llvm-svn: 260804	2016-02-13 04:18:53 +00:00
Matt Arsenault	ce56a0ef54	AMDGPU: Add intrinsics for sin/cos These provide direct access to the hardware instruction without the unit version required like llvm.sin/llvm.cos lowering requires. llvm-svn: 260782	2016-02-13 01:19:56 +00:00
Matt Arsenault	79963e80b8	AMDGPU: Rename intrinsic to better match instruction name Also fixes missing f32 test. llvm-svn: 260780	2016-02-13 01:03:00 +00:00
Matt Arsenault	16f48d7c55	AMDGPU: Fix broken condition causing warning llvm-svn: 260773	2016-02-13 00:36:10 +00:00
Tom Stellard	bc4497b13c	AMDGPU/SI: Detect uniform branches and emit s_cbranch instructions Reviewers: arsenm Subscribers: mareko, MatzeB, qcolombet, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D16603 llvm-svn: 260765	2016-02-12 23:45:29 +00:00
Matt Arsenault	296b849163	AMDGPU: Set flat_scratch from flat_scratch_init reg This was hardcoded to the static private size, but this would be missing the offset and additional size for someday when we have dynamic sizing. Also stops always initializing flat_scratch even when unused. In the future we should stop emitting this unless flat instructions are used to access private memory. For example this will initialize it almost always on VI because flat is used for global access. llvm-svn: 260658	2016-02-12 06:31:30 +00:00
Matt Arsenault	9524566314	AMDGPU: Split R600 and SI store lowering These were only sharing some somewhat incorrect logic for when to scalarize or split vectors. llvm-svn: 260490	2016-02-11 05:32:46 +00:00
Matt Arsenault	a14364115e	AMDGPU: Fix indentation and variable names llvm-svn: 260399	2016-02-10 18:21:45 +00:00
Matt Arsenault	6dfda9625d	AMDGPU: Split R600 and SI load lowering These weren't actually sharing anything in the common LowerLOAD. llvm-svn: 260398	2016-02-10 18:21:39 +00:00
Ahmed Bougacha	f8dfb47c02	[CodeGen] Prefer "if (SDValue R = ...)" to "if (R.getNode())". NFCI. llvm-svn: 260316	2016-02-09 22:54:12 +00:00
Oliver Stannard	7e7d983a87	Refactor backend diagnostics for unsupported features Re-commit of r258951 after fixing layering violation. The BPF and WebAssembly backends had identical code for emitting errors for unsupported features, and AMDGPU had very similar code. This merges them all into one DiagnosticInfo subclass, that can be used by any backend. There should be minimal functional changes here, but some AMDGPU tests have been updated for the new format of errors (it used a slightly different format to BPF and WebAssembly). The AMDGPU error messages will now benefit from having precise source locations when debug info is available. llvm-svn: 259498	2016-02-02 13:52:43 +00:00
Matt Arsenault	e013246462	AMDGPU: Fix emitting invalid workitem intrinsics for HSA The AMDGPUPromoteAlloca pass was emitting the read.local.size calls, which with HSA was incorrectly selected to reading from the offset mesa uses off of the kernarg pointer. Error on intrinsics which aren't supported by HSA, and start emitting the correct IR to read the workgroup size out of the dispatch pointer. Also initialize the pass so it can be tested with opt, and start moving towards not depending on the subtarget as an argument. Start emitting errors for the intrinsics not handled with HSA. llvm-svn: 259297	2016-01-30 05:19:45 +00:00
Matt Arsenault	43976df0da	AMDGPU: Add new amdgcn workitem intrinsics These use the correct prefix and follow the HSA naming convention rather than the config register option names. llvm-svn: 259293	2016-01-30 04:25:19 +00:00
Matt Arsenault	5b39b34ca5	AMDGPU: Match fmed3 patterns with legacy fmin/fmax llvm-svn: 259090	2016-01-28 20:53:48 +00:00
Matt Arsenault	f639c32739	AMDGPU: Match some med3 patterns llvm-svn: 259089	2016-01-28 20:53:42 +00:00
Oliver Stannard	02fa1c80c4	Revert r259035, it introduces a cyclic library dependency llvm-svn: 259045	2016-01-28 13:19:47 +00:00
Oliver Stannard	b4b092ea1b	Add backend dignostic printer for unsupported features Re-commit of r258951 after fixing layering violation. The related LLVM patch adds a backend diagnostic type for reporting unsupported features, this adds a printer for them to clang. In the case where debug location information is not available, I've changed the printer to report the location as the first line of the function, rather than the closing brace, as the latter does not give the user any information. This also affects optimisation remarks. Differential Revision: http://reviews.llvm.org/D16590 llvm-svn: 259035	2016-01-28 10:07:27 +00:00
NAKAMURA Takumi	628a7a0aef	Revert r258951 (and r258950), "Refactor backend diagnostics for unsupported features" It broke layering violation in LLVMIR. clang r258950 "Add backend dignostic printer for unsupported features" llvm r258951 "Refactor backend diagnostics for unsupported features" llvm-svn: 259016	2016-01-28 04:41:32 +00:00
Oliver Stannard	1e67a9f196	Refactor backend diagnostics for unsupported features The BPF and WebAssembly backends had identical code for emitting errors for unsupported features, and AMDGPU had very similar code. This merges them all into one DiagnosticInfo subclass, that can be used by any backend. There should be minimal functional changes here, but some AMDGPU tests have been updated for the new format of errors (it used a slightly different format to BPF and WebAssembly). The AMDGPU error messages will now benefit from having precise source locations when debug info is available. The implementation of DiagnosticInfoUnsupported::print must be in lib/Codegen rather than in the existing file in lib/IR/ to avoid introducing a dependency from IR to CodeGen. Differential Revision: http://reviews.llvm.org/D16590 llvm-svn: 258951	2016-01-27 17:30:33 +00:00
Matt Arsenault	c5f6152911	AMDGPU: Make v32i8/v64i8 illegal types Old intrinsics were forcing these, but they have now all been removed. This fixes large i8 vector operations generally being broken. llvm-svn: 258788	2016-01-26 04:43:48 +00:00
Matt Arsenault	018179fc46	AMDGPU: Remove old sample intrinsics I did my best to try to update all the uses in tests that just happened to use the old ones to the newer intrinsics. I'm not sure I got all of the immediate operand conversions correct, since the value seems to have been ignored by the old pattern but I don't think it really matters. llvm-svn: 258787	2016-01-26 04:38:08 +00:00
Matt Arsenault	9a10cea7fb	AMDGPU: Implement read_register and write_register intrinsics Some of the special intrinsics now that now correspond to a instruction also have special setting of some registers, e.g. llvm.SI.sendmsg sets m0 as well as use s_sendmsg. Using these explicit register intrinsics may be a better option. Reading the exec mask and others may be useful for debugging. For this I'm not sure this is entirely correct because we would want this to be convergent, although it's possible this is already treated sufficently conservatively. llvm-svn: 258785	2016-01-26 04:29:24 +00:00
Matt Arsenault	0c3e2338fe	AMDGPU: Restore AMDGPU prefixed rsq intrinsic for now Also move into backend intrinsics to discourage use of the old name. llvm-svn: 258783	2016-01-26 04:14:16 +00:00
Matt Arsenault	f75257aaa6	AMDGPU: Move amdgcn intrinsic handling into SITargetLowering llvm-svn: 258608	2016-01-23 05:32:20 +00:00
Matt Arsenault	bef34e21c7	AMDGPU: Rename intrinsics to use amdgcn prefix The intrinsic target prefix should match the target name as it appears in the triple. This is not yet complete, but gets most of the important ones. llvm.AMDGPU.* intrinsics used by mesa and libclc are still handled for compatability for now. llvm-svn: 258557	2016-01-22 21:30:34 +00:00
Matt Arsenault	0cbaa1762b	AMDGPU: Remove AMDGPU.fract intrinsic Mesa doesn't use this, and this is pattern matched already from fsub x, (ffloor x) llvm-svn: 258513	2016-01-22 18:42:38 +00:00
Tom Stellard	d1efda8e9e	AMDGPU/SI: Promote i1 SETCC operations Summary: While working on uniform branching, I've hit a few cases where we emit i1 SETCC operations. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D16233 llvm-svn: 258352	2016-01-20 21:48:24 +00:00
Matt Arsenault	15fbe49daf	AMDGPU: Remove AMDIL.fraction intrinsic llvm-svn: 258347	2016-01-20 21:05:49 +00:00
Tom Stellard	2e045bbc5f	AMDGPU/SI: Prevent the DAGCombiner from creating setcc with i1 inputs Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15035 llvm-svn: 258256	2016-01-20 00:13:22 +00:00
Matt Arsenault	6e3a45193a	AMDGPU: Split 64-bit and of constant up This breaks the tests that were meant for testing 64-bit inline immediates, so move those to shl where they won't be broken up. This should be repeated for the other related bit ops. llvm-svn: 258095	2016-01-18 22:01:13 +00:00
Marek Olsak	46dadbfab2	AMDGPU/SI: Fix a GPU hang with POS_W_FLOAT enabled Reviewers: tstellarAMD, arsenm Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D16037 llvm-svn: 257625	2016-01-13 17:23:20 +00:00
Marek Olsak	8e9cc63bfb	AMDGPU/SI: Add s_waitcnt at the end of non-void functions Summary: v2: Make ReturnsVoid private, so that I can another 8 lines of code and look more productive. Reviewers: tstellarAMD, arsenm Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D16034 llvm-svn: 257622	2016-01-13 17:23:09 +00:00
Marek Olsak	8a0f335ad6	AMDGPU/SI: Add support for non-void functions Summary: Return values can be stored in SGPRs (i32) and VGPRs (f32). This will be used by functions which expect some bytecode or other binary to be appended at the end. It allows defining in which registers the return values will be stored. v2: don't do this for compute shaders Reviewers: tstellarAMD, arsenm Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D16033 llvm-svn: 257621	2016-01-13 17:23:04 +00:00
Marek Olsak	b6c8c3d165	AMDGPU/SI: Allow any number of PS inputs Summary: With the ability to concatenate shader binaries, the limit of 15 no longer applies. Reviewers: tstellarAMD, arsenm Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D16031 llvm-svn: 257592	2016-01-13 11:46:10 +00:00
Marek Olsak	fccabaf57e	AMDGPU/SI: Add new target attribute InitialPSInputAddr Summary: This allows Mesa to pass initial SPI_PS_INPUT_ADDR to LLVM. The register assigns VGPR locations to PS inputs, while the ENA register determines whether or not they are loaded. Mesa needs to set some inputs as not-movable, so that a pixel shader prolog binary appended at the beginning can assume where some inputs are. v2: Make PSInputAddr private, because there is never enough silly getters and setters for people to read. Reviewers: tstellarAMD, arsenm Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D16030 llvm-svn: 257591	2016-01-13 11:45:36 +00:00
Matt Arsenault	800fecf9de	AMDGPU: Fix crash with dispatch.ptr intrinsic with non-HSA target It might be better to let this be a select failure instead. llvm-svn: 257386	2016-01-11 21:18:33 +00:00
Matt Arsenault	de5fbe9c60	AMDGPU: Pattern match ffbh pattern to instruction. The hardware instruction's output on 0 is -1 rather than 32. Eliminate a test and select to -1. This removes an extra instruction from the compatability function with HSAIL's firstbit instruction. llvm-svn: 257352	2016-01-11 17:02:00 +00:00
Matt Arsenault	02d45dfeda	AMDGPU: Remove dead target dag combine llvm-svn: 257344	2016-01-11 16:37:40 +00:00
Tom Stellard	a6f24c6565	AMDGPU/SI: Select constant loads with non-uniform addresses to MUBUF instructions Summary: We were previously selecting all constant loads to SMRD instructions and legalizing the SMRDs with non-uniform addresses during the SIFixSGPRCopesPass. This new solution is more simple and also generates much better code, because the instruction selector is able to take advantage of all the MUBUF addressing modes that are legalization pass wasn't able to. We also no longer need to generate v_add_* instructions when we have a uniform pointer and a non-uniform offset, as this is now folded into the MUBUF instruction during instruction selection. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15425 llvm-svn: 255672	2015-12-15 20:55:55 +00:00
Tom Stellard	ad7d03daa6	AMDGPU/SI: Add llvm.amdgcn.v.interp.p[12] intrinsics Summary: These are meant to be used instead of the llvm.SI.fs.interp intrinsic which will be deprecated at some point. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15474 llvm-svn: 255651	2015-12-15 17:02:49 +00:00
Matt Arsenault	d079285e05	AMDGPU: Use generic bitreverse intrinsic Also fix bug in vector legalization for bitreverse. llvm-svn: 255512	2015-12-14 17:25:38 +00:00
Tom Stellard	c93fc11f36	AMDGPU/SI: Emit constant arrays in the .text section Summary: This allows us to remove the END_OF_TEXT_LABEL hack we had been using and simplifies the fixups used to compute the address of constant arrays. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15257 llvm-svn: 255204	2015-12-10 02:13:01 +00:00
Tom Stellard	b3c3bda512	AMDGPU/SI: Add support for sgpr and vgpr inline assembly constraints Summary: The 's' constraint represents sgprs and the 'v' constraint represents vgprs. Reviewers: arsenm, echristo Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15342 llvm-svn: 255203	2015-12-10 02:12:53 +00:00
Matt Arsenault	f9bfeafd00	AMDGPU: Implement isNoopAddrSpaceCast llvm-svn: 254468	2015-12-01 23:04:00 +00:00
Tom Stellard	38b7cbe3e0	AMDGPU/SI: Remove REGISTER_STORE/REGISTER_LOAD code which is now dead Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15050 llvm-svn: 254427	2015-12-01 17:45:22 +00:00
Matt Arsenault	ada6cf1b22	AMDGPU: Fix unused function llvm-svn: 254333	2015-11-30 21:32:10 +00:00
Matt Arsenault	26f8f3db39	AMDGPU: Rework how private buffer passed for HSA If we know we have stack objects, we reserve the registers that the private buffer resource and wave offset are passed and use them directly. If not, reserve the last 5 SGPRs just in case we need to spill. After register allocation, try to pick the next available registers instead of the last SGPRs, and then insert copies from the inputs to the reserved registers in the progloue. This also only selectively enables all of the input registers which are really required instead of always enabling them. llvm-svn: 254331	2015-11-30 21:16:03 +00:00
Matt Arsenault	ac234b604d	AMDGPU: Rename enums to be consistent with HSA code object terminology llvm-svn: 254330	2015-11-30 21:15:57 +00:00
Matt Arsenault	0e3d38937e	AMDGPU: Remove SIPrepareScratchRegs It does not work because of emergency stack slots. This pass was supposed to eliminate dummy registers for the spill instructions, but the register scavenger can introduce more during PrologEpilogInserter, so some would end up left behind if they were needed. The potential for spilling the scratch resource descriptor and offset register makes doing something like this overly complicated. Reserve registers to use for the resource descriptor and use them directly in eliminateFrameIndex. Also removes creating another scratch resource descriptor when directly selecting scratch MUBUF instructions. The choice of which registers are reserved is temporary. For now it attempts to pick the next available registers after the user and system SGPRs. llvm-svn: 254329	2015-11-30 21:15:53 +00:00
Matt Arsenault	ff6da2fe89	AMDGPU: Use assert zext for workgroup sizes llvm-svn: 254328	2015-11-30 21:15:45 +00:00
Matt Arsenault	ea03cf2fa1	AMDGPU: Don't reserve SCRATCH_PTR input register This hasn't been doing anything since using relocations was added. llvm-svn: 254304	2015-11-30 15:46:47 +00:00
Tom Stellard	48f29f21ee	AMDGPU: Add llvm.amdgcn.dispatch.ptr intrinsic Summary: This returns a pointer to the dispatch packet, which can be used to load information about the kernel dispach. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D14898 llvm-svn: 254116	2015-11-26 00:43:29 +00:00
Matt Arsenault	61001bbc03	AMDGPU: Make v2i64/v2f64 legal types. They can be loaded and stored, so count them as legal. This is mostly to fix a number of common cases for load/store merging. llvm-svn: 254086	2015-11-25 19:58:34 +00:00
Matt Arsenault	ff05da806c	AMDGPU: Split LDS vector loads If properly aligned this could allow using ds_read_b64. llvm-svn: 253975	2015-11-24 12:18:54 +00:00
Matt Arsenault	4d801cd357	AMDGPU: Split x8 and x16 vector loads instead of scalarize The one regression in the builtin tests is in the read2 test which now (again) has many extra copies, but this should be solved once the pass is replaced with a DAG combine. llvm-svn: 253974	2015-11-24 12:05:03 +00:00
Matt Arsenault	d48da14269	AMDGPU: Error on graphics shaders with HSA I've found myself pointlessly debugging problems from running graphics tests with an HSA triple a few times, so stop this from happening again. llvm-svn: 251858	2015-11-02 23:23:02 +00:00
Marek Olsak	6f6d318e16	AMDGPU/SI: handle undef for llvm.SI.packf16 llvm-svn: 251632	2015-10-29 15:29:09 +00:00
Matt Arsenault	6005fcbe12	AMDGPU: Simplify VOP3 operand legalization. This was checking for a variety of situations that should never happen. This saves a tiny bit of compile time. We should not be selecting instructions with invalid operands in the first place. Most of the time for registers copys are inserted to the correct operand register class. For VOP3, since all operand types are supported and literal constants never are, we just need to verify the constant bus requirements (all immediates should be legal inline ones). The only possibly tricky case to maybe worry about is if when legalizing operands in moveToVALU with s_add_i32 and similar instructions. If the original s_add_i32 had a literal constant and we need to replace it with v_add_i32_e64 we would have an unsupported literal operand. However, I don't think we should worry about that because SIFoldOperands should handle folding literal constant operands into the SALU instructions based on the uses. At SIFoldOperands time, the legality and profitability of operand types is a bit different. llvm-svn: 250951	2015-10-21 21:51:02 +00:00
Matt Arsenault	3add6439d0	AMDGPU: Add MachineInstr overloads for instruction format tests llvm-svn: 250797	2015-10-20 04:35:43 +00:00
Tom Stellard	0fbf899c0f	AMDGPU/SI: Remove calling convention assertion from LowerFormalArguments() Summary: We currently ignore the calling convention, so there is no real reason to assert on the calling convention of functions. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D13367 llvm-svn: 249468	2015-10-06 21:16:34 +00:00
Marek Olsak	d1a69a2839	AMDGPU/SI: Don't set DATA_FORMAT if ADD_TID_ENABLE is set to prevent setting a huge stride, because DATA_FORMAT has a different meaning if ADD_TID_ENABLE is set. This is a candidate for stable llvm 3.7. Tested-and-Reviewed-by: Christian König <christian.koenig@amd.com> llvm-svn: 248858	2015-09-29 23:37:32 +00:00
Matt Arsenault	2d6fdb8495	AMDGPU: Re-justify workaround and fix worked around problem When buffer resource descriptors were built, the upper two components of the descriptor were first composed into a 64-bit register because legalizeOperands assumed all operands had the same register class. Fix that problem, but keep the workaround. I'm not sure anything actually is actually emitting such a REG_SEQUENCE now. If multiple resource descriptors are set up with different base pointers, this is copied with a single s_mov_b64. We probably should fix this better by recognizing a pair of s_mov_b32 later, but for now delete the dead code. llvm-svn: 248585	2015-09-25 17:08:42 +00:00
NAKAMURA Takumi	84965031a7	Reformat comment lines. llvm-svn: 248262	2015-09-22 11:14:12 +00:00
Craig Topper	0013be16ff	Use makeArrayRef or None to avoid unnecessarily mentioning the ArrayRef type extra times. NFC llvm-svn: 248140	2015-09-21 05:32:41 +00:00
Sanjay Patel	a260701bbb	propagate fast-math-flags on DAG nodes After D10403, we had FMF in the DAG but disabled by default. Nick reported no crashing errors after some stress testing, so I enabled them at r243687. However, Escha soon notified us of a bug not covered by any in-tree regression tests: if we don't propagate the flags, we may fail to CSE DAG nodes because differing FMF causes them to not match. There is one test case in this patch to prove that point. This patch hopes to fix or leave a 'TODO' for all of the in-tree places where we create nodes that are FMF-capable. I did this by putting an assert in SelectionDAG.getNode() to find any FMF-capable node that was being created without FMF ( D11807 ). I then ran all regression tests and test-suite and confirmed that everything passes. This patch exposes remaining work to get DAG FMF to be fully functional: (1) add the flags to non-binary nodes such as FCMP, FMA and FNEG; (2) add the flags to intrinsics; (3) use the flags as conditions for transforms rather than the current global settings. Differential Revision: http://reviews.llvm.org/D12095 llvm-svn: 247815	2015-09-16 16:31:21 +00:00
Matt Arsenault	acd68b58ae	SelectionDAG: Support Expand of f16 extloads Currently this hits an assert that extload should always be supported, which assumes integer extloads. This moves a hack out of SI's argument lowering and is covered by existing tests. llvm-svn: 247113	2015-09-09 01:12:27 +00:00
Sanjay Patel	ce74db9d8d	check for fastness before merging in DAGCombiner::MergeConsecutiveStores() Use and check the 'IsFast' optional parameter to TLI.allowsMemoryAccess() any time we have a merged access candidate. Without this patch, we were generating unaligned 16-byte (SSE) memops for x86 targets where those accesses are slow. This change was mentioned in: http://reviews.llvm.org/D10662 and http://reviews.llvm.org/D10905 and will help solve PR21711. Differential Revision: http://reviews.llvm.org/D12573 llvm-svn: 246771	2015-09-03 15:03:19 +00:00
Benjamin Kramer	df005cbe19	Fix some comment typos. llvm-svn: 244402	2015-08-08 18:27:36 +00:00
Matt Arsenault	711b390a7c	AMDGPU: Assume SMRD access for constant address space Since r243294 these are selected to SMRD and moved later if required. llvm-svn: 244354	2015-08-07 20:18:34 +00:00
Craig Topper	e3dcce9700	De-constify pointers to Type since they can't be modified. NFC This was already done in most places a while ago. This just fixes the ones that crept in over time. llvm-svn: 243842	2015-08-01 22:20:21 +00:00
Matt Arsenault	e1ce344b5a	AMDGPU: Fix v16i32 to v16i8 truncstore llvm-svn: 243731	2015-07-31 04:12:04 +00:00
Tom Stellard	70580f83cc	AMDGPU/SI: Add VI patterns to select FLAT instructions for global memory ops Summary: The MUBUF addr64 bit has been removed on VI, so we must use FLAT instructions when the pointer is stored in VGPRs. Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11067 llvm-svn: 242673	2015-07-20 14:28:41 +00:00
Tom Stellard	c98ee20328	AMDPGU/SI: Use AssertZext node to mask high bit for scratch offsets Summary: We can safely assume that the high bit of scratch offsets will never be set, because this would require at least 128 GB of GPU memory. Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11225 llvm-svn: 242433	2015-07-16 19:40:07 +00:00
Matt Arsenault	cf13d18730	AMDGPU: Fix chains for memory ops dependent on argument loads Most loads and stores are derived from pointers derived from a kernel argument load inserted during argument lowering. This was just using the EntryToken chain for the argument loads, and any users of these loads were also on the EntryToken chain. Return the chain of the lowered argument load so that dependent loads end up on the correct chain. No test since I'm not aware of any case where this actually broke. llvm-svn: 241960	2015-07-10 22:51:36 +00:00
Matt Arsenault	0d5197380c	AMDGPU: Use requested chain when lowering arguments No test since I'm not aware of any case where this will end up being a different chain. llvm-svn: 241954	2015-07-10 22:28:41 +00:00
Tom Stellard	dcb9f0907f	AMDGPU: Add helper function for implicit parameter offsets. Patch by: Zoltan Gilian llvm-svn: 241861	2015-07-09 21:20:37 +00:00
Mehdi Amini	eaabc51e78	Re-instate the EVT parameter to getScalarShiftAmountTy() for OOT user A documentation for this function would be nice by the way. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241807	2015-07-09 15:12:23 +00:00
Mehdi Amini	a749f2ad47	Remove getDataLayout() from TargetLowering Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: yaron.keren, rafael, llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D11042 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241779	2015-07-09 02:09:52 +00:00
Mehdi Amini	0cdec1e2ab	Make isLegalAddressingMode() taking DataLayout as an argument Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, llvm-commits, rafael, yaron.keren Differential Revision: http://reviews.llvm.org/D11040 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241778	2015-07-09 02:09:40 +00:00
Mehdi Amini	9639d650bb	Make TargetLowering::getShiftAmountTy() taking DataLayout as an argument Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, llvm-commits, rafael, yaron.keren Differential Revision: http://reviews.llvm.org/D11037 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241776	2015-07-09 02:09:20 +00:00
Mehdi Amini	44ede33a69	Make TargetLowering::getPointerTy() taking DataLayout as an argument Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, ted, yaron.keren, rafael, llvm-commits Differential Revision: http://reviews.llvm.org/D11028 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241775	2015-07-09 02:09:04 +00:00
Benjamin Kramer	9bfb627a0e	[TargetLowering] StringRefize asm constraint getters. There is some functional change here because it changes target code from atoi(3) to StringRef::getAsInteger which has error checking. For valid constraints there should be no difference. llvm-svn: 241411	2015-07-05 19:29:18 +00:00
Tom Stellard	b5798b09d3	AMDGPU/SI: There are no implicit kernel args in the amdhsa ABI Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10706 llvm-svn: 240830	2015-06-26 21:15:03 +00:00
Matt Arsenault	0b554ed364	AMDGPU: Use getAsInteger instead of atoi llvm-svn: 240365	2015-06-23 02:05:55 +00:00
Tom Stellard	45bb48ea19	R600 -> AMDGPU rename llvm-svn: 239657	2015-06-13 03:28:10 +00:00
Tom Stellard	1be1aa84ec	Revert "AMDGPU: Add core backend files for R600/SI codegen v6" This reverts commit 4ea70107c5e51230e9e60f0bf58a0f74aa4885ea. llvm-svn: 160303	2012-07-16 18:19:53 +00:00
Tom Stellard	bcce80fa95	AMDGPU: Add core backend files for R600/SI codegen v6 llvm-svn: 160270	2012-07-16 14:17:08 +00:00

... 3 4 5 6 7

337 Commits