llvm-project

Commit Graph

Author	SHA1	Message	Date
Artem Tamazov	2e217b87cb	[AMDGPU][mc] Add support for ds_add_[rtn_]f32. Lit tests added. Resolves https://github.com/RadeonOpenCompute/hcc/issues/122. Differential Revision: https://reviews.llvm.org/D24765 llvm-svn: 282086	2016-09-21 16:35:44 +00:00
Nico Weber	903859c0e4	Revert r281715, it caused PR30475 llvm-svn: 282076	2016-09-21 15:33:24 +00:00
Tim Northover	9a46718378	GlobalISel: produce correct code for signext/zeroext ABI flags. We still don't really have an equivalent of "AssertXExt" in DAG, so we don't exploit the guarantees on the receiving side yet, but this should produce conservatively correct code on iOS ABIs. llvm-svn: 282069	2016-09-21 12:57:45 +00:00
Tim Northover	862758ec14	GlobalISel: pass Function to lowerFormalArguments directly (NFC). The only implementation that exists immediately looks it up anyway, and the information is needed to handle various parameter attributes (stored on the function itself). llvm-svn: 282068	2016-09-21 12:57:35 +00:00
Sam Kolton	12b633beda	[AMDGPU] Assembler: remove unused AMDGPUMCObjectWriter. Summary: It is replaced by AMDGPUELFObjectWriter Reviewers: tstellarAMD, vpykhtin, artem.tamazov Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl Differential Revision: https://reviews.llvm.org/D24654 llvm-svn: 282065	2016-09-21 10:33:32 +00:00
Simon Dardis	9a66bbecae	[mips] LLVM PR/30197 - Tail call incorrectly clobbers arguments for mips The postRA scheduler performs alias analysis to determine if stores and loads can moved past each other. When a function has more arguments than argument registers for the calling convention used, excess arguments are spilled onto the stack. LLVM by default assumes that argument slots are immutable, unless the function contains a tail call. Without the knowledge of that a function contains a tail call site, stores and loads to fixed stack slots may be re-ordered causing the out-going arguments to clobber the incoming arguments before the incoming arguments are supposed to be dead. Reviewers: vkalintiris Differential Review: https://reviews.llvm.org/D24077 llvm-svn: 282063	2016-09-21 09:43:40 +00:00
Diana Picus	2a3f066349	Revert "AArch64: Set shift bit of TLSLE HI12 add instruction" This reverts commit r282057 because it broke the buildbots - see e.g. http://lab.llvm.org:8011/builders/clang-cmake-aarch64-42vma/builds/12063 llvm-svn: 282058	2016-09-21 08:24:41 +00:00
Lei Liu	6c87f23526	AArch64: Set shift bit of TLSLE HI12 add instruction Summary: AArch64 LLVM assembler emits add instruction without shift bit to calculate the higher 12-bit address of TLS variables in local exec model. This generates wrong code sequence to access TLS variables with thread offset larger than 0x1000. Reviewers: t.p.northover, peter.smith, rovka Subscribers: salim.nasser, aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D24702 llvm-svn: 282057	2016-09-21 07:41:41 +00:00
Craig Topper	29f1a1f834	[AVX-512] Split the 3 different usages of the X86ISD::FSETCC opcode into 3 different opcodes. It turns out isel is really not robust against having different type profiles for the same opcode. It turns out that if you put an illegal rounding mode(i.e. not CUR_DIRECTION or NO_EXC) on a comiss intrinsic we would generate the FSETCC form with the rounding mode added, but then pattern match to an instruction with ROUND_CUR_DIRECTION. We can probably get away with just one FSETCCM opcode that always contains the rounding mode and explicitly put ROUND_CUR_DIRECTION in the pattern, but I'll leave that for future work. With this change the clang tests for the comiss intrinsics that used an incorrect rounding mode of 3 properly fail isel instead of silently doing the wrong thing. Those clang tests will be fixed in a follow up commit and I also plan to add rounding mode checking to clang. llvm-svn: 282055	2016-09-21 06:37:54 +00:00
Craig Topper	d868870f17	[AVX-512] Don't add an additional rounding mode operand to the avx512 vcvtps2ph intrinsic lowering. There was no way to control its value so it was always FROUND_CURRENT making it unnecessary. The true rounding mode is encoded in the immediate operand of the instruction. This also removes the pattern from the rb form of the instructions since there is no way to specify the FROUND_NO_EXC rounding mode it required. llvm-svn: 282052	2016-09-21 03:58:44 +00:00
Craig Topper	a27f54b4d9	[AVX-512] Simplify handling of INTR_TYPE_1OP_MASK_RM to remove support for the second opcode since its never used. This makes it consistent with INTR_TYPE_2OP_MASK_RM and INTR_TYPE_3OP_MASK_RM. And even if it was used we were passing the same operands to both so it wouldn't make sense to have two opcodes. llvm-svn: 282051	2016-09-21 03:58:41 +00:00
Craig Topper	e18258dc1c	[AVX-512] Don't lower avx512 vcvtps2ph/vcvtph2ps nodes to ISD::FP16_TO_FP/ISD::FP_TO_FP16 with an extra x86 specific rounding mode operand. We should use a target specific ISD opcode. llvm-svn: 282046	2016-09-21 02:05:22 +00:00
Jacques Pienaar	98345fc0a1	[NVPTX] Check if callsite is defined when computing argument allignment Summary: In getArgumentAlignment check if the ImmutableCallSite pointer CS is non-null before dereferencing. If CS is 0x0 fall back to the ABI type alignment else compute the alignment as before. Reviewers: eliben, jpienaar Subscribers: jlebar, vchuravy, cfe-commits, jholewinski Differential Revision: https://reviews.llvm.org/D9168 llvm-svn: 282045	2016-09-21 01:57:57 +00:00
Eric Christopher	5653e5dffc	Remove the default subtarget from the x86 port as it isn't necessary (or correct) anymore. llvm-svn: 282031	2016-09-20 22:19:33 +00:00
Eric Christopher	c4636b3002	Revert "Remove extra argument used once on TargetMachine::getNameWithPrefix and inline the result into the singular caller." and "Remove more guts of TargetMachine::getNameWithPrefix and migrate one check to the TLOF mach-o version." temporarily until I can get the whole call migrated out of the TargetMachine as we could hit places where TLOF isn't valid. This reverts commits r281981 and r281983. llvm-svn: 282028	2016-09-20 22:03:28 +00:00
Evandro Menezes	9b5d89513b	Revert part of "AArch64: Do not test for CPUs, use SubtargetFeatures" This reverts part of commit 119e358d9635c8d1f3e7aee67e3ea3b8a62f8db6 by removing FeatureUseRSqrt et al per request by Eric Christopher <echristo@gmail.com> (v. http://bit.ly/2cmz6kW). llvm-svn: 282001	2016-09-20 19:02:09 +00:00
Evandro Menezes	ba4926efde	Revert "[AArch64] Use the reciprocal estimation machinery" This reverts commit b7d42b0048f65346e9fa37fb65defeea7ce8c337 per request by Eric Christopher <echristo@gmail.com> (v. http://bit.ly/2cmz6kW). llvm-svn: 282000	2016-09-20 19:02:06 +00:00
Evandro Menezes	61a1273d27	Revert "[AArch64] Properly validate the reciprocal estimation." This reverts commit ad8ca1528242e2a4cb363e3779309e70eb7a430e per request by Eric Christopher <echristo@gmail.com> (v. http://bit.ly/2cmz6kW). llvm-svn: 281999	2016-09-20 19:02:02 +00:00
Saleem Abdulrasool	03ffa797ad	X86: loosen an overly aggressive MachO assertion We would assert that the FP setup CFI used esp/rsp always. This held up in practice when the code was generated from IR. However, with the integrated assembler, it is possible to have the input be user specified assembly. In such a case, we cannot assume that the function implementation has a compact unwind representation. Loosen the assertion into a check and bail if we cannot represent the frame pointer in the compact unwinding. Addresses PR30453! llvm-svn: 281986	2016-09-20 17:05:04 +00:00
Eric Christopher	a1ccdc3433	Remove more guts of TargetMachine::getNameWithPrefix and migrate one check to the TLOF mach-o version. NFC intended. llvm-svn: 281983	2016-09-20 16:05:02 +00:00
Eric Christopher	ef579d2195	Remove a use of subtarget initialization in the X86 backend so we can get rid of the default subtarget. NFC intended. llvm-svn: 281982	2016-09-20 16:04:59 +00:00
Eric Christopher	0be7793d75	Remove extra argument used once on TargetMachine::getNameWithPrefix and inline the result into the singular caller. llvm-svn: 281981	2016-09-20 16:04:50 +00:00
Tim Northover	b18ea162df	GlobalISel: split aggregates for PCS lowering This should match the existing behaviour for passing complicated struct and array types, in particular HFAs come through like that from Clang. For C & C++ we still need to somehow support all the weird ABI flags, or at least those that are present in the IR (signext, byval, ...), and stack-based parameter passing. llvm-svn: 281977	2016-09-20 15:20:36 +00:00
Elena Demikhovsky	d3ff7c288b	AVX-512: Fixed a bug in lowering saturated operations on KNL. The generated code is still not optimal. Differential Revision: https://reviews.llvm.org/D24723 llvm-svn: 281966	2016-09-20 11:02:26 +00:00
Valery Pykhtin	e330cfa294	[AMDGPU] Refactor VOP3 instruction TD definitions Differential revision: https://reviews.llvm.org/D24664 llvm-svn: 281965	2016-09-20 10:41:16 +00:00
Craig Topper	67882bd94e	[AVX-512] Teach X86InstrInfo::copyPhysReg to use a 512-bit move if XMM16-XMM31 or YMM16-YMM31 are the source or dest of the copy and VLX is not supported. This can happen with SUBREG_TO_REG of ZMM16-ZMM31. Fixes PR30430. llvm-svn: 281959	2016-09-20 06:49:17 +00:00
Craig Topper	9820e341f9	[AVX-512] Use 512-bit vcvtps2ph/vcvtph2ps to implement fp_to_f16/f16_to_fp when F16C and VLX are not supported. Fixes PR23941. llvm-svn: 281958	2016-09-20 05:44:47 +00:00
Sanjay Patel	e97f7947b1	[x86] fix variable names; NFC llvm-svn: 281953	2016-09-20 00:27:22 +00:00
Sanjay Patel	0fa3365923	[x86] use getSignBit() to simplify code; NFCI llvm-svn: 281944	2016-09-19 22:07:27 +00:00
Valery Pykhtin	2828b9be1e	[AMDGPU] Refactor VOPC instruction TD definitions Differential Revision: https://reviews.llvm.org/D24546 llvm-svn: 281903	2016-09-19 14:39:49 +00:00
Diana Picus	a53660e4a3	[AArch64] Fix encoding for lsl #12 in add/sub immediates Whenever an add/sub immediate needs a fixup, we set that immediate field to zero, which is correct, but we also set the shift bits to zero, which is not true for instructions that use lsl #12. This patch makes sure that if lsl #12 was used, it will appear in the encoding of the instruction. Differential Revision: https://reviews.llvm.org/D23930 llvm-svn: 281898	2016-09-19 11:10:18 +00:00
Sam Kolton	be7ffb90bf	[AMDGPU] Fix s_branch with -1 offset Summary: In case s_branch instruction target is itself backend should emit offset -1 but instead it emit 0. ''' label: s_branch label // should emit [0xff,0xff,0x82,0xbf] ''' Tom, Matt: why are we adjusting fixup values in applyFixup() method instead of processFixup()? processFixup() is calling adjustFixupValue() but does nothing with its result. Reviewers: vpykhtin, artem.tamazov, tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl Differential Revision: https://reviews.llvm.org/D24671 llvm-svn: 281896	2016-09-19 10:20:55 +00:00
Oliver Stannard	e1f6dc59ce	[Thumb] Set correct initial mapping symbol for big-endian thumb The initial mapping symbol state is set from the triple, but we only checked for the little-endian thumb triple, so could end up with an ARM mapping symbol for big-endian thumb. Differential Revision: https://reviews.llvm.org/D24553 llvm-svn: 281894	2016-09-19 09:21:45 +00:00
Tim Northover	eaee28b5ca	ARM: check alignment before transforming ldr -> ldm (or similar). ldm and stm instructions always require 4-byte alignment on the pointer, but we weren't checking this before trying to reduce code-size by replacing a post-indexed load/store with them. Unfortunately, we were also dropping this incormation in DAG ISel too, but that's easy enough to fix. llvm-svn: 281893	2016-09-19 09:11:09 +00:00
Craig Topper	61403201ea	[X86,AVX-512] Use INSERT_SUBREG instead of SUBREG_TO_REG when the input is not the output of an instruction. SUBREG_TO_REG is supposed to indicate that the super register has been zeroed, but we can't prove that if we don't know where it came from. llvm-svn: 281885	2016-09-19 02:53:43 +00:00
Craig Topper	b3b5033179	[AVX-512] Add support for lowering fp_to_f16 and f16_to_fp when VLX is supported regardless of whether F16C is also supported. Still need to add support for lowering using AVX512F when neither VLX or F16C is supported. llvm-svn: 281884	2016-09-19 02:53:37 +00:00
Dean Michael Berris	4640154446	[XRay] ARM 32-bit no-Thumb support in LLVM This is a port of XRay to ARM 32-bit, without Thumb support yet. The XRay instrumentation support is moving up to AsmPrinter. This is one of 3 commits to different repositories of XRay ARM port. The other 2 are: https://reviews.llvm.org/D23932 (Clang test) https://reviews.llvm.org/D23933 (compiler-rt) Differential Revision: https://reviews.llvm.org/D23931 llvm-svn: 281878	2016-09-19 00:54:35 +00:00
Craig Topper	af5ee86bc9	[AVX-512] Don't lower CVTPD2PS intrinsics to ISD::FP_ROUND with an X86 rounding mode encoding in the second operand. This immediate should only be 0 or 1 and indicates if the truncation loses precision. Also enhance an assert in SelectionDAG::getNode to flag this sort of problem in the future. llvm-svn: 281868	2016-09-18 21:49:32 +00:00
Craig Topper	c26cd68422	[AVX-512] Stop lowering avx512_mask_sqrt intrinsics to ISD:FSQRT with a second operand containing an X86 specific rounding mode encoding that doesn't belong. llvm-svn: 281867	2016-09-18 21:49:28 +00:00
Craig Topper	cc03165d3f	[X86] Fix typo in comment. NFC llvm-svn: 281862	2016-09-18 18:59:38 +00:00
Craig Topper	8542041bb2	[AVX-512] Add memory load patterns for the legacy SSE scalar fp to integer conversion intrinsics to be consistent across all intruction sets. llvm-svn: 281861	2016-09-18 18:59:36 +00:00
Craig Topper	8c252bc4dd	[AVX-512] Remove COPY_TO_REGCLASS from a few patterns that already had the correct register class. llvm-svn: 281860	2016-09-18 18:59:33 +00:00
Simon Pilgrim	6c21e6a54e	[X86][SSE] Improve recognition of uitofp conversions that can be performed as sitofp With D24253 we can now use SelectionDAG::SignBitIsZero with vector operations. This patch uses SelectionDAG::SignBitIsZero to recognise that a zero sign bit means that we can use a sitofp instead of a uitofp (which is not directly support on pre-AVX512 hardware). While AVX512 does provide support for uitofp, the conversion to sitofp should not cause any regressions. Differential Revision: https://reviews.llvm.org/D24343 llvm-svn: 281852	2016-09-18 12:45:23 +00:00
Simon Pilgrim	6736096ac3	[X86][SSE] Improve target shuffle mask extraction Add ability to extract vXi64 'vzext_movl' masks on 32-bit targets llvm-svn: 281834	2016-09-17 18:50:54 +00:00
Ron Lieberman	da5df7c99e	[Hexagon] segv while processing SUnit with nullNodePtr Added BoundaryNode check to isBestZeroLatency function. llvm-svn: 281825	2016-09-17 16:21:09 +00:00
Matt Arsenault	ac0fc849cf	AMDGPU: Fix broken FrameIndex handling We were trying to avoid using a FrameIndex operand in non-pointer operands in a convoluted way, and would break because of using TargetFrameIndex. The TargetFrameIndex should only be used in the case where it makes sense to fold it as part of the addressing mode, otherwise it requires materialization like a normal constant. This wasn't working reliably and failed in the added testcase, hitting the assert when processing the frame index. The TargetFrameIndex was coming from trying to produce an AssertZext limiting the maximum stack size. I'm not sure this was correct to begin with, because it is apparently possible to have a single workitem dispatch that requires all 4G of private memory. llvm-svn: 281824	2016-09-17 16:09:55 +00:00
Matt Arsenault	bcfd94c298	AMDGPU: Rename spill operands to match real instruction llvm-svn: 281823	2016-09-17 15:52:37 +00:00
Matt Arsenault	d99ef1144b	AMDGPU: Push bitcasts through build_vector This reduces the number of copies and reg_sequences when using fp constant vectors. This significantly reduces the code size in local-stack-alloc-bug.ll llvm-svn: 281822	2016-09-17 15:44:16 +00:00
Matt Arsenault	7b1dc2c983	AMDGPU: Use i64 scalar compare instructions VI added eq/ne for i64, so use them. llvm-svn: 281800	2016-09-17 02:02:19 +00:00
Tom Stellard	7998db634c	AMDGPU/SI: Fix kernel argument ABI for HSA Summary: i8, i16, and f16 values are not extended to 32-bit in the HSA kernel ABI. Reviewers: arsenm Subscribers: arsenm, kzhuravl, wdng, nhaehnle, llvm-commits, yaxunl Differential Revision: https://reviews.llvm.org/D24621 llvm-svn: 281789	2016-09-16 22:20:24 +00:00

1 2 3 4 5 ...

39388 Commits