llvm-project

Commit Graph

Author	SHA1	Message	Date
Tim Northover	2131044814	X86: support double extension of f16 type. x86 has no native ability to extend an f16 to f64, but the same result is obtained if we expand it into two separate extensions: f16 -> f32 -> f64. Unfortunately the same is not true for truncate, so that still results in a compilation failure. llvm-svn: 213251	2014-07-17 11:04:04 +00:00
Tim Northover	fd7e424935	CodeGen: extend f16 conversions to permit types > float. This makes the two intrinsics @llvm.convert.from.f16 and @llvm.convert.to.f16 accept types other than simple "float". This is only strictly needed for the truncate operation, since otherwise double rounding occurs and there's no way to represent the strict IEEE conversion. However, for symmetry we allow larger types in the extend too. During legalization, we can expand an "fp16_to_double" operation into two extends for convenience, but abort when the truncate isn't legal. A new libcall is probably needed here. Even after this commit, various target tweaks are needed to actually use the extended intrinsics. I've put these into separate commits for clarity, so there are no actual tests of f64 conversion here. llvm-svn: 213248	2014-07-17 10:51:23 +00:00
Yi Kong	2355066e43	Port memory barriers intrinsics to AArch64 Memory barrier __builtin_arm_[dmb, dsb, isb] intrinsics are required to implement their corresponding ACLE and MSVC intrinsics. This patch ports ARM dmb, dsb, isb intrinsic to AArch64. Differential Revision: http://reviews.llvm.org/D4520 llvm-svn: 213247	2014-07-17 10:50:20 +00:00
Daniel Sanders	701e961650	[mips] .reginfo is 8 byte aligned on N32. Differential Revision: http://reviews.llvm.org/D4540 llvm-svn: 213246	2014-07-17 10:10:04 +00:00
Daniel Sanders	7f70573ed9	[mips] Correct ELF e_flags for the N32 ABI when using a mips-* triple rather than a mips64-* triple Summary: Generally speaking, mips-* vs mips64-* should not be used to make decisions about the content or format of the ELF. This should be based on the ABI and CPU in use. For example, `mips-linux-gnu-clang -mips64r2 -mabi=64` should produce an ELF64 as should `mips64-linux-gnu-clang -mabi=64`. Conversely, `mips64-linux-gnu-clang -mabi=n32` should produce an ELF32 as should `mips-linux-gnu-clang -mips64r2 -mabi=n32`. This patch fixes the e_flags but leaves the ELF32 vs ELF64 issue for now since there is no apparent way to base this decision on the ABI and CPU. Differential Revision: http://reviews.llvm.org/D4539 llvm-svn: 213244	2014-07-17 10:02:08 +00:00
Daniel Sanders	185f23adbc	[mips] Correct .MIPS.abiflags for -mfpxx on MIPS32r6 Summary: The cpr1_size field describes the minimum register width to run the program rather than the size of the registers on the target. MIPS32r6 was acting as if -mfp64 has been given because it starts off with 64-bit FPU registers. Differential Revision: http://reviews.llvm.org/D4538 llvm-svn: 213243	2014-07-17 09:57:23 +00:00
Daniel Sanders	16ec6c1939	[mips] Fix ELF e_flags related to -mabicalls and -mplt. Summary: These options are not implemented yet but we act as if they are always given. The integrated assembler is driven by the clang driver so the e_flag test cases should match the e_flags emitted by GCC+GAS rather than GAS by itself. Differential Revision: http://reviews.llvm.org/D4536 llvm-svn: 213242	2014-07-17 09:52:56 +00:00
Matt Arsenault	ac6e39cf3b	Use range for llvm-svn: 213230	2014-07-17 06:19:06 +00:00
Matt Arsenault	5e2b0f51e7	R600: Short circuit alloca check if address space isn't private. Skip calling GetUnderlyingObject in cases where it obviously isn't from an alloca. This should only be a compile time improvement. llvm-svn: 213229	2014-07-17 06:13:41 +00:00
Sanjay Patel	6360441f99	Remove Atom references in description. Any CPU can run this pass. llvm-svn: 213190	2014-07-16 20:18:49 +00:00
Chris Bieneman	df4b763be5	[RegisterCoalescer] Moving the RegisterCoalescer subtarget hook onto the TargetRegisterInfo instead of the TargetSubtargetInfo. llvm-svn: 213188	2014-07-16 20:13:31 +00:00
Justin Holewinski	ac451066f4	[NVPTX] Honor alignment on vector loads/stores We were not considering the stated alignment on vector loads/stores, leading us to generate vector instructions even when we do not have sufficient alignment. Now, for IR like: %1 = load <4 x float>, <4 x float>* %ptr, align 4 we will generate correct, conservative PTX like: ld.f32 ... [%ptr] ld.f32 ... [%ptr+4] ld.f32 ... [%ptr+8] ld.f32 ... [%ptr+12] Or if we have an alignment of 8 (for example), we can generate code like: ld.v2.f32 ... [%ptr] ld.v2.f32 ... [%ptr+8] llvm-svn: 213186	2014-07-16 19:45:35 +00:00
Chris Bieneman	80a866a316	Added documentation for SizeMultiplier in the ARM subtarget hook for register coalescing. Also fixed some 80 col violations. No functional code changes. llvm-svn: 213169	2014-07-16 16:27:31 +00:00
Justin Holewinski	3e037d98e6	[NVPTX] Rename registers %fl -> %fd and %rl -> %rd This matches the internal behavior of NVIDIA tools like libnvvm. llvm-svn: 213168	2014-07-16 16:26:58 +00:00
Tim Northover	7f3e11e7c0	CodeGen: don't form illegail EXTLOAD operations. It turns out that in most cases (the main exception being i1-related types) once these operations are formed we cannot separate them and the targets end up having to deal with them whether they want to or not. This is not a good situation, and a more reasonable default can be formed by ackowledging this and having targets leave them as Legal. Only x86 seems to be affected (other targets don't even try marking the operation Expand). Mostly there's no visible change here yet, but it will be useful to have truly expanded EXTLOADS for MVT::f16 softening support. llvm-svn: 213162	2014-07-16 15:37:24 +00:00
Daniel Sanders	68dcb4ffa3	[mips][fp64a] Temporarily disable odd-numbered double-precision registers when using the FP64A ABI. Summary: A few instructions (mostly cvt.d.w and similar) are causing problems with -mfp64 and -mno-odd-spreg and it looks like fixing it properly may take several weeks. In the meantime, let's disable the odd-numbered double-precision registers so that the generated code is at least valid. The problem is that instructions like cvt.d.w read from the 32-bit low subregister of a double-precision FPU register. This often leads to the compiler to inserting moves to transfer a GPR32 to a FGR32 using mtc1. Such moves violate the rules against 32-bit writes to odd-numbered FPU registers imposed by -mno-odd-spreg. By disabling the odd-numbered double-precision registers, it becomes impossible for the 32-bit low subregister to be odd-numbered. This fixes numerous test-suite failures when compiling for the FP64A ABI ('-mfp64 -mno-odd-spreg'). There is no LLVM test case because it's difficult to test that odd-numbered FPU registers are not allocatable. Instead, we depend on the assembler (GAS and -fintegrated-as) raising errors when the rules are violated. Differential Revision: http://reviews.llvm.org/D4532 llvm-svn: 213160	2014-07-16 15:34:07 +00:00
Andrea Di Biagio	a03624d8ab	[X86] Add a check for 'isMOVHLPSMask' within method 'isShuffleMaskLegal'. Before this change, method 'isShuffleMaskLegal' didn't know that shuffles implementing a 'movhlps' operation were perfectly legal for SSE targets. This patch adds the missing check for 'isMOVHLPSMask' inside method 'isShuffleMaskLegal' to fix the problem. The reason why it is important to do this is because the DAGCombiner conservatively avoids combining a pair of shuffles if the resulting shuffle node has an illegal mask. Before this patch, shuffles with a MOVHLPS mask were wrongly considered not to be legal. This was the root cause of some poor-code generation bugs. llvm-svn: 213137	2014-07-16 11:29:39 +00:00
Matt Arsenault	22ca3f8860	R600/SI: Allow using f32 rcp / rsq when denormals not handled. These are precise enough to use for OpenCL unless denormals are handled. llvm-svn: 213107	2014-07-15 23:50:10 +00:00
David Majnemer	3821ff03cd	X86: Simplify X86WindowsTargetObjectFile::getSectionForConstant There exists a helper function to abstract away the various differences between ConstantVector, ConstantDataVector, ConstantAggregateZero, etc. Use it to simplify X86WindowsTargetObjectFile::getSectionForConstant. llvm-svn: 213104	2014-07-15 23:01:10 +00:00
Sanjay Patel	a2f658d69d	Move Post RA Scheduling flag bit into SchedMachineModel Refactoring; no functional changes intended Removed PostRAScheduler bits from subtargets (X86, ARM). Added PostRAScheduler bit to MCSchedModel class. This bit is set by a CPU's scheduling model (if it exists). Removed enablePostRAScheduler() function from TargetSubtargetInfo and subclasses. Fixed the existing enablePostMachineScheduler() method to use the MCSchedModel (was just returning false!). Added methods to TargetSubtargetInfo to allow overrides for AntiDepBreakMode, CriticalPathRCs, and OptLevel for PostRAScheduling. Added enablePostRAScheduler() function to PostRAScheduler class which queries the subtarget for the above values. Preserved existing scheduler behavior for ARM, MIPS, PPC, and X86: a. ARM overrides the CPU's postRA settings by enabling postRA for any non-Thumb or Thumb2 subtarget. b. MIPS overrides the CPU's postRA settings by enabling postRA for everything. c. PPC overrides the CPU's postRA settings by enabling postRA for everything. d. X86 is the only target that actually has postRA specified via sched model info. Differential Revision: http://reviews.llvm.org/D4217 llvm-svn: 213101	2014-07-15 22:39:58 +00:00
Matt Arsenault	0d89e849bd	R600/SI: Fix select on i1 llvm-svn: 213096	2014-07-15 21:44:37 +00:00
Matt Arsenault	e9fa3b8e6b	R600/SI: Implement less wrong f32 fdiv Assuming single precision denormals and accurate sqrt/div are not reported, this passes the OpenCL conformance test. llvm-svn: 213089	2014-07-15 20:18:31 +00:00
Matt Arsenault	1d077749ea	R600: Add predicate for UnsafeFPMath llvm-svn: 213088	2014-07-15 20:18:24 +00:00
Matt Arsenault	84446a026b	R600: Remove intrinsics that appear to be unused llvm-svn: 213087	2014-07-15 20:10:27 +00:00
Chris Bieneman	03695ab57e	[RegisterCoalescer] Add new subtarget hook allowing targets to opt-out of coalescing. The coalescer is very aggressive at propagating constraints on the register classes, and the register allocator doesn’t know how to split sub-registers later to recover. This patch provides an escape valve for targets that encounter this problem to limit coalescing. This patch also implements such for ARM to lower register pressure when using lots of large register classes. This works around PR18825. llvm-svn: 213078	2014-07-15 17:18:41 +00:00
Cameron McInally	44f3e30cf2	Revert r213070. It's breaking the build in MCELFStreamer::EmitInstToData(...). llvm-svn: 213073	2014-07-15 16:24:24 +00:00
Jan Vesely	6ddb8dd442	R600: Implement zero undef variants of ctlz/cttz v2: use ffbh/l if available v3: Rebase on top of Matt's SI patches Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <tom@stellard.net> llvm-svn: 213072	2014-07-15 15:51:09 +00:00
Daniel Sanders	a6e125f07e	[mips] Correct .MIPS.abiflags fp_abi field for -mfpxx and without .module Summary: Previously all the test cases set it after initialization with '.module fp=xx'. Differential Revision: http://reviews.llvm.org/D4489 llvm-svn: 213071	2014-07-15 15:31:39 +00:00
Cameron McInally	53bc7a3330	Add x86 patterns to match a specific add-with-carry. llvm-svn: 213070	2014-07-15 15:03:32 +00:00
NAKAMURA Takumi	04b8b37f56	Prune Redundant libdeps in CMake's target_link_libraries and LLVMBuild.txt. I checked this with Release+Asserts on x86_64-mingw32. Please restore partially if this were overkill. llvm-svn: 213064	2014-07-15 11:37:03 +00:00
Andrea Di Biagio	04d5a7b337	Silence a warning in conditional expression. Fixes a gcc warning caused by a typo. A redundant assignment operation was accidentally used as the third operand of a conditional expression. No functional change intended. llvm-svn: 213061	2014-07-15 10:53:44 +00:00
Tim Northover	e4b8e138e1	AArch64: fall back to generic code for out of range extract/insert. rdar://problem/17624784 llvm-svn: 213059	2014-07-15 10:00:26 +00:00
David Majnemer	d4d9944416	Fix typo in comment No functionality changed. llvm-svn: 213052	2014-07-15 07:11:32 +00:00
Juergen Ributzka	8f073c8d60	[FastISel][X86] Remove no longer needed functions. llvm-svn: 213051	2014-07-15 06:35:53 +00:00
Juergen Ributzka	3566c08dd9	[FastISel][X86] Implement the FastLowerIntrinsicCall hook. Rename X86VisitIntrinsicCall -> FastLowerIntrinsicCall, which effectively implements the target hook. llvm-svn: 213050	2014-07-15 06:35:50 +00:00
Juergen Ributzka	23d43318c7	[FastISel][X86] Implement the FastLowerCall hook. This implements the FastLowerCall hook, which is based on the DoSelectCall function. The implementation is very similar, but the target-independent call lowering part has been factored out. This should also enable patchpoint intrinsic lowering for FastISel on X86. Related to <rdar://problem/17427052>. llvm-svn: 213049	2014-07-15 06:35:47 +00:00
Juergen Ributzka	5ee9d90248	Revert "[FastISel][X86] Remove no longer needed functions." Revert "[FastISel][X86] Implement the FastLowerIntrinsicCall hook." Revert "[FastISel][X86] Implement the FastLowerCall hook." This reverts commit r213035, r213036, and r213037 to make the buildbots happy again. llvm-svn: 213048	2014-07-15 05:23:40 +00:00
David Majnemer	4e3ccc0505	CodeGen: Handle ConstantVector and undef in WinCOFF constant pools The constant pool entry code for WinCOFF assumed that vector constants would be formed using ConstantDataVector, it did not expect to see a ConstantVector. Furthermore, it did not expect undef as one of the elements of the vector. ConstantVectors should be handled like ConstantDataVectors, treat Undef as zero. llvm-svn: 213038	2014-07-15 02:34:12 +00:00
Juergen Ributzka	9fbf33d70f	[FastISel][X86] Remove no longer needed functions. llvm-svn: 213037	2014-07-15 02:22:56 +00:00
Juergen Ributzka	170f9354bb	[FastISel][X86] Implement the FastLowerIntrinsicCall hook. Rename X86VisitIntrinsicCall -> FastLowerIntrinsicCall, which effectively implements the target hook. llvm-svn: 213036	2014-07-15 02:22:53 +00:00
Juergen Ributzka	a9cced8a94	[FastISel][X86] Implement the FastLowerCall hook. This implements the FastLowerCall hook, which is based on the DoSelectCall function. The implementation is very similar, but the target-independent call lowering part has been factored out. This should also enable patchpoint intrinsic lowering for FastISel on X86. Related to <rdar://problem/17427052>. llvm-svn: 213035	2014-07-15 02:22:49 +00:00
Matt Arsenault	ca3976f7ae	R600: Add dag combine for copy of an illegal type. This helps avoid redundant instructions to unpack, and repack the vectors. Ideally we could recognize that pattern and eliminate it. Currently v4i8 and other small element type vectors are scalarized, so this has the added bonus of avoiding that. llvm-svn: 213031	2014-07-15 02:06:31 +00:00
Matt Arsenault	f171cf23b8	R600: Add denormal handling subtarget features. llvm-svn: 213018	2014-07-14 23:40:49 +00:00
Matt Arsenault	c6ae7b4763	R600/SI: Default to no single precision denormals. llvm-svn: 213017	2014-07-14 23:40:43 +00:00
Adam Nemet	cf7c905cfb	[X86] Specify all TSFlags bit-offsets symbolically No functional change. The offsets for the other bitfields are specified symbolically. I need to increase the size for one of the earlier fields which is easier after this cleanup. Why these bits are relative to VEXShift is a bit strange but that is for another cleanup. I made sure that the values for the enums are unchanged after this change. llvm-svn: 213011	2014-07-14 23:18:39 +00:00
David Majnemer	8bce66b093	CodeGen: Stick constant pool entries in COMDAT sections for WinCOFF COFF lacks a feature that other object file formats support: mergeable sections. To work around this, MSVC sticks constant pool entries in special COMDAT sections so that each constant is in it's own section. This permits unused constants to be dropped and it also allows duplicate constants in different translation units to get merged together. This fixes PR20262. Differential Revision: http://reviews.llvm.org/D4482 llvm-svn: 213006	2014-07-14 22:57:27 +00:00
Saleem Abdulrasool	b51d464f1e	X86: correct 64-bit atomics on 32-bit We would emit a libcall for a 64-bit atomic on x86 after SVN r212119. This was due to the misuse of hasCmpxchg16 to indicate if cmpxchg8b was supported on a 32-bit target. They were added at different times and would result in the border condition being mishandled. This fixes the border case to emit the cmpxchg8b instruction for 64-bit atomic operations on x86 at the cost of restoring a long-standing bug in the codegen. We emit a cmpxchg8b on all x86 targets even where the CPU does not support this instruction (pre-Pentium CPUs). Although this bug should be fixed, this was present prior to SVN r212119 and this change, so this is not really introducing a regression. llvm-svn: 212956	2014-07-14 16:28:13 +00:00
Tim Northover	6c647eae8b	X86: remove temporary atomicrmw used during lowering. We construct a temporary "atomicrmw xchg" instruction when lowering atomic stores for widths that aren't supported natively. This isn't on the top-level worklist though, so it won't be removed automatically and we have to do it ourselves once that itself has been lowered. Thanks Saleem for pointing this out! llvm-svn: 212948	2014-07-14 15:31:13 +00:00
Daniel Sanders	41ffa5d1ba	Re-commit: [mips] Correct section alignments and EntrySizes for .bss, .text, .data, .reginfo, .MIPS.options, and .MIPS.abiflags The lld tests will temporarily fail again but Simon Atanasyan will commit a fix for those shortly. llvm-svn: 212946	2014-07-14 15:05:51 +00:00
Daniel Sanders	cb0d36e592	Revert: [mips] Correct section alignments and EntrySizes for .bss, .text, .data, .reginfo, .MIPS.options, and .MIPS.abiflags This commit causes multiple lld tests to fail. Reverting while I investigate the issue. llvm-svn: 212945	2014-07-14 14:43:45 +00:00

1 2 3 4 5 ...

29164 Commits