llvm-project

Commit Graph

Author	SHA1	Message	Date
Rafael Espindola	5dec7eaae2	Rename createIRObjectFile to just create. It is a static method of IRObjectFile, so having to use IRObjectFile::createIRObjectFile was redundant. llvm-svn: 223822	2014-12-09 20:36:13 +00:00
Colin LeMahieu	4af437fee5	[Hexagon] Updating rr/ri 32/64 transfer encodings and adding tests. llvm-svn: 223821	2014-12-09 20:23:30 +00:00
Juergen Ributzka	c6f314b8ed	[FastISel][AArch64] Fix a missing nullptr check in 'computeAddress'. The load/store value type is currently not available when lowering the memcpy intrinsic. Add the missing nullptr check to support this in 'computeAddress'. Fixes rdar://problem/19178947. llvm-svn: 223818	2014-12-09 19:44:38 +00:00
Colin LeMahieu	b580d7d8c8	[Hexagon] Adding word combine dot-new form and replacing old combine opcode. llvm-svn: 223815	2014-12-09 19:23:45 +00:00
Chandler Carruth	a7f247ea56	Revert r223764 which taught instcombine about integer-based elment extraction patterns. This is causing Clang to miscompile itself for 32-bit x86 somehow, and likely also on ARM and PPC. I really don't know how, but reverting now that I've confirmed this is actually the culprit. I have a reproduction as well and so should be able to restore this shortly. This reverts commit r223764. Original commit log follows: Teach instcombine to canonicalize "element extraction" from a load of an integer and "element insertion" into a store of an integer into actual element extraction, element insertion, and vector loads and stores. Previously various parts of LLVM (including instcombine itself) would introduce integer loads and stores into the code as a way of opaquely loading and storing "bits". In some cases (such as a memcpy of std::complex<float> object) we will eventually end up using those bits in non-integer types. In order for SROA to effectively promote the allocas involved, it splits these "store a bag of bits" integer loads and stores up into the constituent parts. However, for non-alloca loads and tsores which remain, it uses integer math to recombine the values into a large integer to load or store. All of this would be "fine", except that it forces LLVM to go through integer math to combine and split up values. While this makes perfect sense for integers (and in fact is critical for bitfields to end up lowering efficiently) it is terrible for non-integer types, especially floating point types. We have a much more canonical way of representing the act of concatenating the bits of two SSA values in LLVM: a vector and insertelement. This patch teaching InstCombine to use this representation. With this patch applied, LLVM will no longer introduce integer math into the critical path of every loop over std::complex<float> operations such as those that make up the hot path of ... oh, most HPC code, Eigen, and any other heavy linear algebra library. For the record, I looked extensively at fixing this in other parts of the compiler, but it just doesn't work: - We really do want to canonicalize memcpy and other bit-motion to integer loads and stores. SSA values are tremendously more powerful than "copy" intrinsics. Not doing this regresses massive amounts of LLVM's scalar optimizer. - We really do need to split up integer loads and stores of this form in SROA or every memcpy of a trivially copyable struct will prevent SSA formation of the members of that struct. It essentially turns off SROA. - The closest alternative is to actually split the loads and stores when partitioning with SROA, but this has all of the downsides historically discussed of splitting up loads and stores -- the wide-store information is fundamentally lost. We would also see performance regressions for bitfield-heavy code and other places where the integers aren't really intended to be split without seemingly arbitrary logic to treat integers totally differently. - We can effectively fix this in instcombine, so it isn't that hard of a choice to make IMO. llvm-svn: 223813	2014-12-09 19:21:16 +00:00
David Majnemer	2defbada38	AsmParser: Don't crash on short hex constants for fp128 types If we see 0xL01, treat it like 0xL00000000000000000000000000000001 instead of crashing. llvm-svn: 223811	2014-12-09 19:10:03 +00:00
Frederic Riss	35f0a9aeba	Remove unneeded curly braces. llvm-svn: 223809	2014-12-09 18:57:39 +00:00
Frederic Riss	ff58fd207e	Reorder the code to avoid inserting at the beginning of a vector. As per dblaikie suggestion, thanks\! llvm-svn: 223808	2014-12-09 18:57:34 +00:00
Duncan P. N. Exon Smith	562283189d	Fix a GCC build failure from r223802 llvm-svn: 223806	2014-12-09 18:52:38 +00:00
Robert Khasanov	8e8c39963d	[AVX512] Added lowering for VBROADCASTSS/SD instructions. Lowering patterns were written through avx512_broadcast_pat multiclass as pattern generates VBROADCAST and COPY_TO_REGCLASS nodes. Added lowering tests. llvm-svn: 223804	2014-12-09 18:45:30 +00:00
Duncan P. N. Exon Smith	5bf8fef580	IR: Split Metadata from Value Split `Metadata` away from the `Value` class hierarchy, as part of PR21532. Assembly and bitcode changes are in the wings, but this is the bulk of the change for the IR C++ API. I have a follow-up patch prepared for `clang`. If this breaks other sub-projects, I apologize in advance :(. Help me compile it on Darwin I'll try to fix it. FWIW, the errors should be easy to fix, so it may be simpler to just fix it yourself. This breaks the build for all metadata-related code that's out-of-tree. Rest assured the transition is mechanical and the compiler should catch almost all of the problems. Here's a quick guide for updating your code: - `Metadata` is the root of a class hierarchy with three main classes: `MDNode`, `MDString`, and `ValueAsMetadata`. It is distinct from the `Value` class hierarchy. It is typeless -- i.e., instances do not have a `Type`. - `MDNode`'s operands are all `Metadata ` (instead of `Value `). - `TrackingVH<MDNode>` and `WeakVH` referring to metadata can be replaced with `TrackingMDNodeRef` and `TrackingMDRef`, respectively. If you're referring solely to resolved `MDNode`s -- post graph construction -- just use `MDNode`. - `MDNode` (and the rest of `Metadata`) have only limited support for `replaceAllUsesWith()`. As long as an `MDNode` is pointing at a forward declaration -- the result of `MDNode::getTemporary()` -- it maintains a side map of its uses and can RAUW itself. Once the forward declarations are fully resolved RAUW support is dropped on the ground. This means that uniquing collisions on changing operands cause nodes to become "distinct". (This already happened fairly commonly, whenever an operand went to null.) If you're constructing complex (non self-reference) `MDNode` cycles, you need to call `MDNode::resolveCycles()` on each node (or on a top-level node that somehow references all of the nodes). Also, don't do that. Metadata cycles (and the RAUW machinery needed to construct them) are expensive. - An `MDNode` can only refer to a `Constant` through a bridge called `ConstantAsMetadata` (one of the subclasses of `ValueAsMetadata`). As a side effect, accessing an operand of an `MDNode` that is known to be, e.g., `ConstantInt`, takes three steps: first, cast from `Metadata` to `ConstantAsMetadata`; second, extract the `Constant`; third, cast down to `ConstantInt`. The eventual goal is to introduce `MDInt`/`MDFloat`/etc. and have metadata schema owners transition away from using `Constant`s when the type isn't important (and they don't care about referring to `GlobalValue`s). In the meantime, I've added transitional API to the `mdconst` namespace that matches semantics with the old code, in order to avoid adding the error-prone three-step equivalent to every call site. If your old code was: MDNode N = foo(); bar(isa <ConstantInt>(N->getOperand(0))); baz(cast <ConstantInt>(N->getOperand(1))); bak(cast_or_null <ConstantInt>(N->getOperand(2))); bat(dyn_cast <ConstantInt>(N->getOperand(3))); bay(dyn_cast_or_null<ConstantInt>(N->getOperand(4))); you can trivially match its semantics with: MDNode N = foo(); bar(mdconst::hasa <ConstantInt>(N->getOperand(0))); baz(mdconst::extract <ConstantInt>(N->getOperand(1))); bak(mdconst::extract_or_null <ConstantInt>(N->getOperand(2))); bat(mdconst::dyn_extract <ConstantInt>(N->getOperand(3))); bay(mdconst::dyn_extract_or_null<ConstantInt>(N->getOperand(4))); and when you transition your metadata schema to `MDInt`: MDNode N = foo(); bar(isa <MDInt>(N->getOperand(0))); baz(cast <MDInt>(N->getOperand(1))); bak(cast_or_null <MDInt>(N->getOperand(2))); bat(dyn_cast <MDInt>(N->getOperand(3))); bay(dyn_cast_or_null<MDInt>(N->getOperand(4))); - A `CallInst` -- specifically, intrinsic instructions -- can refer to metadata through a bridge called `MetadataAsValue`. This is a subclass of `Value` where `getType()->isMetadataTy()`. `MetadataAsValue` is the only class that can legally refer to a `LocalAsMetadata`, which is a bridged form of non-`Constant` values like `Argument` and `Instruction`. It can also refer to any other `Metadata` subclass. (I'll break all your testcases in a follow-up commit, when I propagate this change to assembly.) llvm-svn: 223802	2014-12-09 18:38:53 +00:00
David Majnemer	b39e22bdc5	AsmParser: Don't crash on malformed attribute groups This fixes PR21785. llvm-svn: 223801	2014-12-09 18:33:57 +00:00
Colin LeMahieu	30dcb232b0	[Hexagon] Updating predicate register transfers and adding tstbit to allow select selection. Updating ll tests with predicate transfers that previously had nop encodings. llvm-svn: 223800	2014-12-09 18:16:49 +00:00
Frederic Riss	7c78db5065	Correctly handle complex locations expressions in replaceDbgDeclareForAlloca() replaceDbgDeclareForAlloca() replaces an alloca by a value storing the address of what was the alloca. If there is a dbg.declare corresponding to that alloca, we need to lower it to a dbg.value describing the additional dereference operation to be performed to get to the underlying variable. This is done by adding a DW_OP_deref to the complex location part of the location description. This deref was added to the end of the operation list, which is wrong. The expression applies to what is described by the dbg.{declare,value}, and as we are changing this, we need to apply the DW_OP_deref as the first operation in the list. Part of the fix for rdar://19162268. llvm-svn: 223799	2014-12-09 17:55:48 +00:00
Juergen Ributzka	8bda738221	[CGP] Rewrite pattern match for splitBranchCondition to work with Values instead. Rewrite the pattern match code to work also with Values instead with Instructions only. Also remove the no longer need matcher (m_Instruction). llvm-svn: 223797	2014-12-09 17:50:10 +00:00
Juergen Ributzka	194350a936	Revert "Move function to obtain branch weights into the BranchInst class. NFC." This reverts commit r223784 and copies the 'ExtractBranchMetadata' to CodeGenPrepare. llvm-svn: 223795	2014-12-09 17:32:12 +00:00
Bill Schmidt	efe9ce216e	[PowerPC 4/4] Enable little-endian support for VSX. With the foregoing three patches, VSX instructions can be used for little endian. This patch removes the restriction that prevented this, and re-enables the test cases from the first three patches. llvm-svn: 223792	2014-12-09 16:59:57 +00:00
Bill Schmidt	3014435ca9	[PowerPC 3/4] Little-endian adjustments for VSX vector shuffle When performing instruction selection for ISD::VECTOR_SHUFFLE, there is special code for handling v2f64 and v2i64 using VSX instructions. This code must be adjusted for little-endian. Because the two inputs are treated as a double-wide register, we must swap their order for little endian. To get the appropriate mask elements to use with the big-endian biased XXPERMDI instruction, we must reverse their order and invert the bits. A new test is added to test the 16 possible values of the shuffle mask. It is initially disabled for reasons specified in the test. It is re-enabled by patch 4/4. llvm-svn: 223791	2014-12-09 16:52:29 +00:00
Bill Schmidt	10f6eb91a0	[PowerPC 2/4] Little-endian adjustments for VSX insert/extract operations For little endian, we need to make some straightforward adjustments in the code expansions for scalar_to_vector and vector_extract of v2f64. First, scalar_to_vector must place the scalar into vector element zero. However, our implementation of SUBREG_TO_REG will place it into big-element vector element zero (high-order bits), and for little endian we need it in the low-order bits. The LE implementation splats the high-order doubleword into the low-order doubleword. Second, the meaning of (vector_extract x, 0) and (vector_extract x, 1) must be reversed for similar reasons. A new test is added that tests code generation for insertelement and extractelement for both element 0 and element 1. It is disabled in this patch but enabled in patch 4/4, for reasons stated in the test. llvm-svn: 223788	2014-12-09 16:43:32 +00:00
Robert Khasanov	cbc5703aeb	[AVX512] Added VPBROADCAST{BWDQ} (Load with Broadcast Integer Data from General Purpose Register) encodings for AVX512-BW/VL subsets Added encoding tests. llvm-svn: 223787	2014-12-09 16:38:41 +00:00
Juergen Ributzka	c1bbcbbd32	[CodeGenPrepare] Split branch conditions into multiple conditional branches. This optimization transforms code like: bb1: %0 = icmp ne i32 %a, 0 %1 = icmp ne i32 %b, 0 %or.cond = or i1 %0, %1 br i1 %or.cond, label %TrueBB, label %FalseBB into a multiple branch instructions like: bb1: %0 = icmp ne i32 %a, 0 br i1 %0, label %TrueBB, label %bb2 bb2: %1 = icmp ne i32 %b, 0 br i1 %1, label %TrueBB, label %FalseBB This optimization is already performed by SelectionDAG, but not by FastISel. FastISel cannot perform this optimization, because it cannot generate new MachineBasicBlocks. Performing this optimization at CodeGenPrepare time makes it available to both - SelectionDAG and FastISel - and the implementation in SelectiuonDAG could be removed. There are currenty a few differences in codegen for X86 and PPC, so this commmit only enables it for FastISel. Reviewed by Jim Grosbach This fixes rdar://problem/19034919. llvm-svn: 223786	2014-12-09 16:36:13 +00:00
Juergen Ributzka	e2aa3aa38a	Move function to obtain branch weights into the BranchInst class. NFC. Make this function available to other parts of LLVM. llvm-svn: 223784	2014-12-09 16:36:06 +00:00
Bill Schmidt	fae5d71584	[PowerPC 1/4] Little-endian adjustments for VSX loads/stores This patch addresses the inherent big-endian bias in the lxvd2x, lxvw4x, stxvd2x, and stxvw4x instructions. These instructions load vector elements into registers left-to-right (with the first element loaded into the high-order bits of the register), regardless of the endian setting of the processor. However, these are the only vector memory instructions that permit unaligned storage accesses, so we want to use them for little-endian. To make this work, a lxvd2x or lxvw4x is replaced with an lxvd2x followed by an xxswapd, which swaps the doublewords. This works for lxvw4x as well as lxvd2x, because for lxvw4x on an LE system the vector elements are in LE order (right-to-left) within each doubleword. (Thus after lxvw2x of a <4 x float> the elements will appear as 1, 0, 3, 2. Following the swap, they will appear as 3, 2, 0, 1, as desired.) For stores, an stxvd2x or stxvw4x is replaced with an stxvd2x preceded by an xxswapd. Introduction of extra swap instructions provides correctness, but obviously is not ideal from a performance perspective. Future patches will address this with optimizations to remove most of the introduced swaps, which have proven effective in other implementations. The introduction of the swaps is performed during lowering of LOAD, STORE, INTRINSIC_W_CHAIN, and INTRINSIC_VOID operations. The latter are used to translate intrinsics that specify the VSX loads and stores directly into equivalent sequences for little endian. Thus code that uses vec_vsx_ld and vec_vsx_st does not have to be modified to be ported from BE to LE. We introduce new PPCISD opcodes for LXVD2X, STXVD2X, and XXSWAPD for use during this lowering step. In PPCInstrVSX.td, we add new SDType and SDNode definitions for these (PPClxvd2x, PPCstxvd2x, PPCxxswapd). These are recognized during instruction selection and mapped to the correct instructions. Several tests that were written to use -mcpu=pwr7 or pwr8 are modified to disable VSX on LE variants because code generation changes with this and subsequent patches in this set. I chose to include all of these in the first patch than try to rigorously sort out which tests were broken by one or another of the patches. Sorry about that. The new test vsx-ldst-builtin-le.ll, and the changes to vsx-ldst.ll, are disabled until LE support is enabled because of breakages that occur as noted in those tests. They are re-enabled in patch 4/4. llvm-svn: 223783	2014-12-09 16:35:51 +00:00
Rafael Espindola	25a7e0a89f	Move method out of line to make buildbot happy. llvm-svn: 223781	2014-12-09 16:18:11 +00:00
Rafael Espindola	527e846ef7	Don't lookup an object symbol name in the module. Instead, walk the obj symbol list in parallel to find the GV. This shouldn't change anything on ELF where global symbols are not mangled, but it is a step toward supporting other object formats. Gold itself is ELF only, but bfd ld supports COFF and the logic in the gold plugin could be reused on lld. llvm-svn: 223780	2014-12-09 16:13:59 +00:00
Chandler Carruth	f57ac3bd22	[x86] Fix the test to actually test things for the CPU names, add the missing barcelona CPU which that test uncovered, and remove the 32-bit x86 CPUs which I really wasn't prepared to audit and test thoroughly. If anyone wants to clean up the 32-bit only x86 CPUs, go for it. Also, if anyone else wants to try to de-duplicate the AMD CPUs, that'd be cool, but from the looks of it wouldn't save as much as it did for the Intel CPUs. llvm-svn: 223774	2014-12-09 14:25:55 +00:00
Aaron Ballman	f588251b99	Removing an unused variable to silence a -Wunused-but-set-variable warning. NFC. llvm-svn: 223773	2014-12-09 13:20:11 +00:00
Asiri Rathnayake	7835e9b232	Fix modified immediate bug reported by MC Hammer. Instructions of the form [ADD Rd, pc, #imm] are manually aliased in processInstruction() to use ADR. To accomodate this, mod_imm handling had to be tweaked a bit. Turns out it was the manual aliasing that must be tweaked to accommodate mod_imms instead. More information about the parsed instruction is available at the point where processInstruction() is invoked, which makes it easier to detect a mod_imm at that point rather than trying to detect a potential alias when a mod_imm is being prepped. Added a test case and fixed some white spaces as well. llvm-svn: 223772	2014-12-09 13:14:58 +00:00
Chandler Carruth	af892403c2	[x86] Bring some sanity to the x86 CPU processor definitions. Notably, this adds simple micro-architecture names for the Intel CPU variants, and defines the old 'core'-based names as aliases. GCC has started to simplify their documented interface to use these names as well, so it seems like we can start to converge on a consistent pattern. I'd appreciate Intel double checking the entries that aren't yet documented widely, especially Atom (Bonnell and Silvermont), Knights Landing, and Skylake. But this change shouldn't break any existing users. Also, ran clang-format to re-format this code and it actually worked (modulo a tiny bug) so hopefully we can start to stop thinking about formatting this stuff. llvm-svn: 223769	2014-12-09 10:58:36 +00:00
Chandler Carruth	7415205113	Teach instcombine to canonicalize "element extraction" from a load of an integer and "element insertion" into a store of an integer into actual element extraction, element insertion, and vector loads and stores. Previously various parts of LLVM (including instcombine itself) would introduce integer loads and stores into the code as a way of opaquely loading and storing "bits". In some cases (such as a memcpy of std::complex<float> object) we will eventually end up using those bits in non-integer types. In order for SROA to effectively promote the allocas involved, it splits these "store a bag of bits" integer loads and stores up into the constituent parts. However, for non-alloca loads and tsores which remain, it uses integer math to recombine the values into a large integer to load or store. All of this would be "fine", except that it forces LLVM to go through integer math to combine and split up values. While this makes perfect sense for integers (and in fact is critical for bitfields to end up lowering efficiently) it is terrible for non-integer types, especially floating point types. We have a much more canonical way of representing the act of concatenating the bits of two SSA values in LLVM: a vector and insertelement. This patch teaching InstCombine to use this representation. With this patch applied, LLVM will no longer introduce integer math into the critical path of every loop over std::complex<float> operations such as those that make up the hot path of ... oh, most HPC code, Eigen, and any other heavy linear algebra library. For the record, I looked extensively at fixing this in other parts of the compiler, but it just doesn't work: - We really do want to canonicalize memcpy and other bit-motion to integer loads and stores. SSA values are tremendously more powerful than "copy" intrinsics. Not doing this regresses massive amounts of LLVM's scalar optimizer. - We really do need to split up integer loads and stores of this form in SROA or every memcpy of a trivially copyable struct will prevent SSA formation of the members of that struct. It essentially turns off SROA. - The closest alternative is to actually split the loads and stores when partitioning with SROA, but this has all of the downsides historically discussed of splitting up loads and stores -- the wide-store information is fundamentally lost. We would also see performance regressions for bitfield-heavy code and other places where the integers aren't really intended to be split without seemingly arbitrary logic to treat integers totally differently. - We can effectively fix this in instcombine, so it isn't that hard of a choice to make IMO. Differential Revision: http://reviews.llvm.org/D6548 llvm-svn: 223764	2014-12-09 08:55:32 +00:00
Michael Ilseman	2770c2d6d4	Skip declarations in the case of functions. This is a revert of r223521 in spirit, if not in content. I am not sure why declarations ended up in LazilyLinkGlobalValues in the first place; that will take some more investigation. llvm-svn: 223763	2014-12-09 08:20:06 +00:00
Elena Demikhovsky	fa4a6c18f7	AVX-512: Added some comments to ERI scalar intrinsics. No functional change. llvm-svn: 223761	2014-12-09 07:06:32 +00:00
Owen Anderson	558012a3fc	Fix a few instances found in SelectionDAG where we were not handling F16 at parity with F32 and F64. llvm-svn: 223760	2014-12-09 06:50:39 +00:00
Mohit K. Bhakkad	e38c32ffec	test commit (spelling correction) llvm-svn: 223758	2014-12-09 06:31:07 +00:00
Michael Kuperstein	c69bb43f35	[X86] Convert esp-relative movs of function arguments into pushes, step 1 This handles the simplest case for mov -> push conversion: 1. x86-32 calling convention, everything is passed through the stack. 2. There is no reserved call frame. 3. Only registers or immediates are pushed, no attempt to combine a mem-reg-mem sequence into a single PUSHmm. Differential Revision: http://reviews.llvm.org/D6503 llvm-svn: 223757	2014-12-09 06:10:44 +00:00
David Majnemer	598bd05bd7	Reland r223754 The commit is identical except a reference to `GV' should have been to `GVal'. llvm-svn: 223756	2014-12-09 05:56:09 +00:00
David Majnemer	8d3e580cc7	Revert "AsmParser: Reject invalid mismatch between forward ref and def" This reverts commit r223754. I've upset the buildbots. llvm-svn: 223755	2014-12-09 05:50:11 +00:00
David Majnemer	e9efecaa52	AsmParser: Reject invalid mismatch between forward ref and def Don't assume that the forward referenced entity was of the same global-kind as the new entity. This fixes PR21779. llvm-svn: 223754	2014-12-09 05:43:56 +00:00
Bill Schmidt	0913500021	Restore r223709 as it was meant to be, and enable FeatureP8Vector for P8 llvm-svn: 223751	2014-12-09 03:02:48 +00:00
NAKAMURA Takumi	cc4487eb8b	Revert r223709, "[PowerPC]Activate FeatureVSX for the Power target", to unbreak bots. CodeGen/PowerPC/vsx-p8.ll was failing. '+power8-vector' is not a recognized feature for this target (ignoring feature) llvm/test/CodeGen/PowerPC/vsx-p8.ll:33:14: error: expected string not found in input ; CHECK-REG: lxvw4x 34, 0, 3 ^ <stdin>:50:2: note: scanning from here .align 3 ^ <stdin>:61:2: note: possible intended match here lvx 3, 0, 3 ^ llvm-svn: 223729	2014-12-09 01:03:27 +00:00
Hal Finkel	c8cf2b88bc	Handle early-clobber registers in the aggressive anti-dep breaker The aggressive anti-dep breaker, used by the PowerPC backend during post-RA scheduling (but is available to all targets), did not handle early-clobber MI operands (at all). When constructing the list of available registers for the replacement of some def operand, check the using instructions, and remove registers assigned to early-clobbered defs from the set. Fixes PR21452. llvm-svn: 223727	2014-12-09 01:00:59 +00:00
Tom Stellard	3e41dc419c	R600/SI: Set MayStore = 0 on MUBUF loads llvm-svn: 223722	2014-12-09 00:03:54 +00:00
Tom Stellard	3260ec41cf	R600/SI: Move setting of the lds bit to the base MUBUF class llvm-svn: 223721	2014-12-09 00:03:51 +00:00
Colin LeMahieu	5cf5632696	[Hexagon] Removing old def versions and replacing usages with versions that have encodings. llvm-svn: 223720	2014-12-08 23:55:43 +00:00
Tom Stellard	3e01d47d98	MISched: Fix moving stores across barriers This fixes an issue with ScheduleDAGInstrs::buildSchedGraph where stores without an underlying object would not be added as a predecessor to the current BarrierChain. llvm-svn: 223717	2014-12-08 23:36:48 +00:00
Colin LeMahieu	f5b4d655d2	[Hexagon] Adding any8, all8, and/or/xor/andn/orn/not predicate register forms, mask, and vitpack instructions and patterns. llvm-svn: 223710	2014-12-08 23:07:59 +00:00
Bill Seurer	05663d8589	[PowerPC]Activate FeatureVSX for the Power target This change activates FeatureVSX for Power 7 and Power 8 in PPC.td. http://reviews.llvm.org/D6570 llvm-svn: 223709	2014-12-08 23:07:12 +00:00
Hal Finkel	aa10b3caaf	[PowerPC] Don't use a non-allocatable register to implement the 'cc' alias GCC accepts 'cc' as an alias for 'cr0', and we need to do the same when processing inline asm constraints. This had previously been implemented using a non-allocatable register, named 'cc', that was listed as an alias of 'cr0', but the infrastructure does not seem to support this properly (neither the register allocator nor the scheduler properly accounts for the alias). Instead, we can just process this as a naming alias inside of the inline asm constraint-processing code, so we'll do that instead. There are two regression tests, one where the post-RA scheduler did the wrong thing with the non-allocatable alias, and one where the register allocator did the wrong thing. Fixes PR21742. llvm-svn: 223708	2014-12-08 22:54:22 +00:00
Colin LeMahieu	b6c4dd96f9	[Hexagon] Adding xtype doubleword add, sub, and, or, xor and patterns. llvm-svn: 223702	2014-12-08 22:19:14 +00:00
Colin LeMahieu	9bfe5473da	[Hexagon] Adding xtype doubleword comparisons. Removing unused multiclass. llvm-svn: 223701	2014-12-08 21:56:47 +00:00
Colin LeMahieu	025f860638	[Hexagon] Adding xtype parity, min, minu, max, maxu instructions. llvm-svn: 223693	2014-12-08 21:19:18 +00:00
Colin LeMahieu	8d1376c60e	[Hexagon] Adding xtype halfword add/sub ll/hl/lh/hh/sat/<<16 instructions. llvm-svn: 223692	2014-12-08 20:33:01 +00:00
Matt Arsenault	13bd95bbc7	R600/SI: Move continue after checking s_mov_b32. There's nothing else to bother trying to shrink these. llvm-svn: 223686	2014-12-08 19:55:43 +00:00
David Majnemer	770fd82f39	ConstantFold: Zero-sized globals might land on top of another global A zero sized array is zero sized and might share its address with another global. llvm-svn: 223684	2014-12-08 19:35:31 +00:00
Rafael Espindola	ef23711eee	Lazily link GlobalVariables and GlobalAliases. We were already lazily linking functions, but all GlobalValues can be treated uniformly for this. The test updates are to ensure that a given GlobalValue is still linked in. This fixes pr21494. llvm-svn: 223681	2014-12-08 18:45:16 +00:00
Colin LeMahieu	cc46cd8eec	[Hexagon] Adding add/sub with saturation. Removing unused def. Cleaning up shift patterns. llvm-svn: 223680	2014-12-08 18:33:49 +00:00
David Majnemer	d5b3aa49ac	InstSimplify: Try to bring back the rest of r223583 This reverts r223624 with a small tweak, hopefully this will make stage3 equivalent. llvm-svn: 223679	2014-12-08 18:30:43 +00:00
Bruno Cardoso Lopes	27de9b0f70	[CompactUnwind] Fix register encoding logic Fix a compact unwind encoding logic bug which would try to encode more callee saved registers than it should, leading to early bail out in the encoding logic and abusive use of DWARF frame mode unnecessarily. Also remove no-compact-unwind.ll which was testing the wrong thing based on this bug and move it to valid 'compact unwind' tests. Added other few more tests too. llvm-svn: 223676	2014-12-08 18:18:32 +00:00
Rafael Espindola	beadd56a7d	Don't crash when the key of a comdat is lazily linked. llvm-svn: 223673	2014-12-08 18:05:48 +00:00
Justin Bogner	61ba2e3996	InstrProf: An intrinsic and lowering for instrumentation based profiling Introduce the ``llvm.instrprof_increment`` intrinsic and the ``-instrprof`` pass. These provide the infrastructure for writing counters for profiling, as in clang's ``-fprofile-instr-generate``. The implementation of the instrprof pass is ported directly out of the CodeGenPGO classes in clang, and with the followup in clang that rips that code out to use these new intrinsics this ends up being NFC. Doing the instrumentation this way opens some doors in terms of improving the counter performance. For example, this will make it simple to experiment with alternate lowering strategies, and allows us to try handling profiling specially in some optimizations if we want to. Finally, this drastically simplifies the frontend and puts all of the lowering logic in one place. llvm-svn: 223672	2014-12-08 18:02:35 +00:00
Tim Northover	67be569a31	AArch64: treat HFAs containing "half" types as blocks too. llvm-svn: 223669	2014-12-08 17:54:58 +00:00
Andrea Di Biagio	d80836ed09	[X86] Improved tablegen patters for matching TZCNT/LZCNT. Teach ISel how to match a TZCNT/LZCNT from a conditional move if the condition code is X86_COND_NE. Existing tablegen patterns only allowed to match TZCNT/LZCNT from a X86cond with condition code equal to X86_COND_E. To avoid introducing extra rules, I added an 'ImmLeaf' definition that checks if the condition code is COND_E or COND_NE. llvm-svn: 223668	2014-12-08 17:47:18 +00:00
Colin LeMahieu	b56e6cd9b9	[Hexagon] Adding combine reg, reg with predicated forms. llvm-svn: 223667	2014-12-08 17:33:06 +00:00
Colin LeMahieu	a55070dbdd	[Hexagon] Adding packhl instruction. llvm-svn: 223664	2014-12-08 17:01:18 +00:00
Daniel Sanders	c8a040c390	[mips] Add Mips-specific CCIf's for accessing the MipsCCState. NFC. Reviewers: vmedic Reviewed By: vmedic Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6213 llvm-svn: 223662	2014-12-08 15:40:09 +00:00
Andrea Di Biagio	64bc246f3f	[X86] Improved lowering of packed v8i16 vector shifts by non-constant count. Before this patch, the backend sub-optimally expanded the non-constant shift count of a v8i16 shift into a sequence of two 'movd' plus 'movzwl'. With this patch the backend checks if the target features sse4.1. If so, then it lets the shuffle legalizer deal with the expansion of the shift amount. Example: ;; define <8 x i16> @test(<8 x i16> %A, <8 x i16> %B) { %shamt = shufflevector <8 x i16> %B, <8 x i16> undef, <8 x i32> zeroinitializer %shl = shl <8 x i16> %A, %shamt ret <8 x i16> %shl } ;; Before (with -mattr=+avx): vmovd %xmm1, %eax movzwl %ax, %eax vmovd %eax, %xmm1 vpsllw %xmm1, %xmm0, %xmm0 retq Now: vpxor %xmm2, %xmm2, %xmm2 vpblendw $1, %xmm1, %xmm2, %xmm1 vpsllw %xmm1, %xmm0, %xmm0 retq llvm-svn: 223660	2014-12-08 14:36:51 +00:00
Rafael Espindola	3519da82b8	Move the ValueMap lookup inside linkFunctionBody. NFC. llvm-svn: 223659	2014-12-08 14:25:26 +00:00
Rafael Espindola	a314d1aca4	Use range loops. NFC. llvm-svn: 223658	2014-12-08 14:20:10 +00:00
Rafael Espindola	21ec84eb81	Use range loops. NFC. llvm-svn: 223657	2014-12-08 14:05:33 +00:00
Rafael Espindola	869d1ce811	Fix linking of prologue data. It would crash when the function was lazy linked. llvm-svn: 223656	2014-12-08 13:44:38 +00:00
Rafael Espindola	f97d0cbe58	Simple style fixes. * Use a range loop. * Move simple continue checks earlier. * clang-format. llvm-svn: 223654	2014-12-08 13:35:09 +00:00
Rafael Espindola	40d7ebed8a	Move materialize/Dematerialize calls to linkFunctionBody. NFC. Just less code duplication. llvm-svn: 223653	2014-12-08 13:29:33 +00:00
Elena Demikhovsky	68e04b8613	X86 intrinsics moved form X86ISelLowering.cpp to X86IntrinsicsInfo.h X86ISelLowering.cpp has a long switch for intrinsics. I moved a part of this long switch to the new intrinsics table in X86IntrinsicsInfo.h. No functional changes, just code and compile time optimization. llvm-svn: 223641	2014-12-08 09:03:08 +00:00
NAKAMURA Takumi	2b6e662672	Revert a part of r223583, for now. It seems causing different emission between stage2(gcc-clang) and stage3 clang. Investigating. llvm-svn: 223624	2014-12-08 02:07:22 +00:00
Duncan P. N. Exon Smith	9c51b50a71	IR: Revert r223618 behaviour of MDNode::concatenate() r223618 including special handling of `MDNode::intersect()`: if the first operand is a self-reference with the same operands you're trying to return, return it instead. Reuse that handling in `MDNode::concatenate()` in the hopes that it fixes a polly test that seems to rely on the old behaviour [1]. [1]: http://lab.llvm.org:8011/builders/polly-amd64-linux/builds/25167 llvm-svn: 223619	2014-12-07 20:32:11 +00:00
Duncan P. N. Exon Smith	ac8ee289eb	IR: Drop uniquing for self-referencing MDNodes It doesn't make sense to unique self-referencing nodes. Drop uniquing for them. Note that `MDNode::intersect()` occasionally returns self-referencing nodes. Previously these would be returned by `MDNode::get()`. I'm not convinced this was intended behaviour -- to me it seems it should return a node whose only operand is the self-reference -- but I don't know much about alias scopes so I'm preserving it for now. This is part of PR21532. llvm-svn: 223618	2014-12-07 19:52:06 +00:00
Duncan P. N. Exon Smith	545a9b0f51	IR: Add missing tests for function-local metadata Add assembly and bitcode tests that I neglected to add in r223564 (IR: Disallow complicated function-local metadata) and r223574 (IR: Disallow function-local metadata attachments). Found a couple of bugs: - The error message for function-local attachments gave the wrong line number -- it indicated the next token (typically on the next line) instead of the token that started the attachment. Fixed. - Metadata arguments of the form `!{i32 0, i32 %v}` (or with the arguments reversed) fired an assertion in `ValueEnumerator` in LLVM v3.5, so I suppose this never really worked. I suppose this was "fixed" by r223564. (Thanks to dblaikie for pointing out my omission.) Part of PR21532. llvm-svn: 223616	2014-12-07 17:56:16 +00:00
Marek Olsak	fa58e5e111	R600/SI: Disable VMEM and SMEM clauses by breaking them with S_NOP This is only a workaround. llvm-svn: 223615	2014-12-07 17:17:43 +00:00
Marek Olsak	58f61a84e7	R600/SI: Set 20-bit immediate byte offset for SMRD on VI llvm-svn: 223614	2014-12-07 17:17:38 +00:00
Marek Olsak	be047806d1	R600/SI: Update instruction conversions for VI There are 3 changes: - Convert 32-bit S_LSHL/LSHR/ASHR to their V_*REV variants for VI - Lower RSQ_CLAMP for VI - Don't generate MIN/MAX_LEGACY on VI llvm-svn: 223604	2014-12-07 12:19:03 +00:00
Marek Olsak	5df00d63e2	R600/SI: Add VI instructions llvm-svn: 223603	2014-12-07 12:18:57 +00:00
Marek Olsak	b08604c4cd	R600/SI: Add SCC Defs/Uses to SOP1 and SOP2 opcodes llvm-svn: 223602	2014-12-07 12:18:45 +00:00
Benjamin Kramer	3280a5d9f5	Turn some DenseMaps that are only used for set operations into DenseSets. DenseSet has better memory efficiency now. llvm-svn: 223589	2014-12-06 19:22:54 +00:00
Benjamin Kramer	89e5306f43	Make the DenseMap bucket type configurable and use a smaller bucket for DenseSet. DenseSet used to be implemented as DenseMap<Key, char>, which usually doubled the memory footprint of the map. Now we use a compressed set so the second element uses no memory at all. This required some surgery on DenseMap as all accesses to the bucket now have to go through methods; this should have no impact on the behavior of DenseMap though. The new default bucket type for DenseMap is a slightly extended std::pair as we expose it through DenseMap's iterator and don't want to break any existing users. llvm-svn: 223588	2014-12-06 19:22:44 +00:00
Benjamin Kramer	8e5dc53784	Reapply "LLVMContext: Store APInt/APFloat directly into the ConstantInt/FP DenseMaps." This reapplies r223478 with a fix for 32 bit targets. llvm-svn: 223586	2014-12-06 13:12:56 +00:00
David Majnemer	64ba326b1e	ConstantFold: Don't optimize comparisons with weak linkage objects Consider: void f() {} void __attribute__((weak)) g() {} bool b = &f != &g; It's possble for g to resolve to f if --defsym=g=f is passed on to the linker. llvm-svn: 223585	2014-12-06 11:58:33 +00:00
David Majnemer	ed00cd20ad	I didn't intend to commit this change. llvm-svn: 223584	2014-12-06 10:52:32 +00:00
David Majnemer	1af36e5baf	InstSimplify: Optimize away useless unsigned comparisons Code like X < Y && Y == 0 should always be folded away to false. llvm-svn: 223583	2014-12-06 10:51:40 +00:00
NAKAMURA Takumi	fc3062f65a	Reformat. llvm-svn: 223580	2014-12-06 05:57:06 +00:00
Tom Stellard	8d5f5e4238	R600/SI: Restore PrivateGlobalPrefix to the default ELF value of ".L" This was changed in r223323. llvm-svn: 223579	2014-12-06 05:34:34 +00:00
Duncan P. N. Exon Smith	35303fd739	IR: Disallow function-local metadata attachments Metadata attachments to instructions cannot be function-local. This is part of PR21532. llvm-svn: 223574	2014-12-06 02:29:44 +00:00
NAKAMURA Takumi	6980404cfe	LLVMInstrumentation requires MC since r223532. llvm-svn: 223573	2014-12-06 02:22:11 +00:00
Ahmed Bougacha	8b54286d1c	[X86] Refactor PMOV[SZ]Xrm to add missing AVX2 patterns. Most patterns will go away once the extload legalization changes land. Differential Revision: http://reviews.llvm.org/D6125 llvm-svn: 223567	2014-12-06 01:31:07 +00:00
Hans Wennborg	08de833c1c	SelectionDAG switch lowering: Replace unreachable default with most popular case. This can significantly reduce the size of the switch, allowing for more efficient lowering. I also worked with the idea of exploiting unreachable defaults by omitting the range check for jump tables, but always ended up with a non-neglible binary size increase. It might be worth looking into some more. SimplifyCFG currently does this transformation, but I'm working towards changing that so we can optimize harder based on unreachable defaults. Differential Revision: http://reviews.llvm.org/D6510 llvm-svn: 223566	2014-12-06 01:28:50 +00:00
Duncan P. N. Exon Smith	da41af9e94	IR: Disallow complicated function-local metadata Disallow complex types of function-local metadata. The only valid function-local metadata is an `MDNode` whose sole argument is a non-metadata function-local value. Part of PR21532. llvm-svn: 223564	2014-12-06 01:26:49 +00:00
Duncan P. N. Exon Smith	b236211c4c	Utils: Style cleanups, NFC llvm-svn: 223556	2014-12-06 00:48:17 +00:00
Duncan P. N. Exon Smith	b13f7d2e36	Utils: Avoid RAUW on metadata in CloneFunction() llvm-svn: 223555	2014-12-06 00:48:13 +00:00
Nick Lewycky	05044c248e	Canonicalize multiplies by looking at whether the operands have any constants themselves. Patch by Tim Murray! llvm-svn: 223554	2014-12-06 00:45:50 +00:00
Tim Northover	5e84fe3ed4	AArch64: use explicit MVT::i64 when creating EXTRACT_SUBVECTOR nodes. All our patterns use MVT::i64, but the ISelLowering nodes were inconsistent in their choice. No functional change. llvm-svn: 223551	2014-12-06 00:33:37 +00:00
Benjamin Kramer	0dc0e54272	Revert "LLVMContext: Store APInt/APFloat directly into the ConstantInt/FP DenseMaps." Somehow made DenseMap probe on forever on 32 bit machines. This reverts commit r223478. llvm-svn: 223546	2014-12-06 00:02:31 +00:00

1 2 3 4 5 ...

74854 Commits