llvm-project

Commit Graph

Author	SHA1	Message	Date
Erik Eckstein	d181752be0	InstCombine: simplify signed range checks Try to convert two compares of a signed range check into a single unsigned compare. Examples: (icmp sge x, 0) & (icmp slt x, n) --> icmp ult x, n (icmp slt x, 0) \| (icmp sgt x, n) --> icmp ugt x, n llvm-svn: 223224	2014-12-03 10:39:15 +00:00
Hal Finkel	c91fc11181	[PowerPC] Print all inline-asm consts as signed numbers Almost all immediates in PowerPC assembly (both 32-bit and 64-bit) are signed numbers, and it is important that we print them as such. To make sure that happens, we change PPCTargetLowering::LowerAsmOperandForConstraint so that it does all intermediate checks on a signed-extended int64_t value, and then creates the resulting target constant using MVT::i64. This will ensure that all negative values are printed as negative values (mirroring what is done in other backends to achieve the same sign-extension effect). This came up in the context of inline assembly like this: "add%I2 %0,%0,%2", ..., "Ir"(-1ll) where we used to print: addi 3,3,4294967295 and gcc would print: addi 3,3,-1 and gas accepts both forms, but our builtin assembler (correctly) does not. Now we print -1 like gcc does. While here, I replaced a bunch of custom integer checks with isInt<16> and friends from MathExtras.h. Thanks to Paul Hargrove for the bug report. llvm-svn: 223220	2014-12-03 09:37:50 +00:00
Charlie Turner	f02c92489a	Emit ABI_FP_rounding attribute. LLVM understands a -enable-sign-dependent-rounding-fp-math codegen option. When the user has specified this option, the Tag_ABI_FP_rounding attribute should be emitted with value 1. This option currently does not appear to disable transformations and optimizations that assume default floating point rounding behavior, AFAICT, but the intention should be recorded in the build attributes, regardless of what the compiler actually does with the intention. Change-Id: If838578df3dc652b6f2796b8d152545674bcb30e llvm-svn: 223218	2014-12-03 08:12:26 +00:00
Charlie Turner	1620a69fe8	Add tests for default value of Tag_ABI_FP_rounding. Change-Id: I051866d073fc6ce87ce3e693a3762da6d81f4393 llvm-svn: 223217	2014-12-03 07:59:50 +00:00
Rafael Espindola	2fa1e43a22	Ask the module for its the identified types. When lazy reading a module, the types used in a function will not be visible to a TypeFinder until the body is read. This patch fixes that by asking the module for its identified struct types. If a materializer is present, the module asks it. If not, it uses a TypeFinder. This fixes pr21374. I will be the first to say that this is ugly, but it was the best I could find. Some of the options I looked at: * Asking the LLVMContext. This could be made to work for gold, but not currently for ld64. ld64 will load multiple modules into a single context before merging them. This causes us to see types from future merges. Unfortunately, MappedTypes is not just a cache when it comes to opaque types. Once the mapping has been made, we have to remember it for as long as the key may be used. This would mean moving MappedTypes to the Linker class and having to drop the Linker::LinkModules static methods, which are visible from C. * Adding an option to ignore function bodies in the TypeFinder. This would fix the PR by picking the worst result. It would work, but unfortunately we are currently quite dependent on the upfront type merging. I will try to reduce our dependency, but it is not clear that we will be able to get rid of it for now. The only clean solution I could think of is making the Module own the types. This would have other advantages, but it is a much bigger change. I will propose it, but it is nice to have this fixed while that is discussed. With the gold plugin, this patch takes the number of types in the LTO clang binary from 52817 to 49669. llvm-svn: 223215	2014-12-03 07:18:23 +00:00
Matt Arsenault	becd656c7c	R600/SI: Remove i1 pseudo VALU ops Select i1 logical ops directly to 64-bit SALU instructions. Vector i1 values are always really in SGPRs, with each bit for each item in the wave. This saves about 4 instructions when and/or/xoring any condition, and also helps write conditions that need to be passed in vcc. This should work correctly now that the SGPR live range fixing pass works. More work is needed to eliminate the VReg_1 pseudo regclass and possibly the entire SILowerI1Copies pass. llvm-svn: 223206	2014-12-03 05:22:35 +00:00
Tom Stellard	1f0dded057	StructurizeCFG: Use LoopInfo analysis for better loop detection We were assuming that each back-edge in a region represented a unique loop, which is not always the case. We need to use LoopInfo to correctly determine which back-edges are loops. llvm-svn: 223199	2014-12-03 04:28:32 +00:00
Tom Stellard	369308061b	R600/SI: Enable inline assembly We just needed to remove the assertion in AMDGPURegisterInfo::getFrameRegister(), which is called when initializing the parser for inline assembly. llvm-svn: 223197	2014-12-03 04:08:00 +00:00
Matt Arsenault	fb13b22d9a	R600/SI: Change mubuf offsets to print as decimal This matches SC's behavior. llvm-svn: 223194	2014-12-03 03:12:13 +00:00
Nick Lewycky	2e8a6219fc	Emit the entry block first and the exit block second, then all the blocks in between afterwards. This is what gcc always does, and some out of tree tools depend on that. llvm-svn: 223193	2014-12-03 02:45:01 +00:00
Peter Collingbourne	51d2de7b9e	Prologue support Patch by Ben Gamari! This redefines the `prefix` attribute introduced previously and introduces a `prologue` attribute. There are a two primary usecases that these attributes aim to serve, 1. Function prologue sigils 2. Function hot-patching: Enable the user to insert `nop` operations at the beginning of the function which can later be safely replaced with a call to some instrumentation facility 3. Runtime metadata: Allow a compiler to insert data for use by the runtime during execution. GHC is one example of a compiler that needs this functionality for its tables-next-to-code functionality. Previously `prefix` served cases (1) and (2) quite well by allowing the user to introduce arbitrary data at the entrypoint but before the function body. Case (3), however, was poorly handled by this approach as it required that prefix data was valid executable code. Here we redefine the notion of prefix data to instead be data which occurs immediately before the function entrypoint (i.e. the symbol address). Since prefix data now occurs before the function entrypoint, there is no need for the data to be valid code. The previous notion of prefix data now goes under the name "prologue data" to emphasize its duality with the function epilogue. The intention here is to handle cases (1) and (2) with prologue data and case (3) with prefix data. References ---------- This idea arose out of discussions[1] with Reid Kleckner in response to a proposal to introduce the notion of symbol offsets to enable handling of case (3). [1] http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-May/073235.html Test Plan: testsuite Differential Revision: http://reviews.llvm.org/D6454 llvm-svn: 223189	2014-12-03 02:08:38 +00:00
Ahmed Bougacha	d65f787a5f	[X86][MC] Intel syntax: accept implicit memory operand sizes larger than 80. The X86AsmParser intel handling was refactored in r216481, making it try each different memory operand size to see which one matches. Operand sizes larger than 80 ("[xyz]mmword ptr") were forgotten, which led to an "invalid operand" error for code such as: movdqa [rax], xmm0 llvm-svn: 223187	2014-12-03 02:03:26 +00:00
Hal Finkel	01fa7701e6	[PowerPC] Fix readcyclecounter to be custom expanded for all 32-bit targets We need to use the custom expansion of readcyclecounter on all 32-bit targets (even those with 64-bit registers). This should fix the ppc64 buildbot. llvm-svn: 223182	2014-12-03 00:19:17 +00:00
Tim Northover	4a8ac260cc	AArch64: strengthen Darwin ABI alignment assumptions A global variable without an explicit alignment specified should be assumed to be ABI-aligned according to its type, like on other platforms. This allows us to use better memory operations when accessing it. rdar://18533701 llvm-svn: 223180	2014-12-02 23:53:43 +00:00
Tim Northover	ec7ebebe55	AArch64: don't be too greedy when folding :lo12: accesses into mem ops. This frequently leads to cases like: ldr xD, [xN, :lo12:var] add xA, xN, :lo12:var ldr xD, [xA, #8] where the ADD would have been needed anyway, and the two distinct addressing modes can prevent the formation of an ldp. Because of how we handle ADRP (aggressively forming an ADRP/ADD pseudo-inst at ISel time), this pattern also results in duplicated ADRP instructions (one on its own to cover the ldr, and one combined with the add). llvm-svn: 223172	2014-12-02 23:13:39 +00:00
Michael Zolotukhin	ea8327b80f	PR21302. Vectorize only bottom-tested loops. rdar://problem/18886083 llvm-svn: 223171	2014-12-02 22:59:06 +00:00
Michael Zolotukhin	540580ca06	Apply loop-rotate to several vectorizer tests. Such loops shouldn't be vectorized due to the loops form. After applying loop-rotate (+simplifycfg) the tests again start to check what they are intended to check. llvm-svn: 223170	2014-12-02 22:59:02 +00:00
Simon Pilgrim	6b988ad8f2	[X86][SSE] Keep 4i32 vector insertions in integer domain on SSE4.1 targets 4i32 shuffles for single insertions into zero vectors lowers to X86vzmovl which was using (v)blendps - causing domain switch stalls. This patch fixes this by using (v)pblendw instead. The updated tests on test/CodeGen/X86/sse41.ll still contain a domain stall due to the use of insertps - I'm looking at fixing this in a future patch. Differential Revision: http://reviews.llvm.org/D6458 llvm-svn: 223165	2014-12-02 22:31:23 +00:00
Hal Finkel	bbdee93638	[PowerPC] Implement readcyclecounter for PPC32 We've long supported readcyclecounter on PPC64, but it is easier there (the read of the 64-bit time-base register can be accomplished via a single instruction). This now provides an implementation for PPC32 as well. On PPC32, the time-base register is still 64 bits, but can only be read 32 bits at a time via two separate SPRs. The ISA manual explains how to do this properly (it involves re-reading the upper bits and looping if the counter has wrapped while being read). This requires PPC to implement a custom integer splitting legalization for the READCYCLECOUNTER node, turning it into a target-specific SDAG node, which then gets turned into a pseudo-instruction, which is then expanded to the necessary sequence (which has three SPR reads, the comparison and the branch). Thanks to Paul Hargrove for pointing out to me that this was still unimplemented. llvm-svn: 223161	2014-12-02 22:01:00 +00:00
Lang Hames	a7395bf49b	[AArch64][Stackmaps] Optimize stackmap shadows on AArch64. Reduce the number of nops emitted for stackmap shadows on AArch64 by counting non-stackmap instructions up to the next branch target towards the requested shadow. <rdar://problem/14959522> llvm-svn: 223156	2014-12-02 21:36:24 +00:00
Tom Stellard	4df465bd5e	R600/SI: Move more information into SIProgramInfo struct llvm-svn: 223154	2014-12-02 21:28:53 +00:00
Matt Arsenault	6f1e96b437	R600: Cleanup some tests and add missing testcases llvm-svn: 223151	2014-12-02 21:02:20 +00:00
Daniel Sanders	d134c9dac4	[mips] Fix passing of small structures for big-endian O32. Summary: Like N32/N64, they must be passed in the upper bits of the register. The new code could be merged with the existing if-statements but I've refrained from doing this since it will make porting the O32 implementation to tablegen harder later. Reviewers: vmedic Reviewed By: vmedic Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6463 llvm-svn: 223148	2014-12-02 20:40:27 +00:00
Roman Divacky	7e6b5955d4	Introduce CPUStringIsValid() into MCSubtargetInfo and use it for ARM .cpu parsing. Previously .cpu directive in ARM assembler didnt switch to the new CPU and therefore acted as a nop. This implemented real action for .cpu and eg. allows to assembler FreeBSD kernel with -integrated-as. llvm-svn: 223147	2014-12-02 20:03:22 +00:00
Philip Reames	1a1bdb22bf	[Statepoints 3/4] Statepoint infrastructure for garbage collection: SelectionDAGBuilder This is the third patch in a small series. It contains the CodeGen support for lowering the gc.statepoint intrinsic sequences (223078) to the STATEPOINT pseudo machine instruction (223085). The change also includes the set of helper routines and classes for working with gc.statepoints, gc.relocates, and gc.results since the lowering code uses them. With this change, gc.statepoints should be functionally complete. The documentation will follow in the fourth change, and there will likely be some cleanup changes, but interested parties can start experimenting now. I'm not particularly happy with the amount of code or complexity involved with the lowering step, but at least it's fairly well isolated. The statepoint lowering code is split into it's own files and anyone not working on the statepoint support itself should be able to ignore it. During the lowering process, we currently spill aggressively to stack. This is not entirely ideal (and we have plans to do better), but it's functional, relatively straight forward, and matches closely the implementations of the patchpoint intrinsics. Most of the complexity comes from trying to keep relocated copies of values in the same stack slots across statepoints. Doing so avoids the insertion of pointless load and store instructions to reshuffle the stack. The current implementation isn't as effective as I'd like, but it is functional and 'good enough' for many common use cases. In the long term, I'd like to figure out how to integrate the statepoint lowering with the register allocator. In principal, we shouldn't need to eagerly spill at all. The register allocator should do any spilling required and the statepoint should simply record that fact. Depending on how challenging that turns out to be, we may invest in a smarter global stack slot assignment mechanism as a stop gap measure. Reviewed by: atrick, ributzka llvm-svn: 223137	2014-12-02 18:50:36 +00:00
Bruno Cardoso Lopes	15520db9ad	[SwitchLowering] Handle destinations on multiple phi instructions Follow up from r222926. Also handle multiple destinations from merged cases on multiple and subsequent phi instructions. rdar://problem/19106978 llvm-svn: 223135	2014-12-02 18:31:53 +00:00
Ahmed Bougacha	54b7d334c7	[MachineCSE] Clear kill-flag on registers imp-def'd by the CSE'd instruction. Go through implicit defs of CSMI and MI, and clear the kill flags on their uses in all the instructions between CSMI and MI. We might have made some of the kill flags redundant, consider: subs ... %NZCV<imp-def> <- CSMI csinc ... %NZCV<imp-use,kill> <- this kill flag isn't valid anymore subs ... %NZCV<imp-def> <- MI, to be eliminated csinc ... %NZCV<imp-use,kill> Since we eliminated MI, and reused a register imp-def'd by CSMI (here %NZCV), that register, if it was killed before MI, should have that kill flag removed, because it's lifetime was extended. Also, add an exhaustive testcase for the motivating example. Reviewed by: Juergen Ributzka <juergen@apple.com> llvm-svn: 223133	2014-12-02 18:09:51 +00:00
Tim Northover	24ec87debb	AArch64: make register block rules apply to vector types too. The blocking code originated in ARM, which is more aggressive about casting types to a canonical representative before doing anything else, so I missed out most vector HFAs and broke the ABI. This should fix it. llvm-svn: 223126	2014-12-02 17:15:22 +00:00
Tom Stellard	794c8c0f78	R600/SI: Set the ATC bit on all resource descriptors for the HSA runtime llvm-svn: 223125	2014-12-02 17:05:41 +00:00
Bruno Cardoso Lopes	d035fbb96f	[LICM] Avoind store sinking if no preheader is available Load instructions are inserted into loop preheaders when sinking stores and later removed if not used by the SSA updater. Avoid sinking if the loop has no preheader and avoid crashes. This fixes one more side effect of not handling indirectbr instructions properly on LoopSimplify. llvm-svn: 223119	2014-12-02 14:22:34 +00:00
Asiri Rathnayake	a0199b9a59	Add support for ARM modified-immediate assembly syntax. Certain ARM instructions accept 32-bit immediate operands encoded as a 8-bit integer value (0-255) and a 4-bit rotation (0-30, even). Current ARM assembly syntax support in LLVM allows the decoded (32-bit) immediate to be specified as a single immediate operand for such instructions: mov r0, #4278190080 The ARMARM defines an extended assembly syntax allowing the encoding to be made more explicit, as in: mov r0, #255, #8 ; (same 32-bit value as above) The behaviour of the two instructions can be different w.r.t flags, which is documented under "Modified immediate constants" in ARMARM. This patch enables support for this extended syntax at the MC layer. llvm-svn: 223113	2014-12-02 10:53:20 +00:00
Charlie Turner	15f91c5240	Emit Tag_ABI_FP_denormal correctly in fast-math mode. The default ARM floating-point mode does not support IEEE 754 mode exactly. Of relevance to this patch is that input denormals are flushed to zero. The way in which they're flushed to zero depends on the architecture, * For VFPv2, it is implementation defined as to whether the sign of zero is preserved. * For VFPv3 and above, the sign of zero is always preserved when a denormal is flushed to zero. When FP support has been disabled, the strategy taken by this patch is to assume the software support will mirror the behaviour of the hardware support for the target if it existed. That is, for architectures which can only have VFPv2, it is assumed the software will flush to positive zero. For later architectures it is assumed the software will flush to zero preserving sign. Change-Id: Icc5928633ba222a4ba3ca8c0df44a440445865fd llvm-svn: 223110	2014-12-02 08:22:29 +00:00
Sonam Kumari	f2eacabd66	[signext.ll] Removal Of Duplicate Test Cases Removed the duplicate test case existing in signext.ll file. llvm-svn: 223109	2014-12-02 05:29:47 +00:00
Hal Finkel	afcd8dbbcf	Simplify pointer comparisons involving memory allocation functions System memory allocation functions, which are identified at the IR level by the noalias attribute on the return value, must return a pointer into a memory region disjoint from any other memory accessible to the caller. We can use this property to simplify pointer comparisons between allocated memory and local stack addresses and the addresses of global variables. Neither the stack nor global variables can overlap with the region used by the memory allocator. Fixes PR21556. llvm-svn: 223093	2014-12-01 23:38:06 +00:00
Philip Reames	337c4bd4ab	[Statepoints 1/4] Statepoint infrastructure for garbage collection: IR Intrinsics The statepoint intrinsics are intended to enable precise root tracking through the compiler as to support garbage collectors of all types. The addition of the statepoint intrinsics to LLVM should have no impact on the compilation of any program which does not contain them. There are no side tables created, no extra metadata, and no inhibited optimizations. A statepoint works by transforming a call site (or safepoint poll site) into an explicit relocation operation. It is the frontend's responsibility (or eventually the safepoint insertion pass we've developed, but that's not part of this patch series) to ensure that any live pointer to a GC object is correctly added to the statepoint and explicitly relocated. The relocated value is just a normal SSA value (as seen by the optimizer), so merges of relocated and unrelocated values are just normal phis. The explicit relocation operation, the fact the statepoint is assumed to clobber all memory, and the optimizers standard semantics ensure that the relocations flow through IR optimizations correctly. This is the first patch in a small series. This patch contains only the IR parts; the documentation and backend support will be following separately. The entire series can be seen as one combined whole in http://reviews.llvm.org/D5683. Reviewed by: atrick, ributzka llvm-svn: 223078	2014-12-01 21:18:12 +00:00
Jingyue Wu	5b62eb9b48	[NVPTX] Do not emit .weak symbols for NVPTX Summary: ".weak" symbols cannot be consumed by ptxas (PR21685). This patch makes the weak directive in MCAsmPrinter customizable, and disables emitting ".weak" symbols for NVPTX. Test Plan: weak-linkage.ll Reviewers: jholewinski Reviewed By: jholewinski Subscribers: majnemer, jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D6455 llvm-svn: 223077	2014-12-01 21:16:17 +00:00
Reid Kleckner	35fc363ce8	Parse 'ghccc' in .ll files as the GHC convention (cc 10) Previously we just used "cc 10" in the .ll files, but that isn't very human readable. llvm-svn: 223076	2014-12-01 21:04:44 +00:00
Ahmed Bougacha	d0ce058f2c	[AArch64] Don't combine "select (setcc i1 LHS, RHS), vL, vR". r208210 introduced an optimization that improves the vector select codegen by doing the setcc on vectors directly. This is a problem they the setcc operands are i1s, because the optimization would create vectors of i1, which aren't legal. Part of PR21549. Differential Revision: http://reviews.llvm.org/D6308 llvm-svn: 223075	2014-12-01 20:59:00 +00:00
Ahmed Bougacha	879463206e	[AArch64] Fix v2i8->i16 bitcast legalization. r213378 improved f16 bitcasts, so that they go directly through subregs, instead of through the stack. That code now causes an assertion failure for bitcasts from other 16-bits types (most importantly v2i8). Correct that by doing the custom lowering for i16 bitcasts only when the input is an f16. Part of PR21549. Differential Revision: http://reviews.llvm.org/D6307 llvm-svn: 223074	2014-12-01 20:52:32 +00:00
Peter Zotov	0d040f66a5	[OCaml] Move Llvm.clone_module to its own Llvm_transform_utils module. This way most code won't link this (substantially large) library, if compiled statically with LLVM. llvm-svn: 223072	2014-12-01 19:50:39 +00:00
Peter Zotov	b20073c63c	[OCaml] [cmake] Add CMake buildsystem for OCaml. Closes PR15325. llvm-svn: 223071	2014-12-01 19:50:23 +00:00
Ahmed Bougacha	fb6eeb74c5	[MachineVerifier] Accept a MBB with a single landing pad successor. The MachineVerifier used to check that there was always exactly one unconditional branch to a non-landingpad (normal) successor. If that normal successor to an invoke BB is unreachable, it seems reasonable to only have one successor, the landing pad. On targets other than AArch64 (and on AArch64 with a different testcase), the branch folder turns the branch to the landing pad into a fallthrough. The MachineVerifier, which relies on AnalyzeBranch, is unable to check the condition, and doesn't complain. However, it does in this specific testcase, where the branch to the landing pad remained. Make the MachineVerifier accept it. llvm-svn: 223059	2014-12-01 18:43:53 +00:00
Tim Northover	3024b5535c	ARM: lower tail calls correctly when using GHC calling convention. Patch by Ben Gamari. llvm-svn: 223055	2014-12-01 17:46:39 +00:00
Hans Wennborg	5bef5b522b	Revert r223049, r223050 and r223051 while investigating test failures. I didn't foresee affecting the Clang test suite :/ llvm-svn: 223054	2014-12-01 17:36:43 +00:00
Hans Wennborg	269ebb612e	SimplifyCFG: Omit range checks for switch lookup tables when default is unreachable They would get optimized away later, but we might as well not emit them. llvm-svn: 223051	2014-12-01 17:08:38 +00:00
Hans Wennborg	5a1e5c05d8	SimplifyCFG: don't remove unreachable default switch destinations An unreachable default destination can be exploited by other optimizations, and SDag lowering is now prepared to handle them efficiently. For example, branches to the unreachable destination will be optimized away, such as in the case of range checks for switch lookup tables. On 64-bit Linux, this reduces the size of a clang bootstrap by 80 kB (and Chromium by 30 kB). llvm-svn: 223050	2014-12-01 17:08:35 +00:00
Hans Wennborg	1571336fb2	SelectionDAG switch lowering: Replace unreachable default with most popular case. This can significantly reduce the size of the switch, allowing for more efficient lowering. I also worked with the idea of exploiting unreachable defaults by omitting the range check for jump tables, but always ended up with a non-neglible binary size increase. It might be worth looking into some more. llvm-svn: 223049	2014-12-01 17:08:32 +00:00
Rafael Espindola	a4e85e3db0	Partial revert of r222986. The explicit set of destination types is not fully redundant when lazy loading since the TypeFinder will not find types used only in function bodies. This keeps the logic to drop the name of mapped types since it still helps with avoiding further renaming. llvm-svn: 223043	2014-12-01 16:32:20 +00:00
Vladimir Medic	b682ddf33a	The andi16, addiusp and jraddiusp micromips instructions were missing dedicated decoder methods in MipsDisassembler.cpp to properly decode immediate operands. These methods are added together with corresponding tests. llvm-svn: 223006	2014-12-01 11:12:04 +00:00
Jay Foad	1f0a44e662	[PowerPC] Fix unwind info with dynamic stack realignment Summary: PowerPC DWARF unwind info defined CFA as SP + offset even in a function where the stack had been dynamically realigned. This clearly doesn't work because the offset from SP to CFA is not a constant. Fix it by defining CFA as BP instead. This was causing the AddressSanitizer null_deref test to fail 50% of the time, depending on whether SP happened to be 32-byte aligned on entry to a particular function or not. Reviewers: willschm, uweigand, hfinkel Reviewed By: hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6410 llvm-svn: 222996	2014-12-01 09:42:32 +00:00
Sonam Kumari	237cfa9916	Removed extra whitespace. (Testing commit access). NFC. llvm-svn: 222994	2014-12-01 09:27:46 +00:00
Charlie Turner	30895f9ab8	Add post-decode checking of HVC instruction. Add checkDecodedInstruction for post-decode checking of instructions, to catch the corner cases like HVC that don't fit into the general pattern. Needed to check for an invalid condition field in instruction encoding despite HVC not taking a predicate. Patch by Matthew Wahab. Change-Id: I48e28de981d7a9e43569594da3c45fb478b4f795 llvm-svn: 222992	2014-12-01 08:50:27 +00:00
Yury Gribov	3ae427d811	[asan] Change dynamic alloca instrumentation to only consider allocas that are dominating all exits from function. Reviewed in http://reviews.llvm.org/D6412 llvm-svn: 222991	2014-12-01 08:47:58 +00:00
Charlie Turner	7de905cd17	Add Thumb HVC and ERET virtualisation extension instructions. Patch by Matthew Wahab. Change-Id: I131f71c1150d5fa797066a18e09d526c19bf9016 llvm-svn: 222990	2014-12-01 08:39:19 +00:00
Charlie Turner	4d88ae2002	Add ARM ERET and HVC virtualisation extension instructions. Patch by Matthew Wahab. Change-Id: Iad75f078fbaa4ecc7d7a4820ad9b3930679cbbbb llvm-svn: 222989	2014-12-01 08:33:28 +00:00
Akira Hatanaka	b9991a2656	[stack protector] Set edge weights for newly created basic blocks. This commit fixes a bug in stack protector pass where edge weights were not set when new basic blocks were added to lists of successor basic blocks. Differential Revision: http://reviews.llvm.org/D5766 llvm-svn: 222987	2014-12-01 04:27:03 +00:00
Rafael Espindola	04a74af734	Change how we keep track of which types are in the dest module. Instead of keeping an explicit set, just drop the names of types we choose to map to some other type. This has the advantage that the name of the unused will not cause the context to rename types on module read. llvm-svn: 222986	2014-12-01 04:15:59 +00:00
Rafael Espindola	c59dc43eeb	Add a test showing what the linker IdentifiedStructTypes is for. Without this it could just be deleted and all tests would pass. llvm-svn: 222985	2014-12-01 03:20:57 +00:00
Rafael Espindola	a4b2ee4548	Relax an assert a bit to avoid a crash on unreachable code. Patch by Duncan Exon Smith with a small tweak by me. llvm-svn: 222984	2014-12-01 02:55:24 +00:00
Hal Finkel	378107daa4	[PowerPC] Add asm support for cache-inhibited ld/st instructions Add assembler support for the fixed-point cache-inhibited load/store instructions. These are hypervisor-level only, so don't get too excited ;) Fixes PR21650. llvm-svn: 222976	2014-11-30 10:15:56 +00:00
Hans Wennborg	6c42d1a5de	Switch lowering: Fix broken 'Figure out which block is next' code This doesn't seem to have worked in a long time, but other optimizations would clean it up. llvm-svn: 222961	2014-11-29 21:17:05 +00:00
Jozef Kolek	c7e220f6e0	[mips][microMIPS] Implement NOP aliases This patch implements microMIPS 16-bit (MOVE16 $0, $0) and 32-bit (SLL $0, $0, 0) NOP aliases. http://reviews.llvm.org/D6440 llvm-svn: 222953	2014-11-29 13:29:24 +00:00
Duncan P. N. Exon Smith	910f05d181	DebugIR: Delete -debug-ir llvm-svn: 222945	2014-11-29 03:15:47 +00:00
Matt Arsenault	8596f71910	R600/SI: Fix assertion on sign extend of 3 vectors This was trying to create an MVT with 3x vectors which created an invalid EVT llvm-svn: 222942	2014-11-28 22:51:38 +00:00
Duncan P. N. Exon Smith	9bc81fbe92	Revert "Masked Vector Load and Store Intrinsics." This reverts commit r222632 (and follow-up r222636), which caused a host of LNT failures on an internal bot. I'll respond to the commit on the list with a reproduction of one of the failures. Conflicts: lib/Target/X86/X86TargetTransformInfo.cpp llvm-svn: 222936	2014-11-28 21:29:14 +00:00
David Majnemer	3d6f80b619	InstCombine: FoldOrOfICmps harder We may be in a situation where the icmps might not be near each other in a tree of or instructions. Try to dig out related compare instructions and see if they combine. N.B. This won't fire on deep trees of compares because rewritting the tree might end up creating a net increase of IR. We may have to resort to something more sophisticated if this is a real problem. llvm-svn: 222928	2014-11-28 19:58:29 +00:00
Bruno Cardoso Lopes	46d5bf2982	[LICM] Store sink and indirectbr instructions Loop simplify skips exit-block insertion when exits contain indirectbr instructions. This leads to an assertion in LICM when trying to sink stores out of non-dedicated loop exits containing indirectbr instructions. This patch fix this issue by re-checking for dedicated exits in LICM prior to store sink attempts. Differential Revision: http://reviews.llvm.org/D6414 rdar://problem/18943047 llvm-svn: 222927	2014-11-28 19:47:46 +00:00
Bruno Cardoso Lopes	bc7ba2c766	[SwitchLowering] Handle multiple destinations on condensed case stmts Switch cases statements with sequential values that branch to the same destination BB may often be handled together in a single new source BB. In this scenario we need to remove remaining incoming values from PHI instructions in the destination BB, as to match the number of source branches. Differential Revision: http://reviews.llvm.org/D6415 rdar://problem/19040894 llvm-svn: 222926	2014-11-28 19:47:33 +00:00
Sanjay Patel	e57f3c0a42	Enable FeatureFastUAMem for btver2 Allow unaligned 16-byte memop codegen for btver2. No functional changes for any other subtargets. Replace the existing supposed small memcpy test with an actual test of a small memcpy. The previous test wasn't using FileCheck either. This patch should allow us to close PR21541 ( http://llvm.org/bugs/show_bug.cgi?id=21541 ). Differential Revision: http://reviews.llvm.org/D6360 llvm-svn: 222925	2014-11-28 18:40:18 +00:00
Rafael Espindola	a96f235c15	Add back r222727 with a fix. The original patch would fail when: * A dst opaque type (%A) is matched with a src type (%A). * A src opaque (%E) type is then speculatively matched with %A and the speculation fails afterward. * When rolling back the speculation we would cancel the source %A to dest %A mapping. The fix is to keep an explicit list of which resolutions are speculative. Original message: Fix overly aggressive type merging. If we find out that two types are not isomorphic, we learn nothing about opaque sub types in both the source and destination. llvm-svn: 222923	2014-11-28 16:41:24 +00:00
Rafael Espindola	93cd657cf0	Add a testcase reduced from clang lto bootstrap on OS X. llvm-svn: 222921	2014-11-28 15:45:31 +00:00
Charlie Turner	db6c5e7afa	Fix wrong encoding of MRSBanked. Patch by Matthew Wahab. Change-Id: Ia2a001ca2760028ea360fe77b56f203a219eefbc llvm-svn: 222920	2014-11-28 15:01:06 +00:00
Evgeniy Stepanov	a0b6899234	[msan] Fix origin propagation for select of floats. MSan does not assign origin for instrumentation temps (i.e. the ones that do not come from the application code), but "select" instrumentation erroneously tried to use one of those. https://code.google.com/p/memory-sanitizer/issues/detail?id=78 llvm-svn: 222918	2014-11-28 11:17:58 +00:00
Charlie Turner	ab9524ecdb	Test all <build attribute, value> pairs. Add more tests to make sure the encoding/decoding of build attributes works correctly for all permissible values of build attributes. For cases where there are an infinite number of such values, a representative subset has been settled for. Change-Id: I2643c9624c211b2d56405306e16eec2d487bc5d6 llvm-svn: 222917	2014-11-28 11:14:47 +00:00
Tim Northover	3c55ccac48	AArch64: treat [N x Ty] as a block during procedure calls. The AAPCS treats small structs and homogeneous floating (or vector) aggregates specially, and guarantees they either get passed as a contiguous block of registers, or prevent any future use of those registers and get passed on the stack. This concept can fit quite neatly into LLVM's own type system, mapping an HFA to [N x float] and so on, and small structs to [N x i64]. Doing so allows front-ends to emit AAPCS compliant code without having to duplicate the register counting logic. llvm-svn: 222903	2014-11-27 21:02:42 +00:00
Zoran Jovanovic	f9a02500b6	[mips][microMIPS] Implement SWM16 and LWM16 instructions Differential Revision: http://reviews.llvm.org/D5579 llvm-svn: 222901	2014-11-27 18:28:59 +00:00
Jozef Kolek	56a6a7d3bd	[mips][microMIPS] Implement BREAK16 and SDBBP16 instructions Patch by Radovan Obradovic. Differential Revision: http://reviews.llvm.org/D5048 llvm-svn: 222900	2014-11-27 18:18:42 +00:00
Daniel Sanders	b4484d62ad	[mips] Add synci instruction. Patch by Amaury Pouly Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6421 llvm-svn: 222899	2014-11-27 17:28:10 +00:00
Will Newton	a7d232fe56	Widen ELFYAML relocation type to 32 bits The current 8 bits is sufficient for ELF32 targets but ELF64 requires 32 bits. Add a test for AArch64 that exposes the issue. llvm-svn: 222898	2014-11-27 17:20:48 +00:00
Rafael Espindola	d11591b293	Commit back the correct bits of r222760 (was r222538). I also added a test. Original message: Allow FDE references outside the +/-2GB range supported by PC relative offsets for code models other than small/medium. For JIT application, memory layout is less controlled and can result in truncations otherwise. Patch from Akos Kiss. Differential Revision: http://reviews.llvm.org/D6079 llvm-svn: 222897	2014-11-27 17:13:56 +00:00
Rafael Espindola	b3323f0bd9	Revert "Reapply 222538 and update tests to explicitly request small code model and PIC:" This reverts commit r222760. It changed our behaviour on PIC so we don't match gas anymore. It also included lots of unnecessary changes to tests. If those changes are desirable, there should be an independent discussion as they are out of scope for that patch. I will recommit the other bits. llvm-svn: 222896	2014-11-27 17:13:51 +00:00
Duncan P. N. Exon Smith	c586eaa1f1	Revert "Fix overly aggressive type merging." This reverts commit r222727, which causes LTO bootstrap failures. Last passing @ r222698: http://lab.llvm.org:8080/green/job/clang-Rlto_master_build/532/ First failing @ r222843: http://lab.llvm.org:8080/green/job/clang-Rlto_master_build/533/ Internal bootstraps pointed at a much narrower range: r222725 is passing, and r222731 is failing. LTO crashes while handling libclang.dylib: http://lab.llvm.org:8080/green/job/clang-Rlto_master_build/533/consoleFull#-158682280549ba4694-19c4-4d7e-bec5-911270d8a58c GEP is not of right type for indices! %InfoObj.i.i = getelementptr inbounds %"class.llvm::OnDiskIterableChainedHashTable"* %.lcssa, i64 0, i32 0, i32 4, !dbg !123627 %"class.clang::serialization::reader::ASTIdentifierLookupTrait" = type { %"class.clang::ASTReader.31859", %"class.clang::serialization::ModuleFile.31870", %"class.clang::IdentifierInfo"* }LLVM ERROR: Broken function found, compilation aborted! clang: error: linker command failed with exit code 1 (use -v to see invocation) Looks like the new algorithm doesn't merge types aggressively enough. llvm-svn: 222895	2014-11-27 17:01:10 +00:00
Erik Eckstein	0d86c7623f	reinstate r222872: Peephole optimization in switch table lookup: reuse the guarding table comparison if possible. Fixed missing dominance check. Original commit message: This optimization tries to reuse the generated compare instruction, if there is a comparison against the default value after the switch. Example: if (idx < tablesize) r = table[idx]; // table does not contain default_value else r = default_value; if (r != default_value) ... Is optimized to: cond = idx < tablesize; if (cond) r = table[idx]; else r = default_value; if (cond) ... Jump threading will then eliminate the second if(cond). llvm-svn: 222891	2014-11-27 15:13:14 +00:00
Evgeniy Stepanov	e402d9ef4c	[msan] Remove indirect call wrapping code. This functionality was only used in MSanDR, which is deprecated. llvm-svn: 222889	2014-11-27 14:54:02 +00:00
Jozef Kolek	aa2b9278fe	[mips][microMIPS] Implement disassembler support for 16-bit instructions LI16, ADDIUR1SP, ADDIUR2 and ADDIUS5 Differential Revision: http://reviews.llvm.org/D6419 llvm-svn: 222887	2014-11-27 14:41:44 +00:00
Charlie Turner	8d43369163	Stop uppercasing build attribute data. The string data for string-valued build attributes were being unconditionally uppercased. There is no mention in the ARM ABI addenda about case conventions, so it's technically implementation defined as to whether the data are capitialised in some way or not. However, there are good reasons not to captialise the data. * It's less work. * Some vendors may legitimately have case-sensitive checks for these attributes which would fail on LLVM generated object files. * There could be locale issues with uppercasing. The original reasons for uppercasing appear to have stemmed from an old codesourcery toolchain behaviour, see http://comments.gmane.org/gmane.comp.compilers.llvm.cvs/87133 This patch makes the object file emitted no longer captialise string data, it encodes as seen in the assembly source. Change-Id: Ibe20dd6e60d2773d57ff72a78470839033aa5538 llvm-svn: 222882	2014-11-27 12:13:56 +00:00
Suyog Sarda	f8516e1662	Use FileCheck instead of grep. Change by Ankur Garg. Differential Revision: http://reviews.llvm.org/D6430 llvm-svn: 222879	2014-11-27 11:22:49 +00:00
Erik Eckstein	2190cd9ffa	Revert "Peephole optimization in switch table lookup: reuse the guarding table comparison if possible." It is breaking the clang bootstrag. llvm-svn: 222877	2014-11-27 10:59:08 +00:00
Suyog Sarda	c3024c75e0	Use FileCheck instead of grep. Change by Sonam. Differential Revision: http://reviews.llvm.org/D6432 llvm-svn: 222876	2014-11-27 10:57:24 +00:00
Erik Eckstein	e73e308ab9	Peephole optimization in switch table lookup: reuse the guarding table comparison if possible. This optimization tries to reuse the generated compare instruction, if there is a comparison against the default value after the switch. Example: if (idx < tablesize) r = table[idx]; // table does not contain default_value else r = default_value; if (r != default_value) ... Is optimized to: cond = idx < tablesize; if (cond) r = table[idx]; else r = default_value; if (cond) ... \endcode Jump threading will then eliminate the second if(cond). llvm-svn: 222872	2014-11-27 08:33:51 +00:00
David Majnemer	40157d5c4d	InstCombine: Restore optimizations lost in r210006 This restores our ability to optimize: (X & C) == 0 ? X ^ C : X into X \| C (X & C) != 0 ? X ^ C : X into X & ~C llvm-svn: 222871	2014-11-27 07:25:21 +00:00
David Majnemer	c6a5e1dd4f	InstSimplify: Restore optimizations lost in r210006 This restores our ability to optimize: (X & C) ? X & ~C : X into X & ~C (X & C) ? X : X & ~C into X (X & C) ? X \| C : X into X (X & C) ? X : X \| C into X \| C llvm-svn: 222868	2014-11-27 06:32:46 +00:00
David Majnemer	5468e86469	Revert "Added inst combine transforms for single bit tests from Chris's note" This reverts commit r210006, it miscompiled libapr which is used in who knows how many projects. A test has been added to ensure that we don't regress again. I'll work on a rewrite of what the optimization was trying to do later. llvm-svn: 222856	2014-11-26 23:00:38 +00:00
Colin LeMahieu	6e0f9f8d61	[Hexagon] Adding cmp* immediate form instructions. llvm-svn: 222849	2014-11-26 19:43:12 +00:00
Jozef Kolek	315e7eca1b	[mips][microMIPS] Implement disassembler support for 16-bit instructions LBU16, LHU16, LW16, SB16, SH16 and SW16 Differential Revision: http://reviews.llvm.org/D6405 llvm-svn: 222847	2014-11-26 18:56:38 +00:00
Colin LeMahieu	31abe33726	[Hexagon] Adding and64, or64, and xor64 instructions. llvm-svn: 222846	2014-11-26 18:55:59 +00:00
Will Newton	40f08faa70	Update AArch64 ELF relocations to ABI 1.0 This mostly entails adding relocations, however there are a couple of changes to existing relocations: 1. R_AARCH64_NONE is defined to be zero rather than 256 R_AARCH64_NONE has been defined to be zero for a long time elsewhere e.g. binutils and glibc since the submission of the AArch64 port in 2012 so this is required for compatibility. 2. R_AARCH64_TLSDESC_ADR_PAGE renamed to R_AARCH64_TLSDESC_ADR_PAGE21 I don't think there is any way for relocation names to leak out of LLVM so this should not break anything. Tested with check-all with no regressions. llvm-svn: 222821	2014-11-26 10:49:18 +00:00
Elena Demikhovsky	905a5a606f	AVX-512: Scalar ERI intrinsics including SAE mode and memory operand. Added AVX512_maskable_scalar template, that should cover all scalar instructions in the future. The main difference between AVX512_maskable_scalar<> and AVX512_maskable<> is using X86select instead of vselect. I need it, because I can't create vselect node for MVT::i1 mask for scalar instruction. http://reviews.llvm.org/D6378 llvm-svn: 222820	2014-11-26 10:46:49 +00:00
Will Newton	7ad0ddc7e6	Update ARM ELF relocations to ABI 2.09 Add R_ARM_IRELATIVE. llvm-svn: 222817	2014-11-26 10:36:03 +00:00
Simon Pilgrim	371417db34	[X86][SSE] Improvements to byte shift shuffle matching Since (v)pslldq / (v)psrldq instructions resolve to a single input argument it is useful to match it much earlier than we currently do - this prevents more complicated shuffles (notably insertion into a zero vector) matching before it. Differential Revision: http://reviews.llvm.org/D6409 llvm-svn: 222796	2014-11-25 22:34:59 +00:00
Colin LeMahieu	b3d08bb44b	[Hexagon] Adding add64 and sub64 instructions. llvm-svn: 222795	2014-11-25 22:15:44 +00:00
Colin LeMahieu	6f6c4ff1fc	Reverting 222792 llvm-svn: 222793	2014-11-25 21:39:57 +00:00
Colin LeMahieu	aaf33928ee	[Hexagon] Adding compare with immediate instructions. llvm-svn: 222792	2014-11-25 21:30:28 +00:00
Rafael Espindola	7f2fa4fa38	This test requires asserts because of -stats. Sorry about that. llvm-svn: 222788	2014-11-25 20:56:56 +00:00
Rafael Espindola	947bdb69cb	gold plugin: call llvm_shutdown so that -stats works. llvm-svn: 222787	2014-11-25 20:52:49 +00:00
Cameron McInally	9b7c15a364	[AVX512] Add 512b integer shift by variable intrinsics and patterns. llvm-svn: 222786	2014-11-25 20:41:51 +00:00
Colin LeMahieu	f7f156ffc3	[Hexagon] [NFC] Adding trailing whitespace to test files. llvm-svn: 222785	2014-11-25 20:22:24 +00:00
Colin LeMahieu	e83bc7476f	[Hexagon] Adding C2_mux instruction. llvm-svn: 222784	2014-11-25 20:20:09 +00:00
Hans Wennborg	bda193edff	Remove useless rdar:// comment from switch_to_lookup_table.ll test. llvm-svn: 222772	2014-11-25 18:45:23 +00:00
Colin LeMahieu	902157c249	[Hexagon] Replacing cmp* instructions with ones that contain encoding bits. llvm-svn: 222771	2014-11-25 18:20:52 +00:00
Hans Wennborg	45172aceb3	LazyValueInfo: Actually re-visit partially solved block-values in solveBlockValue() If solveBlockValue() needs results from predecessors that are not already computed, it returns false with the intention of resuming when the dependencies have been resolved. However, the computation would never be resumed since an 'overdefined' result had been placed in the cache, preventing any further computation. The point of placing the 'overdefined' result in the cache seems to have been to break cycles, but we can check for that when inserting work items in the BlockValue stack instead. This makes the "stop and resume" mechanism of solveBlockValue() work as intended, unlocking more analysis. Using this patch shaves 120 KB off a 64-bit Chromium build on Linux. I benchmarked compiling bzip2.c at -O2 but couldn't measure any difference in compile time. Tests by Jiangning Liu from r215343 / PR21238, Pete Cooper, and me. Differential Revision: http://reviews.llvm.org/D6397 llvm-svn: 222768	2014-11-25 17:23:05 +00:00
Joerg Sonnenberger	4af2f12153	Small model and JIT generally don't go well with each other. On LP64 platforms, it will work or not depending on the choosen memory layout, so neither PASS nor XFAIL is appropiate. As UNSUPPORTED as per-test target doesn't exist (yet), remove the test instead to unbreak the builds. llvm-svn: 222767	2014-11-25 17:14:22 +00:00
Rafael Espindola	c81c3f554c	Set the body of a new struct as soon as it is created. This changes the order in which different types are passed to get, but one order is not inherently better than the other. The main motivation is that this simplifies linkDefinedTypeBodies now that it is only linking "real" opaque types. It is also means that we only have to call it once and that we don't need getImpl. A small change in behavior is that we don't copy type names when resolving opaque types. This is an improvement IMHO, but it can be added back if desired. A test is included with the new behavior. llvm-svn: 222764	2014-11-25 15:33:40 +00:00
Joerg Sonnenberger	cf0ea262b1	Reapply 222538 and update tests to explicitly request small code model and PIC: Allow FDE references outside the +/-2GB range supported by PC relative offsets for code models other than small/medium. For JIT application, memory layout is less controlled and can result in truncations otherwise. Patch from Akos Kiss. Differential Revision: http://reviews.llvm.org/D6079 llvm-svn: 222760	2014-11-25 13:37:55 +00:00
Joerg Sonnenberger	b6e36e1421	Mark as explicit failing on x86-64 -- small memory model doesn't agree with default address selections. llvm-svn: 222759	2014-11-25 13:28:56 +00:00
Zoran Jovanovic	b554bba90f	[mips][micromips] Use call instructions with short delay slots Differential Revision: http://reviews.llvm.org/D6338 llvm-svn: 222752	2014-11-25 10:50:00 +00:00
Chandler Carruth	816d26fe5e	[InstCombine] Change LLVM To canonicalize toward the value type being stored rather than the pointer type. This change is analogous to r220138 which changed the canonicalization for loads. The rationale is the same: memory does not have a type, operations (and thus the values they produce) have a type. We should match that type as closely as possible rather than reading some form of semantics into the pointer type. With this change, loads and stores should no longer be made with nonsensical types for the values that tehy load and store. This is particularly important when trying to match specific loaded and stored types in the process of doing other instcombines, which is what led me down this twisty maze of miscanonicalization. I've put quite some effort into looking through IR to find places where LLVM's optimizer was being unreasonably conservative in the face of mismatched load and store types, however it is possible (let's say, likely!) I have missed some. If you see regressions here, or from r220138, the likely cause is some part of LLVM failing to cope with load and store types differing. Test cases appreciated, it is important that we root all of these out of LLVM. llvm-svn: 222748	2014-11-25 10:09:51 +00:00
Suyog Sarda	99c9c1f2b0	Change the test case file to use FileCheck instead of grep. NFC. Change by Ankur Garg. Differential Revision: http://reviews.llvm.org/D6382 llvm-svn: 222740	2014-11-25 08:44:56 +00:00
Chandler Carruth	1a3c2c414c	Revert r220349 to re-instate r220277 with a fix for PR21330 -- quite clearly only exactly equal width ptrtoint and inttoptr casts are no-op casts, it says so right there in the langref. Make the code agree. Original log from r220277: Teach the load analysis to allow finding available values which require inttoptr or ptrtoint cast provided there is datalayout available. Eventually, the datalayout can just be required but in practice it will always be there today. To go with the ability to expose available values requiring a ptrtoint or inttoptr cast, helpers are added to perform one of these three casts. These smarts are necessary to finish canonicalizing loads and stores to the operational type requirements without regressing fundamental combines. I've added some test cases. These should actually improve as the load combining and store combining improves, but they may fundamentally be highlighting some missing combines for select in addition to exercising the specific added logic to load analysis. llvm-svn: 222739	2014-11-25 08:20:27 +00:00
David Majnemer	8cf0dbb015	Forgot to add a file for r222734 llvm-svn: 222736	2014-11-25 07:45:56 +00:00
David Majnemer	0349a019ca	COFF: Add another test for r222124 llvm-svn: 222734	2014-11-25 07:42:36 +00:00
Rafael Espindola	86911440c2	Fix overly aggressive type merging. If we find out that two types are not isomorphic, we learn nothing about opaque sub types in both the source and destination. llvm-svn: 222727	2014-11-25 05:59:24 +00:00
Simon Atanasyan	84f4651f9a	[Object][Mips] Return address of MIPS symbol with cleared microMIPS indicator bit llvm-svn: 222726	2014-11-25 05:57:55 +00:00
Rafael Espindola	e96d7eb8bd	Link the type of aliases. They are not more or less "well typed" than GlobalVariables. llvm-svn: 222725	2014-11-25 04:43:59 +00:00
Juergen Ributzka	eb67bd8d74	[FastISel][AArch64] Fix and extend the tbz/tbnz pattern matching. The pattern matching failed to recognize all instances of "-1", because when comparing against "-1" we didn't use an APInt of the same bitwidth. This commit fixes this and also adds inverse versions of the conditon to catch more cases. llvm-svn: 222722	2014-11-25 04:16:15 +00:00
Rafael Espindola	27ce577356	Add an interesting test that we already get right. NFC. llvm-svn: 222720	2014-11-25 03:47:57 +00:00
David Majnemer	bd9ce4ea51	InstSimplify: Handle some simple tautological comparisons This handles cases where we are comparing a masked value against itself. The analysis could be further improved by making it recursive but such expense is not currently justified. llvm-svn: 222716	2014-11-25 02:55:48 +00:00
Hal Finkel	5901676581	[PowerPC] Add the 'attn' instruction The attn instruction is not part of the Power ISA, but is documented in the A2 user manual, and is accepted by the GNU assembler for the A2 and the POWER4+. Reported as part of PR21650. llvm-svn: 222712	2014-11-25 00:30:11 +00:00
Hal Finkel	360f213d03	[PowerPC] Implement combineRepeatedFPDivisors This does not matter on newer cores (where we can use reciprocal estimates in fast-math mode anyway), but for older cores this allows us to generate better fast-math code where we have multiple FDIVs with a common divisor. llvm-svn: 222710	2014-11-24 23:45:21 +00:00
Matt Arsenault	238ff1ad1e	Bug 21610: Canonicalize min/max fcmp selects to use ordered comparisons llvm-svn: 222705	2014-11-24 23:15:18 +00:00
Matt Arsenault	ea515d33c9	Convert test to FileCheck and use CHECK-LABEL llvm-svn: 222704	2014-11-24 23:03:17 +00:00
Rafael Espindola	6953a3a6e0	Add a disable-output option to the gold plugin. This corresponds to the opt option and is handy for profiling. llvm-svn: 222687	2014-11-24 21:18:14 +00:00
Rafael Espindola	3410466495	Pass the .ll files to llvm-link directly. NFC. llvm-svn: 222681	2014-11-24 20:35:59 +00:00
Kostya Serebryany	4cadd4afa0	[asan/coverage] change the way asan coverage instrumentation is done: instead of setting the guard to 1 in the generated code, pass the pointer to guard to __sanitizer_cov and set it there. No user-visible functionality change expected llvm-svn: 222675	2014-11-24 18:49:53 +00:00
Ulrich Weigand	a69bcd5ed0	[PowerPC] Fix PR 21652 - copy st_other bits on symbol assignment When processing an assignment in the integrated assembler that sets a symbol to the value of another symbol, we need to copy the st_other bits that encode the local entry point offset. Modeled after MipsTargetELFStreamer::emitAssignment handling of the ELF::STO_MIPS_MICROMIPS flag. llvm-svn: 222672	2014-11-24 18:09:47 +00:00
Colin LeMahieu	397a25e7cd	[Hexagon] Adding asrh instruction, removing unused multiclasses. llvm-svn: 222670	2014-11-24 18:04:42 +00:00
Colin LeMahieu	3b3197ef95	[Hexagon] Adding aslh instruction. llvm-svn: 222668	2014-11-24 17:44:19 +00:00
Colin LeMahieu	098256c5e6	[Hexagon] Adding zxth instruction. llvm-svn: 222662	2014-11-24 17:11:34 +00:00
Colin LeMahieu	bb7d6f5514	[Hexagon] Adding zxtb instruction. llvm-svn: 222660	2014-11-24 16:48:43 +00:00
David Majnemer	8e6f6a98b5	InstCombine: Don't create an unused instruction We would create an instruction but not inserting it. Not inserting the unused instruction would lead us to verification failure. This fixes PR21653. llvm-svn: 222659	2014-11-24 16:41:13 +00:00
Jozef Kolek	11bdb8bf33	[mips][microMIPS] Fix JRADDIUSP instruction Fix JRADDIUSP instruction, remove delay slot flag because this instruction doesn't have delay slot. Differential Revision: http://reviews.llvm.org/D6365 llvm-svn: 222658	2014-11-24 16:14:10 +00:00
Jozef Kolek	e8c9d1eaf7	[mips][microMIPS] Implement LBU16, LHU16, LW16, SB16, SH16 and SW16 instructions Differential Revision: http://reviews.llvm.org/D5122 llvm-svn: 222653	2014-11-24 14:39:13 +00:00
Jozef Kolek	ea22c4cfbb	[mips][microMIPS] Implement disassembler support for 16-bit instructions With the help of new method readInstruction16() two bytes are read and decodeInstruction() is called with DecoderTableMicroMips16, if this fails four bytes are read and decodeInstruction() is called with DecoderTableMicroMips32. Differential Revision: http://reviews.llvm.org/D6149 llvm-svn: 222648	2014-11-24 13:29:59 +00:00
Andrea Di Biagio	23e2cfa834	[X86] Improved target specific combine on VSELECT dag nodes. This patch teaches function 'transformVSELECTtoBlendVECTOR_SHUFFLE' how to convert VSELECT dag nodes to shuffles on targets that do not have SSE4.1. On pre-SSE4.1 targets, we can still perform blend operations using movss/movsd. Also, removed a target specific combine that performed a premature lowering of VSELECT nodes to target specific MOVSS/MOVSD nodes. llvm-svn: 222647	2014-11-24 12:23:15 +00:00
David Majnemer	b2a6e7458d	InstCombine: Don't assume DataLayout is always available We tried to get the result of DataLayout::getLargestLegalIntTypeSize but we didn't have a DataLayout. This resulted in opt crashing. This fixes PR21651. llvm-svn: 222645	2014-11-24 07:26:20 +00:00
Michael Kuperstein	9ef90647b9	[X86] Fixes bug in build_vector v4x32 lowering r222375 made some improvements to build_vector lowering of v4x32 and v4xf32 into an insertps, but it missed a case where: 1. A single extracted element is used twice. 2. The lower of the two non-zero indexes should be preserved, and the higher should be used for the dest mask. This caused a crash, since the source value for the insertps ends-up uninitialized. Differential Revision: http://reviews.llvm.org/D6377 llvm-svn: 222635	2014-11-23 13:09:06 +00:00
Elena Demikhovsky	9e5089a938	Masked Vector Load and Store Intrinsics. Introduced new target-independent intrinsics in order to support masked vector loads and stores. The loop vectorizer optimizes loops containing conditional memory accesses by generating these intrinsics for existing targets AVX2 and AVX-512. The vectorizer asks the target about availability of masked vector loads and stores. Added SDNodes for masked operations and lowering patterns for X86 code generator. Examples: <16 x i32> @llvm.masked.load.v16i32(i8* %addr, <16 x i32> %passthru, i32 4 /* align /, <16 x i1> %mask) declare void @llvm.masked.store.v8f64(i8 %addr, <8 x double> %value, i32 4, <8 x i1> %mask) Scalarizer for other targets (not AVX2/AVX-512) will be done in a separate patch. http://reviews.llvm.org/D6191 llvm-svn: 222632	2014-11-23 08:07:43 +00:00
Matt Arsenault	2a495975ed	R600: Fix extloads of i1 on R600/Evergreen llvm-svn: 222631	2014-11-23 02:57:54 +00:00
Matt Arsenault	b7ebdffe3c	R600/SI: Add additional tests for i1 loads llvm-svn: 222629	2014-11-23 02:57:50 +00:00
Matt Arsenault	79db0a70bc	R600/SI: Fix broken check lines and modernize prefixes Use -LABEL and remove -CHECK llvm-svn: 222628	2014-11-23 02:57:49 +00:00
Matt Arsenault	8499ea6a90	R600/SI: Fix missing -verify-machineinstrs on a test llvm-svn: 222627	2014-11-23 02:57:47 +00:00
David Majnemer	fb3805576b	InstCombine: Propagate exact for (sdiv X, Pow2) -> (udiv X, Pow2) llvm-svn: 222625	2014-11-22 20:00:41 +00:00
David Majnemer	ec6e481bc5	InstCombine: Propagate exact for (sdiv X, Y) -> (udiv X, Y) llvm-svn: 222624	2014-11-22 20:00:38 +00:00
David Majnemer	fa4699e65f	InstCombine: Propagate exact for (sdiv -X, C) -> (sdiv X, -C) llvm-svn: 222623	2014-11-22 20:00:34 +00:00
David Majnemer	a3aeb15613	InstCombine: Propagate exact in (udiv (lshr X,C1),C2) -> (udiv x,C1<<C2) llvm-svn: 222620	2014-11-22 18:16:54 +00:00
David Majnemer	546f81064c	InstCombine: Propagate NSW/NUW for X*(1<<Y) -> X<<Y llvm-svn: 222613	2014-11-22 08:57:02 +00:00
David Majnemer	8279a7506d	InstCombine: Propagate NSW for -X * -Y -> X * Y llvm-svn: 222612	2014-11-22 07:25:19 +00:00
David Majnemer	4efa9ff8ca	InstSimplify: Simplify (sub 0, X) -> X if it's NUW This is a generalization of the X - (0 - Y) -> X transform. llvm-svn: 222611	2014-11-22 07:15:16 +00:00
Chandler Carruth	8c44d86ab8	[x86] Add some tests for a common unpack pattern of vector shuffle that has a remarkably unique and efficient lowering. While we get this some of the time already, we miss a few cases and there wasn't a principled reason we got it. We should at least test this. v8 already has tests for this pattern. llvm-svn: 222607	2014-11-22 05:44:43 +00:00
David Majnemer	80c8f627db	InstCombine: Preserve nsw when folding X*(2^C) -> X << C llvm-svn: 222606	2014-11-22 04:52:55 +00:00
David Majnemer	fd4a6d2b7a	InstCombine: Preserve nsw/nuw for ((X << C2)C1) -> (X (C1 << C2)) llvm-svn: 222605	2014-11-22 04:52:52 +00:00
David Majnemer	027bc80928	InstCombine: Preserve nsw for (mul %V, -1) -> (sub 0, %V) llvm-svn: 222604	2014-11-22 04:52:38 +00:00
Gerolf Hoflehner	ec6217c929	[InstCombine] Re-commit of r218721 (Optimize icmp-select-icmp sequence) Fixes the self-host fail. Note that this commit activates dominator analysis in the combiner by default (like the original commit did). llvm-svn: 222590	2014-11-21 23:36:44 +00:00
Joerg Sonnenberger	02b13a8d9b	Fix transformation of add with pc argument to adr for non-immediate arguments. llvm-svn: 222587	2014-11-21 22:39:34 +00:00
Kostya Serebryany	60ef25bd54	[asan] remove old experimental code llvm-svn: 222586	2014-11-21 22:34:29 +00:00
Tom Stellard	f1206edfd0	R600/SI: Add a failing test case for offset order in ds_read2 instructions llvm-svn: 222585	2014-11-21 22:31:47 +00:00
Tom Stellard	a99ada528c	R600/SI: Emit s_mov_b32 m0, -1 before every DS instruction This s_mov_b32 will write to a virtual register from the M0Reg class and all the ds instructions now take an extra M0Reg explicit argument. This change is necessary to prevent issues with the scheduler mixing together instructions that expect different values in the m0 registers. llvm-svn: 222583	2014-11-21 22:31:44 +00:00
Tom Stellard	6596ba7933	R600/SI: Add SIFoldOperands pass This pass attempts to fold the source operands of mov and copy instructions into their uses. llvm-svn: 222581	2014-11-21 22:06:37 +00:00
Jozef Kolek	3b8ddb665b	[mips][microMIPS] This patch implements functionality in MIPS delay slot filler such as if delay slot filler have to put NOP instruction into the delay slot of microMIPS BEQ or BNE instruction which uses the register $0, then instead of emitting NOP this instruction is replaced by the corresponding microMIPS compact branch instruction, i.e. BEQZC or BNEZC. Differential Revision: http://reviews.llvm.org/D3566 llvm-svn: 222580	2014-11-21 22:04:35 +00:00
Tom Stellard	3ae588789e	R600/SI: Use hex notation for constant in test llvm-svn: 222578	2014-11-21 22:00:13 +00:00
Colin LeMahieu	310991c66f	[Hexagon] Adding sxth instruction. llvm-svn: 222577	2014-11-21 21:54:59 +00:00
Colin LeMahieu	91ffec908f	[Hexagon] Adding sxtb instruction. Renaming some identically named classes that will be removed after converting referencing defs. llvm-svn: 222575	2014-11-21 21:35:52 +00:00
Manman Ren	f0a582bada	Debug Info: revert r222195, r222210 and r222239. This is no longer needed after David's fix at r222377 + r222485. rdar://18958417 llvm-svn: 222563	2014-11-21 19:55:23 +00:00
Sanjay Patel	501890e909	Add a feature flag for slow 32-byte unaligned memory accesses [x86]. This patch adds a feature flag to avoid unaligned 32-byte load/store AVX codegen for Sandy Bridge and Ivy Bridge. There is no functionality change intended for those chips. Previously, the absence of AVX2 was being used as a proxy to detect this feature. But that hindered codegen for AVX-enabled AMD chips such as btver2 that do not have the 32-byte unaligned access slowdown. Performance measurements are included in PR21541 ( http://llvm.org/bugs/show_bug.cgi?id=21541 ). Differential Revision: http://reviews.llvm.org/D6355 llvm-svn: 222544	2014-11-21 17:40:04 +00:00
Chandler Carruth	ce5a26b0e7	[x86] Restructure the checking patterns for v16 and v32 avx2 vector shuffle lowering to allow much better blend matching. Specifically, with the new structure the code seems clearer to me and we correctly can hit the cases where merging two 128-bit lanes is a clear win and can be shuffled cheaply afterward. llvm-svn: 222539	2014-11-21 14:53:03 +00:00
Chandler Carruth	6c4d1ea8c4	[x86] Make the previous logic significantly less conservative and get a bunch more improvements. Non-lane-crossing is fine, the key is that lane merging only makes sense for single-input shuffles. Not sure why I got so turned around here. The code all works, I was just using the wrong model for it. This only updates v4 and v8 lowering. The v16 and v32 lowering requires restructuring the entire check sequence. llvm-svn: 222537	2014-11-21 14:33:24 +00:00
Andrea Di Biagio	0225b5bf6f	[DAG] Teach how to turn a build_vector into a shuffle if some of the operands are zero. Before this patch, the DAGCombiner only tried to convert build_vector dag nodes into shuffles if all operands were either extract_vector_elt or undef. This patch improves that logic and teaches the DAGCombiner how to deal with build_vector dag nodes where one or more operands are zero. A build_vector dag node with some zero operands is turned into a shuffle only if the resulting shuffle mask is legal for the target. llvm-svn: 222536	2014-11-21 14:32:06 +00:00
Chandler Carruth	d2b19bc867	[x86] Teach the x86 vector shuffle lowering to detect mergable 128-bit lanes. By special casing these we can often either reduce the total number of shuffles significantly or reduce the number of (high latency on Haswell) AVX2 shuffles that potentially cross 128-bit lanes. Even when these don't actually cross lanes, they have much higher latency to support that. Doing two of them and a blend is worse than doing a single insert across the 128-bit lanes to blend and then doing a single interleaved shuffle. While this seems like a narrow case, it kept cropping up on me and the difference is huge as you can see in many of the test cases. I first hit this trying to perfectly fix the interleaving shuffle patterns used by Halide for AVX2. llvm-svn: 222533	2014-11-21 13:56:05 +00:00
Chandler Carruth	77e1a0ad1f	[x86] Remove more windows line endings that slipped into this file... llvm-svn: 222528	2014-11-21 12:33:46 +00:00
Chandler Carruth	61c7b6252c	[x86] Add a bunch of test cases to 256-bit shuffles that exercise merging 128-bit subvectors and also shuffling all the elements of those subvectors. Currently we generate pretty bad code for many of these, but I'm testing a patch that should dramatically improve this in addition to making the shuffle lowering robust to other changes. llvm-svn: 222525	2014-11-21 12:17:50 +00:00
Alexey Volkov	fd1731d876	[X86] For Silvermont CPU use 16-bit division instead of 64-bit for small positive numbers Differential Revision: http://reviews.llvm.org/D5938 llvm-svn: 222521	2014-11-21 11:19:34 +00:00
Yury Gribov	55441bb601	[asan] Add new hidden compile-time flag asan-instrument-allocas to sanitize variable-sized dynamic allocas. Patch by Max Ostapenko. Reviewed at http://reviews.llvm.org/D6055 llvm-svn: 222519	2014-11-21 10:29:50 +00:00
Hao Liu	44e5d7a131	DAGCombiner: Allow the DAGCombiner to combine multiple FDIVs with the same divisor info FMULs by the reciprocal. E.g., ( a / D; b / D ) -> ( recip = 1.0 / D; a * recip; b * recip) A hook is added to allow the target to control whether it needs to do such combine. Reviewed in http://reviews.llvm.org/D6334 llvm-svn: 222510	2014-11-21 06:39:58 +00:00
Hal Finkel	f413be11f0	[PPC] Use SeparateConstOffsetFromGEP This mirrors r222331, which enabled SeparateConstOffsetFromGEP on AArch64, in the PowerPC backend. Yields, on a POWER7 machine, a 30% speedup on SingleSource/Benchmarks/Shootout/nestedloop (this might just be from LICM, there is a store moved out of the inner loop) and a potential speedup on MultiSource/Benchmarks/mediabench/mpeg2/mpeg2dec/mpeg2decode. Regardless, it makes some code look cleaner, and synchronizing the backends in this regard seems like a generally good thing. llvm-svn: 222504	2014-11-21 04:35:51 +00:00
David Majnemer	c0a313b57c	SROA: The alloca type isn't a candidate promotion type for vectors The alloca's type is irrelevant, only those types which are used in a load or store of the exact size of the slice should be considered. This manifested as an assertion failure when we compared the various types: we had a size mismatch. This fixes PR21480. llvm-svn: 222499	2014-11-21 02:34:55 +00:00
Quentin Colombet	a7439d4483	[X86] Do not custom lower UINT_TO_FP when the target type does not match the custom lowering. <rdar://problem/19026326> llvm-svn: 222489	2014-11-21 00:47:19 +00:00
Michael Zolotukhin	0dcae71449	Fix a trip-count overflow issue in LoopUnroll. Currently LoopUnroll generates a prologue loop before the main loop body to execute first N%UnrollFactor iterations. Also, this loop is used if trip-count can overflow - it's determined by a runtime check. However, we've been mistakenly optimizing this loop to a linear code for UnrollFactor = 2, not taking into account that it also serves as a safe version of the loop if its trip-count overflows. llvm-svn: 222451	2014-11-20 20:19:55 +00:00
Saleem Abdulrasool	2f3b3f3182	X86: use the correct alloca symbol for Windows Itanium Windows itanium targets the MSVCRT, and the stack probe symbol is provided by MSVCRT. This corrects the emission of stack probes on i686-windows-itanium. llvm-svn: 222439	2014-11-20 18:01:26 +00:00
Renato Golin	a03161c6ee	MCJIT tests passing on ARM after r222414 fixed the relocation llvm-svn: 222430	2014-11-20 13:32:16 +00:00
Jyoti Allur	5b9f35220e	[ELF] Prevent ARM ELF object writer from generating deprecated relocation code R_ARM_PLT32 llvm-svn: 222414	2014-11-20 05:58:11 +00:00
David Majnemer	ccce9ae4c7	Add a test for r221870 bad-relocs.obj.coff-i386 has a relocation whose symbol index is outside the symbol table. llvm-svn: 222413	2014-11-20 05:32:10 +00:00
Colin LeMahieu	ac00643603	[Hexagon] Adding A2_xor instruction with IR selection pattern and test. llvm-svn: 222399	2014-11-19 23:22:23 +00:00
Chad Rosier	90a2f9b110	Revert "[Reassociate] As the expression tree is rewritten make sure the operands are" This reverts commit r222142. This is causing/exposing an execution-time regression in spec2006/gcc and coremark on AArch64/A57/Ofast. Conflicts: test/Transforms/Reassociate/optional-flags.ll llvm-svn: 222398	2014-11-19 23:21:20 +00:00
Colin LeMahieu	21866546ae	[Hexagon] Adding A2_or instruction with IR selection pattern and test. llvm-svn: 222396	2014-11-19 22:58:04 +00:00
Andrea Di Biagio	1b657bfcc8	[X86] Improved lowering of v4x32 build_vector dag nodes. This patch improves the lowering of v4f32 and v4i32 build_vector dag nodes that are known to have at least two non-zero elements. With this patch, a build_vector that performs a blend with zero is converted into a shuffle. This is done to let the shuffle legalizer expand the dag node in a optimal way. For example, if we know that a build_vector performs a blend with zero, we can try to lower it as a movq/blend instead of always selecting an insertps. This patch also improves the logic that lowers a build_vector into a insertps with zero masking. See for example the extra test cases added to test sse41.ll. Differential Revision: http://reviews.llvm.org/D6311 llvm-svn: 222375	2014-11-19 19:34:29 +00:00
Tom Stellard	e0ddfd11ea	R600/SI: Make SIInstrInfo::isOperandLegal() more strict A register operand that has a common sub-class with its instruction's defined register class is not always legal. For example, SReg_32 and M0Reg both have a common sub-class, but we can't use an SReg_32 in instructions that expect a M0Reg. This prevents the llvm.SI.sendmsg.ll test from failing when the fold operand pass is added. llvm-svn: 222368	2014-11-19 16:58:49 +00:00
Zoran Jovanovic	a4c4b5fc01	[mips][micromips] Implement SWM32 and LWM32 instructions Differential Revision: http://reviews.llvm.org/D5519 llvm-svn: 222367	2014-11-19 16:44:02 +00:00
Suyog Sarda	aba97f4aba	Vectorize a reduction chain feeding into a 'return' statement. e.x return (a[0]+b[0]) + (a[1]+b[1]) Differential Revision: http://reviews.llvm.org/D6227 llvm-svn: 222364	2014-11-19 16:07:38 +00:00
Jozef Kolek	ffeed44190	[mips][microMIPS] Fix opcodes of MFHC1 and MTHC1 instructions. Differential Revision: http://reviews.llvm.org/D6169 llvm-svn: 222355	2014-11-19 13:37:51 +00:00
Arnaud A. de Grandmaison	7b9dc28060	Fix tail recursion elimination When the BasicBlock containing the return instrution has a PHI with 2 incoming values, FoldReturnIntoUncondBranch will remove the no longer used incoming value and remove the no longer needed phi as well. This leaves us with a BB that no longer has a PHI, but the subsequent call to FoldReturnIntoUncondBranch from FoldReturnAndProcessPred will not remove the return instruction (which still uses the result of the call instruction). This prevents EliminateRecursiveTailCall to remove the value, as it is still being used in a basicblock which has no predecessors. The basicblock can not be erased on the spot, because its iterator is still being used in runTRE. This issue was exposed when removing the threshold on size for lifetime marker insertion for named temporaries in clang. The testcase is a much reduced version of peelOffOuterExpr(const Expr, const ExplodedNode ) from clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp. llvm-svn: 222354	2014-11-19 13:32:51 +00:00

... 2 3 4 5 6 ...

27447 Commits