llvm-project

Commit Graph

Author	SHA1	Message	Date
Tom Stellard	ab8a8c84d4	R600/SI: SI support for 64bit ConstantFP Patch by: Niels Ole Salscheider Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 186178	2013-07-12 18:15:02 +00:00
Tom Stellard	7512c0803c	R600/SI: Add initial double precision support for SI Patch by: Niels Ole Salscheider Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 186177	2013-07-12 18:14:56 +00:00
Benjamin Kramer	068a2253e9	X86: Shrink certain forms of movsx. In particular: movsbw %al, %ax --> cbtw movswl %ax, %eax --> cwtl movslq %eax, %rax --> cltq According to Intel's manual those have the same performance characteristics but come with a smaller encoding. llvm-svn: 186174	2013-07-12 18:06:44 +00:00
Stephen Lin	fda967fdea	X86: fold SSE2/AVX2 logical shift by immediate amount into zero vector when possible Patch by Andrea Di Biagio llvm-svn: 186165	2013-07-12 15:31:36 +00:00
Stephen Lin	764d8d3d6f	Start using CHECK-LABEL in some tests. llvm-svn: 186163	2013-07-12 14:54:12 +00:00
Stephen Lin	f8bd2e5b86	Add new directive called CHECK-LABEL to FileCheck. CHECK-LABEL is meant to be used in place on CHECK on lines containing identifiers or other unique labels (they need not actually be labels in the source or output language, though.) This is used to break up the input stream into separate blocks delineated by CHECK-LABEL lines, each of which is checked independently. This greatly improves the accuracy of errors and fix-it hints in many cases, and allows for FileCheck to recover from errors in one block by continuing to subsequent blocks. Some tests will be converted to use this new directive in forthcoming patches. llvm-svn: 186162	2013-07-12 14:51:05 +00:00
Rafael Espindola	f0c617264a	Don't reject an empty archive. llvm-svn: 186159	2013-07-12 13:32:28 +00:00
Chandler Carruth	cf3715cadd	Revert "indvars: Improve LFTR by eliminating truncation when comparing against a constant." This reverts commit r186107. It didn't handle wrapping arithmetic in the loop correctly and thus caused the following C program to count from 0 to UINT64_MAX instead of from 0 to 255 as intended: #include <stdio.h> int main() { unsigned char first = 0, last = 255; do { printf("%d\n", first); } while (first++ != last); } Full test case and instructions to reproduce with just the -indvars pass sent to the original review thread rather than to r186107's commit. llvm-svn: 186152	2013-07-12 11:18:55 +00:00
Vladimir Medic	bcf1ca08e0	Add support for Mips break and syscall insructions. The corresponding test cases are added. llvm-svn: 186151	2013-07-12 09:25:35 +00:00
Richard Sandiford	17276d3567	[SystemZ] Add test missing from r186148 Sigh, twice in two days sorry. One day I'll remember... llvm-svn: 186150	2013-07-12 09:20:14 +00:00
Richard Sandiford	6d4bd28322	[SystemZ] Optimize sign-extends of vector setccs Normal (sext (setcc ...)) sequences are optimised into (select_cc ..., -1, 0) by DAGCombiner::visitSIGN_EXTEND. However, this is deliberately not done for vectors, and after vector type legalization we have (sext_inreg (setcc ...)) instead. I wondered about trying to extend DAGCombiner to handle this case too, but it seemed to be a loss on some other targets I tried, even those for which SETCC isn't "legal" and SELECT_CC is. llvm-svn: 186149	2013-07-12 09:17:10 +00:00
Richard Sandiford	3f0edc2903	[SystemZ] Improve spilling of LGDR and LDGR If the source of these instructions is spilled we should load the destination. If the destination is spilled we should store the source. llvm-svn: 186147	2013-07-12 08:37:17 +00:00
Nadav Rotem	89c41bf06a	SLPVectorizer: Sink and enable CSE for ExtractElements. llvm-svn: 186145	2013-07-12 06:09:24 +00:00
Charles Davis	e8f297ca94	Target/X86: Add explicit Win64 and System V/x86-64 calling conventions. Summary: This patch adds explicit calling convention types for the Win64 and System V/x86-64 ABIs. This allows code to override the default, and use the Win64 convention on a target that wants to use SysV (and vice-versa). This is needed to implement the `ms_abi` and `sysv_abi` GNU attributes. Reviewers: CC: llvm-svn: 186144	2013-07-12 06:02:35 +00:00
NAKAMURA Takumi	aaaec3db98	llvm/test/Object/archive-toc.test: Use env(1) to satisfy win32 hosts. llvm-svn: 186143	2013-07-12 02:34:45 +00:00
Nadav Rotem	fa3c2db211	SLPVectorize: Replace the code that checks for vectorization candidates in successor blocks with code that scans PHINodes. Before we could vectorize PHINodes scanning successors was a good way of finding candidates. Now we can vectorize the phinodes which is simpler. llvm-svn: 186139	2013-07-12 00:04:18 +00:00
David Dean	f3ed656189	Add the ability to use guarded malloc when running llvm lit tests. llvm-svn: 186134	2013-07-11 23:36:57 +00:00
Adrian Prantl	29b3fdc8c2	In response to dblaikie's comment on r186035, replacing the (reduced LLVM IR) + (full source in comment) with the (full LLVM IR) + (reduced src in comment) llvm-svn: 186119	2013-07-11 21:16:14 +00:00
Rafael Espindola	dee53e76f6	Add tests for the before and after modifiers. llvm-svn: 186118	2013-07-11 21:11:55 +00:00
Rafael Espindola	621ca94358	Add a test for llvm-ar's m operation. llvm-svn: 186110	2013-07-11 19:09:04 +00:00
Hal Finkel	4715081787	PPC: Add some missing V_SET0 patterns We had patterns to match v4i32 immAllZerosV -> V_SET0, but not patterns for v8i16 (which occurs in the test case) or v16i8. The same was true for V_SETALLONES (so I added the associated patterns for those as well). Another bug found by llvm-stress. llvm-svn: 186108	2013-07-11 17:43:32 +00:00
Andrew Trick	3095993d6f	indvars: Improve LFTR by eliminating truncation when comparing against a constant. Patch by Michele Scandale! Adds a special handling of the case where, during the loop exit condition rewriting, the exit value is a constant of bitwidth lower than the type of the induction variable: instead of introducing a trunc operation in order to match correctly the operand types, it allows to convert the constant value to an equivalent constant, depending on the initial value of the induction variable and the trip count, in order have an equivalent comparison between the induction variable and the new constant. llvm-svn: 186107	2013-07-11 17:08:59 +00:00
Hal Finkel	ff3ea8060c	PPCDAGToDAGISel::isRunOfOnes should return false on zero This fixes a bug (found by csmith) at -O0 where we attempt to create a RLWIMI with an out-of-range operand. Most uses of the isRunOfOnes function are guarded by a condition that the value is not zero. This was not true in two places, and in both places a zero input would result in an out-of-rage MB value (= 32). To fix this, isRunOfOnes returns false on a zero input (and I've remove one now-redundant guard). llvm-svn: 186101	2013-07-11 16:31:51 +00:00
Rafael Espindola	b1c1c5f377	Fix a FIXME about the format and add a test. While at it, use strftime on Unix too and use the thread safe versions of localtime. llvm-svn: 186090	2013-07-11 15:35:23 +00:00
Arnold Schwaighofer	e97c71b8fd	LoopVectorize: Vectorize all accesses in address space zero with unit stride We can vectorize them because in the case where we wrap in the address space the unvectorized code would have had to access a pointer value of zero which is undefined behavior in address space zero according to the LLVM IR semantics. (Thank you Duncan, for pointing this out to me). Fixes PR16592. llvm-svn: 186088	2013-07-11 15:21:55 +00:00
Rafael Espindola	a7f6913c08	Merge these tests. llvm-svn: 186084	2013-07-11 13:44:10 +00:00
Rafael Espindola	70a765dc47	Use a more unique name to avoid conflicting with directory.ll tests when running in parallel. llvm-svn: 186083	2013-07-11 13:31:38 +00:00
Rafael Espindola	0ec47c801d	Add a test for llvm-ar's 'd' operation. llvm-svn: 186082	2013-07-11 13:24:27 +00:00
Rafael Espindola	54dbca5eeb	Add tests for the 'x' operation. llvm-svn: 186081	2013-07-11 13:13:09 +00:00
Richard Sandiford	4209e7f6c6	[SystemZ] Add testcase missing from r186073 llvm-svn: 186074	2013-07-11 09:10:38 +00:00
Richard Sandiford	ea9b6aa20b	[SystemZ] Use zeroing form of RISBG for shift-and-AND sequences Extend r186072 to handle shifts and ANDs. llvm-svn: 186073	2013-07-11 09:10:09 +00:00
Richard Sandiford	84f54a3bc9	[SystemZ] Use zeroing form of RISBG for some AND sequences RISBG can handle some ANDs for which no AND IMMEDIATE exists. It also acts as a three-operand AND for some cases where an AND IMMEDIATE could be used instead. It might be worth adding a pass to replace RISBG with AND IMMEDIATE in cases where the register operands end up being the same and where AND IMMEDIATE is smaller. llvm-svn: 186072	2013-07-11 08:59:12 +00:00
Richard Sandiford	67ddcd6dd0	[SystemZ] Allow 8-bit operands to RISBG RISBG has three 8-bit operands (I3, I4 and I5). I'd originally restricted all three to 6 bits, since that's the only range we intended to use at the time. However, the top bit of I4 acts as a "zero" flag for RISBG, while the top bit of I3 acts as a "test" flag for RNSBG & co. This patch therefore allows them to have the full 8-bit range. I've left the fifth operand as a 6-bit value for now since the upper 2 bits have no defined meaning. llvm-svn: 186070	2013-07-11 08:37:13 +00:00
Duncan Sands	e773c08021	TryToSimplifyUncondBranchFromEmptyBlock was checking that any common predecessors of the two blocks it is attempting to merge supply the same incoming values to any phi in the successor block. This change allows merging in the case where there is one or more incoming values that are undef. The undef values are rewritten to match the non-undef value that flows from the other edge. Patch by Mark Lacey. llvm-svn: 186069	2013-07-11 08:28:20 +00:00
Hal Finkel	743b194084	RegScavenger should not exclude undef uses When computing currently-live registers, the register scavenger excludes undef uses. As a result, undef uses are ignored when computing the restore points of registers spilled into the emergency slots. While the register scavenger normally excludes from consideration, when scavenging, registers used by the current instruction, we need to not exclude undef uses. Otherwise, we might end up requiring more emergency spill slots than we have (in the case where the undef use is the currently-spilled register). Another bug found by llvm-stress. llvm-svn: 186067	2013-07-11 05:55:57 +00:00
Nadav Rotem	108ef760ff	Consolidate more lit tests. llvm-svn: 186063	2013-07-11 05:15:11 +00:00
Nadav Rotem	e0a49499fe	Consolidate some of the lit tests. llvm-svn: 186062	2013-07-11 05:11:33 +00:00
Nadav Rotem	c6b5e2499e	Consolidate some of the lit tests. llvm-svn: 186060	2013-07-11 05:01:50 +00:00
Michael Gottesman	b40db26eae	Teach TailRecursionElimination to handle certain cases of nocapture escaping allocas. Without the changes introduced into this patch, if TRE saw any allocas at all, TRE would not perform TRE or mark callsites with the tail marker. Because TRE runs after mem2reg, this inadequacy is not a death sentence. But given a callsite A without escaping alloca argument, A may not be able to have the tail marker placed on it due to a separate callsite B having a write-back parameter passed in via an argument with the nocapture attribute. Assume that B is the only other callsite besides A and B only has nocapture escaping alloca arguments (NOTE B may have other arguments that are not passed allocas). In this case not marking A with the tail marker is unnecessarily conservative since: 1. By assumption A has no escaping alloca arguments itself so it can not access the caller's stack via its arguments. 2. Since all of B's escaping alloca arguments are passed as parameters with the nocapture attribute, we know that B does not stash said escaping allocas in a manner that outlives B itself and thus could be accessed indirectly by A. With the changes introduced by this patch: 1. If we see any escaping allocas passed as a capturing argument, we do nothing and bail early. 2. If we do not see any escaping allocas passed as captured arguments but we do see escaping allocas passed as nocapture arguments: i. We do not perform TRE to avoid PR962 since the code generator produces significantly worse code for the dynamic allocas that would be created by the TRE algorithm. ii. If we do not return twice, mark call sites without escaping allocas with the tail marker. NOTE This excludes functions with escaping nocapture allocas. 3. If we do not see any escaping allocas at all (whether captured or not): i. If we do not have usage of setjmp, mark all callsites with the tail marker. ii. If there are no dynamic/variable sized allocas in the function, attempt to perform TRE on all callsites in the function. Based off of a patch by Nick Lewycky. rdar://14324281. llvm-svn: 186057	2013-07-11 04:40:01 +00:00
Hal Finkel	94383e542b	Move r186044 tests into CodeGen/X86 I had thought that these tests could be target-neutral, but in practice this is not the case (on some targets, like Hexagon and Darwin), they trigger an assert (a different assert than the one that r186044 fixes). llvm-svn: 186051	2013-07-11 01:55:55 +00:00
Hal Finkel	a2aeb8e8e1	Set REQUIRES shell on the test cases for r186044 Trying to fix the i686-mingw32 build. llvm-svn: 186046	2013-07-10 23:25:03 +00:00
Hal Finkel	31ffcec999	XFAIL the test cases for r186044 on Hexagon For some reason, the Hexagon backend does not reject these invalid static initializer expressions, but instead crashes in AsmPrinter::EmitGlobalConstant. llvm-svn: 186045	2013-07-10 23:11:14 +00:00
Hal Finkel	b31366da82	Don't assert if we can't constant fold extract/insertvalue A non-constant-foldable static initializer expression containing insertvalue or extractvalue had been causing an assert: Constants.cpp:1971: Assertion `FC && "ExtractValue constant expr couldn't be folded!"' failed. Now we report a more-sensible "Unsupported expression in static initializer" error instead. Fixes PR15417. llvm-svn: 186044	2013-07-10 22:51:01 +00:00
Rafael Espindola	555aa899c6	Remove this test for now. It is not reliable to depend on the output of llvm_unreachable. The original change will have proper tests when llvm-ar moves to lib/Object (soon). llvm-svn: 186043	2013-07-10 22:15:29 +00:00
Rafael Espindola	555099207b	Find the symbol table on archives created on OS X. llvm-svn: 186041	2013-07-10 22:07:59 +00:00
Rafael Espindola	3b5475c0f2	Move tests from test/Archive to test/Object. There is no lib/Archive anymore and some archive tests were in test/Archive and others in test/Object. Since archive is just one of the formats supported by lib/Object, test/Object is probably the best location. llvm-svn: 186038	2013-07-10 21:47:16 +00:00
Adrian Prantl	ef99752e69	Add a comment. llvm-svn: 186035	2013-07-10 21:08:02 +00:00
Tim Northover	a630fb0b67	Put ELF COMDAT relocations into the relevant COMDAT group. Patch from Игорь Пашев (I do hope we support utf-8 commit messages; I also hope he'll forgive me for transliterating it as Igor Pashev in case things go horribly wrong). llvm-svn: 186034	2013-07-10 20:58:17 +00:00
Adrian Prantl	5a4c862a90	Add a testcase for r186014. llvm-svn: 186031	2013-07-10 20:43:29 +00:00
Rafael Espindola	fbcafc0793	Don't crash in 'llvm -s' when an archive has no symtab. llvm-svn: 186029	2013-07-10 20:14:22 +00:00
Reid Kleckner	755d324cd2	Fix %t typo in Ocaml bindings test. llvm-svn: 186027	2013-07-10 18:55:06 +00:00
Michel Danzer	49812b5bbd	R600/SI: Initial local memory support Enough for the radeonsi driver to use it for calculating derivatives. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 186012	2013-07-10 16:37:07 +00:00
Michel Danzer	8d69617b27	R600/SI: Add intrinsic for retrieving the current thread ID Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 186010	2013-07-10 16:36:52 +00:00
Michel Danzer	83f87c4c2e	R600/SI: Add intrinsics for texture sampling with user derivatives Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 186008	2013-07-10 16:36:36 +00:00
Vladimir Medic	f38ee30485	Reverting commit r185999 due to buildboot failure. llvm-svn: 186001	2013-07-10 12:27:25 +00:00
Vladimir Medic	e84de1e101	Add support for Mips break and syscall insructions. The corresponding test cases are added. llvm-svn: 185999	2013-07-10 10:18:10 +00:00
Adrian Prantl	a1ffd1a450	Un-break the buildbot by tweaking the indirection flag. Pulled in a testcase from the debuginfo-test suite. llvm-svn: 185993	2013-07-10 01:53:37 +00:00
Jim Grosbach	ebcad2e063	ARM: Fix incorrect pack pattern for thumb2 Propagate the fix from r185712 to Thumb2 codegen as well. Original commit message applies here as well: A "pkhtb x, x, y asr #num" uses the lower 16 bits of "y asr #num" and packs them in the bottom half of "x". An arithmetic and logic shift are only equivalent in this context if the shift amount is 16. We would be shifting in ones into the bottom 16bits instead of zeros if "y" is negative. rdar://14338767 llvm-svn: 185982	2013-07-09 22:59:22 +00:00
David Majnemer	a80fed7e58	InstSimplify: X >> X -> 0 llvm-svn: 185973	2013-07-09 22:01:22 +00:00
Adrian Prantl	1014fcfd99	move test into the appropriate subdir. llvm-svn: 185972	2013-07-09 21:44:11 +00:00
Nadav Rotem	d7b574e5b3	Fix PR16571, which is a bug in the code that checks that all of the types in the bundle are uniform. llvm-svn: 185970	2013-07-09 21:38:08 +00:00
Adrian Prantl	418d1d1ea9	Reapply an improved version of r180816/180817. Change the informal convention of DBG_VALUE machine instructions so that we can express a register-indirect address with an offset of 0. The old convention was that a DBG_VALUE is a register-indirect value if the offset (operand 1) is nonzero. The new convention is that a DBG_VALUE is register-indirect if the first operand is a register and the second operand is an immediate. For plain register values the combination reg, reg is used. MachineInstrBuilder::BuildMI knows how to build the new DBG_VALUES. rdar://problem/13658587 llvm-svn: 185966	2013-07-09 20:28:37 +00:00
Stephen Lin	4ee5e873d5	Appease buildbots after r185956: just set -mcpu explicitly, as it should have been from the beginning. llvm-svn: 185962	2013-07-09 19:27:10 +00:00
Stephen Lin	228765f61f	Appease Atom buildbot after r185956 (explicitly turn on AVX) llvm-svn: 185961	2013-07-09 18:55:52 +00:00
Hal Finkel	e4dd5c29f0	WidenVecRes_BUILD_VECTOR must use the first operand's type Because integer BUILD_VECTOR operands may have a larger type than the result's vector element type, and all operands must have the same type, when widening a BUILD_VECTOR node by adding UNDEFs, we cannot use the vector element type, but rather must use the type of the existing operands. Another bug found by llvm-stress. llvm-svn: 185960	2013-07-09 18:55:10 +00:00
Bill Schmidt	4122169308	[PowerPC] Better fix for PR16556. A more complete example of the bug in PR16556 was recently provided, showing that the previous fix was not sufficient. The previous fix is reverted herein. The real problem is that ReplaceNodeResults() uses LowerFP_TO_INT as custom lowering for FP_TO_SINT during type legalization, without checking whether the input type is handled by that routine. LowerFP_TO_INT requires the input to be f32 or f64, so we fail when the input is ppcf128. I'm leaving the test case from the initial fix (r185821) in place, and adding the new test as another crash-only check. llvm-svn: 185959	2013-07-09 18:50:20 +00:00
Stephen Lin	73fa842e2e	Attempt to appease buildbot after r185956 by explicitly turning setting -fma,-fma4 attrs (I'm assuming they're set because the bot is running on machine that has one or the other.) llvm-svn: 185958	2013-07-09 18:41:43 +00:00
Stephen Lin	73de7bf5de	AArch64/PowerPC/SystemZ/X86: This patch fixes the interface, usage, and all in-tree implementations of TargetLoweringBase::isFMAFasterThanMulAndAdd in order to resolve the following issues with fmuladd (i.e. optional FMA) intrinsics: 1. On X86(-64) targets, ISD::FMA nodes are formed when lowering fmuladd intrinsics even if the subtarget does not support FMA instructions, leading to laughably bad code generation in some situations. 2. On AArch64 targets, ISD::FMA nodes are formed for operations on fp128, resulting in a call to a software fp128 FMA implementation. 3. On PowerPC targets, FMAs are not generated from fmuladd intrinsics on types like v2f32, v8f32, v4f64, etc., even though they promote, split, scalarize, etc. to types that support hardware FMAs. The function has also been slightly renamed for consistency and to force a merge/build conflict for any out-of-tree target implementing it. To resolve, see comments and fixed in-tree examples. llvm-svn: 185956	2013-07-09 18:16:56 +00:00
Hal Finkel	ff666bd962	Don't crash in SE dealing with ashr x, -1 ScalarEvolution::getSignedRange uses ComputeNumSignBits from ValueTracking on ashr instructions. ComputeNumSignBits can return zero, but this case was not handled correctly by the code in getSignedRange which was calling: APInt::getSignedMinValue(BitWidth).ashr(NS - 1) with NS = 0, resulting in an assertion failure in APInt::ashr. Now, we just return the conservative result (as with NS == 1). Another bug found by llvm-stress. llvm-svn: 185955	2013-07-09 18:16:16 +00:00
David Majnemer	a92b3c914e	ValueTracking: Fix bugs in isKnownToBeAPowerOfTwo (add nsw x, (and x, y)) isn't a power of two if x is zero, it's zero (add nsw x, (xor x, y)) isn't a power of two if y has bits set that aren't set in x llvm-svn: 185954	2013-07-09 18:11:10 +00:00
Hal Finkel	6c29bd9088	DAGCombine tryFoldToZero cannot create illegal types after type legalization When folding sub x, x (and other similar constructs), where x is a vector, the result is a vector of zeros. After type legalization, make sure that the input zero elements have a legal type. This type may be larger than the result's vector element type. This was another bug found by llvm-stress. llvm-svn: 185949	2013-07-09 17:02:45 +00:00
Ulrich Weigand	52cf8e4488	[PowerPC] Revert r185476 and fix up TLS variant kinds In the commit message to r185476 I wrote: >The PowerPC-specific modifiers VK_PPC_TLSGD and VK_PPC_TLSLD >correspond exactly to the generic modifiers VK_TLSGD and VK_TLSLD. >This causes some confusion with the asm parser, since VK_PPC_TLSGD >is output as @tlsgd, which is then read back in as VK_TLSGD. > >To avoid this confusion, this patch removes the PowerPC-specific >modifiers and uses the generic modifiers throughout. (The only >drawback is that the generic modifiers are printed in upper case >while the usual convention on PowerPC is to use lower-case modifiers. >But this is just a cosmetic issue.) This was unfortunately incorrect, there is is fact another, serious drawback to using the default VK_TLSLD/VK_TLSGD variant kinds: using these causes ELFObjectWriter::RelocNeedsGOT to return true, which in turn causes the ELFObjectWriter to emit an undefined reference to _GLOBAL_OFFSET_TABLE_. This is a problem on powerpc64, because it uses the TOC instead of the GOT, and the linker does not provide _GLOBAL_OFFSET_TABLE_, so the symbol remains undefined. This means shared libraries using TLS built with the integrated assembler are currently broken. While the whole RelocNeedsGOT / _GLOBAL_OFFSET_TABLE_ situation probably ought to be properly fixed at some point, for now I'm simply reverting the r185476 commit. Now this in turn exposes the breakage of handling @tlsgd/@tlsld in the asm parser that this check-in was originally intended to fix. To avoid this regression, I'm also adding a different fix for this problem: while common code now parses @tlsgd as VK_TLSGD, a special hack in the asm parser translates this code to the platform-specific VK_PPC_TLSGD that the back-end now expects. While this is not really pretty, it's self-contained and shouldn't hurt anything else for now. One the underlying problem is fixed, this hack can be reverted again. llvm-svn: 185945	2013-07-09 16:41:09 +00:00
Vincent Lejeune	ce499744b3	R600: Do not predicated basic block with multiple alu clause Test is not included as it is several 1000 lines long. To test this functionnality, a test case must generate at least 2 ALU clauses, where an ALU clause is ~110 instructions long. NOTE: This is a candidate for the stable branch. llvm-svn: 185943	2013-07-09 15:03:33 +00:00
Vincent Lejeune	b8aac8d720	R600: Fix a rare bug where swizzle optimization returns wrong values llvm-svn: 185942	2013-07-09 15:03:25 +00:00
Vincent Lejeune	a4d8d2ef2b	R600: Fix wrong export reswizzling llvm-svn: 185941	2013-07-09 15:03:19 +00:00
Vincent Lejeune	b55940cc7d	R600: Use DAG lowering pass to handle fcos/fsin NOTE: This is a candidate for the stable branch. llvm-svn: 185940	2013-07-09 15:03:11 +00:00
Joey Gouly	0f12aa2b0f	Add MC assembly/disassembly support for VRINT{A, N, P, M} to V8FP. llvm-svn: 185929	2013-07-09 11:26:18 +00:00
Joey Gouly	3b693c42b5	Add MC assembly/disassembly support for VRINT{Z, X, R} to V8FP. llvm-svn: 185926	2013-07-09 11:03:21 +00:00
Ulrich Weigand	55daa77901	[PowerPC] Support ".machine any" The PowerPC assembler is supposed to provide a directive .machine that allows switching the supported CPU instruction set on the fly. Since we do not yet check CPU feature sets at all and always accept any available instruction, this is not really useful at this point. However, it makes sense to accept (and ignore) ".machine any" to avoid spuriously rejecting existing assembler files that use this. llvm-svn: 185924	2013-07-09 10:00:34 +00:00
Alexander Potapenko	8d2d79d05f	Revert r185872 - "Stop emitting weak symbols into the "coal" sections" This patch broke `make check-asan` on Mac, causing ld warnings like the following one: ld: warning: direct access in __GLOBAL__I_a to global weak symbol ___asan_mapping_scale means the weak symbol cannot be overridden at runtime. This was likely caused by different translation units being compiled with different visibility settings. The resulting test binaries crashed with incorrect ASan warnings. llvm-svn: 185923	2013-07-09 10:00:16 +00:00
Joey Gouly	2d0175e8fb	Add MC assembly/disassembly support for VCVT{A, N, P, M} to V8FP. llvm-svn: 185922	2013-07-09 09:59:04 +00:00
Richard Sandiford	9784649157	[SystemZ] Use MVC for simple load/store pairs Look for patterns of the form (store (load ...), ...) in which the two locations are known not to partially overlap. (Identical locations are OK.) These sequences are better implemented by MVC unless either the load or the store could use RELATIVE LONG instructions. The testcase showed that we weren't using LHRL and LGHRL for extload16, only sextloadi16. The patch fixes that too. llvm-svn: 185919	2013-07-09 09:46:39 +00:00
Richard Sandiford	47660c148c	[SystemZ] Use "STC;MVC" for memset Use "STC;MVC" for memsets that are too big for two STCs or MV...Is yet small enough for a single MVC. As with memcpy, I'm leaving longer cases till later. The number of tests might seem excessive, but f33 & f34 from memset-04.ll failed the first cut because I'd not added the "?:" on the calculation of Size1. llvm-svn: 185918	2013-07-09 09:32:42 +00:00
David Majnemer	72d76275ac	InstCombine: variations on 0xffffffff - x >= 4 The following transforms are valid if -C is a power of 2: (icmp ugt (xor X, C), ~C) -> (icmp ult X, C) (icmp ult (xor X, C), -C) -> (icmp uge X, C) These are nice, they get rid of the xor. llvm-svn: 185915	2013-07-09 09:20:58 +00:00
David Majnemer	414d4e58aa	InstCombine: X & -C != -C -> X <= u ~C Tests were added in r185910 somehow. llvm-svn: 185912	2013-07-09 08:09:32 +00:00
Ulrich Weigand	78a5a116a0	[PowerPC] Support .llong and fix .word This adds support for the .llong PowerPC-specifc assembler directive. In doing so, I notices that .word is currently incorrect: it is supposed to define a 2-byte data element, not a 4-byte one. llvm-svn: 185911	2013-07-09 07:59:25 +00:00
David Majnemer	bafa537eb7	Commit r185909 was a misapplied patch, fix it llvm-svn: 185910	2013-07-09 07:58:32 +00:00
David Majnemer	f2a9a513c7	InstCombine: add more transforms C1-X <u C2 -> (X\|(C2-1)) == C1 C1-X >u C2 -> (X\|C2) == C1 X-C1 <u C2 -> (X & -C2) == C1 X-C1 >u C2 -> (X & ~C2) == C1 llvm-svn: 185909	2013-07-09 07:50:59 +00:00
Hal Finkel	dbbf09b28e	PPC: Allocate RS spill slot for unaligned i64 load/store This fixes another bug found by llvm-stress! If we happen to be doing an i64 load or store into a stack slot that has less than a 4-byte alignment, then the frame-index elimination may need to use an indexed load or store instruction (because the offset may not be a multiple of 4, a requirement of the STD/LD instructions). The extra register needed to hold the offset comes from the register scavenger, and it is possible that the scavenger will need to use an emergency spill slot. As a result, we need to make sure that a spill slot is allocated when doing an i64 load/store into a less-than-4-byte-aligned stack slot. Because test cases for things like this tend to be fairly fragile, I've concatenated a few small bugpoint-reduced test cases together to form the regression test. llvm-svn: 185907	2013-07-09 06:34:51 +00:00
Eric Christopher	a95d39251d	CEHCK->CHECK typo fix. llvm-svn: 185875	2013-07-08 21:47:33 +00:00
Eric Christopher	7ca38c6e9f	Fix up whitespace. llvm-svn: 185874	2013-07-08 21:47:31 +00:00
Bill Wendling	0176708e85	Stop emitting weak symbols into the "coal" sections. The Mach-O linker has been able to support the weak-def bit on any symbol for quite a while now. The compiler however continued to place these symbols into a "coal" section, which required the linker to map them back to the base section name. Replace the sections like this: __TEXT/__textcoal_nt instead use __TEXT/__text __TEXT/__const_coal instead use __TEXT/__const __DATA/__datacoal_nt instead use __DATA/__data <rdar://problem/14265330> llvm-svn: 185872	2013-07-08 21:34:52 +00:00
Ulrich Weigand	266db7fe04	[PowerPC] Always use "assembler dialect" 1 A setting in MCAsmInfo defines the "assembler dialect" to use. This is used by common code to choose between alternatives in a multi-alternative GNU inline asm statement like the following: __asm__ ("{sfe\|subfe} %0,%1,%2" : "=r" (out) : "r" (in1), "r" (in2)); The meaning of these dialects is platform specific, and GCC defines those for PowerPC to use dialect 0 for old-style (POWER) mnemonics and 1 for new-style (PowerPC) mnemonics, like in the example above. To be compatible with inline asm used with GCC, LLVM ought to do the same. Specifically, this means we should always use assembler dialect 1 since old-style mnemonics really aren't supported on any current platform. However, the current LLVM back-end uses: AssemblerDialect = 1; // New-Style mnemonics. in PPCMCAsmInfoDarwin, and AssemblerDialect = 0; // Old-Style mnemonics. in PPCLinuxMCAsmInfo. The Linux setting really isn't correct, we should be using new-style mnemonics everywhere. This is changed by this commit. Unfortunately, the setting of this variable is overloaded in the back-end to decide whether or not we are on a Darwin target. This is done in PPCInstPrinter (the "SyntaxVariant" is initialized from the MCAsmInfo AssemblerDialect setting), and also in PPCMCExpr. Setting AssemblerDialect to 1 for both Darwin and Linux no longer allows us to make this distinction. Instead, this patch uses the MCSubtargetInfo passed to createPPCMCInstPrinter to distinguish Darwin targets, and ignores the SyntaxVariant parameter. As to PPCMCExpr, this patch adds an explicit isDarwin argument that needs to be passed in by the caller when creating a target MCExpr. (To do so this patch implicitly also reverts commit 184441.) llvm-svn: 185858	2013-07-08 20:20:51 +00:00
Hal Finkel	21ada79757	PPC: Mark vector CC action for SETO and SETONE as Expand Another bug found by llvm-stress! This fixes hitting llvm_unreachable("Invalid integer vector compare condition"); at the end of getVCmpInst in PPCISelDAGToDAG. llvm-svn: 185855	2013-07-08 20:00:03 +00:00
Joey Gouly	392cdad2b1	Add a comment to this change, requested by Eric Christopher. llvm-svn: 185853	2013-07-08 19:52:51 +00:00
Jim Grosbach	24e102a947	ARM: Improve codegen for generic vselect. Fall back to by-element insert rather than building it up on the stack. rdar://14351991 llvm-svn: 185846	2013-07-08 18:18:52 +00:00
Hal Finkel	e39302258e	PPC: Mark vector FREM as Expand by default Another bug found by llvm-stress! This fixes crashing with: LLVM ERROR: Cannot select: v4f32 = frem ... llvm-svn: 185840	2013-07-08 17:30:25 +00:00
Ulrich Weigand	e840ee2ca2	[PowerPC] Support time base instructions This adds support for the old-style time base instructions; while new programs are supposed to use mfspr, the mftb instructions are still supported and in use by existing assembler files. llvm-svn: 185829	2013-07-08 15:20:38 +00:00
Ulrich Weigand	c0944b50fe	[PowerPC] Support basic compare mnemonics This adds support for the basic mnemoics (with the L operand) for the fixed-point compare instructions. These are defined as aliases for the already existing CMPW/CMPD patterns, depending on the value of L. This requires use of InstAlias patterns with immediate literal operands. To make this work, we need two further changes: - define a RegisterPrefix, because otherwise literals 0 and 1 would be parsed as literal register names - provide a PPCAsmParser::validateTargetOperandClass routine to recognize immediate literals (like ARM does) llvm-svn: 185826	2013-07-08 14:49:37 +00:00
Bill Schmidt	2db29ef467	[PowerPC] Fix PR16556 (handle undef ppcf128 in LowerFP_TO_INT). PPCTargetLowering::LowerFP_TO_INT() expects its source operand to be either an f32 or f64, but this is not checked. A long double (ppcf128) operand will normally be custom-lowered to a conversion to f64 in this context. However, this isn't the case for an UNDEF node. This patch recognizes a ppcf128 as a legal source operand for FP_TO_INT only if it's an undef, in which case it creates an undef of the target type. At some point we might want to do a wholesale custom lowering of ISD::UNDEF when the type is ppcf128, but it's not really clear that's a great idea, and probably more work than it's worth for a situation that only arises in the case of a programming error. At this point I think simple is best. The test case comes from PR16556, and is a crash-test only. llvm-svn: 185821	2013-07-08 14:22:45 +00:00
Reid Kleckner	8b4ccc0645	Convert an OCaml binding grep test to FileCheck I shaved this yak because I mistakenly thought that this was one of the last grep tests. Turns out my search was skipping .ll files, for which there are ~1200 more tests using grep. llvm-svn: 185819	2013-07-08 14:14:22 +00:00
David Majnemer	fa90a0b325	InstCombine: Fold X-C1 <u 2 -> (X & -2) == C1 Back in r179493 we determined that two transforms collided with each other. The fix back then was to reorder the transforms so that the preferred transform would give it a try and then we would try the secondary transform. However, it was noted that the best approach would canonicalize one transform into the other, removing the collision and allowing us to optimize IR given to us in that form. llvm-svn: 185808	2013-07-08 11:53:08 +00:00
Nico Rieck	51969be724	Reuse %rax after calling __chkstk on win64 Reapply this as I reverted the wrong commit. llvm-svn: 185807	2013-07-08 11:20:11 +00:00
Nico Rieck	4801303ce1	Revert "Proper va_arg/va_copy lowering on win64" This reverts commit 2b52880592a525cfe04d8f9008a35da8c2ea94c3. Needs review. llvm-svn: 185806	2013-07-08 11:19:44 +00:00
Richard Sandiford	d131ff8cf8	[SystemZ] Use MVC for memcpy Use MVC for memcpy in cases where a single MVC is enough. Using MVC is a win for longer copies too, but I'll leave that for later. llvm-svn: 185802	2013-07-08 09:35:23 +00:00
NAKAMURA Takumi	fed0ccfb9c	llvm/test/CMakeLists.txt: Add llvm-cov in "check-clang". llvm-svn: 185801	2013-07-08 08:44:36 +00:00
NAKAMURA Takumi	b39be04164	llvm/test/CMakeLists.txt: Reformat LLVM_TEST_DEPENDS. llvm-svn: 185800	2013-07-08 08:44:30 +00:00
NAKAMURA Takumi	1be81b4d1c	llvm/test/Other/llvm-cov.test: It requires +Asserts to let XFAILed. llvm-svn: 185799	2013-07-08 08:44:24 +00:00
Hal Finkel	8cb9a0e1d3	Fix PromoteIntRes_BUILD_VECTOR crash with i1 vectors This fixes a bug (found by llvm-stress) in DAGTypeLegalizer::PromoteIntRes_BUILD_VECTOR where it assumed that the result type would always be larger than the original operands. This is not always true, however, with boolean vectors. For example, promoting a node of type v8i1 (where the operands will be of type i32, the type to which i1 is promoted) will yield a node with a result vector element type of i16 (and operands of type i32). As a result, we cannot blindly assume that we can ANY_EXTEND the operands to the result type. llvm-svn: 185794	2013-07-08 06:16:58 +00:00
Kai Nacke	c5cca5ab42	Revert: Fix wrong code offset for unwind code SET_FPREG. llvm-svn: 185793	2013-07-08 04:48:34 +00:00
Kai Nacke	939ecd7ea0	Revert: Generate IMAGE_REL_AMD64_ADDR32NB relocations for SEH data structures. llvm-svn: 185791	2013-07-08 04:46:55 +00:00
Kai Nacke	07bad44e9b	Revert: Fix alignment of unwind data. llvm-svn: 185790	2013-07-08 04:45:05 +00:00
Hal Finkel	ec474f28e3	Add the nearbyint -> FNEARBYINT mapping to BasicTargetTransformInfo This fixes an oversight that Intrinsic::nearbyint was not being mapped to ISD::FNEARBYINT (thus fixing the over-optimistic cost we were assigning to nearbyint calls for some targets). llvm-svn: 185783	2013-07-08 03:24:07 +00:00
Michael Gottesman	8c96263ee3	[objc-arc] Committed test for r185770 as per dblaikie's suggestion. llvm-svn: 185782	2013-07-08 02:13:47 +00:00
Nico Rieck	43b51056d6	Revert "Reuse %rax after calling __chkstk on win64" This reverts commit 01f8d579f7672872324208ac5bc4ac311e81b22e. llvm-svn: 185781	2013-07-08 01:30:57 +00:00
Nico Rieck	7adf6111a8	Reuse %rax after calling __chkstk on win64 llvm-svn: 185778	2013-07-07 16:48:39 +00:00
Nick Lewycky	c0514629c9	Eliminate trivial redundant loads across nocapture+readonly calls to uncaptured pointer arguments. llvm-svn: 185776	2013-07-07 10:15:16 +00:00
Nadav Rotem	2041b742d4	SLPVectorizer: Implement DCE as part of vectorization. This is a complete re-write if the bottom-up vectorization class. Before this commit we scanned the instruction tree 3 times. First in search of merge points for the trees. Second, for estimating the cost. And finally for vectorization. There was a lot of code duplication and adding the DCE exposed bugs. The new design is simpler and DCE was a part of the design. In this implementation we build the tree once. After that we estimate the cost by scanning the different entries in the constructed tree (in any order). The vectorization phase also works on the built tree. llvm-svn: 185774	2013-07-07 06:57:07 +00:00
Michael Gottesman	618df456e2	[objc-arc] Remove the alias analysis part of r185764. Upon further reflection, the alias analysis part of r185764 is not a safe change. llvm-svn: 185770	2013-07-07 04:18:03 +00:00
Michael Gottesman	a72630d453	[objc-arc] Teach the ARC optimizer that objc_sync_enter/objc_sync_exit do not modify the ref count of an objc object and additionally are inert for modref purposes. llvm-svn: 185769	2013-07-07 01:52:55 +00:00
Joey Gouly	2efaa733a2	Add MC support for the v8fp instructions: vmaxnm and vminnm. llvm-svn: 185767	2013-07-06 20:50:18 +00:00
Nico Rieck	99ef2890c0	Proper va_arg/va_copy lowering on win64 llvm-svn: 185763	2013-07-06 18:08:19 +00:00
Kai Nacke	4417cccba3	Fix alignment of unwind data. For alignment purposes, the instruction array will always have an even number of entries, with the final entry potentially unused (in which case the array will be one longer than indicated by the count of unwind codes field). Reviewed by Charles Davis and Nico Rieck. llvm-svn: 185760	2013-07-06 17:16:50 +00:00
Kai Nacke	2a933a6549	Generate IMAGE_REL_AMD64_ADDR32NB relocations for SEH data structures. The Win64 EH data structures must be of type IMAGE_REL_AMD64_ADDR32NB instead of IMAGE_REL_AMD64_ADDR32. This is easiely achieved by adding the VK_COFF_IMGREL32 modifier to the symbol reference. Change also references to start and end of the SEH range of a function as offsets to start of the function. Reviewed by Charles Davis and Nico Rieck. llvm-svn: 185759	2013-07-06 17:16:12 +00:00
Kai Nacke	66bfdb8354	Fix wrong code offset for unwind code SET_FPREG. The code offset for unwind code SET_FPREG is wrong because it is set to constant 0. The fix is to do the same as for the other unwind codes: emit a label and later the absolute difference between the label and the begin of the prologue. Also enables the failing test case MC/COFF/seh.s Reviewed by Charles Davis and Nico Rieck. llvm-svn: 185758	2013-07-06 17:15:36 +00:00
Benjamin Kramer	c7332b2796	DAGCombiner: Don't drop extension behavior when shrinking a load when unsafe. ReduceLoadWidth unconditionally drops extensions from loads. Limit it to the case when all of the bits the extension would otherwise produce are dropped by the shrink. It would be possible to shrink the load in more cases by merging the extensions, but this isn't trivial and a very rare case. I left a TODO for that case. Fixes PR16551. llvm-svn: 185755	2013-07-06 14:05:09 +00:00
Tim Northover	dab4db5372	Stop putting operations after a tail call. This prevents the emission of DAG-generated vreg definitions after a tail call be dropping them entirely (on the grounds that nothing could use them anyway, and they interfere with O0 CodeGen). llvm-svn: 185754	2013-07-06 12:58:45 +00:00
Nico Rieck	a37acf702d	MC: Implement COFF .linkonce directive llvm-svn: 185753	2013-07-06 12:13:10 +00:00
David Majnemer	69430609ff	InstCombine: typo in or_icmp_eq_B_0_icmp_ult_A_B test llvm-svn: 185737	2013-07-06 00:54:07 +00:00
Nick Lewycky	c2ec0725ce	Extend 'readonly' and 'readnone' to work on function arguments as well as functions. Make the function attributes pass add it to known library functions and when it can deduce it. llvm-svn: 185735	2013-07-06 00:29:58 +00:00
Michael Gottesman	275b22e310	[TRE] Combined another test into basic.ll llvm-svn: 185729	2013-07-05 22:24:06 +00:00
Michael Gottesman	e283e1958a	[TRE] Merged several tests into the the test basic.ll. llvm-svn: 185723	2013-07-05 20:45:13 +00:00
Arnold Schwaighofer	97c1343c45	ARM: Add a pack pattern for matching arithmetic shift right llvm-svn: 185714	2013-07-05 18:57:49 +00:00
Arnold Schwaighofer	50b76b5226	ARM: Fix incorrect pack pattern A "pkhtb x, x, y asr #num" uses the lower 16 bits of "y asr #num" and packs them in the bottom half of "x". An arithmetic and logic shift are only equivalent in this context if the shift amount is 16. We would be shifting in ones into the bottom 16bits instead of zeros if "y" is negative. radar://14338767 llvm-svn: 185712	2013-07-05 18:28:39 +00:00
Richard Sandiford	c40f27b52d	[SystemZ] Remove no-op MVCs The stack coloring pass has code to delete stores and loads that become trivially dead after coloring. Extend it to cope with single instructions that copy from one frame index to another. The testcase happens to show an example of this kicking in at the moment. It did occur in Real Code too though. llvm-svn: 185705	2013-07-05 14:38:48 +00:00
Richard Sandiford	b5d9bd6f59	Fix double renaming bug in stack coloring pass The stack coloring pass renumbered frame indexes with a loop of the form: for each frame index FI for each instruction I that uses FI for each use of FI in I rename FI to FI' This caused problems if an instruction used two frame indexes F0 and F1 and if F0 was renamed to F1 and F1 to F2. The first time we visited the instruction we changed F0 to F1, then we changed both F1s to F2. In other words, the problem was that SSRefs recorded which instructions used an FI, but not which MachineOperands and MachineMemOperands within that instruction used it. This is easily fixed for MachineOperands by walking the instructions once and processing each operand in turn. There's already a loop to do that for dead store elimination, so it seemed more efficient to fuse the two at the block level. MachineMemOperands are more tricky because they can be shared between instructions. The patch handles them by making SSRefs an array of MachineMemOperands rather than an array of MachineInstrs. We might end up processing the same MachineMemOperand twice, but that's OK because we always know from the SSRefs index what the original frame index was. llvm-svn: 185703	2013-07-05 14:24:47 +00:00
Richard Sandiford	8976ea72ab	[SystemZ] Enable the use of MVC for frame-to-frame spills ...now that the problem that prompted the restriction has been fixed. The original spill-02.py was a compromise because at the time I couldn't find an example that actually failed without the two scavenging slots. The version included here did. llvm-svn: 185701	2013-07-05 14:02:01 +00:00
Ulrich Weigand	b204431106	[PowerPC] Add some special @got@tprel fixup cases When a target@got@tprel or target@got@tprel@l symbol variant is used in a fixup_ppc_half16 (not fixup_ppc_half16ds) context, we currently fail, since the corresponding R_PPC64_GOT_TPREL16 / R_PPC64_GOT_TPREL16_LO relocation types do not exist. However, since such symbol variants resolve to GOT offsets which are always 4-aligned, we can simply instead use the _DS variants of the relocation types, which do exist. The same applies for the @got@dtprel variants. llvm-svn: 185700	2013-07-05 13:49:46 +00:00
Richard Sandiford	23943229f6	[SystemZ] Allocate a second register scavenging slot This is another prerequisite for frame-to-frame MVC copies. I'll commit the patch that makes use of the slot separately. The downside of trying to test many corner cases with each of the available addressing modes is that a fair few tests need to account for the new frame layout. I do still think it's useful to have all these tests though, since it's something that wouldn't get much coverage otherwise. llvm-svn: 185698	2013-07-05 13:11:52 +00:00
Rafael Espindola	8ef843fc72	Don't create an archive if, for example, we are asked to print the index. llvm-svn: 185697	2013-07-05 13:03:07 +00:00
Ulrich Weigand	5abd12fc32	[PowerPC] Make test case buildable with GNU as The ppc64-fixups.s test currently fails to build with GNU as, since it does not support plain symbols as arguments to li/lis. Rewrite the test for R_PPC64_ADDR16 and R_PPC64_REL16 to use lwz instead. Allowing the test case to be built with both LLVM and GNU as makes it easier to spot unwanted difference in the output. llvm-svn: 185694	2013-07-05 12:33:03 +00:00
Ulrich Weigand	5b427591d6	[PowerPC] Support @tls in the asm parser This adds support for the last missing construct to parse TLS-related assembler code: add 3, 4, symbol@tls The ADD8TLS currently hard-codes the @tls into the assembler string. This cannot be handled by the asm parser, since @tls is parsed as a symbol variant. This patch changes ADD8TLS to have the @tls suffix printed as symbol variant on output too, which allows us to remove the isCodeGenOnly marker from ADD8TLS. This in turn means that we can add a AsmOperand to accept @tls marked symbols on input. As a side effect, this means that the fixup_ppc_tlsreg fixup type is no longer necessary and can be merged into fixup_ppc_nofixup. llvm-svn: 185692	2013-07-05 12:22:36 +00:00
Joey Gouly	606f3fbc2b	PR16490: fix a crash in ARMDAGToDAGISel::SelectInlineAsm. In the SelectionDAG immediate operands to inline asm are constructed as two separate operands. The first is a constant of value InlineAsm::Kind_Imm and the second is a constant with the value of the immediate. In ARMDAGToDAGISel::SelectInlineAsm, if we reach an operand of Kind_Imm we should skip over the next operand too. llvm-svn: 185688	2013-07-05 10:19:40 +00:00
David Majnemer	c2a990bc00	InstCombine: (icmp eq B, 0) \| (icmp ult A, B) -> (icmp ule A, B-1) This transform allows us to turn IR that looks like: %1 = icmp eq i64 %b, 0 %2 = icmp ult i64 %a, %b %3 = or i1 %1, %2 ret i1 %3 into: %0 = add i64 %b, -1 %1 = icmp uge i64 %0, %a ret i1 %1 which means we go from lowering: cmpq %rsi, %rdi setb %cl testq %rsi, %rsi sete %al orb %cl, %al ret to lowering: decq %rsi cmpq %rdi, %rsi setae %al ret llvm-svn: 185677	2013-07-05 00:31:17 +00:00
David Blaikie	9a300bda38	DebugInfo: Consider global variables without locations to be valid We were being a bit too aggresive here in classifying global variables with no global reference or constant value to be invalid - this would cause LLVM to not emit the DWARF description of the global variable if it had been optimized away, which isn't helpful for users who might benefit from the global variable's description even if there's no location information. This also fixes a crasher issue here that I was unable to reduce a test case for - involving a using decl (& subsequent DW_TAG_imported_declaration ) of such a global variable that, once optimized away, would crash when an attempt to emit the imported declaration was made. llvm-svn: 185675	2013-07-04 23:15:18 +00:00
Nico Rieck	1558c5a6ee	MC: Add .section directive to COFF Supports GAS flags "abdnrswxy". No support for alignment or subsections. Fixes PR16366. llvm-svn: 185669	2013-07-04 21:32:07 +00:00
David Majnemer	37f8f445de	InstCombine: Reimplementation of visitUDivOperand This transform was originally added in r185257 but later removed in r185415. The original transform would create instructions speculatively and then discard them if the speculation was proved incorrect. This has been replaced with a scheme that splits the transform into two parts: preflight and fold. While we preflight, we build up fold actions that inform the folding stage on how to act. llvm-svn: 185667	2013-07-04 21:17:49 +00:00
Rafael Espindola	1cbed22836	Add support for archives with no symbol table or string table. llvm-svn: 185664	2013-07-04 19:40:23 +00:00
Ulrich Weigand	d3ac7c058b	[PowerPC] Implement writeNopData This implements a proper PPCAsmBackend::writeNopData routine that actually writes PowerPC nop instructions. This fixes the last remaining difference in object file output (text section) between the integrated assembler and GNU as that I've seen anywhere. llvm-svn: 185662	2013-07-04 18:28:46 +00:00
Rafael Espindola	31c3b2ddee	Add 'not' in front of a command that is expected to fail. llvm-svn: 185659	2013-07-04 17:21:01 +00:00
Joey Gouly	cc4ff9e907	Add support for MC assembling and disassembling of vsel{ge, gt, eq, vs} instructions. This adds a new decoder table/namespace 'VFPV8', as these instructions have their top 4 bits as 0b1111, while other Thumb instructions have 0b1110. llvm-svn: 185642	2013-07-04 14:57:20 +00:00
Ulrich Weigand	56b0e7b011	[PowerPC] Add all trap mnemonics This adds support for all basic and extended variants of the trap instructions to the asm parser. llvm-svn: 185638	2013-07-04 14:40:12 +00:00
Ulrich Weigand	b86cb7d04b	[PowerPC] Add asm parser support for CR expressions This adds support for specifying condition registers and condition register fields via expressions using the symbols defined by the PowerISA, like "4*cr2+eq". llvm-svn: 185633	2013-07-04 14:24:00 +00:00
Benjamin Kramer	371722288c	SimplifyCFG: Teach switch generation some patterns that instcombine forms. This allows us to create switches even if instcombine has munged two of the incombing compares into one and some bit twiddling. This was motivated by enum compares that are common in clang. llvm-svn: 185632	2013-07-04 14:22:02 +00:00
Joey Gouly	39f7488294	Add a V8FP instruction 'vcvt{b,t}' to convert between half and double precision. llvm-svn: 185620	2013-07-04 10:04:08 +00:00
Quentin Colombet	04b3a0fdb2	[ARM] Improve the instruction selection of vector loads. In the ARM back-end, build_vector nodes are lowered to a target specific build_vector that uses floating point type. This works well, unless the inserted bitcasts survive until instruction selection. In that case, they incur moves between integer unit and floating point unit that may result in inefficient code. In other words, this conversion may introduce artificial dependencies when the code leading to the build vector cannot be completed with a floating point type. In particular, this happens when loads are not aligned. Before this patch, in that case, the compiler generates general purpose loads and creates the floating point vector from them, instead of directly using the vector unit. The patch uses a vector friendly sequence of code when the inserted bitcasts to floating point survived DAGCombine. This is done by a target specific DAGCombine that changes the target specific build_vector into a sequence of insert_vector_elt that get rid of the bitcasts. <rdar://problem/14170854> llvm-svn: 185587	2013-07-03 21:42:57 +00:00
Tilmann Scheller	ef5666fbbf	ARM: Prevent ARMAsmParser::shouldOmitCCOutOperand() from misidentifying certain Thumb2 add immediate T3 encodings. Before the fix Thumb2 instructions of type "add rD, rN, #imm" (T3 encoding, see ARM ARM A8.8.4) with rD and rN both being low registers (r0-r7) were classified as having the T4 encoding. The T4 encoding doesn't have a cc_out operand so for above instructions the operand gets erroneously removed, corrupting the token stream and leading to parse errors later in the process. This bug prevented "add r1, r7, #0xcbcbcbcb" from being assembled correctly. Fixes <rdar://problem/14224440>. llvm-svn: 185575	2013-07-03 20:38:01 +00:00
Ulrich Weigand	2542b3b17f	[PowerPC] Support lmw/stmw in the asm parser This adds support for the load/store multiple instructions, currently used by the asm parser only. llvm-svn: 185564	2013-07-03 18:29:47 +00:00
Ulrich Weigand	49f487e6cd	[PowerPC] Use mtocrf when available Just as with mfocrf, it is also preferable to use mtocrf instead of mtcrf when only a single CR register is to be written. Current code however always emits mtcrf. This probably does not matter when using an external assembler, since the GNU assembler will in fact automatically replace mtcrf with mtocrf when possible. It does create inefficient code with the integrated assembler, however. To fix this, this patch adds MTOCRF/MTOCRF8 instruction patterns and uses those instead of MTCRF/MTCRF8 everything. Just as done in the MFOCRF patch committed as 185556, these patterns will be converted back to MTCRF if MTOCRF is not available on the machine. As a side effect, this allows to modify the MTCRF pattern to accept the full range of mask operands for the benefit of the asm parser. llvm-svn: 185561	2013-07-03 17:59:07 +00:00
Rafael Espindola	b0fccb225c	Prefix failing commands with not to make clear they are expected to fail. llvm-svn: 185554	2013-07-03 16:41:29 +00:00
Rafael Espindola	8490bbd16b	Remove another old test. It was only passing because 'grep andpd' was not finding any andpd, but we don't fail if part of a pipe fails. llvm-svn: 185552	2013-07-03 16:35:26 +00:00
Rafael Espindola	447dbc38b6	Remove test for the old EH system. It doesn't parse anymore. llvm-svn: 185551	2013-07-03 16:30:01 +00:00
Rafael Espindola	0bdc4bb684	Fix test: It was missing run lines and llvm-dis has no -disable-verify option. llvm-svn: 185550	2013-07-03 16:27:55 +00:00
Rafael Espindola	88ae7dd230	Add support for gnu archives with a string table and no symtab. While there, use early returns to reduce nesting. llvm-svn: 185547	2013-07-03 15:57:14 +00:00
Rafael Espindola	8b82a4d36e	Make llvm-nm return 1 on error. This is a small compatibility improvement with gnu nm and makes llvm-nm more useful as a testing tool. llvm-svn: 185546	2013-07-03 15:46:03 +00:00
Evgeniy Stepanov	dc6d7eb860	[msan] Unpoison stack allocations and undef values in blacklisted functions. This changes behavior of -msan-poison-stack=0 flag from not poisoning stack allocations to actively unpoisoning them. llvm-svn: 185538	2013-07-03 14:39:14 +00:00
Ulrich Weigand	ae9cf5828c	[PowerPC] Support mtspr/mfspr in the asm parser This adds support for the generic forms of mtspr/mfspr for the asm parser. The compiler will continue to use the specialized patters for mtlr etc. since those are needed to correctly describe data flow. llvm-svn: 185532	2013-07-03 12:32:41 +00:00
Richard Sandiford	ed1fab6b5b	[SystemZ] Fold more spills Add a mapping from register-based <INSN>R instructions to the corresponding memory-based <INSN>. Use it to cut down on the number of spill loads. Some instructions extend their operands from smaller fields, so this required a new TSFlags field to say how big the unextended operand is. This optimisation doesn't trigger for C(G)R and CL(G)R because in practice we always combine those instructions with a branch. Adding a test for every other case probably seems excessive, but it did catch a missed optimisation for DSGF (fixed in r185435). llvm-svn: 185529	2013-07-03 10:10:02 +00:00
Mihai Popa	d36cbaa423	This corrects the implementation of Thumb ADR instruction. There are three issues: 1. it should accept only 4-byte aligned addresses 2. the maximum offset should be 1020 3. it should be encoded with the offset scaled by two bits llvm-svn: 185528	2013-07-03 09:21:44 +00:00
Tim Northover	36b2417f18	ARM: relax the atomic release barrier to "dmb ishst" on Swift Swift cores implement store barriers that are stronger than the ARM specification but weaker than general barriers. They are, in fact, just about enough to provide the ordering needed for atomic operations with release semantics. This patch makes use of that quirk. llvm-svn: 185527	2013-07-03 09:20:36 +00:00
Richard Osborne	a1cff61dec	[XCore] Add ISel pattern for LDWCP Patch by Robert Lytton. llvm-svn: 185518	2013-07-03 07:48:50 +00:00
Michael Gottesman	bed2e82501	Change the gettimeofday test to only test on a posix platform. llvm-svn: 185503	2013-07-03 04:15:22 +00:00
Michael Gottesman	2db11161a8	Added support in FunctionAttrs for adding relevant function/argument attributes for the posix call gettimeofday. This implies annotating it as nounwind and its arguments as nocapture. To be conservative, we do not annotate the arguments with noalias since some platforms do not have restrict on the declaration for gettimeofday. llvm-svn: 185502	2013-07-03 04:00:54 +00:00
Manman Ren	94119ceebb	Trying to fix the bots llvm-svn: 185489	2013-07-03 00:16:11 +00:00
Manman Ren	ac8062bb72	Debug Info: use module flag to set up Dwarf version. Correctly handles ref_addr depending on the Dwarf version. Emit Dwarf with version from module flag. TODO: turn on/off features depending on the Dwarf version. llvm-svn: 185484	2013-07-02 23:40:10 +00:00
Ulrich Weigand	42a09dc12f	[PowerPC] PR16512 - Support TLS call sequences in the asm parser This patch now adds support for recognizing TLS call sequences in the asm parser. This needs a new pattern BL8_TLS, which is like BL8_NOP_TLS except without nop. That pattern is used for the asm parser only. llvm-svn: 185478	2013-07-02 21:31:59 +00:00
Ulrich Weigand	4050995650	[PowerPC] Remove VK_PPC_TLSGD and VK_PPC_TLSLD The PowerPC-specific modifiers VK_PPC_TLSGD and VK_PPC_TLSLD correspond exactly to the generic modifiers VK_TLSGD and VK_TLSLD. This causes some confusion with the asm parser, since VK_PPC_TLSGD is output as @tlsgd, which is then read back in as VK_TLSGD. To avoid this confusion, this patch removes the PowerPC-specific modifiers and uses the generic modifiers throughout. (The only drawback is that the generic modifiers are printed in upper case while the usual convention on PowerPC is to use lower-case modifiers. But this is just a cosmetic issue.) llvm-svn: 185476	2013-07-02 21:29:06 +00:00
Jyotsna Verma	ddca5fa24a	Add 'REQUIRES: object-emission' to DebugInfo/inlined-arguments.ll. llvm-svn: 185465	2013-07-02 19:21:43 +00:00
Ulrich Weigand	0f0398246c	[PowerPC] Support TLS variables in debug info This adds an implementation of getDebugThreadLocalSymbol for (64-bit) PowerPC. This needs to return a generic MCExpr since on ppc64, we need to add a bias of 0x8000 to the value returned by the R_PPC64_DTPREL64 relocation. llvm-svn: 185461	2013-07-02 18:47:35 +00:00
Richard Sandiford	e6e7885591	[SystemZ] Use DSGFR over DSGR in more cases Fixes some cases where we were using full 64-bit division for (sdiv i32, i32) and (sdiv i64, i32). The "32" in "SDIVREM32" just refers to the second operand. The first operand of all DIVREMs is a GR128. llvm-svn: 185435	2013-07-02 15:40:22 +00:00
Richard Sandiford	f6bae1e434	[SystemZ] Use MVC to spill loads and stores Try to use MVC when spilling the destination of a simple load or the source of a simple store. As explained in the comment, this doesn't yet handle the case where the load or store location is also a frame index, since that could lead to two simultaneous scavenger spills, something the backend can't handle yet. spill-02.py tests that this restriction kicks in, but unfortunately I've not yet found a case that would fail without it. The volatile trick I used for other scavenger tests doesn't work here because we can't use MVC for volatile accesses anyway. I'm planning on relaxing the restriction later, hopefully with a test that does trigger the problem... Tests @f8 and @f9 also showed that L(G)RL and ST(G)RL were wrongly classified as SimpleBDX{Load,Store}. It wouldn't be easy to test for that bug separately, which is why I didn't split out the fix as a separate patch. llvm-svn: 185434	2013-07-02 15:28:56 +00:00
Richard Sandiford	1d959008d6	[SystemZ] Add the MVC instruction This is the first use of D(L,B) addressing, which required a fair bit of surgery. For that reason, the patch just adds the instruction definition and the associated assembler and disassembler support. A later patch will actually make use of it for codegen. llvm-svn: 185433	2013-07-02 14:56:45 +00:00
Richard Osborne	e4cc98686a	[XCore] Fix instruction selection for zext, mkmsk instructions. r182680 replaced CountLeadingZeros_32 with a template function countLeadingZeros that relies on using the correct argument type to give the right result. The type passed in the XCore backend after this revision was incorrect in a couple of places. Patch by Robert Lytton. llvm-svn: 185430	2013-07-02 14:46:34 +00:00
Logan Chien	c931fce404	Fix ARM EHABI compact model 1 and 2 without handlerdata. According to ARM EHABI section 9.2, if the __aeabi_unwind_cpp_pr1() or __aeabi_unwind_cpp_pr2() is used, then the handler data must be emitted after the unwind opcodes. The handler data consists of several words, and should be terminated by zero. In case that the .handlerdata directive is not specified by the programmer, we should emit zero to terminate the handler data. llvm-svn: 185422	2013-07-02 12:43:27 +00:00
Tim Northover	6823900e55	DAGCombiner: fix use-counting issue when forming zextload DAGCombiner was counting all uses of a load node when considering whether it's worth combining into a zextload. Really, it wants to ignore the chain and just count real uses. rdar://problem/13896307 llvm-svn: 185419	2013-07-02 09:58:53 +00:00
Hal Finkel	fdbe161b1a	Revert r185257 (InstCombine: Be more agressive optimizing 'udiv' instrs with 'select' denoms) I'm reverting this commit because: 1. As discussed during review, it needs to be rewritten (to avoid creating and then deleting instructions). 2. This is causing optimizer crashes. Specifically, I'm seeing things like this: While deleting: i1 % Use still stuck around after Def is destroyed: <badref> = select i1 <badref>, i32 0, i32 1 opt: /src/llvm-trunk/lib/IR/Value.cpp:79: virtual llvm::Value::~Value(): Assertion `use_empty() && "Uses remain when a value is destroyed!"' failed. I'd guess that these will go away once we're no longer creating/deleting instructions here, but just in case, I'm adding a regression test. Because the code is bring rewritten, I've just XFAIL'd the original regression test. Original commit message: InstCombine: Be more agressive optimizing 'udiv' instrs with 'select' denoms Real world code sometimes has the denominator of a 'udiv' be a 'select'. LLVM can handle such cases but only when the 'select' operands are symmetric in structure (both select operands are a constant power of two or a left shift, etc.). This falls apart if we are dealt a 'udiv' where the code is not symetric or if the select operands lead us to more select instructions. Instead, we should treat the LHS and each select operand as a distinct divide operation and try to optimize them independently. If we can to simplify each operation, then we can replace the 'udiv' with, say, a 'lshr' that has a new select with a bunch of new operands for the select. llvm-svn: 185415	2013-07-02 05:21:11 +00:00
Hal Finkel	52727c6b82	Cleanup PPC Altivec registers in CSR lists and improve VRSAVE handling There are a couple of (small) related changes here: 1. The printed name of the VRSAVE register has been changed from VRsave to vrsave in order to match the name accepted by GNU binutils. 2. Support for parsing vrsave has been added to the asm parser (it seems that there was no test case specifically covering this code, so I've added one). 3. The list of Altivec registers, which was common to all calling conventions, has been separated out. This allows us to define the base CSR lists, and then lists for each ABI with Altivec included. This allows SjLj, for example, to work correctly on non-Altivec targets without using unnatural definitions of the NoRegs CSR list. 4. VRSAVE is now always reserved on non-Darwin targets and all Altivec registers are reserved when Altivec is disabled. With these changes, it is now possible to compile a function containing __builtin_unwind_init() on Linux/PPC64 with debugging information. This did not work previously because GNU binutils assumes that all .cfi_offset offsets will be 8-byte aligned on PPC64 (and errors out if you provide a non-8-byte-aligned offset). This is not true for the vrsave register, however, because this register is used only on Darwin, GCC does not bother printing a .cfi_offset entry for it (even though there is a slot in the stack frame for it as specified by the ABI). This change allows us to do the same: we will also not print .cfi_offset directives for vrsave. llvm-svn: 185409	2013-07-02 03:39:34 +00:00
David Blaikie	8466ca86fe	PR14728: DebugInfo: TLS variables with -gsplit-dwarf llvm-svn: 185398	2013-07-01 23:55:52 +00:00
Ulrich Weigand	f11efe7f48	[PowerPC] Add support for TLS data relocations This adds support for TLS data relocations and modifiers: .quad target@dtpmod .quad target@tprel .quad target@dtprel Currently exploited by the asm parser only. llvm-svn: 185394	2013-07-01 23:33:29 +00:00
David Blaikie	1b01ae8648	PR16493: DebugInfo with TLS on PPC crashing due to invalid relocation Restrict the current TLS support to X86 ELF for now. Test that we don't produce it on PPC & we can flesh that test case out with the right thing once someone implements it. llvm-svn: 185389	2013-07-01 21:45:25 +00:00
Ulrich Weigand	85c6f7f7a7	[PowerPC] Support all condition register logical instructions This adds support for all missing condition register logical instructions and extended mnemonics to the asm parser. llvm-svn: 185387	2013-07-01 21:40:54 +00:00
Bill Schmidt	48fc20a034	Index: test/CodeGen/PowerPC/reloc-align.ll =================================================================== --- test/CodeGen/PowerPC/reloc-align.ll (revision 0) +++ test/CodeGen/PowerPC/reloc-align.ll (revision 0) @@ -0,0 +1,34 @@ +; RUN: llc -mcpu=pwr7 -O1 < %s \| FileCheck %s + +; This test verifies that the peephole optimization of address accesses +; does not produce a load or store with a relocation that can't be +; satisfied for a given instruction encoding. Reduced from a test supplied +; by Hal Finkel. + +target datalayout = "E-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-f128:128:128-v128:128:128-n32:64" +target triple = "powerpc64-unknown-linux-gnu" + +%struct.S1 = type { [8 x i8] } + +@main.l_1554 = internal global { i8, i8, i8, i8, i8, i8, i8, i8 } { i8 -1, i8 -6, i8 57, i8 62, i8 -48, i8 0, i8 58, i8 80 }, align 1 + +; Function Attrs: nounwind readonly +define signext i32 @main() #0 { +entry: + %call = tail call fastcc signext i32 @func_90(%struct.S1* byval bitcast ({ i8, i8, i8, i8, i8, i8, i8, i8 }* @main.l_1554 to %struct.S1)) +; CHECK-NOT: ld {{[0-9]+}}, main.l_1554@toc@l + ret i32 %call +} + +; Function Attrs: nounwind readonly +define internal fastcc signext i32 @func_90(%struct.S1 byval nocapture %p_91) #0 { +entry: + %0 = bitcast %struct.S1* %p_91 to i64* + %bf.load = load i64* %0, align 1 + %bf.shl = shl i64 %bf.load, 26 + %bf.ashr = ashr i64 %bf.shl, 54 + %bf.cast = trunc i64 %bf.ashr to i32 + ret i32 %bf.cast +} + +attributes #0 = { nounwind readonly "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf"="true" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "unsafe-fp-math"="false" "use-soft-float"="false" } Index: lib/Target/PowerPC/PPCAsmPrinter.cpp =================================================================== --- lib/Target/PowerPC/PPCAsmPrinter.cpp (revision 185327) +++ lib/Target/PowerPC/PPCAsmPrinter.cpp (working copy) @@ -679,7 +679,26 @@ void PPCAsmPrinter::EmitInstruction(const MachineI OutStreamer.EmitRawText(StringRef("\tmsync")); return; } + break; + case PPC::LD: + case PPC::STD: + case PPC::LWA: { + // Verify alignment is legal, so we don't create relocations + // that can't be supported. + // FIXME: This test is currently disabled for Darwin. The test + // suite shows a handful of test cases that fail this check for + // Darwin. Those need to be investigated before this sanity test + // can be enabled for those subtargets. + if (!Subtarget.isDarwin()) { + unsigned OpNum = (MI->getOpcode() == PPC::STD) ? 2 : 1; + const MachineOperand &MO = MI->getOperand(OpNum); + if (MO.isGlobal() && MO.getGlobal()->getAlignment() < 4) + llvm_unreachable("Global must be word-aligned for LD, STD, LWA!"); + } + // Now process the instruction normally. + break; } + } LowerPPCMachineInstrToMCInst(MI, TmpInst, this); OutStreamer.EmitInstruction(TmpInst); Index: lib/Target/PowerPC/PPCISelDAGToDAG.cpp =================================================================== --- lib/Target/PowerPC/PPCISelDAGToDAG.cpp (revision 185327) +++ lib/Target/PowerPC/PPCISelDAGToDAG.cpp (working copy) @@ -1530,6 +1530,14 @@ void PPCDAGToDAGISel::PostprocessISelDAG() { if (GlobalAddressSDNode GA = dyn_cast<GlobalAddressSDNode>(ImmOpnd)) { SDLoc dl(GA); const GlobalValue GV = GA->getGlobal(); + // We can't perform this optimization for data whose alignment + // is insufficient for the instruction encoding. + if (GV->getAlignment() < 4 && + (StorageOpcode == PPC::LD \|\| StorageOpcode == PPC::STD \|\| + StorageOpcode == PPC::LWA)) { + DEBUG(dbgs() << "Rejected this candidate for alignment.\n\n"); + continue; + } ImmOpnd = CurDAG->getTargetGlobalAddress(GV, dl, MVT::i64, 0, Flags); } else if (ConstantPoolSDNode CP = dyn_cast<ConstantPoolSDNode>(ImmOpnd)) { llvm-svn: 185380	2013-07-01 20:52:27 +00:00
Chad Rosier	fa705ee36c	[ARMAsmParser] Sort the ARM register lists based on the encoding value, not the tablegen enum values. This should be the last fix due to fallout from r185094. llvm-svn: 185379	2013-07-01 20:49:23 +00:00
Ulrich Weigand	f7152a8596	[PowerPC] Also add "msync" alias This adds an alias for "msync" (which is used on Book E systems instead of "sync"). llvm-svn: 185375	2013-07-01 20:39:50 +00:00
Akira Hatanaka	263c6af8f3	[mips] Increase the number of floating point control registers available to 32. Create a dedicated register class for floating point condition code registers and move FCC0 from register class CCR to the new register class. llvm-svn: 185373	2013-07-01 20:31:44 +00:00
Akira Hatanaka	8b5b1e072f	[mips] Fix test case to check that mips64 instructions are generated. llvm-svn: 185371	2013-07-01 20:18:58 +00:00
Anton Korobeynikov	ba8f4c5e29	Really fix the test. Sorry for the breakage... llvm-svn: 185369	2013-07-01 19:51:36 +00:00
Anton Korobeynikov	0267837076	Fix the test which relies on uncommitted change llvm-svn: 185368	2013-07-01 19:50:31 +00:00
Anton Korobeynikov	82bedb1f3b	Add jump tables handling for MSP430. Patch by Job Noorman! llvm-svn: 185364	2013-07-01 19:44:44 +00:00
Cameron Zwarich	867bfcd546	Fix PR16508. When phis get lowered, destination copies are inserted using an iterator that is determined once for all phis in the block, which BuildMI interprets as a request to insert an instruction directly before the iterator. In the case of a cyclic phi, source copies may also be inserted directly before this iterator, which can cause source copies to be inserted before destination copies. The fix is to keep an iterator to the last phi and then advance it while lowering each phi in order to insert destination copies directly after the phis. llvm-svn: 185363	2013-07-01 19:42:46 +00:00
Hal Finkel	25e4a0d418	Don't form PPC CTR loops for over-sized exit counts Although you can't generate this from C on PPC64, if you have a loop using a 64-bit counter on PPC32 then you can't form a CTR-based loop for it. This had been cauing the PPCCTRLoops pass to assert. Thanks to Joerg Sonnenberger for providing a test case! llvm-svn: 185361	2013-07-01 19:34:59 +00:00
Tim Northover	8625fd8cad	AArch64: correct CodeGen of MOVZ/MOVK combinations. According to the AArch64 ELF specification (4.6.8), it's the assembler's responsibility to make sure the shift amount is correct in relocated MOVZ/MOVK instructions. This wasn't being obeyed by either the MCJIT CodeGen or RuntimeDyldELF (which happened to work out well for JIT tests). This commit should make us compliant in this area. llvm-svn: 185360	2013-07-01 19:23:10 +00:00
Matt Beaumont-Gay	8b30c13e12	(1) Add ".test" to test/Other/lit.local.cfg, so llvm-cov.test is actually run. (2) Rename llvm-cov test inputs so the string "llvm-cov" doesn't get substituted by lit within the input filenames on the RUN line. (3) XFAIL llvm-cov.test because it asserts: include/llvm/ADT/SmallVector.h:140: reference llvm::SmallVectorTemplateCommon<llvm::GCOVBlock , void>::operator[](unsigned int) [T = llvm::GCOVBlock ]: Assertion `begin() + idx < end()' failed. llvm-svn: 185358	2013-07-01 18:58:53 +00:00
Tim Northover	7f3d9e1f36	Revert r185339 (ARM: relax the atomic release barrier to "dmb ishst") Turns out I'd misread the architecture reference manual and thought that was a load/store-store barrier, when it's not. Thanks for pointing it out Eli! llvm-svn: 185356	2013-07-01 18:37:33 +00:00
Ulrich Weigand	3a75861b06	[PowerPC] Fix @got references to local symbols A @got reference must always result in a relocation, so that the linker has a chance to set up the GOT entry, even if the symbol happens to be local. Add a PPCELFObjectWriter::ExplicitRelSym routine that enforces a relocation to be emitted for GOT references. llvm-svn: 185353	2013-07-01 18:19:56 +00:00
Ulrich Weigand	7a9fcdf6fb	[PowerPC] Add "wait" instruction This adds the "wait" instruction and its extended mnemonics. llvm-svn: 185350	2013-07-01 17:21:23 +00:00
Ulrich Weigand	98fcc7b6bc	[PowerPC] Support "eieio" instruction This adds support for the "eieio" instruction to the asm parser. llvm-svn: 185349	2013-07-01 17:06:26 +00:00
Ulrich Weigand	421843229c	[PowerPC] Add some existing instructions to ppc64-encoding-bookII.s The test case had a couple of FIXMEs where the instruction is in fact already supported by the back-end. In some other case, while the generic form of the instruction is not yet supported, a specialized form is. This adds tests for those already supported instructions / instruction forms. llvm-svn: 185347	2013-07-01 16:52:55 +00:00
Ulrich Weigand	797f1a3f5b	[PowerPC] Add variants of "sync" instruction This adds support for the "sync $L" instruction with operand, and provides aliases for "lwsync" and "ptesync". llvm-svn: 185344	2013-07-01 16:37:52 +00:00
Tim Northover	953abab40a	ARM: relax the atomic release barrier to "dmb ishst" I believe the full "dmb ish" barrier is not required to guarantee release semantics for atomic operations. The weaker "dmb ishst" prevents previous operations being reordered with a store executed afterwards, which is enough. A key point to note (fortunately already correct) is that this barrier alone is insufficient for sequential consistency, no matter how liberally placed. llvm-svn: 185339	2013-07-01 14:48:48 +00:00
Justin Holewinski	d2bbdf05e0	[NVPTX] Add support for module-scope inline asm Since we were explicitly not calling AsmPrinter::doInitialization, any module-scope inline asm was not being printed. llvm-svn: 185336	2013-07-01 13:00:14 +00:00
Justin Holewinski	51cb1349dc	[NVPTX] 64-bit ADDC/ADDE are not legal llvm-svn: 185333	2013-07-01 12:59:04 +00:00
Justin Holewinski	dff28d215f	[NVPTX] Fix vector loads from parameters that span multiple loads, and fix some typos llvm-svn: 185332	2013-07-01 12:59:01 +00:00
Justin Holewinski	a2911283e4	[NVPTX] Handle signext/zeroext attributes properly Fix a case where we were incorrectly sign-extending a value when we should have been zero-extending the value. Also change some SIGN_EXTEND to ANY_EXTEND because we really dont care and may have more opportunity to fold subexpressions llvm-svn: 185331	2013-07-01 12:58:58 +00:00
Justin Holewinski	318c625ff4	[NVPTX] Add support for native SIGN_EXTEND_INREG where available llvm-svn: 185330	2013-07-01 12:58:56 +00:00
Justin Holewinski	e40e929eb1	[NVPTX] Add isel patterns for [reg+offset] form of ldg/ldu. llvm-svn: 185329	2013-07-01 12:58:52 +00:00
Justin Holewinski	e8c93e3378	[NVPTX] Make sure we zero out high-order 24 bits for 8-bit load into 32-bit value llvm-svn: 185328	2013-07-01 12:58:48 +00:00
NAKAMURA Takumi	234acdfdc8	llvm-symbolizer: Recognize a drive letter on win32. Then "REQUIRES: shell" can be removed. FIXME: Could we use llvm::sys::Path here? llvm-svn: 185322	2013-07-01 09:51:42 +00:00
Serge Pavlov	ff9a65c6a6	Added the test missed from r185080. llvm-svn: 185316	2013-07-01 09:02:33 +00:00
Arnold Schwaighofer	ef51cf202b	LoopVectorize: Math functions only read rounding mode Math functions are mark as readonly because they read the floating point rounding mode. Because we don't vectorize loops that would contain function calls that set the rounding mode it is safe to ignore this memory read. llvm-svn: 185299	2013-07-01 00:54:44 +00:00
Stephen Lin	2e551adcd9	DeadArgumentElimination: keep return value on functions that have a live argument with the 'returned' attribute (rather than generate invalid IR); however, if both can be eliminated, both will be llvm-svn: 185290	2013-06-30 20:26:21 +00:00
Benjamin Kramer	cc846016bf	ConstantFold: Check that truncating the other side is safe under a sext when trying to remove a sext from a compare. Fixes PR16462. llvm-svn: 185284	2013-06-30 13:47:43 +00:00
David Majnemer	7a69d2c06a	ValueTracking: Teach isKnownToBeAPowerOfTwo about (ADD X, (XOR X, Y)) where X is a power of two This allows us to simplify urem instructions involving the add+xor to turn into simpler math. llvm-svn: 185272	2013-06-29 23:44:53 +00:00
Benjamin Kramer	4093f29366	InstCombine: Also turn selects fed by an and into arithmetic when the types don't match. Inserting a zext or trunc is sufficient. This pattern is somewhat common in LLVM's pointer mangling code. llvm-svn: 185270	2013-06-29 21:17:04 +00:00
Vincent Lejeune	77a8352476	R600: Support schedule and packetization of trans-only inst llvm-svn: 185268	2013-06-29 19:32:43 +00:00
David Majnemer	5953d3712a	InstCombine: FoldGEPICmp shouldn't change sign of base pointer comparison Changing the sign when comparing the base pointer would introduce all sorts of unexpected things like: %gep.i = getelementptr inbounds [1 x i8]* %a, i32 0, i32 0 %gep2.i = getelementptr inbounds [1 x i8]* %b, i32 0, i32 0 %cmp.i = icmp ult i8* %gep.i, %gep2.i %cmp.i1 = icmp ult [1 x i8]* %a, %b %cmp = icmp ne i1 %cmp.i, %cmp.i1 ret i1 %cmp into: %cmp.i = icmp slt [1 x i8]* %a, %b %cmp.i1 = icmp ult [1 x i8]* %a, %b %cmp = xor i1 %cmp.i, %cmp.i1 ret i1 %cmp By preserving the original sign, we now get: ret i1 false This fixes PR16483. llvm-svn: 185259	2013-06-29 10:28:04 +00:00
David Majnemer	797227eea6	InstCombine: Be more agressive optimizing 'udiv' instrs with 'select' denoms Real world code sometimes has the denominator of a 'udiv' be a 'select'. LLVM can handle such cases but only when the 'select' operands are symmetric in structure (both select operands are a constant power of two or a left shift, etc.). This falls apart if we are dealt a 'udiv' where the code is not symetric or if the select operands lead us to more select instructions. Instead, we should treat the LHS and each select operand as a distinct divide operation and try to optimize them independently. If we can to simplify each operation, then we can replace the 'udiv' with, say, a 'lshr' that has a new select with a bunch of new operands for the select. llvm-svn: 185257	2013-06-29 08:40:07 +00:00
David Majnemer	b889e405eb	InstCombine: Optimize (1 << X) Pred CstP2 to X Pred Log2(CstP2) We may, after other optimizations, find ourselves with IR that looks like: %shl = shl i32 1, %y %cmp = icmp ult i32 %shl, 32 Instead, we should just compare the shift count: %cmp = icmp ult i32 %y, 5 llvm-svn: 185242	2013-06-28 23:42:03 +00:00
Jakob Stoklund Olesen	0b075103cd	Minimize precision loss when computing cyclic probabilities. Allow block frequencies to exceed 32 bits by using the new BlockFrequency division function. llvm-svn: 185236	2013-06-28 22:40:43 +00:00
Hal Finkel	ac1a24b508	PPC: Ignore spill/restore requests for VRSAVE (except on Darwin) This fixes PR16418, which reports that a function calling __builtin_unwind_init() asserts. The cause is that this generates a spill/restore for VRSAVE, and we support that only on Darwin (because VRSAVE is only really used on Darwin). The test case checks only that we don't crash. We can add correctness checks once someone verifies what behavior the function is supposed to have. llvm-svn: 185235	2013-06-28 22:29:56 +00:00
Nadav Rotem	060be733a5	SLP Vectorizer: Add support for trees with external users. To support this we have to insert 'extractelement' instructions to pick the right lane. We had this functionality before but I removed it when we moved to the multi-block design because it was too complicated. llvm-svn: 185230	2013-06-28 22:07:09 +00:00
Daniel Malea	4146b0404e	Adding tests for DebugIR pass - lit tests verify that each line of input LLVM IR gets a !dbg node and a corresponding entry of metadata that contains the line number - unit tests verify that DebugIR works as advertised in the interface - refactored some useful IR generation functionality from the MCJIT unit tests so it can be reused llvm-svn: 185212	2013-06-28 20:37:20 +00:00
Hal Finkel	147c287d91	Fix CodeGen/PowerPC/stack-protector.ll on OpenBSD On OpenBSD, the stack-smash protection transform uses "__guard_local" and "__stack_smash_handler" instead of "__stack_chk_guard" and "__stack_chk_fail". However, CodeGen/PowerPC/stack-protector.ll doesn't specify a target OS, so on OpenBSD it fails. Add -mtriple=ppc32-unknown-linux to make the test host-OS agnostic. While there, convert to FileCheck. Patch by Matthew Dempsky. llvm-svn: 185206	2013-06-28 20:18:14 +00:00
David Blaikie	f269497068	DebugInfo: PR14728: TLS support Based on GCC's output for TLS variables (OP_constNu, x@dtpoff, OP_lo_user), this implements debug info support for TLS in ELF. Verified that this output is correct/sufficient on Linux (using gold - if you're using binutils-ld, you'll need something with the fix for http://sourceware.org/bugzilla/show_bug.cgi?id=15685 in it). Support on non-ELF is sort of "arbitrary" at the moment - if Apple folks want to discuss (or just go ahead & implement) how this should work in MachO, etc, I'm open. llvm-svn: 185203	2013-06-28 20:05:11 +00:00
Hal Finkel	4ca70100de	Fix a PPC rlwimi instruction-selection bug Under certain (evidently rare) circumstances, this code used to convert OR(a, AND(x, y)) into OR(a, x). This was incorrect. While there, I've added a comment to the code immediately above. llvm-svn: 185201	2013-06-28 20:00:07 +00:00
Preston Briggs	6c286b6029	(no commit message) llvm-svn: 185187	2013-06-28 18:44:48 +00:00
Lang Hames	c22e39d83d	Add missing case to switch statement - DAGTypeLegalizer::ExpandIntegerResult should expand ATOMIC_CMP_SWAP nodes the same way that it does for ATOMIC_SWAP. Since ATOMIC_LOADs on some targets (e.g. older ARM variants) get legalized to ATOMIC_CMP_SWAPs, the missing case had been causing i64 atomic loads to crash during isel. <rdar://problem/14074644> llvm-svn: 185186	2013-06-28 18:36:42 +00:00
Justin Holewinski	af258be134	[NVPTX] Add (1.0 / sqrt(x)) => rsqrt(x) generation when allowable by FP flags llvm-svn: 185178	2013-06-28 17:58:13 +00:00
Justin Holewinski	e04e4bdf71	[NVPTX] Calling conventions fix Fix ABI handling for function returning bool -- use st.param.b32 to return the value and use ld.param.b32 in caller to load the return value. llvm-svn: 185177	2013-06-28 17:58:10 +00:00
Justin Holewinski	dc372df63b	[NVPTX] Add support for cttz/ctlz/ctpop llvm-svn: 185176	2013-06-28 17:58:07 +00:00
Justin Holewinski	dc5e3b68f5	[NVPTX] Clean up comparison/select/convert patterns and factor out PTX instructions from their patterns Test case is no breakage llvm-svn: 185175	2013-06-28 17:58:04 +00:00
Justin Holewinski	f8f7091722	[NVPTX] Remove i8 register class. PTX support for i8 (.b8, .u8, .s8) is rather poor and we're better off just ignoring it and letting LLVM expand all i8 ops out to i16. llvm-svn: 185174	2013-06-28 17:57:59 +00:00
Justin Holewinski	120baee819	[NVPTX] Add support for vectorized function return values llvm-svn: 185173	2013-06-28 17:57:55 +00:00
Justin Holewinski	44f5c60e58	[NVPTX] Clean up handling of formal arguments and enable generation of vector parameter loads llvm-svn: 185172	2013-06-28 17:57:53 +00:00
Weiming Zhao	a3d87a1024	Bug 13662: Enable GPRPair for all i64 operands of inline asm on ARM This patch assigns paired GPRs for inline asm with 64-bit data on ARM. It's enabled for both ARM and Thumb to support modifiers like %H, %Q, %R. llvm-svn: 185169	2013-06-28 17:26:02 +00:00
Tom Stellard	c026e8bc8e	R600: Add local memory support via LDS Reviewed-by: Vincent Lejeune<vljn at ovi.com> llvm-svn: 185162	2013-06-28 15:47:08 +00:00
Tom Stellard	ce540330df	R600: Add support for GROUP_BARRIER instruction Reviewed-by: Vincent Lejeune<vljn at ovi.com> llvm-svn: 185161	2013-06-28 15:46:59 +00:00
Tim Northover	7cbc21529d	ARM: ensure fixed-point conversions have sane types We were generating intrinsics for NEON fixed-point conversions that didn't exist (e.g. float -> i16). There are two cases to consider: + iN is smaller than float. In this case we can do the conversion but need an extend or truncate as well. + iN is larger than float. In this case using the NEON conversion would be incorrect so we don't perform any combining. llvm-svn: 185158	2013-06-28 15:29:25 +00:00
Tilmann Scheller	de09fae38d	ARM: Fix pseudo-instructions for SRS (Store Return State). The mapping between SRS pseudo-instructions and SRS native instructions was incorrect, the correct mapping is: srsfa -> srsib srsea -> srsia srsfd -> srsdb srsed -> srsda This fixes <rdar://problem/14214734>. llvm-svn: 185155	2013-06-28 15:09:46 +00:00
Alexey Samsonov	7323383bd7	llvm-symbolizer: skip leading underscore in Mach-O symbol table entries llvm-svn: 185151	2013-06-28 14:25:52 +00:00
Alexey Samsonov	2ca6536d7a	llvm-symbolizer: add support for Mach-O universal binaries llvm-svn: 185137	2013-06-28 08:15:40 +00:00
Manman Ren	983a16c08a	Debug Info: clean up usage of Verify. No functionality change. It should suffice to check the type of a debug info metadata, instead of calling Verify. For cases where we know the type of a DI metadata, use assert. Also update testing cases to make them conform to the format of DI classes. llvm-svn: 185135	2013-06-28 05:43:10 +00:00
David Blaikie	c3ccdbe2bf	Integrate Assembler: Support X86_64_DTPOFF64 relocations llvm-svn: 185131	2013-06-28 04:24:32 +00:00
Matt Arsenault	fbfdced30f	Convert tests to FileCheck llvm-svn: 185124	2013-06-28 01:29:35 +00:00
Arnold Schwaighofer	12ecb331af	LoopVectorize: Preserve debug location info radar://14169017 llvm-svn: 185122	2013-06-28 00:38:54 +00:00
Arnold Schwaighofer	38de7cd464	LoopVectorize: Cache edge masks created during if-conversion Otherwise, we end up with an exponential IR blowup. Fixes PR16472. llvm-svn: 185097	2013-06-27 20:31:06 +00:00
Chad Rosier	ccd0664393	Improve the compression of the tablegen DiffLists by introducing a new sort algorithm when assigning EnumValues to the synthesized registers. The current algorithm, LessRecord, uses the StringRef compare_numeric function. This function compares strings, while handling embedded numbers. For example, the R600 backend registers are sorted as follows: T1 T1_W T1_X T1_XYZW T1_Y T1_Z T2 T2_W T2_X T2_XYZW T2_Y T2_Z In this example, the 'scaling factor' is dEnum/dN = 6 because T0, T1, T2 have an EnumValue offset of 6 from one another. However, in other parts of the register bank, the scaling factors are different: dEnum/dN = 5: KC0_128_W KC0_128_X KC0_128_XYZW KC0_128_Y KC0_128_Z KC0_129_W KC0_129_X KC0_129_XYZW KC0_129_Y KC0_129_Z The diff lists do not work correctly because different kinds of registers have different 'scaling factors'. This new algorithm, LessRecordRegister, tries to enforce a scaling factor of 1. For example, the registers are now sorted as follows: T1 T2 T3 ... T0_W T1_W T2_W ... T0_X T1_X T2_X ... KC0_128_W KC0_129_W KC0_130_W ... For the Mips and R600 I see a 19% and 6% reduction in size, respectively. I did see a few small regressions, but the differences were on the order of a few bytes (e.g., AArch64 was 16 bytes). I suspect there will be even greater wins for targets with larger register files. Patch reviewed by Jakob. rdar://14006013 llvm-svn: 185094	2013-06-27 19:38:13 +00:00
Nadav Rotem	f9ecbcb835	CostModel: improve the cost model for load/store of non power-of-two types such as <3 x float>, which are popular in graphics. llvm-svn: 185085	2013-06-27 17:52:04 +00:00
Tom Stellard	1baa03aba6	R600: Remove alu-split.ll test The purpose of this test was to check boundary conditions for the size of an ALU clause. This test is very sensitive to changes to the optimizer or scheduler, because it requires an exact number of ALU instructions in order to remain valid. It's not good to have a test this sensitive, because it is confusing to developers who implement optimizations and then 'break' the test. I'm not sure if there is a good way to test these limits using lit, but if I can come up with replacement test that isn't as sensitive I'll add it back to the tree. llvm-svn: 185084	2013-06-27 17:00:38 +00:00
Arnold Schwaighofer	a2dd195fb3	LoopVectorize: Use vectorized loop invariant gep index anchored in loop Use vectorized instruction instead of original instruction anchored in the original loop. Fixes PR16452 and t2075.c of PR16455. llvm-svn: 185081	2013-06-27 15:11:55 +00:00
Joey Gouly	b1b0dd8758	Add a Subtarget feature 'v8fp' to the ARM backend. llvm-svn: 185073	2013-06-27 11:49:26 +00:00
Richard Sandiford	ec8693d5f3	[SystemZ] Fix some embarrassing test typos llvm-svn: 185070	2013-06-27 09:49:34 +00:00
Richard Sandiford	891a7e7454	[SystemZ] Allow LA and LARL to be rematerialized llvm-svn: 185069	2013-06-27 09:42:10 +00:00
Richard Sandiford	a57e13b670	[SystemZ] Allow immediate moves to be rematerialized llvm-svn: 185068	2013-06-27 09:38:48 +00:00
Richard Sandiford	b86a83488e	[SystemZ] Add conditional store patterns Add pseudo conditional store instructions, so that we use: branch foo: store foo: instead of: load branch foo: move foo: store z196 has real 32-bit and 64-bit conditional stores, but we don't use any z196 instructions yet. llvm-svn: 185065	2013-06-27 09:27:40 +00:00
Manman Ren	31dee5bec9	Update testing case to make DI nodes have the correct format. llvm-svn: 185061	2013-06-27 06:40:18 +00:00
Arnold Schwaighofer	8db6347b9d	Fix spelling. llvm-svn: 185052	2013-06-27 01:01:11 +00:00
Arnold Schwaighofer	ccd6c9929b	LoopVectorize: Don't store a reversed value in the vectorized value map When we store values for reversed induction stores we must not store the reversed value in the vectorized value map. Another instruction might use this value. This fixes 3 test cases of PR16455. llvm-svn: 185051	2013-06-27 00:45:41 +00:00
Michael Gottesman	41748d7c86	Added support for the Builtin attribute. The Builtin attribute is an attribute that can be placed on function call site that signal that even though a function is declared as being a builtin, rdar://problem/13727199 llvm-svn: 185049	2013-06-27 00:25:01 +00:00
Chad Rosier	253777fdc3	[Mips Disassembler] Have the DecodeCCRRegisterClass function use the getReg function to lookup the proper tablegen'ed register enumeration. Previously, it was using the encoded value directly. llvm-svn: 185026	2013-06-26 22:23:32 +00:00
Akira Hatanaka	c3114b3341	[mips] Do not emit ".option pic0" if target is mips64. llvm-svn: 185012	2013-06-26 19:08:49 +00:00
Akira Hatanaka	5832fc607b	[mips] Improve code generation for constant multiplication using shifts, adds and subs. llvm-svn: 185011	2013-06-26 18:48:17 +00:00
Nadav Rotem	4c5b2d1de6	Erase all of the instructions that we RAUWed llvm-svn: 184969	2013-06-26 17:16:09 +00:00
Joey Gouly	b3f550e8cd	Add a subtarget feature 'v8' to the ARM backend. This allows for targeting the ARMv8 AArch32 variant. llvm-svn: 184967	2013-06-26 16:58:26 +00:00
Nadav Rotem	f4ca3994b8	Do not add cse-ed instructions into the visited map because we dont want to consider them as a candidate for replacement of instructions to be visited. llvm-svn: 184966	2013-06-26 16:54:53 +00:00
Tim Northover	2c45a383a8	ARM: fix more cases where predication may or may not be allowed Unfortunately this addresses two issues (by the time I'd disentangled the logic it wasn't worth putting it back to half-broken): + Coprocessor instructions should all be predicable in Thumb mode. + BKPT should never be predicable. llvm-svn: 184965	2013-06-26 16:52:40 +00:00
Tim Northover	52f77f5cda	ARM: allow predicated barriers in Thumb mode The barrier instructions are only "always-execute" in ARM mode, they can quite happily sit inside an IT block in Thumb. llvm-svn: 184964	2013-06-26 16:52:32 +00:00
Joey Gouly	05b04cf3a5	Remove the 'generic' CPU from the ARM eabi attributes printer. Make v4 the default ARM architecture attribute, to match CodeGen. llvm-svn: 184962	2013-06-26 16:39:06 +00:00
Ulrich Weigand	5a02a02b41	[PowerPC] Accept 17-bit signed immediates for addis The assembler currently strictly verifies that immediates for s16imm operands are in range (-32768 ... 32767). This matches the behaviour of the GNU assembler, with one exception: gas allows, as a special case, operands in an extended range (-65536 .. 65535) for the addis instruction only (and its extended mnemonic lis). The main reason for this seems to be to allow using unsigned 16-bit operands for lis, e.g. like lis %r1, 0xfedc. Since this has been supported by gas for a long time, and assembler source code seen "in the wild" actually exploits this feature, this patch adds equivalent support to LLVM for compatibility reasons. llvm-svn: 184946	2013-06-26 13:49:53 +00:00
Ulrich Weigand	fd3ad693e8	[PowerPC] Support symbolic u16imm operands Currently, all instructions taking s16imm operands support symbolic operands. However, for u16imm operands, we only support actual immediate integers. This causes the assembler to reject code like ori %r5, %r5, symbol@l This patch changes the u16imm operand definition to likewise accept symbolic operands. In fact, s16imm and u16imm can share the same encoding routine, now renamed to getImm16Encoding. llvm-svn: 184944	2013-06-26 13:49:15 +00:00
Amaury de la Vieuville	a6f5542be4	ARM: operands should be explicit when disassembled llvm-svn: 184943	2013-06-26 13:39:07 +00:00
NAKAMURA Takumi	1c9de1f078	Suppress llvm/test/Other/can-execute.txt on msys bash. llvm-svn: 184932	2013-06-26 10:56:44 +00:00
Elena Demikhovsky	6769c50d9e	Optimized integer vector multiplication operation by replacing it with shift/xor/sub when it is possible. Fixed a bug in SDIV, where the const operand is not a splat constant vector. llvm-svn: 184931	2013-06-26 10:55:03 +00:00
Kostya Serebryany	5e276f9dbc	[asan] workaround for PR16277: don't instrument AllocaInstr with alignment more than the redzone size llvm-svn: 184928	2013-06-26 09:49:52 +00:00
Kostya Serebryany	9f5213f20f	[asan] add option -asan-keep-uninstrumented-functions llvm-svn: 184927	2013-06-26 09:18:17 +00:00
Nadav Rotem	0794acc1da	SLPVectorizer: support slp-vectorization of PHINodes between basic blocks llvm-svn: 184888	2013-06-25 23:04:09 +00:00
Jakob Stoklund Olesen	6e630d46d2	Print block frequencies in decimal form. This is easier to read than the internal fixed-point representation. If anybody knows the correct algorithm for converting fixed-point numbers to base 10, feel free to fix it. llvm-svn: 184881	2013-06-25 21:57:38 +00:00
Tom Stellard	02661d9605	R600: Use new getNamedOperandIdx function generated by TableGen llvm-svn: 184880	2013-06-25 21:22:18 +00:00
Arnold Schwaighofer	a04b9ef1e8	X86 cost model: Vectorizing integer division is a bad idea radar://14057959 llvm-svn: 184872	2013-06-25 19:14:09 +00:00
Bob Wilson	acfc01dedf	Fix SROA to avoid unnecessary scalar conversions for 1-element vectors. When a 1-element vector alloca is promoted, a store instruction can often be rewritten without converting the value to a scalar and using an insertelement instruction to stuff it into the new alloca. This patch just adds a check to skip that conversion when it is unnecessary. This turns out to be really important for some ARM Neon operations where <1 x i64> is used to get around the fact that i64 is not a legal type. llvm-svn: 184870	2013-06-25 19:09:50 +00:00
Ulrich Weigand	93372b4583	[PowerPC] Support @got modifier Add VK_... values and relocation types necessary to support the @got family of modifiers. Used by the asm parser only. llvm-svn: 184860	2013-06-25 16:49:50 +00:00
Aaron Watry	0517275a57	R600: Add v2i32 test for vselect Note: Only adding test for evergreen, not SI yet. When I attempted to expand vselect for SI, I got the following: llc: /home/awatry/src/llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp:522: llvm::SDValue llvm::DAGTypeLegalizer::PromoteIntRes_SETCC(llvm::SDNode*): Assertion `SVT.isVector() == N->getOperand(0).getValueType().isVector() && "Vector compare must return a vector result!"' failed. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 184847	2013-06-25 13:55:54 +00:00
Aaron Watry	daabb20e1b	R600/SI: Expand xor v2i32/v4i32 Add test cases for both vector sizes on SI and also add v2i32 test for EG. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 184846	2013-06-25 13:55:52 +00:00
Aaron Watry	91d2886169	R600: Add v2i32 test for setcc on evergreen No test/expansion for SI has been added yet. Attempts to expand this operation for SI resulted in a stacktrace in (IIRC) LegalizeIntegerTypes which was complaining about vector comparisons being required to return a vector type. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 184845	2013-06-25 13:55:49 +00:00
Aaron Watry	83fa6006bc	R600/SI: Expand urem of v2i32/v4i32 for SI Also add lit test for both cases on SI, and v2i32 for evergreen. Note: I followed the guidance of the v4i32 EG check... UREM produces really complex code, so let's just check that the instruction was lowered successfully. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 184844	2013-06-25 13:55:46 +00:00
Aaron Watry	5527b6c6b6	R600/SI: Expand udiv v[24]i32 for SI and v2i32 for EG Also add lit test for both cases on SI, and v2i32 for evergreen. Note: I followed the guidance of the v4i32 EG check... UDIV produces really complex code, so let's just check that the instruction was lowered successfully. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 184843	2013-06-25 13:55:43 +00:00
Aaron Watry	16d80c0529	R600/SI: Expand ashr of v2i32/v4i32 for SI Also add lit test for both cases on SI, and v2i32 for evergreen. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 184842	2013-06-25 13:55:40 +00:00
Aaron Watry	f63791e778	R600/SI: Expand srl of v2i32/v4i32 for SI Also add lit test for both cases on SI, and v2i32 for evergreen. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 184841	2013-06-25 13:55:37 +00:00
Aaron Watry	5584553984	R600/SI: Expand shl of v2i32/v4i32 for SI Also add lit test for both cases on SI, and v2i32 for evergreen. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 184840	2013-06-25 13:55:32 +00:00
Aaron Watry	2fa162e88e	R600/SI: Expand or of v2i32/v4i32 for SI Also add lit test for both cases on SI, and v2i32 for evergreen. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 184839	2013-06-25 13:55:29 +00:00
Aaron Watry	265eef5efe	R600/SI: Expand mul of v2i32/v4i32 for SI Also add lit test for both cases on SI, and v2i32 for evergreen. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 184838	2013-06-25 13:55:26 +00:00
Aaron Watry	00aeb119db	R600/SI: Expand and of v2i32/v4i32 for SI Also add lit test for both cases on SI, and v2i32 for evergreen. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 184837	2013-06-25 13:55:23 +00:00
Benjamin Kramer	866793109e	BlockFrequency: Bump up the entry frequency a bit. This is a band-aid to fix the most severe regressions we're seeing from basing spill decisions on block frequencies, until we have a better solution. llvm-svn: 184835	2013-06-25 13:34:40 +00:00
Ulrich Weigand	ad873cdb2b	[PowerPC] Add extended rotate/shift mnemonics This adds all missing extended rotate/shift mnemonics to the asm parser. llvm-svn: 184834	2013-06-25 13:17:41 +00:00
Ulrich Weigand	6c31c4aae8	[PowerPC] Add rldcr/rldic instructions This adds pattern for the rldcr and rldic instructions (the last instruction from the rotate/shift family that were missing). They are currently used only by the asm parser. llvm-svn: 184833	2013-06-25 13:17:10 +00:00
Ulrich Weigand	4069e24bd3	[PowerPC] Add extended subtract mnemonics This adds support for the extended subtract mnemonics to the asm parser: subi subis subic subic. sub sub. subc subc. llvm-svn: 184832	2013-06-25 13:16:48 +00:00
Andrew Trick	121124acf8	Revert "Temporarily enable MI-Sched on X86." This reverts commit 98a9b72e8c56dc13a2617de84503a3d78352789c. llvm-svn: 184823	2013-06-25 02:48:58 +00:00
Tom Stellard	0125f2a6e4	R600/SI: Report unaligned memory accesses as legal for > 32-bit types In reality, some unaligned memory accesses are legal for 32-bit types and smaller too, but it all depends on the address space. Allowing unaligned loads/stores for > 32-bit types is mainly to prevent the legalizer from splitting one load into multiple loads of smaller types. https://bugs.freedesktop.org/show_bug.cgi?id=65873 llvm-svn: 184822	2013-06-25 02:39:35 +00:00
Tom Stellard	9810ec613c	R600: Add support for i32 loads from the constant address space on Cayman Tested-By: Aaron Watry <awatry@gmail.com> llvm-svn: 184821	2013-06-25 02:39:30 +00:00
Tom Stellard	b06f3fc1be	R600/SI: Add support for v4i32 and v4f32 kernel args Tested-By: Aaron Watry <awatry@gmail.com> llvm-svn: 184820	2013-06-25 02:39:25 +00:00
Tom Stellard	9d2e1500b4	R600: Fix typo in R600Schedule.td This should only make a difference in programs that use a lot of the vector ALU instructions like BFI_INT and BIT_ALIGN. There is a slight improvement in the phatk bitcoin mining kernel with this patch on Evergreen (vector size == 1): Before: 1173 Instruction Groups / 9520 dwords After: 1167 Instruction Groups / 9510 dwords Reviewed-by: Reviewed-by: Vincent Lejeune<vljn at ovi.com> llvm-svn: 184819	2013-06-25 02:39:20 +00:00
Ulrich Weigand	6ca71579db	[PowerPC] Support some miscellaneous mnemonics in the asm parser This adds support for the following extended mnemonics: xnop mr. not not. la llvm-svn: 184767	2013-06-24 18:08:03 +00:00
Ulrich Weigand	ba19f79655	[PowerPC] Add some FIXMEs A bunch of extendend mnemomics ought to support '.' forms. Add FIXMEs to the test case for those. llvm-svn: 184757	2013-06-24 17:00:22 +00:00
Ulrich Weigand	86247b6e27	[PowerPC] Add predicted forms of branches This adds support for the predicted forms of branches (+/-). There are three cases to consider: - Branches using a PPC::Predicate code For these, I've added new PPC::Predicate codes corresponding to the BO values for predicted branch forms, and updated insn printing to print them correctly. I've also added new aliases for the asm parser matching the new forms. - bt/bf I've added new aliases matching to gBC etc. - bd(n)z variants I've added new instruction patterns for the predicted forms. In all cases, the new patterns are used for the asm parser only. (The new infrastructure ought to be sufficient to allow use by the compiler too at some point.) llvm-svn: 184754	2013-06-24 16:52:04 +00:00
NAKAMURA Takumi	b64e776268	Move llvm/test/DebugInfo/arguments.ll to X86, for now. It is still Windows' PECOFF incompatible. llvm-svn: 184750	2013-06-24 16:05:21 +00:00
NAKAMURA Takumi	c316274d76	llvm/test/CodeGen/X86: Add explicit -mtriple=x86_64-unknown-unknown. llvm-svn: 184731	2013-06-24 13:19:59 +00:00
NAKAMURA Takumi	da9833f22c	llvm/test/CodeGen/X86/legalize-shift-64.ll: Add explicit -mtriple=i686-unknown-unknown. llvm-svn: 184730	2013-06-24 13:19:52 +00:00
NAKAMURA Takumi	1ea45844f5	llvm/test/DebugInfo/arguments.ll: Add explicit -mtriple=x86_64-unknown-unknown. llvm-svn: 184729	2013-06-24 13:19:47 +00:00
Ulrich Weigand	fedd5a756e	[PowerPC] Add t/f branch mnemonics to asm parser This adds the bt/bf/bd(n)zt/bd(n)zf mnemonics as aliases for the asm parser, resolving to the generic conditional patterns. llvm-svn: 184725	2013-06-24 12:49:20 +00:00
Arnold Schwaighofer	b252c11ccc	Reapply 184685 after the SetVector iteration order fix. This should hopefully have fixed the stage2/stage3 miscompare on the dragonegg testers. "LoopVectorize: Use the dependence test utility class We now no longer need alias analysis - the cases that alias analysis would handle are now handled as accesses with a large dependence distance. We can now vectorize loops with simple constant dependence distances. for (i = 8; i < 256; ++i) { a[i] = a[i+4] * a[i+8]; } for (i = 8; i < 256; ++i) { a[i] = a[i-4] * a[i-8]; } We would be able to vectorize about 200 more loops (in many cases the cost model instructs us no to) in the test suite now. Results on x86-64 are a wash. I have seen one degradation in ammp. Interestingly, the function in which we now vectorize a loop is never executed so we probably see some instruction cache effects. There is a 2% improvement in h264ref. There is one or the other TSCV loop kernel that speeds up. radar://13681598" llvm-svn: 184724	2013-06-24 12:09:15 +00:00
Ulrich Weigand	824b7d8dfd	[PowerPC] Support generic conditional branches in asm parser This adds instruction patterns to cover the generic forms of the conditional branch instructions. This allows the assembler to support the generic mnemonics. The compiler will still generate the various specific forms of the instruction that were already supported. llvm-svn: 184722	2013-06-24 11:55:21 +00:00
Ulrich Weigand	b6a30d159e	[PowerPC] Support absolute branches There is currently only limited support for the "absolute" variants of branch instructions. This patch adds support for the absolute variants of all branches that are currently otherwise supported. This requires adding new fixup types so that the correct variant of relocation type can be selected by the object writer. While the compiler will continue to usually choose the relative branch variants, this will allow the asm parser to fully support the absolute branches, with either immediate (numerical) or symbolic target addresses. No change in code generation intended. llvm-svn: 184721	2013-06-24 11:03:33 +00:00
Ulrich Weigand	5b9d591ad1	[PowerPC] Support bd(n)zl and bd(n)zlrl This adds support for the bd(n)zl and bd(n)zlrl instructions. The patterns are currently used for the asm parser only. llvm-svn: 184720	2013-06-24 11:02:38 +00:00
Ulrich Weigand	d20e91edad	[PowerPC] Support b(cond)l in the asm parser This patch adds support for the conditional variants of bl. The pattern is currently used by the asm parser only. llvm-svn: 184719	2013-06-24 11:02:19 +00:00
Ulrich Weigand	1847bb811e	[PowerPC] Support blrl and variants in the asm parser This patch adds support for blrl and its conditional variants. The patterns are (currently) used for the asm parser only. llvm-svn: 184718	2013-06-24 11:01:55 +00:00
Andrew Trick	c08bd450a3	Add -mcpu to some unit tests that only fail on certain hosts. llvm-svn: 184709	2013-06-24 09:51:30 +00:00
Amaury de la Vieuville	8449c0d5ed	ARM: check predicate bits for thumb instructions When encoded to thumb, VFP instruction and VMOV/VDUP between scalar and core registers, must have their predicate bit to 0b1110. llvm-svn: 184707	2013-06-24 09:15:01 +00:00
Amaury de la Vieuville	8175bda3db	ARM: rGPR is meant to be unpredictable, not undefined llvm-svn: 184706	2013-06-24 09:14:54 +00:00
Andrew Trick	5a1e0af838	Temporarily enable MI-Sched on X86. Sorry for the unit test churn. I'll try to make the change permanently next time. llvm-svn: 184705	2013-06-24 09:13:20 +00:00
Amaury de la Vieuville	f2f00b4e28	ARM: fix thumb1 nop decoding In thumb1, NOP is a pseudo-instruction equivalent to mov r8, r8. However the disassembler should not use this alias. llvm-svn: 184703	2013-06-24 09:11:53 +00:00
Amaury de la Vieuville	2f0ac8d961	ARM: fix IT decoding mask == 0 -> UNPRED llvm-svn: 184702	2013-06-24 09:11:45 +00:00
Amaury de la Vieuville	4b6c076da3	ARM: enable decoding of pc-relative PLD/PLI llvm-svn: 184701	2013-06-24 09:11:38 +00:00
David Blaikie	3656123dfc	DebugInfo: add some testing from an overly broad end-to-end test in Clang llvm-svn: 184692	2013-06-24 06:47:22 +00:00
Arnold Schwaighofer	58ca945f38	Revert "LoopVectorize: Use the dependence test utility class" This reverts commit cbfa1ca993363ca5c4dbf6c913abc957c584cbac. We are seeing a stage2 and stage3 miscompare on some dragonegg bots. llvm-svn: 184690	2013-06-24 06:10:41 +00:00
Arnold Schwaighofer	b914a7e2ef	LoopVectorize: Use the dependence test utility class We now no longer need alias analysis - the cases that alias analysis would handle are now handled as accesses with a large dependence distance. We can now vectorize loops with simple constant dependence distances. for (i = 8; i < 256; ++i) { a[i] = a[i+4] * a[i+8]; } for (i = 8; i < 256; ++i) { a[i] = a[i-4] * a[i-8]; } We would be able to vectorize about 200 more loops (in many cases the cost model instructs us no to) in the test suite now. Results on x86-64 are a wash. I have seen one degradation in ammp. Interestingly, the function in which we now vectorize a loop is never executed so we probably see some instruction cache effects. There is a 2% improvement in h264ref. There is one or the other TSCV loop kernel that speeds up. radar://13681598 llvm-svn: 184685	2013-06-24 03:55:48 +00:00
Nadav Rotem	210e86d7c4	SLP Vectorizer: Add support for vectorizing parts of the tree. Untill now we detected the vectorizable tree and evaluated the cost of the entire tree. With this patch we can decide to trim-out branches of the tree that are not profitable to vectorizer. Also, increase the max depth from 6 to 12. In the worse possible case where all of the code is made of diamond-shaped graph this can bring the cost to 2**10, but diamonds are not very common. llvm-svn: 184681	2013-06-24 02:52:43 +00:00
Andrew Trick	97a1d7c475	Fix tail merging to assign the (more) correct BasicBlock when splitting. This makes it possible to write unit tests that are less susceptible to minor code motion, particularly copy placement. block-placement.ll covers this case with -pre-RA-sched=source which will soon be default. One incorrectly named block is already fixed, but without this fix, enabling new coalescing and scheduling would cause more failures. llvm-svn: 184680	2013-06-24 01:55:01 +00:00
Nadav Rotem	0323925d51	SLP Vectorizer: Fix a bug in the code that does CSE on the generated gather sequences. Make sure that we don't replace and RAUW two sequences if one does not dominate the other. llvm-svn: 184674	2013-06-23 21:57:27 +00:00
David Blaikie	5acff7e691	DebugInfo: PR14404: Avoid truncating 64 bit values into 32 bits for ULEB128/SLEB128 generation llvm-svn: 184669	2013-06-23 18:31:11 +00:00
Tim Northover	295f049d1f	AArch64: fix overzealous NEXTing for Windows testing. llvm-svn: 184667	2013-06-23 15:32:01 +00:00
Andrew Trick	47740deb26	Add MI-Sched support for x86 macro fusion. This is an awful implementation of the target hook. But we don't have abstractions yet for common machine ops, and I don't see any quick way to make it table-driven. llvm-svn: 184664	2013-06-23 09:00:28 +00:00
Nadav Rotem	eb65e67eea	SLP Vectorizer: Implement a simple CSE optimization for the gather sequences. llvm-svn: 184660	2013-06-23 06:15:46 +00:00
Nadav Rotem	80de0a28f1	SLP Vectorizer: Implement multi-block slp-vectorization. Rewrote the SLP-vectorization as a whole-function vectorization pass. It is now able to vectorize chains across multiple basic blocks. It still does not vectorize PHIs, but this should be easy to do now that we scan the entire function. I removed the support for extracting values from trees. We are now able to vectorize more programs, but there are some serious regressions in many workloads (such as flops-6 and mandel-2). llvm-svn: 184647	2013-06-22 21:34:10 +00:00
Reed Kotler	de085b2afb	Replace with a shorter test case produced by Doug Gillmore. llvm-svn: 184645	2013-06-22 19:35:08 +00:00
David Blaikie	2b380232c3	DebugInfo: Support (using GNU extensions) for template template parameters and parameter packs llvm-svn: 184643	2013-06-22 18:59:11 +00:00
Sean Silva	8217757379	[yaml2obj][ELF] Make symbol table top-level key. Although in reality the symbol table in ELF resides in a section, the standard requires that there be no more than one SHT_SYMTAB. To enforce this constraint, it is cleaner to group all the symbols under a top-level `Symbols` key on the object file. llvm-svn: 184627	2013-06-22 01:38:00 +00:00
Sean Silva	e5c41896b3	This was a nifty test, but remove it. It wouldn't really test anything that doesn't already have a more targeted test: `yaml2obj-elf-section-basic.yaml`: Already tests that section content is correctly passed though. `yaml2obj-elf-symbol-basic.yaml` (this file): Tests that the st_value and st_size attributes of `main` are set correctly. Between those two tests, disassembling the file doesn't really add anything, so just remove mention of disassembling the file. llvm-svn: 184607	2013-06-21 23:17:13 +00:00
Sean Silva	2d47ffd3da	Revert "Put r184469 disassembler test back on X86" This reverts commit r184602. In an upcoming commit, I will just remove the disassembler part of the test; it was mostly just a "nifty" thing marking a milestone but it doesn't test anything that isn't tested elsewhere. llvm-svn: 184606	2013-06-21 23:17:10 +00:00
David Blaikie	97c6c5bd98	DebugInfo: Don't lose unreferenced non-trivial by-value parameters A FastISel optimization was causing us to emit no information for such parameters & when they go missing we end up emitting a different function type. By avoiding that shortcut we not only get types correct (very important) but also location information (handy) - even if it's only live at the start of a function & may be clobbered later. Reviewed/discussion by Evan Cheng & Dan Gohman. llvm-svn: 184604	2013-06-21 22:56:30 +00:00
Renato Golin	fe941943a6	Put r184469 disassembler test back on X86 llvm-svn: 184602	2013-06-21 22:42:20 +00:00

... 5 6 7 8 9 ...

20297 Commits