llvm-project

Commit Graph

Author	SHA1	Message	Date
Pete Cooper	91e4ba2f88	Revert "GlobalDCE: Delete available_externally initializers if it allows removing the value the initializer is referring to." This reverts commit 5b55a47e94e28fbb56d0cd5d72c3db9105c15b4c. A test case was found to crash after this was applied. I'll file a bug to track fixing this with the test case needed. llvm-svn: 212550	2014-07-08 17:06:03 +00:00
Andrea Di Biagio	d261e98f3d	[DAG] Teach how to combine a pair of shuffles into a single shuffle if the resulting mask is legal. This patch teaches how to fold a shuffle according to rule: shuffle (shuffle (x, undef, M0), undef, M1) -> shuffle(x, undef, M2) We do this only if the resulting mask M2 is legal; this is to avoid introducing illegal shuffles that are potentially expanded into a sub-optimal sequence of target specific dag nodes. This patch has the advantage of being target independent, since it works on ISD nodes. Therefore, all targets (not only x86) can take advantage of this rule. The idea behind this patch is that most shuffle pairs can be safely combined before we run the legalizer on vector operations. This allows us to combine/simplify dag nodes earlier in the process and not only immediately before instruction selection stage. That said. This patch is not meant to replace any existing target specific combine rules; backends might still introduce new shuffles during legalization stage. Also, this rule is very simple and avoids to aggressively optimize shuffles. llvm-svn: 212539	2014-07-08 15:22:29 +00:00
Daniel Sanders	c7dbc630e5	[mips] Improve encapsulation of the .MIPS.abiflags implementation and limit scope of related enums Summary: Follow on to r212519 to improve the encapsulation and limit the scope of the enums. Also merged two very similar parser functions, fixed a bug where ASE's were not being reported, and marked CPR1's as being 128-bit when MSA is enabled. Differential Revision: http://reviews.llvm.org/D4384 llvm-svn: 212522	2014-07-08 10:11:38 +00:00
Arnaud A. de Grandmaison	d7827606de	Truncate the immediate in logical operation to the register width And continue to produce an error if the 32 most significant bits are not all ones or zeros. llvm-svn: 212520	2014-07-08 09:53:04 +00:00
Vladimir Medic	fb8a2a95cd	Mips.abiflags is a new implicitly generated section that will be present on all new modules. The section contains a versioned data structure which represents essentially information to allow a program loader to determine the requirements of the application. This patch implements mips.abiflags section and provides test cases for it. llvm-svn: 212519	2014-07-08 08:59:22 +00:00
Chandler Carruth	142e966261	[x86,SDAG] Sink the logic for folding shuffles of splats more aggressively from the x86 shuffle lowering to the generic SDAG vector shuffle formation code. This code already tried to fold away shuffles of splats! It just had lots of bugs and couldn't handle the case my new x86 shuffle lowering needed. First, it failed to correctly compute whether N2 was undef because it pre-computed this, then did transformations which could make N2 undef, then failed to ever re-consider the precomputed state. Second, it didn't look through bitcasts at all, even in the safe cases where they are just element-type bitcasts with no change to the number of elements. Third, it didn't handle all-zero bit casts nicely the way my code in the x86 side of things did, which is essential to getting good zext-shuffle lowerings. But all of these are generic. I just ported the code down to this layer and fixed the surrounding bugs. Tests exercising this in the x86 backend still pass and some silly code in widen_cast-6.ll gets better. I updated that test to be a bit more precise but it's still pretty unclear what the value of the test is in this day and age. llvm-svn: 212517	2014-07-08 08:45:38 +00:00
Adam Nemet	79580db918	[X86] AVX512: Only allow k1-k7 as predicates to vpcmp* As destination k0 is allowed but not as predicate/writemask. I also modified the test to allow checking of error messages by the assembler. I applied a similar approach to the test ret.s in the same directory. llvm-svn: 212504	2014-07-08 00:22:32 +00:00
Andrea Di Biagio	2620b877b6	[x86] Fix assertion failure caused by a wrong combine of PSHUFD nodes with different types. When combining a sequence of two PSHUFD dag nodes into a single PSHUFD, make sure that we assign the correct type to the resulting PSHUFD. X86ISD::PSHUFD dag nodes can be either MVT::v4i32 or MVT::v4f32. Before this change, an assertion failure was triggered in method 'DAGCombinerInfo::CombineTo' when trying to combine the shuffles from the test below into a single PSHUFD. define <4 x float> @test1(<4 x float> %V) { %1 = shufflevector <4 x float> %V, <4 x float> undef, <4 x i32> <i32 3, i32 0, i32 2, i32 1> %2 = shufflevector <4 x float> %1, <4 x float> undef, <4 x i32> <i32 3, i32 0, i32 2, i32 1> ret <4 x float> %2 } llvm-svn: 212498	2014-07-07 23:25:23 +00:00
Juergen Ributzka	665ea71fcd	[FastISel][X86] Fix smul.with.overflow.i8 lowering. Add custom lowering code for signed multiply instruction selection, because the default FastISel instruction selection for ISD::MUL will use unsigned multiply for the i8 type and signed multiply for all other types. This would set the incorrect flags for the overflow check. This fixes <rdar://problem/17549300> llvm-svn: 212493	2014-07-07 21:52:21 +00:00
Louis Gerbarg	4c5b4054b2	Allow AArch64FastISel to degrade graceully in the presence of an MVT::i128 Currently AArch64FastISel crashes if it tries to extend an integer into an MVT::i128. This can happen by creating 128 bit integers like so: typedef unsigned int uint128_t __attribute__((mode(TI))); typedef int sint128_t __attribute__((mode(TI))); This patch makes EmitIntExt check for their presence and then falls back to SelectionDAG. Tests included. rdar://17516686 llvm-svn: 212492	2014-07-07 21:37:51 +00:00
Sanjay Patel	a932da8f35	Fix for PR17073 ( http://llvm.org/pr17073 ), simplifycfg illegally hoists an operation in a phi node that can trap. This patch adds to an existing loop over phi nodes in SimplifyCondBranchToCondBranch() to check for trapping ops and bails out of the optimization if we find one of those. The test cases verify that trapping ops are not hoisted and non-trapping ops are still optimized as expected. llvm-svn: 212490	2014-07-07 21:19:00 +00:00
Ulrich Weigand	5fd91e0c0b	[PowerPC] Fix testcase regression Use -mcpu to avoid different codegen depending on host platform. llvm-svn: 212478	2014-07-07 19:41:54 +00:00
Ulrich Weigand	ec2bf93895	[PowerPC] Fix "byval align" arguments Arguments passed as "byval align" should get the specified alignment in the parameter save area. There was some code in PPCISelLowering.cpp that attempted to implement this, but this didn't work correctly: while code did update the ArgOffset value, it neglected to update the PtrOff value (which was already computed from the old ArgOffset), and it also neglected to update GPR_idx -- fields skipped due to alignment in the save area must likewise be skipped in GPRs. This patch fixes and simplifies this logic by: - handling argument offset alignment right at the beginning of argument processing, using a new helper routine CalculateStackSlotAlignment (this avoids having to update PtrOff and other derived values later on) - not tracking GPR_idx separately, but always computing the correct GPR_idx for each argument from its ArgOffset - removing some redundant computation in LowerFormalArguments: MinReservedArea must equal ArgOffset after argument processing, so there's no use in computing it twice. [This doesn't change the behavior of the current clang front-end, since that never creates "byval align" arguments at the moment. This will change with a follow-on patch, however.] llvm-svn: 212476	2014-07-07 19:26:41 +00:00
Chandler Carruth	beeacac0b3	[x86] Revert r212324 which was too aggressive w.r.t. allowing undef lanes in vector splats. The core problem here is that undef lanes can't unilaterally be considered to contribute to splats. Their handling needs to be more cautious. There is also a reported failure of the nightly testers (thanks Tobias!) that may well stem from the same core issue. I'm going to fix this theoretical issue, factor the APIs a bit better, and then verify that I don't see anything bad with Tobias's reduction from the test suite before recommitting. Original commit message for r212324: [x86] Generalize BuildVectorSDNode::getConstantSplatValue to work for any constant, constant FP, or undef splat and to tolerate any undef lanes in a splat, then replace all uses of isSplatVector in X86's lowering with it. This fixes issues where undef lanes in an otherwise splat vector would prevent the splat logic from firing. It is a touch more awkward to use this interface, but it is much more accurate. Suggestions for better interface structuring welcome. With this fix, the code generated with the widening legalization strategy for widen_cast-4.ll is dramatically improved as the special lowering strategies for a v16i8 SRA kick in even though the high lanes are undef. We also get a slightly different choice for broadcasting an aligned memory location, and use vpshufd instead of vbroadcastss. This looks like a minor win for pipelining and domain crossing, but a minor loss for the number of micro-ops. I suspect its a wash, but folks can easily tweak the lowering if they want. llvm-svn: 212475	2014-07-07 19:03:32 +00:00
Matt Arsenault	d2c9e08b63	R600: Fix mishandling of load / store chains. Fixes various bugs with reordering loads and stores. Scalarized vector loads weren't collecting the chains at all. llvm-svn: 212473	2014-07-07 18:34:45 +00:00
Evgeniy Stepanov	6fa6c677cc	[asan] Generate asm instrumentation in MC. Generate entire ASan asm instrumentation in MC without relying on runtime helper functions. Patch by Yuri Gorshenin. llvm-svn: 212455	2014-07-07 13:57:37 +00:00
Evgeniy Stepanov	d948a5f3c3	[msan] Fix handling of phi in blacklisted functions. llvm-svn: 212454	2014-07-07 13:28:31 +00:00
Chandler Carruth	0dcb366268	[x86] Teach the new vector shuffle lowering code to handle what is essentially a DAG combine that never gets a chance to run. We might typically expect DAG combining to remove shuffles-of-splats and other similar patterns, but we don't get a chance to run the DAG combiner when we recursively form sub-shuffles during the lowering of a shuffle. So instead hand-roll a really important combine directly into the lowering code to detect shuffles-of-splats, especially shuffles of an all-zero splat which needn't even have the same element width, etc. This lets the new vector shuffle lowering handle shuffles which implement things like zero-extension really nicely. This will become even more important when I wire the legalization of zero-extension to vector shuffles with the new widening legalization strategy. llvm-svn: 212444	2014-07-07 09:06:58 +00:00
Tim Northover	55beb64bd0	CodeGen: it turns out that NAND is not the same thing as BIC. At all. We've been performing the wrong operation on ARM for "atomicrmw nand" for years, since "a NAND b" is "~(a & b)" rather than ARM's very tempting "a & ~b". This bled over into the generic expansion pass. So I assume no-one has ever actually tried to do an atomic nand in the real world. Oh well. llvm-svn: 212443	2014-07-07 09:06:35 +00:00
Saleem Abdulrasool	763f9a50a5	ARM: properly lower dllimport'ed global values This completes the handling for DLL import storage symbols when lowering instructions. A DLL import storage symbol must have an additional load performed prior to use. This is applicable to variables and functions. This is particularly important for non-function symbols as it is possible to handle function references by emitting a thunk which performs the translation from the unprefixed __imp_ symbol to the proper symbol (although, this is a non-optimal lowering). For a variable symbol, no such thunk can be accommodated. llvm-svn: 212431	2014-07-07 05:18:35 +00:00
Kevin Qin	4473c1943f	[AArch64] Normalize all constants to build a vector. The value of constant operands will be truncated to fit element width. llvm-svn: 212428	2014-07-07 02:45:40 +00:00
Ehsan Akhgari	b3efe0602d	Revert r212375 because of test failures llvm-svn: 212376	2014-07-05 19:46:10 +00:00
Ehsan Akhgari	7c35b0f004	Add a test case for the tilde operator in Microsoft inline assembly llvm-svn: 212375	2014-07-05 19:40:35 +00:00
Simon Atanasyan	5a63aa305d	[llvm-readobj] Fix output of MIPS GOT without local and global entries. llvm-svn: 212374	2014-07-05 19:28:49 +00:00
David Majnemer	d1bea693e2	IR: Fold away compares between GV GEPs and GVs A GEP of a non-weak global variable will not be equivalent to another non-weak global variable or a GEP of such a variable. Differential Revision: http://reviews.llvm.org/D4238 llvm-svn: 212360	2014-07-04 22:05:26 +00:00
Rafael Espindola	2dc0d9bddb	Ignore llvm.* globals. It is not clear if llvm.global_ctors should or should not be in llvm.metadata, but in practice it is not and we need to ignore it for LTO. llvm-svn: 212351	2014-07-04 19:08:22 +00:00
Rafael Espindola	3885090b86	Mark intrinsic functions as llvm-specific. llvm-svn: 212347	2014-07-04 15:58:00 +00:00
Daniel Sanders	950f48d3c7	[mips][mips64r6] Set ELF e_flags for MIPS32r6/MIPS64r6. Also do MIPS-I to MIPS-V Differential Revision: http://reviews.llvm.org/D4386 llvm-svn: 212346	2014-07-04 15:21:53 +00:00
Daniel Sanders	20c82ee4fa	[mips] Add tests for the 'ret', 'call', and 'indirectbr' LLVM IR instruction. Summary: The tests in this directory are intended to test a single IR instruction with as few dependencies on other instructions as possible. The aim is to be very confident that each LLVM-IR instruction is implemented correctly and with the optimal sequence of instructions, as well as to make it easy to tell what is tested, and make it easier to bring up new ISA revisions in the future. This gives us a good foundation on which to test bigger things. These particular tests will allow testing that MIPS32r6/MIPS64r6 generate the correct return instruction for returns, calls, and indirect branches. This will be a bit tricky since the assembly text is identical but the instruction is actually different. On MIPS32r6/MIPS64r6 'jr $rs' has been removed in favour of the equivalent 'jalr $zero, $rs'. 'jr $rs' remains as an alias for 'jalr $zero, $rs'. Differential Revision: http://reviews.llvm.org/D4266 llvm-svn: 212345	2014-07-04 15:16:14 +00:00
Rafael Espindola	b674c17deb	Don't include llvm.metadata variables in archive symbol tables. llvm-svn: 212344	2014-07-04 15:03:17 +00:00
Benjamin Kramer	3c5b126239	GlobalDCE: Delete available_externally initializers if it allows removing the value the initializer is referring to. This is useful for functions that are not actually available externally but referenced by a vtable of some kind. Clang emits functions like this for the MS ABI. PR20182. llvm-svn: 212337	2014-07-04 12:36:05 +00:00
NAKAMURA Takumi	91e9f4d6f8	llvm/test/CodeGen/XCore/dwarf_debug.ll: Fix not to be affected by *-win32. llvm-svn: 212335	2014-07-04 11:58:03 +00:00
NAKAMURA Takumi	9e5b987642	llvm/test/CodeGen/X86/vector-gep.ll: Appease to add -mtriple=i686-linux. This doesn't pass if stack alignment is not 16, like cygming, *bsd. llvm-svn: 212334	2014-07-04 11:55:40 +00:00
Tim Northover	1bc367a41b	ARM: when falling back to scattered relocs, keep the type. The linker relies on relocation type info (e.g. is it a branch?) to perform the correct actions, so we should keep that even when we end up using a scattered relocation for whatever reason. rdar://problem/17553104 llvm-svn: 212333	2014-07-04 10:58:05 +00:00
Tim Northover	07f99fb769	llvm-readobj: fix MachO relocatoin printing a bit. There were two issues here: 1. At the very least, scattered relocations cannot use the same code to determine the corresponding symbol being referred to. For some reason we pretend there is no symbol, even when one actually exists in the symtab, so to match this behaviour getRelocationSymbol should simply return symbols_end for scattered relocations. 2. Printing "-" when we can't get a symbol (including the scattered case, but not exclusively), isn't that helpful. In both cases there is interesting information in that field, so we should print it. As hex will do. Small part of rdar://problem/17553104 llvm-svn: 212332	2014-07-04 10:57:56 +00:00
Benjamin Kramer	a420df2999	InstCombine: Strength reduce sadd.with.overflow into a regular nsw add if we can prove that it cannot overflow. PR20194 llvm-svn: 212331	2014-07-04 10:22:21 +00:00
Daniel Sanders	2e03d66453	[mips][mips64r6] Correct the encoding of dmuh, dmuhu, dmul, and dmulu. We have detected a documentation bug in the encoding tables of the released MIPS64r6 specification that has resulted in the wrong encodings being used for these instructions in LLVM. This commit corrects them. llvm-svn: 212330	2014-07-04 10:08:27 +00:00
Chandler Carruth	8d37ae4471	[x86] Relax the line in this check to pacify build bots. I still don't love testing the comments, but its the only sane way to check shuffle instructions... llvm-svn: 212326	2014-07-04 08:39:30 +00:00
Chandler Carruth	d32b08c62a	[x86] Move some check lines to be slightly easier for me to find. (meant to put this cleanup in the previous patch, sorry) llvm-svn: 212325	2014-07-04 08:19:37 +00:00
Chandler Carruth	5d79bb5d32	[x86] Generalize BuildVectorSDNode::getConstantSplatValue to work for any constant, constant FP, or undef splat and to tolerate any undef lanes in a splat, then replace all uses of isSplatVector in X86's lowering with it. This fixes issues where undef lanes in an otherwise splat vector would prevent the splat logic from firing. It is a touch more awkward to use this interface, but it is much more accurate. Suggestions for better interface structuring welcome. With this fix, the code generated with the widening legalization strategy for widen_cast-4.ll is dramatically improved as the special lowering strategies for a v16i8 SRA kick in even though the high lanes are undef. We also get a slightly different choice for broadcasting an aligned memory location, and use vpshufd instead of vbroadcastss. This looks like a minor win for pipelining and domain crossing, but a minor loss for the number of micro-ops. I suspect its a wash, but folks can easily tweak the lowering if they want. llvm-svn: 212324	2014-07-04 08:11:49 +00:00
Alexey Volkov	302309f39f	[X86] Limit maximum nop length on Silvermont Silvermont can only decode one instruction per cycle if the instruction exceeds 8 bytes. Also in Silvermont instructions with more than 3 prefixes will cause 3 cycle penalty. Maximum nop length is limited to 7 bytes when used for padding on Silvermont. For other x86 processors max nop length remains unchanged 15 bytes. Differential Revision: http://reviews.llvm.org/D4374 llvm-svn: 212321	2014-07-04 07:14:56 +00:00
Robert Lytton	37d3fa7e36	XCore target: remove incorrect DebugLoc entries from prologue Summary: This was causing the prologue_end to be incorrectly positioned. Differential Revision: http://reviews.llvm.org/D4122 llvm-svn: 212318	2014-07-04 06:38:22 +00:00
NAKAMURA Takumi	b52c761a3c	Let test/Unit/lit.cfg add config.shlibdir to $PATH on DLL platforms like cygming. This makes unittests run with BUILD_SHARED_LIBS on DLL platforms. llvm-svn: 212316	2014-07-04 05:11:55 +00:00
David Majnemer	651ed5e8fd	InstSimplify: Fix a bug when INT_MIN is in a sdiv When INT_MIN is the numerator in a sdiv, we would not properly handle overflow when calculating the bounds of possible values; abs(INT_MIN) is not a meaningful number. Instead, check and handle INT_MIN by reasoning that the largest value is INT_MIN/-2 and the smallest value is INT_MIN. This fixes PR20199. llvm-svn: 212307	2014-07-04 00:23:39 +00:00
Eric Christopher	09f7131984	Temporarily revert "Don't try to construct debug LexicalScopes hierarchy for functions that do not have top level debug information." as it appears to be breaking some LTO constructs. This reverts commit r212203. llvm-svn: 212298	2014-07-03 22:24:54 +00:00
Andrea Di Biagio	c8e8bda58f	[CostModel][x86] Improved cost model for alternate shuffles. This patch: 1) Improves the cost model for x86 alternate shuffles (originally added at revision 211339); 2) Teaches the Cost Model Analysis pass how to analyze alternate shuffles. Alternate shuffles are a special kind of blend; on x86, we can often easily lowered alternate shuffled into single blend instruction (depending on the subtarget features). The existing cost model didn't take into account subtarget features. Also, it had a couple of "dead" entries for vector types that are never legal (example: on x86 types v2i32 and v2f32 are not legal; those are always either promoted or widened to 128-bit vector types). The new x86 cost model takes into account what target features we have before returning the shuffle cost (i.e. the number of instructions after the blend is lowered/expanded). This patch also teaches the Cost Model Analysis how to identify and analyze alternate shuffles (i.e. 'SK_Alternate' shufflevector instructions): - added function 'isAlternateVectorMask'; - added some logic to check if an instruction is a alternate shuffle and, in case, call the target specific TTI to get the corresponding shuffle cost; - added a test to verify the cost model analysis on alternate shuffles. llvm-svn: 212296	2014-07-03 22:24:18 +00:00
Kevin Enderby	0fd8aac5da	Add the -just-symbol-name (aka -j) flag to llvm-nm to just print the symbol’s name. On darwin the -j flag is used (often in combinations with other flags) to produce a complete list of symbol names which than can then be reorder and used with ld(1)’s -order_file. llvm-svn: 212294	2014-07-03 21:51:07 +00:00
Andrea Di Biagio	a37a2fc81f	[X86] Add ISel patterns to select 'f32_to_f16' and 'f16_to_f32' dag nodes. This patch adds tablegen patterns to select F16C float-to-half-float conversion instructions from 'f32_to_f16' and 'f16_to_f32' dag nodes. If the target doesn't have F16C, then 'f32_to_f16' and 'f16_to_f32' are expanded into library calls. llvm-svn: 212293	2014-07-03 21:51:06 +00:00
Rafael Espindola	d69a347128	Move test since it now depends on the x86 backend. llvm-svn: 212289	2014-07-03 20:26:21 +00:00
Rafael Espindola	8e8debc756	Add support for inline asm symbols in llvm-ar. This should allow llvm-ar to be used instead of gnu ar + plugin in a LTO build. I will add a release note about it once I finish a LTO bootstrap with it. llvm-svn: 212287	2014-07-03 19:40:08 +00:00
Rafael Espindola	13b69d63e6	Add support for inline asm symbols to IRObjectFile. This also enables it in llvm-nm so that it can be tested. llvm-svn: 212282	2014-07-03 18:59:23 +00:00
Kevin Enderby	acaaf903e8	Add the -U flag to llvm-nm as an alias to -defined-only as darwin’s nm(1) uses -U for this functionality. llvm-svn: 212280	2014-07-03 18:18:50 +00:00
Yi Kong	93e52da641	[ARM] Implement ISB memory barrier intrinsic Adds support for __builtin_arm_isb. Also corrects DSB and ISB instructions modelling by adding has-side-effects property. llvm-svn: 212276	2014-07-03 16:00:41 +00:00
Sanjay Patel	dc574ab500	bug fix for PR20020: anti-dependency-breaker causes miscompilation This patch sets the 'KeepReg' bit for any tied and live registers during the PrescanInstruction() phase of the dependency breaking algorithm. It then checks those 'KeepReg' bits during the ScanInstruction() phase to avoid changing any tied registers. For more details, please see comments in: http://llvm.org/bugs/show_bug.cgi?id=20020 I added two FIXME comments for code that I think can be removed by using register iterators that include self. I don't want to include those code changes with this patch, however, to keep things as small as possible. The test case is larger than I'd like, but I don't know how to reduce it further and still produce the failing asm. Differential Revision: http://reviews.llvm.org/D4351 llvm-svn: 212275	2014-07-03 15:19:40 +00:00
Ulrich Weigand	f236bb1b5b	Fix ppcf128 component access on little-endian systems The PowerPC 128-bit long double data type (ppcf128 in LLVM) is in fact a pair of two doubles, where one is considered the "high" or more-significant part, and the other is considered the "low" or less-significant part. When a ppcf128 value is stored in memory or a register pair, the high part always comes first, i.e. at the lower memory address or in the lower-numbered register, and the low part always comes second. This is true both on big-endian and little-endian PowerPC systems. (Similar to how with a complex number, the real part always comes first and the imaginary part second, no matter the byte order of the system.) This was implemented incorrectly for little-endian systems in LLVM. This commit fixes three related issues: - When printing an immediate ppcf128 constant to assembler output in emitGlobalConstantFP, emit the high part first on both big- and little-endian systems. - When lowering a ppcf128 type to a pair of f64 types in SelectionDAG (which is used e.g. when generating code to load an argument into a register pair), use correct low/high part ordering on little-endian systems. - In a related issue, because lowering ppcf128 into a pair of f64 must operate differently from lowering an int128 into a pair of i64, bitcasts between ppcf128 and int128 must not be optimized away by the DAG combiner on little-endian systems, but must effect a word-swap. Reviewed by Hal Finkel. llvm-svn: 212274	2014-07-03 15:06:47 +00:00
Evgeniy Stepanov	174242c74c	[msan] Stop propagating shadow in blacklisted functions. With this change all values passed through blacklisted functions become fully initialized. Previous behavior was to initialize all loads in blacklisted functions, but apply normal shadow propagation logic for all other operation. This makes blacklist applicable in a wider range of situations. It also makes code for blacklisted functions a lot shorter, which works as yet another workaround for PR17409. llvm-svn: 212268	2014-07-03 11:56:30 +00:00
Evgeniy Stepanov	89c40a8b2d	[msan] Add missing attributes in MemorySanitizer tests. llvm-svn: 212267	2014-07-03 11:49:50 +00:00
NAKAMURA Takumi	254bd27ce2	Let llvm/test/CodeGen/X86/lower-bitcast.ll tolerant of win32 calling convention. llvm-svn: 212258	2014-07-03 07:25:00 +00:00
Chandler Carruth	99b1104c46	[x86] Fix the completely broken vector widening legalization of bswap. This operation was classified as a binary operation in the widening logic for some reason (clearly, untested). It is in fact a unary operation. Add a RUN line to a test to exercise this for x86. Note that again the vector widening strategy doesn't regress anything and in one case removes a totally unecessary instruction that we couldn't avoid when promoting the element type. llvm-svn: 212257	2014-07-03 07:04:38 +00:00
Chandler Carruth	739b6ada99	[x86] Fix crashes in lowering bitcast instructions with the widening mode. This also runs the test in that mode which would reproduce the crash. What I love is that every single FIXME in the test is addressed by switching to widening. llvm-svn: 212254	2014-07-03 03:43:47 +00:00
Chandler Carruth	395421fd98	[aarch64] Add a test that should have been in r212242 but I forgot to add it. Sorry about that. llvm-svn: 212251	2014-07-03 02:12:26 +00:00
Richard Trieu	f2a795241a	Add new lines to debugging information. Differential Revision: http://reviews.llvm.org/D4262 llvm-svn: 212250	2014-07-03 02:11:49 +00:00
Chandler Carruth	9d010fffe1	[codegen,aarch64] Add a target hook to the code generator to control vector type legalization strategies in a more fine grained manner, and change the legalization of several v1iN types and v1f32 to be widening rather than scalarization on AArch64. This fixes an assertion failure caused by scalarizing nodes like "v1i32 trunc v1i64". As v1i64 is legal it will fail to scalarize v1i32. This also provides a foundation for other targets to have more granular control over how vector types are legalized. Patch by Hao Liu, reviewed by Tim Northover. I'm committing it to allow some work to start taking place on top of this patch as it adds some really important hooks to the backend that I'd like to immediately start using. =] http://reviews.llvm.org/D4322 llvm-svn: 212242	2014-07-03 00:23:43 +00:00
Kevin Enderby	25a614bccc	Add the -reverse-sort flag (aka -r) to llvm-nm which exists in other Unix nm(1)’s. llvm-svn: 212235	2014-07-02 23:23:58 +00:00
Adam Nemet	11dd5cf9f1	[X86] AVX512: Allow writemask argument in vpermt* intrinsics llvm-svn: 212223	2014-07-02 21:26:01 +00:00
Adam Nemet	2415a497b5	[X86] AVX512: Add writemask variants for vperm2 This includes assembler and codegen support (see the new tests in avx512-encodings.s and avx512-shuffle.ll). <rdar://problem/17492620> llvm-svn: 212221	2014-07-02 21:25:54 +00:00
Tom Stellard	10ae6a0e6a	R600: Promote i64 loads to v2i32 llvm-svn: 212216	2014-07-02 20:53:54 +00:00
David Blaikie	9408f5282e	DebugInfo: Ensure that all debug location scope chains from instructions within a function, lead to the function itself. Originally committed in r211723, reverted in r211724 due to failure cases found and fixed (ArgumentPromotion: r211872, Inlining: r212065), committed again in r212085 and reverted again in r212089 after fixing some other cases, such as debug info subprogram lists not keeping track of the function they represent (r212128) and then short-circuiting things like LiveDebugVariables that build LexicalScopes for functions that might not have full debug info. And again, I believe the invariant actually holds for some reasonable amount of code (but I'll keep an eye on the buildbots and see what happens... ). Original commit message: PR20038: DebugInfo: Inlined call sites where the caller has debug info but the call itself has no debug location. This situation does bad things when inlined, so I've fixed Clang not to produce inlinable call sites without locations when the caller has debug info (in the one case where I could find that this occurred). This updates the PR20038 test case to be what clang now produces, and readds the assertion that had to be removed due to this bug. I've also beefed up the debug info verifier to help diagnose these issues in the future, and I hope to add checks to the inliner to just assert-fail if it encounters this situation. If, in the future, we decide we have to cope with this situation, the right thing to do is probably to just remove all the DebugLocs from the inlined instructions. llvm-svn: 212205	2014-07-02 18:32:05 +00:00
David Blaikie	d47fb5b339	Don't try to construct debug LexicalScopes hierarchy for functions that do not have top level debug information. If a function isn't actually in a CU's subprogram list in the debug info metadata, ignore all the DebugLocs and don't try to build scopes, track variables, etc. While this is possibly a minor optimization, it's also a correctness fix for an incoming patch that will add assertions to LexicalScopes and the debug info verifier to ensure that all scope chains lead to debug info for the current function. Fix up a few test cases that had broken/incomplete debug info that could violate this constraint. Add a test case where this occurs by design (inlining a debug-info-having function in an attribute nodebug function - we want this to work because /if/ the nodebug function is then inlined into a debug-info-having function, it should be fine (and will work fine - we just stitch the scopes up as usual), but should the inlining not happen we need to not assert fail either). llvm-svn: 212203	2014-07-02 18:31:35 +00:00
Duncan P. N. Exon Smith	de58870394	AArch64: Re-enable AArch64AddressTypePromotion This reverts commits r212189 and r212190. While this pass was accidentally disabled (until r212073), r205437 slipped in a use of `auto` that should have been `auto&`. This fixes PR20188. llvm-svn: 212201	2014-07-02 18:17:40 +00:00
Duncan P. N. Exon Smith	292fa19077	XFAIL the test to go with r202189 llvm-svn: 212190	2014-07-02 17:07:03 +00:00
Alexey Samsonov	4f319cca42	[ASan] Print exact source location of global variables in error reports. See https://code.google.com/p/address-sanitizer/issues/detail?id=299 for the original feature request. Introduce llvm.asan.globals metadata, which Clang (or any other frontend) may use to report extra information about global variables to ASan instrumentation pass in the backend. This metadata replaces llvm.asan.dynamically_initialized_globals that was used to detect init-order bugs. llvm.asan.globals contains the following data for each global: 1) source location (file/line/column info); 2) whether it is dynamically initialized; 3) whether it is blacklisted (shouldn't be instrumented). Source location data is then emitted in the binary and can be picked up by ASan runtime in case it needs to print error report involving some global. For example: 0x... is located 4 bytes to the right of global variable 'C::array' defined in '/path/to/file:17:8' (0x...) of size 40 These source locations are printed even if the binary doesn't have any debug info. This is an ABI-breaking change. ASan initialization is renamed to __asan_init_v4(). Pre-built libraries compiled with older Clang will not work with the fresh runtime. llvm-svn: 212188	2014-07-02 16:54:41 +00:00
Chad Rosier	aba845e835	Revert "Revert "MachineScheduler: better book-keeping for asserts."" This reverts commit r212109, which reverted r212088. However, disable the assert as it's not necessary for correctness. There are several corner cases that the assert needed to handle better for in-order scheduling, but none of them are incorrect scheduler behavior. The assert is mainly there to collect good unit tests like this and ensure that the target-independent scheduler is working as expected with the various machine models. llvm-svn: 212187	2014-07-02 16:46:08 +00:00
Benjamin Kramer	e739cf3eb5	X86: When combining shuffles just remove shuffles that are completely redundant. CombineTo doesn't allow replacing a node with itself so this would crash if the combined shuffle is the same as the input shuffle. llvm-svn: 212181	2014-07-02 15:09:44 +00:00
Elena Demikhovsky	678bd5ba4a	AVX-512: dec/inc instructions are slow on KNL After Alexey Volkov, I'm adding the same property for KNL, that prefers ADD/SUB instead of INC/DEC. Added a test. llvm-svn: 212178	2014-07-02 14:11:05 +00:00
David Majnemer	f28e2a4282	InstCombine: Optimize x/INT_MIN to x==INT_MIN The result of x/INT_MIN is either 0 or 1, we can just use an icmp instead. llvm-svn: 212167	2014-07-02 06:42:13 +00:00
David Majnemer	e18d302ef4	InstCombine: Add a vector variant test for PR20186 No functional change, just adding more test coverage that was meant to go in with r212164. llvm-svn: 212165	2014-07-02 06:14:13 +00:00
David Majnemer	bdeef602e9	InstCombine: Don't turn -(x/INT_MIN) -> x/INT_MIN It is not safe to negate the smallest signed integer, doing so yields the same number back. This fixes PR20186. llvm-svn: 212164	2014-07-02 06:07:09 +00:00
Saleem Abdulrasool	2e09c514c0	aarch64: support target-specific .req assembler directive Based on the support for .req on ARM. The aarch64 variant has to keep track if the alias register was a vector register (v0-31) or a general purpose or VFP/Advanced SIMD ([bhsdq]0-31) register. Patch by Janne Grunau! llvm-svn: 212161	2014-07-02 04:50:23 +00:00
Tim Northover	df58625e3c	X86: delegate expanding atomic libcalls to generic code. On targets without cmpxchg16b or cmpxchg8b, the borderline atomic operations were slipping through the gaps. X86AtomicExpand.cpp was delegating to ISelLowering. Generic ISelLowering was delegating to X86ISelLowering and X86ISelLowering was asserting. The correct behaviour is to expand to a libcall, preferably in generic ISelLowering. This can be achieved by X86ISelLowering deciding it doesn't want the faff after all. llvm-svn: 212134	2014-07-01 21:44:59 +00:00
David Blaikie	e844cd5305	DebugInfo: Keep track of subprograms who's arguments have been promoted. Matching behavior with DeadArgumentElimination (and leveraging some now-common infrastructure), keep track of the function from debug info metadata if arguments are promoted. This may produce interesting debug info - since the arguments may be missing or of different types... but at least backtraces, inlining, etc, will be correct. llvm-svn: 212128	2014-07-01 21:13:37 +00:00
Tim Northover	277066ab43	X86: expand atomics in IR instead of as MachineInstrs. The logic for expanding atomics that aren't natively supported in terms of cmpxchg loops is much simpler to express at the IR level. It also allows the normal optimisations and CodeGen improvements to help out with atomics, instead of using a limited set of possible instructions.. rdar://problem/13496295 llvm-svn: 212119	2014-07-01 18:53:31 +00:00
Adam Nemet	16de2486cb	[X86] AVX512: Allow writemasks with vpcmp For now I only updated the _alt variants. The main variants are used by codegen and that will need a bit more work to trigger. <rdar://problem/17492620> llvm-svn: 212114	2014-07-01 18:03:45 +00:00
Chad Rosier	f575a73751	Revert "MachineScheduler: better book-keeping for asserts." This reverts commit r212088, which is causing a number of spec failures. Will provide reduced test cases shortly. PR20057 llvm-svn: 212109	2014-07-01 17:23:11 +00:00
Kevin Enderby	afef4c99dc	Add the -arch flag support to llvm-size like what was done to llvm-nm to select the slice out of a Mach-O universal file. This also includes support for -arch all, selecting the host architecture by default from a universal file and checking if -arch is used with a standard Mach-O it matches that architecture. llvm-svn: 212108	2014-07-01 17:19:10 +00:00
David Majnemer	5c92115972	GlobalOpt: Don't swap private for internal linkage There were transforms whose intent was to downgrade the linkage of external objects to have internal linkage. However, it fired on things with private linkage as well. llvm-svn: 212104	2014-07-01 15:26:50 +00:00
David Majnemer	9797abb0bf	GlobalOpt: FileCheck-ize test No functionality change. llvm-svn: 212103	2014-07-01 15:26:47 +00:00
Rafael Espindola	83120cdf68	Avoid revocations when possible. This is a small targeted fix for pr20119. The code needs quiet a bit of refactoring and I added some FIXMEs about it, but I want to get the testcase passing first. llvm-svn: 212101	2014-07-01 14:34:30 +00:00
David Blaikie	c8caa1702a	Revert "DebugInfo: Ensure that all debug location scope chains from instructions within a function, lead to the function itself." This reverts commit r212085. This breaks the sanitizer bot... & I thought I'd tried pretty hard not to do that. Guess I need to try harder. llvm-svn: 212089	2014-07-01 04:11:45 +00:00
Andrew Trick	f1b307bcb0	MachineScheduler: better book-keeping for asserts. Fixes another test case under PR20057. llvm-svn: 212088	2014-07-01 03:23:13 +00:00
David Blaikie	b89e6d93d9	DebugInfo: Ensure that all debug location scope chains from instructions within a function, lead to the function itself. Originally committed in r211723, reverted in r211724 due to failure cases found and fixed (ArgumentPromotion: r211872, Inlining: r212065), and I now believe the invariant actually holds for some reasonable amount of code (but I'll keep an eye on the buildbots and see what happens... ). Original commit message: PR20038: DebugInfo: Inlined call sites where the caller has debug info but the call itself has no debug location. This situation does bad things when inlined, so I've fixed Clang not to produce inlinable call sites without locations when the caller has debug info (in the one case where I could find that this occurred). This updates the PR20038 test case to be what clang now produces, and readds the assertion that had to be removed due to this bug. I've also beefed up the debug info verifier to help diagnose these issues in the future, and I hope to add checks to the inliner to just assert-fail if it encounters this situation. If, in the future, we decide we have to cope with this situation, the right thing to do is probably to just remove all the DebugLocs from the inlined instructions. llvm-svn: 212085	2014-07-01 03:11:59 +00:00
Reid Kleckner	b5dd9452b4	Fix .seh_stackalloc 0 seh_stackalloc 0 is not representable in Win64 SEH info, so emitting it is a bug. Reviewers: rnk Differential Revision: http://reviews.llvm.org/D4334 Patch by Vadim Chugunov! llvm-svn: 212081	2014-07-01 00:42:47 +00:00
David Majnemer	0e2cc2a519	GlobalOpt: Handle non-zero offsets for aliases An alias with an aliasee of a non-zero GEP is not trivially replacable with it's aliasee. llvm-svn: 212079	2014-07-01 00:30:56 +00:00
Gerolf Hoflehner	734f4c8984	Suppress inlining when the block address is taken Inlining functions with block addresses can cause many problem and requires a rich infrastructure to support including escape analysis. At this point the safest approach to address these problems is by blocking inlining from happening. Background: There have been reports on Ruby segmentation faults triggered by inlining functions with block addresses like //Ruby code snippet vm_exec_core() { finish_insn_seq_0 = &&INSN_LABEL_finish; INSN_LABEL_finish: ; } This kind of scenario can also happen when LLVM picks a subset of blocks for inlining, which is the case with the actual code in the Ruby environment. LLVM suppresses inlining for such functions when there is an indirect branch. The attached patch does so even when there is no indirect branch. Note that user code like above would not make much sense: using the global for jumping across function boundaries would be illegal. Why was there a segfault: In the snipped above the block with the label is recognized as dead So it is eliminated. Instead of a block address the cloner stores a constant (sic!) into the global resulting in the segfault (when the global is used in a goto). Why had it worked in the past then: By luck. In older versions vm_exec_core was also inlined but the label address used was the block label address in vm_exec_core. So the global jump ended up in the original function rather than in the caller which accidentally happened to work. Test case ./tools/clang/test/CodeGen/indirect-goto.c will fail as a result of this commit. rdar://17245966 llvm-svn: 212077	2014-07-01 00:19:34 +00:00
Duncan P. N. Exon Smith	7d7ae93139	AArch64: Actually do address type promotion AArch64AddressTypePromotion was doing nothing because it was using the old semantics of `Use` and `uses()`, when it really wanted to get at the `users()`. llvm-svn: 212073	2014-06-30 23:42:14 +00:00
Reid Kleckner	cdb4e64a20	Convert some byval argpromotion grep tests to FileCheck Surprisingly, the i32* byval parameter is not transformed by argpromotion. llvm-svn: 212067	2014-06-30 20:44:28 +00:00
David Blaikie	644d2eee59	DebugInfo: Preserve debug location information when transforming a call into an invoke during inlining. This both improves basic debug info quality, but also fixes a larger hole whenever we inline a call/invoke without a location (debug info for the entire inlining is lost and other badness that the debug info emission code is currently working around but shouldn't have to). llvm-svn: 212065	2014-06-30 20:30:39 +00:00
David Blaikie	ba405c22b8	Remove unnecessary datalayout string from a test case. llvm-svn: 212063	2014-06-30 20:26:12 +00:00
Reid Kleckner	833740ac5e	msan: Stop stripping the 'tail' modifier off of calls This probably isn't necessary since msan started to unpoison the return value shadow memory before all calls. llvm-svn: 212061	2014-06-30 20:12:27 +00:00
Ed Maste	557c54d6ca	objdump: Add test for ELF file with no section table This is a test for the fix in r211904. Differential Revision: http://reviews.llvm.org/D4349 llvm-svn: 212059	2014-06-30 20:03:02 +00:00
Kevin Enderby	4c8dfe4d0f	Add the -arch flag support to llvm-nm to select the slice out of a Mach-O universal file. This also includes support for -arch all, selecting the host architecture by default from a universal file and checking if -arch is used with a standard Mach-O it matches that architecture. llvm-svn: 212054	2014-06-30 18:45:23 +00:00
Adrian Prantl	da7d92e3e2	Debug info: split out complex DIVariable address expressions into a separate MDNode so they can be uniqued via folding set magic. To conserve space, DIVariable nodes are still variable-length, with the last two fields being optional. No functional change. http://reviews.llvm.org/D3526 llvm-svn: 212050	2014-06-30 17:17:35 +00:00
Andrea Di Biagio	53b6830069	[X86] Add support for builtin to read performance monitoring counters. This patch adds support for a new builtin instruction called __builtin_ia32_rdpmc. Builtin '__builtin_ia32_rdpmc' is defined as a 'GCC builtin'; on X86, it can be used to read performance monitoring counters. It takes as input the index of the performance counter to read, and returns the value of the specified performance counter as a 64-bit number. Calls to this new builtin will map to instruction RDPMC. The index in input to the builtin call is moved to register %ECX. The result of the builtin call is the value of the specified performance counter (RDPMC would return that quantity in registers RDX:RAX). This patch: - Adds builtin int_x86_rdpmc as a GCCBuiltin; - Adds a new x86 DAG node called 'RDPMC_DAG'; - Teaches how to lower this new builtin; - Adds an ISel pattern to select instruction RDPMC; - Fixes the definition of instruction RDPMC adding %RAX and %RDX as implicit definitions, and adding %ECX as implicit use; - Adds a LLVM test to verify that the new builtin is correctly selected. llvm-svn: 212049	2014-06-30 17:14:21 +00:00
Chad Rosier	304fe3ff71	[AArch64] Unsized types don't specify an alignment. PR20109 llvm-svn: 212045	2014-06-30 15:03:00 +00:00
Chad Rosier	e6b8761ab9	[AArch64] Convert mul x, -(pow2 +/- 1) to shift + add/sub. The combine for mul x, pow2 +/- 1 is unchanged. Test cases for both combines as well as mul x, pow2 have been added as well. llvm-svn: 212044	2014-06-30 14:51:14 +00:00
Scott Douglass	7650a9b871	ARM: take care not to set the ThumbFunc bit on TLS data symbols This fixes LNT SingleSource/UnitTests/Threads with -mthumb. Differential Revision: http://reviews.llvm.org/D4324 llvm-svn: 212029	2014-06-30 09:37:24 +00:00
Erik Eckstein	5b18a09748	test commit: add a comment line in GVN test file llvm-svn: 212019	2014-06-30 07:19:02 +00:00
Chandler Carruth	bd0717d7cc	[x86] Fix a bug in the v8i16 shuffling exposed by the new splat-like lowering for v16i8. ASan and some bots caught this bug with existing test cases. Fixing it even fixed a miscompile with one of the test cases. I'm still a bit suspicious of this test case as I've not taken a proper amount of time to think about it, but the fix here is strict goodness. llvm-svn: 211976	2014-06-28 05:46:28 +00:00
Chandler Carruth	d5821f36d9	Fix this test to not write to the source tree, and instead to write to a temporary file. This fixes the test in cases where the source tree is mounted read-only. llvm-svn: 211975	2014-06-28 05:18:49 +00:00
Chandler Carruth	887c2c3482	[x86] Add handling for splat-like widenings of v16i8 shuffles. These show up really frequently, not the least with actual splats. =] We lowered these quite badly before. The new code path tries to widen i8 shuffles to i16 shuffles in a splat-like way. There are still some inefficiencies in our i16 splat logic though, so we aren't really done here. Also, for certain patterns (bit of a gather-and-splat) we still generate pretty silly code, and I've left a fixme for addressing it. However, I'm not actually worried about this code pattern as much. The old shuffle lowering generates a 29 instruction monstrosity for it that should execute much more slowly. llvm-svn: 211974	2014-06-28 05:16:40 +00:00
David Majnemer	72304efabc	This file wasn't supposed to be checked in This was generated while trying to debug a test, it shouldn't have been checked in. Thanks to Alexander Kornienko for spotting this. llvm-svn: 211973	2014-06-28 01:56:50 +00:00
Lang Hames	116d1354b6	[RuntimeDyld] Make sure that RuntimeDyld regression tests only run for targets that have been enabled. Without this, testers will fail when llvm-rtdyld is invoked with triples for unsupported targets. llvm-svn: 211969	2014-06-27 23:29:18 +00:00
Matt Arsenault	018e91f808	Revert "Temporary hack to try cleaning extra .s file from bots." llvm-svn: 211967	2014-06-27 23:11:26 +00:00
Matt Arsenault	c9c44d682c	Temporary hack to try cleaning extra .s file from bots. llvm-svn: 211963	2014-06-27 21:43:50 +00:00
Chad Rosier	5235973ee0	[AArch64] Fix memset ICE when memset value is f128. llvm-svn: 211960	2014-06-27 21:05:09 +00:00
Justin Bogner	035fcf7115	llvm-cov: Support specifying multiple source files Make llvm-cov compatible with gcov for cases where multiple files are specified on the command line. That is, loop over each one and report coverage, and report errors on stderr only rather than via return code. llvm-svn: 211959	2014-06-27 20:41:25 +00:00
Lang Hames	e1c1138a38	[RuntimeDyld] Add a framework for testing relocation logic in RuntimeDyld. This patch adds a "-verify" mode to the llvm-rtdyld utility. In verify mode, llvm-rtdyld will test supplied expressions against the linked program images that it creates in memory. This scheme can be used to verify the correctness of the relocation logic applied by RuntimeDyld. The expressions to test will be read out of files passed via the -check option (there may be more than one of these). Expressions to check are extracted from lines of the form: # rtdyld-check: <expression> This system is designed to fit the llvm-lit regression test workflow. It is format and target agnostic, and supports verification of images linked for remote targets. The expression language is defined in llvm/include/llvm/RuntimeDyldChecker.h . Examples can be found in test/ExecutionEngine/RuntimeDyld. llvm-svn: 211956	2014-06-27 20:20:57 +00:00
Chandler Carruth	a94ef908d9	[x86] Fix another bug hit when bootstrapping with the new shuffle lowering. For maximum irony, I had already discovered this bug, diagnosed it, and left FIXMEs about it in the test cases. =[ I just failed to go back over those until after i had reduced a bootstrap miscompile down to a single TU, stared at the assembly for an hour, and figured out the bug. Again. Oh well. llvm-svn: 211955	2014-06-27 20:07:40 +00:00
Justin Holewinski	a0d531f031	[NVPTX] Add reflect intrinsic (better than matching by function name) Also clean up some of the logic in NVVMReflect.cpp while we're messing around in there. llvm-svn: 211948	2014-06-27 18:36:11 +00:00
Justin Holewinski	2739c0175c	[NVPTX] Add 'b' asm constraint llvm-svn: 211946	2014-06-27 18:36:06 +00:00
Justin Holewinski	549c773619	[NVPTX] Error out if initializer is given for variable in an address space that does not support initialization llvm-svn: 211943	2014-06-27 18:36:01 +00:00
Justin Holewinski	773ca40f5d	[NVPTX] Add support for .managed variables for UVM llvm-svn: 211942	2014-06-27 18:35:58 +00:00
Justin Holewinski	d73767a80a	[NVPTX] Emit .weak linkage for link_once, weak, available_externally, and common linkage llvm-svn: 211941	2014-06-27 18:35:56 +00:00
Justin Holewinski	b926d9d446	[NVPTX] Fix handling of ldg/ldu intrinsics. The address space of the pointer must be global (1) for these intrinsics. There must also be alignment metadata attached to the intrinsic calls, e.g. %val = tail call i32 @llvm.nvvm.ldu.i.global.i32.p1i32(i32 addrspace(1)* %ptr), !align !0 !0 = metadata !{i32 4} llvm-svn: 211939	2014-06-27 18:35:51 +00:00
Justin Holewinski	6e40f63e41	[NVPTX] Clean up argument lowering code and properly handle alignment for structs and vectors llvm-svn: 211938	2014-06-27 18:35:44 +00:00
Justin Holewinski	360a5cfcd3	[NVPTX] Add support for [SHL,SRA,SRL]_PARTS llvm-svn: 211936	2014-06-27 18:35:40 +00:00
Justin Holewinski	eafe26d082	[NVPTX] Implement fma and imad contraction as target DAGCombiner patterns This also introduces DAGCombiner patterns for mul.wide to multiply two smaller integers and produce a larger integer llvm-svn: 211935	2014-06-27 18:35:37 +00:00
Justin Holewinski	832e09b4d9	[NVPTX] Add support for efficient rotate instructions on SM 3.2+ llvm-svn: 211934	2014-06-27 18:35:33 +00:00
Justin Holewinski	7be57de6b8	[NVPTX] Add missing isel patterns for 64-bit atomics llvm-svn: 211933	2014-06-27 18:35:30 +00:00
Justin Holewinski	ca7a4f136d	[NVPTX] Add isel patterns for bit-field extract (bfe) llvm-svn: 211932	2014-06-27 18:35:27 +00:00
Justin Holewinski	10c25968d8	[NVPTX] Add support for isspacep instruction llvm-svn: 211931	2014-06-27 18:35:24 +00:00
Justin Holewinski	124fc1951f	[NVPTX] Add support for envreg reads llvm-svn: 211930	2014-06-27 18:35:21 +00:00
Justin Holewinski	7d5bf66f61	[NVPTX] Emit .weak when linkage is not external, internal, or private llvm-svn: 211926	2014-06-27 18:35:10 +00:00
Chandler Carruth	dd6470a9dd	[x86] Fix a miscompile in the new shuffle lowering uncovered by a bootstrap. I managed to mis-remember how PACKUS worked on x86, and was using undef for the high bytes instead of zero. The fix is fairly obvious. llvm-svn: 211922	2014-06-27 18:25:23 +00:00
David Majnemer	dad0a645a7	IR: Add COMDATs to the IR This new IR facility allows us to represent the object-file semantic of a COMDAT group. COMDATs allow us to tie together sections and make the inclusion of one dependent on another. This is required to implement features like MS ABI VFTables and optimizing away certain kinds of initialization in C++. This functionality is only representable in COFF and ELF, Mach-O has no similar mechanism. Differential Revision: http://reviews.llvm.org/D4178 llvm-svn: 211920	2014-06-27 18:19:56 +00:00
David Blaikie	6a21e14d53	Fix test so it doesn't try to write out temporary files into the test tree. llvm-svn: 211916	2014-06-27 17:45:43 +00:00
David Majnemer	c57d038240	MC: Fix associative sections on COFF COFF sections in MC were represented by a tuple of section-name and COMDAT-name. This is not sufficient to represent a .text section associated with another .text section; we need a way to distinguish between the key section and the one marked associative. llvm-svn: 211913	2014-06-27 17:19:44 +00:00
Matt Arsenault	642d2e78b3	R600: Don't crash on unhandled instruction in promote alloca llvm-svn: 211906	2014-06-27 16:52:49 +00:00
Ulrich Weigand	14bd521f4c	[PowerPC] Constrain base register in PPCRegisterInfo::resolveFrameIndex I've run into a bug where current LLVM at -O0 (with fast-isel) generated invalid code like: ld 0, 20936(1) # 8-byte Folded Reload stw 12, 10348(0) stw 12, 10344(0) The underlying vreg had been introduced as base register by the Local Stack Slot Allocation pass. That register was constrained to G8RC by PPCRegisterInfo::materializeFrameBaseRegister to match the ADDI instruction used to set it, but it was not constrained to G8RC_NOX0 to fit the use of the register in an address. That should have happened in PPCRegisterInfo::resolveFrameIndex. This patch adds an appropriate constrainRegClass call. Reviewed by Hal Finkel. llvm-svn: 211897	2014-06-27 13:04:12 +00:00
Chandler Carruth	688001f042	[x86] Teach the target combine step to aggressively fold pshufd insturcions. Summary: This allows it to fold pshufd instructions across intervening half-shuffles and other noise. This pattern actually shows up in the generic lowering tests, but I've also added direct tests using intrinsics to make sure that the specific desired functionality is working even if the lowering stuff changes in the future. Differential Revision: http://reviews.llvm.org/D4292 llvm-svn: 211892	2014-06-27 11:40:13 +00:00
Simon Atanasyan	24199883e5	[ELF][Mips] Fix recognition of MIPS 64-bit arch in the ELFObjectFile:getArch() method. llvm-svn: 211891	2014-06-27 11:36:45 +00:00
Chandler Carruth	0d6d1f2b17	[x86] Teach the target-specific combining how to aggressively fold half-shuffles, even looking through intervening instructions in a chain. Summary: This doesn't happen to show up with any test cases I've found for the current shuffle lowering, but previous attempts would benefit from this and it seems generally useful. I've tested it directly using intrinsics, which also shows that it will work with hand vectorized code as well. Note that even though pshufd isn't directly used in these tests, it gets exercised because we combine some of the half shuffles into a pshufd first, and then merge them. Differential Revision: http://reviews.llvm.org/D4291 llvm-svn: 211890	2014-06-27 11:34:40 +00:00
Chandler Carruth	97ebc2362c	[x86] Teach the X86 backend to DAG-combine SSE2 shuffles that are trivially redundant. This fixes several cases in the new vector shuffle lowering algorithm which would generate redundant shuffle instructions for the sake of simplicity. I'm also deleting a testcase which was somewhat ridiculous. It was checking for a bug in 2007 about incorrectly transforming shuffles by looking for the string "-86" in the output of a pretty substantial function. This test case doesn't seem to have any value at this point. Differential Revision: http://reviews.llvm.org/D4240 llvm-svn: 211889	2014-06-27 11:27:52 +00:00
Chandler Carruth	83860cfcfa	[x86] Begin a significant overhaul of how vector lowering is done in the x86 backend. This sketches out a new code path for vector lowering, hidden behind an off-by-default flag while it is under development. The fundamental idea behind the new code path is to aggressively break down the problem space in ways that ease selecting the odd set of instructions available on x86, and carefully avoid scalarizing code even when forced to use older ISAs. Notably, this starts off restricting itself to SSE2 and implements the complete vector shuffle and blend space for 128-bit vectors in SSE2 without scalarizing. The plan is to layer on top of this ISA extensions where we can bail out of the complex SSE2 lowering and opt for a cheaper, specialized instruction (or set of instructions). It also needs to be generalized to AVX and AVX512 vector widths. Currently, this does a decent but not perfect job for SSE2. There are some specific shortcomings that I plan to address: - We need a peephole combine to fold together shuffles where possible. There are cases where a previous shuffle could be modified slightly to arrange for elements to be in the correct position and a later shuffle eliminated. Doing this eagerly added quite a bit of complexity, and so my plan is to combine away these redundancies afterward. - There are a lot more clever ways to use unpck and pack that need to be added. This is essential for real world shuffles as it turns out... Once SSE2 is polished a bit I should be able to get interesting numbers on performance improvements on benchmarks conducive to vectorization. All of this will be off by default until it is functionally equivalent of course. Differential Revision: http://reviews.llvm.org/D4225 llvm-svn: 211888	2014-06-27 11:23:44 +00:00
Dinesh Dwivedi	adc07739a9	Added instruction combine to transform few more negative values addition to subtraction (Part 3) This patch enables transforms for (x + (~(y \| c) + 1) --> x - (y \| c) if c is odd Differential Revision: http://reviews.llvm.org/D4210 llvm-svn: 211881	2014-06-27 07:47:35 +00:00
David Majnemer	9930398c5c	GlobalOpt: Fix constantfold-initializers.ll test The test added in r211762 was sloppy, the correct initializer wasn't added to @llvm.global_ctors Spotted by Pasi Parviainen! llvm-svn: 211879	2014-06-27 07:36:26 +00:00
David Blaikie	dada538bb4	Revert "Revert "Revert "PR20038: DebugInfo: Inlined call sites where the caller has debug info but the call itself has no debug location.""" Reverting this again, didn't mean to commit it - while r211872 fixes one of the issues here, there are still others to figure out and address. This reverts commit r211871. llvm-svn: 211873	2014-06-27 05:34:05 +00:00
David Blaikie	b0cdf530c3	ArgumentPromotion: Propagate debug locations on calls for which arguments are promoted. llvm-svn: 211872	2014-06-27 05:32:09 +00:00
David Blaikie	8832992df5	Revert "Revert "PR20038: DebugInfo: Inlined call sites where the caller has debug info but the call itself has no debug location."" This reverts commit r211724. llvm-svn: 211871	2014-06-27 05:31:49 +00:00
Andrew Trick	5632722cab	MachineScheduler: add some book-keeping to fix an assert. Fixe for Bug 20057 - Assertion failied in llvm::SUnit* llvm::SchedBoundary::pickOnlyChoice(): Assertion `i <= (HazardRec->getMaxLookAhead() + MaxObservedStall) && "permanent hazard"' Thanks to Chad for the test case. llvm-svn: 211865	2014-06-27 04:57:05 +00:00
Matt Arsenault	6995dd90c0	R600: Add some testcases for promote alloca pass. More complicated GEPs are skipped. Add some tests to actually stress this skipping. llvm-svn: 211859	2014-06-27 03:55:55 +00:00
Adam Nemet	73f72e15ac	[X86] AVX512: Add vbroadcasti* For now I used a separate template for these sub-vector/tuple broadcasts rather than sharing the mem variants with avx512_int_broadcast_rm. <rdar://problem/17402869> llvm-svn: 211828	2014-06-27 00:43:38 +00:00
Juergen Ributzka	009bff223b	[StackMaps] Enable patchpoint liveness analysis per default. llvm-svn: 211817	2014-06-26 23:39:52 +00:00
Juergen Ributzka	14871f73bb	[Stackmaps] Remove the liveness calculation for stackmap intrinsics. There is no need to calculate the liveness information for stackmaps. The liveness information is still available for the patchpoint intrinsic and that is also the intended usage model. Related to <rdar://problem/17473725> llvm-svn: 211816	2014-06-26 23:39:44 +00:00
Arnold Schwaighofer	ed988fb97d	GVN: Preserve invariant.load metadata If both instructions to be replaced are marked invariant the resulting instruction is invariant. rdar://13358910 Fix by Erik Eckstein! llvm-svn: 211801	2014-06-26 19:51:19 +00:00
Matt Arsenault	0989d51520	R600/SI: Add FP mode bits to binary. The default rounding mode to initialize the mode register needs to be reported to the runtime. Fill in other bits a kernel may be interested in setting for future use. llvm-svn: 211791	2014-06-26 17:22:30 +00:00
Renato Golin	ac561c3ac7	Added parsing co-processor names starting with "cr" Additional compliant GAS names for coprocessor register name are enabled for all instruction with parameter MCK_CoprocReg: LDC,LDC2,STC,STC2,CDP,CDP2,MCR,MCR2,MCRR,MCRR2,MRC,MRC2,MRRC,MRRC2 Patch by Andrey Kuharev. llvm-svn: 211776	2014-06-26 13:10:53 +00:00
Andrea Di Biagio	7fb85256bc	[X86] Improve the selection of SSE3/AVX addsub instructions. This patch teaches the backend how to canonicalize a shuffle vectors according to the rule: - (shuffle (FADD A, B), (FSUB A, B), Mask) -> (shuffle (FSUB A, -B), (FADD A, -B), Mask) Where 'Mask' is: <0,5,2,7> ;; for v4f32 and v4f64 shuffles. <0,3> ;; for v2f64 shuffles. <0,9,2,11,4,13,6,15> ;; for v8f32 shuffles. In general, ISel only knows how to pattern-match a canonical 'fadd + fsub + blendi' dag node sequence into an ADDSUB instruction. This new rule allows to convert a non-canonical dag sequence into a canonical one that will be matched by a single ADDSUB at ISel stage. The idea of converting a non-canonical ADDSUB into a canonical one by swapping the first two operands of the shuffle, and then negating the second operand of the FADD and FSUB, was originally proposed by Hal Finkel. llvm-svn: 211771	2014-06-26 10:45:21 +00:00
Dinesh Dwivedi	99281a0615	This patch removed duplicate code for matching patterns which are now handled in SimplifyUsingDistributiveLaws() (after r211261) Differential Revision: http://reviews.llvm.org/D4253 llvm-svn: 211768	2014-06-26 08:57:33 +00:00
Dinesh Dwivedi	a716173581	Added instruction combine to transform few more negative values addition to subtraction (Part 2) This patch enables transforms for (x + (~(y \| c) + 1) --> x - (y \| c) if c is even Differential Revision: http://reviews.llvm.org/D4209 llvm-svn: 211765	2014-06-26 05:40:22 +00:00
David Majnemer	6098b2f519	GlobalOpt: Don't optimize thread_local for initializers Folding a reference to a thread_local variable into another global variable's initializer is very problematic, there is no relocation that exists to represent such an access. llvm-svn: 211762	2014-06-26 03:02:19 +00:00
Matt Arsenault	c6f8fdb4e5	R600: Fix vector FMA llvm-svn: 211757	2014-06-26 01:28:05 +00:00
Hans Wennborg	b03ebfb77e	Don't build switch tables for dllimport and TLS variables in GEPs This is a follow-up to r211331, which failed to notice that we were returning early from ValidLookupTableConstant for GEPs. llvm-svn: 211753	2014-06-26 00:30:52 +00:00
Adam Nemet	905832bf87	[X86] AVX512: Fix asm syntax for packed vcmp The *_alt defs for vcmp are used by the InstParser (the asm string in the main def is used by the InstPrinter) . The former was accepting vector registers as destination rather than mask registers. llvm-svn: 211750	2014-06-26 00:21:12 +00:00
Juergen Ributzka	296833cde9	[FastISel][X86] Only fold the cmp into the select when both instructions are in the same basic block. If the cmp is in a different basic block, then it is possible that not all operands of that compare have defined registers. This can happen when one of the operands to the cmp is a load and the load gets folded into the cmp. In this case FastISel will skip the load instruction and the vreg is never defined. llvm-svn: 211730	2014-06-25 20:06:12 +00:00
David Blaikie	2952956fd8	Revert "PR20038: DebugInfo: Inlined call sites where the caller has debug info but the call itself has no debug location." This reverts commit r211723. Breaks the ASan/compiler-rt build... guess I didn't test very far at all :/. llvm-svn: 211724	2014-06-25 18:20:54 +00:00
David Blaikie	442584588a	PR20038: DebugInfo: Inlined call sites where the caller has debug info but the call itself has no debug location. This situation does bad things when inlined, so I've fixed Clang not to produce inlinable call sites without locations when the caller has debug info (in the one case where I could find that this occurred). This updates the PR20038 test case to be what clang now produces, and readds the assertion that had to be removed due to this bug. I've also beefed up the debug info verifier to help diagnose these issues in the future, and I hope to add checks to the inliner to just assert-fail if it encounters this situation. If, in the future, we decide we have to cope with this situation, the right thing to do is probably to just remove all the DebugLocs from the inlined instructions. llvm-svn: 211723	2014-06-25 18:03:10 +00:00
Tyler Nowicki	4b07b00786	Add Rpass-missed and Rpass-analysis reports to the loop vectorizer. The remarks give the vector width of vectorized loops and a brief analysis of loops that fail to be vectorized. For example, an analysis will be generated for loops containing control flow that cannot be simplified to a select. The optimization remarks also give the debug location of expressions that cannot be vectorized, for example the location of an unvectorizable call. Reviewed by: Arnold Schwaighofer llvm-svn: 211721	2014-06-25 17:50:15 +00:00
Andrea Di Biagio	07cdffc324	[X86] Always prefer to lower a VECTOR_SHUFFLE into a BLENDI instead of SHUFP (or VPERM2X128). This patch teaches method 'LowerVECTOR_SHUFFLE' to give higher precedence to the check for 'isBlendMask'; the idea is that, when possible, we should firstly check if a shuffle performs a blend, and in case, try to lower it into a BLENDI instead of selecting a SHUFP or (worse) a VPERM2X128. In general: - AVX VBLENDPS/D always have better latency and throughput than VPERM2F128; - BLENDPS/D instructions tend to always have better 'reciprocal throughput' than the equivalent SHUFPS/D; - Both BLENDPS/D and SHUFPS/D are often decoded into the same number of m-ops; however, a m-op obtained from a BLENDPS/D can be scheduled to more than one execution port. This patch: - Moves the check for 'isBlendMask' immediately before the check for 'isSHUFPMask' within method 'LowerVECTOR_SHUFFLE'; - Updates existing tests for sse/avx shuffle/blend instructions to verify that we select (v)blendps/d when possible (instead of (v)shufps/d or vperm2f128). llvm-svn: 211720	2014-06-25 17:41:58 +00:00
Eli Bendersky	451ef5b2c5	Add some test files for r211710. llvm-svn: 211711	2014-06-25 15:41:39 +00:00
Eli Bendersky	5d5e18da3e	Rename loop unrolling and loop vectorizer metadata to have a common prefix. [LLVM part] These patches rename the loop unrolling and loop vectorizer metadata such that they have a common 'llvm.loop.' prefix. Metadata name changes: llvm.vectorizer.* => llvm.loop.vectorizer.* llvm.loopunroll.* => llvm.loop.unroll.* This was a suggestion from an earlier review (http://reviews.llvm.org/D4090) which added the loop unrolling metadata. Patch by Mark Heffernan. llvm-svn: 211710	2014-06-25 15:41:00 +00:00
Evgeniy Stepanov	b163f0276f	[msan] Fix bad interaction between with-calls mode and chained origin tracking. Origin history should only be recorded for uninitialized values, because it is meaningless otherwise. This change moves __msan_chain_origin to the runtime library side and makes it conditional on the corresponding shadow value. Previous code was correct, but _very_ inefficient. llvm-svn: 211700	2014-06-25 14:41:57 +00:00
Chandler Carruth	e5724d7532	[x86] Add intrinsics for the pshufd, pshuflw, and pshufhw instructions. llvm-svn: 211694	2014-06-25 13:12:54 +00:00
NAKAMURA Takumi	1db5995d14	Re-apply r211399, "Generate native unwind info on Win64" with a fix to ignore SEH pseudo ops in X86 JIT emitter. -- This patch enables LLVM to emit Win64-native unwind info rather than DWARF CFI. It handles all corner cases (I hope), including stack realignment. Because the unwind info is not flexible enough to describe stack frames with a gap of unknown size in the middle, such as the one caused by stack realignment, I modified register spilling code to place all spills into the fixed frame slots, so that they can be accessed relative to the frame pointer. Patch by Vadim Chugunov! Reviewed By: rnk Differential Revision: http://reviews.llvm.org/D4081 llvm-svn: 211691	2014-06-25 12:41:52 +00:00
Andrea Di Biagio	6d9b9e125d	[X86] Add target combine rule to select ADDSUB instructions from a build_vector This patch teaches the backend how to combine a build_vector that implements an 'addsub' between packed float vectors into a sequence of vector add and vector sub followed by a VSELECT. The new VSELECT is expected to be lowered into a BLENDI. At ISel stage, the sequence 'vector add + vector sub + BLENDI' is pattern-matched against ISel patterns added at r211427 to select 'addsub' instructions. Added three more ISel patterns for ADDSUB. Added test sse3-avx-addsub-2.ll to verify that we correctly emit 'addsub' instructions. llvm-svn: 211679	2014-06-25 10:02:21 +00:00
Evgeniy Stepanov	10280dac1d	[LICM] Don't create more than one copy of an instruction per loop exit block when sinking. Fixes exponential compilation complexity in PR19835, caused by LICM::sink not handling the following pattern well: f = op g e = op f, g d = op e c = op d, e b = op c a = op b, c When an instruction with N uses is sunk, each of its operands gets N new uses (all of them - phi nodes). In the example above, if a had 1 use, c would have 2, e would have 4, and g would have 8. llvm-svn: 211673	2014-06-25 07:54:58 +00:00
Rafael Espindola	6804d450cd	Fix another asserting method in the null streamer. llvm-svn: 211668	2014-06-25 05:37:58 +00:00
Rafael Espindola	c00d875d35	Fix a regression from r211653. The method was empty in the null streamer but I mistakenly replaced it with the aborting one in MCStreamer. llvm-svn: 211666	2014-06-25 05:31:22 +00:00
NAKAMURA Takumi	78d8ebfc28	CodeGen/X86/pr20088.ll: Add -march=x86-64, or llc fails due to non-x86 default target. llvm-svn: 211659	2014-06-25 03:05:47 +00:00
Juergen Ributzka	2bce27e5a0	[FastISel][X86] Fold XALU condition into branch and compare. Optimize the codegen of select and branch instructions to directly use the EFLAGS from the {s\|u}{add\|sub\|mul}.with.overflow intrinsics. llvm-svn: 211645	2014-06-24 23:51:21 +00:00
Tom Stellard	9b3816b5ee	R600: Promote i64 stores to v2i32 Now we need only one 64-bit pattern for stores. llvm-svn: 211643	2014-06-24 23:33:04 +00:00
NAKAMURA Takumi	d7abf56595	ldr-pseudo-obj-errors.s: Fix silly copypasto. llvm-svn: 211642	2014-06-24 23:18:07 +00:00
NAKAMURA Takumi	e49e30357b	llvm/test/MC/AArch64/ldr-pseudo-obj-errors.s: Add -triple=aarch64-linux. AArch64 is unaware of PECOFF for now. FIXME: This should pass for also targeting aarch64-darwin. llvm-svn: 211640	2014-06-24 23:11:42 +00:00
Rafael Espindola	f491704e22	Print a=b as an assignment. In assembly the expression a=b is parsed as an assignment, so it should be printed as one. This remove a truly horrible hack for producing a label with "a=.". It would be used by codegen but would never be reached by the asm parser. Sorry I missed this when it was first committed. llvm-svn: 211639	2014-06-24 22:45:16 +00:00
Matt Arsenault	257d48d22c	R600: Fix inconsistency in rsq instructions. R600 was using a clamped version of rsq, but SI was not. Add a new rsq_clamped intrinsic and use them consistently. It's unclear to me from the documentation what behavior the R600 instructions have, so I assume they have the legacy behavior described by the SI documents. For R600, use RECIPSQRT_IEEE for both llvm.AMDGPU.rsq.legacy and llvm.AMDGPU.rsq. R600 also has RECIPSQRT_FF, which I'm not sure how it fits in here. llvm-svn: 211637	2014-06-24 22:13:39 +00:00
David Blaikie	6800e39865	Fix up scoping in a few tests (and delete one that validates unnecessary behavior). Most of this is just tests that were silently succeeding in spite of schema changes I made over a year ago. Cleaning them up as they lead to failures in a change I'm working on/will come soon. test/DebugInfo/2010-01-19-DbgScope.ll was removed as it tested miscoping where a DebugLoc described a location not in the current function. The test case doesn't describe why this is a valid situation and should be supported, so I'm removing it and shortly going to commit changes that make this firmly unsupported/assert-fail. llvm-svn: 211628	2014-06-24 20:10:27 +00:00
Bill Schmidt	83973ef23b	[PPC64] Fix PR20071 (fctiduz generated for targets lacking that instruction) PR20071 identifies a problem in PowerPC's fast-isel implementation for floating-point conversion to integer. The fctiduz instruction was added in Power ISA 2.06 (i.e., Power7 and later). However, this instruction is being generated regardless of which 64-bit PowerPC target is selected. The intent is for fast-isel to punt to DAG selection when this instruction is not available. This patch implements that change. For testing purposes, the existing fast-isel-conversion.ll test adds a RUN line for -mcpu=970 and tests for the expected code generation. Additionally, the existing test fast-isel-conversion-p5.ll was found to be incorrectly expecting the unavailable instruction to be generated. I've removed these test variants since we have adequate coverage in fast-isel-conversion.ll. llvm-svn: 211627	2014-06-24 20:05:18 +00:00
Robert Khasanov	21c836823f	vpblend intrinsics combines as shifts intrinsics due to absence return stmt between them Fix PR20088 Differential Revision: http://reviews.llvm.org/D4277 llvm-svn: 211617	2014-06-24 18:08:04 +00:00
Weiming Zhao	abb603da1c	Fix test case in r211605/r211533 The test case in "Fix PR20056: Implement pseudo LDR <reg>, =<literal/label> for AArch64" should only work with Linux. llvm-svn: 211613	2014-06-24 17:05:43 +00:00
Diego Novillo	56653fdada	Add new debug kind LocTrackingOnly. Summary: This new debug emission kind supports emitting line location information in all instructions, but stops code generation from emitting debug info to the final output. This mode is useful when the backend wants to track source locations during code generation, but it does not want to produce debug info. This is currently used by optimization remarks (-pass-remarks, -pass-remarks-missed and -pass-remarks-analysis). To prevent debug info emission, DIBuilder never inserts the annotation 'llvm.dbg.cu' when LocTrackingOnly is enabled. Reviewers: echristo, dblaikie Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D4234 llvm-svn: 211609	2014-06-24 17:02:03 +00:00
Weiming Zhao	b1d4dbdcc7	Resubmit commit r211533 "Fix PR20056: Implement pseudo LDR <reg>, =<literal/label> for AArch64" Missed files are added in this commit. llvm-svn: 211605	2014-06-24 16:21:38 +00:00
Christian Pirker	c6308f59b2	ARM: Fix TPsoft for Thumb mode Reviewed at http://reviews.llvm.org/D4230 llvm-svn: 211601	2014-06-24 15:45:59 +00:00
Daniel Sanders	e6198bf886	[mips] Added support for assembling sdbbp. Summary: This instruction is re-encoded in MIPS32r6/MIPS64r6 without changing the restrictions. We hadn't implemented it for earlier ISA's so it has been added to those too. Differential Revision: http://reviews.llvm.org/D4265 llvm-svn: 211590	2014-06-24 13:00:32 +00:00
Benjamin Kramer	c96a7f88b9	InstCombine: Disable umul.with.overflow recognition for vectors. It doesn't make a lot on most targets and the code isn't ready for it. PR20113. llvm-svn: 211583	2014-06-24 10:47:52 +00:00
Benjamin Kramer	6de786666a	InstCombine: Don't try to reorder shuffles where the mask is a ConstantExpr. We can't analyze the individual values of a vector expression. PR20114. llvm-svn: 211581	2014-06-24 10:38:10 +00:00
David Majnemer	23fc9afa4d	GlobalOpt: Don't optimize dllimport for initializers Referencing a dllimport variable requires actually instructions, not just a relocation. This fixes PR19955. Differential Revision: http://reviews.llvm.org/D4249 llvm-svn: 211571	2014-06-24 06:53:45 +00:00
Kevin Qin	93d45ecdbf	[AArch64] Fix a build_vector pattern match fail caused by defect in isBuildVectorAllZeros(). llvm-svn: 211567	2014-06-24 05:37:27 +00:00
Adam Nemet	8ae70506ea	[Disasm][AVX512] Implement decoding of top bit for non-destructive reg fields V' bit in the P2 byte of the EVEX prefix provides the top bit of the NDD and NDS register fields. This was simply not used in the decoder until now. Fixes <rdar://problem/17402661> llvm-svn: 211565	2014-06-24 01:42:32 +00:00
Juergen Ributzka	aed5c96684	[FastISel][X86] Lower unsupported selects to control-flow. The extends the select lowering coverage by emiting pseudo cmov instructions. These insturction will be later on lowered to control-flow to simulate the select. llvm-svn: 211545	2014-06-23 21:55:44 +00:00
Juergen Ributzka	21d560843f	[FastISel][X86] Add support for floating-point select. This extends the select lowering to support floating-point selects. The lowering depends on SSE instructions and that the conditon comes from a floating-point compare. Under this conditions it is possible to emit an optimized instruction sequence that doesn't require any branches to simulate the select. llvm-svn: 211544	2014-06-23 21:55:40 +00:00
Juergen Ributzka	6ef06f9159	[FastISel][X86] Optimize selects when the condition comes from a compare. Optimize the select instructions sequence to use the EFLAGS directly from a compare when possible. llvm-svn: 211543	2014-06-23 21:55:36 +00:00
NAKAMURA Takumi	0c2a080158	nm-trivial-object.test requires shell since Lit internal runner isn't capable of chdir. llvm-svn: 211537	2014-06-23 21:07:04 +00:00
Kevin Enderby	4fc2edb023	Change the default input for llvm-nm to be a.out instead of standard input to match llvm-size and other UNIX systems for their nm(1). Tweak test cases that used llvm-nm with standard input to add a "-" to indicate that and add a test case to check the default of a.out for llvm-nm. llvm-svn: 211529	2014-06-23 20:27:53 +00:00
Rafael Espindola	60890b8910	[Mips] Add a target streamer when creating a null streamer. Should fix DebugInfo/global.ll on the mips bot. llvm-svn: 211527	2014-06-23 19:43:40 +00:00
Matt Arsenault	f2b0aebb8a	R600/SI: Fix div_scale intrinsic. The operand that must match one of the others does matter, and implement selecting for it. llvm-svn: 211523	2014-06-23 18:28:28 +00:00
Christian Pirker	6f81e75dab	ARMEB: Vector extend operations Reviewed at http://reviews.llvm.org/D4043 llvm-svn: 211520	2014-06-23 18:05:53 +00:00
Matt Arsenault	c4d3d3a16e	R600: Move add/sub with overflow out of AMDILISelLowering Add more tests for these. llvm-svn: 211517	2014-06-23 18:00:49 +00:00
Matt Arsenault	b8b5153935	R600/SI: Handle i64 sub. We can handle it the same way as add llvm-svn: 211514	2014-06-23 18:00:38 +00:00
Rafael Espindola	c5f1a6c66f	Delete utils/FileUpdate. It is unused and it looks like it was never used. llvm-svn: 211508	2014-06-23 17:58:39 +00:00
Rafael Espindola	886048276f	Allow using .cfi_startproc without a leading symbol. This is possible now that we don't produce .eh symbols. This fixes pr19430. llvm-svn: 211502	2014-06-23 15:34:32 +00:00
Rafael Espindola	440bb21b5a	Stop producing func.eh symbols on Darwin. According Nick Kledzik (http://llvm.org/bugs/show_bug.cgi?id=19430#c2): "... mach-o no longer needs names in the __eh_frame section (and has not for years)." Iain Sandoe confirms it is also unnecessary for their old darwin support. llvm-svn: 211500	2014-06-23 15:13:23 +00:00
Ulrich Weigand	f316e1db75	[PowerPC] Allow stack frames without parameter save area The PPCFrameLowering::determineFrameLayout routine currently ensures that every function that allocates a stack frame provides space for the parameter save area (via PPCFrameLowering::getMinCallFrameSize). This is actually not necessary. There may be functions that never call another routine but still allocate a frame; those do not require the parameter save area. In the future, with the ELFv2 ABI, even some routines that do call other functions do not need to allocate the parameter save area. While it is not a bug to allocate the parameter area when it is not needed, it is better to avoid it to save stack space. Note that when any particular function call requires the parameter save area, this space will already have been included by ABI code in the size the CALLSEQ_START insn is annotated with, and therefore included in the size returned by MFI->getMaxCallFrameSize(). This means that determineFrameLayout simply does not need to care about the parameter save area. (It still needs to ensure that every frame provides the linkage area.) This is implemented by this patch. Note that this exposed a bug in the new fast-isel code where the parameter area was not included in the CALLSEQ_START size; this is also fixed. A couple of test cases needed to be adapted for the new (smaller) stack frame size those tests now see. llvm-svn: 211495	2014-06-23 13:47:52 +00:00
Ulrich Weigand	9ba552db89	[PowerPC] Fix on-stack AltiVec arguments with 64-bit SVR4 Current 64-bit SVR4 code seems to have some remnants of Darwin code in AltiVec argument handing. This had the effect that AltiVec arguments (or subsequent arguments) were not correctly placed in the parameter area in some cases. The correct behaviour with the 64-bit SVR4 ABI is: - All AltiVec arguments take up space in the parameter area, just like any other arguments, whether vararg or not. - They are always 16-byte aligned, skipping a parameter area doubleword (and the associated GPR, if any), if necessary. This patch implements the correct behaviour and adds a test case. (Verified against GCC behaviour via the ABI compat test suite.) llvm-svn: 211492	2014-06-23 12:36:34 +00:00
Tim Northover	2099862a50	ARM: mark UBFX as not allowing PC. Strictly, it's unpredictable. But we don't quite model that yet and an error is better than ignoring the issue. This one somehow got left out before though. rdar://problem/15997748 llvm-svn: 211490	2014-06-23 09:20:02 +00:00
Saleem Abdulrasool	bdbc0088da	MC: adjust text section flags for WoA Correct the section flags for code built for Windows on ARM with `-ffunction-sections`. Windows on ARM uses solely Thumb-2 instructions, and indicates that the function is thumb by placing it in a text section that has IMAGE_SCN_MEM_16BIT flag set. When we encounter a .section directive, a new section is constructed. This may be a text segment. In order to identify that we need the additional flag, expose the target triple through the ObjectFileInfo as this information is lost otherwise. Since any modern ARM targeting environment on Windows would be Thumb-2 (Windows ARM NT or Windows Embedded Compact), introducing a new flag to indicate the section attribute seems to be a bit overkill. Simply depend on the target triple. Since there is one location that this information is currently needed, creating a target specific assembly parser and delegating the parsing of section switches also feels a bit heavy handed. If it turns out that this information ends up changing additional behaviour, then it may be worth considering that alternative. llvm-svn: 211481	2014-06-22 22:25:01 +00:00
NAKAMURA Takumi	d77cefe633	Revert r211399, "Generate native unwind info on Win64" It broke Legacy JIT Tests on x86_64-{mingw32\|msvc}, aka Windows x64. llvm-svn: 211480	2014-06-22 22:00:56 +00:00
Jan Vesely	b32714054a	R600: Add udivrem test v2: move < %s to the end of the line space after ; add v4i32 test Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 211476	2014-06-22 21:42:58 +00:00
Filipe Cabecinhas	1af2dfd274	Fix PR20087 by using the source index when changing the vector load llvm-svn: 211472	2014-06-22 17:21:37 +00:00
NAKAMURA Takumi	e80af7f3eb	Introduce a Lit feature "debug_frame" and apply it to llvm/test/MC/ELF/cfi-version.ll. .debug_frame is not emitted for targeting Windows x64. llvm-svn: 211466	2014-06-22 12:35:39 +00:00
Benjamin Kramer	f504ec2298	Add a description to the test from r211433 explaining why it's written that way. llvm-svn: 211465	2014-06-22 12:22:04 +00:00
Arnold Schwaighofer	c11107cb1e	LoopVectorizer: Fix a dominance issue The induction variables start value needs to be defined before we branch (overflow check) to the scalar preheader where we used it. llvm-svn: 211460	2014-06-22 03:38:59 +00:00
Weiming Zhao	58eb5ab326	Report error for non-zero data in .bss User may initialize a var with non-zero value and specify .bss section. E.g. : int a __attribute__((section(".bss"))) = 2; This patch converts an assertion to error report for better user experience. Differential Revision: http://reviews.llvm.org/D4199 llvm-svn: 211455	2014-06-22 00:33:44 +00:00
Benjamin Kramer	0bf086f80f	LoopUnrollRuntime: Check for overflow in the trip count calculation. Fixes PR19823. llvm-svn: 211436	2014-06-21 13:46:25 +00:00
Benjamin Kramer	b7f5fb5751	Legalizer: Add support for splitting insert_subvectors. We handle this by spilling the whole thing to the stack and doing the insertion as a store. PR19492. This happens in real code because the vectorizer creates v2i128 when AVX is enabled. llvm-svn: 211435	2014-06-21 12:56:42 +00:00
Benjamin Kramer	8dd637aa04	SCEVExpander: Fold constant PHIs harder. The logic below only understands proper IVs. PR20093. llvm-svn: 211433	2014-06-21 11:47:18 +00:00
Andrea Di Biagio	e5015d8aba	[X86] Add ISel patterns to select SSE3/AVX ADDSUB instructions. This patch adds ISel patterns to select SSE3/AVX ADDSUB instructions from a sequence of "vadd + vsub + blend". Example: /// typedef float float4 __attribute__((ext_vector_type(4))); float4 foo(float4 A, float4 B) { float4 X = A - B; float4 Y = A + B; return (float4){X[0], Y[1], X[2], Y[3]}; } /// Before this patch, (with flag -mcpu=corei7) llc produced the following assembly sequence: movaps %xmm0, %xmm2 addps %xmm1, %xmm2 subps %xmm1, %xmm0 blendps $10, %xmm2, %xmm0 With this patch, we now get a single addsubps %xmm1, %xmm0 llvm-svn: 211427	2014-06-21 01:31:15 +00:00
Rafael Espindola	b4076b290e	Always use a temp symbol for CIE. Fixes pr19185. llvm-svn: 211423	2014-06-20 23:54:32 +00:00
Rafael Espindola	c3510c74f7	Use compact unwind for the iOS simulator. Another step in fixing pr19185. llvm-svn: 211416	2014-06-20 22:40:55 +00:00
Kevin Enderby	26646108c9	Fix some double printing of filenames for archives in llvm-nm when the tool is given multiple files. Also fix the same issue with Mach-O universal files. And fix the newline spacing to separate the output in these cases. llvm-svn: 211405	2014-06-20 21:29:27 +00:00
Rafael Espindola	b4357fc293	Don't produce eh_frame relocations when targeting the IOS simulator. First step for fixing pr19185. llvm-svn: 211404	2014-06-20 21:15:27 +00:00
Reid Kleckner	4a01230db4	Generate native unwind info on Win64 This patch enables LLVM to emit Win64-native unwind info rather than DWARF CFI. It handles all corner cases (I hope), including stack realignment. Because the unwind info is not flexible enough to describe stack frames with a gap of unknown size in the middle, such as the one caused by stack realignment, I modified register spilling code to place all spills into the fixed frame slots, so that they can be accessed relative to the frame pointer. Patch by Vadim Chugunov! Reviewed By: rnk Differential Revision: http://reviews.llvm.org/D4081 llvm-svn: 211399	2014-06-20 20:35:47 +00:00
Stepan Dyatkovskiy	6baeb8805c	Commited patch from Björn Steinbrink: Summary: Different range metadata can lead to different optimizations in later passes, possibly breaking the semantics of the merged function. So range metadata must be taken into consideration when comparing Load instructions. Thanks! llvm-svn: 211391	2014-06-20 19:11:56 +00:00
Ulrich Weigand	dbc8e1ae28	[RuntimeDyld] Support more PPC64 relocations This adds support for several missing PPC64 relocations in the straight-forward manner to RuntimeDyldELF.cpp. Note that this actually fixes a failure of a large-model test case on PowerPC, allowing the XFAIL to be removed. llvm-svn: 211382	2014-06-20 17:51:47 +00:00
Tom Stellard	ae4c9e7bc3	R600/SI: Add patterns for ctpop inside a branch llvm-svn: 211378	2014-06-20 17:06:11 +00:00
Tom Stellard	9c603ebca4	R600/SI: Add a pattern for f32 ftrunc llvm-svn: 211377	2014-06-20 17:06:09 +00:00
Tom Stellard	a79e9f0f6d	R600: Expand vector flog2 llvm-svn: 211376	2014-06-20 17:06:07 +00:00
Tom Stellard	5222a88653	R600: Expand vector fexp2 llvm-svn: 211375	2014-06-20 17:06:05 +00:00
Tom Stellard	c9dedb8e29	R600/SI: Add a VALU pattern for i64 xor llvm-svn: 211373	2014-06-20 17:05:57 +00:00
Ulrich Weigand	59c6ab20d6	[PowerPC] Fix small argument stack slot offset for LE When small arguments (structures < 8 bytes or "float") are passed in a stack slot in the ppc64 SVR4 ABI, they must reside in the least significant part of that slot. On BE, this means that an offset needs to be added to the stack address of the parameter, but on LE, the least significant part of the slot has the same address as the slot itself. This changes the PowerPC back-end ABI code to only add the small argument stack slot offset for BE. It also adds test cases to verify the correct behavior on both BE and LE. llvm-svn: 211368	2014-06-20 16:34:05 +00:00
Rafael Espindola	e5bb30d9a7	Move test so that it is skipped if the ARM target is not enabled. llvm-svn: 211366	2014-06-20 15:30:38 +00:00
Rafael Espindola	1fc003e6c5	Allow a target to create a null streamer. Targets can assume that a target streamer is present, so they have to be able to construct a null streamer in order to set the target streamer in it to. Fixes a crash when using the null streamer with arm. llvm-svn: 211358	2014-06-20 13:11:28 +00:00
Oliver Stannard	5dc2934ba2	Emit the ARM build attributes ABI_PCS_wchar_t and ABI_enum_size. Emit the ARM build attributes ABI_PCS_wchar_t and ABI_enum_size based on module flags metadata. llvm-svn: 211349	2014-06-20 10:08:11 +00:00
Zoran Jovanovic	6a29b55a5a	ps][mips64r6] Added LSA/DLSA instructions Differential Revision: http://reviews.llvm.org/D3897 llvm-svn: 211346	2014-06-20 09:28:09 +00:00
Karthik Bhat	e03a25da70	Add Support to Recognize and Vectorize NON SIMD instructions in SLPVectorizer. This patch adds support to recognize patterns such as fadd,fsub,fadd,fsub.../add,sub,add,sub... and vectorizes them as vector shuffles if they are profitable. These patterns of vector shuffle can later be converted to instructions such as addsubpd etc on X86. Thanks to Arnold and Hal for the reviews. http://reviews.llvm.org/D4015 llvm-svn: 211339	2014-06-20 04:32:48 +00:00
Hans Wennborg	4dc895164a	Don't build switch lookup tables for dllimport or TLS variables We would previously put dllimport variables in switch lookup tables, which doesn't work because the address cannot be used in a constant initializer. This is basically the same problem that we have in PR19955. Putting TLS variables in switch tables also desn't work, because the address of such a variable is not constant. Differential Revision: http://reviews.llvm.org/D4220 llvm-svn: 211331	2014-06-20 00:38:12 +00:00
Kevin Enderby	14a96ac343	Added the -m option as an alias for -format=darwin to llvm-nm and llvm-size which is what the darwin tools use for the Mach-O format output. llvm-svn: 211326	2014-06-20 00:04:16 +00:00
Kevin Enderby	1e1b992ad7	Fix the output of llvm-nm for Mach-O files to use the characters ‘d’ and ‘b’ for data and bss symbols instead of the generic ’s’ for a symbol in a section. llvm-svn: 211321	2014-06-19 22:49:21 +00:00
Rafael Espindola	a064b0c476	Set missing options in LTOCodeGenerator::setTargetOptions. Patch by Tom Roeder, I just added the test. llvm-svn: 211317	2014-06-19 22:14:12 +00:00
Kevin Enderby	1983fcf86c	Change the output of llvm-nm and llvm-size for Mach-O universal files (aka fat files) to print “ (for architecture XYZ)” for fat files with more than one architecture to be like what the darwin tools do for fat files. Also clean up the Mach-O printing of archive membernames in llvm-nm to use the darwin form of "libx.a(foo.o)". llvm-svn: 211316	2014-06-19 22:03:18 +00:00
Eric Christopher	b0a78ca11a	Since we're using DW_AT_string rather than DW_AT_strp for debug_info for assembly files we can't depend on the offset within the section after a string since it could be different between producers etc. Relax these tests accordingly. llvm-svn: 211308	2014-06-19 20:00:13 +00:00

... 3 4 5 6 7 ...

25194 Commits