llvm-project

Commit Graph

Author	SHA1	Message	Date
Nico Rieck	a0abeb3548	Fix more broken CHECK lines llvm-svn: 201493	2014-02-16 13:28:39 +00:00
Nico Rieck	5ba5226ab9	Add extra CHECK prefix to tests with explicit prefix These tests mistakenly assume that CHECK is still available even if an explicit prefix is specified. llvm-svn: 201492	2014-02-16 13:28:15 +00:00
Nico Rieck	35a237d4ed	Actually call FileCheck in tests llvm-svn: 201491	2014-02-16 13:27:39 +00:00
Elena Demikhovsky	1fad075974	AVX-512: simpyfied BUILD_VECTOR for masks; fixed cmp/test sequence llvm-svn: 201487	2014-02-16 11:34:23 +00:00
Eric Christopher	4a74104933	Add a DIELoc class to cover the DW_FORM_exprloc set of expressions alongside DIEBlock and replace uses accordingly. Use DW_FORM_exprloc in DWARF4 and later code. Update testcases. Adding a DIELoc instead of using extra forms inside DIEBlock so that we can keep location expressions separate from other uses. No direct use at the moment, however, it's not a lot of code and using a separately named class keeps it somewhat more obvious what's going on in various locations. llvm-svn: 201481	2014-02-16 08:46:55 +00:00
Nico Rieck	7647178738	Fix broken CHECK lines llvm-svn: 201479	2014-02-16 07:31:05 +00:00
Saleem Abdulrasool	27304cb189	MCAsmParser: relax declaration parsing The Linux kernel defines empty macros for compatibility with ARM UAL syntax. The comma after the name is optional, and if present can be safely lexed. This improves compatibility with the GNU assembler. llvm-svn: 201474	2014-02-16 04:56:31 +00:00
Saleem Abdulrasool	49480bf01c	ARM IAS: (partially) support .arch_extension directive This adds a partial implementation of the .arch_extension directive to the integrated ARM assembler. There are a number of limitations to this implementation arising from the target backend support rather than the implementation itself. Namely, iWMMXT (v1 and v2), Maverick, and XScale support is not present in the ARM backend. Currently, there is no check for A-class only (needed for virt), and no ARMv6k detection (needed for os and sec). The remainder of the extensions are fully supported. llvm-svn: 201471	2014-02-16 00:16:41 +00:00
David Blaikie	f1a6dea82c	DebugInfo: Deduplicate entries in the fission address table This broke in r185459 while TLS support was being generalized to handle non-symbol TLS representations. I thought about/tried having an enum rather than a bool to track the TLS-ness of the address table entry, but namespaces and naming seemed more hassle than it was worth for only one caller that needed to specify this. llvm-svn: 201469	2014-02-15 19:34:03 +00:00
Arnold Schwaighofer	847d96142c	Revert "SCEVExpander: Try hard not to create derived induction variables in other loops" This reverts commit r201465. It broke an arm bot. llvm-svn: 201466	2014-02-15 18:16:56 +00:00
Arnold Schwaighofer	1e12f8563d	SCEVExpander: Try hard not to create derived induction variables in other loops During LSR of one loop we can run into a situation where we have to expand the start of a recurrence of a loop induction variable in this loop. This start value is a value derived of the induction variable of a preceeding loop. SCEV has cannonicalized this value to a different recurrence than the recurrence of the preceeding loop's induction variable (the type and/or step direction) has changed). When we come to instantiate this SCEV we created a second induction variable in this preceeding loop. This patch tries to base such derived induction variables of the preceeding loop's induction variable. This helps twolf on arm and seems to help scimark2 on x86. radar://15970709 llvm-svn: 201465	2014-02-15 17:11:56 +00:00
Craig Topper	34875ab0b5	Add opcode extension forms of MOV8ri/MOV16ri/MOV32ri. llvm-svn: 201463	2014-02-15 07:29:18 +00:00
David Blaikie	60e6386b87	DebugInfo: Implement DW_AT_stmt_list for type units Type units will share the statement list of their defining compile unit. This is a tradeoff that reduces .o debug info size at the cost of some linked debug info size (since the contents of those string tables won't be deduplicated along with the type unit) which seems right for now. llvm-svn: 201445	2014-02-14 23:58:13 +00:00
Quentin Colombet	867c550947	[CodeGenPrepare][AddressingModeMatcher] Give up on type promotion if the transformation does not bring any immediate benefits and introduce an illegal operation. llvm-svn: 201439	2014-02-14 22:23:22 +00:00
Tom Stellard	728d4172df	TargetLowering: n * r where n > 2 should be an illegal addressing mode llvm-svn: 201433	2014-02-14 21:10:34 +00:00
David Blaikie	9acebfdd94	DebugInfo: Don't include the name of the CU file in the line table file list when it's unneeded Recommitting r201380 (reverted in r201389) Recommitting r201351 and r201355 (reverted in r201351 and r201355) We weren't emitting the an empty (header only) line table when the line table was empty - this made the DWARF invalid (the compile unit would point to the zero-size debug_lines section where there should've been an empty line table but there was nothing at all). Fix that, and as a consequence this works around/addresses PR18809. Also, we emit a non-empty line table to workaround a darwin linker bug, so XFAILing on darwin too. Also, mark the test as 'REQUIRES: object-emission' because it does. llvm-svn: 201429	2014-02-14 19:51:35 +00:00
Diego Novillo	5b5cf503b5	Support DWARF discriminators in object streamer. Summary: This adds support for emitting DWARF path discriminator values in the object streamer. It also changes the DWARF dumper to show discriminator values in the line table output. Reviewers: echristo CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D2794 llvm-svn: 201427	2014-02-14 19:27:53 +00:00
Reed Kotler	4cdaa7d778	This patch has two main functions: 1) Fix a specific bug when certain conversion functions are called in a program compiled as mips16 with hard float and the program is linked as c++. There are two libraries that are reversed in the link order with gcc/g++ and clang/clang++ for mips16 in this case and the proper stubs will then not be called. These stubs are normally handled in the Mips16HardFloat pass but in this case we don't know at that time that we need to generate the stubs. This must all be handled later in code generation and we have moved this functionality to MipsAsmPrinter. When linked as C (gcc or clang) the proper stubs are linked in from libc. 2) Set up the infrastructure to handle 90% of what is in the Mips16HardFloat pass in this new area of MipsAsmPrinter. This is a more logical place to handle this and we have known for some time that we needed to move the code later and not implement it using inline asm as we do now but it was not clear exactly where to do this and what mechanism should be used. Now it's clear to us how to do this and this patch contains the infrastructure to move most of this to MipsAsmPrinter but the actual moving will be done in a follow on patch. The same infrastructure is used to fix this current bug as described in #1. This change was requested by the list during the original putback of the Mips16HardFloat pass but was not practical for us do at that time. llvm-svn: 201426	2014-02-14 19:16:39 +00:00
Artyom Skrobov	f6830f47b8	Generate the DWARF stack frame decode operations in the function prologue for ARM/Thumb functions. Patch by Keith Walker! llvm-svn: 201423	2014-02-14 17:19:07 +00:00
Kevin Qin	edc95ee196	[AArch64 NEON] Fix a bug to avoid using floating type as condition type in lowering SELECT_CC. llvm-svn: 201395	2014-02-14 09:41:15 +00:00
Eric Christopher	abc621668d	Revert "DebugInfo: Don't include the name of the CU file in the line table file list when it's unneeded" This reverts commit r201380 for now while we investigate. llvm-svn: 201389	2014-02-14 05:33:16 +00:00
NAKAMURA Takumi	602978ae0e	llvm/test/DebugInfo/empty.ll: Mark it as XFAIL:win32 lacking of line table. llvm-svn: 201388	2014-02-14 05:26:49 +00:00
NAKAMURA Takumi	b54440f9ed	[PR18809] Remove XFAIL from DebugInfo/empty.ll. I added it in r201211. llvm-svn: 201383	2014-02-14 03:59:43 +00:00
Hao Liu	7146ef8542	[AArch64]Fix the assertion failure caused by "v1i1 SETCC" DAG node. As v1i1 is illegal, the type legalizer tries to scalarize such node. But if the type operands of SETCC is legal, the scalarization algorithm will cause an assertion failure. llvm-svn: 201381	2014-02-14 02:21:56 +00:00
David Blaikie	177585d1d9	DebugInfo: Don't include the name of the CU file in the line table file list when it's unneeded Recommitting r201351 and r201355 (reverted in r201351 and r201355) We weren't emitting the an empty (header only) line table when the line table was empty - this made the DWARF invalid (the compile unit would point to the zero-size debug_lines section where there should've been an empty line table but there was nothing at all). Fix that, and as a consequence this works around/addresses PR18809. llvm-svn: 201380	2014-02-14 01:57:59 +00:00
Eric Christopher	02dbadb3a0	Disable emission of aranges by default and add a command line option to enable again that will be matched with a commit to enable in clang. llvm-svn: 201378	2014-02-14 01:26:55 +00:00
Matt Arsenault	aa689f5079	Do more addrspacecast transforms that happen for bitcast. Makes addrspacecast (gep) do addrspacecast (gep) instead. llvm-svn: 201376	2014-02-14 00:49:12 +00:00
Tom Stellard	967bf5813f	R600/SI: Expand all v8[if]32 operations llvm-svn: 201371	2014-02-13 23:34:15 +00:00
Tom Stellard	f16d38cbb5	R600/SI: Add a pattern for i32 anyext Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 201370	2014-02-13 23:34:13 +00:00
Tom Stellard	6c7a7e82a7	R600/SI: Completely Disable TypeRewriter on compute llvm-svn: 201369	2014-02-13 23:34:12 +00:00
Tom Stellard	80be9650e3	R600/SI: Split global vector loads with more than 4 elements llvm-svn: 201368	2014-02-13 23:34:10 +00:00
Tom Stellard	60b6153044	R600/SI: Add ShaderType attribute to some tests llvm-svn: 201367	2014-02-13 23:34:07 +00:00
Rafael Espindola	1f3de49f37	Use __literal16. It has been supported by the linker since 2005. llvm-svn: 201365	2014-02-13 23:16:11 +00:00
Diego Novillo	8d9be9a4cf	Simplify checks in MC/AsmParser/directive_loc.s llvm-svn: 201361	2014-02-13 20:16:42 +00:00
Diego Novillo	b1b5007c52	Fix generation of 'isa' and 'discriminator' keywords. Summary: There should be a space before each of these two keywords to avoid generating invalid assembly files. NOTE: I could not find an obvious maintainers in CODE_OWNERS.TXT, but this seems related to debug info. Reviewers: echristo CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D2791 llvm-svn: 201359	2014-02-13 20:05:03 +00:00
NAKAMURA Takumi	f71dc448b9	Tweak llvm/test/DebugInfo/X86/generate-odr-hash.ll corresponding to r201351 (Revert r201187). llvm-svn: 201355	2014-02-13 18:28:28 +00:00
NAKAMURA Takumi	4f2a067df1	[PR18809] Revert r201187, "DebugInfo: Don't include the name of the CU file in the line table file list when it's unneeded" It really crashes cygwin's stage2 configure with "clang -g". llvm-svn: 201351	2014-02-13 18:18:56 +00:00
Rafael Espindola	8459762c88	Add triples to try to fix the windows bots. llvm-svn: 201345	2014-02-13 16:49:47 +00:00
Rafael Espindola	98f0cfdd90	.file is only available on ELF, use a triple instead of -march. llvm-svn: 201337	2014-02-13 15:38:16 +00:00
Rafael Espindola	e6f37b908a	"foo" is not a ppc instruction, don't try to parse it. llvm-svn: 201336	2014-02-13 15:33:35 +00:00
Rafael Espindola	432acf5cef	Specify a triple. MachO AArch64 support is missing. llvm-svn: 201335	2014-02-13 15:30:06 +00:00
Daniel Sanders	753e17629d	Re-commit: Demote EmitRawText call in AsmPrinter::EmitInlineAsm() and remove hasRawTextSupport() call Summary: AsmPrinter::EmitInlineAsm() will no longer use the EmitRawText() call for targets with mature MC support. Such targets will always parse the inline assembly (even when emitting assembly). Targets without mature MC support continue to use EmitRawText() for assembly output. The hasRawTextSupport() check in AsmPrinter::EmitInlineAsm() has been replaced with MCAsmInfo::UseIntegratedAs which when true, causes the integrated assembler to parse inline assembly (even when emitting assembly output). UseIntegratedAs is set to true for targets that consider any failure to parse valid assembly to be a bug. Target specific subclasses generally enable the integrated assembler in their constructor. The default value can be overridden with -no-integrated-as. All tests that rely on inline assembly supporting invalid assembly (for example, those that use mnemonics such as 'foo' or 'hello world') have been updated to disable the integrated assembler. Changes since review (and last commit attempt): - Fixed test failures that were missed due to configuration of local build. (fixes crash.ll and a couple others). - Fixed tests that happened to pass because the local build was on X86 (should fix 2007-12-17-InvokeAsm.ll) - mature-mc-support.ll's should no longer require all targets to be compiled. (should fix ARM and PPC buildbots) - Object output (-filetype=obj and similar) now forces the integrated assembler to be enabled regardless of default setting or -no-integrated-as. (should fix SystemZ buildbots) Reviewers: rafael Reviewed By: rafael CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D2686 llvm-svn: 201333	2014-02-13 14:44:26 +00:00
NAKAMURA Takumi	6fe94621b9	llvm/test/CodeGen/AArch64/cpus.ll: Tweak to use -mtriple=aarch64-unknown-unknown, or this would crash for targeting pecoff like *-mingw32. llvm-svn: 201315	2014-02-13 11:06:23 +00:00
Tim Northover	914af6273b	ARM: remove floating-point patterns for @llvm.arm.neon.vabs The front-end is now generating the generic @llvm.fabs for this operation now, so the extra patterns are no longer needed. llvm-svn: 201314	2014-02-13 10:44:30 +00:00
Oliver Stannard	5bbb72f37e	Add Cortex-A53 and Cortex-A57 cores to the AArch64 backend llvm-svn: 201305	2014-02-13 09:46:11 +00:00
Hao Liu	7b6dfcf06a	[AArch64]Fix the problems that can't select mul/add/sub of v1i8/v1i16/v1i32 types. As this problems are similar to shl/sra/srl, also add patterns for shift nodes. llvm-svn: 201298	2014-02-13 05:42:33 +00:00
Rafael Espindola	75ec01f3de	Copy dll storage in copyAttributes. llvm-svn: 201295	2014-02-13 05:11:35 +00:00
Juergen Ributzka	2b97f9b211	[DAG] Fix the recognition of opaque constants in the SelectionDAGBuilder. This fix checks the original LLVM IR node to identify opaque constants by looking for the bitcast-constant pattern. Originally we looked at the generated SDNode, but this might lead to incorrect results. The SDNode could have been generated by an constant expression that was folded to a constant. This fixes <rdar://problem/16050719> llvm-svn: 201291	2014-02-13 04:19:26 +00:00
Hao Liu	4f345f3c03	[AArch64]Add support for spilling FPR8/FPR16. llvm-svn: 201287	2014-02-13 02:36:58 +00:00
Reid Kleckner	22b19da9fc	GlobalOpt: Aliases don't have sections, don't copy them when replacing As defined in LangRef, aliases do not have sections. However, LLVM's GlobalAlias class inherits from GlobalValue, which means we can read and set its section. We should probably ban that as a separate change, since it doesn't make much sense for an alias to have a section that differs from its aliasee. Fixes PR18757, where the section was being lost on the global in code from Clang like: extern "C" { __attribute__((used, section("CUSTOM"))) static int in_custom_section; } Reviewers: rafael.espindola Differential Revision: http://llvm-reviews.chandlerc.com/D2758 llvm-svn: 201286	2014-02-13 02:18:36 +00:00
Owen Anderson	883b5add8e	Remove a very old instcombine where we would turn sequences of selects into logical operations on the i1's driving them. This is a bad idea for every target I can think of (confirmed with micro tests on all of: x86-64, ARM, AArch64, Mips, and PowerPC) because it forces the i1 to be materialized into a general purpose register, whereas consuming it directly into a select generally allows it to exist only transiently in a predicate or flags register. Chandler ran a set of performance tests with this change, and reported no measurable change on x86-64. llvm-svn: 201275	2014-02-12 23:54:07 +00:00
Andrea Di Biagio	b7882b3bd1	[Vectorizer] Add a new 'OperandValueKind' in TargetTransformInfo called 'OK_NonUniformConstValue' to identify operands which are constants but not constant splats. The cost model now allows returning 'OK_NonUniformConstValue' for non splat operands that are instances of ConstantVector or ConstantDataVector. With this change, targets are now able to compute different costs for instructions with non-uniform constant operands. For example, On X86 the cost of a vector shift may vary depending on whether the second operand is a uniform or non-uniform constant. This patch applies the following changes: - The cost model computation now takes into account non-uniform constants; - The cost of vector shift instructions has been improved in X86TargetTransformInfo analysis pass; - BBVectorize, SLPVectorizer and LoopVectorize now know how to distinguish between non-uniform and uniform constant operands. Added a new test to verify that the output of opt '-cost-model -analyze' is valid in the following configurations: SSE2, SSE4.1, AVX, AVX2. llvm-svn: 201272	2014-02-12 23:43:47 +00:00
Andrea Di Biagio	386d566395	[X86] Teach the backend how to lower vector shift left into multiply rather than scalarizing it. Instead of expanding a packed shift into a sequence of scalar shifts, the backend now tries (when possible) to convert the vector shift into a vector multiply. Before this change, a shift of a MVT::v8i16 vector by a build_vector of constants was always scalarized into a long sequence of "vector extracts + scalar shifts + vector insert". With this change, if there is SSE2 support, we emit a single vector multiply. This change also affects SSE4.1, AVX, AVX2 shifts: - A shift of a MVT::v4i32 vector by a build_vector of non uniform constants is now lowered when possible into a single SSE4.1 vector multiply. - Packed v16i16 shift left by constant build_vector are now expanded when possible into a single AVX2 vpmullw. This change also improves the lowering of AVX512f vector shifts. Added test CodeGen/X86/vec_shift6.ll with some code examples that are affected by this change. llvm-svn: 201271	2014-02-12 23:42:28 +00:00
David Blaikie	3c7e961b29	DebugInfo: Demonstrate that we're not currently uniquing address table entries in fission Since I just discovered this while poking at other things, here's the test case so I have it to come back to later. llvm-svn: 201267	2014-02-12 23:03:54 +00:00
David Blaikie	23afec97ba	DebugInfo: Merge fission and non-fission (and 32 and 64 bit) tests for TLS support. llvm-svn: 201266	2014-02-12 23:03:51 +00:00
Adrian Prantl	7199fd532c	Debug info: Bugfix for r201190: DW_OP_piece takes bytes, not bits. rdar://problem/16015314 llvm-svn: 201253	2014-02-12 19:34:44 +00:00
Akira Hatanaka	a07ffb5b31	Pass edges weights to MachineBasicBlock::addSuccessor in TailDuplicatePass to preserve branch probability information. <rdar://problem/15893208> llvm-svn: 201245	2014-02-12 18:09:18 +00:00
Renato Golin	c2ae1b1b41	PC-rel implemented in AArch64, test now pass llvm-svn: 201243	2014-02-12 17:17:41 +00:00
Daniel Sanders	abe212a3b8	Revert r201237+r201238: Demote EmitRawText call in AsmPrinter::EmitInlineAsm() and remove hasRawTextSupport() call It introduced multiple test failures in the buildbots. llvm-svn: 201241	2014-02-12 15:39:20 +00:00
Daniel Sanders	a7d504cf58	Demote EmitRawText call in AsmPrinter::EmitInlineAsm() and remove hasRawTextSupport() call Summary: AsmPrinter::EmitInlineAsm() will no longer use the EmitRawText() call for targets with mature MC support. Such targets will always parse the inline assembly (even when emitting assembly). Targets without mature MC support continue to use EmitRawText() for assembly output. The hasRawTextSupport() check in AsmPrinter::EmitInlineAsm() has been replaced with MCAsmInfo::UseIntegratedAs which when true, causes the integrated assembler to parse inline assembly (even when emitting assembly output). UseIntegratedAs is set to true for targets that consider any failure to parse valid assembly to be a bug. Target specific subclasses generally enable the integrated assembler in their constructor. The default value can be overridden with -no-integrated-as. All tests that rely on inline assembly supporting invalid assembly (for example, those that use mnemonics such as 'foo' or 'hello world') have been updated to disable the integrated assembler. Reviewers: rafael Reviewed By: rafael CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D2686 llvm-svn: 201237	2014-02-12 14:44:54 +00:00
NAKAMURA Takumi	2dfe92e010	[PR18809] Mark DebugInfo/empty.ll as XFAIL:cygming. llvm-svn: 201211	2014-02-12 07:15:05 +00:00
Craig Topper	522af29465	Test case I forgot to 'add' for r201126. llvm-svn: 201207	2014-02-12 03:58:47 +00:00
David Blaikie	5b85858b77	DwarfUnit: Include type unit's file strings in the defining compile unit's file_names table There's still one piece missing here, which is adding the DW_AT_stmt_list to the type unit that refer's to the compile unit's line table. Working on that. llvm-svn: 201198	2014-02-12 00:40:47 +00:00
Evan Cheng	57add3e4ee	Tweak ARM fastcc by adopting these two AAPCS rules: * CPRCs may be allocated to co-processor registers or the stack – they may never be allocated to core registers * When a CPRC is allocated to the stack, all other VFP registers should be marked as unavailable The difference is only noticeable in rare cases where there are a large number of floating point arguments (e.g. 7 doubles + additional float, double arguments). Although it's probably still better to avoid vmov as it can cause stalls in some older ARM cores. The other, more subtle benefit, is to minimize difference between the various calling conventions. rdar://16039676 llvm-svn: 201193	2014-02-11 23:49:31 +00:00
Adrian Prantl	cbcd578f0c	Reapply r201180 with an additional error path. Debug info: Emit values in subregisters that do not have a separate DWARF register number by emitting a super-register + DW_OP_bit_piece. This is necessary because on x86_64, there are no DWARF register numbers for i386-style subregisters. Fixes a bunch of FIXMEs. rdar://problem/16015314 llvm-svn: 201190	2014-02-11 22:22:15 +00:00
Adrian Prantl	80b6fd02fa	Revert "Debug info: Emit values in subregisters that do not have a separate" This reverts commit r201179 for buildbot breakage. llvm-svn: 201188	2014-02-11 22:03:30 +00:00
David Blaikie	284cfc1089	DebugInfo: Don't include the name of the CU file in the line table file list when it's unneeded This comes up in empty files or files containing #file directives that never reference the actual source file name. Came up in a small test of line tables I was playing with. llvm-svn: 201187	2014-02-11 21:49:46 +00:00
David Blaikie	6bd395f3f0	DebugInfo: Remove dependence on file numbering in the line table. These tests were unnecessarily sensitive to the presence and ordering of elements in the line table file_names list which will break on a future change I'm working on. llvm-svn: 201185	2014-02-11 21:46:46 +00:00
Adrian Prantl	a83cc8a356	Debug info: Emit values in subregisters that do not have a separate DWARF register number by emitting a super-register + DW_OP_bit_piece. This is necessary because on x86_64, there are no DWARF register numbers for i386-style subregisters. Fixes a bunch of FIXMEs. rdar://problem/16015314 llvm-svn: 201180	2014-02-11 21:22:59 +00:00
Matt Arsenault	71b71d25eb	R600/SI: Fix assertion on infinite loops. This isn't the most useful case to fix in the real world, but bugpoint runs into this. llvm-svn: 201177	2014-02-11 21:12:38 +00:00
Benjamin Kramer	94fc18d040	InstCombine: Teach icmp merging about the equivalence of bit tests and UGE/ULT with a power of 2. This happens in bitfield code. While there reorganize the existing code a bit. llvm-svn: 201176	2014-02-11 21:09:03 +00:00
Jim Grosbach	7422074df0	Tidy up a bit. Formatting only. llvm-svn: 201174	2014-02-11 20:48:41 +00:00
Jim Grosbach	8bfcb735fa	ARM: Thumb2 LDR(literal) can target SP. Fix a slightly overzealous destination register restriction for the 'without .w' alias. Add some explicit testcases. rdar://16033140 llvm-svn: 201173	2014-02-11 20:48:39 +00:00
Benjamin Kramer	5a188549ad	ScalarEvolution: Analyze trip count of loops with a switch guarding the exit. llvm-svn: 201159	2014-02-11 15:44:32 +00:00
Robert Lougher	7d9084ffa1	Teach the DAGCombiner how to fold concat_vector nodes when the input is two BUILD_VECTOR nodes, e.g.: (concat_vectors (BUILD_VECTOR a1, a2, a3, a4), (BUILD_VECTOR b1, b2, b3, b4)) -> (BUILD_VECTOR a1, a2, a3, a4, b1, b2, b3, b4) This fixes an issue with AVX, where a sequence was not recognized as a 256-bit vbroadcast due to the concat_vectors. llvm-svn: 201158	2014-02-11 15:42:46 +00:00
Chandler Carruth	fc25854b09	[LPM] Switch LICM to actively use LCSSA in addition to preserving it. Fixes PR18753 and PR18782. This is necessary for LICM to preserve LCSSA correctly and efficiently. There is still some active discussion about whether we should be using LCSSA, but we can't just immediately stop using it and we need LICM to preserve it while we are using it. We can restore the old SSAUpdater driven code if and when there is a serious effort to remove the reliance on LCSSA from all of the loop passes. However, this also serves as a great example of why LCSSA is very nice to have. This change significantly simplifies the process of sinking instructions for LICM, and makes it quite a bit less expensive. It wouldn't even be as complex as it is except that I had to start the process of removing the big recursive LCSSA formation hammer in order to switch even this much of the re-forming code to asserting that LCSSA was preserved. I'll fully remove that next just to tidy things up until the LCSSA debate settles one way or the other. llvm-svn: 201148	2014-02-11 12:52:27 +00:00
Robert Lytton	70b5ba49c3	XCore target: fix const section handling Xcore target ABI requires const data that is externally visible to be handled differently if it has C-language linkage rather than C++ language linkage. Clang now emits ".cp.rodata" section information. All other externally visible constant data will be placed in the DP section. llvm-svn: 201144	2014-02-11 10:36:26 +00:00
Robert Lytton	9b6bb461b1	XCore target: Lower ATOMIC_LOAD & ATOMIC_STORE llvm-svn: 201143	2014-02-11 10:36:18 +00:00
Elena Demikhovsky	1f32c313f1	AVX: fixed a bug in LowerVECTOR_SHUFFLE llvm-svn: 201140	2014-02-11 10:21:53 +00:00
Elena Demikhovsky	2aafc22ed9	AVX-512: Optimized BUILD_VECTOR pattern; fixed encoding of VEXTRACTPS instruction. llvm-svn: 201134	2014-02-11 07:25:59 +00:00
Quentin Colombet	64ca544d95	[CodeGenPrepare] Test case for the promotions that bypass the profitability check due to some other checks in the addressing mode matcher. I.e., test case for commit r201121. <rdar://problem/16020230> llvm-svn: 201132	2014-02-11 06:55:43 +00:00
Tom Stellard	5d7aaaed7d	R600/SI: Initialize M0 and emit S_WQM_B64 whenever DS instructions are used DS instructions that access local memory can only uses addresses that are less than or equal to the value of M0. When M0 is uninitialized, then we experience undefined behavior. This patch also changes the behavior to emit S_WQM_B64 on pixel shaders no matter what kind of DS instruction is used. llvm-svn: 201097	2014-02-10 16:58:30 +00:00
Tim Northover	b0430415e6	ARM: use natural LLVM IR for vshll instructions Similarly to the vshrn instructions, these are simple zext/sext + trunc operations. Using normal LLVM IR should allow for better code, and more sharing with the AArch64 backend. llvm-svn: 201093	2014-02-10 16:20:29 +00:00
Chad Rosier	bcde0c49cb	[AArch64] Handle aliases of conditional branches without b.pred form. llvm-svn: 201091	2014-02-10 15:43:11 +00:00
Oliver Stannard	8dcaa761a2	ARM: r12 is callee-saved for interrupt handlers For A- and R-class processors, r12 is not normally callee-saved, but is for interrupt handlers. See AAPCS, 5.3.1.1, "Use of IP by the linker". llvm-svn: 201089	2014-02-10 14:24:23 +00:00
Tim Northover	170daafe01	ARM: use LLVM IR to represent the vshrn operation vshrn is just the combination of a right shift and a truncate (and the limits on the immediate value actually mean the signedness of the shift doesn't matter). Using that representation allows us to get rid of an ARM-specific intrinsic, share more code with AArch64 and hopefully get better code out of the mid-end optimisers. llvm-svn: 201085	2014-02-10 14:04:07 +00:00
Robert Lougher	48ee75b7e3	Test commit - added a new line to vec_shuf-insert.ll. llvm-svn: 201083	2014-02-10 12:42:13 +00:00
Matheus Almeida	4b27eb588c	[mips][msa] Add DLSA instruction. llvm-svn: 201081	2014-02-10 12:05:17 +00:00
Matheus Almeida	b4133b25e7	[mips][msa] Update FileCheck prefix in preparation for the addition of Mips64 tests. No functional changes. llvm-svn: 201080	2014-02-10 11:30:09 +00:00
Kostya Serebryany	8baa386670	[asan] support for FreeBSD, LLVM part. patch by Viktor Kutuzov llvm-svn: 201067	2014-02-10 07:37:04 +00:00
Elena Demikhovsky	9f423d6f25	AVX-512: Fixed extract_vector_elt for v16i1 and v8i1 vectors. llvm-svn: 201066	2014-02-10 07:02:39 +00:00
Craig Topper	a0869dceea	Recommit r201059 and r201060 with hopefully a fix for its original failure. Original commits messages: Add MRMXr/MRMXm form to X86 for use by instructions which treat the 'reg' field of modrm byte as a don't care value. Will allow for simplification of disassembler code. Simplify a bunch of code by removing the need for the x86 disassembler table builder to know about extended opcodes. The modrm forms are sufficient to convey the information. llvm-svn: 201065	2014-02-10 06:55:41 +00:00
Hao Liu	6e73761dc8	[AArch64]Implement the copy of two FPR8 registers by using FMOVss of two FPR32 registers in copyPhysReg. llvm-svn: 201061	2014-02-10 03:16:22 +00:00
Benjamin Kramer	9d94a4eee9	AsmParser: Parse (and ignore) nested .macro definitions. This enables a slightly odd feature of gas. The macro is defined when the outermost macro is instantiated. PR18599 llvm-svn: 201045	2014-02-09 16:22:00 +00:00
Saleem Abdulrasool	40726a1c11	ARM: change attribute tests to use parsed form This makes the tests more readable by using the -arm-attributes decoding support in llvm-readobj since that is now available. Change the invocation commands to be similar to other test and use a more precise triple (the tests only require ARM EABI support). llvm-svn: 201029	2014-02-08 23:17:02 +00:00
Arnold Schwaighofer	348e1b60be	LoopVectorizer: Keep track of conditional store basic blocks Before conditional store vectorization/unrolling we had only one vectorized/unrolled basic block. After adding support for conditional store vectorization this will not only be one block but multiple basic blocks. The last block would have the back-edge. I updated the code to use a vector of basic blocks instead of a single basic block and fixed the users to use the last entry in this vector. But, I forgot to add the basic blocks to this vector! Fixes PR18724. llvm-svn: 201028	2014-02-08 20:41:13 +00:00
Juergen Ributzka	9479b31f97	[Constant Hoisting] Fix insertion point for constant materialization. The bitcast instruction during constant materialization was not placed correcly in the presence of phi nodes. This commit fixes the insertion point to be in the idom instead. This fixes PR18768 llvm-svn: 201009	2014-02-08 00:20:49 +00:00
Rafael Espindola	5054362920	Always create a temporary symbol to use with the cfi frame. This is a small simplification and a small step in fixing pr18743 since private functions on MachO should be using a 'l' prefix. llvm-svn: 200994	2014-02-07 21:23:18 +00:00
Rafael Espindola	e393aab13b	Use FileCheck variables to simplify this test. llvm-svn: 200992	2014-02-07 21:11:33 +00:00
Renato Golin	3dc5ade8bb	Fix Darwin bots from EHABI change llvm-svn: 200990	2014-02-07 20:32:32 +00:00
Matt Arsenault	9ad27e8d76	R600/SI: Add failing test for 3 x i64 vectors. Stores of <4 x i64> do work (although they do expand to 4 stores instead of 2), but 3 x i64 vectors fail to select. llvm-svn: 200989	2014-02-07 20:29:40 +00:00
Renato Golin	78a6eba862	Remove -arm-disable-ehabi option llvm-svn: 200988	2014-02-07 20:12:49 +00:00
Rafael Espindola	66f273be34	Don't internalize linkonce_odr non constant variables. llvm-svn: 200983	2014-02-07 19:04:43 +00:00
Sasa Stankovic	4c80bdae72	[mips] Forbid the use of registers t6, t7 and t8 if the target is NaCl. Differential Revision: http://llvm-reviews.chandlerc.com/D2694 llvm-svn: 200978	2014-02-07 17:16:40 +00:00
Rafael Espindola	61acf5d9b0	Fix a bug with .weak_def_can_be_hidden: Mutable variables cannot use it. Thanks to John McCall for noticing it. llvm-svn: 200977	2014-02-07 16:21:30 +00:00
Oliver Stannard	1dc1034218	LLVM-1163: AAPCS-VFP violation when CPRC allocated to stack According to the AAPCS, when a CPRC is allocated to the stack, all other VFP registers should be marked as unavailable. I have also modified the rules for allocating non-CPRCs to the stack, to make it more explicit that all GPRs must be made unavailable. I cannot think of a case where the old version would produce incorrect answers, so there is no test for this. llvm-svn: 200970	2014-02-07 11:19:53 +00:00
Sasa Stankovic	6a284eecec	Changed comment. llvm-svn: 200969	2014-02-07 11:16:02 +00:00
Venkatraman Govindaraju	de98fae368	[Sparc] Add support for parsing synthetic instruction 'mov'. llvm-svn: 200965	2014-02-07 09:06:52 +00:00
Venkatraman Govindaraju	ced9226b0f	[Sparc] Emit correct encoding for atomic instructions. Also, add support for parsing CAS instructions to test the CAS encoding. llvm-svn: 200963	2014-02-07 07:34:49 +00:00
Venkatraman Govindaraju	fd07500dd1	[Sparc] Emit relocations for Thread Local Storage (TLS) when integrated assembler is used. llvm-svn: 200962	2014-02-07 05:54:20 +00:00
Venkatraman Govindaraju	104643d0aa	[Sparc] Emit correct relocations for PIC code when integrated assembler is used. llvm-svn: 200961	2014-02-07 04:24:35 +00:00
Manman Ren	37c9267107	PGO branch weight: fix PR18752. Fix a bug triggered in IfConverterTriangle when CvtBB has multiple predecessors by getting the weights before removing a successor. llvm-svn: 200958	2014-02-07 00:38:56 +00:00
Jim Grosbach	e9008de652	X86: Resolve a long standing FIXME and properly isel pextr[bw]. Generalize the AArch64 .td nodes for AssertZext and AssertSext. Use them to match the relevant pextr store instructions. The test widen_load-2.ll requires a slight change because with the stores gone, the remaining instructions are scheduled in a different order. Add test cases for SSE4 and AVX variants. Resolves rdar://13414672. Patch by Adam Nemet <anemet@apple.com>. llvm-svn: 200957	2014-02-07 00:16:33 +00:00
Rafael Espindola	803fb10874	Convert test to FileCheck. llvm-svn: 200955	2014-02-06 23:35:22 +00:00
Quentin Colombet	3a4bf0405e	[CodeGenPrepare] Move away sign extensions that get in the way of addressing mode. Basically the idea is to transform code like this: %idx = add nsw i32 %a, 1 %sextidx = sext i32 %idx to i64 %gep = gep i8* %myArray, i64 %sextidx load i8* %gep Into: %sexta = sext i32 %a to i64 %idx = add nsw i64 %sexta, 1 %gep = gep i8* %myArray, i64 %idx load i8* %gep That way the computation can be folded into the addressing mode. This transformation is done as part of the addressing mode matcher. If the matching fails (not profitable, addressing mode not legal, etc.), the matcher will revert the related promotions. <rdar://problem/15519855> llvm-svn: 200947	2014-02-06 21:44:56 +00:00
Tom Stellard	e236794578	R600/SI: Add a MUBUF store pattern for Reg+Imm offsets llvm-svn: 200935	2014-02-06 18:36:41 +00:00
Tom Stellard	2937cbc005	R600/SI: Add a MUBUF store pattern for Imm offsets llvm-svn: 200934	2014-02-06 18:36:39 +00:00
Tom Stellard	11624bc577	R600/SI: Add a MUBUF load pattern for Reg+Imm offsets llvm-svn: 200933	2014-02-06 18:36:38 +00:00
Tom Stellard	044e418f15	R600/SI: Use immediates offsets for SMRD instructions whenever possible There was a problem with the old pattern, so we were copying some larger immediates into registers when we could have been encoding them in the instruction. llvm-svn: 200932	2014-02-06 18:36:34 +00:00
Tim Northover	f0e21616f3	X86: add costs for 64-bit vector ext/trunc & rebalance The most important part of this is probably adding any cost at all for operations like zext <8 x i8> to <8 x i32>. Before they were being recorded as extremely costly (24, I believe) which made LLVM fall back on a 4-wide vectorisation of a loop. It also rebalances the values for sext, zext and trunc. Lacking any other sane metric that might work across CPU microarchitectures I went for instructions. This seems to be in reasonable accord with the rest of the table (sitofp, ...) though no doubt at least one value is sub-optimal for some bizarre reason. Finally, separate AVX and AVX2 values are provided where appropriate. The CodeGen is quite different in many cases. rdar://problem/15981990 llvm-svn: 200928	2014-02-06 18:18:36 +00:00
Nick Lewycky	993849490e	A memcpy out of an fresh alloca is a no-op, delete it. Patch by Patrick Walton! llvm-svn: 200907	2014-02-06 06:29:19 +00:00
Chandler Carruth	bf71a34eb9	[PM] Add a new "lazy" call graph analysis pass for the new pass manager. The primary motivation for this pass is to separate the call graph analysis used by the new pass manager's CGSCC pass management from the existing call graph analysis pass. That analysis pass is (somewhat unfortunately) over-constrained by the existing CallGraphSCCPassManager requirements. Those requirements make it really hard to cleanly layer the needed functionality for the new pass manager on top of the existing analysis. However, there are also a bunch of things that the pass manager would specifically benefit from doing differently from the existing call graph analysis, and this new implementation tries to address several of them: - Be lazy about scanning function definitions. The existing pass eagerly scans the entire module to build the initial graph. This new pass is significantly more lazy, and I plan to push this even further to maximize locality during CGSCC walks. - Don't use a single synthetic node to partition functions with an indirect call from functions whose address is taken. This node creates a huge choke-point which would preclude good parallelization across the fanout of the SCC graph when we got to the point of looking at such changes to LLVM. - Use a memory dense and lightweight representation of the call graph rather than value handles and tracking call instructions. This will require explicit update calls instead of some updates working transparently, but should end up being significantly more efficient. The explicit update calls ended up being needed in many cases for the existing call graph so we don't really lose anything. - Doesn't explicitly model SCCs and thus doesn't provide an "identity" for an SCC which is stable across updates. This is essential for the new pass manager to work correctly. - Only form the graph necessary for traversing all of the functions in an SCC friendly order. This is a much simpler graph structure and should be more memory dense. It does limit the ways in which it is appropriate to use this analysis. I wish I had a better name than "call graph". I've commented extensively this aspect. This is still very much a WIP, in fact it is really just the initial bits. But it is about the fourth version of the initial bits that I've implemented with each of the others running into really frustrating problms. This looks like it will actually work and I'd like to split the actual complexity across commits for the sake of my reviewers. =] The rest of the implementation along with lots of wiring will follow somewhat more rapidly now that there is a good path forward. Naturally, this doesn't impact any of the existing optimizer. This code is specific to the new pass manager. A bunch of thanks are deserved for the various folks that have helped with the design of this, especially Nick Lewycky who actually sat with me to go through the fundamentals of the final version here. llvm-svn: 200903	2014-02-06 04:37:03 +00:00
Juergen Ributzka	fa0eba6c8b	[DAG] Don't pull the binary operation though the shift if the operands have opaque constants. During DAGCombine visitShiftByConstant assumes that certain binary operations with only constant operands can always be folded successfully. This is no longer true when the constant is opaque. This commit fixes visitShiftByConstant by not performing the optimization for opaque constants. Otherwise we would end up in an infinite DAGCombine loop. llvm-svn: 200900	2014-02-06 04:09:06 +00:00
Manman Ren	d461244972	Set default of inlinecold-threshold to 225. 225 is the default value of inline-threshold. This change will make sure we have the same inlining behavior as prior to r200886. As Chandler points out, even though we don't have code in our testing suite that uses cold attribute, there are larger applications that do use cold attribute. r200886 + this commit intend to keep the same behavior as prior to r200886. We can later on tune the inlinecold-threshold. The main purpose of r200886 is to help performance of instrumentation based PGO before we actually hook up inliner with analysis passes such as BPI and BFI. For instrumentation based PGO, we try to increase inlining of hot functions and reduce inlining of cold functions by setting inlinecold-threshold. Another option suggested by Chandler is to use a boolean flag that controls if we should use OptSizeThreshold for cold functions. The default value of the boolean flag should not change the current behavior. But it gives us less freedom in controlling inlining of cold functions. llvm-svn: 200898	2014-02-06 01:59:22 +00:00
Kevin Enderby	d6b107136a	Update the X86 assembler for .intel_syntax to accept the << and >> bitwise operators. rdar://15975725 llvm-svn: 200896	2014-02-06 01:21:15 +00:00
Paul Robinson	af4e64d095	Disable most IR-level transform passes on functions marked 'optnone'. Ideally only those transform passes that run at -O0 remain enabled, in reality we get as close as we reasonably can. Passes are responsible for disabling themselves, it's not the job of the pass manager to do it for them. llvm-svn: 200892	2014-02-06 00:07:05 +00:00
Manman Ren	e8781b1a36	Inliner uses a smaller inline threshold for callees with cold attribute. Added command line option inlinecold-threshold to set threshold for inlining functions with cold attribute. Listen to the cold attribute when it would decrease the inline threshold. llvm-svn: 200886	2014-02-05 22:53:44 +00:00
Quentin Colombet	87769713cf	[RegAlloc] Add a last chance recoloring mechanism when everything else failed to find a register. The idea is to choose a color for the variable that cannot be allocated and recolor its interferences around. Unlike the current register allocation scheme, it is allowed to change the color of an already assigned (but maybe not splittable or spillable) live interval while propagating this change to its neighbors. In other word, there are two things that may help finding an available color: - Already assigned variables (RS_Done) can be recolored to different color. - The recoloring allows to catch solutions that needs to touch more that just the neighbors of the current allocated variable. E.g., vA can use {R1, R2 } vB can use { R2, R3} vC can use {R1 } Where vA, vB, and vC cannot be split anymore (they are reloads for instance) and they all interfere. vA is assigned R1 vB is assigned R2 vC tries to evict vA but vA is already done. => Regular register allocation heuristic fails. Last chance recoloring kicks in: vC does as if vA was evicted => vC uses R1. vC is marked as fixed. vA needs to find a color. None are available. vA cannot evict vC: vC is a fixed virtual register now. vA does as if vB was evicted => vA uses R2. vB needs to find a color. R3 is available. Recoloring => vC = R1, vA = R2, vB = R3. <rdar://problem/15947839> llvm-svn: 200883	2014-02-05 22:13:59 +00:00
Rafael Espindola	b4eec1daa1	Remove support for not using .loc directives. Clang itself was not using this. The only way to access it was via llc. llvm-svn: 200862	2014-02-05 18:00:21 +00:00
Petar Jovanovic	9725016af3	[mips] Add NaCl target and forbid indexed loads and stores for it This patch adds NaCl target for Mips. It also forbids indexed loads and stores if the target is NaCl. Patch by Sasa Stankovic. Differential Revision: http://llvm-reviews.chandlerc.com/D2690 llvm-svn: 200855	2014-02-05 17:19:30 +00:00
Petar Jovanovic	f387387835	mips: XFAIL non-extern-addend-smallcodemodel test Small code model (and default reloc model) set Reloc::PIC_ in this test, and PIC is not yet supported in MCJIT for MIPS. llvm-svn: 200852	2014-02-05 16:47:59 +00:00
Elena Demikhovsky	0b79be8ab2	AVX-512: optimized icmp -> sext -> icmp pattern llvm-svn: 200849	2014-02-05 16:17:36 +00:00
Logan Chien	d5c48aa3d3	ARM: Resolve thumb_bl fixup in same MCFragment. In Thumb1 mode, bl instruction might be selected for branches between basic blocks in the function if the offset is greater than 2KB. However, this might cause SEGV because the destination symbol is not marked as thumb function and the execution mode will be reset to ARM mode. Since we are sure that these symbols are in the same data fragment, we can simply resolve these local symbols, and don't emit any relocation information for this bl instruction. llvm-svn: 200842	2014-02-05 14:15:16 +00:00
Elena Demikhovsky	a38114c45e	AVX-512: fixed a bug in EVEX encoding (the bug appeared after r200624) llvm-svn: 200837	2014-02-05 13:03:01 +00:00
Michel Danzer	5d26fdfcba	R600/SI: Add pattern for zero-extending i1 to i32 Fixes opencl-example if_* tests with radeonsi. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74469 Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 200830	2014-02-05 09:48:05 +00:00
Kai Nacke	382c140567	ARM: Enable use of relocation type tlsldo in debug info for tls data. This fixes PR18554. Reviewers: Renato Golin, Keith Walker llvm-svn: 200826	2014-02-05 07:23:09 +00:00
Elena Demikhovsky	a30e437659	AVX-512: Added intrinsic for cvtph2ps. Added VPTESTNM instruction. Added a pattern to vselect (lit tests will follow). llvm-svn: 200823	2014-02-05 07:05:03 +00:00
Rafael Espindola	02eac9a270	Add a test for printing absolute symbols in ELF. llvm-svn: 200818	2014-02-05 04:36:47 +00:00
Rafael Espindola	f42c58d2ef	Small fix for llvm-nm handling of weak symbols on ELF (print 'v'). llvm-svn: 200808	2014-02-04 23:53:15 +00:00
Manman Ren	81f136f94a	Update testing case for r200806. llvm-svn: 200807	2014-02-04 23:53:12 +00:00
Rafael Espindola	fb66ef05f7	Add a test for common symbols in coff. llvm-svn: 200803	2014-02-04 23:18:52 +00:00
Benjamin Kramer	34f460ed29	SimplifyLibCalls: Push TLI through the exp2->ldexp transform. For the odd case of platforms with exp2 available but not ldexp. llvm-svn: 200795	2014-02-04 20:27:23 +00:00
Petar Jovanovic	a5da588b2f	[mips] Implement %hi(sym1 - sym2) and %lo(sym1 - sym2) expressions Patch implements %hi(sym1 - sym2) and %lo(sym1 - sym2) expressions for MIPS by creating target expression class MipsMCExpr. Patch by Sasa Stankovic. Differential Revision: http://llvm-reviews.chandlerc.com/D2592 llvm-svn: 200783	2014-02-04 18:41:57 +00:00
David Peixotto	b9b7362cdc	Fix PR18345: ldr= pseudo instruction produces incorrect code when using in inline assembly This patch fixes the ldr-pseudo implementation to work when used in inline assembly. The fix is to move arm assembler constant pools from the ARMAsmParser class to the ARMTargetStreamer class. Previously we kept the assembler generated constant pools in the ARMAsmParser object. This does not work for inline assembly because a new parser object is created for each blob of inline assembly. This patch moves the constant pools to the ARMTargetStreamer class so that the constant pool will remain alive for the entire code generation process. An ARMTargetStreamer class is now required for the arm backend. There was no existing implementation for MachO, only Asm and ELF. Instead of creating an empty MachO subclass, we decided to make the ARMTargetStreamer a non-abstract class and provide default (llvm_unreachable) implementations for the non constant-pool related methods. Differential Revision: http://llvm-reviews.chandlerc.com/D2638 llvm-svn: 200777	2014-02-04 17:22:40 +00:00
Tom Stellard	0ec134f3d6	R600/SI: Custom lower i64 ISD::SELECT llvm-svn: 200774	2014-02-04 17:18:40 +00:00
Tom Stellard	bfebd1fc7e	R600: Enable vector fpow. The OpenCL specs say: "The vector versions of the math functions operate component-wise. The description is per-component." Patch by: Jan Vesely Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 200773	2014-02-04 17:18:37 +00:00
Tim Northover	103e648d30	OS X: the correct function is __sincospif_stret, not __sincospi_stretf rdar://problem/13729466 llvm-svn: 200771	2014-02-04 16:28:20 +00:00
Tim Northover	fdbdb4b6d5	ARM & AArch64: merge NEON absolute compare intrinsics There was an extremely confusing proliferation of LLVM intrinsics to implement the vacge & vacgt instructions. This combines them all into two polymorphic intrinsics, shared across both backends. llvm-svn: 200768	2014-02-04 14:55:42 +00:00
Justin Bogner	c6af350698	llvm-cov: Implement the preserve-paths flag Until now, when a path in a gcno file included a directory, we would emit our .gcov file in that directory, whereas gcov always emits the file in the current directory. In doing so, this implements gcov's strange name-mangling -p flag, which is needed to avoid clobbering files when two with the same name exist in different directories. The path mangling is a bit ugly and only handles unix-like paths, but it's simple, and it doesn't make any guesses as to how it should behave outside of what gcov documents. If we decide this should be cross platform later, we can consider the compatibility implications then. llvm-svn: 200754	2014-02-04 10:45:02 +00:00
Tim Northover	e42fb07618	ARM: fix fast-isel assertion failure Missing braces on if meant we inserted both ARM and Thumb load for a litpool entry. This didn't end well. rdar://problem/15959157 llvm-svn: 200752	2014-02-04 10:38:46 +00:00
Michel Danzer	624b02aa67	R600/SI: Fix fneg for 0.0 V_ADD_F32 with source modifier does not produce -0.0 for this. Just manipulate the sign bit directly instead. Also add a pattern for (fneg (fabs ...)). Fixes a bunch of bit encoding piglit tests with radeonsi. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 200743	2014-02-04 07:12:38 +00:00
Justin Bogner	f69557e670	llvm-cov: Implement the object-directory flag llvm-svn: 200741	2014-02-04 06:41:43 +00:00
Justin Bogner	93d1edbb7d	llvm-cov: Ignore missing .gcda files When gcov is run without gcda data, it acts as if the counts are all zero and labels the file as - to indicate that there was no data. We should do the same. llvm-svn: 200740	2014-02-04 06:41:39 +00:00
Justin Bogner	b5bff69ac6	llvm-cov: Document the llvm-cov tests llvm-svn: 200739	2014-02-04 06:41:33 +00:00
Kai Nacke	ab7ee461c2	Revert: ARM: Enable use of relocation type tlsldo in debug info for tls data. There seems to be a new problem with the debug info in the test case. I'll have to investigate this. llvm-svn: 200737	2014-02-04 06:07:00 +00:00
Kai Nacke	a56bb78021	Add strchr(p, 0) -> p + strlen(p) to SimplifyLibCalls Add the missing transformation strchr(p, 0) -> p + strlen(p) to SimplifyLibCalls and remove the ToDo comment. Reviewer: Duncan P.N. Exan Smith llvm-svn: 200736	2014-02-04 05:55:16 +00:00
Kai Nacke	5e8c30f192	ARM: Enable use of relocation type tlsldo in debug info for tls data. This fixes PR18554. Reviewers: Renato Golin, Keith Walker llvm-svn: 200735	2014-02-04 05:43:09 +00:00
David Blaikie	5e390e4df7	DebugInfo: Remove some unneeded conditionals now that DIBuilder no longer emits zero-length arrays as {i32 0} A bunch of test cases needed to be cleaned up for this, many my fault - when implementid imported modules I updated test cases by simply duplicating the prior metadata field - which wasn't always the empty metadata entry. llvm-svn: 200731	2014-02-04 01:23:52 +00:00
Reid Kleckner	d47a59a4f8	inalloca: Don't remove dead arguments in the presence of inalloca args It disturbs the layout of the parameters in memory and registers, leading to problems in the backend. The plan for optimizing internal inalloca functions going forward is to essentially SROA the argument memory and demote any captured arguments (things that aren't trivially written by a load or store) to an indirect pointer to a static alloca. llvm-svn: 200717	2014-02-03 20:42:49 +00:00
Tim Northover	24979d8e10	AArch64 & ARM: refactor crypto intrinsics to take scalars Some of the SHA instructions take a scalar i32 as one argument (largely because they work on 160-bit hash fragments). This wasn't reflected in the IR previously, with ARM and AArch64 choosing different types (<4 x i32> and <1 x i32> respectively) which was ugly. This makes all the affected intrinsics take a uniform "i32", allowing them to become non-polymorphic at the same time. llvm-svn: 200706	2014-02-03 17:27:49 +00:00
Hal Finkel	5c968d9440	Expand vector bswap in LegalizeVectorOps ISD::BSWAP was missing from the list of node types that should be expanded element-wise. llvm-svn: 200705	2014-02-03 17:27:25 +00:00
Duncan P. N. Exon Smith	1ff08e389f	Lower llvm.expect intrinsic correctly for i1 LowerExpectIntrinsic previously only understood the idiom of an expect intrinsic followed by a comparison with zero. For llvm.expect.i1, the comparison would be stripped by the early-cse pass. Patch by Daniel Micay. llvm-svn: 200664	2014-02-02 22:43:55 +00:00
Arnold Schwaighofer	17455633c7	LoopVectorizer: Enable unrolling of conditional stores and the load/store unrolling heuristic per default Benchmarking on x86_64 (thanks Chandler!) and ARM has shown those options speed up some benchmarks while not causing any interesting regressions. llvm-svn: 200621	2014-02-02 03:12:34 +00:00
Matt Arsenault	6e63dd27a2	Add some xfailed R600 tests for 64-bit private accesses. llvm-svn: 200620	2014-02-02 00:13:12 +00:00
Matt Arsenault	f5958dded4	R600/SI: Fix insertelement with dynamic indices. This didn't work for any integer vectors, and didn't work with some sizes of float vectors. This should now work with all sizes of float and i32 vectors. llvm-svn: 200619	2014-02-02 00:05:35 +00:00
Venkatraman Govindaraju	52b6473d74	[Sparc] Set %o7 as the return address register instead of %i7 in MCRegisterInfo. Also, add CFI instructions to initialize the frame correctly. llvm-svn: 200617	2014-02-01 18:54:16 +00:00
Arnold Schwaighofer	445f7fb064	ARMTTI: We don't have 16 allocatable scalar registers This caused an regression on libquantum after enabling the new loop vectorizer unroll heuristics. llvm-svn: 200616	2014-02-01 18:00:25 +00:00
David Woodhouse	6c9a6f9b3d	MC: Fix .octa output for APInts with BitWidth > 128 llvm-svn: 200615	2014-02-01 16:52:33 +00:00
David Woodhouse	d6de0d99c5	MC: Add support for .octa This is a minimal implementation which accepts only constants rather than full expressions, but that should be perfectly sufficient for all known users for now. Patch from PaX Team <pageexec@freemail.hu> llvm-svn: 200614	2014-02-01 16:20:59 +00:00
Chandler Carruth	1665152cce	[LPM] Apply a really big hammer to fix PR18688 by recursively reforming LCSSA when we promote to SSA registers inside of LICM. Currently, this is actually necessary. The promotion logic in LICM uses SSAUpdater which doesn't understand how to place LCSSA PHI nodes. Teaching it to do so would be a very significant undertaking. It may be worthwhile and I've left a FIXME about this in the code as well as starting a thread on llvmdev to try to figure out the right long-term solution. For now, the PR needs to be fixed. Short of using the promition SSAUpdater to place both the LCSSA PHI nodes and the promoted PHI nodes, I don't see a cleaner or cheaper way of achieving this. Fortunately, LCSSA is relatively lazy and sparse -- it should only update instructions which need it. We can also skip the recursive variant when we don't promote to SSA values. llvm-svn: 200612	2014-02-01 13:35:14 +00:00
Chandler Carruth	6b4cc8b66a	[inliner] Skip debug intrinsics even earlier in computing the inline cost so that they don't impact the vector bonus. Fundamentally, counting unsimplified instructions is just wrong; it will continue to introduce instability as things which do not generate code bizarrely impact inlining. For example, sufficiently nested inlined functions could turn off the vector bonus with lifetime markers just like the debug intrinsics do. =/ This is a short-term tactical fix. Long term, I think we need to remove the vector bonus entirely. That's a separate patch and discussion though. The patch to fix this provided by Dario Domizioli. I've added some comments about the planned direction and used a heavily pruned form of debug info intrinsics for the test case. While this debug info doesn't work or "do" anything useful, it lets us easily test all manner of interference easily, and I suspect this will not be the last time we want to craft a pattern where debug info interferes with the inliner in a problematic way. llvm-svn: 200609	2014-02-01 10:38:17 +00:00
David Majnemer	5a67e2b1b8	Update a .fill test to use the updated semantics. Something funny happened, this should've been part of r200606. llvm-svn: 200607	2014-02-01 07:36:52 +00:00
David Majnemer	522d3db745	MC: Improve the .fill directive's compatibility with GAS Per the GAS documentation, .fill should permit pattern widths that aren't a power of two. While I was in the neighborhood, I added some sanity checking. This change was motivated by a use of this construct in the Linux Kernel. llvm-svn: 200606	2014-02-01 07:19:38 +00:00
Reid Kleckner	a04504fe97	Revert "[SLPV] Recognize vectorizable intrinsics during SLP vectorization ..." This reverts commit r200576. It broke 32-bit self-host builds by vectorizing two calls to @llvm.bswap.i64, which we then fail to expand. llvm-svn: 200602	2014-02-01 01:37:30 +00:00
Josh Magee	24c7f06333	[stackprotector] Implement the sspstrong rules for stack layout. This changes the PrologueEpilogInserter and LocalStackSlotAllocation passes to follow the extended stack layout rules for sspstrong and sspreq. The sspstrong layout rules are: 1. Large arrays and structures containing large arrays (>= ssp-buffer-size) are closest to the stack protector. 2. Small arrays and structures containing small arrays (< ssp-buffer-size) are 2nd closest to the protector. 3. Variables that have had their address taken are 3rd closest to the protector. Differential Revision: http://llvm-reviews.chandlerc.com/D2546 llvm-svn: 200601	2014-02-01 01:36:16 +00:00
Reid Kleckner	f5b76518c9	Implement inalloca codegen for x86 with the new inalloca design Calls with inalloca are lowered by skipping all stores for arguments passed in memory and the initial stack adjustment to allocate argument memory. Now the frontend is responsible for the memory layout, and the backend doesn't have to do any work. As a result these changes are pretty minimal. Reviewers: echristo Differential Revision: http://llvm-reviews.chandlerc.com/D2637 llvm-svn: 200596	2014-01-31 23:50:57 +00:00
Reid Kleckner	dfbed59cc2	Don't put non-static allocas in the static alloca map Allocas marked inalloca are never static, but we were trying to put them into the static alloca map if they were in the entry block. Also add an assertion in x86 fastisel. llvm-svn: 200593	2014-01-31 23:45:12 +00:00
Lang Hames	b0bd489e4a	Split out small-code-model MCJIT testcase in order to xfail for AArch64, where PC-rel relocations aren't yet fully implemented. llvm-svn: 200592	2014-01-31 23:36:25 +00:00
Rafael Espindola	972e71ab5a	Remove another hasRawTextSupport. To remove this one simply move the end of file logic from the asm printer to the target mc streamer. This removes the last call to hasRawTextSupport from lib/Target. llvm-svn: 200590	2014-01-31 23:10:26 +00:00
Reid Kleckner	edb94c70c1	Set -mcpu to make this test pass on atom bots llvm-svn: 200588	2014-01-31 22:58:10 +00:00
Rafael Espindola	e9d1102545	Mark the first dynamic elf symbol as SF_FormatSpecific. llvm-svn: 200578	2014-01-31 21:40:13 +00:00
Lang Hames	5ec150c967	Replace X86 FMA intrinsic pseduo-instructions with def pats. It looks like these pseudos were only used for pattern matching. Def pats are the appropriate way to do that. As a bonus, these intrinsics will now have memory operands folded properly, and better FMA3 variants selected where appropriate (see r199933). <rdar://problem/15611947> llvm-svn: 200577	2014-01-31 21:29:19 +00:00
Chandler Carruth	b3da389e30	[SLPV] Recognize vectorizable intrinsics during SLP vectorization and transform accordingly. Based on similar code from Loop vectorization. Subsequent commits will include vectorization of function calls to vector intrinsics and form function calls to vector library calls. Patch by Raul Silvera! (Much delayed due to my not running dcommit) llvm-svn: 200576	2014-01-31 21:14:40 +00:00
David Blaikie	322d79b4a2	DebugInfo: Flag type unit references as declarations This ensures DWARF consumers don't confuse these references for definitions. I'd argue it might be nice to improve debuggers so we don't need this, but it's just one field in an abbreviation anyway - so it doesn't seem worth the fight. llvm-svn: 200569	2014-01-31 19:52:26 +00:00
Reid Kleckner	1c843228f8	[ms-cxxabi] Add a new calling convention that swaps 'this' and 'sret' MSVC always places the 'this' parameter for a method first. The implicit 'sret' pointer for methods always comes second. We already implement this for __thiscall by putting sret parameters on the stack, but __cdecl methods require putting both parameters on the stack in opposite order. Using a special calling convention allows frontends to keep the sret parameter first, which avoids breaking lots of assumptions in LLVM and Clang. Fixes PR15768 with the corresponding change in Clang. Reviewers: ributzka, majnemer Differential Revision: http://llvm-reviews.chandlerc.com/D2663 llvm-svn: 200561	2014-01-31 17:41:22 +00:00
Matheus Almeida	1ace1f1236	[mips][msa] Add insert.d instruction. This instruction is only available on Mips64 cores that implement the MSA ASE. llvm-svn: 200543	2014-01-31 13:31:20 +00:00
Matheus Almeida	8114cf70aa	Update FileCheck prefixes in preparation for the addition of Mips64 MSA tests. No functional changes. llvm-svn: 200541	2014-01-31 13:05:56 +00:00
Chandler Carruth	c12224cb93	[vectorizer] Tweak the way we do small loop runtime unrolling in the loop vectorizer to not do so when runtime pointer checks are needed and share code with the new (not yet enabled) load/store saturation runtime unrolling. Also ensure that we only consider the runtime checks when the loop hasn't already been vectorized. If it has, the runtime check cost has already been paid. I've fleshed out a test case to cover the scalar unrolling as well as the vector unrolling and comment clearly why we are or aren't following the pattern. llvm-svn: 200530	2014-01-31 10:51:08 +00:00
Craig Topper	327f13b43b	Move address override handling in X86MCCodeEmitter to a place where it works for VEX encoded instructions too. This allows 32-bit addressing to work in 64-bit mode. llvm-svn: 200516	2014-01-31 05:33:45 +00:00
Manman Ren	413a6cb42b	This patch teaches the DAGCombiner how to fold insert_subvector nodes when the input is a concat_vectors and the insert replaces one of the concat halves: Lower half: fold (insert_subvector (concat_vectors X, Y), Z) -> (concat_vectors Z, Y) Upper half: fold (insert_subvector (concat_vectors X, Y), Z) -> (concat_vectors X, Z) This can be seen with the following IR: define <8 x float> @lower_half(<4 x float> %v1, <4 x float> %v2, <4 x float> %v3) { %1 = shufflevector <4 x float> %v1, <4 x float> %v2, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7> %2 = tail call <8 x float> @llvm.x86.avx.vinsertf128.ps.256(<8 x float> %1, <4 x float> %v3, i8 0) The vinsertf128 intrinsic is converted into an insert_subvector node in SelectionDAGBuilder.cpp. Using AVX, without the patch this generates two vinsertf128 instructions: vinsertf128 $1, %xmm1, %ymm0, %ymm0 vinsertf128 $0, %xmm2, %ymm0, %ymm0 With the patch this is optimized into: vinsertf128 $1, %xmm1, %ymm2, %ymm0 Patch by Robert Lougher. llvm-svn: 200506	2014-01-31 01:10:35 +00:00
Manman Ren	4ece7452ba	PGO branch weight: update edge weights in SelectionDAGBuilder. When converting from "or + br" to two branches, or converting from "and + br" to two branches, we correctly update the edge weights of the two branches. The previous attempt at r200431 was reverted at r200434 because of two testing case failures. I modified my patch a little, but forgot to re-run "make check-all". Testing case CodeGen/ARM/lsr-unfolded-offset.ll is updated because of the patch's impact on branch probability which causes changes in spill placement. llvm-svn: 200502	2014-01-31 00:42:44 +00:00
Matt Arsenault	ee364ee729	Allow speculating llvm.sqrt, fma and fmuladd This doesn't set errno, so this should be OK. Also update the documentation to explicitly state that errno are not set. llvm-svn: 200501	2014-01-31 00:09:00 +00:00
Timur Iskhodzhanov	f33d8b979b	Add a link to a bug to a couple of FIXMEs llvm-svn: 200500	2014-01-30 23:14:38 +00:00
David Woodhouse	0b6c94909e	[x86] Fix signed relocations for i64i32imm operands These should end up (in ELF) as R_X86_64_32S relocs, not R_X86_64_32. Kill the horrid and incomplete special case and FIXME in EncodeInstruction() and set things up so it can infer the signedness from the ImmType just like it can the size and whether it's PC-relative. llvm-svn: 200495	2014-01-30 22:20:41 +00:00
Chad Rosier	fe5ab2f5ba	[AArch64] Custom lower concat_vector patterns with v4i16, v4i32, v8i8, v8i16, v16i8 types. llvm-svn: 200491	2014-01-30 21:46:54 +00:00
Timur Iskhodzhanov	3e4ac4eeca	Fix PR18381 - print a minimal diagnostic rather than assert on unresolved .secidx target llvm-svn: 200490	2014-01-30 21:13:05 +00:00
Rafael Espindola	196666cf5f	Only ELF has a dynamic symbol table. Remove it from ObjectFile. COFF has only one symbol table. MachO has a LC_DYSYMTAB, but that is not a symbol table, just extra info about the one symbol table (LC_SYMTAB). IR (coming soon) also has only one table. llvm-svn: 200488	2014-01-30 20:45:33 +00:00
Rafael Espindola	41186dd288	This has been fixed. llvm-svn: 200487	2014-01-30 20:29:25 +00:00
Juergen Ributzka	fb4d648295	[Stackmaps] Record the stack size of each function that contains a stackmap/patchpoint intrinsic. Re-applying the patch, but this time without using AsmPrinter methods. Reviewed by Andy llvm-svn: 200481	2014-01-30 18:58:27 +00:00
Timur Iskhodzhanov	4a83bf168f	Explicitly specify the CPU to avoid Atom-specific assembly mismatch llvm-svn: 200473	2014-01-30 17:53:45 +00:00
Evgeniy Stepanov	02bc78b9cd	Reenable ARM EHABI on Android. Broken in r200388. llvm-svn: 200466	2014-01-30 14:18:25 +00:00
Jakob Stoklund Olesen	ef1d59a175	Implement SPARCv9 atomic_swap_64 with a pseudo. The SWAP instruction only exists in a 32-bit variant, but the 64-bit atomic swap can be implemented in terms of CASX, like the other atomic rmw primitives. llvm-svn: 200453	2014-01-30 04:48:46 +00:00
Saleem Abdulrasool	4c4789be5e	ARM IAS: support .object_arch The .object_arch directive indicates an alternative architecture to be specified in the object file. The directive does not effect the enabled feature bits for the object file generation. This is particularly useful when the code performs runtime detection and would like to indicate a lower architecture as the requirements than the actual instructions used. llvm-svn: 200451	2014-01-30 04:46:41 +00:00
Saleem Abdulrasool	15d16d809b	tools: add support for decoding ARM attributes Enhance the ARM specific parsing support in llvm-readobj to support attributes. This allows for simpler tests to validate encoding of the build attributes as specified in the ARM ELF specification. llvm-svn: 200450	2014-01-30 04:46:33 +00:00
Saleem Abdulrasool	5d962d31f1	ARM IAS: support .movsp .movsp is an ARM unwinding directive that indicates to the unwinder that a register contains an offset from the current stack pointer. If the offset is unspecified, it defaults to zero. llvm-svn: 200449	2014-01-30 04:46:24 +00:00
Saleem Abdulrasool	56e06e8640	ARM: suuport .tlsdescseq directive This enhances the ARMAsmParser to handle .tlsdescseq directives. This is a slightly special relocation. We must be able to generate them, but not consume them in assembly. The relocation is meant to assist the linker in generating a TLS descriptor sequence. The ELF target streamer is enhanced to append additional fixups into the current segment and that is used to emit the new R_ARM_TLS_DESCSEQ relocations. llvm-svn: 200448	2014-01-30 04:02:47 +00:00
Saleem Abdulrasool	a3f12bdeec	ARM: support TLS descriptor relocations Add support for tlsdesc relocations which are part of the ABI, marked as experimental. These relocations permit the linker to perform TLS reference optimizations. llvm-svn: 200447	2014-01-30 04:02:38 +00:00
Saleem Abdulrasool	6e00ca887e	ARM: support tlscall relocations This adds support for TLS CALL relocations. TLS CALL relocations are used to indicate to the linker to generate appropriate entries to resolve TLS references via an appropriate function invocation (e.g. __tls_get_addr(PLT)). In order to accomodate the linker relaxation of the TLS access model for the references (GD/LD -> IE, IE -> LE), the relocation addend must be incomplete. This requires that the partial inplace value is also incomplete (i.e. 0). We simply avoid the offset value calculation at the time of the fixup adjustment in the ARM assembler backend. llvm-svn: 200446	2014-01-30 04:02:31 +00:00
Juergen Ributzka	f6f0ce903e	Revert "[Stackmaps] Record the stack size of each function that contains a stackmap/patchpoint intrinsic." This reverts commit r200444 to unbreak buildbots. llvm-svn: 200445	2014-01-30 03:34:02 +00:00
Juergen Ributzka	aece7583a7	[Stackmaps] Record the stack size of each function that contains a stackmap/patchpoint intrinsic. Reviewed by Andy llvm-svn: 200444	2014-01-30 03:06:14 +00:00
Timur Iskhodzhanov	f166f6c8d0	Reland r200340 - 'Add line table debug info to COFF files when using a win32 triple' This incorporates a couple of fixes reviewed at http://llvm-reviews.chandlerc.com/D2651 llvm-svn: 200440	2014-01-30 01:39:17 +00:00
Manman Ren	7407e0e31c	Revert r200431 due to bot failures. llvm-svn: 200434	2014-01-30 00:53:27 +00:00
Rafael Espindola	9fa52155a9	Fix TLS handling in ELF's getAddress and llvm-nm to print 'D' for it. llvm-svn: 200433	2014-01-30 00:42:30 +00:00
Manman Ren	104e0c80cc	PGO branch weight: update edge weights in SelectionDAGBuilder. When converting from "or + br" to two branches, or converting from "and + br" to two branches, we correctly update the edge weights of the two branches. llvm-svn: 200431	2014-01-30 00:24:37 +00:00
Manman Ren	b681918ddd	PGO branch weight: update edge weights in IfConverter. This commit only handles IfConvertTriangle. To update edge weights of a successor, one interface is added to MachineBasicBlock: /// Set successor weight of a given iterator. setSuccWeight(succ_iterator I, uint32_t weight) An existing testing case test/CodeGen/Thumb2/v8_IT_5.ll is updated, since we now correctly update the edge weights, the cold block is placed at the end of the function and we jump to the cold block. llvm-svn: 200428	2014-01-29 23:18:47 +00:00
Eric Christopher	8873adaa60	If we use DW_AT_ranges we need to specify a base address that ranges are relative to in the compile unit. Currently let's just use 0... Thanks to Greg Clayton for the catch! llvm-svn: 200425	2014-01-29 22:22:56 +00:00
Eric Christopher	fb8dd0085e	Turn on CU ranges if we've got multiple compile units in the same module since there's no range guarantee that we could make given output order. This also fixes up the testcases that have multiple CUs to have the correct range offset. llvm-svn: 200422	2014-01-29 22:06:27 +00:00
Justin Bogner	a99a3902a5	llvm-cov: Expect a source file as a positional parameter Currently, llvm-cov isn't command-line compatible with gcov, which accepts a source file name as its first parameter and infers the gcno and gcda file names from that. This change keeps our -gcda and -gcno options available for convenience in overriding this behaviour, but adds the required parameter and inference behaviour as a compatible default. llvm-svn: 200417	2014-01-29 21:31:34 +00:00
David Majnemer	91fc4c2169	MC: Better management of macro arguments The linux kernel makes uses of a GAS `feature' which substitutes nothing for macro arguments which aren't specified. Proper support for these kind of macro arguments necessitated a cleanup of differences between `GAS' and `Darwin' dialect macro processing. Differential Revision: http://llvm-reviews.chandlerc.com/D2634 llvm-svn: 200409	2014-01-29 18:57:46 +00:00
Arnold Schwaighofer	85a26704e9	LoopVectorizer: Add a test case for unrolling of small loops that need a runtime check. llvm-svn: 200408	2014-01-29 18:55:44 +00:00
Lang Hames	2cbbdf41c0	Add support for PC-relative non-extern relocations to RuntimeDyldMachO. Also replaces testcase for r180790 (support for absolute non-externs relocs) with a more robust version. <rdar://problem/15864721> llvm-svn: 200404	2014-01-29 18:31:35 +00:00
Matheus Almeida	ec079d9e1d	[mips][msa] Add fill.d instruction. This instruction is only available on Mips64 cores that implement the MSA ASE. llvm-svn: 200400	2014-01-29 15:12:02 +00:00
Matheus Almeida	4cb577c614	[mips][msa] CHECK-DAG-ize MSA 2r_vector_scalar.ll test. This update is a preparation for the addition of Mips64 MSA tests. No functional changes. llvm-svn: 200399	2014-01-29 14:32:03 +00:00
Matheus Almeida	74070327b2	[mips][msa] Add copy_{u,s}.d. These instructions are only available on Mips64 cores that implement the MSA ASE. llvm-svn: 200398	2014-01-29 14:05:28 +00:00
Matheus Almeida	a64f0600f3	[mips][msa] CHECK-DAG-ize MSA elm_copy.ll test. This update is a preparation for the addition of Mips64 MSA tests. No functional changes. llvm-svn: 200395	2014-01-29 13:51:34 +00:00
Chandler Carruth	d4be9dc02d	[LPM] Fix PR18643, another scary place where loop transforms failed to preserve loop simplify of enclosing loops. The problem here starts with LoopRotation which ends up cloning code out of the latch into the new preheader it is buidling. This can create a new edge from the preheader into the exit block of the loop which breaks LoopSimplify form. The code tries to fix this by splitting the critical edge between the latch and the exit block to get a new exit block that only the latch dominates. This sadly isn't sufficient. The exit block may be an exit block for multiple nested loops. When we clone an edge from the latch of the inner loop to the new preheader being built in the outer loop, we create an exiting edge from the outer loop to this exit block. Despite breaking the LoopSimplify form for the inner loop, this is fine for the outer loop. However, when we split the edge from the inner loop to the exit block, we create a new block which is in neither the inner nor outer loop as the new exit block. This is a predecessor to the old exit block, and so the split itself takes the outer loop out of LoopSimplify form. We need to split every edge entering the exit block from inside a loop nested more deeply than the exit block in order to preserve all of the loop simplify constraints. Once we try to do that, a problem with splitting critical edges surfaces. Previously, we tried a very brute force to update LoopSimplify form by re-computing it for all exit blocks. We don't need to do this, and doing this much will sometimes but not always overlap with the LoopRotate bug fix. Instead, the code needs to specifically handle the cases which can start to violate LoopSimplify -- they aren't that common. We need to see if the destination of the split edge was a loop exit block in simplified form for the loop of the source of the edge. For this to be true, all the predecessors need to be in the exact same loop as the source of the edge being split. If the dest block was originally in this form, we have to split all of the deges back into this loop to recover it. The old mechanism of doing this was conservatively correct because at least one of the exiting blocks it rewrote was the DestBB and so the DestBB's predecessors were fixed. But this is a much more targeted way of doing it. Making it targeted is important, because ballooning the set of edges touched prevents LoopRotate from being able to split edges it needs to split to preserve loop simplify in a coherent way -- the critical edge splitting would sometimes find the other edges in need of splitting but not others. Many, many thanks for help from Nick reducing these test cases mightily. And helping lots with the analysis here as this one was quite tricky to track down. llvm-svn: 200393	2014-01-29 13:16:53 +00:00
Renato Golin	8cea6e8fc6	Enable EHABI by default After all hard work to implement the EHABI and with the test-suite passing, it's time to turn it on by default and allow users to disable it as a work-around while we fix the eventual bugs that show up. This commit also remove the -arm-enable-ehabi-descriptors, since we want the tables to be printed every time the EHABI is turned on for non-Darwin ARM targets. Although MCJIT EHABI is not working yet (needs linking with the right libraries), this commit also fixes some relocations on MCJIT regarding the EH tables/lib calls, and update some tests to avoid using EH tables when none are needed. The EH tests in the test-suite that were previously disabled on ARM now pass with these changes, so a follow-up commit on the test-suite will re-enable them. llvm-svn: 200388	2014-01-29 11:50:56 +00:00
David Majnemer	ad5c1734e2	MC: Reorganize macro MC test along dialect lines This commit seeks to do two things: - Run the surfeit of tests under the Darwin dialect. This ends up affecting tests which assumed that spaces could deliminate arguments. - The GAS dialect tests should limit their surface area to things that could plausibly work under GAS. For example, Darwin style arguments have no business being in such a test. llvm-svn: 200383	2014-01-29 09:18:43 +00:00
Kostya Serebryany	6e15ec1442	[asan] simplify a test llvm-svn: 200378	2014-01-29 07:35:43 +00:00
Venkatraman Govindaraju	141d0e2221	[Sparc] Use %r_disp32 for pc_rel entries in FDE as well. This makes MCAsmInfo::getExprForFDESymbol() a virtual function and overrides it in SparcMCAsmInfo. llvm-svn: 200376	2014-01-29 06:59:20 +00:00
NAKAMURA Takumi	b366f01f83	Revert r200340, "Add line table debug info to COFF files when using a win32 triple." It was incompatible with --target=i686-win32. llvm-svn: 200375	2014-01-29 06:05:38 +00:00
Venkatraman Govindaraju	fd5c1f9497	[Sparc] Use %r_disp32 for pc_rel entries in gcc_except_table and eh_frame. Otherwise, assembler (gas) fails to assemble them with error message "operation combines symbols in different segments". This is because MC computes pc_rel entries with subtract expression between labels from different sections. llvm-svn: 200373	2014-01-29 04:51:35 +00:00
Chandler Carruth	66f0b16360	[LPM] Fix PR18642, a pretty nasty bug in IndVars that "never mattered" because of the inside-out run of LoopSimplify in the LoopPassManager and the fact that LoopSimplify couldn't be "preserved" across two independent LoopPassManagers. Anyways, in that case, IndVars wasn't correctly preserving an LCSSA PHI node because it thought it was rewriting (via SCEV) the incoming value to a loop invariant value. While it may well be invariant for the current loop, it may be rewritten in terms of an enclosing loop's values. This in and of itself is fine, as the LCSSA PHI node in the enclosing loop for the inner loop value we're rewriting will have its own LCSSA PHI node if used outside of the enclosing loop. With me so far? Well, the current loop and the enclosing loop may share an exiting block and exit block, and when they do they also share LCSSA PHI nodes. In this case, its not valid to RAUW through the LCSSA PHI node. Expected crazy test included. llvm-svn: 200372	2014-01-29 04:40:19 +00:00
Rafael Espindola	7a578026ae	We do use pipefail these days. Update the test. llvm-svn: 200370	2014-01-29 04:08:05 +00:00
Venkatraman Govindaraju	50f32d949b	[SparcV9] Use correct register class (I64RegClass) to hold the address of _GLOBAL_OFFSET_TABLE_ in sparcv9. llvm-svn: 200368	2014-01-29 03:35:08 +00:00
Rafael Espindola	310f501ef0	Use a raw_stream to implement the mangler. This is a bit more convenient for some callers, but more importantly, it is easier to implement correctly. Doing this removes the patching of already printed data that was used for fastcall, fixing a crash with private fastcall symbols. llvm-svn: 200367	2014-01-29 02:30:38 +00:00
Kevin Qin	92d64d2d56	[AArch64 NEON] Lower SELECT_CC with vector operand. When the scalar compare is between floating point and operands are vector, we custom lower SELECT_CC to use NEON SIMD compare for generating less instructions. llvm-svn: 200365	2014-01-29 01:57:30 +00:00
David Woodhouse	21bfc71752	[ARM] Remove superfluous inline asm mode switch test llvm-svn: 200361	2014-01-29 00:49:28 +00:00
David Woodhouse	7db3705f9e	Tests for mode switching 1. test that inlineasm works 2. test that relaxable instructions are re-encoded in the correct mode. llvm-svn: 200351	2014-01-28 23:13:30 +00:00
Timur Iskhodzhanov	7523743075	Disable the COFF tests on non-X86 archs llvm-svn: 200341	2014-01-28 21:47:33 +00:00
Timur Iskhodzhanov	2c659648b3	Add line table debug info to COFF files when using a win32 triple. Reviewed at http://llvm-reviews.chandlerc.com/D2232 llvm-svn: 200340	2014-01-28 21:33:27 +00:00
Matheus Almeida	2e03f24301	[mips] Fix ELF header flags. As opposed to GCC/GAS the default ABI for Mips64 is n64. Compatibility bit should be set if o32 ABI is used when targeting Mips64. llvm-svn: 200332	2014-01-28 19:24:11 +00:00
Gautam Chakrabarti	2c283400f9	[NVPTX] Fix emitting aggregate parameters The code was missing the case for aggregate parameters and hence was emitting them as .b0 type. Also fixed a couple of comments. llvm-svn: 200325	2014-01-28 18:35:29 +00:00
Andrea Di Biagio	2ea61f17ad	[X86] Add extra rules for combining vselect dag nodes into movsd. This improves the fix committed at revision 199683 adding the following new target specific combine rules: 1) fold (v4i32: vselect <0,0,-1,-1>, A, B) -> (v4i32 (bitcast (movsd (v2i64 (bitcast A)), (v2i64 (bitcast B))) )) 2) fold (v4f32: vselect <0,0,-1,-1>, A, B) -> (v4f32 (bitcast (movsd (v2f64 (bitcast A)), (v2f64 (bitcast B))) )) 3) fold (v4i32: vselect <-1,-1,0,0>, A, B) -> (v4i32 (bitcast (movsd (v2i64 (bitcast B)), (v2i64 (bitcast A))) )) 4) fold (v4f32: vselect <-1,-1,0,0>, A, B) -> (v4f32 (bitcast (movsd (v2i64 (bitcast B)), (v2i64 (bitcast A))) )) llvm-svn: 200324	2014-01-28 18:14:21 +00:00
Rafael Espindola	ab73c493ea	Fix pr14893. When simplifycfg moves an instruction, it must drop metadata it doesn't know is still valid with the preconditions changes. In particular, it must drop the range and tbaa metadata. The patch implements this with an utility function to drop all metadata not in a white list. llvm-svn: 200322	2014-01-28 16:56:46 +00:00
Andrea Di Biagio	b6d39afbda	[DAGCombiner] Avoid introducing an illegal build_vector when folding a sign_extend. Make sure that we don't introduce illegal build_vector dag nodes when trying to fold a sign_extend of a build_vector. This fixes a regression introduced by r200234. Added test CodeGen/X86/fold-vector-sext-crash.ll to verify that llc no longer crashes with an assertion failure due to an illegal build_vector of type MVT::v4i64. Thanks to Ilia Filippov for spotting this regression and for providing a reproducible test case. llvm-svn: 200313	2014-01-28 12:53:56 +00:00
Chandler Carruth	b783628560	[vectorizer] Completely disable the block frequency guidance of the loop vectorizer, placing it behind an off-by-default flag. It turns out that block frequency isn't what we want at all, here or elsewhere. This has been I think a nagging feeling for several of us working with it, but Arnold has given some really nice simple examples where the results are so comprehensively wrong that they aren't useful. I'm planning to email the dev list with a summary of why its not really useful and a couple of ideas about how to better structure these types of heuristics. llvm-svn: 200294	2014-01-28 09:10:41 +00:00
Hal Finkel	4e703bcecd	Handle spilling the PPC GPRC_NOR0 register class GPRC_NOR0 is not a subclass of GPRC (because it also contains the ZERO pseudo register). As a result, we also need to check for it in the spilling code. llvm-svn: 200288	2014-01-28 05:32:58 +00:00
Michel Danzer	bf1a641060	R600/SI: Add pattern for truncating i32 to i1 Fixes half a dozen piglit tests with radeonsi. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 200283	2014-01-28 03:01:16 +00:00
Jakob Stoklund Olesen	83c677353b	Fix the DWARF EH encodings for Sparc PIC code. Also emit the stubs that were generated for references to typeinfo symbols. llvm-svn: 200282	2014-01-28 02:52:26 +00:00

... 3 4 5 6 7 ...

23001 Commits