llvm-project

Commit Graph

Author	SHA1	Message	Date
NAKAMURA Takumi	0607c15435	llvm/test/CodeGen/X86/shift-pcmp.ll: Tweak to appease FileCheck. "CHECK-LABEL" doesn't identify labels magically and CHECK-LABEL behaves free from other contexts. For targeting pecoff, ".def foo" appears before ".short 32". .def foo; ... .LCPI0_0: .short 32 foo: CHECK-LABEL seeks not from ".short 32" but from the top of the input. llvm-svn: 201931	2014-02-22 07:27:04 +00:00
Quentin Colombet	1627a4159e	[CodeGenPrepare] Fix the check of the legality of an instruction. The API expects an ISD opcode, not an IR opcode. Fixes a regression for R600. Related to <rdar://problem/15519855>. llvm-svn: 201923	2014-02-22 01:06:41 +00:00
Quentin Colombet	4db08df18e	[DAGCombiner] PCMP* sets its result to all ones or zeros so we can AND with the shifted mask rather than masking and shifting separately. The patch adds this transformation to the DAGCombiner: (shl (and (setcc:i8v16 ...) N01C) N1C) -> (and (setcc:i8v16 ...) N01C<<N1C) <rdar://problem/16054492> Patch by Adam Nemet <anemet@apple.com> llvm-svn: 201906	2014-02-21 23:42:41 +00:00
Kevin Qin	07334d37de	[AArch64] Add register constraints to avoid generating STLXR and STXR with unpredictable behavior. llvm-svn: 201841	2014-02-21 07:45:48 +00:00
Oliver Stannard	7b2f2fba7f	AArch64: __va_list.__stack must be 8-byte aligned The va_start macro for AArch64 must set va_list.__stack to the address following the last named argument on the stack, rounded up to an alignment of 8 bytes. llvm-svn: 201797	2014-02-20 17:19:26 +00:00
Daniel Sanders	5a1449dab4	[mips] Make it impossible to have UnknownABI in CodeGen and Integrated Assembler. Summary: This removes the need to coerce UnknownABI to the default ABI (O32 for MIPS32, N64 for MIPS64 []) in both MipsSubtarget and MipsAsmParser. Clang has been updated to disable both possible default ABI's before enabling the ABI it intends to use. [] N64 being the default for MIPS64 is not actually correct. However N32 is not fully implemented/tested yet. Depends on: D2830 Reviewers: jacksprat, matheusalmeida Reviewed By: matheusalmeida Differential Revision: http://llvm-reviews.chandlerc.com/D2832 Differential Revision: http://llvm-reviews.chandlerc.com/D2846 llvm-svn: 201792	2014-02-20 14:58:19 +00:00
Daniel Sanders	e70897f3ea	[mips] Make mips64 the default CPU for the mips64 architecture Summary: This is consistent with the integrated assembler. All mips64 codegen tests previously passed -mcpu. Removed -mcpu from blez_bgez.ll and const-mult.ll to cover the default case. Ideally, the two implementations of selectMipsCPU() will be merged but it's proven difficult to find a home for the function that doesn't cause link errors. For now, we'll hoist the common functionality into a function and mark it with FIXME's. Reviewers: jacksprat, matheusalmeida Reviewed By: matheusalmeida Differential Revision: http://llvm-reviews.chandlerc.com/D2830 llvm-svn: 201782	2014-02-20 13:13:33 +00:00
Elena Demikhovsky	2efed98b58	AVX-512: added a lit test for truncate operation llvm-svn: 201763	2014-02-20 07:34:13 +00:00
Roman Divacky	37136c0333	Expand 64bit {SHL,SHR,SRA}_PARTS on sparcv9. llvm-svn: 201718	2014-02-19 21:35:39 +00:00
Rafael Espindola	daeafb4c2a	Add back r201608, r201622, r201624 and r201625 r201608 made llvm corretly handle private globals with MachO. r201622 fixed a bug in it and r201624 and r201625 were changes for using private linkage, assuming that llvm would do the right thing. They all got reverted because r201608 introduced a crash in LTO. This patch includes a fix for that. The issue was that TargetLoweringObjectFile now has to be initialized before we can mangle names of private globals. This is trivially true during the normal codegen pipeline (the asm printer does it), but LTO has to do it manually. llvm-svn: 201700	2014-02-19 17:23:20 +00:00
Daniel Sanders	e15c1bbb69	[mips] Use multiple FileCheck prefixes rather than run the test multiple times llvm-svn: 201695	2014-02-19 16:27:36 +00:00
Venkatraman Govindaraju	55824b176a	[Sparc] Remove spurious checks from a testcase. llvm-svn: 201690	2014-02-19 15:57:49 +00:00
Cameron McInally	7b544f0297	Fix AVX512 vector sqrt assembly strings. llvm-svn: 201681	2014-02-19 15:16:09 +00:00
Daniel Jasper	7e198ad862	Revert r201622 and r201608. This causes the LLVMgold plugin to segfault. More information on the replies to r201608. llvm-svn: 201669	2014-02-19 12:26:01 +00:00
Rafael Espindola	b9ea63c551	Avoid an infinite cycle with private linkage and -f{data\|function}-sections. When outputting an object we check its section to find its name, but when looking for the section with -ffunction-section we look for the symbol name. Break the loop by requesting a name with the private prefix when constructing the section name. This matches the behavior before r201608. llvm-svn: 201622	2014-02-19 01:28:30 +00:00
Rafael Espindola	09dcc6a536	Fix PR18743. The IR @foo = private constant i32 42 is valid, but before this patch we would produce an invalid MachO from it. It was invalid because it would use an L label in a section where the liker needs the labels in order to atomize it. One way of fixing it would be to just reject this IR in the backend, but that would not be very front end friendly. What this patch does is use an 'l' prefix in sections that we know the linker requires symbols for atomizing them. This allows frontends to just use private and not worry about which sections they go to or how the linker handles them. One small issue with this strategy is that now a symbol name depends on the section, which is not available before codegen. This is not a problem in practice. The reason is that it only happens with private linkage, which will be ignored by the non codegen users (llvm-nm and llvm-ar). llvm-svn: 201608	2014-02-18 22:24:57 +00:00
Ana Pazos	7c27a265dc	[AArch64] Expanded sin, cos, pow with FP vector types inputs llvm-svn: 201601	2014-02-18 20:31:05 +00:00
Robert Lytton	346e808ec6	XCore target: Handle common linkage llvm-svn: 201563	2014-02-18 11:21:59 +00:00
Robert Lytton	af6c256c34	XCore target: Fix llvm.eh.return and EH info register handling llvm-svn: 201561	2014-02-18 11:21:48 +00:00
Tim Northover	f06df5866f	X86: use vpsllvd (& friends) for 16-bit shifts on Haswell llvm-svn: 201558	2014-02-18 11:15:32 +00:00
Jiangning Liu	742c588edc	Fix a typo about lowering AArch64 va_copy. llvm-svn: 201541	2014-02-18 02:37:42 +00:00
Elena Demikhovsky	750498c77b	AVX-512: implemented zext fron i1 to i16 llvm-svn: 201502	2014-02-17 07:29:33 +00:00
Mark Seaborn	be266aa325	Use 16 byte stack alignment for NaCl on ARM NaCl's ARM ABI uses 16 byte stack alignment, so set that in ARMSubtarget.cpp. Using 16 byte alignment exposes an issue in code generation in which a varargs function leaves a 4 byte gap between the values of r1-r3 saved to the stack and the following arguments that were passed on the stack. (Previously, this code only needed to support 4 byte and 8 byte alignment.) With this issue, llc generated: varargs_func: sub sp, sp, #16 push {lr} sub sp, sp, #12 add r0, sp, #16 // Should be 20 stm r0, {r1, r2, r3} ldr r0, .LCPI0_0 // Address of va_list add r1, sp, #16 str r1, [r0] bl external_func Fix the bug by checking for "Align > 4". Also simplify the code by using OffsetToAlignment(), and update comments. Differential Revision: http://llvm-reviews.chandlerc.com/D2677 llvm-svn: 201497	2014-02-16 18:59:48 +00:00
Nico Rieck	a0abeb3548	Fix more broken CHECK lines llvm-svn: 201493	2014-02-16 13:28:39 +00:00
Nico Rieck	5ba5226ab9	Add extra CHECK prefix to tests with explicit prefix These tests mistakenly assume that CHECK is still available even if an explicit prefix is specified. llvm-svn: 201492	2014-02-16 13:28:15 +00:00
Nico Rieck	35a237d4ed	Actually call FileCheck in tests llvm-svn: 201491	2014-02-16 13:27:39 +00:00
Elena Demikhovsky	1fad075974	AVX-512: simpyfied BUILD_VECTOR for masks; fixed cmp/test sequence llvm-svn: 201487	2014-02-16 11:34:23 +00:00
Nico Rieck	7647178738	Fix broken CHECK lines llvm-svn: 201479	2014-02-16 07:31:05 +00:00
Quentin Colombet	867c550947	[CodeGenPrepare][AddressingModeMatcher] Give up on type promotion if the transformation does not bring any immediate benefits and introduce an illegal operation. llvm-svn: 201439	2014-02-14 22:23:22 +00:00
Tom Stellard	728d4172df	TargetLowering: n * r where n > 2 should be an illegal addressing mode llvm-svn: 201433	2014-02-14 21:10:34 +00:00
Reed Kotler	4cdaa7d778	This patch has two main functions: 1) Fix a specific bug when certain conversion functions are called in a program compiled as mips16 with hard float and the program is linked as c++. There are two libraries that are reversed in the link order with gcc/g++ and clang/clang++ for mips16 in this case and the proper stubs will then not be called. These stubs are normally handled in the Mips16HardFloat pass but in this case we don't know at that time that we need to generate the stubs. This must all be handled later in code generation and we have moved this functionality to MipsAsmPrinter. When linked as C (gcc or clang) the proper stubs are linked in from libc. 2) Set up the infrastructure to handle 90% of what is in the Mips16HardFloat pass in this new area of MipsAsmPrinter. This is a more logical place to handle this and we have known for some time that we needed to move the code later and not implement it using inline asm as we do now but it was not clear exactly where to do this and what mechanism should be used. Now it's clear to us how to do this and this patch contains the infrastructure to move most of this to MipsAsmPrinter but the actual moving will be done in a follow on patch. The same infrastructure is used to fix this current bug as described in #1. This change was requested by the list during the original putback of the Mips16HardFloat pass but was not practical for us do at that time. llvm-svn: 201426	2014-02-14 19:16:39 +00:00
Artyom Skrobov	f6830f47b8	Generate the DWARF stack frame decode operations in the function prologue for ARM/Thumb functions. Patch by Keith Walker! llvm-svn: 201423	2014-02-14 17:19:07 +00:00
Kevin Qin	edc95ee196	[AArch64 NEON] Fix a bug to avoid using floating type as condition type in lowering SELECT_CC. llvm-svn: 201395	2014-02-14 09:41:15 +00:00
Hao Liu	7146ef8542	[AArch64]Fix the assertion failure caused by "v1i1 SETCC" DAG node. As v1i1 is illegal, the type legalizer tries to scalarize such node. But if the type operands of SETCC is legal, the scalarization algorithm will cause an assertion failure. llvm-svn: 201381	2014-02-14 02:21:56 +00:00
Tom Stellard	967bf5813f	R600/SI: Expand all v8[if]32 operations llvm-svn: 201371	2014-02-13 23:34:15 +00:00
Tom Stellard	f16d38cbb5	R600/SI: Add a pattern for i32 anyext Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 201370	2014-02-13 23:34:13 +00:00
Tom Stellard	6c7a7e82a7	R600/SI: Completely Disable TypeRewriter on compute llvm-svn: 201369	2014-02-13 23:34:12 +00:00
Tom Stellard	80be9650e3	R600/SI: Split global vector loads with more than 4 elements llvm-svn: 201368	2014-02-13 23:34:10 +00:00
Tom Stellard	60b6153044	R600/SI: Add ShaderType attribute to some tests llvm-svn: 201367	2014-02-13 23:34:07 +00:00
Rafael Espindola	1f3de49f37	Use __literal16. It has been supported by the linker since 2005. llvm-svn: 201365	2014-02-13 23:16:11 +00:00
Rafael Espindola	8459762c88	Add triples to try to fix the windows bots. llvm-svn: 201345	2014-02-13 16:49:47 +00:00
Rafael Espindola	98f0cfdd90	.file is only available on ELF, use a triple instead of -march. llvm-svn: 201337	2014-02-13 15:38:16 +00:00
Rafael Espindola	e6f37b908a	"foo" is not a ppc instruction, don't try to parse it. llvm-svn: 201336	2014-02-13 15:33:35 +00:00
Rafael Espindola	432acf5cef	Specify a triple. MachO AArch64 support is missing. llvm-svn: 201335	2014-02-13 15:30:06 +00:00
Daniel Sanders	753e17629d	Re-commit: Demote EmitRawText call in AsmPrinter::EmitInlineAsm() and remove hasRawTextSupport() call Summary: AsmPrinter::EmitInlineAsm() will no longer use the EmitRawText() call for targets with mature MC support. Such targets will always parse the inline assembly (even when emitting assembly). Targets without mature MC support continue to use EmitRawText() for assembly output. The hasRawTextSupport() check in AsmPrinter::EmitInlineAsm() has been replaced with MCAsmInfo::UseIntegratedAs which when true, causes the integrated assembler to parse inline assembly (even when emitting assembly output). UseIntegratedAs is set to true for targets that consider any failure to parse valid assembly to be a bug. Target specific subclasses generally enable the integrated assembler in their constructor. The default value can be overridden with -no-integrated-as. All tests that rely on inline assembly supporting invalid assembly (for example, those that use mnemonics such as 'foo' or 'hello world') have been updated to disable the integrated assembler. Changes since review (and last commit attempt): - Fixed test failures that were missed due to configuration of local build. (fixes crash.ll and a couple others). - Fixed tests that happened to pass because the local build was on X86 (should fix 2007-12-17-InvokeAsm.ll) - mature-mc-support.ll's should no longer require all targets to be compiled. (should fix ARM and PPC buildbots) - Object output (-filetype=obj and similar) now forces the integrated assembler to be enabled regardless of default setting or -no-integrated-as. (should fix SystemZ buildbots) Reviewers: rafael Reviewed By: rafael CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D2686 llvm-svn: 201333	2014-02-13 14:44:26 +00:00
NAKAMURA Takumi	6fe94621b9	llvm/test/CodeGen/AArch64/cpus.ll: Tweak to use -mtriple=aarch64-unknown-unknown, or this would crash for targeting pecoff like *-mingw32. llvm-svn: 201315	2014-02-13 11:06:23 +00:00
Tim Northover	914af6273b	ARM: remove floating-point patterns for @llvm.arm.neon.vabs The front-end is now generating the generic @llvm.fabs for this operation now, so the extra patterns are no longer needed. llvm-svn: 201314	2014-02-13 10:44:30 +00:00
Oliver Stannard	5bbb72f37e	Add Cortex-A53 and Cortex-A57 cores to the AArch64 backend llvm-svn: 201305	2014-02-13 09:46:11 +00:00
Hao Liu	7b6dfcf06a	[AArch64]Fix the problems that can't select mul/add/sub of v1i8/v1i16/v1i32 types. As this problems are similar to shl/sra/srl, also add patterns for shift nodes. llvm-svn: 201298	2014-02-13 05:42:33 +00:00
Juergen Ributzka	2b97f9b211	[DAG] Fix the recognition of opaque constants in the SelectionDAGBuilder. This fix checks the original LLVM IR node to identify opaque constants by looking for the bitcast-constant pattern. Originally we looked at the generated SDNode, but this might lead to incorrect results. The SDNode could have been generated by an constant expression that was folded to a constant. This fixes <rdar://problem/16050719> llvm-svn: 201291	2014-02-13 04:19:26 +00:00
Hao Liu	4f345f3c03	[AArch64]Add support for spilling FPR8/FPR16. llvm-svn: 201287	2014-02-13 02:36:58 +00:00
Andrea Di Biagio	386d566395	[X86] Teach the backend how to lower vector shift left into multiply rather than scalarizing it. Instead of expanding a packed shift into a sequence of scalar shifts, the backend now tries (when possible) to convert the vector shift into a vector multiply. Before this change, a shift of a MVT::v8i16 vector by a build_vector of constants was always scalarized into a long sequence of "vector extracts + scalar shifts + vector insert". With this change, if there is SSE2 support, we emit a single vector multiply. This change also affects SSE4.1, AVX, AVX2 shifts: - A shift of a MVT::v4i32 vector by a build_vector of non uniform constants is now lowered when possible into a single SSE4.1 vector multiply. - Packed v16i16 shift left by constant build_vector are now expanded when possible into a single AVX2 vpmullw. This change also improves the lowering of AVX512f vector shifts. Added test CodeGen/X86/vec_shift6.ll with some code examples that are affected by this change. llvm-svn: 201271	2014-02-12 23:42:28 +00:00
Akira Hatanaka	a07ffb5b31	Pass edges weights to MachineBasicBlock::addSuccessor in TailDuplicatePass to preserve branch probability information. <rdar://problem/15893208> llvm-svn: 201245	2014-02-12 18:09:18 +00:00
Daniel Sanders	abe212a3b8	Revert r201237+r201238: Demote EmitRawText call in AsmPrinter::EmitInlineAsm() and remove hasRawTextSupport() call It introduced multiple test failures in the buildbots. llvm-svn: 201241	2014-02-12 15:39:20 +00:00
Daniel Sanders	a7d504cf58	Demote EmitRawText call in AsmPrinter::EmitInlineAsm() and remove hasRawTextSupport() call Summary: AsmPrinter::EmitInlineAsm() will no longer use the EmitRawText() call for targets with mature MC support. Such targets will always parse the inline assembly (even when emitting assembly). Targets without mature MC support continue to use EmitRawText() for assembly output. The hasRawTextSupport() check in AsmPrinter::EmitInlineAsm() has been replaced with MCAsmInfo::UseIntegratedAs which when true, causes the integrated assembler to parse inline assembly (even when emitting assembly output). UseIntegratedAs is set to true for targets that consider any failure to parse valid assembly to be a bug. Target specific subclasses generally enable the integrated assembler in their constructor. The default value can be overridden with -no-integrated-as. All tests that rely on inline assembly supporting invalid assembly (for example, those that use mnemonics such as 'foo' or 'hello world') have been updated to disable the integrated assembler. Reviewers: rafael Reviewed By: rafael CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D2686 llvm-svn: 201237	2014-02-12 14:44:54 +00:00
Evan Cheng	57add3e4ee	Tweak ARM fastcc by adopting these two AAPCS rules: * CPRCs may be allocated to co-processor registers or the stack – they may never be allocated to core registers * When a CPRC is allocated to the stack, all other VFP registers should be marked as unavailable The difference is only noticeable in rare cases where there are a large number of floating point arguments (e.g. 7 doubles + additional float, double arguments). Although it's probably still better to avoid vmov as it can cause stalls in some older ARM cores. The other, more subtle benefit, is to minimize difference between the various calling conventions. rdar://16039676 llvm-svn: 201193	2014-02-11 23:49:31 +00:00
David Blaikie	6bd395f3f0	DebugInfo: Remove dependence on file numbering in the line table. These tests were unnecessarily sensitive to the presence and ordering of elements in the line table file_names list which will break on a future change I'm working on. llvm-svn: 201185	2014-02-11 21:46:46 +00:00
Matt Arsenault	71b71d25eb	R600/SI: Fix assertion on infinite loops. This isn't the most useful case to fix in the real world, but bugpoint runs into this. llvm-svn: 201177	2014-02-11 21:12:38 +00:00
Robert Lougher	7d9084ffa1	Teach the DAGCombiner how to fold concat_vector nodes when the input is two BUILD_VECTOR nodes, e.g.: (concat_vectors (BUILD_VECTOR a1, a2, a3, a4), (BUILD_VECTOR b1, b2, b3, b4)) -> (BUILD_VECTOR a1, a2, a3, a4, b1, b2, b3, b4) This fixes an issue with AVX, where a sequence was not recognized as a 256-bit vbroadcast due to the concat_vectors. llvm-svn: 201158	2014-02-11 15:42:46 +00:00
Robert Lytton	70b5ba49c3	XCore target: fix const section handling Xcore target ABI requires const data that is externally visible to be handled differently if it has C-language linkage rather than C++ language linkage. Clang now emits ".cp.rodata" section information. All other externally visible constant data will be placed in the DP section. llvm-svn: 201144	2014-02-11 10:36:26 +00:00
Robert Lytton	9b6bb461b1	XCore target: Lower ATOMIC_LOAD & ATOMIC_STORE llvm-svn: 201143	2014-02-11 10:36:18 +00:00
Elena Demikhovsky	1f32c313f1	AVX: fixed a bug in LowerVECTOR_SHUFFLE llvm-svn: 201140	2014-02-11 10:21:53 +00:00
Elena Demikhovsky	2aafc22ed9	AVX-512: Optimized BUILD_VECTOR pattern; fixed encoding of VEXTRACTPS instruction. llvm-svn: 201134	2014-02-11 07:25:59 +00:00
Quentin Colombet	64ca544d95	[CodeGenPrepare] Test case for the promotions that bypass the profitability check due to some other checks in the addressing mode matcher. I.e., test case for commit r201121. <rdar://problem/16020230> llvm-svn: 201132	2014-02-11 06:55:43 +00:00
Tom Stellard	5d7aaaed7d	R600/SI: Initialize M0 and emit S_WQM_B64 whenever DS instructions are used DS instructions that access local memory can only uses addresses that are less than or equal to the value of M0. When M0 is uninitialized, then we experience undefined behavior. This patch also changes the behavior to emit S_WQM_B64 on pixel shaders no matter what kind of DS instruction is used. llvm-svn: 201097	2014-02-10 16:58:30 +00:00
Tim Northover	b0430415e6	ARM: use natural LLVM IR for vshll instructions Similarly to the vshrn instructions, these are simple zext/sext + trunc operations. Using normal LLVM IR should allow for better code, and more sharing with the AArch64 backend. llvm-svn: 201093	2014-02-10 16:20:29 +00:00
Oliver Stannard	8dcaa761a2	ARM: r12 is callee-saved for interrupt handlers For A- and R-class processors, r12 is not normally callee-saved, but is for interrupt handlers. See AAPCS, 5.3.1.1, "Use of IP by the linker". llvm-svn: 201089	2014-02-10 14:24:23 +00:00
Tim Northover	170daafe01	ARM: use LLVM IR to represent the vshrn operation vshrn is just the combination of a right shift and a truncate (and the limits on the immediate value actually mean the signedness of the shift doesn't matter). Using that representation allows us to get rid of an ARM-specific intrinsic, share more code with AArch64 and hopefully get better code out of the mid-end optimisers. llvm-svn: 201085	2014-02-10 14:04:07 +00:00
Robert Lougher	48ee75b7e3	Test commit - added a new line to vec_shuf-insert.ll. llvm-svn: 201083	2014-02-10 12:42:13 +00:00
Matheus Almeida	4b27eb588c	[mips][msa] Add DLSA instruction. llvm-svn: 201081	2014-02-10 12:05:17 +00:00
Matheus Almeida	b4133b25e7	[mips][msa] Update FileCheck prefix in preparation for the addition of Mips64 tests. No functional changes. llvm-svn: 201080	2014-02-10 11:30:09 +00:00
Elena Demikhovsky	9f423d6f25	AVX-512: Fixed extract_vector_elt for v16i1 and v8i1 vectors. llvm-svn: 201066	2014-02-10 07:02:39 +00:00
Hao Liu	6e73761dc8	[AArch64]Implement the copy of two FPR8 registers by using FMOVss of two FPR32 registers in copyPhysReg. llvm-svn: 201061	2014-02-10 03:16:22 +00:00
Rafael Espindola	5054362920	Always create a temporary symbol to use with the cfi frame. This is a small simplification and a small step in fixing pr18743 since private functions on MachO should be using a 'l' prefix. llvm-svn: 200994	2014-02-07 21:23:18 +00:00
Rafael Espindola	e393aab13b	Use FileCheck variables to simplify this test. llvm-svn: 200992	2014-02-07 21:11:33 +00:00
Renato Golin	3dc5ade8bb	Fix Darwin bots from EHABI change llvm-svn: 200990	2014-02-07 20:32:32 +00:00
Matt Arsenault	9ad27e8d76	R600/SI: Add failing test for 3 x i64 vectors. Stores of <4 x i64> do work (although they do expand to 4 stores instead of 2), but 3 x i64 vectors fail to select. llvm-svn: 200989	2014-02-07 20:29:40 +00:00
Renato Golin	78a6eba862	Remove -arm-disable-ehabi option llvm-svn: 200988	2014-02-07 20:12:49 +00:00
Sasa Stankovic	4c80bdae72	[mips] Forbid the use of registers t6, t7 and t8 if the target is NaCl. Differential Revision: http://llvm-reviews.chandlerc.com/D2694 llvm-svn: 200978	2014-02-07 17:16:40 +00:00
Rafael Espindola	61acf5d9b0	Fix a bug with .weak_def_can_be_hidden: Mutable variables cannot use it. Thanks to John McCall for noticing it. llvm-svn: 200977	2014-02-07 16:21:30 +00:00
Oliver Stannard	1dc1034218	LLVM-1163: AAPCS-VFP violation when CPRC allocated to stack According to the AAPCS, when a CPRC is allocated to the stack, all other VFP registers should be marked as unavailable. I have also modified the rules for allocating non-CPRCs to the stack, to make it more explicit that all GPRs must be made unavailable. I cannot think of a case where the old version would produce incorrect answers, so there is no test for this. llvm-svn: 200970	2014-02-07 11:19:53 +00:00
Venkatraman Govindaraju	fd07500dd1	[Sparc] Emit relocations for Thread Local Storage (TLS) when integrated assembler is used. llvm-svn: 200962	2014-02-07 05:54:20 +00:00
Venkatraman Govindaraju	104643d0aa	[Sparc] Emit correct relocations for PIC code when integrated assembler is used. llvm-svn: 200961	2014-02-07 04:24:35 +00:00
Manman Ren	37c9267107	PGO branch weight: fix PR18752. Fix a bug triggered in IfConverterTriangle when CvtBB has multiple predecessors by getting the weights before removing a successor. llvm-svn: 200958	2014-02-07 00:38:56 +00:00
Jim Grosbach	e9008de652	X86: Resolve a long standing FIXME and properly isel pextr[bw]. Generalize the AArch64 .td nodes for AssertZext and AssertSext. Use them to match the relevant pextr store instructions. The test widen_load-2.ll requires a slight change because with the stores gone, the remaining instructions are scheduled in a different order. Add test cases for SSE4 and AVX variants. Resolves rdar://13414672. Patch by Adam Nemet <anemet@apple.com>. llvm-svn: 200957	2014-02-07 00:16:33 +00:00
Rafael Espindola	803fb10874	Convert test to FileCheck. llvm-svn: 200955	2014-02-06 23:35:22 +00:00
Quentin Colombet	3a4bf0405e	[CodeGenPrepare] Move away sign extensions that get in the way of addressing mode. Basically the idea is to transform code like this: %idx = add nsw i32 %a, 1 %sextidx = sext i32 %idx to i64 %gep = gep i8* %myArray, i64 %sextidx load i8* %gep Into: %sexta = sext i32 %a to i64 %idx = add nsw i64 %sexta, 1 %gep = gep i8* %myArray, i64 %idx load i8* %gep That way the computation can be folded into the addressing mode. This transformation is done as part of the addressing mode matcher. If the matching fails (not profitable, addressing mode not legal, etc.), the matcher will revert the related promotions. <rdar://problem/15519855> llvm-svn: 200947	2014-02-06 21:44:56 +00:00
Tom Stellard	e236794578	R600/SI: Add a MUBUF store pattern for Reg+Imm offsets llvm-svn: 200935	2014-02-06 18:36:41 +00:00
Tom Stellard	2937cbc005	R600/SI: Add a MUBUF store pattern for Imm offsets llvm-svn: 200934	2014-02-06 18:36:39 +00:00
Tom Stellard	11624bc577	R600/SI: Add a MUBUF load pattern for Reg+Imm offsets llvm-svn: 200933	2014-02-06 18:36:38 +00:00
Tom Stellard	044e418f15	R600/SI: Use immediates offsets for SMRD instructions whenever possible There was a problem with the old pattern, so we were copying some larger immediates into registers when we could have been encoding them in the instruction. llvm-svn: 200932	2014-02-06 18:36:34 +00:00
Juergen Ributzka	fa0eba6c8b	[DAG] Don't pull the binary operation though the shift if the operands have opaque constants. During DAGCombine visitShiftByConstant assumes that certain binary operations with only constant operands can always be folded successfully. This is no longer true when the constant is opaque. This commit fixes visitShiftByConstant by not performing the optimization for opaque constants. Otherwise we would end up in an infinite DAGCombine loop. llvm-svn: 200900	2014-02-06 04:09:06 +00:00
Quentin Colombet	87769713cf	[RegAlloc] Add a last chance recoloring mechanism when everything else failed to find a register. The idea is to choose a color for the variable that cannot be allocated and recolor its interferences around. Unlike the current register allocation scheme, it is allowed to change the color of an already assigned (but maybe not splittable or spillable) live interval while propagating this change to its neighbors. In other word, there are two things that may help finding an available color: - Already assigned variables (RS_Done) can be recolored to different color. - The recoloring allows to catch solutions that needs to touch more that just the neighbors of the current allocated variable. E.g., vA can use {R1, R2 } vB can use { R2, R3} vC can use {R1 } Where vA, vB, and vC cannot be split anymore (they are reloads for instance) and they all interfere. vA is assigned R1 vB is assigned R2 vC tries to evict vA but vA is already done. => Regular register allocation heuristic fails. Last chance recoloring kicks in: vC does as if vA was evicted => vC uses R1. vC is marked as fixed. vA needs to find a color. None are available. vA cannot evict vC: vC is a fixed virtual register now. vA does as if vB was evicted => vA uses R2. vB needs to find a color. R3 is available. Recoloring => vC = R1, vA = R2, vB = R3. <rdar://problem/15947839> llvm-svn: 200883	2014-02-05 22:13:59 +00:00
Rafael Espindola	b4eec1daa1	Remove support for not using .loc directives. Clang itself was not using this. The only way to access it was via llc. llvm-svn: 200862	2014-02-05 18:00:21 +00:00
Petar Jovanovic	9725016af3	[mips] Add NaCl target and forbid indexed loads and stores for it This patch adds NaCl target for Mips. It also forbids indexed loads and stores if the target is NaCl. Patch by Sasa Stankovic. Differential Revision: http://llvm-reviews.chandlerc.com/D2690 llvm-svn: 200855	2014-02-05 17:19:30 +00:00
Elena Demikhovsky	0b79be8ab2	AVX-512: optimized icmp -> sext -> icmp pattern llvm-svn: 200849	2014-02-05 16:17:36 +00:00
Michel Danzer	5d26fdfcba	R600/SI: Add pattern for zero-extending i1 to i32 Fixes opencl-example if_* tests with radeonsi. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74469 Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 200830	2014-02-05 09:48:05 +00:00
Elena Demikhovsky	a30e437659	AVX-512: Added intrinsic for cvtph2ps. Added VPTESTNM instruction. Added a pattern to vselect (lit tests will follow). llvm-svn: 200823	2014-02-05 07:05:03 +00:00
David Peixotto	b9b7362cdc	Fix PR18345: ldr= pseudo instruction produces incorrect code when using in inline assembly This patch fixes the ldr-pseudo implementation to work when used in inline assembly. The fix is to move arm assembler constant pools from the ARMAsmParser class to the ARMTargetStreamer class. Previously we kept the assembler generated constant pools in the ARMAsmParser object. This does not work for inline assembly because a new parser object is created for each blob of inline assembly. This patch moves the constant pools to the ARMTargetStreamer class so that the constant pool will remain alive for the entire code generation process. An ARMTargetStreamer class is now required for the arm backend. There was no existing implementation for MachO, only Asm and ELF. Instead of creating an empty MachO subclass, we decided to make the ARMTargetStreamer a non-abstract class and provide default (llvm_unreachable) implementations for the non constant-pool related methods. Differential Revision: http://llvm-reviews.chandlerc.com/D2638 llvm-svn: 200777	2014-02-04 17:22:40 +00:00
Tom Stellard	0ec134f3d6	R600/SI: Custom lower i64 ISD::SELECT llvm-svn: 200774	2014-02-04 17:18:40 +00:00
Tom Stellard	bfebd1fc7e	R600: Enable vector fpow. The OpenCL specs say: "The vector versions of the math functions operate component-wise. The description is per-component." Patch by: Jan Vesely Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 200773	2014-02-04 17:18:37 +00:00
Tim Northover	fdbdb4b6d5	ARM & AArch64: merge NEON absolute compare intrinsics There was an extremely confusing proliferation of LLVM intrinsics to implement the vacge & vacgt instructions. This combines them all into two polymorphic intrinsics, shared across both backends. llvm-svn: 200768	2014-02-04 14:55:42 +00:00
Tim Northover	e42fb07618	ARM: fix fast-isel assertion failure Missing braces on if meant we inserted both ARM and Thumb load for a litpool entry. This didn't end well. rdar://problem/15959157 llvm-svn: 200752	2014-02-04 10:38:46 +00:00
Michel Danzer	624b02aa67	R600/SI: Fix fneg for 0.0 V_ADD_F32 with source modifier does not produce -0.0 for this. Just manipulate the sign bit directly instead. Also add a pattern for (fneg (fabs ...)). Fixes a bunch of bit encoding piglit tests with radeonsi. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 200743	2014-02-04 07:12:38 +00:00
David Blaikie	5e390e4df7	DebugInfo: Remove some unneeded conditionals now that DIBuilder no longer emits zero-length arrays as {i32 0} A bunch of test cases needed to be cleaned up for this, many my fault - when implementid imported modules I updated test cases by simply duplicating the prior metadata field - which wasn't always the empty metadata entry. llvm-svn: 200731	2014-02-04 01:23:52 +00:00
Tim Northover	24979d8e10	AArch64 & ARM: refactor crypto intrinsics to take scalars Some of the SHA instructions take a scalar i32 as one argument (largely because they work on 160-bit hash fragments). This wasn't reflected in the IR previously, with ARM and AArch64 choosing different types (<4 x i32> and <1 x i32> respectively) which was ugly. This makes all the affected intrinsics take a uniform "i32", allowing them to become non-polymorphic at the same time. llvm-svn: 200706	2014-02-03 17:27:49 +00:00
Hal Finkel	5c968d9440	Expand vector bswap in LegalizeVectorOps ISD::BSWAP was missing from the list of node types that should be expanded element-wise. llvm-svn: 200705	2014-02-03 17:27:25 +00:00
Matt Arsenault	6e63dd27a2	Add some xfailed R600 tests for 64-bit private accesses. llvm-svn: 200620	2014-02-02 00:13:12 +00:00
Matt Arsenault	f5958dded4	R600/SI: Fix insertelement with dynamic indices. This didn't work for any integer vectors, and didn't work with some sizes of float vectors. This should now work with all sizes of float and i32 vectors. llvm-svn: 200619	2014-02-02 00:05:35 +00:00
Venkatraman Govindaraju	52b6473d74	[Sparc] Set %o7 as the return address register instead of %i7 in MCRegisterInfo. Also, add CFI instructions to initialize the frame correctly. llvm-svn: 200617	2014-02-01 18:54:16 +00:00
Josh Magee	24c7f06333	[stackprotector] Implement the sspstrong rules for stack layout. This changes the PrologueEpilogInserter and LocalStackSlotAllocation passes to follow the extended stack layout rules for sspstrong and sspreq. The sspstrong layout rules are: 1. Large arrays and structures containing large arrays (>= ssp-buffer-size) are closest to the stack protector. 2. Small arrays and structures containing small arrays (< ssp-buffer-size) are 2nd closest to the protector. 3. Variables that have had their address taken are 3rd closest to the protector. Differential Revision: http://llvm-reviews.chandlerc.com/D2546 llvm-svn: 200601	2014-02-01 01:36:16 +00:00
Reid Kleckner	f5b76518c9	Implement inalloca codegen for x86 with the new inalloca design Calls with inalloca are lowered by skipping all stores for arguments passed in memory and the initial stack adjustment to allocate argument memory. Now the frontend is responsible for the memory layout, and the backend doesn't have to do any work. As a result these changes are pretty minimal. Reviewers: echristo Differential Revision: http://llvm-reviews.chandlerc.com/D2637 llvm-svn: 200596	2014-01-31 23:50:57 +00:00
Reid Kleckner	dfbed59cc2	Don't put non-static allocas in the static alloca map Allocas marked inalloca are never static, but we were trying to put them into the static alloca map if they were in the entry block. Also add an assertion in x86 fastisel. llvm-svn: 200593	2014-01-31 23:45:12 +00:00
Reid Kleckner	edb94c70c1	Set -mcpu to make this test pass on atom bots llvm-svn: 200588	2014-01-31 22:58:10 +00:00
Lang Hames	5ec150c967	Replace X86 FMA intrinsic pseduo-instructions with def pats. It looks like these pseudos were only used for pattern matching. Def pats are the appropriate way to do that. As a bonus, these intrinsics will now have memory operands folded properly, and better FMA3 variants selected where appropriate (see r199933). <rdar://problem/15611947> llvm-svn: 200577	2014-01-31 21:29:19 +00:00
Reid Kleckner	1c843228f8	[ms-cxxabi] Add a new calling convention that swaps 'this' and 'sret' MSVC always places the 'this' parameter for a method first. The implicit 'sret' pointer for methods always comes second. We already implement this for __thiscall by putting sret parameters on the stack, but __cdecl methods require putting both parameters on the stack in opposite order. Using a special calling convention allows frontends to keep the sret parameter first, which avoids breaking lots of assumptions in LLVM and Clang. Fixes PR15768 with the corresponding change in Clang. Reviewers: ributzka, majnemer Differential Revision: http://llvm-reviews.chandlerc.com/D2663 llvm-svn: 200561	2014-01-31 17:41:22 +00:00
Matheus Almeida	1ace1f1236	[mips][msa] Add insert.d instruction. This instruction is only available on Mips64 cores that implement the MSA ASE. llvm-svn: 200543	2014-01-31 13:31:20 +00:00
Matheus Almeida	8114cf70aa	Update FileCheck prefixes in preparation for the addition of Mips64 MSA tests. No functional changes. llvm-svn: 200541	2014-01-31 13:05:56 +00:00
Manman Ren	413a6cb42b	This patch teaches the DAGCombiner how to fold insert_subvector nodes when the input is a concat_vectors and the insert replaces one of the concat halves: Lower half: fold (insert_subvector (concat_vectors X, Y), Z) -> (concat_vectors Z, Y) Upper half: fold (insert_subvector (concat_vectors X, Y), Z) -> (concat_vectors X, Z) This can be seen with the following IR: define <8 x float> @lower_half(<4 x float> %v1, <4 x float> %v2, <4 x float> %v3) { %1 = shufflevector <4 x float> %v1, <4 x float> %v2, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7> %2 = tail call <8 x float> @llvm.x86.avx.vinsertf128.ps.256(<8 x float> %1, <4 x float> %v3, i8 0) The vinsertf128 intrinsic is converted into an insert_subvector node in SelectionDAGBuilder.cpp. Using AVX, without the patch this generates two vinsertf128 instructions: vinsertf128 $1, %xmm1, %ymm0, %ymm0 vinsertf128 $0, %xmm2, %ymm0, %ymm0 With the patch this is optimized into: vinsertf128 $1, %xmm1, %ymm2, %ymm0 Patch by Robert Lougher. llvm-svn: 200506	2014-01-31 01:10:35 +00:00
Manman Ren	4ece7452ba	PGO branch weight: update edge weights in SelectionDAGBuilder. When converting from "or + br" to two branches, or converting from "and + br" to two branches, we correctly update the edge weights of the two branches. The previous attempt at r200431 was reverted at r200434 because of two testing case failures. I modified my patch a little, but forgot to re-run "make check-all". Testing case CodeGen/ARM/lsr-unfolded-offset.ll is updated because of the patch's impact on branch probability which causes changes in spill placement. llvm-svn: 200502	2014-01-31 00:42:44 +00:00
Chad Rosier	fe5ab2f5ba	[AArch64] Custom lower concat_vector patterns with v4i16, v4i32, v8i8, v8i16, v16i8 types. llvm-svn: 200491	2014-01-30 21:46:54 +00:00
Juergen Ributzka	fb4d648295	[Stackmaps] Record the stack size of each function that contains a stackmap/patchpoint intrinsic. Re-applying the patch, but this time without using AsmPrinter methods. Reviewed by Andy llvm-svn: 200481	2014-01-30 18:58:27 +00:00
Evgeniy Stepanov	02bc78b9cd	Reenable ARM EHABI on Android. Broken in r200388. llvm-svn: 200466	2014-01-30 14:18:25 +00:00
Jakob Stoklund Olesen	ef1d59a175	Implement SPARCv9 atomic_swap_64 with a pseudo. The SWAP instruction only exists in a 32-bit variant, but the 64-bit atomic swap can be implemented in terms of CASX, like the other atomic rmw primitives. llvm-svn: 200453	2014-01-30 04:48:46 +00:00
Juergen Ributzka	f6f0ce903e	Revert "[Stackmaps] Record the stack size of each function that contains a stackmap/patchpoint intrinsic." This reverts commit r200444 to unbreak buildbots. llvm-svn: 200445	2014-01-30 03:34:02 +00:00
Juergen Ributzka	aece7583a7	[Stackmaps] Record the stack size of each function that contains a stackmap/patchpoint intrinsic. Reviewed by Andy llvm-svn: 200444	2014-01-30 03:06:14 +00:00
Manman Ren	7407e0e31c	Revert r200431 due to bot failures. llvm-svn: 200434	2014-01-30 00:53:27 +00:00
Manman Ren	104e0c80cc	PGO branch weight: update edge weights in SelectionDAGBuilder. When converting from "or + br" to two branches, or converting from "and + br" to two branches, we correctly update the edge weights of the two branches. llvm-svn: 200431	2014-01-30 00:24:37 +00:00
Manman Ren	b681918ddd	PGO branch weight: update edge weights in IfConverter. This commit only handles IfConvertTriangle. To update edge weights of a successor, one interface is added to MachineBasicBlock: /// Set successor weight of a given iterator. setSuccWeight(succ_iterator I, uint32_t weight) An existing testing case test/CodeGen/Thumb2/v8_IT_5.ll is updated, since we now correctly update the edge weights, the cold block is placed at the end of the function and we jump to the cold block. llvm-svn: 200428	2014-01-29 23:18:47 +00:00
Matheus Almeida	ec079d9e1d	[mips][msa] Add fill.d instruction. This instruction is only available on Mips64 cores that implement the MSA ASE. llvm-svn: 200400	2014-01-29 15:12:02 +00:00
Matheus Almeida	4cb577c614	[mips][msa] CHECK-DAG-ize MSA 2r_vector_scalar.ll test. This update is a preparation for the addition of Mips64 MSA tests. No functional changes. llvm-svn: 200399	2014-01-29 14:32:03 +00:00
Matheus Almeida	74070327b2	[mips][msa] Add copy_{u,s}.d. These instructions are only available on Mips64 cores that implement the MSA ASE. llvm-svn: 200398	2014-01-29 14:05:28 +00:00
Matheus Almeida	a64f0600f3	[mips][msa] CHECK-DAG-ize MSA elm_copy.ll test. This update is a preparation for the addition of Mips64 MSA tests. No functional changes. llvm-svn: 200395	2014-01-29 13:51:34 +00:00
Renato Golin	8cea6e8fc6	Enable EHABI by default After all hard work to implement the EHABI and with the test-suite passing, it's time to turn it on by default and allow users to disable it as a work-around while we fix the eventual bugs that show up. This commit also remove the -arm-enable-ehabi-descriptors, since we want the tables to be printed every time the EHABI is turned on for non-Darwin ARM targets. Although MCJIT EHABI is not working yet (needs linking with the right libraries), this commit also fixes some relocations on MCJIT regarding the EH tables/lib calls, and update some tests to avoid using EH tables when none are needed. The EH tests in the test-suite that were previously disabled on ARM now pass with these changes, so a follow-up commit on the test-suite will re-enable them. llvm-svn: 200388	2014-01-29 11:50:56 +00:00
Venkatraman Govindaraju	141d0e2221	[Sparc] Use %r_disp32 for pc_rel entries in FDE as well. This makes MCAsmInfo::getExprForFDESymbol() a virtual function and overrides it in SparcMCAsmInfo. llvm-svn: 200376	2014-01-29 06:59:20 +00:00
Venkatraman Govindaraju	fd5c1f9497	[Sparc] Use %r_disp32 for pc_rel entries in gcc_except_table and eh_frame. Otherwise, assembler (gas) fails to assemble them with error message "operation combines symbols in different segments". This is because MC computes pc_rel entries with subtract expression between labels from different sections. llvm-svn: 200373	2014-01-29 04:51:35 +00:00
Venkatraman Govindaraju	50f32d949b	[SparcV9] Use correct register class (I64RegClass) to hold the address of _GLOBAL_OFFSET_TABLE_ in sparcv9. llvm-svn: 200368	2014-01-29 03:35:08 +00:00
Rafael Espindola	310f501ef0	Use a raw_stream to implement the mangler. This is a bit more convenient for some callers, but more importantly, it is easier to implement correctly. Doing this removes the patching of already printed data that was used for fastcall, fixing a crash with private fastcall symbols. llvm-svn: 200367	2014-01-29 02:30:38 +00:00
Kevin Qin	92d64d2d56	[AArch64 NEON] Lower SELECT_CC with vector operand. When the scalar compare is between floating point and operands are vector, we custom lower SELECT_CC to use NEON SIMD compare for generating less instructions. llvm-svn: 200365	2014-01-29 01:57:30 +00:00
David Woodhouse	21bfc71752	[ARM] Remove superfluous inline asm mode switch test llvm-svn: 200361	2014-01-29 00:49:28 +00:00
David Woodhouse	7db3705f9e	Tests for mode switching 1. test that inlineasm works 2. test that relaxable instructions are re-encoded in the correct mode. llvm-svn: 200351	2014-01-28 23:13:30 +00:00
Gautam Chakrabarti	2c283400f9	[NVPTX] Fix emitting aggregate parameters The code was missing the case for aggregate parameters and hence was emitting them as .b0 type. Also fixed a couple of comments. llvm-svn: 200325	2014-01-28 18:35:29 +00:00
Andrea Di Biagio	2ea61f17ad	[X86] Add extra rules for combining vselect dag nodes into movsd. This improves the fix committed at revision 199683 adding the following new target specific combine rules: 1) fold (v4i32: vselect <0,0,-1,-1>, A, B) -> (v4i32 (bitcast (movsd (v2i64 (bitcast A)), (v2i64 (bitcast B))) )) 2) fold (v4f32: vselect <0,0,-1,-1>, A, B) -> (v4f32 (bitcast (movsd (v2f64 (bitcast A)), (v2f64 (bitcast B))) )) 3) fold (v4i32: vselect <-1,-1,0,0>, A, B) -> (v4i32 (bitcast (movsd (v2i64 (bitcast B)), (v2i64 (bitcast A))) )) 4) fold (v4f32: vselect <-1,-1,0,0>, A, B) -> (v4f32 (bitcast (movsd (v2i64 (bitcast B)), (v2i64 (bitcast A))) )) llvm-svn: 200324	2014-01-28 18:14:21 +00:00
Andrea Di Biagio	b6d39afbda	[DAGCombiner] Avoid introducing an illegal build_vector when folding a sign_extend. Make sure that we don't introduce illegal build_vector dag nodes when trying to fold a sign_extend of a build_vector. This fixes a regression introduced by r200234. Added test CodeGen/X86/fold-vector-sext-crash.ll to verify that llc no longer crashes with an assertion failure due to an illegal build_vector of type MVT::v4i64. Thanks to Ilia Filippov for spotting this regression and for providing a reproducible test case. llvm-svn: 200313	2014-01-28 12:53:56 +00:00
Hal Finkel	4e703bcecd	Handle spilling the PPC GPRC_NOR0 register class GPRC_NOR0 is not a subclass of GPRC (because it also contains the ZERO pseudo register). As a result, we also need to check for it in the spilling code. llvm-svn: 200288	2014-01-28 05:32:58 +00:00
Michel Danzer	bf1a641060	R600/SI: Add pattern for truncating i32 to i1 Fixes half a dozen piglit tests with radeonsi. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 200283	2014-01-28 03:01:16 +00:00
Jakob Stoklund Olesen	83c677353b	Fix the DWARF EH encodings for Sparc PIC code. Also emit the stubs that were generated for references to typeinfo symbols. llvm-svn: 200282	2014-01-28 02:52:26 +00:00
David Peixotto	b76f55f74a	Fix unsupported addressing mode assertion for pld Summary: This commit gives an address mode to the PLD instruction. We were getting an assertion failure in the frame lowering code because we had code that was doing a pld of a stack allocated address. The frame lowering was checking the address mode and then asserting because pld had none defined. This commit fixes pld for arm mode. There was a previous fix for thumb mode in a separate commit. The commit for thumb mode added a test in a separate file because it would otherwise fail for arm. This commit moves the thumb test back into the prefetch.ll file and adds the corresponding arm test. Differential Revision: http://llvm-reviews.chandlerc.com/D2622 llvm-svn: 200248	2014-01-27 21:39:04 +00:00
Andrea Di Biagio	f09a357765	[DAGCombiner] Teach how to fold sext/aext/zext of constant build vectors. This patch teaches the DAGCombiner how to fold a sext/aext/zext dag node when the operand in input is a build vector of constants (or UNDEFs). The inability to fold a sext/zext of a constant build_vector was the root cause of some pcg bugs affecting vselect expansion on x86-64 with AVX support. Before this change, the DAGCombiner only knew how to fold a sext/zext/aext of a ConstantSDNode. llvm-svn: 200234	2014-01-27 18:45:30 +00:00
Stepan Dyatkovskiy	55139555c4	Additional fix for 200201: due to dependence on bitwidth test was moved to X86 directory. llvm-svn: 200202	2014-01-27 09:43:10 +00:00
Stepan Dyatkovskiy	157bb42e27	Fix for PR18102. Issue outcomes from DAGCombiner::MergeConsequtiveStores, more precisely from mem-ops sequence sorting. Consider, how MergeConsequtiveStores works for next example: store i8 1, a[0] store i8 2, a[1] store i8 3, a[1] ; a[1] again. return ; DAG starts here 1. Method will collect all the 3 stores. 2. It sorts them by distance from the base pointer (farthest with highest index). 3. It takes first consecutive non-overlapping stores and (if possible) replaces them with a single store instruction. The point is, we can't determine here which 'store' instruction would be the second after sorting ('store 2' or 'store 3'). It happens that 'store 3' would be the second, and 'store 2' would be the third. So after merging we have the next result: store i16 (1 \| 3 << 8), base ; is a[0] but bit-casted to i16 store i8 2, a[1] So actually we swapped 'store 3' and 'store 2' and got wrong contents in a[1]. Fix: In sort routine just also take into account mem-op sequence number. llvm-svn: 200201	2014-01-27 09:18:31 +00:00
Michel Danzer	13736221e3	R600/SI: Add intrinsic for BUFFER_LOAD_DWORD* instructions Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 200196	2014-01-27 07:20:51 +00:00
Michel Danzer	6064f57ae8	R600/SI: Add intrinsic for S_SENDMSG instruction Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 200195	2014-01-27 07:20:44 +00:00
Kevin Qin	4a183d7094	[AArch64 NEON] Try to generate CONCAT_VECTOR when lowering BUILD_VECTOR or SHUFFLE_VECTOR. Replace r199791. llvm-svn: 200180	2014-01-27 02:53:54 +00:00
Kevin Qin	9eeedfbaa6	Revert r199791. It's old version which has some bugs. I'll commit lattest patch soon. llvm-svn: 200179	2014-01-27 02:53:41 +00:00
Jakob Stoklund Olesen	6f39ce4be2	Clean up the Legal/Expand logic for SPARC popc. llvm-svn: 200141	2014-01-26 08:12:34 +00:00
Rafael Espindola	cb1953f6d9	Implement the missing bits corresponding to .mips_hack_elf_flags. These were: * noreorder handling on the target object streamer and asm parser. * setting the initial flag bits based on the enabled features. * setting the elf header flag for micromips It is really depressing I am the one doing this instead of someone at mips actually taking the time to understand the infrastructure. llvm-svn: 200138	2014-01-26 06:57:13 +00:00
Jakob Stoklund Olesen	ead3b3d7a1	Only generate the popc instruction for SPARC CPUs that implement it. The popc instruction is defined in the SPARCv9 instruction set architecture, but it was emulated on CPUs older than Niagara 2. llvm-svn: 200131	2014-01-26 06:09:59 +00:00
Jakob Stoklund Olesen	39f0833f47	Fix swapped CASA operands. Found by SingleSource/UnitTests/AtomicOps.c llvm-svn: 200130	2014-01-26 06:09:54 +00:00
Jiangning Liu	fb3c17b6c9	Improve pattern match from v1i8 to v1i32 for AArch64 Neon. llvm-svn: 200119	2014-01-26 04:55:53 +00:00
Rafael Espindola	e52d556614	Remove -print-hack-directives from a test where we already do the right thing. llvm-svn: 200116	2014-01-26 04:14:50 +00:00
Rafael Espindola	aa93586678	Move tests that just use llc from test/MC/Mips to test/MC/Codegen. This is an expanded version of r200064. llvm-svn: 200115	2014-01-26 04:08:47 +00:00
Jiangning Liu	6398d839c6	Implement pattern match from v1xx to v1xx for AArch64 Neon. llvm-svn: 200113	2014-01-26 03:27:40 +00:00
Kevin Qin	18662f4b7c	[AArch64 NEON] Add patterns for concat_vector on v2i32. llvm-svn: 200111	2014-01-26 02:46:15 +00:00
Kevin Qin	a4068c4243	[AArch64 NEON] Add test case for vector FP_ROUND. llvm-svn: 200110	2014-01-26 02:23:33 +00:00
Hal Finkel	5eb2466243	Add a TBAA CodeGen failure test case I disabled the use of TBAA in CodeGen in r200093. This adds a test case that demonstrates the problems with inttoptr and TBAA in CodeGen (and, specifically, the problem that causes LLVM to miscompile itself in Release mode). This test will currently fail if -use-tbaa-in-sched-mi is enabled. llvm-svn: 200097	2014-01-25 20:16:36 +00:00
Hal Finkel	93d8f59877	XFAIL test/CodeGen/SystemZ/alias-01.ll which requires CodeGen TBAA llvm-svn: 200094	2014-01-25 19:31:44 +00:00
Rafael Espindola	14d02fe5c8	This reverts commit r200064 and r200051. r200064 depends on r200051. r200051 is broken: I tries to replace .mips_hack_elf_flags, which is a good thing, but what it replaces it with is even worse. The new emitMipsELFFlags it adds corresponds to no assembly directive, is not marked as a hack and is not even printed to the .s file. The patch also introduces more uses of hasRawTextSupport. The correct way to remove .mips_hack_elf_flags is to have the mips target streamer handle the default flags (and command line options). That way the same code path is used for asm and obj. The streamer interface should really correspond to what is printed in the .s file. llvm-svn: 200078	2014-01-25 15:06:56 +00:00
Jack Carter	8150e14190	[Mips] Move 2 test cases from MC to CodeGen. No code changes. Just reassignment of test case files. llvm-svn: 200064	2014-01-25 02:14:14 +00:00
Juergen Ributzka	f26beda7c7	Revert "Revert "Add Constant Hoisting Pass" (r200034)" This reverts commit r200058 and adds the using directive for ARMTargetTransformInfo to silence two g++ overload warnings. llvm-svn: 200062	2014-01-25 02:02:55 +00:00
Hans Wennborg	4d67a2e85a	Revert "Add Constant Hoisting Pass" (r200034) This commit caused -Woverloaded-virtual warnings. The two new TargetTransformInfo::getIntImmCost functions were only added to the superclass, and to the X86 subclass. The other targets were not updated, and the warning highlighted this by pointing out that e.g. ARMTTI::getIntImmCost was hiding the two new getIntImmCost variants. We could pacify the warning by adding "using TargetTransformInfo::getIntImmCost" to the various subclasses, or turning it off, but I suspect that it's wrong to leave the functions unimplemnted in those targets. The default implementations return TCC_Free, which I don't think is right e.g. for ARM. llvm-svn: 200058	2014-01-25 01:18:18 +00:00
Ana Pazos	cd3b9f763e	[AArch64] Removed unused i8 type from FPR8 register class. The i8 type is not registered with any register class. This causes a segmentation fault in MachineLICM::getRegisterClassIDAndCost. The code selects the first type associated with register class FPR8, which happens to be i8. It uses this type (i8) to get the representative class pointer, which is 0. It then uses this pointer to access a field, resulting in segmentation fault. Since i8 type is not being used for printing any neon instruction we can safely remove it. llvm-svn: 200046	2014-01-24 22:36:53 +00:00
Juergen Ributzka	4f3df4ad64	Add Constant Hoisting Pass Retry commit r200022 with a fix for the build bot errors. Constant expressions have (unlike instructions) module scope use lists and therefore may have users in different functions. The fix is to simply ignore these out-of-function uses. llvm-svn: 200034	2014-01-24 20:18:00 +00:00
Lang Hames	c63c52e03c	Add a testcase for the changes in r199938. <rdar://problem/15611947> llvm-svn: 200027	2014-01-24 19:00:19 +00:00
Juergen Ributzka	50e7e80d00	Revert "Add Constant Hoisting Pass" This reverts commit r200022 to unbreak the build bots. llvm-svn: 200024	2014-01-24 18:40:30 +00:00
Juergen Ributzka	38b67d0caf	Add Constant Hoisting Pass This pass identifies expensive constants to hoist and coalesces them to better prepare it for SelectionDAG-based code generation. This works around the limitations of the basic-block-at-a-time approach. First it scans all instructions for integer constants and calculates its cost. If the constant can be folded into the instruction (the cost is TCC_Free) or the cost is just a simple operation (TCC_BASIC), then we don't consider it expensive and leave it alone. This is the default behavior and the default implementation of getIntImmCost will always return TCC_Free. If the cost is more than TCC_BASIC, then the integer constant can't be folded into the instruction and it might be beneficial to hoist the constant. Similar constants are coalesced to reduce register pressure and materialization code. When a constant is hoisted, it is also hidden behind a bitcast to force it to be live-out of the basic block. Otherwise the constant would be just duplicated and each basic block would have its own copy in the SelectionDAG. The SelectionDAG recognizes such constants as opaque and doesn't perform certain transformations on them, which would create a new expensive constant. This optimization is only applied to integer constants in instructions and simple (this means not nested) constant cast experessions. For example: %0 = load i64* inttoptr (i64 big_constant to i64*) Reviewed by Eric llvm-svn: 200022	2014-01-24 18:23:08 +00:00
Alp Toker	cb40291100	Fix known typos Sweep the codebase for common typos. Includes some changes to visible function names that were misspelt. llvm-svn: 200018	2014-01-24 17:20:08 +00:00
Rafael Espindola	f8f15bf670	Don't use "llc -filetype=obj" now that the codepath is the same. r200011 remove the special codepaths in MC for inline asm, so we can now test all the logic with just llc + llvm-mc. llvm-svn: 200013	2014-01-24 15:59:50 +00:00
Kevin Qin	21cd2152d3	[AArch64 NEON] Fix a bug in implementing register copy bwtween FPR16. llvm-svn: 199978	2014-01-24 07:53:04 +00:00
Juergen Ributzka	e758ddcd16	[X86] Prevent the creation of redundant ops for sadd and ssub with overflow. This commit teaches the X86 backend to create the same X86 instructions when it lowers an sadd/ssub with overflow intrinsic and a conditional branch that uses that overflow result. This allows SelectionDAG to recognize and remove one of the redundant operations. This fixes <rdar://problem/15874016> and <rdar://problem/15661073>. Reviewed by Nadav llvm-svn: 199976	2014-01-24 06:47:57 +00:00
Jakob Stoklund Olesen	05ae2d6715	Implement atomicrmw operations in 32 and 64 bits for SPARCv9. These all use the compare-and-swap CASA/CASXA instructions. llvm-svn: 199975	2014-01-24 06:23:31 +00:00
Lang Hames	23de211c5d	Replace vfmaddxx213 instructions with their 231-type equivalents in accumulator loops. Writing back to the accumulator (231-type) allows the coalescer to eliminate an extra copy. llvm-svn: 199933	2014-01-23 20:23:36 +00:00
Weiming Zhao	5930ae6cc2	[Thumbv8] Fix the value of BLXOperandIndex of isV8EligibleForIT Originally, BLX was passed as operand #0 in MachineInstr and as operand #2 in MCInst. But now, it's operand #2 in both cases. This patch also removes unnecessary FileCheck in the test case added by r199127. llvm-svn: 199928	2014-01-23 19:55:33 +00:00
Eric Christopher	589d6c4118	Move test to x86 directory. llvm-svn: 199927	2014-01-23 19:32:19 +00:00
Ana Pazos	5d31f6945b	[AArch64] Added vselect patterns with float and double types llvm-svn: 199925	2014-01-23 19:18:57 +00:00
Eric Christopher	4c96056acd	Avoid emitting a DWARF type attribute for an ObjC property of type void. Patch by Scott Talbot. llvm-svn: 199924	2014-01-23 19:16:28 +00:00
Tom Stellard	a2a4b8ee2f	R600: Disable the BFE pattern This pattern uses an SDNodeXForm, which isn't being emitted for some reason. I can get it to work by attaching the PatLeaf that has the XForm to the argument in the output pattern, but this results in an immediate being used in a register operand, which the backend can't handle yet. llvm-svn: 199918	2014-01-23 18:49:33 +00:00
Tom Stellard	805890b252	R600: Correctly handle vertex fetch clauses the precede ENDIFs The control flow finalizer would sometimes use an ALU_POP_AFTER instruction before the vetex fetch clause instead of using a POP instruction after it. llvm-svn: 199917	2014-01-23 18:49:31 +00:00
Tom Stellard	8cce9bdf17	R600: Unconditionally unroll loops that contain GEPs with alloca pointers Implement the getUnrollingPreferences() function for AMDGPUTargetTransformInfo so that loops that do address calculations on pointers derived from alloca are unconditionally unrolled. Unrolling these loops makes it more likely that SROA will be able to eliminate the allocas, which is a big win for R600 since memory allocated by alloca (private memory) is really slow. llvm-svn: 199916	2014-01-23 18:49:28 +00:00
Andrew Trick	3cc534ac6d	Move a unit test into the correct dir. Sorry if it broke Mips-only builds. llvm-svn: 199911	2014-01-23 17:47:57 +00:00
Tom Stellard	348273df97	R600: Recommit 199842: Add work-around for the CF stack entry HW bug The unit test is now disabled on non-asserts builds. The CF stack can be corrupted if you use CF_ALU_PUSH_BEFORE, CF_ALU_ELSE_AFTER, CF_ALU_BREAK, or CF_ALU_CONTINUE when the number of sub-entries on the stack is greater than or equal to the stack entry size and sub-entries modulo 4 is either 0 or 3 (on cedar the bug is present when number of sub-entries module 8 is either 7 or 0) We choose to be conservative and always apply the work-around when the number of sub-enries is greater than or equal to the stack entry size, so that we can safely over-allocate the stack when we are unsure of the stack allocation rules. reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 199905	2014-01-23 16:18:02 +00:00
Elena Demikhovsky	a5d38a39a0	AVX-512: added VPERM2D VPERM2Q VPERM2PS VPERM2PD instructions, they give better sequences than VPERMI llvm-svn: 199893	2014-01-23 14:27:26 +00:00
Tim Northover	55c625f222	ARM: use litpools for normal i32 imms when compiling minsize. With constant-sharing, litpool loads consume 4 + N2 bytes of code, but movw/movt pairs consume 8N. This means litpools are better than movw/movt even with just one use. Other materialisation strategies can still be better though, so the logic is a little odd. llvm-svn: 199891	2014-01-23 13:43:47 +00:00
Hao Liu	b920682e4a	[AArch64]Add CHECK for two test cases testing scalar_to_vector committed in r199461. llvm-svn: 199861	2014-01-23 02:09:30 +00:00
Owen Anderson	77e4d44411	Revert r162101 and replace it with a solution that works for targets where the pointer type is illegal. This is a horrible bit of code. We're calling a simplification routine in the middle of type legalization. We tell the simplification routine that it's running after legalization, but some of the types it will encounter will be illegal! The fix is only to invoke the simplification if the types in question were legal, so that none of its invariants will be violated. llvm-svn: 199847	2014-01-22 22:34:17 +00:00
Tom Stellard	31e16388d7	Revert "R600: Add work-around for the CF stack entry HW bug" This reverts commit 35b8331cad6eb512a2506adbc394201181da94ba. The -debug-only flag for llc doesn't appear to be available in all build configurations. llvm-svn: 199845	2014-01-22 22:20:54 +00:00
Tom Stellard	e89373e062	R600: Add work-around for the CF stack entry HW bug The CF stack can be corrupted if you use CF_ALU_PUSH_BEFORE, CF_ALU_ELSE_AFTER, CF_ALU_BREAK, or CF_ALU_CONTINUE when the number of sub-entries on the stack is greater than or equal to the stack entry size and sub-entries modulo 4 is either 0 or 3 (on cedar the bug is present when number of sub-entries module 8 is either 7 or 0) We choose to be conservative and always apply the work-around when the number of sub-enries is greater than or equal to the stack entry size, so that we can safely over-allocate the stack when we are unsure of the stack allocation rules. reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 199842	2014-01-22 21:55:46 +00:00
Tom Stellard	a40f97154b	R600: Refactor stack size calculation reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 199840	2014-01-22 21:55:43 +00:00
Quentin Colombet	5a8739e023	Add a testcase for r199430. llvm-svn: 199831	2014-01-22 20:11:50 +00:00
Tom Stellard	476437cbbc	R600: MOVA is vector only llvm-svn: 199827	2014-01-22 19:24:24 +00:00

... 2 3 4 5 6 ...

9329 Commits