llvm-project

Commit Graph

Author	SHA1	Message	Date
Vasileios Kalintiris	88faf6d697	[mips] Disable code generation through FastISel for MIPS32R6. Reviewers: dsanders Subscribers: llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D14708 llvm-svn: 253225	2015-11-16 17:05:01 +00:00
Petr Pavlu	a770379524	[ARM] Prevent use of a value pointed by end() iterator when placing a jump table Function ARMConstantIslands::doInitialJumpTablePlacement() iterates over all basic blocks in a machine function. It calls `MI = MBB.getLastNonDebugInstr()` to get the last instruction in each block and then uses MI->getOpcode() to decide what to do. If getLastNonDebugInstr() returns MBB.end() (for example, when the block does not contain any instructions) then calling getOpcode() on this value is incorrect. Avoid this problem by checking the result of getLastNonDebugInstr(). Differential Revision: http://reviews.llvm.org/D14694 llvm-svn: 253222	2015-11-16 16:41:13 +00:00
Oliver Stannard	9327a7575b	[ARM,AArch64] Store source location of asm constant pool entries Storing the source location of the expression that created a constant pool entry allows us to emit better error messages if we later discover that the expression cannot be represented by a relocation. Differential Revision: http://reviews.llvm.org/D14646 llvm-svn: 253220	2015-11-16 16:25:47 +00:00
Oliver Stannard	09be060606	[ARM,AArch64] Store source location for values in assembly files The MCValue class can store a SMLoc to allow better error messages to be emitted if an error is detected after parsing. The ARM and AArch64 assembly parsers were not setting this, so error messages did not have source information. Differential Revision: http://reviews.llvm.org/D14645 llvm-svn: 253219	2015-11-16 16:22:47 +00:00
Dan Gohman	1462faad35	[WebAssembly] Prototype passes for register coloring and register stackifying. These passes are not yet enabled by default. llvm-svn: 253217	2015-11-16 16:18:28 +00:00
Artyom Skrobov	f187a65f99	Handle ARMv6KZ naming Summary: * ARMv6KZ is the "canonical" name, given in the ARMARM * ARMv6Z is an "official abbreviation" for it, mentioned in the ARMARM * ARMv6ZK is a popular misspelling, which we should support as an alias. The patch corrects the handling of the names. Functional changes: * ARMv6Z no longer treated as an architecture in its own right * ARMv6ZK renamed to ARMv6KZ, accepting ARMv6ZK as an alias * arm1176jz-s and arm1176jzf-s recognized as ARMv6ZK, instead of ARMv6K * default ARMv6K CPU changed to arm1176j-s Reviewers: rengolin, logan, compnerd Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D14568 llvm-svn: 253206	2015-11-16 14:05:32 +00:00
Bradley Smith	323fee105d	[ARM] Introduce subtarget features per ARM architecture. This allows for accurate architecture targeting as well as removing duplicate information (hardcoded feature strings) from MCTargetDesc. llvm-svn: 253196	2015-11-16 11:10:19 +00:00
James Molloy	2018091e87	Properly check if a CMPZ node is in fact comparing against zero This was left implicit and never ever checked, which means we could have a CMPZ against some non-zero value and we were carrying on with BFI conversion regardless. Caught by Oliver Stannard using csmith; regression test added. llvm-svn: 253195	2015-11-16 10:49:25 +00:00
Oliver Stannard	db9081bf89	[AArch64] ldr= pseudo-instruction silently ignored if register invalid The AArch64 assembler was silently ignoring instructions like this: ldr foo, =bar AArch64AsmParser::parseOperand was returning true as the parse failed, but was not calling AArch64AsmParser::Error to report this to the user, so the instruction was ignored without printing an error message. Differential Revision: http://reviews.llvm.org/D14651 llvm-svn: 253193	2015-11-16 10:25:19 +00:00
Igor Breger	24cab0fa06	AVX512: Implemented encoding and intrinsics for VMOVSHDUP/VMOVSLDUP instructions. Differential Revision: http://reviews.llvm.org/D14322 llvm-svn: 253185	2015-11-16 07:22:00 +00:00
Dan Gohman	1031d4a8c3	[WebAssembly] Use tabs instead of spaces in assembly output. This seems to be the most popular convention among the other backends. llvm-svn: 253172	2015-11-15 15:34:19 +00:00
Simon Pilgrim	cbba348ae7	[X86][SSE] Tidyup with implicit SDValue bool check. NFC. llvm-svn: 253171	2015-11-15 14:57:07 +00:00
Igor Breger	3ff8ef9eb7	Revert r253160. It broke layering violation. Reproducible with BUILD_SHARED_LIBS=ON. llvm-svn: 253163	2015-11-15 12:19:11 +00:00
Igor Breger	aa40ddd3ba	AVX512: Implemented encoding and intrinsics for VMOVSHDUP/VMOVSLDUP instructions. Differential Revision: http://reviews.llvm.org/D14322 llvm-svn: 253160	2015-11-15 07:23:13 +00:00
Dan Gohman	5219ecf068	[WebAssembly] Minor code simplification. NFC. llvm-svn: 253150	2015-11-14 23:28:15 +00:00
Dan Gohman	8ad045c1d1	[WebAssembly] Support signext, zeroext, and several other function attributes. llvm-svn: 253148	2015-11-14 23:15:41 +00:00
Akira Hatanaka	b11ef0897c	Reduce the size of MCRelaxableFragment. MCRelaxableFragment previously kept a copy of MCSubtargetInfo and MCInst to enable re-encoding the MCInst later during relaxation. A copy of MCSubtargetInfo (instead of a reference or pointer) was needed because the feature bits could be modified by the parser. This commit replaces the MCSubtargetInfo copy in MCRelaxableFragment with a constant reference to MCSubtargetInfo. The copies of MCSubtargetInfo are kept in MCContext, and the target parsers are now responsible for asking MCContext to provide a copy whenever the feature bits of MCSubtargetInfo have to be toggled. With this patch, I saw a 4% reduction in peak memory usage when I compiled verify-uselistorder.lto.bc using llc. rdar://problem/21736951 Differential Revision: http://reviews.llvm.org/D14346 llvm-svn: 253127	2015-11-14 06:35:56 +00:00
Akira Hatanaka	bd9fc28444	[MCTargetAsmParser] Move the member varialbes that reference MCSubtargetInfo in the subclasses into MCTargetAsmParser and define a member function getSTI. This is done in preparation for making changes to shrink the size of MCRelaxableFragment. (see http://reviews.llvm.org/D14346). llvm-svn: 253124	2015-11-14 05:20:05 +00:00
Eric Christopher	57a6e1321f	Add MMX to the 3dnow enum and propagate changes around. This makes it somewhat more consistent with how the feature is used. llvm-svn: 253122	2015-11-14 03:04:00 +00:00
Justin Bogner	fff708db92	AArch64: Default AArch64Subtarget::ReserveX18 to true on darwin Darwin reserves x18, so it's never ABI compliant to generate code that uses it. Set the default value based on the OS part of the triple rather than forcing front-ends to set the +reserve-x18 target feature in order to build correct code for Darwin. This will make r243310 redundant, so I'll revert that shortly. llvm-svn: 253102	2015-11-13 23:05:46 +00:00
Colin LeMahieu	655489433c	[Hexagon] Fixing memory leak during relaxation by allocating MCInst in MCContext. llvm-svn: 253090	2015-11-13 21:45:50 +00:00
Reid Kleckner	75b4be9a11	[WinEH] Fix ESP management with 32-bit __CxxFrameHandler3 The C++ EH personality automatically restores ESP from the C++ EH registration node after a catchret. I mistakenly thought it was like SEH, which does not restore ESP. It makes sense for C++ EH to differ from SEH here because SEH does not use funclets for catches, and does not allow catching inside of finally. C++ EH may need to unwind through multiple catch funclets and eventually catchret to some outer funclet. Therefore, the runtime has to keep track of which ESP to use with catchret, rather than having the compiler reload it manually. llvm-svn: 253084	2015-11-13 21:27:00 +00:00
Dan Gohman	dd0071f440	[WebAssembly] Rename the Const instructions to be upper-case too. llvm-svn: 253072	2015-11-13 20:27:45 +00:00
Dan Gohman	f433324290	[WebAssembly] Rename memory intrinsics to be upper-case, following convention. NFC. llvm-svn: 253070	2015-11-13 20:19:11 +00:00
Cong Hou	ef4074bac2	[X86][SSE] Combine UNPCKL with vector_shuffle into UNPCKH to save one instruction for sext from v16i8 to v16i16 and v8i16 to v8i32. This patch is enabling combining UNPCKL with vector_shuffle that moves the upper half of a vector into the lower half, into a UNPCKH instruction. For example: t2: v16i8 = vector_shuffle<8,9,10,11,12,13,14,15,u,u,u,u,u,u,u,u> t1, undef:v16i8 t3: v16i8 = X86ISD::UNPCKL undef:v16i8, t2 will be combined to: t3: v16i8 = X86ISD::UNPCKH undef:v16i8, t1 Differential revision: http://reviews.llvm.org/D14399 llvm-svn: 253067	2015-11-13 19:47:43 +00:00
Reid Kleckner	94b57065c6	[WinEH] Make UnwindHelp a fixed stack object allocated after XMM CSRs Now the offset of UnwindHelp in our EH tables and the offset that we store to in the prologue agree. llvm-svn: 253059	2015-11-13 19:06:01 +00:00
Colin LeMahieu	f0af6e5243	[Hexagon] Factoring bundle creation in to a utility function. llvm-svn: 253056	2015-11-13 17:42:46 +00:00
Tom Stellard	afd6e2f3c3	AMDGPU: Add stony support Patch by: Alex Deucher llvm-svn: 253053	2015-11-13 17:06:32 +00:00
James Molloy	b564098c62	[ARM] Replace ARMISD::RBIT with ISD::BITREVERSE ISD::BITREVERSE matches "rbit" completely, so remove ARMISD::RBIT and mark ISD::BITREVERSE as legal, adding a test for lowering. llvm-svn: 253047	2015-11-13 16:05:22 +00:00
Zlatko Buljan	32fb5c40d2	[mips][microMIPS] Implement SHRA[_R].PH, SHRAV[_R].PH, SHRAV[_R].QB, SHRAV_R.W, SHRA_R.W, SHRL.PH, SHRL.QB, SHRLV.PH and SHRLV.QB instructions Differential Revision: http://reviews.llvm.org/D14010 llvm-svn: 253041	2015-11-13 13:14:25 +00:00
Ulrich Weigand	19d24d2699	[SystemZ] Simplify boolean conditional return statements Use clang-tidy to simplify conditonal return statements. Author: LegalizeAdulthood Differential Revision: http://reviews.llvm.org/D9986 llvm-svn: 253038	2015-11-13 13:00:27 +00:00
Colin LeMahieu	b3c97271e3	[Hexagon] Fixing leak in padEndloop by allocating in MCContext. llvm-svn: 253019	2015-11-13 07:58:06 +00:00
Dan Gohman	f19ed56288	[WebAssembly] Inline asm support. llvm-svn: 252997	2015-11-13 01:42:29 +00:00
Colin LeMahieu	8bb168b160	[Hexagon] Adding relaxation functionality to backend and test. llvm-svn: 252989	2015-11-13 01:12:25 +00:00
Dan Gohman	bc58a7bad0	[WebAssembly] Un-mangle the conversion instruction names. This arranges the types in the LLVM instruction names in the same order that they appear in the WebAssembly opcode names, and eliminates double-underscores. llvm-svn: 252988	2015-11-13 00:50:04 +00:00
Dan Gohman	231244c304	[WebAssembly] Rename BR_IF_ to BR_IF With MC-based instruction printing, we no longer need instruction names to mangle in hints about how they should be printed. llvm-svn: 252987	2015-11-13 00:46:31 +00:00
Dan Gohman	c9dd057e3c	[WebAssembly] Remove unneeded TODO items. NFC. llvm-svn: 252985	2015-11-13 00:41:25 +00:00
Dan Gohman	b1daa3aec7	[WebAssembly] Tidy up and update a TODO item. NFC. llvm-svn: 252984	2015-11-13 00:40:37 +00:00
Joseph Tremoulet	149c433bcc	[WinEH] Find root frame correctly in CLR funclets Summary: The value that the CoreCLR personality passes to a funclet for the establisher frame may be the root function's frame or may be the parent funclet's (mostly empty) frame in the case of nested funclets. Each funclet stores a pointer to the root frame in its own (mostly empty) frame, as does the root function itself. All frames allocate this slot at the same offset, measured from the post-prolog stack pointer, so that the same sequence can accept any ancestor as an establisher frame parameter value, and so that a single offset can be reported to the GC, which also looks at this slot. This change allocate the slot when processing function entry, and records its frame index on the WinEHFuncInfo object, then inserts the code to set/copy it during prolog emission. Reviewers: majnemer, AndyAyers, pgavlin, rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14614 llvm-svn: 252983	2015-11-13 00:39:23 +00:00
Dan Gohman	058fce5435	[WebAssembly] Introduce a new pseudo-operand for unused expression results. llvm-svn: 252975	2015-11-13 00:21:05 +00:00
Vyacheslav Klochkov	cbc56baae6	X86-FMA3: Implemented commute transformations FMA_Int instructions. It made it possible to apply the memory folding optimization for the 2nd operand of FMA_Int instructions. Reviewer: Quentin Colombet Differential Revision: http://reviews.llvm.org/D14550 llvm-svn: 252973	2015-11-13 00:07:35 +00:00
Tom Stellard	0967c91e0c	Revert "Remove unnecessary call to getAllocatableRegClass" This reverts commit r252565. This also includes the revert of the commit mentioned below in order to avoid breaking tests in AMDGPU: Revert "AMDGPU: Set isAllocatable = 0 on VS_32/VS_64" This reverts commit r252674. llvm-svn: 252956	2015-11-12 21:43:25 +00:00
Vyacheslav Klochkov	1ff9cbdfc0	My first/test commit. Removed a trailing whitespace. llvm-svn: 252940	2015-11-12 20:11:57 +00:00
Benjamin Kramer	7c576d8bcf	[Hexagon] Allocate MCInst in the MCContext to avoid leaking it. Found by leaksanitizer. llvm-svn: 252931	2015-11-12 19:30:40 +00:00
David Blaikie	b0311c590d	Roll an expression into an assert to fix -Wunused-variable in a -Asserts build llvm-svn: 252925	2015-11-12 19:07:43 +00:00
Dan Gohman	cf4748f180	[WebAssembly] Reapply r252858, with svn add for the new file. Switch to MC for instruction printing. This encompasses several changes which are all interconnected: - Use the MC framework for printing almost all instructions. - AsmStrings are now live. - This introduces an indirection between LLVM vregs and WebAssembly registers, and a new pass, WebAssemblyRegNumbering, for computing a basic the mapping. This addresses some basic issues with argument registers and unused registers. - The way ARGUMENT instructions are handled no longer generates redundant get_local+set_local for every argument. This also changes the assembly syntax somewhat; most notably, MC's printing does not use sigils on label names, so those are no longer present, and push/pop now have a sigil to keep them unambiguous. The usage of set_local/get_local/$push/$pop will continue to evolve significantly. This patch is just one step of a larger change. llvm-svn: 252910	2015-11-12 17:04:33 +00:00
Michael Zuckerman	fd3fe9e45a	[x86] translating "fp" (floating point) instructions from {fadd,fdiv,fmul,fsub,fsubr,fdivr} to {faddp,fdivp,fmulp,fsubp,fsubrp,fdivrp} LLVM Missing the following instructions: fadd\fdiv\fmul\fsub\fsubr\fdivr. GAS and MS supporting this instruction and lowering them in to a faddp\fdivp\fmulp\fsubp\fsubrp\fdivrp instructions. Differential Revision: http://reviews.llvm.org/D14217 llvm-svn: 252908	2015-11-12 16:58:51 +00:00
Artyom Skrobov	2c2f378f8a	Cull non-standard variants of ARM architectures (NFC) Summary: This patch changes ARMV5, ARMV5E, ARMV6SM, ARMV6HL, ARMV7, ARMV7L, ARMV7HL, ARMV7EM to be treated as aliases for the corresponding standard architectures, instead of as actual architectures. Reviewers: rengolin Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D14577 llvm-svn: 252903	2015-11-12 15:51:41 +00:00
Hans Wennborg	7384a2de02	Revert r252858: "[WebAssembly] Switch to MC for instruction printing." It broke the CMake build: "Cannot find source file: WebAssemblyRegNumbering.cpp" llvm-svn: 252897	2015-11-12 14:37:56 +00:00
Vasileios Kalintiris	48e0256ed6	Re-apply "[mips] Use correct frame register for DWARF info when dynamically realigning the stack."" r252219 reversed the direction of subprogram -> function edge. Fixed the IR to account for this. llvm-svn: 252895	2015-11-12 14:11:43 +00:00
James Molloy	8e99e97f2a	[ARM] CMOV->BFI combining: handle both senses of CMPZ I completely misunderstood what ARMISD::CMPZ means. It's not "compare equal to zero", it's "compare, only setting the zero/Z flag". It can either be equal-to-zero or not-equal-to-zero, and we weren't checking what sense it was. If it's equal-to-zero, we can swap the operands around and pretend like it is not-equal-to-zero, which is both a bug fix and lets us handle more cases. llvm-svn: 252891	2015-11-12 13:49:17 +00:00
Renato Golin	93064025bd	Revert "[ARM] Enable shrink-wrapping by default." This reverts commit r252825, as it broke ASAN on ARM. Investigating... llvm-svn: 252889	2015-11-12 13:34:50 +00:00
Daniel Sanders	9f6ad49740	Implement .reloc (constant offset only) with support for R_MIPS_NONE and R_MIPS_32. Summary: Support for R_MIPS_NONE allows us to parse MIPS16's usage of .reloc. R_MIPS_32 was included to be able to better test the directive. Targets can add their relocations by overriding MCAsmBackend::getFixupKind(). Subscribers: grosbach, rafael, majnemer, dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D13659 llvm-svn: 252888	2015-11-12 13:33:00 +00:00
Zlatko Buljan	797c2aec6b	[mips][microMIPS] Implement LWM16, SB16, SH16, SW16, SWSP and SWM16 instructions Differential Revision: http://reviews.llvm.org/D11406 llvm-svn: 252885	2015-11-12 13:21:33 +00:00
Vasileios Kalintiris	d38860610d	Revert "[mips] Use correct frame register for DWARF info when dynamically realigning the stack." This reverts commit r252882. LLParser complains for invalid field 'function' in DISubprogram. llvm-svn: 252884	2015-11-12 13:19:11 +00:00
Vasileios Kalintiris	352eb55baf	[mips] Use correct frame register for DWARF info when dynamically realigning the stack. Summary: This patch overrides TargetFrameLowering::getFrameIndexReference() in order to specify the correct register when the function needs dynamic stack realignment. The values returned from this function are used in order to create DW_AT_locations for DWARF info. These locations would use the wrong registers as it's been reported in PR25028. Reviewers: dsanders Subscribers: dean, llvm-commits Differential Revision: http://reviews.llvm.org/D13511 llvm-svn: 252882	2015-11-12 13:04:16 +00:00
Dylan McKay	c498ba3a3e	Add AVR backend skeleton This adds part of the target info code, and adds modifications to the build scripts so that AVR is recognized a supported, experimental backend. It does not include any AVR-specific code, just the bare sources required for a backend to exist. From D14039. llvm-svn: 252865	2015-11-12 09:26:44 +00:00
Dan Gohman	9dd55a8065	[WebAssembly] Switch to MC for instruction printing. This encompasses several changes which are all interconnected: - Use the MC framework for printing almost all instructions. - AsmStrings are now live. - This introduces an indirection between LLVM vregs and WebAssembly registers, and a new pass, WebAssemblyRegNumbering, for computing a basic the mapping. This addresses some basic issues with argument registers and unused registers. - The way ARGUMENT instructions are handled no longer generates redundant get_local+set_local for every argument. This also changes the assembly syntax somewhat; most notably, MC's printing use sigils on label names, so those are no longer present, and push/pop now have a sigil to keep them unambiguous. The usage of set_local/get_local/$push/$pop will continue to evolve significantly. This patch is just one step of a larger change. llvm-svn: 252858	2015-11-12 06:10:03 +00:00
Manman Ren	3f2b9c18e2	[TLS on Darwin] use a different mask for tls calls on x86-64. Calls involved in thread-local variable lookup save more registers than normal calls. rdar://problem/23073171 llvm-svn: 252837	2015-11-12 00:54:04 +00:00
Quentin Colombet	10f9813528	[ARM] Enable shrink-wrapping by default. Differential Revision: http://reviews.llvm.org/D14357 rdar://problem/21942589 llvm-svn: 252825	2015-11-11 23:31:46 +00:00
Joseph Tremoulet	9f467353a5	[WinEH] Only generate UnwindHelp slot for MSVCXX Summary: Other personalities don't use this special frame slot. Reviewers: majnemer, andrew.w.kaylor, rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14580 llvm-svn: 252778	2015-11-11 19:21:09 +00:00
Sanjay Patel	f740129198	[MIPS] add overrides for isCheapToSpeculateCttz() and isCheapToSpeculateCtlz() MIPS32 has instructions for efficient count-leading/trailing-zeros, so this should be considered a cheap operation (and therefore fair game for speculation) for any MIPS32 implementation. The net result of allowing this speculation for the regression tests in this patch is that we get this code: ctlz: jr $ra clz $2, $4 cttz: addiu $1, $4, -1 not $2, $4 and $1, $2, $1 clz $1, $1 addiu $2, $zero, 32 jr $ra subu $2, $2, $1 Instead of: ctlz: beqz $4, $BB0_2 addiu $2, $zero, 32 clz $2, $4 $BB0_2: jr $ra nop cttz: beqz $4, $BB1_2 addiu $2, $zero, 32 addiu $1, $4, -1 not $2, $4 and $1, $2, $1 clz $1, $1 addiu $2, $zero, 32 subu $2, $2, $1 $BB1_2: jr $ra nop See D14469 for the larger motivation. Differential Revision: http://reviews.llvm.org/D14500 llvm-svn: 252755	2015-11-11 17:24:56 +00:00
Diego Novillo	0767ae5896	Properly fix unused variable in disable-assert builds. I missed the side-effects of ParseBFI in my previous attempt (r252748). Thanks dblaikie for the suggestion of adding a void use of the unused variable instead. llvm-svn: 252751	2015-11-11 16:39:22 +00:00
Diego Novillo	29f88a2460	Remove unused variable in disable-assert builds. NFC. llvm-svn: 252748	2015-11-11 16:14:52 +00:00
Douglas Katzman	a14039764b	Visibly fail if attempting to encode register AH,BH,CH,DH in a REX-prefixed instruction. Differential Revision: http://reviews.llvm.org/D13316 Fixes PR25003 llvm-svn: 252743	2015-11-11 15:51:16 +00:00
James Molloy	ce12c92f66	[ARM] Combine BFIs together If we have a chain of BFIs, we may be able to combine several together into one merged BFI. We can do this if the "from" bits from one BFI OR'd with the "from" bits from the other BFI form a contiguous range, and the same with the "to" bits. llvm-svn: 252740	2015-11-11 15:40:40 +00:00
Aaron Ballman	107bb0d193	Silencing nine warnings for "enumeral and non-enumeral type in conditional expression"; NFC. llvm-svn: 252728	2015-11-11 13:44:06 +00:00
Michael Kuperstein	12982a816c	[X86] Replace LEAs with INC/DEC when profitable If possible and profitable, replace lea %reg, 1(%reg) and lea %reg, -1(%reg) with inc %reg and dec %reg respectively. Patch by: anton.nadolsky@intel.com Differential Revision: http://reviews.llvm.org/D14059 llvm-svn: 252722	2015-11-11 11:44:31 +00:00
Craig Topper	b24a58e28f	[X86] Fix feature flags on some MMX register instructions that really were introduced with SSE or SSE2. llvm-svn: 252709	2015-11-11 07:29:25 +00:00
Craig Topper	700a1a23d7	[X86] Remove redundant MMX isel patterns. llvm-svn: 252708	2015-11-11 07:29:22 +00:00
Dan Gohman	754cd11d90	[WebAssembly] Support non-legal argument and return types. llvm-svn: 252687	2015-11-11 01:33:02 +00:00
Ahmed Bougacha	4a85643907	[MC] Use LShr for constant evaluation of ">>" on non-arm64 darwin. Follow-up to r235963: this matches other assemblers and is less unexpected (e.g. PR23227). llvm-svn: 252681	2015-11-11 00:51:36 +00:00
Matt Arsenault	8246d4aead	AMDGPU: Print more fields in comments llvm-svn: 252677	2015-11-11 00:27:46 +00:00
Matt Arsenault	61cb6fa848	AMDGPU: Remove dead code llvm-svn: 252675	2015-11-11 00:01:36 +00:00
Matt Arsenault	6690d7de39	AMDGPU: Set isAllocatable = 0 on VS_32/VS_64 llvm-svn: 252674	2015-11-11 00:01:32 +00:00
Reid Kleckner	7f84a939ed	[WinEH] Insert the MBB for EH_RESTORE after the catchret Inserting it before the target block could be bad, we might already have a fallthrough edge to it. llvm-svn: 252670	2015-11-10 23:22:20 +00:00
Dan Gohman	16d314d300	[WebAssembly] Remove special cases for things that are no longer special. NFC. llvm-svn: 252656	2015-11-10 21:48:21 +00:00
Bill Schmidt	3c44c6f189	Add PPCMIPeephole.cpp to CMakeLists.txt llvm-svn: 252654	2015-11-10 21:43:45 +00:00
Dan Gohman	b84ae9bb38	[WebAssembly] Support for floating point min and max. llvm-svn: 252653	2015-11-10 21:40:21 +00:00
Bill Schmidt	34af5e1c76	[PowerPC] Add an MI SSA peephole pass. This patch adds a pass for doing PowerPC peephole optimizations at the MI level while the code is still in SSA form. This allows for easy modifications to the instructions while depending on a subsequent pass of DCE. Both passes are very fast due to the characteristics of SSA. At this time, the only peepholes added are for cleaning up various redundancies involving the XXPERMDI instruction. However, I would expect this will be a useful place to add more peepholes for inefficiencies generated during instruction selection. The pass is placed after VSX swap optimization, as it is best to let that pass remove unnecessary swaps before performing any remaining clean-ups. The utility of these clean-ups are demonstrated by changes to four existing test cases, all of which now have tighter expected code generation. I've also added Eric Schweiz's bugpoint-reduced test from PR25157, for which we now generate tight code. One other test started failing for me, and I've fixed it (test/Transforms/PlaceSafepoints/finite-loops.ll) as well; this is not related to my changes, and I'm not sure why it works before and not after. The problem is that the CHECK-NOT: of "statepoint" from test1 fails because of the "statepoint" in test2, and so forth. Adding a CHECK-LABEL in between keeps the different occurrences of that string properly scoped. llvm-svn: 252651	2015-11-10 21:38:26 +00:00
Sanjay Patel	af1b48bfdc	[ARM] add overrides for isCheapToSpeculateCttz() and isCheapToSpeculateCtlz() ARM V6T2 has instructions for efficient count-leading/trailing-zeros, so this should be considered a cheap operation (and therefore fair game for speculation) for any ARM V6T2 implementation. The net result of allowing this speculation for the regression tests in this patch is that we get this code: ctlz: clz r0, r0 bx lr cttz: rbit r0, r0 clz r0, r0 bx lr Instead of: ctlz: cmp r0, #0 moveq r0, #32 clzne r0, r0 bx lr cttz: cmp r0, #0 moveq r0, #32 rbitne r0, r0 clzne r0, r0 bx lr This will help solve a general speculation/despeculation problem noted in PR24818: https://llvm.org/bugs/show_bug.cgi?id=24818 Differential Revision: http://reviews.llvm.org/D14469 llvm-svn: 252639	2015-11-10 19:24:31 +00:00
Sanjay Patel	241c31fb64	[AArch64] add overrides for isCheapToSpeculateCttz() and isCheapToSpeculateCtlz() AArch64 has instructions for efficient count-leading/trailing-zeros, so this should be considered a cheap operation (and therefore fair game for speculation) for any AArch64 implementation. The net result of allowing this speculation for the regression tests in this patch is that we get this code: ctlz: clz w0, w0 ret cttz: rbit w8, w0 clz w0, w8 ret Instead of: ctlz: cbz w0, .LBB0_2 clz w0, w0 ret .LBB0_2: orr w0, wzr, #0x20 ret cttz: cbz w0, .LBB1_2 rbit w8, w0 clz w0, w8 ret .LBB1_2: orr w0, wzr, #0x20 ret See D14469 for the larger motivation. Differential Revision: http://reviews.llvm.org/D14505 llvm-svn: 252625	2015-11-10 18:11:37 +00:00
Michael Kuperstein	a01a5ee72f	[X86] Do not try to custom-lower sitofp/fptosi in soft-float mode Differential Revision: http://reviews.llvm.org/D14495 llvm-svn: 252621	2015-11-10 17:37:49 +00:00
James Molloy	9d55f19cfa	Reapply "[ARM] Combine CMOV into BFI where possible" Added fixes for stage2 failures: CMOV is not commutable; commuting the operands results in the condition being flipped! d'oh! Original commit message: If we have a CMOV, OR and AND combination such as: if (x & CN) y \|= CM; And: * CN is a single bit; * All bits covered by CM are known zero in y; Then we can convert this to a sequence of BFI instructions. This will always be a win if CM is a single bit, will always be no worse than the TST & OR sequence if CM is two bits, and for thumb will be no worse if CM is three bits (due to the extra IT instruction). llvm-svn: 252606	2015-11-10 14:22:05 +00:00
Tilmann Scheller	990a8d88c8	[PowerPC] Remove redundant code. The local variable Hi is never being read. Issue identified by the Clang static analyzer. llvm-svn: 252600	2015-11-10 12:29:37 +00:00
Oliver Stannard	d414c99b9c	[AArch64] Fix halfword load merging for big-endian targets For big-endian targets, when we merge two halfword loads into a word load, the order of the halfwords in the loaded value is reversed compared to little-endian, so the load-store optimiser needs to swap the destination registers. This does not affect merging of two word loads, as we use ldp, which treats the memory as two separate 32-bit words. llvm-svn: 252597	2015-11-10 11:04:18 +00:00
Igor Breger	b6b27af46a	AVX512 : Implemented encoding and DAG lowering for VMOVHPS/PD and VMOVLPS/PD instructions. Differential Revision: http://reviews.llvm.org/D14492 llvm-svn: 252592	2015-11-10 07:09:07 +00:00
David Blaikie	578a31fe0a	Remove another variable unused in -Asserts build llvm-svn: 252582	2015-11-10 04:10:04 +00:00
David Blaikie	e35168f008	Remove some unused variables to clean up the -Werror build llvm-svn: 252580	2015-11-10 03:16:28 +00:00
Colin LeMahieu	3c7ecf9af1	[Hexagon] Adding instruction aliases and tests. llvm-svn: 252579	2015-11-10 01:58:26 +00:00
Andy Ayers	809cbe9ea0	Support for emitting inline stack probes For CoreCLR on Windows, stack probes must be emitted as inline sequences that probe successive stack pages between the current stack limit and the desired new stack pointer location. This implements support for the inline expansion on x64. For in-body alloca probes, expansion is done during instruction lowering. For prolog probes, a stub call is initially emitted during prolog creation, and expanded after epilog generation, to avoid complications that arise when introducing new machine basic blocks during prolog and epilog creation. Added a new test case, modified an existing one to exclude non-x64 coreclr (for now). Add test case Fix tests llvm-svn: 252578	2015-11-10 01:50:49 +00:00
Colin LeMahieu	13cc3ab785	[Hexagon] Fixing compound register printing and reenabling more tests. llvm-svn: 252574	2015-11-10 00:51:56 +00:00
Tim Northover	339c83e27f	AArch64: add experimental support for address tagging. AArch64 has the ability to use the top 8-bits of an "address" for extra information, with the memory subsystem automatically masking them off for loads and stores. When that's happening, we can sometimes skip masks on memory operations in the compiler. However, this requires the host OS and support stack to preserve those bits so it can't be enabled everywhere. In principle iOS 8.0 and above do take the required precautions and but we'll put it under a flag for now. llvm-svn: 252573	2015-11-10 00:44:23 +00:00
Derek Schuff	ffa143ce81	[WebAssembly] Support 'unreachable' expression Lower LLVM's 'unreachable' terminator to ISD::TRAP, and lower ISD::TRAP to wasm's 'unreachable' expression. WebAssembly type-checks expressions, but a noreturn function with a return type that doesn't match the context will cause a check failure. So we lower LLVM 'unreachable' to ISD::TRAP and then lower that to WebAssembly's 'unreachable' expression, which typechecks in any context and causes a trap if executed. Differential Revision: http://reviews.llvm.org/D14515 llvm-svn: 252566	2015-11-10 00:30:57 +00:00
Colin LeMahieu	b7a5f9fc29	[Hexagon] Fixing store instructions and reenabling a few more tests. llvm-svn: 252561	2015-11-10 00:22:00 +00:00
Akira Hatanaka	3bfc3e2d2a	[ARM] Handle t2ADDri in ARMAsmPrinter::EmitUnwindingInstruction. This fixes a bug in ARMAsmPrinter::EmitUnwindingInstruction where llvm_unreachable was reached because t2ADDri wasn't handled. Test case provided by Tim Northover. rdar://problem/23270609 http://reviews.llvm.org/D14518 llvm-svn: 252557	2015-11-10 00:10:41 +00:00
Colin LeMahieu	8ab7e8e1b5	[Hexagon] Fixing load instruction parsing and reenabling tests. llvm-svn: 252555	2015-11-10 00:02:27 +00:00
Reid Kleckner	420f0542cc	[WinEH] Remove isBarrier from instructions that do not return Fixes machine verification failures with David's latest EH change. llvm-svn: 252541	2015-11-09 23:34:42 +00:00
Sanjay Patel	533c10c651	add a SelectionDAG method to check if no common bits are set in two nodes; NFCI This was suggested in: http://reviews.llvm.org/D13956 and is a follow-on to: http://reviews.llvm.org/rL252515 http://reviews.llvm.org/rL252519 This lets us remove logically equivalent/duplicated code from DAGCombiner and X86ISelDAGToDAG. A corresponding function for IR instructions already exists in ValueTracking. llvm-svn: 252539	2015-11-09 23:31:38 +00:00
David Majnemer	2652b75700	[WinEH] Don't emit CATCHRET from visitCatchPad Instead, emit a CATCHPAD node which will get selected to a target specific sequence. llvm-svn: 252528	2015-11-09 23:07:48 +00:00
Sanjay Patel	32538d6811	[x86] try harder to match bitwise 'or' into an LEA The motivation for this patch starts with the epic fail example in PR18007: https://llvm.org/bugs/show_bug.cgi?id=18007 ...unfortunately, this patch makes no difference for that case, but it solves some simpler cases. We'll get there some day. :) The current 'or' matching code was using computeKnownBits() via isBaseWithConstantOffset() -> MaskedValueIsZero(), but that's an unnecessarily limited use. We can do more by copying the logic in ValueTracking's haveNoCommonBitsSet(), so we can treat the 'or' as if it was an 'add'. There's a TODO comment here because we should lift the bit-checking logic into a helper function, so it's not duplicated in DAGCombiner. An example of the better LEA matching: leal (%rdi,%rdi), %eax andl $1, %esi orl %esi, %eax Becomes: andl $1, %esi leal (%rsi,%rdi,2), %eax Differential Revision: http://reviews.llvm.org/D13956 llvm-svn: 252515	2015-11-09 21:16:49 +00:00
Colin LeMahieu	9d851f0435	[Hexagon] Separating statement to match what clang-format would do. llvm-svn: 252513	2015-11-09 21:06:28 +00:00
Reid Kleckner	64b003f05d	[WinEH] Tweak funclet prologue/epilogue insertion to pass verifier For some reason we'd never run MachineVerifier on WinEH code, and you explicitly have to ask for it with llc. I added it to a few test cases to get some coverage. Fixes PR25461. llvm-svn: 252512	2015-11-09 21:04:00 +00:00
Reid Kleckner	390191dacc	[Hexagon] Fix -Wmicrosoft-enum-value warning with explicit enum type llvm-svn: 252505	2015-11-09 19:44:38 +00:00
Sanjay Patel	776e59b0fe	don't repeat function names in comments; NFC llvm-svn: 252502	2015-11-09 19:18:26 +00:00
Charlie Turner	90dafb1b6d	[AArch64] Add UABDL patterns for log2 shuffle. Summary: This matches the sum-of-absdiff patterns emitted by the vectoriser using log2 shuffles. Relies on D14207 to be able to match the `extract_subvector(..., 0)` Reviewers: t.p.northover, jmolloy Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D14208 llvm-svn: 252465	2015-11-09 13:10:52 +00:00
Charlie Turner	7b7b06f737	[AArch64] Handle extract_subvector(..., 0) in ISel. Summary: Lowering this pattern early to an `EXTRACT_SUBREG` was making it impossible to match larger patterns in tblgen that use `extract_subvector(..., 0)` as part of the their input pattern. It seems like there will exist somewhere a better way of specifying this pattern over all relevant register value types, but I didn't manage to find it. Reviewers: t.p.northover, jmolloy Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D14207 llvm-svn: 252464	2015-11-09 12:45:11 +00:00
Renato Golin	6d435f12f0	[EABI] Add LLVM support for -meabi flag "GCC requires the freestanding environment provide memcpy, memmove, memset and memcmp": https://gcc.gnu.org/onlinedocs/gcc-5.2.0/gcc/Standards.html Hence in GNUEABI targets LLVM should not convert 'memops' to their equivalent '__aeabi_memops'. This convertion violates GCC contract. The -meabi flag controls whether or not LLVM will modify 'memops' in GNUEABI targets. Without -meabi: use the triple default EABI. With -meabi=default: use the triple default EABI. With -meabi=gnu: use 'memops'. With -meabi=4 or -meabi=5: use '__aeabi_memops'. With -meabi set to an unknown value: same as -meabi=default. Patch by Vinicius Tinti. llvm-svn: 252462	2015-11-09 12:40:30 +00:00
Renato Golin	1d8a2c952f	Revert "[ARM] Combine CMOV into BFI where possible" This reverts commit r252057, as it broke ARM self-hosting buildbots, probably due to a code-gen fault. llvm-svn: 252460	2015-11-09 12:19:10 +00:00
Colin LeMahieu	9ea507edc7	[Hexagon] Adding override to methods. llvm-svn: 252453	2015-11-09 07:10:24 +00:00
Colin LeMahieu	775d7ad677	[Hexagon] Fixing warnings. llvm-svn: 252448	2015-11-09 05:47:56 +00:00
Colin LeMahieu	a1adb51e6b	[Hexagon] Removing extra gen line. llvm-svn: 252447	2015-11-09 05:31:39 +00:00
Colin LeMahieu	892f54f408	[Hexagon] Maybe the makefile? llvm-svn: 252446	2015-11-09 05:16:08 +00:00
Colin LeMahieu	d5537bf219	[Hexagon] Adding LLVMBuild.txt reference to HexagonAsmParser. llvm-svn: 252444	2015-11-09 04:31:02 +00:00
Colin LeMahieu	7cd0892729	[Hexagon] Enabling ASM parsing on Hexagon backend and adding instruction parsing tests. General updating of the code emission. llvm-svn: 252443	2015-11-09 04:07:48 +00:00
Colin LeMahieu	8a0453e23a	[AsmParser] Backends can parameterize ASM tokenization. llvm-svn: 252439	2015-11-09 00:31:07 +00:00
Hal Finkel	f046f72efa	[PowerPC] Fix LoopPreIncPrep not to depend on SCEV constant simplifications Under most circumstances, if SCEV can simplify X-Y to a constant, then it can also simplify Y-X to a constant. However, there is no guarantee that this is always true, and concensus is not to consider that a correctness bug in SCEV (although it is undesirable). PPCLoopPreIncPrep gathers pointers used to access memory (via loads, stores and prefetches) into buckets, where in each bucket the relative pointer offsets are constant. We used to keep each bucket as a multimap, where SCEV's subtraction operation was used to define the ordering predicate. Instead, use a fixed SCEV base expression for each bucket, record the constant offsets from that base expression, and adjust it later, if desirable, once all pointers have been collected. Doing it this way should be more compile-time efficient than the previous scheme (in addition to making the implementation less sensitive to SCEV simplification quirks). Fixes PR25170. llvm-svn: 252417	2015-11-08 08:04:40 +00:00
David Majnemer	e35244cf63	[WinEH] Update PHIs of CATCHRET successors The TailDuplication machine pass ran across a malformed CFG: a PHI node referred it's predecessor's predecessor instead of it's predecessor. This occurred because we split the edge in X86ISelLowering when we processed the CATCHRET but forgot to do something about the PHI nodes. This fixes PR25444. llvm-svn: 252413	2015-11-08 02:36:00 +00:00
Nico Weber	00406472e8	Try to fix build more -- like r252392 but for WebAssembly. llvm-svn: 252394	2015-11-07 02:47:31 +00:00
Joseph Tremoulet	f748c8937e	[WinEH] Update exception pointer registers Summary: The CLR's personality routine passes these in rdx/edx, not rax/eax. Make getExceptionPointerRegister a virtual method parameterized by personality function to allow making this distinction. Similarly make getExceptionSelectorRegister a virtual method parameterized by personality function, for symmetry. Reviewers: pgavlin, majnemer, rnk Subscribers: jyknight, dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D14344 llvm-svn: 252383	2015-11-07 01:11:31 +00:00
Duncan P. N. Exon Smith	83c4b68720	ADT: Remove last implicit ilist iterator conversions, NFC Some implicit ilist iterator conversions have crept back into Analysis, Transforms, Hexagon, and llvm-stress. This removes them. I'll commit a patch immediately after this to disallow them (in a separate patch so that it's easy to revert if necessary). llvm-svn: 252371	2015-11-07 00:01:16 +00:00
Ahmed Bougacha	cf49b523a0	[AArch64][FastISel] Don't even try to select vector icmps. We used to try to constant-fold them to i32 immediates. Given that fast-isel doesn't otherwise support vNi1, when selecting the result users, we'd fallback to SDAG anyway. However, if the users were in another block, we'd insert broken cross-class copies (GPR32 to FPR64). Give up, let SDAG agree with itself on a vNi1 legalization strategy. llvm-svn: 252364	2015-11-06 23:16:53 +00:00
Ahmed Bougacha	b49eb3ab4b	[X86] Fold (trunc (i32 (zextload i16))) into vbroadcast. When matching non-LSB-extracting truncating broadcasts, we now insert the necessary SRL. If the scalar resulted from a load, the SRL will be folded into it, creating a narrower, offset, load. However, i16 loads aren't Desirable, so we get i16->i32 zextloads. We already catch i16 aextloads; catch these as well. llvm-svn: 252363	2015-11-06 23:16:48 +00:00
Ahmed Bougacha	05a0514b12	[X86] SRL non-LSB extracts when folding to truncating broadcasts. Now that we recognize this, we can support it instead of bailing out. That is, we can fold: (v8i16 (shufflevector (v8i16 (bitcast (v4i32 (build_vector X, Y, ...)))), <1,1,...,1>)) into: (v8i16 (vbroadcast (i16 (trunc (srl Y, 16))))) llvm-svn: 252362	2015-11-06 23:16:43 +00:00
Ahmed Bougacha	68614a36d1	[X86] Don't fold non-LSB extracts into truncating broadcasts. We used to incorrectly assume that the offset we're extracting from was a multiple of the element size. So, we'd fold: (v8i16 (shufflevector (v8i16 (bitcast (v4i32 (build_vector X, Y, ...)))), <1,1,...,1>)) into: (v8i16 (vbroadcast (i16 (trunc Y)))) whereas we should have extracted the higher bits from X. Instead, bail out if the assumption doesn't hold. llvm-svn: 252361	2015-11-06 23:16:38 +00:00
Tom Stellard	41b7e63040	AMDGPU/SI: Refactor VOP[12C] tablegen definitions Summary: Pass the VOPProfile object all the through to *_m multiclasses. This will allow us to do more simplifications in the future. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D13437 llvm-svn: 252339	2015-11-06 20:56:18 +00:00
Andrew Kaylor	4731bea3e5	Improved the operands commute transformation for X86-FMA3 instructions. All 3 operands of FMA3 instructions are commutable now. Patch by Slava Klochkov Reviewers: Quentin Colombet(qcolombet), Ahmed Bougacha(ab). Differential Revision: http://reviews.llvm.org/D13269 llvm-svn: 252335	2015-11-06 19:47:25 +00:00
Dan Gohman	4b96d8d1ff	[WebAssembly] Make expression-stack pushing explicit Modelling of the expression stack is evolving. This patch takes another step by making pushes explicit. Differential Revision: http://reviews.llvm.org/D14338 llvm-svn: 252334	2015-11-06 19:45:01 +00:00
Matt Arsenault	f59e538937	AMDGPU: Cleanup includes llvm-svn: 252328	2015-11-06 18:23:00 +00:00
Matt Arsenault	0c90e9501e	AMDGPU: Create emergency stack slots during frame lowering Test has a bogus verifier error which will be fixed by later commits. llvm-svn: 252327	2015-11-06 18:17:45 +00:00
Matt Arsenault	08f14de244	AMDGPU: Remove unused scratch resource operands The SGPR spill pseudos don't actually use them. llvm-svn: 252324	2015-11-06 18:07:53 +00:00
Matt Arsenault	3931948bb6	AMDGPU: Add pass to detect used kernel features Mark kernels that use certain features that require user SGPRs to support with kernel attributes. We need to know before instruction selection begins because it impacts the kernel calling convention lowering. For now this only detects the workitem intrinsics. llvm-svn: 252323	2015-11-06 18:01:57 +00:00
Matt Arsenault	4dc7a5a5c6	AMDGPU: Fix hardcoded alignment of spill. Instead of forcing 4 alignment when spilled, set register class alignments. llvm-svn: 252322	2015-11-06 17:54:47 +00:00
Matt Arsenault	623e6fd466	AMDGPU: Hack for VS_32 register pressure For some reason VS_32 ends up factoring into the pressure heuristics even though we should never see a virtual register with this class. When SGPRs are reserved for register spilling, this for some reason triggers reg-crit scheduling. Setting isAllocatable = 0 may help with this since that seems to remove it from the default implementation's generated table. llvm-svn: 252321	2015-11-06 17:54:43 +00:00
Reid Kleckner	b8fd162fc5	[WinEH] Mark funclet entries and exits as clobbering all registers Summary: In this implementation, LiveIntervalAnalysis invents a few register masks on basic block boundaries that preserve no registers. The nice thing about this is that it prevents the prologue inserter from thinking it needs to spill all XMM CSRs, because it doesn't see any explicit physreg defs in the MI. Reviewers: MatzeB, qcolombet, JosephTremoulet, majnemer Subscribers: MatzeB, llvm-commits Differential Revision: http://reviews.llvm.org/D14407 llvm-svn: 252318	2015-11-06 17:06:38 +00:00
Jun Bum Lim	22fe15ee86	[AArch64]Enable the narrow ld promotion only on profitable microarchitectures The benefit from converting narrow loads into a wider load (r251438) could be micro-architecturally dependent, as it assumes that a single load with two bitfield extracts is cheaper than two narrow loads. Currently, this conversion is enabled only in cortex-a57 on which performance benefits were verified. llvm-svn: 252316	2015-11-06 16:27:47 +00:00
Daniel Sanders	5762a4f9d1	[mips][ias] Range check uimm4 operands and fixed a bug this revealed. Summary: The bug was that the sldi instructions have immediate widths dependant on their element size. So sldi.d has a 1-bit immediate and sldi.b has a 4-bit immediate. All of these were using 4-bit immediates previously. Reviewers: vkalintiris Subscribers: llvm-commits, atanasyan, dsanders Differential Revision: http://reviews.llvm.org/D14018 llvm-svn: 252297	2015-11-06 12:41:43 +00:00
Daniel Sanders	38ce0f629c	[mips][ias] Range check uimm3 operands. Summary: Reviewers: vkalintiris Subscribers: atanasyan, dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D14016 llvm-svn: 252296	2015-11-06 12:31:27 +00:00
Daniel Sanders	ea4f653d18	[mips][ias] Range check uimm2 operands and fix a bug this revealed. Summary: The bug was that the MIPS32R6/MIPS64R6/microMIPS32R6 versions of LSA and DLSA (unlike the MSA version) failed to account for the off-by-one encoding of the immediate. The range is actually 1..4 rather than 0..3. Reviewers: vkalintiris Subscribers: atanasyan, dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D14015 llvm-svn: 252295	2015-11-06 12:22:31 +00:00
Daniel Sanders	52da7af4d2	[mips][ias] Range check uimmz operands. Reviewers: vkalintiris Subscribers: dsanders, atanasyan, llvm-commits Differential Revision: http://reviews.llvm.org/D14013 llvm-svn: 252294	2015-11-06 12:11:03 +00:00
Vasileios Kalintiris	b04672cade	[mips] Define patterns for the atomic_{load,store}_{8,16,32,64} nodes. Summary: Without these patterns we would generate a complete LL/SC sequence. This would be problematic for memory regions marked as WRITE-only or READ-only, as the instructions LL/SC would read/write to the protected memory regions correspondingly. Reviewers: dsanders Subscribers: llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D14397 llvm-svn: 252293	2015-11-06 12:07:20 +00:00
Tom Stellard	1e1b05db24	AMDGPU/SI: Emit HSA kernels with symbol type STT_AMDGPU_HSA_KERNEL Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D13804 llvm-svn: 252291	2015-11-06 11:45:14 +00:00
Reid Kleckner	51460c139e	[WinEH] Split EH_RESTORE out of CATCHRET for 32-bit EH This adds the EH_RESTORE x86 pseudo instr, which is responsible for restoring the stack pointers: EBP and ESP, and ESI if stack realignment is involved. We only need this on 32-bit x86, because on x64 the runtime restores CSRs for us. Previously we had to keep the CATCHRET instruction around during SEH so that we could convince X86FrameLowering to restore our frame pointers. Now we can split these instructions earlier. This was confusing, because we had a return instruction which wasn't really a return and was ultimately going to be removed by X86FrameLowering. This change also simplifies X86FrameLowering, which really shouldn't be building new MBBs. No observable functional change currently, but with the new register mask stuff in D14407, CATCHRET will become a register allocator barrier, and our existing tests rely on us having reasonable register allocation around SEH. llvm-svn: 252266	2015-11-06 01:49:05 +00:00
Tim Northover	775aaeb765	Remove windows line endings introduced by r252177. NFC. llvm-svn: 252217	2015-11-05 21:54:58 +00:00
Reid Kleckner	6ddae31045	[WinEH] Fix funclet prologues with stack realignment We already had a test for this for 32-bit SEH catchpads, but those don't actually create funclets. We had a bug that only appeared in funclet prologues, where we would establish EBP and ESI as our FP and BP, and then downstream prologue code would overwrite them. While I was at it, I fixed Win64+funclets+stackrealign. This issue doesn't come up as often there due to the ABI requring 16 byte stack alignment, but now we can rest easy that AVX and WinEH will work well together =P. llvm-svn: 252210	2015-11-05 21:09:49 +00:00
Dan Gohman	b9ce5a8b6c	[WebAssembly] Fix copypasta. Noticed by dschff in http://reviews.llvm.org/rL252203 llvm-svn: 252208	2015-11-05 20:59:49 +00:00
Dan Gohman	da7f428a4a	[WebAssembly] Rename Immediate instructions to Const. This more closely reflects the naming convention in the spec. llvm-svn: 252204	2015-11-05 20:44:29 +00:00
Dan Gohman	af29bd4fd4	[WebAssembly] Add AsmString strings for most instructions. Mangling type information into MachineInstr opcode names was a temporary measure, and it's starting to get hairy. At the same time, the MC instruction printer wants to use AsmString strings for printing. This patch takes the first step, starting the process of adding AsmStrings for instructions. llvm-svn: 252203	2015-11-05 20:42:30 +00:00
Dan Gohman	d7ffb919c1	[WebAssembly] Update wasm builtin functions to match spec changes. The page_size operator has been removed from the spec, and the resize_memory operator has been changed to grow_memory. llvm-svn: 252202	2015-11-05 20:16:59 +00:00
Sanjay Patel	387e66e79f	replace MachineCombinerPattern namespace and enum with enum class; NFCI Also, remove an enum hack where enum values were used as indexes into an array. We may want to make this a real class to allow pattern-based queries/customization (D13417). llvm-svn: 252196	2015-11-05 19:34:57 +00:00
Dan Gohman	e9361d58ff	[WebAssembly] Add WebAssemblyMCInstLower.cpp. This isn't used yet; it's just a start towards eventually using MC to do instruction printing, and eventually binary encoding. llvm-svn: 252194	2015-11-05 19:28:16 +00:00
Oleg Ranevskyy	057c5a6b2b	[DebugInfo] Fix ARM/AArch64 prologue_end position. Related to D11268. Summary: This review is related to another review request http://reviews.llvm.org/D11268, does the same and merely fixes a couple of issues with it. D11268 is quite old and has merge conflicts against the current trunk. This request - rebases D11268 onto the new trunk; - resolves the merge conflicts; - fixes the prologue_end tests, which do not pass due to the subprogram definitions not marked as distinct. Reviewers: echristo, rengolin, kubabrecka Subscribers: aemerson, rengolin, jyknight, dsanders, llvm-commits, asl Differential Revision: http://reviews.llvm.org/D14338 llvm-svn: 252177	2015-11-05 17:50:17 +00:00
Petar Jovanovic	99fba3c141	Add cfi instr for CFA calculation when movpc is expanded to call and pop This fixes the issue of wrong CFA calculation in the following case: 0x08048400 <+0>: push %ebx 0x08048401 <+1>: sub $0x8,%esp 0x08048404 <+4>: call 0x8048409 <test+9> 0x08048409 <+9>: pop %eax 0x0804840a <+10>: add $0x1bf7,%eax 0x08048410 <+16>: mov %eax,%ebx 0x08048412 <+18>: call 0x80483f0 <bar> 0x08048417 <+23>: add $0x8,%esp 0x0804841a <+26>: pop %ebx 0x0804841b <+27>: ret The highlighted instructions are a product of movpc instruction. The call instruction changes the stack pointer, and pop instruction restores its value. However, the rule for computing CFA is not updated and is wrong on the pop instruction. So, e.g. backtrace in gdb does not work when on the pop instruction. This adds cfi instructions for both call and pop instructions. cfi_adjust_cfa_offset** instruction is used with the appropriate offset for setting the rules to calculate CFA correctly. Patch by Violeta Vukobrat. Differential Revision: http://reviews.llvm.org/D14021 llvm-svn: 252176	2015-11-05 17:19:59 +00:00
Derek Schuff	8a76b04a63	[WebAssembly] Rename ior operator to or to match the spec Summary: The spec uses "or" for inclusive-or and "xor" for exclusive-or Reviewers: sunfish Subscribers: jfb, llvm-commits, dschuff Differential Revision: http://reviews.llvm.org/D14362 llvm-svn: 252174	2015-11-05 17:08:11 +00:00
James Molloy	bef6e43107	[ARM] Compute known bits for ARMISD::CMOV We can conservatively know that CMOV's known bits are the intersection of known bits for each of its operands. This helps PerformCMOVToBFICombine find more opportunities. I tried hard to create a testcase for this and failed - we have to sufficiently confuse DAG.computeKnownBits which can see through all the cheap tricks I tried to narrow my larger testcase down :( This code is actually exercised in CodeGen/ARM/bfi.ll, there's just no functional difference because DAG.computeKnownBits gets the right answer in that case. llvm-svn: 252168	2015-11-05 15:21:58 +00:00
Asaf Badouh	f99c054ebc	revert rev. 252153 due to build failure on ubuntu [X86][AVX512] add comi with Sae llvm-svn: 252154	2015-11-05 08:55:54 +00:00
Asaf Badouh	7fdabf0a35	[X86][AVX512] add comi with Sae add builtin_ia32_vcomisd and builtin_ia32_vcomisd Differential Revision: http://reviews.llvm.org/D14331 llvm-svn: 252153	2015-11-05 08:45:06 +00:00
Asaf Badouh	a8209d92cc	[X86][AVX512] small bugfix in VPBROADCASTM VPBROADCASTMW2D and VPBROADCASTMB2Q Differential Revision: http://reviews.llvm.org/D14335 llvm-svn: 252151	2015-11-05 08:08:21 +00:00
Matt Arsenault	5b22dfa65d	AMDGPU: Also track whether SGPRs were spilled llvm-svn: 252145	2015-11-05 05:27:10 +00:00
Matt Arsenault	d41c0dbff0	AMDGPU: Print number user SGPRs This doesn't quite match how SC prints it, which doesn't put it in a comment. llvm-svn: 252144	2015-11-05 05:27:07 +00:00
Matt Arsenault	68802d3177	AMDGPU: Disallow s[102:103] on VI in assembler llvm-svn: 252142	2015-11-05 03:11:27 +00:00
Matt Arsenault	a40450cba2	AMDGPU: Fix assert when legalizing atomic operands The operand layout is slightly different for the atomic opcodes from the usual MUBUF loads and stores. This should only fix it on SI/CI. VI is still broken because it still emits the addr64 replacement. llvm-svn: 252140	2015-11-05 02:46:56 +00:00
Matt Arsenault	bed42a7320	AMDGPU: Make addr64 atomic operand order consistent vaddr comes before srsrc in every other MUBUF instruction, and is the order it is printed. llvm-svn: 252139	2015-11-05 02:46:53 +00:00
Joseph Tremoulet	6afccf6120	[WinEH] Fix establisher param reg in CLR funclets Summary: The CLR's personality routine passes the pointer to the establisher frame in RCX, not RDX. Reviewers: pgavlin, majnemer, rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14343 llvm-svn: 252135	2015-11-05 02:20:07 +00:00
Rafael Espindola	e61a902371	Go back to producing relocations for out of range symbols. This brings back the behavior from before r252090 for out of range symbols. Should bring some arm bots back. llvm-svn: 252119	2015-11-05 01:10:15 +00:00
Matt Arsenault	6c2e200d38	AMDGPU: Fix typo llvm-svn: 252116	2015-11-05 01:03:08 +00:00
Rafael Espindola	49b8548903	Slightly saner handling of thumb branches. The generic infrastructure already did a lot of work to decide if the fixup value is know or not. It doesn't make sense to reimplement a very basic case: same fragment. llvm-svn: 252090	2015-11-04 23:00:39 +00:00
Quentin Colombet	421723cdd8	[x86] Teach the shrink-wrapping hooks to do the proper thing with Win64. Win64 has some strict requirements for the epilogue. As a result, we disable shrink-wrapping for Win64 unless the block that gets the epilogue is already an exit block. Fixes PR24193. llvm-svn: 252088	2015-11-04 22:37:28 +00:00
Simon Pilgrim	f669d381f9	Warning fix. llvm-svn: 252078	2015-11-04 21:27:22 +00:00
Simon Pilgrim	7e6606f4f1	[X86][SSE] Add general memory folding for (V)INSERTPS instruction This patch improves the memory folding of the inserted float element for the (V)INSERTPS instruction. The existing implementation occurs in the DAGCombiner and relies on the narrowing of a whole vector load into a scalar load (and then converted into a vector) to (hopefully) allow folding to occur later on. Not only has this proven problematic for debug builds, it also prevents other memory folds (notably stack reloads) from happening. This patch removes the old implementation and moves the folding code to the X86 foldMemoryOperand handler. A new private 'special case' function - foldMemoryOperandCustom - has been added to deal with memory folding of instructions that can't just use the lookup tables - (V)INSERTPS is the first of several that could be done. It also tweaks the memory operand folding code with an additional pointer offset that allows existing memory addresses to be modified, in this case to convert the vector address to the explicit address of the scalar element that will be inserted. Unlike the previous implementation we now set the insertion source index to zero, although this is ignored for the (V)INSERTPSrm version, anything that relied on shuffle decodes (such as unfolding of insertps loads) was incorrectly calculating the source address - I've added a test for this at insertps-unfold-load-bug.ll Differential Revision: http://reviews.llvm.org/D13988 llvm-svn: 252074	2015-11-04 20:48:09 +00:00
Sanjoy Das	b11b440f8e	[IR] Add bounds checking to paramHasAttr Summary: This is intended to make a later change simpler. Note: adding this bounds checking required fixing `X86FastISel`. As far I can tell I've preserved original behavior but a careful review will be appreciated. Reviewers: reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14304 llvm-svn: 252073	2015-11-04 20:33:45 +00:00
Andrew Kaylor	e41a8c4182	Created new X86 FMA3 opcodes (FMA_Int) that are used now for lowering of scalar FMA intrinsics. Patch by Slava Klochkov The key difference between FMA and FMA_Int opcodes is that FMA_Int opcodes are handled more conservatively. It is illegal to commute the 1st operand of FMA*_Int instructions as the upper bits of scalar FMA intrinsic result must be taken from the 1st operand, but such commute transformation would change those upper bits and invalidate the intrinsic's result. Reviewers: Quentin Colombet, Elena Demikhovsky Differential Revision: http://reviews.llvm.org/D13710 llvm-svn: 252060	2015-11-04 18:10:41 +00:00
James Molloy	e7d679cf4c	[ARM] Combine CMOV into BFI where possible If we have a CMOV, OR and AND combination such as: if (x & CN) y \|= CM; And: * CN is a single bit; * All bits covered by CM are known zero in y; Then we can convert this to a sequence of BFI instructions. This will always be a win if CM is a single bit, will always be no worse than the TST & OR sequence if CM is two bits, and for thumb will be no worse if CM is three bits (due to the extra IT instruction). llvm-svn: 252057	2015-11-04 16:55:07 +00:00
Michael Kuperstein	a3b79dd783	[ELF] elfiamcu triple should imply e_machine == EM_IAMCU Differential Revision: http://reviews.llvm.org/D14109 llvm-svn: 252043	2015-11-04 11:21:50 +00:00
Michael Kuperstein	b34de72269	[X86] DAGCombine should not introduce FILD in soft-float mode The x86 "sitofp i64 to double" dag combine, in 32-bit mode, lowers sitofp directly to X86ISD::FILD (or FILD_FLAG). This should not be done in soft-float mode. llvm-svn: 252042	2015-11-04 11:17:53 +00:00
Peter Collingbourne	94d778697a	CodeGen, Target: Move Mach-O-specific symbol name logic to Mach-O lowering. A profile of an LTO link of Chrome revealed that we were spending some ~30-50% of execution time in the function Constant::getRelocationInfo(), which is called from TargetLoweringObjectFile::getKindForGlobal() and in turn from TargetMachine::getNameWithPrefix(). It turns out that we only need the result of getKindForGlobal() when targeting Mach-O, so this change moves the relevant part of the logic to TargetLoweringObjectFileMachO. NFCI. Differential Revision: http://reviews.llvm.org/D14168 llvm-svn: 252014	2015-11-03 23:40:03 +00:00
Matt Arsenault	aac9b49325	AMDGPU: Make flat_scratch name consistent The printed name and the parsed assembler names weren't the same. I'm not sure which name SC prints these as, but I think it's this one. llvm-svn: 252010	2015-11-03 22:50:34 +00:00
Matt Arsenault	967c2f5dee	AMDGPU: Fix asserts on invalid register ranges If the requested SGPR was not actually aligned, it was accepted and rounded down instead of rejected. Also fix an assert if the range is an invalid size. llvm-svn: 252009	2015-11-03 22:50:32 +00:00
Matt Arsenault	3473c72aab	AMDGPU: Fix off by one error in register parsing If trying to use one past the end, this would assert. llvm-svn: 252008	2015-11-03 22:50:27 +00:00
Derek Schuff	b44d4d350e	Align whitespace llvm-svn: 252003	2015-11-03 22:40:43 +00:00
Derek Schuff	6b5c6da760	[WebAssembly] Support wasm select operator Summary: Add support for wasm's select operator, and lower LLVM's select DAG node to it. Reviewers: sunfish Subscribers: dschuff, llvm-commits, jfb Differential Revision: http://reviews.llvm.org/D14295 llvm-svn: 252002	2015-11-03 22:40:40 +00:00
Matt Arsenault	e8ed13d946	AMDGPU: s[102:103] is unavailable on VI llvm-svn: 252000	2015-11-03 22:39:52 +00:00
Matt Arsenault	192b282bf3	AMDGPU: Define correct number of SGPRs There are actually 104 so 2 were missing. More assembler tests with high register number tuples will be included in later patches. llvm-svn: 251999	2015-11-03 22:39:50 +00:00
Matt Arsenault	6c0674112a	AMDGPU: Make findUsedSGPR more readable Add more comments etc. llvm-svn: 251996	2015-11-03 22:30:15 +00:00
Matt Arsenault	782c03bb7e	AMDGPU: Initialize SIFixSGPRCopies so -print-after works llvm-svn: 251995	2015-11-03 22:30:13 +00:00
Matt Arsenault	d9d659aa23	AMDGPU: Alphabetize includes llvm-svn: 251994	2015-11-03 22:30:08 +00:00
Simon Pilgrim	e88dc04c48	[X86][XOP] Add support for the matching of the VPCMOV bit select instruction XOP has the VPCMOV instruction that performs the common vector bit select operation OR( AND( SRC1, SRC3 ), AND( SRC2, ~SRC3 ) ) This patch adds tablegen pattern matching for this instruction. Differential Revision: http://reviews.llvm.org/D8841 llvm-svn: 251975	2015-11-03 20:27:01 +00:00
Michael Kuperstein	73dc85293f	[X86] Generate .cfi_adjust_cfa_offset correctly when pushing arguments When push instructions are being used to pass function arguments on the stack, and either EH or debugging are enabled, we need to generate .cfi_adjust_cfa_offset directives appropriately. For (synch) EH, it is enough for the CFA offset to be correct at every call site, while for debugging we want to be correct after every push. Darwin does not support this well, so don't use pushes whenever it would be required. Differential Revision: http://reviews.llvm.org/D13767 llvm-svn: 251904	2015-11-03 08:17:25 +00:00
Igor Breger	4ec5abffae	AVX512: add encoding tests for vmovq/d instructions. llvm-svn: 251903	2015-11-03 07:30:17 +00:00
Matthias Braun	f538e133cc	Fix build problme introduced in r251883 llvm-svn: 251888	2015-11-03 02:19:07 +00:00
Matthias Braun	93563e7032	ScheduleDAGInstrs: Remove IsPostRA flag; NFC ScheduleDAGInstrs doesn't behave differently before or after register allocation. It was only used in a method of MachineSchedulerBase which behaved differently in MachineScheduler/PostMachineScheduler. Change this to let MachineScheduler/PostMachineScheduler just pass in a parameter to that function. The order of the LiveIntervals* and bool RemoveKillFlags paramters have been switched to make out-of-tree code fail instead of unintentionally passing a value intended for the IsPostRA flag to the (previously following and default initialized) RemoveKillFlags. Differential Revision: http://reviews.llvm.org/D14245 llvm-svn: 251883	2015-11-03 01:53:29 +00:00
Colin LeMahieu	160f73e36f	[Hexagon] Fixing mistaken case fallthrough. llvm-svn: 251867	2015-11-03 00:21:19 +00:00
Matt Arsenault	f1aebbf33a	AMDGPU: Stop assuming vreg for build_vector This was causing a variety of test failures when v2i64 is added as a legal type. SIFixSGPRCopies should correctly handle the case of vector inputs to a scalar reg_sequence, so this isn't necessary anymore. This was hiding some deficiencies in how reg_sequence is handled later, but this shouldn't be a problem anymore since the register class copy of a reg_sequence is now done before the reg_sequence. llvm-svn: 251860	2015-11-02 23:30:48 +00:00
Derek Schuff	43e96c4feb	[WebAssembly] Make WebAssemblyCodeGen depend on WebAssemblyAsmPrinter llvm-svn: 251859	2015-11-02 23:23:16 +00:00
Matt Arsenault	d48da14269	AMDGPU: Error on graphics shaders with HSA I've found myself pointlessly debugging problems from running graphics tests with an HSA triple a few times, so stop this from happening again. llvm-svn: 251858	2015-11-02 23:23:02 +00:00
Matt Arsenault	0de924b76d	AMDGPU: Distribute SGPR->VGPR copies of REG_SEQUENCE Make the REG_SEQUENCE be a VGPR, and do the register class copy first. llvm-svn: 251855	2015-11-02 23:15:42 +00:00
Bill Schmidt	8ed7cec170	[PPC64LE] Properly initialize instr-info in PPCVSXSwapRemoval pass Replace some hacky code with the proper way to get at this data. No functional change. llvm-svn: 251848	2015-11-02 22:43:57 +00:00
Tim Northover	155103ec18	WatchOS: update default CPU for triple after t2dsp -> dsp rename llvm-svn: 251814	2015-11-02 18:21:07 +00:00
Nemanja Ivanovic	be5f0c04f1	Fix for bootstrap bug introduced in r244921 This revision has introduced an issue that only affects bootstrapped compiler when it is printing the ASM. It turns out that the new code path taken due to legalizing a scalar_to_vector of i64 -> v2i64 exposes a missing check in a micro optimization to change a load followed by a scalar_to_vector into a load and splat instruction on PPC. llvm-svn: 251798	2015-11-02 14:01:11 +00:00
Igor Breger	fa798a9dbb	AVX512: Implemented encoding and intrinsics for VBROADCASTI32x2 and VBROADCASTF32x2 instructions. Differential Revision: http://reviews.llvm.org/D14216 llvm-svn: 251781	2015-11-02 07:39:36 +00:00
Craig Topper	45e83b8ba7	[X86] Remove assertions that check for valid scale values on scatter/gather intrinsics. Nothing upstream prevented illegal values from getting here. llvm-svn: 251780	2015-11-02 07:24:40 +00:00
Craig Topper	e69eb78510	[X86] Fold 'if' followed by just an llvm_unreachable into an assert. llvm-svn: 251778	2015-11-02 07:24:34 +00:00
Craig Topper	aebab7c03f	[X86] Use isa instead of dyn_cast in a bool context. NFC llvm-svn: 251777	2015-11-02 07:24:32 +00:00
Craig Topper	c70af642a2	[X86] Remove some llvm_unreachables after switches that already have an unreachable in their default case. llvm-svn: 251776	2015-11-02 07:24:30 +00:00
Craig Topper	d6a77ca4bb	[X86] Remove a 'break' after an llvm_unreachable. llvm-svn: 251775	2015-11-02 07:24:27 +00:00
Craig Topper	d49a41793c	[X86] Use cast instead of dyn_cast and a null check marked unreachable. llvm-svn: 251774	2015-11-02 07:24:25 +00:00
Craig Topper	95ceb5a60a	[X86] Use MVT instead of EVT when the type is known to be simple. NFC llvm-svn: 251772	2015-11-02 05:24:22 +00:00
NAKAMURA Takumi	50df0c2037	Untabify. llvm-svn: 251769	2015-11-02 01:38:12 +00:00
Elena Demikhovsky	db738d9cc3	AVX-512: Optimized SIMD truncate operations for AVX512F set. Optimized <8 x i32> to <8 x i16> <4 x i64> to < 4 x i32> <16 x i16> to <16 x i8> All these oprtrations use now AVX512F set (KNL). Before this change it was implemented with AVX2 set. Differential Revision: http://reviews.llvm.org/D14108 llvm-svn: 251764	2015-11-01 11:45:47 +00:00
Craig Topper	ec2ea4817e	[X86] Replace getScalarType with getVectorElementType when the type is already known to be a vector. This should result in slightly less code. NFC llvm-svn: 251751	2015-10-31 21:44:52 +00:00
Craig Topper	476be8f94a	[X86] Convert to MVT instead of calling EVT functions since we already know the type is simple. NFC llvm-svn: 251745	2015-10-31 18:14:17 +00:00
Craig Topper	0fec4d8ce7	[X86] Call getScalarSizeInBits() instead of getScalarType().getScalarSizeInBits(). NFC llvm-svn: 251744	2015-10-31 18:14:15 +00:00
Craig Topper	0e7680da9f	[X86] Remove two const references to the return value of a constructor and just use normal object creation syntax. NFC llvm-svn: 251743	2015-10-31 17:28:02 +00:00
Craig Topper	7b1d3a8a6c	[X86] Replace EVT with MVT in some more places. NFC llvm-svn: 251742	2015-10-31 17:27:59 +00:00
Craig Topper	63c2925b87	[X86] Fix indentation of case statements in switch. NFC llvm-svn: 251741	2015-10-31 17:27:56 +00:00
Craig Topper	5c8a378f48	[X86] Reduce math for index calculation for inserting and extracting subvectors and elements by exploiting the fact that all supported vector types have a power 2 number of elements. llvm-svn: 251740	2015-10-31 17:27:52 +00:00
JF Bastien	5789a69435	[WebAssembly] Fix import statement Summary: Imports should be generated like (param i32 f32...) not (param i32) (param f32) ... Author: binji Reviewers: jfb Subscribers: jfb, dschuff llvm-svn: 251714	2015-10-30 16:41:21 +00:00
Craig Topper	9377f01f21	[X86] Use is128BitVector/is256BitVector/is512BitVector in place of getSizeInBits == in some places. NFC llvm-svn: 251687	2015-10-30 04:31:18 +00:00
Craig Topper	62c3ed0ae3	[X86] Minor formatting fixes. NFC. llvm-svn: 251686	2015-10-30 04:31:14 +00:00
Craig Topper	9ef327c962	[X86] Use MVT instead of EVT in some places. NFC Prior to this the compiled code probably had extra checks for extended types that won't ever execute. llvm-svn: 251682	2015-10-30 03:19:12 +00:00
Simon Pilgrim	ca56a72af9	[X86][SSE] Shuffle blends with zero This patch generalizes the zeroing of vector elements with the BLEND instructions. Currently a zero vector will only blend if the shuffled elements are correctly inline, this patch recognises when a vector input is zero (or zeroable) and modifies a local copy of the shuffle mask to support a blend. As a zeroable vector input may not be all zeroes, the zeroable vector is regenerated if necessary. Differential Revision: http://reviews.llvm.org/D14050 llvm-svn: 251659	2015-10-29 22:11:28 +00:00
Jonas Paulsson	45d5c673ec	[SystemZ] Make the CCRegs regclass non-allocatable. This was discovered to be necessary while running memchr-01.ll with -verify-machinstrs, because it is not allowed to have a phys reg live accross block boundaries while on SSA form, if the register is allocatable (expect in entry block and landing pads). In this test case, stringRRE pseudos are expanded after isel by adding a loop block which produces a live out CC register. To make the test pass, it was also necessary to not say that StringRRELoop pseudo uses R0L, this is only true for the StringRRE opcode. -verify-machineinstrs added to memchr-01.ll test. New test case int-cmp-51.ll to test that MachineCSE can eliminate an identical compare (which it couldn't do before). Reviewed by Ulrich Weigand llvm-svn: 251634	2015-10-29 16:13:55 +00:00
Marek Olsak	6f6d318e16	AMDGPU/SI: handle undef for llvm.SI.packf16 llvm-svn: 251632	2015-10-29 15:29:09 +00:00
Marek Olsak	74d084f466	AMDGPU/SI: use S_OR for fneg (fabs f32) llvm-svn: 251631	2015-10-29 15:29:05 +00:00
Marek Olsak	f924dd6f3c	AMDGPU/SI: use S_AND for i1 trunc llvm-svn: 251630	2015-10-29 15:05:03 +00:00
Zoran Jovanovic	796ed6d937	[mips] wrong opcode for ll/sc instructions on mipsr6 when -integrated-as is used Summary: This commit resolves wrong opcodes for ll and sc instructions for r6 architecutres, which were generated in method MipsTargetLowering::emitAtomicBinary. Author: Jelena.Losic Reviewers: dsanders Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D13593 llvm-svn: 251629	2015-10-29 14:40:19 +00:00
Artyom Skrobov	0ff1ce4038	Recognize that ARM1176JZ[F]-S support TrustZone Summary: ARMv6KZ cores were set up incorrectly in ARM.td; also, the SMI mnemonic (the old name for SMC, as defined in ARMv6KZ) wasn't supported. Reviewers: jmolloy, rengolin Subscribers: aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D14154 llvm-svn: 251627	2015-10-29 13:56:19 +00:00
Vasileios Kalintiris	2f412684a9	[mips] Check the register class before replacing materializations of zero with $zero in microMIPS. Summary: The microMIPS register class GPRMM16 does not contain the $zero register. However, MipsSEDAGToDAGISel::replaceUsesWithZeroReg() would replace uses of the $dst register: [d]addiu, $dst, $zero, 0 with the $zero register, without checking for membership in the register class of the target machine operand. Reviewers: dsanders Subscribers: llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D13984 llvm-svn: 251622	2015-10-29 10:17:16 +00:00
JF Bastien	7b452e2c63	[WebAssembly] Update opcode name format for conversions Summary: Conversion opcode name format should be f64.convert_u/i64 not f64_convert_u Author: s3ththompson Reviewers: jfb Subscribers: sunfish, jfb, llvm-commits, dschuff Differential Revision: http://reviews.llvm.org/D14160 llvm-svn: 251613	2015-10-29 04:10:52 +00:00
Benjamin Kramer	4e4ca38bcf	Remove CRLF line endings. llvm-svn: 251594	2015-10-29 02:33:05 +00:00
Hal Finkel	7d0e34eb33	[PowerPC] Recurse through constants when looking for TLS globals We cannot form ctr-based loops around function calls, including calls to __tls_get_addr used for PIC TLS variables. References to such TLS variables, however, might be buried within constant expressions, and so we need to search the entire constant expression to be sure that no references to such TLS variables exist. Fixes PR25256, reported by Eric Schweitz. This is a slightly-modified version of the patch suggested by Eric in the bug report, and a test case I created. llvm-svn: 251582	2015-10-28 23:43:00 +00:00
Hal Finkel	bdd292ae22	[PowerPC] Don't return unsupported register classes for asm constraints As a follow-up to r251566, do the same for the other optionally-supported register classes (mostly for vector registers). Don't return an unavailable register class (which would cause an assert later), but fail cleanly when provided an unsupported inline asm constraint. llvm-svn: 251575	2015-10-28 23:03:45 +00:00
Tim Northover	f8e47e4868	ARM: add support for WatchOS's compact unwind information. llvm-svn: 251573	2015-10-28 22:56:36 +00:00
Tim Northover	8b40366b54	ARM: teach backend about WatchOS and TvOS libcalls. The most substantial changes are again for watchOS: libcalls are hard-float if needed and sincos has a different calling convention. llvm-svn: 251571	2015-10-28 22:51:16 +00:00
Tim Northover	e0ccdc6de9	ARM: add backend support for the ABI used in WatchOS At the LLVM level this ABI is essentially a minimal modification of AAPCS to support 16-byte alignment for vector types and the stack. llvm-svn: 251570	2015-10-28 22:46:43 +00:00
Tim Northover	2d4d161519	ARM: support .watchos_version_min and .tvos_version_min. These MachO file directives are used by linkers and other tools to provide compatibility information, much like the existing .ios_version_min and .macosx_version_min. llvm-svn: 251569	2015-10-28 22:36:05 +00:00
Hal Finkel	34d4149452	[PowerPC] Cleanly reject asm crbit constraint with -crbits When crbits are disabled, cleanly reject the constraint (return the register class only to cause an assert later). llvm-svn: 251566	2015-10-28 22:25:52 +00:00
Cong Hou	da4e8aeec6	[X86] A small fix in X86/X86TargetTransformInfo.cpp: check a value type is simple before calling getSimpleVT(). llvm-svn: 251538	2015-10-28 18:15:46 +00:00
Artyom Skrobov	b43981076a	[ARM] Allow SP in rGPR, starting from ARMv8 Summary: This patch handles assembly and disassembly, but not codegen, as of yet. Additionally, it fixes a bug whereby SP and PC as shifted-reg operands were treated as predictable in ARMv7 Thumb; and it enables the tests for invalid and unpredictable instructions to run on both ARMv7 and ARMv8. Reviewers: jmolloy, rengolin Subscribers: aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D14141 llvm-svn: 251516	2015-10-28 13:58:36 +00:00
Benjamin Kramer	039b10423a	Put global classes into the appropriate namespace. Most of the cases belong into an anonymous namespace. No functionality change intended. llvm-svn: 251515	2015-10-28 13:54:36 +00:00
Hrvoje Varga	18148671ee	[mips][microMIPS] Implement PAUSE, RDHWR, RDPGPR, SDBBP, SSNOP, SYNC, SYNCI and WAIT instructions Differential Revision: http://reviews.llvm.org/D12628 llvm-svn: 251510	2015-10-28 11:04:29 +00:00
Craig Topper	93d4a9e117	[X86] Make some for loops over MVTs more explicit (and shorter) by just mentioning all the relevant types in an initializer list. NFC llvm-svn: 251500	2015-10-28 05:48:32 +00:00
Craig Topper	3a47587c41	Use range-based for loops and use initializer list to remove a small static array. NFC llvm-svn: 251494	2015-10-28 04:53:27 +00:00
Craig Topper	4b27576001	Remove templates from CostTableLookup functions. All instantiations had the same type. This also lets us remove the versions of the functions that took a statically sized array as we can rely on ArrayRef implicit conversion now. llvm-svn: 251490	2015-10-28 04:02:12 +00:00
Hal Finkel	f4052340a4	[PowerPC] Replace cntlz[.] with cntlzw[.] cntlz is the old POWER mnemonic. cntlzw is the PowerPC mnemonic. This change fixes an issue when -no-integrated-as: The opcode cntlz is unrecognized by gas Alias the POWER mnemonic cntlz[.] to the PowerPC mnemonic cntlzw[.] This is done for because the POWER cntlz mnemonic has be used by LLVM for a very long time. We need to make sure that assembly programs that are using the cntlz[.] do not break with this change. Change PowerPC tests to reflect the insn change from cntlz to cntlzw. Add assembly test to verify cntlz[.] is encoded correctly. Patch by Tom Rix! llvm-svn: 251489	2015-10-28 03:26:45 +00:00
Jun Bum Lim	c9879ecfbc	[AArch64]Merge halfword loads into a 32-bit load This recommits r250719, which caused a failure in SPEC2000.gcc because of the incorrect insert point for the new wider load. Convert two halfword loads into a single 32-bit word load with bitfield extract instructions. For example : ldrh w0, [x2] ldrh w1, [x2, #2] becomes ldr w0, [x2] ubfx w1, w0, #16, #16 and w0, w0, #ffff llvm-svn: 251438	2015-10-27 19:16:03 +00:00
Cong Hou	07eeb8001e	Create a new interface addSuccessorWithoutWeight(MBB) in MBB to add successors when optimization is disabled. When optimization is disabled, edge weights that are stored in MBB won't be used so that we don't have to store them. Currently, this is done by adding successors with default weight 0, and if all successors have default weights, the weight list will be empty. But that the weight list is empty doesn't mean disabled optimization (as is stated several times in MachineBasicBlock.cpp): it may also mean all successors just have default weights. We should discourage using default weights when adding successors, because it is very easy for users to forget update the correct edge weights instead of using default ones (one exception is that the MBB only has one successor). In order to detect such usages, it is better to differentiate using default weights from the case when optimizations is disabled. In this patch, a new interface addSuccessorWithoutWeight(MBB) is created for when optimization is disabled. In this case, MBB will try to maintain an empty weight list, but it cannot guarantee this as for many uses of addSuccessor() whether optimization is disabled or not is not checked. But it can guarantee that if optimization is enabled, then the weight list always has the same size of the successor list. Differential revision: http://reviews.llvm.org/D13963 llvm-svn: 251429	2015-10-27 17:59:36 +00:00
Asaf Badouh	c7cb880669	[X86][AVX512] [X86][AVX512] add convert float to half convert float to half with mask/maskz for the reg to reg version and mask for the reg to mem version (there is no maskz version for reg to mem). Differential Revision: http://reviews.llvm.org/D14113 llvm-svn: 251409	2015-10-27 15:37:17 +00:00
Charlie Turner	458e79b814	[ARM] Expand ROTL and ROTR of vector value types Summary: After D13851 landed, we saw backend crashes when compiling the reduced test case included in this patch. The right fix seems to be to allow these vector types for expansion in instruction selection. Reviewers: rengolin, t.p.northover Subscribers: RKSimon, t.p.northover, aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D14082 llvm-svn: 251401	2015-10-27 10:25:20 +00:00
Michael Kuperstein	e1194bdb4f	[X86] Make elfiamcu an OS, not an environment. GNU tools require elfiamcu to take up the entire OS field, so, e.g. i?86-*-linux-elfiamcu is not considered a legal triple. Make us compatible. Differential Revision: http://reviews.llvm.org/D14081 llvm-svn: 251390	2015-10-27 07:23:59 +00:00

... 3 4 5 6 7 ...

35188 Commits