llvm-project

Commit Graph

Author	SHA1	Message	Date
Quentin Colombet	3c2b13b258	[ARM64][CollectLOH] Add some comments to explain how the LOHs framework works (for the compiler part), since the design document is not available. llvm-svn: 205379	2014-04-02 01:02:28 +00:00
Hal Finkel	9e0baa6d3a	[PowerPC] Add some missing VSX bitcast patterns llvm-svn: 205352	2014-04-01 19:24:27 +00:00
Yaron Keren	48d68d439a	If isKnownWindowsMSVCEnvironment then getOS == Triple::Win32 and Environment == Triple::MSVC so it will never be MinGW or Cygwin. llvm-svn: 205349	2014-04-01 18:52:55 +00:00
Hal Finkel	2eed29f3c8	Implement X86TTI::getUnrollingPreferences This provides an initial implementation of getUnrollingPreferences for x86. getUnrollingPreferences is used by the generic (concatenation) unroller, which is distinct from the unrolling done by the loop vectorizer. Many modern x86 cores have some kind of uop cache and loop-stream detector (LSD) used to efficiently dispatch small loops, and taking full advantage of this requires unrolling small loops (small here means 10s of uops). These caches also have limits on the number of taken branches in the loop, and so we also cap the loop unrolling factor based on the maximum "depth" of the loop. This is currently calculated with a partial DFS traversal (partial because it will stop early if the path length grows too much). This is still an approximation, and one that is both conservative (because it does not account for branches eliminated via block placement) and optimistic (because it is only recording the maximum depth over minimum paths). Nevertheless, because the loops that fit in these uop caches are so small, it is not clear how much the details matter. The original set of patches posted for review produced the following test-suite performance results (from the TSVC benchmark) at that time: ControlLoops-dbl - 13% speedup ControlLoops-flt - 15% speedup Reductions-dbl - 7.5% speedup llvm-svn: 205348	2014-04-01 18:50:34 +00:00
Kai Nacke	af47f60f83	[mips] Add Octeon cnMips instructions mtmX and mtpX Adds the Octeon cnMips instructions "load multiplier register MPLx" and "load product register Px". Includes tests. Reviews by: Daniel.Sanders@imgtec.com llvm-svn: 205343	2014-04-01 18:35:26 +00:00
Reid Kleckner	101102711d	Support segmented stacks on Win64 Identical to Win32 method except the GS segment register is used for TLS instead of FS and pvArbitrary is at TEB offset 0x28 instead of 0x14. llvm-svn: 205342	2014-04-01 18:34:21 +00:00
Yaron Keren	136fe7db46	isTargetWindows() renamed to isTargetKnownWindowsMSVC() to reflect its current functionality. Based on Takumi NAKAMURA suggestion. llvm-svn: 205338	2014-04-01 18:15:34 +00:00
Christian Pirker	dc9ff75554	ARM: rename ARMle/ARMbe with ARMLE/ARMBE, and Thumble/Thumbbe with ThumbLE/ThumbBE llvm-svn: 205317	2014-04-01 15:19:30 +00:00
Tim Northover	0feb91ef15	ARM: teach LLVM that Cortex-A7 is very similar to A8. llvm-svn: 205314	2014-04-01 14:10:07 +00:00
Aaron Ballman	8bf5a548ea	Attempting to fix r205124, which had failed asserts when built with MSVC. Suggestion from Yaron Keren. llvm-svn: 205313	2014-04-01 13:56:35 +00:00
Tim Northover	1351030801	ARM: add cyclone CPU with ZeroCycleZeroing feature. The Cyclone CPU is similar to swift for most LLVM purposes, but does have two preferred instructions for zeroing a VFP register. This teaches LLVM about them. llvm-svn: 205309	2014-04-01 13:22:02 +00:00
Daniel Sanders	21bce30fdc	[mips] Renamed ParseAnyRegisterWithoutDollar to MatchAnyRegisterWithoutDollar This is for consistency with other functions. The Parse* functions consume tokens and the Match* functions don't. No functional change. llvm-svn: 205305	2014-04-01 12:35:23 +00:00
Aaron Ballman	0947bb20d8	Fixing an MSVC warning about widening the result of a 32-bit shift implicitly. No functional change intended. llvm-svn: 205304	2014-04-01 12:24:25 +00:00
Tim Northover	4f1dd58e2e	ARM64: add intrinsic for pmull (p64 x p64 = p128) operations. llvm-svn: 205302	2014-04-01 12:22:37 +00:00
Aaron Ballman	d1726ee8fa	Fixing warnings in the MSVC build. No functional changes intended. llvm-svn: 205301	2014-04-01 12:22:20 +00:00
Daniel Sanders	ffd8436d6c	[mips] Extend ParseJumpTarget to support the full symbol expression syntax. Summary: This should fix the issues the D3222 caused in lld. Testcase is based on the one that failed in the buildbot. Depends on D3233 Reviewers: matheusalmeida, vmedic Reviewed By: matheusalmeida Differential Revision: http://llvm-reviews.chandlerc.com/D3234 llvm-svn: 205298	2014-04-01 10:41:48 +00:00
Daniel Sanders	315386c083	[mips] Use AsmLexer::peekTok() to resolve the conflict between $reg and $sym Summary: Parsing registers no longer consume the $ token before it's confirmed whether it really has a register or not, therefore it's no longer impossible to match symbols if registers were tried first. Depends on D3232 Reviewers: matheusalmeida, vmedic Reviewed By: matheusalmeida Differential Revision: http://llvm-reviews.chandlerc.com/D3233 llvm-svn: 205297	2014-04-01 10:40:14 +00:00
Daniel Sanders	0993457891	[mips] Hoist Parser.Lex() calls out of MatchAnyRegisterNameWithoutDollar() Summary: No functional change Depends on D3222 Reviewers: matheusalmeida, vmedic Reviewed By: matheusalmeida Differential Revision: http://llvm-reviews.chandlerc.com/D3232 llvm-svn: 205295	2014-04-01 10:37:46 +00:00
Tim Northover	ff179ba3d3	ARM64: add patterns for more lane-wise ld1/st1 operations. llvm-svn: 205294	2014-04-01 10:37:09 +00:00
Tim Northover	d8d613b979	ARM64: fix bug in ld3r (1d) SelectionDAG. llvm-svn: 205293	2014-04-01 10:37:03 +00:00
Daniel Sanders	b50ccf8e26	[mips] Rewrite MipsAsmParser and MipsOperand. Summary: Highlights: - Registers are resolved much later (by the render method). Prior to that point, GPR32's/GPR64's are GPR's regardless of register size. Similarly FGR32's/FGR64's/AFGR64's are FGR's regardless of register size or FR mode. Numeric registers can be anything. - All registers are parsed the same way everywhere (even when handling symbol aliasing) - One consequence is that all registers can be specified numerically almost anywhere (e.g. $fccX, $wX). The exception is symbol aliasing but that can be easily resolved. - Removes the need for the hasConsumedDollar hack - Parenthesis and Bracket suffixes are handled generically - Micromips instructions are parsed directly instead of going through the standard encodings first. - rdhwr accepts all 32 registers, and the following instructions that previously xfailed now work: ddiv, ddivu, div, divu, cvt.l.[ds], se[bh], wsbh, floor.w.[ds], c.ngl.d, c.sf.s, dsbh, dshd, madd.s, msub.s, nmadd.s, nmsub.s, swxc1 - Diagnostics involving registers point at the correct character (the $) - There's only one kind of immediate in MipsOperand. LSA immediates are handled by the predicate and renderer. Lowlights: - Hardcoded '$zero' in the div patterns is handled with a hack. MipsOperand::isReg() will return true for a k_RegisterIndex token with Index == 0 and getReg() will return ZERO for this case. Note that it doesn't return ZERO_64 on isGP64() targets. - I haven't cleaned up all of the now-unused functions. Some more of the generic parser could be removed too (integers and relocs for example). - insve.df needed a custom decoder to handle the implicit fourth operand that was needed to make it parse correctly. The difficulty was that the matcher expected a Token<'0'> but gets an Imm<0>. Adding an implicit zero solved this. Reviewers: matheusalmeida, vmedic Reviewed By: matheusalmeida Differential Revision: http://llvm-reviews.chandlerc.com/D3222 llvm-svn: 205292	2014-04-01 10:35:28 +00:00
Alexey Volkov	1328b28dc6	[x86] Do not convert to cmp32 for Atom arch by Sergey Okunev Differential Revision: http://llvm-reviews.chandlerc.com/D2824 llvm-svn: 205288	2014-04-01 08:13:07 +00:00
Adam Nemet	10c4ce2584	[X86] Adjust cost of FP_TO_UINT v4f64->v4i32 as well Pretty obvious follow-on to r205159 to also handle conversion from double besides float. Fixes <rdar://problem/16373208> llvm-svn: 205253	2014-03-31 21:54:48 +00:00
Matt Arsenault	d6c4326786	R600/SI: Remove leftover pattern splitting 64-bit ors. It's now matched to the scalar 64-bit or and split later if necessary.' llvm-svn: 205252	2014-03-31 21:46:46 +00:00
Manman Ren	63efd8e7e6	Register allocator: set CSRFirstUseCost to 5 for ARM64. A value of 5 means if we have a split or spill option that has a really low cost (1 << 14 is the entry frequency), we will choose to spill or split the really cold path before using a callee-saved register. This gives us the performance benefit on SPECInt2k and is also conservative. rdar://16162005 llvm-svn: 205248	2014-03-31 21:06:36 +00:00
Matt Arsenault	f751d6272d	Change shouldSplitVectorElementType to better match the description. Pass the entire vector type, and not just the element. llvm-svn: 205247	2014-03-31 20:54:58 +00:00
Matt Arsenault	d7bdcc46a6	R600/SI: Implement shouldConvertConstantLoadToIntImm llvm-svn: 205244	2014-03-31 19:54:27 +00:00
Matt Arsenault	378bf9c68b	R600: Compute masked bits for min and max llvm-svn: 205242	2014-03-31 19:35:33 +00:00
Rafael Espindola	ee1c342ef9	Don't relocate with sections if there might be a paired relocation. llvm-svn: 205240	2014-03-31 19:00:23 +00:00
Daniel Sanders	e34a120285	Revert: [mips] Rewrite MipsAsmParser and MipsOperand.' due to buildbot errors in lld tests. It's currently unable to parse 'sym + imm' without surrounding parenthesis. llvm-svn: 205237	2014-03-31 18:51:43 +00:00
Matt Arsenault	4c53717787	R600: Add BFE, BFI, and BFM intrinsics to help with writing tests. llvm-svn: 205236	2014-03-31 18:21:18 +00:00
Matt Arsenault	b34583661b	R600: Add target nodes for BFM and BFI llvm-svn: 205235	2014-03-31 18:21:13 +00:00
Saleem Abdulrasool	2070088bef	ARM: fix typo llvm-svn: 205233	2014-03-31 18:09:10 +00:00
Hal Finkel	b4240ca0f4	[PowerPC] Don't ever expand BUILD_VECTOR of v2i64 with shuffles If we have two unique values for a v2i64 build vector, this will always result in two vector loads if we expand using shuffles. Only one is necessary. llvm-svn: 205231	2014-03-31 17:48:16 +00:00
Daniel Sanders	0c648ba5be	[mips] Rewrite MipsAsmParser and MipsOperand. Summary: Highlights: - Registers are resolved much later (by the render method). Prior to that point, GPR32's/GPR64's are GPR's regardless of register size. Similarly FGR32's/FGR64's/AFGR64's are FGR's regardless of register size or FR mode. Numeric registers can be anything. - All registers are parsed the same way everywhere (even when handling symbol aliasing) - One consequence is that all registers can be specified numerically almost anywhere (e.g. $fccX, $wX). The exception is symbol aliasing but that can be easily resolved. - Removes the need for the hasConsumedDollar hack - Parenthesis and Bracket suffixes are handled generically - Micromips instructions are parsed directly instead of going through the standard encodings first. - rdhwr accepts all 32 registers, and the following instructions that previously xfailed now work: ddiv, ddivu, div, divu, cvt.l.[ds], se[bh], wsbh, floor.w.[ds], c.ngl.d, c.sf.s, dsbh, dshd, madd.s, msub.s, nmadd.s, nmsub.s, swxc1 - Diagnostics involving registers point at the correct character (the $) - There's only one kind of immediate in MipsOperand. LSA immediates are handled by the predicate and renderer. Lowlights: - Hardcoded '$zero' in the div patterns is handled with a hack. MipsOperand::isReg() will return true for a k_RegisterIndex token with Index == 0 and getReg() will return ZERO for this case. Note that it doesn't return ZERO_64 on isGP64() targets. - I haven't cleaned up all of the now-unused functions. Some more of the generic parser could be removed too (integers and relocs for example). - insve.df needed a custom decoder to handle the implicit fourth operand that was needed to make it parse correctly. The difficulty was that the matcher expected a Token<'0'> but gets an Imm<0>. Adding an implicit zero solved this. Reviewers: matheusalmeida, vmedic Reviewed By: matheusalmeida Differential Revision: http://llvm-reviews.chandlerc.com/D3222 llvm-svn: 205229	2014-03-31 17:43:46 +00:00
Hal Finkel	4c8f634f23	[PowerPC] Correct P7 dispatch unit allocation for vector instructions llvm-svn: 205222	2014-03-31 17:02:10 +00:00
Eli Bendersky	6a0ccfb585	PR19099 - revert r203483 Now that r205212 was committed, r203483 is no longer necessary; it was a temporary workaround that only handled a small number of the problematic cases. llvm-svn: 205216	2014-03-31 16:11:57 +00:00
Christian Pirker	6d301b8ff2	ARM: change parameter names of the ELFARMAsmBackend constructor I removed the underscore at the beginning of the parameter name, because of a comment from Tim. llvm-svn: 205215	2014-03-31 16:06:39 +00:00
Robert Khasanov	ed0b2e9733	Test commit. llvm-svn: 205214	2014-03-31 16:01:38 +00:00
Daniel Sanders	d69adeb8e7	[mips] Fix use of uninitialized value reported by the sanitizer-x86_64-linux-bootstrap buildbot llvm-svn: 205213	2014-03-31 15:58:58 +00:00
Eli Bendersky	264cd4672d	Fix for PR19099 - NVPTX produces invalid symbol names. This is a more thorough fix for the issue than r203483. An IR pass will run before NVPTX codegen to make sure there are no invalid symbol names that can't be consumed by the ptxas assembler. llvm-svn: 205212	2014-03-31 15:56:26 +00:00
Tim Northover	5081cd0f81	ARM64: add extra patterns for scalar shifts llvm-svn: 205209	2014-03-31 15:46:46 +00:00
Tim Northover	e7834c3bbc	ARM64: add extra scalar neg pattern & tests. llvm-svn: 205208	2014-03-31 15:46:42 +00:00
Tim Northover	4468670345	ARM64: add patterns for scalar sqdmlal & sqdmlsl. llvm-svn: 205207	2014-03-31 15:46:38 +00:00
Tim Northover	5731fc75af	ARM64: add more patterns for commuted fmsub operations. llvm-svn: 205206	2014-03-31 15:46:34 +00:00
Tim Northover	290e0698d4	ARM64: shuffle patterns around for fmin/fmax & add tests. llvm-svn: 205205	2014-03-31 15:46:30 +00:00
Tim Northover	903814ccd6	ARM64: add more scalar patterns for usqadd & suqadd. llvm-svn: 205204	2014-03-31 15:46:26 +00:00
Tim Northover	4c9d2c7e3f	ARM64: add more scalar patterns for reciprocal ops. llvm-svn: 205203	2014-03-31 15:46:22 +00:00
Tim Northover	f48103618e	ARM64: add i64 scalar pattern for @llvm.arm64.abs This will be used by the Clang front-end code for vabsd_s64. llvm-svn: 205202	2014-03-31 15:46:17 +00:00
Daniel Sanders	a567da5a36	[mips] Implement missing relocations in the integrated assembler. %got_hi, %got_lo, %call_hi, %call_lo, %higher, and %highest are now recognised by MipsAsmParser::getVariantKind(). To prevent future issues with missing entries in this StringSwitch, I've added an assertion to the default case. llvm-svn: 205200	2014-03-31 15:15:02 +00:00
Daniel Sanders	cefddb2ca6	Revert r205194 - [mips] Removed R_MIPS_GOT. It's identical to R_MIPS_GOT16. There's a couple additional bits I missed. llvm-svn: 205195	2014-03-31 14:34:36 +00:00
Daniel Sanders	a104300dbe	[mips] Removed R_MIPS_GOT. It's identical to R_MIPS_GOT16. llvm-svn: 205194	2014-03-31 14:30:05 +00:00
Rafael Espindola	2378d4c0ce	Capitalize the D in parseDirectiveGpDWord. DWord seems to be the canonical way to camel case dword in llvm. Thanks to Daniel Sander for noticing. llvm-svn: 205191	2014-03-31 14:15:07 +00:00
Tom Stellard	30f59417cf	R600/SI: Implement SIInstrInfo::isTriviallyRematerializable() llvm-svn: 205188	2014-03-31 14:01:56 +00:00
Tom Stellard	7ea3d6d420	R600/SI: Lower i64 SELECT by bitcasting to a vector type This allows allows us to replace ISD::EXTRACT_ELEMENT, which is lowered using shifts, with ISD::EXTRACT_VECTOR_ELT, which is a no-op. llvm-svn: 205187	2014-03-31 14:01:55 +00:00
Tom Stellard	7277b008ee	R600/SI: Return the correct index for VGPRs in getHWRegIndex() The register index is stored in the low 8-bits of the encoding. llvm-svn: 205186	2014-03-31 14:01:52 +00:00
Zoran Jovanovic	9b05a31f76	Fixed issue with microMIPS JAL instruction. Differential Revision: http://llvm-reviews.chandlerc.com/D3200 llvm-svn: 205185	2014-03-31 14:00:10 +00:00
Tim Northover	241856e5f8	ARM64: fix a couple of signed/unsigned comparison warnings. llvm-svn: 205174	2014-03-31 10:21:36 +00:00
Alexey Samsonov	23aaf2a182	Try to fix MSan bootstrap bot: make ARM64Disassembler::getInstruction() always initialize Size argument. llvm-svn: 205171	2014-03-31 07:59:33 +00:00
Yaron Keren	070a752d7e	Correct OS conditionals following r204977 and r204978. Previously, MinGW OS was Triple::MinGW and Cygwin was Triple::Cygwin and now it is Triple::Win32 with Environment being GNU or Cygwin. So, TheTriple.getOS() == Triple::Win32 is replaced by TheTriple.isWindowsMSVCEnvironment() and (TheTriple.getOS() == Triple::MinGW32 \|\| TheTriple.getOS() == Triple::Cygwin) is replaced by TheTriple.isOSCygMing() llvm-svn: 205170	2014-03-31 07:59:14 +00:00
Craig Topper	ec82847a64	[C++11] Mark more classes in the X86 target as 'final'. llvm-svn: 205166	2014-03-31 06:53:13 +00:00
Craig Topper	26eec09d84	Mark a couple of the X86 target classes as final. Allows the compiler to de-virtualize some internal calls. llvm-svn: 205165	2014-03-31 06:22:15 +00:00
NAKAMURA Takumi	82ec13e3d5	ARM64CollectLOH.cpp: Tweak \param. [-Wdocumentation] llvm-svn: 205162	2014-03-31 01:10:26 +00:00
Chandler Carruth	d28515af31	[ARM64] Fix materialization of an fp128 zero immediate. There currently is not a pattern to lower this with clever instructions that zero the register, so restrict the zero immediate legality special case to f64 and f32 (the only two sizes which fmov seems to directly support). Fixes backend errors when building code such as libxml. llvm-svn: 205161	2014-03-31 00:02:10 +00:00
Adam Nemet	6dafe97271	[X86] Adjust cost of FP_TO_UINT v8f32->v8i32 There is no direct AVX instruction to convert to unsigned. I have some ideas how we may be able to do this with three vector instructions but the current backend just bails on this to get it scalarized. See the comment why we need to adjust the cost returned by BasicTTI. The test is a bit roundabout (and checks assembly rather than bit code) because I'd like it to work even if at some point we could vectorize this conversion. Fixes <rdar://problem/16371920> llvm-svn: 205159	2014-03-30 18:07:13 +00:00
Stepan Dyatkovskiy	8baf17fc5f	PR18929: According to ARM assembler language hash symbol is optional before immediates. For example, see here for more details: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0473j/dom1359731154529.html llvm-svn: 205157	2014-03-30 17:09:54 +00:00
Hal Finkel	5c0d1454d6	[PowerPC] Handle VSX v2i64 SIGN_EXTEND_INREG sitofp from v2i32 to v2f64 ends up generating a SIGN_EXTEND_INREG v2i64 node (and similarly for v2i16 and v2i8). Even though there are no sign-extension (or algebraic shifts) for v2i64 types, we can handle v2i32 sign extensions by converting two and from v2i64. The small trick necessary here is to shift the i32 elements into the right lanes before the i32 -> f64 step. This is because of the big Endian nature of the system, we need the i32 portion in the high word of the i64 elements. For v2i16 and v2i8 we can do the same, but we first use the default Altivec shift-based expansion from v2i16 or v2i8 to v2i32 (by casting to v4i32) and then apply the above procedure. llvm-svn: 205146	2014-03-30 13:22:59 +00:00
Chandler Carruth	81f7061065	[ARM64] Fix a heap-use-after-free spotted by ASan. StringRef::lower() returns a std::string. Better yet, we can now stop thinking about what it returns and write 'auto'. It does the right thing. =] llvm-svn: 205135	2014-03-30 09:08:07 +00:00
Tim Northover	bf679cec67	ARM64: uncopy/paste helper function It was doing functional but highly suspect operations on bools due to the more limited shifting operands supported by memory instructions. Should fix some MSVC warnings. llvm-svn: 205134	2014-03-30 08:30:28 +00:00
Tim Northover	6b3258f087	ARM64: remove unused variables llvm-svn: 205133	2014-03-30 07:35:48 +00:00
Tim Northover	3e52557212	ARM64: override all the things. Actually, mostly only those in the top-level directory that already had a "virtual" attached. But it's the thought that counts and it's been a long day. llvm-svn: 205131	2014-03-30 07:25:18 +00:00
NAKAMURA Takumi	09717bd1c4	X86Subtarget.h: isTargetWindows() should tell whether he is targeting msvc. FYI, !isWindowsGNUEnvironment() is insufficient. It missed cygwin. FIXME: The name "isTargetWindows" should be fixed. llvm-svn: 205124	2014-03-30 04:35:00 +00:00
Dmitri Gribenko	1fd72104ad	Fix a few -Wdocumentation warnings llvm-svn: 205116	2014-03-29 19:40:32 +00:00
Benjamin Kramer	3ad660a515	Detemplatize LOHDirective. The ARM64 backend uses it only as a container to keep an MCLOHType and Arguments around so give it its own little copy. The other functionality isn't used and we had a crazy method specialization hack in place to keep it working. Unfortunately that was incompatible with MSVC. Also range-ify a couple of loops while at it. llvm-svn: 205114	2014-03-29 19:21:20 +00:00
Benjamin Kramer	61e595be4d	ARM64: Remove unused helper function, make others static. llvm-svn: 205112	2014-03-29 18:00:49 +00:00
Hal Finkel	777c9dd90a	[PowerPC] Handle v2i64 comparisons v2i64 is a legal type under VSX, however we don't have native vector comparisons. We can handle eq/ne by casting it to an Altivec type, but everything else must be expanded. llvm-svn: 205106	2014-03-29 16:04:40 +00:00
Tim Northover	adbd34e045	ARM64: format register strings without creating a local Twine. It was causing horrible failures on some build-bots. llvm-svn: 205105	2014-03-29 15:35:57 +00:00
Hal Finkel	e8fba98735	[PowerPC] VSX instruction latency corrections The vector divide and sqrt instructions have high latencies, and the scalar comparisons are like all of the others. On the P7, permutations take an extra cycle over purely-simple vector ops. llvm-svn: 205096	2014-03-29 13:20:31 +00:00
Stepan Dyatkovskiy	df657cc1d5	Recommitted fix for PR18931, with extended tests set. Issue subject: Crash using integrated assembler with immediate arithmetic Fix description: Expressions like 'cmp r0, #(l1 - l2) >> 3' could not be evaluated on asm parsing stage, since it is impossible to resolve labels on this stage. In the end of stage we still have expression (MCExpr). Then, when we want to encode it, we expect it to be an immediate, but it still an expression. Patch introduces a Fixup (MCFixup instance), that is processed after main encoding stage. llvm-svn: 205094	2014-03-29 13:12:40 +00:00
Tim Northover	2125374ecf	ARM64: use 64-bit constant even on 32-bit machines Another existing bot failure so no tests. llvm-svn: 205093	2014-03-29 11:51:49 +00:00
Tim Northover	2011df293d	ARM64: change format specifier to work on 32-bit targets Existing tests were failing. llvm-svn: 205092	2014-03-29 11:47:07 +00:00
Chandler Carruth	7b7a67c5c8	[ARM64] Fix 'assert("...")' to be 'assert(0 && "...")'. Otherwise, it is no assert at all. ;] Some of these should probably be switched to llvm_unreachable, but I didn't want to perturb the behavior in this patch. Found by -Wstring-conversion, which I'll try to turn on in CMake builds at least as it is finding useful things. llvm-svn: 205091	2014-03-29 11:07:40 +00:00
Tim Northover	00ed9964c6	ARM64: initial backend import This adds a second implementation of the AArch64 architecture to LLVM, accessible in parallel via the "arm64" triple. The plan over the coming weeks & months is to merge the two into a single backend, during which time thorough code review should naturally occur. Everything will be easier with the target in-tree though, hence this commit. llvm-svn: 205090	2014-03-29 10:18:08 +00:00
Rafael Espindola	5904e12bfa	Completely rewrite ELFObjectWriter::RecordRelocation. I started trying to fix a small issue, but this code has seen a small fix too many. The old code was fairly convoluted. Some of the issues it had: * It failed to check if a symbol difference was in the some section when converting a relocation to pcrel. * It failed to check if the relocation was already pcrel. * The pcrel value computation was wrong in some cases (relocation-pc.s) * It was missing quiet a few cases where it should not convert symbol relocations to section relocations, leaving the backends to patch it up. * It would not propagate the fact that it had changed a relocation to pcrel, requiring a quiet nasty work around in ARM. * It was missing comments. llvm-svn: 205076	2014-03-29 06:26:49 +00:00
Hal Finkel	19be506a5e	[PowerPC] Add subregister classes for f64 VSX values We had stored both f64 values and v2f64, etc. values in the VSX registers. This worked, but was suboptimal because we would always spill 16-byte values even through we almost always had scalar 8-byte values. This resulted in an increase in stack-size use, extra memory bandwidth, etc. To fix this, I've added 64-bit subregisters of the Altivec registers, and combined those with the existing scalar floating-point registers to form a class of VSX scalar floating-point registers. The ABI code has also been enhanced to use this register class and some other necessary improvements have been made. llvm-svn: 205075	2014-03-29 05:29:01 +00:00
Akira Hatanaka	9afbb8c2b1	[x86] Fix printing of register operands with q modifier. Emit 32-bit register names instead of 64-bit register names if the target does not have 64-bit general purpose registers. <rdar://problem/14653996> llvm-svn: 205067	2014-03-28 23:28:07 +00:00
David Majnemer	02f2188bb9	X86: Disable IsLegalToCallImmediateAddr for Win32 WinCOFF cannot form PC relative relocations to support absolute MCValues. We should reenable this once WinCOFF supports emission of IMAGE_REL_I386_REL32 relocations. This fixes PR19272. llvm-svn: 205058	2014-03-28 21:40:47 +00:00
Hal Finkel	2583b06310	[PowerPC] Fix VSX permutation isel Not only did I invert the indices when I wrote the code, but I also did the same thing when I wrote the regression test. Oops. llvm-svn: 205046	2014-03-28 20:24:55 +00:00
Hal Finkel	7811c6188e	[PowerPC] v2[fi]64 need to be explicitly passed in VSX registers v2[fi]64 values need to be explicitly passed in VSX registers. This is because the code in TRI that finds the minimal register class given a register and a value type will assert if given an Altivec register and a non-Altivec type. llvm-svn: 205041	2014-03-28 19:58:11 +00:00
Rafael Espindola	b59fb7347a	Parse .gpdword and convert another llc -filetype=obj test. llvm-svn: 205028	2014-03-28 18:50:26 +00:00
Rafael Espindola	d7610a5d67	Add const to a method I missed in the previous commit. llvm-svn: 205014	2014-03-28 16:14:12 +00:00
Rafael Espindola	3e3de5e353	Add const. llvm-svn: 205013	2014-03-28 16:06:09 +00:00
Erik Verbruggen	5e1bac3a38	Revert "InstCombine: merge constants in both operands of icmp." This reverts commit r204912, and follow-up commit r204948. This introduced a performance regression, and the fix is not completely clear yet. llvm-svn: 205010	2014-03-28 14:50:57 +00:00
Christian Pirker	2a11160956	Add ARM big endian Target (armeb, thumbeb) Reviewed at http://llvm-reviews.chandlerc.com/D3095 llvm-svn: 205007	2014-03-28 14:35:30 +00:00
Tim Northover	24f46618b2	R600: avoid calling std::next on an iterator that might be end() This was causing my llc to go into an infinite loop on CodeGen/R600/address-space.ll (just triggered recently by some allocator changes). llvm-svn: 205005	2014-03-28 13:52:56 +00:00
Hal Finkel	c6fc9b8960	[PowerPC] Use a small cleanup pass to remove VSX self copies As explained in r204976, because of how the allocation of VSX registers interacts with the call-lowering code, we sometimes end up generating self VSX copies. Specifically, things like this: %VSL2<def> = COPY %F2, %VSL2<imp-use,kill> (where %F2 is really a sub-register of %VSL2, and so this copy is a nop) This adds a small cleanup pass to remove these prior to post-RA scheduling. llvm-svn: 204980	2014-03-27 23:12:31 +00:00
Saleem Abdulrasool	edbdd2e5df	Canonicalise Windows target triple spellings Construct a uniform Windows target triple nomenclature which is congruent to the Linux counterpart. The old triples are normalised to the new canonical form. This cleans up the long-standing issue of odd naming for various Windows environments. There are four different environments on Windows: MSVC: The MS ABI, MSVCRT environment as defined by Microsoft GNU: The MinGW32/MinGW32-W64 environment which uses MSVCRT and auxiliary libraries Itanium: The MSVCRT environment + libc++ built with Itanium ABI Cygnus: The Cygwin environment which uses custom libraries for everything The following spellings are now written as: i686-pc-win32 => i686-pc-windows-msvc i686-pc-mingw32 => i686-pc-windows-gnu i686-pc-cygwin => i686-pc-windows-cygnus This should be sufficiently flexible to allow us to target other windows environments in the future as necessary. llvm-svn: 204977	2014-03-27 22:50:05 +00:00
Hal Finkel	9dcb3583d5	[PowerPC] Don't remove self VSX copies in PPCInstrInfo::copyPhysReg Because of how the allocation of VSX registers interacts with the call-lowering code, we sometimes end up generating self VSX copies. Specifically, things like this: %VSL2<def> = COPY %F2, %VSL2<imp-use,kill> (where %F2 is really a sub-register of %VSL2, and so this copy is a nop) The problem is that ExpandPostRAPseudos always assumes that some instruction has been inserted, and adds implicit defs to it. This is a problem if no copy was inserted because it can cause subtle problems during post-RA scheduling. These self copies will have to be removed some other way. llvm-svn: 204976	2014-03-27 22:46:28 +00:00
Quentin Colombet	85b904d875	[X86][Vector Cost Model] Add a comment to explain the workaround in my previous commit (r204884). <rdar://problem/16381225> llvm-svn: 204972	2014-03-27 22:27:41 +00:00
Hal Finkel	82569b6366	[PowerPC] Fix v2f64 vector extract and related patterns First, v2f64 vector extract had not been declared legal (and so the existing patterns were not being used). Second, the patterns for that, and for scalar_to_vector, should really be a regclass copy, not a subregister operation, because the VSX registers directly hold both the vector and scalar data. llvm-svn: 204971	2014-03-27 22:22:48 +00:00
Hal Finkel	ad801b7459	[PowerPC] Expand v2i64 shifts These operations need to be expanded during legalization so that isel does not crash. In theory, we might be able to custom lower some of these. That, however, would need to be follow-up work. llvm-svn: 204963	2014-03-27 21:26:33 +00:00
Rafael Espindola	c03f44ca8a	Remove another unused argument. llvm-svn: 204961	2014-03-27 20:49:35 +00:00
Rafael Espindola	9ab380122a	Remove unused argument. llvm-svn: 204956	2014-03-27 20:41:17 +00:00
Matt Arsenault	b517c8128e	R600: Implement isZExtFree. This allows 64-bit operations that are truncated to be reduced to 32-bit ones. llvm-svn: 204946	2014-03-27 17:23:31 +00:00
Matt Arsenault	d125d74a73	R600/SI: Fix unreachable with a sext_in_reg to an illegal type. llvm-svn: 204945	2014-03-27 17:23:24 +00:00
Daniel Sanders	5e94e68f7b	[mips] Some uses of isMips64()/hasMips64() are really tests for 64-bit GPR's Summary: No functional change since these predicates are (currently) synonymous. Extracted from a patch by David Chisnall His work was sponsored by: DARPA, AFRL Differential Revision: http://llvm-reviews.chandlerc.com/D3202 llvm-svn: 204943	2014-03-27 16:42:17 +00:00
Logan Chien	30eb9f47c6	[AArch64] Lower SHL_PARTS, SRA_PARTS and SRL_PARTS Lower SHL_PARTS, SRA_PARTS and SRL_PARTS to perform 128-bit integer shift Patch by GuanHong Liu. llvm-svn: 204940	2014-03-27 16:28:09 +00:00
Rafael Espindola	24a669d225	Prevent alias from pointing to weak aliases. This adds back r204781. Original message: Aliases are just another name for a position in a file. As such, the regular symbol resolutions are not applied. For example, given define void @my_func() { ret void } @my_alias = alias weak void ()* @my_func @my_alias2 = alias void ()* @my_alias We produce without this patch: .weak my_alias my_alias = my_func .globl my_alias2 my_alias2 = my_alias That is, in the resulting ELF file my_alias, my_func and my_alias are just 3 names pointing to offset 0 of .text. That is not the semantics of IR linking. For example, linking in a @my_alias = alias void ()* @other_func would require the strong my_alias to override the weak one and my_alias2 would end up pointing to other_func. There is no way to represent that with aliases being just another name, so the best solution seems to be to just disallow it, converting a miscompile into an error. llvm-svn: 204934	2014-03-27 15:26:56 +00:00
Daniel Sanders	64cf5a4eb2	[mips] Attempting to use register $32 should be an error instead of an assertion. Reviewers: matheusalmeida Reviewed By: matheusalmeida Differential Revision: http://llvm-reviews.chandlerc.com/D3201 llvm-svn: 204932	2014-03-27 15:00:44 +00:00
Aaron Ballman	be648a3c16	The forward declare should be a struct instead of a class (to be consistent with the definition, as well as to silence an MSVC C4099 warning). llvm-svn: 204928	2014-03-27 14:10:00 +00:00
Daniel Sanders	5bce5f6245	[mips] Add support for .cpsetup Summary: Patch by Robert N. M. Watson His work was sponsored by: DARPA, AFRL Small corrections by myself. CC: theraven, matheusalmeida Differential Revision: http://llvm-reviews.chandlerc.com/D3199 llvm-svn: 204924	2014-03-27 13:52:53 +00:00
Daniel Sanders	bd0e39079a	[mips] The decision between GOT_DISP and GOT16 for global addresses depends on ABI rather than MIPS64 Summary: No functional change (for supported use cases) Reviewers: matheusalmeida Reviewed By: matheusalmeida Differential Revision: http://llvm-reviews.chandlerc.com/D3191 llvm-svn: 204922	2014-03-27 12:49:34 +00:00
Zoran Jovanovic	ada38ef61b	Split the file MipsAsmBackend.cpp in Split the file MipsAsmBackend.cpp and Split the file MipsAsmBackend.h. Differential Revision: http://llvm-reviews.chandlerc.com/D3134 llvm-svn: 204921	2014-03-27 12:38:40 +00:00
Matheus Almeida	a805e85cb2	[mips] Remove unused private field. llvm-svn: 204919	2014-03-27 12:02:48 +00:00
Matheus Almeida	61218ba798	[mips] NaCl should now use the custom MipsELFStreamer (recently added) in spite of MCELFStreamer. This is so that changes to MipsELFStreamer will automatically propagate through its subclasses. No functional changes (MipsELFStreamer has the same functionality of MCELFStreamer at the moment). Differential Revision: http://llvm-reviews.chandlerc.com/D3130 llvm-svn: 204918	2014-03-27 11:52:20 +00:00
Matheus Almeida	dac77fb389	[mips] Implement custom MCELFStreamer. This allows us to insert some hooks before emitting data into an actual object file. For example, we can capture the register usage for a translation unit by overriding the EmitInstruction method. The register usage information is needed to generate .reginfo and .Mips.options ELF sections. No functional changes. Differential Revision: http://llvm-reviews.chandlerc.com/D3129 llvm-svn: 204917	2014-03-27 11:39:03 +00:00
Erik Verbruggen	59a1219846	InstCombine: merge constants in both operands of icmp. Transform: icmp X+Cst2, Cst into: icmp X, Cst-Cst2 when Cst-Cst2 does not overflow, and the add has nsw. llvm-svn: 204912	2014-03-27 11:16:05 +00:00
Daniel Sanders	d897b564ca	[mips] Stop caching the result of hasMips64(), isABI_O32(), isABI_N32(), and isABI_N64() from MipsSubTarget in MipsTargetLowering Summary: The short name is quite convenient so provide an accessor for them instead. No functional change Depends on D3177 Reviewers: matheusalmeida Reviewed By: matheusalmeida Differential Revision: http://llvm-reviews.chandlerc.com/D3178 llvm-svn: 204911	2014-03-27 10:46:12 +00:00
Elena Demikhovsky	bb2f6b72d3	AVX-512: Implemented masking for integer arithmetic & logic instructions. By Robert Khasanov rob.khasanov@gmail.com llvm-svn: 204906	2014-03-27 09:45:08 +00:00
Stepan Dyatkovskiy	e8747e30ef	Rejected r204899 and r204900 due to remaining test failures on cmake-llvm-x86_64-linux buildbot. llvm-svn: 204901	2014-03-27 08:38:18 +00:00
Stepan Dyatkovskiy	3530003008	Fix for pr18931: Crash using integrated assembler with immediate arithmetic Fix description: Expressions like 'cmp r0, #(l1 - l2) >> 3' could not be evaluated on asm parsing stage, since it is impossible to resolve labels on this stage. In the end of stage we still have expression (MCExpr). Then, when we want to encode it, we expect it to be an immediate, but it still an expression. Patch introduces a Fixup (MCFixup instance), that is processed after main encoding stage. llvm-svn: 204899	2014-03-27 07:49:39 +00:00
Jiangning Liu	1d3f2c7c82	ARM: raise error message when complex SO expressions can't really be solved as a constant at compilation time. llvm-svn: 204898	2014-03-27 07:42:58 +00:00
Quentin Colombet	3914bf516b	[X86][Vectorizer Cost Model] Correct vectorization cost model for v2i64->v2f64 and v4i64->v4f64. The new costs match what we did for SSE2 and reflect the reality of our codegen. <rdar://problem/16381225> llvm-svn: 204884	2014-03-27 00:52:16 +00:00
Jim Grosbach	72fbde84b8	X86: Correct vectorization cost model for v8f32->v8i8. Fix the cost model to reflect the reality of our codegen. rdar://16370633 llvm-svn: 204880	2014-03-27 00:04:11 +00:00
Hal Finkel	df3e34d944	[PowerPC] Generate VSX permutations for v2[fi]64 vectors llvm-svn: 204873	2014-03-26 22:58:37 +00:00
Kevin Enderby	5611398b6b	Fix a problem with the ARM assembler incorrectly matching a vector list parameter that is using all lanes "{d0[], d2[]}" but can match and instruction with a ”{d0, d2}" parameter. I’m finishing up a fix for proper checking of the unsupported alignments on vld/vst instructions and ran into this. Thus I don’t have a test case at this time. And adding all code that will demonstrate the bug would obscure the very simple one line fix. So if you would indulge me on not having a test case at this time I’ll instead offer up a detailed explanation of what is going on in this commit message. This instruction: vld2.8 {d0[], d2[]}, [r4:64] is not legal as the alignment can only be 16 when the size is 8. Per this documentation: A8.8.325 VLD2 (single 2-element structure to all lanes) <align> The alignment. It can be one of: 16 2-byte alignment, available only if <size> is 8, encoded as a = 1. 32 4-byte alignment, available only if <size> is 16, encoded as a = 1. 64 8-byte alignment, available only if <size> is 32, encoded as a = 1. omitted Standard alignment, see Unaligned data access on page A3-108. So when code is added to the llvm integrated assembler to not match that instruction because of the alignment it then goes on to try to match other instructions and comes across this: vld2.8 {d0, d2}, [r4:64] and and matches it. This is because of the method ARMOperand::isVecListDPairSpaced() is missing the check of the Kind. In this case the Kind is k_VectorListAllLanes . While the name of the method may suggest that this is OK it really should check that the Kind is k_VectorList. As the method ARMOperand::isDoubleSpacedVectorAllLanes() is what was used to match {d0[], d2[]} and correctly checks the Kind: bool isDoubleSpacedVectorAllLanes() const { return Kind == k_VectorListAllLanes && VectorList.isDoubleSpaced; } where the original ARMOperand::isVecListDPairSpaced() does not check the Kind: bool isVecListDPairSpaced() const { if (isSingleSpacedVectorList()) return false; return (ARMMCRegisterClasses[ARM::DPairSpcRegClassID] .contains(VectorList.RegNum)); } Jim Grosbach has reviewed the change and said: Yep, that sounds right. … And by "right" I mean, "wow, that's a nasty latent bug I'm really, really glad to see fixed." :) rdar://16436683 llvm-svn: 204861	2014-03-26 21:54:11 +00:00
Hal Finkel	6e28e6aaaf	[PowerPC] VSX loads and stores support unaligned access I've not yet updated PPCTTI because I'm not sure what the actual relative cost is compared to the aligned uses. llvm-svn: 204848	2014-03-26 19:39:09 +00:00
Kevin Enderby	8108f38437	Fix the ARM VST4 (single 4-element structure from one lane) size 16 double-spaced registers instruction printing. This: vld4.16 {d17[1], d19[1], d21[1], d23[1]}, [r7]! was being printed as: vld4.16 {d17[1], d18[1], d19[1], d20[1]}, [r7]! rdar://16435096 llvm-svn: 204847	2014-03-26 19:35:40 +00:00
Hal Finkel	7279f4b00d	[PowerPC] Use v2f64 <-> v2i64 VSX conversion instructions llvm-svn: 204843	2014-03-26 19:13:54 +00:00
Matt Arsenault	90b733a3cf	R600: Add a testcase for sext_in_reg I missed. This sext_inreg i32 in i64 case was already handled, but not enabled. llvm-svn: 204840	2014-03-26 18:31:06 +00:00
Hal Finkel	ea76a44584	[PowerPC] Remove some dead VSX v4f32 store patterns These patterns are dead (because v4f32 stores are currently promoted to v4i32 and stored using Altivec instructions), and also are likely not correct (because they'd store the vector elements in the opposite order from that assumed by the rest of the Altivec code). llvm-svn: 204839	2014-03-26 18:26:36 +00:00
Hal Finkel	9281c9a38b	[PowerPC] Use VSX vector load/stores for v2[fi]64 These instructions have access to the complete VSX register file. In addition, they "swap" the order of the elements so that element 0 (the scalar part) comes first in memory and element 1 follows at a higher address. llvm-svn: 204838	2014-03-26 18:26:30 +00:00
Hans Wennborg	d683a22dd2	Revert "X86 memcpy lowering: use "rep movs" even when esi is used as base pointer" (r204174) > For functions where esi is used as base pointer, we would previously fall ba > from lowering memcpy with "rep movs" because that clobbers esi. > > With this patch, we just store esi in another physical register, and restore > it afterwards. This adds a little bit of register preassure, but the more > efficient memcpy should be worth it. > > Differential Revision: http://llvm-reviews.chandlerc.com/D2968 This didn't work. I was ending up with code like this: lea edi,[esi+38h] mov ecx,0Fh mov edx,esi mov esi,ebx rep movs dword ptr es:[edi],dword ptr [esi] lea ecx,[esi+74h] <-- Ooops, we're now using esi before restoring it from edx. add ebx,3Ch mov esi,edx I guess if we want to do this we need stronger glue or something, or doing the expansion much later. llvm-svn: 204829	2014-03-26 16:30:54 +00:00
Hal Finkel	a6c8b51212	[PowerPC] Add v2i64 as a legal VSX type v2i64 needs to be a legal VSX type because it is the SetCC result type from v2f64 comparisons. We need to expand all non-arithmetic v2i64 operations. This fixes the lowering for v2f64 VSELECT. llvm-svn: 204828	2014-03-26 16:12:58 +00:00
Matheus Almeida	ea06727f03	[mips] Use TwoOperandAliasConstraint for ArithLogicR instructions. This enables TableGen to generate an additional two operand matcher for our ArithLogicR class of instructions (constituted by 3 register operands). E.g.: and $1, $2 <=> and $1, $1, $2 llvm-svn: 204826	2014-03-26 16:09:43 +00:00
Matheus Almeida	ab5633b70c	[mips] Add support to the '.dword' directive. The '.dword' directive accepts a list of expressions and emits them in 8-byte chunks in successive locations. llvm-svn: 204822	2014-03-26 15:44:18 +00:00
Matheus Almeida	3e2a702aa2	[mips] Rename function in MipsAsmParser. parseDirectiveWord is a generic function that parses an expression which means there's no need for it to have such an specific name. Renaming it to parseDataDirective so that it can also be used to handle .dword directives[1]. [1]To be added in a follow up commit. No functional changes. llvm-svn: 204818	2014-03-26 15:24:36 +00:00
Matheus Almeida	3b9c63d29b	[mips] Add support to '.set mips64'. The '.set mips64' directive enables the feature Mips:FeatureMips64 from assembly. Note that it doesn't modify the ELF header as opposed to the use of -mips64 from the command-line. The reason for this is that we want to be as compatible as possible with existing assemblers like GAS. llvm-svn: 204817	2014-03-26 15:14:32 +00:00
Matheus Almeida	a2cd009c51	[mips] Add support to '.set mips64r2'. The '.set mips64r2' directive enables the feature Mips:FeatureMips64r2 from assembly. Note that it doesn't modify the ELF header as opposed to the use of -mips64r2 from the command-line. The reason for this is that we want to be as compatible as possible with existing assemblers like GAS. llvm-svn: 204815	2014-03-26 14:52:22 +00:00
Christian Pirker	3aa0e6a1f9	AArch64_BE function argument passing for ARM ABI llvm-svn: 204814	2014-03-26 14:51:22 +00:00
Tim Northover	1ff5f29fb5	ARM: add intrinsics for the v8 ldaex/stlex We've already got versions without the barriers, so this just adds IR-level support for generating the new v8 ones. rdar://problem/16227836 llvm-svn: 204813	2014-03-26 14:39:31 +00:00
Matheus Almeida	fe1e39dcba	[mips] Hoist common functionality into a new function. Given that we support multiple directives that enable a particular feature (e.g. '.set mips16'), it's best to hoist that code into a new function so that we don't repeat the same pattern w.r.t parsing and handling error cases. No functional changes. llvm-svn: 204811	2014-03-26 14:26:27 +00:00
Renato Golin	93010e687f	Change @llvm.clear_cache default to call rt-lib After some discussion on IRC, emitting a call to the library function seems like a better default, since it will move from a compiler internal error to a linker error, that the user can work around until LLVM is fixed. I'm also adding a note on the responsibility of the user to confirm that the cache was cleared on platforms where nothing is done. llvm-svn: 204806	2014-03-26 14:01:32 +00:00
Daniel Sanders	6dd7251599	[mips] The decision to use MO_GOT_PAGE and MO_GOT_OFST depends on the ABI being N32 or N64 not the arch being MIPS64 Summary: No functional change (in supported use cases) Reviewers: matheusalmeida Reviewed By: matheusalmeida Differential Revision: http://llvm-reviews.chandlerc.com/D3177 llvm-svn: 204805	2014-03-26 13:59:42 +00:00
Cameron McInally	4532596b8f	Fix AVX512 Gather and Scatter execution domains. llvm-svn: 204804	2014-03-26 13:50:50 +00:00
Matheus Almeida	f79b281421	[mips] Add support for '.option pic2'. The directive '.option pic2' enables PIC from assembly source. At the moment none of the macros/directives check the PIC bit but that's going to be fixed relatively soon. For example, the expansion of macros like 'la' depend on the relocation model. llvm-svn: 204803	2014-03-26 13:40:29 +00:00
Renato Golin	c0a3c1d66b	Add @llvm.clear_cache builtin Implementing the LLVM part of the call to __builtin___clear_cache which translates into an intrinsic @llvm.clear_cache and is lowered by each target, either to a call to __clear_cache or nothing at all incase the caches are unified. Updating LangRef and adding some tests for the implemented architectures. Other archs will have to implement the method in case this builtin has to be compiled for it, since the default behaviour is to bail unimplemented. A Clang patch is required for the builtin to be lowered into the llvm intrinsic. This will be done next. llvm-svn: 204802	2014-03-26 12:52:28 +00:00
Hal Finkel	732f0f73a7	[PowerPC] Lower VSELECT using xxsel when VSX is available With VSX there is a real vector select instruction, and so we should use it. Note that VSELECT will still scalarize for v2f64 because the corresponding SetCC result type (v2i64) is not currently a legal type. llvm-svn: 204801	2014-03-26 12:49:28 +00:00
Daniel Sanders	a4b0c74765	[mips] The register names depend on the ABI being N32/N64 rather than the arch being mips64 Summary: Added test cases for O32 and N32 on MIPS64. Reviewers: matheusalmeida Reviewed By: matheusalmeida Differential Revision: http://llvm-reviews.chandlerc.com/D3175 llvm-svn: 204796	2014-03-26 11:39:07 +00:00
Daniel Sanders	85f482b02f	[mips] $s8 is an alias for $fp in all ABI's, not just N32/N64. llvm-svn: 204793	2014-03-26 11:05:24 +00:00
Rafael Espindola	65481d7b97	Revert "Prevent alias from pointing to weak aliases." This reverts commit r204781. I will follow up to with msan folks to see what is what they were trying to do with aliases to weak aliases. llvm-svn: 204784	2014-03-26 06:14:40 +00:00
Hal Finkel	bd4de9d478	[PowerPC] Generate logical vector VSX instructions These instructions are essentially the same as their Altivec counterparts, but have access to the larger VSX register file. llvm-svn: 204782	2014-03-26 04:55:40 +00:00
Rafael Espindola	3b712a84a9	Prevent alias from pointing to weak aliases. Aliases are just another name for a position in a file. As such, the regular symbol resolutions are not applied. For example, given define void @my_func() { ret void } @my_alias = alias weak void ()* @my_func @my_alias2 = alias void ()* @my_alias We produce without this patch: .weak my_alias my_alias = my_func .globl my_alias2 my_alias2 = my_alias That is, in the resulting ELF file my_alias, my_func and my_alias are just 3 names pointing to offset 0 of .text. That is not the semantics of IR linking. For example, linking in a @my_alias = alias void ()* @other_func would require the strong my_alias to override the weak one and my_alias2 would end up pointing to other_func. There is no way to represent that with aliases being just another name, so the best solution seems to be to just disallow it, converting a miscompile into an error. llvm-svn: 204781	2014-03-26 04:48:47 +00:00
Quentin Colombet	6f12ae0d5c	[X86] Add broadcast instructions to the table used by ExeDepsFix pass. Adds the different broadcast instructions to the ReplaceableInstrsAVX2 table. That way the ExeDepsFix pass can take better decisions when AVX2 broadcasts are across domain (int <-> float). In particular, prior to this patch we were generating: vpbroadcastd LCPI1_0(%rip), %ymm2 vpand %ymm2, %ymm0, %ymm0 vmaxps %ymm1, %ymm0, %ymm0 ## <- domain change penalty Now, we generate the following nice sequence where everything is in the float domain: vbroadcastss LCPI1_0(%rip), %ymm2 vandps %ymm2, %ymm0, %ymm0 vmaxps %ymm1, %ymm0, %ymm0 <rdar://problem/16354675> llvm-svn: 204770	2014-03-26 00:10:22 +00:00
Hal Finkel	174e590966	[PowerPC] Select between VSX A-type and M-type FMA instructions just before RA The VSX instruction set has two types of FMA instructions: A-type (where the addend is taken from the output register) and M-type (where one of the product operands is taken from the output register). This adds a small pass that runs just after MI scheduling (and, thus, just before register allocation) that mutates A-type instructions (that are created during isel) into M-type instructions when: 1. This will eliminate an otherwise-necessary copy of the addend 2. One of the product operands is killed by the instruction The "right" moment to make this decision is in between scheduling and register allocation, because only there do we know whether or not one of the product operands is killed by any particular instruction. Unfortunately, this also makes the implementation somewhat complicated, because the MIs are not in SSA form and we need to preserve the LiveIntervals analysis. As a simple example, if we have: %vreg5<def> = COPY %vreg9; VSLRC:%vreg5,%vreg9 %vreg5<def,tied1> = XSMADDADP %vreg5<tied0>, %vreg17, %vreg16, %RM<imp-use>; VSLRC:%vreg5,%vreg17,%vreg16 ... %vreg9<def,tied1> = XSMADDADP %vreg9<tied0>, %vreg17, %vreg19, %RM<imp-use>; VSLRC:%vreg9,%vreg17,%vreg19 ... We can eliminate the copy by changing from the A-type to the M-type instruction. This means: %vreg5<def,tied1> = XSMADDADP %vreg5<tied0>, %vreg17, %vreg16, %RM<imp-use>; VSLRC:%vreg5,%vreg17,%vreg16 is replaced by: %vreg16<def,tied1> = XSMADDMDP %vreg16<tied0>, %vreg18, %vreg9, %RM<imp-use>; VSLRC:%vreg16,%vreg18,%vreg9 and we remove: %vreg5<def> = COPY %vreg9; VSLRC:%vreg5,%vreg9 llvm-svn: 204768	2014-03-25 23:29:21 +00:00
Hal Finkel	6c32ff31d0	[PowerPC] Correct commutable indices for VSX FMA instructions Although the first two operands are the ones that can be swapped, the tied input operand is listed before them, so we need to adjust for that. I have a test case for this, but it goes along with an upcoming commit (so it will come soon). llvm-svn: 204748	2014-03-25 19:26:43 +00:00
Hal Finkel	25e0454f10	[PowerPC] Add a TableGen relation for A-type and M-type VSX FMA instructions TableGen will create a lookup table for the A-type FMA instructions providing their corresponding M-form opcodes. This will be used by upcoming commits. llvm-svn: 204746	2014-03-25 18:55:11 +00:00
Matt Arsenault	0c274feedf	R600: Move computeMaskedBitsForTargetNode out of AMDILISelLowering.cpp Remove handling of select_cc, since it makes no sense to be there. This now does nothing, but I'll be adding some handling of other target nodes soon. llvm-svn: 204743	2014-03-25 18:18:27 +00:00
Juergen Ributzka	631c4914b2	[X86TTI] Make constant base pointers for getElementPtr opaque. If getElementPtr uses a constant as base pointer, then make the constant opaque. This prevents constant folding it with the offset. The offset can usually be encoded in the load/store instruction itself and the base address doesn't have to be rematerialized several times. llvm-svn: 204739	2014-03-25 18:01:25 +00:00
Juergen Ributzka	5eef98cf7a	[Stackmaps][X86TTI] Fix think-o in getIntImmCost calculation. The cost for the first four stackmap operands was always TCC_Free. This is only true for the first two operands. All other operands are TCC_Free if they are within 64bit. llvm-svn: 204738	2014-03-25 18:01:23 +00:00
Adam Nemet	4beef4c90d	[X86] Generate VPSHUFB for in-place v16i16 shuffles This used to resort to splitting the 256-bit operation into two 128-bit shuffles and then recombining the results. Fixes <rdar://problem/16167303> llvm-svn: 204735	2014-03-25 17:47:06 +00:00
Adam Nemet	ac6d6383a3	[X86] Factor out new helper getPSHUFB I found three implementations of this. This splits it out into a new function and uses it from the three places. My plan is to add a fourth use when lowering a vector_shuffle:v16i16. Compared the assembly output of test/CodeGen/X86 before and after. The only change is due to how the first PSHUFB was generated in LowerVECTOR_SHUFFLEv8i16. If the shuffle mask specified undef (i.e. -1), the old implementation would write -1 * 2 and -1 * 2 + 1 (254 and 255) in the control mask. Now we write 0x80. These are of course interchangeable since bit 7 decides if a constant zero is written in the result byte. The other instances of this code use 0x80 consistently. Related to <rdar://problem/16167303> llvm-svn: 204734	2014-03-25 17:47:03 +00:00
Daniel Sanders	71a89d92f6	[mips] '.set at=$0' should be equivalent to '.set noat' Differential Revision: http://llvm-reviews.chandlerc.com/D3171 llvm-svn: 204714	2014-03-25 13:01:06 +00:00
Cameron McInally	45dc489403	Fix AVX2 Gather execution domains. llvm-svn: 204713	2014-03-25 12:36:38 +00:00
Daniel Sanders	b1d7e53a26	[mips] Correct testcase for .set at=$reg and emit the new warnings for numeric registers too. Summary: Remove the XFAIL added in my previous commit and correct the test such that it correctly tests the expansion of the assembler temporary. Also added a test to check that $at is always $1 when written by the user. Corrected the new assembler temporary warnings so that they are emitted for numeric registers too. Differential Revision: http://llvm-reviews.chandlerc.com/D3169 llvm-svn: 204711	2014-03-25 11:16:03 +00:00
Daniel Sanders	e231ae9e3a	[mips] Fix assembler temporary expansion and add associated warnings about the use of $at. Summary: The assembler temporary is normally $at ($1) but can be reassigned using '.set at=$reg'. Regardless of which register is nominated as the assembler temporary, $at remains $1 when written by the user. Adds warnings under the following conditions: * The register nominated as the assembler temporary is used by the user. * '.set noat' is in effect and $at is used by the user. Both of these only work for named registers. I have a follow up commit that makes it work for numeric registers as well. XFAIL set-at-directive.s since it incorrectly tests that $at is redefined by '.set at=$reg'. Testcases will follow in a separate commit. Patch by David Chisnall His work was sponsored by: DARPA, AFRL Differential Revision: http://llvm-reviews.chandlerc.com/D3167 llvm-svn: 204710	2014-03-25 10:57:07 +00:00
Kevin Enderby	89299400ac	Fix crashes when assembler directives are used that are not for Mach-O object files by generating an error instead. rdar://16335232 llvm-svn: 204687	2014-03-25 00:05:50 +00:00
Matt Arsenault	db8b1d5b6c	R600: Don't viewCFG() under DEBUG() except on failure. Having these popping up every time you use -debug is really irritating. llvm-svn: 204664	2014-03-24 20:29:02 +00:00
Matt Arsenault	684dc80b6d	R600/SI: Fix extra mov from legalizing 64-bit SALU ops. Check the register class of each operand individually to avoid an extra copy to a vgpr. llvm-svn: 204662	2014-03-24 20:08:13 +00:00
Matt Arsenault	248b7b6ba1	R600/SI: Sub-optimial fix for 64-bit immediates with SALU ops. No longer asserts, but now you get moves loading legal immediates into the split 32-bit operations. llvm-svn: 204661	2014-03-24 20:08:09 +00:00
Matt Arsenault	f35182c783	R600/SI: Fix 64-bit bit ops that require the VALU. Try to match scalar and first like the other instructions. Expand 64-bit ands to a pair of 32-bit ands since that is not available on the VALU. llvm-svn: 204660	2014-03-24 20:08:05 +00:00
Matt Arsenault	a7f1e0c44f	R600: Implement isNarrowingProfitable. llvm-svn: 204658	2014-03-24 19:43:31 +00:00
Matt Arsenault	bd9958038c	R600/SI: Move splitting 64-bit immediates to separate function. llvm-svn: 204651	2014-03-24 18:26:52 +00:00
Ulrich Weigand	cae3a17a21	[PowerPC] Generate little-endian object files As a first step towards real little-endian code generation, this patch changes the PowerPC MC layer to actually generate little-endian object files. This involves passing the little-endian flag through the various layers, including down to createELFObjectWriter so we actually get basic little-endian ELF objects, emitting instructions in little-endian order, and handling fixups and relocations as appropriate for little-endian. The bulk of the patch is to update most test cases in test/MC/PowerPC to verify both big- and little-endian encodings. (The only test cases not updated are those that create actual big-endian ABI code, like the TLS tests.) Note that while the object files are now little-endian, the generated code itself is not yet updated, in particular, it still does not adhere to the ELFv2 ABI. llvm-svn: 204634	2014-03-24 18:16:09 +00:00
Quentin Colombet	2d5c156b96	[X86][ISelDAG] Add missing fallback patterns for avx2 broadcast instructions. Those patterns are used when the load cannot be folded into the related broadcast during the select phase. This happens when the load gets additional uses that were not anticipated during the previous lowering phases (constant vector to constant load, then constant load reused) or when selection DAG is not able to prove that folding the load will not create a cycle in the DAG. <rdar://problem/16074331> llvm-svn: 204631	2014-03-24 17:54:19 +00:00
Matt Arsenault	ad41d7b531	R600/SI: Fix 64-bit private loads. llvm-svn: 204630	2014-03-24 17:50:46 +00:00
Adam Nemet	b47372f555	[X86] Fix non-determinism in LowerVectorAllZeroTest This can be observed with the old testcase of CodeGen/X86/pr12312.ll: 47c47 < vorps %ymm0, %ymm1, %ymm0 --- > vorps %ymm1, %ymm0, %ymm0 97c97 < vorps %ymm1, %ymm0, %ymm0 --- > vorps %ymm0, %ymm1, %ymm0 The vector VecIns is populated with all the values from VecInMap. This is done while iterating VecInMap. VecInMap uses a hash of pointer values so the resulting order can vary depending on the memory layout. The fix is to populate the vector VecIns earlier as VecInMap is populated. This is done in DAG traversal order. Fixes <rdar://problem/16398806> llvm-svn: 204623	2014-03-24 16:52:08 +00:00
Daniel Sanders	d89b13625e	[mips] Add error message when trying to use $at in '.set noat' mode. Summary: Patch by David Chisnall His work was sponsored by: DARPA, AFRL Differential Revision: http://llvm-reviews.chandlerc.com/D3158 llvm-svn: 204621	2014-03-24 16:48:01 +00:00
Eli Bendersky	6de2087ea7	Removes the NVPTXSplitBBatBar pass. This pass is a historic remnant and actually causes less efficient code to be generated in some cases. llvm-svn: 204620	2014-03-24 16:36:39 +00:00
Tom Stellard	8c12fd9252	R600/SI: Fix warning with gcc 4.8.2 llvm-svn: 204618	2014-03-24 16:12:34 +00:00
Tom Stellard	da99c6eff5	R600/SI: Promote fp64 SELECT to i64 This type promotion is replacing a Tablegen pattern and it is already covered by existing tests. llvm-svn: 204617	2014-03-24 16:07:30 +00:00
Tom Stellard	2c1c9de151	R600: Reorganize tablegen instruction definitions Each GPU family now has its own file. llvm-svn: 204615	2014-03-24 16:07:25 +00:00
Will Schmidt	114777e47f	[PPC64LE] ELFv2 ABI updates for the .opd section [PPC64LE] ELFv2 ABI updates for the .opd section The PPC64 Little Endian (PPC64LE) target supports the ELFv2 ABI, and as such, does not have a ".opd" section. This is keyed off a _CALL_ELF=2 macro check. The CALL_ELF check is not clearly documented at this time. The basis for usage in this patch is from the gcc thread here: http://gcc.gnu.org/ml/gcc-patches/2013-11/msg01144.html > Adding comment from Uli: Looks good to me. I think the old-style JIT doesn't really work anyway for 64-bit, but at least with this patch LLVM will compile and link again on a ppc64le host ... llvm-svn: 204614	2014-03-24 16:04:15 +00:00
Daniel Sanders	01f9fc06e7	[mips] Allow dsubu to take an immediate as an alias for dsubiu. Summary: Patch by David Chisnall His work was sponsored by: DARPA, AFRL Differential Revision: http://llvm-reviews.chandlerc.com/D3155 llvm-svn: 204611	2014-03-24 15:38:00 +00:00
Hal Finkel	e01d32107c	[PowerPC] Mark many instructions as commutative I'm under the impression that we used to infer the isCommutable flag from the instruction-associated pattern. Regardless, we don't seem to do this (at least by default) any more. I've gone through all of our instruction definitions, and marked as commutative all of those that should be trivial to commute (by exchanging the first two operands). There has been special code for the RL* instructions, and that's not changed. Before this change, we had the following commutative instructions: RLDIMI RLDIMIo RLWIMI RLWIMI8 RLWIMI8o RLWIMIo XSADDDP XSMULDP XVADDDP XVADDSP XVMULDP XVMULSP After: ADD4 ADD4o ADD8 ADD8o ADDC ADDC8 ADDC8o ADDCo ADDE ADDE8 ADDE8o ADDEo AND AND8 AND8o ANDo CRAND CREQV CRNAND CRNOR CROR CRXOR EQV EQV8 EQV8o EQVo FADD FADDS FADDSo FADDo FMADD FMADDS FMADDSo FMADDo FMSUB FMSUBS FMSUBSo FMSUBo FMUL FMULS FMULSo FMULo FNMADD FNMADDS FNMADDSo FNMADDo FNMSUB FNMSUBS FNMSUBSo FNMSUBo MULHD MULHDU MULHDUo MULHDo MULHW MULHWU MULHWUo MULHWo MULLD MULLDo MULLW MULLWo NAND NAND8 NAND8o NANDo NOR NOR8 NOR8o NORo OR OR8 OR8o ORo RLDIMI RLDIMIo RLWIMI RLWIMI8 RLWIMI8o RLWIMIo VADDCUW VADDFP VADDSBS VADDSHS VADDSWS VADDUBM VADDUBS VADDUHM VADDUHS VADDUWM VADDUWS VAND VAVGSB VAVGSH VAVGSW VAVGUB VAVGUH VAVGUW VMADDFP VMAXFP VMAXSB VMAXSH VMAXSW VMAXUB VMAXUH VMAXUW VMHADDSHS VMHRADDSHS VMINFP VMINSB VMINSH VMINSW VMINUB VMINUH VMINUW VMLADDUHM VMULESB VMULESH VMULEUB VMULEUH VMULOSB VMULOSH VMULOUB VMULOUH VNMSUBFP VOR VXOR XOR XOR8 XOR8o XORo XSADDDP XSMADDADP XSMAXDP XSMINDP XSMSUBADP XSMULDP XSNMADDADP XSNMSUBADP XVADDDP XVADDSP XVMADDADP XVMADDASP XVMAXDP XVMAXSP XVMINDP XVMINSP XVMSUBADP XVMSUBASP XVMULDP XVMULSP XVNMADDADP XVNMADDASP XVNMSUBADP XVNMSUBASP XXLAND XXLNOR XXLOR XXLXOR This is a by-inspection change, and I'm not sure how to write a reliable test case. I would like advice on this, however. llvm-svn: 204609	2014-03-24 15:07:28 +00:00
Daniel Sanders	a771fefb72	[mips] Implement shorthand add / sub forms for MIPS. Summary: - If only two registers are passed to a three-register operation, then the first argument is both source and destination register. - If a non-register is passed as the last argument, generate the immediate version of the instruction. Also mark DADD commutative and add scheduling information (to the generic scheduler), and implement DSUB. Patch by David Chisnall His work was sponsored by: DARPA, AFRL CC: theraven Differential Revision: http://llvm-reviews.chandlerc.com/D3148 llvm-svn: 204605	2014-03-24 14:05:39 +00:00
Justin Holewinski	ba2fa6de4f	[NVPTX] Add isel patterns for addrspacecast llvm-svn: 204600	2014-03-24 11:17:53 +00:00
Hal Finkel	32854b0439	[PowerPC] Don't schedule VSX copy legalization unless VSX is enabled There is no need to schedule this extra pass if it will have nothing to do. llvm-svn: 204594	2014-03-24 09:51:41 +00:00
Hal Finkel	bbad2332e3	[PowerPC] Update comment re: VSX copy-instruction selection I've done some experimentation with this, and it looks like using the lower-latency (but lower throughput) copy instruction is essentially always the right thing to do. My assumption is that, in order to be relatively sure that the higher-latency copy will increase throughput, we'd want to have it unlikely to be in-flight with its use. On the P7, the global completion table (GCT) can hold a maximum of 120 instructions, shared among all active threads (up to 4), giving 30 instructions per thread. So specifically, I'd require at least that many instructions between the copy and the use before the high-latency variant is used. Trying this, however, over the entire test suite resulted in zero cases where the high-latency form would be preferable. This may be a consequence of the fact that the scheduler views copies as free, and so they tend to end up close to their uses. For this experiment I created a function: unsigned chooseVSXCopy(MachineBasicBlock &MBB, MachineBasicBlock::iterator I, unsigned DestReg, unsigned SrcReg, unsigned StartDist = 1, unsigned Depth = 3) const; with an implementation like: if (!Depth) return PPC::XXLOR; const unsigned MaxDist = 30; unsigned Dist = StartDist; for (auto J = I, JE = MBB.end(); J != JE && Dist <= MaxDist; ++J) { if (J->isTransient() && !J->isCopy()) continue; if (J->isCall() \|\| J->isReturn() \|\| J->readsRegister(DestReg, TRI)) return PPC::XXLOR; ++Dist; } // We've exceeded the required distance for the high-latency form, use it. if (Dist > MaxDist) return PPC::XVCPSGNDP; // If this is only an exit block, use the low-latency form. if (MBB.succ_empty()) return PPC::XXLOR; // We've reached the end of the block, check the successor blocks (up to some // depth), and use the high-latency form if that is okay with all successors. for (auto J = MBB.succ_begin(), JE = MBB.succ_end(); J != JE; ++J) { if (chooseVSXCopy(*J, (J)->begin(), DestReg, SrcReg, Dist, --Depth) == PPC::XXLOR) return PPC::XXLOR; } // All of our successor blocks seem okay with the high-latency variant, so // we'll use it. return PPC::XVCPSGNDP; and then changed the copy opcode selection from: Opc = PPC::XXLOR; to: Opc = chooseVSXCopy(MBB, std::next(I), DestReg, SrcReg); In conclusion, I'm removing the FIXME from the comment, because I believe that there is, at least absent other examples, nothing to fix. llvm-svn: 204591	2014-03-24 09:36:36 +00:00
Arnaud A. de Grandmaison	1182600f20	ARM: no need to update SplatBits as it is not used llvm-svn: 204575	2014-03-23 21:14:32 +00:00
Nuno Lopes	31617266ea	remove a bunch of unused private methods found with a smarter version of -Wunused-member-function that I'm playwing with. Appologies in advance if I removed someone's WIP code. include/llvm/CodeGen/MachineSSAUpdater.h \| 1 include/llvm/IR/DebugInfo.h \| 3 lib/CodeGen/MachineSSAUpdater.cpp \| 10 -- lib/CodeGen/PostRASchedulerList.cpp \| 1 lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp \| 10 -- lib/IR/DebugInfo.cpp \| 12 -- lib/MC/MCAsmStreamer.cpp \| 2 lib/Support/YAMLParser.cpp \| 39 --------- lib/TableGen/TGParser.cpp \| 16 --- lib/TableGen/TGParser.h \| 1 lib/Target/AArch64/AArch64TargetTransformInfo.cpp \| 9 -- lib/Target/ARM/ARMCodeEmitter.cpp \| 12 -- lib/Target/ARM/ARMFastISel.cpp \| 84 -------------------- lib/Target/Mips/MipsCodeEmitter.cpp \| 11 -- lib/Target/Mips/MipsConstantIslandPass.cpp \| 12 -- lib/Target/NVPTX/NVPTXISelDAGToDAG.cpp \| 21 ----- lib/Target/NVPTX/NVPTXISelDAGToDAG.h \| 2 lib/Target/PowerPC/PPCFastISel.cpp \| 1 lib/Transforms/Instrumentation/AddressSanitizer.cpp \| 2 lib/Transforms/Instrumentation/BoundsChecking.cpp \| 2 lib/Transforms/Instrumentation/MemorySanitizer.cpp \| 1 lib/Transforms/Scalar/LoopIdiomRecognize.cpp \| 8 - lib/Transforms/Scalar/SCCP.cpp \| 1 utils/TableGen/CodeEmitterGen.cpp \| 2 24 files changed, 2 insertions(+), 261 deletions(-) llvm-svn: 204560	2014-03-23 17:09:26 +00:00
Hal Finkel	4a912250fa	[PowerPC] Make use of VSX f64 <-> i64 conversion instructions When VSX is available, these instructions should be used in preference to the older variants that only have access to the scalar floating-point registers. llvm-svn: 204559	2014-03-23 05:35:00 +00:00
Craig Topper	a9253267a9	Prune includes in ARM target. llvm-svn: 204548	2014-03-22 23:51:00 +00:00
Saleem Abdulrasool	44419fc3cd	ARM IAS: properly handle function entries in .thumb When a label is parsed, check if there is type information available for the label. If so, check if the symbol is a function. If the symbol is a function and we are in thumb mode and no explicit thumb_func has been emitted, adjust the symbol data to indicate that the function definition is a thumb function. The application of this inferencing is improved value handling in the object file (the required thumb bit is set on symbols which are thumb functions). It also helps improve compatibility with binutils. The one complication that arises from this handling is the MCAsmStreamer. The default implementation of getOrCreateSymbolData in MCStreamer does not support tracking the symbol data. In order to support the semantics of thumb functions, track symbol data in assembly streamer. Although O(n) in number of labels in the TU, this is already done in various other streamers and as such the memory overhead is not a practical concern in this scenario. llvm-svn: 204544	2014-03-22 19:26:18 +00:00
Hal Finkel	55805eb562	[PowerPC] Fix the VSX v2f64 return register v2f64 values, like other 128-bit values, are returned under VSX in register vs34 (Altivec register v2). llvm-svn: 204543	2014-03-22 18:24:43 +00:00
Chad Rosier	b7747e31ef	[AArch64] Add SchedRW lists to NEON instructions. Previously, only regular AArch64 instructions were annotated with SchedRW lists. This patch does the same for NEON enabling these instructions to be scheduled by the MIScheduler. Additionally, store operations are now modeled and a few SchedRW lists were updated for bug fixes (e.g. multiple def operands). Reviewers: apazos, mcrosier, atrick Patch by Dave Estes <cestes@codeaurora.org>! llvm-svn: 204505	2014-03-21 19:34:41 +00:00
Matt Arsenault	8e2581b11e	R600/SI: Move instruction patterns to scalar versions. Some of them also had the pattern on both, so this removes the duplication. llvm-svn: 204492	2014-03-21 18:01:18 +00:00
Daniel Sanders	f88a29e66a	[mips] Correct lowering of VECTOR_SHUFFLE to VSHF. Summary: VECTOR_SHUFFLE concatenates the vectors in an vectorwise fashion. <0b00, 0b01> + <0b10, 0b11> -> <0b00, 0b01, 0b10, 0b11> VSHF concatenates the vectors in a bitwise fashion: <0b00, 0b01> + <0b10, 0b11> -> 0b0100 + 0b1110 -> 0b01001110 <0b10, 0b11, 0b00, 0b01> We must therefore swap the operands to get the correct result. The test case that discovered the issue was MultiSource/Benchmarks/nbench. Reviewers: matheusalmeida Reviewed By: matheusalmeida Differential Revision: http://llvm-reviews.chandlerc.com/D3142 llvm-svn: 204480	2014-03-21 16:56:51 +00:00
Tom Stellard	1583409e33	R600/SI: Handle MUBUF instructions in SIInstrInfo::moveToVALU() llvm-svn: 204476	2014-03-21 15:51:57 +00:00
Tom Stellard	e038720702	R600/SI: Handle S_MOV_B64 in SIInstrInfo::moveToVALU() llvm-svn: 204475	2014-03-21 15:51:54 +00:00
Tom Stellard	def38c567d	R600/SI: Use SGPR_(32\|64) reg clases when lowering SI_ADDR64_RSRC The SReg_(32\|64) register classes contain special registers in addition to the numbered SGPRs. This can lead to machine verifier errors when these register classes are used as sub-registers for SReg_128, since SReg_128 only uses the numbered SGPRs. Replacing SReg_(32\|64) with SGPR_(32\|64) fixes this problem, since the SGPR_(32\|64) register classes contain only numbered SGPRs. Tests cases for this are comming in a later commit. llvm-svn: 204474	2014-03-21 15:51:53 +00:00
Richard Sandiford	5676585886	[SystemZ] Use "let Predicates =" for blocks of new instructions ...instead of a separate Requires for each one. This style was already used in some places and seems more compact. No behavioral change intended. llvm-svn: 204452	2014-03-21 11:04:54 +00:00
Richard Sandiford	dc6c2c953d	[SystemZ] Add support for z196 float<->unsigned conversions These complement the older float<->signed instructions. llvm-svn: 204451	2014-03-21 10:56:30 +00:00
Matheus Almeida	518b4f9fcf	[mips] Update namespace. We should be using the llvm namespace and not an anonymous namespace in a header file. llvm-svn: 204450	2014-03-21 10:35:14 +00:00
Juergen Ributzka	f0dff49ad0	[Constant Hoisting] Make the constant materialization cost operand dependent Extend the target hook to take also the operand index into account when calculating the cost of the constant materialization. Related to <rdar://problem/16381500> llvm-svn: 204435	2014-03-21 06:04:45 +00:00
Jiangning Liu	db55b02e1c	This reverts commit r203762, "ARM: support emission of complex SO expressions". The commit r203762 introduced silent failure for complext SO expression, and it's even worse than compiler crash. llvm-svn: 204427	2014-03-21 02:51:01 +00:00
Kevin Qin	b2c78b07d6	[AArch64] Remove .data_region directive from AArch64. .data_region is only used in Darwin, so it shouldn't be generated for other OS. Currently AArch64 doesn't support darwin yet, so I removed it from AArch64. When Darwin is supported someday, we can add it back and associate it with Darwin. llvm-svn: 204424	2014-03-21 02:12:48 +00:00
Weiming Zhao	0152485679	Fix PR19136: [ARM] Fix Folding SP Update into vpush/vpop Sicne MBB->computeRegisterLivenes() returns Dead for sub regs like s0, d0 is used in vpop instead of updating sp, which causes s0 dead before its use. This patch checks the liveness of each subreg to make sure the reg is actually dead. llvm-svn: 204411	2014-03-20 23:28:16 +00:00
Juergen Ributzka	46357931ab	Revert "[Constant Hoisting] Extend coverage of the constant hoisting pass." I will break this up into smaller pieces for review and recommit. llvm-svn: 204393	2014-03-20 20:17:13 +00:00
Juergen Ributzka	6dab520c70	[Constant Hoisting] Extend coverage of the constant hoisting pass. This commit extends the coverage of the constant hoisting pass, adds additonal debug output and updates the function names according to the style guide. Related to <rdar://problem/16381500> llvm-svn: 204389	2014-03-20 19:55:52 +00:00
Matt Arsenault	99395fa98f	R600: Remove unused method declaration. llvm-svn: 204357	2014-03-20 16:41:06 +00:00
Kai Nacke	93fe5e810d	[MIPS] Add cpu octeon and some instructions The Octeon cpu from Cavium Networks is mips64r2 based and has an extended instruction set. In order to utilize this with LLVM, a new cpu feature "octeon" and a subtarget feature "cnmips" is added. A small set of new instructions (baddu, dmul, pop, dpop, seq, sne) is also added. LLVM generates dmul, pop and dpop instructions with option -mcpu=octeon or -mattr=+cnmips. llvm-svn: 204337	2014-03-20 11:51:58 +00:00
Zoran Jovanovic	a0f5328984	Provide an operand for microMIPS wait instruction. llvm-svn: 204329	2014-03-20 10:41:37 +00:00
Zoran Jovanovic	87d13e5ec1	Implementation of microMIPS 16-bit instructions MOVE and JALR. Differential Revision: http://llvm-reviews.chandlerc.com/D3112 llvm-svn: 204325	2014-03-20 10:18:24 +00:00
Zoran Jovanovic	28221d8bc1	Mark alias symbols as microMIPS if necessary. Differential Revision: http://llvm-reviews.chandlerc.com/D3080 llvm-svn: 204323	2014-03-20 09:44:49 +00:00
Matheus Almeida	9e1450bce9	[mips] Splitting up class definition from implementation. Also removed some unnecessary #includes. No functional changes. llvm-svn: 204320	2014-03-20 09:29:54 +00:00
Alexey Samsonov	94bc422d7a	Add llvm_unreachable after fully-covered switches to appease GCC llvm-svn: 204318	2014-03-20 07:30:40 +00:00
Saleem Abdulrasool	39f773f939	Reapply 'ARM IAS: support .thumb_set' Re-apply the change after it was reverted to do conflicts due to another change being reverted. llvm-svn: 204306	2014-03-20 06:05:33 +00:00
Craig Topper	38afbfdd76	[X86] Check return value of readSIB in disassembler so errors propagate. In particular this makes a too short instruction with a missing SIB byte fail. llvm-svn: 204305	2014-03-20 05:56:00 +00:00
Hao Liu	40b5ab8e5b	[ARM]Fix an assertion failure in A15SDOptimizer about DPair reg class by treating DPair as QPR. llvm-svn: 204304	2014-03-20 05:36:59 +00:00
Rafael Espindola	7fadc0ea7d	Look through variables when computing relocations. Given bar = foo + 4 .long bar MC would eat the 4. GNU as includes it in the relocation. The rule seems to be that a variable that defines a symbol is used in the relocation and one that does not define a symbol is evaluated and the result included in the relocation. Fixing this unfortunately required some other changes: * Since the variable is now evaluated, it would prevent the ELF writer from noticing the weakref marker the elf streamer uses. This patch then replaces that with a VariantKind in MCSymbolRefExpr. * Using VariantKind then requires us to look past other VariantKind to see .weakref bar,foo call bar@PLT doing this also fixes zed = foo +2 call zed@PLT so that is a good thing. * Looking past VariantKind means that the relocation selection has to use the fixup instead of the target. This is a reboot of the previous fixes for MC. I will watch the sanitizer buildbot and wait for a build before adding back the previous fixes. llvm-svn: 204294	2014-03-20 02:12:01 +00:00
Matt Arsenault	dd78b8059b	R600/SI: Add unused LDS 2 form instructions. llvm-svn: 204275	2014-03-19 22:19:56 +00:00
Matt Arsenault	d06ebd93e6	R600/SI: Add support for 64-bit LDS writes llvm-svn: 204274	2014-03-19 22:19:54 +00:00
Matt Arsenault	b943348cb9	R600/SI: Add support for 64-bit LDS loads. v2: -Use correct opcode for DS_READ_64 llvm-svn: 204273	2014-03-19 22:19:52 +00:00
Matt Arsenault	99ed78926b	R600/SI: Match i16 immediate offset of LDS instructions. llvm-svn: 204272	2014-03-19 22:19:49 +00:00
Matt Arsenault	547aff20f5	R600/SI: Don't display the GDS bit. It isn't actually used now, and probably never will be, plus it makes tests less annoying. I also think SC prints GDS instructions as a separate instruction name. llvm-svn: 204270	2014-03-19 22:19:43 +00:00
Matt Arsenault	9cd8c38a32	R600/SI: Merge offset0 and offset1 fields for single address DS instructions v2 Also remove unused data fields from the DS_Load_Helper class. v2: - Merge fields for DS_WRITE llvm-svn: 204269	2014-03-19 22:19:39 +00:00
Matheus Almeida	c11f305082	[mips] 80-column. llvm-svn: 204252	2014-03-19 16:29:06 +00:00
Craig Topper	c6d4efa1e5	Prune includes in X86 target. llvm-svn: 204216	2014-03-19 06:53:25 +00:00
Rafael Espindola	7bbd5c2636	Revert "Add back r203962, r204028 and r204059." This reverts commit r204178. llvm-svn: 204203	2014-03-19 00:13:43 +00:00
Rafael Espindola	574bfa12fa	Add back r203962, r204028 and r204059. This reverts commit r204137. This includes a fix for handling aliases of aliases. llvm-svn: 204178	2014-03-18 20:40:38 +00:00
Hans Wennborg	aec21ce43e	X86 memcpy lowering: use "rep movs" even when esi is used as base pointer For functions where esi is used as base pointer, we would previously fall back from lowering memcpy with "rep movs" because that clobbers esi. With this patch, we just store esi in another physical register, and restore it afterwards. This adds a little bit of register preassure, but the more efficient memcpy should be worth it. Differential Revision: http://llvm-reviews.chandlerc.com/D2968 llvm-svn: 204174	2014-03-18 20:04:34 +00:00
Manuel Jacob	dcb78dbc82	X86: Use enums for memory operand decoding instead of integer literals. Summary: X86BaseInfo.h defines an enum for the offset of each operand in a memory operand sequence. Some code uses it and some does not. This patch replaces (hopefully) all remaining locations where an integer literal was used instead of this enum. No functionality change intended. Reviewers: nadav CC: llvm-commits, t.p.northover Differential Revision: http://llvm-reviews.chandlerc.com/D3108 llvm-svn: 204158	2014-03-18 16:14:11 +00:00
Krzysztof Parzyszek	4d38c82575	Enable CFI on Hexagon. llvm-svn: 204157	2014-03-18 16:02:37 +00:00
Bill Schmidt	ff9622ef0e	Fix PR19144: Incorrect offset generated for int-to-fp conversion at -O0. When converting a signed 32-bit integer to double-precision floating point on hardware without a lfiwax instruction, we have to instead use a lfd followed by fcfid. We were erroneously offsetting the address by 4 bytes in preparation for either a lfiwax or lfiwzx when generating the lfd. This fixes that silly error. This was not caught in the test suite since the conversion tests were run with -mcpu=pwr7, which implies availability of lfiwax. I've added another test case for older hardware that checks the code we expect in the absence of lfiwax and other flavors of fcfid. There are fewer tests in this test case because we punt to DAG selection in more cases on older hardware. (We must generate complex fiddly sequences in those cases, and there is marginal benefit in duplicating that logic in fast-isel.) llvm-svn: 204155	2014-03-18 14:32:50 +00:00
Alexander Kornienko	64de613751	Revert r203962 and two revisions depending on it: r204028 and r204059. The revision I'm reverting breaks handling of transitive aliases. This blocks us and breaks sanitizer bootstrap: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/2651 (and checked locally by Alexey). This revision is the result of: svn merge -r204059:204058 -r204028:204027 -r203962:203961 . + the regression test added to test/MC/ELF/alias.s Another way to reproduce the regression with clang: $ cat q.c void a1(); void a2() __attribute__((alias("a1"))); void a3() __attribute__((alias("a2"))); void a1() {} $ ~/work/llvm-build/bin/clang-3.5-good -c q.c && mv q.o good.o && \ ~/work/llvm-build/bin/clang-3.5-bad -c q.c && mv q.o bad.o && \ objdump -t good.o bad.o good.o: file format elf64-x86-64 SYMBOL TABLE: 0000000000000000 l df ABS 0000000000000000 q.c 0000000000000000 l d .text 0000000000000000 .text 0000000000000000 l d .data 0000000000000000 .data 0000000000000000 l d .bss 0000000000000000 .bss 0000000000000000 l d .comment 0000000000000000 .comment 0000000000000000 l d .note.GNU-stack 0000000000000000 .note.GNU-stack 0000000000000000 l d .eh_frame 0000000000000000 .eh_frame 0000000000000000 g F .text 0000000000000006 a1 0000000000000000 g F .text 0000000000000006 a2 0000000000000000 g F .text 0000000000000006 a3 bad.o: file format elf64-x86-64 SYMBOL TABLE: 0000000000000000 l df ABS 0000000000000000 q.c 0000000000000000 l d .text 0000000000000000 .text 0000000000000000 l d .data 0000000000000000 .data 0000000000000000 l d .bss 0000000000000000 .bss 0000000000000000 l d .comment 0000000000000000 .comment 0000000000000000 l d .note.GNU-stack 0000000000000000 .note.GNU-stack 0000000000000000 l d .eh_frame 0000000000000000 .eh_frame 0000000000000000 g F .text 0000000000000006 a1 0000000000000000 g F .text 0000000000000006 a2 0000000000000000 g .text 0000000000000000 a3 llvm-svn: 204137	2014-03-18 10:36:11 +00:00
Alon Mishne	ad312155a6	[C++11] Change DebugInfoFinder to use range-based loops Also changes the iterators to return actual DI type over MDNode. llvm-svn: 204130	2014-03-18 09:41:07 +00:00
Craig Topper	26696314d5	[C++11] Mark the target fast isel classes as 'final' so that the compiler can de-virtualize some of the internal calls. llvm-svn: 204123	2014-03-18 07:27:13 +00:00
Saleem Abdulrasool	42b233a836	ARM: add an assertion Add an assertion that a valid section is referenced. The potential NULL pointer dereference was identified by the clang static analyzer. llvm-svn: 204114	2014-03-18 05:26:55 +00:00
Matt Arsenault	f45faaf30d	Make methods static llvm-svn: 204085	2014-03-17 22:23:09 +00:00
Matt Arsenault	fae02989b7	R600: Match sign_extend_inreg to BFE instructions llvm-svn: 204072	2014-03-17 18:58:11 +00:00
Adam Nemet	8a130a5f86	[X86] Fix unused variable warning with NDEBUG from r204058 llvm-svn: 204063	2014-03-17 17:32:53 +00:00
Saleem Abdulrasool	11543a9953	ARM IAS: support .thumb_set This performs the equivalent of a .set directive in that it creates a symbol which is an alias for another symbol or value which may possibly be yet undefined. This directive also has the added property in that it marks the aliased symbol as being a thumb function entry point, in the same way that the .thumb_func directive does. The current implementation fails one test due to an unrelated issue. Functions within .thumb sections are not marked as thumb_func. The result is that the aliasee function is not valued correctly. llvm-svn: 204059	2014-03-17 17:13:54 +00:00
Adam Nemet	24381f1cb7	[VectorLegalizer/X86] Don't unvectorize fp_to_uint for v8f32->v8i16 Rather than LegalizeAction::Expand, this needs LegalizeAction::Promote to get promoted to fp_to_sint v8f32->v8i32. This is a legal operation on AVX. For that to work properly, we also need to teach the legalizer about the specific promotion required here. The default vector promotion uses bitcasting to a vector type of the same total size. We want to promote the vector element type, effectively widening the operation and then truncating the result. This is analogous to the current logic of how int_to_fp is promoted. The change also factors out some code from the int_to_fp promotion code to ValueType::widenIntegerVectorElementType. This is now shared between int_to_fp and fp_to_int. There is no longer need for the custom lowering of fp_to_sint f32->v8i16 in X86. It can now go through the new target-independent fp_to_*int promotion logic. I also checked that no other target uses Promote for these ops yet, so there shouldn't be any unexpected change in behavior. Fixes <rdar://problem/16202247> llvm-svn: 204058	2014-03-17 17:06:14 +00:00
Tom Stellard	d0084464b5	R600/SI: Fix implementation of isInlineConstant() used by the verifier The type of the immediates should not matter as long as the encoding is equivalent to the encoding of one of the legal inline constants. Tested-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 204056	2014-03-17 17:03:52 +00:00
Tom Stellard	fbe435de63	R600/SI: Use correct dest register class for V_READFIRSTLANE_B32 This instructions writes to an 32-bit SGPR. This change required adding the 32-bit VCC_LO and VCC_HI registers, because the full VCC register is 64 bits. This fixes verifier errors on several of the indirect addressing piglit tests. Tested-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 204055	2014-03-17 17:03:51 +00:00
Tom Stellard	ca700e41ef	R600/SI: Add generic checks to SIInstrInfo::verifyInstruction() Added checks for number of operands and operand register classes. Tested-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 204054	2014-03-17 17:03:49 +00:00
Lang Hames	7c8189c6d3	[X86] New and improved VZeroUpperInserter optimization. - Adds support for inserting vzerouppers before tail-calls. This is enabled implicitly by having MachineInstr::copyImplicitOps preserve regmask operands, which allows VZeroUpperInserter to see where tail-calls use vector registers. - Fixes a bug that caused the previous version of this optimization to miss some vzeroupper insertion points in loops. (Loops-with-vector-code that followed loops-without-vector-code were mistakenly overlooked by the previous version). - New algorithm never revisits instructions. Fixes <rdar://problem/16228798> llvm-svn: 204021	2014-03-17 01:22:54 +00:00
Arnaud A. de Grandmaison	75c9e6dedf	Remove some dead assignements found by scan-build llvm-svn: 204013	2014-03-15 22:13:15 +00:00
Patrik Hagglund	8d09a6c674	Replace ValueTypes.h with MachineValueType.h if possible. Utilize the previous move of MVT to a separate header for all trivial cases (that don't need any further restructuring). Reviewed By: Tim Northover llvm-svn: 204003	2014-03-15 09:11:41 +00:00
Matt Arsenault	ea330fbe49	R600: Remove unnecessary attempt to zext a pointer. Private pointers are now always 32-bits. llvm-svn: 203989	2014-03-15 00:08:26 +00:00
Matt Arsenault	74891cdefe	R600: Code cleanup. Use sign_extend_inreg and getZeroExtendInReg instead of using the bit operations they expand into. llvm-svn: 203988	2014-03-15 00:08:22 +00:00
Duncan P. N. Exon Smith	a824862664	x86: Add missing break to getCallPreservedMask() This change brings getCallPreservedMask()'s logic in line with getCalleeSavedRegs(). While this changes the control flow slightly, the change is not currently observable. is64Bit must be false to get to the accidental fallthrough, but the case that we fall into (coldcc) does nothing unless is64Bit is true. llvm-svn: 203943	2014-03-14 16:29:21 +00:00
Duncan P. N. Exon Smith	fea3c8afd6	x86: NFC: Make getCallPreservedMask() more similar to getCalleeSavedRegs() Changing order of checks in getCallPreservedMask() to match getCalleeSavedRegs() so that the logic is easier to compare. llvm-svn: 203939	2014-03-14 16:09:13 +00:00
Duncan P. N. Exon Smith	8f66a3afe0	x86: getCalleeSavedRegs() would crash on 0 (so don't default to it) The current logic assumes that MF is not 0. Assert that it isn't, and remove the default of 0 from the header. llvm-svn: 203934	2014-03-14 15:38:12 +00:00
Ulrich Weigand	f445399870	[ppc64] Avoid copy relocs in named rodata sections Commit r181723 introduced code to avoid placing initialized variables needing relocations into the .rodata section, which avoid copy relocs that do not work as expected on ppc64 function references. The same treatment is also needed for named .rodata.XXX sections. This patch changes PPC64LinuxTargetObjectFile::SelectSectionForGlobal to modify "Kind" before calling the default SelectSectionForGlobal routine, instead of first calling the default routine and then just checking for the (main) .rodata section afterwards. llvm-svn: 203921	2014-03-14 12:45:22 +00:00
Evgeniy Stepanov	49e2625144	AddressSanitizer instrumentation for MOV and MOVAPS. This is an initial version of *Sanitizer instrumentation of assembly code. Patch by Yuri Gorshenin. llvm-svn: 203908	2014-03-14 08:58:04 +00:00
Rafael Espindola	2fb5bc33a3	Remove the linker_private and linker_private_weak linkages. These linkages were introduced some time ago, but it was never very clear what exactly their semantics were or what they should be used for. Some investigation found these uses: * utf-16 strings in clang. * non-unnamed_addr strings produced by the sanitizers. It turns out they were just working around a more fundamental problem. For some sections a MachO linker needs a symbol in order to split the section into atoms, and llvm had no idea that was the case. I fixed that in r201700 and it is now safe to use the private linkage. When the object ends up in a section that requires symbols, llvm will use a 'l' prefix instead of a 'L' prefix and things just work. With that, these linkages were already dead, but there was a potential future user in the objc metadata information. I am still looking at CGObjcMac.cpp, but at this point I am convinced that linker_private and linker_private_weak are not what they need. The objc uses are currently split in * Regular symbols (no '\01' prefix). LLVM already directly provides whatever semantics they need. * Uses of a private name (start with "\01L" or "\01l") and private linkage. We can drop the "\01L" and "\01l" prefixes as soon as llvm agrees with clang on L being ok or not for a given section. I have two patches in code review for this. * Uses of private name and weak linkage. The last case is the one that one could think would fit one of these linkages. That is not the case. The semantics are * the linker will merge these symbol by name. * the linker will hide them in the final DSO. Given that the merging is done by name, any of the private (or internal) linkages would be a bad match. They allow llvm to rename the symbols, and that is really not what we want. From the llvm point of view, these objects should really be (linkonce\|weak)(_odr)?. For now, just keeping the "\01l" prefix is probably the best for these symbols. If we one day want to have a more direct support in llvm, IMHO what we should add is not a linkage, it is just a hidden_symbol attribute. It would be applicable to multiple linkages. For example, on weak it would produce the current behavior we have for objc metadata. On internal, it would be equivalent to private (and we should then remove private). llvm-svn: 203866	2014-03-13 23:18:37 +00:00
Owen Anderson	16c6bf49b7	Phase 2 of the great MachineRegisterInfo cleanup. This time, we're changing operator* on the by-operand iterators to return a MachineOperand& rather than a MachineInstr&. At this point they almost behave like normal iterators! Again, this requires making some existing loops more verbose, but should pave the way for the big range-based for-loop cleanups in the future. llvm-svn: 203865	2014-03-13 23:12:04 +00:00
Rafael Espindola	4269b9eed5	Use printable names to implement directional labels. This changes the implementation of local directional labels to use a dedicated map. With that it can then just use CreateTempSymbol, which is what the rest of MC uses. CreateTempSymbol doesn't do a great job at making sure the names are unique (or being efficient when the names are not needed), but that should probably be fixed in a followup patch. This fixes pr18928. llvm-svn: 203826	2014-03-13 18:09:26 +00:00
Tom Stellard	08ef1233c6	R600: LDS instructions shouldn't implicitly define OQAP LDS instructions are pseudo instructions which model the OQAP defs and uses within a single instruction. This fixes a hang in the opencv MedianFilter tests. llvm-svn: 203818	2014-03-13 17:13:04 +00:00
Hans Wennborg	89050436e6	[ARM] Use symbolic register names in .cfi directives only with IAS (PR19110) This is a follow-up to r203635. Saleem pointed out that since symbolic register names are much easier to read, it would be good if we could turn them off only when we really need to because we're using an external assembler. Differential Revision: http://llvm-reviews.chandlerc.com/D3056 llvm-svn: 203806	2014-03-13 15:56:41 +00:00
Manuel Jacob	a7c48f99ae	CodeGenPrep: sink extends of illegal types into use block. Summary: This helps the instruction selector to lower an i64 * i64 -> i128 multiplication into a single instruction on targets which support it. This is an update of D2973 which was reverted because of a bug reported as PR19084. Reviewers: t.p.northover, chapuni Reviewed By: t.p.northover CC: llvm-commits, alex, chapuni Differential Revision: http://llvm-reviews.chandlerc.com/D3021 llvm-svn: 203797	2014-03-13 13:36:25 +00:00
Elena Demikhovsky	fd05667276	AVX-512: masked load/store + intrinsics for them. llvm-svn: 203790	2014-03-13 12:05:52 +00:00
Tim Northover	38a93aaaa1	AArch64: error when both positional & named operands are used. Only one instruction pair needed changing: SMULH & UMULH. The previous code worked, but MC was doing extra work treating Ra as a valid operand (which then got completely overwritten in MCCodeEmitter). No behaviour change, so no tests. llvm-svn: 203772	2014-03-13 09:00:13 +00:00
Hal Finkel	27774d9274	[PowerPC] Initial support for the VSX instruction set VSX is an ISA extension supported on the POWER7 and later cores that enhances floating-point vector and scalar capabilities. Among other things, this adds <2 x double> support and generally helps to reduce register pressure. The interesting part of this ISA feature is the register configuration: there are 64 new 128-bit vector registers, the 32 of which are super-registers of the existing 32 scalar floating-point registers, and the second 32 of which overlap with the 32 Altivec vector registers. This makes things like vector insertion and extraction tricky: this can be free but only if we force a restriction to the right register subclass when needed. A new "minipass" PPCVSXCopy takes care of this (although it could do a more-optimal job of it; see the comment about unnecessary copies below). Please note that, currently, VSX is not enabled by default when targeting anything because it is not yet ready for that. The assembler and disassembler are fully implemented and tested. However: - CodeGen support causes miscompiles; test-suite runtime failures: MultiSource/Benchmarks/FreeBench/distray/distray MultiSource/Benchmarks/McCat/08-main/main MultiSource/Benchmarks/Olden/voronoi/voronoi MultiSource/Benchmarks/mafft/pairlocalalign MultiSource/Benchmarks/tramp3d-v4/tramp3d-v4 SingleSource/Benchmarks/CoyoteBench/almabench SingleSource/Benchmarks/Misc/matmul_f64_4x4 - The lowering currently falls back to using Altivec instructions far more than it should. Worse, there are some things that are scalarized through the stack that shouldn't be. - A lot of unnecessary copies make it past the optimizers, and this needs to be fixed. - Many more regression tests are needed. Normally, I'd fix these things prior to committing, but there are some students and other contributors who would like to work this, and so it makes sense to move this development process upstream where it can be subject to the regular code-review procedures. llvm-svn: 203768	2014-03-13 07:58:58 +00:00
Hal Finkel	5457bd08cb	[TableGen] Optionally forbid overlap between named and positional operands There are currently two schemes for mapping instruction operands to instruction-format variables for generating the instruction encoders and decoders for the assembler and disassembler respectively: a) to map by name and b) to map by position. In the long run, we'd like to remove the position-based scheme and use only name-based mapping. Unfortunately, the name-based scheme currently cannot deal with complex operands (those with suboperands), and so we currently must use the position-based scheme for those. On the other hand, the position-based scheme cannot deal with (register) variables that are split into multiple ranges. An upcoming commit to the PowerPC backend (adding VSX support) will require this capability. While we could teach the position-based scheme to handle that, since we'd like to move away from the position-based mapping generally, it seems silly to teach it new tricks now. What makes more sense is to allow for partial transitioning: use the name-based mapping when possible, and only use the position-based scheme when necessary. Now the problem is that mixing the two sensibly was not possible: the position-based mapping would map based on position, but would not skip those variables that were mapped by name. Instead, the two sets of assignments would overlap. However, I cannot currently change the current behavior, because there are some backends that rely on it [I think mistakenly, but I'll send a message to llvmdev about that]. So I've added a new TableGen bit variable: noNamedPositionallyEncodedOperands, that can be used to cause the position-based mapping to skip variables mapped by name. llvm-svn: 203767	2014-03-13 07:57:54 +00:00
Saleem Abdulrasool	aae4dc21ea	ARM: ignore unused variable to fix -Wunused-variable builds llvm-svn: 203765	2014-03-13 07:15:45 +00:00
Saleem Abdulrasool	dadf94ce84	ARM: support emission of complex SO expressions Support to the IAS was added to actually parse and handle the complex SO expressions. However, the object file lowering was not updated to compensate for the fact that the shift operand may be an absolute expression. When trying to assemble to an object file, the lowering would fail while succeeding when emitting purely assembly. Add an appropriate test. The test case is inspired by the test case provided by Jiangning Liu who also brought the issue to light. llvm-svn: 203762	2014-03-13 07:02:41 +00:00
Adam Nemet	d4e56073c7	[X86] Add peephole for masked rotate amount Extend what's currently done for shift because the HW performs this masking implicitly: (rotl:i32 x, (and y, 31)) -> (rotl:i32 x, y) I use the newly factored out multiclass that was only supporting shifts so far. For testing I extended my testcase for the new rotation idiom. <rdar://problem/15295856> llvm-svn: 203718	2014-03-12 21:20:55 +00:00
Roman Divacky	a26f9a6a42	Allow exclamation and tilde to be parsed as a part of the ppc asm operand. llvm-svn: 203699	2014-03-12 19:25:57 +00:00
Matt Arsenault	e389dd5d68	R600: Fix trunc store from i64 to i1 llvm-svn: 203695	2014-03-12 18:45:52 +00:00
Adam Nemet	b667c3fc26	[X86] Refactor peepholes for masked shift amount into a multiclass The peephole (shift x, (and y, 31)) -> (shift x, y) is repeated for each integer type and each shift variant. To improve this a new multiclass is added that covers all integer types. The shift patterns are now instantiated from this. I am planning to add new instances for rotates as well. No functional change intended: * test/CodeGen/X86/shift-and.ll provides coverage * Compared the expanded tablegen output and matched up the defs for these Pat<>s before and after llvm-svn: 203685	2014-03-12 18:02:33 +00:00
Quentin Colombet	b5e41ea144	[X86] Set the scheduling resources of some of the FPStack instructions. This is related to <rdar://problem/15607571>. llvm-svn: 203682	2014-03-12 17:33:42 +00:00
Rafael Espindola	3d5d464df8	Try harder to evaluate expressions when printing assembly. When printing assembly we don't have a Layout object, but we can still try to fold some constants. Testcase by Ulrich Weigand. llvm-svn: 203677	2014-03-12 16:55:59 +00:00
Hans Wennborg	6693c673a1	Add comment pointing to the binutils bugzilla entry This is a follow-up to r203635 as suggested by Rafael. llvm-svn: 203670	2014-03-12 16:14:23 +00:00
Will Schmidt	acae468c8e	Update the datalayout string for ppc64LE. Update the datalayout string for ppc64LE. llvm-svn: 203664	2014-03-12 14:59:17 +00:00
Daniel Sanders	61c76cc56f	[mips][fp64] Add an implicit def to MTHC1 claiming that it reads the lower 32-bits of 64-bit FPR Summary: This is a white lie to workaround a widespread bug in the -mfp64 implementation. The problem is that none of the 32-bit fpu ops mention the fact that they clobber the upper 32-bits of the 64-bit FPR. This allows MTHC1 to be scheduled on the wrong side of most 32-bit FPU ops, particularly MTC1. Fixing that requires a major overhaul of the FPU implementation which can't be done right now due to time constraints. The testcase is SingleSource/Benchmarks/Misc/oourafft.c when given TARGET_CFLAGS='-mips32r2 mfp64 -mmsa'. Also correct the comment added in r203464 to indicate that two instructions were affected. Reviewers: matheusalmeida, jacksprat Reviewed By: matheusalmeida Differential Revision: http://llvm-reviews.chandlerc.com/D3029 llvm-svn: 203659	2014-03-12 13:35:43 +00:00
Daniel Sanders	df22154579	[mips] BSEL's and BINS[RL] operands are reversed compared to the vselect node used in the pattern. Summary: Correct the match patterns and the lowerings that made the CodeGen tests pass despite the mistakes. The original testcase that discovered the problem was SingleSource/UnitTests/SignlessType/factor.c in test-suite. During review, we also found that some of the existing CodeGen tests were incorrect and fixed them: * bitwise.ll: In bsel_v16i8 the IfSet/IfClear were reversed because bsel and bmnz have different operand orders and the test didn't correctly account for this. bmnz goes 'IfClear, IfSet, CondMask', while bsel goes 'CondMask, IfClear, IfSet'. * vec.ll: In the cases where a bsel is emitted as a bmnz (they are the same operation with a different input tied to the result) the operands were in the wrong order. * compare.ll and compare_float.ll: The bsel operand order was correct for a greater-than comparison, but a greater-than comparison instruction doesn't exist. Lowering this operation inverts the condition so the IfSet/IfClear need to be swapped to match. The differences between BSEL, BMNZ, and BMZ and how they map to/from vselect are rather confusing. I've therefore added a note to MSA.txt to explain this in a single place in addition to the comments that explain each case. Reviewers: matheusalmeida, jacksprat Reviewed By: matheusalmeida Differential Revision: http://llvm-reviews.chandlerc.com/D3028 llvm-svn: 203657	2014-03-12 11:54:00 +00:00
Tim Northover	3cccc45a9f	ARM: correct Dwarf output for non-contiguous VFP saves. When the list of VFP registers to be saved was non-contiguous (so multiple vpush/vpop instructions were needed) these were being ordered oddly, as in: vpush {d8, d9} vpush {d11} This led to the layout in memory being [d11, d8, d9] which is ugly and doesn't match the CFI_INSTRUCTIONs we're generating either (so Dwarf info would be broken). This switches the order of vpush/vpop (in both prologue and epilogue, obviously) so that the Dwarf locations are correct again. rdar://problem/16264856 llvm-svn: 203655	2014-03-12 11:29:23 +00:00
Patrik Hagglund	1da3512166	Replace '#include ValueTypes.h' with forward declarations. In some cases the include is pushed "downstream" (or removed if unused). llvm-svn: 203644	2014-03-12 08:00:24 +00:00
Hans Wennborg	14863418ed	[ARM] Use DWARF register numbers for CFI directives in ELF assembly It seems gas can't handle CFI directives with VFP register names ("d12", etc.). This broke us trying to build Chromium for Android after 201423. A gas bug has been filed: https://sourceware.org/bugzilla/show_bug.cgi?id=16694 compnerd suggested making this conditional on whether we're using the integrated assembler or not. I'll look into that in a follow-up patch. Differential Revision: http://llvm-reviews.chandlerc.com/D3049 llvm-svn: 203635	2014-03-12 03:52:34 +00:00
Sasa Stankovic	8600ebc74d	[mips] Implement NaCl sandboxing of function calls: * Add masking instructions before indirect calls (in MC layer). * Align call + branch delay to the bundle end (in MC layer). Differential Revision: http://llvm-reviews.chandlerc.com/D3032 llvm-svn: 203606	2014-03-11 21:23:40 +00:00
Rafael Espindola	a063bdde8d	Simplify a really complicated check for Arch == X86_64. The function hasReliableSymbolDifference had exactly one use in the MachO writer. It is also only true for X86_64. In fact, the comments refers to "Darwin x86_64" and everything else, so this makes the code match the comment. If this is to be abstracted again, it should be a property of TargetObjectWriter, like useAggressiveSymbolFolding. llvm-svn: 203605	2014-03-11 21:22:57 +00:00
Owen Anderson	56112b999b	Range-ify a loop. llvm-svn: 203590	2014-03-11 17:37:48 +00:00
Hans Wennborg	6c37f8b985	X86: Don't generate 64-bit movd after cmpneqsd in 32-bit mode (PR19059) This fixes the bug where we would bitcast the 64-bit floating point result of cmpneqsd to a 64-bit integer even on 32-bit targets. Differential Revision: http://llvm-reviews.chandlerc.com/D3009 llvm-svn: 203581	2014-03-11 15:49:24 +00:00
Saleem Abdulrasool	0d96f3dd6e	ARM: honour -f{no-,}optimize-sibling-calls Use the options in the ARMISelLowering to control whether tail calls are optimised or not. Previously, this option was entirely ignored on the ARM target and only honoured on x86. This option is mostly useful in profiling scenarios. The default remains that tail call optimisations will be applied. llvm-svn: 203577	2014-03-11 15:09:54 +00:00
Saleem Abdulrasool	b720a6bab7	ARM: remove ancient -arm-tail-calls option This option is from 2010, designed to work around a linker issue on Darwin for ARM. According to grosbach this is no longer an issue and this option can safely be removed. llvm-svn: 203576	2014-03-11 15:09:49 +00:00
Saleem Abdulrasool	ec1ec1b416	ARM: enable tail call optimisation on Thumb 2 Tail call optimisation was previously disabled on all targets other than iOS5.0+. This enables the tail call optimisation on all Thumb 2 capable platforms. The test adjustments are to remove the IR hint "tail" to function invocation. The tests were designed assuming that tail call optimisations would not kick in which no longer holds true. llvm-svn: 203575	2014-03-11 15:09:44 +00:00
Tim Northover	445dd58aae	ARM: simplify EmitAtomicBinary64 ATOMIC_STORE operations always get here as a lowered ATOMIC_SWAP, so there's no need for any code to handle them specially. There should be no functionality change so no tests. llvm-svn: 203567	2014-03-11 13:19:55 +00:00
Tim Northover	e94a518a22	IR: add a second ordering operand to cmpxhg for failure The syntax for "cmpxchg" should now look something like: cmpxchg i32* %addr, i32 42, i32 3 acquire monotonic where the second ordering argument gives the required semantics in the case that no exchange takes place. It should be no stronger than the first ordering constraint and cannot be either "release" or "acq_rel" (since no store will have taken place). rdar://problem/15996804 llvm-svn: 203559	2014-03-11 10:48:52 +00:00
Matt Arsenault	0211714ecb	R600: Calculate store mask instead of using switch. llvm-svn: 203527	2014-03-11 01:38:53 +00:00
Jim Grosbach	c94d993adf	X86: Enable ISel of 16-bit MOVBE instructions. When the MOVBE instructions are available, use them for 16-bit endian swapping as well as for 32 and 64 bit. The patterns were already present on the instructions, but weren't being matched because the operation was unconditionally marked to 'Expand.' Change that to be conditional on whether the MOVBE instructions are available. Use 'rolw' to implement the in-register version (32 and 64 bit have the dedicated 'bswap' instruction for that). Patch by Louis Gerbarg <lgg@apple.com>. rdar://15479984 llvm-svn: 203524	2014-03-11 00:44:14 +00:00
Matt Arsenault	faa297e89e	Remove incomplete comment llvm-svn: 203518	2014-03-11 00:01:37 +00:00
Matt Arsenault	6dde30354a	Move trivial getter into header. llvm-svn: 203517	2014-03-11 00:01:34 +00:00
Matt Arsenault	9504d2f269	Use .data() instead of &x[0] llvm-svn: 203516	2014-03-11 00:01:31 +00:00
Matt Arsenault	e1f1da30f4	Fix indentation llvm-svn: 203515	2014-03-11 00:01:27 +00:00
Sasa Stankovic	5fddf61089	[mips] Implement NaCl sandboxing of loads, stores and SP changes: * Add masking instructions before loads and stores (in MC layer). * Add masking instructions after SP changes (in MC layer). * Forbid loads, stores and SP changes in delay slots (in MI layer). Differential Revision: http://llvm-reviews.chandlerc.com/D2904 llvm-svn: 203484	2014-03-10 20:34:23 +00:00
Eli Bendersky	e78ae059b5	Make sure NVPTX doesn't emit symbol names that aren't valid in PTX. NVPTX, like the other backends, relies on generic symbol name sanitizing done by MCSymbol. However, the ptxas assembler is more stringent and disallows some additional characters in symbol names. See PR19099 for more details. llvm-svn: 203483	2014-03-10 20:05:42 +00:00
Reed Kotler	96b7402bac	Fix regression with -O0 for mips . llvm-svn: 203469	2014-03-10 16:31:25 +00:00
Daniel Sanders	059e4b158c	[mips][fp64] Add an implicit def to MFHC1 claiming that it reads the lower 32-bits of 64-bit FPR Summary: This is a white lie to workaround a widespread bug in the -mfp64 implementation. The problem is that none of the 32-bit fpu ops mention the fact that they clobber the upper 32-bits of the 64-bit FPR. This allows MFHC1 to be scheduled on the wrong side of most 32-bit FPU ops. Fixing that requires a major overhaul of the FPU implementation which can't be done right now due to time constraints. MFHC1 is one of two affected instructions. These instructions are the only FPU instructions that don't read or write the lower 32-bits. We therefore pretend that it reads the bottom 32-bits to artificially create a dependency and prevent the scheduler changing the behaviour of the code. The other instruction is MTHC1 which will be fixed once I've have found a failing test case for it. The testcase is test-suite/SingleSource/UnitTests/Vector/simple.c when given TARGET_CFLAGS="-mips32r2 -mfp64 -mmsa". Reviewers: jacksprat, matheusalmeida Reviewed By: jacksprat Differential Revision: http://llvm-reviews.chandlerc.com/D2966 llvm-svn: 203464	2014-03-10 15:01:57 +00:00
Matheus Almeida	64459d296b	[mips] Assembly parser must invoke the target streamer to handle .set reorder macro. llvm-svn: 203459	2014-03-10 13:21:10 +00:00
Tim Northover	2a661f3f73	AArch64: fix LowerCONCAT_VECTORS for new CodeGen. The function was making too many assumptions about its input: 1. The NEON_VDUP optimisation was far too aggressive, assuming (I think) that the input would always be BUILD_VECTOR. 2. We were treating most unknown concats as legal (by returning Op rather than SDValue()). I think only concats of pairs of vectors are actually legal. http://llvm.org/PR19094 llvm-svn: 203450	2014-03-10 09:34:07 +00:00
Craig Topper	24e685fdb0	[C++11] Remove 'virtual' keyword from methods marked with 'override' keyword. llvm-svn: 203444	2014-03-10 05:29:18 +00:00
Chandler Carruth	e42bafece1	[AArch64] Fix a use of uninitialized memory introduced in r203125, and caught by the MSan bootstrap build bot. This should hopefully get the bot green at long last. llvm-svn: 203441	2014-03-10 03:52:47 +00:00
Craig Topper	d25ff6f917	De-virtualize a method since it doesn't override anything and isn't overridden itself. llvm-svn: 203440	2014-03-10 03:22:59 +00:00
Craig Topper	ca7e3e5c4b	[C++11] Add 'override' keyword to virtual methods that override their base class. llvm-svn: 203439	2014-03-10 03:19:03 +00:00
Chandler Carruth	aee3ca6cfd	[TTI] There is actually no realistic way to pop TTI implementations off the stack of the analysis group because they are all immutable passes. This is made clear by Craig's recent work to use override systematically -- we weren't overriding anything for 'finalizePass' because there is no such thing. This is kind of a lame restriction on the API -- we can no longer push and pop things, we just set up the stack and run. However, I'm not invested in building some better solution on top of the existing (terrifying) immutable pass and legacy pass manager. llvm-svn: 203437	2014-03-10 02:45:14 +00:00
Craig Topper	6bc27bf359	[C++11] Add 'override' keyword to virtual methods that override their base class. llvm-svn: 203433	2014-03-10 02:09:33 +00:00
Venkatraman Govindaraju	f703132b09	[Sparc] Add support for decoding 'swap' instruction. llvm-svn: 203424	2014-03-09 23:32:07 +00:00
Craig Topper	39012ccee9	[C++11] Add 'override' keyword to virtual methods that override their base class. llvm-svn: 203418	2014-03-09 18:03:14 +00:00
NAKAMURA Takumi	1783e1e984	Revert r203230, "CodeGenPrep: sink extends of illegal types into use block." It choked i686 stage2. llvm-svn: 203386	2014-03-09 11:01:07 +00:00
Craig Topper	f5e3b0b98c	De-virtualize some methods since they don't override anything. llvm-svn: 203379	2014-03-09 07:58:15 +00:00
Craig Topper	2d9361e325	[C++11] Add 'override' keyword to virtual methods that override their base class. llvm-svn: 203378	2014-03-09 07:44:38 +00:00
Chandler Carruth	cdf4788401	[C++11] Add range based accessors for the Use-Def chain of a Value. This requires a number of steps. 1) Move value_use_iterator into the Value class as an implementation detail 2) Change it to actually be a Use iterator rather than a User iterator. 3) Add an adaptor which is a User iterator that always looks through the Use to the User. 4) Wrap these in Value::use_iterator and Value::user_iterator typedefs. 5) Add the range adaptors as Value::uses() and Value::users(). 6) Update all of the callers to correctly distinguish between whether they wanted a use_iterator (and to explicitly dig out the User when needed), or a user_iterator which makes the Use itself totally opaque. Because #6 requires churning essentially everything that walked the Use-Def chains, I went ahead and added all of the range adaptors and switched them to range-based loops where appropriate. Also because the renaming requires at least churning every line of code, it didn't make any sense to split these up into multiple commits -- all of which would touch all of the same lies of code. The result is still not quite optimal. The Value::use_iterator is a nice regular iterator, but Value::user_iterator is an iterator over Users rather than over the User objects themselves. As a consequence, it fits a bit awkwardly into the range-based world and it has the weird extra-dereferencing 'operator->' that so many of our iterators have. I think this could be fixed by providing something which transforms a range of T&s into a range of Ts, but that can be separated into another patch, and it isn't yet 100% clear whether this is the right move. However, this change gets us most of the benefit and cleans up a substantial amount of code around Use and User. =] llvm-svn: 203364	2014-03-09 03:16:01 +00:00
Duncan P. N. Exon Smith	429d2608f9	Change else if => if after return, after r203265 llvm-svn: 203347	2014-03-08 15:15:42 +00:00
Owen Anderson	8c1f17bb98	Range-ify some for loops. llvm-svn: 203306	2014-03-07 22:48:22 +00:00
Eli Bendersky	ab9da5129a	Remove unused method declaration llvm-svn: 203301	2014-03-07 22:19:10 +00:00
Tom Stellard	e28859f8fa	R600/SI: Using SGPRs is illegal for instructions that read carry-out from VCC Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 203281	2014-03-07 20:12:39 +00:00
Tom Stellard	1c8788ef5a	R600/SI: Custom lower i1 stores These are sometimes created by the shrink to boolean optimization in the globalopt pass. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 203280	2014-03-07 20:12:33 +00:00
Rafael Espindola	24a542fd5c	Don't avoid cfi instructions on the bg/p. The integrated assembler now works for ppc. Since this was the last use of the bg/p predicate and Hal says that it is now dead, drop the predicate too. llvm-svn: 203269	2014-03-07 19:04:12 +00:00
Ted Kremenek	d9e9c72732	Remove dead 'break' (dominated by 'return'). llvm-svn: 203267	2014-03-07 18:54:08 +00:00
Ted Kremenek	0b01471694	Remove dead 'return'. llvm-svn: 203265	2014-03-07 18:51:16 +00:00
Nico Weber	ad15692061	"Mac OS/X" -> "Mac OS X" spelling fixes for llvm. Patch from Sean McBride <sean@rogue-research.com>! llvm-svn: 203258	2014-03-07 18:08:54 +00:00
Duncan P. N. Exon Smith	29db0eb855	ARM: Make .unreq directives case-insensitive Be case-insensitive when processing .unreq directives. Patch by Lin Zuojian! llvm-svn: 203251	2014-03-07 16:16:52 +00:00
Richard Sandiford	95bc5f92ee	[SystemZ] Move sign_extend optimization to PerformDAGCombine The target was marking SIGN_EXTEND as Custom because it wanted to optimize certain sign-extended shifts. In all other respects the extension is Legal, so it'd be better to do the optimization in PerformDAGCombine instead. No functional change intended. llvm-svn: 203234	2014-03-07 11:34:35 +00:00
Tim Northover	ad3d81d320	CodeGenPrep: sink extends of illegal types into use block. This helps the instruction selector to lower an i64 * i64 -> i128 multiplication into a single instruction on targets which support it. Patch by Manuel Jacob. llvm-svn: 203230	2014-03-07 11:04:30 +00:00
Tim Northover	fad2761ca0	InstCombine: form shuffles from wider range of insert/extractelements Sequences of insertelement/extractelements are sometimes used to build vectorsr; this code tries to put them back together into shuffles, but could only produce a completely uniform shuffle types (<N x T> from two <N x T> sources). This should allow shuffles with different numbers of elements on the input and output sides as well. llvm-svn: 203229	2014-03-07 10:24:44 +00:00
Alexey Volkov	1051f04a8d	Enable FeatureFastUAMem for Silvermont processor Differential Revision: http://llvm-reviews.chandlerc.com/D2982 llvm-svn: 203218	2014-03-07 09:03:49 +00:00
Alexey Volkov	bb2f047346	Test commit Removed whitespace llvm-svn: 203216	2014-03-07 08:28:44 +00:00
David Majnemer	7b58305ff6	MC: Remove superfluous section attribute flag definitions Summary: llvm/MC/MCSectionMachO.h and llvm/Support/MachO.h both had the same definitions for the section flags. Instead, grab the definitions out of support. No functionality change. Reviewers: grosbach, Bigcheese, rafael Reviewed By: rafael CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D2998 llvm-svn: 203211	2014-03-07 07:36:05 +00:00
Rafael Espindola	b1f25f1b93	Replace PROLOG_LABEL with a new CFI_INSTRUCTION. The old system was fairly convoluted: * A temporary label was created. * A single PROLOG_LABEL was created with it. * A few MCCFIInstructions were created with the same label. The semantics were that the cfi instructions were mapped to the PROLOG_LABEL via the temporary label. The output position was that of the PROLOG_LABEL. The temporary label itself was used only for doing the mapping. The new CFI_INSTRUCTION has a 1:1 mapping to MCCFIInstructions and points to one by holding an index into the CFI instructions of this function. I did consider removing MMI.getFrameInstructions completelly and having CFI_INSTRUCTION own a MCCFIInstruction, but MCCFIInstructions have non trivial constructors and destructors and are somewhat big, so the this setup is probably better. The net result is that we don't create temporary labels that are never used. llvm-svn: 203204	2014-03-07 06:08:31 +00:00
Rafael Espindola	0233bf19fd	Simplify. No functionality change. llvm-svn: 203202	2014-03-07 04:58:32 +00:00
Rafael Espindola	afeb01c0a7	Simplify. No functionality change. llvm-svn: 203199	2014-03-07 04:45:03 +00:00
Saleem Abdulrasool	35476334e9	Support: split object format out of environment This is a preliminary setup change to support a renaming of Windows target triples. Split the object file format information out of the environment into a separate entity. Unfortunately, file format was previously treated as an environment with an unknown OS. This is most obvious in the ARM subtarget where the handling for macho on an arbitrary platform switches to AAPCS rather than APCS (as per Apple's needs). llvm-svn: 203160	2014-03-06 20:47:11 +00:00
Reid Kleckner	94a1c4d3f1	MS asm: The initial dot in struct access is optional Fixes PR18994. Tests, once again, in that other repository. =P llvm-svn: 203146	2014-03-06 19:19:12 +00:00
Matt Arsenault	f9a995d68c	R600: Fix extloads from i8 / i16 to i64. This appears to only be working for global loads. Private and local break for other reasons. llvm-svn: 203135	2014-03-06 17:34:12 +00:00
Matt Arsenault	9fe669c522	R600/SI: Expand selects on vectors. llvm-svn: 203134	2014-03-06 17:34:03 +00:00
Matt Arsenault	e6ed1d796f	Fix missing C++ mode comment llvm-svn: 203133	2014-03-06 17:33:58 +00:00
Richard Osborne	47155af5eb	[XCore] Add support for the "m" inline asm constraint. Summary: This provides support for CP and DP relative global accesses in inline asm. Reviewers: robertlytton Reviewed By: robertlytton Differential Revision: http://llvm-reviews.chandlerc.com/D2943 llvm-svn: 203129	2014-03-06 16:37:48 +00:00
Chad Rosier	86a8f72041	[AArch64] This is a work in progress to provide a machine description for the Cortex-A53 subtarget in the AArch64 backend. This patch lays the ground work to annotate each AArch64 instruction (no NEON yet) with a list of SchedReadWrite types. The patch also provides the Cortex-A53 processor resources, maps those the the default SchedReadWrites, and provides basic latency. NEON support will be added in a subsequent patch with proper forwarding logic. Verification was done by setting the pre-RA scheduler to linearize to better gauge the effect of the MIScheduler. Even without modeling the forward logic, the results show a modest improvement for Cortex-A53. Reviewers: apazos, mcrosier, atrick Patch by Dave Estes <cestes@codeaurora.org>! llvm-svn: 203125	2014-03-06 16:04:00 +00:00
Richard Sandiford	b4d67b593e	[SystemZ] Remove "virtual" from override methods Also fix a couple of cases where "override" was missing. No behavioural change intended. llvm-svn: 203110	2014-03-06 12:03:36 +00:00
Richard Sandiford	21f5d68a17	[SystemZ] Use "auto" for cast results No functional change intended. llvm-svn: 203106	2014-03-06 11:22:58 +00:00
Richard Sandiford	28c111ec8a	[SystemZ] Use "for (auto" a bit Just the simple cases for now. There were a few knock-on changes of MachineBasicBlock *s to MachineBasicBlock &s. No functional change intended. llvm-svn: 203105	2014-03-06 11:00:15 +00:00
Richard Sandiford	c231269ff9	[SystemZ] Update namespace formatting to match current guidelines No functional change intended. llvm-svn: 203103	2014-03-06 10:38:30 +00:00
Elena Demikhovsky	f7c1b16591	AVX-512: Added rrk, rrkz, rmk, rmkz, rmbk, rmbkz versions of AVX512 FP packed instructions, added encoding tests for them. By Robert Khazanov. llvm-svn: 203098	2014-03-06 08:45:30 +00:00
Elena Demikhovsky	8fae565f08	AVX-512: fixed comressed displacement - by Robert Khazanov llvm-svn: 203096	2014-03-06 08:15:35 +00:00
Yaron Keren	cf96c257f4	Cleaning up two more pre-Visual C++ 2012 build hacks. llvm-svn: 203093	2014-03-06 08:05:43 +00:00
Chandler Carruth	7da14f1ab9	[Layering] Move InstVisitor.h into the IR library as it is pretty obviously coupled to the IR. llvm-svn: 203064	2014-03-06 03:23:41 +00:00
Hal Finkel	6daf2aa140	The PPC global base register cannot be r0 The global base register cannot be r0 because it might end up as the first argument to addi or addis. Fixes PR18316. I don't have a small stable test case. llvm-svn: 203054	2014-03-06 01:28:23 +00:00
Chandler Carruth	9a4c9e597b	[Layering] Move DebugInfo.h into the IR library where its implementation already lives. llvm-svn: 203046	2014-03-06 00:46:21 +00:00
Hal Finkel	7f908e8ef4	Fixup PPC Darwin i1 argument handling Like on other targets, we need to zero_extend/truncate i1 args before copying them to GPRs. llvm-svn: 203045	2014-03-06 00:45:19 +00:00
Hal Finkel	2a9d318e4a	When using CR bit registers on PPC32, handle the i1 vaarg case When copying an i1 value into a GPR for a vaarg call, we need to explicitly zero-extend the i1 value (otherwise an invalid CRBIT -> GPR copy will be generated). llvm-svn: 203041	2014-03-06 00:23:33 +00:00
Hal Finkel	6a56b21729	With PPC CR bit registers, handle int_to_fp on older cores On cores without fpcvt support, we cannot promote int_to_fp i1 operations, because there is nothing to promote them to. The most straightforward implementation of this uses a select to choose between the two possible resulting floating-point values (and that's what is done here). llvm-svn: 203015	2014-03-05 22:14:00 +00:00
Matt Arsenault	ca6dcfcf59	Fix typo llvm-svn: 203013	2014-03-05 21:47:22 +00:00
Cameron McInally	791ae9927c	Lower AVX v4i64->v4i32 truncate to one shuffle. llvm-svn: 202996	2014-03-05 19:41:16 +00:00
David Blaikie	7f4a52eaee	Fix clang -Werror build break due to mismatched sign comparison. Originally committed in r202985. llvm-svn: 202992	2014-03-05 18:53:36 +00:00
Oliver Stannard	d55e115b58	ARM: Correctly align arguments after a byval struct is passed on the stack llvm-svn: 202985	2014-03-05 15:25:27 +00:00
Joerg Sonnenberger	cce644a633	Enable integrated assembler on OpenBSD/PPC32 by default, too. From Brad Smith. llvm-svn: 202967	2014-03-05 11:37:04 +00:00
Vladimir Medic	27c398e38c	This patch implements .set dsp directive and sets appropriate feature bits.This directive is a counterpart of -mattr=dsp command line option with the exception that it does not influence elf header flags. The usage example is gives in test file. llvm-svn: 202966	2014-03-05 11:05:09 +00:00
Evan Cheng	f1f45e754e	Remove a special character in comment that accidentially got committed. llvm-svn: 202905	2014-03-04 22:56:57 +00:00
Reid Kleckner	4e3bd518f2	MS asm: Attempt to parse variables followed by a bracketed displacement This is required to include MSVC's <atomic> header, which we do now in LLVM. Tests forthcoming in Clang, since that's where we test semantic inline asm changes. llvm-svn: 202865	2014-03-04 17:57:01 +00:00
Saleem Abdulrasool	763666ef2f	X86: 80-column llvm-svn: 202863	2014-03-04 17:11:46 +00:00
Will Schmidt	3503018e19	[PowerPC] support powerpc64le as syntax-checking target (pass2) Register the Asm Printer for the ppc64le target. This fills in a spot that was missed in an earlier change (r187179). llvm-svn: 202861	2014-03-04 16:51:52 +00:00
Richard Osborne	1b5fc39710	[XCore] Fix call of absolute address. Previously for: tail call void inttoptr (i64 65536 to void ()*)() nounwind We would emit: bl 65536 The immediate operand of the bl instruction is a relative offset so it is wrong to use the absolute address here. llvm-svn: 202860	2014-03-04 16:50:30 +00:00
Daniel Sanders	d920770add	[mips][msa] Correct the behaviour of the COPY_FW pseudo on lanes 2 and 3. Summary: Previously, attempting to extract lanes 2 and 3 would actually extract lane 1. The MSA CodeGen tests only covered lanes 0 and 1. Differential Revision: http://llvm-reviews.chandlerc.com/D2935 llvm-svn: 202848	2014-03-04 13:54:30 +00:00
Chandler Carruth	64396b069a	[Modules] Move the NoFolder into the IR library as it creates instructions. llvm-svn: 202834	2014-03-04 12:05:47 +00:00
Chandler Carruth	1305dc3351	[Modules] Move CFG.h to the IR library as it defines graph traits over IR types. llvm-svn: 202827	2014-03-04 11:45:46 +00:00
Chandler Carruth	a4ea269f15	[Modules] Move ValueMap to the IR library. While this class does not directly care about the Value class (it is templated so that the key can be any arbitrary Value subclass), it is in fact concretely tied to the Value class through the ValueHandle's CallbackVH interface which relies on the key type being some Value subclass to establish the value handle chain. Ironically, the unittest is already in the right library. llvm-svn: 202824	2014-03-04 11:26:31 +00:00
Chandler Carruth	4220e9c154	[Modules] Move ValueHandle into the IR library where Value itself lives. Move the test for this class into the IR unittests as well. This uncovers that ValueMap too is in the IR library. Ironically, the unittest for ValueMap is useless in the Support library (honestly, so was the ValueHandle test) and so it already lives in the IR unittests. Mmmm, tasty layering. llvm-svn: 202821	2014-03-04 11:17:44 +00:00
Chandler Carruth	219b89b987	[Modules] Move CallSite into the IR library where it belogs. It is abstracting between a CallInst and an InvokeInst, both of which are IR concepts. llvm-svn: 202816	2014-03-04 11:01:28 +00:00
Chandler Carruth	03eb0de93d	[Modules] Move GetElementPtrTypeIterator into the IR library. As its name might indicate, it is an iterator over the types in an instruction in the IR.... You see where this is going. Another step of modularizing the support library. llvm-svn: 202815	2014-03-04 10:40:04 +00:00
Chandler Carruth	8394857f43	[Modules] Move InstIterator out of the Support library, where it had no business. This header includes Function and BasicBlock and directly uses the interfaces of both classes. It has to do with the IR, it even has that in the name. =] Put it in the library it belongs to. This is one step toward making LLVM's Support library survive a C++ modules bootstrap. llvm-svn: 202814	2014-03-04 10:30:26 +00:00
Chandler Carruth	442f784814	[cleanup] Re-sort all the includes with utils/sort_includes.py. llvm-svn: 202811	2014-03-04 10:07:28 +00:00
Vladimir Medic	615b26e1cd	This patch implements .set mips32r2 directive and sets appropriate feature bits. It also introduces helper functions that are used to set and clear feature bits as necessary. This directive is a counterpart of -mips32r2 command line options with the exception that it does not influence elf header flags. The usage example is gives in test file. llvm-svn: 202807	2014-03-04 09:54:09 +00:00
Yaron Keren	225d550b05	Cleaning up a bunch of pre-Visual C++ 2012 build hacks. llvm-svn: 202806	2014-03-04 09:23:33 +00:00
Kevin Qin	b08c6746c4	[AArch64]Fix improper diagnostics about offset range of load/store instructions. llvm-svn: 202775	2014-03-04 02:05:13 +00:00
Reid Kleckner	d84e70ea1b	MC: Fix Intel assembly parser for [global + offset] We were dropping the displacement on the floor if we also had some immediate offset. Should fix PR19033. llvm-svn: 202774	2014-03-04 00:33:17 +00:00
Chad Rosier	70cb2311ab	Revert "[AArch64] This is a work in progress to provide a machine description" This reverts commit ff717c8fc786a0cfa1602982b91895fa09e514fc. llvm-svn: 202773	2014-03-04 00:32:07 +00:00
Chad Rosier	fe45290566	[AArch64] This is a work in progress to provide a machine description for the Cortex-A53 subtarget in the AArch64 backend. This patch lays the ground work to annotate each AArch64 instruction (no NEON yet) with a list of SchedReadWrite types. The patch also provides the Cortex-A53 processor resources, maps those the the default SchedReadWrites, and provides basic latency. NEON support will be added in a subsequent patch with proper forwarding logic. Verification was done by setting the pre-RA scheduler to linearize to better gauge the effect of the MIScheduler. Even without modeling the forward logic, the results show a modest improvement for Cortex-A53. Reviewers: apazos, mcrosier, atrick Patch by Dave Estes <cestes@codeaurora.org>! llvm-svn: 202767	2014-03-03 23:32:47 +00:00
Daniel Sanders	fa961d76f0	[mips] Prevent %lo relocation being used on MSA loads and stores. Summary: Parts of the compiler still believed MSA load/stores have a 16-bit offset when it is actually 10-bit. Corrected this, and fixed a closely related issue this uncovered where load/stores with 10-bit and 12-bit offsets (MSA and microMIPS respectively) could not load/store using offsets from the stack/frame pointer. They accepted frameindex+offset, but not frameindex by itself. Reviewers: jacksprat, matheusalmeida Reviewed By: jacksprat Differential Revision: http://llvm-reviews.chandlerc.com/D2888 llvm-svn: 202717	2014-03-03 14:31:21 +00:00
Ed Maste	2a710d0a5b	[mips] support FK_Data_2 and FK_Data_8 to fix big-endian debug data This fixes invalid lengths in .debug_aranges on big-endian mips64 (lengths appear to be left-shifted by 32 bits) and in .debug_loc. Differential Revision: http://llvm-reviews.chandlerc.com/D2517 llvm-svn: 202716	2014-03-03 14:27:49 +00:00
Vladimir Medic	f9f8f4859a	Fixing a build failure reported by certain buildbots. This will disable jalx instruction for micromips target. llvm-svn: 202715	2014-03-03 14:05:14 +00:00
Vladimir Medic	43e978234a	This patch implements jalx instruction for Mips architecture.This instruction executes a procedure call within the current 256 MB-aligned region and change the ISA Mode from MIPS32 to microMIPS32 or MIPS16e. Usage samples for assembler and dissasembler are provided as well. llvm-svn: 202706	2014-03-03 13:12:59 +00:00
Venkatraman Govindaraju	925ec9b11e	[Sparc] Add trap on integer condition codes (Ticc) instructions to Sparc backend. llvm-svn: 202670	2014-03-02 23:39:07 +00:00
Venkatraman Govindaraju	07d3af2821	[Sparc] Add return/rett instruction to Sparc backend. llvm-svn: 202666	2014-03-02 22:55:53 +00:00
Venkatraman Govindaraju	4fa2ab26f5	[Sparc] Add support for decoding jmpl/retl/ret instruction. llvm-svn: 202663	2014-03-02 21:17:44 +00:00
Venkatraman Govindaraju	c3084ad294	[Sparc] Add fcmpe* instructions to Sparc backend. llvm-svn: 202661	2014-03-02 19:56:19 +00:00
Venkatraman Govindaraju	f9a202a9ac	[Sparc] Add VIS instructions to sparc backend. llvm-svn: 202660	2014-03-02 19:31:21 +00:00
Hal Finkel	6aca2373f2	Add a PPC inline asm constraint type for single CR bits Now that the PowerPC backend can track individual CR bits as first-class registers, we should also have a way of allocating them for inline asm statements. Because these registers are only one bit, if an output variable is implicitly cast to a larger integer size, we'll get an any_extend to that larger type (this is part of the existing target-independent logic). As a result, regardless of the size of the output type, only the first bit is meaningful. The constraint identifier "wc" has been chosen for this purpose. Although gcc does not currently support allocating individual CR bits, this identifier choice has been coordinated with the gcc PowerPC team, and will be marked as reserved for this purpose in the gcc constraints.md file. llvm-svn: 202657	2014-03-02 18:23:39 +00:00
Benjamin Kramer	d6f1f84f51	[C++11] Replace llvm::tie with std::tie. The old implementation is no longer needed in C++11. llvm-svn: 202644	2014-03-02 13:30:33 +00:00
Benjamin Kramer	b6d0bd48bd	[C++11] Replace llvm::next and llvm::prior with std::next and std::prev. Remove the old functions. llvm-svn: 202636	2014-03-02 12:27:27 +00:00
Venkatraman Govindaraju	b745e67a64	[SparcV9] Adds support for branch on integer register instructions (BPr) and conditional moves on integer register (MOVr/FMOVr). llvm-svn: 202628	2014-03-02 09:46:56 +00:00
Elena Demikhovsky	9737e3886b	AVX-512: Fixed extract_vector_elt for v8i1 vector llvm-svn: 202624	2014-03-02 09:19:44 +00:00
Craig Topper	73156025e0	Switch all uses of LLVM_OVERRIDE to just use 'override' directly. llvm-svn: 202621	2014-03-02 09:09:27 +00:00
Craig Topper	77dfe45f81	Switch all uses of LLVM_FINAL to just use 'final', and remove the macro. llvm-svn: 202618	2014-03-02 08:08:51 +00:00
Venkatraman Govindaraju	600f390bb9	[Sparc] Add support for parsing branches and conditional move instructions with %fcc1-%fcc3 conditional registers. llvm-svn: 202616	2014-03-02 06:28:15 +00:00
Venkatraman Govindaraju	293a81c406	[Sparc] Make floating point branch instruction formats to accept %fcc0-%fcc1 conditional registers as input. No functionality change. llvm-svn: 202614	2014-03-02 04:43:45 +00:00
Venkatraman Govindaraju	81aae57282	[Sparc] Add support for parsing fcmp with %fcc registers. llvm-svn: 202610	2014-03-02 03:39:39 +00:00
Venkatraman Govindaraju	bac285f588	[Sparc] Add register class for floating point conditional flags (%fcc0 - %fcc3). llvm-svn: 202604	2014-03-02 02:12:33 +00:00

... 6 7 8 9 10 ...

28026 Commits