llvm-project

Commit Graph

Author	SHA1	Message	Date
Jim Grosbach	4a1a9ce5e6	ARM: cortex-m0 doesn't support unaligned memory access. Unlike other v6+ processors, cortex-m0 never supports unaligned accesses. From the v6m ARM ARM: "A3.2 Alignment support: ARMv6-M always generates a fault when an unaligned access occurs." rdar://16491560 llvm-svn: 205452	2014-04-02 19:28:13 +00:00
Jim Grosbach	df1e05bb8a	Make some range based loop types more explicit. No functional change, but more readable code. llvm-svn: 205451	2014-04-02 19:28:08 +00:00
Kai Nacke	13673ac704	[mips] Add more Octeon cnMips instructions Adds the instructions ext/ext32/cins/cins32. It also changes pop/dpop to accept the two operand version and adds a simple pattern to generate baddu. Tests for the two operand versions (including baddu/dmul/dpop/pop) and the code generation pattern for baddu are included. Reviewed by: Daniel.Sanders@imgtec.com llvm-svn: 205449	2014-04-02 18:40:43 +00:00
Jim Grosbach	20b0790df7	[C++11,ARM64] Range based for and explicit 'override' in STP cleanup. No functional change intended. llvm-svn: 205446	2014-04-02 18:00:59 +00:00
Jim Grosbach	05abd709f3	[C++11,ARM64] Range based for loops in constant promotion. No functional change intended. llvm-svn: 205445	2014-04-02 18:00:56 +00:00
Jim Grosbach	7dc9edeaa5	[C++11,ARM64] Range based for loops in load/store pair optimizer. No functional change intended. llvm-svn: 205444	2014-04-02 18:00:53 +00:00
Jim Grosbach	020e657790	[C++11,ARM64] Range based for loops in target lowering. No functional change intended. llvm-svn: 205443	2014-04-02 18:00:51 +00:00
Jim Grosbach	91f1f47751	[C++11,ARM64] Range based for loops in frame lowering. No functional change intended. llvm-svn: 205442	2014-04-02 18:00:49 +00:00
Jim Grosbach	f39d752b03	[C++11,ARM64] Range based for loops in pseudo expansion. No functional change intended. llvm-svn: 205441	2014-04-02 18:00:46 +00:00
Jim Grosbach	673825ebac	[C++11,ARM64] Range based for loops for LOH No functional change intended. llvm-svn: 205440	2014-04-02 18:00:44 +00:00
Jim Grosbach	2539c3d07a	[C++11,ARM64] Range based for loops TLS cleanup. No functional change intended. llvm-svn: 205439	2014-04-02 18:00:41 +00:00
Jim Grosbach	0d0c5a614a	[C++11,ARM64] Range based for loops in branch relaxation. No functional change intended. llvm-svn: 205438	2014-04-02 18:00:39 +00:00
Jim Grosbach	1c762ca9bd	[C++11,ARM64] Range based for loops in address type promotion. No functional change intended. llvm-svn: 205437	2014-04-02 18:00:36 +00:00
Quentin Colombet	7bf9d8cd13	[ARM64][CollectLOH] Remove the link to the radar from the comments. llvm-svn: 205435	2014-04-02 16:40:49 +00:00
Oliver Stannard	b14c625111	ARM: Add support for segmented stacks Patch by Alex Crichton, ILyoan, Luqman Aden and Svetoslav. llvm-svn: 205430	2014-04-02 16:10:33 +00:00
Adrian Prantl	a731cf0018	clarify comment llvm-svn: 205429	2014-04-02 15:49:45 +00:00
Tim Northover	6d69168ffd	ARM64: use GOT for weak symbols & PIC. Weak symbols cannot use the small code model's usual ADRP sequences since the instruction simply may not be able to encode a value of 0. This redirects them to use the GOT, which hopefully linkers are able to cope with even in the static relocation model. llvm-svn: 205426	2014-04-02 14:39:11 +00:00
Tim Northover	0d80f70530	ARM64: fix lowering of fp128 fptosi/fptoui We were creating libcall nodes that returned an MVT::f128, when these particular operations actually return an int of some stripe. llvm-svn: 205425	2014-04-02 14:39:07 +00:00
Tim Northover	670df3d937	SLPVectorizer: compare entire intrinsic for SLP compatibility. Some Intrinsics are overloaded to the extent that return type equality (all that's been checked up to now) does not guarantee that the arguments are the same. In these cases SLP vectorizer should not recurse into the operands, which can be achieved by comparing them as "Function *" rather than simply the ID. llvm-svn: 205424	2014-04-02 14:39:02 +00:00
Tim Northover	ebd37ab382	ARM64: make sure first argument to INSERT_SUBVECTOR has right type. Again, coalescing and other optimisations swiftly made the MachineInstrs consistent again, but when compiled at -O0 a bad INSERT_SUBREGISTER was produced. llvm-svn: 205423	2014-04-02 14:38:58 +00:00
Tim Northover	5e3a484e3b	ARM64: convert fp16 narrowing ISel to pseudo-instruction The previous attempt was fine with optimisations, but was actually rather cavalier with its types. When compiled at -O0, it produced invalid COPY MachineInstrs. llvm-svn: 205422	2014-04-02 14:38:54 +00:00
Job Noorman	f7da105f39	Mark FPB as a reserved register when needed. llvm-svn: 205421	2014-04-02 13:13:56 +00:00
Rafael Espindola	b1b49789d0	Work around gold bug http://sourceware.org/PR16794 . llvm-svn: 205416	2014-04-02 12:15:20 +00:00
Renato Golin	d93295ea56	Remove duplicated DMB instructions ARM specific optimiztion, finding places in ARM machine code where 2 dmbs follow one another, and eliminating one of them. Patch by Reinoud Elhorst. llvm-svn: 205409	2014-04-02 09:03:43 +00:00
Yaron Keren	2895496852	Added isTargetWindowsMSVC(), renamed isTargetMingw() to isTargetWindowsGNU() and isTargetCygwin() to isTargetWindowsCygwin() to be consistent with the four Windows environments in Triple.h. Suggestion by Saleem Abdulrasool! llvm-svn: 205393	2014-04-02 04:27:51 +00:00
Hal Finkel	b0ebdc0f43	[LoopVectorizer] Count dependencies of consecutive pointers as uniforms For the purpose of calculating the cost of the loop at various vectorization factors, we need to count dependencies of consecutive pointers as uniforms (which means that the VF = 1 cost is used for all overall VF values). For example, the TSVC benchmark function s173 has: ... %3 = add nsw i64 %indvars.iv, 16000 %arrayidx8 = getelementptr inbounds %struct.GlobalData* @global_data, i64 0, i32 0, i64 %3 ... and we must realize that the add will be a scalar in order to correctly deduce it to be profitable to vectorize this on PowerPC with VSX enabled. In fact, all dependencies of a consecutive pointer must be a scalar (uniform), and so we simply need to add all consecutive pointers to the worklist that currently detects collects uniforms. Fixes PR19296. llvm-svn: 205387	2014-04-02 02:34:49 +00:00
David Blaikie	326e1fa13b	Adjust comments regarding non-relocated abbrev offset in debug_info.dwo I'm not sure the comment in the implementation really adds a lot of value (it's clear that we emit zero when no symbol is provided, but it doesn't explain why we would do that). Happy to iterate. llvm-svn: 205386	2014-04-02 02:04:51 +00:00
David Blaikie	94c1d7f174	Split debug_loc and debug_loc.dwo emission into two separate functions Based on code review feedback from Eric Christopher on r204697 llvm-svn: 205385	2014-04-02 01:50:20 +00:00
David Blaikie	0a456de5a2	DebugInfo: Introduce DebugLocList to encapsulate a list of DebugLocEntries and an MC Label to refer to them This removes the magic-number-esque code creating/retrieving the same label for a debug_loc entry from two places and removes the last small piece of reusable logic from emitDebugLoc so that there will be less duplication when refactoring it into two functions (one for debug_loc, the other for debug_loc.dwo). llvm-svn: 205382	2014-04-02 01:43:18 +00:00
Quentin Colombet	3c2b13b258	[ARM64][CollectLOH] Add some comments to explain how the LOHs framework works (for the compiler part), since the design document is not available. llvm-svn: 205379	2014-04-02 01:02:28 +00:00
Adrian Prantl	3c5453cb6e	Add a doxygen comment to DebugLocEntry::Merge. llvm-svn: 205374	2014-04-01 23:34:45 +00:00
David Blaikie	6fa9966ee6	DebugLocEntry: Actually merge the loc entry when returning true. Seems we didn't have any test coverage for merging... awesome. So I added some - but hit an llvm-objdump bug while I was there. I'm choosing not to shave that yak right now. Code review feedback/bug catch by Adrian Prantl in r205360. llvm-svn: 205373	2014-04-01 23:19:23 +00:00
David Blaikie	91567b6700	Fix accidental fallthrough in DebugLocEntry::hasSameValueOrLocation No test case (this would invoke UB by examining uninitialized members, etc, at best - and this code is apparently untested anyway - I'm about to fix that) Code review feedback from Adrian Prantl on r205360. llvm-svn: 205367	2014-04-01 22:25:09 +00:00
David Blaikie	c2af77b027	Remove unused function DebugLocEntry::isEmpty llvm-svn: 205365	2014-04-01 22:06:18 +00:00
David Blaikie	d306baf572	Refactor out the comparison of the location/value in a DebugLocEntry llvm-svn: 205364	2014-04-01 22:04:07 +00:00
David Blaikie	1275e4f026	DebugInfo: Split DebugLocEntry into its own file. It seems big enough that it deserves its own file - but it is header only, so there's no need for another cpp file, etc. llvm-svn: 205360	2014-04-01 21:49:04 +00:00
Adrian Prantl	6b444c5c8e	Add a comment about the DIDescriptor class hierarchy. llvm-svn: 205358	2014-04-01 21:04:24 +00:00
Adrian Prantl	75ce62acef	DwarfDebug: Prevent DebugLocEntry merging from coalescing two different constants into only the first one. rdar://14874886. llvm-svn: 205357	2014-04-01 21:04:18 +00:00
Hal Finkel	9e0baa6d3a	[PowerPC] Add some missing VSX bitcast patterns llvm-svn: 205352	2014-04-01 19:24:27 +00:00
Yaron Keren	48d68d439a	If isKnownWindowsMSVCEnvironment then getOS == Triple::Win32 and Environment == Triple::MSVC so it will never be MinGW or Cygwin. llvm-svn: 205349	2014-04-01 18:52:55 +00:00
Hal Finkel	2eed29f3c8	Implement X86TTI::getUnrollingPreferences This provides an initial implementation of getUnrollingPreferences for x86. getUnrollingPreferences is used by the generic (concatenation) unroller, which is distinct from the unrolling done by the loop vectorizer. Many modern x86 cores have some kind of uop cache and loop-stream detector (LSD) used to efficiently dispatch small loops, and taking full advantage of this requires unrolling small loops (small here means 10s of uops). These caches also have limits on the number of taken branches in the loop, and so we also cap the loop unrolling factor based on the maximum "depth" of the loop. This is currently calculated with a partial DFS traversal (partial because it will stop early if the path length grows too much). This is still an approximation, and one that is both conservative (because it does not account for branches eliminated via block placement) and optimistic (because it is only recording the maximum depth over minimum paths). Nevertheless, because the loops that fit in these uop caches are so small, it is not clear how much the details matter. The original set of patches posted for review produced the following test-suite performance results (from the TSVC benchmark) at that time: ControlLoops-dbl - 13% speedup ControlLoops-flt - 15% speedup Reductions-dbl - 7.5% speedup llvm-svn: 205348	2014-04-01 18:50:34 +00:00
Hal Finkel	6386cb8d4d	Add some additional fields to TTI::UnrollingPreferences In preparation for an upcoming commit implementing unrolling preferences for x86, this adds additional fields to the UnrollingPreferences structure: - PartialThreshold and PartialOptSizeThreshold - Like Threshold and OptSizeThreshold, but used when not fully unrolling. These are necessary because we need different thresholds for full unrolling from those used when partially unrolling (the full unrolling thresholds are generally going to be larger). - MaxCount - A cap on the unrolling factor when partially unrolling. This can be used by a target to prevent the unrolled loop from exceeding some resource limit independent of the loop size (such as number of branches). There should be no functionality change for any in-tree targets. llvm-svn: 205347	2014-04-01 18:50:30 +00:00
Hal Finkel	b4e001cc81	Use TopTTI->getGEPCost from within getUserCost The implementation of getUserCost had duplicated (and hard-coded) the default logic in getGEPCost. Instead, it is better to use getGEPCost directly, which limits the default logic to the implementation of one function, and allows targets to override the behavior. No functionality change intended. llvm-svn: 205346	2014-04-01 18:50:06 +00:00
Kai Nacke	af47f60f83	[mips] Add Octeon cnMips instructions mtmX and mtpX Adds the Octeon cnMips instructions "load multiplier register MPLx" and "load product register Px". Includes tests. Reviews by: Daniel.Sanders@imgtec.com llvm-svn: 205343	2014-04-01 18:35:26 +00:00
Reid Kleckner	101102711d	Support segmented stacks on Win64 Identical to Win32 method except the GS segment register is used for TLS instead of FS and pvArbitrary is at TEB offset 0x28 instead of 0x14. llvm-svn: 205342	2014-04-01 18:34:21 +00:00
Yaron Keren	136fe7db46	isTargetWindows() renamed to isTargetKnownWindowsMSVC() to reflect its current functionality. Based on Takumi NAKAMURA suggestion. llvm-svn: 205338	2014-04-01 18:15:34 +00:00
Matt Arsenault	e407ae9846	Make isSetCCEquivalent respect the TargetBooleanContents llvm-svn: 205336	2014-04-01 18:13:26 +00:00
Matt Arsenault	6310c3f667	Add helpers for checking if a value is a target boolean constant. llvm-svn: 205335	2014-04-01 18:13:22 +00:00
David Blaikie	0e84adc621	DebugInfo: Factor out common functionality for rendering debug_loc and debug_loc.dwo location list entries In preparation for refactoring this function into two, one for debug_loc, one for debug_loc.dwo. llvm-svn: 205324	2014-04-01 16:17:41 +00:00
David Blaikie	7f1f8742ea	Cleanup remaining use of removed variable to fix the build llvm-svn: 205323	2014-04-01 16:13:29 +00:00
David Blaikie	e12ab1276d	Simplify debug_loc.dwo handling slightly. llvm-svn: 205322	2014-04-01 16:09:49 +00:00
Christian Pirker	dc9ff75554	ARM: rename ARMle/ARMbe with ARMLE/ARMBE, and Thumble/Thumbbe with ThumbLE/ThumbBE llvm-svn: 205317	2014-04-01 15:19:30 +00:00
Tim Northover	0feb91ef15	ARM: teach LLVM that Cortex-A7 is very similar to A8. llvm-svn: 205314	2014-04-01 14:10:07 +00:00
Aaron Ballman	8bf5a548ea	Attempting to fix r205124, which had failed asserts when built with MSVC. Suggestion from Yaron Keren. llvm-svn: 205313	2014-04-01 13:56:35 +00:00
Tim Northover	1351030801	ARM: add cyclone CPU with ZeroCycleZeroing feature. The Cyclone CPU is similar to swift for most LLVM purposes, but does have two preferred instructions for zeroing a VFP register. This teaches LLVM about them. llvm-svn: 205309	2014-04-01 13:22:02 +00:00
Daniel Sanders	21bce30fdc	[mips] Renamed ParseAnyRegisterWithoutDollar to MatchAnyRegisterWithoutDollar This is for consistency with other functions. The Parse* functions consume tokens and the Match* functions don't. No functional change. llvm-svn: 205305	2014-04-01 12:35:23 +00:00
Aaron Ballman	0947bb20d8	Fixing an MSVC warning about widening the result of a 32-bit shift implicitly. No functional change intended. llvm-svn: 205304	2014-04-01 12:24:25 +00:00
Tim Northover	4f1dd58e2e	ARM64: add intrinsic for pmull (p64 x p64 = p128) operations. llvm-svn: 205302	2014-04-01 12:22:37 +00:00
Aaron Ballman	d1726ee8fa	Fixing warnings in the MSVC build. No functional changes intended. llvm-svn: 205301	2014-04-01 12:22:20 +00:00
Daniel Sanders	ffd8436d6c	[mips] Extend ParseJumpTarget to support the full symbol expression syntax. Summary: This should fix the issues the D3222 caused in lld. Testcase is based on the one that failed in the buildbot. Depends on D3233 Reviewers: matheusalmeida, vmedic Reviewed By: matheusalmeida Differential Revision: http://llvm-reviews.chandlerc.com/D3234 llvm-svn: 205298	2014-04-01 10:41:48 +00:00
Daniel Sanders	315386c083	[mips] Use AsmLexer::peekTok() to resolve the conflict between $reg and $sym Summary: Parsing registers no longer consume the $ token before it's confirmed whether it really has a register or not, therefore it's no longer impossible to match symbols if registers were tried first. Depends on D3232 Reviewers: matheusalmeida, vmedic Reviewed By: matheusalmeida Differential Revision: http://llvm-reviews.chandlerc.com/D3233 llvm-svn: 205297	2014-04-01 10:40:14 +00:00
Daniel Sanders	0993457891	[mips] Hoist Parser.Lex() calls out of MatchAnyRegisterNameWithoutDollar() Summary: No functional change Depends on D3222 Reviewers: matheusalmeida, vmedic Reviewed By: matheusalmeida Differential Revision: http://llvm-reviews.chandlerc.com/D3232 llvm-svn: 205295	2014-04-01 10:37:46 +00:00
Tim Northover	ff179ba3d3	ARM64: add patterns for more lane-wise ld1/st1 operations. llvm-svn: 205294	2014-04-01 10:37:09 +00:00
Tim Northover	d8d613b979	ARM64: fix bug in ld3r (1d) SelectionDAG. llvm-svn: 205293	2014-04-01 10:37:03 +00:00
Daniel Sanders	b50ccf8e26	[mips] Rewrite MipsAsmParser and MipsOperand. Summary: Highlights: - Registers are resolved much later (by the render method). Prior to that point, GPR32's/GPR64's are GPR's regardless of register size. Similarly FGR32's/FGR64's/AFGR64's are FGR's regardless of register size or FR mode. Numeric registers can be anything. - All registers are parsed the same way everywhere (even when handling symbol aliasing) - One consequence is that all registers can be specified numerically almost anywhere (e.g. $fccX, $wX). The exception is symbol aliasing but that can be easily resolved. - Removes the need for the hasConsumedDollar hack - Parenthesis and Bracket suffixes are handled generically - Micromips instructions are parsed directly instead of going through the standard encodings first. - rdhwr accepts all 32 registers, and the following instructions that previously xfailed now work: ddiv, ddivu, div, divu, cvt.l.[ds], se[bh], wsbh, floor.w.[ds], c.ngl.d, c.sf.s, dsbh, dshd, madd.s, msub.s, nmadd.s, nmsub.s, swxc1 - Diagnostics involving registers point at the correct character (the $) - There's only one kind of immediate in MipsOperand. LSA immediates are handled by the predicate and renderer. Lowlights: - Hardcoded '$zero' in the div patterns is handled with a hack. MipsOperand::isReg() will return true for a k_RegisterIndex token with Index == 0 and getReg() will return ZERO for this case. Note that it doesn't return ZERO_64 on isGP64() targets. - I haven't cleaned up all of the now-unused functions. Some more of the generic parser could be removed too (integers and relocs for example). - insve.df needed a custom decoder to handle the implicit fourth operand that was needed to make it parse correctly. The difficulty was that the matcher expected a Token<'0'> but gets an Imm<0>. Adding an implicit zero solved this. Reviewers: matheusalmeida, vmedic Reviewed By: matheusalmeida Differential Revision: http://llvm-reviews.chandlerc.com/D3222 llvm-svn: 205292	2014-04-01 10:35:28 +00:00
Alexey Volkov	1328b28dc6	[x86] Do not convert to cmp32 for Atom arch by Sergey Okunev Differential Revision: http://llvm-reviews.chandlerc.com/D2824 llvm-svn: 205288	2014-04-01 08:13:07 +00:00
David Blaikie	3464161070	DebugInfo: Avoid creating unnecessary/empty line tables and remove the special case of '0' in DwarfCompileUnit::initStmtList by just always using a label difference This moves one case of raw text checking down into the MCStreamer interfaces in the form of a virtual function, even if we ultimately end up consolidating on the one-or-many line tables issue one day, this is nicer in the interim. This just generally streamlines a bunch of use cases into a common code path. llvm-svn: 205287	2014-04-01 08:07:52 +00:00
David Blaikie	8bf66c4f3f	DebugInfo: Emit relocation to debug_line section when emitting asm for asm I don't think this is reachable by any frontend (why would you transform asm to asm+debug info?) but it helps tidy up some of this code, avoid the weird special case of "emit the first CU, store the label, then emit the rest" in MCDwarfLineTable::Emit by instead having the DWARF-for-assembly case use the same codepath as DwarfDebug.cpp, by registering the label of the debug_line section, thus causing it to be emitted. (with a special case in asm output to just emit the label since asm output uses the .loc directives, etc, rather than the debug_loc directly) llvm-svn: 205286	2014-04-01 07:35:52 +00:00
Adrian Prantl	d09ba23faf	LTO type uniquing: store the Decl field of a DIImportedEntity as a DIRef. No other functionality changes, DIBuilder testcase is included in a paired CFE commit. This relaxes the assertion in isScopeRef to also accept subclasses of DIScope. llvm-svn: 205279	2014-04-01 03:41:04 +00:00
Hal Finkel	86b3064f2b	Move partial/runtime unrolling late in the pipeline The generic (concatenation) loop unroller is currently placed early in the standard optimization pipeline. This is a good place to perform full unrolling, but not the right place to perform partial/runtime unrolling. However, most targets don't enable partial/runtime unrolling, so this never mattered. However, even some x86 cores benefit from partial/runtime unrolling of very small loops, and follow-up commits will enable this. First, we need to move partial/runtime unrolling late in the optimization pipeline (importantly, this is after SLP and loop vectorization, as vectorization can drastically change the size of a loop), while keeping the full unrolling where it is now. This change does just that. llvm-svn: 205264	2014-03-31 23:23:51 +00:00
Arnold Schwaighofer	15262e6703	Revert "SLPVectorizer: Ignore users that are insertelements we can reschedule them" This reverts commit r205018. Conflicts: lib/Transforms/Vectorize/SLPVectorizer.cpp test/Transforms/SLPVectorizer/X86/insert-element-build-vector.ll This is breaking libclc build. llvm-svn: 205260	2014-03-31 23:05:56 +00:00
Juergen Ributzka	e117992f00	[Stackmaps] Update the stackmap format to use 64-bit relocations for the function address and properly align all entries. This commit updates the stackmap format to version 1 to indicate the reorganizaion of several fields. This was done in order to align stackmap entries to their natural alignment and to minimize padding. Fixes <rdar://problem/16005902> llvm-svn: 205254	2014-03-31 22:14:04 +00:00
Adam Nemet	10c4ce2584	[X86] Adjust cost of FP_TO_UINT v4f64->v4i32 as well Pretty obvious follow-on to r205159 to also handle conversion from double besides float. Fixes <rdar://problem/16373208> llvm-svn: 205253	2014-03-31 21:54:48 +00:00
Matt Arsenault	d6c4326786	R600/SI: Remove leftover pattern splitting 64-bit ors. It's now matched to the scalar 64-bit or and split later if necessary.' llvm-svn: 205252	2014-03-31 21:46:46 +00:00
Manman Ren	63efd8e7e6	Register allocator: set CSRFirstUseCost to 5 for ARM64. A value of 5 means if we have a split or spill option that has a really low cost (1 << 14 is the entry frequency), we will choose to spill or split the really cold path before using a callee-saved register. This gives us the performance benefit on SPECInt2k and is also conservative. rdar://16162005 llvm-svn: 205248	2014-03-31 21:06:36 +00:00
Matt Arsenault	f751d6272d	Change shouldSplitVectorElementType to better match the description. Pass the entire vector type, and not just the element. llvm-svn: 205247	2014-03-31 20:54:58 +00:00
Matt Arsenault	d7bdcc46a6	R600/SI: Implement shouldConvertConstantLoadToIntImm llvm-svn: 205244	2014-03-31 19:54:27 +00:00
Hal Finkel	b811b6d0d1	Add an optional ability to expand larger BUILD_VECTORs with shuffles This adds the ability to expand large (meaning with more than two unique defined values) BUILD_VECTOR nodes in terms of SCALAR_TO_VECTOR and (legal) vector shuffles. There is now no limit of the size we are capable of expanding this way, although we don't currently do this for vectors with many unique values because of the default implementation of TLI's shouldExpandBuildVectorWithShuffles function. There is currently no functional change to any existing targets because the new capabilities are not used unless some target overrides the TLI shouldExpandBuildVectorWithShuffles function. As a result, I've not included a test case for the new functionality in this commit, but regression tests will (at least) be added soon when I commit support for the PPC QPX vector instruction set. The benefit of committing this now is that it makes the shouldExpandBuildVectorWithShuffles callback, which had to be added for other reasons regardless, fully functional. I suspect that other targets will also benefit from tuning the heuristic. llvm-svn: 205243	2014-03-31 19:42:55 +00:00
Matt Arsenault	378bf9c68b	R600: Compute masked bits for min and max llvm-svn: 205242	2014-03-31 19:35:33 +00:00
Rafael Espindola	ee1c342ef9	Don't relocate with sections if there might be a paired relocation. llvm-svn: 205240	2014-03-31 19:00:23 +00:00
Daniel Sanders	e34a120285	Revert: [mips] Rewrite MipsAsmParser and MipsOperand.' due to buildbot errors in lld tests. It's currently unable to parse 'sym + imm' without surrounding parenthesis. llvm-svn: 205237	2014-03-31 18:51:43 +00:00
Matt Arsenault	4c53717787	R600: Add BFE, BFI, and BFM intrinsics to help with writing tests. llvm-svn: 205236	2014-03-31 18:21:18 +00:00
Matt Arsenault	b34583661b	R600: Add target nodes for BFM and BFI llvm-svn: 205235	2014-03-31 18:21:13 +00:00
Saleem Abdulrasool	2070088bef	ARM: fix typo llvm-svn: 205233	2014-03-31 18:09:10 +00:00
Hal Finkel	b4240ca0f4	[PowerPC] Don't ever expand BUILD_VECTOR of v2i64 with shuffles If we have two unique values for a v2i64 build vector, this will always result in two vector loads if we expand using shuffles. Only one is necessary. llvm-svn: 205231	2014-03-31 17:48:16 +00:00
Hal Finkel	1977514287	Add a TLI hook to control when BUILD_VECTOR might be expanded using shuffles There are two general methods for expanding a BUILD_VECTOR node: 1. Use SCALAR_TO_VECTOR on the defined scalar values and then shuffle them together. 2. Build the vector on the stack and then load it. Currently, we use a fixed heuristic: If there are only one or two unique defined values, then we attempt an expansion in terms of SCALAR_TO_VECTOR and vector shuffles (provided that the required shuffle mask is legal). Otherwise, always expand via the stack. Even when SCALAR_TO_VECTOR is not legal, this can still be a good idea depending on what tricks the target can play when lowering the resulting shuffle. If the target can't do anything special, however, and if SCALAR_TO_VECTOR is expanded via the stack, this heuristic leads to sub-optimal code (two stack loads instead of one). Because only the target knows whether the SCALAR_TO_VECTORs and shuffles for a build vector of a particular type are likely to be optimial, this adds a new TLI function: shouldExpandBuildVectorWithShuffles which takes the vector type and the count of unique defined values. If this function returns true, then method (1) will be used, subject to the constraint that all of the necessary shuffles are legal (as determined by isShuffleMaskLegal). If this function returns false, then method (2) is always used. This commit does not enhance the current code to support expanding a build_vector with more than two unique values using shuffles, but I'll commit an implementation of the more-general case shortly. llvm-svn: 205230	2014-03-31 17:48:10 +00:00
Daniel Sanders	0c648ba5be	[mips] Rewrite MipsAsmParser and MipsOperand. Summary: Highlights: - Registers are resolved much later (by the render method). Prior to that point, GPR32's/GPR64's are GPR's regardless of register size. Similarly FGR32's/FGR64's/AFGR64's are FGR's regardless of register size or FR mode. Numeric registers can be anything. - All registers are parsed the same way everywhere (even when handling symbol aliasing) - One consequence is that all registers can be specified numerically almost anywhere (e.g. $fccX, $wX). The exception is symbol aliasing but that can be easily resolved. - Removes the need for the hasConsumedDollar hack - Parenthesis and Bracket suffixes are handled generically - Micromips instructions are parsed directly instead of going through the standard encodings first. - rdhwr accepts all 32 registers, and the following instructions that previously xfailed now work: ddiv, ddivu, div, divu, cvt.l.[ds], se[bh], wsbh, floor.w.[ds], c.ngl.d, c.sf.s, dsbh, dshd, madd.s, msub.s, nmadd.s, nmsub.s, swxc1 - Diagnostics involving registers point at the correct character (the $) - There's only one kind of immediate in MipsOperand. LSA immediates are handled by the predicate and renderer. Lowlights: - Hardcoded '$zero' in the div patterns is handled with a hack. MipsOperand::isReg() will return true for a k_RegisterIndex token with Index == 0 and getReg() will return ZERO for this case. Note that it doesn't return ZERO_64 on isGP64() targets. - I haven't cleaned up all of the now-unused functions. Some more of the generic parser could be removed too (integers and relocs for example). - insve.df needed a custom decoder to handle the implicit fourth operand that was needed to make it parse correctly. The difficulty was that the matcher expected a Token<'0'> but gets an Imm<0>. Adding an implicit zero solved this. Reviewers: matheusalmeida, vmedic Reviewed By: matheusalmeida Differential Revision: http://llvm-reviews.chandlerc.com/D3222 llvm-svn: 205229	2014-03-31 17:43:46 +00:00
Paul Robinson	7c99ec5b99	Disable each MachineFunctionPass for 'optnone' functions, unless that pass normally runs at optimization level None, or is part of the register allocation pipeline. llvm-svn: 205228	2014-03-31 17:43:35 +00:00
Hal Finkel	4c8f634f23	[PowerPC] Correct P7 dispatch unit allocation for vector instructions llvm-svn: 205222	2014-03-31 17:02:10 +00:00
Tom Roeder	ed0e88c31a	This patch fixes LTO's RecordStreamer so that it records symbols in the MCExpr part of an asm .symver directive as being used. This prevents referenced functions from being internalized and deleted. Without the patch to LTOModule.cpp, the test case will produce the error: LLVM ERROR: A @@ version cannot be undefined. llvm-svn: 205221	2014-03-31 16:59:13 +00:00
Saleem Abdulrasool	28b82bc39e	Support: generalise object type handling for Windows This generalises the object file type parsing to all Windows environments. This is used by cygwin as well as MSVC environments for MCJIT. This also makes the triple more similar to Chandler's suggestion of a separate field for the object file format. llvm-svn: 205219	2014-03-31 16:34:41 +00:00
Eli Bendersky	6a0ccfb585	PR19099 - revert r203483 Now that r205212 was committed, r203483 is no longer necessary; it was a temporary workaround that only handled a small number of the problematic cases. llvm-svn: 205216	2014-03-31 16:11:57 +00:00
Christian Pirker	6d301b8ff2	ARM: change parameter names of the ELFARMAsmBackend constructor I removed the underscore at the beginning of the parameter name, because of a comment from Tim. llvm-svn: 205215	2014-03-31 16:06:39 +00:00
Robert Khasanov	ed0b2e9733	Test commit. llvm-svn: 205214	2014-03-31 16:01:38 +00:00
Daniel Sanders	d69adeb8e7	[mips] Fix use of uninitialized value reported by the sanitizer-x86_64-linux-bootstrap buildbot llvm-svn: 205213	2014-03-31 15:58:58 +00:00
Eli Bendersky	264cd4672d	Fix for PR19099 - NVPTX produces invalid symbol names. This is a more thorough fix for the issue than r203483. An IR pass will run before NVPTX codegen to make sure there are no invalid symbol names that can't be consumed by the ptxas assembler. llvm-svn: 205212	2014-03-31 15:56:26 +00:00
Tim Northover	5081cd0f81	ARM64: add extra patterns for scalar shifts llvm-svn: 205209	2014-03-31 15:46:46 +00:00
Tim Northover	e7834c3bbc	ARM64: add extra scalar neg pattern & tests. llvm-svn: 205208	2014-03-31 15:46:42 +00:00
Tim Northover	4468670345	ARM64: add patterns for scalar sqdmlal & sqdmlsl. llvm-svn: 205207	2014-03-31 15:46:38 +00:00
Tim Northover	5731fc75af	ARM64: add more patterns for commuted fmsub operations. llvm-svn: 205206	2014-03-31 15:46:34 +00:00
Tim Northover	290e0698d4	ARM64: shuffle patterns around for fmin/fmax & add tests. llvm-svn: 205205	2014-03-31 15:46:30 +00:00
Tim Northover	903814ccd6	ARM64: add more scalar patterns for usqadd & suqadd. llvm-svn: 205204	2014-03-31 15:46:26 +00:00
Tim Northover	4c9d2c7e3f	ARM64: add more scalar patterns for reciprocal ops. llvm-svn: 205203	2014-03-31 15:46:22 +00:00
Tim Northover	f48103618e	ARM64: add i64 scalar pattern for @llvm.arm64.abs This will be used by the Clang front-end code for vabsd_s64. llvm-svn: 205202	2014-03-31 15:46:17 +00:00
Daniel Sanders	a567da5a36	[mips] Implement missing relocations in the integrated assembler. %got_hi, %got_lo, %call_hi, %call_lo, %higher, and %highest are now recognised by MipsAsmParser::getVariantKind(). To prevent future issues with missing entries in this StringSwitch, I've added an assertion to the default case. llvm-svn: 205200	2014-03-31 15:15:02 +00:00
Daniel Sanders	cefddb2ca6	Revert r205194 - [mips] Removed R_MIPS_GOT. It's identical to R_MIPS_GOT16. There's a couple additional bits I missed. llvm-svn: 205195	2014-03-31 14:34:36 +00:00
Daniel Sanders	a104300dbe	[mips] Removed R_MIPS_GOT. It's identical to R_MIPS_GOT16. llvm-svn: 205194	2014-03-31 14:30:05 +00:00
Rafael Espindola	2378d4c0ce	Capitalize the D in parseDirectiveGpDWord. DWord seems to be the canonical way to camel case dword in llvm. Thanks to Daniel Sander for noticing. llvm-svn: 205191	2014-03-31 14:15:07 +00:00
Tom Stellard	30f59417cf	R600/SI: Implement SIInstrInfo::isTriviallyRematerializable() llvm-svn: 205188	2014-03-31 14:01:56 +00:00
Tom Stellard	7ea3d6d420	R600/SI: Lower i64 SELECT by bitcasting to a vector type This allows allows us to replace ISD::EXTRACT_ELEMENT, which is lowered using shifts, with ISD::EXTRACT_VECTOR_ELT, which is a no-op. llvm-svn: 205187	2014-03-31 14:01:55 +00:00
Tom Stellard	7277b008ee	R600/SI: Return the correct index for VGPRs in getHWRegIndex() The register index is stored in the low 8-bits of the encoding. llvm-svn: 205186	2014-03-31 14:01:52 +00:00
Zoran Jovanovic	9b05a31f76	Fixed issue with microMIPS JAL instruction. Differential Revision: http://llvm-reviews.chandlerc.com/D3200 llvm-svn: 205185	2014-03-31 14:00:10 +00:00
Hal Finkel	02807595fb	Look at shuffles of build_vectors in DAGCombiner::visitEXTRACT_VECTOR_ELT When the loop vectorizer vectorizes code that uses the loop induction variable, we often end up with IR like this: %b1 = insertelement <2 x i32> undef, i32 %v, i32 0 %b2 = shufflevector <2 x i32> %b1, <2 x i32> undef, <2 x i32> zeroinitializer %i = add <2 x i32> %b2, <i32 2, i32 3> If the add in this example is not legal (as is the case on PPC with VSX), it will be scalarized, and we'll end up with a number of extract_vector_elt nodes with the vector shuffle as the input operand, and that vector shuffle is fed by one or more build_vector nodes. By the time that vector operations are expanded, visitEXTRACT_VECTOR_ELT will not create new extract_vector_elt by looking through the vector shuffle (to make sure that no illegal operations are created), and so the extract_vector_elt -> vector shuffle -> build_vector is never simplified to an operand of the build vector. By looking at build_vectors through a shuffle we fix this particular situation, preventing a vector from being built, only to be deconstructed again (for the scalarized add) -- an expensive proposition when this all needs to be done via the stack. We probably want a more comprehensive fix here where we look back recursively through any shuffles to any build_vectors or scalar_to_vectors, etc. but that can come later. llvm-svn: 205179	2014-03-31 11:43:19 +00:00
Tim Northover	241856e5f8	ARM64: fix a couple of signed/unsigned comparison warnings. llvm-svn: 205174	2014-03-31 10:21:36 +00:00
Daniel Sanders	9cf3d3b764	[yaml2obj] Add support for ELF e_flags. Summary: The FileHeader mapping now accepts an optional Flags sequence that accepts the EF_<arch>_<flag> constants. When not given, Flags defaults to zero. Reviewers: atanasyan Reviewed By: atanasyan CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D3213 llvm-svn: 205173	2014-03-31 09:44:05 +00:00
Alexey Samsonov	23aaf2a182	Try to fix MSan bootstrap bot: make ARM64Disassembler::getInstruction() always initialize Size argument. llvm-svn: 205171	2014-03-31 07:59:33 +00:00
Yaron Keren	070a752d7e	Correct OS conditionals following r204977 and r204978. Previously, MinGW OS was Triple::MinGW and Cygwin was Triple::Cygwin and now it is Triple::Win32 with Environment being GNU or Cygwin. So, TheTriple.getOS() == Triple::Win32 is replaced by TheTriple.isWindowsMSVCEnvironment() and (TheTriple.getOS() == Triple::MinGW32 \|\| TheTriple.getOS() == Triple::Cygwin) is replaced by TheTriple.isOSCygMing() llvm-svn: 205170	2014-03-31 07:59:14 +00:00
Craig Topper	ec82847a64	[C++11] Mark more classes in the X86 target as 'final'. llvm-svn: 205166	2014-03-31 06:53:13 +00:00
Craig Topper	26eec09d84	Mark a couple of the X86 target classes as final. Allows the compiler to de-virtualize some internal calls. llvm-svn: 205165	2014-03-31 06:22:15 +00:00
NAKAMURA Takumi	82ec13e3d5	ARM64CollectLOH.cpp: Tweak \param. [-Wdocumentation] llvm-svn: 205162	2014-03-31 01:10:26 +00:00
Chandler Carruth	d28515af31	[ARM64] Fix materialization of an fp128 zero immediate. There currently is not a pattern to lower this with clever instructions that zero the register, so restrict the zero immediate legality special case to f64 and f32 (the only two sizes which fmov seems to directly support). Fixes backend errors when building code such as libxml. llvm-svn: 205161	2014-03-31 00:02:10 +00:00
Adam Nemet	6dafe97271	[X86] Adjust cost of FP_TO_UINT v8f32->v8i32 There is no direct AVX instruction to convert to unsigned. I have some ideas how we may be able to do this with three vector instructions but the current backend just bails on this to get it scalarized. See the comment why we need to adjust the cost returned by BasicTTI. The test is a bit roundabout (and checks assembly rather than bit code) because I'd like it to work even if at some point we could vectorize this conversion. Fixes <rdar://problem/16371920> llvm-svn: 205159	2014-03-30 18:07:13 +00:00
Stepan Dyatkovskiy	8baf17fc5f	PR18929: According to ARM assembler language hash symbol is optional before immediates. For example, see here for more details: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0473j/dom1359731154529.html llvm-svn: 205157	2014-03-30 17:09:54 +00:00
Hal Finkel	90adf0fe06	Make use of previously generated stores in SelectionDAGLegalize::ExpandExtractFromVectorThroughStack When expanding EXTRACT_VECTOR_ELT and EXTRACT_SUBVECTOR using SelectionDAGLegalize::ExpandExtractFromVectorThroughStack, we store the entire vector and then load the piece we want. This is fine in isolation, but generating a new store (and corresponding stack slot) for each extraction ends up producing code of poor quality. When we scalarize a vector operation (using SelectionDAG::UnrollVectorOp for example) we generate one EXTRACT_VECTOR_ELT for each element in the vector. This used to generate one stored copy of the vector for each element in the vector. Now we search the uses of the vector for a suitable store before generating a new one, which results in much more efficient scalarization code. llvm-svn: 205153	2014-03-30 15:10:18 +00:00
Hal Finkel	5c0d1454d6	[PowerPC] Handle VSX v2i64 SIGN_EXTEND_INREG sitofp from v2i32 to v2f64 ends up generating a SIGN_EXTEND_INREG v2i64 node (and similarly for v2i16 and v2i8). Even though there are no sign-extension (or algebraic shifts) for v2i64 types, we can handle v2i32 sign extensions by converting two and from v2i64. The small trick necessary here is to shift the i32 elements into the right lanes before the i32 -> f64 step. This is because of the big Endian nature of the system, we need the i32 portion in the high word of the i64 elements. For v2i16 and v2i8 we can do the same, but we first use the default Altivec shift-based expansion from v2i16 or v2i8 to v2i32 (by casting to v4i32) and then apply the above procedure. llvm-svn: 205146	2014-03-30 13:22:59 +00:00
Chandler Carruth	9df0fd4018	[Allocator] Lift the slab size and size threshold into template parameters rather than runtime parameters. There is only one user of these parameters and they are compile time for that user. Making these compile time seems to better reflect their intended usage as well. llvm-svn: 205143	2014-03-30 12:07:07 +00:00
Chandler Carruth	a48ecb7639	Don't mark the declarations of the TSan annotation functions as weak. That causes references to them to be weak references which can collapse to null if no definition is provided. We call these functions unconditionally, so a definition must be provided. Make the definitions provided in the .cpp file weak by re-declaring them as weak just prior to defining them. This should keep compilers which cannot attach the weak attribute to the definition happy while actually resolving the symbols correctly during the link. You might ask yourself upon reading this commit log: how did any of this work before? Well, fun story. It turns out we have some code in Support (BumpPtrAllocator) which both uses virtual dispatch and has out-of-line vtables used by that virtual dispatch. If you move the virtual dispatch into its header in just the right way, the optimizer gets to devirtualize, and remove all references to the vtable. Then the sad part: the references to this one vtable were the only strong symbol uses in the support library for llvm-tblgen AFAICT. At least, after doing something just like this, these symbols stopped getting their weak definition and random calls to them would segfault instead. Yay software. llvm-svn: 205137	2014-03-30 11:20:25 +00:00
Chandler Carruth	81f7061065	[ARM64] Fix a heap-use-after-free spotted by ASan. StringRef::lower() returns a std::string. Better yet, we can now stop thinking about what it returns and write 'auto'. It does the right thing. =] llvm-svn: 205135	2014-03-30 09:08:07 +00:00
Tim Northover	bf679cec67	ARM64: uncopy/paste helper function It was doing functional but highly suspect operations on bools due to the more limited shifting operands supported by memory instructions. Should fix some MSVC warnings. llvm-svn: 205134	2014-03-30 08:30:28 +00:00
Tim Northover	6b3258f087	ARM64: remove unused variables llvm-svn: 205133	2014-03-30 07:35:48 +00:00
Tim Northover	af6bfb21cd	ARM64: remove -m32/-m64 mapping with ARM. This is causing the ARM build-bots to fail since they only include the ARM backend and can't create an ARM64 target. llvm-svn: 205132	2014-03-30 07:25:23 +00:00
Tim Northover	3e52557212	ARM64: override all the things. Actually, mostly only those in the top-level directory that already had a "virtual" attached. But it's the thought that counts and it's been a long day. llvm-svn: 205131	2014-03-30 07:25:18 +00:00
Saleem Abdulrasool	f80b49b5d2	Support: correct Windows normalisation If the environment is unknown and no object file is provided, then assume an "MSVC" environment, otherwise, set the environment to the object file format. In the case that we have a known environment but a non-native file format for Windows (COFF) which is used for MCJIT, then append the custom file format to the triple as an additional component. This fixes the MCJIT tests on Windows. llvm-svn: 205130	2014-03-30 07:19:31 +00:00
NAKAMURA Takumi	09717bd1c4	X86Subtarget.h: isTargetWindows() should tell whether he is targeting msvc. FYI, !isWindowsGNUEnvironment() is insufficient. It missed cygwin. FIXME: The name "isTargetWindows" should be fixed. llvm-svn: 205124	2014-03-30 04:35:00 +00:00
Lang Hames	c339840666	[MC] Remove an unused (and broken) variant of the setupForSymbolicDisassembly method in MCDisassembler. llvm-svn: 205123	2014-03-30 04:27:33 +00:00
Rafael Espindola	5e66a7e699	Add a missing break. Patch by Tobias Güntner. I tried to write a test, but the only difference is the Changed value that gets returned. It can be tested with "opt -debug-pass=Executions -functionattrs, but that doesn't seem worth it. llvm-svn: 205121	2014-03-30 03:26:17 +00:00
Saleem Abdulrasool	ceec2cba64	Support: normalize the default triple on Unix This will fix cross-compiling buildbots (e.g. cygwin). This is in the same vein as SVN r205070. Apply this to fix the cross-compiling scenario, even though the preferred solution is to update the build system to normalize the embedded triple rather than perform this at runtime every time. This is meant to tide us over until that approach is fleshed out and applied. llvm-svn: 205120	2014-03-30 03:22:37 +00:00
Dmitri Gribenko	1fd72104ad	Fix a few -Wdocumentation warnings llvm-svn: 205116	2014-03-29 19:40:32 +00:00
Benjamin Kramer	3ad660a515	Detemplatize LOHDirective. The ARM64 backend uses it only as a container to keep an MCLOHType and Arguments around so give it its own little copy. The other functionality isn't used and we had a crazy method specialization hack in place to keep it working. Unfortunately that was incompatible with MSVC. Also range-ify a couple of loops while at it. llvm-svn: 205114	2014-03-29 19:21:20 +00:00
Benjamin Kramer	61e595be4d	ARM64: Remove unused helper function, make others static. llvm-svn: 205112	2014-03-29 18:00:49 +00:00
Benjamin Kramer	48e7e85d29	tblgen: Twinify PrintFatalError. No functionality change. llvm-svn: 205110	2014-03-29 17:17:15 +00:00
Benjamin Kramer	fd719b9551	Avoid storing Twines. While there nested ifs into a helper function. No functionality change. llvm-svn: 205108	2014-03-29 16:54:29 +00:00
Hal Finkel	777c9dd90a	[PowerPC] Handle v2i64 comparisons v2i64 is a legal type under VSX, however we don't have native vector comparisons. We can handle eq/ne by casting it to an Altivec type, but everything else must be expanded. llvm-svn: 205106	2014-03-29 16:04:40 +00:00
Tim Northover	adbd34e045	ARM64: format register strings without creating a local Twine. It was causing horrible failures on some build-bots. llvm-svn: 205105	2014-03-29 15:35:57 +00:00
Hal Finkel	e8fba98735	[PowerPC] VSX instruction latency corrections The vector divide and sqrt instructions have high latencies, and the scalar comparisons are like all of the others. On the P7, permutations take an extra cycle over purely-simple vector ops. llvm-svn: 205096	2014-03-29 13:20:31 +00:00
Stepan Dyatkovskiy	df657cc1d5	Recommitted fix for PR18931, with extended tests set. Issue subject: Crash using integrated assembler with immediate arithmetic Fix description: Expressions like 'cmp r0, #(l1 - l2) >> 3' could not be evaluated on asm parsing stage, since it is impossible to resolve labels on this stage. In the end of stage we still have expression (MCExpr). Then, when we want to encode it, we expect it to be an immediate, but it still an expression. Patch introduces a Fixup (MCFixup instance), that is processed after main encoding stage. llvm-svn: 205094	2014-03-29 13:12:40 +00:00
Tim Northover	2125374ecf	ARM64: use 64-bit constant even on 32-bit machines Another existing bot failure so no tests. llvm-svn: 205093	2014-03-29 11:51:49 +00:00
Tim Northover	2011df293d	ARM64: change format specifier to work on 32-bit targets Existing tests were failing. llvm-svn: 205092	2014-03-29 11:47:07 +00:00
Chandler Carruth	7b7a67c5c8	[ARM64] Fix 'assert("...")' to be 'assert(0 && "...")'. Otherwise, it is no assert at all. ;] Some of these should probably be switched to llvm_unreachable, but I didn't want to perturb the behavior in this patch. Found by -Wstring-conversion, which I'll try to turn on in CMake builds at least as it is finding useful things. llvm-svn: 205091	2014-03-29 11:07:40 +00:00
Tim Northover	00ed9964c6	ARM64: initial backend import This adds a second implementation of the AArch64 architecture to LLVM, accessible in parallel via the "arm64" triple. The plan over the coming weeks & months is to merge the two into a single backend, during which time thorough code review should naturally occur. Everything will be easier with the target in-tree though, hence this commit. llvm-svn: 205090	2014-03-29 10:18:08 +00:00
Tim Northover	753eca0f78	CodeGen: add sensible defaults for the ISD::FROUND operation Some exotic types didn't know how to handle FROUND, which ARM64 uses. llvm-svn: 205088	2014-03-29 09:03:18 +00:00
Tim Northover	d1c6f51730	MC-exceptions: add support for compact-unwind without .eh_frame ARM64 has compact-unwind information, but doesn't necessarily want to emit .eh_frame directives as well. This teaches MC about such a situation so that it will skip .eh_frame info when compact unwind has been successfully produced. For functions incompatible with compact unwind, the normal information is still written. llvm-svn: 205087	2014-03-29 09:03:13 +00:00
Tim Northover	cea0abb60a	CodeGenPrep: wrangle IR to exploit AArch64 tbz/tbnz inst. Given IR like: %bit = and %val, #imm-with-1-bit-set %tst = icmp %bit, 0 br i1 %tst, label %true, label %false some targets can emit just a single instruction (tbz/tbnz in the AArch64 case). However, with ISel acting at the basic-block level, all three instructions need to be together for this to be possible. This adds another transformation to CodeGenPrep to expose these opportunities, if targets opt in via the hook. llvm-svn: 205086	2014-03-29 08:22:29 +00:00
Tim Northover	0999cbd0b9	MC: add a RefKind field to MCValue This is principally to allow neater mapping of fixups to relocations in ARM64 ELF. Without this, there isn't enough information available to GetRelocType, leading to many more fixup_arm64_... enumerators. llvm-svn: 205085	2014-03-29 08:22:20 +00:00
Tim Northover	53d3251851	MachO: Add linker-optimisation hint framework to MC. Another part of the ARM64 backend (so tests will be following soon). This is currently used by the linker to relax adrp/ldr pairs into nops where possible, though could well be more broadly applicable. llvm-svn: 205084	2014-03-29 07:34:53 +00:00
Tim Northover	5627670e84	MachO: actually set linker-private prefix at MC level. This was accidentally omitted from r205081. llvm-svn: 205083	2014-03-29 07:33:24 +00:00
Tim Northover	c3988b4aa3	MachO: allow each section to have a linker-private symbol The upcoming ARM64 backend doesn't have section-relative relocations, so we give each section its own symbol to provide this functionality. Of course, it doesn't need to appear in the final executable, so linker-private is the best kind for this purpose. llvm-svn: 205081	2014-03-29 07:05:06 +00:00
Tim Northover	4516de3412	Intrinsics: add LLVMHalfElementsVectorType constraint This is like the LLVMMatchType, except the verifier checks that the second argument is a vector with the same base type and half the number of elements. This will be used by the ARM64 backend. llvm-svn: 205079	2014-03-29 07:04:54 +00:00
Rafael Espindola	5904e12bfa	Completely rewrite ELFObjectWriter::RecordRelocation. I started trying to fix a small issue, but this code has seen a small fix too many. The old code was fairly convoluted. Some of the issues it had: * It failed to check if a symbol difference was in the some section when converting a relocation to pcrel. * It failed to check if the relocation was already pcrel. * The pcrel value computation was wrong in some cases (relocation-pc.s) * It was missing quiet a few cases where it should not convert symbol relocations to section relocations, leaving the backends to patch it up. * It would not propagate the fact that it had changed a relocation to pcrel, requiring a quiet nasty work around in ARM. * It was missing comments. llvm-svn: 205076	2014-03-29 06:26:49 +00:00
Hal Finkel	19be506a5e	[PowerPC] Add subregister classes for f64 VSX values We had stored both f64 values and v2f64, etc. values in the VSX registers. This worked, but was suboptimal because we would always spill 16-byte values even through we almost always had scalar 8-byte values. This resulted in an increase in stack-size use, extra memory bandwidth, etc. To fix this, I've added 64-bit subregisters of the Altivec registers, and combined those with the existing scalar floating-point registers to form a class of VSX scalar floating-point registers. The ABI code has also been enhanced to use this register class and some other necessary improvements have been made. llvm-svn: 205075	2014-03-29 05:29:01 +00:00
Saleem Abdulrasool	37511ecea8	Windows: canonicalise the default windows triple Canonicalise the default triple that is used on Windows. This should hopefully fix the MSVC buildbots. llvm-svn: 205070	2014-03-29 01:08:53 +00:00
Akira Hatanaka	9afbb8c2b1	[x86] Fix printing of register operands with q modifier. Emit 32-bit register names instead of 64-bit register names if the target does not have 64-bit general purpose registers. <rdar://problem/14653996> llvm-svn: 205067	2014-03-28 23:28:07 +00:00
David Blaikie	dca7c7c5f1	Debug Compression: Avoid compression debug_frame for now Turns out debug_frame does use multiple fragments, so it doesn't compress correctly with the current approach. Disable compressing it for now while I figure out what's the best solution for it. llvm-svn: 205059	2014-03-28 21:48:31 +00:00
David Majnemer	02f2188bb9	X86: Disable IsLegalToCallImmediateAddr for Win32 WinCOFF cannot form PC relative relocations to support absolute MCValues. We should reenable this once WinCOFF supports emission of IMAGE_REL_I386_REL32 relocations. This fixes PR19272. llvm-svn: 205058	2014-03-28 21:40:47 +00:00
Hal Finkel	2583b06310	[PowerPC] Fix VSX permutation isel Not only did I invert the indices when I wrote the code, but I also did the same thing when I wrote the regression test. Oops. llvm-svn: 205046	2014-03-28 20:24:55 +00:00
Hal Finkel	7811c6188e	[PowerPC] v2[fi]64 need to be explicitly passed in VSX registers v2[fi]64 values need to be explicitly passed in VSX registers. This is because the code in TRI that finds the minimal register class given a register and a value type will assert if given an Altivec register and a non-Altivec type. llvm-svn: 205041	2014-03-28 19:58:11 +00:00
Rafael Espindola	c44c26b4e1	Map ELf flags back to more specific section kinds. With that, convert another llc -filetype=obj test. llvm-svn: 205031	2014-03-28 19:14:08 +00:00
Rafael Espindola	b59fb7347a	Parse .gpdword and convert another llc -filetype=obj test. llvm-svn: 205028	2014-03-28 18:50:26 +00:00
Arnold Schwaighofer	c9d58e8d32	SLPVectorizer: Take credit for free extractelement instructions Extract element instructions that will be removed when vectorzing lower the cost. Patch by Arch D. Robison! llvm-svn: 205020	2014-03-28 17:21:32 +00:00
Arnold Schwaighofer	b0d3bcdd32	SLPVectorizer: Fix typos Patch by Arch D. Robison! llvm-svn: 205019	2014-03-28 17:21:27 +00:00
Arnold Schwaighofer	b190cb30c3	SLPVectorizer: Ignore users that are insertelements we can reschedule them Patch by Arch D. Robison! llvm-svn: 205018	2014-03-28 17:21:22 +00:00
Rafael Espindola	d7610a5d67	Add const to a method I missed in the previous commit. llvm-svn: 205014	2014-03-28 16:14:12 +00:00
Rafael Espindola	3e3de5e353	Add const. llvm-svn: 205013	2014-03-28 16:06:09 +00:00
Erik Verbruggen	5e1bac3a38	Revert "InstCombine: merge constants in both operands of icmp." This reverts commit r204912, and follow-up commit r204948. This introduced a performance regression, and the fix is not completely clear yet. llvm-svn: 205010	2014-03-28 14:50:57 +00:00
Erik Verbruggen	2074ebd8af	Revert "GVN: merge overflow intrinsics with non-overflow instructions." This reverts commit r203553, and follow-up commits r203558 and r203574. I will follow this up on the mailinglist to do it in a way that won't cause subtle PRE bugs. llvm-svn: 205009	2014-03-28 14:42:34 +00:00
Christian Pirker	2a11160956	Add ARM big endian Target (armeb, thumbeb) Reviewed at http://llvm-reviews.chandlerc.com/D3095 llvm-svn: 205007	2014-03-28 14:35:30 +00:00
Tim Northover	24f46618b2	R600: avoid calling std::next on an iterator that might be end() This was causing my llc to go into an infinite loop on CodeGen/R600/address-space.ll (just triggered recently by some allocator changes). llvm-svn: 205005	2014-03-28 13:52:56 +00:00
Tim Northover	aa3cf1e691	Intrinsics: expand semantics of LLVMExtendedVectorType (& trunc) These are used in the ARM backends to aid type-checking on patterns involving intrinsics. By making sure one argument is an extended/truncated version of another. However, there's no reason to limit them to just vectors types. For example AArch64 has the instruction "uqshrn sD, dN, #imm" which would naturally use an intrinsic taking an i64 and returning an i32. llvm-svn: 205003	2014-03-28 12:31:39 +00:00
Chandler Carruth	2c540f62bf	[Allocator Cleanup] Move generic pointer alignment helper out of an out-of-line private static method and into the collection of inline alignment helpers in MathExtras.h. llvm-svn: 204995	2014-03-28 09:08:14 +00:00
Chandler Carruth	3b56b9cf90	[Allocator Cleanup] Make the growth of the "slab" size of the BumpPtrAllocator significantly less strange by making it a simple function of the number of slabs allocated rather than by making it a recurrance. I think the previous behavior was essentially that the size of the slabs would be doubled after the first 128 were allocated, and then doubled again each time 64 more were allocated, but only if every allocation packed perfectly into the slab size. If not, the wasted space wouldn't be counted toward increasing the size, but allocations over the size threshold would. And since the allocations over the size threshold might be much larger than the slab size, this could have somewhat surprising consequences where we rapidly grow the slab size. This currently requires adding state to the allocator to track the number of slabs currently allocated, but that isn't too bad. I'm planning further changes to the allocator that will make this state fall out even more naturally. It still doesn't fully decouple the growth rate from the allocations which are over the size threshold. That fix is coming later. This specific fix will allow making the entire thing into a more stateless device and lifting the parameters into template parameters rather than runtime parameters. llvm-svn: 204993	2014-03-28 08:53:25 +00:00
Chandler Carruth	ead0f76443	[cleanup] Hoist the initialization and constants for slab sizes to the top of the default jit memory manager. This will allow them to be used as template parameters rather than runtime parameters in a subsequent commit. llvm-svn: 204992	2014-03-28 08:53:08 +00:00
Adrian Prantl	79c8e8f046	C++11: convert verbose loops to range-based loops. llvm-svn: 204981	2014-03-27 23:30:04 +00:00
Hal Finkel	c6fc9b8960	[PowerPC] Use a small cleanup pass to remove VSX self copies As explained in r204976, because of how the allocation of VSX registers interacts with the call-lowering code, we sometimes end up generating self VSX copies. Specifically, things like this: %VSL2<def> = COPY %F2, %VSL2<imp-use,kill> (where %F2 is really a sub-register of %VSL2, and so this copy is a nop) This adds a small cleanup pass to remove these prior to post-RA scheduling. llvm-svn: 204980	2014-03-27 23:12:31 +00:00
Manman Ren	ed0de1368d	Provide a target override for the cost of using a callee-saved register for the first time. Thanks Andy for the discussion. rdar://16162005 llvm-svn: 204979	2014-03-27 23:10:04 +00:00
Saleem Abdulrasool	edbdd2e5df	Canonicalise Windows target triple spellings Construct a uniform Windows target triple nomenclature which is congruent to the Linux counterpart. The old triples are normalised to the new canonical form. This cleans up the long-standing issue of odd naming for various Windows environments. There are four different environments on Windows: MSVC: The MS ABI, MSVCRT environment as defined by Microsoft GNU: The MinGW32/MinGW32-W64 environment which uses MSVCRT and auxiliary libraries Itanium: The MSVCRT environment + libc++ built with Itanium ABI Cygnus: The Cygwin environment which uses custom libraries for everything The following spellings are now written as: i686-pc-win32 => i686-pc-windows-msvc i686-pc-mingw32 => i686-pc-windows-gnu i686-pc-cygwin => i686-pc-windows-cygnus This should be sufficiently flexible to allow us to target other windows environments in the future as necessary. llvm-svn: 204977	2014-03-27 22:50:05 +00:00
Hal Finkel	9dcb3583d5	[PowerPC] Don't remove self VSX copies in PPCInstrInfo::copyPhysReg Because of how the allocation of VSX registers interacts with the call-lowering code, we sometimes end up generating self VSX copies. Specifically, things like this: %VSL2<def> = COPY %F2, %VSL2<imp-use,kill> (where %F2 is really a sub-register of %VSL2, and so this copy is a nop) The problem is that ExpandPostRAPseudos always assumes that some instruction has been inserted, and adds implicit defs to it. This is a problem if no copy was inserted because it can cause subtle problems during post-RA scheduling. These self copies will have to be removed some other way. llvm-svn: 204976	2014-03-27 22:46:28 +00:00
Quentin Colombet	85b904d875	[X86][Vector Cost Model] Add a comment to explain the workaround in my previous commit (r204884). <rdar://problem/16381225> llvm-svn: 204972	2014-03-27 22:27:41 +00:00
Hal Finkel	82569b6366	[PowerPC] Fix v2f64 vector extract and related patterns First, v2f64 vector extract had not been declared legal (and so the existing patterns were not being used). Second, the patterns for that, and for scalar_to_vector, should really be a regclass copy, not a subregister operation, because the VSX registers directly hold both the vector and scalar data. llvm-svn: 204971	2014-03-27 22:22:48 +00:00
Hal Finkel	ad801b7459	[PowerPC] Expand v2i64 shifts These operations need to be expanded during legalization so that isel does not crash. In theory, we might be able to custom lower some of these. That, however, would need to be follow-up work. llvm-svn: 204963	2014-03-27 21:26:33 +00:00
Manman Ren	9dee449ee3	Register Allocator: refactoring and add comments. No functionality change. Thanks Andy for reviewing. rdar://16162005 llvm-svn: 204962	2014-03-27 21:21:57 +00:00
Rafael Espindola	c03f44ca8a	Remove another unused argument. llvm-svn: 204961	2014-03-27 20:49:35 +00:00
David Blaikie	7400a97952	DebugInfo: Support for compressed debug info sections 1) When creating a .debug_* section and instead create a .zdebug_ section. 2) When creating a fragment in a .zdebug_* section, make it a compressed fragment. 3) When computing the size of a compressed section, compress the data and use the size of the compressed data. 4) Emit the compressed bytes. Also, check that only if a section has a compressed fragment, then that is the only fragment in the section. Assert-fail if the fragment's data is modified after it is compressed. Initial review on llvm-commits by Eric Christopher and Rafael Espindola. llvm-svn: 204958	2014-03-27 20:45:58 +00:00
David Blaikie	70bd1fd22f	DebugInfo: TargetOptions/MCAsmInfo support for compressed debug info sections llvm-svn: 204957	2014-03-27 20:45:41 +00:00
Rafael Espindola	9ab380122a	Remove unused argument. llvm-svn: 204956	2014-03-27 20:41:17 +00:00
Reid Kleckner	3bdf9bc48b	InstCombine: Don't combine constants on unsigned icmps Fixes a miscompile introduced in r204912. It would miscompile code like (unsigned)(a + -49) <= 5U. The transform would turn this into (unsigned)a < 55U, which would return true for values in [0, 49], when it should not. llvm-svn: 204948	2014-03-27 17:49:27 +00:00
Matt Arsenault	b517c8128e	R600: Implement isZExtFree. This allows 64-bit operations that are truncated to be reduced to 32-bit ones. llvm-svn: 204946	2014-03-27 17:23:31 +00:00
Matt Arsenault	d125d74a73	R600/SI: Fix unreachable with a sext_in_reg to an illegal type. llvm-svn: 204945	2014-03-27 17:23:24 +00:00
Daniel Sanders	5e94e68f7b	[mips] Some uses of isMips64()/hasMips64() are really tests for 64-bit GPR's Summary: No functional change since these predicates are (currently) synonymous. Extracted from a patch by David Chisnall His work was sponsored by: DARPA, AFRL Differential Revision: http://llvm-reviews.chandlerc.com/D3202 llvm-svn: 204943	2014-03-27 16:42:17 +00:00
Logan Chien	30eb9f47c6	[AArch64] Lower SHL_PARTS, SRA_PARTS and SRL_PARTS Lower SHL_PARTS, SRA_PARTS and SRL_PARTS to perform 128-bit integer shift Patch by GuanHong Liu. llvm-svn: 204940	2014-03-27 16:28:09 +00:00
Rafael Espindola	24a669d225	Prevent alias from pointing to weak aliases. This adds back r204781. Original message: Aliases are just another name for a position in a file. As such, the regular symbol resolutions are not applied. For example, given define void @my_func() { ret void } @my_alias = alias weak void ()* @my_func @my_alias2 = alias void ()* @my_alias We produce without this patch: .weak my_alias my_alias = my_func .globl my_alias2 my_alias2 = my_alias That is, in the resulting ELF file my_alias, my_func and my_alias are just 3 names pointing to offset 0 of .text. That is not the semantics of IR linking. For example, linking in a @my_alias = alias void ()* @other_func would require the strong my_alias to override the weak one and my_alias2 would end up pointing to other_func. There is no way to represent that with aliases being just another name, so the best solution seems to be to just disallow it, converting a miscompile into an error. llvm-svn: 204934	2014-03-27 15:26:56 +00:00
Daniel Sanders	64cf5a4eb2	[mips] Attempting to use register $32 should be an error instead of an assertion. Reviewers: matheusalmeida Reviewed By: matheusalmeida Differential Revision: http://llvm-reviews.chandlerc.com/D3201 llvm-svn: 204932	2014-03-27 15:00:44 +00:00
Aaron Ballman	be648a3c16	The forward declare should be a struct instead of a class (to be consistent with the definition, as well as to silence an MSVC C4099 warning). llvm-svn: 204928	2014-03-27 14:10:00 +00:00
Daniel Sanders	5bce5f6245	[mips] Add support for .cpsetup Summary: Patch by Robert N. M. Watson His work was sponsored by: DARPA, AFRL Small corrections by myself. CC: theraven, matheusalmeida Differential Revision: http://llvm-reviews.chandlerc.com/D3199 llvm-svn: 204924	2014-03-27 13:52:53 +00:00
Daniel Sanders	bd0e39079a	[mips] The decision between GOT_DISP and GOT16 for global addresses depends on ABI rather than MIPS64 Summary: No functional change (for supported use cases) Reviewers: matheusalmeida Reviewed By: matheusalmeida Differential Revision: http://llvm-reviews.chandlerc.com/D3191 llvm-svn: 204922	2014-03-27 12:49:34 +00:00
Zoran Jovanovic	ada38ef61b	Split the file MipsAsmBackend.cpp in Split the file MipsAsmBackend.cpp and Split the file MipsAsmBackend.h. Differential Revision: http://llvm-reviews.chandlerc.com/D3134 llvm-svn: 204921	2014-03-27 12:38:40 +00:00
Karthik Bhat	82540e9ef8	All new elements except the last one initialized to NULL. Ideally, once parsing is complete, all elements should be non-NULL. To safe-guard BitcodeReader, this patch adds null check for all access to these list. Patch by Dinesh Dwivedi! llvm-svn: 204920	2014-03-27 12:08:23 +00:00
Matheus Almeida	a805e85cb2	[mips] Remove unused private field. llvm-svn: 204919	2014-03-27 12:02:48 +00:00
Matheus Almeida	61218ba798	[mips] NaCl should now use the custom MipsELFStreamer (recently added) in spite of MCELFStreamer. This is so that changes to MipsELFStreamer will automatically propagate through its subclasses. No functional changes (MipsELFStreamer has the same functionality of MCELFStreamer at the moment). Differential Revision: http://llvm-reviews.chandlerc.com/D3130 llvm-svn: 204918	2014-03-27 11:52:20 +00:00
Matheus Almeida	dac77fb389	[mips] Implement custom MCELFStreamer. This allows us to insert some hooks before emitting data into an actual object file. For example, we can capture the register usage for a translation unit by overriding the EmitInstruction method. The register usage information is needed to generate .reginfo and .Mips.options ELF sections. No functional changes. Differential Revision: http://llvm-reviews.chandlerc.com/D3129 llvm-svn: 204917	2014-03-27 11:39:03 +00:00
Erik Verbruggen	59a1219846	InstCombine: merge constants in both operands of icmp. Transform: icmp X+Cst2, Cst into: icmp X, Cst-Cst2 when Cst-Cst2 does not overflow, and the add has nsw. llvm-svn: 204912	2014-03-27 11:16:05 +00:00
Daniel Sanders	d897b564ca	[mips] Stop caching the result of hasMips64(), isABI_O32(), isABI_N32(), and isABI_N64() from MipsSubTarget in MipsTargetLowering Summary: The short name is quite convenient so provide an accessor for them instead. No functional change Depends on D3177 Reviewers: matheusalmeida Reviewed By: matheusalmeida Differential Revision: http://llvm-reviews.chandlerc.com/D3178 llvm-svn: 204911	2014-03-27 10:46:12 +00:00
Elena Demikhovsky	bb2f6b72d3	AVX-512: Implemented masking for integer arithmetic & logic instructions. By Robert Khasanov rob.khasanov@gmail.com llvm-svn: 204906	2014-03-27 09:45:08 +00:00
Stepan Dyatkovskiy	e8747e30ef	Rejected r204899 and r204900 due to remaining test failures on cmake-llvm-x86_64-linux buildbot. llvm-svn: 204901	2014-03-27 08:38:18 +00:00
Stepan Dyatkovskiy	3530003008	Fix for pr18931: Crash using integrated assembler with immediate arithmetic Fix description: Expressions like 'cmp r0, #(l1 - l2) >> 3' could not be evaluated on asm parsing stage, since it is impossible to resolve labels on this stage. In the end of stage we still have expression (MCExpr). Then, when we want to encode it, we expect it to be an immediate, but it still an expression. Patch introduces a Fixup (MCFixup instance), that is processed after main encoding stage. llvm-svn: 204899	2014-03-27 07:49:39 +00:00
Jiangning Liu	1d3f2c7c82	ARM: raise error message when complex SO expressions can't really be solved as a constant at compilation time. llvm-svn: 204898	2014-03-27 07:42:58 +00:00
Lang Hames	2768d26a62	Move MCSymbolizer's constructor into header. It's trivial - there's no need for it to be out-of-line. llvm-svn: 204892	2014-03-27 02:42:52 +00:00
Lang Hames	eb37092342	Update MCSymbolizer and its subclasses' constructors to reflect the fact that they take ownership of the RelocationInfo they're constructed with. llvm-svn: 204891	2014-03-27 02:39:01 +00:00
Lang Hames	69247821a2	Remove forward declaration for Target class - Target is already defined here. No functional change. llvm-svn: 204885	2014-03-27 01:05:49 +00:00
Quentin Colombet	3914bf516b	[X86][Vectorizer Cost Model] Correct vectorization cost model for v2i64->v2f64 and v4i64->v4f64. The new costs match what we did for SSE2 and reflect the reality of our codegen. <rdar://problem/16381225> llvm-svn: 204884	2014-03-27 00:52:16 +00:00
Rafael Espindola	a041ef1bd8	Correctly propagates st_size. This also finally removes a bogus call to AliasedSymbol. llvm-svn: 204883	2014-03-27 00:28:24 +00:00
Jim Grosbach	72fbde84b8	X86: Correct vectorization cost model for v8f32->v8i8. Fix the cost model to reflect the reality of our codegen. rdar://16370633 llvm-svn: 204880	2014-03-27 00:04:11 +00:00
Nick Lewycky	77d5fb40c8	Treat lifetime.start'd memory like we treat freshly alloca'd memory. Patch by Björn Steinbrink! llvm-svn: 204876	2014-03-26 23:45:15 +00:00
Hal Finkel	df3e34d944	[PowerPC] Generate VSX permutations for v2[fi]64 vectors llvm-svn: 204873	2014-03-26 22:58:37 +00:00
Reid Kleckner	23798a9731	CloneFunction: Clone all attributes, including the CC Summary: Tested with a unit test because we don't appear to have any transforms that use this other than ASan, I think. Fixes PR17935. Reviewers: nicholas CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D3194 llvm-svn: 204866	2014-03-26 22:26:35 +00:00
Ekaterina Romanova	b9aea9383a	This is a fix for PR# 19051. I noticed code gen differences due to code motion when running tests with and without the debug info at O2. The problem is in branch folding. A loop wanted to skip the debug info, but actually it didn't do so. llvm-svn: 204865	2014-03-26 22:15:28 +00:00
Manman Ren	14aa891976	Add comments. Addressing review comments from Evan on r204690. llvm-svn: 204864	2014-03-26 22:14:09 +00:00
Justin Bogner	95e0a70581	llvm-cov: Handle functions with no line number Functions may in an instrumented binary but not in the original source when they're inserted by the compiler or the runtime. These functions aren't meaningful to the user, so teach llvm-cov to skip over them instead of crashing. llvm-svn: 204863	2014-03-26 22:03:06 +00:00
Kevin Enderby	5611398b6b	Fix a problem with the ARM assembler incorrectly matching a vector list parameter that is using all lanes "{d0[], d2[]}" but can match and instruction with a ”{d0, d2}" parameter. I’m finishing up a fix for proper checking of the unsupported alignments on vld/vst instructions and ran into this. Thus I don’t have a test case at this time. And adding all code that will demonstrate the bug would obscure the very simple one line fix. So if you would indulge me on not having a test case at this time I’ll instead offer up a detailed explanation of what is going on in this commit message. This instruction: vld2.8 {d0[], d2[]}, [r4:64] is not legal as the alignment can only be 16 when the size is 8. Per this documentation: A8.8.325 VLD2 (single 2-element structure to all lanes) <align> The alignment. It can be one of: 16 2-byte alignment, available only if <size> is 8, encoded as a = 1. 32 4-byte alignment, available only if <size> is 16, encoded as a = 1. 64 8-byte alignment, available only if <size> is 32, encoded as a = 1. omitted Standard alignment, see Unaligned data access on page A3-108. So when code is added to the llvm integrated assembler to not match that instruction because of the alignment it then goes on to try to match other instructions and comes across this: vld2.8 {d0, d2}, [r4:64] and and matches it. This is because of the method ARMOperand::isVecListDPairSpaced() is missing the check of the Kind. In this case the Kind is k_VectorListAllLanes . While the name of the method may suggest that this is OK it really should check that the Kind is k_VectorList. As the method ARMOperand::isDoubleSpacedVectorAllLanes() is what was used to match {d0[], d2[]} and correctly checks the Kind: bool isDoubleSpacedVectorAllLanes() const { return Kind == k_VectorListAllLanes && VectorList.isDoubleSpaced; } where the original ARMOperand::isVecListDPairSpaced() does not check the Kind: bool isVecListDPairSpaced() const { if (isSingleSpacedVectorList()) return false; return (ARMMCRegisterClasses[ARM::DPairSpcRegClassID] .contains(VectorList.RegNum)); } Jim Grosbach has reviewed the change and said: Yep, that sounds right. … And by "right" I mean, "wow, that's a nasty latent bug I'm really, really glad to see fixed." :) rdar://16436683 llvm-svn: 204861	2014-03-26 21:54:11 +00:00
Arnold Schwaighofer	1a444489e9	PR15967 Fix in basicaa for faulty returning no alias. This commit consist of two parts. The first part fix the PR15967. The wrong conclusion was made when the MaxLookup limit was reached. The fix introduce a out parameter (MaxLookupReached) to DecomposeGEPExpression that the function aliasGEP can act upon. The second part is introducing the constant MaxLookupSearchDepth to make sure that DecomposeGEPExpression and GetUnderlyingObject use the same search depth. This is a small cleanup to clarify the original algorithm. Patch by Karl-Johan Karlsson! llvm-svn: 204859	2014-03-26 21:30:19 +00:00
Hal Finkel	6e28e6aaaf	[PowerPC] VSX loads and stores support unaligned access I've not yet updated PPCTTI because I'm not sure what the actual relative cost is compared to the aligned uses. llvm-svn: 204848	2014-03-26 19:39:09 +00:00
Kevin Enderby	8108f38437	Fix the ARM VST4 (single 4-element structure from one lane) size 16 double-spaced registers instruction printing. This: vld4.16 {d17[1], d19[1], d21[1], d23[1]}, [r7]! was being printed as: vld4.16 {d17[1], d18[1], d19[1], d20[1]}, [r7]! rdar://16435096 llvm-svn: 204847	2014-03-26 19:35:40 +00:00
Hal Finkel	7279f4b00d	[PowerPC] Use v2f64 <-> v2i64 VSX conversion instructions llvm-svn: 204843	2014-03-26 19:13:54 +00:00
Matt Arsenault	90b733a3cf	R600: Add a testcase for sext_in_reg I missed. This sext_inreg i32 in i64 case was already handled, but not enabled. llvm-svn: 204840	2014-03-26 18:31:06 +00:00
Hal Finkel	ea76a44584	[PowerPC] Remove some dead VSX v4f32 store patterns These patterns are dead (because v4f32 stores are currently promoted to v4i32 and stored using Altivec instructions), and also are likely not correct (because they'd store the vector elements in the opposite order from that assumed by the rest of the Altivec code). llvm-svn: 204839	2014-03-26 18:26:36 +00:00
Hal Finkel	9281c9a38b	[PowerPC] Use VSX vector load/stores for v2[fi]64 These instructions have access to the complete VSX register file. In addition, they "swap" the order of the elements so that element 0 (the scalar part) comes first in memory and element 1 follows at a higher address. llvm-svn: 204838	2014-03-26 18:26:30 +00:00
Juergen Ributzka	6ff29a7b2f	[MCJIT] Check if there have been errors during RuntimeDyld execution. llvm-svn: 204837	2014-03-26 18:19:27 +00:00
Jim Grosbach	ed2cd39b81	Fix for incorrect address sinking in the presence of potential overflows. In some cases it is possible for CGP to attempt to reuse a base address from another basic block. In those cases we have to be sure that all the address math was either done at the same bit width, or that none of it overflowed before it was extended. Patch by Louis Gerbarg <lgg@apple.com> rdar://16307442 llvm-svn: 204833	2014-03-26 17:27:01 +00:00
Hans Wennborg	d683a22dd2	Revert "X86 memcpy lowering: use "rep movs" even when esi is used as base pointer" (r204174) > For functions where esi is used as base pointer, we would previously fall ba > from lowering memcpy with "rep movs" because that clobbers esi. > > With this patch, we just store esi in another physical register, and restore > it afterwards. This adds a little bit of register preassure, but the more > efficient memcpy should be worth it. > > Differential Revision: http://llvm-reviews.chandlerc.com/D2968 This didn't work. I was ending up with code like this: lea edi,[esi+38h] mov ecx,0Fh mov edx,esi mov esi,ebx rep movs dword ptr es:[edi],dword ptr [esi] lea ecx,[esi+74h] <-- Ooops, we're now using esi before restoring it from edx. add ebx,3Ch mov esi,edx I guess if we want to do this we need stronger glue or something, or doing the expansion much later. llvm-svn: 204829	2014-03-26 16:30:54 +00:00
Hal Finkel	a6c8b51212	[PowerPC] Add v2i64 as a legal VSX type v2i64 needs to be a legal VSX type because it is the SetCC result type from v2f64 comparisons. We need to expand all non-arithmetic v2i64 operations. This fixes the lowering for v2f64 VSELECT. llvm-svn: 204828	2014-03-26 16:12:58 +00:00
Matheus Almeida	ea06727f03	[mips] Use TwoOperandAliasConstraint for ArithLogicR instructions. This enables TableGen to generate an additional two operand matcher for our ArithLogicR class of instructions (constituted by 3 register operands). E.g.: and $1, $2 <=> and $1, $1, $2 llvm-svn: 204826	2014-03-26 16:09:43 +00:00
Matheus Almeida	ab5633b70c	[mips] Add support to the '.dword' directive. The '.dword' directive accepts a list of expressions and emits them in 8-byte chunks in successive locations. llvm-svn: 204822	2014-03-26 15:44:18 +00:00
Matheus Almeida	3e2a702aa2	[mips] Rename function in MipsAsmParser. parseDirectiveWord is a generic function that parses an expression which means there's no need for it to have such an specific name. Renaming it to parseDataDirective so that it can also be used to handle .dword directives[1]. [1]To be added in a follow up commit. No functional changes. llvm-svn: 204818	2014-03-26 15:24:36 +00:00
Matheus Almeida	3b9c63d29b	[mips] Add support to '.set mips64'. The '.set mips64' directive enables the feature Mips:FeatureMips64 from assembly. Note that it doesn't modify the ELF header as opposed to the use of -mips64 from the command-line. The reason for this is that we want to be as compatible as possible with existing assemblers like GAS. llvm-svn: 204817	2014-03-26 15:14:32 +00:00
Christian Pirker	99974c7242	AArch64_BE Elf support for MC-JIT runtime dynamic linker llvm-svn: 204816	2014-03-26 14:57:32 +00:00
Matheus Almeida	a2cd009c51	[mips] Add support to '.set mips64r2'. The '.set mips64r2' directive enables the feature Mips:FeatureMips64r2 from assembly. Note that it doesn't modify the ELF header as opposed to the use of -mips64r2 from the command-line. The reason for this is that we want to be as compatible as possible with existing assemblers like GAS. llvm-svn: 204815	2014-03-26 14:52:22 +00:00
Christian Pirker	3aa0e6a1f9	AArch64_BE function argument passing for ARM ABI llvm-svn: 204814	2014-03-26 14:51:22 +00:00
Tim Northover	1ff5f29fb5	ARM: add intrinsics for the v8 ldaex/stlex We've already got versions without the barriers, so this just adds IR-level support for generating the new v8 ones. rdar://problem/16227836 llvm-svn: 204813	2014-03-26 14:39:31 +00:00
Matheus Almeida	fe1e39dcba	[mips] Hoist common functionality into a new function. Given that we support multiple directives that enable a particular feature (e.g. '.set mips16'), it's best to hoist that code into a new function so that we don't repeat the same pattern w.r.t parsing and handling error cases. No functional changes. llvm-svn: 204811	2014-03-26 14:26:27 +00:00
Renato Golin	93010e687f	Change @llvm.clear_cache default to call rt-lib After some discussion on IRC, emitting a call to the library function seems like a better default, since it will move from a compiler internal error to a linker error, that the user can work around until LLVM is fixed. I'm also adding a note on the responsibility of the user to confirm that the cache was cleared on platforms where nothing is done. llvm-svn: 204806	2014-03-26 14:01:32 +00:00
Daniel Sanders	6dd7251599	[mips] The decision to use MO_GOT_PAGE and MO_GOT_OFST depends on the ABI being N32 or N64 not the arch being MIPS64 Summary: No functional change (in supported use cases) Reviewers: matheusalmeida Reviewed By: matheusalmeida Differential Revision: http://llvm-reviews.chandlerc.com/D3177 llvm-svn: 204805	2014-03-26 13:59:42 +00:00

... 3 4 5 6 7 ...

68434 Commits