llvm-project

Commit Graph

Author	SHA1	Message	Date
Elena Demikhovsky	3dcfbdfa54	AVX-512: Added fp_to_uint and uint_to_fp patterns. llvm-svn: 205754	2014-04-08 07:24:02 +00:00
Andrew Trick	02066f2a4d	Fix a (legacy) PassManager crash that occurs when a ModulePass indirectly requires a function analysis. This bug was reported by Jason Kim. He included a test case here: http://reviews.llvm.org/D3312 llvm-svn: 205753	2014-04-08 03:40:34 +00:00
David Majnemer	c9d2625586	X86: Split the relocation selection up Before, we would have conditional operators where one side of the operator would be of type RelocationTypeAMD64 and the other is of type RelocationTypeI386. GCC would noisly warn with -Wenum-compare diagnostic. Instead, refactor the code so it is more like the X86 ELF object writer. llvm-svn: 205752	2014-04-08 02:15:13 +00:00
Jim Grosbach	e75c048ab9	Tidy up comments a bit. Punctuation, grammar, formatting, etc.. llvm-svn: 205749	2014-04-07 23:47:23 +00:00
Jim Grosbach	75010e7712	ARM64: Range based for loop in ARM64PromoteConstant pass llvm-svn: 205748	2014-04-07 23:47:21 +00:00
Jim Grosbach	64a28e70c8	ARM64: Clean up file header comment a bit. llvm-svn: 205747	2014-04-07 23:14:38 +00:00
David Majnemer	a1c861d379	obj2yaml: Use the correct relocation type for different machine types The IO normalizer would essentially lump I386 and AMD64 relocations together. Relocation types with the same numeric value would then get mapped in appropriately. For example: IMAGE_REL_AMD64_ADDR64 and IMAGE_REL_I386_DIR16 both have a numeric value of one. We would see IMAGE_REL_I386_DIR16 in obj2yaml conversions of object files with a machine type of IMAGE_FILE_MACHINE_AMD64. llvm-svn: 205746	2014-04-07 23:12:20 +00:00
Reed Kotler	735da8e015	Reverting commit r205628 due to mips64 issues. llvm-svn: 205741	2014-04-07 22:11:40 +00:00
Andrew Trick	8d007bb5d4	Put a limit on ScheduleDAGSDNodes::ClusterNeighboringLoads to avoid blowing up compile time. Fixes PR16365 - Extremely slow compilation in -O1 and -O2. The SD scheduler has a quadratic implementation of load clustering which absolutely blows up compile time for large blocks with constant pool loads. The MI scheduler has a better implementation of load clustering. However, we have not done the work yet to completely eliminate the SD scheduler. Some benchmarks still seem to benefit from early load clustering, although maybe by chance. As an intermediate term fix, I just put a nice limit on the number of DAG users to search before finding a match. With this limit there are no binary differences in the LLVM test suite, and the PR16365 test case does not suffer any compile time impact from this routine. llvm-svn: 205738	2014-04-07 21:29:22 +00:00
Tom Stellard	204e61bbdf	R600/SI: Handle INSERT_SUBREG in SIFixSGPRCopies llvm-svn: 205732	2014-04-07 19:45:45 +00:00
Tom Stellard	50122a5890	R600: Match 24-bit arithmetic patterns in a Target DAGCombine Moving these patterns from TableGen files to PerformDAGCombine() should allow us to generate better code by eliminating unnecessary shifts and extensions earlier. This also fixes a bug where the MAD pattern was calling SimplifyDemandedBits with a 24-bit mask on the first operand even when the full pattern wasn't being matched. This occasionally resulted in some instructions being incorrectly deleted from the program. v2: - Fix bug with 64-bit mul llvm-svn: 205731	2014-04-07 19:45:41 +00:00
Tom Stellard	3cbe014027	R600: Replace dyn_cast + assert with cast llvm-svn: 205730	2014-04-07 19:31:13 +00:00
Matt Arsenault	4be76e99fe	Use std::swap llvm-svn: 205723	2014-04-07 16:44:26 +00:00
Matt Arsenault	7939acd7fa	Use .data() instead of &x[0] llvm-svn: 205722	2014-04-07 16:44:24 +00:00
Eric Christopher	484182779d	Invert the option to enable debug info verification. No functional change outside of the command line to enable it. llvm-svn: 205713	2014-04-07 13:55:21 +00:00
Eric Christopher	beb2cd6b7c	Handle vlas during inline cost computation if they'll be turned into a constant size alloca by inlining. Ran a run over the testsuite, no results out of the noise, fixes the testcase in the PR. PR19115. llvm-svn: 205710	2014-04-07 13:36:21 +00:00
Eric Christopher	fcd222aa47	Add NDEBUG markers around debug only function. llvm-svn: 205706	2014-04-07 12:46:30 +00:00
Eric Christopher	6ab6a0648b	Add debug location information to the vectorizer debug statements. Patch by Zinovy Nis. llvm-svn: 205705	2014-04-07 12:32:17 +00:00
Craig Topper	c10719f55d	[C++11] Make use of 'nullptr' in the Support library. llvm-svn: 205697	2014-04-07 04:17:22 +00:00
Elena Demikhovsky	d8efd89b20	Changes in IntelJITEventListener - By Arch Robinson - take->release: LLVM has moved to C++11. MockWrapper became an instance of unique_ptr. - method symbol_iterator::increment disappeared recently, in this revision: r200442 \| rafael \| 2014-01-29 20:49:50 -0600 (Wed, 29 Jan 2014) \| 9 lines Simplify the handling of iterators in ObjectFile. None of the object file formats reported error on iterator increment. In retrospect, that is not too surprising: no object format stores symbols or sections in a linked list or other structure that requires chasing pointers. As a consequence, all error checking can be done on begin() and end(). This reduces the text segment of bin/llvm-readobj in my machine from 521233 to 518526 bytes. My change mimics the change that the revision made to lib/DebugInfo/DWARFContext.cpp . - const_cast: Shut up a warning from gcc. I ran unittests/ExecutionEngine/JIT/Debug+Asserts/JITTests to make sure it worked. - Arch llvm-svn: 205689	2014-04-06 11:08:33 +00:00
David Blaikie	2a40c14d98	DebugInfo: Support namespace aliases as DW_TAG_imported_declaration instead of DW_TAG_imported_module I really should read the spec more often (and test GCC more often too). I just assumed that namespace aliases would be the same as using directives, except with a name. But apparently that's not how the DWARF standards suggests they be implemented. DWARF4 provides an example and other non-normative text suggesting that namespace aliases be implemented by named imported declarations intsead of named imported modules. So be it. llvm-svn: 205685	2014-04-06 06:29:01 +00:00
Argyrios Kyrtzidis	44ec0a7d76	[Support] Modify LockFileManager::waitForUnlock() to return info about how the lock was released. llvm-svn: 205683	2014-04-06 03:19:31 +00:00
David Blaikie	b38ac1f7ee	Remove unused parameter Also update a few null pointers in this function to be consistent with new null pointers being added. Patch by Robert Matusewicz! Differential Revision: http://reviews.llvm.org/D3123 llvm-svn: 205682	2014-04-05 23:33:25 +00:00
Saleem Abdulrasool	efa31a9831	AsmParser: add a warning for compatibility parsing This adds a warning when linker_private or linker_private_weak is provided and we handle it in a compatible manner. Suggested by Chris Lattner! llvm-svn: 205681	2014-04-05 22:42:53 +00:00
David Blaikie	2f7711242a	MachineInstr: introduce explicit_operands and implicit_operands ranges Makes iteration over implicit and explicit machine operands more explicit (har har). Insipired by code review discussion for r205565. llvm-svn: 205680	2014-04-05 22:42:04 +00:00
Saleem Abdulrasool	dd979e6457	ARM: consolidate MachO checks for ARM asm parser This consolidates the duplicated MachO checks in the directive parsing for various directives that are unsupported for Mach-O. The error message change is unimportant as this restores the behaviour to that prior to the addition of the new directive handling. Furthermore, use a more direct check for MachO targeting rather than an indirect feature check of the assembler. Also simplify the test execution command to avoid temporary files. Further more, perform the check in both object and assembly emission. Whether all non-applicable directives are handled is another question. .fnstart is marked as being unsupported, however, the complementary .fnend is not. The additional unwinding directives are also still honoured. This change does not change that, though, it would be good to validate and mark them as being unsupported if they are unsupported for the MachO emission. llvm-svn: 205678	2014-04-05 22:09:51 +00:00
David Blaikie	857497b9c6	Simplify compression API by compressing into a SmallVector rather than a MemoryBuffer This is the other half of r205676. llvm-svn: 205677	2014-04-05 21:53:04 +00:00
David Blaikie	a505f2479e	Simplify compression API by decompressing into a SmallVector rather than a MemoryBuffer This avoids an extra copy during decompression and avoids the use of MemoryBuffer which is a weirdly esoteric device that includes unrelated concepts like "file name" (its rather generic name is a bit misleading). Similar refactoring of zlib::compress coming up. llvm-svn: 205676	2014-04-05 21:26:44 +00:00
Saleem Abdulrasool	c12813576c	AsmParser: restore LLVM IR compatibility for linker_private{,_weak} This restores the linker_private and linker_private_weak lexemes to permit translation of the deprecated lexmes. The behaviour is identical to the bitcode handling: linker_private and linker_private_weak are handled as if private had been specified. This enables compatibility with IR generated by LLVM 3.4. Reported on IRC by ki9a! llvm-svn: 205675	2014-04-05 20:51:58 +00:00
David Blaikie	6425696818	Fixing typo. Differential Revision: http://reviews.llvm.org/D3154 llvm-svn: 205674	2014-04-05 20:30:31 +00:00
Hal Finkel	41e9b1d559	[PowerPC] Remove unused TM member variable to unbreak build Fix "error: private field 'TM' is not used [-Werror,-Wunused-private-field]" llvm-svn: 205660	2014-04-05 00:16:28 +00:00
Hal Finkel	de0b413ec0	[PowerPC] Adjust load/store costs in PPCTTI This provides more realistic costs for the insert/extractelement instructions (which are load/store pairs), accounts for the cheap unaligned Altivec load sequence, and for unaligned VSX load/stores. Bad news: MultiSource/Applications/sgefa/sgefa - 35% slowdown (this will require more investigation) SingleSource/Benchmarks/McGill/queens - 20% slowdown (we no longer vectorize this, but it was a constant store that was scalarized) MultiSource/Benchmarks/FreeBench/pcompress2/pcompress2 - 2% slowdown Good news: SingleSource/Benchmarks/Shootout/ary3 - 54% speedup SingleSource/Benchmarks/Shootout-C++/ary - 40% speedup MultiSource/Benchmarks/Ptrdist/ks/ks - 35% speedup MultiSource/Benchmarks/FreeBench/neural/neural - 30% speedup MultiSource/Benchmarks/TSVC/Symbolics-flt/Symbolics-flt - 20% speedup Unfortunately, estimating the costs of the stack-based scalarization sequences is hard, and adjusting these costs is like a game of whac-a-mole :( I'll revisit this again after we have better codegen for vector extloads and truncstores and unaligned load/stores. llvm-svn: 205658	2014-04-04 23:51:18 +00:00
Hal Finkel	b1308d525c	[PowerPC] PPCTTI Cleanup Remove the declaration of an unimplemented function. llvm-svn: 205657	2014-04-04 23:51:11 +00:00
Andrew Trick	326c1f6804	Minor change to StackMapLiveness DEBUG output. llvm-svn: 205656	2014-04-04 23:49:35 +00:00
Matt Arsenault	cf6f688a40	Add DAG parameter to ComputeNumSignBitsForTargetNode This way, you can check the number of sign bits in the operands. The depth parameter it already has is pretty useless without this. llvm-svn: 205649	2014-04-04 20:13:13 +00:00
Matt Arsenault	5e1e4316c4	Fix tabs llvm-svn: 205648	2014-04-04 20:13:08 +00:00
Kai Nacke	6da86e8529	[mips] Add Octeon cnMips instructions seqi/snei and v3mulu/vmm0/vmulu. This patch adds the Octeon cnMips instructions seqi/snei and v3mulu/vmm0/vmulu. It is only for the assembler. Test case is included. Reviewed by: Daniel.Sanders@imgtec.com llvm-svn: 205631	2014-04-04 16:21:59 +00:00
Hal Finkel	fbf7e2a1a1	[PowerPC] Add a full condition code register to make the "cc" clobber work gcc inline asm supports specifying "cc" as a clobber of all condition registers. Add just enough modeling of the full register to make this work. Fixed PR19326. llvm-svn: 205630	2014-04-04 15:15:57 +00:00
Daniel Sanders	d4341a0ad7	[mips] abs.[ds], and neg.[ds] should be allowed regardless of -enable-no-nans-fp-math Summary: They behave in accordance with the Has2008 and ABS2008 configuration bits of the processor which are used to select between the 1985 and 2008 versions of IEEE 754. In 1985 mode, these instructions are arithmetic (i.e. they raise invalid operation exceptions when given NaN), in 2008 mode they are non-arithmetic (i.e. they are copies). nmadd.[ds], and nmsub.[ds] are still subject to -enable-no-nans-fp-math because the ISA spec does not explicitly state that they obey Has2008 and ABS2008. Reviewers: matheusalmeida Reviewed By: matheusalmeida Differential Revision: http://llvm-reviews.chandlerc.com/D3274 llvm-svn: 205628	2014-04-04 14:52:54 +00:00
Tim Northover	0e5eaae1cb	DAGLegalize: add last-ditch type-legalization for VSELECT. When LLVM sees something like (v1iN (vselect v1i1, v1iN, v1iN)) it can decide that the result is OK (v1i64 is legal on AArch64, for example) but it still need scalarising because of that v1i1. There was no code to do this though. AArch64 and ARM64 have DAG combines to produce efficient code and prevent that occuring in most such situations, but there are edge cases that they miss. This adds a legalization to cope with that. llvm-svn: 205626	2014-04-04 14:49:30 +00:00
Tim Northover	07a8ff4892	ARM64: handle v1i1 types arising from setcc properly. There were several overlapping problems here, and this solution is closely inspired by the one adopted in AArch64 in r201381. Firstly, scalarisation of v1i1 setcc operations simply fails if the input types are legal. This is fixed in LegalizeVectorTypes.cpp this time, and allows AArch64 code to be simplified slightly. Second, vselect with such a setcc feeding into it ends up in ScalarizeVectorOperand, where it's not handled. I experimented with an implementation, but found that whatever DAG came out was rather horrific. I think Hao's DAG combine approach is a good one for quality, though there are edge cases it won't catch (to be fixed separately). Should fix PR19335. llvm-svn: 205625	2014-04-04 14:49:21 +00:00
Stepan Dyatkovskiy	3f1fa3d545	Fix for PR18921 (LDRD/STRD part):: Removed "GNU Assembler extension (compatibility)" definitions from ARMInstrInfo.td Fixed ARMAsmParser::ParseInstruction GNU compatability branch, so it also works for thumb mode from now. Added new tests. llvm-svn: 205622	2014-04-04 10:17:56 +00:00
Tim Northover	85d6a16c46	ARM64: use regalloc-friendly COPY_TO_REGCLASS for bitcasts The previous patterns directly inserted FMOV or INS instructions into the DAG for scalar_to_vector & bitconvert patterns. This is horribly inefficient and can generated lots more GPR <-> FPR register traffic than necessary. It's much better to emit instructions the register allocator understands so it can coalesce the copies when appropriate. It led to at least one ISelLowering hack to avoid the problems, which was incorrect for v1i64 (FPR64 has no dsub). It can now be removed entirely. This should also fix PR19331. llvm-svn: 205616	2014-04-04 09:03:09 +00:00
Tim Northover	1e4f2c5e5f	ARM64: add 128-bit MLA operations to the custom selection code. Without this change, the llvm_unreachable kicked in. The code pattern being spotted is rather non-canonical for 128-bit MLAs, but it can happen and there's no point in generating sub-optimal code for it just because it looks odd. Should fix PR19332. llvm-svn: 205615	2014-04-04 09:03:02 +00:00
Stepan Dyatkovskiy	a09bd2379c	Fixed register class in STRD instruction for Thumb2 mode. llvm-svn: 205612	2014-04-04 08:14:13 +00:00
Craig Topper	840beec2d0	Make consistent use of MCPhysReg instead of uint16_t throughout the tree. llvm-svn: 205610	2014-04-04 05:16:06 +00:00
Jim Grosbach	537f3ed838	ARM: Range based for-loop over block predecessors. No functional change. llvm-svn: 205604	2014-04-04 02:11:03 +00:00
Jim Grosbach	f92e8f5a8b	ARM: Use range-based for loops in frame lowering. No functional change. llvm-svn: 205602	2014-04-04 02:10:55 +00:00
Quentin Colombet	96bd2a1490	[RegAllocGreedy][Last Chance Recoloring] Emit diagnostics when last chance recoloring cut-offs are encountered and register allocation failed. This is related to PR18747 Patch by MAYUR PANDEY <mayur.p@samsung.com>. llvm-svn: 205601	2014-04-04 02:05:21 +00:00
Quentin Colombet	9c816f39ad	Revert r205599, the commit was not intended to have so many changes llvm-svn: 205600	2014-04-04 02:02:49 +00:00
Quentin Colombet	7ee4e79dec	[RegAllocGreedy][Last Chance Recoloring] Emit diagnostics when last chance recoloring cut-offs are hit. This is related to PR18747. Patch by MAYUR PANDEY <mayur.p@samsung.com> llvm-svn: 205599	2014-04-04 01:58:57 +00:00
Saleem Abdulrasool	a7a8a3e3ee	MIPS: remove vim swap file llvm-svn: 205595	2014-04-04 01:19:54 +00:00
Rafael Espindola	7247546ba3	Add an assert that this is only used with .o files. I am not sure how to get a relocation in a .dylib, but this function would return the wrong value if passed one. llvm-svn: 205592	2014-04-04 00:31:12 +00:00
Rafael Espindola	7e91bc9e32	Implement getRelocationAddress for MachO and ET_REL elf files. With that, fix the symbolizer to work with any ELF file. llvm-svn: 205588	2014-04-03 23:54:35 +00:00
Rafael Espindola	128b8111d7	Implement macho relocation iterators with section number + relocation number. This will make it possible to implement getRelocationAddress. llvm-svn: 205587	2014-04-03 23:51:28 +00:00
Jim Grosbach	b8bd4a5e2a	Tidy up. Space before ':' in range-based for loops. llvm-svn: 205585	2014-04-03 23:43:26 +00:00
Jim Grosbach	bb1af943bb	Tidy up. 80 columns. llvm-svn: 205584	2014-04-03 23:43:22 +00:00
Jim Grosbach	1a59711505	Tidy up. Trailing whitespace. llvm-svn: 205583	2014-04-03 23:43:18 +00:00
Jim Grosbach	e04eb1dc12	Fix typo. llvm-svn: 205582	2014-04-03 23:43:12 +00:00
Rafael Espindola	0cc9ba116f	Fix llvm-objdump crash. llvm-svn: 205581	2014-04-03 23:20:02 +00:00
Rafael Espindola	77314aa014	Remove section_rel_empty. Just compare begin() and end() instead. llvm-svn: 205577	2014-04-03 22:42:22 +00:00
Eli Bendersky	bbef172f19	Optimize away unnecessary address casts. Removes unnecessary casts from non-generic address spaces to the generic address space for certain code patterns. Patch by Jingyue Wu. llvm-svn: 205571	2014-04-03 21:18:25 +00:00
Lang Hames	cb74fa696b	[ARM64] Teach the ARM64DeadRegisterDefinition pass to respect implicit-defs. When rematerializing through truncates, the coalescer may produce instructions with dead defs, but live implicit-defs of subregs: E.g. %X1<def,dead> = MOVi64imm 2, %W1<imp-def>; %X1:GPR64, %W1:GPR32 These instructions are live, and their definitions should not be rewritten. Fixes <rdar://problem/16492408> llvm-svn: 205565	2014-04-03 20:51:08 +00:00
Tom Stellard	a0150cb6a9	R600: Correct opcode for BFE_INT Acording to AMD documentation, the correct opcode for BFE_INT is 0x5, not 0x4 Fixes Arithm/Absdiff.Mat/3 OpenCV test Patch by: Bruno Jiménez llvm-svn: 205562	2014-04-03 20:19:29 +00:00
Tom Stellard	7ed0b5235a	R600/SI: Lower 64-bit immediates using REG_SEQUENCE llvm-svn: 205561	2014-04-03 20:19:27 +00:00
Eli Bendersky	9966b26dac	Fix PR19270 - type mismatch caused by invalid optimization. Patch by Jingyue Wu. llvm-svn: 205547	2014-04-03 17:51:58 +00:00
Tim Northover	01b4aa9437	ARM: tell LLVM about zext properties of ldrexb/ldrexh Implementing this via ComputeMaskedBits has two advantages: + It actually works. DAGISel doesn't deal with the chains properly in the previous pattern-based solution, so they never trigger. + The information can be used in other DAG combines, as well as the trivial "get rid of truncs". For example if the trunc is in a different basic block. rdar://problem/16227836 llvm-svn: 205540	2014-04-03 15:10:35 +00:00
Daniel Sanders	442f1a12f1	[mips] Implement ehb, ssnop, and pause in assembler Summary: Add negative tests for pause Reviewers: matheusalmeida Reviewed By: matheusalmeida Differential Revision: http://llvm-reviews.chandlerc.com/D3246 llvm-svn: 205537	2014-04-03 13:21:51 +00:00
Tim Northover	70450c59a4	ARM: skip cmpxchg failure barrier if ordering is monotonic. The terminal barrier of a cmpxchg expansion will be either Acquire or SequentiallyConsistent. In either case it can be skipped if the operation has Monotonic requirements on failure. rdar://problem/15996804 llvm-svn: 205535	2014-04-03 13:06:54 +00:00
Zoran Jovanovic	cabf0f41e0	Implementation of 16-bit microMIPS instructions MFHI and MFLO. Differential Revision: http://llvm-reviews.chandlerc.com/D3141 llvm-svn: 205532	2014-04-03 12:47:34 +00:00
Daniel Sanders	f7b32291ad	[mips] Add initial (experimental) MIPS-IV support. Summary: Adds the 'mips4' processor and a simple test of the ELF e_flags. Patch by David Chisnall His work was sponsored by: DARPA, AFRL I made one small change to the testcase so that it uses mips64-unknown-linux instead of mips4-unknown-linux. This patch indirectly adds FeatureCondMov to FeatureMips64. This is ok because it's supposed to be there anyway and it turns out that FeatureCondMov is not a predicate of any instructions at the moment (this is a bug that hasn't been noticed because there are no targets without the conditional move instructions yet). CC: theraven Differential Revision: http://llvm-reviews.chandlerc.com/D3244 llvm-svn: 205530	2014-04-03 12:13:36 +00:00
Eric Christopher	bfb38badc1	Fix for PR 19261: llc doesn't generate nodes for unconditional fall-through branches for targets without FastISel implementation (X86 has it, but can be disabled by "-fast-isel=false") in SelectionDAGBuilder::visitBr(). So for line 4 in the following testcase 1: void foo(int i){ 2: switch(i){ 3: default: 4: break; 5: } 6: return; 7: } there is no corresponding line in .debug_line section, and a debugger cannot set a breakpoint at line 4. Fix this by always emitting a branch when we're not optimizing and add a testcase to ensure that there's code on every line we'd want to break. Patch by Daniil Fukalov. llvm-svn: 205529	2014-04-03 12:11:51 +00:00
Zoran Jovanovic	842f20ef0b	MicroMIPS specific little endian fixup data byte ordering. Differential Revision: http://llvm-reviews.chandlerc.com/D3245 llvm-svn: 205528	2014-04-03 12:01:01 +00:00
Tim Northover	c882eb0723	ARM: expand atomic ldrex/strex loops in IR The previous situation where ATOMIC_LOAD_WHATEVER nodes were expanded at MachineInstr emission time had grown to be extremely large and involved, to account for the subtly different code needed for the various flavours (8/16/32/64 bit, cmpxchg/add/minmax). Moving this transformation into the IR clears up the code substantially, and makes future optimisations much easier: 1. an atomicrmw followed by using the new value can be more efficient. As an IR pass, simple CSE could handle this efficiently. 2. Making use of cmpxchg success/failure orderings only has to be done in one (simpler) place. 3. The common "cmpxchg; did we store?" idiom can be exposed to optimisation. I intend to gradually improve this situation within the ARM backend and make sure there are no hidden issues before moving the code out into CodeGen to be shared with (at least ARM64/AArch64, though I think PPC & Mips could benefit too). llvm-svn: 205525	2014-04-03 11:44:58 +00:00
Stepan Dyatkovskiy	6207a4dadc	PR19320: The trouble as in ARMAsmParser, in ParseInstruction method. It assumes that ARM::R12 + 1 == ARM::SP. It is wrong, since ARM::<Register> codes are generated by tablegen and actually could be any random numbers. llvm-svn: 205524	2014-04-03 11:29:15 +00:00
Silviu Baranga	a3106e6847	[ARM] When generating a vpaddl node the input lane type is not always the type of the add operation since extract_vector_elt can perform an extend operation. Get the input lane type from the vector on which we're performing the vpaddl operation on and extend or truncate it to the output type of the original add node. llvm-svn: 205523	2014-04-03 10:44:27 +00:00
Sasa Stankovic	06c4780311	[mips] Extend MipsMCExpr class to handle %higher(sym1 - sym2 + const) and %highest(sym1 - sym2 + const) relocations. Remove "ABS_" from VK_Mips_HI and VK_Mips_LO enums in MipsMCExpr, to be consistent with VK_Mips_HIGHER and VK_Mips_HIGHEST. This change also deletes test file test/MC/Mips/higher_highest.ll and moves its CHECK's to the new test file test/MC/Mips/higher-highest-addressing.s. The deleted file tests that R_MIPS_HIGHER and R_MIPS_HIGHEST relocations are emitted in the .o file. Since it uses -force-mips-long-branch option, it was created when MipsLongBranch's implementation was emitting R_MIPS_HIGHER and R_MIPS_HIGHEST relocations in the .o file. It was disabled when MipsLongBranch started to directly calculate offsets. Differential Revision: http://llvm-reviews.chandlerc.com/D3230 llvm-svn: 205522	2014-04-03 10:37:45 +00:00
Tim Northover	2ad88d3aab	ARM64: always use i64 for the RHS of shift operations Switching between i32 and i64 based on the LHS type is a good idea in theory, but pre-legalisation uses i64 regardless of our choice, leading to potential ISel errors. Should fix PR19294. llvm-svn: 205519	2014-04-03 09:26:16 +00:00
Oliver Stannard	92e0fc0484	ARM: Use __STACK_LIMIT symbol for segmented stacks We cannot use STACK_LIMIT, as it is not reserved for the compiler by the C spec. llvm-svn: 205516	2014-04-03 08:45:16 +00:00
Tim Northover	c7c6a93704	ARM64: don't generate __sincos_stret calls unless on MachO This should fix PR19314. llvm-svn: 205514	2014-04-03 07:06:13 +00:00
David Blaikie	12e00fc649	DebugInfo: Use a 64 bit type for the subrange While we were encoding 64 bit values (data8) in the subrange itself, using a 32 bit type for the subrange was still confusing the gdb. Oh, and make it unsigned too. As the comment points out, this could be pushed into the frontend so that it would be 32 or 64 bit as appropriate, etc. llvm-svn: 205512	2014-04-03 06:28:20 +00:00
Lang Hames	3c0dc2a99d	[CodeGen] Fix peephole optimizer bug introduced in r205481. Fixes PR19318. I should have read that comment a little more carefully. ;) Regression test in the works, committing in the mean time to un-break people. llvm-svn: 205511	2014-04-03 05:03:20 +00:00
Rafael Espindola	895ff83234	Implement get getSymbolFileOffset with getSymbolAddress. This has the following advantages: * Less code. * The old ELF implementation was wrong for non-relocatable objects. * The old ELF implementation (and I think MachO) was wrong for thumb. No current testcase since this is only used from MCJIT and it only uses relocatable objects and I don't think it supports thumb yet. llvm-svn: 205508	2014-04-03 03:13:33 +00:00
Rafael Espindola	a2782f6e77	Remove getSymbolValue. All existing users explicitly ask for an address or a file offset. llvm-svn: 205503	2014-04-03 02:32:47 +00:00
Juergen Ributzka	c81000b8e9	Revert "[Constant Hoisting] Lazily compute the idom and cache the result." This code is no longer usefull, because we only compute and use the IDom once. There is no benefit in caching it anymore. llvm-svn: 205498	2014-04-03 01:38:47 +00:00
Hal Finkel	6fd19ab35e	Account for scalarization costs in BasicTTI::getMemoryOpCost for extending vector loads When a vector type legalizes to a larger vector type, and the target does not support the associated extending load (or truncating store), then legalization will scalarize the load (or store) resulting in an associated scalarization cost. BasicTTI::getMemoryOpCost needs to account for this. Between this, and r205487, PowerPC on the P7 with VSX enabled shows: MultiSource/Benchmarks/PAQ8p/paq8p: 43% speedup SingleSource/Benchmarks/BenchmarkGame/puzzle: 51% speedup SingleSource/UnitTests/Vectorizer/gcc-loops 28% speedup (some of these are new; some of these, such as PAQ8p, just reverse regressions that VSX support would trigger) llvm-svn: 205495	2014-04-03 00:53:59 +00:00
Rafael Espindola	b4865d698b	Revert "Fix a nomenclature error in llvm-nm." This reverts commit r205479. It turns out that nm does use addresses, it is just that every reasonable relocatable ELF object has sections with address 0. I have no idea if those exist in reality, but it at least it shows that llvm-nm should use the name address. The added test was includes an unusual .o file with non 0 section addresses. I created it by hacking ELFObjectWriter.cpp. Really sorry for the churn. llvm-svn: 205493	2014-04-03 00:19:35 +00:00
Lang Hames	c59a2d0529	[X86] As per suggestion from Craig Topper and Hal Finkel, override TargetInstrInfo::findCommutedOpIndices to enable VFMA*231 commutation, rather than abusing commuteInstruction. Thanks very much for the suggestion guys! llvm-svn: 205489	2014-04-02 23:57:49 +00:00
Hal Finkel	55312debee	Fix multi-register costs in BasicTTI::getCastInstrCost For an cast (extension, etc.), the currently logic predicts a low cost if the associated operation (keyed on the destination type) is legal (or promoted). This is not true when the number of values required to legalize the type is changing. For example, <8 x i16> being sign extended by <8 x i32> is not generically cheap on PPC with VSX, even though sign extension to v4i32 is legal, because two output v4i32 values are required compared to the single v8i16 input value, and without custom logic in the target, this conversion will scalarize. llvm-svn: 205487	2014-04-02 23:18:54 +00:00
Lang Hames	5dc14bd54c	[CodeGen] Teach the peephole optimizer to remember (and exploit) all folding opportunities in the current basic block, rather than just the last one seen. <rdar://problem/16478629> llvm-svn: 205481	2014-04-02 22:59:58 +00:00
Rafael Espindola	af9129468e	Fix a nomenclature error in llvm-nm. What llvm-nm prints depends on the file format. On ELF for example, if the file is relocatable, it prints offsets. If it is not, it prints addresses. Since it doesn't really need to care what it is that it is printing, use the generic term value. Fix or implement getSymbolValue to keep llvm-nm working. llvm-svn: 205479	2014-04-02 22:52:46 +00:00
Hal Finkel	f823380a44	[PowerPC] Make PPCTTI::getMemoryOpCost call BasicTTI::getMemoryOpCost PPCTTI::getMemoryOpCost will now make use of BasicTTI::getMemoryOpCost to calculate the base cost of the memory access, and then adjust on top of that. There is no functionality change from this modification, but it will become important so that PPCTTI can take advantage of scalarization information for which BasicTTI::getMemoryOpCost will account in the near future. llvm-svn: 205476	2014-04-02 22:43:49 +00:00
Juergen Ributzka	fcd2e94ecc	Add comments and test case for [DAG] Keep the opaque constant flag when performing unary constant folding operations (r204737). llvm-svn: 205474	2014-04-02 22:21:01 +00:00
Lang Hames	c2c751312e	[X86] Make the VFMA*231 variants commutable and relax the alignment restrictions on FMA3 memory operands. FMA3 instructions are VEX encoded, so they can load from unaligned memory. Testcase to follow, along with related patch. <rdar://problem/16478629> llvm-svn: 205472	2014-04-02 22:06:16 +00:00
Duncan P. N. Exon Smith	4680f40d28	Revert "Reapply "LTO: add API to set strategy for -internalize"" This reverts commit r199244. Conflicts: include/llvm-c/lto.h include/llvm/LTO/LTOCodeGenerator.h lib/LTO/LTOCodeGenerator.cpp llvm-svn: 205471	2014-04-02 22:05:57 +00:00
Juergen Ributzka	27435b3b8a	Add comments and test case for [X86TTI] Make constant base pointers for GetElementPtr opaque (r204739). llvm-svn: 205468	2014-04-02 21:45:36 +00:00
Saleem Abdulrasool	cd1308296e	ARM: update subtarget information for Windows on ARM Update the subtarget information for Windows on ARM. This enables using the MC layer to target Windows on ARM. llvm-svn: 205459	2014-04-02 20:32:05 +00:00
Jim Grosbach	2a2459f365	Make a few more range-based loops use explicit types. No functional change. llvm-svn: 205458	2014-04-02 20:21:22 +00:00
Tom Stellard	36a031870b	TargetLibraryInfo: Disable memcpy and memset on R600 There are no implementations of these for R600. llvm-svn: 205455	2014-04-02 19:53:29 +00:00
Jim Grosbach	36c4953348	Simplify resolveFrameIndex() signature. Just pass a MachineInstr reference rather than an MBB iterator. Creating a MachineInstr& is the first thing every implementation did anyway. llvm-svn: 205453	2014-04-02 19:28:18 +00:00
Jim Grosbach	4a1a9ce5e6	ARM: cortex-m0 doesn't support unaligned memory access. Unlike other v6+ processors, cortex-m0 never supports unaligned accesses. From the v6m ARM ARM: "A3.2 Alignment support: ARMv6-M always generates a fault when an unaligned access occurs." rdar://16491560 llvm-svn: 205452	2014-04-02 19:28:13 +00:00
Jim Grosbach	df1e05bb8a	Make some range based loop types more explicit. No functional change, but more readable code. llvm-svn: 205451	2014-04-02 19:28:08 +00:00
Kai Nacke	13673ac704	[mips] Add more Octeon cnMips instructions Adds the instructions ext/ext32/cins/cins32. It also changes pop/dpop to accept the two operand version and adds a simple pattern to generate baddu. Tests for the two operand versions (including baddu/dmul/dpop/pop) and the code generation pattern for baddu are included. Reviewed by: Daniel.Sanders@imgtec.com llvm-svn: 205449	2014-04-02 18:40:43 +00:00
Jim Grosbach	20b0790df7	[C++11,ARM64] Range based for and explicit 'override' in STP cleanup. No functional change intended. llvm-svn: 205446	2014-04-02 18:00:59 +00:00
Jim Grosbach	05abd709f3	[C++11,ARM64] Range based for loops in constant promotion. No functional change intended. llvm-svn: 205445	2014-04-02 18:00:56 +00:00
Jim Grosbach	7dc9edeaa5	[C++11,ARM64] Range based for loops in load/store pair optimizer. No functional change intended. llvm-svn: 205444	2014-04-02 18:00:53 +00:00
Jim Grosbach	020e657790	[C++11,ARM64] Range based for loops in target lowering. No functional change intended. llvm-svn: 205443	2014-04-02 18:00:51 +00:00
Jim Grosbach	91f1f47751	[C++11,ARM64] Range based for loops in frame lowering. No functional change intended. llvm-svn: 205442	2014-04-02 18:00:49 +00:00
Jim Grosbach	f39d752b03	[C++11,ARM64] Range based for loops in pseudo expansion. No functional change intended. llvm-svn: 205441	2014-04-02 18:00:46 +00:00
Jim Grosbach	673825ebac	[C++11,ARM64] Range based for loops for LOH No functional change intended. llvm-svn: 205440	2014-04-02 18:00:44 +00:00
Jim Grosbach	2539c3d07a	[C++11,ARM64] Range based for loops TLS cleanup. No functional change intended. llvm-svn: 205439	2014-04-02 18:00:41 +00:00
Jim Grosbach	0d0c5a614a	[C++11,ARM64] Range based for loops in branch relaxation. No functional change intended. llvm-svn: 205438	2014-04-02 18:00:39 +00:00
Jim Grosbach	1c762ca9bd	[C++11,ARM64] Range based for loops in address type promotion. No functional change intended. llvm-svn: 205437	2014-04-02 18:00:36 +00:00
Quentin Colombet	7bf9d8cd13	[ARM64][CollectLOH] Remove the link to the radar from the comments. llvm-svn: 205435	2014-04-02 16:40:49 +00:00
Oliver Stannard	b14c625111	ARM: Add support for segmented stacks Patch by Alex Crichton, ILyoan, Luqman Aden and Svetoslav. llvm-svn: 205430	2014-04-02 16:10:33 +00:00
Adrian Prantl	a731cf0018	clarify comment llvm-svn: 205429	2014-04-02 15:49:45 +00:00
Tim Northover	6d69168ffd	ARM64: use GOT for weak symbols & PIC. Weak symbols cannot use the small code model's usual ADRP sequences since the instruction simply may not be able to encode a value of 0. This redirects them to use the GOT, which hopefully linkers are able to cope with even in the static relocation model. llvm-svn: 205426	2014-04-02 14:39:11 +00:00
Tim Northover	0d80f70530	ARM64: fix lowering of fp128 fptosi/fptoui We were creating libcall nodes that returned an MVT::f128, when these particular operations actually return an int of some stripe. llvm-svn: 205425	2014-04-02 14:39:07 +00:00
Tim Northover	670df3d937	SLPVectorizer: compare entire intrinsic for SLP compatibility. Some Intrinsics are overloaded to the extent that return type equality (all that's been checked up to now) does not guarantee that the arguments are the same. In these cases SLP vectorizer should not recurse into the operands, which can be achieved by comparing them as "Function *" rather than simply the ID. llvm-svn: 205424	2014-04-02 14:39:02 +00:00
Tim Northover	ebd37ab382	ARM64: make sure first argument to INSERT_SUBVECTOR has right type. Again, coalescing and other optimisations swiftly made the MachineInstrs consistent again, but when compiled at -O0 a bad INSERT_SUBREGISTER was produced. llvm-svn: 205423	2014-04-02 14:38:58 +00:00
Tim Northover	5e3a484e3b	ARM64: convert fp16 narrowing ISel to pseudo-instruction The previous attempt was fine with optimisations, but was actually rather cavalier with its types. When compiled at -O0, it produced invalid COPY MachineInstrs. llvm-svn: 205422	2014-04-02 14:38:54 +00:00
Job Noorman	f7da105f39	Mark FPB as a reserved register when needed. llvm-svn: 205421	2014-04-02 13:13:56 +00:00
Rafael Espindola	b1b49789d0	Work around gold bug http://sourceware.org/PR16794 . llvm-svn: 205416	2014-04-02 12:15:20 +00:00
Renato Golin	d93295ea56	Remove duplicated DMB instructions ARM specific optimiztion, finding places in ARM machine code where 2 dmbs follow one another, and eliminating one of them. Patch by Reinoud Elhorst. llvm-svn: 205409	2014-04-02 09:03:43 +00:00
Yaron Keren	2895496852	Added isTargetWindowsMSVC(), renamed isTargetMingw() to isTargetWindowsGNU() and isTargetCygwin() to isTargetWindowsCygwin() to be consistent with the four Windows environments in Triple.h. Suggestion by Saleem Abdulrasool! llvm-svn: 205393	2014-04-02 04:27:51 +00:00
Hal Finkel	b0ebdc0f43	[LoopVectorizer] Count dependencies of consecutive pointers as uniforms For the purpose of calculating the cost of the loop at various vectorization factors, we need to count dependencies of consecutive pointers as uniforms (which means that the VF = 1 cost is used for all overall VF values). For example, the TSVC benchmark function s173 has: ... %3 = add nsw i64 %indvars.iv, 16000 %arrayidx8 = getelementptr inbounds %struct.GlobalData* @global_data, i64 0, i32 0, i64 %3 ... and we must realize that the add will be a scalar in order to correctly deduce it to be profitable to vectorize this on PowerPC with VSX enabled. In fact, all dependencies of a consecutive pointer must be a scalar (uniform), and so we simply need to add all consecutive pointers to the worklist that currently detects collects uniforms. Fixes PR19296. llvm-svn: 205387	2014-04-02 02:34:49 +00:00
David Blaikie	326e1fa13b	Adjust comments regarding non-relocated abbrev offset in debug_info.dwo I'm not sure the comment in the implementation really adds a lot of value (it's clear that we emit zero when no symbol is provided, but it doesn't explain why we would do that). Happy to iterate. llvm-svn: 205386	2014-04-02 02:04:51 +00:00
David Blaikie	94c1d7f174	Split debug_loc and debug_loc.dwo emission into two separate functions Based on code review feedback from Eric Christopher on r204697 llvm-svn: 205385	2014-04-02 01:50:20 +00:00
David Blaikie	0a456de5a2	DebugInfo: Introduce DebugLocList to encapsulate a list of DebugLocEntries and an MC Label to refer to them This removes the magic-number-esque code creating/retrieving the same label for a debug_loc entry from two places and removes the last small piece of reusable logic from emitDebugLoc so that there will be less duplication when refactoring it into two functions (one for debug_loc, the other for debug_loc.dwo). llvm-svn: 205382	2014-04-02 01:43:18 +00:00
Quentin Colombet	3c2b13b258	[ARM64][CollectLOH] Add some comments to explain how the LOHs framework works (for the compiler part), since the design document is not available. llvm-svn: 205379	2014-04-02 01:02:28 +00:00
Adrian Prantl	3c5453cb6e	Add a doxygen comment to DebugLocEntry::Merge. llvm-svn: 205374	2014-04-01 23:34:45 +00:00
David Blaikie	6fa9966ee6	DebugLocEntry: Actually merge the loc entry when returning true. Seems we didn't have any test coverage for merging... awesome. So I added some - but hit an llvm-objdump bug while I was there. I'm choosing not to shave that yak right now. Code review feedback/bug catch by Adrian Prantl in r205360. llvm-svn: 205373	2014-04-01 23:19:23 +00:00
David Blaikie	91567b6700	Fix accidental fallthrough in DebugLocEntry::hasSameValueOrLocation No test case (this would invoke UB by examining uninitialized members, etc, at best - and this code is apparently untested anyway - I'm about to fix that) Code review feedback from Adrian Prantl on r205360. llvm-svn: 205367	2014-04-01 22:25:09 +00:00
David Blaikie	c2af77b027	Remove unused function DebugLocEntry::isEmpty llvm-svn: 205365	2014-04-01 22:06:18 +00:00
David Blaikie	d306baf572	Refactor out the comparison of the location/value in a DebugLocEntry llvm-svn: 205364	2014-04-01 22:04:07 +00:00
David Blaikie	1275e4f026	DebugInfo: Split DebugLocEntry into its own file. It seems big enough that it deserves its own file - but it is header only, so there's no need for another cpp file, etc. llvm-svn: 205360	2014-04-01 21:49:04 +00:00
Adrian Prantl	6b444c5c8e	Add a comment about the DIDescriptor class hierarchy. llvm-svn: 205358	2014-04-01 21:04:24 +00:00
Adrian Prantl	75ce62acef	DwarfDebug: Prevent DebugLocEntry merging from coalescing two different constants into only the first one. rdar://14874886. llvm-svn: 205357	2014-04-01 21:04:18 +00:00
Hal Finkel	9e0baa6d3a	[PowerPC] Add some missing VSX bitcast patterns llvm-svn: 205352	2014-04-01 19:24:27 +00:00
Yaron Keren	48d68d439a	If isKnownWindowsMSVCEnvironment then getOS == Triple::Win32 and Environment == Triple::MSVC so it will never be MinGW or Cygwin. llvm-svn: 205349	2014-04-01 18:52:55 +00:00
Hal Finkel	2eed29f3c8	Implement X86TTI::getUnrollingPreferences This provides an initial implementation of getUnrollingPreferences for x86. getUnrollingPreferences is used by the generic (concatenation) unroller, which is distinct from the unrolling done by the loop vectorizer. Many modern x86 cores have some kind of uop cache and loop-stream detector (LSD) used to efficiently dispatch small loops, and taking full advantage of this requires unrolling small loops (small here means 10s of uops). These caches also have limits on the number of taken branches in the loop, and so we also cap the loop unrolling factor based on the maximum "depth" of the loop. This is currently calculated with a partial DFS traversal (partial because it will stop early if the path length grows too much). This is still an approximation, and one that is both conservative (because it does not account for branches eliminated via block placement) and optimistic (because it is only recording the maximum depth over minimum paths). Nevertheless, because the loops that fit in these uop caches are so small, it is not clear how much the details matter. The original set of patches posted for review produced the following test-suite performance results (from the TSVC benchmark) at that time: ControlLoops-dbl - 13% speedup ControlLoops-flt - 15% speedup Reductions-dbl - 7.5% speedup llvm-svn: 205348	2014-04-01 18:50:34 +00:00
Hal Finkel	6386cb8d4d	Add some additional fields to TTI::UnrollingPreferences In preparation for an upcoming commit implementing unrolling preferences for x86, this adds additional fields to the UnrollingPreferences structure: - PartialThreshold and PartialOptSizeThreshold - Like Threshold and OptSizeThreshold, but used when not fully unrolling. These are necessary because we need different thresholds for full unrolling from those used when partially unrolling (the full unrolling thresholds are generally going to be larger). - MaxCount - A cap on the unrolling factor when partially unrolling. This can be used by a target to prevent the unrolled loop from exceeding some resource limit independent of the loop size (such as number of branches). There should be no functionality change for any in-tree targets. llvm-svn: 205347	2014-04-01 18:50:30 +00:00
Hal Finkel	b4e001cc81	Use TopTTI->getGEPCost from within getUserCost The implementation of getUserCost had duplicated (and hard-coded) the default logic in getGEPCost. Instead, it is better to use getGEPCost directly, which limits the default logic to the implementation of one function, and allows targets to override the behavior. No functionality change intended. llvm-svn: 205346	2014-04-01 18:50:06 +00:00
Kai Nacke	af47f60f83	[mips] Add Octeon cnMips instructions mtmX and mtpX Adds the Octeon cnMips instructions "load multiplier register MPLx" and "load product register Px". Includes tests. Reviews by: Daniel.Sanders@imgtec.com llvm-svn: 205343	2014-04-01 18:35:26 +00:00
Reid Kleckner	101102711d	Support segmented stacks on Win64 Identical to Win32 method except the GS segment register is used for TLS instead of FS and pvArbitrary is at TEB offset 0x28 instead of 0x14. llvm-svn: 205342	2014-04-01 18:34:21 +00:00
Yaron Keren	136fe7db46	isTargetWindows() renamed to isTargetKnownWindowsMSVC() to reflect its current functionality. Based on Takumi NAKAMURA suggestion. llvm-svn: 205338	2014-04-01 18:15:34 +00:00
Matt Arsenault	e407ae9846	Make isSetCCEquivalent respect the TargetBooleanContents llvm-svn: 205336	2014-04-01 18:13:26 +00:00
Matt Arsenault	6310c3f667	Add helpers for checking if a value is a target boolean constant. llvm-svn: 205335	2014-04-01 18:13:22 +00:00
David Blaikie	0e84adc621	DebugInfo: Factor out common functionality for rendering debug_loc and debug_loc.dwo location list entries In preparation for refactoring this function into two, one for debug_loc, one for debug_loc.dwo. llvm-svn: 205324	2014-04-01 16:17:41 +00:00
David Blaikie	7f1f8742ea	Cleanup remaining use of removed variable to fix the build llvm-svn: 205323	2014-04-01 16:13:29 +00:00

1 2 3 4 5 ...

68434 Commits