llvm-project

Commit Graph

Author	SHA1	Message	Date
Venkatraman Govindaraju	6f2e08c8e1	[Sparc] Add support for parsing directives in SparcAsmParser. llvm-svn: 202564	2014-03-01 02:18:04 +00:00
Venkatraman Govindaraju	f7eecf80c4	[Sparc] Emit 'restore' instead of 'restore %g0, %g0, %g0'. This improves the readability of the generated code. llvm-svn: 202563	2014-03-01 01:04:26 +00:00
Manman Ren	709c951b42	SpillPlacement: fix a bug in iterate. Inside iterate, we scan backwards then scan forwards in a loop. When iteration is not zero, the last node was just updated so we can skip it. But when iteration is zero, we can't skip the last node. For the testing case, fixing this will save a spill and move register copies from hot path to cold path. llvm-svn: 202557	2014-02-28 23:05:31 +00:00
Reid Kleckner	e6ff5c51e6	Reflow isProfitableToMakeFastCC llvm-svn: 202555	2014-02-28 22:50:08 +00:00
Lang Hames	c083578a14	Jumped the gun with r202551 and broke some bots that weren't yet C++11ified. Reverting until the C++11 switch is complete. llvm-svn: 202554	2014-02-28 22:44:44 +00:00
Lang Hames	525a212379	New PBQP solver, and updates to the PBQP graph. The previous PBQP solver was very robust but consumed a lot of memory, performed a lot of redundant computation, and contained some unnecessarily tight coupling that prevented experimentation with novel solution techniques. This new solver is an attempt to address these shortcomings. Important/interesting changes: 1) The domain-independent PBQP solver class, HeuristicSolverImpl, is gone. It is replaced by a register allocation specific solver, PBQP::RegAlloc::Solver (see RegAllocSolver.h). The optimal reduction rules and the backpropagation algorithm have been extracted into stand-alone functions (see ReductionRules.h), which can be used to build domain specific PBQP solvers. This provides many more opportunities for domain-specific knowledge to inform the PBQP solvers' decisions. In theory this should allow us to generate better solutions. In practice, we can at least test out ideas now. As a side benefit, I believe the new solver is more readable than the old one. 2) The solver type is now a template parameter of the PBQP graph. This allows the graph to notify the solver of any modifications made (e.g. by domain independent rules) without the overhead of a virtual call. It also allows the solver to supply policy information to the graph (see below). 3) Significantly reduced memory overhead. Memory management policy is now an explicit property of the PBQP graph (via the CostAllocator typedef on the graph's solver template argument). Because PBQP graphs for register allocation tend to contain many redundant instances of single values (E.g. the value representing an interference constraint between GPRs), the new RASolver class uses a uniquing scheme. This massively reduces memory consumption for large register allocation problems. For example, looking at the largest interference graph in each of the SPEC2006 benchmarks (the largest graph will always set the memory consumption high-water mark for PBQP), the average memory reduction for the PBQP costs was 400x. That's times, not percent. The highest was 1400x. Yikes. So - this is fixed. "PBQP: No longer feasting upon every last byte of your RAM". Minor details: - Fully C++11'd. Never copy-construct another vector/matrix! - Cute tricks with cost metadata: Metadata that is derived solely from cost matrices/vectors is attached directly to the cost instances themselves. That way if you unique the costs you never have to recompute the metadata. 400x less memory means 400x less cost metadata (re)computation. Special thanks to Arnaud de Grandmaison, who has been the source of much encouragement, and of many very useful test cases. This new solver forms the basis for future work, of which there's plenty to do. I will be adding TODO notes shortly. - Lang. llvm-svn: 202551	2014-02-28 22:25:24 +00:00
Eric Christopher	e587c0853b	Fix >> to be > > for non-c++11. llvm-svn: 202545	2014-02-28 21:37:28 +00:00
Tom Stellard	9b9e926481	R600: Verify all instructions in the AsmPrinter on debug builds Make a call to R600's implementation of verifyInstruction() to check that instructions are only using legal operands. llvm-svn: 202544	2014-02-28 21:36:41 +00:00
Tom Stellard	d61a1c3360	R600/SI: Expand all v16[if]32 operations llvm-svn: 202543	2014-02-28 21:36:37 +00:00
Eric Christopher	961959faec	80-col. llvm-svn: 202541	2014-02-28 21:27:59 +00:00
Eric Christopher	2c3a6dce44	Fix a crasher where when we're attempting to replace a type during the finalization for CGDebugInfo in clang we would RAUW a type and it would result in a corrupted MDNode for an imported declaration. Testcase pending as reducing has been difficult. llvm-svn: 202540	2014-02-28 21:27:57 +00:00
Justin Bogner	02b958422c	CommandLine: Exit successfully for -version and -help Tools that use the CommandLine library currently exit with an error when invoked with -version or -help. This is unusual and non-standard, so we'll fix them to exit successfully instead. I don't expect that anyone relies on the current behaviour, so this should be a fairly safe change. llvm-svn: 202530	2014-02-28 19:08:01 +00:00
Zoran Jovanovic	285cc289e8	Fixed operand of SC microMIPS instruction. llvm-svn: 202526	2014-02-28 18:22:56 +00:00
Zoran Jovanovic	7c6c36d92d	Fixed encoding of SYSCALL microMIPS instruction. llvm-svn: 202523	2014-02-28 18:17:08 +00:00
Zoran Jovanovic	d0a289003d	Revert revision 202518 because of wrong commit message. llvm-svn: 202521	2014-02-28 18:14:16 +00:00
Zoran Jovanovic	9874a2b1ef	Fix operand of SC instruction. llvm-svn: 202518	2014-02-28 18:02:17 +00:00
Evgeniy Stepanov	e3804d4840	X86Operand is extracted into individual header. X86Operand is extracted into individual header, because it allows to create an arbitrary memory operand and append it to MCInst. It'll be reused in X86 inline assembly instrumentation. Patch by Yuri Gorshenin. llvm-svn: 202496	2014-02-28 12:28:07 +00:00
NAKAMURA Takumi	cdb9fafa71	Reorder Mips/MCTargetDesc/CMakeLists.txt. llvm-svn: 202483	2014-02-28 10:18:21 +00:00
Sasa Stankovic	441880f700	[mips] Add MipsNaClELFStreamer.cpp to CMakeLists.txt. llvm-svn: 202482	2014-02-28 10:14:12 +00:00
Sasa Stankovic	8c5736b921	[mips] Implement NaCl sandboxing of indirect jumps: * Align targets of indirect jumps to instruction bundle boundaries (in MI layer). * Add masking instructions before indirect jumps (in MC layer). Differential Revision: http://llvm-reviews.chandlerc.com/D2847 llvm-svn: 202479	2014-02-28 10:00:38 +00:00
Tobias Grosser	e8d4c9a2c7	Add 'remark' diagnostic type in LLVM A 'remark' is information that is not an error or a warning, but rather some additional information provided to the user. In contrast to a 'note' a 'remark' is an independent diagnostic, whereas a 'note' always depends on another diagnostic. A typical use case for remark nodes is information provided to the user, e.g. information provided by the vectorizer about loops that have been vectorized. llvm-svn: 202474	2014-02-28 09:08:45 +00:00
Hal Finkel	b998915ee1	Swap PPC isel operands to allow for 0-folding The PPC isel instruction can fold 0 into the first operand (thus eliminating the need to materialize a zero-containing register when the 'true' result of the isel is 0). When the isel is fed by a bit register operation that we can invert, do so as part of the bit-register-operation peephole routine. llvm-svn: 202469	2014-02-28 06:11:16 +00:00
Rafael Espindola	a51f0f8367	Now that it is possible, use the mangler in IRObjectFile. A really simple patch marks the end of a lot of yak shaving :-) llvm-svn: 202463	2014-02-28 02:17:23 +00:00
Hal Finkel	5cae2168c7	Trying to unbreak the darwin11 builder The CR bit tracking code broke PPC/Darwin; trying to get it working again... (the darwin11 builder, which defaults to the darwin ABI when running PPC tests, asserted when running test/CodeGen/PowerPC/inverted-bool-compares.ll) llvm-svn: 202459	2014-02-28 01:17:25 +00:00
Hal Finkel	b39a0475c0	Try to unbreak the C++11 build Cannot use negative numbers in case statements without running afoul of -Wc++11-narrowing. llvm-svn: 202455	2014-02-28 00:45:27 +00:00
Hal Finkel	940ab934d4	Add CR-bit tracking to the PowerPC backend for i1 values This change enables tracking i1 values in the PowerPC backend using the condition register bits. These bits can be treated on PowerPC as separate registers; individual bit operations (and, or, xor, etc.) are supported. Tracking booleans in CR bits has several advantages: - Reduction in register pressure (because we no longer need GPRs to store boolean values). - Logical operations on booleans can be handled more efficiently; we used to have to move all results from comparisons into GPRs, perform promoted logical operations in GPRs, and then move the result back into condition register bits to be used by conditional branches. This can be very inefficient, because the throughput of these CR <-> GPR moves have high latency and low throughput (especially when other associated instructions are accounted for). - On the POWER7 and similar cores, we can increase total throughput by using the CR bits. CR bit operations have a dedicated functional unit. Most of this is more-or-less mechanical: Adjustments were needed in the calling-convention code, support was added for spilling/restoring individual condition-register bits, and conditional branch instruction definitions taking specific CR bits were added (plus patterns and code for generating bit-level operations). This is enabled by default when running at -O2 and higher. For -O0 and -O1, where the ability to debug is more important, this feature is disabled by default. Individual CR bits do not have assigned DWARF register numbers, and storing values in CR bits makes them invisible to the debugger. It is critical, however, that we don't move i1 values that have been promoted to larger values (such as those passed as function arguments) into bit registers only to quickly turn around and move the values back into GPRs (such as happens when values are returned by functions). A pair of target-specific DAG combines are added to remove the trunc/extends in: trunc(binary-ops(binary-ops(zext(x), zext(y)), ...) and: zext(binary-ops(binary-ops(trunc(x), trunc(y)), ...) In short, we only want to use CR bits where some of the i1 values come from comparisons or are used by conditional branches or selects. To put it another way, if we can do the entire i1 computation in GPRs, then we probably should (on the POWER7, the GPR-operation throughput is higher, and for all cores, the CR <-> GPR moves are expensive). POWER7 test-suite performance results (from 10 runs in each configuration): SingleSource/Benchmarks/Misc/mandel-2: 35% speedup MultiSource/Benchmarks/Prolangs-C++/city/city: 21% speedup MultiSource/Benchmarks/MiBench/automotive-susan: 23% speedup SingleSource/Benchmarks/CoyoteBench/huffbench: 13% speedup SingleSource/Benchmarks/Misc-C++/Large/sphereflake: 13% speedup SingleSource/Benchmarks/Misc-C++/mandel-text: 10% speedup SingleSource/Benchmarks/Misc-C++-EH/spirit: 10% slowdown MultiSource/Applications/lemon/lemon: 8% slowdown llvm-svn: 202451	2014-02-28 00:27:01 +00:00
Hal Finkel	ab51ecd4fc	Fix visitTRUNCATE for legal i1 values This extract-and-trunc vector optimization cannot work for i1 values as currently implemented, and so I'm disabling this for now for i1 values. In the future, this can be fixed properly. Soon I'll commit support for i1 CR bit tracking in the PowerPC backend, and this will be covered by one of the existing regression tests. llvm-svn: 202449	2014-02-28 00:26:45 +00:00
Andrew Trick	b1531e582f	Provide a target override for the latest regalloc heuristic. This is a temporary workaround for native arm linux builds: PR18996: Changing regalloc order breaks "lencod" on native arm linux builds. llvm-svn: 202433	2014-02-27 21:37:33 +00:00
Roman Divacky	7a9c6549ba	Lower FNEG just like FABS to fneg[ds] and fmov[ds], thus avoiding expensive libcall. Also, Qp_neg is not implemented on at least FreeBSD. This is also what gcc is doing. llvm-svn: 202422	2014-02-27 19:26:29 +00:00
Eric Christopher	8bdab43964	Revert r201751 and solve the const problem a different way - by making the cache mutable. llvm-svn: 202417	2014-02-27 18:36:10 +00:00
Adrian Prantl	7072073cc9	Debug info: Remove ARMAsmPrinter::EmitDwarfRegOp(). AsmPrinter can now scan the register file for sub- and super-registers. No functionality change intended. (Tests are updated because the comments in the assembler output are different.) llvm-svn: 202416	2014-02-27 17:56:08 +00:00
Richard Osborne	521bdf211d	[XCore] Support functions returning more than 4 words. If a function returns a large struct by value return the first 4 words in registers and the rest on the stack in a location reserved by the caller. This is needed to support the xC language which supports functions returning an arbitrary number of return values. This is r202397 reapplied with a fix to avoid an uninitialized read of a member. llvm-svn: 202414	2014-02-27 17:47:54 +00:00
Richard Osborne	f474087f98	[XCore] Make LowerCallResult a static function. No functionality change. This is r202396 reapplied with no changes. llvm-svn: 202413	2014-02-27 17:47:48 +00:00
Rafael Espindola	8837995b52	Remove MCPureStreamer. We moved MCJIT to use native object formats a long time ago and R600 now uses ELF, so it was dead. llvm-svn: 202408	2014-02-27 16:17:34 +00:00
Alexander Kornienko	52a07b819e	Re-apply r200853, which should not crash after Clang plugins were converted to loadable modules in r201256. llvm-svn: 202404	2014-02-27 14:47:37 +00:00
Richard Osborne	527aa5052d	Revert r202396, r202397. These are causing test failures, revert for now. llvm-svn: 202398	2014-02-27 14:24:13 +00:00
Richard Osborne	e82bf0988e	[XCore] Support functions returning more than 4 words. Summary: If a function returns a large struct by value return the first 4 words in registers and the rest on the stack in a location reserved by the caller. This is needed to support the xC language which supports functions returning an arbitrary number of return values. Reviewers: robertlytton Reviewed By: robertlytton CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D2889 llvm-svn: 202397	2014-02-27 14:00:40 +00:00
Richard Osborne	ed7e2ad090	[XCore] Make LowerCallResult a static function. No functionality change. llvm-svn: 202396	2014-02-27 14:00:34 +00:00
Richard Osborne	a283d24ad9	[XCore] Target optimized library function __memcpy_4() Summary: If the src, dst and size of a memcpy are known to be 4 byte aligned we can call __memcpy_4() instead of memcpy(). Reviewers: robertlytton Reviewed By: robertlytton CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D2871 llvm-svn: 202395	2014-02-27 13:39:07 +00:00
Richard Osborne	d6e85018c5	[XCore] Add dag combines for instructions that ignore some input bits. These instructions ignore the high bits of one of their input operands - try and use this to simplify the code. llvm-svn: 202394	2014-02-27 13:20:11 +00:00
Richard Osborne	2d3a2bee41	[XCore] Provide information about known zero bits of resource instructions. llvm-svn: 202393	2014-02-27 13:20:06 +00:00
Kostya Serebryany	cb3f6e164b	[asan] fix a pair of silly typos llvm-svn: 202391	2014-02-27 13:13:59 +00:00
Kostya Serebryany	ec34665de9	[asan] disable asan-detect-invalid-pointer-pair (was enabled by mistake) llvm-svn: 202390	2014-02-27 12:56:20 +00:00
Kostya Serebryany	796f6557bf	[asan] experimental implementation of invalid-pointer-pair detector (finds when two unrelated pointers are compared or subtracted). This implementation has both false positives and false negatives and is not tuned for performance. A bug report for a proper implementation will follow. llvm-svn: 202389	2014-02-27 12:45:36 +00:00
Eric Christopher	a9a1d27677	Don't emit anything into the debug_ranges section if we aren't emitting any ranges - this includes CU ranges where we were previously emitting an end list marker even if we didn't have a list. Testcase includes a test for line table only code emission as the problem was noticed while writing this test. llvm-svn: 202357	2014-02-27 07:44:45 +00:00
Craig Topper	5346e75966	[X86] Fix Uses/Defs lists for INS, OUTS, SCAS, CMPS, LODS llvm-svn: 202348	2014-02-27 05:08:25 +00:00
Craig Topper	40dd6211d5	[X86] Add RAX/EAX/AX Uses/Defs to XCHG RAX/EAX/AX instructions. llvm-svn: 202347	2014-02-27 04:27:00 +00:00
Craig Topper	08301dee46	[X86] Add RAX/EAX/AX/AL Uses/Defs to the absolute memory location move instructions. Patch by Florian Lukas with some additional instructions fixed by me. Fixes PR18975. llvm-svn: 202345	2014-02-27 04:07:57 +00:00
Craig Topper	64d94320f3	Fix odd indentation. llvm-svn: 202342	2014-02-27 03:11:13 +00:00
Ben Langmuir	27a58bf770	Revert "Use StringRef in raw_fd_ostream constructor" This reverts commit r202225, which may cause a performance regression. llvm-svn: 202338	2014-02-27 02:09:10 +00:00
Michel Danzer	9e61c4b6cd	R600/SI: Optimize SI_KILL for constant operands If the SI_KILL operand is constant, we can either clear the exec mask if the operand is negative, or do nothing otherwise. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 202337	2014-02-27 01:47:09 +00:00
Michel Danzer	6f273c57db	R600/SI: Allow SI_KILL for geometry shaders Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 202336	2014-02-27 01:47:02 +00:00
Eric Christopher	740a833a3b	If we're only emitting line tables for a particular CU then don't add any ranges to the list of ranges for the CU as we don't want to emit them anyway. This ensures that we will still emit ranges if we have a compile unit compiled with only line tables and one compiled with full debug info requested (we'll emit for the one with full debug info). Update testcase metadata accordingly to continue emitting ranges. llvm-svn: 202333	2014-02-27 01:25:00 +00:00
Eric Christopher	75d49db19b	Add a debug info code generation level to the compile unit metadata and update everything accordingly. This can be used to conditionalize the amount of output in the backend based on the amount of debug requested/metadata emission scheme by a front end (e.g. clang). Paired with a commit to clang. llvm-svn: 202332	2014-02-27 01:24:56 +00:00
Adrian Prantl	e31563c4aa	Fix a type error that crept into r202313. llvm-svn: 202317	2014-02-26 23:46:39 +00:00
Eric Christopher	a13839f5ca	Remove unnecessary llvm:: qualification. llvm-svn: 202316	2014-02-26 23:27:16 +00:00
Adrian Prantl	918b9a77ce	Debug info: Refactor AsmPrinter::EmitDwarfRegOp to make the control flow more obvious. llvm-svn: 202313	2014-02-26 23:03:37 +00:00
Matt Arsenault	530dde4386	R600: Remove unnecessary build_vector pattern. It is already fully handled in AMDGPUISelDAGToDAG. llvm-svn: 202312	2014-02-26 23:00:58 +00:00
Andrew Trick	52a00936b4	Add a limit to the heuristic that register allocates instructions in local order. This handles pathological cases in which we see 2x increase in spill code for large blocks (~50k instructions). I don't have a unit test for this behavior. Fixes rdar://16072279. llvm-svn: 202304	2014-02-26 22:07:26 +00:00
Quentin Colombet	85c9e16291	Lower unsigned vsetcc to psubus in certain cases The current approach to lower a vsetult is to flip the sign bit of the operands, swap the operands and then use a (signed) pcmpgt. psubus (unsigned saturating subtract) can be used to emulate a vsetult more efficiently: + case ISD::SETULT: { + // If the comparison is against a constant we can turn this into a + // setule. With psubus, setule does not require a swap. This is + // beneficial because the constant in the register is no longer + // destructed as the destination so it can be hoisted out of a loop. I also enable lowering via psubus in a few other cases where it's clearly beneficial: setule and setuge if minu/maxu cannot be used. rdar://problem/14338765 Patch by Adam Nemet <anemet@apple.com>. llvm-svn: 202301	2014-02-26 21:39:12 +00:00
Aaron Ballman	3d69c5cae4	Silencing an MSVC signed comparison warning. llvm-svn: 202295	2014-02-26 20:22:20 +00:00
Hal Finkel	121caf6313	Fix the aggressive anti-dep breaker's subregister definition handling The aggressive anti-dependency breaker scans instructions, bottom-up, within the scheduling region in order to find opportunities where register renaming can be used to break anti-dependencies. Unfortunately, the aggressive anti-dep breaker was treating a register definition as defining all of that register's aliases (including super registers). This behavior is incorrect when the super register is live and there are other definitions of subregisters of the super register. For example, given the following sequence: %CR2EQ<def> = CROR %CR3UN, %CR3UN<kill> %CR2GT<def> = IMPLICIT_DEF %X4<def> = MFOCRF8 %CR2 the analysis of the first subregister definition would work as expected: Anti: %CR2GT<def> = IMPLICIT_DEF Def Groups: CR2GT=g194->g0(via CR2) Antidep reg: CR2GT (zero group) Use Groups: but the analysis of the second one would not: Anti: %CR2EQ<def> = CROR %CR3UN, %CR3UN<kill> Def Groups: CR2EQ=g195 Antidep reg: CR2EQ Rename Candidates for Group g195: ... because, when processing the %CR2GT<def>, we'd mark all super registers of %CR2GT (%CR2 in this case) as defined. As a result, when processing %CR2EQ<def>, %CR2 no longer appears to be live, and %CR2EQ<def>'s group is not %unioned with the %CR2 group. I don't have an in-tree test case for this yet (and even if I did, I don't have a small one). llvm-svn: 202294	2014-02-26 20:20:30 +00:00
Reid Kleckner	22869378d9	GlobalOpt: Apply fastcc to internal x86_thiscallcc functions We should apply fastcc whenever profitable. We can expand this list, but there are lots of conventions with performance implications that we don't want to change. Differential Revision: http://llvm-reviews.chandlerc.com/D2705 llvm-svn: 202293	2014-02-26 19:57:30 +00:00
Nico Rieck	773a57958c	Relax COFF string table check COFF object files with 0 as string table size are currently rejected. This prevents us from reading object files written by tools like cvtres that violate the PECOFF spec and write 0 instead of 4 for the size of an empty string table. llvm-svn: 202292	2014-02-26 19:51:44 +00:00
Rafael Espindola	e8ae0dba52	Fix typo. Thanks to Roman Divacky for noticing it. llvm-svn: 202277	2014-02-26 17:05:38 +00:00
Rafael Espindola	ae593f1563	Compare DataLayout by Value, not by pointer. This fixes spurious warnings in llvm-link about the datalayout not matching. Thanks to Zalman Stern for reporting the bug! llvm-svn: 202276	2014-02-26 17:02:08 +00:00
Rafael Espindola	667fcb839e	Use a sorted array to store the information about a few address spaces. We don't have any test with more than 6 address spaces, so a DenseMap is probably not the correct answer. An unsorted array would also be OK, but we have to sort it for printing anyway. llvm-svn: 202275	2014-02-26 16:58:35 +00:00
Rafael Espindola	5109fcc0ae	Move these functions out of line. A DenseMap lookup is not a simple operation. llvm-svn: 202274	2014-02-26 16:49:40 +00:00
Andrew Trick	429e9edd08	Fix PR18165: LSR must avoid scaling factors that exceed the limit on truncated use. Patch by Michael Zolotukhin! llvm-svn: 202273	2014-02-26 16:31:56 +00:00
NAKAMURA Takumi	4ca51b9ace	[CMake] BUILD_SHARED_LIBS: Fixup for r202261: Give PULIC to system_libs in LLVMSupport. llvm-svn: 202263	2014-02-26 12:18:55 +00:00
Artyom Skrobov	1a6cd1d912	ARMv8 IfConversion must skip narrow instructions that a) define CPSR and b) wouldn't affect CPSR in an IT block llvm-svn: 202257	2014-02-26 11:27:28 +00:00
Daniel Sanders	737285e02d	[mips] Treat -mcpu=generic the same way as an empty CPU string. Summary: This should fix the MCJIT unit tests that were broken by r201792 on the MIPS buildbot. MIPS currently uses the default implementation of sys::getHostCPUName() which always returns "generic". For now, we will accept "generic" and coerce it to "mips32" or "mips64" depending on the target architecture like we do for empty CPU names. Reviewers: jacksprat, matheusalmeida Reviewed By: jacksprat Differential Revision: http://llvm-reviews.chandlerc.com/D2878 llvm-svn: 202253	2014-02-26 10:20:15 +00:00
Chandler Carruth	dfb2efd0da	[SROA] Use the correct index integer size in GEPs through non-default address spaces. This isn't really a correctness issue (the values are truncated) but its much cleaner. Patch by Matt Arsenault! llvm-svn: 202252	2014-02-26 10:08:16 +00:00
Chandler Carruth	286d87ed38	[SROA] Teach SROA how to handle pointers from address spaces other than the default. Based on the patch by Matt Arsenault, D1764! I switched one place to use the more direct pointer type to compute the desired address space, and I reworked the memcpy rewriting section to reflect significant refactorings that this patch helped inspire. Thanks to several of the folks who helped review and improve the patch as well. llvm-svn: 202247	2014-02-26 08:25:02 +00:00
Chandler Carruth	aa72b93ae7	[SROA] Split the alignment computation complete for the memcpy rewriting to work independently for the slice side and the other side. This allows us to only compute the minimum of the two when we actually rewrite to a memcpy that needs to take the minimum, and preserve higher alignment for one side or the other when rewriting to loads and stores. This fix was inspired by seeing the result of some refactoring that makes addrspace handling better. llvm-svn: 202242	2014-02-26 07:29:54 +00:00
NAKAMURA Takumi	955d27a4ce	[CMake] Use target_link_libraries(INTERFACE\|PRIVATE) on CMake-2.8.12 to increase opportunity for parallel build. target_link_libraries(INTERFACE) doesn't bring inter-target dependencies in add_library, although final targets have dependencies to whole dependent libraries. It makes most libraries can be built in parallel. target_link_libraries(PRIVATE) is used to shaared library. Each dependent library is linked to the target.so, and its user will not see its grandchildren. For example, - libclang.so has sufficient libclang*.a(s). - c-index-test requires just only libclang.so. FIXME: lld is tweaked minimally. Adding INTERFACE in each library would be better thing. llvm-svn: 202241	2014-02-26 06:53:16 +00:00
Craig Topper	9df497e568	[x86] Add same itinerary to SYSEXIT64 as SYSEXIT for consistency. llvm-svn: 202240	2014-02-26 06:50:27 +00:00
NAKAMURA Takumi	9698686505	[CMake] Use LINK_LIBS instead of target_link_libraries(). llvm-svn: 202238	2014-02-26 06:41:29 +00:00
Craig Topper	c30b81ea06	[x86] Remove some unused instruction format classes. llvm-svn: 202234	2014-02-26 06:06:38 +00:00
Craig Topper	e413b628f8	[x86] Simplify disassembler code slightly. llvm-svn: 202233	2014-02-26 06:01:21 +00:00
Chandler Carruth	181ed05b6a	[SROA] The original refactoring inspired by the addrspace patch in D1764, which in turn set off the other refactorings to make 'getSliceAlign()' a sensible thing. There are two possible inputs to the required alignment of a memory transfer intrinsic: the alignment constraints of the source and the destination. If we are only introducing a (potentially new) offset onto one side of the transfer, we don't need to consider the alignment constraints of the other side. Use this to simplify the logic feeding into alignment computation for unsplit transfers. Also, hoist the clamp of the magical zero alignment for these intrinsics to the more customary one alignment early. This lets several other conditions melt away. No functionality changed. There is a further improvement this exposes which will change functionality, but that's arriving in a separate patch. llvm-svn: 202232	2014-02-26 05:33:36 +00:00
Chandler Carruth	47954c80ed	[SROA] Yet another slight refactoring that simplifies an API in the rewriting logic: don't pass custom offsets for the adjusted pointer to the new alloca. We always passed NewBeginOffset here. Sometimes we spelled it BeginOffset, but only when they were in fact equal. Whats worse, the API is set up so that you can't reasonably call it with anything else -- it assumes that you're passing it an offset relative to the original alloca that happens to fall within the new one. That's the whole point of NewBeginOffset, it's the clamped beginning offset. No functionality changed. llvm-svn: 202231	2014-02-26 05:12:43 +00:00
Chandler Carruth	2659e503c3	[SROA] Simplify the computing of alignment: we only ever need the alignment of the slice being rewritten, not any arbitrary offset. Every caller is really just trying to compute the alignment for the whole slice, never for some arbitrary alignment. They are also just passing a type when they have one to see if we can skip an explicit alignment in the IR by using the type's alignment. This makes for a much simpler interface. Another refactoring inspired by the addrspace patch for SROA, although only loosely related. llvm-svn: 202230	2014-02-26 05:02:19 +00:00
Chandler Carruth	735d5bee48	[SROA] Use NewOffsetBegin in the unsplit case for memset merely for consistency with memcpy rewriting, and fix a latent bug in the alignment management for memset. The alignment issue is that getAdjustedAllocaPtr is computing the relative offset into the new alloca, but the alignment isn't being set to the relative offset, it was using the the absolute offset which is into the old alloca. I don't think its possible to write a test case that actually reaches this code where the resulting alignment would be observably different, but the intent was clearly to use the relative offset within the new alloca. llvm-svn: 202229	2014-02-26 04:45:24 +00:00
Chandler Carruth	ea27cf08d8	[SROA] Use the members for New{Begin,End}Offset in the rewrite helpers rather than passing them as arguments. While I generally prefer actual arguments, in this case the readability loss is substantial. By using members we avoid repeatedly calculating the offsets, and once we're using members it is useful to ensure that those names always refer to the original-alloca-relative new offset for a rewritten slice. No functionality changed. Follow-up refactoring, all toward getting the address space patch merged. llvm-svn: 202228	2014-02-26 04:25:04 +00:00
Chandler Carruth	c46b6eb302	[SROA] Compute the New{Begin,End}Offset values once for each alloca slice being rewritten. We had the same code scattered across most of the visits. Instead, compute the new offsets and the slice size once when we start to visit a particular slice, and use the member variables from then on. This reduces quite a bit of code duplication. No functionality changed. Refactoring inspired to make it easier to apply the address space patch to SROA. llvm-svn: 202227	2014-02-26 04:20:00 +00:00
Ben Langmuir	6a2a14da81	Use StringRef in raw_fd_ostream constructor llvm-svn: 202225	2014-02-26 03:21:00 +00:00
Chandler Carruth	6aedc106ba	[SROA] Fix PR18615 with some long overdue simplifications to the bounds checking in SROA. The primary change is to just rely on uge for checking that the offset is within the allocation size. This removes the explicit checks against isNegative which were terribly error prone (including the reversed logic that led to PR18615) and prevented us from supporting stack allocations larger than half the address space.... Ok, so maybe the latter isn't common but it's a silly restriction to have. Also, we used to try to support a PHI node which loaded from before the start of the allocation if any of the loaded bytes were within the allocation. This doesn't make any sense, we have never really supported loading or storing before the allocation starts. The simplified logic just doesn't care. We continue to allow loading past the end of the allocation in part to support cases where there is a PHI and some loads are larger than others and the larger ones reach past the end of the allocation. We could solve this a different and more conservative way, but I'm still somewhat paranoid about this. llvm-svn: 202224	2014-02-26 03:14:14 +00:00
Nick Lewycky	ea08c7090b	Remove spurious emacs major mode marker, these should only go on .h files. llvm-svn: 202222	2014-02-26 03:10:45 +00:00
Eric Christopher	f9761a294a	80-col. llvm-svn: 202221	2014-02-26 02:53:18 +00:00
Eric Christopher	73ffdb8b3c	Formatting fixups. llvm-svn: 202220	2014-02-26 02:50:56 +00:00
Paul Robinson	0c12b1d23c	Constify the Optnone checks in IR passes. llvm-svn: 202213	2014-02-26 01:23:26 +00:00
Rui Ueyama	5500b07c78	Simplify base64 routine a bit. llvm-svn: 202210	2014-02-25 23:49:11 +00:00
Adrian Prantl	b363c30b5d	Add DIUnspecifiedParameter, so we can pretty-print it. This will be used for testcases in CFE. llvm-svn: 202207	2014-02-25 23:42:11 +00:00
Rafael Espindola	339430f993	Use DataLayout from the module when easily available. Eventually DataLayoutPass should go away, but for now that is the only easy way to get a DataLayout in some APIs. This patch only changes the ones that have easy access to a Module. One interesting issue with sometimes using DataLayoutPass and sometimes fetching it from the Module is that we have to make sure they are equivalent. We can get most of the way there by always constructing the pass with a Module. In fact, the pass could be changed to point to an external DataLayout instead of owning one to make this stricter. Unfortunately, the C api passes a DataLayout, so it has to be up to the caller to make sure the pass and the module are in sync. llvm-svn: 202204	2014-02-25 23:25:17 +00:00
David Blaikie	20474106a1	DwarfDebug: Avoid emitting an empty debug_aranges section when aranges are disabled llvm-svn: 202201	2014-02-25 22:46:44 +00:00
Adrian Prantl	69140d2c0f	Address review comments for r202188. This is refactoring / simplifying code, updating comments and enabling the testcase on non-x86 platforms. No functionality change. llvm-svn: 202199	2014-02-25 22:27:14 +00:00
Rafael Espindola	248ac13975	Fix resetting the DataLayout in a Module. No tool does this currently, but as everything else in a module we should be able to change its DataLayout. Most of the fix is in DataLayout to make sure it can be reset properly. The test uses Module::setDataLayout since the fact that we mutate a DataLayout is an implementation detail. The module could hold a OwningPtr<DataLayout> and the DataLayout itself could be immutable. Thanks to Philip Reames for pushing me in the right direction. llvm-svn: 202198	2014-02-25 22:23:04 +00:00
Chandler Carruth	7b8e112407	[reassociate] Switch two std::sort calls into std::stable_sort calls as their inputs come from std::stable_sort and they are not total orders. I'm not a huge fan of this, but the really bad std::stable_sort is right at the beginning of Reassociate. After we commit to stable-sort based consistent respect of source order, the downstream sorts shouldn't undo that unless they have a total order or they are used in an order-insensitive way. Neither appears to be true for these cases. I don't have particularly good test cases, but this jumped out by inspection when looking for output instability in this pass due to changes in the ordering of std::sort. llvm-svn: 202196	2014-02-25 21:54:50 +00:00
Tom Stellard	fd0d86c322	R600: Don't unconditionally unroll loops with private memory accesses This causes the size of the scrypt kernel to explode and eats all the memory on some systems. llvm-svn: 202195	2014-02-25 21:36:21 +00:00

1 2 3 4 5 ...

67401 Commits