llvm-project

Commit Graph

Author	SHA1	Message	Date
Tobias Grosser	e8d4c9a2c7	Add 'remark' diagnostic type in LLVM A 'remark' is information that is not an error or a warning, but rather some additional information provided to the user. In contrast to a 'note' a 'remark' is an independent diagnostic, whereas a 'note' always depends on another diagnostic. A typical use case for remark nodes is information provided to the user, e.g. information provided by the vectorizer about loops that have been vectorized. llvm-svn: 202474	2014-02-28 09:08:45 +00:00
Hal Finkel	b998915ee1	Swap PPC isel operands to allow for 0-folding The PPC isel instruction can fold 0 into the first operand (thus eliminating the need to materialize a zero-containing register when the 'true' result of the isel is 0). When the isel is fed by a bit register operation that we can invert, do so as part of the bit-register-operation peephole routine. llvm-svn: 202469	2014-02-28 06:11:16 +00:00
Rafael Espindola	a51f0f8367	Now that it is possible, use the mangler in IRObjectFile. A really simple patch marks the end of a lot of yak shaving :-) llvm-svn: 202463	2014-02-28 02:17:23 +00:00
Hal Finkel	5cae2168c7	Trying to unbreak the darwin11 builder The CR bit tracking code broke PPC/Darwin; trying to get it working again... (the darwin11 builder, which defaults to the darwin ABI when running PPC tests, asserted when running test/CodeGen/PowerPC/inverted-bool-compares.ll) llvm-svn: 202459	2014-02-28 01:17:25 +00:00
Hal Finkel	b39a0475c0	Try to unbreak the C++11 build Cannot use negative numbers in case statements without running afoul of -Wc++11-narrowing. llvm-svn: 202455	2014-02-28 00:45:27 +00:00
Hal Finkel	940ab934d4	Add CR-bit tracking to the PowerPC backend for i1 values This change enables tracking i1 values in the PowerPC backend using the condition register bits. These bits can be treated on PowerPC as separate registers; individual bit operations (and, or, xor, etc.) are supported. Tracking booleans in CR bits has several advantages: - Reduction in register pressure (because we no longer need GPRs to store boolean values). - Logical operations on booleans can be handled more efficiently; we used to have to move all results from comparisons into GPRs, perform promoted logical operations in GPRs, and then move the result back into condition register bits to be used by conditional branches. This can be very inefficient, because the throughput of these CR <-> GPR moves have high latency and low throughput (especially when other associated instructions are accounted for). - On the POWER7 and similar cores, we can increase total throughput by using the CR bits. CR bit operations have a dedicated functional unit. Most of this is more-or-less mechanical: Adjustments were needed in the calling-convention code, support was added for spilling/restoring individual condition-register bits, and conditional branch instruction definitions taking specific CR bits were added (plus patterns and code for generating bit-level operations). This is enabled by default when running at -O2 and higher. For -O0 and -O1, where the ability to debug is more important, this feature is disabled by default. Individual CR bits do not have assigned DWARF register numbers, and storing values in CR bits makes them invisible to the debugger. It is critical, however, that we don't move i1 values that have been promoted to larger values (such as those passed as function arguments) into bit registers only to quickly turn around and move the values back into GPRs (such as happens when values are returned by functions). A pair of target-specific DAG combines are added to remove the trunc/extends in: trunc(binary-ops(binary-ops(zext(x), zext(y)), ...) and: zext(binary-ops(binary-ops(trunc(x), trunc(y)), ...) In short, we only want to use CR bits where some of the i1 values come from comparisons or are used by conditional branches or selects. To put it another way, if we can do the entire i1 computation in GPRs, then we probably should (on the POWER7, the GPR-operation throughput is higher, and for all cores, the CR <-> GPR moves are expensive). POWER7 test-suite performance results (from 10 runs in each configuration): SingleSource/Benchmarks/Misc/mandel-2: 35% speedup MultiSource/Benchmarks/Prolangs-C++/city/city: 21% speedup MultiSource/Benchmarks/MiBench/automotive-susan: 23% speedup SingleSource/Benchmarks/CoyoteBench/huffbench: 13% speedup SingleSource/Benchmarks/Misc-C++/Large/sphereflake: 13% speedup SingleSource/Benchmarks/Misc-C++/mandel-text: 10% speedup SingleSource/Benchmarks/Misc-C++-EH/spirit: 10% slowdown MultiSource/Applications/lemon/lemon: 8% slowdown llvm-svn: 202451	2014-02-28 00:27:01 +00:00
Hal Finkel	ab51ecd4fc	Fix visitTRUNCATE for legal i1 values This extract-and-trunc vector optimization cannot work for i1 values as currently implemented, and so I'm disabling this for now for i1 values. In the future, this can be fixed properly. Soon I'll commit support for i1 CR bit tracking in the PowerPC backend, and this will be covered by one of the existing regression tests. llvm-svn: 202449	2014-02-28 00:26:45 +00:00
Andrew Trick	b1531e582f	Provide a target override for the latest regalloc heuristic. This is a temporary workaround for native arm linux builds: PR18996: Changing regalloc order breaks "lencod" on native arm linux builds. llvm-svn: 202433	2014-02-27 21:37:33 +00:00
Roman Divacky	7a9c6549ba	Lower FNEG just like FABS to fneg[ds] and fmov[ds], thus avoiding expensive libcall. Also, Qp_neg is not implemented on at least FreeBSD. This is also what gcc is doing. llvm-svn: 202422	2014-02-27 19:26:29 +00:00
Eric Christopher	8bdab43964	Revert r201751 and solve the const problem a different way - by making the cache mutable. llvm-svn: 202417	2014-02-27 18:36:10 +00:00
Adrian Prantl	7072073cc9	Debug info: Remove ARMAsmPrinter::EmitDwarfRegOp(). AsmPrinter can now scan the register file for sub- and super-registers. No functionality change intended. (Tests are updated because the comments in the assembler output are different.) llvm-svn: 202416	2014-02-27 17:56:08 +00:00
Richard Osborne	521bdf211d	[XCore] Support functions returning more than 4 words. If a function returns a large struct by value return the first 4 words in registers and the rest on the stack in a location reserved by the caller. This is needed to support the xC language which supports functions returning an arbitrary number of return values. This is r202397 reapplied with a fix to avoid an uninitialized read of a member. llvm-svn: 202414	2014-02-27 17:47:54 +00:00
Richard Osborne	f474087f98	[XCore] Make LowerCallResult a static function. No functionality change. This is r202396 reapplied with no changes. llvm-svn: 202413	2014-02-27 17:47:48 +00:00
Rafael Espindola	8837995b52	Remove MCPureStreamer. We moved MCJIT to use native object formats a long time ago and R600 now uses ELF, so it was dead. llvm-svn: 202408	2014-02-27 16:17:34 +00:00
Alexander Kornienko	52a07b819e	Re-apply r200853, which should not crash after Clang plugins were converted to loadable modules in r201256. llvm-svn: 202404	2014-02-27 14:47:37 +00:00
Richard Osborne	527aa5052d	Revert r202396, r202397. These are causing test failures, revert for now. llvm-svn: 202398	2014-02-27 14:24:13 +00:00
Richard Osborne	e82bf0988e	[XCore] Support functions returning more than 4 words. Summary: If a function returns a large struct by value return the first 4 words in registers and the rest on the stack in a location reserved by the caller. This is needed to support the xC language which supports functions returning an arbitrary number of return values. Reviewers: robertlytton Reviewed By: robertlytton CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D2889 llvm-svn: 202397	2014-02-27 14:00:40 +00:00
Richard Osborne	ed7e2ad090	[XCore] Make LowerCallResult a static function. No functionality change. llvm-svn: 202396	2014-02-27 14:00:34 +00:00
Richard Osborne	a283d24ad9	[XCore] Target optimized library function __memcpy_4() Summary: If the src, dst and size of a memcpy are known to be 4 byte aligned we can call __memcpy_4() instead of memcpy(). Reviewers: robertlytton Reviewed By: robertlytton CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D2871 llvm-svn: 202395	2014-02-27 13:39:07 +00:00
Richard Osborne	d6e85018c5	[XCore] Add dag combines for instructions that ignore some input bits. These instructions ignore the high bits of one of their input operands - try and use this to simplify the code. llvm-svn: 202394	2014-02-27 13:20:11 +00:00
Richard Osborne	2d3a2bee41	[XCore] Provide information about known zero bits of resource instructions. llvm-svn: 202393	2014-02-27 13:20:06 +00:00
Kostya Serebryany	cb3f6e164b	[asan] fix a pair of silly typos llvm-svn: 202391	2014-02-27 13:13:59 +00:00
Kostya Serebryany	ec34665de9	[asan] disable asan-detect-invalid-pointer-pair (was enabled by mistake) llvm-svn: 202390	2014-02-27 12:56:20 +00:00
Kostya Serebryany	796f6557bf	[asan] experimental implementation of invalid-pointer-pair detector (finds when two unrelated pointers are compared or subtracted). This implementation has both false positives and false negatives and is not tuned for performance. A bug report for a proper implementation will follow. llvm-svn: 202389	2014-02-27 12:45:36 +00:00
Eric Christopher	a9a1d27677	Don't emit anything into the debug_ranges section if we aren't emitting any ranges - this includes CU ranges where we were previously emitting an end list marker even if we didn't have a list. Testcase includes a test for line table only code emission as the problem was noticed while writing this test. llvm-svn: 202357	2014-02-27 07:44:45 +00:00
Craig Topper	5346e75966	[X86] Fix Uses/Defs lists for INS, OUTS, SCAS, CMPS, LODS llvm-svn: 202348	2014-02-27 05:08:25 +00:00
Craig Topper	40dd6211d5	[X86] Add RAX/EAX/AX Uses/Defs to XCHG RAX/EAX/AX instructions. llvm-svn: 202347	2014-02-27 04:27:00 +00:00
Craig Topper	08301dee46	[X86] Add RAX/EAX/AX/AL Uses/Defs to the absolute memory location move instructions. Patch by Florian Lukas with some additional instructions fixed by me. Fixes PR18975. llvm-svn: 202345	2014-02-27 04:07:57 +00:00
Craig Topper	64d94320f3	Fix odd indentation. llvm-svn: 202342	2014-02-27 03:11:13 +00:00
Ben Langmuir	27a58bf770	Revert "Use StringRef in raw_fd_ostream constructor" This reverts commit r202225, which may cause a performance regression. llvm-svn: 202338	2014-02-27 02:09:10 +00:00
Michel Danzer	9e61c4b6cd	R600/SI: Optimize SI_KILL for constant operands If the SI_KILL operand is constant, we can either clear the exec mask if the operand is negative, or do nothing otherwise. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 202337	2014-02-27 01:47:09 +00:00
Michel Danzer	6f273c57db	R600/SI: Allow SI_KILL for geometry shaders Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 202336	2014-02-27 01:47:02 +00:00
Eric Christopher	740a833a3b	If we're only emitting line tables for a particular CU then don't add any ranges to the list of ranges for the CU as we don't want to emit them anyway. This ensures that we will still emit ranges if we have a compile unit compiled with only line tables and one compiled with full debug info requested (we'll emit for the one with full debug info). Update testcase metadata accordingly to continue emitting ranges. llvm-svn: 202333	2014-02-27 01:25:00 +00:00
Eric Christopher	75d49db19b	Add a debug info code generation level to the compile unit metadata and update everything accordingly. This can be used to conditionalize the amount of output in the backend based on the amount of debug requested/metadata emission scheme by a front end (e.g. clang). Paired with a commit to clang. llvm-svn: 202332	2014-02-27 01:24:56 +00:00
Adrian Prantl	e31563c4aa	Fix a type error that crept into r202313. llvm-svn: 202317	2014-02-26 23:46:39 +00:00
Eric Christopher	a13839f5ca	Remove unnecessary llvm:: qualification. llvm-svn: 202316	2014-02-26 23:27:16 +00:00
Adrian Prantl	918b9a77ce	Debug info: Refactor AsmPrinter::EmitDwarfRegOp to make the control flow more obvious. llvm-svn: 202313	2014-02-26 23:03:37 +00:00
Matt Arsenault	530dde4386	R600: Remove unnecessary build_vector pattern. It is already fully handled in AMDGPUISelDAGToDAG. llvm-svn: 202312	2014-02-26 23:00:58 +00:00
Andrew Trick	52a00936b4	Add a limit to the heuristic that register allocates instructions in local order. This handles pathological cases in which we see 2x increase in spill code for large blocks (~50k instructions). I don't have a unit test for this behavior. Fixes rdar://16072279. llvm-svn: 202304	2014-02-26 22:07:26 +00:00
Quentin Colombet	85c9e16291	Lower unsigned vsetcc to psubus in certain cases The current approach to lower a vsetult is to flip the sign bit of the operands, swap the operands and then use a (signed) pcmpgt. psubus (unsigned saturating subtract) can be used to emulate a vsetult more efficiently: + case ISD::SETULT: { + // If the comparison is against a constant we can turn this into a + // setule. With psubus, setule does not require a swap. This is + // beneficial because the constant in the register is no longer + // destructed as the destination so it can be hoisted out of a loop. I also enable lowering via psubus in a few other cases where it's clearly beneficial: setule and setuge if minu/maxu cannot be used. rdar://problem/14338765 Patch by Adam Nemet <anemet@apple.com>. llvm-svn: 202301	2014-02-26 21:39:12 +00:00
Aaron Ballman	3d69c5cae4	Silencing an MSVC signed comparison warning. llvm-svn: 202295	2014-02-26 20:22:20 +00:00
Hal Finkel	121caf6313	Fix the aggressive anti-dep breaker's subregister definition handling The aggressive anti-dependency breaker scans instructions, bottom-up, within the scheduling region in order to find opportunities where register renaming can be used to break anti-dependencies. Unfortunately, the aggressive anti-dep breaker was treating a register definition as defining all of that register's aliases (including super registers). This behavior is incorrect when the super register is live and there are other definitions of subregisters of the super register. For example, given the following sequence: %CR2EQ<def> = CROR %CR3UN, %CR3UN<kill> %CR2GT<def> = IMPLICIT_DEF %X4<def> = MFOCRF8 %CR2 the analysis of the first subregister definition would work as expected: Anti: %CR2GT<def> = IMPLICIT_DEF Def Groups: CR2GT=g194->g0(via CR2) Antidep reg: CR2GT (zero group) Use Groups: but the analysis of the second one would not: Anti: %CR2EQ<def> = CROR %CR3UN, %CR3UN<kill> Def Groups: CR2EQ=g195 Antidep reg: CR2EQ Rename Candidates for Group g195: ... because, when processing the %CR2GT<def>, we'd mark all super registers of %CR2GT (%CR2 in this case) as defined. As a result, when processing %CR2EQ<def>, %CR2 no longer appears to be live, and %CR2EQ<def>'s group is not %unioned with the %CR2 group. I don't have an in-tree test case for this yet (and even if I did, I don't have a small one). llvm-svn: 202294	2014-02-26 20:20:30 +00:00
Reid Kleckner	22869378d9	GlobalOpt: Apply fastcc to internal x86_thiscallcc functions We should apply fastcc whenever profitable. We can expand this list, but there are lots of conventions with performance implications that we don't want to change. Differential Revision: http://llvm-reviews.chandlerc.com/D2705 llvm-svn: 202293	2014-02-26 19:57:30 +00:00
Nico Rieck	773a57958c	Relax COFF string table check COFF object files with 0 as string table size are currently rejected. This prevents us from reading object files written by tools like cvtres that violate the PECOFF spec and write 0 instead of 4 for the size of an empty string table. llvm-svn: 202292	2014-02-26 19:51:44 +00:00
Rafael Espindola	e8ae0dba52	Fix typo. Thanks to Roman Divacky for noticing it. llvm-svn: 202277	2014-02-26 17:05:38 +00:00
Rafael Espindola	ae593f1563	Compare DataLayout by Value, not by pointer. This fixes spurious warnings in llvm-link about the datalayout not matching. Thanks to Zalman Stern for reporting the bug! llvm-svn: 202276	2014-02-26 17:02:08 +00:00
Rafael Espindola	667fcb839e	Use a sorted array to store the information about a few address spaces. We don't have any test with more than 6 address spaces, so a DenseMap is probably not the correct answer. An unsorted array would also be OK, but we have to sort it for printing anyway. llvm-svn: 202275	2014-02-26 16:58:35 +00:00
Rafael Espindola	5109fcc0ae	Move these functions out of line. A DenseMap lookup is not a simple operation. llvm-svn: 202274	2014-02-26 16:49:40 +00:00
Andrew Trick	429e9edd08	Fix PR18165: LSR must avoid scaling factors that exceed the limit on truncated use. Patch by Michael Zolotukhin! llvm-svn: 202273	2014-02-26 16:31:56 +00:00
NAKAMURA Takumi	4ca51b9ace	[CMake] BUILD_SHARED_LIBS: Fixup for r202261: Give PULIC to system_libs in LLVMSupport. llvm-svn: 202263	2014-02-26 12:18:55 +00:00

1 2 3 4 5 ...

67331 Commits