llvm-project

Commit Graph

Author	SHA1	Message	Date
Tom Stellard	11aa80cc4a	R600/SI: Handle VCC in SIRegisterInfo::getPhysRegSubReg() This fixes a crash in an ocl conformance test. The test requries register spilling and is too big to include. llvm-svn: 216216	2014-08-21 20:40:50 +00:00
Rafael Espindola	33466a745e	Rewrite the gold plugin to fix pr19901. There is a fundamental difference between how the gold API and lib/LTO view the LTO process. The gold API talks about a particular symbol in a particular file. The lib/LTO API talks about a symbol in the merged module. The merged module is then defined in terms of the IR semantics. In particular, a linkonce_odr GV is only copied if it is used, since it is valid to drop unused linkonce_odr GVs. In the testcase in pr19901 both properties collide. What happens is that gold asks us to keep a particular linkonce_odr symbol, but the IR linker doesn't copy it to the merged module and we never have a chance to ask lib/LTO to keep it. This patch fixes it by having a more direct implementation of the gold API. If it asks us to keep a symbol, we change the linkage so it is not linkonce. If it says we can drop a symbol, we do so. All of this before we even send the module to lib/Linker. Since now we don't have to produce LTO_SYMBOL_SCOPE_DEFAULT_CAN_BE_HIDDEN, during symbol resolution we can use a temporary LLVMContext and do lazy module loading. This allows us to keep the minimum possible amount of allocated memory around. This should also allow as much parallelism as we want, since there is no shared context. llvm-svn: 216215	2014-08-21 20:28:55 +00:00
Jonathan Roelofs	f00e7e143e	Satiate the sanitizer build bot This fixes a missing initializer from r216182 llvm-svn: 216212	2014-08-21 20:09:15 +00:00
Rafael Espindola	7cebf36a95	Move some logic to populateLTOPassManager. This will avoid code duplication in the next commit which calls it directly from the gold plugin. llvm-svn: 216211	2014-08-21 20:03:44 +00:00
Adam Nemet	5ed17dad95	[AVX512] Add class to group common template arguments related to vector type We discussed the issue of generality vs. readability of the AVX512 classes recently. I proposed this approach to try to hide and centralize the mappings we commonly perform based on the vector type. A new class X86VectorVTInfo captures these. The idea is to pass an instance of this class to classes/multiclasses instead of the corresponding ValueType. Then the class/multiclass can use its field for things that derive from the type rather than passing all those as separate arguments. I modified avx512_valign to demonstrate this new approach. As you can see instead of 7 related template parameters we now have one. The downside is that we have to refer to fields for the derived values. I named the argument '_' in order to make this as invisible as possible. Please let me know if you absolutely hate this. (Also once we allow local initializations in multiclasses we can recover the original version by assigning the fields to local variables.) Another possible use-case for this class is to directly map things, e.g.: RegisterClass KRC = X86VectorVTInfo<32, i16>.KRC llvm-svn: 216209	2014-08-21 19:50:07 +00:00
Alex Lorenz	936b99c942	Coverage Mapping: add function's hash to coverage function records. The profile data format was recently updated and the new indexing api requires the code coverage tool to know the function's hash as well as the function's name to get the execution counts for a function. Differential Revision: http://reviews.llvm.org/D4994 llvm-svn: 216207	2014-08-21 19:23:25 +00:00
Rafael Espindola	40bfd6db57	llvm-gcc is dead. llvm-svn: 216206	2014-08-21 19:22:24 +00:00
Eric Fiselier	5b0e0e9436	[LIT] Remove documentation for method since it does not exist llvm-svn: 216204	2014-08-21 18:52:58 +00:00
Rafael Espindola	216e0c0617	Respect LibraryInfo in populateLTOPassManager and use it. NFC. llvm-svn: 216203	2014-08-21 18:49:52 +00:00
Rafael Espindola	df1836f750	Remove dead code. NFC. llvm-svn: 216201	2014-08-21 18:11:21 +00:00
Quentin Colombet	0c740d4b9a	[AArch64] Run a peephole pass right after AdvSIMD pass. The AdvSIMD pass may produce copies that are not coalescer-friendly. The peephole optimizer knows how to fix that as demonstrated in the test case. <rdar://problem/12702965> llvm-svn: 216200	2014-08-21 18:10:07 +00:00
Juergen Ributzka	c83265a6c5	[FastISel][AArch64] Factor out ANDWri instruction generation into a helper function. NFCI. llvm-svn: 216199	2014-08-21 18:02:25 +00:00
Moritz Roth	dfdda0d41c	Thumb1 load/store optimizer: Improve code to materialize new base register. There are two add-immediate instructions in Thumb1: tADDi8 and tADDi3. Only the latter supports using different source and destination registers, so whenever we materialize a new base register (at a certain offset) we'd do so by moving the base register value to the new register and then adding in place. This patch changes the code to use a single tADDi3 if the offset is small enough to fit in 3 bits. Differential Revision: http://reviews.llvm.org/D5006 llvm-svn: 216193	2014-08-21 17:11:03 +00:00
Hans Wennborg	f4cb573268	Use returns_nonnull in BumpPtrAllocator and MallocAllocator to avoid null-check in placement new In both Clang and LLVM, this is a common pattern: Size = sizeof(DeclRefExpr) + SomeExtraStuff; void *Mem = Context.Allocate(Size, llvm::alignOf<DeclRefExpr>()); return new (Mem) DeclRefExpr(...); The annoying thing is that because the default placement-new operator has a nothrow specification, the compiler will insert a null check of Mem before calling the DeclRefExpr constructor. This null check is redundant for us, because we expect the allocation functions to never return null. By annotating the allocator functions with returns_nonnull, we can optimize away these checks. Compiling clang with a recent version of Clang and measuring with: $ perf stat -r20 bin/clang.patch -fsyntax-only -w gcc.c && perf stat -r20 bin/clang.orig -fsyntax-only -w gcc.c Shows a 2.4% speed-up (+- 0.8%). The pattern occurs in LLVM too. Measuring with -O3 (and now using bzip2.c instead, because it's smaller): $ perf stat -r20 bin/clang.patch -O3 -w bzip2.c && perf stat -r20 bin/clang.orig -O3 -w bzip2.c Shows 4.4 % speed-up (+- 1%). If anyone knows of a similar attribute we can use for MSVC, or some other technique to get rid off the null check there, please let me know. Differential Revision: http://reviews.llvm.org/D4989 llvm-svn: 216192	2014-08-21 17:10:00 +00:00
Juergen Ributzka	95c0f153e4	[FastISel][AArch64] Remove redundant test. These tests and many more are already covered by fast-isel-addressing-modes.ll. llvm-svn: 216186	2014-08-21 16:40:05 +00:00
Jonathan Roelofs	5e98ff967b	Add a thread-model knob for lowering atomics on baremetal & single threaded systems http://reviews.llvm.org/D4984 llvm-svn: 216182	2014-08-21 14:35:47 +00:00
Rafael Espindola	e07caad9e7	Handle inlining in populateLTOPassManager like in populateModulePassManager. No functionality change. llvm-svn: 216178	2014-08-21 13:35:30 +00:00
Zinovy Nis	33406da5f4	[CLNUP] Remove return after llvm_unreachable. Thanks to Hal Finkel for pointing. llvm-svn: 216176	2014-08-21 13:30:05 +00:00
Benjamin Kramer	ff8b883772	DAGCombiner: Make concat_vector combine safe for EVTs and concat_vectors with many arguments. PR20677 llvm-svn: 216175	2014-08-21 13:28:02 +00:00
Rafael Espindola	208bc533cd	Move DisableGVNLoadPRE from populateLTOPassManager to PassManagerBuilder. llvm-svn: 216174	2014-08-21 13:13:17 +00:00
Josh Klontz	fbe17d6a32	X86AsmPrinter MCJIT MSVC bug fix. Summary: This bug was introduced in r213006 which makes an assumption that MCSection is COFF for Windows MSVC. This assumption is broken for MCJIT users where ELF is used instead [1]. The fix is to change the MCSection cast to a dyn_cast. [1] http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-December/068407.html. Reviewers: majnemer Reviewed By: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D4872 llvm-svn: 216173	2014-08-21 12:55:27 +00:00
Oliver Stannard	51b1d460cb	[ARM] Enable DP copy, load and store instructions for FPv4-SP The FPv4-SP floating-point unit is generally referred to as single-precision only, but it does have double-precision registers and load, store and GPR<->DPR move instructions which operate on them. This patch enables the use of these registers, the main advantage of which is that we now comply with the AAPCS-VFP calling convention. This partially reverts r209650, which added some AAPCS-VFP support, but did not handle return values or alignment of double arguments in registers. This patch also adds tests for Thumb2 code generation for floating-point instructions and intrinsics, which previously only existed for ARM. llvm-svn: 216172	2014-08-21 12:50:31 +00:00
Rafael Espindola	18b2a258c3	Sort declarations. llvm-svn: 216171	2014-08-21 12:39:07 +00:00
Benjamin Kramer	002a1ced06	Make format_object_base's destructor protected and non-virtual. It's not meant to be used with operator delete and this avoids emitting virtual dtors for every derived format object. llvm-svn: 216170	2014-08-21 11:22:05 +00:00
Erik Verbruggen	2b98bd2a80	Reassociate x + -0.1234 * y into x - 0.1234 * y This does not require -ffast-math, and it gives CSE/GVN more options to eliminate duplicate expressions in, e.g.: return ((x + 0.1234 * y) * (x - 0.1234 * y)); Differential Revision: http://reviews.llvm.org/D4904 llvm-svn: 216169	2014-08-21 10:45:30 +00:00
Benjamin Kramer	b791ef21d2	X86: Turn redundant if into an assertion. While there remove noop casts. llvm-svn: 216168	2014-08-21 10:31:37 +00:00
Robert Khasanov	46409eae8e	[x86] Added _addcarry_ and _subborrow_ intrinsics llvm-svn: 216164	2014-08-21 09:43:43 +00:00
Robert Khasanov	86ca6aaf40	[x86] SMAP: added HasSMAP attribute for CLAC/STAC, corrected attributes llvm-svn: 216163	2014-08-21 09:34:12 +00:00
Robert Khasanov	7c5a843646	[x86] Broadwell: ADOX/ADCX. Added _addcarryx_u{32\|64} intrinsics to LLVM. llvm-svn: 216162	2014-08-21 09:27:00 +00:00
Robert Khasanov	98441b6e7f	[x86] Enable Broadwell target. Added FeatureSMAP. Broadwell ISA includes Haswell ISA + ADX + RDSEED + SMAP llvm-svn: 216161	2014-08-21 09:16:12 +00:00
Zinovy Nis	0a36cba29d	[INDVARS] Extend using of widening of induction variables for the cases of "sub nsw" and "mul nsw" instructions. Currently only "add nsw" are widened. This patch eliminates tons of "sext" instructions for 64 bit code (and the corresponding target code) in cases like: int N = 100; float *A; void foo(int x0, int x1) { float A_cur = &A[0][0]; float * A_next = &A[1][0]; for(int x = x0; x < x1; ++x). { // Currently only [x+N] case is widened. Others 2 cases lead to sext. // This patch fixes it, so all 3 cases do not need sext. const float div = A_cur[x + N] + A_cur[x - N] + A_cur[x * N]; A_next[x] = div; } } ... > clang++ test.cpp -march=core-avx2 -Ofast -fno-unroll-loops -fno-tree-vectorize -S -o - Differential Revision: http://reviews.llvm.org/D4695 llvm-svn: 216160	2014-08-21 08:25:45 +00:00
Elena Demikhovsky	08f8596cc0	IntelJITEventListener updates to fix breaks by recent changes to EngineBuilder and DIContext. By Arch Robison. llvm-svn: 216159	2014-08-21 07:01:55 +00:00
Craig Topper	71b7b68b74	Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid needing to mention the size. llvm-svn: 216158	2014-08-21 05:55:13 +00:00
David Majnemer	5d1aeba2ea	InstCombine: Fold ((A \| B) & C1) ^ (B & C2) -> (A & C1) ^ B if C1^C2=-1 Adapted from a patch by Richard Smith, test-case written by me. llvm-svn: 216157	2014-08-21 05:14:48 +00:00
Craig Topper	3ced27c835	Remove custom implementations of max/min in StringRef that was originally added to work an old gcc bug. I believe its been fixed by now. llvm-svn: 216156	2014-08-21 04:31:10 +00:00
Eric Fiselier	a4e211edad	add self to credits llvm-svn: 216155	2014-08-21 04:27:11 +00:00
Jiangning Liu	950844fadb	Fix a bug around truncating vector in const prop. In constant folding stage, "TRUNC" can't handle vector data type. llvm-svn: 216149	2014-08-21 02:12:35 +00:00
Jiangning Liu	deb4b5fc37	Revert r216066, "Optimize ZERO_EXTEND and SIGN_EXTEND in both SelectionDAG Builder and type". llvm-svn: 216147	2014-08-21 01:59:30 +00:00
Quentin Colombet	689623009b	[PeepholeOptimizer] Take advantage of the isInsertSubreg property in the advanced copy optimization. This is the final step patch toward transforming: udiv r0, r0, r2 udiv r1, r1, r3 vmov.32 d16[0], r0 vmov.32 d16[1], r1 vmov r0, r1, d16 bx lr into: udiv r0, r0, r2 udiv r1, r1, r3 bx lr Indeed, thanks to this patch, this optimization is able to look through vmov.32 d16[0], r0 vmov.32 d16[1], r1 and is able to rewrite the following sequence: vmov.32 d16[0], r0 vmov.32 d16[1], r1 vmov r0, r1, d16 into simple generic GPR copies that the coalescer managed to remove. <rdar://problem/12702965> llvm-svn: 216144	2014-08-21 00:19:16 +00:00
Quentin Colombet	84f15bd1b0	[ARM] Mark VSETLNi32 with the InsertSubreg property and implement the related target hook. This patch teaches the compiler that: dX = VSETLNi32 dY, rZ, imm is the same as: dX = INSERT_SUBREG dY, rZ, translateImmToSubIdx(imm) <rdar://problem/12702965> llvm-svn: 216143	2014-08-21 00:10:52 +00:00
James Molloy	a88896b5c0	[LoopVectorize] Up the maximum unroll factor to 4 for AArch64 Only for Cortex-A57 and Cyclone for now, where it has shown wins. llvm-svn: 216141	2014-08-21 00:02:51 +00:00
James Molloy	82c995d450	[LoopVectorizer] Limit unroll factor in the presence of nested reductions. If we have a scalar reduction, we can increase the critical path length if the loop we're unrolling is inside another loop. Limit, by default to 2, so the critical path only gets increased by one reduction operation. llvm-svn: 216140	2014-08-20 23:53:52 +00:00
Quentin Colombet	7e3da6677a	Add isInsertSubreg property. This patch adds a new property: isInsertSubreg and the related target hooks: TargetIntrInfo::getInsertSubregInputs and TargetInstrInfo::getInsertSubregLikeInputs to specify that a target specific instruction is a (kind of) INSERT_SUBREG. The approach is similar to r215394. <rdar://problem/12702965> llvm-svn: 216139	2014-08-20 23:49:36 +00:00
Jonathan Roelofs	44937d98a3	Lower thumbv4t & thumbv5 lo->lo copies through a push-pop sequence On pre-v6 hardware, 'MOV lo, lo' gives undefined results, so such copies need to be avoided. This patch trades simplicity for implementation time at the expense of performance... As they say: correctness first, then performance. See http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-August/075998.html for a few ideas on how to make this better. llvm-svn: 216138	2014-08-20 23:38:50 +00:00
Quentin Colombet	a56749064a	Mention the right target hook in the comment on isExtractSubreg property. llvm-svn: 216137	2014-08-20 23:25:28 +00:00
Quentin Colombet	67639df146	[PeepholeOptimizer] Take advantage of the isExtractSubreg property in the advanced copy optimization. This patch is a step toward transforming: udiv r0, r0, r2 udiv r1, r1, r3 vmov.32 d16[0], r0 vmov.32 d16[1], r1 vmov r0, r1, d16 bx lr into: udiv r0, r0, r2 udiv r1, r1, r3 bx lr Indeed, thanks to this patch, this optimization is able to look through vmov r0, r1, d16 but it does not understand yet vmov.32 d16[0], r0 vmov.32 d16[1], r1 Comming patches will fix that and update the related test case. <rdar://problem/12702965> llvm-svn: 216136	2014-08-20 23:13:02 +00:00
Yi Jiang	1a4e73d7bf	New InstCombine pattern: (icmp ult/ule (A + C1), C3) \| (icmp ult/ule (A + C2), C3) to (icmp ult/ule ((A & ~(C1 ^ C2)) + max(C1, C2)), C3) under certain condition llvm-svn: 216135	2014-08-20 22:55:40 +00:00
Alexey Samsonov	e5864c69a8	Don't allow MCStreamer::EmitIntValue to output 0-byte integers. It makes no sense and can hide bugs. In particular, it lead to left shift by 64 bits, which is an undefined behavior, properly reported by UBSan. llvm-svn: 216134	2014-08-20 22:46:38 +00:00
Quentin Colombet	deb82eab3e	[ARM] Mark VMOVRRD with the ExtractSubreg property and implement the related target hook. This patch teaches the compiler that: rX, rY = VMOVRRD dZ is the same as: rX = EXTRACT_SUBREG dZ, ssub_0 rY = EXTRACT_SUBREG dZ, ssub_1 <rdar://problem/12702965> llvm-svn: 216132	2014-08-20 22:16:19 +00:00
Alexey Samsonov	fffd56ecdf	Fix undefined behavior (left shift of negative value) in SystemZ backend. This bug is reported by UBSan. llvm-svn: 216131	2014-08-20 21:56:43 +00:00

1 2 3 4 5 ...

106952 Commits