llvm-project

Commit Graph

Author	SHA1	Message	Date
Andrew Wilkins	a256889fb2	Go binding: Add methods for missing PassManagerBuilder C APIs Patch by Ryuichi Hayashida! Differential Revision: http://reviews.llvm.org/D30042 llvm-svn: 295420	2017-02-17 05:41:05 +00:00
Sanjoy Das	8b859c26ec	[JumpThreading] Re-enable JumpThreading for guards Summary: JumpThreading for guards feature has been reverted at https://reviews.llvm.org/rL295200 due to the following problem: the feature used the following algorithm for detection of diamond patters: 1. Find a block with 2 predecessors; 2. Check that these blocks have a common single parent; 3. Check that the parent's terminator is a branch instruction. The problem is that these checks are insufficient. They may pass for a non-diamond construction in case if those two predecessors are actually the same block. This may happen if parent's terminator is a br (either conditional or unconditional) to a block that ends with "switch" instruction with exactly two branches going to one block. This patch re-enables the JumpThreading for guards and fixes this issue by adding the check that those found predecessors are actually different blocks. This guarantees that parent's terminator is a conditional branch with exactly 2 different successors, which is now ensured by assertions. It also adds two more tests for this situation (with parent's terminator being a conditional and an unconditional branch). Patch by Max Kazantsev! Reviewers: anna, sanjoy, reames Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30036 llvm-svn: 295410	2017-02-17 04:21:14 +00:00
Rafael Espindola	6eab4044b9	Revert "[Hexagon] Start using regmasks on calls" This reverts commit r295371. It broke windows bots: http://bb.pgr.jp/builders/ninja-clang-i686-msc19-R/builds/11402/steps/test-llvm/logs/stdio llvm-svn: 295402	2017-02-17 02:08:58 +00:00
Dean Michael Berris	4f83c4d1a6	[XRAY] [x86_64] Adding a Flight Data filetype reader to the llvm-xray Trace implementation. Summary: The file type packs function trace data onto disk from potentially multiple threads that are aggregated and flushed during the course of an instrumented program's runtime. It is named FDR mode or Flight Data recorder as an analogy to plane blackboxes, which instrument a running system without access to IO. The writer code is defined in compiler-rt in xray_fdr_logging.h/cc Reviewers: rSerge, kcc, dberris Reviewed By: dberris Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29697 llvm-svn: 295397	2017-02-17 01:47:16 +00:00
Teresa Johnson	95ed51dcfe	Move test to X86 subdirectory for bot failures Second attempt at fixing bot failures from r295384. llvm-svn: 295395	2017-02-17 01:23:28 +00:00
Chandler Carruth	96d86a7f9c	[x86] Give this test a triple so that we don't have to cope with two different asm comment syntaxes. llvm-svn: 295394	2017-02-17 01:18:38 +00:00
Chris Bieneman	e43bffa718	[CMake] Add variable IOS to iOS toolchain This is useful for some edge cases where detecting things gets tricky. Specifically LLDB needs this to support iOS because CMake doesn't support running tests using obj-c code. llvm-svn: 295392	2017-02-17 01:11:41 +00:00
Teresa Johnson	1ede03d5d2	Attempt to fix bot failures by adding -mtriple to llc invocation Failures on hexagon from test added with r295384, e.g.: http://lab.llvm.org:8011/builders/llvm-hexagon-elf/builds/3793 llvm-svn: 295389	2017-02-17 00:52:09 +00:00
Matt Arsenault	c18b67745b	Bug 31948: Fix assertion when bitcasting constantexpr pointers llvm-svn: 295387	2017-02-17 00:32:19 +00:00
Chandler Carruth	8960686927	FileCheck-ize some tests in test/CodeGen/X86/ Patch by Jorge Gorbe! Differential Revision: https://reviews.llvm.org/D29807 llvm-svn: 295386	2017-02-17 00:29:59 +00:00
Teresa Johnson	76b5b7493c	Handle link of NoDebug CU with a CU that has debug emission enabled Summary: This is an issue both with regular and Thin LTO. When we link together a DICompileUnit that is marked NoDebug (e.g when compiling with -g0 but applying an AutoFDO profile, which requires location tracking in the compiler) and a DICompileUnit with debug emission enabled, we can have failures during dwarf debug generation. Specifically, when we have inlined from the NoDebug compile unit into the debug compile unit, we can fail during construction of the abstract and inlined scope DIEs. This is because the SPMap does not include NoDebug CUs (they are skipped in the debug_compile_units_iterator). This patch fixes the failures by skipping locations from NoDebug CUs when extracting lexical scopes. Reviewers: dblaikie, aprantl Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D29765 llvm-svn: 295384	2017-02-17 00:21:19 +00:00
Eugene Zelenko	deaf695138	[IR] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 295383	2017-02-17 00:00:09 +00:00
Zachary Turner	7b327d051b	[pdb] Add the ability to resolve TypeServer PDBs. Some PDBs or object files can contain references to other PDBs where the real type information lives. When this happens, all type indices in the original PDB are meaningless because their records are not there. With this patch we add the ability to pull type info from those secondary PDBs. Differential Revision: https://reviews.llvm.org/D29973 llvm-svn: 295382	2017-02-16 23:35:45 +00:00
Wei Mi	493fb266ed	[LSR] Prevent formula with SCEVAddRecExpr type of Reg from Sibling loops In rL294814, we allow formula with SCEVAddRecExpr type of Reg from loops other than current loop. This is good for the case when induction variable of outerloop being used in expr in innerloop. But it is very bad to allow such Reg from sibling loop because we may need to add lsr.iv in other sibling loops when scev expanding those SCEVAddRecExpr type exprs. For the testcase below, one loop can be inserted with a bunch of lsr.iv because of LSR for other loops. // The induction variable j from a loop in the middle will have initial // value generated from previous sibling loop and exit value used by its // next sibling loop. void goo(long i, long j); long cond; void foo(long N) { long i = 0; long j = 0; i = 0; do { goo(i, j); i++; j++; } while (cond); i = 0; do { goo(i, j); i++; j++; } while (cond); i = 0; do { goo(i, j); i++; j++; } while (cond); i = 0; do { goo(i, j); i++; j++; } while (cond); i = 0; do { goo(i, j); i++; j++; } while (cond); i = 0; do { goo(i, j); i++; j++; } while (cond); } The fix is to only allow formula with SCEVAddRecExpr type of Reg from current loop or its parents. Differential Revision: https://reviews.llvm.org/D30021 llvm-svn: 295378	2017-02-16 21:27:31 +00:00
David Blaikie	fc4857f80b	Fix -Wunused-lambda-capture by removing some unused lambda captures llvm-svn: 295373	2017-02-16 20:55:48 +00:00
Benjamin Kramer	3f6260cab4	[MachinePipeliner] Remove redundant destructor. NFC. llvm-svn: 295372	2017-02-16 20:26:51 +00:00
Krzysztof Parzyszek	fb9503c080	[Hexagon] Start using regmasks on calls All the cool targets are doing it... llvm-svn: 295371	2017-02-16 20:25:23 +00:00
Erich Keane	c4c31e2020	Change default TimerGroup singleton to use magic statics TimerGroup was showing up on a leak in valigrind, and used some pretty complex code to implement a singleton. This patch replaces the implementation with a vastly simpler one. Differential Revision: https://reviews.llvm.org/D28367 llvm-svn: 295370	2017-02-16 20:19:49 +00:00
Krzysztof Parzyszek	cac10f9768	[RDF] Aggregate shadow phi uses into one cluster when propagating live info llvm-svn: 295366	2017-02-16 19:28:06 +00:00
Simon Pilgrim	e5215751ff	[X86][SSE] Add PR31309 test case (load-extend i32 to i128). llvm-svn: 295363	2017-02-16 19:17:36 +00:00
Matt Arsenault	b95ddd7cea	AMDGPU: Remove llvm.AMDGPU.cube intrinsic llvm-svn: 295359	2017-02-16 19:09:04 +00:00
Matt Arsenault	eb65cda986	AMDGPU: Remove llvm.AMDGPU.rsq intrinsic llvm-svn: 295358	2017-02-16 19:08:58 +00:00
Hans Wennborg	35905d6a67	Re-apply r282920 "X86: Allow conditional tail calls in Win64 "leaf" functions (PR26302)" The original commit was reverted in r283329 due to a miscompile in Chromium. That turned out to be the same issue as PR31257, which was fixed in r295262. llvm-svn: 295357	2017-02-16 19:04:42 +00:00
Krzysztof Parzyszek	84cd4ea301	[RDF] Differentiate between defining and clobbering nodes Defining nodes should not alias with one another, while clobbering nodes can. When pushing defs on stacks, push clobbers first, link non-clobbering defs, then push the defs. The data flow in a statement is now: uses -> clobbers -> defs. llvm-svn: 295356	2017-02-16 18:53:04 +00:00
David Blaikie	b2fbb4b276	Refactor DebugHandlerBase a bit to common non-debug-having-function filtering llvm-svn: 295354	2017-02-16 18:48:33 +00:00
Matt Arsenault	920576042d	InstCombine: Canonicalize fast fmuladd to fmul + fadd llvm-svn: 295353	2017-02-16 18:46:24 +00:00
Krzysztof Parzyszek	5226ba8daa	[RDF] Move normalize(RegisterRef) to PhysicalRegisterInfo Remove the duplicate from DFG and make some members of PRI private. llvm-svn: 295351	2017-02-16 18:45:23 +00:00
Andrea Di Biagio	42f7712e23	x86 interrupt calling convention: only save xmm registers if the target supports SSE The existing code always saves the xmm registers for 64-bit targets even if the target doesn't support SSE (which is common for kernels). Thus, the compiler inserts movaps instructions which lead to CPU exceptions when an interrupt handler is invoked. This commit fixes this bug by returning a register set without xmm registers from getCalleeSavedRegs and getCallPreservedMask for such targets. Patch by Philipp Oppermann. Differential Revision: https://reviews.llvm.org/D29959 llvm-svn: 295347	2017-02-16 18:25:37 +00:00
Sanjay Patel	8e55b685c2	[x86] add more tests of select of constants; NFC llvm-svn: 295346	2017-02-16 18:15:16 +00:00
Artur Pilipenko	85d758299e	[DAGCombiner] Support {a\|s}ext, {a\|z\|s}ext load nodes in load combine Resubmit -r295314 with PowerPC and AMDGPU tests updated. Support {a\|s}ext, {a\|z\|s}ext load nodes as a part of load combine patters. Reviewed By: filcab Differential Revision: https://reviews.llvm.org/D29591 llvm-svn: 295336	2017-02-16 17:07:27 +00:00
Sjoerd Meijer	cb2d950214	[AArch64] AArch64AsmParser clean up of isImmediate functions. NFC Regression test neon-diagnostics.s needed changing because it now produces a more specific diagnostic about the immediate ranges. One change in the expected error message is not obvious, but there multiple candidate and it happens to pick the immediate diagnostic. Differential Revision: https://reviews.llvm.org/D29939 llvm-svn: 295331	2017-02-16 15:52:22 +00:00
Dan Gohman	4a5496902c	[WebAssembly] Add a cast to void to fix an unused private member warning, for now. llvm-svn: 295327	2017-02-16 15:21:37 +00:00
Simon Pilgrim	2fe568c95e	[X86] Remove local areOnlyUsersOf helper and use SDNode::areOnlyUsersOf instead. llvm-svn: 295326	2017-02-16 15:11:49 +00:00
Marshall Clow	e9110d71dd	Remove uses of deprecated std::random_shuffle in the LLVM code base. Reviewed as https://reviews.llvm.org/D29780 . llvm-svn: 295325	2017-02-16 14:37:03 +00:00
Diana Picus	1540b06ef8	[ARM] GlobalISel: Select floating point loads llvm-svn: 295321	2017-02-16 14:10:50 +00:00
Artur Pilipenko	a1b384c4ce	Rever -r295314 "[DAGCombiner] Support {a\|s}ext, {a\|z\|s}ext load nodes in load combine" This change causes some of AMDGPU and PowerPC tests to fail. llvm-svn: 295316	2017-02-16 13:04:46 +00:00
Artur Pilipenko	daaa0c0f7d	[DAGCombiner] Support {a\|s}ext, {a\|z\|s}ext load nodes in load combine Support {a\|s}ext, {a\|z\|s}ext load nodes as a part of load combine patters. Reviewed By: filcab Differential Revision: https://reviews.llvm.org/D29591 llvm-svn: 295314	2017-02-16 12:53:26 +00:00
Diana Picus	b1701e0b05	[ARM] GlobalISel: Select G_SEQUENCE and G_EXTRACT Since they're only used for passing around double precision floating point values into the general purpose registers, we'll lower them to VMOVDRR and VMOVRRD. llvm-svn: 295310	2017-02-16 12:19:57 +00:00
Diana Picus	6beef3c087	[ARM] GlobalISel: Select double G_FADD and copies Just use VADDD if available, bail out if not. llvm-svn: 295309	2017-02-16 12:19:52 +00:00
Diana Picus	9b32faa821	[ARM] GlobalISel: Assert that we don't use the FPR bank if we don't have VFP llvm-svn: 295308	2017-02-16 11:25:09 +00:00
Diana Picus	a93803b9fe	[ARM] GlobalISel: Add reg bank mappings for G_SEQUENCE and G_EXTRACT Support G_SEQUENCE and G_EXTRACT as needed for passing double precision floating point values in the soft-fp float mode. llvm-svn: 295306	2017-02-16 11:00:31 +00:00
Diana Picus	7f82c87022	[ARM] GlobalISel: Make the FPR bank 64-bit wide Also add mappings for single and double precision FP, and use them for G_FADD and G_LOAD. llvm-svn: 295302	2017-02-16 10:12:49 +00:00
Diana Picus	21c3d8e0fc	[ARM] GlobalISel: Legalize 64-bit G_FADD and G_LOAD For now we just mark them as legal all the time and let the other passes bail out if they can't handle it. In the future, we'll want to move more of the brains into the legalizer. llvm-svn: 295300	2017-02-16 09:09:49 +00:00
NAKAMURA Takumi	14246c937d	RWMutex.h: Use llvm-config.h instead of config.h in installed headers. llvm-svn: 295297	2017-02-16 08:22:08 +00:00
Diana Picus	ca6a890d7f	[ARM] GlobalISel: Lower double precision FP args For the hard float calling convention, we just use the D registers. For the soft-fp calling convention, we use the R registers and move values to/from the D registers by means of G_SEQUENCE/G_EXTRACT. While doing so, we make sure to honor the endianness of the target, since the CCAssignFn doesn't do that for us. For pure soft float targets, we still bail out because we don't support the libcalls yet. llvm-svn: 295295	2017-02-16 07:53:07 +00:00
Craig Topper	3731f4d173	[AVX-512][InstCombine] Teach InstCombine to optimize 512-bit packss/packus intrinsics like it does 128/256-bit. llvm-svn: 295294	2017-02-16 07:35:23 +00:00
Craig Topper	715873ead3	[AVX-512] Remove masked packss/packus intrinsics and autoupgrade to unmasked intrinsics with select instructions. For 512-bit add new unmasked intrinsics. The new 512-bit unmasked intrinsics will make it easy to handle these with the SSE/AVX intrinsics in InstCombine where we currently have a TODO. llvm-svn: 295290	2017-02-16 06:31:54 +00:00
Rui Ueyama	26ca0bddf0	Split WinCOFFObjectWriter::writeSection. llvm-svn: 295276	2017-02-16 02:56:06 +00:00
Rui Ueyama	af20f10d81	Split WinCOFFObjectWriter::writeObject function. llvm-svn: 295273	2017-02-16 02:35:48 +00:00
Matt Arsenault	d3e5cb77e4	AMDGPU: Remove llvm.SI.sendmsg llvm-svn: 295270	2017-02-16 02:01:17 +00:00
Matt Arsenault	d2c8a337aa	AMDGPU: Remove SI_fs_constant and SI_fs_interp intrinsics Update test uses with expansion in terms of new intrinsics. llvm-svn: 295269	2017-02-16 02:01:13 +00:00
Rui Ueyama	1473e5429e	Remove useless local variable. llvm-svn: 295268	2017-02-16 01:41:04 +00:00
Rui Ueyama	6237678b14	Rename variables to match the LLVM style. llvm-svn: 295265	2017-02-16 01:06:45 +00:00
Hans Wennborg	a468601e0e	[X86] Re-enable conditional tail calls and fix PR31257. This reverts r294348, which removed support for conditional tail calls due to the PR above. It fixes the PR by marking live registers as implicitly used and defined by the now predicated tailcall. This is similar to how IfConversion predicates instructions. Differential Revision: https://reviews.llvm.org/D29856 llvm-svn: 295262	2017-02-16 00:04:05 +00:00
Peter Collingbourne	08eb081ac3	PMB: Add an importing WPD pass to the start of the ThinLTO backend pipeline. Differential Revision: https://reviews.llvm.org/D30008 llvm-svn: 295260	2017-02-15 23:48:38 +00:00
Teresa Johnson	3963ba3e48	Collapse my two entries in CODE_OWNERS.txt llvm-svn: 295259	2017-02-15 23:45:21 +00:00
Tim Northover	9136617a3f	GlobalISel: legalize va_arg on AArch64. Uses a Custom implementation because the slot sizes being a multiple of the pointer size isn't really universal, even for the architectures that do have a simple "void *" va_list. llvm-svn: 295255	2017-02-15 23:22:50 +00:00
Tim Northover	4a652227dd	GlobalISel: support translating va_arg Since (say) i128 and [16 x i8] map to the same type in generic MIR, we also need to attach the required alignment info. llvm-svn: 295254	2017-02-15 23:22:33 +00:00
Daniel Berlin	3c1432fecf	Implement intrinsic mangling for literal struct types. Fixes PR 31921 Summary: Predicateinfo requires an ugly workaround to try to avoid literal struct types due to the intrinsic mangling not being implemented. This workaround actually does not work in all cases (you can hit the assert by bootstrapping with -print-predicateinfo), and can't be made to work without DFS'ing the type (IE copying getMangledStr and using a version that detects if it would crash). Rather than do that, i just implemented the mangling. It seems simple, since they are unified structurally. Looking at the overloaded-mangling testcase we have, it actually turns out the gc intrinsics will also crash if you try to use a literal struct. Thus, the testcase added fails before this patch, and works after, without needing to resort to predicateinfo. Reviewers: chandlerc, davide Subscribers: llvm-commits, sanjoy Differential Revision: https://reviews.llvm.org/D29925 llvm-svn: 295253	2017-02-15 23:16:20 +00:00
Matt Arsenault	824de226a1	AMDGPU: Remove dead node definitions llvm-svn: 295247	2017-02-15 22:23:04 +00:00
Matt Arsenault	900b21c350	Fix typos llvm-svn: 295246	2017-02-15 22:19:06 +00:00
Matt Arsenault	a78ca62c64	AMDGPU: Consolidate sendmsg/sendmsghalt handling and tests llvm-svn: 295244	2017-02-15 22:17:09 +00:00
Eugene Zelenko	454d0cea6a	[Support] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 295243	2017-02-15 22:17:02 +00:00
Matt Arsenault	5de8dc9cf5	DAG: Do not scalarize fsub if fneg is legal Tests will be included with future commit. llvm-svn: 295242	2017-02-15 22:02:42 +00:00
Peter Collingbourne	50cbd7cc90	Re-apply r295110 and r295144 with a fix for the ASan issue. llvm-svn: 295241	2017-02-15 21:56:51 +00:00
Matt Arsenault	d122abead4	AMDGPU: Replace assert with report_fatal_error Also use a more refined condition. llvm-svn: 295239	2017-02-15 21:50:34 +00:00
Keno Fischer	5e1e59180e	[GlobalObject] Fix setSection("") Summary: In rL291613, the section name was interned in LLVMContext. However, this broke the ability to remove the section from a GlobalObject, because it tried to intern empty strings, which is not allowed. Fix that and add an appropriate regression test. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D29795 llvm-svn: 295238	2017-02-15 21:42:42 +00:00
Sanjay Patel	845ea963aa	[InstCombine] improve formatting; NFC llvm-svn: 295237	2017-02-15 21:31:34 +00:00
Peter Collingbourne	9421c2dc54	AssumptionCache: Disable the verifier by default, move it behind a hidden cl::opt and verify from releaseMemory(). This is a short term solution to the problem that many passes currently fail to update the assumption cache. In the long term the verifier should not be controllable with a flag. We should either fix all passes to correctly update the assumption cache and enable the verifier unconditionally or somehow arrange for the assumption list to be updated automatically by passes. Differential Revision: https://reviews.llvm.org/D30003 llvm-svn: 295236	2017-02-15 21:10:09 +00:00
Simon Pilgrim	5b4c30fb32	[X86][SSE] Don't call EltsFromConsecutiveLoads if any element is missing. Minor performance speedup - if any call to getShuffleScalarElt fails to get a result, don't both calling for the remaining elements as EltsFromConsecutiveLoads will fail anyhow. llvm-svn: 295235	2017-02-15 21:09:00 +00:00
Arnold Schwaighofer	8d61e0030a	AddressSanitizer: don't track swifterror memory addresses They are register promoted by ISel and so it makes no sense to treat them as memory. Inserting calls to the thread sanitizer would also generate invalid IR. You would hit: "swifterror value can only be loaded and stored from, or as a swifterror argument!" llvm-svn: 295230	2017-02-15 20:43:43 +00:00
Ahmed Bougacha	f8acf568f1	[AArch64] Make am_ldrlit an iPTR - not OtherVT - operand. NFC-ish. am_ldrlit diverged from am_brcond in r207105, but kept the OtherVT operand type. It made sense for branch targets, as those are represented as MVT::Other in SDAG. But loads operate on pointers. This shouldn't have an observable effect on any in-tree code, but helps make the patterns consistent for external users. llvm-svn: 295229	2017-02-15 20:38:31 +00:00
Ahmed Bougacha	360260066e	[OptDiag] Pass const Values/Types to Argument. NFC. llvm-svn: 295228	2017-02-15 20:38:28 +00:00
Ahmed Bougacha	f9e5a1dd88	[IR] Accept 'const Type &' in the Type operator<<. NFC. Type::print is const; there's no reason for the operator not to be. llvm-svn: 295227	2017-02-15 20:38:22 +00:00
Tobias Edler von Koch	f454b9eadf	[LTO] Add ability to emit assembly to new LTO API Summary: Add a field to LTO::Config, CGFileType, to select the file type to emit (object or assembly). This is useful for testing and to implement -save-temps. Reviewers: tejohnson, mehdi_amini, pcc Reviewed By: mehdi_amini Subscribers: davide, llvm-commits Differential Revision: https://reviews.llvm.org/D29475 llvm-svn: 295226	2017-02-15 20:36:36 +00:00
Kyle Butt	7fbec9bdf1	Codegen: Make chains from trellis-shaped CFGs Lay out trellis-shaped CFGs optimally. A trellis of the shape below: A B \|\ /\| \| \ / \| \| X \| \| / \ \| \|/ \\| C D would be laid out A; B->C ; D by the current layout algorithm. Now we identify trellises and lay them out either A->C; B->D or A->D; B->C. This scales with an increasing number of predecessors. A trellis is a a group of 2 or more predecessor blocks that all have the same successors. because of this we can tail duplicate to extend existing trellises. As an example consider the following CFG: B D F H / \ / \ / \ / \ A---C---E---G---Ret Where A,C,E,G are all small (Currently 2 instructions). The CFG preserving layout is then A,B,C,D,E,F,G,H,Ret. The current code will copy C into B, E into D and G into F and yield the layout A,C,B(C),E,D(E),F(G),G,H,ret define void @straight_test(i32 %tag) { entry: br label %test1 test1: ; A %tagbit1 = and i32 %tag, 1 %tagbit1eq0 = icmp eq i32 %tagbit1, 0 br i1 %tagbit1eq0, label %test2, label %optional1 optional1: ; B call void @a() br label %test2 test2: ; C %tagbit2 = and i32 %tag, 2 %tagbit2eq0 = icmp eq i32 %tagbit2, 0 br i1 %tagbit2eq0, label %test3, label %optional2 optional2: ; D call void @b() br label %test3 test3: ; E %tagbit3 = and i32 %tag, 4 %tagbit3eq0 = icmp eq i32 %tagbit3, 0 br i1 %tagbit3eq0, label %test4, label %optional3 optional3: ; F call void @c() br label %test4 test4: ; G %tagbit4 = and i32 %tag, 8 %tagbit4eq0 = icmp eq i32 %tagbit4, 0 br i1 %tagbit4eq0, label %exit, label %optional4 optional4: ; H call void @d() br label %exit exit: ret void } here is the layout after D27742: straight_test: # @straight_test ; ... Prologue elided ; BB#0: # %entry ; A (merged with test1) ; ... More prologue elided mr 30, 3 andi. 3, 30, 1 bc 12, 1, .LBB0_2 ; BB#1: # %test2 ; C rlwinm. 3, 30, 0, 30, 30 beq 0, .LBB0_3 b .LBB0_4 .LBB0_2: # %optional1 ; B (copy of C) bl a nop rlwinm. 3, 30, 0, 30, 30 bne 0, .LBB0_4 .LBB0_3: # %test3 ; E rlwinm. 3, 30, 0, 29, 29 beq 0, .LBB0_5 b .LBB0_6 .LBB0_4: # %optional2 ; D (copy of E) bl b nop rlwinm. 3, 30, 0, 29, 29 bne 0, .LBB0_6 .LBB0_5: # %test4 ; G rlwinm. 3, 30, 0, 28, 28 beq 0, .LBB0_8 b .LBB0_7 .LBB0_6: # %optional3 ; F (copy of G) bl c nop rlwinm. 3, 30, 0, 28, 28 beq 0, .LBB0_8 .LBB0_7: # %optional4 ; H bl d nop .LBB0_8: # %exit ; Ret ld 30, 96(1) # 8-byte Folded Reload addi 1, 1, 112 ld 0, 16(1) mtlr 0 blr The tail-duplication has produced some benefit, but it has also produced a trellis which is not laid out optimally. With this patch, we improve the layouts of such trellises, and decrease the cost calculation for tail-duplication accordingly. This patch produces the layout A,C,E,G,B,D,F,H,Ret. This layout does have back edges, which is a negative, but it has a bigger compensating positive, which is that it handles the case where there are long strings of skipped blocks much better than the original layout. Both layouts handle runs of executed blocks equally well. Branch prediction also improves if there is any correlation between subsequent optional blocks. Here is the resulting concrete layout: straight_test: # @straight_test ; BB#0: # %entry ; A (merged with test1) mr 30, 3 andi. 3, 30, 1 bc 12, 1, .LBB0_4 ; BB#1: # %test2 ; C rlwinm. 3, 30, 0, 30, 30 bne 0, .LBB0_5 .LBB0_2: # %test3 ; E rlwinm. 3, 30, 0, 29, 29 bne 0, .LBB0_6 .LBB0_3: # %test4 ; G rlwinm. 3, 30, 0, 28, 28 bne 0, .LBB0_7 b .LBB0_8 .LBB0_4: # %optional1 ; B (Copy of C) bl a nop rlwinm. 3, 30, 0, 30, 30 beq 0, .LBB0_2 .LBB0_5: # %optional2 ; D (Copy of E) bl b nop rlwinm. 3, 30, 0, 29, 29 beq 0, .LBB0_3 .LBB0_6: # %optional3 ; F (Copy of G) bl c nop rlwinm. 3, 30, 0, 28, 28 beq 0, .LBB0_8 .LBB0_7: # %optional4 ; H bl d nop .LBB0_8: # %exit Differential Revision: https://reviews.llvm.org/D28522 llvm-svn: 295223	2017-02-15 19:49:14 +00:00
Xinliang David Li	538d666814	include function name in dot filename Differential Revision: http://reviews.llvm.org/D29975 llvm-svn: 295220	2017-02-15 19:21:04 +00:00
Arnold Schwaighofer	8eb1a48540	ThreadSanitizer: don't track swifterror memory addresses They are register promoted by ISel and so it makes no sense to treat them as memory. Inserting calls to the thread sanitizer would also generate invalid IR. You would hit: "swifterror value can only be loaded and stored from, or as a swifterror argument!" llvm-svn: 295215	2017-02-15 18:57:06 +00:00
Michael Kuperstein	ba80db39d7	[DAG] Don't try to create an INSERT_SUBVECTOR with an illegal source We currently can't legalize those, but we should really not be creating them in the first place, since legalization would probably look similar to the way we legalize CONCAT_VECTORS - basically replace the INSERT with a BUILD. This fixes PR311956. Differential Revision: https://reviews.llvm.org/D29961 llvm-svn: 295213	2017-02-15 18:37:26 +00:00
Dehao Chen	726da628e8	Expose getBaseDiscriminatorFromDiscriminator, getDuplicationFactorFromDiscriminator and getCopyIdentifierFromDiscriminator API so that downstream tools can use them to get the correct encoding. Summary: Discriminators are now encoded with rich information. This patch exposes the encoding API to downstream tools. Reviewers: davidxl, hfinkel Reviewed By: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29852 llvm-svn: 295210	2017-02-15 17:54:39 +00:00
Sanjay Patel	056218644b	[Inline] add tests to show attribute information loss; NFC llvm-svn: 295209	2017-02-15 17:42:58 +00:00
Simon Pilgrim	da25d5c7b6	[X86][SSE] Propagate undef upper elements from scalar_to_vector during shuffle combining Only do this for integer types currently - floats types (in particular insertps) load folding often fails with this. llvm-svn: 295208	2017-02-15 17:41:33 +00:00
Stanislav Mekhanoshin	582a5237f9	[AMDGPU] Revert failed scheduling This patch reverts region's scheduling to the original untouched state in case if we have have decreased occupancy. In addition it switches to use TargetRegisterInfo occupancy callback for pressure limits instead of gradually increasing limits which were just passed by. We are going to stay with the best schedule so we do not need to tolerate worsened scheduling anymore. Differential Revision: https://reviews.llvm.org/D29971 llvm-svn: 295206	2017-02-15 17:19:50 +00:00
Anna Thomas	94c8d4976c	Revert "[JumpThreading] Thread through guards" This reverts commit r294617. We fail on an assert while trying to get a condition from an unconditional branch. llvm-svn: 295200	2017-02-15 17:08:29 +00:00
Simon Pilgrim	d811bdd61a	[X86] Regenerate scalar stack reload test llvm-svn: 295195	2017-02-15 16:48:45 +00:00
David Bozier	4b21d022b7	Fix unittest for buildbot with mips host (32bit big endian) from r295174 llvm-svn: 295188	2017-02-15 16:03:22 +00:00
Sanjay Patel	288f075f8e	[InlineFunction] use getFunction(); NFC llvm-svn: 295185	2017-02-15 15:22:18 +00:00
Simon Pilgrim	1746e2152c	Fix spelling mistake - paramater -> parameter. NFCI. llvm-svn: 295182	2017-02-15 15:11:36 +00:00
Sanjay Patel	32d753cae3	[InlineFunction] use getCaller(); NFCI llvm-svn: 295181	2017-02-15 15:08:38 +00:00
Sanjay Patel	ada717e25b	[InlineFunction] use range-for loop; NFCI llvm-svn: 295179	2017-02-15 14:56:11 +00:00
Simon Pilgrim	a0e56d2d68	[X86] Regenerate i64 ext-load on 32-bit target tests llvm-svn: 295177	2017-02-15 14:06:17 +00:00
David Bozier	5c8e5f3722	Attempt to fix buildbots after commit of r295173. Unit tests needed to check on the endianness of the host platform. (Test was failing for big endian hosts). llvm-svn: 295174	2017-02-15 13:40:05 +00:00
David Bozier	4ab9a06f6a	Fix incorrect formatting of DataRefImpl members in operator<< function Changed format specifiers to use format macro constant for pointer type. Moved width part of format specifier in the correct place for formatting members a and b. Added a unit test to confirm the output. Differential Revision: https://reviews.llvm.org/D28957 llvm-svn: 295173	2017-02-15 12:58:41 +00:00
Simon Pilgrim	0f0e5bd3c6	[X86][SSE] Allow matchVectorShuffleWithUNPCK to recognise ZERO inputs Add support for specifying an UNPCK input as ZERO, particularly improves ZEXT cases with non-zero offsets llvm-svn: 295169	2017-02-15 11:46:15 +00:00
Sagar Thakur	ec65792910	[LLVM][XRAY][MIPS] Support xray on mips/mipsel/mips64/mips64el Summary: Adds support for xray instrumentation on mips for both 32-bit and 64-bit. Reviewed by sdardis, dberris Differential: D27697 llvm-svn: 295164	2017-02-15 10:48:11 +00:00
Daniel Jasper	eef9b03395	Revert r295110 and r295144. This fails under ASAN: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/798/steps/check-llvm%20asan/logs/stdio llvm-svn: 295162	2017-02-15 09:56:08 +00:00
Ayman Musa	b8a4f255dd	[X86][AVX] Remove REX_W from AVX instructions. There is no meaning for REX_W in VEX encoded AVX instruction. Differential Revision: https://reviews.llvm.org/D29894 llvm-svn: 295157	2017-02-15 08:12:16 +00:00
Craig Topper	fbc7805e25	[X86] Don't create VBROADCAST nodes with 256-bit or 512-bit input types Summary: We don't seem to have great rules on what a valid VBROADCAST node looks like. And as a consequence we end up with a lot of patterns to try to catch everything. We have patterns with scalar inputs, 128-bit vector inputs, 256-bit vector inputs, and 512-bit vector inputs. As you can see from the things improved here we are currently missing patterns for 128-bit loads being extended to 256-bit before the vbroadcast. I'd like to propose that VBROADCAST should always take a 128-bit vector type as input. As a first step towards that this patch adds an EXTRACT_SUBVECTOR in front of VBROADCAST when the input is 256 or 512-bits. In the future I would like to add scalar_to_vector around all the scalar operations. And maybe we should consider adding a VBROADCAST+load node to avoid separating loads from the broadcasting operation when the load itself isn't foldable. This requires an additional change in target shuffle combining to look for the extract subvector and look through it to find the original operand. I'm sure this change isn't perfect but was enough to fix a few test failures that were being caused. Another interesting thing I noticed is that the changes in masked_gather_scatter.ll show cases were we don't remove a useless insert into element 1 before broadcasting element 0. Reviewers: delena, RKSimon, zvi Reviewed By: zvi Subscribers: igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D28747 llvm-svn: 295155	2017-02-15 06:58:47 +00:00
Craig Topper	ec5df5f4aa	[AVX-512] Add PACKSS/PACKUS instructions to load folding tables. llvm-svn: 295154	2017-02-15 06:51:39 +00:00
Craig Topper	96ec7a23e3	[SelectionDAGBuilder] Simplify creation of shufflevector DAG nodes where inputs are larger than the mask Summary: The current code loops over all elements to calculate a used range. Then a second short loop looks at the ranges and determines if they can be used in a extract and creates a properly aligned start index for the extract. This range finding is unnecessary, we can just calculate a properly aligned start index for an extract for each input during the first loop. If we don't find the same start index for each indice we can't use an extract. Reviewers: zvi, RKSimon Reviewed By: zvi Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29926 llvm-svn: 295152	2017-02-15 05:57:16 +00:00
Lang Hames	56b3d6b151	[Orc][RPC] Add a AsyncHandlerTraits specialization for non-value-type response handler args. The specialization just inherits from the std::decay'd response handler type. This allows member functions (via MemberFunctionWrapper) to be used as async handlers. llvm-svn: 295151	2017-02-15 05:39:35 +00:00
Peter Collingbourne	96e36a67ed	AssumptionCache: Update documentation comment. The comment was somewhat misleading in that it implied that passes were not responsible for adding new assumptions to the assumption cache. This new wording now explicitly mentions that they are required to do so. Differential Revision: https://reviews.llvm.org/D29977 llvm-svn: 295148	2017-02-15 03:50:01 +00:00
Peter Collingbourne	0609acc10d	SimplifyCFG: Register cloned assume intrinsics with assumption cache when creating critical edge. Differential Revision: https://reviews.llvm.org/D29976 llvm-svn: 295145	2017-02-15 03:01:11 +00:00
Peter Collingbourne	e2367415b6	WholeProgramDevirt: Separate the code that applies optzns from the code that decides whether to apply them. NFCI. The idea is that the apply* functions will also be called when importing devirt optimizations. Differential Revision: https://reviews.llvm.org/D29745 llvm-svn: 295144	2017-02-15 02:13:08 +00:00
Rui Ueyama	4b58f577cd	Revert r295138: Instead of a series of string operations, use snprintf(). This broke buildbots. llvm-svn: 295142	2017-02-15 01:48:33 +00:00
Rui Ueyama	aae04a9aa0	Instead of a series of string operations, use snprintf(). llvm-svn: 295138	2017-02-15 01:09:40 +00:00
Rui Ueyama	a39d148aa4	Return early. NFC. llvm-svn: 295137	2017-02-15 01:09:20 +00:00
Rui Ueyama	789c422014	Use LLVM-style naming scheme. llvm-svn: 295136	2017-02-15 01:09:01 +00:00
Stanislav Mekhanoshin	19f98c6a09	[AMDGPU] Fix MaxWorkGroupsPerCU for large workgroups This patch corrects the maximum workgroups per CU if we have big workgroups (more than 128). This calculation contributes to the occupancy calculation in respect to LDS size. Differential Revision: https://reviews.llvm.org/D29974 llvm-svn: 295134	2017-02-15 01:03:59 +00:00
Rui Ueyama	09786c4c3f	Use LLVM-style naming scheme. llvm-svn: 295132	2017-02-15 00:28:48 +00:00
Rui Ueyama	143b52c566	Remove useless local variable. llvm-svn: 295131	2017-02-15 00:28:26 +00:00
Rui Ueyama	24e27b474c	Split WinCOFFObjectWriter::defineSection. NFC. llvm-svn: 295128	2017-02-15 00:15:54 +00:00
Rui Ueyama	dfc8aa8e1b	Simplify WinCOFFObjectWriter by removing a template member function. llvm-svn: 295126	2017-02-14 23:58:19 +00:00
Rui Ueyama	0fcdb48c6e	Do not lookup a DenseMap twice using the same key. llvm-svn: 295124	2017-02-14 23:47:34 +00:00
Rui Ueyama	86e3ef92f3	Use endian::write32le instead of endian::write. llvm-svn: 295120	2017-02-14 23:28:19 +00:00
Rui Ueyama	cbb4e7c1fb	Use zero-initialization instead of memset. llvm-svn: 295119	2017-02-14 23:28:01 +00:00
Kostya Serebryany	32c5004cf5	[libFuzzer] increase the size of FixedWord from 27 to 64, see PR31950 llvm-svn: 295117	2017-02-14 23:02:37 +00:00
Dimitry Andric	9afed0377e	Disable wrapping llvm-xray YAML output Summary: The YAML output produced by llvm-xray is supposed to be wrapped at the arbitrary default of 70 columns set by `yaml:Output`. Unfortunately, the wrapping is rather unpredictable, and can easily go past the set number of columns, depending on the execution environment. To make the YAML output environment-independent, disable wrapping instead. Reviewers: dberris Reviewed By: dberris Subscribers: fhahn, llvm-commits Differential Revision: https://reviews.llvm.org/D29962 llvm-svn: 295116	2017-02-14 22:49:49 +00:00
Easwaran Raman	5a12f236c6	Fix a bug in caller's BFI update code after inlining. Multiple blocks in the callee can be mapped to a single cloned block since we prune the callee as we clone it. The existing code iterates over the value map and clones the block frequency (and eventually scales the frequencies of the cloned blocks). Value map's iteration is not deterministic and so the cloned block might get the frequency of any of the original blocks. The fix is to set the max of the original frequencies to the cloned block. The first block in the sequence must have this max frequency and, in the call context, subsequent blocks must have its frequency. Differential Revision: https://reviews.llvm.org/D29696 llvm-svn: 295115	2017-02-14 22:49:28 +00:00
Kostya Serebryany	ae579a79c0	Use "%zd" format specifier for printing number of testcases executed. Summary: This helps to avoid signed integer overflow after running a fast fuzz target for several hours, e.g.: <...> Done -1097903291 runs in 54001 second(s) Reviewers: kcc Reviewed By: kcc Differential Revision: https://reviews.llvm.org/D29941 llvm-svn: 295112	2017-02-14 22:14:36 +00:00
Michael Kuperstein	569162fefe	[LV] Rename Induction to PrimaryInduction. NFC. llvm-svn: 295111	2017-02-14 22:14:01 +00:00
Peter Collingbourne	534c0175b6	WholeProgramDevirt: Change internal vcall data structures to match summary. Group calls into constant and non-constant arguments up front, and use uint64_t instead of ConstantInt to represent constant arguments. The goal is to allow the information from the summary to fit naturally into this data structure in a future change (specifically, it will be added to CallSiteInfo). This has two side effects: - We disallow VCP for constant integer arguments of width >64 bits. - We remove the restriction that the bitwidth of a vcall's argument and return types must match those of the vfunc definitions. I don't expect either of these to matter in practice. The first case is uncommon, and the second one will lead to UB (so we can do anything we like). Differential Revision: https://reviews.llvm.org/D29744 llvm-svn: 295110	2017-02-14 22:12:23 +00:00
Simon Dardis	454f2e7840	[mips] Correct mips16 return instructions definitions Correct the definition of MIPS16 instructions that act as return instructions so that isReturn = 1 as expected. llvm-svn: 295109	2017-02-14 21:53:23 +00:00
Taewook Oh	2e945ebb13	[BasicBlockUtils] Use getFirstNonPHIOrDbg to set debugloc for instructions created in SplitBlockPredecessors Summary: When setting debugloc for instructions created in SplitBlockPredecessors, current implementation copies debugloc from the first-non-phi instruction of the original basic block. However, if the first-non-phi instruction is a call for @llvm.dbg.value, the debugloc of the instruction may point the location outside of the block itself. For the example code of ``` 1 typedef struct _node_t { 2 struct _node_t next; 3 } node_t; 4 5 extern node_t root; 6 7 int foo() { 8 node_t node, tmp; 9 int ret = 0; 10 11 node = tmp = root->next; 12 while (node != root) { 13 while (node) { 14 tmp = node; 15 node = node->next; 16 ret++; 17 } 18 } 19 20 return ret; 21 } ``` , below is the basicblock corresponding to line 12 after Reassociate expressions pass: ``` while.cond: ; preds = %while.cond2, %entry %node.0 = phi %struct._node_t* [ %1, %entry ], [ null, %while.cond2 ] %ret.0 = phi i32 [ 0, %entry ], [ %ret.1, %while.cond2 ] tail call void @llvm.dbg.value(metadata i32 %ret.0, i64 0, metadata !19, metadata !20), !dbg !21 tail call void @llvm.dbg.value(metadata %struct._node_t* %node.0, i64 0, metadata !11, metadata !20), !dbg !31 %cmp = icmp eq %struct._node_t* %node.0, %0, !dbg !33 br i1 %cmp, label %while.end5, label %while.cond2, !dbg !35 ``` As you can see, the first-non-phi instruction is a call for @llvm.dbg.value, and the debugloc is ``` !21 = !DILocation(line: 9, column: 7, scope: !6) ``` , which is a definition of 'ret' variable and outside of the scope of the basicblock itself. However, current implementation picks up this debugloc for the instructions created in SplitBlockPredecessors. This patch addresses this problem by picking up debugloc from the first-non-phi-non-dbg instruction. Reviewers: dblaikie, samsonov, eugenis Reviewed By: eugenis Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29867 llvm-svn: 295106	2017-02-14 21:10:40 +00:00
Reid Kleckner	a622fc9bdf	[BranchFolding] Tail common all identical unreachable blocks Summary: Blocks ending in unreachable are typically cold because they end the program or throw an exception, so merging them with other identical blocks is usually profitable because it reduces the size of cold code. MachineBlockPlacement generally does not arrange to fall through to such blocks, so commoning these blocks will not introduce additional unconditional branches. Reviewers: hans, iteratee, haicheng Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29153 llvm-svn: 295105	2017-02-14 21:02:24 +00:00
Tim Northover	398c5f57f9	GlobalISel: deal with new G_PTR_MASK instruction on AArch64. It's just an AND-immediate instruction for us, surprisingly simple to select. llvm-svn: 295104	2017-02-14 20:56:29 +00:00
Tim Northover	c2f8956313	GlobalISel: introduce G_PTR_MASK to simplify alloca handling. This instruction clears the low bits of a pointer without requiring (possibly dodgy if pointers aren't ints) conversions to and from an integer. Since (as far as I'm aware) all masks are statically known, the instruction takes an immediate operand rather than a register to specify the mask. llvm-svn: 295103	2017-02-14 20:56:18 +00:00
Vedant Kumar	55891fc71e	Re-apply "[profiling] Remove dead profile name vars after emitting name data" This reverts 295092 (re-applies 295084), with a fix for dangling references from the array of coverage names passed down from frontends. I missed this in my initial testing because I only checked test/Profile, and not test/CoverageMapping as well. Original commit message: The profile name variables passed to counter increment intrinsics are dead after we emit the finalized name data in __llvm_prf_nm. However, we neglect to erase these name variables. This causes huge size increases in the __TEXT,__const section as well as slowdowns when linker dead stripping is disabled. Some affected projects are so massive that they fail to link on Darwin, because only the small code model is supported. Fix the issue by throwing away the name constants as soon as we're done with them. Differential Revision: https://reviews.llvm.org/D29921 llvm-svn: 295099	2017-02-14 20:03:48 +00:00
Eric Christopher	14303d1815	Reformat slightly. llvm-svn: 295096	2017-02-14 19:43:50 +00:00
Wolfgang Pieb	399dcfaa2a	Reapply r294532, reverted in r294787. Store instructions can have more than one memory operand as a result of optimizations that fold different stores into one. When we identify spill instructions to generate DBG_VALUE instructions to record the spilling of a variable, we disregard stores with multiple memory operands for now. We may miss some relevant spills but the handling is a bit more complex, so we'll do it in a different patch. This fixes PR31935. llvm-svn: 295093	2017-02-14 19:08:45 +00:00
Vedant Kumar	27ebdf4bcb	Revert "[profiling] Remove dead profile name vars after emitting name data" This reverts commit r295084. There is a test failure on: http://lab.llvm.org:8011/builders/clang-atom-d525-fedora-rel/builds/2620/ llvm-svn: 295092	2017-02-14 19:08:39 +00:00
Bob Wilson	4074b6b686	allow migrating away from cmake option for LLVM_DISABLE_ABI_BREAKING_CHECKS_ENFORCING In r288754, Mehdi added a cmake option to disable enforcement of the ABI breaking checks in the "abi-breaking.h" header. We used that when building Swift and it works, but I think it will be better to control this with a preprocessor macro instead of a cmake option. That will let us opt out of the enforcement more selectively. This change allows skipping the cmake setting if the existing preprocessor macro is already defined. My intention here is to make this change and get Swift to use it, and then after a few weeks, we can remove the cmake option. I want to stage it like that to be less disruptive. I'm not aware of anyone else using that cmake option. Mehdi had some initial concern about the impact of using a preprocessor macro when building with modules enabled. I don't think that will be a problem if we set the macro on the command line with a -D option in those contexts where we need to disable the enforcement of the checks. https://reviews.llvm.org/D29919 llvm-svn: 295090	2017-02-14 19:06:43 +00:00
Zachary Turner	8bd42a1a98	[Support] Add StringRef::getAsDouble. Differential Revision: https://reviews.llvm.org/D29918 llvm-svn: 295089	2017-02-14 19:06:37 +00:00
Vedant Kumar	bb10484662	[profiling] Remove dead profile name vars after emitting name data The profile name variables passed to counter increment intrinsics are dead after we emit the finalized name data in __llvm_prf_nm. However, we neglect to erase these name variables. This causes huge size increases in the __TEXT,__const section as well as slowdowns when linker dead stripping is disabled. Some affected projects are so massive that they fail to link on Darwin, because only the small code model is supported. Fix the issue by throwing away the name constants as soon as we're done with them. Differential Revision: https://reviews.llvm.org/D29921 llvm-svn: 295084	2017-02-14 18:48:48 +00:00
Aditya Nandakumar	bb0483bc8e	[Tablegen] Instrumenting table gen DAGGenISelDAG To help assist in debugging ISEL or to prioritize GlobalISel backend work, this patch adds two more tables to <Target>GenISelDAGISel.inc - one which contains the patterns that are used during selection and the other containing include source location of the patterns Enabled through CMake varialbe LLVM_ENABLE_DAGISEL_COV llvm-svn: 295081	2017-02-14 18:32:41 +00:00
Adam Nemet	4f6decadc5	[opt-viewer] For single-process, fall back on map instead of Pool.map This allows for nicer backtrace and debugging when -j1 is passed: $ opt-viewer.py CMakeFiles/LLVMScalarOpts.dir/LoopVersioningLICM.cpp.opt.yaml html Traceback (most recent call last): File "/org/llvm/utils/opt-viewer/opt-viewer.py", line 405, in <module> generate_report(pmap, all_remarks, file_remarks, args.source_dir, args.output_dir) File "/org/llvm/utils/opt-viewer/opt-viewer.py", line 362, in generate_report pmap(_render_file_bound, file_remarks.items()) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 251, in map return self.map_async(func, iterable, chunksize).get() File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 567, in get raise self._value Exception: blah $ opt-viewer.py -j 1 CMakeFiles/LLVMScalarOpts.dir/LoopVersioningLICM.cpp.opt.yaml html Traceback (most recent call last): File "/org/llvm/utils/opt-viewer/opt-viewer.py", line 405, in <module> generate_report(pmap, all_remarks, file_remarks, args.source_dir, args.output_dir) File "/org/llvm/utils/opt-viewer/opt-viewer.py", line 362, in generate_report pmap(_render_file_bound, file_remarks.items()) File "/org/llvm/utils/opt-viewer/opt-viewer.py", line 317, in _render_file SourceFileRenderer(source_dir, output_dir, filename).render(remarks) File "/org/llvm/utils/opt-viewer/opt-viewer.py", line 168, in __init__ raise Exception("blah") Exception: blah llvm-svn: 295080	2017-02-14 18:18:58 +00:00
Krzysztof Parzyszek	d3b5641586	[Hexagon] Remove leftover debugging code llvm-svn: 295078	2017-02-14 17:37:44 +00:00
Taewook Oh	f22fa72e4a	Do not apply redundant LastCallToStaticBonus Summary: As written in the comments above, LastCallToStaticBonus is already applied to the cost if Caller has only one user, so it is redundant to reapply the bonus here. If the only user is not a caller, TotalSecondaryCost will not be adjusted anyway because callerWillBeRemoved is false. If there's no caller at all, we don't need to care about TotalSecondaryCost because inliningPreventsSomeOuterInline is false. Reviewers: chandlerc, eraman Reviewed By: eraman Subscribers: haicheng, davidxl, davide, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D29169 llvm-svn: 295075	2017-02-14 17:30:05 +00:00
Adam Nemet	4c98023724	[LazyBFI] Fix typos llvm-svn: 295073	2017-02-14 17:21:12 +00:00
Adam Nemet	bbb141c734	Add new pass LazyMachineBlockFrequencyInfo And use it in MachineOptimizationRemarkEmitter. A test will follow on top of Justin's changes to enable MachineORE in AsmPrinter. The approach is similar to the IR-level pass. It's a bit simpler because BPI is immutable at the Machine level so we don't need to make that lazy. Because of this, a new function mapping is introduced (BPIPassTrait::getBPI). This function extracts BPI from the pass. In case of the lazy pass, this is when the calculation of the BFI occurs. For Machine-level, this is the identity function. Differential Revision: https://reviews.llvm.org/D29836 llvm-svn: 295072	2017-02-14 17:21:09 +00:00
Adam Nemet	24984e1238	[LazyBFI] Split out and templatize LazyBlockFrequencyInfo, NFC This will be used by the LazyMachineBFI pass. Differential Revision: https://reviews.llvm.org/D29834 llvm-svn: 295071	2017-02-14 17:21:04 +00:00
Sanjay Patel	a109dd1398	fix documentation comments for Argument; NFC llvm-svn: 295068	2017-02-14 16:43:49 +00:00
Brian Cain	6dedf65cc9	Correct a typo, s/hosting/hoisting/ llvm-svn: 295066	2017-02-14 16:41:10 +00:00
Diego Novillo	8adfc8ef3a	Remove unused variable. llvm-svn: 295065	2017-02-14 16:39:54 +00:00
Pavel Labath	41ec64999d	[Support] Add formatv support for StringLiteral Summary: This is achieved by generalizing the expression selecting the StringRef format_provider. Now, anything that can be converted to a StringRef will use it's formatter. Reviewers: zturner Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29898 llvm-svn: 295064	2017-02-14 16:35:56 +00:00
Matthew Simpson	f09d13e5cc	Reapply "[LV] Extend trunc optimization to all IVs with constant integer steps" This reapplies commit r294967 with a fix for the execution time regressions caught by the clang-cmake-aarch64-quick bot. We now extend the truncate optimization to non-primary induction variables only if the truncate isn't already free. Differential Revision: https://reviews.llvm.org/D29847 llvm-svn: 295063	2017-02-14 16:28:32 +00:00
Simon Pilgrim	6f732e026d	[X86][SSE] Allow matchVectorShuffleWithUNPCK to recognise UNDEF inputs Add support for specifying an UNPCK input as UNDEF llvm-svn: 295061	2017-02-14 16:22:04 +00:00
Igor Laevsky	c11c1ed909	[SCEV] Cache results during GetMinTrailingZeros query Differential Revision: https://reviews.llvm.org/D29759 llvm-svn: 295060	2017-02-14 15:53:12 +00:00
Simon Pilgrim	5b281d9a5c	[X86][SSE] Add shuffle combine tests showing missed opportunities to use UNPCK Not correctly using UNDEF or ZERO inputs to combine to UNPCK shuffles llvm-svn: 295059	2017-02-14 15:49:37 +00:00
Simon Pilgrim	8351cf1b6e	[X86][SSE] Regenerate intrinsic upgrade tests Remove excess semicolons llvm-svn: 295058	2017-02-14 15:29:50 +00:00
Alexey Bataev	2a2f35d59c	[SLP] Fix for PR31879: vectorize repeated scalar ops that don't get put back into a vector Previously the cost of the existing ExtractElement/ExtractValue instructions was considered as a dead cost only if it was detected that they have only one use. But these instructions may be considered dead also if users of the instructions are also going to be vectorized, like: ``` %x0 = extractelement <2 x float> %x, i32 0 %x1 = extractelement <2 x float> %x, i32 1 %x0x0 = fmul float %x0, %x0 %x1x1 = fmul float %x1, %x1 %add = fadd float %x0x0, %x1x1 ``` This can be transformed to ``` %1 = fmul <2 x float> %x, %x %2 = extractelement <2 x float> %1, i32 0 %3 = extractelement <2 x float> %1, i32 1 %add = fadd float %2, %3 ``` because though `%x0` and `%x1` have 2 users each other, these users are part of the vectorized tree and we can consider these `extractelement` instructions as dead. Differential Revision: https://reviews.llvm.org/D29900 llvm-svn: 295056	2017-02-14 15:20:48 +00:00
Artyom Skrobov	dc66a82dc7	Removing a redundant assignment llvm-svn: 295055	2017-02-14 14:44:01 +00:00
Alexander Timofeev	9f61feac4a	Revert "[AMDGPU] Fix for SIMachineScheduler crash. SI Scheduler should track" This reverts commit ce06d9cb99298eb844b66e117f5108a06747c907. llvm-svn: 295054	2017-02-14 14:29:05 +00:00
Simon Pilgrim	a0878dea9e	[X86][SSE] Move unary inputs handling inside matchVectorShuffleWithUNPCK. llvm-svn: 295053	2017-02-14 13:47:17 +00:00
Simon Pilgrim	3efdffcb27	[X86][SSE] Tidyup matchVectorShuffleWithUNPCK helper function call. Don't bother setting the V1/V2 operands again for unary shuffles. Don't bother legalizing the value type unless the match succeeds. llvm-svn: 295051	2017-02-14 12:54:39 +00:00
Alexey Bataev	4ed47342ff	[SLP] Additional tests for extractelement cost fix. llvm-svn: 295050	2017-02-14 12:52:05 +00:00
Simon Pilgrim	75dda50ebe	[X86][SSE] Test case showing missed PSHUFB target shuffle constant fold opportunity. It also shows an unnecessary pshufb/broadcast being used - the original pshufb mask only requested the lowest byte. llvm-svn: 295046	2017-02-14 11:20:11 +00:00
Karl-Johan Karlsson	ec21b769ec	Revert "[LoopVectorize] Added address space check when analysing interleaved accesses" This reverts r295038. The buildbot clang-with-thin-lto-ubuntu failed. I'm reverting to investigate. llvm-svn: 295042	2017-02-14 10:06:16 +00:00
Karl-Johan Karlsson	2ec409cca2	[LoopVectorize] Added address space check when analysing interleaved accesses Prevent memory objects of different address spaces to be part of the same load/store groups when analysing interleaved accesses. This is fixing pr31900. Reviewers: HaoLiu, mssimpso, mkuper Reviewed By: mssimpso, mkuper Subscribers: llvm-commits, efriedma, mzolotukhin Differential Revision: https://reviews.llvm.org/D29717 llvm-svn: 295038	2017-02-14 08:14:06 +00:00
Karl-Johan Karlsson	38cbf5869d	Test commit permission Removing whitespace. llvm-svn: 295037	2017-02-14 07:31:36 +00:00
Daniel Jasper	29b46f74f7	Add initializer that was missed in r295009. llvm-svn: 295036	2017-02-14 07:10:03 +00:00
Craig Topper	d2d50cba2a	[AVX-512] Add PAVGB/PAVGW to load folding tables. llvm-svn: 295035	2017-02-14 06:54:57 +00:00
Mikael Holmen	ece84cd10c	[LSR] Pointers with different address spaces are considered incompatible. Summary: Function isCompatibleIVType is already used as a guard before the call to SE.getMinusSCEV(OperExpr, PrevExpr); in LSRInstance::ChainInstruction. getMinusSCEV requires the expressions to be of the same type, so we now consider two pointers with different address spaces to be incompatible, since it is possible that the pointers in fact have different sizes. Reviewers: qcolombet, eli.friedman Reviewed By: qcolombet Subscribers: nhaehnle, Ka-Ka, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D29885 llvm-svn: 295033	2017-02-14 06:37:42 +00:00
Lang Hames	f401077c29	[Orc][RPC] Remove lanch policies in favor of async handlers. Launch policies provided a mechanism for running RPC handlers on a background thread (unblocking the main RPC receiver thread). Async handlers generalize this by passing the responder function (the function that sends the RPC return value) as an argument to the handler. The handler can optionally do its work on a background thread (the same way launch policies do), but can also (a) can inspect the call arguments before deciding to run the work on a different thread, or (b) can use the responder in a subsequent RPC call (e.g. in the handler of a callAsync), allowing the handler to call back to the originator (or to a 3rd party) without blocking the listener thread, and without launching a new thread. llvm-svn: 295030	2017-02-14 05:40:01 +00:00
Alex Bradbury	e4f731b813	[RISCV] Fix RV32 datalayout string and ensure initAsmInfo is called llvm-svn: 295028	2017-02-14 05:20:20 +00:00
Alex Bradbury	6be16fbfb8	[RISCV] Pseudo instructions are isCodeGenOnly, have blank asmstr llvm-svn: 295027	2017-02-14 05:17:23 +00:00
Alex Bradbury	d36e04cb6c	[RISCV] Fix unused variable in RISCVMCTargetDesc. NFC Also, for better uniformity use TargetRegistry::RegisterMCAsmInfo rather than RegisterMCAsmInfoFn. Again, no functional change. llvm-svn: 295026	2017-02-14 05:15:24 +00:00
Peter Collingbourne	002c2d5380	ThinLTOBitcodeWriter: Write available_externally copies of VCP eligible functions to merged module. Differential Revision: https://reviews.llvm.org/D29701 llvm-svn: 295021	2017-02-14 03:42:38 +00:00
Mehdi Amini	a0ddb1ed46	[ThinLTO] Make a copy of buffer identifier in ThinLTOCodeGenerator We can't assume that the `const char *` provided through libLTO has a lifetime that expands beyond the codegenerator itself. llvm-svn: 295018	2017-02-14 02:20:51 +00:00
Philip Reames	b2bca7e309	[LICM] Make store promotion work in the face of unordered atomics Extend our store promotion code to deal with unordered atomic accesses. Ordered atomics continue to be unhandled. Most of the change is straight-forward, the only complicated bit is in the reasoning around mixing of atomic and non-atomic memory access. Rather than trying to reason about the complex semantics in these cases, I simply disallowed promotion when both atomic and non-atomic accesses are present. This is conservatively correct. It seems really tempting to just promote all access to atomics, but the original accesses might have been conditional. Since we can't lower an arbitrary atomic type, it might not be safe to promote all access to atomic. Consider a loop like the following: while(b) { load i128 ... if (can lower i128 atomic) store atomic i128 ... else store i128 } It could be there's no race on the location and thus the code is perfectly well defined even if we can't lower a i128 atomically. It's not clear we need to be this conservative - arguably the program above is brocken since it can't be lowered unless the branch is folded - but I didn't want to have to fix any fallout which might result. Differential Revision: https://reviews.llvm.org/D15592 llvm-svn: 295015	2017-02-14 01:38:31 +00:00
Reid Kleckner	e2fa5492b2	Undef MemoryFence, which is defined to _mm_mfence by winnt.h llvm-svn: 295014	2017-02-14 01:38:14 +00:00
Reid Kleckner	a7661c842f	Use std::call_once on Windows Previously we could not use it because std::once_flag's default constructor was not constexpr. Today, all supported versions of VS correctly mark it constexpr. I confirmed that MSVC 2015 does not emit any problematic racy dynamic initialization code, so we should be safe to use this now. llvm-svn: 295013	2017-02-14 01:21:39 +00:00
Eugene Zelenko	d96089b248	[MC] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). Same changes in files affected by reduced MC headers dependencies. llvm-svn: 295009	2017-02-14 00:33:36 +00:00
Peter Collingbourne	c45f7f3eb4	FunctionAttrs: Factor out a function for querying memory access of a specific copy of a function. NFC. This will later be used by ThinLTOBitcodeWriter to add copies of readnone functions to the regular LTO module. Differential Revision: https://reviews.llvm.org/D29695 llvm-svn: 295008	2017-02-14 00:28:13 +00:00
Michael Kuperstein	47a8b6829c	Silence redundant semicolon warnings. NFC. llvm-svn: 295005	2017-02-13 23:42:27 +00:00
Andrew Kaylor	709f1c2a9b	[X86] Add MXCSR register This adds MXCSR to the set of recognized registers for X86 targets and updates the instructions that read or write it. I do not intend for all of the various floating point instructions that implicitly use the control bits or update the status bits of this register to ever have that usage modeled by default. However, when constrained floating point modes (such as strict FP exception status modeling or dynamic rounding modes) are enabled, implicit use/def information for MXCSR will be added to those instructions. Until those additional updates are made this should cause (almost?) no functional changes. Theoretically, this will prevent instructions like LDMXCSR and STMXCSR from being moved past one another, but that should be prevented anyway and I haven't found a case where it is happening now. Differential Revision: https://reviews.llvm.org/D29903 llvm-svn: 295004	2017-02-13 23:38:52 +00:00
Sanjoy Das	5be2e8415c	[LangRef] Explicitly allow readnone and reaodnly functions to unwind Summary: This change edits the language reference to explicitly allow the existence of readnone and readonly functions that can throw. Full discussion at http://lists.llvm.org/pipermail/llvm-dev/2017-January/108637.html Reviewers: dberlin, chandlerc, hfinkel, majnemer Reviewed By: majnemer Subscribers: majnemer, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D28740 llvm-svn: 295000	2017-02-13 23:19:07 +00:00
Sanjoy Das	a3ff994268	[LangRef] Update the TBAA section Summary: Update the TBAA section to mention the struct path TBAA that LLVM implements today. This is not a proposal or change in semantics -- it is intended only to document what LLVM already does today. This is related to https://reviews.llvm.org/D26438 where I've tried to implement some of the constraints as verifier checks. Reviewers: anna, reames, rsmith, chandlerc, hfinkel, rjmccall, mehdi_amini, dexonsmith, manmanren Reviewed By: manmanren Subscribers: dberlin, dberris, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D26831 llvm-svn: 294999	2017-02-13 23:14:03 +00:00
Sanjay Patel	4f74216da0	[FunctionAttrs] try to extend nonnull-ness of arguments from a callsite back to its parent function As discussed here: http://lists.llvm.org/pipermail/llvm-dev/2016-December/108182.html ...we should be able to propagate 'nonnull' info from a callsite back to its parent. The original motivation for this patch is our botched optimization of "dyn_cast" (PR28430), but this won't solve that problem. The transform is currently disabled by default while we wait for clang to work-around potential security problems: http://lists.llvm.org/pipermail/cfe-dev/2017-January/052066.html Differential Revision: https://reviews.llvm.org/D27855 llvm-svn: 294998	2017-02-13 23:10:51 +00:00
Amaury Sechet	3422b539c8	Revert autogenerated check result for test/CodeGen/X86/atomic-minmax-i6432.ll as they don't regenerate cleanly. llvm-svn: 294996	2017-02-13 23:00:23 +00:00
Tim Northover	48dfa1a6ed	GlobalISel: represent atomic loads & stores via the MachineMemOperand. Also make sure the AArch64 backend doesn't try to convert them into normal loads and stores. llvm-svn: 294993	2017-02-13 22:14:16 +00:00
Tim Northover	b73e309071	MIR: parse & print the atomic parts of a MachineMemOperand. We're going to need them very soon for GlobalISel. llvm-svn: 294992	2017-02-13 22:14:08 +00:00
Reid Kleckner	b74485dfaa	[CodeGen] Use bitfields instead of manual masks in ArgFlagsTy, NFC This revealed that we actually have 8 more unused flag bits, and byval size doesn't need to be a bitfield at all. This came up during code review here: https://reviews.llvm.org/D29668#inline-258469 llvm-svn: 294989	2017-02-13 21:33:26 +00:00
Taewook Oh	4d35f9e10e	Address post-commit comments for https://reviews.llvm.org/D29596 . NFCI. llvm-svn: 294985	2017-02-13 21:12:27 +00:00
Arnold Schwaighofer	8f3df731dc	swiftcc: Don't emit tail calls from callers with swifterror parameters Backends don't support this yet. They would have to move to the swifterror register before the tail call to make sure it is live-in to the call. rdar://30495920 llvm-svn: 294982	2017-02-13 19:58:28 +00:00
Peter Collingbourne	2b33f65317	IR: Type ID summary extensions for WPD; thread summary into WPD pass. Make the whole thing testable by adding YAML I/O support for the WPD summary information and adding some negative tests that exercise the YAML support. Differential Revision: https://reviews.llvm.org/D29782 llvm-svn: 294981	2017-02-13 19:26:18 +00:00
Alexey Bataev	7bed48e7a3	[SLP] Test for extractelement cost fix. llvm-svn: 294980	2017-02-13 19:08:19 +00:00
Taewook Oh	06a2128cfa	Make MachineBasicBlock::updateTerminator to update DebugLoc as well Summary: Currently MachineBasicBlock::updateTerminator simply drops DebugLoc for newly created branch instructions, which may cause incorrect stepping and/or imprecise sample profile data. Below is an example: ``` 1 extern int bar(int x); 2 3 int foo(int begin, int end) { 4 int i; 5 int ret = 0; 6 for ( 7 i = begin ; 8 i != end ; 9 i++) 10 { 11 ret += bar(i); 12 } 13 return ret; 14 } ``` Below is a bitcode of 'foo' at the end of LLVM-IR level optimizations with -O3: ``` define i32 @foo(i32* readonly %begin, i32* readnone %end) !dbg !4 { entry: %cmp6 = icmp eq i32* %begin, %end, !dbg !9 br i1 %cmp6, label %for.end, label %for.body.preheader, !dbg !12 for.body.preheader: ; preds = %entry br label %for.body, !dbg !13 for.body: ; preds = %for.body.preheader, %for.body %ret.08 = phi i32 [ %add, %for.body ], [ 0, %for.body.preheader ] %i.07 = phi i32* [ %incdec.ptr, %for.body ], [ %begin, %for.body.preheader ] %0 = load i32, i32* %i.07, align 4, !dbg !13, !tbaa !15 %call = tail call i32 @bar(i32 %0), !dbg !19 %add = add nsw i32 %call, %ret.08, !dbg !20 %incdec.ptr = getelementptr inbounds i32, i32* %i.07, i64 1, !dbg !21 %cmp = icmp eq i32* %incdec.ptr, %end, !dbg !9 br i1 %cmp, label %for.end.loopexit, label %for.body, !dbg !12, !llvm.loop !22 for.end.loopexit: ; preds = %for.body br label %for.end, !dbg !24 for.end: ; preds = %for.end.loopexit, %entry %ret.0.lcssa = phi i32 [ 0, %entry ], [ %add, %for.end.loopexit ] ret i32 %ret.0.lcssa, !dbg !24 } ``` where ``` !12 = !DILocation(line: 6, column: 3, scope: !11) ``` . As you can see, the terminator of 'entry' block, which is a loop control branch, has a DebugLoc of line 6, column 3. Howerver, after the execution of 'MachineBlock::updateTerminator' function, which is triggered by MachineSinking pass, the DebugLoc info is dropped as below (see there's no debug-location for JNE_1): ``` bb.0.entry: successors: %bb.4(0x30000000), %bb.1.for.body.preheader(0x50000000) liveins: %rdi, %rsi %6 = COPY %rsi %5 = COPY %rdi %8 = SUB64rr %5, %6, implicit-def %eflags, debug-location !9 JNE_1 %bb.1.for.body.preheader, implicit %eflags ``` This patch addresses this issue and make newly created branch instructions to keep debug-location info. Reviewers: aprantl, MatzeB, craig.topper, qcolombet Reviewed By: qcolombet Subscribers: qcolombet, llvm-commits Differential Revision: https://reviews.llvm.org/D29596 llvm-svn: 294976	2017-02-13 18:15:31 +00:00
Matthew Simpson	659f92e2aa	Revert "[LV] Extend trunc optimization to all IVs with constant integer steps" This reverts commit r294967. This patch caused execution time slowdowns in a few LLVM test-suite tests, as reported by the clang-cmake-aarch64-quick bot. I'm reverting to investigate. llvm-svn: 294973	2017-02-13 18:02:35 +00:00
Quentin Colombet	fbae5fcb96	[FastISel] Add a diagnostic to warm on fallback. This is consistent with what we do for GlobalISel. That way, it is easy to see whether or not FastISel is able to fully select a function. At some point we may want to switch that to an optimization remark. llvm-svn: 294970	2017-02-13 17:38:59 +00:00
James Molloy	0ae2202235	[ARM] Fix crash caused by r294945 I'd missed a creator of FCMP nodes - duplicateCmp(). Kindly and promptly reported by Gabor Ballabas, due to his CSiBE test suite. llvm-svn: 294968	2017-02-13 17:18:00 +00:00
Matthew Simpson	7b7f40297f	[LV] Extend trunc optimization to all IVs with constant integer steps This patch extends the optimization of truncations whose operand is an induction variable with a constant integer step. Previously we were only applying this optimization to the primary induction variable. However, the cost model assumes the optimization is applied to the truncation of all integer induction variables (even regardless of step type). The transformation is now applied to the other induction variables, and I've updated the cost model to ensure it is better in sync with the transformation we actually perform. Differential Revision: https://reviews.llvm.org/D29847 llvm-svn: 294967	2017-02-13 16:48:00 +00:00
Simon Dardis	d9858dfdee	[mips] Fix failing test. llvm-svn: 294966	2017-02-13 16:42:35 +00:00
Sanjay Patel	a62b8ce323	fix documentation comments; NFC llvm-svn: 294964	2017-02-13 16:17:29 +00:00
Davide Italiano	cd6d7b1c5b	[llvm-lto2] Fix typo spotted by Teresa (r294885 post-commit review). llvm-svn: 294962	2017-02-13 16:08:36 +00:00
Simon Dardis	509da1a46d	[mips] divide macro instruction cleanup. Clean up the implementation of divide macro expansion by getting rid of a FIXME regarding magic numbers and branch instructions. Match GAS' behaviour for expansion of ddiv / div in the two and three operand cases. Add the two operand alias for MIPSR6. Finally, optimize macro expansion cases where the divisior is the $zero register. Reviewers: slthakur Differential Revision: https://reviews.llvm.org/D29887 llvm-svn: 294960	2017-02-13 16:06:48 +00:00
Simon Pilgrim	fd6a84fbaa	Fix indentation. NFCI. llvm-svn: 294959	2017-02-13 15:31:08 +00:00
Davide Italiano	513dfaa0a3	[PM] Hook up the instrumented PGO machinery in the new PM. Differential Revision: https://reviews.llvm.org/D29308 llvm-svn: 294955	2017-02-13 15:26:22 +00:00
Davide Italiano	20a895c4be	[LTO] Make sure we flush buffers to work around linker shenanigans. lld, at least, doesn't call global destructors by default (unless --full-shutdown is passed) because it's, allegedly, expensive. llvm-svn: 294953	2017-02-13 14:39:51 +00:00
Simon Pilgrim	ce2cb2d968	[X86][SSE] Add v4f32 and v2f64 extract to store tests llvm-svn: 294952	2017-02-13 14:20:13 +00:00
Sanne Wouda	490d4a6da6	[CodeGen] fix alignment of JUMPTABLE_INSTS on v8M.base Summary: The attached test case fails with "fatal error: error in backend: misaligned pc-relative fixup value" as the jump table is misaligned. The EmitAlignment existed already for ARM and Thumb-1 code, but was missing for Thumb-2. The test checks that the fatal error disappears when generating an obj file, as well as checking the align directive is there when producing an asm file. Reviewers: rengolin, grosbach, t.p.northover, jmolloy, SjoerdMeijer, samparker Reviewed By: samparker Subscribers: samparker, aemerson, llvm-commits Differential Revision: https://reviews.llvm.org/D29650 llvm-svn: 294950	2017-02-13 14:07:45 +00:00
James Molloy	92497542e7	[Thumb-1] TBB generation: spot redefinitions of index register We match a sequence of 3-4 instructions into a tTBB pseudo. One of our checks is that a particular register in that sequence is killed (so it can be clobbered by the pseudo). We weren't noticing if an errant MOV or other instruction had infiltrated the sequence we were walking. If it had, and it defined the register we've already identified as killed, it makes it live across the tBR_JT and thus unclobberable. Notice this case and bail out. llvm-svn: 294949	2017-02-13 14:07:39 +00:00
James Molloy	9b3b899669	[ARM] Register ConstantIslands with the pass manager This allows us to use -stop-before/-stop-after/-run-pass - we can now write .mir tests. llvm-svn: 294948	2017-02-13 14:07:25 +00:00
Sanne Wouda	91eadad3bd	[Assembler] Improve diagnostics for inline assembly. Summary: Keep a vector of LocInfos around; one for each call to EmitInlineAsm. Since each call to EmitInlineAsm creates a new buffer in the inline asm SourceMgr, we can use the buffer number to map to the right LocInfo. Reviewers: rengolin, grosbach, rnk, echristo Reviewed By: rnk Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D29769 llvm-svn: 294947	2017-02-13 13:58:00 +00:00
Simon Pilgrim	0de807f878	[X86][SSE] Add more thorough extract to store tests Added v4i32 and v2i64 tests and test on i686 as well as x86_64. llvm-svn: 294946	2017-02-13 13:40:12 +00:00
James Molloy	d508789668	[ARM] Use VCMP, not VCMPE, for floating point equality comparisons When generating a floating point comparison we currently unconditionally generate VCMPE. This has the sideeffect of setting the cumulative Invalid bit in FPSCR if any of the operands are QNaN. It is expected that use of a relational predicate on a QNaN value should raise Invalid. Quoting from the C standard: The relational and equality operators support the usual mathematical relationships between numeric values. For any ordered pair of numeric values exactly one of relationships the less, greater, equal and is true. Relational operators may raise the floating-point exception when argument values are NaNs. The standard doesn't explicitly state the expectation for equality operators, but the implication and obvious expectation is that equality operators should not raise Invalid on a QNaN input, as those predicates are wholly defined on unordered inputs (to return not equal). Therefore, add a new operand to ARMISD::FPCMP and FPCMPZ indicating if QNaN should raise Invalid, and pipe that through to TableGen. llvm-svn: 294945	2017-02-13 12:32:47 +00:00
Simon Pilgrim	828dee1f70	[X86][SSE] Create matchVectorShuffleWithUNPCK helper function. Currently only used by target shuffle combining - will use it for lowering as well in a future patch. llvm-svn: 294943	2017-02-13 11:52:58 +00:00
Pierre Gousseau	796e0d6df1	[X86] Improve readability of test/CodeGen/X86/lzcnt-zext-cmp.ll by adding a common check prefix ALL. NFC. llvm-svn: 294938	2017-02-13 09:57:17 +00:00
Ayman Musa	f77219e035	[X86][AVX512] Fix operand classes for some AVX512 instructions to keep consistency between VEX/EVEX versions of the same instruction. Differential Revision: https://reviews.llvm.org/D29873 llvm-svn: 294937	2017-02-13 09:55:48 +00:00
Andrew V. Tischenko	8da96914f9	Compile time decreasing in the case we're dealing with Machine Combiner. Before this patch compile time was about 21s (see below). After this patch we have less than 2s (see bellow). Intel(R) Xeon(R) CPU E5-2676 v3 @ 2.40GHz DAGCombiner - trunk time ./llc spill_fdiv.ll -o /dev/null -enable-unsafe-fp-math real 0m1.685s DAGCombiner + Speed patch time ./llc spill_fdiv.ll -o /dev/null -enable-unsafe-fp-math real 0m1.655s MachineCombiner w/o Speed patch time ./llc spill_fdiv.ll -o /dev/null -enable-unsafe-fp-math real 0m21.614s MachineCombiner + Speed patch time ./llc spill_fdiv.ll -o /dev/null -enable-unsafe-fp-math real 0m1.593s The test spill_fdiv.ll is attached to D29627 D29627 should be closed. llvm-svn: 294936	2017-02-13 09:43:37 +00:00
Alexey Bataev	e8b1536e21	[SLP] Fix for PR31690: Allow using of extra values in horizontal reductions. Currently, LLVM supports vectorization of horizontal reduction instructions with initial value set to 0. Patch supports vectorization of reduction with non-zero initial values. Also, it supports a vectorization of instructions with some extra arguments, like: ``` float f(float x[], int a, int b) { float p = a % b; p += x[0] + 3; for (int i = 1; i < 32; i++) p += x[i]; return p; } ``` Patch allows vectorization of this kind of horizontal reductions. Differential Revision: https://reviews.llvm.org/D29727 llvm-svn: 294934	2017-02-13 08:01:26 +00:00
Craig Topper	3668bde371	[DAGCombiner] Teach DAG combine that inserting an extract_subvector result into the same location of a an undef vector can just use the original input to the extract. llvm-svn: 294932	2017-02-13 04:53:33 +00:00
Craig Topper	680c73e7ab	[X86] Genericize the handling of INSERT_SUBVECTOR from an EXTRACT_SUBVECTOR to support 512-bit vectors with 128-bit or 256-bit subvectors. We now detect that both the extract and insert indices are non-zero and convert to a shuffle. This will be lowered as a blend for 256-bit vectors or as a vshuf operations for 512-bit vectors. llvm-svn: 294931	2017-02-13 04:53:29 +00:00
Craig Topper	aa46204ed9	[DAGCombiner] Remove the half vector width check for the combine of EXTRACT_SUBVECTOR from an INSERT_SUBVECTOR. This gives more parallelism opportunities for AVX-512 when dealing with 128-bit extracts from 512-bit vectors. llvm-svn: 294930	2017-02-12 23:49:49 +00:00
Craig Topper	53eafa8ea4	[X86] Don't let LowerEXTRACT_SUBVECTOR call getNode for EXTRACT_SUBVECTOR. This results in the simplifications inside of getNode running while we're legalizing nodes popped off the worklist during the final DAG combine. This basically makes a DAG combine like operation occur during this legalize step, but we don't handle something quite the same way. I think we don't recursively added the removed nodes to the DAG combiner worklist. llvm-svn: 294929	2017-02-12 23:49:46 +00:00
Daniel Berlin	1bcd504a88	NewGVN: Update a number of xfailed tests to either be correct or note why they fail. llvm-svn: 294928	2017-02-12 23:28:06 +00:00
Daniel Berlin	2ef385d019	NewGVN: We really pass TBAA if we enable DCE and fix the test. Note that GVN eliminates no-use readonly/readnone calls, even if they are not marked nounwind. NewGVN only eliminates them if they are marked nounwind, and thus, trivially dead. llvm-svn: 294927	2017-02-12 23:24:47 +00:00
Daniel Berlin	4d54796f87	NewGVN: Reverse order of congruence class elimination to maximize trivial deadness llvm-svn: 294926	2017-02-12 23:24:45 +00:00
Daniel Berlin	508a1dec94	NewGVN: Use shouldSwapOperands in one more place llvm-svn: 294925	2017-02-12 23:24:42 +00:00
Sanjay Patel	0557a44287	[TargetLowering] fix SETCC SETLT folding with FP types The bug was introduced with: https://reviews.llvm.org/rL294863 ...and manifests as a selection failure in x86, but that's actually another bug. This fix prevents wrong codegen with -0.0, but in the more common case when we have NSZ and NNAN (-ffast-math), we should still be able to fold this setcc/compare. llvm-svn: 294924	2017-02-12 23:07:52 +00:00
Daniel Berlin	31e1b8fe48	Revert accidental commit titled "testing" This reverts commit r294919 llvm-svn: 294923	2017-02-12 22:40:10 +00:00
Daniel Berlin	86eab15f2b	NewGVN: Apply the fast math flags fix in r267113 to NewGVN as well. llvm-svn: 294922	2017-02-12 22:25:20 +00:00
Daniel Berlin	dbe8264c93	PredicateInfo: Handle critical edges Summary: This adds support for placing predicateinfo such that it affects critical edges. This fixes the issues mentioned by Nuno on the mailing list. Depends on D29519 Reviewers: davide, nlopes Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29606 llvm-svn: 294921	2017-02-12 22:12:20 +00:00
Daniel Berlin	eccb8740d1	NewGVN: Fix missed call that should be to shouldSwapOperands llvm-svn: 294920	2017-02-12 22:02:47 +00:00
Daniel Berlin	3fecad0d3e	testing llvm-svn: 294919	2017-02-12 22:02:20 +00:00
Simon Pilgrim	cc9242bd1c	[X86] Fix typo in function name. NFCI. convertBitVectorToUnsiged - convertBitVectorToUnsigned llvm-svn: 294914	2017-02-12 20:53:44 +00:00
Saleem Abdulrasool	4b08913de1	llvm-readobj: process FreeBSD core notes core files on FreeBSD have additional notes to capture state. Process those notes when dumping the notes. llvm-svn: 294909	2017-02-12 18:55:33 +00:00
Craig Topper	cfe8ce3a58	[AVX-512] Add various EVEX move instructions to load folding tables using the VEX equivalents as a guide. llvm-svn: 294908	2017-02-12 18:47:46 +00:00
Craig Topper	5971b5488e	[AVX-512] Add VMOV64toSDZrm CodeGenOnly instruction based on the same instruction from AVX/SSE. I can't prove that we can select this instruction or the AVX/SSE version, but I'm adding it for consistency for now so I can continue matching the load folding tables. llvm-svn: 294907	2017-02-12 18:47:44 +00:00
Craig Topper	ec26801483	[X86] Fix a couple instruction names to use 'mr' instead of 'rm' to indicate they are stores. AVX-512 version was already named with 'mr'. llvm-svn: 294906	2017-02-12 18:47:40 +00:00
Craig Topper	6eca3170a8	[AVX-512] Add VPEXTRD/Q to load folding tables. llvm-svn: 294905	2017-02-12 18:47:37 +00:00
Simon Pilgrim	04ec0f2b2a	[X86][SSE] Update argument names to match function name. NFCI. The target shuffle match function arguments were using the term 'Ops' but the function names referred to them as 'Inputs' - use 'Inputs' consistently. llvm-svn: 294900	2017-02-12 16:46:41 +00:00
Sanjay Patel	45b7e69fef	[InstCombine] fold icmp sgt/slt (add nsw X, C2), C --> icmp sgt/slt X, (C - C2) I found one special case of this transform for 'slt 0', so I removed that and added the general transform. Alive code to check correctness: Name: slt_no_overflow Pre: WillNotOverflowSignedSub(C1, C2) %a = add nsw i8 %x, C2 %b = icmp slt %a, C1 => %b = icmp slt %x, C1 - C2 Name: sgt_no_overflow Pre: WillNotOverflowSignedSub(C1, C2) %a = add nsw i8 %x, C2 %b = icmp sgt %a, C1 => %b = icmp sgt %x, C1 - C2 http://rise4fun.com/Alive/MH Differential Revision: https://reviews.llvm.org/D29774 llvm-svn: 294898	2017-02-12 16:40:30 +00:00
Sanjay Patel	97e4b98749	[ValueTracking] use nonnull argument attribute to eliminate null checks Enhancing value tracking's analysis of null-ness was suggested in D27855, so here's a first attempt at that. This is part of solving: https://llvm.org/bugs/show_bug.cgi?id=28430 Differential Revision: https://reviews.llvm.org/D28204 llvm-svn: 294897	2017-02-12 15:35:34 +00:00
Simon Pilgrim	4cd841757a	[X86][AVX2] Add support for combining target shuffles to VPMOVZX Initial 256-bit vector support - 512-bit support requires extra checks for AVX512BW support (PMOVZXBW) that will be handled in a future patch. llvm-svn: 294896	2017-02-12 14:31:23 +00:00
NAKAMURA Takumi	022c6e4f33	AMDGPU::expandMemIntrinsicUses(): Fix an uninitialized variable. This function returned true or undef. llvm-svn: 294895	2017-02-12 13:15:31 +00:00
Dorit Nuzman	eac89d736c	[LV/LoopAccess] Check statically if an unknown dependence distance can be proven larger than the loop-count This fixes PR31098: Try to resolve statically data-dependences whose compile-time-unknown distance can be proven larger than the loop-count, instead of resorting to runtime dependence checking (which are not always possible). For vectorization it is sufficient to prove that the dependence distance is >= VF; But in some cases we can prune unknown dependence distances early, and even before selecting the VF, and without a runtime test, by comparing the distance against the loop iteration count. Since the vectorized code will be executed only if LoopCount >= VF, proving distance >= LoopCount also guarantees that distance >= VF. This check is also equivalent to the Strong SIV Test. Reviewers: mkuper, anemet, sanjoy Differential Revision: https://reviews.llvm.org/D28044 llvm-svn: 294892	2017-02-12 09:32:53 +00:00
Elena Demikhovsky	5d91ab46c0	AVX-512: Fixed DWARF register numbers for XMM16-31 The reference is here: https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf llvm-svn: 294890	2017-02-12 07:56:50 +00:00
Davide Italiano	77d42eac64	[LTO] Remove useless redirection from test. NFCI. llvm-svn: 294889	2017-02-12 05:43:25 +00:00
Chandler Carruth	719ffe1a66	[PM] Add devirtualization-based iteration utility into the new PM's default pipeline. A clang with this patch built with ASan and asserts can build all of the test-suite as well, so it seems to not uncover any latent problems. Differential Revision: https://reviews.llvm.org/D29853 llvm-svn: 294888	2017-02-12 05:38:04 +00:00
Chandler Carruth	e87fc8cb71	[PM] Enable GlobalsAA in the new PM's pipeline by default. All the invalidation issues and bugs in this seem to be fixed, it has survived a full build of the test suite plus SPEC with asserts and ASan enabled on the Clang binary used. Differential Revision: https://reviews.llvm.org/D29815 llvm-svn: 294887	2017-02-12 05:34:04 +00:00
Davide Italiano	6cb6f997d8	[lib/LTO] Add support for hotness optremarks in the new API. llvm-svn: 294885	2017-02-12 05:05:35 +00:00
Davide Italiano	1e30b3d7be	[LTO] Simplify this test quite a bit, @func2 is unused/unneeded. llvm-svn: 294884	2017-02-12 03:47:54 +00:00
Davide Italiano	fb6ed9114f	[llvm-lto2] Fix typo in error message. llvm-svn: 294883	2017-02-12 03:42:09 +00:00
Davide Italiano	ebd471974a	[lib/LTO] Initial support for optimization remarks in the new API. llvm-svn: 294882	2017-02-12 03:31:30 +00:00
NAKAMURA Takumi	4918901f9e	Kaleidoscope-Ch7: Add TranformUtils for llvm::createPromoteMemoryToRegisterPass() added in r294870. llvm-svn: 294881	2017-02-12 01:18:32 +00:00
Craig Topper	04840ab752	[X86] Update test case I missed in r294876. llvm-svn: 294878	2017-02-11 23:23:11 +00:00
Craig Topper	1c37e991e6	[X86] Move code for using blendi for insert_subvector out to an isel pattern. This gives the DAG combiner more opportunity to optimize without needing to dig through the blend. llvm-svn: 294876	2017-02-11 22:57:12 +00:00
Craig Topper	b633adedc7	[DAGCombiner] Make the combine of INSERT_SUBVECTOR into a CONCAT_VECTOR more generic to support larger concats. llvm-svn: 294875	2017-02-11 22:57:09 +00:00
Simon Pilgrim	755d9127f5	[X86][SSE] Use VSEXT/VZEXT constant folding for SIGN_EXTEND_VECTOR_INREG/ZERO_EXTEND_VECTOR_INREG Preparatory step for PR31712 llvm-svn: 294874	2017-02-11 22:47:06 +00:00
Simon Pilgrim	437d64c49e	[X86][SSE] Improve VSEXT/VZEXT constant folding. Generalize VSEXT/VZEXT constant folding to work with any target constant bits source not just BUILD_VECTOR . llvm-svn: 294873	2017-02-11 21:55:24 +00:00
Mehdi Amini	bb6805d263	Update Kaleidoscope tutorial and improve Windows support Many quoted code blocks were not in sync with the actual toy.cpp files. Improve tutorial text slightly in several places. Added some step descriptions crucial to avoid crashes (like InitializeNativeTarget* calls). Solve/workaround problems with Windows (JIT'ed method not found, using custom and standard library functions from host process). Patch by: Moritz Kroll <moritz.kroll@gmx.de> Differential Revision: https://reviews.llvm.org/D29864 llvm-svn: 294870	2017-02-11 21:26:52 +00:00
Amaury Sechet	cafc256fd4	Fix atomic-minmax-i6432.ll . llvm-svn: 294867	2017-02-11 19:34:11 +00:00
Amaury Sechet	42fb927438	Regen expected tests result. NFC llvm-svn: 294866	2017-02-11 19:27:15 +00:00
Aaron Ballman	b802b8d75b	Correcting several sphinx errors; should fix the LLVM documentation build. llvm-svn: 294865	2017-02-11 18:45:24 +00:00
Simon Pilgrim	4ef9672f0f	[X86][SSE] Add early-out when trying to match blend shuffle. NFCI. llvm-svn: 294864	2017-02-11 18:06:24 +00:00
Sanjay Patel	63499b61c9	[TargetLowering] check for sign-bit comparisons in SimplifyDemandedBits I don't know if anything other than x86 vectors is affected by this change, but this may allow us to remove target-specific intrinsics for blendv* (vector selects). The simplification arises from the fact that blendv* instructions only use the sign-bit when deciding which vector element to choose for the destination vector. The mechanism to fold VSELECT into SHRUNKBLEND nodes already exists in x86 lowering; this demanded bits change just enables the transform to fire more often. The original motivation starts with a bug for DSE of masked stores that seems completely unrelated, but I've explained the likely steps in this series here: https://llvm.org/bugs/show_bug.cgi?id=11210 Differential Revision: https://reviews.llvm.org/D29687 llvm-svn: 294863	2017-02-11 18:01:55 +00:00
Amaury Sechet	9df26d330f	Fix typo in test filename. NFC llvm-svn: 294860	2017-02-11 17:48:49 +00:00
Amaury Sechet	58ce15aba1	Fix indentation in X86ISelLowering. NFC llvm-svn: 294859	2017-02-11 17:48:48 +00:00
Craig Topper	255343483d	[AVX-512] Add VPMINS/MINU/MAXS/MAXU instructions to load folding tables. llvm-svn: 294858	2017-02-11 17:35:28 +00:00
Craig Topper	b2fa216dd5	[X86] Improve alphabetizing of load folding tables. NFC llvm-svn: 294857	2017-02-11 17:35:25 +00:00
Simon Pilgrim	0e6945e48a	[X86][SSE] Convert getTargetShuffleMaskIndices to use getTargetConstantBitsFromNode. Removes duplicate constant extraction code in getTargetShuffleMaskIndices. getTargetConstantBitsFromNode - adds support for VZEXT_MOVL(SCALAR_TO_VECTOR) and fail if the caller doesn't support undef bits. llvm-svn: 294856	2017-02-11 17:27:21 +00:00
Simon Pilgrim	d59fa0e38a	[X86] Merge repeated getScalarValueSizeInBits calls. NFCI. llvm-svn: 294852	2017-02-11 16:42:07 +00:00
Daniel Berlin	22a4a01ffa	NewGVN: Reverse sense of this test to make it clearer llvm-svn: 294851	2017-02-11 15:20:15 +00:00
Daniel Berlin	1529bb93c9	NewGVN: Add missing initialization of NumFuncArgs lost due to bad merge. llvm-svn: 294850	2017-02-11 15:13:49 +00:00
Daniel Berlin	1c08767f88	NewGVN: Rank and order commutative operands consistently. llvm-svn: 294849	2017-02-11 15:07:01 +00:00
Simon Pilgrim	86a95c1ff7	[X86][3DNow!] Add tests to ensure PFMAX/PFMIN are not commuted. llvm-svn: 294848	2017-02-11 14:01:37 +00:00
Simon Pilgrim	6411a0ebed	[X86][3DNow!] Enable PFSUB<->PFSUBR commutation llvm-svn: 294847	2017-02-11 13:51:14 +00:00
Simon Pilgrim	4ead1d4aa9	[X86][3DNow!] Enable commutation for PFADD/PFMUL/PFCMPEQ/PAVGUSB/PMULHRW All commutations confirmed to give identical results - note PFMAX/PFMIN do not PFSUB<->PFSUBR should be commutable as well llvm-svn: 294846	2017-02-11 13:32:55 +00:00
Simon Pilgrim	6b4a5134af	[X86][3DNow!] Add tests showing missed commutation opportunities. llvm-svn: 294845	2017-02-11 13:00:32 +00:00
Daniel Berlin	b79f53669a	NewGVN: Clean up how we handle the INITIAL class so that everything in it is dead or unreachable, as it should be. This also makes the leader of INITIAL undef, enabling us to handle irreducibility properly. Summary: This lets us verify, more than we do now, that we didn't screw up value numbering. Reviewers: davide Subscribers: Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D29842 llvm-svn: 294844	2017-02-11 12:48:50 +00:00
Vitaly Buka	bcb6622c95	Fix "left shift of negative value -1" introduced by r294805 llvm-svn: 294843	2017-02-11 12:44:03 +00:00
Simon Pilgrim	8158816efe	[X86][XOP] Regenerate XOP commutation tests. Added 32-bit tests as well. llvm-svn: 294841	2017-02-11 12:30:59 +00:00
Simon Pilgrim	008ba63e04	[X86][SSE] Regenerate float comparison commutation tests. llvm-svn: 294840	2017-02-11 12:29:56 +00:00
Simon Pilgrim	0d8632f089	[X86] Regenerate CLMUL commutation tests. llvm-svn: 294839	2017-02-11 12:23:22 +00:00
Benjamin Kramer	efcf06f5f2	Move symbols from the global namespace into (anonymous) namespaces. NFC. llvm-svn: 294837	2017-02-11 11:06:55 +00:00
Craig Topper	1f6153bab4	[AVX-512] Add VPINSRB/W/D/Q instructions to load folding tables. llvm-svn: 294830	2017-02-11 07:01:40 +00:00
Craig Topper	a9818aadab	[AVX-512] Fix apparent typo in instruction name VMOVSSDrr_REV->VMOVSDZrr_REV. llvm-svn: 294829	2017-02-11 07:01:38 +00:00
Craig Topper	3afa777f10	[AVX-512] Add VPSADBW instructions to load folding tables. llvm-svn: 294827	2017-02-11 06:24:03 +00:00
Evgeny Stupachenko	5f3d9b6c09	The patch fixes r294821 Summary: Update register match for windows testing From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 294825	2017-02-11 05:39:00 +00:00
Craig Topper	464b8cb244	[X86] Don't base domain decisions on VEXTRACTF128/VINSERTF128 if only AVX1 is available. Seems the execution dependency pass likes to use FP instructions when most of the consuming code is integer if a vextractf128 instruction produced the register. Without AVX2 we don't have the corresponding integer instruction available. This patch suppresses the domain on these instructions to GenericDomain if AVX2 is not supported so that they are ignored by domain fixing. If AVX2 is supported we'll report the correct domain and allow them to switch between integer and fp. Overall I think this produces better results in the modified test cases. llvm-svn: 294824	2017-02-11 05:32:57 +00:00
Peter Collingbourne	fa3175f2f6	Address Mehdi's post-commit review comments on r294795. llvm-svn: 294822	2017-02-11 03:19:22 +00:00
Evgeny Stupachenko	fe6f548d2d	Fix PR23384 (under "-lsr-insns-cost" option) Summary: The patch adds instructions number generated by a solution to LSR cost under "-lsr-insns-cost" option. Reviewers: qcolombet, hfinkel Differential Revision: http://reviews.llvm.org/D28307 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 294821	2017-02-11 02:57:43 +00:00
Ahmed Bougacha	8425f453ef	[ARM] Make f16 interleaved accesses expensive. There are no vldN/vstN f16 variants, even with +fullfp16. We could use the i16 variants, but, in practice, even with +fullfp16, the f16 sequence leading to the i16 shuffle usually gets scalarized. We'd need to improve our support for f16 codegen before getting there. Teach the cost model to consider f16 interleaved operations as expensive. Otherwise, we are all but guaranteed to end up with a large block of scalarized vector code. llvm-svn: 294819	2017-02-11 01:53:04 +00:00
Ahmed Bougacha	fc979dc9dd	[ARM] Don't lower f16 interleaved accesses. There are no vldN/vstN f16 variants, even with +fullfp16. We could use the i16 variants, but, in practice, even with +fullfp16, the f16 sequence leading to the i16 shuffle usually gets scalarized. We'd need to improve our support for f16 codegen before getting there. Reject f16 interleaved accesses. If we try to emit the f16 intrinsics, we'll just end up with a selection failure. llvm-svn: 294818	2017-02-11 01:53:00 +00:00
Ahmed Bougacha	f37fb89edc	[ARM] Unique some redundant CHECK lines. NFC. llvm-svn: 294817	2017-02-11 01:52:57 +00:00
Wei Mi	8f20e63a20	[LSR] Recommit: Allow formula containing Reg for SCEVAddRecExpr related with outerloop. The recommit includes some changes of testcases. No functional change to the patch. In RateRegister of existing LSR, if a formula contains a Reg which is a SCEVAddRecExpr, and this SCEVAddRecExpr's loop is an outerloop, the formula will be marked as Loser and dropped. Suppose we have an IR that %for.body is outerloop and %for.body2 is innerloop. LSR only handle inner loop now so only %for.body2 will be handled. Using the logic above, formula like reg(%array) + reg({1,+, %size}<%for.body>) + 1reg({0,+,1}<%for.body2>) will be dropped no matter what because reg({1,+, %size}<%for.body>) is a SCEVAddRecExpr type reg related with outerloop. Only formula like reg(%array) + 1reg({{1,+, %size}<%for.body>,+,1}<nuw><nsw><%for.body2>) will be kept because the SCEVAddRecExpr related with outerloop is folded into the initial value of the SCEVAddRecExpr related with current loop. But in some cases, we do need to share the basic induction variable reg{0 ,+, 1}<%for.body2> among LSR Uses to reduce the final total number of induction variables used by LSR, so we don't want to drop the formula like reg(%array) + reg({1,+, %size}<%for.body>) + 1*reg({0,+,1}<%for.body2>) unconditionally. From the existing comment, it tries to avoid considering multiple level loops at the same time. However, existing LSR only handles innermost loop, so for any SCEVAddRecExpr with a loop other than current loop, it is an invariant and will be simple to handle, and the formula doesn't have to be dropped. Differential Revision: https://reviews.llvm.org/D26429 llvm-svn: 294814	2017-02-11 00:50:23 +00:00
Eugene Zelenko	d3a6c897ba	[MC] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 294813	2017-02-11 00:27:28 +00:00
Matthias Braun	59797eccea	config-ix.cmake: Search for CMAKE_XCRUN before using it. This was previously searched in CMakeLists.txt unconditionally but as of r294371 it is only searched in some circumstances. Repeating the search in config-ix.cmake to make this robust and hopefully fix the macOS Asan+Ubsan jenkins build. llvm-svn: 294811	2017-02-11 00:14:01 +00:00
Chandler Carruth	027340f3b9	[PM] Fix a bug in how I ported LoopDeletion to the new PM. This was marking the loop for deletion after the loop was deleted. This almost works, except that when we do any kind of debug logging it starts reading the name of the loop from deleted memory or otherwise blowing up. This can fail in a bunch of ways. I recently added a test that always does this, and it started failing on the sanitizer bots. The fix is to mark the loop as deleted in the loop PM infrastructure before we remove the loop. We can do this by passing the updater into the routine. That also lets us simplify a bunch of other interface components here for a net win. llvm-svn: 294810	2017-02-11 00:09:30 +00:00
Dan Gohman	dfe6ce7abd	[WebAssembly] Remove old experimental disassemler code. Remove support for disassembling an old experimental wasm binary format, which is no longer in use anywhere. llvm-svn: 294809	2017-02-11 00:02:23 +00:00
Saleem Abdulrasool	769b98d327	vim: add `returned` keyword The `returned` keyword was added in SVN r179925. Update the vim syntax rules. llvm-svn: 294808	2017-02-10 23:57:11 +00:00
Davide Italiano	690ed9dec7	[LTO] Share the optimization remarks setup between Thin/Full LTO. llvm-svn: 294807	2017-02-10 23:49:38 +00:00
Krzysztof Parzyszek	f9015e62fd	[Hexagon] Introduce Hexagon V62 llvm-svn: 294805	2017-02-10 23:46:45 +00:00
Davide Italiano	95a8707de8	[tests] Be explicit about the files we want to remove. Hopefully Windows will stop whining after this change. llvm-svn: 294801	2017-02-10 22:55:37 +00:00
Peter Collingbourne	be9ffaacfa	IR: Function summary extensions for whole-program devirtualization pass. The summary information includes all uses of llvm.type.test and llvm.type.checked.load intrinsics that can be used to devirtualize calls, including any constant arguments for virtual constant propagation. Differential Revision: https://reviews.llvm.org/D29734 llvm-svn: 294795	2017-02-10 22:29:38 +00:00
Benjamin Kramer	03ab8a366e	[InstCombine] Move class into anonymous namespace. NFC. This is necessary to avoid warnings from GCC. InstCombineLoadStoreAlloca.cpp:238:7: error: 'PointerReplacer' declared with greater visibility than the type of its field 'PointerReplacer::IC' llvm-svn: 294794	2017-02-10 22:26:35 +00:00
Davide Italiano	46d72b1b7f	[lib/LTO] Rework optimization remarkers setup. This makes this code much more similar to what ThinLTO is using (also API wise), so now we can probably use a single code path instead of copying stuff around. llvm-svn: 294792	2017-02-10 22:16:17 +00:00
Benjamin Kramer	aa5adfa360	[PPC] Silence warning in Release builds. llvm-svn: 294791	2017-02-10 22:13:34 +00:00
Davide Italiano	62092aeb42	[LTO] Make these tests robust across multiple iterations. Same as r294784, but for regular LTO. llvm-svn: 294789	2017-02-10 22:11:06 +00:00

... 4 5 6 7 8 ...

145226 Commits