llvm-project

Commit Graph

Author	SHA1	Message	Date
Matt Arsenault	4b5fc093d0	AMDGPU: Remove custom getSubReg This was kind of confusing, the subregister class shouldn't really be necessary. llvm-svn: 278362	2016-08-11 17:15:32 +00:00
Matt Arsenault	69fd2c1179	AMDGPU: Remove unused tracking of flat instructions llvm-svn: 278361	2016-08-11 17:15:28 +00:00
Duncan P. N. Exon Smith	ec30cc2171	Hexagon: Avoid dereferencing end() in HexagonCopyToCombine::findPairable Check for end() before skipping through debug values. This avoids dereferencing end() when the instruction is the final one in the basic block. (It still assumes that a debug value will not be the final instruction in the basic block. No tests seemed to violate that.) Many Hexagon tests trigger this, but they happen to magically pass right now. I found this because WIP patches for PR26753 convert them into crashes. llvm-svn: 278355	2016-08-11 16:40:03 +00:00
Wei Ding	34e1753585	AMDGPU : Add LLVM intrinsics for SAD related instructions. Differential Revision: http://reviews.llvm.org/D23133 llvm-svn: 278354	2016-08-11 16:33:53 +00:00
Tim Northover	0d51044b69	GlobalISel: clear vreg mapping after translating each function Otherwise we only materialize (shared) constants in the first function they appear in. This doesn't go well. llvm-svn: 278351	2016-08-11 16:21:29 +00:00
Reid Kleckner	26f9e9ebc3	Remove FIXME about asserting on the end iterator After machine block placement, MBBs may not have terminators, and it is appropriate to check for the end iterator here. We can fold the check into the next if, as well. This look is really just looking for BBs that end in CATCHRET. llvm-svn: 278350	2016-08-11 16:00:43 +00:00
Lang Hames	30526070ab	[MCJIT] Improve documentation and error handling for MCJIT::runFunction. ExecutionEngine::runFunction is supposed to allow execution of arbitrary function types, but MCJIT can only reasonably support a limited subset of main-linke function types. This patch documents this limitation, and fixes MCJIT::runFunction to abort with a meaningful error at runtime if called with an unsupported function type. llvm-svn: 278348	2016-08-11 15:56:23 +00:00
Duncan P. N. Exon Smith	62e351f5a4	X86: Use operator lookup for operator==, NFC Avoid relying on the MachineInstrBundleIterator operator== being implemented as a member function. llvm-svn: 278347	2016-08-11 15:51:29 +00:00
Duncan P. N. Exon Smith	43724649c3	IR: Don't cast the end iterator to Instruction* End iterators are usually sentinels, not actually Instruction* at all. Stop casting to it just to get an iterator back. There is likely no observable functionality change here right now (although this is relying on UB, I doubt it was triggering anything), but I'll be removing the cast soon. llvm-svn: 278346	2016-08-11 15:45:04 +00:00
Duncan P. N. Exon Smith	2e7af979b9	CodeGen: Check for a terminator in llvm::getFuncletMembership Check for an end iterator from MachineBasicBlock::getFirstTerminator in llvm::getFuncletMembership. If this is turned into an assertion, it fires in 48 X86 testcases (for example, CodeGen/X86/regalloc-spill-at-ehpad.ll). Since this is likely a latent bug (shouldn't all basic blocks end with a terminator?) I've filed PR28938. llvm-svn: 278344	2016-08-11 15:29:02 +00:00
Matthew Simpson	3f69195b9e	[SLP] Make RecursionMaxDepth a command line option (NFC) llvm-svn: 278343	2016-08-11 15:28:45 +00:00
Sanjay Patel	38ae83de38	fix comment; NFC llvm-svn: 278342	2016-08-11 15:23:56 +00:00
Sanjay Patel	e3c335cbed	use auto* with dyn_cast ; NFC llvm-svn: 278340	2016-08-11 15:21:21 +00:00
Sanjay Patel	5a470950b9	getParent()->getParent() == getFunction() ; NFC llvm-svn: 278339	2016-08-11 15:16:06 +00:00
Teresa Johnson	9ba95f99f3	Restore "Resolution-based LTO API." This restores commit r278330, with fixes for a few bot failures: - Fix a late change I had made to the save temps output file that I missed due to existing files sitting on my disk - Fix a bunch of Windows bot failures with "ambiguous call to overloaded function" due to confusion between llvm::make_unique vs std::make_unique (preface the new make_unique calls with "llvm::") - Attempt to fix a modules bot failure by adding a missing include to LTO/Config.h. Original change: Resolution-based LTO API. Summary: This introduces a resolution-based LTO API. The main advantage of this API over existing APIs is that it allows the linker to supply a resolution for each symbol in each object, rather than the combined object as a whole. This will become increasingly important for use cases such as ThinLTO which require us to process symbol resolutions in a more complicated way than just adjusting linkage. Patch by Peter Collingbourne. Reviewers: rafael, tejohnson, mehdi_amini Subscribers: lhames, tejohnson, mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D20268 llvm-svn: 278338	2016-08-11 14:58:12 +00:00
Ehsan Amiri	3818f1b38a	revert 278334 llvm-svn: 278337	2016-08-11 14:51:14 +00:00
Valery Pykhtin	82c73bee2b	Revert "[AMDGPU] fix failure on printing of non-existing instruction operands." This reverts revision 278333, newly added test failed. llvm-svn: 278336	2016-08-11 14:22:05 +00:00
Ehsan Amiri	b9fcc2b171	Extend trip count instead of truncating IV in LFTR, when legal When legal, extending trip count in the loop control logic generates better code compared to truncating IV. This is because (1) extending trip count is a loop invariant operation (see genLoopLimit where we prove trip count is loop invariant). (2) Scalar Evolution seems to have problems understanding trunc when computing loop trip count. So removing them allows better analysis performed in Scalar Evolution. (In particular this fixes PR 28363 which is the motivation for this change). I am not going to perform any performance test. Any degradation caused by this should be an indication of a bug elsewhere. To prove legality, we rely on SCEV to prove zext(trunc(IV)) == IV (or similarly for sext). If this holds, we can prove equivalence of trunc(IV)==ExitCnt (1) and IV == zext(ExitCnt). Simply take zext of boths sides of (1) and apply the proven equivalence. https://reviews.llvm.org/D23075 llvm-svn: 278334	2016-08-11 13:51:20 +00:00
Valery Pykhtin	3048ff6ec3	[AMDGPU] fix failure on printing of non-existing instruction operands. Differential revision: https://reviews.llvm.org/D23323 llvm-svn: 278333	2016-08-11 13:49:46 +00:00
Teresa Johnson	cbf684e6c6	Revert "Resolution-based LTO API." This reverts commit r278330. I made a change to the save temps output that is causing issues with the bots. Didn't realize this because I had older output files sitting on disk in my test output directory. llvm-svn: 278331	2016-08-11 13:03:56 +00:00
Teresa Johnson	f99573b3ee	Resolution-based LTO API. Summary: This introduces a resolution-based LTO API. The main advantage of this API over existing APIs is that it allows the linker to supply a resolution for each symbol in each object, rather than the combined object as a whole. This will become increasingly important for use cases such as ThinLTO which require us to process symbol resolutions in a more complicated way than just adjusting linkage. Patch by Peter Collingbourne. Reviewers: rafael, tejohnson, mehdi_amini Subscribers: lhames, tejohnson, mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D20268 Address review comments llvm-svn: 278330	2016-08-11 12:56:40 +00:00
Simon Pilgrim	5c91764af5	Fixed VS2015 (Update 3) warning - differing const/volatile qualifiers for overridden function Dropped the const qualifier to match llvm::CallLowering::lowerCall llvm-svn: 278329	2016-08-11 12:19:43 +00:00
Igor Breger	a77b14d02c	[AVX512] Fix extractelement i1 lowering. The previous implementation (not custom) doesn't enforce zeroing off upper bits. The assumption is that i1 PRODUCER (truncate and extractelement) must zero all upper bits, so i1 CONSUMER instructions ( test, zext, save, etc) can be done without additional zeroing. Make extractelement i1 lowering custom for all vector i1. Differential Revision: http://reviews.llvm.org/D23246 llvm-svn: 278328	2016-08-11 12:13:46 +00:00
Marina Yatsina	88f0c31f13	Avoid false dependencies of undef machine operands This patch helps avoid false dependencies on undef registers by updating the machine instructions' undef operand to use a register that the instruction is truly dependent on, or use a register with clearance higher than Pref. Pseudo example: loop: xmm0 = ... xmm1 = vcvtsi2sdl eax, xmm0<undef> ... = inst xmm0 jmp loop In this example, selecting xmm0 as the undef register creates false dependency between loop iterations. This false dependency cannot be solved by inserting an xor before vcvtsi2sdl because xmm0 is alive at the point of the vcvtsi2sdl instruction. Selecting a different register instead of xmm0, especially a register that is not used in the loop, will eliminate this problem. Differential Revision: https://reviews.llvm.org/D22466 llvm-svn: 278321	2016-08-11 07:32:08 +00:00
Craig Topper	a78b768ed4	[AVX-512] Promote 512-bit integer loads to v8i64 similar to what is done for 128/256-bit vectors for overall consistency. llvm-svn: 278318	2016-08-11 06:04:07 +00:00
Craig Topper	14aa2665d3	[AVX-512] Add patterns to allow EVEX encoded stores of v16i16/v8i16/v16i8/v32i8 even when BWI is not supported. llvm-svn: 278317	2016-08-11 06:04:04 +00:00
Craig Topper	3563d0f622	[AVX-512] Fix the 128-bit and 256-bit nontemporal load patterns with elements type other than i64. These loads have all been promoted to v2i64/v4i64 loads so we need bitcasts or we end up selecting VMOVDQA32/VMOVDQU32 instead. llvm-svn: 278316	2016-08-11 06:04:00 +00:00
Xinliang David Li	76a0108be4	[Profile] improve warning control option Change --no-pgo-warn-missing to -pgo-warn-missing-function and negate the default. /NFC Add more test to make sure the warning is off by default llvm-svn: 278314	2016-08-11 05:09:30 +00:00
Dominic Chen	4173fffa08	[WebAssembly] Cleanup trailing whitespace Summary: Test for commit access. Subscribers: jfb, dschuff Differential Revision: https://reviews.llvm.org/D23392 llvm-svn: 278313	2016-08-11 04:10:56 +00:00
Easwaran Raman	0d58fcac99	Make more fields of InlineParams Optional. Differential revision: https://reviews.llvm.org/D23386 llvm-svn: 278312	2016-08-11 03:58:05 +00:00
Sanjoy Das	25fb5bda0f	[Statepoints] Minor cosmetic change; NFC The verification failure message was missing a space. llvm-svn: 278309	2016-08-11 00:56:46 +00:00
Chris Bieneman	ca5de9d9e3	[MachOYAML] Don't output empty ExportTrie The YAML representation was always outputting the root node of an export trie even if the trie was empty. While this doesn't really have any functional impact, it does add visual clutter to the yaml file. llvm-svn: 278307	2016-08-11 00:20:03 +00:00
Tim Northover	357f1be2ca	GlobalISel: support same ConstantExprs as Instructions. It's more than just inttoptr, but the others can't be tested until we have support for non-trivial constants (they currently get unavoidably folded to a ConstantInt). llvm-svn: 278303	2016-08-10 23:02:41 +00:00
Tim Northover	406024a108	GlobalISel: implement simple function calls on AArch64. We're still limited in the arguments we support, but this at least handles the basic cases. llvm-svn: 278293	2016-08-10 21:44:01 +00:00
Changpeng Fang	fb9c3818dd	AMDGPU/SI: Implement amdgcn image intrinsics with sampler Summary: This patch define and implement amdgcn image intrinsics with sampler. 1. define vdata type to be llvm_anyfloat_ty, address type to be llvm_anyfloat_ty, and rsrc type to be llvm_anyint_ty. As a result, we expect the intrinsics name to have three suffixes to overload each of these three types; 2. D128 as well as two other flags are implied in the three types, for example, if you use v8i32 as resource type, then r128 is 0! 3. don't expose TFE flag, and other flags are exposed in the instruction order: unrm, glc, slc, lwe and da. Differential Revision: http://reviews.llvm.org/D22838 Reviewed by: arsenm and tstellarAMD llvm-svn: 278291	2016-08-10 21:15:30 +00:00
Piotr Padlewski	d89875ca39	Changed sign of LastCallToStaticBouns Summary: I think it is much better this way. When I firstly saw line: Cost += InlineConstants::LastCallToStaticBonus; I though that this is a bug, because everywhere where the cost is being reduced it is usuing -=. Reviewers: eraman, tejohnson, mehdi_amini Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D23222 llvm-svn: 278290	2016-08-10 21:15:22 +00:00
Kyle Butt	81d32846b0	Codegen: Don't tail-duplicate blocks with un-analyzable fallthrough. If AnalyzeBranch can't analyze a block and it is possible to fallthrough, then duplicating the block doesn't make sense, as only one block can be the layout predecessor for the un-analyzable fallthrough. Submitted wit a test case, but NOTE: the test case doesn't currently fail. However, the test case fails with D20505 and would have saved me some time debugging. llvm-svn: 278288	2016-08-10 21:03:27 +00:00
Kyle Butt	e1c931b171	CodeGen: If Convert blocks that would form a diamond when tail-merged. The following function currently relies on tail-merging for if conversion to succeed. The common tail of cond_true and cond_false is extracted, and this then forms a diamond pattern that can be successfully if converted. If this block does not get extracted, either because tail-merging is disabled or the threshold is higher, we should still recognize this pattern and if-convert it. Fixed a regression in the original commit. Need to un-reverse branches after reversing them, or other conversions go awry. define i32 @t2(i32 %a, i32 %b) nounwind { entry: %tmp1434 = icmp eq i32 %a, %b ; <i1> [#uses=1] br i1 %tmp1434, label %bb17, label %bb.outer bb.outer: ; preds = %cond_false, %entry %b_addr.021.0.ph = phi i32 [ %b, %entry ], [ %tmp10, %cond_false ] %a_addr.026.0.ph = phi i32 [ %a, %entry ], [ %a_addr.026.0, %cond_false ] br label %bb bb: ; preds = %cond_true, %bb.outer %indvar = phi i32 [ 0, %bb.outer ], [ %indvar.next, %cond_true ] %tmp. = sub i32 0, %b_addr.021.0.ph %tmp.40 = mul i32 %indvar, %tmp. %a_addr.026.0 = add i32 %tmp.40, %a_addr.026.0.ph %tmp3 = icmp sgt i32 %a_addr.026.0, %b_addr.021.0.ph br i1 %tmp3, label %cond_true, label %cond_false cond_true: ; preds = %bb %tmp7 = sub i32 %a_addr.026.0, %b_addr.021.0.ph %tmp1437 = icmp eq i32 %tmp7, %b_addr.021.0.ph %indvar.next = add i32 %indvar, 1 br i1 %tmp1437, label %bb17, label %bb cond_false: ; preds = %bb %tmp10 = sub i32 %b_addr.021.0.ph, %a_addr.026.0 %tmp14 = icmp eq i32 %a_addr.026.0, %tmp10 br i1 %tmp14, label %bb17, label %bb.outer bb17: ; preds = %cond_false, %cond_true, %entry %a_addr.026.1 = phi i32 [ %a, %entry ], [ %tmp7, %cond_true ], [ %a_addr.026.0, %cond_false ] ret i32 %a_addr.026.1 } Without tail-merging or diamond-tail if conversion: LBB1_1: @ %bb @ =>This Inner Loop Header: Depth=1 cmp r0, r1 ble LBB1_3 @ BB#2: @ %cond_true @ in Loop: Header=BB1_1 Depth=1 subs r0, r0, r1 cmp r1, r0 it ne cmpne r0, r1 bgt LBB1_4 LBB1_3: @ %cond_false @ in Loop: Header=BB1_1 Depth=1 subs r1, r1, r0 cmp r1, r0 bne LBB1_1 LBB1_4: @ %bb17 bx lr With diamond-tail if conversion, but without tail-merging: @ BB#0: @ %entry cmp r0, r1 it eq bxeq lr LBB1_1: @ %bb @ =>This Inner Loop Header: Depth=1 cmp r0, r1 ite le suble r1, r1, r0 subgt r0, r0, r1 cmp r1, r0 bne LBB1_1 @ BB#2: @ %bb17 bx lr llvm-svn: 278287	2016-08-10 20:45:56 +00:00
Jonathan Roelofs	851b79dc4d	Fix UB in APInt::ashr i64 -1, whose sign bit is the 0th one, can't be left shifted without invoking UB. https://reviews.llvm.org/D23362 llvm-svn: 278280	2016-08-10 19:50:14 +00:00
Matt Arsenault	61f8ba8b79	AMDGPU: s_setpc_b64 should be an indirect branch llvm-svn: 278278	2016-08-10 19:20:02 +00:00
Matt Arsenault	c6b1350039	AMDGPU: Set sizes on control flow pseudos llvm-svn: 278276	2016-08-10 19:11:51 +00:00
Matt Arsenault	f4af802381	AMDGPU: Remove empty file comment llvm-svn: 278275	2016-08-10 19:11:48 +00:00
Matt Arsenault	11587d97be	AMDGPU: Remove unnecessary cast llvm-svn: 278274	2016-08-10 19:11:45 +00:00
Matt Arsenault	57431c9680	AMDGPU: Change insertion point of si_mask_branch Insert before the skip branch if one is created. This is a somewhat more natural placement relative to the skip branches, and makes it possible to implement analyzeBranch for skip blocks. The test changes are mostly due to a quirk where the block label is not emitted if there is a terminator that is not also a branch. llvm-svn: 278273	2016-08-10 19:11:42 +00:00
Matt Arsenault	b920e9987d	AMDGPU: Use CreateStackObject instead of CreateSpillStackObject I'm not sure what the difference is, but no other target uses this for emergency spill slots. llvm-svn: 278272	2016-08-10 19:11:36 +00:00
Sanjay Patel	5ccc85fe83	[x86, AVX] allow FP vector select folding to bitwise logic ops (PR28895) This handles the case in: https://llvm.org/bugs/show_bug.cgi?id=28895 ...but we are not getting all of the possibilities yet. Eg, we use 'X86::FANDN' for scalar FP select combines. That enhancement is filed as: https://llvm.org/bugs/show_bug.cgi?id=28925 Differential Revision: https://reviews.llvm.org/D23337 llvm-svn: 278270	2016-08-10 19:00:11 +00:00
Andrew Kaylor	498d3113c3	[IndVarSimplify] Eliminate zext of a signed IV when the IV is known to be non-negative Patch by Li Huang Differential Revision: https://reviews.llvm.org/D18867 llvm-svn: 278269	2016-08-10 18:56:35 +00:00
Nicolai Haehnle	02d784172c	LiveIntervalAnalysis: fix a crash in repairOldRegInRange Summary: See the new test case for one that was (non-deterministically) crashing on trunk and deterministically hit the assertion that I added in D23302. Basically, the machine function contains a sequence DS_WRITE_B32 %vreg4, %vreg14:sub0, ... DS_WRITE_B32 %vreg4, %vreg14:sub0, ... %vreg14:sub1<def> = COPY %vreg14:sub0 and SILoadStoreOptimizer::mergeWrite2Pair merges the two DS_WRITE_B32 instructions into one before calling repairIntervalsInRange. Now repairIntervalsInRange wants to repair %vreg14, in particular, and ends up trying to repair %vreg14:sub1 as well, but that only becomes active _after_ the range that is to be repaired, hence the crash due to LR.find(...) == LR.begin() at the start of repairOldRegInRange. I believe that just skipping those subrange is fine, but again, not too familiar with that code. Reviewers: MatzeB, kparzysz, tstellarAMD Subscribers: llvm-commits, MatzeB Differential Revision: https://reviews.llvm.org/D23303 llvm-svn: 278268	2016-08-10 18:51:14 +00:00
Andrew Kaylor	b10f6876cd	[ValueTracking] An improvement to IR ValueTracking on Non-negative Integers Patch by Li Huang Differential Revision: https://reviews.llvm.org/D18777 llvm-svn: 278267	2016-08-10 18:47:19 +00:00
Krzysztof Parzyszek	c9c2bba621	[Hexagon] Remove unused variants of LO/HI instructions llvm-svn: 278266	2016-08-10 18:40:36 +00:00

1 2 3 4 5 ...

93702 Commits