llvm-project

Commit Graph

Author	SHA1	Message	Date
Colin LeMahieu	0143146514	[MCParser] Accept uppercase radix variants 0X and 0B Differential Revision: http://reviews.llvm.org/D14781 llvm-svn: 263802	2016-03-18 18:22:07 +00:00
Mike Aizatsky	4f7994c8cb	[sancov] specifying sanitizer coverage dependencies. Summary: These dependencies would be used in the future to reduce the number of instrumented blocks(http://reviews.llvm.org/rL262103) This is submitted as a separate CL because of previous problems with ARM. Subscribers: aemerson Differential Revision: http://reviews.llvm.org/D18227 llvm-svn: 263797	2016-03-18 17:33:21 +00:00
Nicolai Haehnle	95e8ffd398	AMDGPU: Overload return type of llvm.amdgcn.buffer.load.format Summary: Allow the selection of BUFFER_LOAD_FORMAT_x and _XY. Do this now before the frontend patches land in Mesa. Eventually, we may want to automatically reduce the size of loads at the LLVM IR level, which requires such overloads, and in some cases Mesa can generate them directly. Reviewers: tstellarAMD, arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18255 llvm-svn: 263792	2016-03-18 16:24:40 +00:00
Nicolai Haehnle	ad63638f6d	AMDGPU/SI: Add llvm.amdgcn.buffer.atomic.* intrinsics Summary: These intrinsics expose the BUFFER_ATOMIC_* instructions and will be used by Mesa to implement atomics with buffer semantics. The intrinsic interface matches that of buffer.load.format and buffer.store.format, except that the GLC bit is not exposed (it is automatically deduced based on whether the return value is used). The change of hasSideEffects is required for TableGen to accept the pattern that matches the intrinsic. Reviewers: tstellarAMD, arsenm Subscribers: arsenm, rivanvx, llvm-commits Differential Revision: http://reviews.llvm.org/D18151 llvm-svn: 263791	2016-03-18 16:24:31 +00:00
Nicolai Haehnle	3003ba00a3	AMDGPU: use ComplexPattern for offsets in llvm.amdgcn.buffer.load/store.format Summary: We cannot easily deduce that an offset is in an SGPR, but the Mesa frontend cannot easily make use of an explicit soffset parameter either. Furthermore, it is likely that in the future, LLVM will be in a better position than the frontend to choose an SGPR offset if possible. Since there aren't any frontend uses of these intrinsics in upstream repositories yet, I would like to take this opportunity to change the intrinsic signatures to a single offset parameter, which is then selected to immediate offsets or voffsets using a ComplexPattern. Reviewers: arsenm, tstellarAMD, mareko Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18218 llvm-svn: 263790	2016-03-18 16:24:20 +00:00
Sam Kolton	a74cd526e9	[AMDGPU] Assembler: Change dpp_ctrl syntax to match sp3 Review: http://reviews.llvm.org/D18267 llvm-svn: 263789	2016-03-18 15:35:51 +00:00
Benjamin Kramer	d96b0c14fb	[Fuzzer] Guard no_sanitize_memory attributes behind __has_feature. Otherwise GCC fails to build it because it doesn't know the attribute. llvm-svn: 263787	2016-03-18 14:19:19 +00:00
Ehsan Amiri	631ed04af0	adding another optimization opportunity to readme file llvm-svn: 263775	2016-03-18 04:02:25 +00:00
Kostya Serebryany	c43b584c1c	[libFuzzer] read corpus dirs recursively llvm-svn: 263773	2016-03-18 01:36:00 +00:00
Adam Nemet	709e3046ee	[LoopDataPrefetch] Add TTI to limit the number of iterations to prefetch ahead Summary: It can hurt performance to prefetch ahead too much. Be conservative for now and don't prefetch ahead more than 3 iterations on Cyclone. Reviewers: hfinkel Subscribers: llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D17949 llvm-svn: 263772	2016-03-18 00:27:43 +00:00
Adam Nemet	6d8beeca53	[LoopDataPrefetch/Aarch64] Allow selective prefetching of large-strided accesses Summary: And use this TTI for Cyclone. As it was explained in the original RFC (http://thread.gmane.org/gmane.comp.compilers.llvm.devel/92758), the HW prefetcher work up to 2KB strides. I am also adding tests for this and the previous change (D17943): * Cyclone prefetching accesses with a large stride * Cyclone not prefetching accesses with a small stride * Generic Aarch64 subtarget not prefetching either Reviewers: hfinkel Subscribers: aemerson, rengolin, llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D17945 llvm-svn: 263771	2016-03-18 00:27:38 +00:00
Adam Nemet	53e758fc55	[Aarch64] Add pass LoopDataPrefetch for Cyclone Summary: This wires up the pass for Cyclone but keeps it off for now because we need a few more TTIs. The getPrefetchMinStride value is not very well tuned right now but it works well with CFP2006/433.milc which motivated this. Tests will be added as part of the upcoming large-stride prefetching patch. Reviewers: t.p.northover Subscribers: llvm-commits, aemerson, hfinkel, rengolin Differential Revision: http://reviews.llvm.org/D17943 llvm-svn: 263770	2016-03-18 00:27:29 +00:00
Kostya Serebryany	945761b8c2	[libFuzzer] improve -merge functionality llvm-svn: 263769	2016-03-18 00:23:29 +00:00
Peter Collingbourne	a1f8625662	DebugInfo: Add ability to not emit DW_AT_vtable_elem_location for virtual functions. A virtual index of -1u indicates that the subprogram's virtual index is unrepresentable (for example, when using the relative vtable ABI), so do not emit a DW_AT_vtable_elem_location attribute for it. Differential Revision: http://reviews.llvm.org/D18236 llvm-svn: 263765	2016-03-17 23:58:03 +00:00
Tim Shen	5cdf75084a	[PPC, FastISel] Fix ordered/unordered fcmp For fcmp, major concern about the following 6 cases is NaN result. The comparison result consists of 4 bits, indicating lt, eq, gt and un (unordered), only one of which will be set. The result is generated by fcmpu instruction. However, bc instruction only inspects one of the first 3 bits, so when un is set, bc instruction may jump to to an undesired place. More specifically, if we expect an unordered comparison and un is set, we expect to always go to true branch; in such case UEQ, UGT and ULT still give false, which are undesired; but UNE, UGE, ULE happen to give true, since they are tested by inspecting !eq, !lt, !gt, respectively. Similarly, for ordered comparison, when un is set, we always expect the result to be false. In such case OGT, OLT and OEQ is good, since they are actually testing GT, LT, and EQ respectively, which are false. OGE, OLE and ONE are tested through !lt, !gt and !eq, and these are true. llvm-svn: 263753	2016-03-17 22:27:58 +00:00
Adam Nemet	b0c4eae073	[LoopVectorize] Annotate versioned loop with noalias metadata Summary: Use the new LoopVersioning facility (D16712) to add noalias metadata in the vector loop if we versioned with memchecks. This can enable some optimization opportunities further down the pipeline (see the included test or the benchmark improvement quoted in D16712). The test also covers the bug I had in the initial version in D16712. The vectorizer did not previously use LoopVersioning. The reason is that the vectorizer performs its transformations in single shot. It creates an empty single-block vector loop that it then populates with the widened, if-converted instructions. Thus creating an intermediate versioned scalar loop seems wasteful. So this patch (rather than bringing in LoopVersioning fully) adds a special interface to LoopVersioning to allow the vectorizer to add no-alias annotation while still performing its own versioning. As the vectorizer propagates metadata from the instructions in the original loop to the vector instructions we also check the pointer in the original instruction and see if LoopVersioning can add no-alias metadata based on the issued memchecks. Reviewers: hfinkel, nadav, mzolotukhin Subscribers: mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D17191 llvm-svn: 263744	2016-03-17 20:32:37 +00:00
Adam Nemet	5eccf07df3	[LoopVersioning] Annotate versioned loop with noalias metadata Summary: If we decide to version a loop to benefit a transformation, it makes sense to record the now non-aliasing accesses in the newly versioned loop. This allows non-aliasing information to be used by subsequent passes. One example is 456.hmmer in SPECint2006 where after loop distribution, we vectorize one of the newly distributed loops. To vectorize we version this loop to fully disambiguate may-aliasing accesses. If we add the noalias markers, we can use the same information in a later DSE pass to eliminate some dead stores which amounts to ~25% of the instructions of this hot memory-pipeline-bound loop. The overall performance improves by 18% on our ARM64. The scoped noalias annotation is added in LoopVersioning. The patch then enables this for loop distribution. A follow-on patch will enable it for the vectorizer. Eventually this should be run by default when versioning the loop but first I'd like to get some feedback whether my understanding and application of scoped noalias metadata is correct. Essentially my approach was to have a separate alias domain for each versioning of the loop. For example, if we first version in loop distribution and then in vectorization of the distributed loops, we have a different set of memchecks for each versioning. By keeping the scopes in different domains they can conveniently be defined independently since different alias domains don't affect each other. As written, I also have a separate domain for each loop. This is not necessary and we could save some metadata here by using the same domain across the different loops. I don't think it's a big deal either way. Probably the best is to review the tests first to see if I mapped this problem correctly to scoped noalias markers. I have plenty of comments in the tests. Note that the interface is prepared for the vectorizer which needs the annotateInstWithNoAlias API. The vectorizer does not use LoopVersioning so we need a way to pass in the versioned instructions. This is also why the maps have to become part of the object state. Also currently, we only have an AA-aware DSE after the vectorizer if we also run the LTO pipeline. Depending how widely this triggers we may want to schedule a DSE toward the end of the regular pass pipeline. Reviewers: hfinkel, nadav, ashutosh.nema Subscribers: mssimpso, aemerson, llvm-commits, mcrosier Differential Revision: http://reviews.llvm.org/D16712 llvm-svn: 263743	2016-03-17 20:32:32 +00:00
Justin Bogner	ae341c6e9b	Bitcode: Error out instead of crashing on corrupt metadata I hit a crash in the bitcode reader on some corrupt input where an MDString had somehow been attached to an instruction instead of an MDNode. This input is pretty bogus, but we shouldn't be crashing on bad input here. This change adds error handling in all of the places where we currently have unchecked casts from Metadata to MDNode, which means we'll error out instead of crashing for that sort of input. Unfortunately, I don't have tests. Hitting this requires flipping bits in the input bitcode, and committing corrupt binary files to catch these cases is a bit too opaque and unmaintainable. llvm-svn: 263742	2016-03-17 20:12:06 +00:00
Tim Northover	498c56c240	ARM: stop asserting on weird <3 x Ty> vectors in ISelLowering. llvm-svn: 263741	2016-03-17 20:10:28 +00:00
Kostya Serebryany	c5575aabd6	[libFuzzer] deprecate several flags llvm-svn: 263739	2016-03-17 19:59:39 +00:00
Kostya Serebryany	23dbc390af	[libFuzzer] add __attribute__((no_sanitize_memory)) to two functions that may be called from signal handler(s) or from msan. This will hopefully avoid msan false reports which I can't reproduce llvm-svn: 263737	2016-03-17 19:42:35 +00:00
Guozhi Wei	7b390ec4cd	[InstCombine] Combine A->B->A BitCast This patch enhances InstCombine to handle following case: A -> B bitcast PHI B -> A bitcast llvm-svn: 263734	2016-03-17 18:47:20 +00:00
Sanjoy Das	c9058ca9e0	[Statepoints] Export a magic constant into a header; NFC llvm-svn: 263733	2016-03-17 18:42:17 +00:00
Petar Jovanovic	0b44f24033	[PowerPC] Disable CTR loops optimization for soft float operations This patch prevents CTR loops optimization when using soft float operations inside loop body. Soft float operations use function calls, but function calls are not allowed inside CTR optimized loops. Patch by Aleksandar Beserminji. Differential Revision: http://reviews.llvm.org/D17600 llvm-svn: 263727	2016-03-17 17:11:33 +00:00
Derek Schuff	d4207ba0f6	[WebAssembly] Stackify code emitted by eliminateFrameIndex and SP writeback Summary: MRI::eliminateFrameIndex can emit several instructions to do address calculations; these can usually be stackified. Because instructions with FI operands can have subsequent operands which may be expression trees, find the top of the leftmost tree and insert the code before it, to keep the LIFO property. Also use stackified registers when writing back the SP value to memory in the epilog; it's unnecessary because SP will not be used after the epilog, and it results in better code. Differential Revision: http://reviews.llvm.org/D18234 llvm-svn: 263725	2016-03-17 17:00:29 +00:00
David Majnemer	511391feaa	[COFF] Refactor section alignment calculation Section alignment isn't completely trivial, let it live in one place so that we may reuse it in LLVM. llvm-svn: 263722	2016-03-17 16:55:18 +00:00
David Majnemer	62fed0c354	Forgot to commit this with r263692 llvm-svn: 263721	2016-03-17 16:55:11 +00:00
Changpeng Fang	234fcb81d3	AMDGPU/SI: Do not generate s_waitcnt after ds_permute/ds_bpermute Symmary: ds_permute/ds_bpermute do not read memory so s_waitcnt is not needed. Reviewers arsenm, tstellarAMD Subscribers llvm-commits, arsenm Differential Revision: http://reviews.llvm.org/D18197 llvm-svn: 263720	2016-03-17 16:43:50 +00:00
Nicolai Haehnle	79cad857a0	AMDGPU: mark atomic instructions as sources of divergence Summary: As explained by the comment, threads will typically see different values returned by atomic instructions even if the arguments are equal. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18156 llvm-svn: 263719	2016-03-17 16:21:59 +00:00
Simon Pilgrim	0f37fbac51	[X86][SSE] Simplified blend-with-zero combining We were being too aggressive in trying to combine a shuffle into a blend-with-zero pattern, often resulting in a endless loop of contrasting combines This patch stops the combine if we already have a blend in place (means we miss some domain corrections) llvm-svn: 263717	2016-03-17 15:59:36 +00:00
Sanjay Patel	9e23fedaf0	propagate 'unpredictable' metadata on select instructions This is similar to D18133 where we allowed profile weights on select instructions. This extends that change to also allow the 'unpredictable' attribute of branches to apply to selects. A test to check that 'unpredictable' metadata is preserved when cloning instructions was checked in at: http://reviews.llvm.org/rL263648 Differential Revision: http://reviews.llvm.org/D18220 llvm-svn: 263716	2016-03-17 15:30:52 +00:00
Saleem Abdulrasool	071a099102	ARM: Revert SVN r253865, 254158, fix windows division The two changes together weakened the test and caused a regression with division handling in MSVC mode. They were applied to avoid an assertion being triggered in the block frequency analysis. However, the underlying problem was simply being masked rather than solved properly. Address the actual underlying problem and revert the changes. Rather than analyze the cause of the assertion, the division failure was assumed to be an overflow. The underlying issue was a subtle bug in the BB construction in the emission of the div-by-zero check (WIN__DBZCHK). We did not construct the proper successor information in the basic blocks, nor did we update the PHIs associated with the basic block when we split them. This would result in assertions being triggered in the block frequency analysis pass. Although the original tests are being removed, the tests themselves performed very little in terms of validation but merely tested that we did not assert when generating code. Update this with new tests that actually ensure that we do not regress on the code generation. llvm-svn: 263714	2016-03-17 14:10:49 +00:00
Simon Atanasyan	58ee875296	[mips] Use `formatImm` call to print immediate value in the `MipsInstPrinter` That allows, for example, to print hex-formatted immediates using llvm-objdump --print-imm-hex command line option. Differential Revision: http://reviews.llvm.org/D18195 llvm-svn: 263704	2016-03-17 10:43:36 +00:00
Scott Egerton	d65377da78	[mips] Eliminate instances of "potentially uninitialised local variable" warnings, NFC Summary: This should eliminate all occurrences of this within LLVMMipsAsmParser. This patch is in response to http://reviews.llvm.org/D17983. I was unable to reproduce the warnings on my machine so please advise if this fixes the warnings. Reviewers: ariccio, vkalintiris, dsanders Subscribers: dblaikie, dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D18087 llvm-svn: 263703	2016-03-17 10:37:51 +00:00
Sanjoy Das	312038872d	[Statepoints] Separate out logic for statepoint directives; NFC This splits out the logic that maps the `"statepoint-id"` attribute into the actual statepoint ID, and the `"statepoint-num-patch-bytes"` attribute into the number of patchable bytes the statpeoint is lowered into. The new home of this logic is in IR/Statepoint.cpp, and this refactoring will support similar functionality when lowering calls with deopt operand bundles in the future. llvm-svn: 263685	2016-03-17 01:56:10 +00:00
Sanjoy Das	d6fc46ea03	[Statepoints] Minor NFC cleanups Mostly code simplifcations, and bringing up IR/Statepoints.cpp up to LLVM coding style. llvm-svn: 263683	2016-03-17 00:47:18 +00:00
Sanjoy Das	3a02019fbc	[SelectionDAG] Remove visitStatepoint; NFC This way we have a single entry point into StatepointLowering. The method was a direct dispatch to LowerStatepoint anyway. llvm-svn: 263682	2016-03-17 00:47:14 +00:00
Chris Bieneman	671d0dda7d	Upgrade TBAA before upgrading intrinsics Summary: If TBAA is on an intrinsic and it gets upgraded and drops the TBAA we hit an odd assert. We should just upgrade the TBAA first because it doesn't have side-effects. Reviewers: reames, apilipenko, manmanren Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18229 llvm-svn: 263673	2016-03-16 23:17:54 +00:00
Sanjoy Das	43e33d61c6	Fix indentation; NFC llvm-svn: 263672	2016-03-16 23:11:21 +00:00
Sanjoy Das	70697ff74d	Extract out a SelectionDAGBuilder::LowerAsStatepoint; NFC Summary: This is a step towards implementing "direct" lowering of calls and invokes with deopt operand bundles into STATEPOINT nodes (as opposed to having them mandatorily pass through RewriteStatepointsForGC, which is the case today). This change extracts out a `SelectionDAGBuilder::LowerAsStatepoint` helper function that is able to lower a "statepoint like thing", and uses it to lower `gc.statepoint` calls. This is an NFC now, but in a later change we will use `LowerAsStatepoint` to directly lower calls and invokes with operand bundles without going through an intermediate `gc.statepoint` IR representation. FYI: I expect `SelectionDAGBuilder::StatepointInfo` will evolve as I add support for lowering non gc.statepoints, right now it is fairly tightly coupled with an IR level `gc.statepoint`. Reviewers: reames, pgavlin, JosephTremoulet Subscribers: sanjoy, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18106 llvm-svn: 263671	2016-03-16 23:08:00 +00:00
Xinliang David Li	897d2923a2	Variable name cleanup /NFC llvm-svn: 263666	2016-03-16 22:13:41 +00:00
James Y Knight	f44fc5219f	Tweak some atomics functions in preparation for larger changes; NFC. - Rename getATOMIC to getSYNC, as llvm will soon be able to emit both '__sync' libcalls and '__atomic' libcalls, and this function is for the '__sync' ones. - getInsertFencesForAtomic() has been replaced with shouldInsertFencesForAtomic(Instruction), so that the decision can be made per-instruction. This functionality will be used soon. - emitLeadingFence/emitTrailingFence are no longer called if shouldInsertFencesForAtomic returns false, and thus don't need to check the condition themselves. llvm-svn: 263665	2016-03-16 22:12:04 +00:00
Sanjoy Das	19c6159833	[SelectionDAG] Extract out populateCallLoweringInfo; NFC SelectionDAGBuilder::populateCallLoweringInfo is now used instead of SelectionDAGBuilder::lowerCallOperands. The populateCallLoweringInfo interface is more composable in face of design changes like http://reviews.llvm.org/D18106 llvm-svn: 263663	2016-03-16 20:49:31 +00:00
Vedant Kumar	aa0cae6208	[ProfileData] Make a utility method public, NFC The swift frontend needs to be able to look up PGO function name variables based on the original raw function name. That's because it's not possible to create PGO function name variables while emitting swift IR. Instead, we have to create the name variables while lowering swift IR to llvm IR, at which point we fix up all calls to the increment intrinsic to point to the right name variable. llvm-svn: 263662	2016-03-16 20:49:26 +00:00
Nicolai Haehnle	ef160de3e5	AMDGPU: Prevent uniform loops from becoming infinite Summary: Uniform loops where the branch leaving the loop is predicated on VCCNZ must be skipped if EXEC = 0, otherwise they will be infinite. Reviewers: tstellarAMD, arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18137 llvm-svn: 263658	2016-03-16 20:14:33 +00:00
Colin LeMahieu	bb0cdfb9f7	[Hexagon] Adding missing break in switch statement. Extra operands would have been appended to the end. llvm-svn: 263657	2016-03-16 20:00:38 +00:00
Chad Rosier	fea398188c	[SLP] Make DataLayout a member variable. llvm-svn: 263656	2016-03-16 19:48:42 +00:00
Geoff Berry	56fabf9b55	Revert "[LSR] Create fewer redundant instructions." This reverts commit r263644. Investigating bootstrap failures. llvm-svn: 263655	2016-03-16 19:21:47 +00:00
Simon Pilgrim	b5a20f0fec	Removed trailing whitespace llvm-svn: 263650	2016-03-16 18:37:44 +00:00
Sanjay Patel	be37e62e0c	fix function names; NFC llvm-svn: 263646	2016-03-16 18:00:09 +00:00

1 2 3 4 5 ...

88146 Commits