llvm-project

Commit Graph

Author	SHA1	Message	Date
Simon Pilgrim	1bc62f03a5	[SelectionDAG] Add VSELECT support to ComputeNumSignBits llvm-svn: 316457	2017-10-24 16:38:38 +00:00
Saleem Abdulrasool	fb490a0bcc	PowerPC: support the separator character in the IAS PowerPC uses ; as a comment leader and the @ as a separator character. Support this properly. llvm-svn: 316454	2017-10-24 16:19:56 +00:00
Simon Pilgrim	0a12c239b6	[X86] truncateVectorCompareWithPACKSS - use PACKSSDW/PACKSSWB instead of just PACKSSWB. By using the widest type possible for PACKSS truncation we have a better chance of being able to peek through bitcasts and improves other combines driven by ComputeNumSignBits. llvm-svn: 316448	2017-10-24 15:38:16 +00:00
Sanjay Patel	f762c7b32f	[x86] add more vector ISA variants for memcmp expansion; NFC ...because every swiss cheese has different holes. llvm-svn: 316446	2017-10-24 15:27:47 +00:00
Oliver Stannard	103cca1af7	[ARM] Tighten up CHECK lines in a test These tests checked for the line number without a leading ":", so for example, a missed diagnostic on line 123 could match one on line 1123, 2123, etc, desynchronising the test for hundreds of lines. This couldn't cause it to incorrectly pass or fail, but made it hard to track down test failures. Differential revision: https://reviews.llvm.org/D39238 llvm-svn: 316442	2017-10-24 14:20:13 +00:00
Oliver Stannard	03ded27bbc	[ARM] Error for invalid shift in memory operand Report a diagnostic when we fail to parse a shift in a memory operand because the shift type is not an identifier. Without this, we were silently ignoring the whole instruction. Differential revision: https://reviews.llvm.org/D39237 llvm-svn: 316441	2017-10-24 14:19:08 +00:00
Andrew V. Tischenko	f4fbe4a51b	Update f16c instruction scheduling on btver2. Differential Revision: https://reviews.llvm.org/D39051 llvm-svn: 316435	2017-10-24 13:38:30 +00:00
Zvi Rackover	31b101a186	X86CallFrameOptimization: Recognize 'store 0/-1 using and/or' idioms Summary: r264440 added or/and patterns for storing -1 or 0 with the intention of decreasing code size. However, X86CallFrameOptimization does not recognize these memory accesses so it will not replace them with push's when profitable. This patch fixes this problem by teaching X86CallFrameOptimization these store 0/-1 idioms. An alternative fix would be to prevent the 'store 0/1 idioms' patterns from firing when accessing the stack. This would save the need to teach the pass about these idioms. However, because X86CallFrameOptimization does not always fire we may result in cases where neither X86CallFrameOptimization not the patterns for 'store 0/1 idioms' fire. Fixes pr34863 Reviewers: DavidKreitzer, guyblank, aymanmus Reviewed By: aymanmus Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38738 llvm-svn: 316431	2017-10-24 12:13:05 +00:00
Bjorn Pettersson	1c043a9f28	[ConstantFolding] Avoid assert when folding ptrtoint of vectorized GEP Summary: Got asserts in llvm::CastInst::getCastOpcode saying: `DestBits == SrcBits && "Illegal cast to vector (wrong type or size)"' failed. Problem seemed to be that llvm::ConstantFoldCastInstruction did not handle ptrtoint cast of a getelementptr returning a vector correctly. I assume such situations are quite rare, since the GEP needs to be considered as a constant value (base pointer being null). The solution used here is to simply avoid the constant fold of ptrtoint when the value is a vector. It is not supported, and by bailing out we do not fail on assertions later on. Reviewers: craig.topper, majnemer, davide, filcab, efriedma Reviewed By: efriedma Subscribers: efriedma, filcab, llvm-commits Differential Revision: https://reviews.llvm.org/D38546 llvm-svn: 316430	2017-10-24 12:08:11 +00:00
George Rimar	a17480d602	[llvm-dwarfdump] - Cleanup of gnu_call_site.s. NFC. This change fixes values of test so that it passes -verify without errors and also adds comments. Test was introduced in D39119 and intention was to check that tool is able to dump few DW_GNU_call_site tags and attributes, so that change is NFC cleanup. llvm-svn: 316428	2017-10-24 11:44:19 +00:00
Marek Olsak	ce76ea0394	AMDGPU: Add new intrinsic llvm.amdgcn.kill(i1) Summary: Kill the thread if operand 0 == false. llvm.amdgcn.wqm.vote can be applied to the operand. Also allow kill in all shader stages. Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D38544 llvm-svn: 316427	2017-10-24 10:27:13 +00:00
Marek Olsak	2114fc3bcb	AMDGPU: Add llvm.amdgcn.wqm.vote intrinsic Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D38543 llvm-svn: 316426	2017-10-24 10:26:59 +00:00
Oliver Stannard	c507b370a1	[ARM] Remove tCPS alias which just crashed This alias caused a crash when trying to print the "cps #0" instruction in a diagnostic for thumbv6 (which doesn't have that instruction). The comment was incorrect, this instruction is UNPREDICTABLE if no flag bits are set, so I don't think it's worth keeping. Differential Revision: https://reviews.llvm.org/D39191 llvm-svn: 316420	2017-10-24 08:55:36 +00:00
Zvi Rackover	3c0d385598	X86: Fix X86CallFrameOptimization to search for the COPY StackPointer SelectionDAG inserts a copy of ESP into a virtual register. X86CallFrameOptimization assumed that the COPY, if present, is always right after the call-frame setup instruction (ADJCALLSTACKDOWN). This was a wrong assumption as the COPY can be located anywhere between the call-frame setup instruction and its first use. If the COPY happened to be located in a different location than what X86CallFrameOptimization assumed, visiting it while processing the call chain would lead to a conservative bail-out. The fix is quite straightfoward, scan ahead for the stack-pointer copy and make note of it so it can be ignored while processing the call chain. Fixes pr34903 Differential Revision: https://reviews.llvm.org/D38730 llvm-svn: 316416	2017-10-24 07:38:29 +00:00
Omer Paparo Bivas	2251c79aba	[MC] Adding code padding for performance stability - infrastructure. NFC. Infrastructure designed for padding code with nop instructions in key places such that preformance improvement will be achieved. The infrastructure is implemented such that the padding is done in the Assembler after the layout is done and all IPs and alignments are known. This patch by itself in a NFC. Future patches will make use of this infrastructure to implement required policies for code padding. Reviewers: aaboud zvi craig.topper gadi.haber Differential revision: https://reviews.llvm.org/D34393 Change-Id: I92110d0c0a757080a8405636914a93ef6f8ad00e llvm-svn: 316413	2017-10-24 06:16:03 +00:00
Zvi Rackover	c6d0b6c103	X86: Register the X86CallFrameOptimization pass Summary: The motivation of this change is to enable .mir testing for this pass. Added one test case to cover the functionality, this same case will be improved by a future patch. Reviewers: igorb, guyblank, DavidKreitzer Reviewed By: guyblank, DavidKreitzer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38729 llvm-svn: 316412	2017-10-24 05:47:07 +00:00
Saleem Abdulrasool	619b3269fd	ObjCARC: do not increment past the end of the BB The `BasicBlock::getFirstInsertionPt` call may return `std::end` for the BB. Dereferencing the end iterator results in an assertion failure "(!NodePtr->isKnownSentinel()), function operator*". Ensure that the returned iterator is valid before dereferencing it. If the end is returned, move one position backward to get a valid insertion point. llvm-svn: 316401	2017-10-24 00:09:10 +00:00
Reid Kleckner	0e88118dd7	[codeview] Add support for inlinee lists This adds type index discovery and dumper support for symbol record kind 0x1168, which is a list of inlined function ids. This symbol kind is undocumented, but S_INLINEES is consistent with the existing nomenclature. Fixes PR34222 llvm-svn: 316398	2017-10-23 23:43:40 +00:00
Jessica Paquette	9df7fde269	[MachineOutliner] Add optimisation remarks for successful outlining This commit adds optimisation remarks for outlining which fire when a function is successfully outlined. To do this, OutlinedFunctions must now contain references to their Candidates. Since the Candidates must still be sorted and worked on separately, this is done by working on everything in terms of shared_ptrs to Candidates. This is good; it means that we can easily move everything to outlining in terms of the OutlinedFunctions rather than the individual Candidates. This is far more intuitive than what's currently there! (Remarks are output when a function is created for some group of Candidates. In a later commit, all of the outlining logic should be rewritten so that we loop over OutlinedFunctions rather than over Candidates.) llvm-svn: 316396	2017-10-23 23:36:46 +00:00
Aditya Nandakumar	921f24cef1	[GISel][ARM]: Fix illegal Generic copies in tests This is in preparation for a verifier check that makes sure copies are of the same size (when generic virtual registers are involved). llvm-svn: 316388	2017-10-23 22:53:08 +00:00
Aditya Nandakumar	4dfd2590dc	[GISel][AArch64]: Fix illegal Generic copies in tests This is in preparation for a verifier check that makes sure copies are of the same size (when generic virtual registers are involved). llvm-svn: 316387	2017-10-23 22:53:04 +00:00
Rong Xu	e1f4245f8d	[PM] Add pgo-memop-opt pass to the new pass manager This pass adds pgo-memop-opt pass to the new pass manager. It is in the old pass manager but somehow left out in the new pass manager. Differential Revision: http://reviews.llvm.org/D39145 llvm-svn: 316384	2017-10-23 22:21:29 +00:00
Simon Pilgrim	321e54f72d	[X86][SSE] combineBitcastvxi1 - use PACKSSWB directly to pack v8i16 to v16i8 Avoid difficulties determining the number of sign bits later on in shuffle lowering to lower to PACKSS llvm-svn: 316383	2017-10-23 22:05:02 +00:00
George Burgess IV	8a0e4bc972	Don't crash when we see unallocatable registers in clobbers This fixes a bug where we'd crash given code like the test-case from https://bugs.llvm.org/show_bug.cgi?id=30792 . Instead, we let the offending clobber silently slide through. This doesn't fully fix said bug, since the assembler will still complain the moment it sees a crypto/fp/vector op, and we still don't diagnose calls that require vector regs. Differential Revision: https://reviews.llvm.org/D39030 llvm-svn: 316374	2017-10-23 20:46:36 +00:00
Stefan Pintilie	52bbd587ac	Revert "[PowerPC] Try to simplify a Swap if it feeds a Splat" Revert commit r316366. Previous commit causes p8-scalar_vector_conversions.ll to fail. This reverts commit 990e764ad8a2eec206ce5dda6aefab059ccd4e92. llvm-svn: 316371	2017-10-23 20:22:23 +00:00
Krzysztof Parzyszek	6f06b6edff	[Hexagon] Return the correct chain edge for i1 function calls In HexagonISelLowering, there is code to handle the case when a function returns an i1 type. In this case, we need to generate extra nodes to copy the result from R0 to a predicate register. The code was returning the wrong value for the chain edge which caused an assert "Wrong topological sorting" when converting the instructions to MIs. This patch fixes the problem by returning the chain for the final copy. Patch by Brendon Cahoon. llvm-svn: 316367	2017-10-23 19:35:25 +00:00
Stefan Pintilie	feafa1d7f0	[PowerPC] Try to simplify a Swap if it feeds a Splat If we have the situation where a Swap feeds a Splat we can sometimes change the index on the Splat and then remove the Swap instruction. Differential Revision: https://reviews.llvm.org/D39009 llvm-svn: 316366	2017-10-23 19:33:31 +00:00
Krzysztof Parzyszek	273678823b	[Hexagon] Add extra pattern for S4_addaddi One combination was missing: add(add(x,y),c). llvm-svn: 316363	2017-10-23 19:07:50 +00:00
Daniel Sanders	d66e0901ae	[globalisel][tablegen] Import stores and allow GISel to automatically substitute zero regs like WZR/XZR/$zero. This patch enables the import of stores. Unfortunately, doing so by itself, loses an optimization where storing 0 to memory makes use of WZR/XZR. To mitigate this, this patch also introduces a new feature that allows register operands to nominate a zero register. When this is done, GlobalISel will substitute (G_CONSTANT 0) with the nominated register automatically. This is currently configured to only apply to the stores. Applying it to GPR32/GPR64 register classes in general will be done after review see (https://reviews.llvm.org/D39150). llvm-svn: 316360	2017-10-23 18:19:24 +00:00
Vedant Kumar	35b50a83ab	[wasm] readSection: Avoid reading past eof (fixes oss-fuzz #3219 ) A wasm file crafted with a bogus section size can trigger an ASan issue in the DWARFObjInMemory constructor. Nip the problem in the bud when we read the wasm section. Found by OSS-Fuzz: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=3219 Differential Revision: https://reviews.llvm.org/D38777 llvm-svn: 316357	2017-10-23 18:04:34 +00:00
Simon Pilgrim	6bda996cab	[X86][SSE] Regenerate PACKSS tests on 32 + 64-bit targets llvm-svn: 316354	2017-10-23 17:50:40 +00:00
Sanjay Patel	bdc0bf7ba9	[PassManager] add test to show the new PM uses -latesimplifycfg early; NFC llvm-svn: 316351	2017-10-23 17:30:17 +00:00
Matt Arsenault	b791802aef	AMDGPU: Fix default range in non-kernel functions The range should be assumed to be the hardware maximum if a workitem intrinsic is used in a callable function which does not know the restricted limit of the calling kernel. llvm-svn: 316346	2017-10-23 17:09:35 +00:00
Andrew V. Tischenko	777308b548	Update DPPD/DPPS instruction scheduling on btver2. Differential Revision: https://reviews.llvm.org/D39046 llvm-svn: 316334	2017-10-23 15:53:30 +00:00
Craig Topper	8f182fdd8b	[X86] Add PTWRITE instruction for assembler and disassembler. llvm-svn: 316333	2017-10-23 15:53:21 +00:00
Craig Topper	5f0339d2f3	[X86] Add RDPID instruction for assembler and disassembler. llvm-svn: 316332	2017-10-23 15:53:16 +00:00
Simon Pilgrim	32da2f9245	[DAGCombine] Permit combining of shuffles of equivalent splat BUILD_VECTORs combineShuffleOfScalars is very conservative about shuffled BUILD_VECTORs that can be combined together. This patch adds one additional case - if both BUILD_VECTORs represent splats of the same scalar value but with different UNDEF elements, then we should create a single splat BUILD_VECTOR, sharing only the UNDEF elements defined by the shuffle mask. Differential Revision: https://reviews.llvm.org/D38696 llvm-svn: 316331	2017-10-23 15:48:08 +00:00
Simon Pilgrim	03c8753924	[X86][SSE] Regenerate bitcast-and-setcc tests Avoid the retl/retq changes in an upcoming patch llvm-svn: 316328	2017-10-23 14:47:49 +00:00
Simon Pilgrim	e131cb0bd5	[X86][AVX2] Regenerate AVX2 intrinsics tests on 32 + 64-bit targets llvm-svn: 316326	2017-10-23 14:19:46 +00:00
Simon Pilgrim	c680c4742b	[X86][AVX] Regenerate AVX intrinsics tests on 32 + 64-bit targets llvm-svn: 316325	2017-10-23 14:17:59 +00:00
Simon Pilgrim	eae6e9dbc5	[X86][F16C] Regenerate F16C schedule tests llvm-svn: 316324	2017-10-23 14:15:24 +00:00
Artur Gainullin	610df9c890	Test commit. llvm-svn: 316322	2017-10-23 13:25:49 +00:00
George Rimar	7fc298afe4	[llvm-dwarfdump] - Teach tool about few GNU call_sites constants. This teaches tool about following consants: DW_TAG_GNU_call_site, DW_TAG_GNU_call_site_parameter, DW_AT_GNU_call_site_value, DW_AT_GNU_all_call_sites. Constants documented here: https://sourceware.org/elfutils/DwarfExtensions Differential revision: https://reviews.llvm.org/D39119 llvm-svn: 316321	2017-10-23 11:24:14 +00:00
Ayman Musa	4b2bd5ff5e	[X86] Add test for opportunity to use bzhi X86 instruction instead of load+and instructions. Transformation uploaded for CR in https://reviews.llvm.org/D34141. llvm-svn: 316320	2017-10-23 10:24:19 +00:00
Andrew V. Tischenko	eff4fc0d41	Fix for Bug 30718 - Failure to disassemble certain MOV with rex.R. The issue was in illegal segment register index. Differential Revision: https://reviews.llvm.org/D38786 llvm-svn: 316319	2017-10-23 09:36:33 +00:00
Sam Parker	487ab86942	[ARM] Allow unrolling of multi-block loops. Before, loop unrolling was only enabled for loops with a single block. This restriction has been removed and replaced by: - allow a maximum of two exiting blocks, - a four basic block limit for cores with a branch predictor. Differential Revision: https://reviews.llvm.org/D38952 llvm-svn: 316313	2017-10-23 08:05:14 +00:00
Craig Topper	326008c615	[X86] Fix disassembly of EVEX rounding control and SAE instructions. Fixes PR31955. llvm-svn: 316308	2017-10-23 02:26:24 +00:00
Yichao Yu	92c11ee352	Fix invalid ptrtoint in InstCombine Summary: It's unclear if this is the only thing we can do but at least this is consistent with the check of address space agreement in `isBitCastable`. The code is used at least in both instcombine and jumpthreading though I could only find a way to trigger the invalid cast in instcombine. Reviewers: loladiro, sanjoy, majnemer Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34335 llvm-svn: 316302	2017-10-22 20:28:17 +00:00
Sanjay Patel	b80daf0b48	[SimplifyCFG] delay switch condition forwarding to -latesimplifycfg As discussed in D39011: https://reviews.llvm.org/D39011 ...replacing constants with a variable is inverting the transform done by other IR passes, so we definitely don't want to do this early. In fact, it's questionable whether this transform belongs in SimplifyCFG at all. I'll look at moving this to codegen as a follow-up step. llvm-svn: 316298	2017-10-22 19:10:07 +00:00
Marina Yatsina	f9371d821f	Add logic to greedy reg alloc to avoid bad eviction chains This fixes bugzilla 26810 https://bugs.llvm.org/show_bug.cgi?id=26810 This is intended to prevent sequences like: movl %ebp, 8(%esp) # 4-byte Spill movl %ecx, %ebp movl %ebx, %ecx movl %edi, %ebx movl %edx, %edi cltd idivl %esi movl %edi, %edx movl %ebx, %edi movl %ecx, %ebx movl %ebp, %ecx movl 16(%esp), %ebp # 4 - byte Reload Such sequences are created in 2 scenarios: Scenario #1: vreg0 is evicted from physreg0 by vreg1 Evictee vreg0 is intended for region splitting with split candidate physreg0 (the reg vreg0 was evicted from) Region splitting creates a local interval because of interference with the evictor vreg1 (normally region spliiting creates 2 interval, the "by reg" and "by stack" intervals. Local interval created when interference occurs.) one of the split intervals ends up evicting vreg2 from physreg1 Evictee vreg2 is intended for region splitting with split candidate physreg1 one of the split intervals ends up evicting vreg3 from physreg2 etc.. until someone spills Scenario #2 vreg0 is evicted from physreg0 by vreg1 vreg2 is evicted from physreg2 by vreg3 etc Evictee vreg0 is intended for region splitting with split candidate physreg1 Region splitting creates a local interval because of interference with the evictor vreg1 one of the split intervals ends up evicting back original evictor vreg1 from physreg0 (the reg vreg0 was evicted from) Another evictee vreg2 is intended for region splitting with split candidate physreg1 one of the split intervals ends up evicting vreg3 from physreg2 etc.. until someone spills As compile time was a concern, I've added a flag to control weather we do cost calculations for local intervals we expect to be created (it's on by default for X86 target, off for the rest). Differential Revision: https://reviews.llvm.org/D35816 Change-Id: Id9411ff7bbb845463d289ba2ae97737a1ee7cc39 llvm-svn: 316295	2017-10-22 17:59:38 +00:00

1 2 3 4 5 ...

48374 Commits