llvm-project

Commit Graph

Author	SHA1	Message	Date
Max Kazantsev	b6d40067af	[SCEV] Enhance SCEVFindUnsafe for division This patch allows SCEVFindUnsafe algorithm to tread division by any non-positive value as safe. Previously, it could only recognize non-zero constants. Differential Revision: https://reviews.llvm.org/D39228 llvm-svn: 316568	2017-10-25 11:07:43 +00:00
Clement Courbet	0c7cd071f7	Re-land "[CodeGen][ExpandMemcmp][NFC] Allow memcmp to expand to vector loads (1)" Compute the actual decomposition only after deciding whether to expand of not. Else, it's easy to make the compiler OOM with: `memcpy(dst, src, 0xffffffffffffffff);`, which typically happens if someone mistakenly passes a negative value. Add a test. This reverts commit f8fc02fbd4ab33383c010d33675acf9763d0bd44. llvm-svn: 316567	2017-10-25 11:02:09 +00:00
George Rimar	0be860f695	[llvm-dwarfdump] - Fix array out of bounds access crash. This fixes possible out of bound access in DWARFDie::getFirstChild() which might happen when .debug_info section is corrupted, like shown in testcase. Differential revision: https://reviews.llvm.org/D39185 llvm-svn: 316566	2017-10-25 10:23:49 +00:00
Sam Parker	ccb209bb97	[ARM] Swap cmp operands for automatic shifts Swap the compare operands if the lhs is a shift and the rhs isn't, as in arm and T2 the shift can be performed by the compare for its second operand. Differential Revision: https://reviews.llvm.org/D39004 llvm-svn: 316562	2017-10-25 08:33:06 +00:00
Martin Storsjo	373c8efa1e	[AArch64] Add support for dllimport of values and functions Previously, the dllimport attribute did the right thing in terms of treating it as a pointer to a value, but this makes sure the names get mangled properly, and calls to such functions load the function from the __imp_ pointer. This is based on SVN r212431 and r212430 where the same was implemented for ARM. Differential Revision: https://reviews.llvm.org/D38530 llvm-svn: 316555	2017-10-25 07:25:18 +00:00
Matt Arsenault	8a752b77a2	DAG: Fix creating select with wrong condition type This code added in r297930 assumed that it could create a select with a condition type that is just an integer bitcast of the selected type. For AMDGPU any vselect is going to be scalarized (although the vector types are legal), and all select conditions must be i1 (the same as getSetCCResultType). This logic doesn't really make sense to me, but there's never really been a consistent policy in what the select condition mask type is supposed to be. Try to extend the logic for skipping the transform for condition types that aren't setccs. It doesn't seem quite right to me though, but checking conditions that seem more sensible (like whether the vselect is going to be expanded) doesn't work since this seems to depend on that also. llvm-svn: 316554	2017-10-25 07:14:07 +00:00
Max Kazantsev	9ac7021a25	[IRCE] Fix intersection between signed and unsigned ranges IRCE for unsigned latch conditions was temporarily disabled by rL314881. The motivating example contained an unsigned latch condition and a signed range check. One of the safe iteration ranges was `[1, SINT_MAX + 1]`. Its right border was incorrectly interpreted as a negative value in `IntersectRange` function, this lead to a miscompile under which we deleted a range check without inserting a postloop where it was needed. This patch brings back IRCE for unsigned latch conditions. Now we treat range intersection more carefully. If the latch condition was unsigned, we only try to consider a range check for deletion if: 1. The range check is also unsigned, or 2. Safe iteration range of the range check lies within `[0, SINT_MAX]`. The same is done for signed latch. Values from `[0, SINT_MAX]` are unambiguous, these values are non-negative under any interpretation, and all values of a range intersected with such range are also non-negative. We also use signed/unsigned min/max functions for range intersection depending on type of the latch condition. Differential Revision: https://reviews.llvm.org/D38581 llvm-svn: 316552	2017-10-25 06:47:39 +00:00
Mikael Holmen	279790b674	[MemDep] DBG intrinsics don't impact abort limit for call site dependence analysis Summary: Memory dependence analysis no longer counts DbgInfoIntrinsics towards the limit where to abort the analysis. Before, a bunch of calls to dbg.value could affect the generated code, meaning that with -g we could generate different code than without. Reviewers: chandlerc, Prazek, davide, efriedma Reviewed By: efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39181 llvm-svn: 316551	2017-10-25 06:15:32 +00:00
Max Kazantsev	4332a943bc	[IRCE] Smarter detection of empty ranges using SCEV For a SCEV range, this patch replaces the naive emptiness check for SCEV ranges which looks like `Begin == End` with a SCEV check. The range is guaranteed to be empty of `Begin >= End`. We should filter such ranges out and do not try to perform IRCE for them. For example, we can get such range when intersecting range `[A, B)` and `[C, D)` where `A < B < C < D`. The resulting range is `[max(A, C), min(B, D)) = [C, B)`. This range is empty, but its `Begin` does not match with `End`. Making IRCE for an empty range is basically safe but unprofitable because we never actually get into the main loop where the range checks are supposed to be eliminated. This patch uses SCEV mechanisms to treat loops with proved `Begin >= End` as empty. Differential Revision: https://reviews.llvm.org/D39082 llvm-svn: 316550	2017-10-25 06:10:02 +00:00
Peter Collingbourne	e0900acbd4	Assembly tests require x86 target. llvm-svn: 316546	2017-10-25 04:24:20 +00:00
Teresa Johnson	f99c84d548	[ThinLTO] Make test for promoted names more specific With r314527, promoted values get a suffix that is a decimal value of the module hash instead of hex. Change the regex to match only decimal suffix values. llvm-svn: 316544	2017-10-25 03:41:31 +00:00
Peter Collingbourne	689e6c052e	llvm-readobj: Add support for reading relocations in the Android packed format. This is in preparation for testing lld's upcoming relocation packing feature (D39152). I have verified that this implementation correctly unpacks the relocations from a Chromium DSO built with gold and the Android relocation packer for ARM32 and ARM64. Differential Revision: https://reviews.llvm.org/D39272 llvm-svn: 316543	2017-10-25 03:37:12 +00:00
Adrian Prantl	2eb7cbf987	Implement salavageDebugInfo functionality for SelectionDAG. Similar to how llvm::salvagDebugInfo hooks into InstCombine, this adds a hook that can be invoked before an SDNode that is associated with an SDDbgValue is erased to capture the effect of the deleted node in a DIExpression. The motivating example is an SDDebugValue attached to an ADD operation that gets folded into a LOAD+OFFSET operation. rdar://problem/32121503 llvm-svn: 316525	2017-10-24 22:55:12 +00:00
Artem Belevich	cb8f6328dc	[NVPTX] allow address space inference for volatile loads/stores. If particular target supports volatile memory access operations, we can avoid AS casting to generic AS. Currently it's only enabled in NVPTX for loads and stores that access global & shared AS. Differential Revision: https://reviews.llvm.org/D39026 llvm-svn: 316495	2017-10-24 20:31:44 +00:00
Gadi Haber	323f2e1715	[X86][Broadwell] Added the instruction scheduling information for the Broadwell CPU. Adding the scheduling information for the Browadwell (BDW) CPU target. This patch adds the instruction scheduling information for the Broadwell (BDW) architecture target by adding the file X86SchedBroadwell.td located under the X86 Target. We used the scheduling information retrieved from the Broadwell architects in order to create the file. The scheduling information includes latency, number of micro-Ops and used ports by each BDW instruction. The patch continues the scheduling replacement and insertion effort started with the SandyBridge (SNB) target in r310792, the Haswell (HSW) target in r311879, the SkylakeClient (SKL) target in rL313613 + rL315978 and the SkylakeServer (SKX) in rL315175. Performance fluctuations may be expected due to code alignment effects. Reviewers: zvi, RKSimon, craig.topper Differential Revision: https://reviews.llvm.org/D39054 Change-Id: If6f799e5ff60e1091c8d43b05ea78c53581bae01 llvm-svn: 316492	2017-10-24 20:19:47 +00:00
Justin Bogner	6c452834a1	MIR: Print the register class or bank in vreg defs This updates the MIRPrinter to include the regclass when printing virtual register defs, which is already valid syntax for the parser. That is, given 64 bit %0 and %1 in a "gpr" regbank, %1(s64) = COPY %0(s64) would now be written as %1:gpr(s64) = COPY %0(s64) While this change alone introduces a bit of redundancy with the registers block, it allows us to update the tests to be more concise and understandable and brings us closer to being able to remove the registers block completely. Note: We generally only print the class in defs, but there is one exception. If there are uses without any defs whatsoever, we'll print the class on all uses. I'm not completely convinced this comes up in meaningful machine IR, but for now the MIRParser and MachineVerifier both accept that kind of stuff, so we don't want to have a situation where we can print something we can't parse. llvm-svn: 316479	2017-10-24 18:04:54 +00:00
Stefan Pintilie	8f0c783095	[PowerPC] Try to simplify a Swap if it feeds a Splat If we have the situation where a Swap feeds a Splat we can sometimes change the index on the Splat and then remove the Swap instruction. Fixed the test case that was failing and recommit after pulling the original commit. Original revision is here: https://reviews.llvm.org/D39009 llvm-svn: 316478	2017-10-24 17:44:27 +00:00
Simon Pilgrim	5e8c3f328f	[X86][AVX] ComputeNumSignBitsForTargetNode - add support for X86ISD::VTRUNC llvm-svn: 316462	2017-10-24 17:04:57 +00:00
Simon Pilgrim	1bc62f03a5	[SelectionDAG] Add VSELECT support to ComputeNumSignBits llvm-svn: 316457	2017-10-24 16:38:38 +00:00
Saleem Abdulrasool	fb490a0bcc	PowerPC: support the separator character in the IAS PowerPC uses ; as a comment leader and the @ as a separator character. Support this properly. llvm-svn: 316454	2017-10-24 16:19:56 +00:00
Simon Pilgrim	0a12c239b6	[X86] truncateVectorCompareWithPACKSS - use PACKSSDW/PACKSSWB instead of just PACKSSWB. By using the widest type possible for PACKSS truncation we have a better chance of being able to peek through bitcasts and improves other combines driven by ComputeNumSignBits. llvm-svn: 316448	2017-10-24 15:38:16 +00:00
Sanjay Patel	f762c7b32f	[x86] add more vector ISA variants for memcmp expansion; NFC ...because every swiss cheese has different holes. llvm-svn: 316446	2017-10-24 15:27:47 +00:00
Oliver Stannard	103cca1af7	[ARM] Tighten up CHECK lines in a test These tests checked for the line number without a leading ":", so for example, a missed diagnostic on line 123 could match one on line 1123, 2123, etc, desynchronising the test for hundreds of lines. This couldn't cause it to incorrectly pass or fail, but made it hard to track down test failures. Differential revision: https://reviews.llvm.org/D39238 llvm-svn: 316442	2017-10-24 14:20:13 +00:00
Oliver Stannard	03ded27bbc	[ARM] Error for invalid shift in memory operand Report a diagnostic when we fail to parse a shift in a memory operand because the shift type is not an identifier. Without this, we were silently ignoring the whole instruction. Differential revision: https://reviews.llvm.org/D39237 llvm-svn: 316441	2017-10-24 14:19:08 +00:00
Andrew V. Tischenko	f4fbe4a51b	Update f16c instruction scheduling on btver2. Differential Revision: https://reviews.llvm.org/D39051 llvm-svn: 316435	2017-10-24 13:38:30 +00:00
Zvi Rackover	31b101a186	X86CallFrameOptimization: Recognize 'store 0/-1 using and/or' idioms Summary: r264440 added or/and patterns for storing -1 or 0 with the intention of decreasing code size. However, X86CallFrameOptimization does not recognize these memory accesses so it will not replace them with push's when profitable. This patch fixes this problem by teaching X86CallFrameOptimization these store 0/-1 idioms. An alternative fix would be to prevent the 'store 0/1 idioms' patterns from firing when accessing the stack. This would save the need to teach the pass about these idioms. However, because X86CallFrameOptimization does not always fire we may result in cases where neither X86CallFrameOptimization not the patterns for 'store 0/1 idioms' fire. Fixes pr34863 Reviewers: DavidKreitzer, guyblank, aymanmus Reviewed By: aymanmus Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38738 llvm-svn: 316431	2017-10-24 12:13:05 +00:00
Bjorn Pettersson	1c043a9f28	[ConstantFolding] Avoid assert when folding ptrtoint of vectorized GEP Summary: Got asserts in llvm::CastInst::getCastOpcode saying: `DestBits == SrcBits && "Illegal cast to vector (wrong type or size)"' failed. Problem seemed to be that llvm::ConstantFoldCastInstruction did not handle ptrtoint cast of a getelementptr returning a vector correctly. I assume such situations are quite rare, since the GEP needs to be considered as a constant value (base pointer being null). The solution used here is to simply avoid the constant fold of ptrtoint when the value is a vector. It is not supported, and by bailing out we do not fail on assertions later on. Reviewers: craig.topper, majnemer, davide, filcab, efriedma Reviewed By: efriedma Subscribers: efriedma, filcab, llvm-commits Differential Revision: https://reviews.llvm.org/D38546 llvm-svn: 316430	2017-10-24 12:08:11 +00:00
George Rimar	a17480d602	[llvm-dwarfdump] - Cleanup of gnu_call_site.s. NFC. This change fixes values of test so that it passes -verify without errors and also adds comments. Test was introduced in D39119 and intention was to check that tool is able to dump few DW_GNU_call_site tags and attributes, so that change is NFC cleanup. llvm-svn: 316428	2017-10-24 11:44:19 +00:00
Marek Olsak	ce76ea0394	AMDGPU: Add new intrinsic llvm.amdgcn.kill(i1) Summary: Kill the thread if operand 0 == false. llvm.amdgcn.wqm.vote can be applied to the operand. Also allow kill in all shader stages. Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D38544 llvm-svn: 316427	2017-10-24 10:27:13 +00:00
Marek Olsak	2114fc3bcb	AMDGPU: Add llvm.amdgcn.wqm.vote intrinsic Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D38543 llvm-svn: 316426	2017-10-24 10:26:59 +00:00
Oliver Stannard	c507b370a1	[ARM] Remove tCPS alias which just crashed This alias caused a crash when trying to print the "cps #0" instruction in a diagnostic for thumbv6 (which doesn't have that instruction). The comment was incorrect, this instruction is UNPREDICTABLE if no flag bits are set, so I don't think it's worth keeping. Differential Revision: https://reviews.llvm.org/D39191 llvm-svn: 316420	2017-10-24 08:55:36 +00:00
Zvi Rackover	3c0d385598	X86: Fix X86CallFrameOptimization to search for the COPY StackPointer SelectionDAG inserts a copy of ESP into a virtual register. X86CallFrameOptimization assumed that the COPY, if present, is always right after the call-frame setup instruction (ADJCALLSTACKDOWN). This was a wrong assumption as the COPY can be located anywhere between the call-frame setup instruction and its first use. If the COPY happened to be located in a different location than what X86CallFrameOptimization assumed, visiting it while processing the call chain would lead to a conservative bail-out. The fix is quite straightfoward, scan ahead for the stack-pointer copy and make note of it so it can be ignored while processing the call chain. Fixes pr34903 Differential Revision: https://reviews.llvm.org/D38730 llvm-svn: 316416	2017-10-24 07:38:29 +00:00
Omer Paparo Bivas	2251c79aba	[MC] Adding code padding for performance stability - infrastructure. NFC. Infrastructure designed for padding code with nop instructions in key places such that preformance improvement will be achieved. The infrastructure is implemented such that the padding is done in the Assembler after the layout is done and all IPs and alignments are known. This patch by itself in a NFC. Future patches will make use of this infrastructure to implement required policies for code padding. Reviewers: aaboud zvi craig.topper gadi.haber Differential revision: https://reviews.llvm.org/D34393 Change-Id: I92110d0c0a757080a8405636914a93ef6f8ad00e llvm-svn: 316413	2017-10-24 06:16:03 +00:00
Zvi Rackover	c6d0b6c103	X86: Register the X86CallFrameOptimization pass Summary: The motivation of this change is to enable .mir testing for this pass. Added one test case to cover the functionality, this same case will be improved by a future patch. Reviewers: igorb, guyblank, DavidKreitzer Reviewed By: guyblank, DavidKreitzer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38729 llvm-svn: 316412	2017-10-24 05:47:07 +00:00
Saleem Abdulrasool	619b3269fd	ObjCARC: do not increment past the end of the BB The `BasicBlock::getFirstInsertionPt` call may return `std::end` for the BB. Dereferencing the end iterator results in an assertion failure "(!NodePtr->isKnownSentinel()), function operator*". Ensure that the returned iterator is valid before dereferencing it. If the end is returned, move one position backward to get a valid insertion point. llvm-svn: 316401	2017-10-24 00:09:10 +00:00
Reid Kleckner	0e88118dd7	[codeview] Add support for inlinee lists This adds type index discovery and dumper support for symbol record kind 0x1168, which is a list of inlined function ids. This symbol kind is undocumented, but S_INLINEES is consistent with the existing nomenclature. Fixes PR34222 llvm-svn: 316398	2017-10-23 23:43:40 +00:00
Jessica Paquette	9df7fde269	[MachineOutliner] Add optimisation remarks for successful outlining This commit adds optimisation remarks for outlining which fire when a function is successfully outlined. To do this, OutlinedFunctions must now contain references to their Candidates. Since the Candidates must still be sorted and worked on separately, this is done by working on everything in terms of shared_ptrs to Candidates. This is good; it means that we can easily move everything to outlining in terms of the OutlinedFunctions rather than the individual Candidates. This is far more intuitive than what's currently there! (Remarks are output when a function is created for some group of Candidates. In a later commit, all of the outlining logic should be rewritten so that we loop over OutlinedFunctions rather than over Candidates.) llvm-svn: 316396	2017-10-23 23:36:46 +00:00
Aditya Nandakumar	921f24cef1	[GISel][ARM]: Fix illegal Generic copies in tests This is in preparation for a verifier check that makes sure copies are of the same size (when generic virtual registers are involved). llvm-svn: 316388	2017-10-23 22:53:08 +00:00
Aditya Nandakumar	4dfd2590dc	[GISel][AArch64]: Fix illegal Generic copies in tests This is in preparation for a verifier check that makes sure copies are of the same size (when generic virtual registers are involved). llvm-svn: 316387	2017-10-23 22:53:04 +00:00
Rong Xu	e1f4245f8d	[PM] Add pgo-memop-opt pass to the new pass manager This pass adds pgo-memop-opt pass to the new pass manager. It is in the old pass manager but somehow left out in the new pass manager. Differential Revision: http://reviews.llvm.org/D39145 llvm-svn: 316384	2017-10-23 22:21:29 +00:00
Simon Pilgrim	321e54f72d	[X86][SSE] combineBitcastvxi1 - use PACKSSWB directly to pack v8i16 to v16i8 Avoid difficulties determining the number of sign bits later on in shuffle lowering to lower to PACKSS llvm-svn: 316383	2017-10-23 22:05:02 +00:00
George Burgess IV	8a0e4bc972	Don't crash when we see unallocatable registers in clobbers This fixes a bug where we'd crash given code like the test-case from https://bugs.llvm.org/show_bug.cgi?id=30792 . Instead, we let the offending clobber silently slide through. This doesn't fully fix said bug, since the assembler will still complain the moment it sees a crypto/fp/vector op, and we still don't diagnose calls that require vector regs. Differential Revision: https://reviews.llvm.org/D39030 llvm-svn: 316374	2017-10-23 20:46:36 +00:00
Stefan Pintilie	52bbd587ac	Revert "[PowerPC] Try to simplify a Swap if it feeds a Splat" Revert commit r316366. Previous commit causes p8-scalar_vector_conversions.ll to fail. This reverts commit 990e764ad8a2eec206ce5dda6aefab059ccd4e92. llvm-svn: 316371	2017-10-23 20:22:23 +00:00
Krzysztof Parzyszek	6f06b6edff	[Hexagon] Return the correct chain edge for i1 function calls In HexagonISelLowering, there is code to handle the case when a function returns an i1 type. In this case, we need to generate extra nodes to copy the result from R0 to a predicate register. The code was returning the wrong value for the chain edge which caused an assert "Wrong topological sorting" when converting the instructions to MIs. This patch fixes the problem by returning the chain for the final copy. Patch by Brendon Cahoon. llvm-svn: 316367	2017-10-23 19:35:25 +00:00
Stefan Pintilie	feafa1d7f0	[PowerPC] Try to simplify a Swap if it feeds a Splat If we have the situation where a Swap feeds a Splat we can sometimes change the index on the Splat and then remove the Swap instruction. Differential Revision: https://reviews.llvm.org/D39009 llvm-svn: 316366	2017-10-23 19:33:31 +00:00
Krzysztof Parzyszek	273678823b	[Hexagon] Add extra pattern for S4_addaddi One combination was missing: add(add(x,y),c). llvm-svn: 316363	2017-10-23 19:07:50 +00:00
Daniel Sanders	d66e0901ae	[globalisel][tablegen] Import stores and allow GISel to automatically substitute zero regs like WZR/XZR/$zero. This patch enables the import of stores. Unfortunately, doing so by itself, loses an optimization where storing 0 to memory makes use of WZR/XZR. To mitigate this, this patch also introduces a new feature that allows register operands to nominate a zero register. When this is done, GlobalISel will substitute (G_CONSTANT 0) with the nominated register automatically. This is currently configured to only apply to the stores. Applying it to GPR32/GPR64 register classes in general will be done after review see (https://reviews.llvm.org/D39150). llvm-svn: 316360	2017-10-23 18:19:24 +00:00
Vedant Kumar	35b50a83ab	[wasm] readSection: Avoid reading past eof (fixes oss-fuzz #3219 ) A wasm file crafted with a bogus section size can trigger an ASan issue in the DWARFObjInMemory constructor. Nip the problem in the bud when we read the wasm section. Found by OSS-Fuzz: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=3219 Differential Revision: https://reviews.llvm.org/D38777 llvm-svn: 316357	2017-10-23 18:04:34 +00:00
Simon Pilgrim	6bda996cab	[X86][SSE] Regenerate PACKSS tests on 32 + 64-bit targets llvm-svn: 316354	2017-10-23 17:50:40 +00:00
Sanjay Patel	bdc0bf7ba9	[PassManager] add test to show the new PM uses -latesimplifycfg early; NFC llvm-svn: 316351	2017-10-23 17:30:17 +00:00
Matt Arsenault	b791802aef	AMDGPU: Fix default range in non-kernel functions The range should be assumed to be the hardware maximum if a workitem intrinsic is used in a callable function which does not know the restricted limit of the calling kernel. llvm-svn: 316346	2017-10-23 17:09:35 +00:00
Andrew V. Tischenko	777308b548	Update DPPD/DPPS instruction scheduling on btver2. Differential Revision: https://reviews.llvm.org/D39046 llvm-svn: 316334	2017-10-23 15:53:30 +00:00
Craig Topper	8f182fdd8b	[X86] Add PTWRITE instruction for assembler and disassembler. llvm-svn: 316333	2017-10-23 15:53:21 +00:00
Craig Topper	5f0339d2f3	[X86] Add RDPID instruction for assembler and disassembler. llvm-svn: 316332	2017-10-23 15:53:16 +00:00
Simon Pilgrim	32da2f9245	[DAGCombine] Permit combining of shuffles of equivalent splat BUILD_VECTORs combineShuffleOfScalars is very conservative about shuffled BUILD_VECTORs that can be combined together. This patch adds one additional case - if both BUILD_VECTORs represent splats of the same scalar value but with different UNDEF elements, then we should create a single splat BUILD_VECTOR, sharing only the UNDEF elements defined by the shuffle mask. Differential Revision: https://reviews.llvm.org/D38696 llvm-svn: 316331	2017-10-23 15:48:08 +00:00
Simon Pilgrim	03c8753924	[X86][SSE] Regenerate bitcast-and-setcc tests Avoid the retl/retq changes in an upcoming patch llvm-svn: 316328	2017-10-23 14:47:49 +00:00
Simon Pilgrim	e131cb0bd5	[X86][AVX2] Regenerate AVX2 intrinsics tests on 32 + 64-bit targets llvm-svn: 316326	2017-10-23 14:19:46 +00:00
Simon Pilgrim	c680c4742b	[X86][AVX] Regenerate AVX intrinsics tests on 32 + 64-bit targets llvm-svn: 316325	2017-10-23 14:17:59 +00:00
Simon Pilgrim	eae6e9dbc5	[X86][F16C] Regenerate F16C schedule tests llvm-svn: 316324	2017-10-23 14:15:24 +00:00
Artur Gainullin	610df9c890	Test commit. llvm-svn: 316322	2017-10-23 13:25:49 +00:00
George Rimar	7fc298afe4	[llvm-dwarfdump] - Teach tool about few GNU call_sites constants. This teaches tool about following consants: DW_TAG_GNU_call_site, DW_TAG_GNU_call_site_parameter, DW_AT_GNU_call_site_value, DW_AT_GNU_all_call_sites. Constants documented here: https://sourceware.org/elfutils/DwarfExtensions Differential revision: https://reviews.llvm.org/D39119 llvm-svn: 316321	2017-10-23 11:24:14 +00:00
Ayman Musa	4b2bd5ff5e	[X86] Add test for opportunity to use bzhi X86 instruction instead of load+and instructions. Transformation uploaded for CR in https://reviews.llvm.org/D34141. llvm-svn: 316320	2017-10-23 10:24:19 +00:00
Andrew V. Tischenko	eff4fc0d41	Fix for Bug 30718 - Failure to disassemble certain MOV with rex.R. The issue was in illegal segment register index. Differential Revision: https://reviews.llvm.org/D38786 llvm-svn: 316319	2017-10-23 09:36:33 +00:00
Sam Parker	487ab86942	[ARM] Allow unrolling of multi-block loops. Before, loop unrolling was only enabled for loops with a single block. This restriction has been removed and replaced by: - allow a maximum of two exiting blocks, - a four basic block limit for cores with a branch predictor. Differential Revision: https://reviews.llvm.org/D38952 llvm-svn: 316313	2017-10-23 08:05:14 +00:00
Craig Topper	326008c615	[X86] Fix disassembly of EVEX rounding control and SAE instructions. Fixes PR31955. llvm-svn: 316308	2017-10-23 02:26:24 +00:00
Yichao Yu	92c11ee352	Fix invalid ptrtoint in InstCombine Summary: It's unclear if this is the only thing we can do but at least this is consistent with the check of address space agreement in `isBitCastable`. The code is used at least in both instcombine and jumpthreading though I could only find a way to trigger the invalid cast in instcombine. Reviewers: loladiro, sanjoy, majnemer Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34335 llvm-svn: 316302	2017-10-22 20:28:17 +00:00
Sanjay Patel	b80daf0b48	[SimplifyCFG] delay switch condition forwarding to -latesimplifycfg As discussed in D39011: https://reviews.llvm.org/D39011 ...replacing constants with a variable is inverting the transform done by other IR passes, so we definitely don't want to do this early. In fact, it's questionable whether this transform belongs in SimplifyCFG at all. I'll look at moving this to codegen as a follow-up step. llvm-svn: 316298	2017-10-22 19:10:07 +00:00
Marina Yatsina	f9371d821f	Add logic to greedy reg alloc to avoid bad eviction chains This fixes bugzilla 26810 https://bugs.llvm.org/show_bug.cgi?id=26810 This is intended to prevent sequences like: movl %ebp, 8(%esp) # 4-byte Spill movl %ecx, %ebp movl %ebx, %ecx movl %edi, %ebx movl %edx, %edi cltd idivl %esi movl %edi, %edx movl %ebx, %edi movl %ecx, %ebx movl %ebp, %ecx movl 16(%esp), %ebp # 4 - byte Reload Such sequences are created in 2 scenarios: Scenario #1: vreg0 is evicted from physreg0 by vreg1 Evictee vreg0 is intended for region splitting with split candidate physreg0 (the reg vreg0 was evicted from) Region splitting creates a local interval because of interference with the evictor vreg1 (normally region spliiting creates 2 interval, the "by reg" and "by stack" intervals. Local interval created when interference occurs.) one of the split intervals ends up evicting vreg2 from physreg1 Evictee vreg2 is intended for region splitting with split candidate physreg1 one of the split intervals ends up evicting vreg3 from physreg2 etc.. until someone spills Scenario #2 vreg0 is evicted from physreg0 by vreg1 vreg2 is evicted from physreg2 by vreg3 etc Evictee vreg0 is intended for region splitting with split candidate physreg1 Region splitting creates a local interval because of interference with the evictor vreg1 one of the split intervals ends up evicting back original evictor vreg1 from physreg0 (the reg vreg0 was evicted from) Another evictee vreg2 is intended for region splitting with split candidate physreg1 one of the split intervals ends up evicting vreg3 from physreg2 etc.. until someone spills As compile time was a concern, I've added a flag to control weather we do cost calculations for local intervals we expect to be created (it's on by default for X86 target, off for the rest). Differential Revision: https://reviews.llvm.org/D35816 Change-Id: Id9411ff7bbb845463d289ba2ae97737a1ee7cc39 llvm-svn: 316295	2017-10-22 17:59:38 +00:00
Sanjay Patel	24226504a7	[SimplifyCFG] try harder to forward switch condition to phi (PR34471) The missed canonicalization/optimization in the motivating test from PR34471 leads to very different codegen: int switcher(int x) { switch(x) { case 17: return 17; case 19: return 19; case 42: return 42; default: break; } return 0; } int comparator(int x) { if (x == 17) return 17; if (x == 19) return 19; if (x == 42) return 42; return 0; } For the first example, we use a bit-test optimization to avoid a series of compare-and-branch: https://godbolt.org/g/BivDsw Differential Revision: https://reviews.llvm.org/D39011 llvm-svn: 316293	2017-10-22 16:51:03 +00:00
Momchil Velikov	d6a4ab3d49	[ARM] Dynamic stack alignment for 16-bit Thumb This patch implements dynamic stack (re-)alignment for 16-bit Thumb. When targeting processors, which support only the 16-bit Thumb instruction set the compiler ignores the alignment attributes of automatic variables and may silently generate incorrect code. Differential revision: https://reviews.llvm.org/D38143 llvm-svn: 316289	2017-10-22 11:56:35 +00:00
Guy Blank	92d5ce3bd4	[X86] Add a pass to convert instruction chains between domains. The pass scans the function to find instruction chains that define registers in the same domain (closures). It then calculates the cost of converting the closure to another domain. If found profitable, the instructions are converted to instructions in the other domain and the register classes are changed accordingly. This commit adds the pass infrastructure and a simple conversion from the GPR domain to the Mask domain. Differential Revision: https://reviews.llvm.org/D37251 Change-Id: Ic2cf1d76598110401168326d411128ae2580a604 llvm-svn: 316288	2017-10-22 11:43:08 +00:00
Nitesh Jain	757f74c2d3	[mips] Adds support for R_MIPS_26, HIGHER, HIGHEST relocations in RuntimeDyld. Reviewers: sdardis Subscribers: jaydeep, bhushan, llvm-commits Differential Revision: https://reviews.llvm.org/D38314 llvm-svn: 316287	2017-10-22 09:47:41 +00:00
Craig Topper	e975127db6	[X86] Teach the disassembler that some instructions use VEX.W==0 without a corresponding VEX.W==1 instruction and we shouldn't treat them as if VEX.W is ignored. Fixes PR11304. llvm-svn: 316285	2017-10-22 06:18:26 +00:00
Craig Topper	158bc6474a	[X86] Don't allow gather/scatter to disassembler if memory operand does not use a SIB byte. Fixes PR34998. llvm-svn: 316282	2017-10-22 04:32:30 +00:00
Aaron Ballman	fc02869c96	Reverting r316270 due to failing build bots. http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules-2/builds/12899 http://lab.llvm.org:8011/builders/clang-x86-windows-msvc2015/builds/7951 llvm-svn: 316276	2017-10-21 20:38:15 +00:00
Simon Pilgrim	3cb024490a	[X86][SSE] Add extractps/pextrd equivalence to domain tables Differential Revision: https://reviews.llvm.org/D39135 llvm-svn: 316274	2017-10-21 20:19:48 +00:00
Craig Topper	ca2382d809	[X86] Fix disassembling of EVEX instructions to stop accidentally decoding the SIB index register as an XMM/YMM/ZMM register. This introduces a new operand type to encode the whether the index register should be XMM/YMM/ZMM. And new code to fixup the results created by readSIB. This has the nice effect of removing a bunch of code that hard coded the name of every GATHER and SCATTER instruction to map the index type. This fixes PR32807. llvm-svn: 316273	2017-10-21 20:03:20 +00:00
Fangrui Song	c7b749bd06	[PPC CodeGen] Fix the bitreverse.i64 intrinsic. Summary: The two 32-bit words were swapped. Subscribers: nemanjai, kbarton Differential Revision: https://reviews.llvm.org/D38705 llvm-svn: 316270	2017-10-21 16:59:40 +00:00
Simon Pilgrim	7025b07828	[X86][SSE] Add missing extractps scheduling test llvm-svn: 316262	2017-10-21 14:35:09 +00:00
David Green	907b60fbba	[LoopInterchange] Fix phi node ordering miscompile. The way that splitInnerLoopHeader splits blocks requires that the induction PHI will be the first PHI in the inner loop header. This makes sure that is actually the case when there are both IV and reduction phis. Differential Revision: https://reviews.llvm.org/D38682 llvm-svn: 316261	2017-10-21 13:58:37 +00:00
Craig Topper	fcf27188d7	[X86] Do not generate __multi3 for mul i128 on X86 Summary: __multi3 is not available on x86 (32-bit). Setting lib call name for MULI_128 to nullptr forces DAGTypeLegalizer::ExpandIntRes_MUL to generate instructions for 128-bit multiply instead of a call to an undefined function. This fixes PR20871 though it may be worth looking at why licm and indvars combine to generate 65-bit multiplies in that test. Patch by Riyaz V Puthiyapurayil Reviewers: craig.topper, schweitz Reviewed By: craig.topper, schweitz Subscribers: RKSimon, llvm-commits Differential Revision: https://reviews.llvm.org/D38668 llvm-svn: 316254	2017-10-21 02:26:00 +00:00
Krzysztof Parzyszek	9d19c8cac9	[Packetizer] Add function to check for aliasing between instructions llvm-svn: 316243	2017-10-20 22:08:40 +00:00
Sam Clegg	12fd3da9d1	[WebAssembly] MC: Fix crash when -g specified. At this point we don't output any debug sections or thier relocations. Differential Revision: https://reviews.llvm.org/D39076 llvm-svn: 316240	2017-10-20 21:28:38 +00:00
Daniel Sanders	1e4569fdc1	[globalisel][tablegen] Fix small spelling nits. NFC ComplexRendererFn -> ComplexRendererFns Corrected a couple lingering references to tied operands that were missed. llvm-svn: 316237	2017-10-20 20:55:29 +00:00
Krzysztof Parzyszek	022922b31a	[Hexagon] Report error instead of crashing on wrong inline-asm constraints llvm-svn: 316236	2017-10-20 20:24:44 +00:00
Krzysztof Parzyszek	64e5d7d3ae	[Hexagon] Reorganize and update instruction patterns llvm-svn: 316228	2017-10-20 19:33:12 +00:00
Simon Pilgrim	1311ff1340	[X86][SSE] Add missing _mm_extract_ps fast-isel test llvm-svn: 316226	2017-10-20 19:29:01 +00:00
Sanjay Patel	bb94161fb7	[x86] avoid FileCheck assert duplication with retl/retq regex; NFC This was suggested in PR35003: https://bugs.llvm.org/show_bug.cgi?id=35003 32-bit checks may be identical to 64-bit (if we avoid those pesky scalar params!). I'll check in the script change shortly assuming this doesn't anger any bots. llvm-svn: 316223	2017-10-20 18:35:32 +00:00
Dave Lee	f9b72327b0	Make x86 __ehhandler comdat if parent function is Summary: This change comes from using lld for i686-windows-msvc. Before this change, lld emits an error of: error: relocation against symbol in discarded section: .xdata It's possible that this could be addressed in lld, but I think this change is reasonable on its own. At a high level, this is being generated: A (.text comdat) -> B (.text) -> C (.xdata comdat) Where A is a C++ inline function, which references B, an exception handler thunk, which references C, the exception handling info. With this structure, lld will error when applying relocations to B if the C it references has been discarded (some other C has been selected). This change checks if A is comdat, and if so places the exception registration thunk (B) in the comdata group of A (and B). It appears that MSVC makes the __ehhandler function comdat. Is it possible that duplicate thunks are being emitted into the final binary with other linkers, or are they stripping the unused thunks? Reviewers: rnk, majnemer, compnerd, smeenai Reviewed By: rnk, compnerd Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38940 llvm-svn: 316219	2017-10-20 17:04:43 +00:00
Krzysztof Parzyszek	3818aeaeb9	[Hexagon] Allow redefinition with immediates for hw loop conversion Normally, if the registers holding the induction variable's bounds are redefined inside of the loop's body, the loop cannot be converted to a hardware loop. However, if the redefining instruction is actually loading an immediate value into the register, this conversion is both possible and legal (since the immediate itself will be used in the loop setup in the preheader). llvm-svn: 316218	2017-10-20 16:56:33 +00:00
Simon Pilgrim	b6b617b7d8	[X86] Check all CPU target names. We ignore the 32-bit/64-bit triple but I've tried to use i686 triples for CPUs that don't support x86_64 llvm-svn: 316217	2017-10-20 16:55:51 +00:00
Zvi Rackover	e95709d54a	X86 Tests: Add tests for vector permutes with variable indices. NFC. Basic tests which are the equivalent of single-source shufflevector with variable mask. llvm-svn: 316216	2017-10-20 15:32:14 +00:00
Aleksandar Beserminji	143572984d	Revert "[mips] Reordering callseq* nodes to be linear" This reverts commit r314507, because the original patch is causing test failures. llvm-svn: 316215	2017-10-20 14:35:41 +00:00
Eugene Leviant	27b226fb65	[ARM] Use post-RA MI scheduler when +use-misched is set Differential revision: https://reviews.llvm.org/D39100 llvm-svn: 316214	2017-10-20 14:29:17 +00:00
Simon Pilgrim	46b791921f	[X86][AVX512] Regenerate regcall tests. As part of tracking down machine verifier issues (PR27481) llvm-svn: 316213	2017-10-20 14:13:02 +00:00
Nikolai Bozhenov	fa8c5514c5	[ValueTracking] Enabling ValueTracking patch by default (recommit #2 after checking for timeout issue). The original patch was an improvement to IR ValueTracking on non-negative integers. It has been checked in to trunk (D18777, r284022). But was disabled by default due to performance regressions. Perf impact has improved. The patch would be enabled by default. Reviewers: reames, hfinkel Differential Revision: https://reviews.llvm.org/D34101 Patch by: Olga Chupina <olga.chupina@intel.com> llvm-svn: 316208	2017-10-20 10:08:47 +00:00
Max Kazantsev	1c839629aa	Add test case for LoopSink pass This test checks that load from constant memory will be sunk regardless of aliasing stores in the loop. Patch by Daniil Suchkov! Differential Revision: https://reviews.llvm.org/D39113 llvm-svn: 316207	2017-10-20 06:40:48 +00:00
Dylan McKay	6670e42402	[AVR] Fix the select-mbb-placement-bug.ll llvm-svn: 316205	2017-10-20 04:17:14 +00:00
Lang Hames	716a142940	[ExecutionEngine] Temporarily remove the ExecutionEngine tls tests. Will re-enable once I figure out why the necessary runtime functions are missing on some bots. llvm-svn: 316203	2017-10-20 01:18:00 +00:00
Lang Hames	8eec91e96d	[ExecutionEngine] After a heroic dev-meeting hack session, the JIT supports TLS. Turns on EmulatedTLS support by default in EngineBuilder. ;) llvm-svn: 316200	2017-10-20 00:53:16 +00:00
Nemanja Ivanovic	0026c06e11	Disabling the transformation introduced in r315888 The commit at https://reviews.llvm.org/rL315888 is causing some failures with internal testing. Disabling this code until we can resolve the issues. llvm-svn: 316199	2017-10-20 00:36:46 +00:00
Alex Bradbury	8971842f43	[RISCV] Initial codegen support for ALU operations This adds the minimum necessary to support codegen for simple ALU operations on RV32. Prolog and epilog insertion, support for memory operations etc etc follow in future patches. Leave guessInstructionProperties=1 until https://reviews.llvm.org/D37065 is reviewed and lands. Differential Revision: https://reviews.llvm.org/D29933 llvm-svn: 316188	2017-10-19 21:37:38 +00:00
Simon Pilgrim	e8e2c4c0cf	[X86][AES] Test AES intrinsics on 32/64-bit targets with/without VEX encoding Don't just test on 32-bit llvm-svn: 316176	2017-10-19 19:05:04 +00:00
Graham Yiu	488782efa3	The cost of splitting a large vector instruction is not being taken into account by the getUserCost function. This was leading to some loops being over unrolled. The cost of a vector instruction is now being multiplied by the cost of the type legalization. This will return a more accurate cost. Committing on behalf on Brad Nemanich (brad.nemanich@ibm.com) Differential Revision: https://reviews.llvm.org/D38961 llvm-svn: 316174	2017-10-19 18:16:31 +00:00
Krzysztof Parzyszek	e4d0e199bf	[Hexagon] Fix store conversion from rr to io in optimize addressing modes llvm-svn: 316170	2017-10-19 16:59:22 +00:00
Saleem Abdulrasool	1261151912	ExecutionEngine: adjust COFF i386 tautological asserts Modify static_casts to not be tautological in some COFF i386 relocations. Patch by Alex Langford! llvm-svn: 316169	2017-10-19 16:57:40 +00:00
Alex Bradbury	3c941e7ed9	[RISCV] RISCVAsmParser: early exit if RISCVOperand isn't immediate as expected This is necessary to avoid an assertion in the included test case and similar assembler inputs. llvm-svn: 316168	2017-10-19 16:22:51 +00:00
Nikolai Bozhenov	8dcab54cb4	Revert r315992 because of a found miscompilation failure llvm-svn: 316164	2017-10-19 15:36:18 +00:00
Simon Pilgrim	fdd63d1535	[X86] Replace custom scalar integer absolute matching with ISD::ABS lowering. x86 has its own copy of integer absolute pattern matching to combine directly to a SUB+CMOV. This patch removes the x86 combine and adds custom lowering support for ISD::ABS instead, allowing us to use the DAGCombiner version. Additional test cases are already covered by iabs.ll (rL315706 and rL315711). Differential Revision: https://reviews.llvm.org/D38895 llvm-svn: 316162	2017-10-19 15:02:24 +00:00
Simon Pilgrim	d0649f978f	[X86] Add scalar (abs (abs x)) -> (abs x) combine test. Before landing D38895 llvm-svn: 316160	2017-10-19 14:59:26 +00:00
Diana Picus	7bf71008aa	[ARM GlobalISel] Fix liveins in test. NFC llvm-svn: 316155	2017-10-19 09:28:19 +00:00
Diana Picus	a993859335	[ARM GlobalISel] Remove redundant tests These test cases don't really add anything that isn't covered by other tests as well, so we can safely remove them. llvm-svn: 316154	2017-10-19 08:50:28 +00:00
Rafael Espindola	55680d0add	Fix buffer overflow. We were reading past the end of the buffer. llvm-svn: 316143	2017-10-19 01:25:48 +00:00
Justin Bogner	876ad287d1	GISel: Canonicalize select tests using update_mir_test_checks This runs `udpate_mir_test_checks --add-vreg-checks` on the tests taht are already more or less in the format that generates, so that there will be less churn in some upcoming changes. llvm-svn: 316139	2017-10-18 23:33:31 +00:00
Justin Bogner	f8dc015bd1	AArch64/GISel: Modernize the localizer test llvm-svn: 316138	2017-10-18 23:26:24 +00:00
Justin Bogner	d45849f703	Canonicalize a large number of mir tests using update_mir_test_checks This converts a large and somewhat arbitrary set of tests to use update_mir_test_checks. I ran the script on all of the tests I expect to need to modify for an upcoming mir syntax change and kept the ones that obviously didn't change the tests in ways that might make it harder to understand. llvm-svn: 316137	2017-10-18 23:18:12 +00:00
Sanjoy Das	2f27456c82	Revert "[ScalarEvolution] Handling for ICmp occuring in the evolution chain." This reverts commit r316054. There was some confusion over the review process: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20171016/495884.html llvm-svn: 316129	2017-10-18 22:00:57 +00:00
Sam Clegg	799f55cb32	Fix lit.site.cfg.py.in after rL316123 llvm-svn: 316126	2017-10-18 20:46:05 +00:00
Dylan McKay	443695f80a	[AVR] Fix the select_mbb_placement_bug.ll test llvm-svn: 316124	2017-10-18 20:04:57 +00:00
Sam Clegg	0be459e066	Don't set static-libs test feature when using LLVM_LINK_LLVM_DYLIB This was causing execname-options.ll to fail on the wasm waterfall. Differential Revision: https://reviews.llvm.org/D39022 llvm-svn: 316123	2017-10-18 19:37:30 +00:00
Vedant Kumar	9cbd33fec9	[llvm-cov] Suppress sub-line highlights in simple cases llvm-cov tends to highlight too many regions because its policy is to highlight all region entry segments. This can look confusing to users: not all region entry segments are interesting and deserve highlighting. Emitting these highlights only when the region count differs from the line count is a more user-friendly policy. llvm-svn: 316109	2017-10-18 18:52:29 +00:00
Vedant Kumar	988faf87f8	[llvm-cov] Highlight gaps in consecutive uncovered regions llvm-cov typically doesn't highlight gap segments, but it should if the gap occurs after an uncovered region in order to preserve continuity. llvm-svn: 316107	2017-10-18 18:52:27 +00:00
Sumanth Gundapaneni	e1983bcf55	[Hexagon] New HVX target features. This patch lets the llvm tools handle the new HVX target features that are added by frontend (clang). The target-features are of the form "hvx-length64b" for 64 Byte HVX mode, "hvx-length128b" for 128 Byte mode HVX. "hvx-double" is an alias to "hvx-length128b" and is soon will be deprecated. The hvx version target feature is upgated form "+hvx" to "+hvxv{version_number}. Eg: "+hvxv62" For the correct HVX code generation, the user must use the following target features. For 64B mode: "+hvxv62" "+hvx-length64b" For 128B mode: "+hvxv62" "+hvx-length128b" Clang picks a default length if none is specified. If for some reason, no hvx-length is specified to llvm, the compilation will bail out. There is a corresponding clang patch. Differential Revision: https://reviews.llvm.org/D38851 llvm-svn: 316101	2017-10-18 18:07:07 +00:00
Konstantin Zhuravlyov	8d5e9e110c	AMDGPU: Rename MaxFlatWorkgroupSize to MaxFlatWorkGroupSize for consistency Differential Revision: https://reviews.llvm.org/D38957 llvm-svn: 316097	2017-10-18 17:31:09 +00:00
Alex Bradbury	13ce95b77f	[RISCV] Bugfix createRISCVELFObjectWriter r315275 set the IsLittleEndian parameter incorrectly. This patch corrects this, and adds a test to ensure such mistakes will be caught in the future. llvm-svn: 316091	2017-10-18 16:11:31 +00:00
Justin Bogner	2ac32cc9ce	AArch64/GISel: Fix a couple of tests that were testing the wrong thing Fix a couple of tests that were extending the wrong vreg, and regenerate their checks with update_mir_test_checks. This looks like it was a copy-paste or test update error. llvm-svn: 316087	2017-10-18 15:34:33 +00:00
Andre Vieira	d4a25707f0	[ARM] Fix disassembly for conditional VMRS and VMSR instructions in ARM mode Differential Revision: https://reviews.llvm.org/D38347 llvm-svn: 316085	2017-10-18 14:47:37 +00:00
Simon Dardis	03c2c65b2d	[mips] Fix analyzeBranch to handle debug data In the case where there was a conditional branch followed by a unconditional branch with debug instruction separating them, MipsInstrInfo::analyzeBranch would not skip past debug instruction when searching for the second branch which give erroneous results about the control flow of the block. This could lead to the branch folder to merge the non-fall through case into it's predecessor, leaving the conditional branch with a dangling basic block operand. This resolves PR34975. Thanks to Alexander Richardson for reporting the issue! Reviewers: atanasyan Differential Revision: https://reviews.llvm.org/D39003 llvm-svn: 316084	2017-10-18 14:35:29 +00:00
Simon Dardis	77bf0fd59c	[mips] Move test to correct directory. NFCI llvm-svn: 316081	2017-10-18 13:59:48 +00:00
Michael Zuckerman	7ba046c784	Adding new test for bug fix 316067 https://bugs.llvm.org/show_bug.cgi?id=34978 This test checks that the x86-interleaved ends without any assertion. Change-Id: I1e970482a4d0404516cbc85517fc091bb21c35a8 llvm-svn: 316080	2017-10-18 13:51:31 +00:00
Michael Zuckerman	49293264cc	[AVX512][AVX2]Cost calculation for interleave load/store patterns {v8i8,v16i8,v32i8,v64i8} This patch adds accurate instructions cost. The formula presents two cases(stride 3 and stride 4) and calculates the cost according to the VF and stride. Reviewers: 1. delena 2. Farhana 3. zvi 4. dorit 5. Ayal Differential Revision: https://reviews.llvm.org/D38762 Change-Id: If4cfbd4ac0e63694e8144cb78c7fa34850647ff7 llvm-svn: 316072	2017-10-18 11:41:55 +00:00
Hiroshi Inoue	5388e66d3a	[PowerPC] Use helper functions to check sign-/zero-extended value Helper functions to identify sign- and zero-extending machine instruction is introduced in rL315888. This patch makes PPCInstrInfo::optimizeCompareInstr use the helper functions. It simplifies the code and also makes possible more optimizations since the helper can do more analysis than the original check code; I observed about 5000 more compare instructions are eliminated while building LLVM. Also, this patch fixes a bug in helpers on ANDIo instruction handling due to the order of checks. This bug causes a failure in an existing test case for optimizeCompareInstr. Differential Revision: https://reviews.llvm.org/D38988 llvm-svn: 316071	2017-10-18 10:31:19 +00:00
Nikolai Bozhenov	74c047eabb	Improve lookThroughCast function. Summary: When we have the following case: %cond = cmp iN %x, CmpConst %tr = trunc iN %x to iK %narrowsel = select i1 %cond, iK %t, iK C We could possibly match only min/max pattern after looking through cast. So it is more profitable if widened C constant will be equal CmpConst. That is why just set widened C constant equal to CmpConst, because there is a further check in this function that trunc CmpConst == C. Also description for lookTroughCast function was added. Reviewers: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38536 Patch by: Artur Gainullin <artur.gainullin@intel.com> llvm-svn: 316070	2017-10-18 09:28:09 +00:00
Jatin Bhateja	1fc49627e4	[ScalarEvolution] Handling for ICmp occuring in the evolution chain. Summary: If a compare instruction is same or inverse of the compare in the branch of the loop latch, then return a constant evolution node. Currently scope of evaluation is limited to SCEV computation for PHI nodes. This shall facilitate computations of loop exit counts in cases where compare appears in the evolution chain of induction variables. Will fix PR 34538 Reviewers: sanjoy, hfinkel, junryoungju Reviewed By: junryoungju Subscribers: javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D38494 llvm-svn: 316054	2017-10-18 01:36:16 +00:00
Adrian Prantl	5a82f0a470	Verifier: Ignore CUs pulled in by ODR-uniqued types. When more than one Module is imported into the same context, such as during an LTO build before linking the modules, ODR type uniquing may cause types to point to a different CU. This check does not make sense in this case. This fixes the error reported in PR34944. https://bugs.llvm.org/show_bug.cgi?id=34944 rdar://problem/34940685 This reapplies a cleaner implementation of r316049. llvm-svn: 316052	2017-10-18 01:11:01 +00:00
Adrian Prantl	fe8226fd94	Revert "Verifier: Ignore CUs pulled in by ODR-uniqued types." This reverts commit r316049. llvm-svn: 316050	2017-10-18 00:54:31 +00:00
Adrian Prantl	f9a1cf6dcc	Verifier: Ignore CUs pulled in by ODR-uniqued types. When more than one Module is imported into the same context, such as during an LTO build before linking the modules, ODR type uniquing may cause types to point to a different CU. This check does not make sense in this case. This fixes the error reported in PR34944. https://bugs.llvm.org/show_bug.cgi?id=34944 rdar://problem/34940685 llvm-svn: 316049	2017-10-18 00:49:31 +00:00
Wei Ding	7ab1f7a421	AMDGPU : Fix an error for the llvm.cttz implementation. Differential Revision: http://reviews.llvm.org/D39014 llvm-svn: 316037	2017-10-17 21:49:52 +00:00
Tim Northover	350a87eaf1	AArch64: account for possible frame index operand in compares. If the address of a local is used in a comparison, AArch64 can fold the address-calculation into the comparison via "adds". Unfortunately, a couple of places (both hit in this one test) are not ready to deal with that yet and just assume the first source operand is a register. llvm-svn: 316035	2017-10-17 21:43:52 +00:00
Simon Pilgrim	7cd4e2c96f	[X86][SSE] Tests packuswb/truncation codegen from PR34773 llvm-svn: 316033	2017-10-17 21:14:53 +00:00
Konstantin Zhuravlyov	7dabe9ced7	AMDGPU: Start generating metadata for MaxFlatWorkGroupSize Differential Revision: https://reviews.llvm.org/D38958 llvm-svn: 316024	2017-10-17 20:03:21 +00:00
Sanjay Patel	94c0eb031c	[ARM, AArch64] adjust tests trying to maintain their objective; NFC A smarter compiler will see that these might be better without a jump table if we're just using the constant values of the switch. llvm-svn: 316012	2017-10-17 16:54:56 +00:00
Sanjay Patel	6d172f2d72	[SimplifyCFG] add test for part of PR34471 (switch squashing); NFC llvm-svn: 316008	2017-10-17 15:56:42 +00:00
Sanjay Patel	6ed5c91422	[SimplifyCFG] update test to use auto-generated FileCheck asserts; NFC llvm-svn: 316006	2017-10-17 15:50:47 +00:00
Gadi Haber	85d99b4310	[X86][Broadwell] Added the broadwell cpu to the scheduling regression tests.<NFC> NFC. Added the Broadwell cpu and the BROADWELL prefix to all the scheduling regression tests, as part of prepartion for a larger commit of adding all Broadwell scheduiling. Reviewers: RKSimon, zvi, aaboud Differential Revision: https://reviews.llvm.org/D38994 Change-Id: I54bc9065168844c107b1729fcdc1d311ce3ea0a9 llvm-svn: 315998	2017-10-17 13:45:39 +00:00
Nikolai Bozhenov	346f4329c4	Improve clamp recognition in ValueTracking. Summary: ValueTracking was recognizing not all variations of clamp. Swapping of true value and false value of select was added to fix this problem. This change breaks the canonical form of cmp inside the matchMinMax function, that is why additional checks for compare predicates is needed. Added corresponding test cases. Reviewers: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38531 Patch by: Artur Gainullin <artur.gainullin@intel.com> llvm-svn: 315992	2017-10-17 11:50:48 +00:00
Yichao Yu	a18b0b1817	Fix implicit null check with negative offset Summary: It seems that negative offset was accidentally allowed in D17967. AFAICT small negative offset should be valid (always raise segfault) on all archs that I'm aware of (especially x86, which is the only one with this optimization enabled) and such case can be useful when loading hiden metadata from an object. However, like the positive side, it should only be done within a certain limit. For now, use the same limit on the positive side for the negative side. A separate option can be added if needs appear. Reviewers: mcrosier, skatkov Reviewed By: skatkov Subscribers: sanjoy, llvm-commits Differential Revision: https://reviews.llvm.org/D38925 llvm-svn: 315991	2017-10-17 11:47:36 +00:00
Gadi Haber	3020490aac	[X86][Skylake] fixed/updated regression test mmx-schedule.ll which failed after r315978. Change-Id: I60cd7e03ea6c3d9a3dc661a882458e83feca66e3 llvm-svn: 315985	2017-10-17 10:00:08 +00:00
Andrew V. Tischenko	5bfbfb48b5	More tests with x86 prefixes which work after rL315899 commit llvm-svn: 315983	2017-10-17 08:49:47 +00:00
Gadi Haber	1e0f1f476a	[X86][SKL] Updated scheduling information for the SkylakeClient target Updated the scheduling information for the SkylakeClient target with the following changes: 1. regrouped the instructions after adding load and store latencies. 2. regrouped the instructions after adding identified missing ports in several groups. The changes were made after revisiting the latencies impact of all the load and store uOps. Reviewers: zvi, RKSimon, craig.topper Differential Revision: https://reviews.llvm.org/D38727 Change-Id: I778a308cc11e490e8fa5e27e2047412a1dca029f llvm-svn: 315978	2017-10-17 06:47:04 +00:00
Max Kazantsev	4faa509bb1	Remove a test after revert of rL315440 llvm-svn: 315977	2017-10-17 06:43:31 +00:00
Max Kazantsev	20fc63351d	[NFC] Add test from bug 34937 llvm-svn: 315976	2017-10-17 06:37:58 +00:00
Philip Reames	6a7bbfb2e2	Revert 315440 on behalf of mkazantsev This patch reverts rL315440 because of the bug described at https://bugs.llvm.org/show_bug.cgi?id=34937 The fix for the bug is on review as D38944, but not yet ready. Given this is a regression reverting until a fix is ready is called for. Max would have done the revert himself, but is having trouble doing a build of fresh LLVM for some reason. I did the build and test to ensure the revert worked as expected on his behalf. llvm-svn: 315974	2017-10-17 06:21:07 +00:00
Daniel Sanders	3229217620	[globalisel][tablegen] Add a GIM_CheckIsSameOperand test where OtherInsnID and OtherOpIdx differ llvm-svn: 315972	2017-10-17 05:24:44 +00:00
Craig Topper	341f2ab444	[X86] Add masked palignr tests to vector-shuffle-masked.ll llvm-svn: 315971	2017-10-17 04:17:56 +00:00
Craig Topper	19f2f49ef1	[X86] Add AVX512BW to the vector-shuffle-masked test to prepare for an upcoming commit. llvm-svn: 315970	2017-10-17 04:17:55 +00:00
Shoaib Meenai	a42f60e7f7	[ExecutionEngine] Correct the size of a write in a COFF i386 relocation We want to be writing a 32bit value, so we should be writing 4 bytes instead of 2. Patch by Alex Langford <apl@fb.com>. Differential Revision: https://reviews.llvm.org/D38872 llvm-svn: 315964	2017-10-17 01:41:14 +00:00
Vedant Kumar	4d1969f22b	[llvm-cov] Add one correction to r315960 (PR34962) In r315960, I accidentally assumed that the first line segment is guaranteed to be the non-gap region entry segment (given that one is present). It can actually be any segment on the line, and the test I checked in demonstrates that. llvm-svn: 315963	2017-10-17 01:34:41 +00:00
Reid Kleckner	57b7d4fad7	Try to make crlf portable to other printf implementations llvm-svn: 315961	2017-10-17 00:27:31 +00:00
Vedant Kumar	58548c30da	[llvm-cov] Remove workaround in line execution count calculation (PR34962) Gap areas make it possible to correctly determine when to use counts from deferred regions. Before gap areas were introduced, llvm-cov needed to use a heuristic to do this: it ignored counts from segments that start, but do not end, on a line. This heuristic breaks down on a simple example (see PR34962). This patch removes the heuristic and picks counts from any region entry segment which isn't a gap area. llvm-svn: 315960	2017-10-16 23:47:10 +00:00
Mark Searles	4e3d6160db	Use the return value of UpdateNodeOperands(); in some cases, UpdateNodeOperands() modifies the node in-place and using the return value isn’t strictly necessary. However, it does not necessarily modify the node, but may return a resultant node if it already exists in the DAG. See comments in UpdateNodeOperands(). In that case, the return value must be used to avoid such scenarios as an infinite loop (node is assumed to have been updated, so added back to the worklist, and re-processed; however, node hasn’t changed so it is once again passed to UpdateNodeOperands(), assumed modified, added back to worklist; cycle infinitely repeats). Differential Revision: https://reviews.llvm.org/D38466 llvm-svn: 315957	2017-10-16 23:38:53 +00:00
Simon Pilgrim	a590c74549	[X86][AVX] Add v4x64 vector shuffle test for <0,2,1,3> mask llvm-svn: 315955	2017-10-16 23:20:16 +00:00
Quentin Colombet	0bd2825517	Re-apply [AArch64][RegisterBankInfo] Use the statically computed mappings for COPY This reverts commit r315823, thus re-applying r315781. Also make sure we don't use G_BITCAST mapping for non-generic registers. Non-generic registers don't have a type but do have a reg bank. Something the COPY mapping now how to deal with but the G_BITCAST mapping don't. -- Original Commit Message -- We use to resort on the generic implementation to get the mappings for COPYs. The generic implementation resorts on table lookup and dynamically allocated objects to get the valid mappings. Given we already know how to map G_BITCAST and have the static mappings for them, use that code path for COPY as well. This is much more efficient. Improve the compile time of RegBankSelect by up to 20%. Note: When we eventually generate all the mappings via TableGen, we wouldn't have to do that dance to shave compile time. The intent of this change was to make sure that moving to static structure really pays off. NFC. llvm-svn: 315947	2017-10-16 22:28:40 +00:00
Quentin Colombet	9f20af6135	[AArch64][RegisterBankInfo] Add mapping support for G_BITCAST of s128 Anything bigger than 64-bit just map to FPR. llvm-svn: 315946	2017-10-16 22:28:38 +00:00
Quentin Colombet	7c114d3d70	[AArch64][LegalizerInfo] Mark s128 G_BITCAST legal We used to mark all G_BITCAST of 128-bit legal but only for vector types. Scalars of this size are just fine as well. llvm-svn: 315945	2017-10-16 22:28:27 +00:00
Matthew Simpson	36bbc8ce98	Add !callees metadata This patch adds a new kind of metadata that indicates the possible callees of indirect calls. Differential Revision: https://reviews.llvm.org/D37354 llvm-svn: 315944	2017-10-16 22:22:11 +00:00
Reid Kleckner	b0c9e0d647	[MC] Lex CRLF as one token This will prevent doubling of line endings when parsing assembly and emitting assembly. Otherwise we'd parse the directive, consume the end of statement, hit the next end of statement, and emit a fresh newline. llvm-svn: 315943	2017-10-16 22:20:03 +00:00
Simon Pilgrim	03c89a840a	[X86][3DNow] Add scheduling latency/throughput tests for 3DNow! instructions llvm-svn: 315942	2017-10-16 21:55:09 +00:00
Simon Pilgrim	608e1b57cf	[X86][MMX] Add scheduling latency/throughput tests for MMX instructions llvm-svn: 315939	2017-10-16 21:29:29 +00:00
Tony Tye	d288430c3e	Add base relative relocation record that can be used for the following case (OpenCL example): static __global int Var = 0; __global int* Ptr[] = {&Var}; ... In this case Var is a non premptable symbol and so its address can be used as the value of Ptr, with a base relative relocation that will add the delta between the ELF address and the actual load address. Such relocations do not require a symbol. Differential Revision: https://reviews.llvm.org/D38909 llvm-svn: 315935	2017-10-16 20:44:29 +00:00
Alexander Timofeev	9dff31c769	[AMDGPU] : revert r315908 llvm-svn: 315916	2017-10-16 16:57:37 +00:00
Akira Hatanaka	e8c1a54c07	[ObjCARC] Do not move a release that has the clang.imprecise_release tag above PHI instructions. ARC optimizer has an optimization that moves a call to an ObjC runtime function above a phi instruction when the phi has a null operand and is an argument passed to the function call. This optimization should not kick in when the runtime function is an objc_release that releases an object with precise lifetime semantics. rdar://problem/34959669 llvm-svn: 315914	2017-10-16 16:46:59 +00:00
Sanjay Patel	a4b89ed0b7	[x86] add minmax tests with more predicate coverage; NFC llvm-svn: 315913	2017-10-16 15:20:00 +00:00
Alexander Timofeev	3828242c7e	[AMDGPU] Prevent Machine Copy Propagation from replacing live copy with the dead one Differential revision: https://reviews.llvm.org/D38754 llvm-svn: 315908	2017-10-16 14:35:29 +00:00
Simon Pilgrim	259b190f0d	Fix test name typo. llvm-svn: 315907	2017-10-16 14:33:51 +00:00
Simon Pilgrim	664f2f697a	[X86][SSE] Added additional PACKUS shuffle tests Mainly inspired by PR34773 llvm-svn: 315906	2017-10-16 14:32:41 +00:00
Simon Dardis	0d378a9eed	[mips][micromips] Fix (dis)assembly of bc1(t\|f) Previously these instructions were marked codegen only and had an under-specified instruction description that did not record the fcc register. Reviewers: atanasyan, abeserminji Differential Revision: https://reviews.llvm.org/D38847 llvm-svn: 315905	2017-10-16 14:20:22 +00:00
Stefan Maksimovic	ee6b5a79dc	[mips] Provide alternate predicates for constant synthesis Ordering of patterns should not be of importance anymore since the predicates used are mutually exclusive now. llvm-svn: 315901	2017-10-16 13:18:21 +00:00
Andrew V. Tischenko	bfc9061593	This patch is a result of D37262: The issues with X86 prefixes. It closes PR7709, PR17697, PR19251, PR32809 and PR21640. There could be other bugs closed by this patch. llvm-svn: 315899	2017-10-16 11:14:29 +00:00
George Rimar	68b285f69e	[llvm-dwarfdump] - Teach tool to parse DW_CFA_GNU_args_size. Currently llvm-dwarfdump runs into llvm_unreachable when faces DW_CFA_GNU_args_size. Patch implements the support. Differential revision: https://reviews.llvm.org/D38879 llvm-svn: 315897	2017-10-16 10:26:17 +00:00
NAKAMURA Takumi	414151a47e	Revert rL315894, "SLPVectorizer.cpp: Try to appease stage2-3 difference. (D38586)" llvm-svn: 315896	2017-10-16 09:50:01 +00:00
Nikolai Bozhenov	0e7ebbccc7	Move folding of icmp with zero after checking for min/max idioms. Summary: The following transformation for cmp instruction: icmp smin(x, PositiveValue), 0 -> icmp x, 0 should only be done after checking for min/max to prevent infinite looping caused by a reverse canonicalization. That is why this transformation was moved to place after the mentioned check. Reviewers: spatel, efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38934 Patch by: Artur Gainullin <artur.gainullin@intel.com> llvm-svn: 315895	2017-10-16 09:19:21 +00:00
NAKAMURA Takumi	4543affa98	SLPVectorizer.cpp: Try to appease stage2-3 difference. (D38586) llvm-svn: 315894	2017-10-16 09:15:23 +00:00
Yonghong Song	6621cf67cf	bpf: fix bug on silently truncating 64-bit immediate We came across an llvm bug when compiling some testcases that 64-bit immediates are silently truncated into 32-bit and then packed into BPF_JMP \| BPF_K encoding. This caused comparison with wrong value. This bug looks to be introduced by r308080. The Select_Ri pattern is supposed to be lowered into J_Ri while the latter only support 32-bit immediate encoding, therefore Select_Ri should have similar immediate predicate check as what J_Ri are doing. Reported-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 315889	2017-10-16 04:14:53 +00:00
Hiroshi Inoue	e3a3e3c9e9	[PowerPC] Eliminate sign- and zero-extensions if already sign- or zero-extended This patch enables redundant sign- and zero-extension elimination in PowerPC MI Peephole pass. If the input value of a sign- or zero-extension is known to be already sign- or zero-extended, the operation is redundant and can be eliminated. One common case is sign-extensions for a method parameter or for a method return value; they must be sign- or zero-extended as defined in PPC ELF ABI. For example of the following simple code, two extsw instructions are generated before the invocation of int_func and before the return. With this patch, both extsw are eliminated. void int_func(int); void ii_test(int a) { if (a & 1) return int_func(a); } Such redundant sign- or zero-extensions are quite common in many programs; e.g. I observed about 60,000 occurrences of the elimination while compiling the LLVM+CLANG. Differential Revision: https://reviews.llvm.org/D31319 llvm-svn: 315888	2017-10-16 04:12:57 +00:00
Daniel Sanders	ea8711b88e	Re-commit r315885: [globalisel][tblgen] Add support for iPTR and implement am_unscaled* and am_indexed* Summary: iPTR is a pointer of subtarget-specific size to any address space. Therefore type checks on this size derive the SizeInBits from a subtarget hook. At this point, we can import the simplests G_LOAD rules and select load instructions using them. Further patches will support for the predicates to enable additional loads as well as the stores. The previous commit failed on MSVC due to a failure to convert an initializer_list to a std::vector. Hopefully, MSVC will accept this version. Depends on D37457 Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar Reviewed By: qcolombet Subscribers: kristof.beyls, javed.absar, llvm-commits, igorb Differential Revision: https://reviews.llvm.org/D37458 llvm-svn: 315887	2017-10-16 03:36:29 +00:00
Daniel Sanders	ce72d611af	Revert r315885: [globalisel][tblgen] Add support for iPTR and implement am_unscaled* and am_indexed* MSVC doesn't like one of the constructors. llvm-svn: 315886	2017-10-16 02:15:39 +00:00
Daniel Sanders	6735ea86cd	[globalisel][tblgen] Add support for iPTR and implement am_unscaled* and am_indexed* Summary: iPTR is a pointer of subtarget-specific size to any address space. Therefore type checks on this size derive the SizeInBits from a subtarget hook. At this point, we can import the simplests G_LOAD rules and select load instructions using them. Further patches will support for the predicates to enable additional loads as well as the stores. Depends on D37457 Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar Reviewed By: qcolombet Subscribers: kristof.beyls, javed.absar, llvm-commits, igorb Differential Revision: https://reviews.llvm.org/D37458 llvm-svn: 315885	2017-10-16 01:16:35 +00:00
Daniel Sanders	a71f454765	[globalisel][tablegen] Implement unindexed load, non-extending load, and MemVT checks Summary: This includes some context-sensitivity in the MVT to LLT conversion so that pointer types are tested correctly. FIXME: I'm not happy with the way this is done since everything is a special-case. I've yet to find a reasonable way to implement it. select-load.mir fails because <1 x s64> loads in tablegen get priority over s64 loads. This is fixed in the next patch and as such they should be committed together, I've posted them separately to help with the review. Depends on D37456 Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar Subscribers: kristof.beyls, javed.absar, llvm-commits, igorb Differential Revision: https://reviews.llvm.org/D37457 llvm-svn: 315884	2017-10-16 00:56:30 +00:00
Daniel Sanders	df39cbae2f	Re-commit r315863: [globalisel][tablegen] Import ComplexPattern when used as an operator Summary: It's possible for a ComplexPattern to be used as an operator in a match pattern. This is used by the load/store patterns in AArch64 to name the suboperands returned by ComplexPattern predicate so that they can be broken apart and referenced independently in the result pattern. This patch adds support for this in order to enable the import of load/store patterns. Depends on D37445 Hopefully fixed the ambiguous constructor that a large number of bots reported. Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar Reviewed By: qcolombet Subscribers: aemerson, javed.absar, igorb, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D37456 llvm-svn: 315869	2017-10-15 18:22:54 +00:00
Daniel Sanders	bb082a36d3	Revert r315863: [globalisel][tablegen] Import ComplexPattern when used as an operator A large number of bots are failing on an ambiguous constructor call. llvm-svn: 315866	2017-10-15 17:51:07 +00:00
Daniel Sanders	b95b867dd8	[globalisel][tablegen] Import ComplexPattern when used as an operator Summary: It's possible for a ComplexPattern to be used as an operator in a match pattern. This is used by the load/store patterns in AArch64 to name the suboperands returned by ComplexPattern predicate so that they can be broken apart and referenced independently in the result pattern. This patch adds support for this in order to enable the import of load/store patterns. Depends on D37445 Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar Reviewed By: qcolombet Subscribers: aemerson, javed.absar, igorb, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D37456 llvm-svn: 315863	2017-10-15 17:03:36 +00:00
Craig Topper	a5af4a64d0	[AVX512] Don't mark EXTLOAD as legal with AVX512. Continue using custom lowering. Summary: This was impeding our ability to combine the extending shuffles with other shuffles as you can see from the test changes. There's one special case that needed to be added to use VZEXT directly for v8i8->v8i64 since the custom lowering requires v64i8. Reviewers: RKSimon, zvi, delena Reviewed By: delena Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38714 llvm-svn: 315860	2017-10-15 16:41:17 +00:00
Sanjay Patel	934738a3da	revert r314984: revert r314698 - [InstCombine] remove one-use restriction for icmp (shr exact X, C1), C2 --> icmp X, (C2<<C1) Recommitting r314698. The bug exposed by this change should be fixed with: https://reviews.llvm.org/rL315579 llvm-svn: 315857	2017-10-15 15:39:15 +00:00
whitequark	ae12efab20	[MergeFunctions] Merge small functions if possible without a thunk. This can result in significant code size savings in some cases, e.g. an interrupt table all filled with the same assembly stub in a certain Cortex-M BSP results in code blowup by a factor of 2.5. Differential Revision: https://reviews.llvm.org/D34806 llvm-svn: 315853	2017-10-15 12:29:09 +00:00
whitequark	b2ce9ffede	[MergeFunctions] Replace all uses of unnamed_addr functions. This reduces code size for constructs like vtables or interrupt tables that refer to functions in global initializers. Differential Revision: https://reviews.llvm.org/D34805 llvm-svn: 315852	2017-10-15 12:29:01 +00:00
Amjad Aboud	c8d67979c0	[X86] Ignore DBG instructions in X86CmovConversion optimization to resolve PR34565 Differential Revision: https://reviews.llvm.org/D38359 llvm-svn: 315851	2017-10-15 11:00:56 +00:00
Craig Topper	a9cd59fb5d	[X86] Lower vselect with constant condition to vector_shuffle even with AVX512 instructions. Summary: It's better to use our shuffle lowering code to handle these than loading an immediate into a k-register. It really feels like this should be a DAG combine optimization rather than a lowering operation, but that's a problem for another day. Reviewers: RKSimon, delena, zvi Reviewed By: delena Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38932 llvm-svn: 315849	2017-10-15 06:39:07 +00:00
Craig Topper	f02e97859b	[X86] Don't use constant condition for select instruction when testing masking ops. We should be able to fold constant conditions by converting to shuffles, but fixing that would break these tests in their current form. Since they are really trying to test masking ops, add a non-constant mask to the selects. llvm-svn: 315848	2017-10-15 06:05:50 +00:00
Konstantin Zhuravlyov	263f7f6676	AMDGPU: Temporary disable pal metadata check line in llvm-readobj test It fails on mips llvm-svn: 315837	2017-10-14 23:42:11 +00:00
Craig Topper	dfb443e88c	[X86] Remove a bunch of dead FileCheck lines with the wrong prefix. llvm-svn: 315828	2017-10-14 21:46:55 +00:00
Simon Pilgrim	36fe00ee17	[X86][SSE] Don't attempt to reduce the imul vector width of odd sized vectors (PR34947) llvm-svn: 315825	2017-10-14 19:57:19 +00:00
Simon Pilgrim	3f49b988e0	[X86][SSE] Test vector imul reduction on 32 and 64-bit targets llvm-svn: 315824	2017-10-14 19:46:08 +00:00
Konstantin Zhuravlyov	a01d8b0b63	AMDGPU: Bring HSA metadata on par with the specification Differential Revision: https://reviews.llvm.org/D38753 llvm-svn: 315821	2017-10-14 19:03:51 +00:00
Konstantin Zhuravlyov	b3c605d680	llvm-readobj: Print AMDGPU note contents Differential Revision: https://reviews.llvm.org/D38752 llvm-svn: 315819	2017-10-14 18:21:42 +00:00
Simon Pilgrim	5bd4431aec	Cleanup update_llc_test_checks.py notes. llvm-svn: 315817	2017-10-14 17:37:03 +00:00
Konstantin Zhuravlyov	7b4be1ed89	AMDGPU: Cleanup elf-notes.ll test llvm-svn: 315816	2017-10-14 17:36:53 +00:00
Konstantin Zhuravlyov	716af741e9	llvm-readobj: Print AMDGPU note type names Differential Revision: https://reviews.llvm.org/D38751 llvm-svn: 315813	2017-10-14 16:43:46 +00:00
Konstantin Zhuravlyov	219066bab8	AMDGPU: Improve note directive verification in assembler - Do not allow amd_amdgpu_isa directives on non-amdgcn architectures - Do not allow amd_amdgpu_hsa_metadata on non-amdhsa OSes - Do not allow amd_amdgpu_pal_metadata on non-amdpal OSes Differential Revision: https://reviews.llvm.org/D38750 llvm-svn: 315812	2017-10-14 16:15:28 +00:00
Konstantin Zhuravlyov	eda425edd4	AMDGPU: Do not emit deprecated notes for code object v3 Differential Revision: https://reviews.llvm.org/D38749 llvm-svn: 315810	2017-10-14 15:59:07 +00:00
Konstantin Zhuravlyov	9c05b2bc3b	AMDGPU: Add support for isa version note - Emit NT_AMD_AMDGPU_ISA - Add assembler parsing for isa version directive - If isa version directive does not match command line arguments, then return error Differential Revision: https://reviews.llvm.org/D38748 llvm-svn: 315808	2017-10-14 15:40:33 +00:00
Simon Pilgrim	f367c27d2d	[X86][SSE] Support combining AND(EXTRACT(SHUF(X)), C) -> EXTRACT(SHUF(X)) If we are applying a byte mask to a value extracted from a shuffle, see if we can combine the mask into shuffle. Fixes the last issue with PR22415 llvm-svn: 315807	2017-10-14 15:01:36 +00:00
Craig Topper	f7e777763d	[X86] Add patterns for vzmovl+cvtpd2dq/cvttpd2dq with a load. llvm-svn: 315802	2017-10-14 07:04:48 +00:00
Craig Topper	61010a85b8	[X86] Add AVX512 versions of VCVTPD2PS to load folding tables. llvm-svn: 315801	2017-10-14 05:55:43 +00:00
Craig Topper	ee277e190c	[X86] Add patterns for vzmovl+cvtpd2ps with a load. llvm-svn: 315800	2017-10-14 05:55:42 +00:00
Craig Topper	134241e4af	[X86] Add AVX512 flavors of VCVTDQ2PD plus VCVTUDQ2PD to the load folding tables. llvm-svn: 315796	2017-10-14 04:18:08 +00:00
Yaxun Liu	adde4e4c01	Fix assembler for alloca of multiple elements in non-zero addr space Currently llvm assembler emits parsing error for valid IR assembly alloca i32, i32 9, addrspace(5) when alloca addr space is 5. This patch fixes that. Differential Revision: https://reviews.llvm.org/D38713 llvm-svn: 315791	2017-10-14 03:23:18 +00:00
Daniel Sanders	bfa9e2cae7	[globalisel][tablegen] Simplify named operand/operator lookups and fix a wrong-code bug this revealed. Summary: Operand variable lookups are now performed by the RuleMatcher rather than searching the whole matcher hierarchy for a match. This revealed a wrong-code bug that currently affects ARM and X86 where patterns that use a variable more than once in the match pattern will be imported but won't check that the operands are identical. This can cause the tablegen-erated matcher to accept matches that should be rejected. Depends on D36569 Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar Subscribers: aemerson, igorb, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D36618 llvm-svn: 315780	2017-10-14 00:31:58 +00:00
Craig Topper	f6c69564e7	[X86] Use X86ISD::VBROADCAST in place of v2f64 X86ISD::MOVDDUP when AVX2 is available This is particularly important for AVX512VL where we are better able to recognize the VBROADCAST loads to fold with other operations. For AVX512VL we now use X86ISD::VBROADCAST for all of the patterns and remove the 128-bit X86ISD::VMOVDDUP. We may be able to use this for AVX1 as well which would allow us to remove more isel patterns. I also had to add X86ISD::VBROADCAST as a node to call combineShuffle for so that we treat it similar to X86ISD::MOVDDUP. Differential Revision: https://reviews.llvm.org/D38836 llvm-svn: 315768	2017-10-13 21:56:48 +00:00
Craig Topper	526b70a089	[X86] Use fsub in the movddup scheduling tests to prevent a future patch from folding movddup as a broadcast load. llvm-svn: 315767	2017-10-13 21:56:45 +00:00
Daniel Sanders	11300cead8	[globalisel][tablegen] Add support for fpimm and import of APInt/APFloat based ImmLeaf. Summary: There's only a tablegen testcase for IntImmLeaf and not a CodeGen one because the relevant rules are rejected for other reasons at the moment. On AArch64, it's because there's an SDNodeXForm attached to the operand. On X86, it's because the rule either emits multiple instructions or has another predicate using PatFrag which cannot easily be supported at the same time. Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar Reviewed By: qcolombet Subscribers: aemerson, javed.absar, igorb, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D36569 llvm-svn: 315761	2017-10-13 21:28:03 +00:00
Matt Arsenault	e11d8aca77	AMDGPU: Implement hasBitPreservingFPLogic llvm-svn: 315754	2017-10-13 21:10:22 +00:00
Peter Collingbourne	868783e855	LowerTypeTests: Give imported symbols a type with size 0 so that they are not assumed not to alias. It is possible for both a base and a derived class to be satisfied with a unique vtable. If a program contains casts of the same pointer to both of those types, the CFI checks will be lowered to this (with ThinLTO): if (p != &__typeid_base_global_addr) trap(); if (p != &__typeid_derived_global_addr) trap(); The optimizer may then use the first condition combined with the assumption that __typeid_base_global_addr and __typeid_derived_global_addr may not alias to optimize away the second comparison, resulting in an unconditional trap. This patch fixes the bug by giving imported globals the type [0 x i8]*, which prevents the optimizer from assuming that they do not alias. Differential Revision: https://reviews.llvm.org/D38873 llvm-svn: 315753	2017-10-13 21:02:16 +00:00
Sanjay Patel	505e071dc7	[Reassociate] auto-generate better checks; NFC These would fail if the created variable names changed. llvm-svn: 315752	2017-10-13 20:56:35 +00:00
Matt Arsenault	550c66d10f	AMDGPU: Look for src mods before fp_extend When selecting modifiers for mad_mix instructions, look at fneg/fabs that occur before the conversion. llvm-svn: 315748	2017-10-13 20:45:49 +00:00
Sanjay Patel	f0242de143	[InstCombine] move code to remove repeated constant check; NFCI Also, consolidate tests for this fold in one place. llvm-svn: 315745	2017-10-13 20:29:11 +00:00
Matt Arsenault	4d70754e3c	AMDGPU: Implement isFPExtFoldable This helps match v_mad_mix* in some cases. llvm-svn: 315744	2017-10-13 20:18:59 +00:00
Krzysztof Parzyszek	7c9c05888c	[Hexagon] Minimize number of repeated constant extenders Each constant extender requires an extra instruction, which adds to the code size and also reduces the number of available slots in an instruction packet. In most cases, the value of a repeated constant extender could be loaded into a register, and the instructions using the extender could be replaced with their counterparts that use that register instead. This patch adds a pass that tries to reduce the number of constant extenders, including extenders which differ only in an immediate offset known at compile time, e.g. @global and @global+12. llvm-svn: 315735	2017-10-13 19:02:59 +00:00
Craig Topper	5d692917f4	[X86] Add initial skeleton support for knm cpu This adds Intel's Knights Mill CPU to valid CPU names for the backend. For now its an alias of "knl", but ultimately we need to support AVX5124FMAPS and AVX5124VNNIW instruction sets for it. Differential Revision: https://reviews.llvm.org/D38811 llvm-svn: 315722	2017-10-13 18:10:17 +00:00
Sanjay Patel	c419c9f640	[InstCombine] add hasOneUse check to add-zext-add fold to prevent increasing instructions llvm-svn: 315718	2017-10-13 17:47:25 +00:00
Sanjay Patel	399fcbea37	[InstCombine] add tests for add (zext (add nuw X, C2)), C --> zext (add nuw X, C2 + C); NFC llvm-svn: 315717	2017-10-13 17:42:12 +00:00
Max Moroz	43df793f5c	[llvm-cov] Reland sources-specified.test with addition of "-path-equivalence". Summary: This version of tests should be working properly. Reviewers: vsk Reviewed By: vsk Differential Revision: https://reviews.llvm.org/D38889 llvm-svn: 315714	2017-10-13 17:27:39 +00:00
Simon Pilgrim	c4977fa9a1	[X86] Test scalar integer absolutes on 32-bit targets with/without CMOV llvm-svn: 315711	2017-10-13 17:09:20 +00:00
Reid Kleckner	be3724b5e1	Not all buildbots seem to dump the nuw flag in SDAG llvm-svn: 315710	2017-10-13 17:00:49 +00:00
Simon Pilgrim	df9611e178	[X86] Updated scalar integer absolute tests to cover i8/i16/i32/i64 llvm-svn: 315706	2017-10-13 16:53:07 +00:00
Sanjay Patel	2150651ac3	[InstCombine] allow zext(bool) + C --> select bool, C+1, C for vector types The backend should be prepared for this transform after: https://reviews.llvm.org/rL311731 llvm-svn: 315701	2017-10-13 16:29:38 +00:00
Reid Kleckner	c687a34870	Update test to expect nuw flag in SDAG dump, fixes test after r315690 llvm-svn: 315698	2017-10-13 16:13:23 +00:00
Daniel Neilson	fa14ebd138	[RS4GC] Look through vector bitcasts when looking for base pointer Summary: In RS4GC it is possible that a base pointer is contained in a vector that has undergone a bitcast from one element-pointertype to another. We teach RS4GC how to look through bitcasts of vector types when looking for a base pointer. Reviewers: anna Reviewed By: anna Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38849 llvm-svn: 315694	2017-10-13 15:59:13 +00:00
Max Moroz	8ff311b54a	[llvm-cov] Temporary delete sources-specified.test, it is failing on some bots. Summary: http://lab.llvm.org:8011/builders/clang-with-thin-lto-ubuntu/builds/5950/steps/test-stage1-compiler/logs/stdio Reviewers: vsk, Dor1s Reviewed By: Dor1s Subscribers: mehdi_amini Differential Revision: https://reviews.llvm.org/D38888 llvm-svn: 315693	2017-10-13 15:58:58 +00:00
Krzysztof Parzyszek	a0f2f7c413	[Hexagon] Add patterns for cmpb/cmph with immediate arguments Patch by Sumanth Gundapaneni. llvm-svn: 315692	2017-10-13 15:43:12 +00:00
Max Moroz	8bc53fd031	[llvm-cov] Fix sources-specified.test so it ignores the order of files printed. Summary: https://reviews.llvm.org/D38884#896964 Reviewers: vsk, Dor1s Reviewed By: Dor1s Differential Revision: https://reviews.llvm.org/D38887 llvm-svn: 315691	2017-10-13 15:41:51 +00:00
Max Moroz	c5834e5e88	[llvm-cov] An attempt to fix sources_specified.test failing on some buildbots. Summary: https://reviews.llvm.org/rL315685#115380 Reviewers: vsk, Dor1s Reviewed By: Dor1s Differential Revision: https://reviews.llvm.org/D38884 llvm-svn: 315687	2017-10-13 15:30:24 +00:00
Max Moroz	4a4bfa4e27	[llvm-cov] Generate "report" for given source paths if sources are specified. Summary: Documentation says that user can specify sources for both "show" and "report" commands. "Show" command respects specified sources, but "report" does not. It is useful to have both "show" and "report" generated for specified sources. Also added tests to for both commands with sources specified. Reviewers: vsk, kcc Reviewed By: vsk Differential Revision: https://reviews.llvm.org/D38860 llvm-svn: 315685	2017-10-13 14:44:51 +00:00
Jonas Devlieghere	614fab4bd8	Re-land "[dsymutil] Timestmap verification for __swift_ast" This patch adds timestamp verification for swiftmodule files. A new flag is provided to allows us to disable this check in order to allow testing of this feature. Differential revision: https://reviews.llvm.org/D38686 llvm-svn: 315684	2017-10-13 14:41:23 +00:00
Anna Thomas	a2ca902033	[SCEV] Teach SCEV to find maxBECount when loop endbound is variant Summary: This patch teaches SCEV to calculate the maxBECount when the end bound of the loop can vary. Note that we cannot calculate the exactBECount. This will only be done when both conditions are satisfied: 1. the loop termination condition is strictly LT. 2. the IV is proven to not overflow. This provides more information to users of SCEV and can be used to improve identification of finite loops. Reviewers: sanjoy, mkazantsev, silviu.baranga, atrick Reviewed by: mkazantsev Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38825 llvm-svn: 315683	2017-10-13 14:30:43 +00:00
Sanjay Patel	45d5568010	[InstCombine] add tests for boolean extend + add; NFC llvm-svn: 315681	2017-10-13 14:09:45 +00:00
Daniel Jasper	3344a21236	Revert r314923: "Recommit : Use the basic cost if a GEP is not used as addressing mode" Significantly reduces performancei (~30%) of gipfeli (https://github.com/google/gipfeli) I have not yet managed to reproduce this regression with the open-source version of the benchmark on github, but will work with others to get a reproducer to you later today. llvm-svn: 315680	2017-10-13 14:04:21 +00:00
Craig Topper	bf0de9d3b6	[X86] Remove patterns that select unmasked vbroadcastf2x32/vbroadcasti2x32. Prefer vbroadcastsd/vpbroadcastq instead. There's no advantage to using these instructions when they aren't masked. This enables some additional execution domain switching without needing to update the table. llvm-svn: 315674	2017-10-13 06:07:10 +00:00
Craig Topper	11655b22dc	[X86] Add the test case for r315613 that I forgot to 'git add'. llvm-svn: 315649	2017-10-13 00:20:47 +00:00
Matt Morehouse	8bc23ab658	[llvm-isel-fuzzer] Use "--" as separator rather than '='. Summary: OSS-Fuzz doesn't support '=' in filenames. Reviewers: bogner, kcc Reviewed By: kcc Subscribers: javed.absar, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D38866 llvm-svn: 315647	2017-10-13 00:18:32 +00:00
Justin Bogner	9c03fd5f64	llvm-isel-fuzzer: Use the right REQUIRES line for r315599 I'd mixed up ENABLE_SHARED and BUILD_SHARED_LIBS before, so these tests were being disabled in too many places. llvm-svn: 315646	2017-10-13 00:17:54 +00:00
Anna Thomas	61aec18d46	[CVP] Process binary operations even when def is local Summary: This patch adds processing of binary operations when the def of operands are in the same block (i.e. local processing). Earlier we bailed out in such cases (the bail out was introduced in rL252032) because LVI at that time was more precise about context at the end of basic blocks, which implied local def and use analysis didn't benefit CVP. Since then we've added support for LVI in presence of assumes and guards. The test cases added show how local def processing in CVP helps adding more information to the ashr, sdiv, srem and add operators. Note: processCmp which suffers from the same problem will be handled in a later patch. Reviewers: philip, apilipenko, SjoerdMeijer, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38766 llvm-svn: 315634	2017-10-12 22:39:52 +00:00
Artur Pilipenko	ead69ee4bd	[LoopPredication] Check whether the loop is already guarded by the first iteration check condition llvm-svn: 315623	2017-10-12 21:21:17 +00:00
Bruno Cardoso Lopes	993d2e67d8	Revert "Reintroduce "[SCCP] Propagate integer range info for parameters in IPSCCP."" This reverts commit r315593: still affect two bots: http://lab.llvm.org:8011/builders/clang-with-lto-ubuntu/builds/5308 http://green.lab.llvm.org/green/job/clang-stage2-configure-Rlto/21751/ llvm-svn: 315618	2017-10-12 20:52:34 +00:00
Artur Pilipenko	b4527e1ce2	[LoopPredication] Support ule, sle latch predicates This is a follow up for the loop predication change 313981 to support ule, sle latch predicates. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D38177 llvm-svn: 315616	2017-10-12 20:40:27 +00:00
Wei Ding	5676acad9e	Implement custom lowering for ISD::CTTZ_ZERO_UNDEF and ISD::CTTZ. Differential Revision: http://reviews.llvm.org/D37348 llvm-svn: 315610	2017-10-12 19:37:14 +00:00
Craig Topper	3dc37cc592	[X86] Add a bunch of -mcpu strings to the cpus.ll test. We were missing most of the "core" aliases as well as skylake, cannonlake, and knights landing. llvm-svn: 315606	2017-10-12 18:55:57 +00:00
Artem Belevich	3bafc2f0d9	[NVPTX] Implemented wmma intrinsics and instructions. WMMA = "Warp Level Matrix Multiply-Accumulate". These are the new instructions introduced in PTX6.0 and available on sm_70 GPUs. Differential Revision: https://reviews.llvm.org/D38645 llvm-svn: 315601	2017-10-12 18:27:55 +00:00
Reid Kleckner	1a7e387849	[codeview] Don't emit FPO data in funclet prologues Attempt 3 to work around bugs in FPO data with funclets. llvm-svn: 315600	2017-10-12 18:20:35 +00:00
Justin Bogner	754a1a8a6f	llvm-isel-fuzzer: Work around BUILD_SHARED_LIBS testing issues Building with BUILD_SHARED_LIBS makes it tricky to copy around executables at will, since they won't be able to find the LLVM libraries any more. This makes testing a feature that's based on the executable name problematic, so we'll just disable these two tests in that configuration. We could potentially fix this by symlinking the lib directory into the test directory, but that wouldn't work on windows, and losing testing on windows would be far worse than losing testing on a configuration that's barely even supported. llvm-svn: 315599	2017-10-12 18:10:22 +00:00
Artem Belevich	786ca6a166	[TableGen] Allow intrinsics to have up to 8 return values. Differential Revision: https://reviews.llvm.org/D38633 llvm-svn: 315598	2017-10-12 17:40:00 +00:00
Sanjay Patel	e272be7c9a	[ValueTracking] return zero when there's conflict in known bits of a shift (PR34838) Poison allows us to return a better result than undef. llvm-svn: 315595	2017-10-12 17:31:46 +00:00
Bruno Cardoso Lopes	326fdcbff8	Reintroduce "[SCCP] Propagate integer range info for parameters in IPSCCP." This is r315288 & r315294, which were reverted due to stage2 bot failures. Summary: This updates the SCCP solver to use of the ValueElement lattice for parameters, which provides integer range information. The range information is used to remove unneeded icmp instructions. For the following function, f() can be optimized to `ret i32 2` with this change source_filename = "sccp.c" target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" ; Function Attrs: norecurse nounwind readnone uwtable define i32 @main() local_unnamed_addr #0 { entry: %call = tail call fastcc i32 @f(i32 1) %call1 = tail call fastcc i32 @f(i32 47) %add3 = add nsw i32 %call, %call1 ret i32 %add3 } ; Function Attrs: noinline norecurse nounwind readnone uwtable define internal fastcc i32 @f(i32 %x) unnamed_addr #1 { entry: %c1 = icmp sle i32 %x, 100 %cmp = icmp sgt i32 %x, 300 %. = select i1 %cmp, i32 1, i32 2 ret i32 %. } attributes #1 = { noinline } Reviewers: davide, sanjoy, efriedma, dberlin Reviewed By: davide, dberlin Subscribers: mcrosier, gberry, mssimpso, dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D36656 llvm-svn: 315593	2017-10-12 16:54:11 +00:00
Lei Huang	0724fea2da	[PowerPC] Add profitablilty check for conversion to mtctr loops Add profitability checks for modifying counted loops to use the mtctr instruction. The latency of mtctr is only justified if there are more than 4 comparisons that will be removed as a result. Usually counted loops are formed relatively early and before unrolling, so most low trip count loops often don't survive. However we want to ensure that if they do, we do not mistakenly update them to mtctr loops. Use CodeMetrics to ensure we are only doing this for small loops with small trip counts. Differential Revision: https://reviews.llvm.org/D38212 llvm-svn: 315592	2017-10-12 16:43:33 +00:00
Tim Renouf	c8ffffe462	[AMDGPU] For amdpal, widen interpolation mode workaround Summary: The interpolation mode workaround ensures that at least one interpolation mode is enabled in PSInputAddr. It does not also check PSInputEna on the basis that the user might enable bits in that depending on run-time state. However, for amdpal os type, the user does not enable some bits after compilation based on run-time states; the register values being generated here are the final ones set in the hardware. Therefore, apply the workaround to PSInputAddr and PSInputEnable together. (The case where a bit is set in PSInputAddr but not in PSInputEnable is where the frontend set up an input arg for a particular interpolation mode, but nothing uses that input arg. Really we should have an earlier pass that removes such an arg.) Reviewers: arsenm, nhaehnle, dstuttard Subscribers: kzhuravl, wdng, yaxunl, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D37758 llvm-svn: 315591	2017-10-12 16:16:41 +00:00
Mikael Holmen	a079ef68e3	[RegisterCoalescer] Don't set read-undef in pruneValues, only clear Summary: The comments in the code said // Remove <def,read-undef> flags. This def is now a partial redef. but the code didn't just remove read-undef, it could introduce new ones which could cause errors. E.g. if we have something like %vreg1<def> = IMPLICIT_DEF %vreg2:subreg1<def, read-undef> = op %vreg3, %vreg4 %vreg2:subreg2<def> = op %vreg6, %vreg7 and we merge %vreg1 and %vreg2 then we should not set undef on the second subreg def, which the old code did. Now we solve this by actually do what the code comment says. We remove read-undef flags rather than remove or introduce them. Reviewers: qcolombet, MatzeB Reviewed By: MatzeB Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38616 llvm-svn: 315564	2017-10-12 06:21:28 +00:00
Justin Bogner	9ea7fbd1e8	Re-commit "llvm-isel-fuzzer: Handle a subset of backend flags in the exec name" Here we add a secondary option parser to llvm-isel-fuzzer (and provide it for use with other fuzzers). With this, you can copy the fuzzer to a name like llvm-isel-fuzzer=aarch64-gisel for a fuzzer that fuzzer AArch64 with GlobalISel enabled, or fuzzer=x86_64 to fuzz x86, with no flags required. This should be useful for running these in OSS-Fuzz. Note that this handrolls a subset of cl::opts to recognize, rather than embedding a complete command parser for argv[0]. If we find we really need the flexibility of handling arbitrary options at some point we can rethink this. This re-applies 315545 using "=" instead of ":" as a separator for arguments. llvm-svn: 315557	2017-10-12 04:35:32 +00:00
Hans Wennborg	022829d84c	Revert r315545 "llvm-isel-fuzzer: Handle a subset of backend flags in the executable name" It broke some tests on Windows: Failing Tests (4): LLVM :: tools/llvm-isel-fuzzer/execname-options.ll LLVM :: tools/llvm-isel-fuzzer/missing-triple.ll LLVM :: tools/llvm-isel-fuzzer/x86-empty-bc.ll LLVM :: tools/llvm-isel-fuzzer/x86-empty.ll > llvm-isel-fuzzer: Handle a subset of backend flags in the executable name > > Here we add a secondary option parser to llvm-isel-fuzzer (and provide > it for use with other fuzzers). With this, you can copy the fuzzer to > a name like llvm-isel-fuzzer:aarch64-gisel for a fuzzer that fuzzer > AArch64 with GlobalISel enabled, or fuzzer:x86_64 to fuzz x86, with no > flags required. This should be useful for running these in OSS-Fuzz. > > Note that this handrolls a subset of cl::opts to recognize, rather > than embedding a complete command parser for argv[0]. If we find we > really need the flexibility of handling arbitrary options at some > point we can rethink this. llvm-svn: 315554	2017-10-12 03:32:09 +00:00
Hongbin Zheng	d36f2030e2	[SimplifyIndVar] Replace IVUsers with loop invariant whenever possible Differential Revision: https://reviews.llvm.org/D38415 llvm-svn: 315551	2017-10-12 02:54:11 +00:00
Justin Bogner	a5969ce15f	llvm-isel-fuzzer: Handle a subset of backend flags in the executable name Here we add a secondary option parser to llvm-isel-fuzzer (and provide it for use with other fuzzers). With this, you can copy the fuzzer to a name like llvm-isel-fuzzer:aarch64-gisel for a fuzzer that fuzzer AArch64 with GlobalISel enabled, or fuzzer:x86_64 to fuzz x86, with no flags required. This should be useful for running these in OSS-Fuzz. Note that this handrolls a subset of cl::opts to recognize, rather than embedding a complete command parser for argv[0]. If we find we really need the flexibility of handling arbitrary options at some point we can rethink this. llvm-svn: 315545	2017-10-12 01:57:49 +00:00
Wei Mi	1736efd16a	Revert r307036 because of PR34919. llvm-svn: 315540	2017-10-12 00:24:52 +00:00
Konstantin Zhuravlyov	c3beb6a075	AMDGPU/NFC: Minor clean ups in PAL metadata - Move PAL metadata definitions to AMDGPUMetadata - Make naming consistent with HSA metadata Differential Revision: https://reviews.llvm.org/D38745 llvm-svn: 315523	2017-10-11 22:41:09 +00:00
Konstantin Zhuravlyov	a63b0f9d20	AMDGPU/NFC: Rename code object metadata as HSA metadata - Rename AMDGPUCodeObjectMetadata to AMDGPUMetadata (PAL metadata will be included in this file in the follow up change) - Rename AMDGPUCodeObjectMetadataStreamer to AMDGPUHSAMetadataStreamer - Introduce HSAMD namespace - Other minor name changes in function and test names llvm-svn: 315522	2017-10-11 22:18:53 +00:00
Reid Kleckner	ddf413f3e1	Really fix llvm-rc include-paths.test llvm-svn: 315515	2017-10-11 21:27:54 +00:00
Reid Kleckner	ade90cbd79	Attempt to fix failing llvm-rc include-paths.text llvm-svn: 315514	2017-10-11 21:25:03 +00:00
Reid Kleckner	9cdd4df81a	[codeview] Implement FPO data assembler directives Summary: This adds a set of new directives that describe 32-bit x86 prologues. The directives are limited and do not expose the full complexity of codeview FPO data. They are merely a convenience for the compiler to generate more readable assembly so we don't need to generate tons of labels in CodeGen. If our prologue emission changes in the future, we can change the set of available directives to suit our needs. These are modelled after the .seh_ directives, which use a different format that interacts with exception handling. The directives are: .cv_fpo_proc _foo .cv_fpo_pushreg ebp/ebx/etc .cv_fpo_setframe ebp/esi/etc .cv_fpo_stackalloc 200 .cv_fpo_endprologue .cv_fpo_endproc .cv_fpo_data _foo I tried to follow the implementation of ARM EHABI CFI directives by sinking most directives out of MCStreamer and into X86TargetStreamer. This helps avoid polluting non-X86 code with WinCOFF specific logic. I used cdb to confirm that this can show locals in parent CSRs in a few cases, most importantly the one where we use ESI as a frame pointer, i.e. the one in http://crbug.com/756153#c28 Once we have cdb integration in debuginfo-tests, we can add integration tests there. Reviewers: majnemer, hans Subscribers: aemerson, mgorny, kristof.beyls, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D38776 llvm-svn: 315513	2017-10-11 21:24:33 +00:00
Krzysztof Parzyszek	c4a9a8d8e0	[Hexagon] Make sure that new-value jump is packetized with producer llvm-svn: 315510	2017-10-11 21:20:43 +00:00
Florian Hahn	e52abba277	[MachineCombiner] Fix initialisation of LastUpdate for incremental update. Summary: Fixes a bogus iterator resulting from the removal of a block's first instruction at the point that incremental update is enabled. Patch by Paul Walker. Reviewers: fhahn, Gerolf, efriedma, MatzeB Reviewed By: fhahn Subscribers: aemerson, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D38734 llvm-svn: 315502	2017-10-11 20:25:58 +00:00
Lei Huang	263dc4ef3a	[PowerPC] Utilize DQ-Form instructions for spill/restore and fix FrameIndex elimination to only use `lis/addi` if necessary. Currently we produce a bunch of unnecessary code when emitting the prologue/epilogue for spills/restores. Namely, if the load from stack slot/store to stack slot instruction is an X-Form instruction, we will always produce an LIS/ORI sequence for the stack offset. Furthermore, we have not exploited the P9 vector D-Form loads/stores for this purpose. This patch address both issues. Specifying the D-Form load as the instruction to use for stack spills/reloads should be safe because: 1. The stack should be aligned according to the ABI 2. If the stack isn't aligned, PPCRegisterInfo::eliminateFrameIndex() will check for the offset being a multiple of 16 and will convert it to an X-Form instruction if it isn't. Differential Revision : https://reviews.llvm.org/D38758 llvm-svn: 315500	2017-10-11 20:20:58 +00:00
Zachary Turner	fa0ca6cbd0	[llvm-rc] Use proper search algorithm for finding resources. Previously we would only look in the current directory for a resource, which might not be the same as the directory of the rc file. Furthermore, MSVC rc supports a /I option, and can also look in the system environment. This patch adds support for this search algorithm. Differential Revision: https://reviews.llvm.org/D38740 llvm-svn: 315499	2017-10-11 20:12:09 +00:00
Sanjay Patel	6c0aef77aa	[x86] avoid infinite loop from SoftenFloatOperand (PR34866) Legalization of fp128 assumes things that we should have asserts for, so that's another potential improvement. Differential Revision: https://reviews.llvm.org/D38771 llvm-svn: 315485	2017-10-11 18:24:21 +00:00
Jake Ehrlich	f03384dce7	Reland "[llvm-objcopy] Add support for --strip-sections to remove all section headers leaving only program headers and loadable segment data" ubsan caught an issue I made where I was converting a null pointer to a reference. elf utils implements a particularly extreme form of stripping that I'd like to support. eu-strip has an option called "strip-sections" that removes all section headers and leaves only program headers and the segment data. I have implemented this option partly as a test but mainly because in Fuchsia we would like to use this option to minimize the size of our executables. The other strip options that are on my list include --strip-all and --strip-debug. This is a preliminary implementation that I'd like to start using in Fuchsia builds if possible. This change implements such a stripping option for llvm-objcopy Differential Revision: https://reviews.llvm.org/D38335 llvm-svn: 315484	2017-10-11 18:09:18 +00:00
Lei Huang	f9c7f7fed4	[NFC] update test case so checks are not order dependent when not needed llvm-svn: 315482	2017-10-11 18:04:41 +00:00
Rafael Espindola	1a0e5a1933	Convert an ErrorOr to Expected. getRelocationAddend should never be called on non SHT_RELA sections, but changing that requires changing RelocVisitor.h. llvm-svn: 315473	2017-10-11 16:56:33 +00:00
Krzysztof Parzyszek	8f174dde92	[Pipeliner] Improve serialization order for post-increments The pipeliner is generating a serial sequence that causes poor register allocation when a post-increment instruction appears prior to the use of the post-increment register. This occurs when there is a circular set of dependences involved with a sequence of instructions in the same cycle. In this case, there is no serialization of the parallel semantics that will not cause an additional register to be allocated. This patch fixes the problem by changing the instructions so that the post-increment instruction is used by the subsequent instruction, which enables the register allocator to make a better decision and not require another register. Patch by Brendon Cahoon. llvm-svn: 315466	2017-10-11 15:51:44 +00:00
Sanjay Patel	8d565a233d	[InstCombine] add baseline tests for D38531; NFC llvm-svn: 315461	2017-10-11 14:29:17 +00:00
Sanjay Patel	34fd5eaaf0	[DAGCombiner] convert insertelement of bitcasted vector into shuffle Eg: insert v4i32 V, (v2i16 X), 2 --> shuffle v8i16 V', X', {0,1,2,3,8,9,6,7} This is a generalization of the IR fold in D38316 to handle insertion into a non-undef vector. We may want to abandon that one if we can't find value in squashing the more specific pattern sooner. We're using the existing legal shuffle target hook to avoid AVX512 horror with vXi1 shuffles. There may be room for improvement in the shuffle lowering here, but that would be follow-up work. Differential Revision: https://reviews.llvm.org/D38388 llvm-svn: 315460	2017-10-11 14:12:16 +00:00
Jonas Devlieghere	ec053332cf	Revert "[dsymutil] Timestmap verification for __swift_ast" This reverts commit r315456. llvm-svn: 315458	2017-10-11 13:51:30 +00:00
Jonas Devlieghere	8acb2e3ac4	[dsymutil] Timestmap verification for __swift_ast This patch adds timestamp verification for swiftmodule files. - A new flag is provided to allows us to continue testing of the code for embedding the__swift_ast. (git doesn't maintain timestamps) - Adds a new test for fat (arm) binaries. Differential revision: https://reviews.llvm.org/D38686 llvm-svn: 315456	2017-10-11 13:34:52 +00:00
Simon Dardis	442ee63468	[mips] Add missing tests from rL315451 llvm-svn: 315454	2017-10-11 11:45:06 +00:00
Uriel Korach	782f28bf2f	[X86] Added tests for TESTM and TESTNM (NFC) Adding this test files now so after another commit that will add a new pattern for TESTM and TESTNM instructions will show the improvemnts that have been done. Change-Id: If3908b7f91897d764053312365a2bc1de78b291d llvm-svn: 315443	2017-10-11 08:39:25 +00:00
Max Kazantsev	3b81809e06	[GVN] Prevent LoadPRE from hoisting across instructions that don't pass control flow to successors This patch fixes the miscompile that happens when PRE hoists loads across guards and other instructions that don't always pass control flow to their successors. PRE is now prohibited to hoist across such instructions because there is no guarantee that the load standing after such instruction is still valid before such instruction. For example, a load from under a guard may be invalid before the guard in the following case: int array[LEN]; ... guard(0 <= index && index < LEN); use(array[index]); Differential Revision: https://reviews.llvm.org/D37460 llvm-svn: 315440	2017-10-11 08:10:43 +00:00
Max Kazantsev	0c8dd052b8	[LICM] Disallow sinking of unordered atomic loads into loops Sinking of unordered atomic load into loop must be disallowed because it turns a single load into multiple loads. The relevant section of the documentation is: http://llvm.org/docs/Atomics.html#unordered, specifically the Notes for Optimizers section. Here is the full text of this section: > Notes for optimizers > In terms of the optimizer, this prohibits any transformation that > transforms a single load into multiple loads, transforms a store into > multiple stores, narrows a store, or stores a value which would not be > stored otherwise. Some examples of unsafe optimizations are narrowing > an assignment into a bitfield, rematerializing a load, and turning loads > and stores into a memcpy call. Reordering unordered operations is safe, > though, and optimizers should take advantage of that because unordered > operations are common in languages that need them. Patch by Daniil Suchkov! Reviewed By: reames Differential Revision: https://reviews.llvm.org/D38392 llvm-svn: 315438	2017-10-11 07:26:45 +00:00
Max Kazantsev	25d8655dc2	[IRCE] Do not process empty safe ranges IRCE should not apply when the safe iteration range is proved to be empty. In this case we do unneeded job creating pre/post loops and then never go to the main loop. This patch makes IRCE not apply to empty safe ranges, adds test for this situation and also modifies one of existing tests where it used to happen slightly. Reviewed By: anna Differential Revision: https://reviews.llvm.org/D38577 llvm-svn: 315437	2017-10-11 06:53:07 +00:00
Davide Italiano	e2138fe41b	[GVN] Don't replace constants with constants. This fixes PR34908. Patch by Alex Crichton! Differential Revision: https://reviews.llvm.org/D38765 llvm-svn: 315429	2017-10-11 04:21:51 +00:00
Jake Ehrlich	d9a283463a	Revert "[llvm-objcopy] Add support for --strip-sections to remove all section headers leaving only program headers and loadable segment data" This reverts commit rL315412 llvm-svn: 315417	2017-10-11 02:42:29 +00:00
Jake Ehrlich	b5152447ba	[llvm-objcopy] Add support for --strip-sections to remove all section headers leaving only program headers and loadable segment data elf utils implements a particularly extreme form of stripping that I'd like to support. eu-strip has an option called "strip-sections" that removes all section headers and leaves only program headers and the segment data. I have implemented this option partly as a test but mainly because in Fuchsia we would like to use this option to minimize the size of our executables. The other strip options that are on my list include --strip-all and --strip-debug. This is a preliminary implementation that I'd like to start using in Fuchsia builds if possible. This change implements such a stripping option for llvm-objcopy Differential Revision: https://reviews.llvm.org/D38335 llvm-svn: 315412	2017-10-11 01:59:06 +00:00
Craig Topper	6ce20bd184	[X86] Add 128-bit version of vbroadcasti32x2 to shuffle comment decoding. llvm-svn: 315395	2017-10-11 00:11:53 +00:00
Jake Ehrlich	fcc05627d4	[llvm-objcopy] Add ability to remove multiple sections by name This change adds the ability to use the "-R"/"-remove-section" option multiple times. Differential Revision: https://reviews.llvm.org/D38332 llvm-svn: 315385	2017-10-10 23:02:43 +00:00
Craig Topper	bb0e316dc7	[X86] Add broadcast patterns that allow a scalar_to_vector between the broadcast and the load. We already have these patterns for AVX512VL, but not AVX1 or 2. llvm-svn: 315382	2017-10-10 22:40:31 +00:00
Rafael Espindola	8f1f7b1442	Make the ELFFile constructor private. With this all clients have to use the new create method which returns an Expected. Fixes a crash on invalid input. llvm-svn: 315376	2017-10-10 22:17:49 +00:00
Rafael Espindola	ef421f9c18	Make the ELFObjectFile constructor private. This forces every user to use the new create method that returns an Expected. This in turn propagates better error messages. llvm-svn: 315371	2017-10-10 21:21:16 +00:00
Dehao Chen	3f56a05ae5	Use the first instruction's count to estimate the funciton's entry frequency. Summary: In the current implementation, we only have accurate profile count for standalone symbols. For inlined functions, we do not have entry count data because it's not available in LBR. In this patch, we use the first instruction's frequency to estimiate the function's entry count, especially for inlined functions. This may be inaccurate due to debug info in optimized code. However, this is a better estimate than the static 80/20 estimation we have in the current implementation. Reviewers: tejohnson, davidxl Reviewed By: tejohnson Subscribers: sanjoy, llvm-commits, aprantl Differential Revision: https://reviews.llvm.org/D38478 llvm-svn: 315369	2017-10-10 21:13:50 +00:00
Sanjay Patel	b74063d21f	[x86] fix prefix typos for CHECK lines; NFC llvm-svn: 315368	2017-10-10 21:12:47 +00:00
Simon Dardis	b994128d14	[mips] Correct the instruction predicates for microMIPSr3 Rather than using the AdditionalPredicates mechanism to guard the microMIPS instructions, use the existing predicates to properly guard those instructions. This also resolves a case where an instruction pattern was incorrectly available for microMIPS32R6, which caused a register allocation failure as the registers specified in the pattern were not available. Reviewers: nitesh.jain, atanasyan Differential Revision: https://reviews.llvm.org/D38451 llvm-svn: 315362	2017-10-10 20:52:53 +00:00
Matt Arsenault	f42074b699	AMDGPU: Fix missing skipFunction calls llvm-svn: 315361	2017-10-10 20:48:36 +00:00
Matt Arsenault	d674e0ac0d	AMDGPU: Fix failure to select branch with optnone opt-bisect/optnone disable the AMDGPUUniformAnnotateValues pass. The heuristic in the custom selector for brcond deferred the branch uniformity check to the pattern, which would fail. llvm-svn: 315360	2017-10-10 20:34:49 +00:00
Rafael Espindola	12db383e20	Convert two uses of ErrorOr to Expected. llvm-svn: 315354	2017-10-10 20:00:07 +00:00
Yaxun Liu	de4b88d9a1	[AMDGPU] Lower enqueued blocks and generate runtime metadata This patch adds a post-linking pass which replaces the function pointer of enqueued block kernel with a global variable (runtime handle) and adds runtime-handle attribute to the enqueued block kernel. In LLVM CodeGen the runtime-handle metadata will be translated to RuntimeHandle metadata in code object. Runtime allocates a global buffer for each kernel with RuntimeHandel metadata and saves the kernel address required for the AQL packet into the buffer. __enqueue_kernel function in device library knows that the invoke function pointer in the block literal is actually runtime handle and loads the kernel address from it and puts it into AQL packet for dispatching. This cannot be done in FE since FE cannot create a unique global variable with external linkage across LLVM modules. The global variable with internal linkage does not work since optimization passes will try to replace loads of the global variable with its initialization value. Differential Revision: https://reviews.llvm.org/D38610 llvm-svn: 315352	2017-10-10 19:39:48 +00:00
Jake Ehrlich	36a2eb34ed	[llvm-objcopy] Add support for removing sections This change adds support for removing sections using the -R field (as GNU objcopy does as well). This change should let us add many helpful tests and is a proper stepping stone for adding more general kinds of stripping. Differential Revision: https://reviews.llvm.org/D38260 llvm-svn: 315346	2017-10-10 18:47:09 +00:00
Jake Ehrlich	c5ff72708d	Revert "temporary" I forgot to add a proper commit message. I'm reverting this to fix that. This reverts commit r315344. llvm-svn: 315345	2017-10-10 18:32:22 +00:00
Jake Ehrlich	77ec1ffe5c	temporary llvm-svn: 315344	2017-10-10 18:28:15 +00:00
Adrian Prantl	16b8b47152	Debug Info: Fix the SDLoc propagation for a DAGCombiner rule This patch ensures that the rule: fold (zext (load x)) -> (zext (truncate (zextload x))) propagates the SDLoc of the load to the zextload. <rdar://problem/33755881> llvm-svn: 315340	2017-10-10 18:08:32 +00:00
Francis Ricci	5776f26fa1	[llvm-objdump] Disable leak checking on an llvm-objdump test Summary: This leak doesn't reproduce locally on macOS 10.12, but is causing buildbot failures. Disable leak checking until it can be fixed. Reviewers: sqlbyme, qcolombet, enderby, bruno Reviewed By: bruno Subscribers: bruno, llvm-commits Differential Revision: https://reviews.llvm.org/D38699 llvm-svn: 315337	2017-10-10 17:50:57 +00:00
Bruno Cardoso Lopes	57304923ca	Revert "[SCCP] Propagate integer range info for parameters in IPSCCP." This reverts commit r315288. This is part of fixing segfault introduced in: http://green.lab.llvm.org/green/job/clang-stage2-configure-Rlto/21675/ llvm-svn: 315329	2017-10-10 16:37:57 +00:00
Jacob Gravelle	37af00e7d0	[WebAssembly] Narrow the scope of WebAssemblyFixFunctionBitcasts Summary: The pass to fix function bitcasts generates thunks for functions that are called directly with a mismatching signature. It was also generating thunks in cases where the function was address-taken, causing aliasing problems in otherwise valid cases. This patch tightens the restrictions for when the pass runs. Reviewers: sunfish, dschuff Subscribers: jfb, sbc100, llvm-commits, aheejin Differential Revision: https://reviews.llvm.org/D38640 llvm-svn: 315326	2017-10-10 16:20:18 +00:00
Simon Pilgrim	053a299a9b	[X86][AVX512] Regenerate element insertion/extraction tests llvm-svn: 315322	2017-10-10 15:58:54 +00:00
Simon Dardis	96d35fe06a	[mips] Duplicate the reciprocal instruction definitions for FP32 Add instruction definitions for FP32 mode for recip.d and rsqrt.d. Previously these instructions were only defined when targeting the full 64-bit FPU model but were not guarded properly. Reviewers: nitesh.jain, atanasyan Differential Revision: https://reviews.llvm.org/D38400 llvm-svn: 315318	2017-10-10 14:41:11 +00:00
Jonas Devlieghere	aa6be823a4	Re-land "[llvm-dwarfdump] Print type names in DW_AT_type DIEs" This patch adds printing for DW_AT_type DIEs like it is already the case for DW_AT_specification DIEs. This is a rather naive approach and only a start. We should have pretty printers for different languages. Recommit after being reverted in r315299. Differential revision: https://reviews.llvm.org/D36993 llvm-svn: 315316	2017-10-10 14:15:25 +00:00
Sanjay Patel	7d52c7ca74	[x86] add tests for insertelement; NFC llvm-svn: 315312	2017-10-10 13:45:25 +00:00
Simon Dardis	a17a7b619a	[mips] Partially fix PR34391 Previously, the parsing of the 'subu $reg, ($reg,) imm' relied on a parser which also rendered the operand to the instruction. In some cases the general parser could construct an MCExpr which was not a MCConstantExpr which MipsAsmParser was expecting. Address this by altering the special handling to cope with unexpected inputs and fine-tune the handling of cases where an register name that is not available in the current ABI is regarded as not a match for the custom parser but also not as an outright error. Also enforces the binutils restriction that only constants are accepted. This partially resolves PR34391. Thanks to Ed Maste for reporting the issue! Reviewers: nitesh.jain, arichardson Differential Revision: https://reviews.llvm.org/D37476 llvm-svn: 315310	2017-10-10 13:34:45 +00:00
David Stuttard	51c1b22806	[DAGCombine] Fix for shuffle to vector extend for non power 2 vectors Summary: See https://llvm.org/PR33743 for more details It seems that for non-power of 2 vector sizes, the algorithm can produce non-matching sizes for input and result causing an assert. This usually isn't a problem as the isAnyExtend check will weed these out, but in some cases (most often with lots of undefined values for the mask indices) it can pass this check for non power of 2 vectors. Adding in an extra check that ensures that bit size will match for the result and input (as required) Subscribers: nhaehnle Differential Revision: https://reviews.llvm.org/D35241 llvm-svn: 315307	2017-10-10 12:45:45 +00:00
Oliver Stannard	30b732c942	[ARM, Asm] Harden GNU LDRD/STRD aliases against invalid inputs Previously, the code that implemented the GNU assembler aliases for the LDRD and STRD instructions (where the second register is omitted) assumed that the input was a valid instruction. This caused assertion failures for every example in ldrd-strd-gnu-bad-inst.s. This improves this code so that it bails out if the instruction is not in the expected format, the check bails out, and the asm parser is run on the unmodified instruction. It also relaxes the alias on thumb targets, so that unaligned pairs of registers can be used. The restriction that Rt must be even-numbered only applies to the ARM versions of these instructions. Differential revision: https://reviews.llvm.org/D36732 llvm-svn: 315305	2017-10-10 12:38:22 +00:00
Oliver Stannard	cd3306f62f	[ARM, Asm] Add diagnostics for floating-point register operands This adds diagnostic strings for the ARM floating-point register classes, which will be used when these classes are expected by the assembler, but the provided operand is not valid. One of these, DPR, requires C++ code to select the correct error message, as that class contains different registers depending on the FPU. The rest can all have their diagnostic strings stored in the tablegen decription of them. Differential revision: https://reviews.llvm.org/D36693 llvm-svn: 315304	2017-10-10 12:35:09 +00:00
Oliver Stannard	bbad419e94	[ARM, Asm] Add diagnostics for general-purpose register operands This adds diagnostic strings for the ARM general-purpose register classes, which will be used when these classes are expected by the assembler, but the provided operand is not valid. One of these, rGPR, requires C++ code to select the correct error message, as that class contains different registers in pre-v8 and v8 targets. The rest can all have their diagnostic strings stored in the tablegen description of them. Differential revision: https://reviews.llvm.org/D36692 llvm-svn: 315303	2017-10-10 12:31:53 +00:00
Nicolai Haehnle	312b64f4d7	AMDGPU: Split MUBUF offset into aligned components Summary: Atomic buffer operations do not work (and trap on gfx9) when the components are unaligned, even if their sum is aligned. Previously, we generated an offset of 4156 without an SGPR by splitting it as 4095 + 61 (immediate + inline constant). The highest offset for which we can do this correctly is 4156 = 4092 + 64. Fixes dEQP-GLES31.functional.ssbo.atomic.* Reviewers: arsenm Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D37850 llvm-svn: 315302	2017-10-10 12:22:23 +00:00
Jonas Devlieghere	5b0f885691	Revert "[llvm-dwarfdump] Print type names in DW_AT_type DIEs" This reverts commit r315297. llvm-svn: 315299	2017-10-10 11:49:56 +00:00
Jonas Devlieghere	2eb95c33f6	[llvm-dwarfdump] Print type names in DW_AT_type DIEs This patch adds printing for DW_AT_type DIEs like it is already the case for DW_AT_specification DIEs. This is a rather naive approach and only a start. We should have pretty printers for different languages. Differential revision: https://reviews.llvm.org/D36993 llvm-svn: 315297	2017-10-10 11:24:41 +00:00
Oliver Stannard	29ffd3f1d9	[AsmParser] Add DiagnosticString to register classes in tablegen This allows a DiagnosticType and/or DiagnosticString to be associated with a RegisterClass in tablegen, so that we can emit diagnostics in the assembler when a register operand is incorrect. DiagnosticType creates a predictable enum value, which gets returned as the error code when an operand does not match, and can be used by the assembly parser to map to a user-facing diagnostic. DiagnosticString creates an anonymous enum value (currently based on the tablegen class name), and a function to map from enum values to strings will be generated. Both of these work the same was as they do for AsmOperand. This isn't used by any targets yet, but has one (positive) side-effect. It improves the diagnostic codes returned by validateOperandClass - we always want to emit the diagnostic that relates to the expected operand class, but this wasn't always being done when the expected and actual classes were completely different (token/register/custom). This causes a few AArch64 diagnostics to be improved, as Match_InvalidOperand was being returned instead of a specific diagnostic type. Differential revision: https://reviews.llvm.org/D36691 llvm-svn: 315295	2017-10-10 11:00:40 +00:00
Gadi Haber	2b132eb4f8	[X86][SKYLAKE] Update regression test to differentiate between HASWELL and SKYLAKE scheduling.<NFC> NFC. Updated 6 regression tests to differentiate between HASWELL and SKYLAKE scheduling information. The fix is in preparation of a patch to update the information of the Skylake Client scheduling to include the appropriate load and store latencies. Reviewers: zvi, RKSimon Differential Revision: https://reviews.llvm.org/D38685 Change-Id: Ifc6b98d9eaf266913698f24c766fd994fc977555 llvm-svn: 315291	2017-10-10 09:53:18 +00:00
Florian Hahn	22a44bca40	[SCCP] Propagate integer range info for parameters in IPSCCP. Summary: This updates the SCCP solver to use of the ValueElement lattice for parameters, which provides integer range information. The range information is used to remove unneeded icmp instructions. For the following function, f() can be optimized to `ret i32 2` with this change source_filename = "sccp.c" target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" ; Function Attrs: norecurse nounwind readnone uwtable define i32 @main() local_unnamed_addr #0 { entry: %call = tail call fastcc i32 @f(i32 1) %call1 = tail call fastcc i32 @f(i32 47) %add3 = add nsw i32 %call, %call1 ret i32 %add3 } ; Function Attrs: noinline norecurse nounwind readnone uwtable define internal fastcc i32 @f(i32 %x) unnamed_addr #1 { entry: %c1 = icmp sle i32 %x, 100 %cmp = icmp sgt i32 %x, 300 %. = select i1 %cmp, i32 1, i32 2 ret i32 %. } attributes #1 = { noinline } Reviewers: davide, sanjoy, efriedma, dberlin Reviewed By: davide, dberlin Subscribers: mcrosier, gberry, mssimpso, dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D36656 llvm-svn: 315288	2017-10-10 09:32:38 +00:00
Nemanja Ivanovic	7bf866eb10	Fix for PR34888. The issue is that we assume operand zero of the input to the add instruction is a register. In this case, the input comes from inline assembly and operand zero is not a register thereby causing a crash. The code will bail anyway if the input instruction doesn't have the right opcode. So do that check first and let short-circuiting prevent the crash. llvm-svn: 315285	2017-10-10 08:46:10 +00:00
Clement Courbet	e2e8a5c496	Re-land "[MergeICmps] Disable mergeicmps if the target does not want to handle memcmp expansion." (fixed stability issues) This reverts commit d6492333d3b478a1d88163315002022f8d5e58dc. llvm-svn: 315281	2017-10-10 08:00:45 +00:00
Craig Topper	a88306e6fb	[AVX512] Add patterns to commute integer comparison instructions during isel. This enables broadcast loads to be commuted and allows normal loads to be folded without the peephole pass. llvm-svn: 315274	2017-10-10 06:36:46 +00:00
Xinliang David Li	4cdc9dab0a	Renable r314928 Eliminate inttype phi with inttoptr/ptrtoint. This version fixed a bug in finding the matching phi -- the order of the incoming blocks may be different (triggered in self build on Windows). A new test case is added. llvm-svn: 315272	2017-10-10 05:07:54 +00:00
Reid Kleckner	97a2d5c42f	[MC] Properly diagnose badly scoped .cfi_ directives Removes two report_fatal_errors. Implement this by removing EmitCFICommon, and do the checking in getCurrentDwarfFrameInfo. Have the callers check for null before dereferencing it. llvm-svn: 315264	2017-10-10 01:49:21 +00:00
Reid Kleckner	78eb8b912f	Give a test a triple llvm-svn: 315263	2017-10-10 01:34:31 +00:00
Reid Kleckner	e52d1e6787	[SEH] Use reportError instead of report_fatal_error for bad directives This makes the .seh_ directives slightly more usable from standalone assembly files. This removes a large number of report_fatal_errors and recovers from the error by ignoring the directive. llvm-svn: 315262	2017-10-10 01:26:25 +00:00
Reid Kleckner	ab23dace56	[MC] Suppress .Lcfi labels when emitting textual assembly Summary: This suppresses the generation of .Lcfi labels in our textual assembler. It was annoying that this generated cascading .Lcfi labels: llc foo.ll -o - \| llvm-mc \| llvm-mc After three trips through MCAsmStreamer, we'd have three labels in the output when none are necessary. We should only bother creating the labels and frame data when making a real object file. This supercedes D38605, which moved the entire .seh_ implementation into MCObjectStreamer. This has the advantage that we do more checking when emitting textual assembly, as a minor efficiency cost. Outputting textual assembly is not performance critical, so this shouldn't matter. Reviewers: majnemer, MatzeB Subscribers: qcolombet, nemanjai, javed.absar, eraman, hiraditya, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D38638 llvm-svn: 315259	2017-10-10 00:57:36 +00:00
Aditya Nandakumar	c3bfc81a1f	[GISel]: Fix generation of illegal COPYs during CallLowering We end up creating COPY's that are either truncating/extending and this should be illegal. https://reviews.llvm.org/D37640 Patch for X86 and ARM by igorb, rovka llvm-svn: 315240	2017-10-09 20:07:43 +00:00
Zvi Rackover	c1d5955684	[X86] Unsigned saturation subtraction canonicalization [the backend part] Summary: On behalf of julia.koval@intel.com The patch transforms canonical version of unsigned saturation, which is sub(max(a,b),a) or sub(a,min(a,b)) to special psubus insturuction on targets, which support it(8bit and 16bit uints). umax(a,b) - b -> subus(a,b) a - umin(a,b) -> subus(a,b) There is also extra case handled, when right part of sub is 32 bit and can be truncated, using UMIN(this transformation was discussed in https://reviews.llvm.org/D25987). The example of special case code: ``` void foo(unsigned short p, int max, int n) { int i; unsigned m; for (i = 0; i < n; i++) { m = --p; p = (unsigned short)(m >= max ? m-max : 0); } } ``` Max in this example is truncated to max_short value, if it is greater than m, or just truncated to 16 bit, if it is not. It is vaid transformation, because if max > max_short, result of the expression will be zero. Here is the table of types, I try to support, special case items are bold: \| Size \| 128 \| 256 \| 512 \| ----- \| ----- \| ----- \| ----- \| i8 \| v16i8 \| v32i8 \| v64i8 \| i16 \| v8i16 \| v16i16 \| v32i16 \| i32 \| \| v8i32* \| v16i32 \| i64 \| \| \| v8i64 Reviewers: zvi, spatel, DavidKreitzer, RKSimon Reviewed By: zvi Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37534 llvm-svn: 315237	2017-10-09 20:01:10 +00:00
Alexey Bataev	6ab4d075ff	[SLP] Add test for reversed load, NFC. llvm-svn: 315232	2017-10-09 19:08:15 +00:00
Daniel Sanders	4d4e7650dc	[globalisel] Add support for ValueType operands in patterns. It's rare but there are a small number of patterns like this: (set i64:$dst, (add i64:$src1, i64:$src2)) These should be equivalent to register classes except they shouldn't check for a specific register bank. This doesn't occur in AArch64/ARM/X86 but does occasionally come up in other in-tree targets such as BPF. llvm-svn: 315226	2017-10-09 18:14:53 +00:00
Francis Ricci	01ab402463	[dsymutil] Emit valid debug locations when no symbol flags are set Summary: swiftc emits symbols without flags set, which led dsymutil to ignore them when searching for global symbols, causing dwarf location data to be omitted. Xcode's dsymutil handles this case correctly, and emits valid location data. Add this functionality to llvm-dsymutil by allowing parsing of symbols with no flags set. Reviewers: aprantl, friss, JDevlieghere Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38587 llvm-svn: 315218	2017-10-09 17:27:47 +00:00
Alexey Bataev	aadc2331e4	[SLP] Test for wrongly vectorized set of extractelements, NFC. llvm-svn: 315217	2017-10-09 17:14:03 +00:00
Zachary Turner	bd3a9dbabb	[llvm-rc] Have the tokenizer discard single & block comments. This allows rc files to have comments. Eventually we should just use clang's c preprocessor, but that's a bit larger effort for minimal gain, and this is straightforward. Differential Revision: https://reviews.llvm.org/D38651 llvm-svn: 315207	2017-10-09 15:46:13 +00:00
Sanjay Patel	2a61a821a0	[DAG] combine assertsexts around a trunc This was a suggested follow-up to: D37017 / https://reviews.llvm.org/rL313577 llvm-svn: 315206	2017-10-09 15:22:20 +00:00
Amara Emerson	24ca39ce71	[AArch64] Improve codegen for inverted overflow checking intrinsics E.g. if we have a (xor(overflow-bit), 1) where overflow-bit comes from an intrinsic like llvm.sadd.with.overflow then we can kill the xor and use the inverted condition code for the CSEL. rdar://28495949 Reviewed By: kristof.beyls Differential Revision: https://reviews.llvm.org/D38160 llvm-svn: 315205	2017-10-09 15:15:09 +00:00
Sanjay Patel	8557e29408	[x86] regenerate test checks; NFC llvm-svn: 315204	2017-10-09 15:01:58 +00:00
Sanjay Patel	be37ab864c	[AArch64] fix typos in test assertions llvm-svn: 315203	2017-10-09 01:29:54 +00:00

... 5 6 7 8 9 ...

48692 Commits