llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	558cc48b44	[SelectionDAG] Remove unused method declaration. The method implementation was removed in r318982. llvm-svn: 319798	2017-12-05 17:37:17 +00:00
Dan Gohman	c2c997718d	[WebAssembly] Implement WASM_STACK_POINTER. Use the .stack_pointer directive to implement WASM_STACK_POINTER for specifying a global variable to be the stack pointer. llvm-svn: 319797	2017-12-05 17:23:43 +00:00
Dan Gohman	f7172f4ab0	[WebAssembly] Don't emit .import_global for the wasm target. .import_global is used by the ELF-based target and not needed by the wasm target. llvm-svn: 319796	2017-12-05 17:21:57 +00:00
Xinliang David Li	cc35bc9efc	[PGO] detect infinite loop and form MST properly Differential Revision: http://reviews.llvm.org/D40702 llvm-svn: 319794	2017-12-05 17:19:41 +00:00
Rafael Espindola	20569e96e9	Delete temp file if rename fails. Without this when lld failed to replace the output file it would leave the temporary behind. The problem is that the existing logic is - cancel the delete flag - rename We have to cancel first to avoid renaming and then crashing and deleting the old version. What is missing then is deleting the temporary file if the rename fails. This can be an issue on both unix and windows, but I am not sure how to cause the rename to fail reliably on unix. I think it can be done on ZFS since it has an ACL system similar to what windows uses, but adding support for checking that in llvm-lit is probably not worth it. llvm-svn: 319786	2017-12-05 16:40:56 +00:00
Simon Pilgrim	d9f1ae3266	[X86][AVX512] Tag VNNIW instruction scheduler classes llvm-svn: 319784	2017-12-05 16:17:21 +00:00
Simon Pilgrim	4a9b1e1273	[X86][AVX512] Drop some default NoItinerary arguments that aren't needed any more llvm-svn: 319782	2017-12-05 16:10:57 +00:00
Jina Nahias	51c1a627c2	[x86][AVX512] Lowering kunpack intrinsics to LLVM IR This patch, together with a matching clang patch (https://reviews.llvm.org/D39719), implements the lowering of X86 kunpack intrinsics to IR. Differential Revision: https://reviews.llvm.org/D39720 Change-Id: I4088d9428478f9457f6afddc90bd3d66b3daf0a1 llvm-svn: 319778	2017-12-05 15:42:56 +00:00
Sam Parker	0a436a9d62	[DAGCombine] Move AND nodes to multiple load leaves Search from AND nodes to find whether they can be propagated back to loads, so that the AND and load can be combined into a narrow load. We search through OR, XOR and other AND nodes and all bar one of the leaves are required to be loads or constants. The exception node then needs to be masked off meaning that the 'and' isn't removed, but the loads(s) are narrowed still. Differential Revision: https://reviews.llvm.org/D39604 llvm-svn: 319773	2017-12-05 15:13:47 +00:00
Simon Pilgrim	4d08aedba3	[X86][AVX512] Tag VPMADD52/VPSADBW instruction scheduler classes llvm-svn: 319772	2017-12-05 14:59:40 +00:00
Bjorn Pettersson	823b299fbc	[DAGCombine] Handle big endian correctly in CombineConsecutiveLoads Summary: Found out, at code inspection, that there was a fault in DAGCombiner::CombineConsecutiveLoads for big-endian targets. A BUILD_PAIR is always having the least significant bits of the composite value in element 0. So when we are doing the checks for consecutive loads, for big endian targets, we should check if the load to elt 1 is at the lower address and the load to elt 0 is at the higher address. Normally this bug only resulted in missed oppurtunities for doing the load combine. I guess that in some rare situation it could lead to faulty combines, but I've not seen that happen. Note that this patch actually will trigger load combine for some big endian regression tests. One example is test/CodeGen/PowerPC/anon_aggr.ll where we now get t76: i64,ch = load<LD8[FixedStack-9] instead of t37: i32,ch = load<LD4[FixedStack-10]> t35: i32,ch = load<LD4[FixedStack-9]> t41: i64 = build_pair t37, t35 before legalization. Then the legalization will split the LD8 into two loads, so the end result is the same. That should verify that the transfomation is correct now. Reviewers: niravd, hfinkel Reviewed By: niravd Subscribers: nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D40444 llvm-svn: 319771	2017-12-05 14:50:05 +00:00
Simon Pilgrim	71660c61e6	[X86][AVX512] Add missing scalar CMPSS/CMPSD logic scheduler classes llvm-svn: 319770	2017-12-05 14:34:42 +00:00
Mikael Holmen	0a3e98062f	Bail out of a SimplifyCFG switch table opt at undef values. Summary: A true or false result is expected from a comparison, but it seems the possibility of undef was overlooked, which could lead to a failed assert. This is fixed by this patch by bailing out if we encounter undef. The bug is old and the assert has been there since the end of 2014, so it seems this is unusual enough to forego optimization. Patch by JesperAntonsson. Reviewers: spatel, eeckstein, hans Reviewed By: hans Subscribers: uabelho, llvm-commits Differential Revision: https://reviews.llvm.org/D40639 llvm-svn: 319768	2017-12-05 14:14:00 +00:00
Simon Pilgrim	b9b46394e3	[X86][AVX512] Cleanup bit logic scheduler classes llvm-svn: 319767	2017-12-05 14:04:23 +00:00
Sam Parker	8b73630c32	[DAGCombine] isLegalNarrowLoad function (NFC) Pull the checks upon the load out from ReduceLoadWidth into their own function. Differential Revision: https://reviews.llvm.org/D40833 llvm-svn: 319766	2017-12-05 14:03:51 +00:00
Simon Pilgrim	fd3a2632e5	[X86][AVX512] Tag scalar CVT and CMP instruction scheduler classes llvm-svn: 319765	2017-12-05 13:49:44 +00:00
Igor Laevsky	cec8f47e77	[InstCombine] Don't crash on out of bounds shifts Differential Revision: https://reviews.llvm.org/D40649 llvm-svn: 319761	2017-12-05 12:18:15 +00:00
Simon Pilgrim	aa91155960	[X86][AVX512] Tag VPCMP/VPCMPU instruction scheduler classes Move hardcoded itinerary out to the instruction declarations. Not sure that IIC_SSE_ALU_F32P is the best schedule for integer comparisons, but I'm not going to change it right now. llvm-svn: 319760	2017-12-05 12:14:36 +00:00
Simon Pilgrim	a2b5862641	[X86][AVX512] Cleanup VPCMP scheduler classes Move hardcoded itinerary out to the instruction declarations. Not sure that IIC_SSE_ALU_F32P is the best schedule for integer comparisons, but I'm not going to change it right now. llvm-svn: 319758	2017-12-05 12:02:22 +00:00
Simon Pilgrim	54b8aa2bb2	[X86][AVX512] Tag VFIXUPIMM instructions scheduler classes llvm-svn: 319757	2017-12-05 11:46:57 +00:00
Jonas Paulsson	b5b91cd402	[SystemZ] set 'guessInstructionProperties = 0' and set flags as needed. This has proven a healthy exercise, as many cases of incorrect instruction flags were corrected in the process. As part of this, IntrWriteMem was added to several SystemZ instrinsics. Furthermore, a bug was exposed in TwoAddress with this change (as incorrect hasSideEffects flags were removed and instructions could now be sunk), and the test case for that bugfix (r319646) is included here as test/CodeGen/SystemZ/twoaddr-sink.ll. One temporary test regression (one extra copy) which will hopefully go away in upcoming patches for similar cases: test/CodeGen/SystemZ/vec-trunc-to-i1.ll Review: Ulrich Weigand. https://reviews.llvm.org/D40437 llvm-svn: 319756	2017-12-05 11:24:39 +00:00
Jonas Paulsson	86c40db49d	[Regalloc] Generate and store multiple regalloc hints. MachineRegisterInfo used to allow just one regalloc hint per virtual register. This patch extends this to a vector of regalloc hints, which is filled in by common code with sorted copy hints. Such hints will make for more ID copies that can be removed. NB! This improvement is currently (and hopefully temporarily) disabled by default, except for SystemZ. The only reason for this is the big impact this has on tests, which has unfortunately proven unmanageable. It was a long while since all the tests were updated and just waiting for review (which didn't happen), but now targets have to enable this themselves instead. Several targets could get a head-start by downloading the tests updates from the Phabricator review. Thanks to those who helped, and sorry you now have to do this step yourselves. This should be an improvement generally for any target! The target may still create its own hint, in which case this has highest priority and is stored first in the vector. If it has target-type, it will not be recomputed, as per the previous behaviour. The temporary hook enableMultipleCopyHints() will be removed as soon as all targets return true. Review: Quentin Colombet, Ulrich Weigand. https://reviews.llvm.org/D38128 llvm-svn: 319754	2017-12-05 10:52:24 +00:00
Pavel Labath	2da3397cdf	Re-commit "[cmake] Enable zlib support on windows" This recommits r319533 which was broken llvm-config --system-libs output. The reason was that I used find_libraries for searching for the z library. This returns absolute paths, and when these paths made it into llvm-config, it made it produce nonsensical flags. To fix this, I hand-roll a search for the library in the same way that we search for the terminfo library a couple of lines below. This is a bit less flexible than the find_library option, as it does not allow the user to specify the path to the library at configure time (which is important on windows, as zlib is unlikely to be found in any of the standard places cmake searches), but I was able to guide the build to find it with appropriate values of LIB and INCLUDE environment variables. Reviewers: compnerd, rnk, beanz, rafael Subscribers: llvm-commits, mgorny Differential Revision: https://reviews.llvm.org/D40779 llvm-svn: 319751	2017-12-05 10:24:15 +00:00
George Rimar	f91f0b0af7	[Support/TarWriter] - Don't allow TarWriter to add the same file more than once. This is for PR35460. Currently when LLD adds files to TarWriter it may pass the same file multiple times. For example it happens for clang reproduce file which specifies archive (.a) files more than once in command line. Patch makes TarWriter to ignore files with the same path, so it will add only the first one to archive. Differential revision: https://reviews.llvm.org/D40606 llvm-svn: 319750	2017-12-05 10:09:59 +00:00
Guy Blank	f3cefdd350	[X86] Fix a bug in handling GRXX subclasses in Domain Reassignment pass When trying to determine the correct Mask register class corresponding to a GPR register class, not all register classes were handled. This caused an assertion to be raised on some scenarios. Differential Revision: https://reviews.llvm.org/D40290 llvm-svn: 319745	2017-12-05 09:08:24 +00:00
Craig Topper	98495291a7	[SelectionDAG] Use WidenTargetBoolean in WidenVecRes_MLOAD and WidenVecOp_MSTORE instead of implementing it manually and incorrectly. The CONCAT_VECTORS operand get its type from getSetCCResultType, but if the mask type and the setcc have different scalar sizes this creates an illegal CONCAT_VECTORS operation. The concat type should be 2x the mask type, and then an extend should be added if needed. llvm-svn: 319744	2017-12-05 08:15:03 +00:00
Craig Topper	a404ce955a	[X86] Use vector widening to support sign extend from i1 when the dest type is not 512-bits and vlx is not enabled. Previously we used a wider element type and truncated. But its more efficient to keep the element type and drop unused elements. If BWI isn't supported and we have a i16 or i8 type, we'll extend it to be i32 and still use a truncate. llvm-svn: 319740	2017-12-05 06:37:21 +00:00
Daniel Sanders	3c1c4c0ee0	Revert r319691: [globalisel][tablegen] Split atomic load/store into separate opcode and enable for AArch64. Some concerns were raised with the direction. Revert while we discuss it and look into an alternative llvm-svn: 319739	2017-12-05 05:52:07 +00:00
Craig Topper	e1ba2450c2	[X86] Fix a crash if avx512bw and xop are both enabled when the IR contrains a v32i8 bitreverse. llvm-svn: 319737	2017-12-05 04:47:12 +00:00
Matt Arsenault	e42b08d96d	AMDGPU: Fix missing subtarget feature initializer llvm-svn: 319733	2017-12-05 03:15:44 +00:00
Matt Arsenault	9a60c3ea36	AMDGPU: Fix crash when scheduling DBG_VALUE This calls handleMove with a DBG_VALUE instruction, which isn't tracked by LiveIntervals. I'm not sure this is the correct place to fix this. The generic scheduler seems to have more deliberate region selection that skips dbg_value. The test is also really hard to reduce. I haven't been able to figure out what exactly causes this particular case to try moving the dbg_value. llvm-svn: 319732	2017-12-05 03:09:23 +00:00
Craig Topper	276c770e57	[X86] Use vector widening to support zero extend from i1 when the dest type is not 512-bits and vlx is not enabled. Previously we used a wider element type and truncated. But its more efficient to keep the element type and drop unused elements. If BWI isn't supported and we have a i16 or i8 type, we'll extend it to be i32 and still use a truncate. llvm-svn: 319728	2017-12-05 01:45:46 +00:00
Craig Topper	913b42b0e1	[X86] Don't use kunpck for vXi1 concat_vectors if the upper bits are undef. This can be efficiently selected by a COPY_TO_REGCLASS without the need for an extra instruction. llvm-svn: 319726	2017-12-05 01:28:06 +00:00
Craig Topper	6302012442	[X86] Use getZeroVector and remove an unnecessary creation of an APInt before calling getConstant. NFCI The getConstant function can take care of creating the APInt internally. getZeroVector will take care of using the correct type for the build vector to avoid re-lowering. The test change here is because execution domain constraints apparently pass through undef inputs of a zeroing xor. So the different ordering of register allocation here caused the dependency to change. llvm-svn: 319725	2017-12-05 01:28:04 +00:00
Craig Topper	adadaae586	[X86] Rearrange some of the code around AVX512 sign/zero extends. NFCI Move the AVX512 code out of LowerAVXExtend. LowerAVXExtend has two callers but one of them pre-checks for AVX-512 so the code is only live from the other caller. So move the AVX-512 checks up to that caller for symmetry. Move all of the i1 input type code in Lower_AVX512ZeroExend together. llvm-svn: 319724	2017-12-05 01:28:00 +00:00
Matthias Braun	7afbfd0f24	MachineFrameInfo: Cleanup some parameter naming inconsistencies; NFC Consistently use the same parameter names as the names of the affected fields. This avoids some unintuitive abbreviations like `isSS`. llvm-svn: 319722	2017-12-05 01:18:15 +00:00
Matthias Braun	62378bb5ab	TwoAddressInstructionPass: Trigger -O0 behavior on optnone While we cannot skip the whole TwoAddressInstructionPass even for -O0 there are some parts of the pass that are currently skipped at -O0 but not for optnone. Changing this as there is no reason to have those two hit different code paths here. llvm-svn: 319721	2017-12-05 00:56:14 +00:00
Jan Vesely	39aeab4f30	AMDGPU/EG: Add a new FeatureFMA and use it to selectively enable FMA instruction Only used by pre-GCN targets v2: fix predicate setting for FMA_Common Differential Revision: https://reviews.llvm.org/D40692 llvm-svn: 319712	2017-12-04 23:07:28 +00:00
Jan Vesely	d1c9b61e2b	AMDGPU: Disable fp64 support on pre GCN asics It's not implemented. Passing +fp64-fp16-denormal feature enables fp64 even on asics that don't support it v2: fix hasFP64 query Differential Revision: https://reviews.llvm.org/D39931 llvm-svn: 319709	2017-12-04 22:57:29 +00:00
Evgeniy Stepanov	4a8d151986	[msan] Add a fixme note for a minor deficiency. llvm-svn: 319708	2017-12-04 22:50:39 +00:00
Hans Wennborg	361d4392cf	Revert r319490 "XOR the frame pointer with the stack cookie when protecting the stack" This broke the Chromium build (crbug.com/791714). Reverting while investigating. > Summary: This strengthens the guard and matches MSVC. > > Reviewers: hans, etienneb > > Subscribers: hiraditya, JDevlieghere, vlad.tsyrklevich, llvm-commits > > Differential Revision: https://reviews.llvm.org/D40622 > > git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319490 91177308-0d34-0410-b5e6-96231b3b80d8 llvm-svn: 319706	2017-12-04 22:21:15 +00:00
Matt Arsenault	68f0505263	AMDGPU: Fix creating invalid copy when adjusting dmask Move the entire optimization to one place. Before it was possible to adjust dmask without changing the register class of the output instruction, since they were done in separate places. Fix all lane sizes and move all of the optimization into the DAG folding. llvm-svn: 319705	2017-12-04 22:18:27 +00:00
Matt Arsenault	e6667ded4d	AMDGPU: Use return value of MorphNodeTo llvm-svn: 319704	2017-12-04 22:18:22 +00:00
Paul Robinson	68ba772cc0	Re-submit r289925 (Update .debug_line section version to match DWARF version) Set the .debug_line version to match the requested DWARF version, except with a maximum of v4 because we don't support v5 yet. Previously Chromium had issues with this patch; see PR31407. Chromium tool issues have been addressed, so hopefully this will go through this time. Patch by Katya Romanova! Differential Revision: https://reviews.llvm.org/D38002 llvm-svn: 319699	2017-12-04 21:27:46 +00:00
Hans Wennborg	e117129ef7	DAG: Follow-up to r319692 check the truncates inputs have the same type MatchRotate assumes the types of the types of LHS and RHS are equal, which is always the case then they come from an OR node, but here we're getting them from two different TRUNC nodes, so we have to check the types. llvm-svn: 319695	2017-12-04 20:48:50 +00:00
Hans Wennborg	7e61f24962	DAG: Match truncated rotation (PR35487) If the truncation has been pushed past the or-node, look through it and truncate afterwards. Differential revision: https://reviews.llvm.org/D40792 llvm-svn: 319692	2017-12-04 20:39:57 +00:00
Daniel Sanders	04e4f47e93	[globalisel][tablegen] Split atomic load/store into separate opcode and enable for AArch64. This patch splits atomics out of the generic G_LOAD/G_STORE and into their own G_ATOMIC_LOAD/G_ATOMIC_STORE. This is a pragmatic decision rather than a necessary one. Atomic load/store has little in implementation in common with non-atomic load/store. They tend to be handled very differently throughout the backend. It also has the nice side-effect of slightly improving the common-case performance at ISel since there's no longer a need for an atomicity check in the matcher table. All targets have been updated to remove the atomic load/store check from the G_LOAD/G_STORE path. AArch64 has also been updated to mark G_ATOMIC_LOAD/G_ATOMIC_STORE legal. There is one issue with this patch though which also affects the extending loads and truncating stores. The rules only match when an appropriate G_ANYEXT is present in the MIR. For example, (G_ATOMIC_STORE (G_TRUNC:s16 (G_ANYEXT:s32 (G_ATOMIC_LOAD:s16 X)))) will match but: (G_ATOMIC_STORE (G_ATOMIC_LOAD:s16 X)) will not. This shouldn't be a problem at the moment, but as we get better at eliminating extends/truncates we'll likely start failing to match in some cases. The current plan is to fix this in a patch that changes the representation of extending-load/truncating-store to allow the MMO to describe a different type to the operation. llvm-svn: 319691	2017-12-04 20:39:32 +00:00
Hiroshi Yamauchi	9364fa3434	Move splitIndirectCriticalEdges() to BasicBlockUtils.h. Summary: Move splitIndirectCriticalEdges() from CodeGenPrepare to BasicBlockUtils.h so that it can be called from other places. Reviewers: davidxl Reviewed By: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D40750 llvm-svn: 319689	2017-12-04 20:36:01 +00:00
Haicheng Wu	234eabaf07	[ConstantFold] Support vector index when factoring out GEP index into preceding dimensions Follow-up of r316824. This patch supports the vector type for both current and previous index when factoring out the current one into the previous one. Differential Revision: https://reviews.llvm.org/D39556 llvm-svn: 319683	2017-12-04 19:56:33 +00:00
Sanjoy Das	adf3751730	[SCEV] Use a "Discovered" set instead of a "Visited" set; NFC Suggested by Max Kazantsev in https://reviews.llvm.org/D39361 llvm-svn: 319679	2017-12-04 19:22:01 +00:00
Sanjoy Das	7e36337935	[SCEV] A different fix for PR33494 Summary: I don't think rL309080 is the right fix for PR33494 -- caching ExitLimit only hides the problem[0]. The real issue is that because of how we forget SCEV expressions ScalarEvolution::getBackedgeTakenInfo, in the test case for PR33494 computing the backedge for any loop invalidates the trip count for every other loop. This effectively makes the SCEV cache useless. I've instead made the SCEV expression invalidation in ScalarEvolution::getBackedgeTakenInfo less aggressive to fix this issue. [0]: One way to think about this is that rL309080 essentially augmented the backedge-taken-count cache with another equivalent exit-limit cache. The bug went away because we were explicitly not clearing the exit-limit cache in getBackedgeTakenInfo. But instead of doing all of that, we can just avoid clearing the backedge-taken-count cache. Reviewers: mkazantsev, mzolotukhin Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D39361 llvm-svn: 319678	2017-12-04 19:22:00 +00:00
Sanjoy Das	aa92cae14e	[BypassSlowDivision] Improve our handling of divisions by constants (This reapplies r314253. r314253 was reverted on r314482 because of a correctness regression on P100, but that regression was identified to be something else.) Summary: Don't bail out on constant divisors for divisions that can be narrowed without introducing control flow . This gives us a 32 bit multiply instead of an emulated 64 bit multiply in the generated PTX assembly. Reviewers: jlebar Subscribers: jholewinski, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D38265 llvm-svn: 319677	2017-12-04 19:21:58 +00:00
Matthias Braun	7eae251bae	MachineVerifier: undef phi arg doesn't need to be live-out from predecessor Differential Revision: https://reviews.llvm.org/D40756 llvm-svn: 319674	2017-12-04 18:57:48 +00:00
Francis Visoiu Mistrih	25528d6de7	[CodeGen] Unify MBB reference format in both MIR and debug output As part of the unification of the debug format and the MIR format, print MBB references as '%bb.5'. The MIR printer prints the IR name of a MBB only for block definitions. * find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" $ -type f -print0 \| xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)->getNumber/" << printMBBReference(\1)/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" $ -type f -print0 \| xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)\.getNumber/" << printMBBReference(\1)/g' * find . $ -name ".txt" -o -name ".s" -o -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" $ -type f -print0 \| xargs -0 sed -i '' -E 's/BB#([0-9]+)/%bb.\1/g' * grep -nr 'BB#' and fix Differential Revision: https://reviews.llvm.org/D40422 llvm-svn: 319665	2017-12-04 17:18:51 +00:00
Pablo Barrio	2b4385846c	Fix function pointer tail calls in armv8-M.base Summary: The compiler fails with the following error message: fatal error: error in backend: ran out of registers during register allocation Tail call optimization for Armv8-M.base fails to meet all the required constraints when handling calls to function pointers where the arguments take up r0-r3. This is because the pointer to the function to be called can only be stored in r0-r3, but these are all occupied by arguments. This patch makes sure that tail call optimization does not try to handle this type of calls. Reviewers: chill, MatzeB, olista01, rengolin, efriedma Reviewed By: olista01, efriedma Subscribers: efriedma, aemerson, javed.absar, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D40706 llvm-svn: 319664	2017-12-04 16:55:49 +00:00
Pavel Labath	f2fdc183b7	Revert "[cmake] Enable zlib support on windows" This reverts commit r319533 as it broke llvm-config --system-libs output and everything that depends on it (which is mostly out of tree or downstream folks, but includes a couple of llvm buildbots as well). I think I have a fix for this in D40779, but I want someone to look review it first. In the mean time, I am reverting this change, as it seems to break a lot of people. llvm-svn: 319663	2017-12-04 16:46:20 +00:00
Sam Kolton	5f7f32c382	[AMDGPU] SDWA: add support for PRESERVE into SDWA peephole. Summary: Reviewers: arsenm, vpykhtin, rampitec Subscribers: kzhuravl, wdng, nhaehnle, mgorny, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D37817 llvm-svn: 319662	2017-12-04 16:22:32 +00:00
Anna Thomas	7b360434ff	[Loop Predication] Teach LP about reverse loops Summary: Currently, we only support predication for forward loops with step of 1. This patch enables loop predication for reverse or countdownLoops, which satisfy the following conditions: 1. The step of the IV is -1. 2. The loop has a singe latch as B(X) = X <pred> latchLimit with pred as s> or u> 3. The IV of the guard is the decrement IV of the latch condition (Guard is: G(X) = X-1 u< guardLimit). This patch was downstream for a while and is the last series of patches that's from our LP implementation downstream. Reviewers: apilipenko, mkazantsev, sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D40353 llvm-svn: 319659	2017-12-04 15:11:48 +00:00
Jonas Hahnfeld	5db24d7c22	[NVPTX] Assign valid global names PTX requires that identifiers consist only of [a-zA-Z0-9_$]. The existing pass already ensured this for globals and this patch adds the cleanup for functions with local linkage. However, there was a different problem in the case of collisions of the adjusted name: The ValueSymbolTable then automatically appended ".N" with increasing Ns to get a unique name while helping the ABI demangling. Special case this behavior to omit the dots and append N directly. This will always give us legal names according to the PTX requirements. Differential Revision: https://reviews.llvm.org/D40573 llvm-svn: 319657	2017-12-04 14:19:33 +00:00
Oliver Stannard	7ab60605f8	Revert r319649 - [Asm, ARM] Add fallback diag for multiple invalid operands This is causing a failure in the llvm-clang-x86_64-expensive-checks-win buildbot, and I can't reproduce it locally, so reverting until I can work out what is wrong. llvm-svn: 319654	2017-12-04 13:42:22 +00:00
Sam McCall	d0d43e6f14	Revert "[ValueTracking] Pass only a single lambda to computeKnownBitsFromShiftOperator by using KnownBits struct instead of separate APInts. NFCI" This reverts commit r319624, which seems to cause a miscompile (breaks the multistage PPC buildbots) llvm-svn: 319652	2017-12-04 12:51:49 +00:00
Tim Corringham	6c6d5e24cd	AMDGPU: fix missing s_waitcnt Summary: The pass that inserts s_waitcnt instructions where needed propagated info used to track dependencies for each block by iterating over the predecessor blocks. The iteration was terminated when a predecessor that had not yet been processed was encountered. Any info in blocks later in the list was therefore not processed, leading to the possiblility of a required s_waitcnt not being inserted. The fix is simply to change the "break" to "continue" for the relevant loops, so that all visited blocks are processed. This is likely what was intended when the code was written. There is no test case provided for this fix because: 1) the only example that reproduces this is large and resistant to being reduced 2) the change is trivial Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D40544 llvm-svn: 319651	2017-12-04 12:30:49 +00:00
Oliver Stannard	7cd4db94f8	[Asm, ARM] Add fallback diag for multiple invalid operands This adds a "invalid operands for instruction" diagnostic for instructions where there is an instruction encoding with the correct mnemonic and which is available for this target, but where multiple operands do not match those which were provided. This makes it clear that there is some combination of operands that is valid for the current target, which the default diagnostic of "invalid instruction" does not. Since this is a very general error, we only emit it if we don't have a more specific error. Differential revision: https://reviews.llvm.org/D36747 llvm-svn: 319649	2017-12-04 12:02:32 +00:00
Jonas Paulsson	e86327f290	[TwoAddressInstructionPass] Bugfix in handling of sunk instructions. An instruction returned by TII->convertToThreeAddress() may contain a %noreg (undef) operand, which is not expected by tryInstructionTransform(). So if this MI is sunk to a lower point in MBB, it must be skipped when later encountered. A new set SunkInstrs is used for this purpose. Note: there is no test supplied here, as this was triggered on SystemZ while working on a review of instruction flags. A test case for this bugfix will be included in the upcoming SystemZ commit. Review: Quentin Colombet https://reviews.llvm.org/D40711 llvm-svn: 319646	2017-12-04 10:03:14 +00:00
Sam Parker	1e26d986aa	[DAGCombine] Remove isAndLoadExtLoad arguments Both LoadedVT and NarrowLoad are passed as references and neither of them are used by any of its callers. Differential Revision: https://reviews.llvm.org/D40713 llvm-svn: 319645	2017-12-04 09:48:26 +00:00
Martin Storsjo	eca862de07	[AArch64] Allow using emulated tls on platforms other than ELF This matches how it is done on X86. This allows using emulated tls on windows; in MinGW environments, native tls isn't supported at the moment. Set the right Data*bitsDirective for windows to match the existing tests for other platforms. Make parts of the existing tests a regex, to allow matching .section .rdata for windows, to avoid having to duplicate the rest of the tests for windows. Differential Revision: https://reviews.llvm.org/D40770 llvm-svn: 319644	2017-12-04 09:09:04 +00:00
Martin Storsjo	c85cc41801	[ARM] Allow using emulated tls on platforms other than ELF This matches how it is done on X86. This allows using emulated tls on windows; in MinGW environments, native tls isn't supported at the moment. Differential Revision: https://reviews.llvm.org/D40769 llvm-svn: 319643	2017-12-04 09:08:55 +00:00
Craig Topper	4520d4f8ad	[X86] Allow VPMAXUQ/VPMAXSQ/VPMINUQ/VPMINSQ to be used with 128/256 bit vectors when AVX512 is enabled. These instructions can be used by widening to 512-bits and extracting back to 128/256. We do similar to several other instructions already. llvm-svn: 319641	2017-12-04 07:21:01 +00:00
Craig Topper	1151facf76	[X86] Don't turn UINT_TO_FP into SINT_TO_FP during lowering. We already do this as a DAG combine. The version during lowering can only trigger if known bits changes something that improves known bits analysis. But this means we should be improving known bits analysis to work on the unlowered form instead. llvm-svn: 319640	2017-12-04 05:38:44 +00:00
Craig Topper	67217d7eb4	[SelectionDAG] Teach computeKnownBits some improvements to ISD::SRL with a non-splat constant shift amount. If we have a non-splat constant shift amount, the minimum shift amount can be used to infer the number of zero upper bits of the result. There's probably a lot more that we can do here, but this fixes a case where I wanted to infer the sign bit as zero when all the shift amounts are non-zero. llvm-svn: 319639	2017-12-04 05:38:42 +00:00
Simon Pilgrim	569e53b0f6	[X86][AVX512] Tag PH2PS/PS2PH conversion instructions scheduler classes llvm-svn: 319637	2017-12-03 21:43:54 +00:00
Simon Pilgrim	465a88bb92	[X86][AVX512] Tag packed F2I/I2F/F2F conversion instructions scheduler class llvm-svn: 319636	2017-12-03 21:16:12 +00:00
Simon Pilgrim	bc8d0223fb	[X86][SSE] Remove unused IIC_SSE_CVT_PI2PS_RR/IIC_SSE_CVT_PI2PS_RM itineraries llvm-svn: 319634	2017-12-03 20:57:04 +00:00
Yaxun Liu	30e4608cca	CodeGen: Fix SelectionDAGISel::LowerArguments for sret addr space SelectionDAGISel::LowerArguments assumes sret addr space is 0, which is not true for amdgcn---amdgiz target. This patch fixes that. Differential Revision: https://reviews.llvm.org/D40255 llvm-svn: 319630	2017-12-03 03:31:45 +00:00
Craig Topper	f3470e1ed4	[SelectionDAG] Use the inlined APInt shift methods since we've already bounds checked the shift. The version that takes APInt is out of line. The 'unsigned' version optimizes for the common case of single word APInts. llvm-svn: 319628	2017-12-03 03:07:09 +00:00
Sam Clegg	a2b35dac03	Reland "[WebAssembly] Add visibility flag to Wasm symbol flags"" Original change was rL319488. This was reverted rL319602 due to a gcc 7.1 warning. Differential Revision: https://reviews.llvm.org/D40772 llvm-svn: 319626	2017-12-03 01:19:23 +00:00
Craig Topper	199acd88e3	[ValueTracking] Pass only a single lambda to computeKnownBitsFromShiftOperator by using KnownBits struct instead of separate APInts. NFCI llvm-svn: 319624	2017-12-02 23:42:17 +00:00
Yaxun Liu	494770403a	CodeGen: Fix pointer info in SplitVecOp_EXTRACT_VECTOR_ELT/SplitVecRes_INSERT_VECTOR_ELT Two issues found when doing codegen for splitting vector with non-zero alloca addr space: DAGTypeLegalizer::SplitVecRes_INSERT_VECTOR_ELT/SplitVecOp_EXTRACT_VECTOR_ELT uses dummy pointer info for creating SDStore. Since one pointer operand contains multiply and add, InferPointerInfo is unable to infer the correct pointer info, which ends up with a dummy pointer info for the target to lower store and results in isel failure. The fix is to introduce MachinePointerInfo::getUnknownStack to represent MachinePointerInfo which is known in alloca address space but without other information. TargetLowering::getVectorElementPointer uses value type of pointer in addr space 0 for multiplication of index and then add it to the pointer. However the pointer may be in an addr space which has different size than addr space 0. The fix is to use the pointer value type for index multiplication. Differential Revision: https://reviews.llvm.org/D39758 llvm-svn: 319622	2017-12-02 22:13:22 +00:00
Simon Pilgrim	299a54c5b9	[X86][SSE] Cleanup float/int conversion scheduler itinerary classes Makes it easier to grok where each is supposed to be used, mainly useful for adding to the AVX512 instructions but hopefully can be used more in SSE/AVX as well. llvm-svn: 319614	2017-12-02 12:27:44 +00:00
Craig Topper	7d9a3b82c6	[X86] Teach the assembler to support %db8-%db15 as aliases for %dr8-%dr15. llvm-svn: 319612	2017-12-02 08:27:46 +00:00
Craig Topper	3e846ecb5b	[X86] Support %dr8-%dr15 in the assembler. Apparently I failed to make this work when I fixed it in the disassembler way back in r224862. llvm-svn: 319611	2017-12-02 08:27:45 +00:00
Tatyana Krasnukha	f665f6a279	[ARC] Add instruction subset for the ARC backend. Reviewers: petecoup, kparzysz Reviewed By: petecoup Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37983 llvm-svn: 319609	2017-12-02 05:25:17 +00:00
Nirav Dave	839ff79a8d	[DAG][AArch64] Disable post-legalization store Disable post-legalization store for AArch64 backend which is causing errors out-of-tree. llvm-svn: 319607	2017-12-02 04:01:26 +00:00
Heejin Ahn	e74a864cec	[WebAssembly] Revert r319488 "Add visibility flag to Wasm symbol flags" This patch reportedly broke one of LLVM bots (ubuntu-gcc7.1-werror). See http://lab.llvm.org:8011/builders/ubuntu-gcc7.1-werror/builds/3369 for details. llvm-svn: 319602	2017-12-02 02:05:06 +00:00
Matt Morehouse	9e658c974b	Revert "[X86] Improvement in CodeGen instruction selection for LEAs." This reverts r319543, due to ASan bot breakage. llvm-svn: 319591	2017-12-01 22:20:26 +00:00
Jessica Paquette	52df8015c5	[MachineOutliner] NFC: Throw out self-intersections on candidates early Currently, the outliner considers candidates that intersect with themselves in the candidate pruning step. That is, candidates of the form "AA" in ranges like "AAAAAA". In that range, it looks like there are 5 instances of "AA" that could possibly be outlined, and that's considered in the benefit calculation. However, only at most 3 instances of "AA" could ever be outlined in "AAAAAA". Thus, it's possible to pass through "AA" to the candidate selection step even though it's never the case that "AA" could be outlined. This makes it so that when we find candidates, we consider only non-overlapping occurrences of that candidate. llvm-svn: 319588	2017-12-01 21:56:56 +00:00
Nirav Dave	3e76e1e89e	[DAG][ARM] Revert "Reenable post-legalize store merge" due to failures in AArch and ARM code gen. llvm-svn: 319587	2017-12-01 21:55:47 +00:00
Jake Ehrlich	3da7982cca	[MC] Handle unknown literal register numbers in .cfi_* directives r230670 introduced a step to map EH register numbers to standard DWARF register numbers. This failed to consider the case when a user .cfi_* directive uses an integer literal rather than a register name, to specify a DWARF register number that has no corresponding LLVM register number (e.g. a special register that the compiler and assembler have no name for). Fixes PR34028. Patch by Roland McGrath Differential Revision: https://reviews.llvm.org/D36493 llvm-svn: 319586	2017-12-01 21:44:27 +00:00
Philip Reames	6260cf71d3	[IndVars] Fix a bug introduced in r317012 Turns out we can have comparisons which are indirect users of the induction variable that we can make invariant. In this case, there is no loop invariant value contributing and we'd fail an assert. The test case was found by a java fuzzer and reduced. It's a real cornercase. You have to have a static loop which we've already proven only executes once, but haven't broken the backedge on, and an inner phi whose result can be constant folded by SCEV using exit count reasoning but not proven by isKnownPredicate. To my knowledge, only the fuzzer has hit this case. llvm-svn: 319583	2017-12-01 20:57:19 +00:00
Adam Nemet	9303f62255	[opt-remarks] If hotness threshold is set, ignore remarks without hotness These are blocks that haven't not been executed during training. For large projects this could make a significant difference. For the project, I was looking at, I got an order of magnitude decrease in the size of the total YAML files with this and r319235. Differential Revision: https://reviews.llvm.org/D40678 Re-commit after fixing the failing testcase in rL319576, rL319577 and rL319578. llvm-svn: 319581	2017-12-01 20:41:38 +00:00
Eli Friedman	b34a8198a9	[DAGCombine] Simplify ISD::AND handling in ReduceLoadWidth Followup to D39595. Removes a bunch of redundant checks. Differential Revision: https://reviews.llvm.org/D40667 llvm-svn: 319573	2017-12-01 19:33:56 +00:00
Simon Pilgrim	031d8b71b3	[X86][AVX512] Tag subvector extract/insert instructions scheduler classes llvm-svn: 319568	2017-12-01 18:40:32 +00:00
Benjamin Kramer	094ac65d72	[IR] Avoid dangling else warning. NFC. llvm-svn: 319567	2017-12-01 18:39:58 +00:00
Fedor Sergeev	3b459c3847	IR printing improvement for loop passes - handle -print-module-scope Summary: Adding support for -print-module-scope similar to how it is being done for function passes. This option causes loop-pass printer to emit a whole-module IR instead of just a loop itself. Reviewers: sanjoy, silvas, weimingz Reviewed By: sanjoy Subscribers: apilipenko, skatkov, llvm-commits Differential Revision: https://reviews.llvm.org/D40247 llvm-svn: 319566	2017-12-01 18:33:58 +00:00
Paul Robinson	ab69b477a9	[DebugInfo] Bail out if making no progress dumping line tables. llvm-svn: 319564	2017-12-01 18:25:30 +00:00
Adam Nemet	57783730fd	Revert "[opt-remarks] If hotness threshold is set, ignore remarks without hotness" This reverts commit r319556. Something is not working with this when used with sample-based profiling. Investigating... llvm-svn: 319562	2017-12-01 18:12:29 +00:00
Fedor Sergeev	94dca7c7ea	IR printing improvement for function passes - introducing -print-module-scope Summary: When debugging function passes it happens to be rather useful to dump the whole module before the transformation and then use this dump to analyze this single transformation by running it separately on that particular module state. Introducing -print-module-scope debugging option that forces all the function-level IR dumps to become whole-module dumps. This option builds on top of normal dumping controls like -print-before/after -filter-print-funcs The plan is to eventually extend this option to cover other local passes (at least loop passes) but that should go as a separate change. Reviewers: sanjoy, weimingz, silvas, fedor.sergeev Reviewed By: weimingz Subscribers: apilipenko, skatkov, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D40245 llvm-svn: 319561	2017-12-01 17:42:46 +00:00
Simon Pilgrim	8d5e469c32	Fix line endings. NFCI. llvm-svn: 319559	2017-12-01 17:24:15 +00:00
Simon Pilgrim	fb01cb1b0c	[X86][AVX512] Tag VPERM2I/VPERM2T instructions scheduler class llvm-svn: 319558	2017-12-01 17:23:06 +00:00
Adam Nemet	8d1fc2b65b	[opt-remarks] If hotness threshold is set, ignore remarks without hotness These are blocks that haven't not been executed during training. For large projects this could make a significant difference. For the project, I was looking at, I got an order of magnitude decrease in the size of the total YAML files with this and r319235. Differential Revision: https://reviews.llvm.org/D40678 llvm-svn: 319556	2017-12-01 17:02:04 +00:00
Simon Pilgrim	54c6083fb1	[X86][AVX512] Tag VFPCLASS instructions scheduler class llvm-svn: 319554	2017-12-01 16:51:48 +00:00
Simon Pilgrim	07b4c5917e	[X86][AVX512] Tag VPSHUFBITQMB instructions scheduler class llvm-svn: 319553	2017-12-01 16:35:57 +00:00
Simon Pilgrim	904d1a895c	[X86][AVX512] Tag VPCOMRESS/VPEXPAND instructions scheduler classes llvm-svn: 319551	2017-12-01 16:20:03 +00:00
Hans Wennborg	e2470b95da	Revert r319531 "[SLPVectorizer] Failure to beneficially vectorize 'copyable' elements in integer binary ops." It causes builds to fail with "Instruction does not dominate all uses" (PR35497). > Patch tries to improve vectorization of the following code: > > void add1(int * __restrict dst, const int * __restrict src) { > dst++ = src++; > dst++ = src++ + 1; > dst++ = src++ + 2; > dst++ = src++ + 3; > } > Allows to vectorize even if the very first operation is not a binary add, but just a load. > > Fixed issues related to previous commit. > > Reviewers: spatel, mzolotukhin, mkuper, hfinkel, RKSimon, filcab, ABataev > > Reviewed By: ABataev, RKSimon > > Subscribers: llvm-commits, RKSimon > > Differential Revision: https://reviews.llvm.org/D28907 llvm-svn: 319550	2017-12-01 16:17:24 +00:00
Nirav Dave	eb2b24fded	[ARM][DAG] Reenable post-legalize store merge Summary: Reenable post-legalize stores with constant merging computation and cofrresponding test case. Reviewers: eastig, efriedma Subscribers: aemerson, javed.absar, kristof.beyls, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D40701 llvm-svn: 319547	2017-12-01 14:49:26 +00:00
Jatin Bhateja	328199ec26	[X86] Improvement in CodeGen instruction selection for LEAs. Summary: 1/ Operand folding during complex pattern matching for LEAs has been extended, such that it promotes Scale to accommodate similar operand appearing in the DAG e.g. T1 = A + B T2 = T1 + 10 T3 = T2 + A For above DAG rooted at T3, X86AddressMode will now look like Base = B , Index = A , Scale = 2 , Disp = 10 2/ During OptimizeLEAPass down the pipeline factorization is now performed over LEAs so that if there is an opportunity then complex LEAs (having 3 operands) could be factored out e.g. leal 1(%rax,%rcx,1), %rdx leal 1(%rax,%rcx,2), %rcx will be factored as following leal 1(%rax,%rcx,1), %rdx leal (%rdx,%rcx) , %edx 3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops, thus avoiding creation of any complex LEAs within a loop. 4/ Simplify LEA converts (lea (BASE,1,INDEX,0) --> add (BASE, INDEX) which offers better through put. PR32755 will be taken care of by this pathc. Previous patch revisions : r313343 , r314886 Reviewers: lsaba, RKSimon, craig.topper, qcolombet, jmolloy, jbhateja Reviewed By: lsaba, RKSimon, jbhateja Subscribers: jmolloy, spatel, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D35014 llvm-svn: 319543	2017-12-01 14:07:38 +00:00
Simon Pilgrim	2dc4ff1cde	[X86][AVX512] Tag vshift/vpermv/pshufd/pshufb instructions scheduler classes llvm-svn: 319540	2017-12-01 13:25:54 +00:00
Mikael Holmen	9c13c8b6ec	Revert r319537: Bail out of a SimplifyCFG switch table opt at undef values. Broke build bots so reverting. llvm-svn: 319539	2017-12-01 13:11:39 +00:00
Florian Hahn	30932a3c16	[InstSimplify] More fcmp cases when comparing against negative constants. Summary: For known positive non-zero value X: fcmp uge X, -C => true fcmp ugt X, -C => true fcmp une X, -C => true fcmp oeq X, -C => false fcmp ole X, -C => false fcmp olt X, -C => false Patch by Paul Walker. Reviewers: majnemer, t.p.northover, spatel, RKSimon Reviewed By: spatel Subscribers: fhahn, llvm-commits Differential Revision: https://reviews.llvm.org/D40012 llvm-svn: 319538	2017-12-01 12:34:16 +00:00
Mikael Holmen	9f047795fb	Bail out of a SimplifyCFG switch table opt at undef values. Summary: A true or false result is expected from a comparison, but it seems the possibility of undef was overlooked, which could lead to a failed assert. This is fixed by this patch by bailing out if we encounter undef. The bug is old and the assert has been there since the end of 2014, so it seems this is unusual enough to forego optimization. Patch by: JesperAntonsson Reviewers: spatel, eeckstein, hans Reviewed By: hans Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D40639 llvm-svn: 319537	2017-12-01 12:30:49 +00:00
Nemanja Ivanovic	4364513cb2	Follow-up to r319434 to turn the pass on by default Now that the patch has gone through the buildbot cycle, turn it on by default. llvm-svn: 319535	2017-12-01 12:02:59 +00:00
Alexander Timofeev	c1425c9d6b	[AMDGPU] SiFixSGPRCopies should not modify non-divergent PHI Differential revision: https://reviews.llvm.org/D40556 llvm-svn: 319534	2017-12-01 11:56:34 +00:00
Pavel Labath	11ce6e6a83	[cmake] Enable zlib support on windows Summary: zlib support was hard-wired to off for (non-cygwin) windows targets. This disables some features, such as reading debug info from compressed dwarf sections. This has been this way since zlib support was added in 2013 (r180083), but there is no obvious reason for that. Zlib is perfectly capable of being compiled for windows (it even has a cmake file that works out of the box). This enables one to turn on zlib support on windows, if one has zlib avaliable. Reviewers: rnk, beanz Subscribers: mgorny, aprantl, llvm-commits Differential Revision: https://reviews.llvm.org/D40655 llvm-svn: 319533	2017-12-01 11:41:07 +00:00
Dinar Temirbulatov	29e86584c6	[SLPVectorizer] Failure to beneficially vectorize 'copyable' elements in integer binary ops. Patch tries to improve vectorization of the following code: void add1(int * __restrict dst, const int * __restrict src) { dst++ = src++; dst++ = src++ + 1; dst++ = src++ + 2; dst++ = src++ + 3; } Allows to vectorize even if the very first operation is not a binary add, but just a load. Fixed issues related to previous commit. Reviewers: spatel, mzolotukhin, mkuper, hfinkel, RKSimon, filcab, ABataev Reviewed By: ABataev, RKSimon Subscribers: llvm-commits, RKSimon Differential Revision: https://reviews.llvm.org/D28907 llvm-svn: 319531	2017-12-01 11:10:47 +00:00
Volkan Keles	a32ff00b00	GlobalISel: Enable the legalization of G_MERGE_VALUES and G_UNMERGE_VALUES Summary: LegalizerInfo assumes all G_MERGE_VALUES and G_UNMERGE_VALUES instructions are legal, so it is not possible to legalize vector operations on illegal vector types. This patch fixes the problem by removing the related check and adding default actions for G_MERGE_VALUES and G_UNMERGE_VALUES. Reviewers: qcolombet, ab, dsanders, aditya_nandakumar, t.p.northover, kristof.beyls Reviewed By: dsanders Subscribers: rovka, javed.absar, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D39823 llvm-svn: 319524	2017-12-01 08:19:10 +00:00
Hiroshi Inoue	48e4c7aae6	Recommit rL319407: [SROA] enable splitting for non-whole-alloca loads and stores Recommiting once reverted patch rL319407 after adding a check for bit vector size to avoid failures in some build bots. llvm-svn: 319522	2017-12-01 06:05:05 +00:00
Craig Topper	f8470a6399	[X86] Custom legalize v2i32 gathers via widening rather than promoting. The default legalization for v2i32 is promotion to v2i64. This results in a gather that reads 64-bit elements rather than 32. If one of the elements is near a page boundary this can cause an illegal access that can fault. We also miscalculate the scale for the gather which is an even worse problem, but we probably could have found a separate way to fix that. llvm-svn: 319521	2017-12-01 06:02:02 +00:00
Craig Topper	c261213abc	[X86][SelectionDAG] Make sure we explicitly sign extend the index when type promoting the index of scatter and gather. Type promotion makes no guarantee about the contents of the promoted bits. Since the gather/scatter instruction will use the bits to calculate addresses, we need to ensure they aren't garbage. llvm-svn: 319520	2017-12-01 06:02:00 +00:00
Craig Topper	11f733df9b	[X86] Add a DAG combine to simplify masks for AVX2 gather instructions. AVX2 gathers only use the upper bit of the mask allowing us to simplify sign_extend_inreg to a shift left. llvm-svn: 319514	2017-12-01 02:49:07 +00:00
Jake Ehrlich	1a468481c0	Add flag to ArchiveWriter to test GNU64 format more efficiently Even with the sparse file optimizations the SYM64 test can still be painfully slow. This unnecessarily slows down devs. It's critical that we test that the switch to the SYM64 format occurs at 4GB but there isn't any better of a way to fake the size of the file than sparse files. This change introduces a flag that allows the cutoff to be arbitrarily set to whatever power of two is desired. The flag is hidden as it really isn't meant to be used outside this one test. This is unfortunate but appears necessary, at least until the average hard drive is much faster. The changes to the test require some explanation. Prior to this change we knew that the SYM64 format was being used because the file was simply too large to have validly handled this case if the SYM64 format were not used. To ensure that the SYM64 format is still being used I am grepping the file for "SYM64". Without changing the filename however this would be pointless because "SYM64" would occur in the file either way. So the filename of the test is also changed in order to avoid this issue. Differential Revision: https://reviews.llvm.org/D40632 llvm-svn: 319507	2017-12-01 00:54:28 +00:00
Zachary Turner	8065f0b975	Mark all library options as hidden. These command line options are not intended for public use, and often don't even make sense in the context of a particular tool anyway. About 90% of them are already hidden, but when people add new options they forget to hide them, so if you were to make a brand new tool today, link against one of LLVM's libraries, and run tool -help you would get a bunch of junk that doesn't make sense for the tool you're writing. This patch hides these options. The real solution is to not have libraries defining command line options, but that's a much larger effort and not something I'm prepared to take on. Differential Revision: https://reviews.llvm.org/D40674 llvm-svn: 319505	2017-12-01 00:53:10 +00:00
Matt Arsenault	686d5c728f	AMDGPU: Use carry-less adds in FI elimination llvm-svn: 319501	2017-11-30 23:42:30 +00:00
Peter Collingbourne	1f03422610	ThinLTOBitcodeWriter: Try harder to discard unused references to the merged module. If the thin module has no references to an internal global in the merged module, we need to make sure to preserve that property if the global is a member of a comdat group, as otherwise promotion can end up adding global symbols to the comdat, which is not allowed. This situation can arise if the external global in the thin module has dead constant users, which would cause use_empty() to return false and would cause us to try to promote it. To prevent this from happening, discard the dead constant users before asking whether a global is empty. Differential Revision: https://reviews.llvm.org/D40593 llvm-svn: 319494	2017-11-30 23:05:52 +00:00
Zachary Turner	f0e4c6a819	Simplify the DenseSet used for hashing CodeView records. This was storing the hash alongside the key so that the hash doesn't need to be re-computed every time, but in doing so it was allocating a structure to keep the key size small in the DenseMap. This is a noble goal, but it also leads to a pointer indirection on every probe, and this cost of this pointer indirection ends up being higher than the cost of having a slightly larger entry in the hash table. Removing this not only simplifies the code, but yields a small but noticeable performance improvement in the type merging algorithm. llvm-svn: 319493	2017-11-30 23:00:30 +00:00
Matt Arsenault	84445dd13c	AMDGPU: Use gfx9 carry-less add/sub instructions llvm-svn: 319491	2017-11-30 22:51:26 +00:00
Reid Kleckner	ba4014e9dc	XOR the frame pointer with the stack cookie when protecting the stack Summary: This strengthens the guard and matches MSVC. Reviewers: hans, etienneb Subscribers: hiraditya, JDevlieghere, vlad.tsyrklevich, llvm-commits Differential Revision: https://reviews.llvm.org/D40622 llvm-svn: 319490	2017-11-30 22:41:21 +00:00
Sam Clegg	9138b7b005	Add visibility flag to Wasm symbol flags The LLVM "hidden" flag needs to be passed through the Wasm intermediate objects in order for the linker to apply it to the final Wasm object. The corresponding change in LLD is here: https://github.com/WebAssembly/lld/pull/14 Patch by Nicholas Wilson Differential Revision: https://reviews.llvm.org/D40442 llvm-svn: 319488	2017-11-30 22:34:58 +00:00
Dan Gohman	59e4c0b938	[memcpyopt] Teach memcpyopt to optimize across basic blocks This teaches memcpyopt to make a non-local memdep query when a local query indicates that the dependency is non-local. This notably allows it to eliminate many more llvm.memcpy calls in common Rust code, often by 20-30%. Fixes PR28958. Differential Revision: https://reviews.llvm.org/D38374 llvm-svn: 319482	2017-11-30 22:10:53 +00:00
Davide Italiano	9d939c8f19	[InlineCost] Prefer getFunction() to two calls to getParent(). Improves clarity, also slightly cheaper. NFCI. llvm-svn: 319481	2017-11-30 22:10:35 +00:00
Krzysztof Parzyszek	d76814200b	[Hexagon] Implement HexagonSubtarget::useAA() llvm-svn: 319477	2017-11-30 21:25:28 +00:00
Daniel Sanders	0c43b3a023	[globalisel][tablegen] Add support for relative AtomicOrderings No test yet because the relevant rules are blocked on the atomic_load, and atomic_store nodes. llvm-svn: 319475	2017-11-30 21:05:59 +00:00
Krzysztof Parzyszek	44555225a6	[Hexagon] Solo instructions cannot be used with new value jumps llvm-svn: 319470	2017-11-30 20:32:54 +00:00
Craig Topper	d4257565cf	[X86] Promote i8 CTPOP to i32 instead of i16 when we have the POPCNT instruction. The 32-bit version is shorter to encode and the zext we emit for the promotion is likely going to be a 32-bit zero extend anyway. llvm-svn: 319468	2017-11-30 20:15:31 +00:00
Daniel Sanders	aef1dfc690	[aarch64][globalisel] Legalize G_ATOMIC_CMPXCHG_WITH_SUCCESS and G_ATOMICRMW_* G_ATOMICRMW_* is generally legal on AArch64. The exception is G_ATOMICRMW_NAND. G_ATOMIC_CMPXCHG_WITH_SUCCESS needs to be lowered to G_ATOMIC_CMPXCHG with an external comparison. Note that IRTranslator doesn't generate these instructions yet. llvm-svn: 319466	2017-11-30 20:11:42 +00:00
Amara Emerson	d78d65c2a4	[GlobalISel][IRTranslator] Fix crash during translation of zero sized loads/stores/args/returns. This fixes PR35358. rdar://35619533 Differential Revision: https://reviews.llvm.org/D40604 llvm-svn: 319465	2017-11-30 20:06:02 +00:00
Xinliang David Li	c23d2c6883	[PGO] Skip counter promotion for infinite loops Differential Revision: http://reviews.llvm.org/D40662 llvm-svn: 319462	2017-11-30 19:16:25 +00:00
Zachary Turner	ca6dbf1440	Split TypeTableBuilder into two classes. llvm-svn: 319456	2017-11-30 18:39:50 +00:00
Dan Gohman	78c19d60a9	[WebAssembly] Revert r319186 "Support bitcasted function addresses with varargs." The patch broke Emscripten's EM_ASM macros, which utiltize unprototyped functions. See https://bugs.llvm.org/show_bug.cgi?id=35385 for details. llvm-svn: 319452	2017-11-30 18:16:49 +00:00
Francis Visoiu Mistrih	c71cced0aa	[CodeGen] Always use `printReg` to print registers in both MIR and debug output As part of the unification of the debug format and the MIR format, always use `printReg` to print all kinds of registers. Updated the tests using '_' instead of '%noreg' until we decide which one we want to be the default one. Differential Revision: https://reviews.llvm.org/D40421 llvm-svn: 319445	2017-11-30 16:12:24 +00:00
Igor Laevsky	0cdf7fdc48	[FuzzMutate] Bailout from injecting into empty basic blocks. In rare cases we can receive request to inject into completelly empty basic block. In the normal case all basic blocks contain at least terminator instruction, but it is possible that the only instruction is catchpad instruction which is not part of the instruction iterator. This case seems rare enough to not care about it. Submiting without review, since it seems almost NFC. I couldn't come up with any reasonable way to test this. llvm-svn: 319444	2017-11-30 15:41:58 +00:00
Igor Laevsky	33031926b6	[FuzzMutate] Correctly handle vector types in the insertvalue operation Differential Revision: https://reviews.llvm.org/D40397 llvm-svn: 319442	2017-11-30 15:31:13 +00:00
Igor Laevsky	65902db279	[FuzzMutate] Don't use index operands as sinks Differential Revision: https://reviews.llvm.org/D40396 llvm-svn: 319441	2017-11-30 15:29:16 +00:00
Igor Laevsky	48147d012b	[FuzzMutate] Pick correct index for the insertvalue instruction Differential Revision: https://reviews.llvm.org/D40395 llvm-svn: 319440	2017-11-30 15:26:48 +00:00
Igor Laevsky	faacdf8d54	[FuzzMutate] Don't create load as a new source if it doesn't match with the descriptor Differential Revision: https://reviews.llvm.org/D40394 llvm-svn: 319439	2017-11-30 15:24:41 +00:00
Igor Laevsky	444afc82c0	[FuzzMutate] Don't crash when we can't remove instruction from empty function Differential Revision: https://reviews.llvm.org/D40393 llvm-svn: 319438	2017-11-30 15:07:38 +00:00
Nemanja Ivanovic	db7e77047c	[PowerPC] Recommit r314244 with refactoring and off by default This re-commits everything that was pulled in r314244. The transformation is off by default (patch to enable it to follow). The code is refactored to have a single entry-point and provide fine-grained control over patterns that it selects. This patch also fixes the bugs in the original code. Everything that failed with the original patch has been re-tested with this patch (with the transformation turned on). So the patch to turn this on is soon to follow. Differential Revision: https://reviews.llvm.org/D38575 llvm-svn: 319434	2017-11-30 13:39:10 +00:00
Simon Pilgrim	bb791b3dbd	[X86][AVX512] Tag fcmp/ptest/ternlog instructions scheduler classes llvm-svn: 319433	2017-11-30 13:18:06 +00:00
Sean Eveson	a6bcd53d52	[MC] Function stack size section. Re applying after fixing issues in the diff, sorry for any painful conflicts/merges! Original RFC: http://lists.llvm.org/pipermail/llvm-dev/2017-August/117028.html This change adds a '.stack-size' section containing metadata on function stack sizes to output ELF files behind the new -stack-size-section flag. The section contains pairs of function symbol references (8 byte) and stack sizes (unsigned LEB128). The contents of this section can be used to measure changes to stack sizes between different versions of the compiler or a source base. The advantage of having a section is that we can extract this information when examining binaries that we didn't build, and it allows users and tools easy access to that information just by referencing the binary. There is a follow up change to add an option to clang. Thanks. Reviewers: hfinkel, MatzeB Reviewed By: MatzeB Subscribers: thegameg, asb, llvm-commits Differential Revision: https://reviews.llvm.org/D39788 llvm-svn: 319430	2017-11-30 13:05:14 +00:00
Sean Eveson	661e4fbf83	Revert r319423: [MC] Function stack size section. I messed up the diff. llvm-svn: 319429	2017-11-30 12:43:25 +00:00
Diana Picus	f003d9ff95	[ARM GlobalISel] Bail out for byval Fallback if we have a byval parameter or argument since we don't support them yet. llvm-svn: 319428	2017-11-30 12:23:44 +00:00
Francis Visoiu Mistrih	93ef145862	[CodeGen] Print "%vreg0" as "%0" in both MIR and debug output As part of the unification of the debug format and the MIR format, avoid printing "vreg" for virtual registers (which is one of the current MIR possibilities). Basically: * find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" $ -type f -print0 \| xargs -0 sed -i '' -E "s/%vreg([0-9]+)/%\1/g" * grep -nr '%vreg' . and fix if needed * find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" $ -type f -print0 \| xargs -0 sed -i '' -E "s/ vreg([0-9]+)/ %\1/g" * grep -nr 'vreg[0-9]\+' . and fix if needed Differential Revision: https://reviews.llvm.org/D40420 llvm-svn: 319427	2017-11-30 12:12:19 +00:00
Simon Pilgrim	d1a7d0c3f1	[X86][AVX512] Tag binop/rounding/sae instructions scheduler classes llvm-svn: 319424	2017-11-30 12:01:52 +00:00
Sean Eveson	f77b4d2f38	[MC] Function stack size section. Summary: Original RFC: http://lists.llvm.org/pipermail/llvm-dev/2017-August/117028.html I wasn't sure who to put as reviewers, so please add/remove people as appropriate. This change adds a '.stack-size' section containing metadata on function stack sizes to output ELF files behind the new -stack-size-section flag. The section contains pairs of function symbol references (8 byte) and stack sizes (unsigned LEB128). The contents of this section can be used to measure changes to stack sizes between different versions of the compiler or a source base. The advantage of having a section is that we can extract this information when examining binaries that we didn't build, and it allows users and tools easy access to that information just by referencing the binary. There is a follow up change to add an option to clang. Thanks. Reviewers: hfinkel, MatzeB Reviewed By: MatzeB Subscribers: thegameg, asb, llvm-commits Differential Revision: https://reviews.llvm.org/D39788 llvm-svn: 319423	2017-11-30 12:01:16 +00:00
Sam Parker	4bd776e001	[DAGCombine] Refactor ReduceLoadWidth visitAND attempts to narrow the width of extending loads that are then masked off. ReduceLoadWidth already exists for a similar purpose and handles shifts, so I've moved the code to handle AND nodes there. Differential Revision: https://reviews.llvm.org/D39595 llvm-svn: 319421	2017-11-30 11:49:11 +00:00
Serge Guelton	24386867b8	Support generic lowering of vector bswap llvm-svn: 319419	2017-11-30 11:06:22 +00:00
Simon Pilgrim	3e5987cf8d	[X86][AVX512] Tag RCP/RSQRT/GETEXP instructions scheduler classes llvm-svn: 319418	2017-11-30 10:48:47 +00:00
Hiroshi Inoue	21e8ded4d2	Revert rL319407: [SROA] enable splitting for non-whole-alloca loads and stores This reverts commit rL319407 due to failures in some buildbot. llvm-svn: 319410	2017-11-30 08:29:51 +00:00
Jonas Paulsson	b9a2467501	[SystemZ] Bugfix in adjustSubwordCmp. Csmith generated a program where a store after load to the same address did not get chained after the new load created during DAG legalizing, and so performed an illegal overwrite of the expected value. When the new zero-extending load is created, the chain users of the original load must be updated, which was not done previously. A similar case was also found and handled in lowerBITCAST. Review: Ulrich Weigand https://reviews.llvm.org/D40542 llvm-svn: 319409	2017-11-30 08:18:50 +00:00
Hiroshi Inoue	422e80aee2	[SROA] enable splitting for non-whole-alloca loads and stores Currently, SROA splits loads and stores only when they are accessing the whole alloca. This patch relaxes this limitation to allow splitting a load/store if all other loads and stores to the alloca are disjoint to or fully included in the current load/store. If there is no other load or store that crosses the boundary of the current load/store, the current splitting implementation works as is. The whole-alloca loads and stores meet this new condition and so they are still splittable. Here is a simplified motivating example. struct record { long long a; int b; int c; }; int func(struct record r) { for (int i = 0; i < r.c; i++) r.b++; return r.b; } When updating r.b (or r.c as well), LLVM generates redundant instructions on some platforms (such as x86_64, ppc64); here, r.b and r.c are packed into one 64-bit GPR when the struct is passed as a method argument. With this patch, the above example is compiled into only few instructions without loop. Without the patch, unnecessary loop-carried dependency is introduced by SROA and the loop cannot be eliminated by the later optimizers. Differential Revision: https://reviews.llvm.org/D32998 llvm-svn: 319407	2017-11-30 07:44:46 +00:00
Craig Topper	a495744d2c	[X86] Optimize avx2 vgatherqps for v2f32 with v2i64 index type. Normal type legalization will widen everything. This requires forcing 0s into the mask register. We can instead choose the form that only reads 2 elements without zeroing the mask. llvm-svn: 319406	2017-11-30 07:01:40 +00:00
Craig Topper	321a8b9b63	[X86] Make sure we don't remove sign extends of masks with AVX2 masked gathers. We don't use k-registers and instead use the MSB so we need to make sure we sign extend the mask to the msb. llvm-svn: 319405	2017-11-30 06:31:31 +00:00
Graham Yiu	70293fa27a	- Removed unused lamba (IsReturnBlock) causing build bots to fail for r319398 - Added lit testcases that were supposed to be part of r319398 llvm-svn: 319399	2017-11-30 03:36:57 +00:00
Graham Yiu	8b1882c186	With PGO information, we can do more aggressive outlining of cold regions in the inline candidate function. This contrasts with the scheme of keeping only the 'early return' portion of the inline candidate and outlining the rest of the function as a single function call. Support for outlining multiple regions of each function is added, as well as some basic heuristics to determine which regions are good to outline. Outline candidates limited to regions that are single-entry & single-exit. We also avoid outlining regions that produce live-exit variables, which may inhibit some forms of code motion (like commoning). Fallback to the regular partial inlining scheme is retained when either i) no regions are identified for outlining in the function, or ii) the outlined function could not be inlined in any of its callers. Differential Revision: https://reviews.llvm.org/D38190 llvm-svn: 319398	2017-11-30 02:41:36 +00:00
Matt Arsenault	caf0ed4d74	AMDGPU: Allow negative MUBUF vaddr for gfx9 GFX9 does not enable bounds checking for the resource descriptors used for private access, so it should be OK to use vaddr with a potentially negative value. llvm-svn: 319393	2017-11-30 00:52:40 +00:00
Vedant Kumar	80fbb85555	[Coverage] Use the most-recent completed region count (PR35437) This is a fix for the coverage segment builder. If multiple regions must be popped off the active stack at once, and more than one of them end at the same location, emit a segment using the count from the most-recent completed region. Fixes PR35437, rdar://35760630 Testing: invoked llvm-cov on a stage2 build of clang, additional unit tests, check-profile llvm-svn: 319391	2017-11-30 00:28:23 +00:00
Peter Collingbourne	9e3175bb6b	LowerTypeTests: Deduplicate code. NFC. llvm-svn: 319390	2017-11-30 00:27:08 +00:00
Peter Collingbourne	943aca3c27	LowerTypeTests: Remove unnecessary cast. NFC. llvm-svn: 319387	2017-11-30 00:02:55 +00:00
Craig Topper	56a41d4b3a	[X86] Remove some questionable looking code that seems to be looking through a VZEXT to create a larger VSEXT. If the input the vzext was signed this would do the wrong thing. Not sure how to test this. llvm-svn: 319382	2017-11-29 23:08:25 +00:00
Joerg Sonnenberger	4b1acff9b3	First step towards more human-friendly PPC assembler output: - add -ppc-reg-with-percent-prefix option to use %r3 etc as register names - split off logic for Darwinish verbose conditional codes into a helper function - be explicit about Darwin vs AIX vs GNUish assembler flavors Based on the patch from Alexandre Yukio Yamashita Differential Revision: https://reviews.llvm.org/D39016 llvm-svn: 319381	2017-11-29 23:05:56 +00:00
Sam Clegg	da8d83f911	[WebAssembly] Update test expectations for gcc torture tests I believe these were recently fixed by: https://reviews.llvm.org/rL319186 Differential Revision: https://reviews.llvm.org/D40619 llvm-svn: 319380	2017-11-29 23:05:50 +00:00
Zachary Turner	52d036e693	[CodeView] Factor some code out of TypeTableBuilder. This class had some code that would automatically remap type indices before hashing and serializing. The only caller of this method was the TypeStreamMerger anyway, and the method doesn't make general sense, and prevents making certain future improvements to the class. So, factoring this up one level into the TypeStreamMerger where it belongs. llvm-svn: 319377	2017-11-29 22:41:56 +00:00
Craig Topper	cf461a0a32	[SelectionDAG][X86] Teach promotion legalization for fp_to_sint/fp_to_uint to insert an assertsext/assertzext based on the original type If we put in an assertsext/zext here, we're able to generate better truncate code using pack on pre-avx512 targets. Similar is already done during type legalization. This is the equivalent for op legalization Differential Revision: https://reviews.llvm.org/D40591 llvm-svn: 319368	2017-11-29 22:15:43 +00:00
Dan Gohman	580c102ab8	[WebAssembly] Fix fptoui lowering bounds To fully avoid trapping on wasm, fptoui needs a second check to ensure that the operand isn't below the supported range. llvm-svn: 319354	2017-11-29 20:20:11 +00:00
Krzysztof Parzyszek	f4dcc42e7b	[Hexagon] Remove HexagonISD::PACKHL llvm-svn: 319352	2017-11-29 19:59:29 +00:00
Krzysztof Parzyszek	6a8e5f4b0f	[Hexagon] Create helpers extractVector and insertVector in lowering llvm-svn: 319351	2017-11-29 19:58:10 +00:00
Simon Pilgrim	4d2c703492	[X86][AVX512] Tag RCP/RSQRT/GETEXP instructions scheduler classes (REVERSION) Accidental commit of incomplete patch llvm-svn: 319346	2017-11-29 19:37:38 +00:00
Zachary Turner	3e3936da93	Make TypeTableBuilder inherit from TypeCollection. A couple of places in LLD were passing references to TypeTableCollections around, which makes it hard to change the implementation at runtime. However, these cases only needed to iterate over the types in the collection, and TypeCollection already provides a handy abstract interface for this purpose. By implementing this interface, we can get rid of the need to pass TypeTableBuilder references around, which should allow us to swap the implementation at runtime in subsequent patches. llvm-svn: 319345	2017-11-29 19:35:21 +00:00
Simon Pilgrim	87034cb498	[X86][AVX512] Tag RCP/RSQRT/GETEXP instructions scheduler classes llvm-svn: 319338	2017-11-29 19:19:59 +00:00
Simon Pilgrim	36be852cee	[X86][AVX512] Tag 3OP (shuffles, double-shifts and GFNI) instructions scheduler classes llvm-svn: 319337	2017-11-29 18:52:20 +00:00
Nirav Dave	bafaa53c4d	[ARM][DAG] Revert Disable post-legalization store merge for ARM Partially reverting enabling of post-legalization store merge (r319036) for just ARM backend as it is causing incorrect code in some Thumb2 cases. llvm-svn: 319331	2017-11-29 18:06:13 +00:00
Simon Pilgrim	6a00970ade	[X86][AVX512] Add itinerary argument to all AVX512_maskable_* wrappers. NFCI All default to NoItinerary llvm-svn: 319326	2017-11-29 17:21:15 +00:00
Sander de Smalen	6a3bf1f84a	Reverted r319315 because of unused functions (due to PPR not yet being used by any instructions). llvm-svn: 319321	2017-11-29 15:14:39 +00:00
Simon Pilgrim	1401a75341	[X86][AVX512] Tag VPERMILV instruction scheduler class llvm-svn: 319316	2017-11-29 14:58:34 +00:00
Sander de Smalen	2b6338b2bc	[AArch64][SVE] Asm: Add SVE predicate register definitions and parsing support Summary: Patch [1/4] in a series to add parsing of predicates and properly parse SVE ZIP1/ZIP2 instructions. Reviewers: rengolin, kristof.beyls, fhahn, mcrosier, evandro, echristo, efriedma Reviewed By: fhahn Subscribers: aemerson, javed.absar, llvm-commits, tschuett Differential Revision: https://reviews.llvm.org/D40360 llvm-svn: 319315	2017-11-29 14:34:18 +00:00
Diana Picus	863b5b05f1	[ARM GlobalISel] Fix selecting G_BRCOND When lowering a G_BRCOND, we generate a TSTri of the condition against 1, which sets the flags, and then a Bcc which branches based on the value of the flags. Unfortunately, we were using the wrong condition code to check whether we need to branch (EQ instead of NE), which caused all our branches to do the opposite of what they were intended to do. This patch fixes the issue by using the correct condition code. llvm-svn: 319313	2017-11-29 14:20:06 +00:00
Simon Pilgrim	756348c1c9	[X86][AVX512] Setup unary (PABS/VPLZCNT/VPOPCNT/VPCONFLICT/VMOV*DUP) instruction scheduler classes llvm-svn: 319312	2017-11-29 13:49:51 +00:00
Dmitry Preobrazhensky	1ac7177abb	[AMDGPU][MC][GFX9] Corrected mapping of GFX9 v_add/sub/subrev_u32 When translating pseudo to MC, v_add/sub/subrev_u32 shall be mapped via a separate table as GFX8 has opcodes with the same names. These instructions shall also be labelled as renamed for pseudoToMCOpcode to handle them correctly. Reviewers: arsenm Differential Revision: https://reviews.llvm.org/D40550 llvm-svn: 319311	2017-11-29 13:33:40 +00:00
Simon Pilgrim	e3291de2b8	[X86][SSE] Merged sse2_unpack and sse2_unpack PUNPCK instruction templates. NFCI. llvm-svn: 319310	2017-11-29 12:12:27 +00:00
Simon Pilgrim	da95772230	[X86][SSE] Merged sse2_pack and sse2_pack_y PACKSS/PACKUS instruction templates. NFCI. llvm-svn: 319308	2017-11-29 11:35:45 +00:00
Max Kazantsev	9545a408b6	[SCEV][NFC] Break from loop after we found first non-Phi in getAddRecExprPHILiterally llvm-svn: 319306	2017-11-29 10:54:16 +00:00
Oliver Stannard	9ea2eaeb50	[ARM] Add support for armv7e-m to the .arch directive This will allow compilation of assembly files targeting armv7e-m without having to specify the Tag_CPU_arch attribute as a workaround. Differential revision: https://reviews.llvm.org/D40370 Patch by Ian Tessier! llvm-svn: 319303	2017-11-29 10:12:15 +00:00
Serguei Katkov	d4df744434	[CGP] Enable complex addr mode Enable complex addr modes after two critical fixes: rL319109 and rL319292 llvm-svn: 319302	2017-11-29 09:48:50 +00:00
Craig Topper	e3515001b9	[X86] Remove setOperationAction Promote for ISD::SINT_TO_FP MVT::v8i16/v16i8/v16i16. A DAG combine ensures these ops are always promoted to vXi32. llvm-svn: 319298	2017-11-29 08:19:36 +00:00
Max Kazantsev	1c3b622820	[SCEV][NFC] Remove condition that can never happen due to check few lines above llvm-svn: 319293	2017-11-29 06:10:36 +00:00
Serguei Katkov	5036459ae3	[CGP] Fix common type handling in optimizeMemoryInst If common type is different we should bail out due to we will not be able to create a select or Phi of these values. Basically it is done in ExtAddrMode::compare however it does not work if we handle the null first and then two values of different types. so add a check in initializeMap as well. The check in ExtAddrMode::compare is used as earlier bail out. Reviewers: reames, john.brawn Reviewed By: john.brawn Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D40479 llvm-svn: 319292	2017-11-29 05:51:26 +00:00
Sean Fertile	aab3ef76d9	[PowerPC] Relax the checking on AND/AND8 in isSignOrZeroExtended. Separate the handling of AND/AND8 out from PHI/OR/ISEL checking. The reasoning is the others need all their operands to be sign/zero extended for their output to also be sign/zero extended. This is true for AND and sign-extension, but for zero-extension we only need at least one of the input operands to be zero extended for the result to also be zero extended. Differential Revision: https://reviews.llvm.org/D39078 llvm-svn: 319289	2017-11-29 04:09:29 +00:00
Matt Arsenault	b655fa9ce2	DAG: Add nuw when splitting loads and stores The object can't straddle the address space wrap around, so I think it's OK to assume any offsets added to the base object pointer can't overflow. Similar logic already appears to be applied in SelectionDAGBuilder when lowering aggregate returns. llvm-svn: 319272	2017-11-29 01:25:12 +00:00
Adrian Prantl	5da51f435a	llvm-dwarfdump: honor the --show-children option when dumping a specific DIE. llvm-svn: 319271	2017-11-29 01:12:22 +00:00
Matt Arsenault	3f71c0e3ee	AMDGPU: Select DS insts without m0 initialization GFX9 stopped using m0 for most DS instructions. Select a different instruction without the use. I think this will be less error prone than trying to manually maintain m0 uses as needed. llvm-svn: 319270	2017-11-29 00:55:57 +00:00
Craig Topper	fbf7b3bf3e	[X86] Promote fp_to_sint v16f32->v16i16/v16i8 to avoid scalarization. llvm-svn: 319266	2017-11-29 00:32:09 +00:00

... 2 3 4 5 6 ...

108717 Commits