llvm-project

Commit Graph

Author	SHA1	Message	Date
Sanjay Patel	7227276d41	[InstCombine] canonicalize icmp predicate feeding select This canonicalization was suggested in D33172 as a way to make InstCombine behavior more uniform. We have this transform for icmp+br, so unless there's some reason that icmp+select should be treated differently, we should do the same thing here. The benefit comes from increasing the chances of creating identical instructions. This is shown in the tests in logical-select.ll (PR32791). InstCombine doesn't fold those directly, but EarlyCSE can simplify the identical cmps, and then InstCombine can fold the selects together. The possible regression for the tests in select.ll raises questions about poison/undef: http://lists.llvm.org/pipermail/llvm-dev/2017-May/113261.html ...but that transform is just as likely to be triggered by this canonicalization as it is to be missed, so we're just pointing out a commutation deficiency in the pattern matching: https://reviews.llvm.org/rL228409 Differential Revision: https://reviews.llvm.org/D34242 llvm-svn: 306435	2017-06-27 17:53:22 +00:00
Dehao Chen	66131665c4	Enable ICP for AutoFDO. Summary: AutoFDO should have ICP enabled. Reviewers: davidxl Reviewed By: davidxl Subscribers: sanjoy, mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D34662 llvm-svn: 306429	2017-06-27 17:23:33 +00:00
Xinliang David Li	d6f10a62c6	[ProfData] Make the method threadsafe llvm-svn: 306428	2017-06-27 17:21:51 +00:00
Coby Tayree	41a5b55f50	[X86][AsmParser][MS-compatability] Binary/Unary operators enhancements Introducing MOD binary operator https://msdn.microsoft.com/en-us/library/hha180wt.aspx Enhancing unary operators NEG and NOT, to support more complex patterns Differential Revision: https://reviews.llvm.org/D33876 llvm-svn: 306425	2017-06-27 16:58:27 +00:00
Paul Robinson	d66ee0f9a7	[DWARF] NFC: Make string-offset handling more like address-table handling; do the indirection and relocation all in the same method. llvm-svn: 306418	2017-06-27 15:40:18 +00:00
Gadi Haber	13759a7ed6	Updated and extended the information about each instruction in HSW and SNB to include the following data: •static latency •number of uOps from which the instructions consists •all ports used by the instruction Reviewers:  RKSimon zvi aymanmus m_zuckerman Differential Revision: https://reviews.llvm.org/D33897 llvm-svn: 306414	2017-06-27 15:05:13 +00:00
Sam Kolton	a179d25b99	[AMDGPU] SDWA: several fixes for V_CVT and VOPC instructions Summary: 1. Instruction V_CVT_U32_F32 allow omod operand (see SIInstrInfo.td:1435). In fact this operand shouldn't be allowed here. This fix checks if SDWA pseudo instruction has OMod operand and then copy it. 2. There were several problems with support of VOPC instructions in SDWA peephole pass. Reviewers: tstellar, arsenm, vpykhtin, airlied, kzhuravl Subscribers: wdng, nhaehnle, yaxunl, dstuttard, tpr, sarnex, t-tye Differential Revision: https://reviews.llvm.org/D34626 llvm-svn: 306413	2017-06-27 15:02:23 +00:00
Matthew Simpson	0bd79f416a	[AArch64] Update successor probabilities after ccmp-conversion This patch modifies the conditional compares pass so that it keeps successor probabilities up-to-date after the conversion. Previously, successor probabilities were being normalized to a uniform distribution, even though they may have been heavily biased prior to the conversion (e.g., if one of the edges was the back edge of a loop). This loss of information affected passes later in the pipeline. Differential Revision: https://reviews.llvm.org/D34109 llvm-svn: 306412	2017-06-27 15:00:22 +00:00
Anna Thomas	dc935a6eb6	[LoopUnrollRuntime] Use SCEV exit count for calculating trip count. NFCI Instead of getBackEdgeTakenCount, use getExitCount on the latch exiting block (which is proven to be the only exiting block in the loop to be unrolled). llvm-svn: 306410	2017-06-27 14:14:35 +00:00
Simon Dardis	4155c8f1f3	[mips] Add instruction aliases for ds(r\|l)l. Add the instruction aliases for ds(r\|l)l for the two operand alias of ds(r\|l)lv and the aliases ds(r\|l)l with the three register operands. llvm-svn: 306405	2017-06-27 13:35:17 +00:00
Hiroshi Inoue	84aafee4fb	[SelectionDAG] set dereferenceable flag in MergeConsecutiveStores to fix assetion failure When SelectionDAG merges consecutive stores and loads in MergeConsecutiveStores, it does not set dereferenceable flag for a created load instruction. This results in an assertion failure if SelectionDAG commonizes this load instruction with other load instructions, as well as it may miss optimization opportunities. This patch sat dereferenceable flag for the newly created load instruction if all the load instructions to be merged are dereferenceable. Differential Revision: https://reviews.llvm.org/D34679 llvm-svn: 306404	2017-06-27 12:43:08 +00:00
Ayman Musa	721d97f7b8	Recommitting rL305465 after fixing bug in TableGen in rL306251 & rL306371 [X86][AVX512] Improve lowering of AVX512 compare intrinsics (remove redundant shift left+right instructions). AVX512 compare instructions return v*i1 types. In cases where the number of elements in the returned value are less than 8, clang adds zeroes to get a mask of v8i1 type. Later on it's replaced with CONCAT_VECTORS, which then is lowered to many DAG nodes including insert/extract element and shift right/left nodes. The fact that AVX512 compare instructions put the result in a k register and zeroes all its upper bits allows us to remove the extra nodes simply by copying the result to the required register class. When lowering, identify these cases and transform them into an INSERT_SUBVECTOR node (marked legal), then catch this pattern in instructions selection phase and transform it into one avx512 cmp instruction. Differential Revision: https://reviews.llvm.org/D33188 llvm-svn: 306402	2017-06-27 12:08:37 +00:00
Hiroshi Inoue	6a391bbf40	fix trivial typos, NFC succesor -> successor llvm-svn: 306393	2017-06-27 10:35:37 +00:00
Diana Picus	0e74a134f8	[ARM] GlobalISel: Support G_SELECT for pointers All we need to do is mark it as legal, otherwise it's just like s32. llvm-svn: 306390	2017-06-27 10:29:50 +00:00
Daniel Sanders	cc36dbf55d	[globalisel][tablegen] Add support for EXTRACT_SUBREG. Summary: After this patch, we finally have test cases that require multiple instruction emission. Depends on D33590 Reviewers: ab, qcolombet, t.p.northover, rovka, kristof.beyls Subscribers: javed.absar, llvm-commits, igorb Differential Revision: https://reviews.llvm.org/D33596 llvm-svn: 306388	2017-06-27 10:11:39 +00:00
Simon Dardis	3e0d39e403	[mips] Refine the condition for when to use CALL16 vs a GOT displacement. Borrow from the logic for 'jal' in MipsAsmParser::processInstruction and add the extra condition of bypassing CALL16 if the destination symbol is an ELF symbol with STB_LOCAL binding. Patch by: John Baldwin Reviewers: sdardis Differential Revision: https://reviews.llvm.org/D33999 llvm-svn: 306387	2017-06-27 10:11:11 +00:00
Diana Picus	7145d22f81	[ARM] GlobalISel: Support G_SELECT for i32 * Mark as legal for (s32, i1, s32, s32) * Map everything into GPRs * Select to two instructions: a CMP of the condition against 0, to set the flags, and a MOVCCr to select between the two inputs based on the flags that we've just set llvm-svn: 306382	2017-06-27 09:19:51 +00:00
Ayal Zaks	fc1e210d44	Recommitting 306331. Undoing revert 306338 after fixed bug: add metadata to the load instead of the reverse shuffle added to it, retaining the original ValueMap implementation. llvm-svn: 306381	2017-06-27 08:41:19 +00:00
Chandler Carruth	3f81d8024c	[SROA] Fix PR32902 by more carefully propagating !nonnull metadata. This is based heavily on the work done ni D34285. I mostly wanted to do test cleanup for the author to save them some time, but I had a really hard time understanding why it was so hard to write better test cases for these issues. The problem is that because SROA does a second rewrite of the loads and because we don't propagate !nonnull for non-pointer loads, we first introduced invalid !nonnull metadata and then stripped it back off just in time to avoid most ways of this PR manifesting. Moving to the more careful utility only fixes this by changing the predicate to look at the new load's type rather than the target type. However, that does fix the bug, and the utility is much nicer including adding range metadata to model the nonnull property after a conversion to an integer. However, we have bigger problems because we don't actually propagate range metadata, and the utility to do this extracted from instcombine isn't really in good shape to do this currently. It only handles the case of copying range metadata from an integer load to a pointer load. It doesn't even handle the trivial cases of propagating from one integer load to another when they are the same width! This utility will need to be beefed up prior to using in this location to get the metadata to fully survive. And even then, we need to go and teach things to turn the range metadata into an assume the way we do with nonnull so that when we promote an integer we don't lose the information. All of this will require a new test case that looks kind-of like `preserve-nonnull.ll` does here but focuses on range metadata. It will also likely require more testing because it needs to correctly handle changes to the integer width, especially as SROA actively tries to change the integer width! Last but not least, I'm a little worried about hooking the range metadata up here because the instcombine logic for converting from a range metadata to a nonnull metadata node seems broken in the face of non-zero address spaces where null is not mapped to the integer `0`. So that probably needs to get fixed with test cases both in SROA and in instcombine to cover it. But this does extract the core PR fix from D34285 of preventing the !nonnull metadata from being propagated in a broken state just long enough to feed into promotion and crash value tracking. On D34285 there is some discussion of zero-extend handling because it isn't necessary. First, the new load size covers all of the non-undef (ie, possibly initialized) bits. This may even extend past the original alloca if loading those bits could produce valid data. The only way its valid for us to zero-extend an integer load in SROA is if the original code had a zero extend or those bits were undef. And we get to assume things like undef never satifies nonnull, so non undef bits can participate here. No need to special case the zero-extend handling, it just falls out correctly. The original credit goes to Ariel Ben-Yehuda! I'm mostly landing this to save a few rounds of trivial edits fixing style issues and test case formulation. Differental Revision: D34285 llvm-svn: 306379	2017-06-27 08:32:03 +00:00
Nicolai Haehnle	43cc6c4e0f	AMDGPU: M0 operands to spill/restore opcodes are dead Summary: With scalar stores, M0 is clobbered and therefore marked as implicitly defined. However, it is also dead. This fixes an assertion when the Greedy Register Allocator decides to optimize a spill/restore pair away again (via tryHintsRecoloring). Reviewers: arsenm Subscribers: qcolombet, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D33319 llvm-svn: 306375	2017-06-27 08:04:13 +00:00
Galina Kistanova	06a0e0e6a9	Fixed the warning introduced by r306289 to make ubuntu-gcc7.1-werror bot green. llvm-svn: 306369	2017-06-27 06:58:57 +00:00
Mikael Holmen	37b5120a9a	[Reassociate] Make sure EraseInst sets MadeChange Summary: EraseInst didn't report that it made IR changes through MadeChange. It is essential that changes to the IR are reported correctly, since for example ReassociatePass::run() will indicate that all analyses are preserved otherwise. And the CGPassManager determines if the CallGraph is up-to-date based on status from InstructionCombiningPass::runOnFunction(). Reviewers: craig.topper, rnk, davide Reviewed By: rnk, davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34616 llvm-svn: 306368	2017-06-27 05:32:13 +00:00
Hiroshi Inoue	5102028f63	[PowerPC] set optimization level in SelectionDAGISel PowerPC backend does not pass the current optimization level to SelectionDAGISel and so SelectionDAGISel works with the default optimization level regardless of the current optimization level. This patch makes the PowerPC backend set the optimization level correctly. Differential Revision: https://reviews.llvm.org/D34615 llvm-svn: 306367	2017-06-27 04:52:17 +00:00
Leslie Zhai	c9d9d7976a	[AVR] Migrate to new MCAsmBackend applyFixup and processFixupValue Reviewers: rafael, dylanmckay, jroelofs, meadori Reviewed By: rafael, meadori Subscribers: meadori, llvm-commits Differential Revision: https://reviews.llvm.org/D34551 llvm-svn: 306359	2017-06-27 03:29:27 +00:00
Davide Italiano	31d4c1bbbc	[CFLAA] Move a common function to the header to reduce duplication. Differential Revision: https://reviews.llvm.org/D34660 llvm-svn: 306354	2017-06-27 02:25:06 +00:00
Matthias Braun	e2ae001982	ScheduleDAGInstrs: Fix fixupKills() adding too many kill flags. Remove invalid shortcut in fixupKills(): A register needs to be marked live even when we are not adding a kill flag. This is because a partially live register must not get a kill flags, but it still needs to be fully marked live when walking backwards. llvm-svn: 306352	2017-06-27 00:58:48 +00:00
Davide Italiano	604c003f5f	[CFLAA] Use raw pointers instead of Optional<Pointer>. NFC. Using Optional<> here doesn't seem to be terribly valuable, but this is not the main point of this change. The change enables us to merge the (now) two identical copies of parentFunctionOfValue() that Steensgaard's and Andersens' provide. llvm-svn: 306351	2017-06-27 00:33:37 +00:00
Davide Italiano	e34a806431	[CFLAA] Change FunctionHandle to be common to Steensgaard's and Andersens' Differential Revision: https://reviews.llvm.org/D34638 llvm-svn: 306348	2017-06-26 23:59:14 +00:00
Wolfgang Pieb	9f65858235	DAGCombine: Make sure we only eliminate trunc/extend when the scales of truncation and extension match. This fixes PR33368. Reviewer: rksimon Differential Revision: https://reviews.llvm.org/D34069 llvm-svn: 306345	2017-06-26 23:05:51 +00:00
Dehao Chen	8b7effb344	revert r306336 for breaking ppc test. llvm-svn: 306344	2017-06-26 23:05:35 +00:00
Eugene Zelenko	76bf48d932	[CodeGen] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 306341	2017-06-26 22:44:03 +00:00
Vedant Kumar	71b3d721fd	[Coverage] Improve readability by using a struct. NFC. llvm-svn: 306340	2017-06-26 22:33:06 +00:00
Ayal Zaks	3923c0c46b	reverting 306331. Causes TBAA metadata to be generates on reverse shuffles, investigating. llvm-svn: 306338	2017-06-26 22:26:54 +00:00
Dehao Chen	79655792cc	Enable vectorizer-maximize-bandwidth by default. Summary: vectorizer-maximize-bandwidth is generally useful in terms of performance. I've tested the impact of changing this to default on speccpu benchmarks on sandybridge machines. The result shows non-negative impact: spec/2006/fp/C++/444.namd 26.84 -0.31% spec/2006/fp/C++/447.dealII 46.19 +0.89% spec/2006/fp/C++/450.soplex 42.92 -0.44% spec/2006/fp/C++/453.povray 38.57 -2.25% spec/2006/fp/C/433.milc 24.54 -0.76% spec/2006/fp/C/470.lbm 41.08 +0.26% spec/2006/fp/C/482.sphinx3 47.58 -0.99% spec/2006/int/C++/471.omnetpp 22.06 +1.87% spec/2006/int/C++/473.astar 22.65 -0.12% spec/2006/int/C++/483.xalancbmk 33.69 +4.97% spec/2006/int/C/400.perlbench 33.43 +1.70% spec/2006/int/C/401.bzip2 23.02 -0.19% spec/2006/int/C/403.gcc 32.57 -0.43% spec/2006/int/C/429.mcf 40.35 +0.27% spec/2006/int/C/445.gobmk 26.96 +0.06% spec/2006/int/C/456.hmmer 24.4 +0.19% spec/2006/int/C/458.sjeng 27.91 -0.08% spec/2006/int/C/462.libquantum 57.47 -0.20% spec/2006/int/C/464.h264ref 46.52 +1.35% geometric mean +0.29% The regression on 453.povray seems real, but is due to secondary effects as all hot functions are bit-identical with and without the flag. I started this patch to consult upstream opinions on this. It will be greatly appreciated if the community can help test the performance impact of this change on other architectures so that we can decided if this should be target-dependent. Reviewers: hfinkel, mkuper, davidxl, chandlerc Reviewed By: chandlerc Subscribers: rengolin, sanjoy, javed.absar, bjope, dorit, magabari, RKSimon, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D33341 llvm-svn: 306336	2017-06-26 21:41:09 +00:00
Dehao Chen	38f1bc7834	Fix the bug when handling shufflevector for aarch64. Summary: This Fixes https://bugs.llvm.org/show_bug.cgi?id=33600 Reviewers: mssimpso, davidxl, Carrot Reviewed By: mssimpso Subscribers: aemerson, rengolin, sanjoy, javed.absar, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D34641 llvm-svn: 306334	2017-06-26 21:33:51 +00:00
Matt Arsenault	53fae0772a	RenameIndependentSubregs: Fix iterator problem Fixes bug 33597. Use of substituteRegister in the tied operand case messes up the register use iterator, causing some uses to be left unprocessed. llvm-svn: 306333	2017-06-26 21:33:36 +00:00
Ayal Zaks	e7e15d186b	[LV] Changing the interface of ValueMap, NFC. Instead of providing access to the internal MapStorage holding all Values associated with a given Key, used for setting or resetting them all together, ValueMap keeps its MapStorage internal; its new interface allows getting, setting or resetting a single Value, per part or per part-and-lane. Follows the discussion in https://reviews.llvm.org/D32871. Differential Revision: https://reviews.llvm.org/D34473 llvm-svn: 306331	2017-06-26 21:03:51 +00:00
Tim Northover	c2d5e6d637	AArch64: legalize G_EXTRACT operations. This is the dual problem to legalizing G_INSERTs so most of the code and testing was cribbed from there. llvm-svn: 306328	2017-06-26 20:34:13 +00:00
Paul Robinson	36e85a867b	[DWARF] NFC: Give DwarfFormat a 1-byte base type. In particular this reduces DWARFFormParams from 64 to 32 bits; pass it around by value. llvm-svn: 306324	2017-06-26 19:52:32 +00:00
Tim Northover	9ac3e42211	AArch64: remove all kill flags when extending register liveness. When we forward a stored value to a load and eliminate it entirely we need to make sure the liveness of the register is maintained all the way to its use. Previously we only cleared liveness on the store doing the forwarding, but there could be other killing uses in between. We already do the right thing when the load has to be converted into something else, it was just this one path that skipped it. llvm-svn: 306318	2017-06-26 18:49:25 +00:00
Paul Robinson	75c068c50b	[DWARF] NFC: Collect info used by DWARFFormValue into a helper. Some forms have sizes that depend on the DWARF version, DWARF format (32/64-bit), or the size of an address. Collect these into a struct to simplify passing them around. Require callers to provide one when they query a form's size. Differential Revision: http://reviews.llvm.org/D34570 llvm-svn: 306315	2017-06-26 18:43:01 +00:00
Wei Mi	71f06420e4	[GVN] Recommit the patch "Add phi-translate support in scalarpre". The recommit fixes three bugs: The first one is to use CurrentBlock instead of PREInstr's Parent as param of performScalarPREInsertion because the Parent of a clone instruction may be uninitialized. The second one is stop PRE when CurrentBlock to its predecessor is a backedge and an operand of CurInst is defined inside of CurrentBlock. The same value defined inside of loop in last iteration can not be regarded as available. The third one is an out-of-bound array access in a flipped if guard. Right now scalarpre doesn't have phi-translate support, so it will miss some simple pre opportunities. Like the following testcase, current scalarpre cannot recognize the last "a * b" is fully redundent because a and b used by the last "a * b" expr are both defined by phis. long a[100], b[100], g1, g2, g3; __attribute__((pure)) long goo(); void foo(long a, long b, long c, long d) { g1 = a * b; if (__builtin_expect(g2 > 3, 0)) { a = c; b = d; g2 = a * b; } g3 = a * b; // fully redundant. } The patch adds phi-translate support in scalarpre. This is only a temporary solution before the newpre based on newgvn is available. llvm-svn: 306313	2017-06-26 18:16:10 +00:00
Matt Arsenault	f28683cf51	AMDGPU: Setup SP/FP in callee function prolog/epilog llvm-svn: 306312	2017-06-26 17:53:59 +00:00
Eric Beckmann	2a81089116	Replace trivial use of external rc.exe by writing our own .res file. This patch removes the dependency on the external rc.exe tool by writing a simple .res file using our own library. In this patch I also added an explicit definition for the .res file magic. Furthermore, I added a unittest for embeded manifests and fixed a bug exposed by the test. llvm-svn: 306311	2017-06-26 17:43:30 +00:00
Ulrich Weigand	af98b748f6	[SystemZ] Fix missing emergency spill slot corner case We sometimes need emergency spill slots for the register scavenger. This may be the case when code needs to access a stack slot that has an offset of 4096 or more relative to the stack pointer. To make that determination, processFunctionBeforeFrameFinalized currently simply checks the total stack frame size of the current function. But this is not enough, since code may need to access stack slots in the caller's stack frame as well, in particular incoming arguments stored on the stack. This commit fixes the problem by taking argument slots into account. llvm-svn: 306305	2017-06-26 16:50:32 +00:00
Marina Yatsina	f58dcb85d2	[inline asm] dot operator while using imm generates wrong ir + asm - llvm part Inline asm dot operator while using imm generates wrong ir and asm This also fixes bugzilla 32987: https://bugs.llvm.org//show_bug.cgi?id=32987 The clang part of the review that contains the test can be found here: https://reviews.llvm.org/D33040 commit on behald of zizhar Differential Revision: https://reviews.llvm.org/D33039 llvm-svn: 306300	2017-06-26 16:03:42 +00:00
Ahmed Bougacha	58a197414e	[X86][AVX-512] Don't raise inexact in ceil, floor, round, trunc. The non-AVX-512 behavior was changed in r248266 to match N1778 (C bindings for IEEE-754 (2008)), which defined the four functions to not raise the inexact exception ("rint" is still defined as raising it). Update the AVX-512 lowering of these functions to match that: it should not be different. llvm-svn: 306299	2017-06-26 16:00:24 +00:00
Tom Stellard	eb8f1e27d9	AMDGPU/GlobalISel: Mark 32-bit G_SHL as legal Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D34589 llvm-svn: 306298	2017-06-26 15:56:52 +00:00
Sanjay Patel	15748d239e	[x86] transform vector inc/dec to use -1 constant (PR33483) Convert vector increment or decrement to sub/add with an all-ones constant: add X, <1, 1...> --> sub X, <-1, -1...> sub X, <1, 1...> --> add X, <-1, -1...> The all-ones vector constant can be materialized using a pcmpeq instruction that is commonly recognized as an idiom (has no register dependency), so that's better than loading a splat 1 constant. AVX512 uses 'vpternlogd' for 512-bit vectors because there is apparently no better way to produce 512 one-bits. The general advantages of this lowering are: 1. pcmpeq has lower latency than a memop on every uarch I looked at in Agner's tables, so in theory, this could be better for perf, but... 2. That seems unlikely to affect any OOO implementation, and I can't measure any real perf difference from this transform on Haswell or Jaguar, but... 3. It doesn't look like it from the diffs, but this is an overall size win because we eliminate 16 - 64 constant bytes in the case of a vector load. If we're broadcasting a scalar load (which might itself be a bug), then we're replacing a scalar constant load + broadcast with a single cheap op, so that should always be smaller/better too. 4. This makes the DAG/isel output more consistent - we use pcmpeq already for padd x, -1 and psub x, -1, so we should use that form for +1 too because we can. If there's some reason to favor a constant load on some CPU, let's make the reverse transform for all of these cases (either here in the DAG or in a later machine pass). This should fix: https://bugs.llvm.org/show_bug.cgi?id=33483 Differential Revision: https://reviews.llvm.org/D34336 llvm-svn: 306289	2017-06-26 14:19:26 +00:00
Krzysztof Parzyszek	918e6d70bd	[Hexagon] Handle cases when the aligned stack pointer is missing llvm-svn: 306288	2017-06-26 14:17:58 +00:00
Jonas Paulsson	8c33647ba1	[SystemZ] Add a check against zero before calling getTestUnderMaskCond() Csmith discovered that this function can be called with a zero argument, in which case an assert for this triggered. This patch also adds a guard before the other call to this function since it was missing, although the test only covers the case where it was discovered. Reduced test case attached as CodeGen/SystemZ/int-cmp-54.ll. Review: Ulrich Weigand llvm-svn: 306287	2017-06-26 13:38:27 +00:00
Mikael Holmen	45bd32f9ad	[IfConversion] Hoist removeBranch calls out of if/else clauses [NFC] Summary: Also added a comment. Pulled out of https://reviews.llvm.org/D34099. Reviewers: iteratee Reviewed By: iteratee Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34388 llvm-svn: 306279	2017-06-26 09:33:04 +00:00
Craig Topper	700892fd89	[IR] Rename BinaryOperator::init to AssertOK and remove argument. Replace default case in switch with llvm_unreachable since all valid opcodes are covered. This method doesn't do any initializing. It just contains asserts. So renaming to AssertOK makes it consistent with similar instructions in other Instruction classes. llvm-svn: 306277	2017-06-26 07:15:59 +00:00
Serguei Katkov	0e70206c8f	This reverts commit r306272. Revert "[MBP] do not rotate loop if it creates extra branch" It breaks the sanitizer build bots. Need to fix this. llvm-svn: 306276	2017-06-26 06:51:45 +00:00
Serguei Katkov	b01fff06ed	[MBP] do not rotate loop if it creates extra branch This is a last fix for the corner case of PR32214. Actually this is not really corner case in general. We should not do a loop rotation if we create an additional branch due to it. Consider the case where we have a loop chain H, M, B, C , where H is header with viable fallthrough from pre-header and exit from the loop M - some middle block B - backedge to Header but with exit from the loop also. C - some cold block of the loop. Let's H is determined as a best exit. If we do a loop rotation M, B, C, H we can introduce the extra branch. Let's compute the change in number of branches: +1 branch from pre-header to header -1 branch from header to exit +1 branch from header to middle block if there is such -1 branch from cold bock to header if there is one So if C is not a predecessor of H then we introduce extra branch. This change actually prohibits rotation of the loop if both true 1) Best Exit has next element in chain as successor. 2) Last element in chain is not a predecessor of first element of chain. Reviewers: iteratee, xur Reviewed By: iteratee Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34271 llvm-svn: 306272	2017-06-26 05:27:27 +00:00
Davide Italiano	9a02494230	[CFL-AA] Remove unneeded function declaration. NFCI. llvm-svn: 306268	2017-06-26 03:55:41 +00:00
Chandler Carruth	2abb65ae11	[InstCombine] Factor the logic for propagating !nonnull and !range metadata out of InstCombine and into helpers. NFC, this just exposes the logic used by InstCombine when propagating metadata from one load instruction to another. The plan is to use this in SROA to address PR32902. If anyone has better ideas about how to factor this or name variables, I'm all ears, but this seemed like a pretty good start and lets us make progress on the PR. This is based on a patch by Ariel Ben-Yehuda (D34285). llvm-svn: 306267	2017-06-26 03:31:31 +00:00
Matt Arsenault	8bcf2f20a7	AMDGPU: Whitespace fixes llvm-svn: 306265	2017-06-26 03:01:36 +00:00
Matt Arsenault	10fc062b2b	AMDGPU: Partially fix implicit.buffer.ptr intrinsic handling This should not be treated as a different version of private_segment_buffer. These are distinct things with different uses and register classes, and requires the function argument info to have more context about the function's type and environment. Also add missing test coverage for the intrinsic, and emit an error for HSA. This also encovers that the intrinsic is broken unless there happen to be stack objects. llvm-svn: 306264	2017-06-26 03:01:31 +00:00
Chandler Carruth	4a000883c7	[LoopSimplify] Re-instate r306081 with a bug fix w.r.t. indirectbr. This was reverted in r306252, but I already had the bug fixed and was just trying to form a test case. The original commit factored the logic for forming dedicated exits inside of LoopSimplify into a helper that could be used elsewhere and with an approach that required fewer intermediate data structures. See that commit for full details including the change to the statistic, etc. The code looked fine to me and my reviewers, but in fact didn't handle indirectbr correctly -- it left the 'InLoopPredecessors' vector dirty. If you have code that looks just right, you can end up leaking these predecessors into a subsequent rewrite, and crash deep down when trying to update PHI nodes for predecessors that don't exist. I've added an assert that makes the bug much more obvious, and then changed the code to reliably clear the vector so we don't get this bug again in some other form as the code changes. I've also added a test case that does manage to catch this while also giving some nice positive coverage in the face of indirectbr. The real code that found this came out of what I think is CPython's interpreter loop, but any code with really "creative" interpreter loops mixing indirectbr and other exit paths could manage to tickle the bug. I was hard to reduce the original test case because in addition to having a particular pattern of IR, the whole thing depends on the order of the predecessors which is in turn depends on use list order. The test case added here was designed so that in multiple different predecessor orderings it should always end up going down the same path and tripping the same bug. I hope. At least, it tripped it for me without manipulating the use list order which is better than anything bugpoint could do... llvm-svn: 306257	2017-06-25 22:45:31 +00:00
Davide Italiano	f15fb368a3	[MemDep] Cleanup return after else & use `auto`. NFC. llvm-svn: 306255	2017-06-25 22:12:59 +00:00
Anna Thomas	e7cb633d29	[LoopDeletion] NFC: Move phi node value setting into prepass Recommit NFC patch (rL306157) where I missed incrementing the basic block iterator, which caused loop deletion tests to hang due to infinite loop. Had reverted it in rL306162. rL306157 commit message: Currently, the implementation of delete dead loops has a special case when the loop being deleted is never executed. This special case (updating of exit block's incoming values for phis) can be run as a prepass for non-executable loops before performing the actual deletion. llvm-svn: 306254	2017-06-25 21:13:58 +00:00
Daniel Jasper	4c6cd4ccb7	Revert "[LoopSimplify] Factor the logic to form dedicated exits into a utility." This leads to a segfault. Chandler already has a test case and should be able to recommit with a fix soon. llvm-svn: 306252	2017-06-25 17:58:25 +00:00
Craig Topper	d1fbb38475	[IR] Use isIntOrIntVectorTy instead of writing it out the long way. NFC llvm-svn: 306250	2017-06-25 17:33:48 +00:00
Simon Pilgrim	c338ba48fc	[X86][SSE] Remove unused memopfsf32_128/memopfsf64_128 scalar memops The 'scalar' simd bitops were dropped a while ago llvm-svn: 306248	2017-06-25 17:04:58 +00:00
Simon Pilgrim	bed1fa1ac1	Strip trailing whitespace. NFCI. llvm-svn: 306247	2017-06-25 16:57:46 +00:00
Sanjay Patel	2f3ead7adc	[InstCombine] add (sext i1 X), 1 --> zext (not X) http://rise4fun.com/Alive/i8Q A narrow bitwise logic op is obviously better than math for value tracking, and zext is better than sext. Typically, the 'not' will be folded into an icmp predicate. The IR difference would even survive through codegen for x86, so we would see worse code: https://godbolt.org/g/C14HMF one_or_zero(int, int): # @one_or_zero(int, int) xorl %eax, %eax cmpl %esi, %edi setle %al retq one_or_zero_alt(int, int): # @one_or_zero_alt(int, int) xorl %ecx, %ecx cmpl %esi, %edi setg %cl movl $1, %eax subl %ecx, %eax retq llvm-svn: 306243	2017-06-25 14:15:28 +00:00
Elena Demikhovsky	72f991cded	AVX-512: Fixed a crash during legalization of <3 x i8> type The compiler fails with assertion during legalization of SETCC for <3 x i8> operands. The result is extended to <4 x i8> and then truncated <4 x i1>. It does not happen on AVX2, because the final result of SETCC is <4 x i32>. Differential Revision: https://reviews.llvm.org/D34503 llvm-svn: 306242	2017-06-25 13:36:20 +00:00
Xin Tong	70f7512add	[AST] Fix a bug in aliasesUnknownInst. Make sure we are comparing the unknown instructions in the alias set and the instruction interested in. Summary: Make sure we are comparing the unknown instructions in the alias set and the instruction interested in. I believe this is clearly a bug (missed opportunity). I can also add some test cases if desired. Reviewers: hfinkel, davide, dberlin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34597 llvm-svn: 306241	2017-06-25 12:55:11 +00:00
Igor Breger	f5035d6ee5	[GlobalISel][X86] Support vector type G_EXTRACT selection. Summary: Support vector type G_EXTRACT selection. For now G_EXTRACT marked as legal for any type, so nothing to do in legalizer. Split from https://reviews.llvm.org/D33665 Reviewers: qcolombet, t.p.northover, zvi, guyblank Reviewed By: guyblank Subscribers: guyblank, rovka, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D33957 llvm-svn: 306240	2017-06-25 11:42:17 +00:00
Dorit Nuzman	e0e0f1ddb0	[AVX2] [TTI CostModel] Add cost of interleaved loads/stores for AVX2 The cost of an interleaved access was only implemented for AVX512. For other X86 targets an overly conservative Base cost was returned, resulting in avoiding vectorization where it is actually profitable to vectorize. This patch starts to add costs for AVX2 for most prominent cases of interleaved accesses (stride 3,4 chars, for now). Note1: Improvements of up to ~4x were observed in some of EEMBC's rgb workloads; There is also a known issue of 15-30% degradations on some of these workloads, associated with an interleaved access followed by type promotion/widening; the resulting shuffle sequence is currently inefficient and will be improved by a series of patches that extend the X86InterleavedAccess pass (such as D34601 and more to follow). Note 2: The costs in this patch do not reflect port pressure penalties which can be very dominant in the case of interleaved accesses since most of the shuffle operations are restricted to a single port. Further tuning, that may incorporate these considerations, will be done on top of the upcoming improved shuffle sequences (that is, along with the abovementioned work to extend X86InterleavedAccess pass). Differential Revision: https://reviews.llvm.org/D34023 llvm-svn: 306238	2017-06-25 08:26:25 +00:00
Ed Schouten	3370e19725	Add support for Ananas platform Ananas is a home-brew operating system, mainly for amd64 machines. After using GCC for quite some time, it has switched to clang and never looked back - yet, having to manually patch things is annoying, so it'd be much nicer if this was in the official tree. More information: https://github.com/zhmu/ananas/ https://rink.nu/projects/ananas.html Submitted by: Rink Springer Differential Revision: https://reviews.llvm.org/D32937 llvm-svn: 306237	2017-06-25 08:19:37 +00:00
Zachary Turner	1affd805fc	[pdb] Fix reading of llvm-generated PDBs by cvdump. If you dump a pdb to yaml, and then round-trip it back to a pdb, and run cvdump -l <file> on the new pdb, cvdump will generate output such as this. * LINES Module: "d:\src\llvm\test\DebugInfo\PDB\Inputs\empty.obj" Error: Line number corrupted: invalid file id 0 <Unknown> (MD5), 0001:00000010-0000001A, line/addr pairs = 3 5 00000010 6 00000013 7 00000018 Note the error message about the corrupted line number. It turns out that the problem is that cvdump cannot find the /names stream (e.g. the global string table), and the reason it can't find the /names stream is because it doesn't understand the NameMap that we serialize which tells pdb consumers which stream has the string table. Some experimentation shows that if we add items to the hash table in a specific order before serializing it, cvdump can read it. This suggests that either we're using the wrong hash function, or we're serializing something incorrectly, but it will take some deeper investigation to figure out how / why. For now, this at least allows cvdump to read our line information (and incidentally, produces an identical byte sequence to what Microsoft tools produce when writing the named stream map). Differential Revision: https://reviews.llvm.org/D34491 llvm-svn: 306233	2017-06-25 03:51:42 +00:00
Xinliang David Li	b67530e9b9	[PGO] Implementate profile counter regiser promotion Differential Revision: http://reviews.llvm.org/D34085 llvm-svn: 306231	2017-06-25 00:26:43 +00:00
Craig Topper	010203964d	[SCEV] Avoid copying ConstantRange just to get the min/max value Summary: This patch changes getRange to getRangeRef and returns a reference to the ConstantRange object stored inside the DenseMap caches. We then take advantage of that to add new helper methods that can return min/max value of a signed or unsigned ConstantRange using that reference without first copying the ConstantRange. getRangeRef calls itself recursively and I believe the reference return is fine for those calls. I've left getSignedRange and getUnsignedRange returning a ConstantRange object so they will make a copy now. This is to ensure safety since the reference will be invalidated if the DenseMap changes. I'm sure there are still more places that can take advantage of the reference and I'll submit future patches as I find them. Reviewers: sanjoy, davide Reviewed By: sanjoy Subscribers: zzheng, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D32978 llvm-svn: 306229	2017-06-24 23:34:50 +00:00
Hiroshi Inoue	a85d24b73d	fix trivial typos in comment, NFC llvm-svn: 306211	2017-06-24 16:00:26 +00:00
Hiroshi Inoue	b300824ee7	fix trivial typos in comment, NFC dereferencable -> dereferenceable llvm-svn: 306210	2017-06-24 15:43:33 +00:00
Hiroshi Inoue	95f24dca98	[SelectionDAG] set dereferenceable flag when expanding memcpy/memmove When SelectionDAG expands memcpy (or memmove) call into a sequence of load and store instructions, it disregards dereferenceable flag even the source pointer is known to be dereferenceable. This results in an assertion failure if SelectionDAG commonizes a load instruction generated for memcpy with another load instruction for the source pointer. This patch makes SelectionDAG to set the dereferenceable flag for the load instructions properly to avoid the assertion failure. Differential Revision: https://reviews.llvm.org/D34467 llvm-svn: 306209	2017-06-24 15:17:38 +00:00
Craig Topper	8bec6a4e1c	[IR][AssumptionCache] Add m_Shift and m_BitwiseLogic matchers to replace a couple m_CombineOr Summary: m_CombineOr isn't very efficient. The code using it is also quite verbose. This patch adds m_Shift and m_BitwiseLogic matchers to make the using code more concise and improve the match efficiency. Reviewers: spatel, davide Reviewed By: davide Subscribers: davide, llvm-commits Differential Revision: https://reviews.llvm.org/D34593 llvm-svn: 306206	2017-06-24 06:27:14 +00:00
Craig Topper	7b66ffe875	[ValueTracking][InstCombine] Use m_Shr instead m_CombineOr(m_LShr, m_AShr). NFC llvm-svn: 306205	2017-06-24 06:24:04 +00:00
Craig Topper	72ee6945af	[Analysis][Transforms] Use commutable matchers instead of m_CombineOr in a few places. NFC llvm-svn: 306204	2017-06-24 06:24:01 +00:00
Rafael Espindola	6418856127	Simplify the processFixupValue interface. NFC. llvm-svn: 306202	2017-06-24 05:22:28 +00:00
Rafael Espindola	daaee7151b	Remove a processFixupValue hack. The intention of processFixupValue is not to redefine the semantics of MCExpr. It is odd enough that a expression lowers to a PCRel MCExpr or not depending on what it looks like. At least it is a local hack now. I left a fix for anyone trying to figure out what producers should be producing a different expression. llvm-svn: 306200	2017-06-24 05:12:29 +00:00
Vitaly Buka	df19ad456e	[InstCombine] Don't replace allocas with smaller globals Summary: InstCombine replaces large allocas with small globals consts causing buffer overflows on valid code, see PR33372. This fix permits this optimization only if the global is dereference for alloca size. Fixes PR33372 Reviewers: eugenis, majnemer, chandlerc Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34311 llvm-svn: 306194	2017-06-24 01:35:19 +00:00
Vitaly Buka	9c2a036276	Make visible isDereferenceableAndAlignedPointer(..., const APInt &Size, ...) Summary: Used by D34311 and D34467 Reviewers: hfinkel, efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34585 llvm-svn: 306193	2017-06-24 01:35:13 +00:00
Derek Schuff	d2c9ec7bc7	[WebAssembly] Fix build after r306177 llvm-svn: 306190	2017-06-24 01:00:43 +00:00
Rafael Espindola	f351292141	Remove redundant argument. llvm-svn: 306189	2017-06-24 00:26:57 +00:00
Lang Hames	cd9d49b605	[ORC] Re-apply r306166 and r306168 with fix for regression test. llvm-svn: 306182	2017-06-23 23:25:28 +00:00
Zachary Turner	fa33282774	[llvm-pdbutil] Dump raw bytes of module symbols and debug chunks. llvm-svn: 306179	2017-06-23 23:08:57 +00:00
Rafael Espindola	86c664f9d7	Move Value adjustment to applyFixup. NFC. llvm-svn: 306178	2017-06-23 23:05:15 +00:00
Rafael Espindola	801b42de31	ARM: move some logic from processFixupValue to applyFixup. processFixupValue is called on every relaxation iteration. applyFixup is only called once at the very end. applyFixup is then the correct place to do last minute changes and value checks. While here, do proper range checks again for fixup_arm_thumb_bl. We used to do it, but dropped because of thumb2. We now do it again, but use the thumb2 range. llvm-svn: 306177	2017-06-23 22:52:36 +00:00
Rafael Espindola	f6242c3e90	This reverts commit r306166 and r306168. Revert "[ORC] Remove redundant semicolons from DEFINE_SIMPLE_CONVERSION_FUNCTIONS uses." Revert "[ORC] Move ORC IR layer interface from addModuleSet to addModule and fix the module type as std::shared_ptr<Module>." They broke ExecutionEngine/OrcMCJIT/test-global-ctors.ll on linux. llvm-svn: 306176	2017-06-23 22:50:24 +00:00
Petar Jovanovic	53dbfb3798	Reland r306095: [mips] Fix reg positions in the aui/daui instructions After fixing (r306173) a failing test in the lld test suite (r306173), reland r306095. Original commit message: [mips] Fix register positions in the aui/daui instructions Swapped the position of the rt and rs register in the aui/daui instructions for mips32r6 and mips64r6. With this change, the format of the generated instructions complies with specifications and GCC. Patch by Milos Stojanovic. llvm-svn: 306174	2017-06-23 22:37:19 +00:00
Geoff Berry	dd239718bd	[AArch64][Falkor] Remove some non-existent opcodes from sched detail regexes. NFC. llvm-svn: 306170	2017-06-23 21:59:09 +00:00
Eugene Zelenko	2db0cfa617	[DebugInfo] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 306169	2017-06-23 21:57:40 +00:00
Lang Hames	eabde9306d	[ORC] Remove redundant semicolons from DEFINE_SIMPLE_CONVERSION_FUNCTIONS uses. llvm-svn: 306168	2017-06-23 21:56:09 +00:00
Zachary Turner	c2f5b4bfd9	[llvm-pdbutil] Dump raw bytes of type and id records. llvm-svn: 306167	2017-06-23 21:50:54 +00:00
Lang Hames	2c19c1be56	[ORC] Move ORC IR layer interface from addModuleSet to addModule and fix the module type as std::shared_ptr<Module>. llvm-svn: 306166	2017-06-23 21:45:29 +00:00
Anna Thomas	77a2e6b198	Revert "[LoopDeletion] NFC: Move phi node value setting into prepass" This reverts commit r306157. It caused some timeouts in clang tests. Perhaps unreachable loops have far too many phi nodes. Reverting and investigating. llvm-svn: 306162	2017-06-23 21:30:48 +00:00
Zachary Turner	dd73968256	[llvm-pdbutil] Dump raw bytes of various DBI stream subsections. llvm-svn: 306160	2017-06-23 21:11:54 +00:00

1 2 3 4 5 ...

104110 Commits