llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	55913ead3d	[X86] Create some wrapper multiclasses to create AVX and SSE shift instructions with less repeated code. NFC llvm-svn: 276085	2016-07-20 05:05:44 +00:00
David Majnemer	a75736087d	Forgot to add a test for r276008. llvm-svn: 276082	2016-07-20 04:13:05 +00:00
David Majnemer	5d26127752	Revert "Disable this-return argument forwarding on ARM/AArch64" Inference of the 'returned' attribute was fixed in r276008, lets try turning the backend support back on. This reverts commit r275677. llvm-svn: 276081	2016-07-20 04:13:01 +00:00
Adam Nemet	67c8929a2c	[LV] Add hotness attribute to missed-optimization remarks The new OptimizationRemarkEmitter analysis pass is hooked up to both new and old PM passes. llvm-svn: 276080	2016-07-20 04:03:43 +00:00
Michael Zolotukhin	6bc56d552a	Revert "Revert r275883 and r275891. They seem to cause PR28608." This reverts commit r276064, and thus reapplies r275891 and r275883 with a fix for PR28608. llvm-svn: 276077	2016-07-20 01:55:27 +00:00
Saleem Abdulrasool	6d9ca182fe	llvm-readobj: add some more aliases Alias -d and -t from readelf in llvm-readobj which effectively replaces the tool. llvm-svn: 276075	2016-07-20 01:16:28 +00:00
Justin Lebar	6114b37838	[LSV] Don't assume that loads/stores appear in address order in the BB. Summary: getVectorizablePrefix previously didn't work properly in the face of aliasing loads/stores. It unwittingly assumed that the loads/stores appeared in the BB in address order. If they didn't, it would do the wrong thing. Reviewers: asbirlea, tstellarAMD Subscribers: arsenm, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D22535 llvm-svn: 276072	2016-07-20 00:55:12 +00:00
Yunzhong Gao	1a01287e5e	Fixing a few places in this doc which look like obvious typos. llvm-svn: 276070	2016-07-20 00:40:54 +00:00
Matthias Braun	5b9722d6c7	Revert "RegScavenging: Add scavengeRegisterBackwards()" Reverting this commit for now as it seems to be causing failures on test-suite tests on the clang-ppc64le-linux-lnt bot. This reverts commit r276044. llvm-svn: 276068	2016-07-20 00:21:32 +00:00
Kyle Butt	d2b886e569	Codegen: Tail Duplication: Only duplicate into layout pred if it is a CFG Pred. Add a check that the layout predecessor of a block is an actual CFG predecssor of the block as well. No current code fails this check, but upcoming patches can trigger this, and it makes sense to separate it out. llvm-svn: 276066	2016-07-20 00:01:51 +00:00
Sean Silva	554efb28d2	Revert r275883 and r275891. They seem to cause PR28608. Revert "[LoopSimplify] Update LCSSA after separating nested loops." This reverts commit r275891. Revert "[LCSSA] Post-process PHI-nodes created by SSAUpdate when constructing LCSSA form." This reverts commit r275883. llvm-svn: 276064	2016-07-19 23:54:29 +00:00
Sean Silva	e3c18a5ae8	[PM] Port LoopUnroll. We just set PreserveLCSSA to always true since we don't have an analogous method `mustPreserveAnalysisID(LCSSA)`. Also port LoopInfo verifier pass to test LoopUnrollPass. llvm-svn: 276063	2016-07-19 23:54:23 +00:00
Kyle Butt	9e52c064c2	Codegen: Factor out canTailDuplicate canTailDuplicate accepts two blocks and returns true if the first can be duplicated into the second successfully. Use this function to encapsulate the heuristic. llvm-svn: 276062	2016-07-19 23:54:21 +00:00
Aaron Ballman	b930c84368	This code block breaks the docs build (http://lab.llvm.org:8011/builders/llvm-sphinx-docs/builds/11925/steps/docs-llvm-html/logs/stdio ). Setting the code highlighting to none instead of llvm. llvm-svn: 276060	2016-07-19 23:50:11 +00:00
Justin Lebar	7ab570ec3a	[ADT] Warn on unused results from ArrayRef and StringRef functions that read like they might mutate. Summary: Functions like "slice" and "drop_front" sound like they might mutate the underlying object, but they don't. Warning on unused results would have saved me an hour yesterday, and I'm sure I'm not the only one. LLVM and Clang are clean wrt this warning after D22540. Reviewers: majnemer Subscribers: sanjoy, chandlerc, llvm-commits Differential Revision: https://reviews.llvm.org/D22541 llvm-svn: 276058	2016-07-19 23:19:25 +00:00
Justin Lebar	ea9598b1f4	Get rid of call to StringRef::substr that's never used. Summary: substr doesn't modify the string, so this line has no effect. Reviewers: majnemer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D22540 llvm-svn: 276057	2016-07-19 23:19:22 +00:00
Justin Lebar	8778c62629	[LSV] Insert stores at the right point. Summary: Previously, the insertion point for stores was the last instruction in Chain before calling getVectorizablePrefixEndIdx. Thus if getVectorizablePrefixEndIdx didn't return Chain.size(), we still would insert at the last instruction in Chain. This patch changes our internal API a bit in an attempt to make it less prone to this sort of error. As a result, we end up recalculating the Chain's boundary instructions, but I think worrying about the speed hit of this is a premature optimization right now. Reviewers: asbirlea, tstellarAMD Subscribers: mzolotukhin, arsenm, llvm-commits Differential Revision: https://reviews.llvm.org/D22534 llvm-svn: 276056	2016-07-19 23:19:20 +00:00
Justin Lebar	2cf2c22870	[LSV] Use make_range, and reformat a DEBUG message. NFC Summary: The DEBUG message was hard to read because two Values were being printed on the same line with only the delimiter "aliases". This change makes us print each Value on its own line. Reviewers: asbirlea Subscribers: llvm-commits, arsenm, mzolotukhin Differential Revision: https://reviews.llvm.org/D22533 llvm-svn: 276055	2016-07-19 23:19:18 +00:00
Justin Lebar	4ee8a2d024	[LSV] Nix two global (ish) variables in the LoadStoreVectorizer. NFC Reviewers: asbirlea Subscribers: mzolotukhin, llvm-commits, arsenm Differential Revision: https://reviews.llvm.org/D22532 llvm-svn: 276054	2016-07-19 23:19:16 +00:00
Justin Lebar	d9446d3770	[LSV] Add detail to correct-order.ll test. Summary: This helps keep us honest -- there were a number of ways we could screw up and still have passed this test. Reviewers: asbirlea Subscribers: llvm-commits, arsenm Differential Revision: https://reviews.llvm.org/D22531 llvm-svn: 276053	2016-07-19 23:18:59 +00:00
Kostya Serebryany	0ccf06f467	[libFuzzer] extend the messages printed by afl_driver llvm-svn: 276052	2016-07-19 23:18:28 +00:00
Matt Arsenault	a1fe17c9ad	AMDGPU: Change fdiv lowering based on !fpmath metadata If 2.5 ulp is acceptable, denormals are not required, and isn't a reciprocal which will already be handled, replace with a faster fdiv. Simplify the lowering tests by using per function subtarget features. llvm-svn: 276051	2016-07-19 23:16:53 +00:00
Daniel Berlin	1986030b62	Fix unused variable llvm-svn: 276050	2016-07-19 23:08:08 +00:00
Paul Robinson	2d23c029f7	Make GVN Hoisting obey optnone/bisect. Differential Revision: http://reviews.llvm.org/D22545 llvm-svn: 276048	2016-07-19 22:57:14 +00:00
Daniel Berlin	5c46b943db	Make MemorySSA::dominates/locallydominates constant time Summary: Make MemorySSA::dominates/locallydominates constant time Reviewers: george.burgess.iv, gberry Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D22527 llvm-svn: 276046	2016-07-19 22:49:43 +00:00
Chandler Carruth	2aff750cb8	Add AIX support to Path.inc, Host.h, and CMake. Patch by Andrew Paprocki! Differential Revision: https://reviews.llvm.org/D18359 llvm-svn: 276045	2016-07-19 22:46:39 +00:00
Matthias Braun	84fd4bee6c	RegScavenging: Add scavengeRegisterBackwards() This is a variant of scavengeRegister() that works for enterBasicBlockEnd()/backward(). The benefit of the backward mode is that it is not affected by incomplete kill flags. This patch also changes PrologEpilogInserter::doScavengeFrameVirtualRegs() to use the register scavenger in backwards mode. Differential Revision: http://reviews.llvm.org/D21885 llvm-svn: 276044	2016-07-19 22:37:09 +00:00
Matthias Braun	4cb68e1048	RegisterScavenger: Introduce backward() mode. This adds two pieces: - RegisterScavenger:::enterBasicBlockEnd() which behaves similar to enterBasicBlock() but starts tracking at the end of the basic block. - A RegisterScavenger::backward() method. It is subtly different from the existing unprocess() method which only considers uses with the kill flag set: If a value is dead at the end of a basic block with a last use inside the basic block, unprocess() will fail to mark it as live. However we cannot change/fix this behaviour because unprocess() needs to perform the exact reverse operation of forward(). Differential Revision: http://reviews.llvm.org/D21873 llvm-svn: 276043	2016-07-19 22:37:02 +00:00
Sanjay Patel	d4ea94eb94	regenerate checks llvm-svn: 276042	2016-07-19 22:32:15 +00:00
Evandro Menezes	238fa76574	[AArch64] Properly validate the reciprocal estimation. Add check for legal data types when expanding into a Newton series. Differential Revision: https://reviews.llvm.org/D22267 llvm-svn: 276041	2016-07-19 22:31:11 +00:00
Sanjay Patel	2d477e59e8	[InstCombine] fold add(zext(xor X, C), C) --> sext X when C is INT_MIN in the source type The pattern may look more obviously like a sext if written as: define i32 @g(i16 %x) { %zext = zext i16 %x to i32 %xor = xor i32 %zext, 32768 %add = add i32 %xor, -32768 ret i32 %add } We already have that fold in visitAdd(). Differential Revision: https://reviews.llvm.org/D22477 llvm-svn: 276035	2016-07-19 22:09:34 +00:00
George Burgess IV	22a0f1a0b9	Attempt to appease MSVC buildbots. Broken by r276026. llvm-svn: 276032	2016-07-19 21:35:47 +00:00
Davide Italiano	63e5968033	[AMDGPU] Remove spurious line (should've been removed in r276029). llvm-svn: 276030	2016-07-19 21:16:30 +00:00
Davide Italiano	1576e38598	[AMDGPU] Remove dead code. LGTM'd by Matt Arsenault. llvm-svn: 276029	2016-07-19 21:10:49 +00:00
George Burgess IV	8b85321bae	[CFLAA] Make a test tell the truth. NFC. Dishonesty noted by Jia Chen. llvm-svn: 276028	2016-07-19 20:56:41 +00:00
George Burgess IV	3b059841ff	[CFLAA] Add some interproc. analysis to CFLAnders. This patch adds function summary support to CFLAnders. It also comes with a lot of tests! Woohoo! Patch by Jia Chen. Differential Revision: https://reviews.llvm.org/D22450 llvm-svn: 276026	2016-07-19 20:47:15 +00:00
Kevin Enderby	6524bd8c00	Next step along the way to getting good error messages for bad archives. This step builds on Lang Hames work to change Archive::child_iterator for better interoperation with Error/Expected. Building on that it is now possible to return an error message when the size field of an archive contains non-decimal characters. llvm-svn: 276025	2016-07-19 20:47:07 +00:00
Sanjay Patel	47c04f9543	add even more missing tests for simplifySelectBitTest() llvm-svn: 276024	2016-07-19 20:47:00 +00:00
George Burgess IV	c01b42faa5	[CFLAA] Teach CFLAnders to distinguish reads from writes. This patch adds more specific edges to CFLAndersAliasAnalysis. The goal of these edges is to give us more information about how two values that MayAlias alias. With this, we can now tell cases like a = b; // ergo, a may alias b apart from a = c; b = c; // so, a may alias b, but only because they were both assigned to c. ...And others. Patch by Jia Chen. Differential Revision: https://reviews.llvm.org/D22429 llvm-svn: 276023	2016-07-19 20:38:21 +00:00
Aaron Ballman	a0c1f40815	This code block breaks the docs build (http://lab.llvm.org:8011/builders/llvm-sphinx-docs/builds/11921/steps/docs-llvm-html/logs/stdio ). Setting the code highlighting to none instead of llvm to hopefully get the bot stumbling back towards green. llvm-svn: 276018	2016-07-19 20:20:03 +00:00
Rafael Espindola	3816c53f04	Use posix_fallocate instead of ftruncate. This makes sure that space is actually available. With this change running lld on a full file system causes it to exit with failed to open foo: No space left on device instead of crashing with a sigbus. llvm-svn: 276017	2016-07-19 20:19:56 +00:00
Vedant Kumar	57faf2d208	[tsan] Don't instrument __llvm_gcov_global_state_pred or __llvm_gcda* r274801 did not go far enough to allow gcov+tsan to cooperate. With this commit it's possible to run the following code without false positives: std::thread T1(fib), T2(fib); T1.join(); T2.join(); llvm-svn: 276015	2016-07-19 20:16:08 +00:00
Tim Northover	554fbd05e8	ARM: move feature for Thumb2 pkhbt/pkhtb onto architectures. There's not much functional change, but it really is an architectural feature (on v6T2, v7A, v7R and v7EM) rather than something each CPU implements individually. The main functional change is the default behaviour you get when specifying only "-triple". llvm-svn: 276013	2016-07-19 19:49:13 +00:00
Ahmed Bougacha	5a59b24bdd	[GlobalISel] Mark newly-created gvregs as having a bank. Also verify that we never try to set the size of a vreg associated to a register class. Report an error when we encounter that in MIR. Fix a testcase that hit that error and had a size for no reason. llvm-svn: 276012	2016-07-19 19:48:36 +00:00
Ahmed Bougacha	0313a08a1a	[GlobalISel] Simplify more RegClassOrRegBank is+get. NFC. llvm-svn: 276011	2016-07-19 19:47:06 +00:00
David Majnemer	5246e0b2c2	[FunctionAttrs] Correct the safety analysis for inference of 'returned' We skipped over ReturnInsts which didn't return an argument which would lead us to incorrectly conclude that an argument returned by another ReturnInst was 'returned'. This reverts commit r275756. This fixes PR28610. llvm-svn: 276008	2016-07-19 18:50:26 +00:00
Davide Italiano	63266b6be5	[SCCP] Improve assert messages. NFCI. I've been hitting those already while working on SCCP and I think it's be useful to provide a more explanatory diagnostic. llvm-svn: 276007	2016-07-19 18:31:07 +00:00
Kostya Serebryany	6b08be9279	[libFuzzer] properly intercept memmem llvm-svn: 276006	2016-07-19 18:29:06 +00:00
Chad Rosier	8b5fa7a2f2	[DSE] Add additional debug output. NFC. llvm-svn: 276005	2016-07-19 18:11:11 +00:00
David Majnemer	07ea344222	Add a testcase for r275581 llvm-svn: 276002	2016-07-19 17:52:41 +00:00
David Majnemer	938a6c7ce0	[RegionInfo] Some cleanups - Use unique_ptr instead of managing a container of new'd pointers. - Use range based for loops. No functional change is intended. llvm-svn: 276001	2016-07-19 17:50:30 +00:00
David Majnemer	f29b7bafb6	[RegionPass] Some minor cleanups No functional change is intended. llvm-svn: 276000	2016-07-19 17:50:27 +00:00
David Majnemer	1a4576e79d	[LoopPass] Some minor cleanups No functional change is intended. llvm-svn: 275999	2016-07-19 17:50:24 +00:00
Aaron Ballman	887ad0e9db	This code block breaks the docs build (http://lab.llvm.org:8011/builders/llvm-sphinx-docs/builds/11920/steps/docs-llvm-html/logs/stdio ), but I cannot see anything immediately wrong with it and cannot reproduce the diagnostic locally. Setting the code highlighting to none instead of nasm to hopefully get the bot stumbling back towards green. llvm-svn: 275998	2016-07-19 17:46:55 +00:00
Sanjay Patel	8b76ebe5b8	add tests related to PR28466 llvm-svn: 275995	2016-07-19 17:07:35 +00:00
Simon Pilgrim	5366d0e0bc	[X86][AVX512] Added AVX512 subvector broadcast tests llvm-svn: 275994	2016-07-19 17:04:28 +00:00
Simon Pilgrim	f2d02cb0f6	[X86][AVX] Fixed typo in test names llvm-svn: 275992	2016-07-19 16:52:05 +00:00
Chad Rosier	667b1ca0e6	[DSE] Add additional debug output. NFC. llvm-svn: 275991	2016-07-19 16:50:57 +00:00
Sanjay Patel	d2ff6d727f	add missing test for simplifySelectBitTest() llvm-svn: 275990	2016-07-19 16:49:55 +00:00
Tobias Grosser	1c38262279	[InstCombine] Enable cast-folding in logic(cast(icmp), cast(icmp)) Summary: Currently, InstCombine is already able to fold expressions of the form `logic(cast(A), cast(B))` to the simpler form `cast(logic(A, B))`, where logic designates one of `and`/`or`/`xor`. This transformation is implemented in `foldCastedBitwiseLogic()` in InstCombineAndOrXor.cpp. However, this optimization will not be performed if both `A` and `B` are `icmp` instructions. The decision to preclude casts of `icmp` instructions originates in r48715 in combination with r261707, and can be best understood by the title of the former one: > Transform (zext (or (icmp), (icmp))) to (or (zext (cimp), (zext icmp))) if at least one of the (zext icmp) can be transformed to eliminate an icmp. Apparently, it introduced a transformation that is a reverse of the transformation that is done in `foldCastedBitwiseLogic()`. Its purpose is to expose pairs of `zext icmp` that would subsequently be optimized by `transformZExtICmp()` in InstCombineCasts.cpp. Therefore, in order to avoid an endless loop of switching back and forth between these two transformations, the one in `foldCastedBitwiseLogic()` has been restricted to exclude `icmp` instructions which is mirrored in the responsible check: `if ((!isa<ICmpInst>(Cast0Src) \|\| !isa<ICmpInst>(Cast1Src)) && ...` This check seems to sort out more cases than necessary because: - the reverse transformation is obviously done for `or` instructions only - and also not every `zext icmp` pair is necessarily the result of this reverse transformation Therefore we now remove this check and replace it by a more finegrained one in `shouldOptimizeCast()` that now rejects only those `logic(zext(icmp), zext(icmp))` that would be able to be optimized by `transformZExtICmp()`, which also avoids the mentioned endless loop. That means we are now able to also simplify expressions of the form `logic(cast(icmp), cast(icmp))` to `cast(logic(icmp, icmp))` (`cast` being an arbitrary `CastInst`). As an example, consider the following IR snippet ``` %1 = icmp sgt i64 %a, %b %2 = zext i1 %1 to i8 %3 = icmp slt i64 %a, %c %4 = zext i1 %3 to i8 %5 = and i8 %2, %4 ``` which would now be transformed to ``` %1 = icmp sgt i64 %a, %b %2 = icmp slt i64 %a, %c %3 = and i1 %1, %2 %4 = zext i1 %3 to i8 ``` This issue became apparent when experimenting with the programming language Julia, which makes use of LLVM. Currently, Julia lowers its `Bool` datatype to LLVM's `i8` (also see https://github.com/JuliaLang/julia/pull/17225). In fact, the above IR example is the lowered form of the Julia snippet `(a > b) & (a < c)`. Like shown above, this may introduce `zext` operations, casting between `i1` and `i8`, which could for example hinder ScalarEvolution and Polly on certain code. Reviewers: grosser, vtjnash, majnemer Subscribers: majnemer, llvm-commits Differential Revision: https://reviews.llvm.org/D22511 Contributed-by: Matthias Reisinger llvm-svn: 275989	2016-07-19 16:39:17 +00:00
Matt Arsenault	03006fd3c4	AMDGPU: Only use legal inline immediates with kill pseudo Only if the value is negative or positive is what matters, so use a constant that doesn't require an instruction to materialize. These should really just emit the write exec directly, but for stick with the kill pseudo-terminator. llvm-svn: 275988	2016-07-19 16:27:56 +00:00
Simon Pilgrim	0ea8d275cc	[X86][SSE] Reimplement SSE fp2si conversion intrinsics instead of using generic IR D20859 and D20860 attempted to replace the SSE (V)CVTTPS2DQ and VCVTTPD2DQ truncating conversions with generic IR instead. It turns out that the behaviour of these intrinsics is different enough from generic IR that this will cause problems, INF/NAN/out of range values are guaranteed to result in a 0x80000000 value - which plays havoc with constant folding which converts them to either zero or UNDEF. This is also an issue with the scalar implementations (which were already generic IR and what I was trying to match). This patch changes both scalar and packed versions back to using x86-specific builtins. It also deals with the other scalar conversion cases that are runtime rounding mode dependent and can have similar issues with constant folding. A companion clang patch is at D22105 Differential Revision: https://reviews.llvm.org/D22106 llvm-svn: 275981	2016-07-19 15:07:43 +00:00
Sam Parker	6ca4bbb00d	[ARM] Refactor Thumb2 Mul and Mla instr descs Recommitting after r274347 was reverted. This patch introduces some classes to refactor the 3 and 4 register Thumb2 multiplication instruction descriptions, plus improved tests for some of those instructions. Differential Revision: https://reviews.llvm.org/D21929 llvm-svn: 275979	2016-07-19 14:44:05 +00:00
Pankaj Gode	1bfca191da	[AArch64] PredictableSelectIsExpensive for Vulcan. Adding PredictableSelectIsExpensive for Vulcan Differential Revision: https://reviews.llvm.org/D22448 llvm-svn: 275978	2016-07-19 14:30:21 +00:00
Peter Smith	cbcecca538	Add support for tlsldm assembler operator to ARM target The standard local dynamic model for TLS on ARM systems needs two relocations: - R_ARM_TLS_LDM32 (module idx) - R_ARM_TLS_LDO32 (offset of object from origin of module TLS block) In GNU style assembler we use symbol(tlsldm) and symbol(tlsldo) to produce these relocations. llvm-mc for ARM supports symbol(tlsldo) but does not support symbol(tlsldm). This patch wires up the existing symbol(tlsldm) to R_ARM_TLS_LDM32. TLS for ARM is defined in Addenda to, and Errata in, the ABI for the ARM Architecture Differential Revision: https://reviews.llvm.org/D22461 llvm-svn: 275977	2016-07-19 14:15:33 +00:00
Simon Pilgrim	b87a21f1c3	[AARCH64] Fix linu triple typo As promised in D22191 llvm-svn: 275976	2016-07-19 14:12:45 +00:00
Simon Pilgrim	fc4d4b251d	[AARCH64] Enable AARCH64 lit tests on windows dev machines As discussed on PR27654, this patch fixes the triples of a lot of aarch64 tests and enables lit tests on windows This will hopefully help stop cases where windows developers break the aarch64 target Differential Revision: https://reviews.llvm.org/D22191 llvm-svn: 275973	2016-07-19 13:35:11 +00:00
Simon Pilgrim	766345e331	Get rid of VS2015 operator precedence warning. NFCI. llvm-svn: 275971	2016-07-19 12:26:51 +00:00
Daniel Sanders	3878412875	[mips][ias] R_MIPS_GOT_(PAGE\|OFST) do not need symbols Reviewers: sdardis Subscribers: dsanders, llvm-commits, sdardis Differential Revision: https://reviews.llvm.org/D22458 llvm-svn: 275968	2016-07-19 10:58:06 +00:00
Daniel Sanders	6a73883c48	[mips] Correct label prefixes for N32 and N64. Summary: N32 and N64 follow the standard ELF conventions (.L) whereas O32 uses its own ($). This fixes the majority of object differences between -fintegrated-as and -fno-integrated-as. Reviewers: sdardis Subscribers: dsanders, sdardis, llvm-commits Differential Revision: https://reviews.llvm.org/D22412 llvm-svn: 275967	2016-07-19 10:49:03 +00:00
Daniel Sanders	2cb55d7dfd	[mips] Recognise the triple used by Debian stretch for mips64el. Summary: The triple used for this distribution is mips64el-linux-gnuabi64. Reviewers: sdardis Subscribers: sdardis, llvm-commits Differential Revision: https://reviews.llvm.org/D22406 llvm-svn: 275966	2016-07-19 10:22:19 +00:00
Tobias Grosser	8ef834c712	[InstCombine] Minor cleanup of cast simplification code [NFC] Summary: This patch cleans up parts of InstCombine to raise its compliance with the LLVM coding standards and to increase its readability. The changes and according rationale are summarized in the following: - Rename `ShouldOptimizeCast()` to `shouldOptimizeCast()` since functions should start with a lower case letter. - Move `shouldOptimizeCast()` from InstCombineCasts.cpp to InstCombineAndOrXor.cpp since it's only used there. - Simplify interface of `shouldOptimizeCast()`. - Minor code style adaptions in `shouldOptimizeCast()`. - Remove the documentation on the function definition of `shouldOptimizeCast()` since it just repeats the documentation on its declaration. Also enhance the documentation on its declaration with more information describing its intended use and make it doxygen-compliant. - Change a comment in `foldCastedBitwiseLogic()` from `fold (logic (cast A), (cast B)) -> (cast (logic A, B))` to `fold logic(cast(A), cast(B)) -> cast(logic(A, B))` since the surrounding comments use this format. - Remove comment `Only do this if the casts both really cause code to be generated.` in `foldCastedBitwiseLogic()` since it just repeats parts of the documentation of `shouldOptimizeCast()` and does not help to improve readability. - Simplify the interface of `isEliminableCastPair()`. - Removed the documentation on the function definition of `isEliminableCastPair()` which only contained obvious statements about its implementation. Instead added more general doxygen-compliant documentation to its declaration. - Renamed parameter `DoXform` of `transformZExtIcmp()` to `DoTransform` to make its intention clearer. - Moved documentation of `transformZExtIcmp()` from its definition to its declaration and made it doxygen-compliant. Reviewers: vtjnash, grosser Subscribers: majnemer, llvm-commits Differential Revision: https://reviews.llvm.org/D22449 Contributed-by: Matthias Reisinger llvm-svn: 275964	2016-07-19 09:06:08 +00:00
Tobias Grosser	3a49a8e13c	Style: drop some unnecessary ';' [NFC] llvm-svn: 275963	2016-07-19 09:01:46 +00:00
Elena Demikhovsky	2c0780b8e5	AVX-512: Fixed BT instruction selection. The following condition expression ( a >> n) & 1 is converted to "bt a, n" instruction. It works on all intel targets. But on AVX-512 it was broken because the expression is modified to (truncate (a >>n) to i1). I added the new sequence (truncate (a >>n) to i1) to the BT pattern. Differential Revision: https://reviews.llvm.org/D22354 llvm-svn: 275950	2016-07-19 07:14:21 +00:00
Craig Topper	d6ca1dc45e	[AVX512] Give priority to EVEX encoded PSHUFB over the VEX versions. llvm-svn: 275942	2016-07-19 02:00:38 +00:00
Craig Topper	592dc30708	[X86] Remove superfluous parameter from a multiclass. All instantiations passed the same value. llvm-svn: 275941	2016-07-19 02:00:35 +00:00
George Burgess IV	5f30897b7b	[MemorySSA] Update to the new shiny walker. This patch updates MemorySSA's use-optimizing walker to be more accurate and, in some cases, faster. Essentially, this changed our core walking algorithm from a cache-as-you-go DFS to an iteratively expanded DFS, with all of the caching happening at the end. Said expansion happens when we hit a Phi, P; we'll try to do the smallest amount of work possible to see if optimizing above that Phi is legal in the first place. If so, we'll expand the search to see if we can optimize to the next phi, etc. An iteratively expanded DFS lets us potentially quit earlier (because we don't assume that we can optimize above all phis) than our old walker. Additionally, because we don't cache as we go, we can now optimize above loops. As an added bonus, this patch adds a ton of verification (if EXPENSIVE_CHECKS are enabled), so finding bugs is easier. Differential Revision: https://reviews.llvm.org/D21777 llvm-svn: 275940	2016-07-19 01:29:15 +00:00
Craig Topper	6189d3ecd4	[X86] Rename VINSERTzrr to use a capital Z to match other instructions. NFC llvm-svn: 275939	2016-07-19 01:26:19 +00:00
Vedant Kumar	e3a0bf5048	Retry: [llvm-profdata] Speed up merging by using a thread pool Add a "-j" option to llvm-profdata to control the number of threads used. Auto-detect NumThreads when it isn't specified, and avoid spawning threads when they wouldn't be beneficial. I tested this patch using a raw profile produced by clang (147MB). Here is the time taken to merge 4 copies together on my laptop: No thread pool: 112.87s user 5.92s system 97% cpu 2:01.08 total With 2 threads: 134.99s user 26.54s system 164% cpu 1:33.31 total Changes since the initial commit: - When handling odd-length inputs, call ThreadPool::wait() before merging the last profile. Should fix a race/off-by-one (see r275937). Differential Revision: https://reviews.llvm.org/D22438 llvm-svn: 275938	2016-07-19 01:17:20 +00:00
Vedant Kumar	21ab20e005	Revert "[llvm-profdata] Speed up merging by using a thread pool" This reverts commit r275921. It broke the ppc64be bot: http://lab.llvm.org:8011/builders/clang-ppc64be-linux-multistage/builds/3537 I'm not sure why it broke, but based on the output, it looks like an off-by-one (one profile left un-merged). llvm-svn: 275937	2016-07-19 00:57:09 +00:00
Wei Mi	79997a24d7	Recommit the patch "Use uniforms set to populate VecValuesToIgnore". For instructions in uniform set, they will not have vector versions so add them to VecValuesToIgnore. For induction vars, those only used in uniform instructions or consecutive ptrs instructions have already been added to VecValuesToIgnore above. For those induction vars which are only used in uniform instructions or non-consecutive/non-gather scatter ptr instructions, the related phi and update will also be added into VecValuesToIgnore set. The change will make the vector RegUsages estimation less conservative. Differential Revision: https://reviews.llvm.org/D20474 The recommit fixed the testcase global_alias.ll. llvm-svn: 275936	2016-07-19 00:50:43 +00:00
Matt Arsenault	fe358066ea	AMDGPU/SI: Fix SI scheduler refcount issue Without this fix, releaseSuccessors when InOrOutBlock is false could release SUs outside the schedule BasicBlock. Patch by Axel Davy llvm-svn: 275935	2016-07-19 00:35:22 +00:00
Matt Arsenault	cb540bc03c	AMDGPU: Expand register indexing pseudos in custom inserter This is to help moveSILowerControlFlow to before regalloc. There are a couple of tradeoffs with this. The complete CFG is visible to more passes, the loop body avoids an extra copy of m0, vcc isn't required, and immediate offsets can be shrunk into s_movk_i32. The disadvantage is the register allocator doesn't understand that the single lane's vector is dead within the loop body, so an extra register is used to outlive the loop block when expanding the VGPR -> m0 loop. This also now results in worse waitcnt insertion before the loop instead of after for pending operations at the point of the indexing, but that should be fixed by future improvements to cross block waitcnt insertion. v_movreld_b32's operands are now modeled more correctly since vdst is not a true output. This is kind of a hack to treat vdst as a use operand. Extra checking is required in the verifier since I can't seem to get tablegen to emit an implicit operand for a virtual register. llvm-svn: 275934	2016-07-19 00:35:03 +00:00
Lang Hames	0de9b91a71	[Kaleidoscope][BuildingAJIT] More work on the text for Chapter 3. Add an overview of stubs and compile callbacks before the discussion of the source changes. -- This line, and those below, will be ignored-- M docs/tutorial/BuildingAJIT3.rst llvm-svn: 275933	2016-07-19 00:25:52 +00:00
Sanjoy Das	ab73c9d88e	[LoopReroll] Reroll loops with unordered atomic memory accesses Reviewers: hfinkel, jfb, reames Subscribers: mcrosier, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D22385 llvm-svn: 275932	2016-07-19 00:23:54 +00:00
Matt Arsenault	4cb438b93c	TableGen: Allow custom register operand decoder method This is for a situation where the encoding for a register may be different depending on the specific operand. For some instructions, we want to apply additional restrictions beyond the encoding's constraints. In AMDGPU some operands are VSrc_32, using the VS_32 pseudo register class which accept VGPRs, SGPRs, or immediates in the encoding. Some specific instructions with the same encoding operand do not want to allow immediates or SGPRs, but the encoding format is different in this case than a regular VGPR_32 operand. This allows specifying the encoding should be treated the same without introducing yet another dummy register class. llvm-svn: 275929	2016-07-18 23:20:46 +00:00
Matt Arsenault	50b76399ed	AMDGPU: Fix test name and broken CHECK-LABEL llvm-svn: 275928	2016-07-18 23:09:51 +00:00
Vedant Kumar	05ee94f1b5	[utils] Generate html reports with the code coverage utility script Instead of extracting raw coverage mappings into an artifact directory, actually generate useful html reports for a given list of binaries with symbol demangling turned on. No tests, but this is actively being used to drive the (still nascent) coverage bot. llvm-svn: 275927	2016-07-18 22:50:10 +00:00
Matt Arsenault	4ced16dd2e	Fix -Wreturn-type with gcc 4.8 and libc++ llvm-svn: 275922	2016-07-18 22:12:46 +00:00
Vedant Kumar	0bd9907581	[llvm-profdata] Speed up merging by using a thread pool Add a "-j" option to llvm-profdata to control the number of threads used. Auto-detect NumThreads when it isn't specified, and avoid spawning threads when they wouldn't be beneficial. I tested this patch using a raw profile produced by clang (147MB). Here is the time taken to merge 4 copies together on my laptop: No thread pool: 112.87s user 5.92s system 97% cpu 2:01.08 total With 2 threads: 134.99s user 26.54s system 164% cpu 1:33.31 total Differential Revision: https://reviews.llvm.org/D22438 llvm-svn: 275921	2016-07-18 22:02:39 +00:00
Artem Belevich	9f97dcb018	[NVPTX] Make sure we adjust alignment at all call sites .. including calls from kernel functions that were ignored by mistake before. llvm-svn: 275920	2016-07-18 21:58:48 +00:00
Dehao Chen	6132ee8502	[PM] Convert Loop Strength Reduce pass to new PM Summary: Convert Loop String Reduce pass to new PM Reviewers: davidxl, silvas Subscribers: junbuml, sanjoy, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D22468 llvm-svn: 275919	2016-07-18 21:41:50 +00:00
Mehdi Amini	4d74631ea4	Update doxygen description for `WriteBitcodeToFile()` API (NFC) llvm-svn: 275917	2016-07-18 21:29:24 +00:00
Teresa Johnson	2124157102	[PM] Port FunctionImport Pass to new PM Summary: Port FunctionImport Pass to new PM. Reviewers: mehdi_amini, davide Subscribers: davidxl, llvm-commits Differential Revision: https://reviews.llvm.org/D22475 llvm-svn: 275916	2016-07-18 21:22:24 +00:00
Wei Mi	f9afff71a2	Revert rL275912. llvm-svn: 275915	2016-07-18 21:14:43 +00:00
Wei Mi	1fd25726af	Use uniforms set to populate VecValuesToIgnore. For instructions in uniform set, they will not have vector versions so add them to VecValuesToIgnore. For induction vars, those only used in uniform instructions or consecutive ptrs instructions have already been added to VecValuesToIgnore above. For those induction vars which are only used in uniform instructions or non-consecutive/non-gather scatter ptr instructions, the related phi and update will also be added into VecValuesToIgnore set. The change will make the vector RegUsages estimation less conservative. Differential Revision: https://reviews.llvm.org/D20474 llvm-svn: 275912	2016-07-18 20:59:53 +00:00
Sanjay Patel	5f5eb58eb5	refactor SimplifySelectInst; NFCI llvm-svn: 275911	2016-07-18 20:56:53 +00:00
Justin Lebar	4133584504	Write isUInt using template specializations to work around an incorrect MSVC warning. Summary: Per D22441, MSVC warns on our old implementation of isUInt<64>. It sees uint64_t(1) << 64 and doesn't realize that it's not going to be executed. Writing as a template specialization is ugly, but prevents the warning. Reviewers: RKSimon Subscribers: majnemer, llvm-commits Differential Revision: https://reviews.llvm.org/D22472 llvm-svn: 275909	2016-07-18 20:40:35 +00:00
Sanjay Patel	dbf44f5016	add tests for missed sext transform llvm-svn: 275908	2016-07-18 20:37:51 +00:00
Hans Wennborg	4ba35d1f4f	build_llvm_package.bat: update version to 4.0.0 llvm-svn: 275903	2016-07-18 20:26:46 +00:00

1 2 3 4 5 ...

135277 Commits