llvm-project

Commit Graph

Author	SHA1	Message	Date
Simon Pilgrim	6b4a5134af	[X86][3DNow!] Add tests showing missed commutation opportunities. llvm-svn: 294845	2017-02-11 13:00:32 +00:00
Daniel Berlin	b79f53669a	NewGVN: Clean up how we handle the INITIAL class so that everything in it is dead or unreachable, as it should be. This also makes the leader of INITIAL undef, enabling us to handle irreducibility properly. Summary: This lets us verify, more than we do now, that we didn't screw up value numbering. Reviewers: davide Subscribers: Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D29842 llvm-svn: 294844	2017-02-11 12:48:50 +00:00
Vitaly Buka	bcb6622c95	Fix "left shift of negative value -1" introduced by r294805 llvm-svn: 294843	2017-02-11 12:44:03 +00:00
Simon Pilgrim	8158816efe	[X86][XOP] Regenerate XOP commutation tests. Added 32-bit tests as well. llvm-svn: 294841	2017-02-11 12:30:59 +00:00
Simon Pilgrim	008ba63e04	[X86][SSE] Regenerate float comparison commutation tests. llvm-svn: 294840	2017-02-11 12:29:56 +00:00
Simon Pilgrim	0d8632f089	[X86] Regenerate CLMUL commutation tests. llvm-svn: 294839	2017-02-11 12:23:22 +00:00
Benjamin Kramer	efcf06f5f2	Move symbols from the global namespace into (anonymous) namespaces. NFC. llvm-svn: 294837	2017-02-11 11:06:55 +00:00
Craig Topper	1f6153bab4	[AVX-512] Add VPINSRB/W/D/Q instructions to load folding tables. llvm-svn: 294830	2017-02-11 07:01:40 +00:00
Craig Topper	a9818aadab	[AVX-512] Fix apparent typo in instruction name VMOVSSDrr_REV->VMOVSDZrr_REV. llvm-svn: 294829	2017-02-11 07:01:38 +00:00
Craig Topper	3afa777f10	[AVX-512] Add VPSADBW instructions to load folding tables. llvm-svn: 294827	2017-02-11 06:24:03 +00:00
Evgeny Stupachenko	5f3d9b6c09	The patch fixes r294821 Summary: Update register match for windows testing From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 294825	2017-02-11 05:39:00 +00:00
Craig Topper	464b8cb244	[X86] Don't base domain decisions on VEXTRACTF128/VINSERTF128 if only AVX1 is available. Seems the execution dependency pass likes to use FP instructions when most of the consuming code is integer if a vextractf128 instruction produced the register. Without AVX2 we don't have the corresponding integer instruction available. This patch suppresses the domain on these instructions to GenericDomain if AVX2 is not supported so that they are ignored by domain fixing. If AVX2 is supported we'll report the correct domain and allow them to switch between integer and fp. Overall I think this produces better results in the modified test cases. llvm-svn: 294824	2017-02-11 05:32:57 +00:00
Peter Collingbourne	fa3175f2f6	Address Mehdi's post-commit review comments on r294795. llvm-svn: 294822	2017-02-11 03:19:22 +00:00
Evgeny Stupachenko	fe6f548d2d	Fix PR23384 (under "-lsr-insns-cost" option) Summary: The patch adds instructions number generated by a solution to LSR cost under "-lsr-insns-cost" option. Reviewers: qcolombet, hfinkel Differential Revision: http://reviews.llvm.org/D28307 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 294821	2017-02-11 02:57:43 +00:00
Ahmed Bougacha	8425f453ef	[ARM] Make f16 interleaved accesses expensive. There are no vldN/vstN f16 variants, even with +fullfp16. We could use the i16 variants, but, in practice, even with +fullfp16, the f16 sequence leading to the i16 shuffle usually gets scalarized. We'd need to improve our support for f16 codegen before getting there. Teach the cost model to consider f16 interleaved operations as expensive. Otherwise, we are all but guaranteed to end up with a large block of scalarized vector code. llvm-svn: 294819	2017-02-11 01:53:04 +00:00
Ahmed Bougacha	fc979dc9dd	[ARM] Don't lower f16 interleaved accesses. There are no vldN/vstN f16 variants, even with +fullfp16. We could use the i16 variants, but, in practice, even with +fullfp16, the f16 sequence leading to the i16 shuffle usually gets scalarized. We'd need to improve our support for f16 codegen before getting there. Reject f16 interleaved accesses. If we try to emit the f16 intrinsics, we'll just end up with a selection failure. llvm-svn: 294818	2017-02-11 01:53:00 +00:00
Ahmed Bougacha	f37fb89edc	[ARM] Unique some redundant CHECK lines. NFC. llvm-svn: 294817	2017-02-11 01:52:57 +00:00
Wei Mi	8f20e63a20	[LSR] Recommit: Allow formula containing Reg for SCEVAddRecExpr related with outerloop. The recommit includes some changes of testcases. No functional change to the patch. In RateRegister of existing LSR, if a formula contains a Reg which is a SCEVAddRecExpr, and this SCEVAddRecExpr's loop is an outerloop, the formula will be marked as Loser and dropped. Suppose we have an IR that %for.body is outerloop and %for.body2 is innerloop. LSR only handle inner loop now so only %for.body2 will be handled. Using the logic above, formula like reg(%array) + reg({1,+, %size}<%for.body>) + 1reg({0,+,1}<%for.body2>) will be dropped no matter what because reg({1,+, %size}<%for.body>) is a SCEVAddRecExpr type reg related with outerloop. Only formula like reg(%array) + 1reg({{1,+, %size}<%for.body>,+,1}<nuw><nsw><%for.body2>) will be kept because the SCEVAddRecExpr related with outerloop is folded into the initial value of the SCEVAddRecExpr related with current loop. But in some cases, we do need to share the basic induction variable reg{0 ,+, 1}<%for.body2> among LSR Uses to reduce the final total number of induction variables used by LSR, so we don't want to drop the formula like reg(%array) + reg({1,+, %size}<%for.body>) + 1*reg({0,+,1}<%for.body2>) unconditionally. From the existing comment, it tries to avoid considering multiple level loops at the same time. However, existing LSR only handles innermost loop, so for any SCEVAddRecExpr with a loop other than current loop, it is an invariant and will be simple to handle, and the formula doesn't have to be dropped. Differential Revision: https://reviews.llvm.org/D26429 llvm-svn: 294814	2017-02-11 00:50:23 +00:00
Eugene Zelenko	d3a6c897ba	[MC] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 294813	2017-02-11 00:27:28 +00:00
Matthias Braun	59797eccea	config-ix.cmake: Search for CMAKE_XCRUN before using it. This was previously searched in CMakeLists.txt unconditionally but as of r294371 it is only searched in some circumstances. Repeating the search in config-ix.cmake to make this robust and hopefully fix the macOS Asan+Ubsan jenkins build. llvm-svn: 294811	2017-02-11 00:14:01 +00:00
Chandler Carruth	027340f3b9	[PM] Fix a bug in how I ported LoopDeletion to the new PM. This was marking the loop for deletion after the loop was deleted. This almost works, except that when we do any kind of debug logging it starts reading the name of the loop from deleted memory or otherwise blowing up. This can fail in a bunch of ways. I recently added a test that always does this, and it started failing on the sanitizer bots. The fix is to mark the loop as deleted in the loop PM infrastructure before we remove the loop. We can do this by passing the updater into the routine. That also lets us simplify a bunch of other interface components here for a net win. llvm-svn: 294810	2017-02-11 00:09:30 +00:00
Dan Gohman	dfe6ce7abd	[WebAssembly] Remove old experimental disassemler code. Remove support for disassembling an old experimental wasm binary format, which is no longer in use anywhere. llvm-svn: 294809	2017-02-11 00:02:23 +00:00
Saleem Abdulrasool	769b98d327	vim: add `returned` keyword The `returned` keyword was added in SVN r179925. Update the vim syntax rules. llvm-svn: 294808	2017-02-10 23:57:11 +00:00
Davide Italiano	690ed9dec7	[LTO] Share the optimization remarks setup between Thin/Full LTO. llvm-svn: 294807	2017-02-10 23:49:38 +00:00
Krzysztof Parzyszek	f9015e62fd	[Hexagon] Introduce Hexagon V62 llvm-svn: 294805	2017-02-10 23:46:45 +00:00
Davide Italiano	95a8707de8	[tests] Be explicit about the files we want to remove. Hopefully Windows will stop whining after this change. llvm-svn: 294801	2017-02-10 22:55:37 +00:00
Peter Collingbourne	be9ffaacfa	IR: Function summary extensions for whole-program devirtualization pass. The summary information includes all uses of llvm.type.test and llvm.type.checked.load intrinsics that can be used to devirtualize calls, including any constant arguments for virtual constant propagation. Differential Revision: https://reviews.llvm.org/D29734 llvm-svn: 294795	2017-02-10 22:29:38 +00:00
Benjamin Kramer	03ab8a366e	[InstCombine] Move class into anonymous namespace. NFC. This is necessary to avoid warnings from GCC. InstCombineLoadStoreAlloca.cpp:238:7: error: 'PointerReplacer' declared with greater visibility than the type of its field 'PointerReplacer::IC' llvm-svn: 294794	2017-02-10 22:26:35 +00:00
Davide Italiano	46d72b1b7f	[lib/LTO] Rework optimization remarkers setup. This makes this code much more similar to what ThinLTO is using (also API wise), so now we can probably use a single code path instead of copying stuff around. llvm-svn: 294792	2017-02-10 22:16:17 +00:00
Benjamin Kramer	aa5adfa360	[PPC] Silence warning in Release builds. llvm-svn: 294791	2017-02-10 22:13:34 +00:00
Davide Italiano	62092aeb42	[LTO] Make these tests robust across multiple iterations. Same as r294784, but for regular LTO. llvm-svn: 294789	2017-02-10 22:11:06 +00:00
Benjamin Kramer	684c87be4f	[InstCombine] Silence unused variable warning in Release builds. llvm-svn: 294788	2017-02-10 22:04:17 +00:00
Nico Weber	ee0b0ec935	Revert r294532, it caused PR31935 llvm-svn: 294787	2017-02-10 21:57:30 +00:00
Yaxun Liu	ba01ed00fe	Fix invalid addrspacecast due to combining alloca with global var For function-scope variables with large initialisation list, FE usually generates a global variable to hold the initializer, then generates memcpy intrinsic to initialize the alloca. InstCombiner::visitAllocaInst identifies such allocas which are accessed only by reading and replaces them with the global variable. This is done by casting the global variable to the type of the alloca and replacing all references. However, when the global variable is in a different address space which is disjoint with addr space 0 (e.g. for IR generated from OpenCL, global variable cannot be in private addr space i.e. addr space 0), casting the global variable to addr space 0 results in invalid IR for certain targets (e.g. amdgpu). To fix this issue, when the global variable is not in addr space 0, instead of casting it to addr space 0, this patch chases down the uses of alloca until reaching the load instructions, then replaces load from alloca with load from the global variable. If during the chasing bitcast and GEP are encountered, new bitcast and GEP based on the global variable are generated and used in the load instructions. Differential Revision: https://reviews.llvm.org/D27283 llvm-svn: 294786	2017-02-10 21:46:07 +00:00
Davide Italiano	d6979b8c38	[ThinLTO] Make this test more robust across multiple runs. The yaml emitter files are left around otherwise. llvm-svn: 294784	2017-02-10 21:35:31 +00:00
Tim Shen	21a960b6a6	Fix a silly syntax error. llvm-svn: 294783	2017-02-10 21:17:35 +00:00
Dehao Chen	fb02f7140a	Encode duplication factor from loop vectorization and loop unrolling to discriminator. Summary: This patch starts the implementation as discuss in the following RFC: http://lists.llvm.org/pipermail/llvm-dev/2016-October/106532.html When optimization duplicates code that will scale down the execution count of a basic block, we will record the duplication factor as part of discriminator so that the offline process tool can find the duplication factor and collect the accurate execution frequency of the corresponding source code. Two important optimization that fall into this category is loop vectorization and loop unroll. This patch records the duplication factor for these 2 optimizations. The recording will be guarded by a flag encode-duplication-in-discriminators, which is off by default. Reviewers: probinson, aprantl, davidxl, hfinkel, echristo Reviewed By: hfinkel Subscribers: mehdi_amini, anemet, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D26420 llvm-svn: 294782	2017-02-10 21:09:07 +00:00
Tim Shen	918ed871df	[XRay] Implement powerpc64le xray. Summary: powerpc64 big-endian is not supported, but I believe that most logic can be shared, except for xray_powerpc64.cc. Also add a function InvalidateInstructionCache to xray_util.h, which is copied from llvm/Support/Memory.cpp. I'm not sure if I need to add a unittest, and I don't know how. Reviewers: dberris, echristo, iteratee, kbarton, hfinkel Subscribers: mehdi_amini, nemanjai, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D29742 llvm-svn: 294781	2017-02-10 21:03:24 +00:00
Krzysztof Parzyszek	fe58ca3218	[Hexagon] Remove unused .td files llvm-svn: 294775	2017-02-10 19:54:00 +00:00
Ahmed Bougacha	2e275e272f	[X86] Bitcast subvector before broadcasting it. Since r274013, we've been looking through bitcasts on broadcast inputs. In the scalar-folding case (from a load, build_vector, or sc2vec), the input type didn't matter, as we'd simply bitcast the resulting scalar back. However, when broadcasting a 128-bit-lane-aligned element, we create an EXTRACT_SUBVECTOR. Use proper types, by creating an extract_subvector of the original input type. llvm-svn: 294774	2017-02-10 19:51:47 +00:00
Kevin Enderby	dc412ccc41	Yet another fix llvm-objdump so it picks a good CPU based for Mach-O files, in this case for CPU_SUBTYPE_ARM64_ALL. For this cpusubtype it should default to a cyclone CPU to give proper disassembly without a -mcpu= flag. rdar://27767188 llvm-svn: 294771	2017-02-10 19:27:10 +00:00
Tim Northover	0e01170c79	GlobalISel: drop lifetime intrinsics during translation. We don't use them yet and they just cause problems. llvm-svn: 294770	2017-02-10 19:10:38 +00:00
Marcos Pividori	e81f9cc63d	[libFuzzer] Use stoull instead of stol to ensure 64 bits. Differential revision: https://reviews.llvm.org/D29831 llvm-svn: 294769	2017-02-10 18:44:14 +00:00
Simon Pilgrim	39f8da3823	[X86][AVX512] Add vector rotate tests for AVX512 targets AVX512 does have vector rotate instructions, but we don't lower to them yet llvm-svn: 294766	2017-02-10 18:06:11 +00:00
Amaury Sechet	280ad2cebb	Autogenerate results for test/CodeGen/X86/peep-test-4.ll . NFC llvm-svn: 294765	2017-02-10 17:57:48 +00:00
Amaury Sechet	f6308cfe87	Autogenerate results for test/CodeGen/X86/pr14314.ll . NFC llvm-svn: 294764	2017-02-10 17:57:46 +00:00
John Brawn	e60f4e4b8d	[ARM] Fix incorrect mask bits in MSR encoding for write_register intrinsic In the encoding of system registers in the M-class MSR instruction the mask bits should be 2 for registers that don't take a _<bits> qualifier (the instruction is unpredictable otherwise), and should also be 2 if the register takes a _<bits> qualifier but it's not present as no _<bits> is an alias for _nzcvq. Differential Revision: https://reviews.llvm.org/D29828 llvm-svn: 294762	2017-02-10 17:41:08 +00:00
Amaury Sechet	c8587e4257	Use autogenerate check in CodeGen/X86/pr16031.ll . NFC llvm-svn: 294761	2017-02-10 17:26:21 +00:00
Mehdi Amini	f1423e893d	Fix doc for `-opt-bisect-limit`: the LTO option prefix for lld is -mllvm Thanks Davide to catch it in my previous patch. llvm-svn: 294759	2017-02-10 17:16:00 +00:00
Alexander Kornienko	beda0f1923	Add a virtual destructor for LegalizerInfo. lib/Target/X86/X86TargetMachine.cpp has a code that deletes an instance of a LegalizerInfo descendant via a pointer to base. llvm-svn: 294757	2017-02-10 17:00:27 +00:00
Amaury Sechet	3b87944433	Check full codegen in CodeGen/X86/i256-add.ll NFC llvm-svn: 294756	2017-02-10 16:34:17 +00:00
Matthew Simpson	df124a7569	[LV] Remove type restriction for vector phi creation We previously only created a vector phi node for an induction variable if its type matched the type of the canonical induction variable. Differential Revision: https://reviews.llvm.org/D29776 llvm-svn: 294755	2017-02-10 16:15:26 +00:00
Krzysztof Parzyszek	a72fad980c	[Hexagon] Replace instruction definitions with auto-generated ones llvm-svn: 294753	2017-02-10 15:33:13 +00:00
Rafael Espindola	be99157127	Move some error handling down to MCStreamer. This makes sure we get the same redefinition rules regardless of who is printing (asm parser, codegen) and to what (asm, obj). This fixes an unintentional regression in r293936. llvm-svn: 294752	2017-02-10 15:13:12 +00:00
Simon Pilgrim	a3362a1c9e	[X86][SSE] Added chained FDIV test cases for D26855 Tests to demonstrate throughput-latency decision between div and rcp on faster hardware such as Haswell llvm-svn: 294750	2017-02-10 14:56:12 +00:00
Simon Pilgrim	bfb1747806	[DAGCombine] Allow vector constant folding of any value type before type legalization The patch comes in 2 parts: 1 - it makes use of the SelectionDAG::NewNodesMustHaveLegalTypes flag to tell when it can safely constant fold illegal types. 2 - it correctly resets SelectionDAG::NewNodesMustHaveLegalTypes at the start of each call to SelectionDAGISel::CodeGenAndEmitDAG so all the pre-legalization stages can make use of it - not just the first basic block that gets handled. Fix for PR30760 Differential Revision: https://reviews.llvm.org/D29568 llvm-svn: 294749	2017-02-10 14:37:25 +00:00
Simon Pilgrim	8c8b10389d	[X86][SSE] Use SDValue::getConstantOperandVal helper. NFCI. Also reordered an if statement to test low cost comparisons first llvm-svn: 294748	2017-02-10 14:27:59 +00:00
Simon Pilgrim	c371159aac	[X86][SSE] Add support for extracting target constants from BUILD_VECTOR In some cases we call getTargetConstantBitsFromNode for nodes that haven't been lowered from BUILD_VECTOR yet Note: We're getting very close to being able to move most of the constant extraction code from getTargetShuffleMaskIndices into getTargetConstantBitsFromNode llvm-svn: 294746	2017-02-10 14:04:11 +00:00
Simon Pilgrim	1140281413	[X86][SSE] Add missing comment describing combing to SHUFPS. NFCI llvm-svn: 294745	2017-02-10 13:16:01 +00:00
Chandler Carruth	7bc6028d7d	[PM] Relax the patterns used in the new test I added because some compilers don't print the typedef name. llvm-svn: 294729	2017-02-10 08:48:50 +00:00
Chandler Carruth	f425292721	[PM] Fix a bug in the new loop PM when handling functions with no loops. Without any loops, we don't even bother to build the standard analyses used by loop passes. Without these, we can't run loop analyses or invalidate them properly. Unfortunately, we did these things in the wrong order which would allow a loop analysis manager's proxy to be built but then not have the standard analyses built. When we went to do the invalidation in the proxy thing would fall apart. In the test case provided, it would actually crash. The fix is to carefully check for loops first, and to in fact build the standard analyses before building the proxy. This allows it to correctly trigger invalidation for those standard analyses. An alternative might seem to be to look at whether there are any loops when doing invalidation, but this doesn't work when during the loop pipeline run we delete the last loop. I've even included that as a test case. It is both simpler and more robust to defer building the proxy until there are definitely the standard set of analyses and indeed loops. This bug was uncovered by enabling GlobalsAA in the pipeline. llvm-svn: 294728	2017-02-10 08:26:58 +00:00
Igor Breger	6677999e17	add #ifdef, fix compilation error in case LLVM_BUILD_GLOBAL_ISEL=OFF llvm-svn: 294726	2017-02-10 07:33:14 +00:00
Mehdi Amini	a826244bb1	Fix doc for `-opt-bisect-limit`: the LTO option is linker specific llvm-svn: 294725	2017-02-10 07:21:06 +00:00
Igor Breger	b4442f34cd	[X86][GlobalISel] Add general-purpose Register Bank Summary: [X86][GlobalISel] Add general-purpose Register Bank. Add trivial handling of G_ADD legalization . Add Regestry Bank selection for COPY and G_ADD instructions Reviewers: rovka, zvi, ab, t.p.northover, qcolombet Reviewed By: qcolombet Subscribers: qcolombet, mgorny, dberris, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D29771 llvm-svn: 294723	2017-02-10 07:05:56 +00:00
Dean Michael Berris	b5600b58a0	[XRay][graph] Disambiguate name of type from member name Follow-up to D29005. Differential Revision: https://reviews.llvm.org/D29005 llvm-svn: 294722	2017-02-10 06:59:25 +00:00
Dean Michael Berris	6c97b3acda	[XRay] A graph Class for the llvm-xray graph Summary: In preparation for graph comparison and filtering, this is a library for representing graphs in LLVM. This will enable easier encapsulation and reuse of graphs in llvm-xray. Depends on D28999, D28225 Reviewers: dblaikie, dberris Reviewed By: dberris Subscribers: mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D29005 llvm-svn: 294717	2017-02-10 06:36:08 +00:00
Philip Reames	578dafbd8b	[LoopUnswitch] Remove BFI usage (dead code) Chandler mentioned at the last social that the need for BFI in the new pass manager was causing a slight hiccup for this pass. Given this code has been checked in, but off for over a year, it makes sense to just remove it for now. Note that there's nothing wrong with the general idea - it's actually a quite good one - and once we have the infrastructure in place to implement this without the full recompuation on every loop, we absolutely should. llvm-svn: 294715	2017-02-10 06:12:06 +00:00
Dean Michael Berris	79f5746f41	Revert "[XRay] A graph Class for the llvm-xray graph" Broke tests, reverting. llvm-svn: 294714	2017-02-10 06:05:46 +00:00
Dean Michael Berris	2957c25a5e	[XRay] A graph Class for the llvm-xray graph Summary: In preparation for graph comparison and filtering, this is a library for representing graphs in LLVM. This will enable easier encapsulation and reuse of graphs in llvm-xray. Depends on D28999, D28225 Reviewers: dblaikie, dberris Reviewed By: dberris Subscribers: mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D29005 llvm-svn: 294713	2017-02-10 05:40:37 +00:00
Craig Topper	a9f1121896	[SelectionDAG] Dump the DAG after legalizing vector ops and after the second type legalization Summary: With -debug, we aren't dumping the DAG after legalizing vector ops. In particular, on X86 with AVX1 only, we don't dump the DAG after we split 256-bit integer ops into pairs of 128-bit ADDs since this occurs during vector legalization. I'm only dumping if the legalize vector ops changes something since we don't print anything during legalize vector ops. So this dump shows up right after the first type-legalization dump happens. So if nothing changed this second dump is unnecessary. Having said that though, I think we should probably fix legalize vector ops to log what its doing. Reviewers: RKSimon, eli.friedman, spatel, arsenm, chandlerc Reviewed By: RKSimon Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D29554 llvm-svn: 294711	2017-02-10 05:05:57 +00:00
Adam Nemet	386cd3dd6b	opt-viewer: fix HtmlFormatter encoding Summary: Small fix to HtmlFormatter, defaults to ascii encoding, so utf-8 output may get `UnicodeEncodeError: 'ascii' codec can't encode character ... ordinal not in range(128)` during write. Patch by Brian Cain! Reviewers: anemet, fhahn Reviewed By: anemet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29802 llvm-svn: 294710	2017-02-10 04:50:18 +00:00
Eric Christopher	0824096cc0	Temporarily revert "For X86-64 linux and PPC64 linux align int128 to 16 bytes." until we can get better TargetMachine::isCompatibleDataLayout to compare - otherwise we can't code generate existing bitcode without a string equality data layout. This reverts commit r294702. llvm-svn: 294709	2017-02-10 04:35:32 +00:00
Ahmed Bougacha	982c5eb396	[GlobalISel] Return an Expected<RuleMatcher> for each SDAG pattern. NFC. Instead of emitting the matcher code directly, return the rule matcher and the skip reason as an Expected<RuleMatcher>. This will let us record all matchers and process them before emission. It's a somewhat unconventional use of Error, but it's nicer than, say, std::pair, because of the bool conversions. Differential Revision: https://reviews.llvm.org/D29743 llvm-svn: 294706	2017-02-10 04:00:17 +00:00
Matthias Braun	ef21cb2d95	SubtargetFeature: Increase MAX_SUBTARGET_FEATURES The ARM target is getting really close to the current limit of 128 subtarget features already breaking out of tree enhancements. Increase the size once more to 196. I filed http://llvm.org/PR31926 to request a proper solution. llvm-svn: 294704	2017-02-10 03:48:50 +00:00
Eric Christopher	42b9248803	For X86-64 linux and PPC64 linux align int128 to 16 bytes. For other platforms we should find out what they need and likely make the same change, however, a smaller additional change is easier for platforms we know have it specified in the ABI. As part of this rewrite some of the handling in the backends for data layout and update a bunch of testcases. Based on a patch by Simonas Kazlauskas! llvm-svn: 294702	2017-02-10 03:32:21 +00:00
Quentin Colombet	21136c0273	[TableGen][AsmWriterEmitter] Use a deterministic order to sort InstrAliases Inside an alias group, when ordering instruction aliases, we rely on the priority field to sort them. When the priority is not set or more generally when there is a tie between two aliases, we used to rely on the lexicographic order. However, this order can change for the anonymous records when more instruction, intrinsic, etc. are inserted. For instance, given two anonymous records r1 and r2 with respective name A_999 and A_1000, their lexicography order will be r2 then r1. Now, if an instruction is added before them, their name will become respectively A_1000 and A_1001, thus the lexicography order will be r1 then r2, i.e., it changed. If that happens in an alias group, the assembly output would prefer a different alias for no apparent good reasons. A way to fix that is to use proper priority for all aliases, but we can also make the tie breaker comparison smarter and use a deterministic ordering. This is what this patch does. llvm-svn: 294695	2017-02-10 02:43:09 +00:00
Matt Arsenault	b4493e909f	AMDGPU: Fix trailing whitespace llvm-svn: 294694	2017-02-10 02:42:31 +00:00
Wei Ding	205bfdb3e9	AMDGPU : Add trap handler support. Differential Revision: http://reviews.llvm.org/D26010 llvm-svn: 294692	2017-02-10 02:15:29 +00:00
Stanislav Mekhanoshin	6dec24316b	[AMDGPU] Override PSet for M0 This change returns empty PSet list for M0 register. Otherwise its PSet as defined by tablegen is SReg_32. This results in incorrect register pressure calculation every time an instruction uses M0. Such uses count as SReg_32 PSet and inadequately increase pressure on SGPRs. Differential Revision: https://reviews.llvm.org/D29798 llvm-svn: 294691	2017-02-10 02:07:58 +00:00
Eric Fiselier	87c87f4c30	[CMake] Fix pthread handling for out-of-tree builds LLVM defines `PTHREAD_LIB` which is used by AddLLVM.cmake and various projects to correctly link the threading library when needed. Unfortunately `PTHREAD_LIB` is defined by LLVM's `config-ix.cmake` file which isn't installed and therefore can't be used when configuring out-of-tree builds. This causes such builds to fail since `pthread` isn't being correctly linked. This patch attempts to fix that problem by renaming and exporting `LLVM_PTHREAD_LIB` as part of`LLVMConfig.cmake`. I renamed `PTHREAD_LIB` because It seemed likely to cause collisions with downstream users of `LLVMConfig.cmake`. llvm-svn: 294690	2017-02-10 01:59:20 +00:00
Marcos Pividori	a0b23b8e63	[libFuzzer] Export external functions on tests. We need to export external functions so they are found when calling GetProcAddress() on Windows. But we can't use `__declspec(dllexport)` because we want the targets to be completely independent from the fuzz engines and don't depend on other header files. Also, we don't want to include platform specific code managed with conditional macros. So, the solution is to add the exported symbols with linker flags in cmake. Differential revision: https://reviews.llvm.org/D29752 llvm-svn: 294688	2017-02-10 01:40:28 +00:00
Marcos Pividori	0ae27e80b0	[libFuzzer] Use dynamic loading for External Functions on Windows. Replace weak aliases with dynamic loading. Weak aliases were generating some problems when linking for MT on Windows. For MT, compiler-rt's libraries are statically linked to the main executable the same than libFuzzer, so if we use weak aliases, we are providing two different default implementations for the same weak function and the linker fails. In this diff I re implement ExternalFunctions() using dynamic loading, so it works in both cases (MD and MT). Also, dynamic loading is simpler, since we are not defining any auxiliary external function, and we don't need to deal with weak aliases. This is equivalent to the implementation using dlsym(RTLD_DEFAULT, FnName) for Posix. Differential revision: https://reviews.llvm.org/D29751 llvm-svn: 294687	2017-02-10 01:35:46 +00:00
David L. Jones	e072cf51da	Update test/CodeGen/X86/sse-align-10.ll to use FileCheck instead of grep Patch by Jorge Gorbe (lethalantidote). Differential Revision: https://reviews.llvm.org/D29797 llvm-svn: 294686	2017-02-10 01:35:31 +00:00
Eugene Zelenko	4b6ff6b86e	[MC] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 294685	2017-02-10 01:33:54 +00:00
Michael J. Spencer	788b10ecbc	[LoadCombine] Change test to not use instcombine. llvm-svn: 294682	2017-02-10 00:44:08 +00:00
Dan Gohman	df4f4d45f0	[WebAssembly] Pass an MCContext to WebAssemblyMCCodeEmitter. NFC. llvm-svn: 294679	2017-02-10 00:14:42 +00:00
Matthias Braun	2bef2a08c0	Fix syntax error llvm-svn: 294678	2017-02-10 00:09:20 +00:00
Matthias Braun	62e1e8531b	ARMSubtarget.h: Change to one line per enum element; NFC Change syntax to have enum elements sorted alphabetically and one per line as that is more merge/cherry pick friendly. llvm-svn: 294677	2017-02-10 00:06:44 +00:00
Dan Gohman	23a543971e	[Support] Extend SLEB128 encoding support. Add support for padded SLEB128 values, and support for writing SLEB128 values to buffers rather than to ostreams, similar to the existing ULEB128 support. llvm-svn: 294675	2017-02-10 00:02:58 +00:00
Eric Christopher	e4b10f5d37	Add an additional set of braces to deal with subobject initialization. llvm-svn: 294674	2017-02-10 00:02:09 +00:00
Matthias Braun	f0cb2fdd74	docs/conf.py: Suppress sphinx highlighting failure warnings The pygments syntax highlighting package used by sphinx fails to parse newer LLVM constructs or valid (at least to me) gas constructs like `.secrel32 _function_name + 0`. Disable this particular warning so the build doesn't abort as fixing pygments doesn't seem a workable option here. Differential Revision: https://reviews.llvm.org/D29794 llvm-svn: 294672	2017-02-10 00:00:22 +00:00
Chandler Carruth	0ede22e1c0	[PM] Add Argument Promotion to the pass pipeline. This needs explicit requires of the optimization remark emission before loop pass pipelines containing LICM as we no longer get it from the inliner -- Argument Promotion may invalidate it. Technically the inliner could also have broken this, but it never came up in testing. Differential Revision: https://reviews.llvm.org/D29595 llvm-svn: 294670	2017-02-09 23:54:57 +00:00
Davide Italiano	fc0d442cf1	[NewGVN] Fix test so that it doesn't rely on InstCombine anymore. llvm-svn: 294668	2017-02-09 23:48:10 +00:00
Chandler Carruth	addcda483e	[PM] Port ArgumentPromotion to the new pass manager. Now that the call graph supports efficient replacement of a function and spurious reference edges, we can port ArgumentPromotion to the new pass manager very easily. The old PM-specific bits are sunk into callbacks that the new PM simply doesn't use. Unlike the old PM, the new PM simply does argument promotion and afterward does the update to LCG reflecting the promoted function. Differential Revision: https://reviews.llvm.org/D29580 llvm-svn: 294667	2017-02-09 23:46:27 +00:00
Peter Collingbourne	17febdbb25	WholeProgramDevirt: Check that VCP candidate functions are defined before evaluating them. This was crashing before. llvm-svn: 294666	2017-02-09 23:46:26 +00:00
Matthias Braun	d0d8daa37c	LowerMemIntrinsics: Fix include guard I hope this fixes the clang-stage2-cmake-modules jenkins build. llvm-svn: 294665	2017-02-09 23:43:28 +00:00
Chandler Carruth	1f8fcfeac5	[PM/LCG] Teach LCG to support spurious reference edges. Somewhat amazingly, this only requires teaching it to clean them up when deleting a dead function from the graph. And we already have exactly the necessary data structures to do that in the parent RefSCCs. This allows ArgPromote to work in a much simpler way be merely letting reference edges linger in the graph after the causing IR is deleted. We will clean up these edges when we run any function pass over the IR, but don't remove them eagerly. This avoids all of the quadratic update issues both in the current pass manager and in my previous attempt with the new pass manager. Differential Revision: https://reviews.llvm.org/D29579 llvm-svn: 294663	2017-02-09 23:30:14 +00:00
George Burgess IV	ccf11c2f9f	[ARM] Add support for armv7ve triple in llvm (PR31358). Gcc supports target armv7ve which is armv7-a with virtualization extensions. This change adds support for this in llvm for gcc compatibility. Also remove redundant FeatureHWDiv, FeatureHWDivARM for a few models as this is specified automatically by FeatureVirtualization. Patch by Manoj Gupta. Differential Revision: https://reviews.llvm.org/D29472 llvm-svn: 294661	2017-02-09 23:29:14 +00:00
Chandler Carruth	aaad9f84be	[PM/LCG] Teach the LazyCallGraph how to replace a function without disturbing the graph or having to update edges. This is motivated by porting argument promotion to the new pass manager. Because of how LLVM IR Function objects work, in order to change their signature a new object needs to be created. This is efficient and straight forward in the IR but previously was very hard to implement in LCG. We could easily replace the function a node in the graph represents. The challenging part is how to handle updating the edges in the graph. LCG previously used an edge to a raw function to represent a node that had not yet been scanned for calls and references. This was the core of its laziness. However, that model causes this kind of update to be very hard: 1) The keys to lookup an edge need to be `Function`s that would all need to be updated when we update the node. 2) There will be some unknown number of edges that haven't transitioned from `Function` edges to `Node` edges. All of this complexity isn't necessary. Instead, we can always build a node around any function, always pointing edges at it and always using it as the key to lookup an edge. To maintain the laziness, we need to sink the edges* of a node into a secondary object and explicitly model transitioning a node from empty to populated by scanning the function. This design seems much cleaner in a number of ways, but importantly there is now exactly one place where the `Function` has to be updated! Some other cleanups that fall out of this include having something to model the entry* edges more accurately. Rather than hand rolling parts of the node in the graph itself, we have an explicit `EdgeSequence` object that gives us exactly the functionality needed. We also have a consistent place to define the edge iterators and can use them for both the entry edges and the internal edges of the graph. The API used to model the separation between a node and its edges is intentionally very thin as most clients are expected to deal with nodes that have populated edges. We model this exactly as an optional does with an additional method to populate the edges when that is a reasonable thing for a client to do. This is based on API design suggestions from Richard Smith and David Blaikie, credit goes to them for helping pick how to model this without it being either too explicit or too implicit. The patch is somewhat noisy due to shifting around iterator types and new syntax for walking the edges of a node, but most of the functionality change is in the `Edge`, `EdgeSequence`, and `Node` types. Differential Revision: https://reviews.llvm.org/D29577 llvm-svn: 294653	2017-02-09 23:24:13 +00:00
Dan Gohman	b6afd2070a	[WebAssembly] Refactor void return peephole using MaybeRewriteToFallthrough. NFC. llvm-svn: 294652	2017-02-09 23:19:03 +00:00
Sanjay Patel	f38bab73aa	[InstCombine] allow (X * C2) << C1 --> X * (C2 << C1) for vectors This fold already existed for vectors but only when 'C1' was a splat constant (but 'C2' could be any constant). There were no tests for any vector constants, so I'm adding a test that shows non-splat constants for both operands. llvm-svn: 294650	2017-02-09 23:13:04 +00:00
Peter Collingbourne	cea1e4e79a	De-duplicate some code for creating an AARGetter suitable for the legacy PM. I'm about to use this in a couple more places. Differential Revision: https://reviews.llvm.org/D29793 llvm-svn: 294648	2017-02-09 23:11:52 +00:00
Hans Wennborg	f1e773cab5	Don't try to link to the 4.0 release notes llvm-svn: 294647	2017-02-09 23:03:34 +00:00
Matthias Braun	6717a0ba03	lit.rst: Fix sphinx complaint about multiple option definitions llvm-svn: 294646	2017-02-09 23:03:22 +00:00
Jonathan Roelofs	ebba0507da	[docs] Fix typo llvm-svn: 294645	2017-02-09 23:02:37 +00:00
Adrian McCarthy	d6e091dcc5	Fix build break from r294633. llvm-svn: 294642	2017-02-09 22:49:35 +00:00
Simon Pilgrim	7f0d7e08b2	[X86] Remove duplicate call to getValueType. NFCI. llvm-svn: 294640	2017-02-09 22:35:59 +00:00
Peter Collingbourne	ef089bdb4b	X86: Introduce relocImm-based patterns for cmp. Differential Revision: https://reviews.llvm.org/D28690 llvm-svn: 294636	2017-02-09 22:02:28 +00:00
Matt Arsenault	0699ef39ce	AMDGPU: Add pass to expand memcpy/memmove/memset llvm-svn: 294635	2017-02-09 22:00:42 +00:00
Peter Collingbourne	d7dd65ad7c	X86: Teach X86InstrInfo::analyzeCompare to recognize compares of symbols. This requires that we communicate to X86InstrInfo::optimizeCompareInstr that the second operand is neither a register nor an immediate. The way we do that is by setting CmpMask to zero. Note that there were already instructions where the second operand was not a register nor an immediate, namely X86::SUB*rm, so also set CmpMask to zero for those instructions. This seems like a latent bug, but I was unable to trigger it. Differential Revision: https://reviews.llvm.org/D28621 llvm-svn: 294634	2017-02-09 21:58:24 +00:00
Adrian McCarthy	0beb3323c5	Introduce NativeRawSymbol for PDB reading. This is a stub for a new concrete implementation of IPDBRawSymbol. Nothing uses this uses this implementation yet. My plan is to locally switch lldb-pdbdump from the DIA reader to the Native one and flesh out the implementations of these method stubs in the order they're needed. llvm-svn: 294633	2017-02-09 21:51:19 +00:00
Michael J. Spencer	714d9d22ad	[LoadCombine] Fix combining of loads which span an aliasing store. Fixes PR31517 Differential Revision: https://reviews.llvm.org/D28922 llvm-svn: 294632	2017-02-09 21:46:49 +00:00
Peter Collingbourne	857aba4410	Rename LowerTypeTestsSummaryAction to PassSummaryAction. NFCI. I intend to use the same type with the same semantics in the WholeProgramDevirt pass. Differential Revision: https://reviews.llvm.org/D29746 llvm-svn: 294629	2017-02-09 21:45:01 +00:00
Sanjay Patel	ae3b43e488	[InstCombine] use m_APInt to allow demanded bits analysis on splat constants llvm-svn: 294628	2017-02-09 21:43:06 +00:00
Konstantin Zhuravlyov	fd87137710	[AMDGPU] Calculate number of min/max SGPRs/VGPRs for WavesPerEU instead of using switch statement Differential Revision: https://reviews.llvm.org/D29741 llvm-svn: 294627	2017-02-09 21:33:23 +00:00
Sanjay Patel	5bcb2d97f0	[InstCombine] add test for demanded bits with splat vector constants; NFC llvm-svn: 294625	2017-02-09 21:33:19 +00:00
Tom Stellard	34fc95bb6f	CODE_OWNERS: Update email address Also clean up description. llvm-svn: 294624	2017-02-09 21:29:12 +00:00
Daniel Berlin	73ad5cb9b1	Drop graph_ prefix llvm-svn: 294621	2017-02-09 20:37:46 +00:00
Daniel Berlin	58a6e57394	GraphTraits: Add range versions of graph traits functions (graph_nodes, graph_children, inverse_graph_nodes, inverse_graph_children). Summary: Convert all obvious node_begin/node_end and child_begin/child_end pairs to range based for. Sending for review in case someone has a good idea how to make graph_children able to be inferred. It looks like it would require changing GraphTraits to be two argument or something. I presume inference does not happen because it would have to check every GraphTraits in the world to see if the noderef types matched. Note: This change was 3-staged with clang as well, which uses Dominators/etc from LLVM. Reviewers: chandlerc, tstellarAMD, dblaikie, rsmith Subscribers: arsenm, llvm-commits, nhaehnle Differential Revision: https://reviews.llvm.org/D29767 llvm-svn: 294620	2017-02-09 20:37:24 +00:00
Saleem Abdulrasool	864bd176a6	test: adjust the test for the BSD format The padding for ld64 changes the header to include the padding. Adjust the test to account for this. llvm-svn: 294619	2017-02-09 20:06:30 +00:00
Frederic Riss	1488766bdf	[dsymutil] Fix handling of empty CUs in LTO links. r288399 introduced the DIEUnit class, and in the process broke the corner case where dsymutil generates an empty CU during an LTO link. This restores the logic and adds a test for the corner case. llvm-svn: 294618	2017-02-09 19:41:55 +00:00
Sanjoy Das	74bda4d591	[JumpThreading] Thread through guards Summary: This patch allows JumpThreading also thread through guards. Virtually, guard(cond) is equivalent to the following construction: if (cond) { do something } else {deoptimize} Yet it is not explicitly converted into IFs before lowering. This patch enables early threading through guards in simple cases. Currently it covers the following situation: if (cond1) { // code A } else { // code B } // code C guard(cond2) // code D If there is implication cond1 => cond2 or !cond1 => cond2, we can transform this construction into the following: if (cond1) { // code A // code C } else { // code B // code C guard(cond2) } // code D Thus, removing the guard from one of execution branches. Patch by Max Kazantsev! Reviewers: reames, apilipenko, igor-laevsky, anna, sanjoy Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29620 llvm-svn: 294617	2017-02-09 19:40:22 +00:00
Vedant Kumar	1677631546	[utils] coverage: Add help text about the --restrict flag (NFC) Passing the --restrict flag to the coverage prep script before other positional arguments is wrong, because it prevents the argparse module from telling apart arguments to --restrict versus positional arguments. Pointed out by Sean Callanan! llvm-svn: 294616	2017-02-09 19:37:18 +00:00
Saleem Abdulrasool	111cd669e9	Object: pad out BSD archive members to 8-bytes ld64 requires its archive members to be 8-byte aligned for 64-bit content and 4-byte aligned for 32-bit content. Opt for the larger alignment requirement. This ensures that ld64 can consume archives generated by llvm-ar. Thanks to Kevin Enderby for the hint about the ld64/cctools behaviours! Resolves PR28361! llvm-svn: 294615	2017-02-09 19:29:35 +00:00
Simon Pilgrim	e0b5c2acbd	Convert to for-range loop. NFCI. llvm-svn: 294610	2017-02-09 18:52:24 +00:00
Geoff Berry	7e320c2485	[SelectionDAG] Fix bugs in inverted condition splitting code. Summary: Fix two bugs in SelectionDAGBuilder::FindMergedConditions reported by Mikael Holmen. Handle non-canonicalized xor not operation correctly (was assuming operand 0 was always the non-constant operand) and check that the negated condition is also in the same block as the original and/or instruction (as is done for and/or operands already) before proceeding with optimization. Reviewers: bogner, MatzeB, qcolombet Subscribers: mcrosier, uabelho, llvm-commits Differential Revision: https://reviews.llvm.org/D29680 llvm-svn: 294605	2017-02-09 18:28:17 +00:00
Chris Bieneman	84ad1f8514	[CMake] Fix standalone project builds broken in r294514 This patch sets the global property indicating that target registration is complete for standalone sub-project builds. llvm-svn: 294602	2017-02-09 18:14:12 +00:00
Sanjay Patel	b36e1f0223	[InstCombine] add tests for icmp with add nsw; NFC llvm-svn: 294601	2017-02-09 18:12:39 +00:00
Kevin Enderby	5879a48c17	Tweak the implementation of llvm-objdump’s -objc-meta-data option so that it works when the ObjC metadata sections end up in the __DATA_CONST or __DATA_DIRTY segments. rdar://26315238 llvm-svn: 294599	2017-02-09 17:56:26 +00:00
Simon Pilgrim	b25f60210f	[X86][BMI2] Regenerate mulx tests llvm-svn: 294598	2017-02-09 17:54:51 +00:00
Simon Pilgrim	6bf1bd3ed6	[X86][MMX] Remove the (long time) unused MMX_PINSRW ISD opcode. llvm-svn: 294596	2017-02-09 17:08:47 +00:00
Kostya Kortchinsky	3b39934444	[docs] Documentation update for Scudo Summary: Documentation update to reflect the changes that occured in the allocator: - additional architectures support; - modification of the header; - options default values for 32 & 64-bit. Reviewers: kcc, alekseyshl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29592 llvm-svn: 294595	2017-02-09 16:07:52 +00:00
Saleem Abdulrasool	d3faeaf8a2	Object: add a comment explaining a divergence Add a note about the reason for the divergence from the specification for ld64. Addresses post-commit review comments from Davide. NFC. llvm-svn: 294594	2017-02-09 15:47:58 +00:00
David Bozier	93e773e9be	Revert: "[Stack Protection] Add diagnostic information for why stack protection was applied to a function" this reverts revision r294590 as it broke some buildbots. llvm-svn: 294593	2017-02-09 15:40:14 +00:00
Artur Pilipenko	0e4583b56c	Add DAGCombiner load combine tests for partially available values If some of the trailing or leading bytes of a load combine pattern are zeroes we can combine the pattern to a load + zext and shift. Currently we don't support it, so the tests check the current codegen without load combine. This change will make the patch to support this kind of combine a bit more clear. llvm-svn: 294591	2017-02-09 15:13:40 +00:00
David Bozier	6a44b7c2eb	[Stack Protection] Add diagnostic information for why stack protection was applied to a function Stack Smash Protection is not completely free, so in hot code, the overhead it causes can cause performance issues. By adding diagnostic information for which function have SSP and why, a user can quickly determine what they can do to stop SSP being applied to a specific hot function. This change adds an SSP-specific DiagnosticInfo class and uses of it to the Stack Protection code. A subsequent change to clang will cause the remarks to be emitted when enabled. Patch by: James Henderson Differential Revision: https://reviews.llvm.org/D29023 llvm-svn: 294590	2017-02-09 15:08:40 +00:00
Rafael Espindola	dc1c3011fd	Make it possible to set SHF_LINK_ORDER explicitly. This will make it possible to add support for gcing user metadata (asan for example). llvm-svn: 294589	2017-02-09 14:59:20 +00:00
Pierre Gousseau	6953b32475	[X86][btver2] PR31902: Fix a crash in combineOrCmpEqZeroToCtlzSrl under fast math. In combineOrCmpEqZeroToCtlzSrl, replace "getConstantOperand == 0" by "isNullConstant" to account for floating point constants. Differential Revision: https://reviews.llvm.org/D29756 llvm-svn: 294588	2017-02-09 14:43:58 +00:00
Simon Pilgrim	05ac1f70be	[X86][SSE] Added extra FMA/NO-FMA reciprocal test cases for D26855 Test for expected codegen for nr reciprocal cases with/without FMA llvm-svn: 294587	2017-02-09 14:14:06 +00:00
David Bozier	9126f54285	[docs] cleanup documentation on lit substitutions 1. Added missing substitutions to the documentation in docs/TestingGuide.rst 2. Modified docs/CommandGuide/lit.rst to only document the "base" set of substitutions and to refer the reader to docs/TestingGuide.rst for more detailed info on substitutions. Patch by bd1976llvm Differential Revision: https://reviews.llvm.org/D29281 llvm-svn: 294586	2017-02-09 14:12:30 +00:00
Diana Picus	7232af352f	[ARM] GlobalISel: Lower single precision FP args Both for aapcscc and aapcs_vfpcc. We currently filter out soft float targets because we don't support libcalls yet. llvm-svn: 294584	2017-02-09 13:09:59 +00:00
Artur Pilipenko	4a64031954	[DAGCombiner] Support non-zero offset in load combine Enable folding patterns which load the value from non-zero offset: i8 a = ... i32 val = a[4] \| (a[5] << 8) \| (a[6] << 16) \| (a[7] << 24) => i32 val = ((i32*)(a+4)) Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D29394 llvm-svn: 294582	2017-02-09 12:06:01 +00:00
Simon Pilgrim	563e23e66e	[X86][SSE] Attempt to break register dependencies during lowerBuildVector LowerBuildVectorv16i8/LowerBuildVectorv8i16 insert values into a UNDEF vector if the build vector doesn't contain any zero elements, resulting in register dependencies with a previous use of the register. This patch attempts to break the register dependency by either always zeroing the vector before hand or (if we're inserting to the 0'th element) by using VZEXT_MOVL(SCALAR_TO_VECTOR(i32 AEXT(Elt))) which lowers to (V)MOVD and performs a similar function. Additionally (V)MOVD is a shorter instruction than PINSRB/PINSRW. We already do something similar for SSE41 PINSRD. On pre-SSE41 LowerBuildVectorv16i8 we go a little further and use VZEXT_MOVL(SCALAR_TO_VECTOR(i32 ZEXT(Elt))) if the build vector contains zeros to avoid the vector zeroing at the cost of a scalar zero extension, which can probably be brought over to the other cases in a future patch in some cases (load folding etc.) Differential Revision: https://reviews.llvm.org/D29720 llvm-svn: 294581	2017-02-09 11:50:19 +00:00
Vitaly Buka	9987d98370	LVI: Fix use-of-uninitialized-value after r294463 BlockValueStack can be reallocated making reference e invalid. llvm-svn: 294572	2017-02-09 09:28:05 +00:00
Igor Breger	ed43f15637	Add new tests for EXTRACT_VECTOR_ELT (vector of packed i8/16/i32/i64/ps/pd data) llvm-svn: 294565	2017-02-09 07:39:19 +00:00
Craig Topper	3cac763532	[X86] Remove the HLE feature flag. We only implemented it for one of the 3 HLE instructions and that instruction is also under the RTM flag. Clang only implements the RTM flag from its command line. llvm-svn: 294562	2017-02-09 06:51:02 +00:00
Craig Topper	86576bd921	[X86] Remove INVPCID and SMAP feature flags. They aren't currently used by any instructions and not tested. If we implement intrinsics for their instructions in the future, the feature flags can be added back with proper testing. llvm-svn: 294561	2017-02-09 06:50:59 +00:00
Craig Topper	50f3d1452c	[X86] Clzero intrinsic and its addition under znver1 This patch does the following. 1. Adds an Intrinsic int_x86_clzero which works with __builtin_ia32_clzero 2. Identifies clzero feature using cpuid info. (Function:8000_0008, Checks if EBX[0]=1) 3. Adds the clzero feature under znver1 architecture. 4. The custom inserter is added in Lowering. 5. A testcase is added to check the intrinsic. 6. The clzero instruction is added to assembler test. Patch by Ganesh Gopalasubramanian with a couple formatting tweaks, a disassembler test, and using update_llc_test.py from me. Differential revision: https://reviews.llvm.org/D29385 llvm-svn: 294558	2017-02-09 04:27:34 +00:00
Saleem Abdulrasool	b4a162be21	Object: pad BSD ar string table to 4-bytes cctools would pad the string table to a sizeof(int32_t) (explicitly printed out by cctools rather than 4). This adjusts the string table to make it more compatible with cctools, but is insufficient to make ld64 happy. llvm-svn: 294557	2017-02-09 04:26:21 +00:00
Ahmed Bougacha	6a1ac5a380	[GlobalISel] Simplify StringRef parameters. NFC. 'const' on StringRef parameters adds no guarantees. Remove it. llvm-svn: 294555	2017-02-09 02:50:01 +00:00
Arnold Schwaighofer	26f016f143	SwiftCC: swifterror register cannot be as the base register Functions that have a dynamic alloca require a base register which is defined to be X19 on AArch64 and r6 on ARM. We have defined the swifterror register to be the same register. Use a different callee save register for swifterror instead: X21 on AArch64 R8 on ARM rdar://30433803 llvm-svn: 294551	2017-02-09 01:52:17 +00:00
Peter Collingbourne	58c90c0c80	LowerTypeTests: Change a few vtable globals in tests to constants. It turns out that some of our negative tests were not in fact providing the test coverage we expected: they were passing because the vtables were failing an early check that they were constant. Fix this by changing the globals in these tests to constants. llvm-svn: 294550	2017-02-09 01:48:24 +00:00
Eugene Zelenko	44d951226e	[MC] Fix some Clang-tidy modernize and Include What You Use warnings in SubtargetFeature; other minor fixes (NFC). Same changes in files affected by reduced SubtargetFeature.h dependencies. llvm-svn: 294548	2017-02-09 01:09:54 +00:00
Wolfgang Pieb	458b4e7c46	Reapply r294356 ("Keep track of spilled variables in LiveDebugValues"). Was reverted with r294447 due to undefined behavior with negative offsets in DBG_VALUE instructions. llvm-svn: 294532	2017-02-08 23:46:59 +00:00
Tim Northover	e041841811	GlobalISel: legalize G_FPOW to a libcall on AArch64. There's no instruction to implement it. llvm-svn: 294531	2017-02-08 23:23:39 +00:00
Tim Northover	b38b4e2464	GlobalISel: translate @llvm.pow intrinsic to G_FPOW. It'll usually be immediately legalized back to a libcall, but occasionally something can be done with it so we'd just as well enable that flexibility from the start. llvm-svn: 294530	2017-02-08 23:23:32 +00:00
Mike Aizatsky	4705ae936d	[sancov] using comdat only when it is enabled Differential Revision: https://reviews.llvm.org/D29733 llvm-svn: 294529	2017-02-08 23:12:46 +00:00
Arnold Schwaighofer	db7bbcbe78	[ARM/AArch ISel] SwiftCC: First parameters that are marked swiftself are not 'this returns' We mark X0 as preserved by a call that passes the returned parameter. x0 = ... fun(x0) // no implicit def of x0 This no longer is valid if we pass the parameter in a different register then the returned value as is the case with a swiftself parameter (passed in x20). x20 = ... fun(x20) // there should be an implict def of x8 rdar://30425845 llvm-svn: 294527	2017-02-08 22:30:47 +00:00
Eugene Zelenko	3d8b0ebb68	[MC] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 294526	2017-02-08 22:23:19 +00:00
Eugene Zelenko	41e7e34c49	[ARM] Fix some Include What You Use warnings; other minor fixes (NFC). This is preparation to reduce MC headers dependencies. llvm-svn: 294525	2017-02-08 22:19:56 +00:00
Sanjay Patel	a62bc44f67	[InstCombine] add tests to show information-losing add nsw/nuw transforms; NFC llvm-svn: 294524	2017-02-08 22:14:11 +00:00
Amara Emerson	c3a4b282bb	Revert r294437 as it broke an asan buildbot. llvm-svn: 294523	2017-02-08 21:41:16 +00:00
Tim Northover	9dd78f8a6d	GlobalISel: select G_[SU]MULH on AArch64. Hopefully this'll be nuked by tablegen pretty soon, but until then it's reasonably important for supporting C++ operator new[]. llvm-svn: 294520	2017-02-08 21:22:25 +00:00
Tim Northover	0a9b27933a	GlobalISel: expand mul-with-overflow into mul-hi on AArch64. AArch64 has specific instructions to multiply two numbers at double the width and produce the high part of the result. These can be used to implement LLVM's mul.with.overflow instructions fairly simply. Helps with C++ operator new[]. llvm-svn: 294519	2017-02-08 21:22:15 +00:00
Stanislav Mekhanoshin	4a24705dd6	[AMDGPU] Implement register pressure callbacks Implement getRegPressureLimit and getRegPressureSetLimit callbacks in SIRegisterInfo. This makes standard converge scheduler to behave almost the same as GCNScheduler, sometime slightly better sometimes a bit worse. In gerenal that is also possible to switch GCNScheduler to use these callbacks instead of getMaxWaves(), which also makes GCNScheduler slightly better on some tests and slightly worse on another. A big win is behavior with converge scheduler. Note, these are used not only by scheduling, but in places like MachineLICM. Differential Revision: https://reviews.llvm.org/D29700 llvm-svn: 294518	2017-02-08 21:22:03 +00:00
Mike Aizatsky	401d369328	[sancov] specifying comdat for sancov constructors Differential Revision: https://reviews.llvm.org/D29662 llvm-svn: 294517	2017-02-08 21:20:33 +00:00
Peter Collingbourne	40540a43b2	Take code ownership of LLVM bitcode. llvm-svn: 294516	2017-02-08 21:16:27 +00:00
Chris Bieneman	e7a982040b	[CMake] Fix `is_llvm_target_library` and support out-of-order components Summary: This patch is required by D28855, and enables us to rely on CMake's ability to handle out of order target dependencies. Reviewers: mgorny, chapuni, bryant Subscribers: llvm-commits, jgosnell Differential Revision: https://reviews.llvm.org/D28869 llvm-svn: 294514	2017-02-08 20:58:37 +00:00
Hans Wennborg	cb1aab6ed9	build_llvm_package.bat: Build teh clang-format plugin separately In r293373 we switched the build to linking dynamically against the Universal CRT and include the redistributables in the installer. However, clang-format.exe is copied into the vsix and needs to be statically linked. This commit makes us build the plugin in a separate step that uses static linking. llvm-svn: 294513	2017-02-08 20:58:33 +00:00
Peter Collingbourne	28ffd3261f	ThinLTOBitcodeWriter: Strip debug info from merged module. This module will contain nothing but vtable definitions and (soon) available_externally function definitions, so there is no point in keeping debug info in the module. Differential Revision: https://reviews.llvm.org/D28913 llvm-svn: 294511	2017-02-08 20:44:00 +00:00
Alexey Bataev	0674fe39e5	[SLP] Additional test to check correct work of horizontal reductions, NFC. llvm-svn: 294505	2017-02-08 19:52:46 +00:00
Elena Demikhovsky	5267edd3e3	[Loop Vectorizer] Cost-based decision for vectorization form of memory instruction. Making the cost model selecting between Interleave, GatherScatter or Scalar vectorization form of memory instruction. The right decision should be done for non-consecutive memory access instrcuctions that may have more than one vectorization solution. This patch includes the following changes: - Cost Model calculates the cost of Load/Store vector form and choose the better option between Widening, Interleave, GatherScactter and Scalarization. Cost Model keeps the widening decision. - Arrays of Uniform and Scalar values are moved from Legality to Cost Model. - Cost Model collects Uniforms and Scalars per VF. The collection is based on CM decision map of Loadis/Stores vectorization form. - Vectorization of memory instruction is performed according to the CM decision. Differential Revision: https://reviews.llvm.org/D27919 llvm-svn: 294503	2017-02-08 19:25:23 +00:00
Simon Dardis	2e8cdbd795	[DebugInfo] Rename EmitDebugValue to EmitDebugThreadLocal (NFC) As pointed out by David Blaikie in the post commit review of r292624, EmitDebugValue should be called EmitDebugThreadLocal. llvm-svn: 294500	2017-02-08 19:03:46 +00:00
Simon Pilgrim	696e27e1ec	[X86][SSE] Regenerate scalar integer conversions to float tests llvm-svn: 294499	2017-02-08 19:01:27 +00:00
Reid Kleckner	e332a5b670	Fix inline-asm-diags.ll on Windows, give it a triple to avoid WoA thumb confusion llvm-svn: 294496	2017-02-08 18:17:21 +00:00
Saleem Abdulrasool	dea14b2902	llvm-objdump: make NoLeadingAddr work on more than just MachO Support printing the disassembly without the address on all formats rather than making it MachO specific. Patch by Jeff Muizelaar! llvm-svn: 294495	2017-02-08 18:11:31 +00:00
Artur Pilipenko	045ab08252	[DAGCombiner] NFC. Mark ByteProvider accessors as const llvm-svn: 294494	2017-02-08 17:59:34 +00:00
Tim Northover	e9600d861c	GlobalISel: select G_VASTART on iOS AArch64. The AAPCS ABI is substantially more complicated so that's coming in a separate patch. For now we can generate correct code for iOS though. llvm-svn: 294493	2017-02-08 17:57:27 +00:00
Tim Northover	f19d467ff6	GlobalISel: translate @llvm.va_start intrinsic. Because we need to preserve the memory access being performed we need a separate instruction to represent this. llvm-svn: 294492	2017-02-08 17:57:20 +00:00
Matt Arsenault	560665250f	NVPTX: Extract mem intrinsic expansions into utilities llvm-svn: 294490	2017-02-08 17:49:52 +00:00
Chad Rosier	e22c992ba9	[Reassociate] Remove an unused argument. NFC. llvm-svn: 294489	2017-02-08 17:45:27 +00:00
Adrian Prantl	a5bf2d7003	Fix bitcode upgrade for DIGlobalVariables with a var: field. This is a follow-up to https://reviews.llvm.org/D29349. It turns out that NeedUpgradeToDIGlobalVariableExpression is always necessary when we encountered a version==0 record because it may always be referenced via a list of globals in a DICompileUnit. My tests weren't good enough to catch this though. To trigger this case, we need much older bitcode produced by LLVM around version 3.7. <rdar://problem/30404262> Differential Revision: https://reviews.llvm.org/D29693 llvm-svn: 294488	2017-02-08 17:44:43 +00:00
Sanjay Patel	d11a03b263	[InstCombine] add test for missed vector icmp fold; NFC Also, move the related existing scalar test to a renamed file where I'm planning to add more icmp-add tests. llvm-svn: 294487	2017-02-08 17:37:17 +00:00
Sanne Wouda	fc674bcb12	Move inline asm diags tests to an ARM directory. The assembler syntaxes (and parsers) differ too much to expect this test to pass for all of them. llvm-svn: 294475	2017-02-08 16:48:35 +00:00
Krzysztof Parzyszek	1df58fc2f9	[Hexagon] Fix decoding conflict between A2_zxtb and A4_ext llvm-svn: 294472	2017-02-08 16:31:00 +00:00
Simon Dardis	3c82a64636	[mips] MUL macro variations [mips] MUL macro variations Adds support for MUL macro variations. Patch by: Srdjan Obucina Reviewers: zoran.jovanovic, vkalintiris, dsanders, sdardis, obucina, seanbruno Differential Revision: https://reviews.llvm.org/D16807 llvm-svn: 294471	2017-02-08 16:25:05 +00:00
Sanjay Patel	6dd2eae76a	[InstCombine] add local name for repeated calls; NFC llvm-svn: 294470	2017-02-08 16:19:36 +00:00
Sanne Wouda	7e101936b6	Fix inline asm diagnostics test. Don't depend on X86 everywhere. Fix the original problem with a reg-exp for the column number. llvm-svn: 294468	2017-02-08 16:14:01 +00:00
Daniel Berlin	9c92a469b4	LVI: Add a per-value worklist limit to LazyValueInfo. Summary: LVI is now depth first, which is optimal for iteration strategy in terms of work per call. However, the way the results get cached means it can still go very badly N^2 or worse right now. The overdefined cache is per-block, because LVI wants to try to get different results for the same name in different blocks (IE solve the problem PredicateInfo solves). This means even if we discover a value is overdefined after going very deep, it doesn't cache this information, causing it to end up trying to rediscover it again and again. The same is true for values along the way. In practice, overdefined anywhere should mean overdefined everywhere (this is how, for example, SCCP works). Until we get around to reworking the overdefined cache, we need to limit the worklist size we process. Note that permanently reverting the DFS strategy exploration seems the wrong strategy (temporarily seems fine if we really want). BFS is clearly the wrong approach, it just gets luckier on some testcases. It's also very hard to design an effective throttle for BFS. For DFS, the throttle is directly related to the depth of the CFG. So really deep CFGs will get cutoff, smaller ones will not. As the CFG simplifies, you get better results. In BFS, the limit is it's related to the fan-out times average block size, which is harder to reason about or make good choices for. Bug being filed about the overdefined cache, but it will require major surgery to fix it (plumbing predicateinfo through CVP or LVI). Note: I did not make this number configurable because i'm not sure anyone really needs to tweak this knob. We run CVP 3 times. On the testcases i have the slow ones happen in the middle, where CVP is doing cleanup work other things are effective at. Over the course of 3 runs, we don't see to have any real loss of performance. I haven't gotten a minimized testcase yet, but just imagine in your head a testcase where, going up the CFG, you have branches, one of which leads 50000 blocks deep, and the other, to something where the answer is overdefined immediately. BFS would discover the overdefined faster than DFS, but do more work to do so. In practice, the right answer is "once DFS discovers overdefined for a value, stop trying to get more info about that value" (and so, DFS would normally cache the overdefined results for every value it passed through in those 50k blocks, and never do that work again. But it don't, because of the naming problem) Reviewers: chandlerc, djasper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29715 llvm-svn: 294463	2017-02-08 15:22:52 +00:00
Sanjay Patel	28ef27e3dc	[x86] add AVX512vl target for more coverage; NFC llvm-svn: 294462	2017-02-08 15:22:52 +00:00
Igor Laevsky	a9b6872908	[InstComobineCalls] Fix buildbot failures after r294453. Some targets don't support uint64_t options. Change type to unsigned. Differential Revision: https://reviews.llvm.org/D28909 llvm-svn: 294461	2017-02-08 15:21:48 +00:00
Sanne Wouda	2933875cc2	[Assembler] Enable nicer diagnostics for inline assembly. Fixed test. Summary: Enables source location in diagnostic messages from the backend. This is after parsing, during finalization. This requires the SourceMgr, the inline assembly string buffer, and DiagInfo to still be alive after EmitInlineAsm returns. This patch creates a single SourceMgr for inline assembly inside the AsmPrinter. MCContext gets a pointer to this SourceMgr. Using one SourceMgr per call to EmitInlineAsm would make it difficult for MCContext to figure out in which SourceMgr the SMLoc is located, while a single SourceMgr can figure it out if it has multiple buffers. The Str argument to EmitInlineAsm is copied into a buffer and owned by the inline asm SourceMgr. This ensures that DiagHandlers won't print garbage. (Clang emits a "note: instantiated into assembly here", which refers to this string.) The AsmParser gets destroyed before finalization, which means that the DiagHandlers the AsmParser installs into the SourceMgr will be stale. Restore the saved DiagHandlers. Since now we're using just one SourceMgr for multiple inline asm strings, we need to tell the AsmParser which buffer it needs to parse currently. Hand a buffer id -- returned from SourceMgr:: AddNewSourceBuffer -- to the AsmParser. Reviewers: rnk, grosbach, compnerd, rengolin, rovka, anemet Reviewed By: rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29441 llvm-svn: 294458	2017-02-08 14:48:05 +00:00
Simon Pilgrim	dcd10344a3	[X86][SSE] Tidyup LowerBuildVectorv16i8 and LowerBuildVectorv8i16. NFCI. Run clang-format and standardized variable names between functions. llvm-svn: 294456	2017-02-08 14:44:45 +00:00
Amaury Sechet	887117fb3d	Add test case for pr31890. NFC llvm-svn: 294455	2017-02-08 14:35:48 +00:00
Konstantin Zhuravlyov	b5acb8ec47	[AMDGPU][NFC] Assign IsaInfo to reference variable in order to shorten long lines llvm-svn: 294454	2017-02-08 14:34:10 +00:00
Igor Laevsky	900ffa34c8	[InstCombineCalls] Unfold element atomic memcpy instruction Differential Revision: https://reviews.llvm.org/D28909 llvm-svn: 294453	2017-02-08 14:32:04 +00:00
Igor Laevsky	4b317fa24e	[InstCombineCalls] Remove zero length atomic memcpy intrinsics Differential Revision: https://reviews.llvm.org/D28909 llvm-svn: 294452	2017-02-08 14:23:47 +00:00
Diana Picus	e79e5ee244	Fix test to work on swift/cyclone too I forgot to remove the neonfp target feature from the test, which means we'd have trouble selecting VADDS on targets that have neonfp enabled by default. llvm-svn: 294451	2017-02-08 14:23:30 +00:00
Konstantin Zhuravlyov	9f89ede107	[AMDGPU] Add target information that is required by tools to metadata Differential Revision: https://reviews.llvm.org/D28760#fb670e28 llvm-svn: 294449	2017-02-08 14:05:23 +00:00
Diana Picus	79add417b4	Revert "[Assembler] Enable nicer diagnostics for inline assembly." This reverts commit r294433 because it seems it broke the buildbots. llvm-svn: 294448	2017-02-08 14:02:16 +00:00

... 2 3 4 5 6 ...

144857 Commits