llvm-project

Commit Graph

Author	SHA1	Message	Date
Elena Demikhovsky	f58f838495	Changed basic cost of store operation on X86 Store operation takes 2 UOps on X86 processors. The exact cost calculation affects several optimization passes including loop unroling. This change compensates performance degradation caused by https://reviews.llvm.org/D34458 and shows improvements on some benchmarks. Differential Revision: https://reviews.llvm.org/D35888 llvm-svn: 311285	2017-08-20 12:34:29 +00:00
Aditya Kumar	a525fffd07	[Loop Vectorize] Added a separate metadata Added a separate metadata to indicate when the loop has already been vectorized instead of setting width and count to 1. Patch written by Divya Shanmughan and Aditya Kumar Differential Revision: https://reviews.llvm.org/D36220 llvm-svn: 311281	2017-08-20 10:32:41 +00:00
Igor Breger	88a3d5c855	[GlobalISel][X86] Support call ABI. Summary: Support call ABI. For now only Linux C and X86_64_SysV calling conventions supported. Variadic function not supported. Reviewers: zvi, guyblank, oren_ben_simhon Reviewed By: oren_ben_simhon Subscribers: rovka, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D34602 llvm-svn: 311279	2017-08-20 09:25:22 +00:00
Igor Breger	b3a860a5e8	[GlobalISel][X86] Support asimetric copy from/to GPR physical register. Usually this case generated by ABI lowering, it requare to performe trancate/anyext. llvm-svn: 311278	2017-08-20 07:14:40 +00:00
Alex Bradbury	21c8adf50b	[RISCV] Trivial whitespace fix in RISCVInstPrinter llvm-svn: 311277	2017-08-20 06:58:43 +00:00
Alex Bradbury	e45186d43f	[RISCV] Fix two abuses of llvm_unreachable Replace with report_fatal_error. llvm-svn: 311276	2017-08-20 06:57:27 +00:00
Alex Bradbury	dd83484ab4	[RISCV] Set HasRelocationAddend for RISCVELFObjectWriter llvm-svn: 311275	2017-08-20 06:55:14 +00:00
Sam Elliott	7fe0aaa140	Revert "Emit only A Single Opt Remark When Inlining" Reverting due to clang build failure llvm-svn: 311274	2017-08-20 06:55:10 +00:00
Sam Elliott	785dd75369	Emit only A Single Opt Remark When Inlining Summary: This updates the Inliner to only add a single Optimization Remark when Inlining, rather than an Analysis Remark and an Optimization Remark. Fixes https://bugs.llvm.org/show_bug.cgi?id=33786 Reviewers: anemet, davidxl, chandlerc Reviewed By: anemet Subscribers: haicheng, fhahn, mehdi_amini, dblaikie, llvm-commits, eraman Differential Revision: https://reviews.llvm.org/D36054 llvm-svn: 311273	2017-08-20 06:43:34 +00:00
Igor Breger	d88dfd32f8	[GlobalIsel] Fix undefined behavior if Action not set (release), it aslo crashing in debug mode. Differential Revision: https://reviews.llvm.org/D34978 llvm-svn: 311272	2017-08-20 06:26:22 +00:00
Sam Elliott	b0c9753691	Keep Optimization Remark Yaml in NewPM Summary: The New Pass Manager infrastructure was forgetting to keep around the optimization remark yaml file that the compiler might have been producing. This meant setting the option to '-' for stdout worked, but setting it to a filename didn't give file output (presumably it was deleted because compilation didn't explicitly keep it). This change just ensures that the file is kept if compilation succeeds. So far I have updated one of the optimization remark output tests to add a version with the new pass manager. It is my intention for this patch to also include changes to all tests that use `-opt-remark-output=` but I wanted to get the code patch ready for review while I was making all those changes. Fixes https://bugs.llvm.org/show_bug.cgi?id=33951 Reviewers: anemet, chandlerc Reviewed By: anemet, chandlerc Subscribers: javed.absar, chandlerc, fhahn, llvm-commits Differential Revision: https://reviews.llvm.org/D36906 llvm-svn: 311271	2017-08-20 01:30:45 +00:00
Chandler Carruth	9ef881efab	[x86] Fix an even stranger corner case where we have multiple levels of cmov self-refrencing. Pointed out by Amjad Aboud in code review, test case minorly simplified from the one he posted. llvm-svn: 311267	2017-08-19 23:35:50 +00:00
Craig Topper	4de6f583da	[X86] Merge all of the vecload and alignedload predicates into single predicates. We can load the memory VT and check for natural alignment. This also adds a new preferNonTemporalLoad helper that checks the correct subtarget feature based on the load size. This shrinks the isel table by at least 5000 bytes by allowing more reordering and combining to occur. llvm-svn: 311266	2017-08-19 23:21:22 +00:00
Craig Topper	afa69eecbb	[X86] Converge alignedstore/alignedstore256/alignedstore512 to a single predicate. We can read the memoryVT and get its store size directly from the SDNode to check its alignment. llvm-svn: 311265	2017-08-19 23:21:21 +00:00
Craig Topper	a0319bb434	[AVX512] Use alignedstore256 in a pattern that's emitting a 256-bit movaps from an extract subvector operation. llvm-svn: 311263	2017-08-19 22:02:02 +00:00
Victor Leschuk	ee3292c5e4	Set init value for ScalarEvolution::BackedgeTakenInfo::MaxOrZero Otherwise it can be used uninitialized in move ctor. llvm-svn: 311262	2017-08-19 21:05:08 +00:00
Martin Storsjo	d606d2a643	[ARM] Factorize the calculation of WhichResult in isV*Mask. NFC. Differential Revision: https://reviews.llvm.org/D36930 llvm-svn: 311260	2017-08-19 20:26:51 +00:00
Martin Storsjo	91522ffa12	[ARM] Check the right order for halves of VZIP/VUZP if both parts are used This is the exact same fix as in SVN r247254. In that commit, the fix was applied only for isVTRNMask and isVTRN_v_undef_Mask, but the same issue is present for VZIP/VUZP as well. This fixes PR33921. Differential Revision: https://reviews.llvm.org/D36899 llvm-svn: 311258	2017-08-19 19:47:48 +00:00
Teresa Johnson	b225ad05af	Fix bot failures by requiring x86 target The tests added in r311254 require a target triple since they are running through code generation. Fix bot failures by requiring an x86 target. llvm-svn: 311257	2017-08-19 19:15:04 +00:00
Konstantin Zhuravlyov	89377c440c	AMDGPU/NFC: Reorder functions in SIMemoryLegalizer: - Move load functions before atomic functions - Move store functions before atomic functions llvm-svn: 311256	2017-08-19 18:44:27 +00:00
Jatin Bhateja	6b4c205685	[DAGCombiner] Extending pattern detection for vector shuffle. Summary: If all the operands of a BUILD_VECTOR extract elements from same vector then split the vector efficiently based on the maximum vector access index. Reviewers: zvi, delena, RKSimon, thakis Reviewed By: RKSimon Subscribers: chandlerc, eladcohen, llvm-commits Differential Revision: https://reviews.llvm.org/D35788 llvm-svn: 311255	2017-08-19 18:08:59 +00:00
Teresa Johnson	73305f82e9	[ThinLTO] Fix ThinLTO crash Summary: Follow up to fix in r311023, which fixed the case where the combined index is written to disk. The same samplePGO logic exists for the in-memory index when computing imports, so we need to filter out GlobalVariable summaries there too. Reviewers: davidxl Subscribers: inglorion, llvm-commits Differential Revision: https://reviews.llvm.org/D36919 llvm-svn: 311254	2017-08-19 18:04:25 +00:00
Craig Topper	6e70f7cd33	[X86] Remove an unnecessary alignment restriction from MOVDDUP pattern. The SSE MOVDDUP instruction only loads 64-bits with no alignment restriction. llvm-svn: 311253	2017-08-19 18:02:28 +00:00
Jatin Bhateja	66f7958e91	Revert rL311247 : To rectify commit message. Summary: This reverts commit rL311247. Differential Revision: https://reviews.llvm.org/D36927 llvm-svn: 311252	2017-08-19 17:59:58 +00:00
Jatin Bhateja	6f0d0d23b0	Merge branch 'arcpatch-D35788' llvm-svn: 311247	2017-08-19 17:00:04 +00:00
Jatin Bhateja	1c56863739	Revert rL311242 "Extension of shuffle vector pattern detection, updating post rebase." Summary: This reverts commit rL311242. Differential Revision: https://reviews.llvm.org/D36924 llvm-svn: 311246	2017-08-19 16:40:06 +00:00
Jatin Bhateja	313f97dd84	Extension of shuffle vector pattern detection, updating post rebase. llvm-svn: 311242	2017-08-19 15:58:36 +00:00
Victor Leschuk	ee7d232a41	revert failing test llvm-svn: 311238	2017-08-19 12:24:41 +00:00
Victor Leschuk	ba0954c4e2	Add temporary test to verify that win10 builder hangs on error llvm-svn: 311236	2017-08-19 12:02:39 +00:00
Victor Leschuk	59dc64f3af	Temporary mark lit :: shtest-format as unsupported on windows When run manually it fails, but when run under buildbot it causes hang. llvm-svn: 311230	2017-08-19 07:58:07 +00:00
Chandler Carruth	4f3aa29a46	[Inliner] Fix a nasty bug when inlining a non-recursive trace of a function into itself. We tried to fix this before in r306495 but that got reverted as the assert was actually hit. This fixes the original bug (which we seem to have lost track of with the revert) by blocking a second remapping when the function being inlined is also the caller and the remapping could succeed but erroneously. The included test case would actually load from an inlined copy of the alloca before this change, failing to load the stored value and miscompiling. Many thanks to Richard Smith for diagnosing a user miscompile to this bug, and to Kyle for the first attempt and initial analysis and David Li for remembering the issue and how to fix it and suggesting the patch. I'm just stitching it together and landing it. =] llvm-svn: 311229	2017-08-19 06:56:11 +00:00
Chandler Carruth	2a80fddf67	[Inliner] Clean up a test case a bit to make it more clear what is being tested and why. llvm-svn: 311228	2017-08-19 06:06:44 +00:00
Chandler Carruth	1f8212597d	[SLP] Fix an unused variable warning in non-asserts builds. llvm-svn: 311227	2017-08-19 05:06:23 +00:00
Chandler Carruth	93a645525c	[x86] Teach the cmov converter to aggressively convert cmovs with memory operands into control flow. We have seen periodically performance problems with cmov where one operand comes from memory. On modern x86 processors with strong branch predictors and speculative execution, this tends to be much better done with a branch than cmov. We routinely see cmov stalling while the load is completed rather than continuing, and if there are subsequent branches, they cannot be speculated in turn. Also, in many (even simple) cases, macro fusion causes the control flow version to be fewer uops. Consider the IACA output for the initial sequence of code in a very hot function in one of our internal benchmarks that motivates this, and notice the micro-op reduction provided. Before, SNB: ``` Throughput Analysis Report -------------------------- Block Throughput: 2.20 Cycles Throughput Bottleneck: Port1 \| Num Of \| Ports pressure in cycles \| \| \| Uops \| 0 - DV \| 1 \| 2 - D \| 3 - D \| 4 \| 5 \| \| --------------------------------------------------------------------- \| 1 \| \| 1.0 \| \| \| \| \| CP \| mov rcx, rdi \| 0* \| \| \| \| \| \| \| \| xor edi, edi \| 2^ \| 0.1 \| 0.6 \| 0.5 0.5 \| 0.5 0.5 \| \| 0.4 \| CP \| cmp byte ptr [rsi+0xf], 0xf \| 1 \| \| \| 0.5 0.5 \| 0.5 0.5 \| \| \| \| mov rax, qword ptr [rsi] \| 3 \| 1.8 \| 0.6 \| \| \| \| 0.6 \| CP \| cmovbe rax, rdi \| 2^ \| \| \| 0.5 0.5 \| 0.5 0.5 \| \| 1.0 \| \| cmp byte ptr [rcx+0xf], 0x10 \| 0F \| \| \| \| \| \| \| \| jb 0xf Total Num Of Uops: 9 ``` After, SNB: ``` Throughput Analysis Report -------------------------- Block Throughput: 2.00 Cycles Throughput Bottleneck: Port5 \| Num Of \| Ports pressure in cycles \| \| \| Uops \| 0 - DV \| 1 \| 2 - D \| 3 - D \| 4 \| 5 \| \| --------------------------------------------------------------------- \| 1 \| 0.5 \| 0.5 \| \| \| \| \| \| mov rax, rdi \| 0* \| \| \| \| \| \| \| \| xor edi, edi \| 2^ \| 0.5 \| 0.5 \| 1.0 1.0 \| \| \| \| \| cmp byte ptr [rsi+0xf], 0xf \| 1 \| 0.5 \| 0.5 \| \| \| \| \| \| mov ecx, 0x0 \| 1 \| \| \| \| \| \| 1.0 \| CP \| jnbe 0x39 \| 2^ \| \| \| \| 1.0 1.0 \| \| 1.0 \| CP \| cmp byte ptr [rax+0xf], 0x10 \| 0F \| \| \| \| \| \| \| \| jnb 0x3c Total Num Of Uops: 7 ``` The difference even manifests in a throughput cycle rate difference on Haswell. Before, HSW: ``` Throughput Analysis Report -------------------------- Block Throughput: 2.00 Cycles Throughput Bottleneck: FrontEnd \| Num Of \| Ports pressure in cycles \| \| \| Uops \| 0 - DV \| 1 \| 2 - D \| 3 - D \| 4 \| 5 \| 6 \| 7 \| \| --------------------------------------------------------------------------------- \| 0* \| \| \| \| \| \| \| \| \| \| mov rcx, rdi \| 0* \| \| \| \| \| \| \| \| \| \| xor edi, edi \| 2^ \| \| \| 0.5 0.5 \| 0.5 0.5 \| \| 1.0 \| \| \| \| cmp byte ptr [rsi+0xf], 0xf \| 1 \| \| \| 0.5 0.5 \| 0.5 0.5 \| \| \| \| \| \| mov rax, qword ptr [rsi] \| 3 \| 1.0 \| 1.0 \| \| \| \| \| 1.0 \| \| \| cmovbe rax, rdi \| 2^ \| 0.5 \| \| 0.5 0.5 \| 0.5 0.5 \| \| \| 0.5 \| \| \| cmp byte ptr [rcx+0xf], 0x10 \| 0F \| \| \| \| \| \| \| \| \| \| jb 0xf Total Num Of Uops: 8 ``` After, HSW: ``` Throughput Analysis Report -------------------------- Block Throughput: 1.50 Cycles Throughput Bottleneck: FrontEnd \| Num Of \| Ports pressure in cycles \| \| \| Uops \| 0 - DV \| 1 \| 2 - D \| 3 - D \| 4 \| 5 \| 6 \| 7 \| \| --------------------------------------------------------------------------------- \| 0* \| \| \| \| \| \| \| \| \| \| mov rax, rdi \| 0* \| \| \| \| \| \| \| \| \| \| xor edi, edi \| 2^ \| \| \| 1.0 1.0 \| \| \| 1.0 \| \| \| \| cmp byte ptr [rsi+0xf], 0xf \| 1 \| \| 1.0 \| \| \| \| \| \| \| \| mov ecx, 0x0 \| 1 \| \| \| \| \| \| \| 1.0 \| \| \| jnbe 0x39 \| 2^ \| 1.0 \| \| \| 1.0 1.0 \| \| \| \| \| \| cmp byte ptr [rax+0xf], 0x10 \| 0F \| \| \| \| \| \| \| \| \| \| jnb 0x3c Total Num Of Uops: 6 ``` Note that this cannot be usefully restricted to inner loops. Much of the hot code we see hitting this is not in an inner loop or not in a loop at all. The optimization still remains effective and indeed critical for some of our code. I have run a suite of internal benchmarks with this change. I saw a few very significant improvements and a very few minor regressions, but overall this change rarely has a significant effect. However, the improvements were very significant, and in quite important routines responsible for a great deal of our C++ CPU cycles. The gains pretty clealy outweigh the regressions for us. I also ran the test-suite and SPEC2006. Only 11 binaries changed at all and none of them showed any regressions. Amjad Aboud at Intel also ran this over their benchmarks and saw no regressions. Differential Revision: https://reviews.llvm.org/D36858 llvm-svn: 311226	2017-08-19 05:01:19 +00:00
Chandler Carruth	e3b3547e9f	[x86] Refactor the CMOV conversion pass to be more flexible. The primary thing that this accomplishes is to allow future re-use of these routines in more contexts and clarify the behavior w.r.t. loops. For example, if handling outer loops is desirable, doing so in a inside-out order becomes straight forward because it walks the loop nest itself (rather than walking the function's basic blocks) and de-couples the CMOV rewriting from the loop structure as there isn't actually anything loop-specific about this transformation. This patch should be essentially a no-op. It potentially changes the order in which we visit the inner loops, but otherwise should merely set the stage for subsequent changes. Differential Revision: https://reviews.llvm.org/D36783 llvm-svn: 311225	2017-08-19 04:28:20 +00:00
Dinar Temirbulatov	7aff8cfa55	[SLPVectorizer] Tighten up VLeft, VRight declaration, remove unnecessary testcase test/Transforms/SLPVectorizer/X86/reorder.ll, NFCI. llvm-svn: 311223	2017-08-19 03:15:07 +00:00
Dinar Temirbulatov	e3ce1b455e	[SLPVectorizer] Add opcode parameter to reorderAltShuffleOperands, reorderInputsAccordingToOpcode functions. Reviewers: mkuper, RKSimon, ABataev, mzolotukhin, spatel, filcab Subscribers: llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D36766 llvm-svn: 311221	2017-08-19 02:54:20 +00:00
Matthias Braun	91bd3ad128	ARMRegsiterInfo: Define more ssub indexes; NFC This doesn't really change anything as Tablegen would have inferred those indices anyway; defining them gives us shorter names that are easier to read while debugging (i.e. "ssub_4" rather than "dsub2_then_ssub_0") llvm-svn: 311218	2017-08-19 01:21:11 +00:00
Adrian Prantl	2116dd360a	Filter out non-constant DIGlobalVariableExpressions reachable via the CU They won't affect the DWARF output, but they will mess with the sorting of the fragments. This fixes the crash reported in PR34159. https://bugs.llvm.org/show_bug.cgi?id=34159 llvm-svn: 311217	2017-08-19 01:15:06 +00:00
Eric Beckmann	91d8af5386	llvm-mt: Merge manifest namespaces. mt.exe performs a tree merge where certain element nodes are combined into one. This introduces the possibility of xml namespaces conflicting with each other. The original mt.exe has a hierarchy whereby certain namespace names can override others, and nodes that would then end up in ambigious namespaces have their namespaces explicitly defined. This namespace handles this merging process. llvm-svn: 311215	2017-08-19 00:37:41 +00:00
Eugene Zelenko	be709f2c19	[Analysis] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 311212	2017-08-18 23:51:26 +00:00
Xinliang David Li	0d07f9d68a	Fix comment /NFC llvm-svn: 311209	2017-08-18 23:08:50 +00:00
Xinliang David Li	709ffe178e	[Profile] backward propagate profile info in JumpThreading Differential Revsion: http://reviews.llvm.org/D36864 llvm-svn: 311208	2017-08-18 23:00:05 +00:00
Amjad Aboud	88ffa3afe2	[InstCombine] Teach ComputeNumSignBitsImpl to handle integer multiply instruction. Differential Revision: https://reviews.llvm.org/D36679 llvm-svn: 311206	2017-08-18 22:56:55 +00:00
Max Kazantsev	0aaf8c16ac	[IRCE] Fix buggy behavior in Clamp Clamp function was too optimistic when choosing signed or unsigned min/max function for calculations. In fact, `!IsSignedPredicate` guarantees us that `Smallest` and `Greatest` can be compared safely using unsigned predicates, but we did not check this for `S` which can in theory be negative. This patch makes Clamp use signed min/max for cases when it fails to prove `S` being non-negative, and it adds a test where such situation may lead to incorrect conditions calculation. Differential Revision: https://reviews.llvm.org/D36873 llvm-svn: 311205	2017-08-18 22:50:29 +00:00
Justin Bogner	b29bebe47b	IR: Make stripDebugInfo robust against (invalid) empty basic blocks Since stripDebugInfo runs before the verifier when reading IR, we can end up in a situation where we read some invalid IR but don't know its invalid yet. Before this patch we would crash in stripDebugInfo when given IR with a completely empty basic block, and after we get a nice error from the verifier instead. llvm-svn: 311202	2017-08-18 21:38:03 +00:00
Jonas Devlieghere	a2faf7b60f	[llvm-dwarfdump] Hide .debug_str and DIE reference offsets in brief mode This patch hides the .debug_str offset and DIE reference offsets into the CU when llvm-dwarfdump is invoked with -brief. Differential Revision: https://reviews.llvm.org/D36835 llvm-svn: 311201	2017-08-18 21:35:44 +00:00
Simon Pilgrim	f36cca88fb	[X86][ADX] Regenerate ADX intrinsics tests llvm-svn: 311198	2017-08-18 21:21:14 +00:00
Sanjay Patel	7046cbd691	fix typos in comments; NFC llvm-svn: 311193	2017-08-18 20:27:47 +00:00
Ana Pazos	6210f27dfc	[PGO] Fixed assertion due to mismatched memcpy size type. Summary: Memcpy intrinsics have size argument of any integer type, like i32 or i64. Fixed size type along with its value when cloning the intrinsic. Reviewers: davidxl, xur Reviewed By: davidxl Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D36844 llvm-svn: 311188	2017-08-18 19:17:08 +00:00
Tim Northover	14302fcb24	ARM: use an external relocation for calls from MachO ARM mode. The internal (__text-relative) relocation risks the offset not being encodable if the destination is Thumb. llvm-svn: 311187	2017-08-18 19:13:56 +00:00
Matt Morehouse	5c7fc76983	[SanitizerCoverage] Add stack depth tracing instrumentation. Summary: Augment SanitizerCoverage to insert maximum stack depth tracing for use by libFuzzer. The new instrumentation is enabled by the flag -fsanitize-coverage=stack-depth and is compatible with the existing trace-pc-guard coverage. The user must also declare the following global variable in their code: thread_local uintptr_t __sancov_lowest_stack https://bugs.llvm.org/show_bug.cgi?id=33857 Reviewers: vitalybuka, kcc Reviewed By: vitalybuka Subscribers: kubamracek, hiraditya, cfe-commits, llvm-commits Differential Revision: https://reviews.llvm.org/D36839 llvm-svn: 311186	2017-08-18 18:43:30 +00:00
Marek Sokolowski	5cd3d5c8d6	Reapply: [llvm-rc] Add basic RC scripts parsing ability. As for now, the parser supports a limited set of statements and resources. This will be extended in the following patches. Thanks to Nico Weber (thakis) for his original work in this area. This patch was originally submitted as r311175 and got reverted in r311177 because of the problems with compilation under gcc. Differential Revision: https://reviews.llvm.org/D36340 llvm-svn: 311184	2017-08-18 18:24:17 +00:00
Jonas Devlieghere	e101b07a1d	[Debug info] Transfer DI to fragment expressions for split integer values. This patch teaches the SDag type legalizer how to split up debug info for integer values that are split into a hi and lo part. (re-commit) Differential Revision: https://reviews.llvm.org/D36805 llvm-svn: 311181	2017-08-18 18:07:00 +00:00
Ben Dunbobbin	7b76de2dcb	[lit] support unsetting env variables (again!) This is an updated version of https://reviews.llvm.org/D22144 by @jlpeyton. The patch was accepted but not landed. This is useful functionality and I would like to use this to enable lit tests for environment variable behaviour. Differential Revision: https://reviews.llvm.org/D36403 llvm-svn: 311180	2017-08-18 17:32:57 +00:00
Konstantin Zhuravlyov	f5d826a294	AMDGPU/NFC: Rename few things in SIMemoryLegalizer: - AtomicInfo -> MemOpInfo - getAtomicLoadInfo -> getLoadInfo - getAtomicStoreInfo -> getStoreInfo - expandAtomicLoad -> expandLoad - expandAtomicStore -> expandStore Differential Revision: https://reviews.llvm.org/D36861 llvm-svn: 311179	2017-08-18 17:30:02 +00:00
Marek Sokolowski	f276f52014	Revert "[llvm-rc] Add basic RC scripts parsing ability." This reverts commit r311175. This failed some buildbots compilation. llvm-svn: 311177	2017-08-18 17:25:55 +00:00
Jakub Kuderski	756c09a58f	[Dominators] Don't print the whole tree when running with -debug As the incremental API is now used in several transforms, printing the whole dominator tree creates a lot of noise when running with the `-debug` flag. This patch fixes that. llvm-svn: 311176	2017-08-18 17:06:37 +00:00
Marek Sokolowski	dbc16476c1	[llvm-rc] Add basic RC scripts parsing ability. As for now, the parser supports a limited set of statements and resources. This will be extended in the following patches. Thanks to Nico Weber (thakis) for his original work in this area. Differential Revision: https://reviews.llvm.org/D36340 llvm-svn: 311175	2017-08-18 17:05:47 +00:00
Ben Dunbobbin	ac6a5aab45	[Support] env vars with empty values on windows An environment variable can be in one of three states: 1. undefined. 2. defined with a non-empty value. 3. defined but with an empty value. The windows implementation did not support case 3 (it was not handling errors). The Linux implementation is already correct. Differential Revision: https://reviews.llvm.org/D36394 llvm-svn: 311174	2017-08-18 16:55:44 +00:00
Simon Pilgrim	879ce046ad	[X86][BMI2] Added scheduling test for RORX/SARX/SHLX/SHRX instructions llvm-svn: 311171	2017-08-18 16:26:39 +00:00
Brian Gesiak	8953a7c544	[Lexicon] Add "GEP" Summary: `getelementptr` is frequently abbreviated as "GEP", often in source files that do not ever reference the full name of the instruction. Add it to the Lexicon, in case readers go to look for what it means there. Test plan: 1. `ninja sphinx` 2. Confirm that the rendered docs HTML contains the new "GEP" entry llvm-svn: 311168	2017-08-18 15:35:53 +00:00
Simon Pilgrim	358aeae7b8	[X86][AES] Add scheduling latency/throughput tests for AES instructions llvm-svn: 311167	2017-08-18 15:26:51 +00:00
Simon Pilgrim	9eb0869e91	[X86][PCLMUL] Add scheduling latency/throughput test for PCLMULQDQ instruction Added it to the SSE42 tests as targets seem to always have both llvm-svn: 311166	2017-08-18 15:08:30 +00:00
Simon Pilgrim	ccaec26175	[X86][SHA] Add scheduling latency/throughput tests for SHA instructions llvm-svn: 311164	2017-08-18 14:55:50 +00:00
Simon Pilgrim	7f506f7d72	[X86][MOVBE] Add scheduling latency/throughput tests for MOVBE instructions llvm-svn: 311163	2017-08-18 14:44:31 +00:00
Sam Parker	04a7db5915	[ARM] Add PostRAScheduler option This patch adds the option to allow also using the PostRA scheduler, which brings the ARM backend inline with AArch64 targets. The SchedModel can also set 'PostRAScheduler', as the R52 does, so also query this property in the overridden function. Differential Revision: https://reviews.llvm.org/D36866 llvm-svn: 311162	2017-08-18 14:27:51 +00:00
Simon Dardis	02c9a3dfc3	[mips] Follow up comments on r310460 Use dblaikie's suggestion of cast<> instead of a seperate assert. llvm-svn: 311160	2017-08-18 13:27:02 +00:00
Simon Pilgrim	320f89782a	[X86][BMI2] Added scheduling test for MULX instructions llvm-svn: 311159	2017-08-18 13:22:18 +00:00
Sjoerd Meijer	ec9581e5e0	[AArch64] Do not promote f16 when subtarget HasFullFP16 Armv8.2-A adds FP16 support, i.e. f16 is not only a storage-only type, but it also supports performing data processing on 16-bit floating-point quantities. All the necessary (tablegen) groundwork of adding the ARMv8.2-A FP16 (scalar) instructions was done in D15014. To take advantage of this, this patch avoids promotion of f16 to f32 types when the subtarget supports FullFP16, which enables instruction selection of these FP16 instructions. Differential Revision: https://reviews.llvm.org/D36396 llvm-svn: 311154	2017-08-18 10:51:14 +00:00
Renato Golin	6fd16d37ae	[Triple] Define OS Check for Haiku This adds the OS check for the Haiku operating system, as it was missing in the Triple class. Tests for x86_64-unknown-haiku and i586-pc-haiku were also added. These patches only affect Haiku and are completely harmless for other platforms. Patch by Calvin Hill <calvin@hakobaito.co.uk> llvm-svn: 311153	2017-08-18 10:35:42 +00:00
Ilya Biryukov	827c8acc21	Addressed some security issues in Dockerfiles. Summary: - Removed --trust-server-cert from `svn checkout` invocations. Installing 'ca-certificates' package on ubuntu adds required CAs to the system and svn can do proper checkout using https. - Added checksum verification when installing cmake from cmake.org. Reviewers: mehdi_amini, klimek Reviewed By: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36673 llvm-svn: 311152	2017-08-18 09:37:23 +00:00
Diana Picus	42ea77d5c2	Revert "GlobalISel (AArch64): fix ABI at border between GPRs and SP." This reverts commit e8fd20964798ca6d46d2729dd3a789707a6416da in an attempt to appease the GlobalISel buildbot, which fails in the test-suite with errors like fpcmp: files differ without tolerance allowance llvm-svn: 311151	2017-08-18 09:31:21 +00:00
Sam Parker	25efe769c0	[AArch64] Fix for buildbots, unused function Removing function declaration, my previous commit broke the bots. llvm-svn: 311150	2017-08-18 09:08:05 +00:00
Victor Leschuk	091da14423	Remove useless default case in switch llvm-svn: 311149	2017-08-18 09:02:06 +00:00
Sam Parker	96f8959cfd	[AArch64] Remove DecodeAuthLoadWriteback The BaseAuthLoad instruction class was incorrectly passing an empty constraint string to its parent, so I have corrected this. This makes the DecodeAuthLoadWriteback function redundant, so I've also removed it. Differential Revision: https://reviews.llvm.org/D36741 llvm-svn: 311148	2017-08-18 08:39:54 +00:00
Alex Bradbury	f698a29a51	Refine report_fatal_error guidance after post-commit review Use text suggested by Justin Bogner in post-commit review of r311146 <http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20170814/479898.html>, which makes it clear that report_fatal_error shouldn't be used when there is a practicable alternative. Also make this clearer in CodingStandards. llvm-svn: 311147	2017-08-18 06:45:34 +00:00
Alex Bradbury	7182440fbd	Give guidance on report_fatal_error in CodingStandards.rst and ProgrammersManual.rst The current ProgrammersManual.rst document has a lot of well-written documentation on error handling thanks to @lhames. It suggests errors can be split cleanly into "programmatic" and "recoverable" errors. However, the reality in current LLVM seems to be there are a number of cases where a non-programmatic error is not easily recoverable. Therefore, add a note to indicate the existence of report_fatal_error for these cases. I've also added a reminder to CodingStandards.rst in the section on assertions, to indicate that llvm_unreachable and assertions should not be relied upon to report errors triggered by user input. The ProgrammersManual is also silent on the use of LLVMContext::diagnose, which is used in BPF+WebAssembly+AMDGPU to report some errors during instruction selection. I don't address that in this patch, as it's not quite clear how to fit in to the current error handling story Differential Revision: https://reviews.llvm.org/D36826 llvm-svn: 311146	2017-08-18 05:29:21 +00:00
Craig Topper	e3edd9c9be	[DAGCombiner] Fix bad comment that had immediate values swapped from the code and what they need to be to make sense. NFC llvm-svn: 311144	2017-08-18 04:52:46 +00:00
Jatin Bhateja	e739fc7d11	Test commit access Summary: Adding a blank line. Differential Revision: https://reviews.llvm.org/D36859 llvm-svn: 311143	2017-08-18 02:39:28 +00:00
Geoff Berry	bd47e8a4f7	Revert "[MachineCopyPropagation] Extend pass to do COPY source forwarding" round 2 This reverts commit r311135. sanitizer-x86_64-linux-android buildbot is timing out with just this patch applied. llvm-svn: 311142	2017-08-18 01:43:11 +00:00
Richard Smith	c0541dfa3e	Increase tail dup threshold for -O3 from 3 to 4. We see a modest performance improvement from this slightly higher tail dup threshold. Differential Revision: https://reviews.llvm.org/D36775 llvm-svn: 311139	2017-08-17 23:38:41 +00:00
Craig Topper	1fae3ae6f0	[X86] Remove SSE/AVX patterns for AND/XOR/OR/ANDN that checked for the inputs being bitcasted from floating point types. There's really no reason to do this we should just let isel pick the integer version and let the execution dependency fixing pass take care of moving to FP if necessary. It's not very reliable to look for bitcasts at the edges of patterns. If for some reason one input was bitcasted and the other wasn't, or if one was a v4f32 bitcast and one was a v2f64 bitcast, we would have fallen back to the integer pattern anyway. llvm-svn: 311138	2017-08-17 23:20:57 +00:00
Tim Northover	48fff995d6	GlobalISel (AArch64): fix ABI at border between GPRs and SP. If a struct would end up half in GPRs and half on SP the ABI says it should actually go entirely on the stack. We were getting this wrong in GlobalISel before, causing compatibility issues. llvm-svn: 311137	2017-08-17 23:14:01 +00:00
Geoff Berry	51f52c4fca	Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding" Two issues identified by buildbots were addressed: - The pass no longer forwards COPYs to physical register uses, since doing so can break code that implicitly relies on the physical register number of the use. - The pass no longer forwards COPYs to undef uses, since doing so can break the machine verifier by creating LiveRanges that don't end on a use (since the undef operand is not considered a use). [MachineCopyPropagation] Extend pass to do COPY source forwarding This change extends MachineCopyPropagation to do COPY source forwarding. This change also extends the MachineCopyPropagation pass to be able to be run during register allocation, after physical registers have been assigned, but before the virtual registers have been re-written, which allows it to remove virtual register COPY LiveIntervals that become dead through the forwarding of all of their uses. Reviewers: qcolombet, javed.absar, MatzeB, jonpa Subscribers: jyknight, nemanjai, llvm-commits, nhaehnle, mcrosier, mgorny Differential Revision: https://reviews.llvm.org/D30751 llvm-svn: 311135	2017-08-17 23:06:55 +00:00
Zachary Turner	4c432b202f	Fix warning about covered switch default. llvm-svn: 311129	2017-08-17 22:20:15 +00:00
Tom Stellard	a096b12628	AMDGPU: Add R600InstPrinter class Summary: This is step towards separating the GCN and R600 tablegen'd code. This is a little awkward for now, because the R600 functions won't have the MCSubtargetInfo parameter, so we need to have AMDMGPUInstPrinter delegate to R600InstPrinter, but once the tablegen'd code is split, we will be able to drop the delegation and use R600InstPrinter directly. Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D36444 llvm-svn: 311128	2017-08-17 22:20:04 +00:00
Jakub Kuderski	e608ef7635	[LoopRotate][Dominators] Use the incremental API to update DomTree Summary: This patch teaches LoopRotate to use the new incremental API to update the DominatorTree. Reviewers: dberlin, davide, grosser, sanjoy Reviewed By: dberlin, davide Subscribers: hiraditya, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D35581 llvm-svn: 311125	2017-08-17 21:48:19 +00:00
Eugene Zelenko	6e07bfd0d9	[CodeGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 311124	2017-08-17 21:26:39 +00:00
Zachary Turner	197bba0028	Remove unused variable. llvm-svn: 311119	2017-08-17 20:18:36 +00:00
Zachary Turner	96bcd6a37a	[llvm-pdbutil] Fix some dumping issues. When dumping, we were treating the S_INLINESITESYM as referring to a type record, when it actually refers to an id record. We had this correct in TypeIndexDiscovery, so our merging algorithm should be fine, but we had it wrong in the dumper, which means it would appear to work most of the time, unless the index was out of bounds in the type stream, when it would fail. Fixed this, and audited a few other cases to make them match the behavior in TypeIndexDiscovery. Also, I've now observed a new symbol record with kind 0x1168 which I have no clue what it is, so to avoid crashing we have to just print "Unknown Symbol Kind". llvm-svn: 311117	2017-08-17 20:04:51 +00:00
Zachary Turner	f401e1102d	Fix a few minor issues when dumping symbols. 1) We weren't handling symbol types that weren't able to parse, even if we knew what the leaf type was. This was triggering when trying to dump /DEBUG:FASTLINK PDBs, where we expect a certain symbol to show up, but we just don't know how to parse it. 2) We lost the code for dumping record bytes, so this was added back. llvm-svn: 311116	2017-08-17 20:04:31 +00:00
Lang Hames	df1e59b640	[docs] Tweak phrasing of the varargs explanation in the command section of the CMake primer. This moves the introduction of the ARGV/ARGN variables up to immmediately follow the introduction of the concept of variable argument functions, and explicitly connects this concept to C varargs functions. llvm-svn: 311113	2017-08-17 18:21:53 +00:00
Lang Hames	3a6a2d26c6	[docs] Fix typo and tweak wording of special variable handling in CMake primer. llvm-svn: 311112	2017-08-17 18:00:28 +00:00
Jonas Devlieghere	30756da212	Revert "[Debug info] Transfer DI to fragment expressions for split integer values." This reverts commit r311102. llvm-svn: 311111	2017-08-17 17:58:33 +00:00
Alexey Bataev	84ad9ae032	[SimplifyCFG] Add a test for preserve store alignment, NFC. llvm-svn: 311106	2017-08-17 17:26:52 +00:00
Sanjay Patel	f2d67f7ecc	[x86] add tests for vector select-of-constants; NFC We've discussed canonicalizing to this form in IR, so the backend should be prepared to lower these in ways better than what we see here in most cases. llvm-svn: 311103	2017-08-17 17:07:37 +00:00
Jonas Devlieghere	622fedc001	[Debug info] Transfer DI to fragment expressions for split integer values. This patch teaches the SDag type legalizer how to split up debug info for integer values that are split into a hi and lo part. Differential Revision: https://reviews.llvm.org/D36805 llvm-svn: 311102	2017-08-17 17:06:48 +00:00
Sanjay Patel	18424e1581	[PowerPC] add tests for vector select-of-constants; NFC We've discussed canonicalizing to this form in IR, so the backend should be prepared to lower these in ways better than what we see here. llvm-svn: 311099	2017-08-17 17:03:11 +00:00
Adrian Prantl	6a57daad81	Improve line debug info when translating a CaseBlock to SDNodes. The SelectionDAGBuilder translates various conditional branches into CaseBlocks which are then translated into SDNodes. If a conditional branch results in multiple CaseBlocks only the first CaseBlock is translated into SDNodes immediately, the rest of the CaseBlocks are put in a queue and processed when all LLVM IR instructions in the basic block have been processed. When a CaseBlock is transformed into SDNodes the SelectionDAGBuilder is queried for the current LLVM IR instruction and the resulting SDNodes are annotated with the debug info of the current instruction (if it exists and has debug metadata). When the deferred CaseBlocks are processed, the SelectionDAGBuilder does not have a current LLVM IR instruction, and the resulting SDNodes will not have any debuginfo. As DwarfDebug::beginInstruction() outputs a .loc directive for the first instruction in a labeled block (typically the case for something coming from a CaseBlock) this tends to produce a line-0 directive. This patch changes the handling of CaseBlocks to store the current instruction's debug info into the CaseBlock when it is created (and the SelectionDAGBuilder knows the current instruction) and to always use the stored debug info when translating a CaseBlock to SDNodes. Patch by Frej Drejhammar! Differential Revision: https://reviews.llvm.org/D36671 llvm-svn: 311097	2017-08-17 16:57:13 +00:00
Jakub Kuderski	e35a449140	[Dominators] Teach LoopUnswitch to use the incremental API Summary: This patch makes LoopUnswitch use new incremental API for updating dominators. It also updates SplitCriticalEdge, as it is called in LoopUnswitch. There doesn't seem to be any noticeable performance difference when bootstrapping clang with this patch. Reviewers: dberlin, davide, sanjoy, grosser, chandlerc Reviewed By: davide, grosser Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D35528 llvm-svn: 311093	2017-08-17 16:45:35 +00:00
Craig Topper	3a622a14f9	[AVX512] Don't switch unmasked subvector insert/extract instructions when AVX512DQI is enabled. There's no reason to switch instructions with and without DQI. It just creates extra isel patterns and test divergences. There is however value in enabling the masked version of the instructions with DQI. This required introducing some new multiclasses to enabling this splitting. Differential Revision: https://reviews.llvm.org/D36661 llvm-svn: 311091	2017-08-17 15:40:25 +00:00
Craig Topper	5960848060	[X86] Remove memopmmx pattern fragment Summary: Just like the FIXME says, there is no alignment requirement for MMX. Reviewers: RKSimon, zvi, igorb Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36815 llvm-svn: 311090	2017-08-17 15:25:05 +00:00
Victor Leschuk	bcfd8e28c8	Mark Verifier/invalid-eh.ll as unsupported on windows Mark this unsupported for now as it causes tests hangs on buildbot. Will place it back when the problem is debugged. llvm-svn: 311089	2017-08-17 15:07:03 +00:00
Simon Dardis	b5205c69d2	[dfsan] Add explicit zero extensions for shadow parameters in function wrappers. In the case where dfsan provides a custom wrapper for a function, shadow parameters are added for each parameter of the function. These parameters are i16s. For targets which do not consider this a legal type, the lack of sign extension information would cause LLVM to generate anyexts around their usage with phi variables and calling convention logic. Address this by introducing zero exts for each shadow parameter. Reviewers: pcc, slthakur Differential Revision: https://reviews.llvm.org/D33349 llvm-svn: 311087	2017-08-17 14:14:25 +00:00
Daniel Sanders	032e7f2cad	[globalisel][tablegen] Generate TypeObject table. NFC Summary: Generate the type table from the types used by a target rather than hard-coding the union of types used by all targets. Depends on D36084 Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar Reviewed By: rovka Subscribers: kristof.beyls, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D36085 llvm-svn: 311084	2017-08-17 13:18:35 +00:00
Simon Pilgrim	8be9f4af4f	[DAGCombiner] Add support for non-uniform constant vectors to (mul x, (1 << c)) -> x << c llvm-svn: 311083	2017-08-17 13:03:34 +00:00
Amjad Aboud	19f15843ab	[X86] Refactoring of X86TargetLowering::EmitLoweredSelect. NFC. Authored by aivchenk Differential Revision: https://reviews.llvm.org/D35685 llvm-svn: 311082	2017-08-17 12:12:30 +00:00
Davide Italiano	903fd3ea4e	[Verifier] Avoid visiting DIGlobalVariables twice. We currently visit them twice. Once, through `visitMDNode()` -> (the code generated by) `../include/llvm/IR/Metadata.def:109` -> `visitDIGlobalVariable()` Then, through `visitMDNode()` -> `visitDIGlobalVariableExpression()` -> `visitDIGlobalVariable()` This results in verification failures printed twice, e.g.: $ ./opt -verify ../../test/DebugInfo/pr34186.ll missing global variable type !4 = distinct !DIGlobalVariable(name: "pat", scope: !0, file: !1, line: 27, isLocal: true, isDefinition: true) missing global variable type !4 = distinct !DIGlobalVariable(name: "pat", scope: !0, file: !1, line: 27, isLocal: true, isDefinition: true) ./opt: ../../test/DebugInfo/pr34186.ll: error: input module is broken! The patch removes one call so we ensure each GV is visited exactly once. Differential Revision: https://reviews.llvm.org/D36797 llvm-svn: 311081	2017-08-17 11:32:21 +00:00
Ayal Zaks	6627883369	[LV] Using VPlan to model the vectorized code and drive its transformation VPlan is an ongoing effort to refactor and extend the Loop Vectorizer. This patch introduces the VPlan model into LV and uses it to represent the vectorized code and drive the generation of vectorized IR. In this patch VPlan models the vectorized loop body: the vectorized control-flow is represented using VPlan's Hierarchical CFG, with predication refactored from being a post-vectorization-step into a vectorization planning step modeling if-then VPRegionBlocks, and generating code inline with non-predicated code. The vectorized code within each VPBasicBlock is represented as a sequence of Recipes, each responsible for modelling and generating a sequence of IR instructions. To keep the size of this commit manageable the Recipes in this patch are coarse-grained and capture large chunks of LV's code-generation logic. The constructed VPlans are dumped in dot format under -debug. This commit retains current vectorizer output, except for minor instruction reorderings; see associated modifications to lit tests. For further details on the VPlan model see docs/Proposals/VectorizationPlan.rst and its references. Authors: Gil Rapaport and Ayal Zaks Differential Revision: https://reviews.llvm.org/D32871 llvm-svn: 311077	2017-08-17 09:29:59 +00:00
Daniel Sanders	edd0784be6	Re-commit: [globalisel][tablegen] Support zero-instruction emission. Summary: Support the case where an operand of a pattern is also the whole of the result pattern. In this case the original result and all its uses must be replaced by the operand. However, register class restrictions can require a COPY. This patch handles both cases by always emitting the copy and leaving it for the register allocator to optimize. The previous commit failed on Windows machines due to a flaw in the sort predicate which allowed both A < B < C and B == C to be satisfied simultaneously. The cause of this was some sloppiness in the priority order of G_CONSTANT instructions compared to other instructions. These had equal priority because it makes no difference, however there were operands had higher priority than G_CONSTANT but lower priority than any other instruction. As a result, a priority order between G_CONSTANT and other instructions must be enforced to ensure the predicate defines a strict weak order. Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar Subscribers: javed.absar, kristof.beyls, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D36084 llvm-svn: 311076	2017-08-17 09:26:14 +00:00
Jonas Paulsson	593d49c0d9	[SystemZ] Also wrap TII with #ifndef NDEBUG in constructor initilizer list. TII needs to be wrapped with #ifndef NDEBUG to silece compiler warnings. llvm-svn: 311075	2017-08-17 09:18:02 +00:00
Jonas Paulsson	d346924a0e	[SystemZ] Add a wrapping with #ifndef NDEBUG to silence warning. SystemZHazardRecognizer::TII is only used for debug output, so it needs also to be wrapped with #ifndef NDEBUG. llvm-svn: 311074	2017-08-17 08:56:09 +00:00
Jonas Paulsson	57a705d9d0	[SystemZ, MachineScheduler] Improve post-RA scheduling. The idea of this patch is to continue the scheduler state over an MBB boundary in the case where the successor block has only one predecessor. This means that the scheduler will continue in the successor block (after emitting any branch instructions) with e.g. maintained processor resource counters. Benchmarks have been confirmed to benefit from this. The algorithm in MachineScheduler.cpp that extracts scheduling regions of an MBB has been extended so that the strategy may optionally reverse the order of processing the regions themselves. This is controlled by a new method doMBBSchedRegionsTopDown(), which defaults to false. Handling the top-most region of an MBB first also means that a top-down scheduler can continue the scheduler state across any scheduling boundary between to regions inside MBB. Review: Ulrich Weigand, Matthias Braun, Andy Trick. https://reviews.llvm.org/D35053 llvm-svn: 311072	2017-08-17 08:33:44 +00:00
Elad Cohen	124d32829c	[SelectionDAG] Teach the vector-types operand scalarizer about SETCC When v1i1 is legal (e.g. AVX512) the legalizer can reach a case where a v1i1 SETCC with an illgeal vector type operand wasn't scalarized (since v1i1 is legal) but its operands does have to be scalarized. This used to assert because SETCC was missing from the vector operand scalarizer. This patch attemps to teach the legalizer to handle these cases by scalazring the operands, converting the node into a scalar SETCC node. Differential revision: https://reviews.llvm.org/D36651 llvm-svn: 311071	2017-08-17 08:06:36 +00:00
Martin Storsjo	caff3268a1	[llvm-dlltool] Improve an error message when unable to open files. NFC. Differential Revision: https://reviews.llvm.org/D36818 llvm-svn: 311069	2017-08-17 06:26:42 +00:00
Martin Storsjo	9d8ecb4333	[llvm-dlltool] Don't crash if no def file is provided or it can't be opened Differential Revision: https://reviews.llvm.org/D36780 llvm-svn: 311068	2017-08-17 05:58:27 +00:00
Serguei Katkov	9e5604dbe1	[CGP] Fix the rematerialization of gc.relocates If we want to substitute the relocation of derived pointer with gep of base then we must ensure that relocation of base dominates the relocation of derived pointer. Currently only check for basic block is present. However it is possible that both relocation are in the same basic block but relocation of derived pointer is defined earlier. The patch moves the relocation of base pointer right before relocation of derived pointer in this case. Reviewers: sanjoy,artagnon,igor-laevsky,reames Reviewed By: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36462 llvm-svn: 311067	2017-08-17 05:48:30 +00:00
Geoff Berry	4e38e02e6f	Revert "[MachineCopyPropagation] Extend pass to do COPY source forwarding" This reverts commit r311038. Several buildbots are breaking, and at least one appears to be due to the forwarding of physical regs enabled by this change. Reverting while I investigate further. llvm-svn: 311062	2017-08-17 04:04:11 +00:00
Saleem Abdulrasool	dd8c16b58e	ARM: mark CPSR as clobbered for Windows VLAs When lowering a VLA, we emit a __chstk call. However, this call can internally clobber CPSR. We did not mark this register as an ImpDef, which could potentially allow a comparison to be hoisted above the call to `__chkstk`. In such a case, the CPSR could be clobbered, and the check invalidated. When the support was initially added, it seemed that the call would take care of preventing CPSR from being clobbered, but this is not the case. Mark the register as clobbered to fix a possible state corruption. llvm-svn: 311061	2017-08-17 02:42:24 +00:00
Craig Topper	2f9743d2ea	[X86] Exchange the memory op predicate for PALIGNR/VPALIGNR. I accidentally swapped them. llvm-svn: 311060	2017-08-17 02:34:35 +00:00
Craig Topper	5357526ce8	[X86] Cleanup multiclasses for SSE/AVX2 PALIGNR. Add missing load patterns. We used to have a separate multiclass for AVX2 and SSE/AVX. Now we have one multiclass and pass the relevant differences. We were also missing load patterns, though we had them for the AVX-512 version. llvm-svn: 311059	2017-08-17 01:48:03 +00:00
Craig Topper	bbe3e46bb9	[X86] Remove patterns for PALIGNR with non-vXi8 types. llvm-svn: 311058	2017-08-17 01:48:00 +00:00
Jakub Kuderski	fd5c5c9144	Reapply: [ADCE][Dominators] Teach ADCE to preserve dominators Summary: This patch teaches ADCE to preserve both DominatorTrees and PostDominatorTrees. I didn't notice any performance impact when bootstrapping clang with this patch. The patch was originally committed in r311039 and reverted in r311049. This revision fixes the problem with not adding a dependency on the DominatorTreeWrapperPass for the LegacyPassManager. Reviewers: dberlin, chandlerc, sanjoy, davide, grosser, brzycki Reviewed By: davide Subscribers: grandinj, zhendongsu, llvm-commits, david2050 Differential Revision: https://reviews.llvm.org/D35869 llvm-svn: 311057	2017-08-17 01:41:49 +00:00
Craig Topper	42a535351e	[X86] Put multiclass closer to its use and simplify slightly. NFC llvm-svn: 311055	2017-08-16 23:38:25 +00:00
Craig Topper	9025579e8a	[X86] Use a static array instead of a SmallVector for a small fixed size array. NFC llvm-svn: 311054	2017-08-16 23:16:43 +00:00
Sanjay Patel	4abc3f6036	[x86] add cmov promotion tests for D36711; NFC This way we can see what the current codegen looks like. I've also explicitly added/removed the cmov attribute from the RUN lines, so we know exactly what we're checking in the runs. llvm-svn: 311052	2017-08-16 22:50:11 +00:00
Amjad Aboud	86111c6696	[InstCombine] Teach canEvaluateTruncated to handle arithmetic shift (including those with vector splat shift amount) Differential Revision: https://reviews.llvm.org/D36784 llvm-svn: 311050	2017-08-16 22:42:38 +00:00
Jakub Kuderski	cbcffb173c	Revert "[ADCE][Dominators] Teach ADCE to preserve dominators" This reverts commit r311039. The patch caused the `test/Bindings/OCaml/Output/scalar_opts.ml` to fail. llvm-svn: 311049	2017-08-16 22:10:53 +00:00
Eugene Zelenko	bb1b2d09cf	[Analysis] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 311048	2017-08-16 22:07:40 +00:00
Craig Topper	882f29630b	[InstCombine] Make folding (X >s -1) ? C1 : C2 --> ((X >>s 31) & (C2 - C1)) + C1 support splat vectors This also uses decomposeBitTestICmp to decode the compare. Differential Revision: https://reviews.llvm.org/D36781 llvm-svn: 311044	2017-08-16 21:52:07 +00:00
Jakub Kuderski	4552e9de9f	[ADCE][Dominators] Teach ADCE to preserve dominators Summary: This patch teaches ADCE to preserve both DominatorTrees and PostDominatorTrees. I didn't notice any performance impact when bootstrapping clang with this patch. Reviewers: dberlin, chandlerc, sanjoy, davide, grosser, brzycki Reviewed By: davide Subscribers: grandinj, zhendongsu, llvm-commits, david2050 Differential Revision: https://reviews.llvm.org/D35869 llvm-svn: 311039	2017-08-16 20:50:23 +00:00
Geoff Berry	87f8d25150	[MachineCopyPropagation] Extend pass to do COPY source forwarding This change extends MachineCopyPropagation to do COPY source forwarding. This change also extends the MachineCopyPropagation pass to be able to be run during register allocation, after physical registers have been assigned, but before the virtual registers have been re-written, which allows it to remove virtual register COPY LiveIntervals that become dead through the forwarding of all of their uses. Reviewers: qcolombet, javed.absar, MatzeB, jonpa Subscribers: jyknight, nemanjai, llvm-commits, nhaehnle, mcrosier, mgorny Differential Revision: https://reviews.llvm.org/D30751 llvm-svn: 311038	2017-08-16 20:50:01 +00:00
Petr Hosek	ad00bd603e	[CMake][runtimes] Support for building target variants This can be used to build non-sanitized and sanitized versions of runtimes, where sanitized versions use the just built sanitizer which in turn may use the non-sanitized version. Differential Revision: https://reviews.llvm.org/D36348 llvm-svn: 311036	2017-08-16 19:13:45 +00:00
Geoff Berry	40549ad1ac	[LoopDataPrefetch][AArch64FalkorHWPFFix] Preserve ScalarEvolution Summary: Mark LoopDataPrefetch and AArch64FalkorHWPFFix passes as preserving ScalarEvolution since they do not alter loop structure and should not alter any SCEV values (though LoopDataPrefetch may introduce new instructions that won't have cached SCEV values yet). This can result in slight code differences, mainly w.r.t. nsw/nuw flags on SCEVs, since these are computed somewhat lazily when a zext/sext instruction is encountered. As a result, passes after the modified passes may see SCEVs with more nsw/nuw flags present. Reviewers: sanjoy, anemet Subscribers: aemerson, rengolin, mzolotukhin, javed.absar, kristof.beyls, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D36716 llvm-svn: 311032	2017-08-16 19:03:16 +00:00
Simon Atanasyan	cb833076ac	[mips] Handle R_MIPS_TLS_DTPREL32/64 relocations in the RelocVisitor Debug information for TLS variables on MIPS might have R_MIPS_TLS_DTPREL32 or R_MIPS_TLS_DTPREL64 relocations. This patch adds a support for such relocations in the `RelocVisitor`. llvm-svn: 311031	2017-08-16 19:01:22 +00:00
Adrian Prantl	3d523a657a	Add a convenience overload of DWARFDie::dump() for debugging purposes. llvm-svn: 311026	2017-08-16 17:43:01 +00:00
Xinliang David Li	5a57b842cf	Add more comment llvm-svn: 311025	2017-08-16 17:33:43 +00:00
Xinliang David Li	71ecaa19ff	[PGO] Fix ThinLTO crash Differential Revsion: http://reviews.llvm.org/D36640 llvm-svn: 311023	2017-08-16 17:18:01 +00:00
Evgeny Mankov	bf9751760a	[AMDGPU] NFC: test commit llvm-svn: 311019	2017-08-16 16:47:29 +00:00
Konstantin Zhuravlyov	d3d89efa3e	AMDGPU/NFC: Sort files in CMakeLists.txt alphabetically llvm-svn: 311017	2017-08-16 16:23:32 +00:00
Simon Pilgrim	38e8a023fa	[X86] Regenerate immediate store merging tests llvm-svn: 311016	2017-08-16 16:22:19 +00:00
Jakub Kuderski	624463a003	[Dominators] Introduce batch updates Summary: This patch introduces a way of informing the (Post)DominatorTree about multiple CFG updates that happened since the last tree update. This makes performing tree updates much easier, as it internally takes care of applying the updates in lockstep with the (virtual) updates to the CFG, which is done by reverse-applying future CFG updates. The batch updater is able to remove redundant updates that cancel each other out. In the future, it should be also possible to reorder updates to reduce the amount of work needed to perform the updates. Reviewers: dberlin, sanjoy, grosser, davide, brzycki Reviewed By: brzycki Subscribers: mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D36167 llvm-svn: 311015	2017-08-16 16:12:52 +00:00
Hal Finkel	9e54b7093a	[BDCE] Don't check demanded bits on unsized types To clear assumptions that are potentially invalid after trivialization, we need to walk the use/def chain. Normally, the only way to reach an instruction with an unsized type is via an instruction that has side effects (or otherwise will demand its input bits). That would stop the walk. However, if we have a readnone function that returns an unsized type (e.g., void), we must avoid asking for the demanded bits of the function call's return value. A void-returning readnone function is always dead (and so we can stop walking the use/def chain here), but the check is necessary to avoid asserting. Fixes PR34211. llvm-svn: 311014	2017-08-16 16:09:22 +00:00
Davide Italiano	cd21378ff6	[Verifier] Reject globals without a type associated. llvm-svn: 311012	2017-08-16 15:16:33 +00:00
Dmitry Preobrazhensky	b865ef534a	[AMDGPU][MC][GFX9] Added op_sel support for v_mad_*16, v_fma_f16, v_div_fixup_f16 This change implements features postponed in https://reviews.llvm.org/D35424 because of a dependency on https://reviews.llvm.org/D36322 Reviewers: SamWot, artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D36694 llvm-svn: 311011	2017-08-16 15:16:32 +00:00
Sanjay Patel	042a53624c	[DemandedBits] simplify call; NFC llvm-svn: 311009	2017-08-16 14:28:23 +00:00
Balaram Makam	c5698befb6	Revert "MachineInstr: Reason locally about some memory objects before going to AA." r310825 caused the clang-ppc64le-linux-lnt bot to go red (http://lab.llvm.org:8011/builders/clang-ppc64le-linux-lnt/builds/5712) because of a test-suite failure of SingleSource/UnitTests/2003-07-09-SignedArgs This reverts commit 0028f6a87224fb595a1c19c544cde9b003035996. llvm-svn: 311008	2017-08-16 14:17:43 +00:00
Dmitry Preobrazhensky	ff64aa514b	[AMDGPU][MC][GFX9] Added integer clamping support for VOP3 opcodes See Bug 34152: https://bugs.llvm.org//show_bug.cgi?id=34152 Reviewers: SamWot, artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D36674 llvm-svn: 311006	2017-08-16 13:51:56 +00:00
Simon Pilgrim	c63f93a197	[CostModel][X86][XOP] Improve costs for XOP shuffles VPPERM/VPERMIL2PD/VPERMIL2PS all provide more effective 2-input shuffles than regular AVX instructions llvm-svn: 311005	2017-08-16 13:50:20 +00:00
Davide Italiano	75ebc568e2	[DI] Every DIGlobalVariable should have a type. I'll make this a verifier check to catch other violations. This commit fixes the tests already in tree. llvm-svn: 311004	2017-08-16 13:39:07 +00:00
Simon Dardis	9c5d64b901	[mips] Handle variables with an explicit section and interactions with .sdata, .sbss If a variable has an explicit section such as .sdata or .sbss, it is placed in that section and accessed in a gp relative manner. This overrides the global -G setting. Otherwise if a variable has a explicit section attached to it, such as '.rodata' or '.mysection', it is not placed in the small data section. This also overrides the global -G setting. Reviewers: atanasyan, nitesh.jain Differential Revision: https://reviews.llvm.org/D36616 llvm-svn: 311001	2017-08-16 12:18:04 +00:00
Sam Parker	84fd0c3bf2	[ARM] Improve loop unrolling for Cortex-M - Set the default runtime unroll count to 4 and use the newly added UnrollRemainder option. - Create loop cost and force unroll for a cost less than 12. - Disable unrolling on Thumb1 only targets. Differential Revision: https://reviews.llvm.org/D36134 llvm-svn: 310997	2017-08-16 07:42:44 +00:00
Igor Breger	ce5ea38135	[GlobalISel][X86] Fix mir tests, use correct physical register.NFC. llvm-svn: 310996	2017-08-16 07:25:51 +00:00
Martin Storsjo	e1f120bbcb	[COFF] Make the weak aliases optional When creating an import library from lld, the cases with Name != ExtName shouldn't end up as a weak alias, but as a real export of the new name, which is what actually is exported from the DLL. This restores the behaviour of renamed exports to what it was in 4.0. The other half of this commit, including test, goes into lld. Differential Revision: https://reviews.llvm.org/D36633 llvm-svn: 310991	2017-08-16 05:22:49 +00:00
Martin Storsjo	58c9527eaf	[llvm-dlltool] Fix creating stdcall/fastcall import libraries for i386 Hook up the -k option (that in the original GNU dlltool removes the @n suffix from the symbol that the final executable ends up linked to). In llvm-dlltool, make sure that functions end up with the undecorate name type if this option is set and they are decorated. In mingw, when creating import libraries from def files instead of creating an import library as a side effect of linking a DLL, the symbol names in the def contain the stdcall/fastcall decoration (but no leading underscore). By setting the undecorate name type, a linker linking to the import library will omit the decoration from the DLL import entry. With this in place, mingw-w64 for i386 built with llvm-dlltool/clang produces import libraries that actually work. Differential Revision: https://reviews.llvm.org/D36548 llvm-svn: 310990	2017-08-16 05:18:36 +00:00
Martin Storsjo	a238b20e23	[COFF] Add SymbolName as a distinct field in COFFImportFile The previous Name and ExtName aren't enough to convey all the nuances between weak aliases and stdcall decorated function names. A test for this will be added in LLD. Differential Revision: https://reviews.llvm.org/D36544 llvm-svn: 310988	2017-08-16 05:13:16 +00:00
Stanislav Mekhanoshin	a9487d92d7	[AMDGPU] Eliminate no effect instructions before s_endpgm Differential Revision: https://reviews.llvm.org/D36585 llvm-svn: 310987	2017-08-16 04:43:49 +00:00
Dehao Chen	84d412035a	Merge debug info when hoist then-else code to if. Summary: When we move then-else code to if, we need to merge its debug info, otherwise the hoisted instruction may have inaccurate debug info attached. Reviewers: aprantl, probinson, dblaikie, echristo, loladiro Reviewed By: aprantl Subscribers: sanjoy, llvm-commits Differential Revision: https://reviews.llvm.org/D36778 llvm-svn: 310985	2017-08-16 01:55:26 +00:00
Derek Schuff	8e71359561	[WebAssembly] Remove infinite loop from reg-stackify test r310940 exposed reverse-unreachable code to some optimizers, which caused some of the code in this test to be sunk, changing the input to the pass and breaking the exptectations. Since that change is irrelevant to this particular test, this change just adds an exit node to work around the problem; the test should really be more robust (or be an MIR test?) but this preserves the existing test intent. llvm-svn: 310981	2017-08-16 00:49:44 +00:00
Quentin Colombet	647b482c1e	[VirtRegRewriter] Properly model the register liveness on undef subreg definition Undef subreg definition means that the content of the super register doesn't matter at this point. While that's true for virtual registers, this may not hold when replacing them with actual physical registers. Indeed, some part of the physical register may be coalesced with the related virtual register and thus, the values for those parts matter and must be live. The fix consists in checking whether or not subregs of the physical register being assigned to an undef subreg definition are live through that def and insert an implicit use if they are. Doing so, will keep them alive until that point like they should be. E.g., let vreg14 being assigned to R0_R1 then %vreg14:gsub_0<def,read-undef> = COPY %R0 ; <-- R1 is still live here %vreg14:gsub_1<def> = COPY %R1 Before this changes, the rewriter would change the code into: %R0<def> = KILL %R0, %R0_R1<imp-def> ; <-- this tells R1 is redefined %R1<def> = KILL %R1, %R0_R1<imp-def>, %R0_R1<imp-use> ; this value of this R1 ; is believed to come ; from the previous ; instruction Because of this invalid liveness, later pass could make wrong choices and in particular clobber live register as it happened with the register scavenger in llvm.org/PR34107 Now we would generate: %R0<def> = KILL %R0, %R0_R1<imp-def>, %R0_R1<imp-use> ; This tells R1 needs to ; reach this point %R1<def> = KILL %R1, %R0_R1<imp-def>, %R0_R1<imp-use> The bug has been here forever, it got exposed recently because the register scavenger got smarter. Fixes llvm.org/PR34107 llvm-svn: 310979	2017-08-16 00:17:05 +00:00
Kuba Mracek	a1adbe6ca1	Revert archive-* tests from r310953, there were test failures. llvm-svn: 310974	2017-08-15 23:41:34 +00:00
Craig Topper	0a1a276d91	[InstCombine] Teach canEvaluateZExtd and canEvaluateTruncated to handle vector shifts with splat shift amount We were only allowing ConstantInt before. This patch allows splat of ConstantInt too. Differential Revision: https://reviews.llvm.org/D36763 llvm-svn: 310970	2017-08-15 22:48:41 +00:00
Quentin Colombet	61d71a138b	Reapply "[GlobalISel] Remove the GISelAccessor API." This reverts commit r310425, thus reapplying r310335 with a fix for link issue of the AArch64 unittests on Linux bots when BUILD_SHARED_LIBS is ON. Original commit message: [GlobalISel] Remove the GISelAccessor API. Its sole purpose was to avoid spreading around ifdefs related to building global-isel. Since r309990, GlobalISel is not optional anymore, thus, we can get rid of this mechanism all together. NFC. ---- The fix for the link issue consists in adding the GlobalISel library in the list of dependencies for the AArch64 unittests. This dependency comes from the use of AArch64Subtarget that needs to know how to destruct the GISel related APIs when being detroyed. Thanks to Bill Seurer and Ahmed Bougacha for helping me reproducing and understand the problem. llvm-svn: 310969	2017-08-15 22:31:51 +00:00
Charles Saternos	55d93e79df	[ThinLTO] Fix ThinLTO crash while destroying context Fix for PR32763 An assert that checks if a Ref was untracked fails during ThinLTO context cleanup. The issue is because lazy loading temporary nodes didn't properly track ValueAsMetadata nodes. This patch ensures that the temporary nodes are properly tracked when they're replaced with the value. llvm-svn: 310967	2017-08-15 22:23:44 +00:00
Kuba Mracek	c0301b2f53	Revert changes in r310953 for llvm-symbolizer.test. The change causes a test failure. llvm-svn: 310956	2017-08-15 21:02:17 +00:00
Tony Tye	46d3576c24	Update AMDGPUUsage.rst documentation: 1. Correct description of the kernel initial state for FLAT_SCRATCH_INIT. 2. Add link to GFX9 architecture documentation. 3. Update product names. 4. Rename note record from NT_AMD_AMDGPU_METADATA to NT_AMD_AMDGPU_HSA_METADATA and move description to the AMDHSA coding convention section. 5. Minor typo corrections. Differential Revision: https://reviews.llvm.org/D36549 llvm-svn: 310954	2017-08-15 20:47:41 +00:00
Kuba Mracek	17ee427ef3	[llvm] Get rid of "%T" expansions The %T lit expansion expands to a common directory shared between all the tests in the same directory, which is unexpected and unintuitive, and more importantly, it's been a source of subtle race conditions and flaky tests. In https://reviews.llvm.org/D35396, it was agreed that it would be best to simply ban %T and only keep %t, which is unique to each test. When a test needs a temporary directory, it can just create one using mkdir %t. This patch removes %T in llvm. Differential Revision: https://reviews.llvm.org/D36495 llvm-svn: 310953	2017-08-15 20:29:24 +00:00
Amjad Aboud	0464c5d958	[InstCombine] Added support for (X >>s C) << C --> X & (-1 << C) Differential Revision: https://reviews.llvm.org/D36743 llvm-svn: 310949	2017-08-15 19:33:14 +00:00
Lang Hames	e815bf3cd8	[ORC][Kaleidoscope] Update Chapter 1 of BuildingAJIT to incorporate recent ORC API changes. llvm-svn: 310947	2017-08-15 19:20:10 +00:00
Sanjay Patel	f69b7d5c93	[InstCombine] sink sext after ashr Narrow ops are better for bit-tracking, and in the case of vectors, may enable better codegen. As the trunc test shows, this can allow follow-on simplifications. There's a block of code in visitTrunc that deals with shifted ops with FIXME comments. It may be possible to remove some of that now, but I want to make sure there are no problems with this step first. http://rise4fun.com/Alive/Y3a Name: hoist_ashr_ahead_of_sext_1 %s = sext i8 %x to i32 %r = ashr i32 %s, 3 ; shift value is < than source bit width => %a = ashr i8 %x, 3 %r = sext i8 %a to i32 Name: hoist_ashr_ahead_of_sext_2 %s = sext i8 %x to i32 %r = ashr i32 %s, 8 ; shift value is >= than source bit width => %a = ashr i8 %x, 7 ; so clamp this shift value %r = sext i8 %a to i32 Name: junc_the_trunc %a = sext i16 %v to i32 %s = ashr i32 %a, 18 %t = trunc i32 %s to i16 => %t = ashr i16 %v, 15 llvm-svn: 310942	2017-08-15 18:25:52 +00:00
Jakub Kuderski	638c085d07	[Dominators] Include infinite loops in PostDominatorTree Summary: This patch teaches PostDominatorTree about infinite loops. It is built on top of D29705 by @dberlin which includes a very detailed motivation for this change. What's new is that the patch also teaches the incremental updater how to deal with reverse-unreachable regions and how to properly maintain and verify tree roots. Before that, the incremental algorithm sometimes ended up preserving reverse-unreachable regions after updates that wouldn't appear in the tree if it was constructed from scratch on the same CFG. This patch makes the following assumptions: - A sequence of updates should produce the same tree as a recalculating it. - Any sequence of the same updates should lead to the same tree. - Siblings and roots are unordered. The last two properties are essential to efficiently perform batch updates in the future. When it comes to the first one, we can decide later that the consistency between freshly built tree and an updated one doesn't matter match, as there are many correct ways to pick roots in infinite loops, and to relax this assumption. That should enable us to recalculate postdominators less frequently. This patch is pretty conservative when it comes to incremental updates on reverse-unreachable regions and ends up recalculating the whole tree in many cases. It should be possible to improve the performance in many cases, if we decide that it's important enough. That being said, my experiments showed that reverse-unreachable are very rare in the IR emitted by clang when bootstrapping clang. Here are the statistics I collected by analyzing IR between passes and after each removePredecessor call: ``` # functions: 52283 # samples: 337609 # reverse unreachable BBs: 216022 # BBs: 247840796 Percent reverse-unreachable: 0.08716159869015269 % Max(PercRevUnreachable) in a function: 87.58620689655172 % # > 25 % samples: 471 ( 0.1395104988314885 % samples ) ... in 145 ( 0.27733680163724345 % functions ) ``` Most of the reverse-unreachable regions come from invalid IR where it wouldn't be possible to construct a PostDomTree anyway. I would like to commit this patch in the next week in order to be able to complete the work that depends on it before the end of my internship, so please don't wait long to voice your concerns :). Reviewers: dberlin, sanjoy, grosser, brzycki, davide, chandlerc, hfinkel Reviewed By: dberlin Subscribers: nhaehnle, javed.absar, kparzysz, uabelho, jlebar, hiraditya, llvm-commits, dberlin, david2050 Differential Revision: https://reviews.llvm.org/D35851 llvm-svn: 310940	2017-08-15 18:14:57 +00:00
Tom Stellard	590a974e10	test-release.sh: Move test-suite setup to beginning of the script Summary: We want to catch failures early before do the full 3 stage build. The goal here is to avoid running through the whole build process and have it fail at the end (and not create the binary packages), just because some prerequisites failed to install. Reviewers: rovka, hans Reviewed By: hans Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36422 llvm-svn: 310939	2017-08-15 18:11:56 +00:00
Lang Hames	359983bdd3	[ORC] Add case statements for AArch64 to the local stub and callback manager creation functions. This should allow lli to lazily execute code using OrcLazyJIT on AArch64. llvm-svn: 310938	2017-08-15 18:10:19 +00:00
Sanjay Patel	d52997a296	[InstCombine] add tests for sext+ashr; NFC llvm-svn: 310935	2017-08-15 17:41:31 +00:00
Rui Ueyama	4a17955030	Fix -Wunused-lambda-capture for Release build. `I` and `this` are used only in assert or DEBUG, so they are unused in Release build. llvm-svn: 310934	2017-08-15 17:39:35 +00:00
George Rimar	e5269439cd	[llvm-dwarfdump] - Attemp to fix BB after r310915. Now MIPS one is unhappy: http://lab.llvm.org:8011/builders/llvm-mips-linux/builds/2221 llvm-svn: 310928	2017-08-15 16:42:21 +00:00
Steven Wu	86a511e836	[Doc] Update LangRef for new Module Flag Behavior Summary: Add the documentation for the new module flag behavior. The new ModFlagBehavior is added in r303590. Reviewers: tejohnson Reviewed By: tejohnson Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36557 llvm-svn: 310926	2017-08-15 16:16:33 +00:00
George Rimar	e1c30f74f7	[llvm-dwarfdump] - Refactor section name/uniqueness gathering. As was requested in D36313 thread, with this patch section names and uniqueness calculated once, and not every time when a range is dumped. Differential revision: https://reviews.llvm.org/D36740 llvm-svn: 310923	2017-08-15 15:54:43 +00:00
Daniel Sanders	eb2f5f3256	Revert r310919 - [globalisel][tablegen] Support zero-instruction emission. As expected, this failed on the windows bots but the instrumentation showed something interesting. The ADD8ri and INC8r rules are never directly compared on the windows machines. That implies that the issue lies in transitivity of the Compare predicate. I believe I've already verified that but maybe I missed something. llvm-svn: 310922	2017-08-15 15:10:31 +00:00
Daniel Sanders	16e6dd3cd6	Re-commit with some instrumentation: [globalisel][tablegen] Support zero-instruction emission. Summary: Support the case where an operand of a pattern is also the whole of the result pattern. In this case the original result and all its uses must be replaced by the operand. However, register class restrictions can require a COPY. This patch handles both cases by always emitting the copy and leaving it for the register allocator to optimize. The previous commit failed on the windows bots and this one is likely to fail on those same bots. However, the added instrumentation should reveal a particular isHigherPriorityThan() evaluation which I'm expecting to expose that these machines are weighing priority of two rules differently from the non-windows machines. Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar Subscribers: javed.absar, kristof.beyls, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D36084 llvm-svn: 310919	2017-08-15 13:50:09 +00:00
George Rimar	36f4d8044b	[DebugInfo] - Attemp to fix BB after r310915. Not sure what BB does not like. While building module 'LLVM_DebugInfo_DWARF' imported from /home/buildbot/modules-slave-2/clang-x86_64-linux-selfhost-modules-2/llvm.src/lib/DebugInfo/DWARF/DWARFAbbreviationDeclaration.cpp:10: In file included from <module-includes>:7: In file included from /home/buildbot/modules-slave-2/clang-x86_64-linux-selfhost-modules-2/llvm.src/include/llvm/DebugInfo/DWARF/DWARFContext.h:29: /home/buildbot/modules-slave-2/clang-x86_64-linux-selfhost-modules-2/llvm.src/include/llvm/DebugInfo/DWARF/DWARFObject.h:30:17: error: declaration of 'object' must be imported from module 'LLVM_Object.Decompressor' before it is required virtual const object::ObjectFile *getFile() const { return nullptr; } ^ /home/buildbot/modules-slave-2/clang-x86_64-linux-selfhost-modules-2/llvm.src/include/llvm/Object/Decompressor.h:18:11: note: previous declaration is here namespace object { http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules-2/builds/10766 llvm-svn: 310918	2017-08-15 13:26:12 +00:00
Alex Bradbury	2fee9ead7e	[RISCV] Add RISCVInstPrinter and basic MC assembler tests With the addition of RISCVInstPrinter, it is now possible to test the basic operation of the RISCV MC layer. Differential Revision: https://reviews.llvm.org/D23564 llvm-svn: 310917	2017-08-15 13:08:29 +00:00
George Rimar	6957ab5b7b	[llvm-dwarfdump] - Print section name and index when dumping .debug_info ranges Teaches llvm-dwarfdump to print section index and name of range when it dumps .debug_info. Differential revision: https://reviews.llvm.org/D36313 llvm-svn: 310915	2017-08-15 12:32:54 +00:00
Alex Bradbury	67820e015f	[RISCV] Recognize new relocation types This patch adds all RISC-V relocation types, as of binutils 2.29. Note that R_RISCV32_PCREL is not currently documented in the RISC-V ELF PSABI. Differential Revision: https://reviews.llvm.org/D36455 Patch by Chih-Mao Chen (@PkmX) llvm-svn: 310914	2017-08-15 12:11:10 +00:00
Ayal Zaks	25e2800e20	[LV] Minor savings to Sink casts to unravel first order recurrence Two minor savings: avoid copying the SinkAfter map and avoid moving a cast if it is not needed. Differential Revision: https://reviews.llvm.org/D36408 llvm-svn: 310910	2017-08-15 08:32:59 +00:00
Frederich Munch	7a3da86823	Propagate error in LazyEmittingLayer::removeModule. Summary: Besides being the better thing to do, not doing so will triggers an assert with LLVM_ENABLE_ABI_BREAKING_CHECKS. Reviewers: lhames Reviewed By: lhames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36700 llvm-svn: 310906	2017-08-15 02:25:36 +00:00
Dinar Temirbulatov	9e43d6e7b2	[SLPVectorizer] Replace VL[0] to VL0 with assert, add propagateIRFlags extra parameter VL0, replace E->Scalars[0] to VL0, NFCI. llvm-svn: 310904	2017-08-15 00:31:49 +00:00
Petr Hosek	ec20fd7731	[CMake] Add install target for LLVMFuzzer This allows including LLVMFuzzer as distribution component. Differential Revision: https://reviews.llvm.org/D36540 llvm-svn: 310897	2017-08-14 23:37:31 +00:00
Dehao Chen	45847d3612	Add missing dependency in ICP. (NFC) llvm-svn: 310896	2017-08-14 23:25:21 +00:00
Jessica Paquette	95c1107f4c	[MachineOutliner] Only outline candidates of length >= 2 Since we don't factor in instruction lengths into outlining calculations right now, it's never the case that a candidate could have length < 2. Thus, we should quit early when we see such candidates. llvm-svn: 310894	2017-08-14 22:57:41 +00:00
Craig Topper	b1e4b1a070	[InstSimplify] Teach decomposeBitTestICmp to handle non-canonical compares This adds support non-canonical compare predicates. InstSimplify can't rely on canonicalization to have occurred. Differential Revision: https://reviews.llvm.org/D36646 llvm-svn: 310893	2017-08-14 22:11:43 +00:00
Reid Kleckner	18728822d2	Remove checks for debug info intrinsics in use lists, NFC These haven't done anything since debug info intrinsics stopped appearing in Value use lists in 2014. llvm-svn: 310892	2017-08-14 22:10:54 +00:00
John Baldwin	1255b165bf	[MIPS] Implement support for -mstack-alignment. Summary: This is modeled on the implementation for x86 which stores the command line option in a 'StackAlignOverride' field in MipsSubtarget and then uses this to compute a 'stackAlignment' value in MipsSubtarget::initializeSubtargetDependencies. The stackAlignment() method in MipsSubTarget is renamed to getStackAlignment() and returns the computed 'stackAlignment'. Reviewers: sdardis Reviewed By: sdardis Subscribers: llvm-commits, arichardson Differential Revision: https://reviews.llvm.org/D35874 llvm-svn: 310891	2017-08-14 21:49:38 +00:00
Craig Topper	0aa3a19512	Recommit r310869, "[InstSimplify][InstCombine] Modify the interface of decomposeBitTestICmp and use it in the InstSimplify" This recommits r310869, with the moved files and no extra changes. Original commit message: This addresses a fixme in InstSimplify about using decomposeBitTest. This also fixes InstSimplify to handle ugt and ult compares too. I've modified the interface a little to return only the APInt version of the mask that InstSimplify needs. InstCombine now has a small wrapper routine to create a Constant out of it. I've also dropped the returning of 0 since InstSimplify doesn't need that. So InstCombine creates a zero constant itself. I also had to make decomposeBitTest support vectors since InstSimplify needs that. As InstSimplify can't use something from the Transforms library, I've moved the CmpInstAnalysis code to the Analysis library. Differential Revision: https://reviews.llvm.org/D36593 llvm-svn: 310889	2017-08-14 21:39:51 +00:00
Chandler Carruth	bba762a13f	[InlineCost] Refactor the checks for different analyses to be a bit more localized to the code that uses those analyses. Technically, this can change behavior as we no longer require the existence of the ProfileSummaryInfo analysis to use local profile information via BFI. We didn't actually require the PSI to have an interesting profile though, so this only really impacts the behavior in non-default pass pipelines. IMO, this makes it substantially less surprising how everything works -- before an analysis that wasn't actually used had to exist to trigger any profile aware inlining. I think the new organization makes it more obvious where various checks for profile signals happen. Differential Revision: https://reviews.llvm.org/D36710 llvm-svn: 310888	2017-08-14 21:25:00 +00:00
Andrew Kaylor	53a5fbb45f	Add strictfp attribute to prevent unwanted optimizations of libm calls Differential Revision: https://reviews.llvm.org/D34163 llvm-svn: 310885	2017-08-14 21:15:13 +00:00
Kostya Serebryany	e3cb3c519f	[libFuzzer] try to use less RAM while processing the initial corpus llvm-svn: 310881	2017-08-14 20:34:35 +00:00
Kostya Serebryany	47cb4856d4	[libFuzzer] explicitly use -fsanitize-coverage=trace-pc-guard in test/dump_coverage.test; mark print_coverage/dump_coverage as To-be-deprecated llvm-svn: 310877	2017-08-14 19:55:23 +00:00
Matt Arsenault	81da0d45f8	IPRA: Allow target to enable IPRA by default llvm-svn: 310876	2017-08-14 19:54:47 +00:00
Matt Arsenault	f9273c81d6	IPRA: Run RegUsageInfoPropagate much later This was running immediately after isel, before isel pseudos were even expanded which is really unreasonable. Move this to before pre-reglloc passes in case some other pre-regalloc pass wants to use the updated regmask info. Fixes one of the reasons IPRA doesn't do anything on AMDGPU currently. Tests will be included with future patch after a few more are fixed. llvm-svn: 310875	2017-08-14 19:54:45 +00:00
Craig Topper	69fa8e0d99	Revert r310869 "[InstSimplify][InstCombine] Modify the interface of decomposeBitTestICmp and use it in the InstSimplify" Failed to add the two files that moved. And then added an extra change I didn't mean to while trying to fix that. Reverting everything. llvm-svn: 310873	2017-08-14 19:09:32 +00:00
Craig Topper	9c7b881677	Revert r310870 "[InstCombine][InstSimplify] 'git add' two files that moved in r310869." An extra change crept in here. llvm-svn: 310872	2017-08-14 19:09:28 +00:00
Craig Topper	914c836842	[InstCombine][InstSimplify] 'git add' two files that moved in r310869. llvm-svn: 310870	2017-08-14 19:01:32 +00:00
Craig Topper	2f0b450666	[InstSimplify][InstCombine] Modify the interface of decomposeBitTestICmp and use it in the InstSimplify This addresses a fixme in InstSimplify about using decomposeBitTest. This also fixes InstSimplify to handle ugt and ult compares too. I've modified the interface a little to return only the APInt version of the mask that InstSimplify needs. InstCombine now has a small wrapper routine to create a Constant out of it. I've also dropped the returning of 0 since InstSimplify doesn't need that. So InstCombine creates a zero constant itself. I also had to make decomposeBitTest support vectors since InstSimplify needs that. As InstSimplify can't use something from the Transforms library, I've moved the CmpInstAnalysis code to the Analysis library. Differential Revision: https://reviews.llvm.org/D36593 llvm-svn: 310869	2017-08-14 18:49:42 +00:00
Craig Topper	58def1e1f2	[InstSimplify] Add some tests cases for selects with bittests hidden in ugt/ult/uge/ule compares. NFC llvm-svn: 310868	2017-08-14 18:49:39 +00:00
Lei Huang	451ef4adcd	[PowerPC] Add codegen for VSX word extract convert to FP Add codegen for VSX word extract conversion from signed/unsigned to single/double precision. For UINT_TO_FP: Extract word unsigned and convert to float was implemented in https://reviews.llvm.org/D20239. Here we will add the missing extract integer and conversion to double. This utilizes the new P9 instruction xxextractuw to extracting an integer element when the result will be converted to double thereby saving 2 direct moves (VSR <-> GPR). For SINT_TO_FP: We will implement the following sequence which will also reduce the number of instructions by saving 2 direct moves. v4i32->f32: xxspltw xvcvsxwsp xscvspdpn v4i32->f64: xxspltw xvcvsxwdp Differential Revision: https://reviews.llvm.org/D35859 llvm-svn: 310866	2017-08-14 18:09:29 +00:00
Aditya Nandakumar	86021a2345	[GISel]: Add some helper constructors to MIRBuilder https://reviews.llvm.org/D36636 llvm-svn: 310860	2017-08-14 17:25:11 +00:00
Hal Finkel	b03dd4be70	[ValueTracking] Don't delete assumes of side-effectful instructions ValueTracking has to strike a balance when attempting to propagate information backwards from assumes, because if the information is trivially propagated backwards, it can appear to LLVM that the assumption is known to be true, and therefore can be removed. This is sound (because an assumption has no semantic effect except for causing UB), but prevents the assume from allowing further optimizations. The isEphemeralValueOf check exists to try and prevent this issue by not removing the source of an assumption. This tries to make it a little bit more general to handle the case of side-effectful instructions, such as in %0 = call i1 @get_val() %1 = xor i1 %0, true call void @llvm.assume(i1 %1) Patch by Ariel Ben-Yehuda, thanks! Differential Revision: https://reviews.llvm.org/D36590 llvm-svn: 310859	2017-08-14 17:11:43 +00:00
Simon Dardis	c3f6b2806f	Revert "Reland "[mips][mt][6/7] Add support for mftr, mttr instructions."" This reverts r310834. It didn't pacify the buildbot, FileCheck is still crashing. llvm-svn: 310854	2017-08-14 16:20:33 +00:00
Sanjay Patel	92653865e6	[x86] fold the mask op on 8- and 16-bit rotates Ref the post-commit thread for r310770: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20170807/478507.html The motivating cases as 'C' source examples can look like this: unsigned char rotate_right_8(unsigned char v, int shift) { // shift &= 7; v = ( v >> shift ) \| ( v << ( 8 - shift ) ); return v; } https://godbolt.org/g/K6rc1A Notice that the source doesn't contain UB-safe masked shift amounts, but instcombine created those in order to produce narrow rotate patterns. This should be the last step needed to resolve PR34046: https://bugs.llvm.org/show_bug.cgi?id=34046 Differential Revision: https://reviews.llvm.org/D36644 llvm-svn: 310849	2017-08-14 15:55:43 +00:00
Dinar Temirbulatov	7b78f5e52d	[SLPVectorizer] Schedule bundle with different opcodes. This change let us schedule a bundle with different opcodes in it, for example : [ load, add, add, add ] Reviewers: mkuper, RKSimon, ABataev, mzolotukhin, spatel, filcab Subscribers: llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D36518 llvm-svn: 310847	2017-08-14 15:40:16 +00:00
Craig Topper	411f29de69	[X86] Fix a place that was mishandling X86ISD::UMUL. According to the X86ISelLowering.h, UMUL results are low, high, and flags. But this place was treating result 1 or 2 as flags. Differential Revision: https://reviews.llvm.org/D36654 llvm-svn: 310846	2017-08-14 15:32:40 +00:00
Craig Topper	c0471829b1	[X86] Remove flag setting ISD nodes from computeKnownBitsForTargetNode Summary: The flag result is an i32 type. But its only really used for connectivity. I don't think anything even assumes a particular format. We don't ever do any real operations on it. So known bits don't help us optimize anything. My main motivation is that the UMUL behavior is actually wrong. I was going to fix this in D36654, but then realized there was just no reason for it to be here. Reviewers: RKSimon, zvi, spatel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36657 llvm-svn: 310845	2017-08-14 15:28:49 +00:00
Craig Topper	b9e3e111ad	[AVX512] Make the itinerary parameter actually pass through the the AVX512_maskable_common multiclass Summary: This looks to have been disconnected about 3 years ago in r219358. Reviewers: gadi.haber, RKSimon, zvi Reviewed By: gadi.haber Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36658 llvm-svn: 310844	2017-08-14 15:28:48 +00:00
Craig Topper	2374de420b	[AVX512] Remove leftover code for when i1 was a legal type from the fast isel load/store code. Summary: I don't think we need this code anymore. It only existed because i1 used to be legal. There's probably more unneeded code in fast isel still. Reviewers: guyblank, zvi Reviewed By: guyblank Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36652 llvm-svn: 310843	2017-08-14 15:28:47 +00:00
Sanjay Patel	a1067d9bae	[BDCE] reduce scope of an assert (PR34179) The assert was added with r310779 and is usually correct, but as the test shows, not always. The 'volatile' on the load is needed to expose the faulty path because without it, DemandedBits would return that the load is just dead rather than not demanded, and so we wouldn't hit the bogus assert. Also, since the lambda is just a single-line now, get rid of it and inline the DB.isAllOnesValue() calls. This should fix (prevent execution of a faulty assert): https://bugs.llvm.org/show_bug.cgi?id=34179 llvm-svn: 310842	2017-08-14 15:13:46 +00:00
Simon Dardis	cbf55deaa1	Reland "[mips][mt][6/7] Add support for mftr, mttr instructions." This adjusts the tests to hopfully pacify the llvm-clang-x86_64-expensive-checks-win buildbot. Unlike many other instructions, these instructions have aliases which take coprocessor registers, gpr register, accumulator (and dsp accumulator) registers, floating point registers, floating point control registers and coprocessor 2 data and control operands. For the moment, these aliases are treated as pseudo instructions which are expanded into the underlying instruction. As a result, disassembling these instructions shows the underlying instruction and not the alias. Reviewers: slthakur, atanasyan Differential Revision: https://reviews.llvm.org/D35253 llvm-svn: 310834	2017-08-14 12:28:00 +00:00
Amaury Sechet	9c529b6be3	[DAGCombine] Do not try to deduplicate commutative operations if both operand are the same. Summary: It is creating useless work as the commuted nodes is the same as the node we are working on in that case. Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33840 llvm-svn: 310832	2017-08-14 11:44:03 +00:00
Elad Cohen	6a9edda356	[SelectionDAG] combine vextract (v1iX extract_subvector(vNiX, Idx)) into vextract(vNiX,Idx) when creating vextract with getNode(). This case appeared in AVX512 after fixing pr33349 in r310552. Differential revision: https://reviews.llvm.org/D36571 llvm-svn: 310828	2017-08-14 10:49:45 +00:00
Sean Eveson	9edfeac9ea	[llvm-cov] Add an option which maps the location of source directories on another machine to your local copies Summary: This patch adds the -path-equivalence option (example: llvm-cov show -path-equivalence=/origin/path,/local/path) which maps the source code path from one machine to another when using `llvm-cov show`. This is similar to the -filename-equivalence option, but doesn't require you to specify all the source files on the command line. This allows you to generate the coverage data on one machine (e.g. in a CI system), and then use llvm-cov on another machine where you have the same code base on a different path. Reviewers: vsk Reviewed By: vsk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36391 llvm-svn: 310827	2017-08-14 10:20:12 +00:00
Balaram Makam	d9f53414de	MachineInstr: Reason locally about some memory objects before going to AA. This addresses a FIXME in MachineInstr::mayAlias. llvm-svn: 310825	2017-08-14 09:41:40 +00:00
Sam Parker	718c8a6a2a	[LoopUnroll] Enable option to peel remainder loop On some targets, the penalty of executing runtime unrolling checks and then not the unrolled loop can be significantly detrimental to performance. This results in the need to be more conservative with the unroll count, keeping a trip count of 2 reduces the overhead as well as increasing the chance of the unrolled body being executed. But being conservative leaves performance gains on the table. This patch enables the unrolling of the remainder loop introduced by runtime unrolling. This can help reduce the overhead of misunrolled loops because the cost of non-taken branches is much less than the cost of the backedge that would normally be executed in the remainder loop. This allows larger unroll factors to be used without suffering performance loses with smaller iteration counts. Differential Revision: https://reviews.llvm.org/D36309 llvm-svn: 310824	2017-08-14 09:25:26 +00:00
Sam Parker	647cce82a3	[AArch64] Remove unused MC function An unused function warning was raised in https://bugs.llvm.org/show_bug.cgi?id=34178. The offending function, in AArch64MCCodeEmitter.cpp, was committed by me last week. Differential Revision: https://reviews.llvm.org/D36665 llvm-svn: 310823	2017-08-14 09:16:13 +00:00
Elad Cohen	3a90a0c10d	Revert "[DAGCombiner] Extending pattern detection for vector shuffle (REAPPLIED)" This reverts commit r310782. llvm-svn: 310822	2017-08-14 09:06:00 +00:00
Chandler Carruth	37c7b08710	[ValueTracking] Revert r310583 which enabled functionality that still is causing compile time issues. Moreover, the patch deleted the flag in addition to changing the default, and links to a code review that doesn't even discuss the flag and just has an update to a Clang test case. I've followed up on the commit thread to ask for numbers on compile time at this point, leaving the flag in place until things stabilize, and pointing at specific code that seems to exhibit excessive compile time with this patch. Original commit message for r310583: """ [ValueTracking] Enabling ValueTracking patch by default (recommit). Part 2. The original patch was an improvement to IR ValueTracking on non-negative integers. It has been checked in to trunk (D18777, r284022). But was disabled by default due to performance regressions. Perf impact has improved. The patch would be enabled by default. """" llvm-svn: 310816	2017-08-14 07:03:24 +00:00
Craig Topper	508aa97b25	[AVX-512] Add hasSideEffects = 0 to the 8-bit and 16-bit register broadcasts. llvm-svn: 310813	2017-08-14 05:09:34 +00:00
Craig Topper	ca98bb9c94	[X86] Remove unused argument from the vextract_for_size multiclass. NFC llvm-svn: 310812	2017-08-14 05:09:33 +00:00
Craig Topper	2c8cb2fa87	[AVX512] Remove comment I should have removed in r310808. NFC llvm-svn: 310811	2017-08-14 05:09:31 +00:00
Brian Gesiak	60a3185940	[opt-viewer] Listify `dict_items` for Py3 indexing Summary: In Python 2, calling `dict.items()` returns an indexable `list`, whereas on Python 3 it returns a set-like `dict_items` object, which cannot be indexed. Explicitly onvert the `dict_items` object so that it can be indexed when using Python 3. In combination with D36622, D36623, and D36624, this change allows `opt-viewer.py` to exit successfully when run with Python 3.4. Test Plan: Run `opt-viewer.py` using Python 3.4 and confirm it does not encounter a runtime error when when indexing into `dict.items()`. Reviewers: anemet Reviewed By: anemet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36630 llvm-svn: 310810	2017-08-14 04:16:43 +00:00
Chandler Carruth	fe6b509f83	[PowerPC] Revert r310346 (and followups r310356 & r310424) which introduce a miscompile bug. There appears to be a bug where the generated code to extract the sign bit doesn't work correctly for 32-bit inputs. I've replied to the original commit pointing out the problem. I think I see by inspection (and reading the manual for PPC) how to fix this, but I can't be 100% confident and I also don't know what the best way to test this is. Currently it seems nearly impossible to get the backend to hit this code path, but the patch autohr is likely in a better position to craft such test cases than I am, and based on where the bug is it should be easily done. Original commit message for r310346: """ [PowerPC] Eliminate compares - add i32 sext/zext handling for SETLE/SETGE Adds handling for SETLE/SETGE comparisons on i32 values. Furthermore, it adds the handling for the special case where RHS == 0. Differential Revision: https://reviews.llvm.org/D34048 """ llvm-svn: 310809	2017-08-14 03:41:00 +00:00
Craig Topper	aadec7078e	[AVX512] Simplify the instruction defintion for VEXTRACT. NFCI The comment about why we couldn't use avx512_maskable appears to have been incorrect. llvm-svn: 310808	2017-08-14 01:53:10 +00:00
Javed Absar	37b2286564	[ARM] Tidy-up Cortex-A15 DPR-SPR optimizer implementation Modernise the code with range-loops etc Reviewed by: @fhahn, @rovka Differential Revision: https://reviews.llvm.org/D36502 llvm-svn: 310807	2017-08-14 01:38:01 +00:00
Craig Topper	f720099007	[InstCombine] Simplify and inline FoldOrWithConstants/FoldXorWithConstants Summary: These functions were overly complicated. The body of this function was rechecking for an And operation to find the constant, but we already knew we were looking at two Ands ORed together and the pieces are in variables. We already had earlier nearby code that checked for ConstantInts. So just inline the remaining parts into the earlier code. Next step is to use m_APInt instead of ConstantInt. Reviewers: spatel, efriedma, davide, majnemer Reviewed By: spatel Subscribers: zzheng, llvm-commits Differential Revision: https://reviews.llvm.org/D36439 llvm-svn: 310806	2017-08-14 00:04:21 +00:00
Simon Pilgrim	8d9e6e607a	[X86][BMI] Add BEXTR demanded bits test cases (PR34042) llvm-svn: 310802	2017-08-13 20:35:38 +00:00
Craig Topper	5b59176abb	[X86] Fix typo from r310794. Index = 0 should have been Index == 0. llvm-svn: 310801	2017-08-13 20:21:12 +00:00
Craig Topper	2dfc7889fd	[X86] Remove unused pattern fragment that referenced MVT::i1. NFC llvm-svn: 310799	2017-08-13 20:04:05 +00:00
Martin Storsjo	2341319564	[COFF, ARM64] Use '//' as comment character in assembly files in GNU environments This allows using semicolons for bundling up more than one statement per line. This is used within the mingw-w64 project in some assembly files that contain code for multiple architectures. Differential Revision: https://reviews.llvm.org/D36366 llvm-svn: 310797	2017-08-13 19:42:05 +00:00
Alex Bradbury	f1af56c26a	Remove RISCV from LLVM_ALL_TARGETS in CMakeLists.txt It was mistakenly added to that list in D23560 (committed in rL285712). RISCV is an experimental backend and should never have been in that list, I mistakenly interpreted LLVM_ALL_TARGETS as a list of all targets rather than targets to build by default. Unfortunately, because of this the RISCV backend has been building by default when it shouldn't be. This commet adds a description comment, which should help to avoid such mistakes in the future. See my message to llvm-dev for more information and analysis <http://lists.llvm.org/pipermail/llvm-dev/2017-August/116347.html>. Differential Revision: https://reviews.llvm.org/D36538 llvm-svn: 310796	2017-08-13 18:49:33 +00:00
Craig Topper	43e3b788f4	[AVX512] Correct isExtractSubvectorCheap so that it will return the correct answers for extracting 128-bits from a 512-bit vector and for mask registers. Previously it would not return true for extracting either of the upper quarters of a 512-bit registers. For mask registers we support extracting anything from index 0. And otherwise we only support extracting the upper half of a register. Differential Revision: https://reviews.llvm.org/D36638 llvm-svn: 310794	2017-08-13 17:40:02 +00:00
Craig Topper	2251ef95a3	[X86][ARM][TargetLowering] Add SrcVT to isExtractSubvectorCheap Summary: Without the SrcVT its hard to know what is really being asked for. For example if your target has 128, 256, and 512 bit vectors. Maybe extracting 128 from 256 is cheap, but maybe extracting 128 from 512 is not. For x86 we do support extracting a quarter of a 512-bit register. But for i1 vectors we don't have isel patterns for extracting arbitrary pieces. So we need this to have a correct implementation of isExtractSubvectorCheap for mask vectors. Reviewers: RKSimon, zvi, efriedma Reviewed By: RKSimon Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D36649 llvm-svn: 310793	2017-08-13 17:29:07 +00:00
Gadi Haber	bed2c50607	[X86][SandyBridge] Additional updates to the SNB instructions scheduling information This is a continuation patch for commit r307529 which completely replaces the scheduling information for the SandyBridge architecture target by modifying the file X86SchedSandyBridge.td located under the X86 Target (see also https://reviews.llvm.org/D35019). In this patch we added the scheduling information of additional SNB instructions that were missing from the patch commit r307529, fixed the scheduling of several resource groups that include only port0 instead of port05 (i.e., port0 OR port5) and fixed several incorrect instructions' scheduling in the r307529 commit. The patch also includes the X87 instructions which were missing in previous patch commit r307529 as reported in bugzilla bug 34080. Reviewers: zvi, RKSimon, chandlerc, igorb, m_zuckerman, craig.topper, aymanmus, dim Differential Revision: https://reviews.llvm.org/D36388 llvm-svn: 310792	2017-08-13 13:59:24 +00:00
Simon Pilgrim	631991fdde	[X86][AVX512] Added additional shuffle+trunc test case. An existing test should have covered this but a typo caused it to fail. I've kept both as the codegen for the typo case needs addressing as well. llvm-svn: 310791	2017-08-13 12:30:36 +00:00
Simon Pilgrim	ad1c5566a7	[X86][TBM] Add tests showing failure to fold RFLAGS result into TBM instructions. And fails to select TBM instructions at all. llvm-svn: 310790	2017-08-13 12:16:00 +00:00
Coby Tayree	799fa2c76e	[X86][AsmParser][AVX512] Error appropriately when K0 is tried as a write-mask K0 isn't expected as a write-mask, so provide a detailed error here, instead of the more generic one (invalid op for insn) Conforms with gas Differential Revision: https://reviews.llvm.org/D36570 llvm-svn: 310789	2017-08-13 12:03:00 +00:00
Simon Pilgrim	808ce12878	[X86][TBM] Regenerate bextri intrinsics tests. NFCI. llvm-svn: 310788	2017-08-13 11:56:15 +00:00
Guy Blank	de425ae753	[X86][AVX512] Add combine for TESTM Add an X86 combine for TESTM when one of the operands is a BUILD_VECTOR(0,0,...). TESTM op0, BUILD_VECTOR(0,0,...) -> BUILD_VECTOR(0,0,...) TESTM BUILD_VECTOR(0,0,...), op1 -> BUILD_VECTOR(0,0,...) Differential Revision: https://reviews.llvm.org/D36536 llvm-svn: 310787	2017-08-13 08:03:37 +00:00
Craig Topper	77dd140786	[X86] Early out of combineInsertSubvector for mask vectors. The combines here shouldn't be done for mask vectors, but it wasn't clear anything was preventing that. llvm-svn: 310786	2017-08-12 22:33:58 +00:00
Craig Topper	dbca6d47f3	[X86] Fix bad comment. NFC llvm-svn: 310785	2017-08-12 22:33:57 +00:00
Craig Topper	44cb1ffb6a	[X86] When handling addcarry intrinsic, create the flag result with the correct type so we don't crash if we use a memory instruction Summary: Previously we were creating the flag result with MVT::Other which is interpretted as a Chain node. If we used a memory form of the instruction we would end up with a copyToReg that consumed the chain result of the adcx instruction instead of the flag result. Pretty sure we should be using MVT::i32 here, that's what we do other places we create these node types. We should probably consider this for 5.0 as well. Reviewers: RKSimon, zvi, spatel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36645 llvm-svn: 310784	2017-08-12 20:19:44 +00:00

... 3 4 5 6 7 ...

153375 Commits