llvm-project

Commit Graph

Author	SHA1	Message	Date
Geoff Berry	d46b6e8096	[AArch64] Fold some filled/spilled subreg COPYs Summary: Extend AArch64 foldMemoryOperandImpl() to handle folding spills of subreg COPYs with read-undef defs like: %vreg0:sub_32<def,read-undef> = COPY %WZR; GPR64:%vreg0 by widening the spilled physical source reg and generating: STRXui %XZR <fi#0> as well as folding fills of similar COPYs like: %vreg0:sub_32<def,read-undef> = COPY %vreg1; GPR64:%vreg0, GPR32:%vreg1 by generating: %vreg0:sub_32<def,read-undef> = LDRWui <fi#0> Reviewers: MatzeB, qcolombet Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D27425 llvm-svn: 291180	2017-01-05 21:51:42 +00:00
Xin Tong	8b8a600d92	Fix typo. NFC llvm-svn: 291178	2017-01-05 21:40:08 +00:00
Teresa Johnson	6c475a7595	ThinLTO: add early "dead-stripping" on the Index Summary: Using the linker-supplied list of "preserved" symbols, we can compute the list of "dead" symbols, i.e. the one that are not reachable from a "preserved" symbol transitively on the reference graph. Right now we are using this information to mark these functions as non-eligible for import. The impact is two folds: - Reduction of compile time: we don't import these functions anywhere or import the function these symbols are calling. - The limited number of import/export leads to better internalization. Patch originally by Mehdi Amini. Reviewers: mehdi_amini, pcc Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23488 llvm-svn: 291177	2017-01-05 21:34:18 +00:00
Joerg Sonnenberger	83963995c6	PR 31534: When emitting both DWARF unwind tables and debug information, do not use .cfi_sections. This requires checking if any non-declaration function in the module needs an unwind table. llvm-svn: 291172	2017-01-05 20:55:28 +00:00
Michael Kuperstein	c9acad12e9	[LICM] Allow promotion of some stores that are not guaranteed to execute. Promotion is always legal when a store within the loop is guaranteed to execute. However, this is not a necessary condition - for promotion to be memory model semantics-preserving, it is enough to have a store that dominates every exit block. This is because if the store dominates every exit block, the fact the exit block was executed implies the original store was executed as well. Differential Revision: https://reviews.llvm.org/D28147 llvm-svn: 291171	2017-01-05 20:42:06 +00:00
Matthias Braun	1172332203	CodeGen: Assert that liveness is up to date when reading block live-ins. Add an assert that checks whether liveins are up to date before they are used. - Do not print liveins into .mir files anymore in situations where they are out of date anyway. - The assert in the RegisterScavenger is superseded by the new one in livein_begin(). - Skip parts of the liveness updating logic in IfConversion.cpp when liveness isn't tracked anymore (just enough to avoid hitting the new assert()). Differential Revision: https://reviews.llvm.org/D27562 llvm-svn: 291169	2017-01-05 20:01:19 +00:00
Evgeniy Stepanov	e8e11eb726	Revert "Reapply r291025 ("AMDGPU: Remove unneccessary intermediate vector")" Summary: This reverts commit r291144. It breaks build bots. http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-autoconf/builds/3270, http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fuzzer/builds/2058 lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp:1638:12: error: could not convert ‘(const unsigned int)(& Variants)’ from ‘const unsigned int’ to ‘llvm::ArrayRef<unsigned int>’ return Variants; Reviewers: eugenis, tstellarAMD Patch by Alex Shlyapnikov. Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D28372 llvm-svn: 291168	2017-01-05 19:51:13 +00:00
Simon Pilgrim	4c050c2190	[CostModel][X86] Move vXi32 MUL costs into existing tables. NFCI. llvm-svn: 291165	2017-01-05 19:42:43 +00:00
Simon Pilgrim	6f72eba606	Remove trailing whitespace. NFCI. llvm-svn: 291163	2017-01-05 19:24:25 +00:00
Simon Pilgrim	5b06e4d319	[CostModel][X86] Reordered SSE42 arithmetic cost LUT into descending order. NFCI. llvm-svn: 291162	2017-01-05 19:19:39 +00:00
Simon Pilgrim	a8bf97569a	[CostModel][X86] Move vXi64 MUL costs into existing tables. NFCI. Removes need for yet another LUT. llvm-svn: 291158	2017-01-05 19:01:50 +00:00
Andrew Kaylor	7353cf4623	[LICM] Small update to note changes made in hoistRegion Differential Revision: https://reviews.llvm.org/D28363 llvm-svn: 291157	2017-01-05 18:53:24 +00:00
Simon Pilgrim	430d34fc14	[CostModel][X86] Strip unused 256-bit vector shift costs. NFCI. Remove SSE2 256-bit entries - AVX targets will have used the SSE42 costs instead. llvm-svn: 291152	2017-01-05 18:36:48 +00:00
Simon Pilgrim	b01e844241	[CostModel][X86] Include the cost of 256-bit upper subvector extract/insertion in AVX1 v4i64 MUL Matches other MUL/ADD/SUB 256-bit case on AVX1 llvm-svn: 291149	2017-01-05 18:20:25 +00:00
Joerg Sonnenberger	d7baada5dd	Typo llvm-svn: 291147	2017-01-05 17:59:22 +00:00
Simon Pilgrim	f74700aa8c	[CostModel][X86] Merged SK_PermuteSingleSrc/SK_PermuteTwoSrc into common shuffle cost LUTs. NFCI. llvm-svn: 291146	2017-01-05 17:56:19 +00:00
Matt Arsenault	ec63f62c58	Reapply r291025 ("AMDGPU: Remove unneccessary intermediate vector") Arrays are supposed to be static const llvm-svn: 291144	2017-01-05 17:36:11 +00:00
Xin Tong	9efb049fb3	Remove a unnecessary hasLoopInvariantOperands check in loop sink. Summary: Preheader instruction's operands will always be invariant w.r.t. the loop which its the preheader for. Memory aliases are handled in canSinkOrHoistInst. Reviewers: danielcdh, davidxl Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D28270 llvm-svn: 291132	2017-01-05 16:52:37 +00:00
Sanjay Patel	dea5a7bd53	less braces; NFC llvm-svn: 291126	2017-01-05 16:47:32 +00:00
Simon Pilgrim	bca02f9e20	[CostModel][X86] Add support for broadcast shuffle costs Currently only for broadcasts with input and output of the same width. Differential Revision: https://reviews.llvm.org/D27811 llvm-svn: 291122	2017-01-05 15:56:08 +00:00
Zvi Rackover	4b7d724d62	[X86] Optimize vector shifts with variable but uniform shift amounts Summary: For instructions such as PSLLW/PSLLD/PSLLQ a variable shift amount may be passed in an XMM register. The lower 64-bits of the register are evaluated to determine the shift amount. This patch improves the construction of the vector containing the shift amount. Reviewers: craig.topper, delena, RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28353 llvm-svn: 291120	2017-01-05 15:11:43 +00:00
Teresa Johnson	2b60384581	[ThinLTO] Add parenthesis as per build warning Fixes a warning about "\|\|" and "&&" due to r291108. llvm-svn: 291119	2017-01-05 15:10:10 +00:00
Tony Jiang	3a2f00b024	[PowerPC] Implement missing ISA 2.06 instructions. Instructions: fctidu[.], fctiwu[.], ftdiv, ftsqrt are not implemented. Implement them and add corresponding test cases in this patch. llvm-svn: 291116	2017-01-05 15:00:45 +00:00
Teresa Johnson	e27b058de3	[ThinLTO] Use DenseSet instead of SmallPtrSet for holding GUIDs Should fix some more bot failures from r291108. This should have been a DenseSet, since GUID is not a pointer type. It caused some bots to fail, but for some reason I wasnt't getting a build failure. llvm-svn: 291115	2017-01-05 14:59:56 +00:00
Simon Pilgrim	a62395a4bd	[CostModel][X86] Pulled out common type legalization code llvm-svn: 291109	2017-01-05 14:33:32 +00:00
Teresa Johnson	519465b993	[ThinLTO] Subsume all importing checks into a single flag Summary: This adds a new summary flag NotEligibleToImport that subsumes several existing flags (NoRename, HasInlineAsmMaybeReferencingInternal and IsNotViableToInline). It also subsumes the checking of references on the summary that was being done during the thin link by eligibleForImport() for each candidate. It is much more efficient to do that checking once during the per-module summary build and record it in the summary. Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28169 llvm-svn: 291108	2017-01-05 14:32:16 +00:00
Mohammed Agabaria	23599ba794	Currently isLikelyComplexAddressComputation tries to figure out if the given stride seems to be 'complex' and need some extra cost for address computation handling. This code seems to be target dependent which may not be the same for all targets. Passed the decision whether the given stride is complex or not to the target by sending stride information via SCEV to getAddressComputationCost instead of 'IsComplex'. Specifically at X86 targets we dont see any significant address computation cost in case of the strided access in general. Differential Revision: https://reviews.llvm.org/D27518 llvm-svn: 291106	2017-01-05 14:03:41 +00:00
Kristof Beyls	a983e7c4a4	[GlobalISel] Add support for address-taken basic blocks To make this work, pointers from the MachineBasicBlock to the LLVM-IR-level basic blocks need to be initialized, as the AsmPrinter uses this link to be able to print out labels for the basic blocks that are address-taken. Most of the changes in this commit are about adapting existing tests to include the basic block name that is now printed out in the MIR format, now that the name becomes available as the link to the LLVM-IR basic block is initialized. The relevant test change for the functionality added in this patch are the added "(address-taken)" strings in test/CodeGen/AArch64/GlobalISel/arm64-irtranslator.ll. Differential Revision: https://reviews.llvm.org/D28123 llvm-svn: 291105	2017-01-05 13:27:52 +00:00
Kristof Beyls	eced071e88	[GlobalISel] Add support for switch statements This commit does this using a trivial chain of conditional branches. In the future, we probably want to reuse the optimized switch lowering used in SelectionDAG. Differential Revision: https://reviews.llvm.org/D28176 llvm-svn: 291099	2017-01-05 11:28:51 +00:00
Kristof Beyls	2252440b81	[GlobalISel] Fix AArch64 ICMP instruction selection Differential Revision: https://reviews.llvm.org/D28175 llvm-svn: 291097	2017-01-05 10:16:08 +00:00
Mohammed Agabaria	189e2d29ba	[Test Commit] fixing some format issue in X86TTI to match clang-format output. llvm-svn: 291095	2017-01-05 09:51:02 +00:00
Elena Demikhovsky	143cbc425b	AVX-512: Optimized pattern for truncate with unsigned saturation. DAG patterns optimization: truncate + unsigned saturation supported by VPMOVUS* instructions in AVX-512. Differential revision: https://reviews.llvm.org/D28216 llvm-svn: 291092	2017-01-05 08:21:09 +00:00
Craig Topper	33c544bdb0	[X86] Add Intel Kaby Lake model numbers to getHostCPUName aliased to "skylake" since there are no feature differences. Model numbers found here http://www.sandpile.org/x86/cpuid.htm llvm-svn: 291086	2017-01-05 05:57:27 +00:00
Saleem Abdulrasool	6252bd8eac	MC: support passing search paths to the IAS This is needed to support inclusion in inline assembly via the `.include` directive. llvm-svn: 291085	2017-01-05 05:56:39 +00:00
Craig Topper	1ab35fa7a8	[X86] Change getHostCPUName to report Intel model 0x4e as "skylake" instead of "skylake-avx512". Add the proper 0x55 model for "skylake-avx512". Summary: Intel's i5-6300U CPU is reporting to have a model id of 78 (4e). The Host detection assumes that to be Skylake Xeon (with AVX512 support), instead of a normal Skylake machine. Patch by: Valentin Churavy Reviewers: nalimilan, craig.topper Subscribers: hfinkel, tkelman, craig.topper, nalimilan, llvm-commits Differential Revision: https://reviews.llvm.org/D28221 llvm-svn: 291084	2017-01-05 05:47:29 +00:00
Kostya Serebryany	2648243ebd	[libFuzzer] use /tmp (or $TMPDIR, if present) to store temp files during merge llvm-svn: 291078	2017-01-05 04:32:19 +00:00
Peter Collingbourne	b2ce2b6805	IR: Module summary representation for type identifiers; summary test scaffolding for lowertypetests. Set up basic YAML I/O support for module summaries, plumb the summary into the pass and add a few command line flags to test YAML I/O support. Bitcode support to come separately, as will the code in LowerTypeTests that actually uses the summary. Also add a couple of tests that pass by virtue of the pass doing nothing with the summary (which happens to be the correct thing to do for those tests). Differential Revision: https://reviews.llvm.org/D28041 llvm-svn: 291069	2017-01-05 03:39:00 +00:00
Richard Smith	d4d575b955	Revert r291025 ("AMDGPU: Remove unneccessary intermediate vector") This caused buildbot failures due to returning ArrayRefs referencing local (temporary) objects. llvm-svn: 291067	2017-01-05 03:13:10 +00:00
Wolfgang Pieb	ce13e716c5	[DWARF] Null out the debug locs of load instructions that have been moved by GVN performing partial redundancy elimination (PRE). Not doing so can cause jumpy line tables and confusing (though correct) source attributions. Differential Revision: https://reviews.llvm.org/D27857 llvm-svn: 291037	2017-01-04 23:58:26 +00:00
Mehdi Amini	19ef4fad91	Use lazy-loading of Metadata in MetadataLoader when importing is enabled (NFC) Summary: This is a relatively simple scheme: we use the index emitted in the bitcode to avoid loading all the global metadata. Instead we load the index with their position in the bitcode so that we can load each of them individually. Materializing the global metadata block in this condition only triggers loading the named metadata, and the ones referenced from there (transitively). When materializing a function, metadata from the global block are loaded lazily as they are referenced. Two main current limitations are: 1) Global values other than functions are not materialized on demand, so we need to eagerly load METADATA_GLOBAL_DECL_ATTACHMENT records (and their transitive dependencies). 2) When we load a single metadata, we don't recurse on the operands, instead we use a placeholder or a temporary metadata. Unfortunately tepmorary nodes are very expensive. This is why we don't have it always enabled and only for importing. These two limitations can be lifted in a subsequent improvement if needed. With this change, the total link time of opt with ThinLTO and Debug Info enabled is going down from 282s to 224s (~20%). Reviewers: pcc, tejohnson, dexonsmith Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28113 llvm-svn: 291027	2017-01-04 22:54:33 +00:00
Mehdi Amini	867aad1359	Change BitstreamCursor::skipRecord to return the record code (NFC) llvm-svn: 291026	2017-01-04 22:54:14 +00:00
Matt Arsenault	6796d7ea8b	AMDGPU: Remove unneccessary intermediate vector llvm-svn: 291025	2017-01-04 22:54:10 +00:00
Matt Arsenault	3bdd75d01e	InstCombine: Fold cos(-x) -> cos(x) Also cos(fabs(x)) -> cos(x) llvm-svn: 291022	2017-01-04 22:49:03 +00:00
David Blaikie	7ad9dc11db	Reapply "Make BitCodeAbbrev ownership explicit using shared_ptr rather than IntrusiveRefCntPtr"" If this is a problem for anyone (shared_ptr is two pointers in size, whereas IntrusiveRefCntPtr is 1 - and the ref count control block that make_shared adds is probably larger than the one int in RefCountedBase) I'd prefer to address this by adding a lower-overhead version of shared_ptr (possibly refactoring IntrusiveRefCntPtr into such a thing) to avoid the intrusiveness - this allows memory ownership to remain orthogonal to types and at least to me, seems to make code easier to understand (since no implicit ownership acquisition can happen). This recommits 291006, reverted in r291007. llvm-svn: 291016	2017-01-04 22:36:33 +00:00
Tim Shen	5480eb8445	[Legalizer] Fix fp-to-uint to fp-tosint promotion assertion. Summary: When promoting fp-to-uint16 to fp-to-sint32, the result is actually zero extended. For example, given double 65534.0, without legalization: fp-to-uint16: 65534.0 -> 0xfffe With the legalization: fp-to-sint32: 65534.0 -> 0x0000fffe Without this patch, legalization wrongly emits a signed extend assertion, which is consumed by later icmp instruction, and cause miscompile. Note that the floating point value must be in [0, 65535), otherwise the behavior is undefined. This patch reverts r279223 behavior and adds more tests and documentations. In PR29041's context, James Molloy mentioned that: We don't need to mask because conversion from float->uint8_t is undefined if the integer part of the float value is not representable in uint8_t. Therefore we can assume this doesn't happen! which is totally true and good, because fptoui is documented clearly to have undefined behavior when overflow/underflow happens. We should take the advantage of this behavior so that we can save unnecessary mask instructions. Reviewers: jmolloy, nadav, echristo, kbarton Subscribers: mehdi_amini, nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D28284 llvm-svn: 291015	2017-01-04 22:11:42 +00:00
Evgeny Stupachenko	c88697dc16	The patch fixes (base, index, offset) match. Summary: Instead of matching: (a + i) + 1 -> (a + i, undef, 1) Now it matches: (a + i) + 1 -> (a, i, 1) Reviewers: rengolin Differential Revision: http://reviews.llvm.org/D26367 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 291012	2017-01-04 21:43:39 +00:00
Chad Rosier	63687e40bc	[AArch64] Update the feature set for Qualcomm's Falkor CPU. llvm-svn: 291010	2017-01-04 21:26:23 +00:00
Nirav Dave	0f9d111f97	[AArch64] Fix over-eager early-exit in load-store combiner Fix early-exit analysis for memory operation pairing when operations are not emitted in ascending order. Reviewers: mcrosier, t.p.northover Subscribers: aemerson, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D28251 llvm-svn: 291008	2017-01-04 21:21:46 +00:00
David Blaikie	6e2207a134	Revert "Make BitCodeAbbrev ownership explicit using shared_ptr rather than IntrusiveRefCntPtr" Breaks Clang's use of bitcode. Reverting until I have a fix to go with it there. This reverts commit r291006. llvm-svn: 291007	2017-01-04 21:19:28 +00:00
David Blaikie	daff78cd87	Make BitCodeAbbrev ownership explicit using shared_ptr rather than IntrusiveRefCntPtr If this is a problem for anyone (shared_ptr is two pointers in size, whereas IntrusiveRefCntPtr is 1 - and the ref count control block that make_shared adds is probably larger than the one int in RefCountedBase) I'd prefer to address this by adding a lower-overhead version of shared_ptr (possibly refactoring IntrusiveRefCntPtr into such a thing) to avoid the intrusiveness - this allows memory ownership to remain orthogonal to types and at least to me, seems to make code easier to understand (since no implicit ownership acquisition can happen). llvm-svn: 291006	2017-01-04 21:13:35 +00:00

1 2 3 4 5 ...

98153 Commits