llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	833788a05c	[X86] Remove VPERM2X128 isel patterns with 32-bit elements. Now that the intrinsics are gone we only need 64-bit elements since that's what shuffle lowering uses. llvm-svn: 313453	2017-09-16 08:15:52 +00:00
Craig Topper	f264fcc704	[X86] Remove VPERM2F128/VPERM2I128 intrinsics and autoupgrade to native shuffles. I've moved the test cases from the InstCombine optimizations to the backend to keep the coverage we had there. It covered every possible immediate so I've preserved the resulting shuffle mask for each of those immediates. llvm-svn: 313450	2017-09-16 07:36:14 +00:00
Eric Beckmann	913213c8ae	Revert "Fix Bug 30978 by emitting cv file checksums." This reverts commit 6389e7aa724ea7671d096f4770f016c3d86b0d54. There is a bug in this implementation where the string value of the checksum is outputted, instead of the actual hex bytes. Therefore the checksum is incorrect, and this prevent pdbs from being loaded by visual studio. Revert this until the checksum is emitted correctly. llvm-svn: 313431	2017-09-16 01:14:36 +00:00
Adrian Prantl	057d336c0d	llvm-dwarfdump: Add support for -debug-info=<offset>. This is the first of many commits that enable selectively dumping just one record from the debug info. This reapplies r313412 with some extra qualification to appease GCC and MSVC. llvm-svn: 313419	2017-09-15 23:04:04 +00:00
Adrian Prantl	b5abcc558d	Revert "llvm-dwarfdump: Add support for -debug-info=<offset>." This reverts commit r313412 because of a g++ incompatibility. llvm-svn: 313413	2017-09-15 22:47:16 +00:00
Adrian Prantl	fb5d284e97	llvm-dwarfdump: Add support for -debug-info=<offset>. This is the first of many commits that enable selectively dumping just one record from the debug info. llvm-svn: 313412	2017-09-15 22:37:56 +00:00
Chandler Carruth	beb22b5437	[SLP] Revert r312791 and other necessary commits, except for TTI and CostModel. The original patch added support for horizontal min/max reductions to the SLP vectorizer. This patch causes LLVM to miscompile fairly simple signed min reductions. I have attached a test progrom to http://llvm.org/PR34635 that shows the behavior change after this patch. We found this in a test for the open source Eigen library, but also in other code. Unfortunately, the revert is moderately challenging. It required reverting: r313042: [SLP] Test with multiple uses of conditional op and wrong parent. r312853: [SLP] Fix buildbots, NFC. r312793: [SLP] Fix the warning about paths not returning the value, NFC. r312791: [SLP] Support for horizontal min/max reduction. And even then, I had to completely skip reverting the changes to TTI and CostModel because r312832 rewrote so much of this code. Plus, the cost modeling changes aren implicated in the miscompile, so they should be fine and will just not be used until this gets re-introduced. llvm-svn: 313409	2017-09-15 22:23:27 +00:00
Reid Kleckner	eed0973270	Name the sentinel value used for the location number of the undefined register NFC llvm-svn: 313405	2017-09-15 22:08:50 +00:00
Reid Kleckner	3a66c1cb58	[DebugInfo] Insert DW_OP_deref when spilling indirect DBG_VALUEs Summary: This comes up in optimized debug info for C++ programs that pass and return objects indirectly by address. In these programs, llvm.dbg.declare survives optimization, which causes us to emit indirect DBG_VALUE instructions. The fast register allocator knows to insert DW_OP_deref when spilling indirect DBG_VALUE instructions, but the LiveDebugVariables did not until this change. This fixes part of PR34513. I need to look into why this doesn't work at -O0 and I'll send follow up patches to handle that. Reviewers: aprantl, dblaikie, probinson Subscribers: qcolombet, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D37911 llvm-svn: 313400	2017-09-15 21:54:38 +00:00
Reid Kleckner	9e6c309ef3	[DebugInfo] Add missing DW_OP_deref when an NRVO pointer is spilled Summary: Fixes PR34513. Indirect DBG_VALUEs typically come from dbg.declares of non-trivially copyable C++ objects that must be passed by address. We were already handling the case where the virtual register gets allocated to a physical register and is later spilled. That's what usually happens for normal parameters that aren't NRVO variables: they usually appear in physical register parameters, and are spilled later in the function, which would correctly add deref. NRVO variables are different because the dbg.declare can come much later after earlier instructions cause the incoming virtual register to be spilled. Also, clean up this code. We only need to look at the first operand of a DBG_VALUE, which eliminates the operand loop. Reviewers: aprantl, dblaikie, probinson Subscribers: MatzeB, qcolombet, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D37929 llvm-svn: 313399	2017-09-15 21:49:56 +00:00
Steven Wu	ab211df5de	[AutoUpgrade] Fix a compatibility issue with module flag Summary: After r304661, module flag to record objective-c image info section is encoded without whitespaces after comma. The new name is equivalent to the old one, except that when LTO a module built by old compiler and a module built by a new compiler, it will fail with conflicting values. Fix the issue by removing whitespaces in bitcode upgrade path. rdar://problem/34416934 Reviewers: compnerd Reviewed By: compnerd Subscribers: mehdi_amini, hans, llvm-commits Differential Revision: https://reviews.llvm.org/D37909 llvm-svn: 313398	2017-09-15 21:12:14 +00:00
Sam Clegg	759631c77b	[WebAssembly] MC: Create wasm data segments based on MCSections This means that we can honor -fdata-sections rather than always creating a segment for each symbol. It also allows for a followup change to add .init_array and friends. Differential Revision: https://reviews.llvm.org/D37876 llvm-svn: 313395	2017-09-15 20:54:59 +00:00
Davide Italiano	dee018c51f	[ConstantFold] Return the correct type when folding a GEP with vector indices. As Eli pointed out (and I got wrong in the first place), langref says: "The getelementptr returns a vector of pointers, instead of a single address, when one or more of its arguments is a vector. In such cases, all vector arguments should have the same number of elements, and every scalar argument will be effectively broadcast into a vector during address calculation." Costantfold for gep doesn't really take in account this paragraph, returning a pointer instead of a vector of pointer which triggers an assertion in RAUW, as we're trying to replace values with mistmatching types. Differential Revision: https://reviews.llvm.org/D37928 llvm-svn: 313394	2017-09-15 20:53:05 +00:00
Sam Clegg	66a99e41cd	Change encodeU/SLEB128 to pad to certain number of bytes Previously the 'Padding' argument was the number of padding bytes to add. However most callers that use 'Padding' know how many overall bytes they need to write. With the previous code this would mean encoding the LEB once to find out how many bytes it would occupy and then using this to calulate the 'Padding' value. See: https://reviews.llvm.org/D36595 Differential Revision: https://reviews.llvm.org/D37494 llvm-svn: 313393	2017-09-15 20:34:47 +00:00
Vivek Pandya	b5ab895e2a	This patch fixes https://bugs.llvm.org/show_bug.cgi?id=32352 It enables OptimizationRemarkEmitter::allowExtraAnalysis and MachineOptimizationRemarkEmitter::allowExtraAnalysis to return true not only for -fsave-optimization-record but when specific remarks are requested with command line options. The diagnostic handler used to be callback now this patch adds a class DiagnosticHandler. It has virtual method to provide custom diagnostic handler and methods to control which particular remarks are enabled. However LLVM-C API users can still provide callback function for diagnostic handler. llvm-svn: 313390	2017-09-15 20:10:09 +00:00
Mandeep Singh Grang	1be19e6f5b	[llvm] Fix some typos. NFC. Reviewers: mcrosier Reviewed By: mcrosier Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D37922 llvm-svn: 313388	2017-09-15 20:01:43 +00:00
Vivek Pandya	df8598dcc4	This reverts r313381 llvm-svn: 313387	2017-09-15 19:53:54 +00:00
Sam Clegg	457fb0b4a0	[WebAssembly] Pass ArrayRef rather than SmallVector This is more flexible and less verbose. Differential Revision: https://reviews.llvm.org/D37875 llvm-svn: 313384	2017-09-15 19:50:44 +00:00
Vivek Pandya	00d887447b	This patch fixes https://bugs.llvm.org/show_bug.cgi?id=32352 It enables OptimizationRemarkEmitter::allowExtraAnalysis and MachineOptimizationRemarkEmitter::allowExtraAnalysis to return true not only for -fsave-optimization-record but when specific remarks are requested with command line options. The diagnostic handler used to be callback now this patch adds a class DiagnosticHandler. It has virtual method to provide custom diagnostic handler and methods to control which particular remarks are enabled. However LLVM-C API users can still provide callback function for diagnostic handler. llvm-svn: 313382	2017-09-15 19:30:59 +00:00
Sam Clegg	aff1c4df25	[WebAssembly] MC: Fix crash in getProvitionalValue on weak references - Create helper function for resolving weak references. - Add test that preproduces the crash. Differential Revision: https://reviews.llvm.org/D37916 llvm-svn: 313381	2017-09-15 19:22:01 +00:00
Hans Wennborg	534bfbd3ba	Revert r313343 "[X86] PR32755 : Improvement in CodeGen instruction selection for LEAs." This caused PR34629: asserts firing when building Chromium. It also broke some buildbots building test-suite as reported on the commit thread. > Summary: > 1/ Operand folding during complex pattern matching for LEAs has been > extended, such that it promotes Scale to accommodate similar operand > appearing in the DAG. > e.g. > T1 = A + B > T2 = T1 + 10 > T3 = T2 + A > For above DAG rooted at T3, X86AddressMode will no look like > Base = B , Index = A , Scale = 2 , Disp = 10 > > 2/ During OptimizeLEAPass down the pipeline factorization is now performed over LEAs > so that if there is an opportunity then complex LEAs (having 3 operands) > could be factored out. > e.g. > leal 1(%rax,%rcx,1), %rdx > leal 1(%rax,%rcx,2), %rcx > will be factored as following > leal 1(%rax,%rcx,1), %rdx > leal (%rdx,%rcx) , %edx > > 3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops, > thus avoiding creation of any complex LEAs within a loop. > > Reviewers: lsaba, RKSimon, craig.topper, qcolombet > > Reviewed By: lsaba > > Subscribers: spatel, igorb, llvm-commits > > Differential Revision: https://reviews.llvm.org/D35014 llvm-svn: 313376	2017-09-15 18:40:26 +00:00
Eric Beckmann	349746f044	Fix Bug 30978 by emitting cv file checksums. Summary: The checksums had already been placed in the IR, this patch allows MCCodeView to actually write it out to an MCStreamer. Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D37157 llvm-svn: 313374	2017-09-15 18:20:28 +00:00
Craig Topper	7a183e2760	[X86] Prefer VPERMQ over VPERM2F128 for any unary shuffle, not just the ones that can be done with a insertf128 The early out for AVX2 in lowerV2X128VectorShuffle is positioned in a weird spot below some shuffle mask equivalency checks. But I think we want to allow VPERMQ for any unary shuffle. Differential Revision: https://reviews.llvm.org/D37893 llvm-svn: 313373	2017-09-15 18:11:13 +00:00
Adrian Prantl	8416802ea4	llvm-dwarfdump: Factor out the printing of the section header (NFC) llvm-svn: 313370	2017-09-15 17:39:50 +00:00
Craig Topper	f1620b2555	[X86] Use SDNode::ops() instead of makeArrayRef and op_begin(). NFCI llvm-svn: 313367	2017-09-15 17:09:05 +00:00
Craig Topper	e0d724cf51	[X86] Don't create i64 constants on 32-bit targets when lowering v64i1 constant build vectors When handling a v64i1 build vector of constants on 32-bit targets we were creating an illegal i64 constant that we then bitcasted back to v64i1. We need to instead create two 32-bit constants, bitcast them to v32i1 and concat the result. We should also take care to handle the halves being all zeros/ones after the split. This patch splits the build vector and then recursively lowers the two pieces. This allows us to handle the all ones and all zeros cases with minimal effort. Ideally we'd just do the split and concat, and let lowering get called again on the new nodes, but getNode has special handling for CONCAT_VECTORS that reassembles the pieces back into a single BUILD_VECTOR. Hopefully the two temporary BUILD_VECTORS we had to create to do this that don't get returned don't cause any issues. Fixes PR34605. Differential Revision: https://reviews.llvm.org/D37858 llvm-svn: 313366	2017-09-15 17:09:03 +00:00
Craig Topper	143797eb89	[X86] Add isel pattern infrastructure to begin recognizing when we're inserting 0s into the upper portions of a vector register and the producing instruction as already produced the zeros. Currently if we're inserting 0s into the upper elements of a vector register we insert an explicit move of the smaller register to implicitly zero the upper bits. But if we can prove that they are already zero we can skip that. This is based on a similar idea of what we do to avoid emitting explicit zero extends for GR32->GR64. Unfortunately, this is harder for vector registers because there are several opcodes that don't have VEX equivalent instructions, but can write to XMM registers. Among these are SHA instructions and a MMX->XMM move. Bitcasts can also get in the way. So for now I'm starting with explicitly allowing only VPMADDWD because we emit zeros in combineLoopMAddPattern. So that is placing extra instruction into the reduction loop. I'd like to allow PSADBW as well after D37453, but that's currently blocked by a bitcast. We either need to peek through bitcasts or canonicalize insert_subvectors with zeros to remove bitcasts on the value being inserted. Longer term we should probably have a cleanup pass that removes superfluous zeroing moves even when the producer is in another basic block which is something these isel tricks can't do. See PR32544. Differential Revision: https://reviews.llvm.org/D37653 llvm-svn: 313365	2017-09-15 17:09:00 +00:00
Anna Thomas	f34537dff8	[RuntimeUnroll] Add heuristic for unrolling multi-exit loop Add a profitability heuristic to enable runtime unrolling of multi-exit loop: There can be atmost two unique exit blocks for the loop and the second exit block should be a deoptimizing block. Also, there can be one other exiting block other than the latch exiting block. The reason for the latter is so that we limit the number of branches in the unrolled code to being at most the unroll factor. Deoptimizing blocks are rarely taken so these additional number of branches created due to the unrolling are predictable, since one of their target is the deopt block. Reviewers: apilipenko, reames, evstupac, mkuper Subscribers: llvm-commits Reviewed by: reames Differential Revision: https://reviews.llvm.org/D35380 llvm-svn: 313363	2017-09-15 15:56:05 +00:00
Krzysztof Parzyszek	557729761c	[Hexagon] Switch to parameterized register classes for HVX This removes the duplicate HVX instruction set for the 128-byte mode. Single instruction set now works for both modes (64- and 128-byte). llvm-svn: 313362	2017-09-15 15:46:05 +00:00
Anna Thomas	512dde77ba	[RuntimeUnrolling] Populate the VMap entry correctly when default generated through lookup During runtime unrolling on loops with multiple exits, we update the exit blocks with the correct phi values from both original and remainder loop. In this process, we lookup the VMap for the mapped incoming phi values, but did not update the VMap if a default entry was generated in the VMap during the lookup. This default value is generated when constants or values outside the current loop are looked up. This patch fixes the assertion failure when null entries are present in the VMap because of this lookup. Added a testcase that showcases the problem. llvm-svn: 313358	2017-09-15 13:29:33 +00:00
Ilya Biryukov	d23faa843e	Revert "[SLPVectorizer] Failure to beneficially vectorize 'copyable' elements in integer binary ops." This reverts commit r313348. Reason: it caused buildbot failures. llvm-svn: 313352	2017-09-15 10:15:00 +00:00
Sjoerd Meijer	0c5ba21cbf	[AArch64] allow v8f16 types when FullFP16 is supported This adds support for allowing v8f16 vector types, thus avoiding conversions from/to single precision for these types. This is a follow up patch of commits r311154 and r312104, which added support for scalars and v4f16 types, respectively. Differential Revision: https://reviews.llvm.org/D37802 llvm-svn: 313351	2017-09-15 09:24:48 +00:00
Jonas Paulsson	6188f326eb	Recommit "[RegAlloc] Make sure live-ranges reflect the state of the IR when removing them" This was temporarily reverted, but now that the fix has been commited (r313197) it should be put back in place. https://bugs.llvm.org/show_bug.cgi?id=34502 This reverts commit 9ef93d9dc4c51568e858cf8203cd2c5ce8dca796. llvm-svn: 313349	2017-09-15 07:47:38 +00:00
Dinar Temirbulatov	e2358b53bc	[SLPVectorizer] Failure to beneficially vectorize 'copyable' elements in integer binary ops. Patch tries to improve vectorization of the following code: void add1(int * __restrict dst, const int * __restrict src) { dst++ = src++; dst++ = src++ + 1; dst++ = src++ + 2; dst++ = src++ + 3; } Allows to vectorize even if the very first operation is not a binary add, but just a load. Reviewers: spatel, mzolotukhin, mkuper, hfinkel, RKSimon, filcab, ABataev, davide Subscribers: llvm-commits, RKSimon Differential Revision: https://reviews.llvm.org/D28907 llvm-svn: 313348	2017-09-15 06:56:39 +00:00
Lang Hames	d4d6a1aa9b	[ORC] Fix a typo. llvm-svn: 313346	2017-09-15 06:50:19 +00:00
Jatin Bhateja	908c8b37c2	[X86] PR32755 : Improvement in CodeGen instruction selection for LEAs. Summary: 1/ Operand folding during complex pattern matching for LEAs has been extended, such that it promotes Scale to accommodate similar operand appearing in the DAG. e.g. T1 = A + B T2 = T1 + 10 T3 = T2 + A For above DAG rooted at T3, X86AddressMode will no look like Base = B , Index = A , Scale = 2 , Disp = 10 2/ During OptimizeLEAPass down the pipeline factorization is now performed over LEAs so that if there is an opportunity then complex LEAs (having 3 operands) could be factored out. e.g. leal 1(%rax,%rcx,1), %rdx leal 1(%rax,%rcx,2), %rcx will be factored as following leal 1(%rax,%rcx,1), %rdx leal (%rdx,%rcx) , %edx 3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops, thus avoiding creation of any complex LEAs within a loop. Reviewers: lsaba, RKSimon, craig.topper, qcolombet Reviewed By: lsaba Subscribers: spatel, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D35014 llvm-svn: 313343	2017-09-15 05:29:51 +00:00
Dinar Temirbulatov	bb891b864c	[SLPVectorizer] Remove duplicated functionality code in initScheduleData function, NFCI. llvm-svn: 313341	2017-09-15 04:31:54 +00:00
Martin Pelikan	d78db15629	[XRay] fix and clarify comments in the log file decoder Summary: For readers unfamiliar with the XRay code base, reference the compiler-rt implementation even though we're not allowed to share any code and explain our little-endian views more clearly. For code clarity either get rid of obvious comments or explain their intentions, fix typos, correct coding style according to LLVM's standards and manually CSE long expressions to point out it is the same expression. Reviewers: dberris Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34339 llvm-svn: 313340	2017-09-15 04:22:16 +00:00
Reid Kleckner	87288b98b6	[codeview] Use a type index of zero for static method "this" types Otherwise VS won't show anything in the autos or watch window of static methods. llvm-svn: 313329	2017-09-15 00:59:07 +00:00
Alina Sbirlea	7ed5856a32	Refactor collectChildrenInLoop to LoopUtils [NFC] Summary: Move to LoopUtils method that collects all children of a node inside a loop. Reviewers: majnemer, sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37870 llvm-svn: 313322	2017-09-15 00:04:16 +00:00
Sam Clegg	7c39594357	[WebAssembly] Use a separate wasm data segment for each global symbol This is stepping stone towards honoring -fdata-sections and letting the assembler decide how many wasm data segments to create. Differential Revision: https://reviews.llvm.org/D37834 llvm-svn: 313313	2017-09-14 23:07:53 +00:00
Eric Beckmann	5c8194d6ba	Fix bug 34608 by moving private header out of public header. WindowsManifestMerger.h should not include llvm/Config/config.h, since it is private. The include has been moved to the source instead. Summary: The checksums had already been placed in the IR, this patch allows MCCodeView to actually write it out to an MCStreamer. Move private config.h header dependency out of public header file. Addresses Bug 34608 Subscribers: javed.absar, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D37863 llvm-svn: 313312	2017-09-14 23:01:13 +00:00
Craig Topper	c2311f476d	[X86] Remove an unnecessary SmallVector from LowerBUILD_VECTOR. I think this may have existed to convert from SDUse to SDValue, but it doesn't look like its needed now. llvm-svn: 313311	2017-09-14 22:47:59 +00:00
Jan Sjodin	1f2f57a7ea	Fix warnings in r313297. llvm-svn: 313302	2017-09-14 21:49:52 +00:00
Matt Arsenault	c317287fde	AMDGPU: Fix violating constant bus restriction You can't use madmk/madmk if it already uses an SGPR input. llvm-svn: 313298	2017-09-14 20:54:29 +00:00
Jan Sjodin	312ccf761c	Add AddresSpace to PseudoSourceValue. Differential Revision: https://reviews.llvm.org/D35089 llvm-svn: 313297	2017-09-14 20:53:51 +00:00
Krzysztof Parzyszek	788e768ffd	Subtarget support for parameterized register class information Implement "checkFeatures" and emitting HW mode check code. Differential Revision: https://reviews.llvm.org/D31959 llvm-svn: 313295	2017-09-14 20:44:20 +00:00
Benjamin Kramer	591aac7cdf	Remove usages of deprecated std::unary_function and std::binary_function. These are removed in C++17. We still have some users of unary_function::argument_type, so just spell that typedef out. No functionality change intended. Note that many of the argument types are actually wrong :) llvm-svn: 313287	2017-09-14 18:33:25 +00:00
Matt Arsenault	37ab4cf8b8	AMDGPU: Fix assert on alloca of array of struct llvm-svn: 313282	2017-09-14 18:02:29 +00:00
Jonas Devlieghere	d585a20394	[test] Fix TestDWARFDieRangeInfoIntersects Fixes heap buffer overflow triggered in DWARF verifier, detected by ASAN. llvm-svn: 313280	2017-09-14 17:46:23 +00:00
Matt Arsenault	defe371771	AMDGPU: Stop modifying SP in call sequences Because the stack growth direction and addressing is done in the same direction, modifying SP at the beginning of the call sequence was incorrect. If we had a stack passed argument, we would end up skipping that number of bytes before pushing arguments, leaving unused/inconsistent space. The callee creates fixed stack objects in its frame, so the space necessary for these is already logically allocated in the callee, so we just let the callee increment SP if it really requires it. llvm-svn: 313279	2017-09-14 17:37:40 +00:00
Dehao Chen	3a81f84d9a	Invoke GetInlineCost for legality check before inline functions in SampleProfileLoader. Summary: SampleProfileLoader inlines hot functions if it is inlined in the profiled binary. However, the inline needs to be guarded by legality check, otherwise it could lead to correctness issues. Reviewers: eraman, davidxl Reviewed By: eraman Subscribers: vitalybuka, sanjoy, llvm-commits Differential Revision: https://reviews.llvm.org/D37779 llvm-svn: 313277	2017-09-14 17:29:56 +00:00
Simon Dardis	55e446737f	[mips] Implement the 'dext' aliases and it's disassembly alias. The other members of the dext family of instructions (dextm, dextu) are traditionally handled by the assembler selecting the right variant of 'dext' depending on the values of the position and size operands. When these instructions are disassembled, rather than reporting the actual instruction, an equivalent aliased form of 'dext' is generated and is reported. This is to mimic the behaviour of binutils. Reviewers: slthakur, nitesh.jain, atanasyan Differential Revision: https://reviews.llvm.org/D34887 llvm-svn: 313276	2017-09-14 17:27:53 +00:00
Matt Arsenault	6efd082c01	AMDGPU: Make frame register caller preserved Using SplitCSR for the frame register was very broken. Often the copies in the prolog and epilog were optimized out, in addition to them being inserted after the true prolog where the FP was clobbered. I have a hacky solution which works that continues to use split CSR, but for now this is simpler and will get to working programs. llvm-svn: 313274	2017-09-14 17:14:57 +00:00
Adrian Prantl	5fd3d49bc4	llvm-dwarfdump: support dumping static archives. llvm-svn: 313272	2017-09-14 17:01:53 +00:00
Krzysztof Parzyszek	779d98e1c0	TableGen support for parameterized register class information This replaces TableGen's type inference to operate on parameterized types instead of MVTs, and as a consequence, some interfaces have changed: - Uses of MVTs are replaced by ValueTypeByHwMode. - EEVT::TypeSet is replaced by TypeSetByHwMode. This affects the way that types and type sets are printed, and the tests relying on that have been updated. There are certain users of the inferred types outside of TableGen itself, namely FastISel and GlobalISel. For those users, the way that the types are accessed have changed. For typical scenarios, these replacements can be used: - TreePatternNode::getType(ResNo) -> getSimpleType(ResNo) - TreePatternNode::hasTypeSet(ResNo) -> hasConcreteType(ResNo) - TypeSet::isConcrete -> TypeSetByHwMode::isValueTypeByHwMode(false) For more information, please refer to the review page. Differential Revision: https://reviews.llvm.org/D31951 llvm-svn: 313271	2017-09-14 16:56:21 +00:00
Krzysztof Parzyszek	6ca02b25a7	[IfConversion] More simple, correct dead/kill liveness handling Patch by Jesper Antonsson. Differential Revision: https://reviews.llvm.org/D37611 llvm-svn: 313268	2017-09-14 15:53:11 +00:00
Simon Dardis	6f83ae38a3	[mips] Implement the 'dins' aliases. Traditionally GAS has provided automatic selection between dins, dinsm and dinsu. Binutils also disassembles all instructions in that family as 'dins' rather than the actual instruction. Reviewers: slthakur Differential Revision: https://reviews.llvm.org/D34877 llvm-svn: 313267	2017-09-14 15:17:50 +00:00
Sanjay Patel	0d4fd5b668	[InstSimplify] fold sdiv/srem based on compare of dividend and divisor This should bring signed div/rem analysis up to the same level as unsigned. We use icmp simplification to determine when the divisor is known greater than the dividend. Each positive test is followed by a negative test to show that we're not overstepping the boundaries of the known bits. There are extra tests for the signed-min-value special cases. Alive proofs: http://rise4fun.com/Alive/WI5 Differential Revision: https://reviews.llvm.org/D37713 llvm-svn: 313264	2017-09-14 14:59:07 +00:00
Aleksandar Beserminji	7d610f4d06	Test commit. llvm-svn: 313262	2017-09-14 14:34:04 +00:00
Sanjay Patel	cca8f7853f	[InstSimplify] clean up div/rem handling; NFCI The idea to make an 'isDivZero' helper was suggested for the signed case in D37713: https://reviews.llvm.org/D37713 This clean-up makes it clear that D37713 is just filling the gap for signed div/rem, removes unnecessary code, and allows us to remove a bit of duplicated code from the planned improvement in D37713. llvm-svn: 313261	2017-09-14 14:09:11 +00:00
Krzysztof Parzyszek	473d02dbac	[Hexagon] Make getMemAccessSize return size in bytes It used to return the actual field value from the instruction descriptor. There is no reason for that, that value is not interesting in any way and the specifics of its encoding in the descriptor should not be exposed. llvm-svn: 313257	2017-09-14 12:06:40 +00:00
Ayman Musa	ab68449c53	[X86] When applying the shuffle-to-zero-extend transformation on floating point, bitcast to integer first. Fix issue described in PR34577. Differential Revision: https://reviews.llvm.org/D37803 llvm-svn: 313256	2017-09-14 12:06:38 +00:00
Jonas Devlieghere	5891060ff8	[dwarfdump] Add DWARF verifiers for address ranges This patch started as an attempt to rebase Greg's differential (D32821). The result is both quite similar and different at the same time. It adds the following checks: - Verify that all address ranges in a DIE are valid. - Verify that no ranges within the DIE overlap. - Verify that no ranges overlap with the ranges of a sibling. - Verify that children are completely contained in its (direct) parent's address range. (unless both are subprograms) Differential revision: https://reviews.llvm.org/D37696 llvm-svn: 313255	2017-09-14 11:33:42 +00:00
Simon Dardis	28365b33ad	[mips] Pick the right variant of DINS upfront and enable target instruction verification This patch complements D16810 "[mips] Make isel select the correct DEXT variant up front.". Now ISel picks the right variant of DINS, so now there is no need to replace DINS with the appropriate variant during MipsMCCodeEmitter::encodeInstruction(). This patch also enables target specific instruction verification for ins, dins, dinsm, dinsu, ext, dext, dextm, dextu. These instructions have constraints that are checked when generating MipsISD::Ins and MipsISD::Ext nodes, but these constraints are not checked during instruction selection. Adding machine verification should catch outstanding cases. Finally, correct a bug that instruction verification uncovered, where the position operand of a DINSU generated during lowering was being silently and accidently corrected to the correct value. Reviewers: slthakur Differential Revision: https://reviews.llvm.org/D34809 llvm-svn: 313254	2017-09-14 10:58:00 +00:00
Jonas Devlieghere	a9f55bed8a	Revert "[dwarfdump] Add DWARF verifiers for address ranges" This reverts commit r313250. llvm-svn: 313253	2017-09-14 10:49:15 +00:00
Simon Pilgrim	8bd2d8780a	[DAGCombine] (shl (or x, c1), c2) -> (or (shl x, c2), c1 << c2) We already have a combine for this pattern when the input to shl is add, so we just need to enable the transformation when the input is or. Original patch by @tstellar Differential Revision: https://reviews.llvm.org/D19325 llvm-svn: 313251	2017-09-14 10:38:30 +00:00
Jonas Devlieghere	d7201b3a36	[dwarfdump] Add DWARF verifiers for address ranges This patch started as an attempt to rebase Greg's differential (D32821). The result is both quite similar and different at the same time. It adds the following checks: - Verify that all address ranges in a DIE are valid. - Verify that no ranges within the DIE overlap. - Verify that no ranges overlap with the ranges of a sibling. - Verify that children are completely contained in its (direct) parent's address range. (unless both are subprograms) Differential revision: https://reviews.llvm.org/D37696 llvm-svn: 313250	2017-09-14 10:38:18 +00:00
Simon Pilgrim	523483e0bd	[SelectionDAG] ComputeNumSignBits - cleanup ROTL/ROTR wrapping to match DAGCombine etc. Use RotAmt.urem(VTBits) instead of AND(RotAmt, VTBits - 1) TBH I don't expect non-power-of-2 types to be created, but it makes the logic clearer and matches what we do in other rotation combines. llvm-svn: 313245	2017-09-14 10:28:01 +00:00
Chandler Carruth	7376ae88eb	[PM/CGSCC] Teach the CGSCC pass manager components to gracefully handle invalidated SCCs even when we do not have an updated SCC to redirect towards. This comes up in a fairly subtle and surprising circumstance: we need to have a connected but internal node in the call graph which later becomes a disconnected island, and then gets deleted. All of this needs to happen mid-CGSCC walk. Because it is disconnected, we have no way of computing a new "current" SCC when it gets deleted. Instead, we need to explicitly check for a deleted "current" SCC and bail out of the current CGSCC step. This will bubble all the way up to the post-order walk and then resume correctly. I've included minimal tests for this bug. The specific behavior matches something we've seen in the wild with the new PM combined with ThinLTO and sample PGO, but I've not yet confirmed whether this is the only issue there. llvm-svn: 313242	2017-09-14 08:33:57 +00:00
Alon Kom	682cfc1d4c	[LV] Fix maximum legal VF calculation This patch fixes pr34283, which exposed that the computation of maximum legal width for vectorization was wrong, because it relied on MaxInterleaveFactor to obtain the maximum stride used in the loop, however not all strided accesses in the loop have an interleave-group associated with them. Instead of recording the maximum stride in the loop, which can be over conservative (e.g. if the access with the maximum stride is not involved in the dependence limitation), this patch tracks the actual maximum legal width imposed by accesses that are involved in dependencies. Differential Revision: https://reviews.llvm.org/D37507 llvm-svn: 313237	2017-09-14 07:40:02 +00:00
Dean Michael Berris	01fd7c8bd4	[XRay][CodeGen] Use the current function symbol as the associated symbol for the instrumentation map Summary: XRay had been assuming that the previous section is the "text" section of the function when lowering the instrumentation map. Unfortunately this is not a safe assumption, because we may be coming from lowering debug type information for the function being lowered. This fixes an issue with combining -gsplit-dwarf, -generate-type-units, -debug-compile and -fxray-instrument for sole member functions. When the split dwarf section is stripped, we're left with references from the xray_instr_map to the debug section. The change now uses the function's symbol instead of the previous section's start symbol. We found the bug while attempting to strip the split debug sections off an XRay-instrumented object file, which had a peculiar edge-case for single-function classes where the single function is being lowered. Because XRay had assocaited the instrumentation map for a function to the debug types section instead of the function's section, the objcopy call will fail due to the misplaced reference from the xray_instr_map section. Reviewers: pcc, dblaikie, echristo Subscribers: llvm-commits, aprantl Differential Revision: https://reviews.llvm.org/D37791 llvm-svn: 313233	2017-09-14 07:08:23 +00:00
Simon Atanasyan	b35dd1c908	[mips] Recognise the triple used by Debian for MIPS n32 ABI Triples like mips64-linux-gnuabin32 are documented in this article: https://wiki.debian.org/Multiarch/Tuples llvm-svn: 313231	2017-09-14 06:50:05 +00:00
Vitaly Buka	48624d327a	Revert "Invoke GetInlineCost for legality check before inline functions in SampleProfileLoader." Patch introduced uninitialized value. This reverts commit r313195. llvm-svn: 313230	2017-09-14 05:40:33 +00:00
Peter Collingbourne	cfbd089237	Reland r313157, "ThinLTO: Correctly follow aliasee references when dead stripping." which was reverted in r313222. This reland includes a fix for the LowerTypeTests pass so that it looks past aliases when determining which type identifiers are live. Differential Revision: https://reviews.llvm.org/D37842 llvm-svn: 313229	2017-09-14 05:02:59 +00:00
Dinar Temirbulatov	df0b843875	[SLPVectorizer] Prefer auto over explicit type for VL0, NFCI. llvm-svn: 313228	2017-09-14 04:28:35 +00:00
Hans Wennborg	ae050afeb9	Revert r313157 "ThinLTO: Correctly follow aliasee references when dead stripping." This broke Chromium's CFI build; see crbug.com/765004. > We were previously handling aliases during dead stripping by adding > the aliased global's "original name" GUID to the worklist. This will > lead to incorrect behaviour if the global has local linkage because > the original name GUID will not correspond to the global's GUID in > the summary. > > Because an alias is just another name for the global that it > references, there is no need to mark the referenced global as used, > or to follow references from any other copies of the global. So all > we need to do is to follow references from the aliasee's summary > instead of the alias. > > Differential Revision: https://reviews.llvm.org/D37789 llvm-svn: 313222	2017-09-14 00:40:14 +00:00
Matt Arsenault	ecb43ef1bc	AMDGPU: Don't spill SP reg like a normal CSR llvm-svn: 313217	2017-09-13 23:47:01 +00:00
Reid Kleckner	cd7bba0264	[codeview] Fold FIXME into comment, there's nothing to do. NFC llvm-svn: 313214	2017-09-13 23:30:01 +00:00
Hans Wennborg	06e2a384c2	Revert r312719 "[MachineCombiner] Update instruction depths incrementally for large BBs." This caused PR34596. > [MachineCombiner] Update instruction depths incrementally for large BBs. > > Summary: > For large basic blocks with lots of combinable instructions, the > MachineTraceMetrics computations in MachineCombiner can dominate the compile > time, as computing the trace information is quadratic in the number of > instructions in a BB and it's relevant successors/predecessors. > > In most cases, knowing the instruction depth should be enough to make > combination decisions. As we already iterate over all instructions in a basic > block, the instruction depth can be computed incrementally. This reduces the > cost of machine-combine drastically in cases where lots of instructions > are combined. The major drawback is that AFAIK, computing the critical path > length cannot be done incrementally. Therefore we only compute > instruction depths incrementally, for basic blocks with more > instructions than inc_threshold. The -machine-combiner-inc-threshold > option can be used to set the threshold and allows for easier > experimenting and checking if using incremental updates for all basic > blocks has any impact on the performance. > > Reviewers: sanjoy, Gerolf, MatzeB, efriedma, fhahn > > Reviewed By: fhahn > > Subscribers: kiranchandramohan, javed.absar, efriedma, llvm-commits > > Differential Revision: https://reviews.llvm.org/D36619 llvm-svn: 313213	2017-09-13 23:23:09 +00:00
Stanislav Mekhanoshin	7fe9a5d9b4	Allow target to decide when to cluster loads/stores in misched MachineScheduler when clustering loads or stores checks if base pointers point to the same memory. This check is done through comparison of base registers of two memory instructions. This works fine when instructions have separate offset operand. If they require a full calculated pointer such instructions can never be clustered according to such logic. Changed shouldClusterMemOps to accept base registers as well and let it decide what to do about it. Differential Revision: https://reviews.llvm.org/D37698 llvm-svn: 313208	2017-09-13 22:20:47 +00:00
Adrian Prantl	3ae35eb56b	llvm-dwarfdump: automatically dump both regular and .dwo variant of sections Since users typically don't really care about the .dwo / non.dwo distinction, this patch makes it so dwarfdump --debug-<info,...> dumps .debug_info and (if available) also .debug_info.dwo. This simplifies the command line interface (I've removed all dwo-specific dump options) and makes the tool friendlier to use. Differential Revision: https://reviews.llvm.org/D37771 llvm-svn: 313207	2017-09-13 22:09:01 +00:00
Matt Arsenault	fb017ae155	AMDGPU: Handle coldcc in more places Missed in r312936 llvm-svn: 313205	2017-09-13 21:55:52 +00:00
Reid Kleckner	89af112cf5	[codeview] VLAs and unsized arrays should use a size of zero Previously we used a size of '1' for VLAs because we weren't sure what MSVC did. However, MSVC does support declaring an array without a size, for which it emits an array type with a size of zero. Clang emits the same DI metadata for VLAs and arrays without bound, so we would describe arrays without bound as having one element. This lead to Microsoft debuggers only printing a single element. Emitting a size of zero appears to cause these debuggers to search the symbol information to find a definition of the variable with accurate array bounds. Fixes http://crbug.com/763580 llvm-svn: 313203	2017-09-13 21:54:20 +00:00
Eli Friedman	bde9fc76dd	[ARM] Add more CPUs to host detection This returns "cortex-a73" for second-generation Kryo; not precisely correct, but close enough. Differential Revision: https://reviews.llvm.org/D37724 llvm-svn: 313200	2017-09-13 21:48:00 +00:00
Eugene Zelenko	8002c504cd	[Transforms] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 313198	2017-09-13 21:43:53 +00:00
Wei Mi	c0d066468e	[RegAlloc] Keep a copy of live interval for the spilled vregs in HoistSpillHelper. This is to fix PR34502. After rL311401, the live range of spilled vreg will be cleared. HoistSpill need to use the live range of the original vreg before splitting to know the moving range of the spills. The patch saves a copy of live interval for the spilled vreg inside of HoistSpillHelper. Differential Revision: https://reviews.llvm.org/D37578 llvm-svn: 313197	2017-09-13 21:41:30 +00:00
Dehao Chen	15c86ef970	Invoke GetInlineCost for legality check before inline functions in SampleProfileLoader. Summary: SampleProfileLoader inlines hot functions if it is inlined in the profiled binary. However, the inline needs to be guarded by legality check, otherwise it could lead to correctness issues. Reviewers: eraman, davidxl Reviewed By: eraman Subscribers: sanjoy, llvm-commits Differential Revision: https://reviews.llvm.org/D37779 llvm-svn: 313195	2017-09-13 21:22:55 +00:00
Eugene Zelenko	618c555bbe	[CodeGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 313194	2017-09-13 21:15:20 +00:00
Adrian McCarthy	d91bf3998f	Mark static member functions as static in CodeViewDebug Summary: To improve CodeView quality for static member functions, we need to make the static explicit. In addition to a small change in LLVM's CodeViewDebug to return the appropriate MethodKind, this requires a small change in Clang to note the staticness in the debug info metadata. Subscribers: aprantl, hiraditya Differential Revision: https://reviews.llvm.org/D37715 llvm-svn: 313192	2017-09-13 20:53:55 +00:00
Easwaran Raman	4924bb002d	[Inliner] Add another way to compute full inline cost. Summary: Full inline cost is computed when -inline-cost-full is true or ORE is non-null. This patch adds another way to compute full inline cost by adding a field to InlineParams. This will be used by SampleProfileLoader to check legality of inlining a callee that it wants to inline. Reviewers: danielcdh, haicheng Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37819 llvm-svn: 313185	2017-09-13 20:16:02 +00:00
Anna Thomas	19529f75b9	[LV] Avoid computing the register usage for default VF. NFC These are changes to reduce redundant computations when calculating a feasible vectorization factor: 1. early return when target has no vector registers 2. don't compute register usage for the default VF. Suggested during review for D37702. llvm-svn: 313176	2017-09-13 19:35:45 +00:00
Michael Zuckerman	80d3649f23	Refactoring the stride 4 code in the X86interleavedaccess NFC llvm-svn: 313166	2017-09-13 18:28:09 +00:00
Adrian Prantl	3dcd122151	llvm-dwarfdump: support dumping UUIDs of Mach-O binaries. This is a feature supported by Darwin dwarfdump. UUIDs are used to associate executables with their .dSYM bundles. llvm-svn: 313165	2017-09-13 18:22:59 +00:00
Hiroshi Yamauchi	a43913cfaf	Add options to dump PGO counts in text. Summary: Added text options to -pgo-view-counts and -pgo-view-raw-counts that dump block frequency and branch probability info in text. This is useful when the graph is very large and complex (the dot command crashes, lines/edges too close to tell apart, hard to navigate without textual search) or simply when text is preferred. Reviewers: davidxl Reviewed By: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37776 llvm-svn: 313159	2017-09-13 17:20:38 +00:00
Teresa Johnson	cbdc5ff628	[ThinLTO] AliasSummary should not have any references Summary: References should only be on the aliasee. Reviewers: pcc Subscribers: llvm-commits, inglorion Differential Revision: https://reviews.llvm.org/D37814 llvm-svn: 313158	2017-09-13 17:10:24 +00:00
Peter Collingbourne	d067c8ed59	ThinLTO: Correctly follow aliasee references when dead stripping. We were previously handling aliases during dead stripping by adding the aliased global's "original name" GUID to the worklist. This will lead to incorrect behaviour if the global has local linkage because the original name GUID will not correspond to the global's GUID in the summary. Because an alias is just another name for the global that it references, there is no need to mark the referenced global as used, or to follow references from any other copies of the global. So all we need to do is to follow references from the aliasee's summary instead of the alias. Differential Revision: https://reviews.llvm.org/D37789 llvm-svn: 313157	2017-09-13 17:09:20 +00:00
Alexander Kornienko	208eecd57f	Convenience/safety fix for llvm::sys::Execute(And\|No)Wait Summary: Change the type of the Redirects parameter of llvm::sys::ExecuteAndWait, ExecuteNoWait and other APIs that wrap them from `const StringRef **` to `ArrayRef<Optional<StringRef>>`, which is safer and simplifies the use of these APIs (no more local StringRef variables just to get a pointer to). Corresponding clang changes will be posted as a separate patch. Reviewers: bkramer Reviewed By: bkramer Subscribers: vsk, llvm-commits Differential Revision: https://reviews.llvm.org/D37563 llvm-svn: 313155	2017-09-13 17:03:37 +00:00
Teresa Johnson	1958083d35	[ThinLTO] For SamplePGO, need to handle ICP targets consistently in thin link Summary: SamplePGO indirect call profiles record the target as the original GUID for statics. The importer had special handling to map to the normal GUID in that case. The dead global analysis needs the same treatment or inconsistencies arise, resulting in linker unsats due to some dead symbols being exported and kept, leaving in references to other dead symbols that are removed. This can happen when a SamplePGO profile collected by one binary is used for a different binary, so the indirect call profiles may not accurately reflect live targets. Reviewers: danielcdh Subscribers: mehdi_amini, inglorion, llvm-commits, eraman Differential Revision: https://reviews.llvm.org/D37783 llvm-svn: 313151	2017-09-13 15:16:38 +00:00
Petar Jovanovic	50e068158b	[mips] correct operand range for DINSM instruction This patch corrects the definition of the DINSM instruction. Specification for DINSM instruction for Mips64 says that size operand should be 2 <= size <= 64, but it is defined as uimm5_inssize_plus1 which gives range of 1 .. 32. Patch by Aleksandar Beserminji. Differential Revision: https://reviews.llvm.org/D37683 llvm-svn: 313149	2017-09-13 14:09:13 +00:00
Mikael Holmen	4eb2a96e7f	[MachineScheduler] Put SchedRegion in an anonymous namespace. Summary: It pollutes the global namespace otherwise. Patch by: Bevin Hansson Reviewers: jonpa Reviewed By: jonpa Subscribers: MatzeB, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D37555 llvm-svn: 313148	2017-09-13 14:07:47 +00:00
Stefan Pintilie	dff606ec3e	[Power9] Add missing instructions: extswsli, popcntb Added the following P9 instructions: extswsli, extswsli., popcntb Differential Revision: https://reviews.llvm.org/D37342 llvm-svn: 313147	2017-09-13 14:05:27 +00:00
Jonas Devlieghere	81f5abe1ad	[MachO] Prevent heap overflow when load command extends past EOF This patch fixes a heap-buffer-overflow when a malformed Mach-O has a load command who's size extends past the end of the binary. Fixes: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=3225 Differential revision: https://reviews.llvm.org/D37439 llvm-svn: 313145	2017-09-13 13:43:01 +00:00
Jonas Devlieghere	27476ce24b	[dwarfdump] Rename Brief to Verbose in DIDumpOptions This patches renames "brief" to "verbose" in de DIDumpOptions and inverts the logic to match the new behavior where brief is the default. Changing the default value uncovered some bugs related to the DIDumpOptions not being propagated and have been fixed as well. Differential revision: https://reviews.llvm.org/D37745 llvm-svn: 313139	2017-09-13 09:43:05 +00:00
Igor Breger	5c721199dd	[GlobalISel][X86] support G_FPEXT operation. Summary: Support G_FPEXT operation. Selection done via TableGen'erated code. Reviewers: zvi, guyblank, aymanmus, m_zuckerman Reviewed By: zvi Subscribers: rovka, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D34816 llvm-svn: 313135	2017-09-13 09:05:23 +00:00
Uriel Korach	5d5da5f531	[X86] [PATCH] [intrinsics] Lowering X86 ABS intrinsics to IR. (llvm) This patch, together with a matching clang patch (https://reviews.llvm.org/D37694), implements the lowering of X86 ABS intrinsics to IR. differential revision: https://reviews.llvm.org/D37693. llvm-svn: 313134	2017-09-13 09:02:36 +00:00
Mohammed Agabaria	e9aebf26af	[X86] Adding X86 Processor Families Adding x86 Processor families to initialize several uArch properties (based on the family) This patch shows how gather cost can be initialized based on the proc. family Differential Revision: https://reviews.llvm.org/D35348 llvm-svn: 313132	2017-09-13 09:00:27 +00:00
Craig Topper	2b6bfda561	[X86] Make sure we emit a SUBREG_TO_REG after the MOV32ri when creating a BEXTR64rr instruction from a shift/and pair. Fixes PR34589. llvm-svn: 313126	2017-09-13 07:53:21 +00:00
Elena Demikhovsky	6cab129464	[X86 CodeGen] Optimization of ZeroExtendLoad for v2i8 vector Load with zero-extend and sign-extend from v2i8 to v2i32 is "Legal" since SSE4.1 and may be performed using PMOVZXBD , PMOVSXBD instructions. llvm-svn: 313121	2017-09-13 06:40:26 +00:00
Ayal Zaks	e2a8c0758f	[LV] Fix PR34523 - avoid generating redundant selects When converting a PHI into a series of 'select' instructions to combine the incoming values together according their edge masks, initialize the first value to the incoming value In0 of the first predecessor, instead of generating a redundant assignment 'select(Cond[0], In0, In0)'. The latter fails when the Cond[0] mask is null, representing a full mask, which can happen only when there's a single incoming value. No functional changes intended nor expected other than surviving null Cond[0]'s. This fix follows D35725, which introduced using null to represent full masks. Differential Revision: https://reviews.llvm.org/D37619 llvm-svn: 313119	2017-09-13 06:28:37 +00:00
Aditya Kumar	dfa8741c96	[GVNHoist] Factor out reachability to search for anticipable instructions quickly Factor out the reachability such that multiple queries to find reachability of values are fast. This is based on finding the ANTIC points in the CFG which do not change during hoisting. The ANTIC points are basically the dominance-frontiers in the inverse graph. So we introduce a data structure (CHI nodes) to keep track of values flowing out of a basic block. We only do this for values with multiple occurrences in the function as they are the potential hoistable candidates. This patch allows us to hoist instructions to a basic block with >2 successors, as well as deal with infinite loops in a trivial way. Relevant test cases are added to show the functionality as well as regression fixes from PR32821. Regression from previous GVNHoist: We do not hoist fully redundant expressions because fully redundant expressions are already handled by NewGVN Differential Revision: https://reviews.llvm.org/D35918 Reviewers: dberlin, sebpop, gberry, llvm-svn: 313116	2017-09-13 05:28:03 +00:00
Craig Topper	0a3bcebcc2	[X86] Use isUInt<32> to simplify some code. NFC llvm-svn: 313112	2017-09-13 02:29:59 +00:00
Leslie Zhai	49277d1fea	[ARC] Prepare the implementation of relocation for LLD Reviewers: ruiu, kparzysz, petecoup, rafael Reviewed By: kparzysz Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37556 llvm-svn: 313109	2017-09-13 01:49:49 +00:00
Reid Kleckner	8a1cd91016	[InstCombine] Add a flag to disable LowerDbgDeclare Summary: This should improve optimized debug info for address-taken variables at the cost of inaccurate debug info in some situations. We patched this into clang and deployed this change to Chromium developers, and this significantly improved debuggability of optimized code. The long-term solution to PR34136 seems more and more like it's going to take a while, so I would like to commit this change under a flag so that it can be used as a stop-gap measure. This flag should really help so for C++ aggregates like std::string and std::vector, which are typically address-taken, even after inlining, and cannot be SROA-ed. Reviewers: aprantl, dblaikie, probinson, dberlin Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D36596 llvm-svn: 313108	2017-09-13 01:43:25 +00:00
Petr Hosek	c35fe2b70b	[Fuchsia] Magenta -> Zircon Fuchsia's lowest API layer has been renamed from Magenta to Zircon. In LLVM proper, this is only mentioned in comments. Patch by Roland McGrath Differential Revision: https://reviews.llvm.org/D37763 llvm-svn: 313105	2017-09-13 01:18:06 +00:00
Derek Schuff	a519fe5a37	[WebAssembly] Add sign extend instructions from atomics proposal Select them from ISD::SIGN_EXTEND_INREG Differential Revision: https://reviews.llvm.org/D37603 remove spurious change llvm-svn: 313101	2017-09-13 00:29:06 +00:00
Sanjay Patel	659279450e	[x86] eliminate unnecessary vector compare for AVX masked store The masked store instruction only cares about the sign-bit of each mask element, so the compare s<0 isn't needed. As noted in PR11210: https://bugs.llvm.org/show_bug.cgi?id=11210 ...fixing this should allow us to eliminate x86-specific masked store intrinsics in IR. (Although more testing will be needed to confirm that.) I filed a bug to track improvements for AVX512: https://bugs.llvm.org/show_bug.cgi?id=34584 Differential Revision: https://reviews.llvm.org/D37446 llvm-svn: 313089	2017-09-12 23:24:05 +00:00
Dehao Chen	f3ed14d323	Refactor the code to pass down ACT to SampleProfileLoader correctly. Summary: This change passes down ACT to SampleProfileLoader for the new PM. Also remove the default value for SampleProfileLoader class as it is not used. Reviewers: eraman, davidxl Reviewed By: eraman Subscribers: sanjoy, llvm-commits Differential Revision: https://reviews.llvm.org/D37773 llvm-svn: 313080	2017-09-12 21:55:55 +00:00
Peter Collingbourne	876da0294a	Remove -generate-dwarf-pub-sections flag. This flag is unnecessary for testing because we can get the coverage we need by adjusting CU attributes. Differential Revision: https://reviews.llvm.org/D37725 llvm-svn: 313079	2017-09-12 21:50:55 +00:00
Peter Collingbourne	b52e23669c	IR: Represent -ggnu-pubnames with a flag on the DICompileUnit. This allows the flag to be persisted through to LTO. Differential Revision: https://reviews.llvm.org/D37655 llvm-svn: 313078	2017-09-12 21:50:41 +00:00
Petar Jovanovic	e4dacb750d	[mips] handle UImm16_AltRelaxed match type Currently, UImm16_AltRelaxed match type is not handled in MatchAndEmitInstruction() function, which may result in llvm_unreachable() behavior. This patch adds necessary case for this match type. Patch by Aleksandar Beserminji. Differential Revision: https://reviews.llvm.org/D37682 llvm-svn: 313077	2017-09-12 21:43:33 +00:00
Alina Sbirlea	80b806bf30	Make promoteLoopAccessesToScalars independent of AliasSet [NFC] Summary: The current promoteLoopAccessesToScalars method receives an AliasSet, but the information used is in fact a list of Value, known to must alias. Create the list ahead of time to make this method independent of the AliasSet class. While there is no functionality change, this adds overhead for creating a set of Value, when promotion would normally exit earlier. This is meant to be as a first refactoring step in order to start replacing AliasSetTracker with MemorySSA. And while the end goal is to redesign LICM, the first few steps will focus on adding MemorySSA as an alternative to the AliasSetTracker using most of the existing functionality. Reviewers: mkuper, danielcdh, dberlin Subscribers: sanjoy, chandlerc, gberry, davide, llvm-commits Differential Revision: https://reviews.llvm.org/D35439 llvm-svn: 313075	2017-09-12 21:18:44 +00:00
Ahmed Bougacha	106dd035a8	[AArch64][GlobalISel] Select all fpexts. Tablegen already can select these: mark them as legal, remove the c++ code, and add tests for all types. llvm-svn: 313074	2017-09-12 21:04:11 +00:00
Ahmed Bougacha	a7aa2a9fb1	[AArch64][GlobalISel] Select all fptruncs. We already support these in tablegen, but we're matching the wrong operator (libm ftrunc). Fix that. While there, drop the c++ code, support COPYs of FPR16, and add tests for the other types. llvm-svn: 313073	2017-09-12 21:04:10 +00:00
Lei Huang	34e6621724	Update branch coalescing to be a PowerPC specific pass Implementing this pass as a PowerPC specific pass. Branch coalescing utilizes the analyzeBranch method which currently does not include any implicit operands. This is not an issue on PPC but must be handled on other targets. Pass is currently off by default. Enabled via -enable-ppc-branch-coalesce. Differential Revision : https: // reviews.llvm.org/D32776 llvm-svn: 313061	2017-09-12 18:39:11 +00:00
Sam Clegg	2176a9f2a3	[WebAssembly] Remove flags from MCSectionWasm Looks like these were copied from the ELF sections but don't apply to Wasm and were not used anywhere. Also remove unused Wasm methods in MCContext. Differential Revision: https://reviews.llvm.org/D37633 llvm-svn: 313058	2017-09-12 18:31:24 +00:00
Robert Lougher	51529eb0c2	Revert "[DWARF] Incorrect prologue end line record." This reverts commit r313047 as it is causing buildbot failure (lldb inline stepping tests). llvm-svn: 313057	2017-09-12 18:23:15 +00:00
Yonghong Song	06ff655e59	bpf: Add BPF AsmParser support in LLVM Reviewed-by: Yonghong Song <yhs@fb.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> llvm-svn: 313055	2017-09-12 17:55:23 +00:00
Craig Topper	958106d0f1	[X86] Move matching of (and (srl/sra, C), (1<<C) - 1) to BEXTR/BEXTRI instruction to custom isel Recognizing this pattern during DAG combine hides information about the 'and' and the shift from other combines. I think it should be recognized at isel so its as late as possible. But it can't be done with table based isel because you need to be able to look at both immediates. This patch moves it to custom isel in X86ISelDAGToDAG.cpp. This does break a couple tests in tbm_patterns because we are now emitting an and_flag node or (cmp and, 0) that we dont' recognize yet. We already had this problem for several other TBM patterns so I think this fine and we can address of them together. I've also fixed a bug where the combine to BEXTR was preventing us from using a trick of zero extending AH to handle extracts of bits 15:8. We might still want to use BEXTR if it enables load folding. But honestly I hope we narrowed the load instead before got to isel. I think we should probably also support matching BEXTR from (srl/srl (and mask << C), C). But that should be a different patch. Differential Revision: https://reviews.llvm.org/D37592 llvm-svn: 313054	2017-09-12 17:40:25 +00:00
Robert Lougher	f696a22d3c	[DWARF] Incorrect prologue end line record. A prologue-end line record is emitted with an incorrect associated address, which causes a debugger to show the beginning of function body to be inside the prologue. Patch written by Carlos Alberto Enciso. Differential Revision: https://reviews.llvm.org/D37625 llvm-svn: 313047	2017-09-12 16:35:25 +00:00
Anna Thomas	9f1be02fa3	[LV] Clamp the VF to the trip count Summary: When the MaxVectorSize > ConstantTripCount, we should just clamp the vectorization factor to be the ConstantTripCount. This vectorizes loops where the TinyTripCountThreshold >= TripCount < MaxVF. Earlier we were finding the maximum vector width, which could be greater than the trip count itself. The Loop vectorizer does all the work for generating a vectorizable loop, but in the end we would always choose the scalar loop (since the VF > trip count). This allows us to choose the VF keeping in mind the trip count if available. This is a fix on top of rL312472. Reviewers: Ayal, zvi, hfinkel, dneilson Reviewed by: Ayal Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37702 llvm-svn: 313046	2017-09-12 16:32:45 +00:00
Hans Wennborg	8c1eb106bd	Revert r313009 "[ARM] Use ADDCARRY / SUBCARRY" This was causing PR34045 to fire again. > This is a preparatory step for D34515 and also is being recommitted as its > first version caused PR34045. > > This change: > - makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32 > - lowering is done by first converting the boolean value into the carry flag > using (_, C) ← (ARMISD::ADDC R, -1) and converted back to an integer value > using (R, _) ← (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two > operations does the actual addition. > - for subtraction, given that ISD::SUBCARRY second result is actually a > borrow, we need to invert the value of the second operand and result before > and after using ARMISD::SUBE. We need to invert the carry result of > ARMISD::SUBE to preserve the semantics. > - given that the generic combiner may lower ISD::ADDCARRY and > ISD::SUBCARRYinto ISD::UADDO and ISD::USUBO we need to update their lowering > as well otherwise i64 operations now would require branches. This implies > updating the corresponding test for unsigned. > - add new combiner to remove the redundant conversions from/to carry flags > to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) → C > - fixes PR34045 > > Differential Revision: https://reviews.llvm.org/D35192 Also revert follow-up r313010: > [ARM] Fix typo when creating ISD::SUB nodes > > In D35192, I accidentally introduced a typo when creating ISD::SUB nodes, > giving them two values instead of one. > > This fails when the merge_values combiner finds one of these nodes. > > This change fixes PR34564. > > Differential Revision: https://reviews.llvm.org/D37690 llvm-svn: 313044	2017-09-12 16:24:17 +00:00
Alexey Bataev	a26d3e834d	[SLP] Fix for PHINode during horizontal reduction scanning, NFC. Reduces number of loops during instructions analysis. llvm-svn: 313035	2017-09-12 15:13:50 +00:00
Jonas Paulsson	fc4f323ac1	[SystemZ] Add the CoveredBySubRegs bit to GPR64, GPR128 and FPR128 registers. This bit is needed in order for the CalleeSavedRegs list to automatically include the super registers if all of their subregs are present. Thanks to Wei Mi for initially indicating this deficiency in the SystemZ backend. Review: Ulrich Weigand. https://bugs.llvm.org/show_bug.cgi?id=34550 llvm-svn: 313023	2017-09-12 12:11:29 +00:00
Sjoerd Meijer	bafde8f3e3	[AArch64] ISel: Add some debug messages to LowerBUILDVECTOR. NFC. Differential Revision: https://reviews.llvm.org/D37676 llvm-svn: 313017	2017-09-12 10:24:12 +00:00
Yael Tsafrir	47668b5e03	[X86] Lower _mm[256\|512]_[mask[z]]_avg_epu[8\|16] intrinsics to native llvm IR Differential Revision: https://reviews.llvm.org/D37560 llvm-svn: 313013	2017-09-12 07:50:35 +00:00
Silviu Baranga	ac920f7716	[LAA] Allow more run-time alias checks by coercing pointer expressions to AddRecExprs Summary: LAA can only emit run-time alias checks for pointers with affine AddRec SCEV expressions. However, non-AddRecExprs can be now be converted to affine AddRecExprs using SCEV predicates. This change tries to add the minimal set of SCEV predicates in order to enable run-time alias checking. Reviewers: anemet, mzolotukhin, mkuper, sanjoy, hfinkel Reviewed By: hfinkel Subscribers: mssimpso, Ayal, dorit, roman.shirokiy, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D17080 llvm-svn: 313012	2017-09-12 07:48:22 +00:00
Roger Ferrer Ibanez	9df2527b0b	[ARM] Fix typo when creating ISD::SUB nodes In D35192, I accidentally introduced a typo when creating ISD::SUB nodes, giving them two values instead of one. This fails when the merge_values combiner finds one of these nodes. This change fixes PR34564. Differential Revision: https://reviews.llvm.org/D37690 llvm-svn: 313010	2017-09-12 07:42:28 +00:00
Roger Ferrer Ibanez	4f92b4162f	[ARM] Use ADDCARRY / SUBCARRY This is a preparatory step for D34515 and also is being recommitted as its first version caused PR34045. This change: - makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32 - lowering is done by first converting the boolean value into the carry flag using (_, C) ← (ARMISD::ADDC R, -1) and converted back to an integer value using (R, _) ← (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two operations does the actual addition. - for subtraction, given that ISD::SUBCARRY second result is actually a borrow, we need to invert the value of the second operand and result before and after using ARMISD::SUBE. We need to invert the carry result of ARMISD::SUBE to preserve the semantics. - given that the generic combiner may lower ISD::ADDCARRY and ISD::SUBCARRYinto ISD::UADDO and ISD::USUBO we need to update their lowering as well otherwise i64 operations now would require branches. This implies updating the corresponding test for unsigned. - add new combiner to remove the redundant conversions from/to carry flags to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) → C - fixes PR34045 Differential Revision: https://reviews.llvm.org/D35192 llvm-svn: 313009	2017-09-12 07:40:09 +00:00
Craig Topper	fd6be2868e	[X86] Fix typo in comment. NFC llvm-svn: 312990	2017-09-12 01:30:09 +00:00
Hans Wennborg	075e5a2e2b	Revert r312898 "[ARM] Use ADDCARRY / SUBCARRY" It caused PR34564. > This is a preparatory step for D34515 and also is being recommitted as its > first version caused PR34045. > > This change: > - makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32 > - lowering is done by first converting the boolean value into the carry flag > using (_, C) ← (ARMISD::ADDC R, -1) and converted back to an integer value > using (R, _) ← (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two > operations does the actual addition. > - for subtraction, given that ISD::SUBCARRY second result is actually a > borrow, we need to invert the value of the second operand and result before > and after using ARMISD::SUBE. We need to invert the carry result of > ARMISD::SUBE to preserve the semantics. > - given that the generic combiner may lower ISD::ADDCARRY and > ISD::SUBCARRYinto ISD::UADDO and ISD::USUBO we need to update their lowering > as well otherwise i64 operations now would require branches. This implies > updating the corresponding test for unsigned. > - add new combiner to remove the redundant conversions from/to carry flags > to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) → C > - fixes PR34045 > > Differential Revision: https://reviews.llvm.org/D35192 llvm-svn: 312980	2017-09-11 23:52:02 +00:00
Yonghong Song	be9c00347f	bpf: add " ll" in the LD_IMM64 asmstring This partially revert previous fix in commit f5858045aa0b ("bpf: proper print imm64 expression in inst printer"). In that commit, the original suffix "ll" is removed from LD_IMM64 asmstring. In the customer print method, the "ll" suffix is printed if the rhs is an immediate. For example, "r2 = 5ll" => "r2 = 5ll", and "r3 = varll" => "r3 = var". This has an issue though for assembler. Since assembler relies on asmstring to do pattern matching, it will not be able to distiguish between "mov r2, 5" and "ld_imm64 r2, 5" since both asmstring is "r2 = 5". In such cases, the assembler uses 64bit load for all "r = <val>" asm insts. This patch adds back " ll" suffix for ld_imm64 with one additional space for "#reg = #global_var" case. Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> llvm-svn: 312978	2017-09-11 23:43:35 +00:00
Eugene Zelenko	32a4056438	[CodeGen] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 312971	2017-09-11 23:00:48 +00:00
Adrian Prantl	7bc1b28291	llvm-dwarfdump: Replace -debug-dump=sect option with individual options. As discussed on llvm-dev in http://lists.llvm.org/pipermail/llvm-dev/2017-September/117301.html this changes the command line interface of llvm-dwarfdump to match the one used by the dwarfdump utility shipping on macOS. In addition to being shorter to type this format also has the advantage of allowing more than one section to be specified at the same time. In a nutshell, with this change $ llvm-dwarfdump --debug-dump=info $ llvm-dwarfdump --debug-dump=apple-objc becomes $ dwarfdump --debug-info --apple-objc Differential Revision: https://reviews.llvm.org/D37714 llvm-svn: 312970	2017-09-11 22:59:45 +00:00
Peter Collingbourne	b9b6025328	LowerTypeTests: Add import/export support for targets without absolute symbol constants. The rationale is the same as for r312967. Differential Revision: https://reviews.llvm.org/D37408 llvm-svn: 312968	2017-09-11 22:49:10 +00:00
Peter Collingbourne	b15a35e604	WholeProgramDevirt: Add import/export support for targets without absolute symbol constants. Not all targets support the use of absolute symbols to export constants. In particular, ARM has a wide variety of constant encodings that cannot currently be relocated by linkers. So instead of exporting the constants using symbols, export them directly in the summary. The values of the constants are left as zeroes on targets that support symbolic exports. This may result in more cache misses when targeting those architectures as a result of arbitrary changes in constant values, but this seems somewhat unavoidable for now. Differential Revision: https://reviews.llvm.org/D37407 llvm-svn: 312967	2017-09-11 22:34:42 +00:00
Matt Arsenault	537bd3b906	AMDGPU: Allow coldcc calls llvm-svn: 312936	2017-09-11 18:54:20 +00:00
Petar Jovanovic	d4f3723c56	[mips][microMIPS] add lapc instruction Implement LAPC instruction for mips32r6, mips64r6 and micromips32r6. Patch by Milos Stojanovic. Differential Revision: https://reviews.llvm.org/D35984 llvm-svn: 312934	2017-09-11 18:34:04 +00:00
Hiroshi Yamauchi	9364432cec	Unmerge GEPs to reduce register pressure on IndirectBr edges. Summary: GEP merging can sometimes increase the number of live values and register pressure across control edges and cause performance problems particularly if the increased register pressure results in spills. This change implements GEP unmerging around an IndirectBr in certain cases to mitigate the issue. This is in the CodeGenPrepare pass (after all the GEP merging has happened.) With this patch, the Python interpreter loop runs faster by ~5%. Reviewers: sanjoy, hfinkel Reviewed By: hfinkel Subscribers: eastig, junbuml, llvm-commits Differential Revision: https://reviews.llvm.org/D36772 llvm-svn: 312930	2017-09-11 17:52:08 +00:00
Stanislav Mekhanoshin	710da42b86	[AMDGPU] Produce madak and madmk from the two-address pass These two instructions are normally selected, but when the two address pass converts mac into mad we end up with the mad where we could have one of these. Differential Revision: https://reviews.llvm.org/D37389 llvm-svn: 312928	2017-09-11 17:13:57 +00:00

1 2 3 4 5 ...

106482 Commits