llvm-project

Commit Graph

Author	SHA1	Message	Date
Eugene Leviant	453c976a63	[ThinLTO] Implement summary visualizer Differential revision: https://reviews.llvm.org/D41297 llvm-svn: 323062	2018-01-21 07:27:32 +00:00
Lang Hames	5ff5a30bee	[ORC] Add a lookupFlags method to VSO. lookupFlags returns a SymbolFlagsMap for the requested symbols, along with a set containing the SymbolStringPtr for any symbol not found in the VSO. The JITSymbolFlags for each symbol will have been stripped of its transient JIT-state flags (i.e. NotMaterialized, Materializing). Calling lookupFlags does not trigger symbol materialization. llvm-svn: 323060	2018-01-21 03:20:39 +00:00
Lang Hames	86edf53664	[ORC] More cleanup. NFC. llvm-svn: 323059	2018-01-21 03:20:36 +00:00
Lang Hames	0d7d442699	[ORC] Cleanup. NFC. llvm-svn: 323057	2018-01-21 02:24:45 +00:00
Philip Reames	f57714c3c7	[DSE] Factor out common code [NFC] We already had the pointer being stored to in the MemLoc, reuse that code. In merging cases, it turned out the interface of the getLocForWrite had become inconsitent with other related utilities. Fix that by making sure the input passes hasAnalyzableWrite as well. llvm-svn: 323056	2018-01-21 02:10:54 +00:00
Philip Reames	424e7a1174	[DSE] Minor rename for clarity sake [NFC] llvm-svn: 323055	2018-01-21 01:44:33 +00:00
Craig Topper	7fddf2bfef	[X86] Add an override of targetShrinkDemandedConstant to limit the damage that shrinkdemandedbits can do to zext_in_reg operations Summary: This patch adds an implementation of targetShrinkDemandedConstant that tries to keep shrinkdemandedbits from removing bits that would otherwise have been recognized as a movzx. We still need a follow patch to stop moving ands across srl if the and could be represented as a movzx before the shift but not after. I think this should help with some of the cases that D42088 ended up removing during isel. Reviewers: spatel, RKSimon Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42265 llvm-svn: 323048	2018-01-20 18:50:09 +00:00
Simon Pilgrim	89540d9665	[X86][SSE] Check for out of bounds PEXTR/PINSR indices during faux shuffle combining. llvm-svn: 323045	2018-01-20 17:16:01 +00:00
Jonas Paulsson	7ad28863fb	[SelectionDAG] Fix codegen of vector stores with non byte-sized elements. This was completely broken, but hopefully fixed by this patch. In cases where it is needed, a vector with non byte-sized elements is stored by extracting, zero-extending, shift:ing and or:ing the elements into an integer of the same width as the vector, which is then stored. Review: Eli Friedman, Ulrich Weigand https://reviews.llvm.org/D42100#inline-369520 https://bugs.llvm.org/show_bug.cgi?id=35520 llvm-svn: 323042	2018-01-20 16:05:10 +00:00
Martin Storsjo	f641d0d4f2	[COFF] Keep the underscore on exported decorated stdcall functions in MSVC mode This (together with the corresponding LLD commit, that contains the testcase updates) fixes PR35733. Differential Revision: https://reviews.llvm.org/D41631 llvm-svn: 323035	2018-01-20 11:44:32 +00:00
Saleem Abdulrasool	99f479abcf	CodeGen: handle llvm.used properly for COFF `llvm.used` contains a list of pointers to named values which the compiler, assembler, and linker are required to treat as if there is a reference that they cannot see. Ensure that the symbols are preserved by adding an explicit `-include` reference to the linker command. llvm-svn: 323017	2018-01-20 00:28:02 +00:00
Craig Topper	08bd14803c	[X86] Teach X86 codegen to use vector width preference to avoid promoting to 512-bit types when VLX is enabled and the preference is for a smaller size. This change applies to places where we would turn 128/256-bit code into 512-bit in order to get a wider element type through sext/zext. Any 512-bit types that already existed in the IR/DAG will be left that way. The width preference has no effect on codegen behavior when the target does not have AVX512 enabled. So AVX/AVX2 codegen cannot be limited via this mechanism yet. If the preference is lower than 256 we may still use a 256 bit type to do the operation. Constraining to 128 bits makes it much more difficult to support some operations. For many of these cases we need to change element width while keeping element count constant which is easiest done by switching between 256 and 128 bit. The preference is only obeyed when AVX512 and VLX are available. This means the preference is not obeyed for KNL, but is obeyed for SKX, Cannonlake, and Icelake. For KNL, the only way to do masked operation is on 512-bit registers so we would have to completely disable masking to obey the preference. We would also lose support for gather, scatter, ctlz, vXi64 multiplies, etc. This may change in the future, but this simplifies the initial implementation. Differential Revision: https://reviews.llvm.org/D41895 llvm-svn: 323016	2018-01-20 00:26:12 +00:00
Craig Topper	0d797a34d8	[X86] Add support for passing 'prefer-vector-width' function attribute into X86Subtarget and exposing via X86's getRegisterWidth TTI interface. This will cause the vectorizers to do some limiting of the vector widths they create. This is not a strict limit. There are reasons I know of that the loop vectorizer will generate larger vectors for. I've written this in such a way that the interface will only return a properly supported width(0/128/256/512) even if the attribute says something funny like 384 or 10. This has been split from D41895 with the remainder in a follow up commit. llvm-svn: 323015	2018-01-20 00:26:08 +00:00
Derek Schuff	a83a665cd4	[WebAssembly] Fix MSVC build nullptr_t can't be used left of boolean && llvm-svn: 323012	2018-01-20 00:01:18 +00:00
Akira Hatanaka	73ceb50d85	[ObjCARC] Do not turn a call to @objc_autoreleaseReturnValue into a call to @objc_autorelease if its operand is a PHI and the PHI has an equivalent value that is used by a return instruction. For example, ARC optimizer shouldn't replace the call in the following example, as doing so breaks the AutoreleaseRV/RetainRV optimization: %v1 = bitcast i32* %v0 to i8* br label %bb3 bb2: %v3 = bitcast i32* %v2 to i8* br label %bb3 bb3: %p = phi i8* [ %v1, %bb1 ], [ %v3, %bb2 ] %retval = phi i32* [ %v0, %bb1 ], [ %v2, %bb2 ] ; equivalent to %p %v4 = tail call i8* @objc_autoreleaseReturnValue(i8* %p) ret i32* %retval Also, make sure ObjCARCContract replaces @objc_autoreleaseReturnValue's operand uses with its value so that the call gets tail-called. rdar://problem/15894705 llvm-svn: 323009	2018-01-19 23:51:13 +00:00
Rui Ueyama	20b49240b8	Fix -Wunused-variable. llvm-svn: 323004	2018-01-19 22:56:04 +00:00
Lang Hames	b72f48452c	[ORC] Re-apply r322913 with a fix for a read-after-free error. ExternalSymbolMap now stores the string key (rather than using a StringRef), as the object file backing the key may be removed at any time. llvm-svn: 323001	2018-01-19 22:24:13 +00:00
Jessica Paquette	a499c3c29d	Add optional DICompileUnit to DIBuilder + make outliner debug info use it Previously, the DIBuilder didn't expose functionality to set its compile unit in any other way than calling createCompileUnit. This meant that the outliner, which creates new functions, had to create a new compile unit for its debug info. This commit adds an optional parameter in the DIBuilder's constructor which lets you set its CU at construction. It also changes the MachineOutliner so that it keeps track of the DISubprograms for each outlined sequence. If debugging information is requested, then it uses one of the outlined sequence's DISubprograms to grab a CU. It then uses that CU to construct the DISubprogram for the new outlined function. The test has also been updated to reflect this change. See https://reviews.llvm.org/D42254 for more information. Also see the e-mail discussion on D42254 in llvm-commits for more context. llvm-svn: 322992	2018-01-19 21:21:49 +00:00
Ulrich Weigand	426f6bef44	[SystemZ] Prefer LOCHI over generating IPM sequences On current machines we have load-on-condition instructions that can be used to directly implement the SETCC semantics. If we have those, it is always preferable to use them instead of generating the IPM sequence. llvm-svn: 322989	2018-01-19 20:56:04 +00:00
Ulrich Weigand	31112895d9	[SystemZ] Directly use CC result of compare-and-swap In order to implement a test whether a compare-and-swap succeeded, the SystemZ back-end currently emits a rather inefficient sequence of first converting the CC result into an integer, and then testing that integer against zero. This commit changes the back-end to simply directly test the CC value set by the compare-and-swap instruction. llvm-svn: 322988	2018-01-19 20:54:18 +00:00
Ulrich Weigand	849a59fd4b	[SystemZ] Rework IPM sequence generation The SystemZ back-end uses a sequence of IPM followed by arithmetic operations to implement the SETCC primitive. This is currently done early during SelectionDAG. This patch moves generating those sequences to much later in SelectionDAG (during PreprocessISelDAG). This doesn't change much in generated code by itself, but it allows further enhancements that will be checked-in as follow-on commits. llvm-svn: 322987	2018-01-19 20:52:04 +00:00
Ulrich Weigand	9eb858c92f	[SystemZ] Implement computeKnownBitsForTargetNode This provides a computeKnownBits implementation for SystemZ target nodes. Currently only SystemZISD::SELECT_CCMASK is supported. llvm-svn: 322986	2018-01-19 20:49:05 +00:00
Ulrich Weigand	5605be9e50	[SelectionDAG] Teach computeKnownBits about ATOMIC_CMP_SWAP_WITH_SUCCESS boolean return value The second return value of ATOMIC_CMP_SWAP_WITH_SUCCESS is known to be a boolean, and should therefore be treated by computeKnownBits just like the second return values of SMULO / UMULO. Differential Revision: https://reviews.llvm.org/D42067 llvm-svn: 322985	2018-01-19 20:47:14 +00:00
Sam Clegg	30e1bbc106	[WebAssembly] MC: Start table at offset 1 rather than 0 Summary: For consistency with the output of lld. This is useful in runnable binaries as can them be sure the null function pointer will never be a valid argument call_indirect. Subscribers: jfb, dschuff, jgravelle-google, aheejin, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D42284 llvm-svn: 322978	2018-01-19 18:57:01 +00:00
Joel Galenson	dbc724f764	[ARM] Fix perf regression in compare optimization. Fix a performance regression caused by r322737. While trying to make it easier to replace compares with existing adds and subtracts, I accidentally stopped it from doing so in some cases. This should fix that. I'm also fixing another potential bug in that commit. Differential Revision: https://reviews.llvm.org/D42263 llvm-svn: 322972	2018-01-19 17:46:27 +00:00
Derek Schuff	bfb02aec5a	[WebAssembly] Fix libcall signature lookup RuntimeLibcallSignatures previously manually initialized all the libcall names into an array and searched it linearly for the first match to lookup the corresponding index. r322802 switched that to initializing a map keyed by the libcall name. Neither of these approaches works correctly because some libcall numbers use the same name on different platforms (e.g. the "l" suffixed functions use f80 or f128 or ppcf128). This change fixes that by ensuring that each name only goes into the map once. It also adds tests. Differential Revision: https://reviews.llvm.org/D42271 llvm-svn: 322971	2018-01-19 17:45:54 +00:00
Dan Gohman	5d2b9354b1	[WebAssembly] Make sign-extension opcodes a distinct feature. Sign-extension opcodes have been split into a separate proposal from the main threads proposal, so switch them to their own target feature. See: https://github.com/WebAssembly/sign-extension-ops llvm-svn: 322966	2018-01-19 17:16:24 +00:00
Daniel Neilson	1e68724d24	Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1) Summary: This is a resurrection of work first proposed and discussed in Aug 2015: http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html and initially landed (but then backed out) in Nov 2015: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html The @llvm.memcpy/memmove/memset intrinsics currently have an explicit argument which is required to be a constant integer. It represents the alignment of the dest (and source), and so must be the minimum of the actual alignment of the two. This change is the first in a series that allows source and dest to each have their own alignments by using the alignment attribute on their arguments. In this change we: 1) Remove the alignment argument. 2) Add alignment attributes to the source & dest arguments. We, temporarily, require that the alignments for source & dest be equal. For example, code which used to read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 100, i32 4, i1 false) will now read call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 4 %dest, i8* align 4 %src, i32 100, i1 false) Downstream users may have to update their lit tests that check for @llvm.memcpy/memmove/memset call/declaration patterns. The following extended sed script may help with updating the majority of your tests, but it does not catch all possible patterns so some manual checking and updating will be required. s~declare void @llvm\.mem(set\|cpy\|move)\.p([^(])\((.), i32, i1\)~declare void @llvm.mem\1.p\2(\3, i1)~g s~call void @llvm\.memset\.p([^(])i8\(i8([^])\ (.), i8 (.), i8 (.), i32 [01], i1 ([^)])\)~call void @llvm.memset.p\1i8(i8\2* \3, i8 \4, i8 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i16\(i8([^])\ (.), i8 (.), i16 (.), i32 [01], i1 ([^)])\)~call void @llvm.memset.p\1i16(i8\2* \3, i8 \4, i16 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i32\(i8([^])\ (.), i8 (.), i32 (.), i32 [01], i1 ([^)])\)~call void @llvm.memset.p\1i32(i8\2* \3, i8 \4, i32 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i64\(i8([^])\ (.), i8 (.), i64 (.), i32 [01], i1 ([^)])\)~call void @llvm.memset.p\1i64(i8\2* \3, i8 \4, i64 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i128\(i8([^])\ (.), i8 (.), i128 (.), i32 [01], i1 ([^)])\)~call void @llvm.memset.p\1i128(i8\2* \3, i8 \4, i128 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i8\(i8([^])\ (.), i8 (.), i8 (.), i32 ([0-9]), i1 ([^)])\)~call void @llvm.memset.p\1i8(i8\2 align \6 \3, i8 \4, i8 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(])i16\(i8([^])\ (.), i8 (.), i16 (.), i32 ([0-9]), i1 ([^)])\)~call void @llvm.memset.p\1i16(i8\2 align \6 \3, i8 \4, i16 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(])i32\(i8([^])\ (.), i8 (.), i32 (.), i32 ([0-9]), i1 ([^)])\)~call void @llvm.memset.p\1i32(i8\2 align \6 \3, i8 \4, i32 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(])i64\(i8([^])\ (.), i8 (.), i64 (.), i32 ([0-9]), i1 ([^)])\)~call void @llvm.memset.p\1i64(i8\2 align \6 \3, i8 \4, i64 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(])i128\(i8([^])\ (.), i8 (.), i128 (.), i32 ([0-9]), i1 ([^)])\)~call void @llvm.memset.p\1i128(i8\2 align \6 \3, i8 \4, i128 \5, i1 \7)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i8\(i8([^])\ (.), i8([^])\ (.), i8 (.), i32 [01], i1 ([^)])\)~call void @llvm.mem\1.p\2i8(i8\3 \4, i8\5* \6, i8 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i16\(i8([^])\ (.), i8([^])\ (.), i16 (.), i32 [01], i1 ([^)])\)~call void @llvm.mem\1.p\2i16(i8\3 \4, i8\5* \6, i16 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i32\(i8([^])\ (.), i8([^])\ (.), i32 (.), i32 [01], i1 ([^)])\)~call void @llvm.mem\1.p\2i32(i8\3 \4, i8\5* \6, i32 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i64\(i8([^])\ (.), i8([^])\ (.), i64 (.), i32 [01], i1 ([^)])\)~call void @llvm.mem\1.p\2i64(i8\3 \4, i8\5* \6, i64 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i128\(i8([^])\ (.), i8([^])\ (.), i128 (.), i32 [01], i1 ([^)])\)~call void @llvm.mem\1.p\2i128(i8\3 \4, i8\5* \6, i128 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i8\(i8([^])\ (.), i8([^])\ (.), i8 (.), i32 ([0-9]), i1 ([^)])\)~call void @llvm.mem\1.p\2i8(i8\3* align \8 \4, i8\5* align \8 \6, i8 \7, i1 \9)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i16\(i8([^])\ (.), i8([^])\ (.), i16 (.), i32 ([0-9]), i1 ([^)])\)~call void @llvm.mem\1.p\2i16(i8\3* align \8 \4, i8\5* align \8 \6, i16 \7, i1 \9)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i32\(i8([^])\ (.), i8([^])\ (.), i32 (.), i32 ([0-9]), i1 ([^)])\)~call void @llvm.mem\1.p\2i32(i8\3* align \8 \4, i8\5* align \8 \6, i32 \7, i1 \9)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i64\(i8([^])\ (.), i8([^])\ (.), i64 (.), i32 ([0-9]), i1 ([^)])\)~call void @llvm.mem\1.p\2i64(i8\3* align \8 \4, i8\5* align \8 \6, i64 \7, i1 \9)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i128\(i8([^])\ (.), i8([^])\ (.), i128 (.), i32 ([0-9]), i1 ([^)])\)~call void @llvm.mem\1.p\2i128(i8\3* align \8 \4, i8\5* align \8 \6, i128 \7, i1 \9)~g The remaining changes in the series will: Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. Step 3) Update Clang to use the new IRBuilder API. Step 4) Update Polly to use the new IRBuilder API. Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get\|set]Alignment() to use getDestAlignment() and getSourceAlignment() instead. Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get\|set]Alignment() methods. Reviewers: pete, hfinkel, lhames, reames, bollu Reviewed By: reames Subscribers: niosHD, reames, jholewinski, qcolombet, jfb, sanjoy, arsenm, dschuff, dylanmckay, mehdi_amini, sdardis, nemanjai, david2050, nhaehnle, javed.absar, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, llvm-commits Differential Revision: https://reviews.llvm.org/D41675 llvm-svn: 322965	2018-01-19 17:13:12 +00:00
Petr Hosek	cc7a8f14bd	Fallback option for colorized output when terminfo isn't available Try to detect the terminal color support by checking the value of the TERM environment variable. This is not great, but it's better than nothing when terminfo library isn't available, which may still be the case on some Linux distributions. Differential Revision: https://reviews.llvm.org/D42055 llvm-svn: 322962	2018-01-19 17:10:55 +00:00
Carey Williams	22c49c6470	Test commit llvm-svn: 322958	2018-01-19 16:55:23 +00:00
Sanjay Patel	74a1eef7c4	[x86] shrink 'and' immediate values by setting the high bits (PR35907) Try to reverse the constant-shrinking that happens in SimplifyDemandedBits() for 'and' masks when it results in a smaller sign-extended immediate. We are also able to detect dead 'and' ops here (the mask is all ones). In that case, we replace and return without selecting the 'and'. Other targets might want to share some of this logic by enabling this under a target hook, but I didn't see diffs for simple cases with PowerPC or AArch64, so they may already have some specialized logic for this kind of thing or have different needs. This should solve PR35907: https://bugs.llvm.org/show_bug.cgi?id=35907 Differential Revision: https://reviews.llvm.org/D42088 llvm-svn: 322957	2018-01-19 16:37:25 +00:00
Sanjay Patel	33cb84571f	[InstSimplify] use m_Specific and commutative matcher to reduce code; NFCI llvm-svn: 322955	2018-01-19 16:12:55 +00:00
Nirav Dave	72d32f24f5	[X86] Extend load-op-store fusion merge to ADC/SBB. Summary: Add handling of EFLAG input to X86 Load-op-store fusion checking. Reviewers: craig.topper, RKSimon Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D42128 llvm-svn: 322952	2018-01-19 15:37:57 +00:00
Sander de Smalen	909cf956a1	[AArch64][SVE] Asm: Add support for RDVL/ADDVL/ADDPL instructions Reviewers: fhahn, rengolin, t.p.northover, echristo, olista01, SjoerdMeijer Reviewed By: SjoerdMeijer Subscribers: SjoerdMeijer, aemerson, javed.absar, tschuett, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D41900 llvm-svn: 322951	2018-01-19 15:22:00 +00:00
Alexey Bataev	fa80c47c6a	[SLP] Fix vectorization for tree with trunc to minimum required bit width. Summary: If the vectorized tree has truncate to minimum required bit width and the vector type of the cast operation after the truncation is the same as the vector type of the cast operands, count cost of the vector cast operation as 0, because this cast will be later removed. Also, if the vectorization tree root operations are integer cast operations, do not consider them as candidates for truncation. It will just create extra number of the same vector/scalar operations, which will be removed by instcombiner. Reviewers: RKSimon, spatel, mkuper, hfinkel, mssimpso Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41948 llvm-svn: 322946	2018-01-19 14:40:13 +00:00
Dmitry Preobrazhensky	0e074e349d	[AMDGPU][MC] Corrected parsing of image modifiers and encoding of image atomics See bugs 35962: https://bugs.llvm.org/show_bug.cgi?id=35962 35963: https://bugs.llvm.org/show_bug.cgi?id=35963 Differential Revision: https://reviews.llvm.org/D42184 Reviewers: vpykhtin, artem.tamazov, arsenm llvm-svn: 322942	2018-01-19 13:49:53 +00:00
Francis Visoiu Mistrih	548add992e	[CodeGen] Unify printing format of debug-location in both MIR and -debug Use "debug-location" instead of "; dbg:" in MI::print. llvm-svn: 322936	2018-01-19 11:44:42 +00:00
Hiroshi Inoue	d24ddcd6c4	[NFC] fix trivial typos in comments "the the" -> "the" llvm-svn: 322934	2018-01-19 10:55:29 +00:00
Alina Sbirlea	df26cf8117	[ModRefInfo] Return NoModRef for Must and NoModRef. Summary: In ModRefInfo "Must" was introduced to track presence of MustAlias, but we still want to return NoModRef when there is neither Mod or Ref, even when MustAlias is found. Patch has small fixes to ensure this happens. Minor cleanup to remove nesting for 2 if statements when calling getModRefInfo for 2 ImmutableCallSites. Reviewers: sanjoy Subscribers: jlebar, llvm-commits Differential Revision: https://reviews.llvm.org/D42209 llvm-svn: 322932	2018-01-19 10:26:40 +00:00
John Brawn	2867bd72c0	[InstCombine] Make foldSelectOpOp able to handle two-operand getelementptr Three (or more) operand getelementptrs could plausibly also be handled, but handling only two-operand fits in easily with the existing BinaryOperator handling. Differential Revision: https://reviews.llvm.org/D39958 llvm-svn: 322930	2018-01-19 10:05:15 +00:00
Matthias Braun	4a7c8e7aa2	Split MachineLICM into EarlyMachineLICM and MachineLICM; NFC This avoids playing games with pseudo pass IDs and avoids using an unreliable MRI::isSSA() check to determine whether register allocation has happened. Note that this renames: - MachineLICMID -> EarlyMachineLICM - PostRAMachineLICMID -> MachineLICMID to be consistent with the EarlyTailDuplicate/TailDuplicate naming. llvm-svn: 322927	2018-01-19 06:46:10 +00:00
Matthias Braun	3ab9fcb98e	Split TailDuplicatePass into pre- and post-RA variant; NFC Split TailDuplicatePass into EarlyTailDuplicate and TailDuplicate. This avoids playing games with fake pass IDs and using MRI::isSSA() to determine pre-/post-RA state. llvm-svn: 322926	2018-01-19 06:08:17 +00:00
Craig Topper	f4cd9083ac	[X86] Make better use of instregex for cmovcc/setcc/jcc instructions in the Intel scheduler models. Combine all the separate condition codes into a singular expression when possible. llvm-svn: 322924	2018-01-19 05:47:32 +00:00
Serguei Katkov	22bb1c0e17	Revert [CGP] Re-enable Select in complex addressing mode One of buildbots failed. Revert for now till fix the issue. llvm-svn: 322923	2018-01-19 04:52:39 +00:00
Matthias Braun	5c290dc206	AArch64: Fix emergency spillslot being out of reach for large callframes Re-commit of r322200: The testcase shouldn't hit machineverifiers anymore with r322917 in place. Large callframes (calls with several hundreds or thousands or parameters) could lead to situations in which the emergency spillslot is out of range to be addressed relative to the stack pointer. This commit forces the use of a frame pointer in the presence of large callframes. This commit does several things: - Compute max callframe size at the end of instruction selection. - Add mirFileLoaded target callback. Use it to compute the max callframe size after loading a .mir file when the size wasn't specified in the file. - Let TargetFrameLowering::hasFP() return true if there exists a callframe > 255 bytes. - Always place the emergency spillslot close to FP if we have a frame pointer. - Note that `useFPForScavengingIndex()` would previously return false when a base pointer was available leading to the emergency spillslot getting allocated late (that's the whole effect of this callback). Which made no sense to me so I took this case out: Even though the emergency spillslot is technically not referenced by FP in this case we still want it allocated early. Differential Revision: https://reviews.llvm.org/D40876 llvm-svn: 322919	2018-01-19 03:16:36 +00:00
Matthias Braun	dc4b3e87f4	AArch64: Omit callframe setup/destroy when not necessary Do not create CALLSEQ_START/CALLSEQ_END when there is no callframe to setup and the callframe size is 0. - Fixes an invalid callframe nesting for byval arguments, which would look like this before this patch (as in `big-byval.ll`): ... ADJCALLSTACKDOWN 32768, 0, ... # Setup for extfunc ... ADJCALLSTACKDOWN 0, 0, ... # setup for memcpy ... BL &memcpy ... ADJCALLSTACKUP 0, 0, ... # destroy for memcpy ... BL &extfunc ADJCALLSTACKUP 32768, 0, ... # destroy for extfunc - Saves us two instructions in the common case of zero-sized stackframes. - Remove an unnecessary scheduling barrier (hence the small unittest changes). Differential Revision: https://reviews.llvm.org/D42006 llvm-svn: 322917	2018-01-19 02:45:38 +00:00
Sam Clegg	b6c5bc27c4	[WebAssembly] Add test expectations for gcc C++ tests (gcc/testsuite/g++.dg) Differential Revision: https://reviews.llvm.org/D42226 llvm-svn: 322915	2018-01-19 01:40:52 +00:00
Lang Hames	44efd042a2	[ORC] Revert r322913 while I investigate an ASan failure. llvm-svn: 322914	2018-01-19 01:40:26 +00:00
Lang Hames	817df9fa0c	[ORC] Redesign the JITSymbolResolver interface to support bulk queries. Bulk queries reduce IPC/RPC overhead for cross-process JITing and expose opportunities for parallel compilation. The two new query methods are lookupFlags, which finds the flags for each of a set of symbols; and lookup, which finds the address and flags for each of a set of symbols. (See doxygen comments for more details.) The existing JITSymbolResolver class is renamed LegacyJITSymbolResolver, and modified to extend the new JITSymbolResolver class using the following scheme: - lookupFlags is implemented by calling findSymbolInLogicalDylib for each of the symbols, then returning the result of calling getFlags() on each of these symbols. (Importantly: lookupFlags does NOT call getAddress on the returned symbols, so lookupFlags will never trigger materialization, and lookupFlags will never call findSymbol, so only symbols that are part of the logical dylib will return results.) - lookup is implemented by calling findSymbolInLogicalDylib for each symbol and falling back to findSymbol if findSymbolInLogicalDylib returns a null result. Assuming a symbol is found its getAddress method is called to materialize it and the result (if getAddress succeeds) is stored in the result map, or the error (if getAddress fails) is returned immediately from lookup. If any symbol is not found then lookup returns immediately with an error. This change will break any out-of-tree derivatives of JITSymbolResolver. This can be fixed by updating those classes to derive from LegacyJITSymbolResolver instead. llvm-svn: 322913	2018-01-19 01:12:40 +00:00
Craig Topper	84b26b90d1	[X86] Add intrinsic support for the RDPID instruction This adds a new instrinsic to support the rdpid instruction. The implementation is a bit weird because the intrinsic is defined as always returning 32-bits, but the assembler support thinks the instruction produces a 64-bit register in 64-bit mode. But really it zeros the upper 32 bits. So I had to add separate patterns where 64-bit mode uses an extract_subreg. Differential Revision: https://reviews.llvm.org/D42205 llvm-svn: 322910	2018-01-18 23:52:31 +00:00
Changpeng Fang	ba6240cc71	AMDGPU/SI: Fix typos in d16 support patch the buffer intrinsics. llvm-svn: 322906	2018-01-18 22:57:57 +00:00
Reid Kleckner	7897a789e7	[CodeView] Add line numbers for inlined call sites We did this for inline call site line tables, but we hadn't done it for regular function line tables yet. This patch copies that logic from encodeInlineLineTable. llvm-svn: 322905	2018-01-18 22:55:43 +00:00
Reid Kleckner	b525872288	[CodeView] Sink complex inline functions to .cpp file, NFC I'm cleaning up this code before I attempt to fix a line table bug. llvm-svn: 322904	2018-01-18 22:55:14 +00:00
Changpeng Fang	4737e892de	AMDGPU/SI: Add d16 support for image intrinsics. Summary: This patch implements d16 support for image load, image store and image sample intrinsics. Reviewers: Matt, Brian. Differential Revision: https://reviews.llvm.org/D3991 llvm-svn: 322903	2018-01-18 22:08:53 +00:00
Eric Christopher	668e6b4b05	Typo fix SIBABRT -> SIGABRT. Based on a patch by Henry Wong! llvm-svn: 322902	2018-01-18 21:45:51 +00:00
Peter Collingbourne	719c1f7405	Support: Add missing #include. This #include is necessary to provide the definitions of _fpclass and _FPCLASS_NZ when building with libc++. llvm-svn: 322885	2018-01-18 20:49:33 +00:00
Paul Robinson	8181d23b3d	[DWARFv5] Number the line-table's directory array correctly. The compilation directory has always been #0, but as of DWARF v5 it is explicitly listed in the line-table section instead of implicitly being a reference to the compile_unit DIE's DW_AT_comp_dir attribute. This means the dumper should number the dumped array starting with 0 or 1 depending on the DWARF version of the line table. References in the generated DWARF are correct, it's just the dumper that was wrong. Also some assembler-coded tests were similarly confused about directory numbers. llvm-svn: 322884	2018-01-18 20:33:35 +00:00
Amara Emerson	d5785775f8	[AArch64][GlobalISel] Add isel support for global values in the large code model. Fixes PR35958. Differential Revision: https://reviews.llvm.org/D42175 llvm-svn: 322878	2018-01-18 19:21:27 +00:00
Ana Pazos	1b57c7a0f4	[RISCV] Fixed setting predicates for compressed instructions. Summary: Fixed setting predicates for compressed instructions. Some instructions were being generated with C extension enabled only, without proper checks for the other required extensions like F, D and 32 and 64-bit target checks. Affected instructions: C_FLD, C_FLW, C_LD, C_FSD, C_FSW, C_SD, C_JAL, C_ADDIW, C_SUBW, C_ADDW, C_FLDSP, C_FLWSP, C_LDSP, C_FSDSP, C_FSWSP, C_SDSP Reviewers: asb, shiva0217 Reviewed By: asb Subscribers: rbar, johnrusso, simoncook, jordy.potman.lists, sabuasal, niosHD, llvm-commits Differential Revision: https://reviews.llvm.org/D42132 llvm-svn: 322876	2018-01-18 18:54:05 +00:00
Zachary Turner	1bc2ce6b9b	Speed up iteration of CodeView record streams. There's some abstraction overhead in the underlying mechanisms that were being used, and it was leading to an abundance of small but not-free copies being made. This showed up on a profile. Eliminating this and going back to a low-level byte-based implementation speeds up lld with /DEBUG between 10 and 15%. Differential Revision: https://reviews.llvm.org/D42148 llvm-svn: 322871	2018-01-18 18:35:01 +00:00
Francis Visoiu Mistrih	eb3f76fc83	[CodeGen][NFC] Rename IsVerbose to IsStandalone in Machine*::print Committed r322867 too soon. Differential Revision: https://reviews.llvm.org/D42239 llvm-svn: 322868	2018-01-18 18:05:15 +00:00
Francis Visoiu Mistrih	378b5f3de6	[CodeGen] Print RegClasses on MI in verbose mode r322086 removed the trailing information describing reg classes for each register. This patch adds printing reg classes next to every register when individual operands/instructions/basic blocks are printed. In the case of dumping MIR or printing a full function, by default don't print it. Differential Revision: https://reviews.llvm.org/D42239 llvm-svn: 322867	2018-01-18 17:59:06 +00:00
Sanjay Patel	0e690b05d1	[TargetLowering] add punctuation for readability; NFC llvm-svn: 322855	2018-01-18 15:25:32 +00:00
Francis Visoiu Mistrih	586444e42b	[CodeGen][NFC] Refactor MachineInstr::print * Handle more cases where the MI is not attached yet * Add similar asserts like in MIRPrinter::print llvm-svn: 322848	2018-01-18 14:52:14 +00:00
Benjamin Kramer	bfc1d976ca	[HWAsan] Fix uninitialized variable. Found by msan. llvm-svn: 322847	2018-01-18 14:19:04 +00:00
Alex Bradbury	921383828e	[RISCV] Codegen support for the standard RV32M instruction set extension llvm-svn: 322843	2018-01-18 12:36:38 +00:00
Alex Bradbury	7d6aa1f7ae	[RISCV] Implement frame pointer elimination llvm-svn: 322839	2018-01-18 11:34:02 +00:00
Sam Parker	1f8c035d6b	[SelectionDAG] Convert assert to condtion Follow-up to r322120 which can cause assertions for AArch64 because v1f64 and v1i64 are legal types. Differential Revision: https://reviews.llvm.org/D42097 llvm-svn: 322823	2018-01-18 09:22:24 +00:00
Craig Topper	83b0a98902	[X86] Use vmovdqu64/vmovdqa64 for unmasked integer vector stores for consistency with loads. Previously we used 64 for vXi64 stores and 32 for everything else. This change uses 64 for everything just like do for loads. llvm-svn: 322820	2018-01-18 07:44:09 +00:00
Craig Topper	21c8a8fa49	[X86] Remove isel patterns for using unmasked vmovdqa32/vmovdqu32 for integer vector loads. These patterns were just looking for a vXi64 bitcasted to vXi32, but there is no advantage to using vmovdqa32 over vmovdqa64. llvm-svn: 322819	2018-01-18 07:44:06 +00:00
Rafael Espindola	8a1ca67461	Don't drop dso_local in LTO. LTO sets dso_local as an optimization, so don't clear it. This avoid clearing it from undefined hidden symbols, which would then fail the verifier. llvm-svn: 322814	2018-01-18 05:38:43 +00:00
Craig Topper	7f0d85ec1e	[DAGCombiner] Add a DAG combine to turn a splat build_vector where the splat elemnt is a bitcast from a vector type into a concat_vector For example, a build_vector of i64 bitcasted from v2i32 can be turned into a concat_vectors of the v2i32 vectors with a bitcast to a vXi64 type Differential Revision: https://reviews.llvm.org/D42090 llvm-svn: 322811	2018-01-18 04:17:06 +00:00
Rafael Espindola	9fbc040599	Make GlobalValues with non-default visibilility dso_local. This is similar to r322317, but for visibility. It is not as neat because we have to special case extern_weak. The idea is the same as the previous change, make the transition to explicit dso_local easier for the frontends. With this they only have to add dso_local to symbols where we need some external information to decide if it is dso_local (like it being part of an ELF executable). llvm-svn: 322806	2018-01-18 02:08:23 +00:00
Justin Bogner	a9346e050f	GlobalISel: Make MachineCSE runnable in the middle of the GlobalISel Right now, it is not possible to run MachineCSE in the middle of the GlobalISel pipeline. Being able to run generic optimizations between the core passes of GlobalISel was one of the goals of the new ISel framework. This is the first attempt to do it. The problem is that MachineCSE pass assumes all register operands have a register class, which, in GlobalISel context, won't be true until after the InstructionSelect pass. The reason for this behaviour is that before replacing one virtual register with another, MachineCSE pass (and most of the other optimization machine passes) must check if the virtual registers' constraints have a (sufficiently large) intersection, and constrain the resulting register appropriately if such intersection exists. GlobalISel extends the representation of such constraints from just a register class to a triple (low-level type, register bank, register class). This commit adds MachineRegisterInfo::constrainRegAttrs method that extends MachineRegisterInfo::constrainRegClass to such a triple. The idea is that going forward we should use: - RegisterBankInfo::constrainGenericRegister within GlobalISel's InstructionSelect pass - MachineRegisterInfo::constrainRegClass within SelectionDAG ISel - MachineRegisterInfo::constrainRegAttrs everywhere else regardless the target and instruction selector it uses. Patch by Roman Tereshin. Thanks! llvm-svn: 322805	2018-01-18 02:06:56 +00:00
Derek Schuff	53b3855b2b	[WebAssembly] Remove duplicated RTLIB names Remove the tight coupling between llvm/CodeGenRuntimeLibcalls.def and the table of supported singatures for wasm. This will allow adding new libcalls without changing wasm's signature table. Also, some cleanup: Use ManagedStatics instead of const tables to avoid memory/binary bloat. Use a StringMap instead of a linear search for name lookup. Differential Revision: https://reviews.llvm.org/D35592 llvm-svn: 322802	2018-01-18 01:15:45 +00:00
Volkan Keles	4aa73a649a	Fix the failure caused by r322773 Do not run GlobalISel if `-fast-isel=0 -global-isel=false`. llvm-svn: 322800	2018-01-18 01:10:30 +00:00
Jessica Paquette	729e68693f	[MachineOutliner] Add DISubprograms to outlined functions. Before, it wasn't possible to get backtraces inside outlined functions. This commit adds DISubprograms to the IR functions created by the outliner which makes this possible. Also attached a test that ensures that the produced debug information is correct. This is useful to users that want to debug outlined code. llvm-svn: 322789	2018-01-18 00:00:58 +00:00
Reid Kleckner	1aa9061c5f	[CodeGen] Hoist common AsmPrinter code out of X86, ARM, and AArch64 Every known PE COFF target emits /EXPORT: linker flags into a .drective section. The AsmPrinter should handle this. While we're at it, use global_values() and emit each export flag with its own .ascii directive. This should make the .s file output more readable. llvm-svn: 322788	2018-01-17 23:55:23 +00:00
Evgeniy Stepanov	5bd669dc8f	[hwasan] LLVM-level flags for linux kernel-compatible hwasan instrumentation. Summary: -hwasan-mapping-offset defines the non-zero shadow base address. -hwasan-kernel disables calls to __hwasan_init in module constructors. Unlike ASan, -hwasan-kernel does not force callback instrumentation. This is controlled separately with -hwasan-instrument-with-calls. Reviewers: kcc Subscribers: srhines, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D42141 llvm-svn: 322785	2018-01-17 23:24:38 +00:00
Volkan Keles	a79b0620a0	Add a TargetOption to enable/disable GlobalISel Summary: This patch adds a new target option in order to control GlobalISel. This will allow the users to enable/disable GlobalISel prior to the backend by calling `TargetMachine::setGlobalISel(bool Enable)`. No test case as there is already a test to check GlobalISel command line options. See: CodeGen/AArch64/GlobalISel/gisel-commandline-option.ll. Reviewers: qcolombet, aemerson, ab, dsanders Reviewed By: qcolombet Subscribers: rovka, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D42137 llvm-svn: 322773	2018-01-17 22:34:21 +00:00
Benjamin Kramer	8b1986b5cb	Add support for emitting libcalls for x86_fp80 -> fp128 and vice-versa compiler_rt doesn't provide them (yet), but libgcc does. PR34076. llvm-svn: 322772	2018-01-17 22:29:16 +00:00
Easwaran Raman	e5b8de2f1f	Add a ProfileCount class to represent entry counts. Summary: The class wraps a uint64_t and an enum to represent the type of profile count (real and synthetic) with some helper methods. Reviewers: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41883 llvm-svn: 322771	2018-01-17 22:24:23 +00:00
Eli Friedman	c60a23a6af	[LegalizeDAG] Fix ATOMIC_CMP_SWAP_WITH_SUCCESS legalization. The code wasn't zero-extending correctly, so the comparison could spuriously fail. Adds some AArch64 tests to cover this case. Inspired by D41791. Differential Revision: https://reviews.llvm.org/D41798 llvm-svn: 322767	2018-01-17 22:04:36 +00:00
Javed Absar	1e28194a40	[SCEV] Fix typo. NFC. Fix confusing typo in comment. llvm-svn: 322765	2018-01-17 21:58:35 +00:00
Zaara Syeda	c9dc7b451b	Revert [PowerPC] This reverts commit rL322721 Failing build bots. Revert the commit now. llvm-svn: 322748	2018-01-17 20:00:15 +00:00
Philip Reames	f5ff5d584e	[MDA] Use common code instead of reimplementing same. [NFC] llvm-svn: 322747	2018-01-17 19:57:19 +00:00
Aditya Nandakumar	18b3f9d384	[GISel] Make constrainSelectedInstRegOperands() available to the legalizer. NFC https://reviews.llvm.org/D42149 llvm-svn: 322743	2018-01-17 19:31:33 +00:00
Sam Clegg	9f3fe42e19	[WebAssembly] Remove debug names from symbol table Get rid of DEBUG_FUNCTION_NAME symbols. When we actually debug data, maybe we'll want somewhere to put it... but having a symbol that just stores the name of another symbol seems odd. It means you have multiple Symbols with the same name, one containing the actual function and another containing the name! Store the names in a vector on the WasmObjectFile when reading them in. Also stash them on the WasmFunctions themselves. The names are //not// "symbol names" or aliases or anything, they're just the name that a debugger should show against the function body itself. NB. The WasmObjectFile stores them so that they can be exported in the YAML losslessly, and hence the tests can be precise. Enforce that the CODE section has been read in before reading the "names" section. Requires minor adjustment to some tests. Patch by Nicholas Wilson! Differential Revision: https://reviews.llvm.org/D42075 llvm-svn: 322741	2018-01-17 19:28:43 +00:00
Rafael Espindola	d700869235	Use a got to access a hidden weak undefined on MachO. Trying to link __attribute__((weak, visibility("hidden"))) extern int foo; int *main(void) { return &foo; } on OS X fails with ld: 32-bit RIP relative reference out of range (-4294971318 max is +/-2GB): from _main (0x100000FAB) to _foo@0x00001000 (0x00000000) in '_main' from test.o for architecture x86_64 The problem being that 0 cannot be computed as a fixed difference from %rip. Exactly the same issue exists on ELF and we can use the same solution. llvm-svn: 322739	2018-01-17 19:19:55 +00:00
Joel Galenson	bbcaf4ac5c	[ARM] Optimize {s,u}mul.with.overflow. This extends my previous patches to also optimize overflow-checked multiplies during SelectionDAG. Differential revision: https://reviews.llvm.org/D40922 llvm-svn: 322738	2018-01-17 19:19:05 +00:00
Joel Galenson	fe7fa40869	[ARM] Optimize {s,u}{add,sub}.with.overflow. The ARM backend contains code that tries to optimize compares by replacing them with an existing instruction that sets the flags the same way. This allows it to replace a "cmp" with a "adds", generalizing the code that replaces "cmp" with "sub". It also heuristically disables sinking of instructions that could potentially be used to replace compares (currently only if they're next to each other). Differential revision: https://reviews.llvm.org/D38378 llvm-svn: 322737	2018-01-17 19:19:05 +00:00
Daniel Neilson	88dddb8948	[Attributes] Fix crash when attempting to remove alignment from an attribute list/set Summary: Discovered while working on a patch to move alignment in @llvm.memcpy/move/set from an arg into parameter attributes. The current implementations of AttributeSet::removeAttribute() and AttributeList::removeAttribute crash when attempting to remove the alignment attribute. Currently, these implementations add the to-be-removed attributes to an AttrBuilder and then remove the builder from the list/set. Alignment is special in that it must be added to a builder with an integer value for the alignment; attempts to add alignment to a builder without a value is an error. This change fixes the removeAttribute implementations for AttributeSet and AttributeList to make them able to remove the alignment, and other similar, attributes. Reviewers: rnk, chandlerc, pete, javed.absar, reames Reviewed By: rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41951 llvm-svn: 322735	2018-01-17 19:15:21 +00:00
Simon Pilgrim	8c87a2e7bd	[X86][BTVER2] Reduce instregex usage (PR35955) Most are just replaced with instrs lists, but a few regexps have been further generalized to match more instructions with a single pattern. llvm-svn: 322734	2018-01-17 19:12:48 +00:00
Craig Topper	b70ca5060f	[X86] Teach LowerBUILD_VECTOR to recognize pair-wise splats of 32-bit elements and use a 64-bit broadcast If we are splatting pairs of 32-bit elements, we can use a 64-bit broadcast to get the job done. We could probably could probably do this with other sizes too, for example four 16-bit elements. Or we could broadcast pairs of 16-bit elements using a 32-bit element broadcast. But I've left that as a future improvement. I've also restricted this to AVX2 only because we can only broadcast loads under AVX. Differential Revision: https://reviews.llvm.org/D42086 llvm-svn: 322730	2018-01-17 18:58:22 +00:00
Craig Topper	279ace187a	[X86] When legalizing (v64i1 select i8, v64i1, v64i1) make sure not to introduce bitcasts to i64 in 32-bit mode We legalize selects of masks with scalar conditions using a bitcast to an integer type. But if we are in 32-bit mode we can't convert v64i1 to i64. So instead split the v64i1 to v32i1 and concat it back together. Each half will then be legalized by bitcasting to i32 which is fine. The test case is a little indirect. If we have the v64i1 select in IR it will get legalized by legalize vector ops which has a run of type legalization after it. That type legalization run is able to fix this i64 bitcast. So in order to avoid that we need a build_vector of a splat which legalize vector ops will ignore. Legalize DAG will then turn that into a select via LowerBUILD_VECTORvXi1. And the select will get legalized. In this case there is no type legalizer run to cleanup the bitcast. This fixes pr35972. llvm-svn: 322724	2018-01-17 18:46:01 +00:00
Zaara Syeda	8e951fd2f6	[PowerPC] Add handling for ColdCC calling convention and a pass to mark candidates with coldcc attribute. This patch adds support for the coldcc calling convention for Power. This changes the set of non-volatile registers. It includes a pass to stress test the implementation by marking all static directly called functions with the coldcc attribute through the option -enable-coldcc-stress-test. It also includes an option, -ppc-enable-coldcc, to add the coldcc attribute to functions which are cold at all call sites based on BlockFrequencyInfo when the containing function does not call any non cold functions. Differential Revision: https://reviews.llvm.org/D38413 llvm-svn: 322721	2018-01-17 18:22:55 +00:00
Tatyana Krasnukha	8979eea04e	[ARC] Add missing condition codes. Summary: Added VS and VC, required for disassembling. Reviewers: petecoup Reviewed By: petecoup Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42172 llvm-svn: 322718	2018-01-17 17:58:28 +00:00
Jonas Paulsson	ef785694f2	[SystemZ] Handle BRCTH branches correctly in SystemZLongBranch.cpp. BRCTH is capable of a long branch which needs to be recognized during branch relaxation. This is done by checking for ExtraRelaxSize == 0. Review: Ulrich Weigand llvm-svn: 322688	2018-01-17 17:16:07 +00:00
Matt Arsenault	1491ca8911	AMDGPU: Error in SIAnnotateControlFlow instead of assert This assert typically happens if an unstructured CFG is passed to the pass. This can happen if the pass is run independently without the structurizer. llvm-svn: 322685	2018-01-17 16:30:01 +00:00
Diana Picus	01bcfd2112	[ARM GlobalISel] Rename local variable. NFC llvm-svn: 322667	2018-01-17 15:25:37 +00:00
Pablo Barrio	f2c29571da	[AArch64] Fix incorrect LD1 of 16-bit FP vectors in big endian Summary: Loading a vector of 4 half-precision FP sometimes results in an LD1 of 2 single-precision FP + a reversal. This results in an incorrect byte swap due to the conversion from little endian to big endian. In order to generate the correct byte swap, it is easier to generate the correct LD1 of 4 half-precision FP, thus avoiding the subsequent reversal. Reviewers: craig.topper, jmolloy, olista01 Reviewed By: olista01 Subscribers: efriedma, samparker, SjoerdMeijer, rogfer01, aemerson, rengolin, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D41863 llvm-svn: 322663	2018-01-17 14:39:29 +00:00
Sanjay Patel	aa766efd09	[InstCombine] fix demanded-bits propagation for zext/trunc I was comparing the demanded-bits implementations between InstCombine and TargetLowering as part of investigating questions in D42088 and noticed that this was wrong in IR. We were losing all of the prior known bits when we got back to the 'zext'. llvm-svn: 322662	2018-01-17 14:39:28 +00:00
Alex Bradbury	d93f889d89	[RISCV] Allow RISCVAsmBackend::writeNopData to generate c.nop when supported When the compressed instruction set is enabled, the 16-bit c.nop can be generated if necessary. Differential Revision: https://reviews.llvm.org/D41221 Patch by Shiva Chen. llvm-svn: 322658	2018-01-17 14:17:12 +00:00
Diana Picus	c62a16234b	[ARM GlobalISel] Map G_FPEXT and G_FPTRUNC to FPR llvm-svn: 322657	2018-01-17 14:14:14 +00:00
Daniil Fukalov	d5fca554e2	[AMDGPU] add LDS f32 intrinsics added llvm.amdgcn.atomic.{add\|min\|max}.f32 intrinsics to allow generate ds_{add\|min\|max}[_rtn]_f32 instructions needed for OpenCL float atomics in LDS Reviewed by: arsenm Differential Revision: https://reviews.llvm.org/D37985 llvm-svn: 322656	2018-01-17 14:05:05 +00:00
Dmitry Preobrazhensky	6b65f7c380	[AMDGPU][MC][GFX9] Enable inline constants for SDWA operands See bug 35771: https://bugs.llvm.org/show_bug.cgi?id=35771 Differential Revision: https://reviews.llvm.org/D42058 Reviewers: vpykhtin, artem.tamazov, arsenm llvm-svn: 322655	2018-01-17 14:00:48 +00:00
Diana Picus	65ed364fac	[ARM GlobalISel] Legalize G_FPEXT and G_FPTRUNC Mark G_FPEXT and G_FPTRUNC as legal or libcall, depending on hardware support, but only for conversions between float and double. Also add the necessary boilerplate so that the LegalizerHelper can introduce the required libcalls. This also works only for float and double, but isn't too difficult to extend when the need arises. llvm-svn: 322651	2018-01-17 13:34:10 +00:00
Ivan A. Kosarev	4d0ff0c74d	[Transforms] Support making mutable versions of new-format TBAA access tags Differential Revision: https://reviews.llvm.org/D41565 llvm-svn: 322650	2018-01-17 13:29:54 +00:00
Benjamin Kramer	8d073a2c2d	[X86] Don't mutate shuffle arguments after early-out for AVX512 The match* functions have the annoying behavior of modifying its inputs. Save and restore the inputs, just in case the early out for AVX512 is hit. This is still not great and its only a matter of time this kind of bug happens again, but I couldn't come up with a better pattern without rewriting significant chunks of this code. Fixes PR35977. llvm-svn: 322644	2018-01-17 13:01:06 +00:00
Benjamin Kramer	05dc3527de	[X86] Constify DebugLoc parameters. No functionality change. llvm-svn: 322643	2018-01-17 13:00:58 +00:00
Hiroshi Inoue	8f976ba0bf	[NFC] fix trivial typos in comments "the the" -> "the" llvm-svn: 322636	2018-01-17 12:29:38 +00:00
Pavel Labath	67530e478b	Don't emit apple accelerator tables on non-darwin targets Summary: Currently -glldb turns on emission of apple tables on all targets, but lldb is only really capable of consuming them on darwin. Furthermore, making lldb consume these tables is not straight-forward because of the differences in how the debug info is distributed on darwin vs. elf targets. The darwin debug model assumes that the debug info (along with accelerator tables) will either remain in the .o files or it will be linked into a dsym bundle by a linker that knows how to merge these tables. In the elf world, all present linkers will simply concatenate these accelerator tables into the shared object. Since the tables are not self-terminating, this renders the tables unusable, as the debugger cannot pry the individual tables apart anymore. It might theoretically be possible to make the tables work with split dwarf, as that is somewhat similar to the apple .o model, but unfortunately right now the combination of -glldb and -gsplit-dwarf produces broken object files. Until these issues are resolved there is no point in emitting the apple tables for these targets. At best, it wastes space; at worst, it breaks compilation and prevents the user from getting other benefits of -glldb. Reviewers: probinson, aprantl, dblaikie Subscribers: emaste, dim, llvm-commits, JDevlieghere Differential Revision: https://reviews.llvm.org/D41986 llvm-svn: 322633	2018-01-17 11:52:13 +00:00
Javed Absar	0b05f327d6	[SCEV] fix typo llvm-svn: 322629	2018-01-17 11:03:06 +00:00
George Rimar	2421b6fa5c	[ThinLTO] - Remove code duplication. NFC. Refactors 3 copies of isExpected. Splitted from D42107. llvm-svn: 322627	2018-01-17 10:33:05 +00:00
Andrew V. Tischenko	f7706994a6	Allow usage of X86-prefixes as separate instrs. Differential Revision: https://reviews.llvm.org/D42102 llvm-svn: 322623	2018-01-17 10:12:06 +00:00
Sean Eveson	2ae6037dd1	[MC] Fix -stack-size-section on ARM Change symbol values in the stack_size section from being 8 bytes, to being a target dependent size. Differential Revision: https://reviews.llvm.org/D42108 llvm-svn: 322619	2018-01-17 09:01:29 +00:00
Craig Topper	77ba1e7c08	[X86] In LowerBUILD_VECTOR, rename ExtVT to EltVT so it makes sense. llvm-svn: 322616	2018-01-17 03:58:21 +00:00
Craig Topper	de1d28e053	[X86] Remove duplicate lines from scheduler models. NFC llvm-svn: 322615	2018-01-17 03:50:21 +00:00
George Burgess IV	41e646d8ea	[Support] Return an enum instead of an unsigned; NFC. We seem to be (logically) returning ArchExtKinds here in all cases, so the return type should reflect that. The static_cast is necessary because `A.ID` is actually an `unsigned`, presumably since we use `decltype(A)` to represent extended attributes for both ARM and AArch64, which use distinct `ArchExtKinds`. We can't trivially make the same change for ARM, because one of the values it returns is the bitwise-or of two `ARM::ArchExtKind`s. llvm-svn: 322613	2018-01-17 03:12:06 +00:00
Aaron Smith	53a1a1616c	Fix pretty printing the unspecified param of a variadic function Summary: - Fix a bug in PrettyBuiltinDumper that returns "void" as the name for an unspecified builtin type. Since the unspecified param of a variadic function is considered a builtin of unspecified type in PDBs, we set "..." for its name. - Provide a method to determine if a PDBSymbolFunc is variadic in PrettyFunctionDumper since PDBSymbolFunc::getArgument() doesn't return the last unspecified-type param. - Add a pretty-func-dumper.test to test pretty dumping of variadic functions. Reviewers: zturner, llvm-commits Reviewed By: zturner Differential Revision: https://reviews.llvm.org/D41801 llvm-svn: 322608	2018-01-17 01:22:03 +00:00
Evgeniy Stepanov	c07e0bd533	[hwasan] Rename sized load/store callbacks to be consistent with ASan. Summary: __hwasan_load is now __hwasan_loadN. Reviewers: kcc Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D42138 llvm-svn: 322601	2018-01-16 23:15:08 +00:00
Simon Pilgrim	a8e6b885bd	[X86][BTVER2] Fix scheduling of VCMPSD/VCMPSS instructions For some reason they don't have a trailing i like the packed equivalents. llvm-svn: 322600	2018-01-16 22:15:41 +00:00
Florian Hahn	c6c89bffdc	[CallSiteSplitting] Pass list of (BB, Conditions) pairs to splitCallSite. This removes some duplication from splitCallSite and makes it easier to add additional code dealing with each predecessor. It also allows us to split for more than 2 predecessors, although that is not enabled for now. Reviewers: junbuml, mcrosier, davidxl, davide Reviewed By: junbuml Differential Revision: https://reviews.llvm.org/D41858 llvm-svn: 322599	2018-01-16 22:13:15 +00:00
Simon Pilgrim	3c66e2c541	[X86][BTVER2] Use instrs instead of instregex for low match counts (PR35955) llvm-svn: 322598	2018-01-16 22:08:43 +00:00
Simon Pilgrim	e9a2832f32	[X86][BTVER2] Use instrs instead of instregex for single use matches (PR35955) llvm-svn: 322597	2018-01-16 21:44:48 +00:00
Rui Ueyama	af4ddd5a6e	Specify inline for isWhitespace in CommandLine.cpp Patch by Takuto Ikuta. In chromium's component build, there are many directive sections and commandline parsing takes much time. This patch is for speed up of lld in RelWithDebInfo build by forcing inline heavily called isWhitespace function. 10 times link perf stats of blink_core.dll changed like below. master: TotalSeconds: 9.8764878 TotalSeconds: 10.1455242 TotalSeconds: 10.075279 TotalSeconds: 10.3397347 TotalSeconds: 9.8361665 TotalSeconds: 9.9544441 TotalSeconds: 9.8960686 TotalSeconds: 9.8877865 TotalSeconds: 10.0551879 TotalSeconds: 10.0492254 Avg: 10.01159047 with this patch: TotalSeconds: 8.8696762 TotalSeconds: 9.1021585 TotalSeconds: 9.0233893 TotalSeconds: 9.1886175 TotalSeconds: 9.156954 TotalSeconds: 9.0978564 TotalSeconds: 9.1316824 TotalSeconds: 8.8354606 TotalSeconds: 9.2549431 TotalSeconds: 9.4473085 Avg: 9.11080465 llvm-svn: 322595	2018-01-16 20:52:32 +00:00
Lang Hames	4a793c0667	[ExecutionEngine] Rename JITSymbol::isStrongDefinition to isStrong. For symmetry with isWeak, isCommon. llvm-svn: 322594	2018-01-16 20:39:51 +00:00
Guozhi Wei	e6fb4e1f8a	[PPC] Add a new register XER aliased to CARRY When "xer" is specified as clobbered register in inline assembler, clang can accept it, but llvm simply ignore it when lowered to machine instructions. It may cause problems later in scheduler. This patch adds a new register XER aliased to CARRY, and adds it to register class CARRYRC. Now PPCTargetLowering::getRegForInlineAsmConstraint can return correct register number for inline asm constraint "{xer}", and scheduler behave correctly. Differential Revision: https://reviews.llvm.org/D41967 llvm-svn: 322591	2018-01-16 19:28:50 +00:00
Francis Visoiu Mistrih	54a9e7a400	[CodeGen] Skip some instructions that shouldn't affect shrink-wrapping r320606 checked for MI.isMetaInstruction which skips all DBG_VALUEs. This also skips IMPLICIT_DEFs and other instructions that may def / read a register. Differential Revision: https://reviews.llvm.org/D42119 llvm-svn: 322584	2018-01-16 18:55:26 +00:00
Volkan Keles	f7f2568613	[GlobalISel][TableGen] Add support for SDNodeXForm Summary: This patch adds CustomRenderer which renders the matched operands to the specified instruction. Targets can enable the matching of SDNodeXForm by adding a definition that inherits from GICustomOperandRenderer and GISDNodeXFormEquiv as follows. def gi_imm8 : GICustomOperandRenderer<"renderImm8”>, GISDNodeXFormEquiv<imm8_xform>; Custom renderer functions should be of the form: void render(MachineInstrBuilder &MIB, const MachineInstr &I); Reviewers: dsanders, ab, rovka Reviewed By: dsanders Subscribers: kristof.beyls, javed.absar, llvm-commits, mgrang, qcolombet Differential Revision: https://reviews.llvm.org/D42012 llvm-svn: 322582	2018-01-16 18:44:05 +00:00
Alexey Bataev	6977dbcc7b	[SLP] Fix for PR32164: Improve vectorization of reverse order of extract operations. Summary: Sometimes vectorization of insertelement instructions with extractelement operands may produce an extra shuffle operation, if these operands are in the reverse order. Patch tries to improve this situation by the reordering of the operands to remove this extra shuffle operation. Reviewers: mkuper, hfinkel, RKSimon, spatel Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D33954 llvm-svn: 322579	2018-01-16 18:17:01 +00:00
Simon Pilgrim	3e0aafbfcc	[X86][MMX] Accept UNDEF upper bits for MOVD GR32->MMX llvm-svn: 322574	2018-01-16 17:01:31 +00:00
Petar Jovanovic	0b464e4f0e	[LiveDebugValues] recognize spilled reg killed in instruction after spill Current condition for spill instruction recognition in LiveDebugValues does not recognize case when register is spilled and killed in next instruction. Patch by Nikola Prica. Differential Revision: https://reviews.llvm.org/D41226 llvm-svn: 322554	2018-01-16 14:46:05 +00:00
Simon Pilgrim	85e6139633	[X86][MMX] Improve MMX constant generation Extend the MMX zero code to take any constant with zero'd upper 32-bits llvm-svn: 322553	2018-01-16 14:21:28 +00:00
Jonas Devlieghere	6f24c8778c	[DebugInfo] Unify dumping of address ranges Summary: This patch unifies the printing of address ranges as [0x0, 0x1). rdar://34822059 Reviewers: aprantl, dblaikie Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D42056 llvm-svn: 322543	2018-01-16 11:17:57 +00:00
Francis Visoiu Mistrih	5eaddb3f68	[CodeGen] Remove special case of printing subRegIdx from MachineInstr::print Support in MachineOperand has been added in r320209. No need to special case this anymore. llvm-svn: 322542	2018-01-16 10:53:14 +00:00
Francis Visoiu Mistrih	ecd0b83312	[CodeGen][NFC] Correct case for printSubRegIdx llvm-svn: 322541	2018-01-16 10:53:11 +00:00
Yonghong Song	035dd256d5	[BPF] Mark pseudo insn patterns as isCodeGenOnly These pseudos are not supposed to be visible to user. This patch reduced the auto-generated instruction matcher. For example, the following words are removed from keyword list of LLVM BPF assembler. - MCK__35_, // '#' - MCK__COLON_, // ':' - MCK__63_, // '?' - MCK_ADJCALLSTACKDOWN, // 'ADJCALLSTACKDOWN' - MCK_ADJCALLSTACKUP, // 'ADJCALLSTACKUP' - MCK_PSEUDO, // 'PSEUDO' - MCK_Select, // 'Select' Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> llvm-svn: 322535	2018-01-16 07:27:20 +00:00
Yonghong Song	b42c7c7863	[BPF] Teach DAG2DAG AND elimination about load intrinsics As commented on the existing code: // The Reg operand should be a virtual register, which is defined // outside the current basic block. DAG combiner has done a pretty // good job in removing truncating inside a single basic block. However, when the Reg operand comes from bpf_load_[byte \| half \| word] intrinsics, the generic optimizer doesn't understand their results are zero extended, so these single basic block elimination opportunities were missed. Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> llvm-svn: 322534	2018-01-16 07:27:19 +00:00
Hiroshi Inoue	99a8faa615	[SROA] fix assetion failure This patch fixes the assertion failure in SROA reported in PR35657. PR35657 reports the assertion failure due to r319522 (splitting for non-whole-alloca slices), but this problem can happen even without r319522. The problem exists in a check for reusing an existing alloca when rewriting partitions. As the original comment said, we can reuse the existing alloca if the new alloca has the same type and offset with the existing one. But the code checks only type of the alloca and then check the offset using an assert. In a corner case with out-of-bounds access (e.g. @PR35657 function added in unit test), it is possible that the two allocas have the same type but different offsets. This patch makes the check of the offset in the if condition, and re-enables the splitting for non-whole-alloca slices. Differential Revision: https://reviews.llvm.org/D41981 llvm-svn: 322533	2018-01-16 06:23:05 +00:00
Craig Topper	7a0c601f95	[X86] Revisit the fix I made years ago to make 'xchgl %eax, %eax' not encode using the 0x90 encoding in 64-bit mode. Prior to this we had a separate instruction and register class that excluded eax to prevent matching the instruction that would encode with 0x90. This patch changes this to just use an InstAlias to force xchgl %eax, %eax to use XCHG32rr instruction in 64-bit mode. This gets rid of the separate instruction and register class. llvm-svn: 322532	2018-01-16 06:07:16 +00:00
Craig Topper	daa385f480	[X86] Make 'xchgq %rax, %rax' an alias for the 0x90 nop encoding to match gas. Previously we encoded it as 0x48 0x90. llvm-svn: 322531	2018-01-16 06:07:14 +00:00
Simon Pilgrim	e5dad1365c	Avoid Wparentheses warning. llvm-svn: 322526	2018-01-15 22:40:06 +00:00
Simon Pilgrim	85bd9141ca	[X86][MMX] Add support for MMX zero vector creation As mentioned on PR35869, (and came up recently on D41517) we don't create a MMX zero register via the PXOR but instead perform a spill to stack from a XMM zero register. This patch adds support for direct MMX zero vector creation and should make it easier to add better constant vector creation in the future as well. Differential Revision: https://reviews.llvm.org/D41908 llvm-svn: 322525	2018-01-15 22:32:40 +00:00
Simon Pilgrim	940eae3cc1	[X86][SSE] Add custom execution domain fixing for BLENDPD/BLENDPS/PBLENDD/PBLENDW (PR34873) Add support for custom execution domain fixing and implement support for BLENDPD/BLENDPS/PBLENDD/PBLENDW. Differential Revision: https://reviews.llvm.org/D42042 llvm-svn: 322524	2018-01-15 22:18:45 +00:00
Craig Topper	1393ccf949	[X86] Use MVT::getVectorVT instead of EVT::getVectorVT when splitting 256/512 bit build_vectors. NFC We must be creating a legal type here which means it can be an MVT. llvm-svn: 322512	2018-01-15 20:33:53 +00:00
Craig Topper	aacc622564	[X86] Generalize some code in LowerBUILD_VECTOR. NFC llvm-svn: 322511	2018-01-15 20:33:52 +00:00
Craig Topper	4f7fadd029	[X86] Remove unnecessary if statement from LowerBUILD_VECTOR. NFCI We were checking for 128, 256, or 512 bit vectors, but those are the only types that can get here. llvm-svn: 322510	2018-01-15 20:33:50 +00:00
Dan Gohman	7aa1fcdf3e	[WebAssembly] Update README.txt. Describe more of the current status, mention Rust as another easy way to use this backend, and add more documentation links. llvm-svn: 322508	2018-01-15 20:08:14 +00:00
Stanislav Mekhanoshin	62875fcd6c	[AMDGPU] Add HW_REG_SH_MEM_BASES symbolic name for s_getreg_b32 Differential Revision: https://reviews.llvm.org/D41617 llvm-svn: 322500	2018-01-15 18:49:15 +00:00

1 2 3 4 5 ...

109919 Commits