llvm-project

Commit Graph

Author	SHA1	Message	Date
Hans Wennborg	4b2e7daa7e	Revert r342870 "[ARM] bottom-top mul support ARMParallelDSP" This broke Chromium's Android build (https://crbug.com/889390) and the polly-aosp buildbot (http://lab.llvm.org:8011/builders/aosp-O3-polly-before-vectorizer-unprofitable). > Originally committed in rL342210 but was reverted in rL342260 because > it was causing issues in vectorized code, because I had forgotten to > ensure that we're operating on scalar values. > > Original commit message: > > On failing to find sequences that can be converted into dual macs, > try to find sequential 16-bit loads that are used by muls which we > can then use smultb, smulbt, smultt with a wide load. > > Differential Revision: https://reviews.llvm.org/D51983 llvm-svn: 343082	2018-09-26 08:41:50 +00:00
Lang Hames	d8048675f4	[ORC] Update CompileOnDemandLayer2 to use the new lazyReexports mechanism for lazy compilation, rather than a callback manager. The new mechanism does not block compile threads, and does not require function bodies to be renamed. Future modifications should allow laziness on a per-module basis to work without any modification of the input module. llvm-svn: 343065	2018-09-26 05:08:29 +00:00
Hsiangkai Wang	55321d82bd	[DebugInfo] Do not generate address info for removed debug labels. In some senario, LLVM will remove llvm.dbg.labels in IR. For example, when the labels are in unreachable blocks, these labels will not be generated in LLVM IR. In the case, these debug labels will have address zero as their address. It is not legal address for debugger to set breakpoints or query sources. So, the patch inhibits the address info (DW_AT_low_pc) of removed labels. Fix build failed in BuildBot, clang-stage1-cmake-RA-incremental, on macOS. Differential Revision: https://reviews.llvm.org/D51908 llvm-svn: 343062	2018-09-26 04:19:23 +00:00
Lang Hames	225a32af72	[ORC] Add support for multithreaded compiles to LLJIT and LLLazyJIT. LLJIT and LLLazyJIT can now be constructed with an optional NumCompileThreads arguments. If this is non-zero then a thread-pool will be created with the given number of threads, and compile tasks will be dispatched to the thread pool. To enable testing of this feature, two new flags are added to lli: (1) -compile-threads=N (N = 0 by default) controls the number of compile threads to use. (2) -thread-entry can be used to execute code on additional threads. For each -thread-entry argument supplied (multiple are allowed) a new thread will be created and the given symbol called. These additional thread entry points are called after static constructors are run, but before main. llvm-svn: 343058	2018-09-26 02:39:42 +00:00
Vyacheslav Zakharin	e06831a3b2	Remove LoopID metadata from the branch instruction that follows the peeled iterations. Differential Revision: https://reviews.llvm.org/D52176 llvm-svn: 343054	2018-09-26 01:03:21 +00:00
Zhaoshi Zheng	95710337b4	Revert "Revert "[ConstHoist] Do not rebase single (or few) dependent constant"" This reverts commit bd7b44f35ee9fbe365eb25ce55437ea793b39346. Reland r342994: disabled the optimization and explicitly enable it in test. -mllvm -consthoist-min-num-to-rebase<unsigned>=0 [ConstHoist] Do not rebase single (or few) dependent constant If an instance (InsertionPoint or IP) of Base constant A has only one or few rebased constants depending on it, do NOT rebase. One extra ADD instruction is required to materialize each rebased constant, assuming A and the rebased have the same materialization cost. Differential Revision: https://reviews.llvm.org/D52243 llvm-svn: 343053	2018-09-26 00:59:09 +00:00
Thomas Lively	c949857a7f	[WebAssembly] SIMD conversions Summary: Lowers (s\|u)itofp and fpto(s\|u)i instructions for vectors. The fp to int conversions produce poison values if their arguments are out of the convertible range, so a future CL will have to add an LLVM intrinsic to make the saturating behavior of this conversion usable. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D52372 llvm-svn: 343052	2018-09-26 00:34:36 +00:00
Stanislav Mekhanoshin	8dfcd83371	[AMDGPU] Fix ds combine with subregs Differential Revision: https://reviews.llvm.org/D52522 llvm-svn: 343047	2018-09-25 23:33:18 +00:00
Craig Topper	12c18840fa	[X86] Allow movmskpd/ps ISD nodes to be created and selected with integer input types. This removes an int->fp bitcast between the surrounding code and the movmsk. I had already added a hack to combineMOVMSK to try to look through this bitcast to improve the SimplifyDemandedBits there. But I found an additional issue where the bitcast was preventing combineMOVMSK from being called again after earlier nodes in the DAG are optimized. The bitcast gets revisted, but not the user of the bitcast. By using integer types throughout, the bitcast doesn't get in the way. llvm-svn: 343046	2018-09-25 23:28:27 +00:00
Craig Topper	d8c68840c8	[X86] Add some more movmsk test cases. NFC These IR patterns represent the exact behavior of a movmsk instruction using (zext (bitcast (icmp slt X, 0))). For the v4i32/v8i32/v2i64/v4i64 we currently emit a PCMPGT for the icmp slt which is unnecessary since we only care about the sign bit of the result. This is because of the int->fp bitcast we put on the input to the movmsk nodes for these cases. I'll be fixing this in a future patch. llvm-svn: 343045	2018-09-25 23:28:24 +00:00
Sanjay Patel	f23727d972	[InstCombine] add fneg variation of shuffle-binop fold; NFC If the fsub in this pattern was replaced by an actual fneg instruction, we would need to add a fold to recognize that because fneg would not be a binop. llvm-svn: 343041	2018-09-25 22:48:58 +00:00
Changpeng Fang	6f4922ccc9	AMDGPU: Add Selection patterns to support add of one bit. Summary: We generate s_xor to lower add of i1s in general cases, and s_not to lower add with a one-bit imm of -1 (true). Reviewers: rampitec Differential Revision: https://reviews.llvm.org/D52518 llvm-svn: 343030	2018-09-25 21:21:18 +00:00
Anna Thomas	b1e3d45318	[LV][LAA] Vectorize loop invariant values stored into loop invariant address Summary: We are overly conservative in loop vectorizer with respect to stores to loop invariant addresses. More details in https://bugs.llvm.org/show_bug.cgi?id=38546 This is the first part of the fix where we start with vectorizing loop invariant values to loop invariant addresses. This also includes changes to ORE for stores to invariant address. Reviewers: anemet, Ayal, mkuper, mssimpso Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D50665 llvm-svn: 343028	2018-09-25 20:57:20 +00:00
Teresa Johnson	7fb39dfa7c	[ThinLTO] Efficiency fix for writing type id records in per-module indexes Summary: In D49565/r337503, the type id record writing was fixed so that only referenced type ids were emitted into each per-module index for ThinLTO distributed builds. However, this still left an efficiency issue: each per-module index checked all type ids for membership in the referenced set, yielding O(M*N) performance (M indexes and N type ids). Change the TypeIdMap in the summary to be indexed by GUID, to facilitate correlating with type identifier GUIDs referenced in the function summary TypeIdInfo structures. This allowed simplifying other places where a map from type id GUID to type id map entry was previously being used to aid this correlation. Also fix AsmWriter code to handle the rare case of type id GUID collision. For a large internal application, this reduced the thin link time by almost 15%. Reviewers: pcc, vitalybuka Subscribers: mehdi_amini, inglorion, steven_wu, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D51330 llvm-svn: 343021	2018-09-25 20:14:40 +00:00
Sanjay Patel	10c11b867a	[x86] avoid 256-bit andnp that requires insert/extract with AVX1 (PR37449) This is the final (I hope!) problem pattern mentioned in PR37749: https://bugs.llvm.org/show_bug.cgi?id=37749 We are trying to avoid an AVX1 sinkhole caused by having 256-bit bitwise logic ops but no other 256-bit integer ops. We've already solved the simple logic ops, but 'andn' is an x86 special. I looked at alternative solutions like extending the generic DAG combine or trying to wait until the ANDNP node is created, but those are bigger patches that can over-reach. Ie, splitting to 128-bit does not look like a win in most cases with >1 256-bit op. The pattern matching is cluttered with bitcasts because of our i64 element canonicalization. For the affected test, we have this vector-type-legalized sequence: t29: v8i32 = concat_vectors t27, t28 t30: v4i64 = bitcast t29 t18: v8i32 = BUILD_VECTOR Constant:i32<-1>, Constant:i32<-1>, ... t31: v4i64 = bitcast t18 t32: v4i64 = xor t30, t31 t9: v8i32 = BUILD_VECTOR Constant:i32<255>, Constant:i32<255>, ... t34: v4i64 = bitcast t9 t35: v4i64 = and t32, t34 t36: v8i32 = bitcast t35 t37: v4i32 = extract_subvector t36, Constant:i64<0> t38: v4i32 = extract_subvector t36, Constant:i64<4> Differential Revision: https://reviews.llvm.org/D52318 llvm-svn: 343008	2018-09-25 19:09:34 +00:00
Yury Delendik	7c18d6083a	[WebAssembly] Move/clone DBG_VALUE during WebAssemblyRegStackify pass Summary: The MoveForSingleUse or MoveAndTeeForMultiUse functions move wasm instructions, however DBG_VALUE stay unchanged -- moving or cloning these. Reviewers: dschuff Reviewed By: dschuff Subscribers: mattd, MatzeB, dschuff, sbc100, jgravelle-google, aheejin, sunfish, llvm-commits, aardappel Tags: #debug-info Differential Revision: https://reviews.llvm.org/D49034 llvm-svn: 343007	2018-09-25 18:59:34 +00:00
Jessica Paquette	e02de05b32	Revert "[ConstHoist] Do not rebase single (or few) dependent constant" This caused a couple test failures on a bot: CodeGen/X86/constant-hoisting-bfi.ll Transforms/ConstantHoisting/X86/ehpad.ll Example: http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/53575/ llvm-svn: 343005	2018-09-25 18:41:40 +00:00
Daniil Fukalov	349b5943b4	[RegAllocGreedy] avoid using physreg candidates that cannot be correctly spilled For the AMDGPU target if a MBB contains exec mask restore preamble, SplitEditor may get state when it cannot insert a spill instruction. E.g. for a MIR bb.100: %1 = S_OR_SAVEEXEC_B64 %2, implicit-def $exec, implicit-def $scc, implicit $exec and if the regalloc will try to allocate a virtreg to the physreg already assigned to virtreg %1, it should insert spill instruction before the S_OR_SAVEEXEC_B64 instruction. But it is not possible since can generate incorrect code in terms of exec mask. The change makes regalloc to ignore such physreg candidates. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D52052 llvm-svn: 343004	2018-09-25 18:37:38 +00:00
Daniel Sanders	06f4ff1952	[globalisel][tblgen] Table optimization should consider the C++ code in C++ predicates This fixes PR39045 llvm-svn: 342997	2018-09-25 17:59:02 +00:00
Zhaoshi Zheng	2c1a09188f	[ConstHoist] Do not rebase single (or few) dependent constant If an instance (InsertionPoint or IP) of Base constant A has only one or few rebased constants depending on it, do NOT rebase. One extra ADD instruction is required to materialize each rebased constant, assuming A and the rebased have the same materialization cost. Differential Revision: https://reviews.llvm.org/D52243 llvm-svn: 342994	2018-09-25 17:45:37 +00:00
Justin Bogner	ef2ae740c6	Revert "[DebugInfo] Do not generate address info for removed debug labels." The added test is failing on macOS: http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/53550/ This reverts r342943. llvm-svn: 342993	2018-09-25 17:29:30 +00:00
Craig Topper	6fb1358a98	[X86] Add AVX512 support to combineVectorSizedSetCCEquality. Reviewers: spatel, RKSimon Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52424 llvm-svn: 342989	2018-09-25 16:27:12 +00:00
Sanjay Patel	69ed4710b8	[InstCombine] narrow binops on concatenated vectors (PR33026) The motivating case from: https://bugs.llvm.org/show_bug.cgi?id=33026 ...has no shuffles now. This kind of pattern may occur during vectorization when targets have lumpy ISAs like SSE/AVX. llvm-svn: 342988	2018-09-25 15:57:37 +00:00
Guillaume Chatelet	345fae5d56	[llvm-exegesis] Serializes registers initial values. Summary: Adds the registers initial values to the YAML output of llvm-exegesis. Reviewers: courbet Subscribers: tschuett, llvm-commits Differential Revision: https://reviews.llvm.org/D52460 llvm-svn: 342982	2018-09-25 15:15:54 +00:00
Guillaume Chatelet	6078f82241	[llvm-exegesis] Fix missing document separator in YAML output. Reviewers: courbet Subscribers: tschuett, llvm-commits Differential Revision: https://reviews.llvm.org/D52496 llvm-svn: 342981	2018-09-25 14:48:24 +00:00
Clement Courbet	86baebc5fd	[llvm-exegesis] Add lit tests (v2). Summary: This revisits rL342953 by adding detection of host support. Reviewers: gchatelet, lebedev.ri, alexshap Subscribers: mgorny, tschuett, llvm-commits Differential Revision: https://reviews.llvm.org/D52464 llvm-svn: 342975	2018-09-25 13:59:35 +00:00
Simon Pilgrim	b56be79e0c	Revert rL342916: [X86] Remove shift/rotate by CL memory (RMW) overrides As suggested by Craig Topper - I'm going to look at cleaning up the RMW sequences instead. The uops are slightly different to the register variant, so requires a +1uop tweak llvm-svn: 342969	2018-09-25 13:01:26 +00:00
David Green	9108c2b921	[LoopUnroll] Add check to Latch's terminator in UnrollRuntimeLoopRemainder In this patch, I'm adding an extra check to the Latch's terminator in llvm::UnrollRuntimeLoopRemainder, similar to how it is already done in the llvm::UnrollLoop. The compiler would crash if this function is called with a malformed loop. Patch by Rodrigo Caetano Rocha! Differential Revision: https://reviews.llvm.org/D51486 llvm-svn: 342958	2018-09-25 10:08:47 +00:00
Sameer Sahasrabuddhe	b4f2d1cb68	[AMDGPU] restore r342722 which was reverted with r342743 [AMDGPU] lower-switch in preISel as a workaround for legacy DA Summary: The default target of the switch instruction may sometimes be an "unreachable" block, when it is guaranteed that one of the cases is always taken. The dominator tree concludes that such a switch instruction does not have an immediate post dominator. This confuses divergence analysis, which is unable to propagate sync dependence to the targets of the switch instruction. As a workaround, the AMDGPU target now invokes lower-switch as a preISel pass. LowerSwitch is designed to handle the unreachable default target correctly, allowing the divergence analysis to locate the correct immediate dominator of the now-lowered switch. llvm-svn: 342956	2018-09-25 09:39:21 +00:00
Clement Courbet	6d92c198ac	Revert rL342953 "[llvm-exegesis] Add lit tests." We also need to make sure that we're on the right subtarget. llvm-svn: 342955	2018-09-25 09:36:44 +00:00
Clement Courbet	7f1322dc4d	[llvm-exegesis] Add lit tests. Summary: Right now we only have unit tests. This will allow testing the whole tool. Even though We can't really check actual values, this will avoid regressions such as PR39055. Reviewers: gchatelet, alexshap Subscribers: mgorny, tschuett, llvm-commits Differential Revision: https://reviews.llvm.org/D52407 llvm-svn: 342953	2018-09-25 09:27:43 +00:00
Hsiangkai Wang	9c2463622d	[DebugInfo] Do not generate address info for removed debug labels. In some senario, LLVM will remove llvm.dbg.labels in IR. For example, when the labels are in unreachable blocks, these labels will not be generated in LLVM IR. In the case, these debug labels will have address zero as their address. It is not legal address for debugger to set breakpoints or query sources. So, the patch inhibits the address info (DW_AT_low_pc) of removed labels. Differential Revision: https://reviews.llvm.org/D51908 llvm-svn: 342943	2018-09-25 06:09:50 +00:00
Thomas Lively	12da0f9c3d	[WebAssembly] SIMD sqrt Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D52387 llvm-svn: 342937	2018-09-25 03:39:28 +00:00
Stanislav Mekhanoshin	14fefe7f8e	[AMDGPU] Remove useless check from test. NFC. The check for assignment of zero is practically useless while the assignment moves around with different scheduling. llvm-svn: 342935	2018-09-25 01:24:54 +00:00
Craig Topper	9ce5da7b62	[X86] Don't create FILD ISD nodes when X87 is disabled. The included test case previously asserted because the type legalizer tried to soften the FILD ISD node. Fixes PR38819. llvm-svn: 342934	2018-09-25 00:16:57 +00:00
Thomas Lively	586153652c	[WebAssembly][NFC] Fix hardcoded stack indices in tests Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D52388 llvm-svn: 342928	2018-09-24 23:42:07 +00:00
Evgeniy Stepanov	090f0f9504	[hwasan] Record and display stack history in stack-based reports. Summary: Display a list of recent stack frames (not a stack trace!) when tag-mismatch is detected on a stack address. The implementation uses alignment tricks to get both the address of the history buffer, and the base address of the shadow with a single 8-byte load. See the comment in hwasan_thread_list.h for more details. Developed in collaboration with Kostya Serebryany. Reviewers: kcc Subscribers: srhines, kubamracek, mgorny, hiraditya, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D52249 llvm-svn: 342923	2018-09-24 23:03:34 +00:00
Evgeniy Stepanov	20c4999e8b	Revert "[hwasan] Record and display stack history in stack-based reports." This reverts commit r342921: test failures on clang-cmake-arm* bots. llvm-svn: 342922	2018-09-24 22:50:32 +00:00
Evgeniy Stepanov	9043e17edd	[hwasan] Record and display stack history in stack-based reports. Summary: Display a list of recent stack frames (not a stack trace!) when tag-mismatch is detected on a stack address. The implementation uses alignment tricks to get both the address of the history buffer, and the base address of the shadow with a single 8-byte load. See the comment in hwasan_thread_list.h for more details. Developed in collaboration with Kostya Serebryany. Reviewers: kcc Subscribers: srhines, kubamracek, mgorny, hiraditya, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D52249 llvm-svn: 342921	2018-09-24 21:38:42 +00:00
Christy Lee	e94374809e	Re-submitting changes in D51550 because it failed to patch. Reviewers: javed.absar, trentxintong, courbet Reviewed By: trentxintong Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52433 llvm-svn: 342919	2018-09-24 20:47:12 +00:00
Simon Pilgrim	0b4ad7596f	[X86] Remove shift/rotate by CL memory (RMW) overrides The uops are slightly different to the register variant, so requires a +1uop tweak llvm-svn: 342916	2018-09-24 20:11:50 +00:00
Stefan Pintilie	b5305771fb	[Power9] [LLVM] Add __float128 exponent GET and SET builtins Added __builtin_vsx_scalar_extract_expq __builtin_vsx_scalar_insert_exp_qp Builtins should behave the same way as in GCC. Differential Revision: https://reviews.llvm.org/D48185 llvm-svn: 342910	2018-09-24 18:14:13 +00:00
Simon Pilgrim	51cbd838d0	[X86][AVX] Add truncation as shuffle test for PR31451 llvm-svn: 342908	2018-09-24 17:26:31 +00:00
Christy Lee	bf112ea25b	Reland r342494 after fixing LIT checks. llvm-svn: 342907	2018-09-24 17:26:30 +00:00
Sanjay Patel	7b86bc22de	[InstCombine] add/move tests for extractelement; NFC llvm-svn: 342905	2018-09-24 17:17:16 +00:00
Zhaoshi Zheng	05b46dc300	[Thumb1] Any imm8 should have cost of 1 A simple MOVS rd, imm8 can materialize [-128, 127] in signed i8 type or [0, 255] in unsigned i8 type on Thumb1. Differential Revision: https://reviews.llvm.org/D52257 llvm-svn: 342898	2018-09-24 16:15:23 +00:00
Fedor Sergeev	662e5686fe	[New PM][PassInstrumentation] IR printing support for New Pass Manager Implementing -print-before-all/-print-after-all/-filter-print-func support through PassInstrumentation callbacks. - PrintIR routines implement printing callbacks. - StandardInstrumentations class provides a central place to manage all the "standard" in-tree pass instrumentations. Currently it registers PrintIR callbacks. Reviewers: chandlerc, paquette, philip.pfaffe Differential Revision: https://reviews.llvm.org/D50923 llvm-svn: 342896	2018-09-24 16:08:15 +00:00
Simon Pilgrim	00865a48d1	[X86] Split WriteIMul into 8/16/32/64 implementations (PR36931) Split WriteIMul by size and also by IMUL multiply-by-imm and multiply-by-reg cases. This removes all the scheduler overrides for gpr multiplies and stops WriteMULH being ignored for BMI2 MULX instructions. llvm-svn: 342892	2018-09-24 15:21:57 +00:00
Luke Cheeseman	ab7f9b170d	[Arm][AsmParser] Restrict register list size for VSTM/VLDM - The assembler accepts VSTM/VLDM with register lists (specifically double registers lists) with more than 16 registers specified - The Arm architecture reference manual says this instruction must not contain more than 16 registers when the registers are doubleword registers - This addresses one of the concerns in https://bugs.llvm.org/show_bug.cgi?id=38389 Differential Revision: https://reviews.llvm.org/D52082 llvm-svn: 342891	2018-09-24 15:13:48 +00:00
Sanjay Patel	2c901742ca	[DAGCombiner] use UADDO to optimize saturated unsigned add This is a preliminary step towards solving PR14613: https://bugs.llvm.org/show_bug.cgi?id=14613 If we have an 'add' instruction that sets flags, we can use that to eliminate an explicit compare instruction or some other instruction (cmn) that sets flags for use in the later select. As shown in the unchanged tests that use 'icmp ugt %x, %a', we're effectively reversing an IR icmp canonicalization that replaces a variable operand with a constant: https://rise4fun.com/Alive/V1Q But we're not using 'uaddo' in those cases via DAG transforms. This happens in CGP after D8889 without checking target lowering to see if the op is supported. So AArch already shows 'uaddo' codegen for the i8/i16/i32/i64 test variants with "using_cmp_sum" in the title. That's the pattern that CGP matches as an unsigned saturated add and converts to uaddo without checking target capabilities. This patch is gated by isOperationLegalOrCustom(ISD::UADDO, VT), so we see only see AArch diffs for i32/i64 in the tests with "using_cmp_notval" in the title (unlike x86 which sees improvements for all sizes because all sizes are 'custom'). But the AArch code (like x86) looks better when translated to 'uaddo' in all cases. So someone that is involved with AArch may want to set i8/i16 to 'custom' for UADDO, so this patch will fire on those tests. Another possibility given the existing behavior: we could remove the legal-or-custom check altogether because we're assuming that a UADDO sequence is canonical/optimal before we ever reach here. But that seems like a bug to me. If the target doesn't have an add-with-flags op, then it's not likely that we'll get optimal DAG combining using a UADDO node. This is similar justification for why we don't canonicalize IR to the overflow math intrinsic sibling (llvm.uadd.with.overflow) for UADDO in the first place. Differential Revision: https://reviews.llvm.org/D51929 llvm-svn: 342886	2018-09-24 14:47:15 +00:00
Petar Jovanovic	f9808c5f09	[Mips][FastISel] Fix selectBranch on icmp i1 The r337288 tried to fix result of icmp i1 when its input is not sanitized by falling back to DagISel. While it now produces the correct result for bit 0, the other bits can still hold arbitrary value which is not supported by MipsFastISel branch lowering. This patch fixes the issue by falling back to DagISel in this case. Patch by Dragan Mladjenovic. Differential Revision: https://reviews.llvm.org/D52045 llvm-svn: 342884	2018-09-24 14:14:19 +00:00
Zaara Syeda	edefda48d2	[PowerPC] Support operand modifier 'x' in inline asm gcc uses operand modifier 'x' in inline asm for VSX registers. Without this modifier, instructions which use VSX numbering for their operands are printed as VMX registers. This patch adds support for the operand modifier 'x'. Differential Revision: https://reviews.llvm.org/D52244 llvm-svn: 342882	2018-09-24 14:01:16 +00:00
Jonas Devlieghere	8a7cfc6c86	[dsymutil] Set LSan blacklist whenever sanitizers are enabled. LSan can be enabled by itself or as part of the address sanitizer. Rather than checking the enabled sanitizers for both, just set the LSan env options whenever a sanitizer is enabled. llvm-svn: 342881	2018-09-24 13:56:36 +00:00
Roman Lebedev	fb697d0f1b	[NFC][CodeGen][X86][AArch64] More tests for 'bit field extract' w/ constants It would be best to introduce ISD::BitFieldExtract, because clearly more than one backend faces the same problem. But for now let's solve this in the x86-specific DAG combine. https://bugs.llvm.org/show_bug.cgi?id=38938 llvm-svn: 342880	2018-09-24 13:24:20 +00:00
Matt Arsenault	f432011d33	AMDGPU: Fix private handling for allowsMisalignedMemoryAccesses If the alignment is at least 4, this should report true. Something still seems off with how < 4-byte types are handled here though. Fixing this seems to change how some combines get to where they get, but somehow isn't changing the net result. llvm-svn: 342879	2018-09-24 13:18:15 +00:00
Matt Arsenault	b53feca372	Fix some missing opcodes in bcanalyzer llvm-svn: 342878	2018-09-24 12:47:17 +00:00
Sjoerd Meijer	d986ede313	[ARM] Do not fuse VADD and VMUL on the Cortex-M4 and Cortex-M33 A sequence of VMUL and VADD instructions always give the same or better performance than a fused VMLA instruction on the Cortex-M4 and Cortex-M33. Executing the VMUL and VADD back-to-back requires the same cycles, but having separate instructions allows scheduling to avoid the hazard between these 2 instructions. Differential Revision: https://reviews.llvm.org/D52289 llvm-svn: 342874	2018-09-24 12:02:50 +00:00
Luke Cheeseman	bda54bca39	[ARM][ARMLoadStoreOptimizer] - The load store optimizer is currently merging multiple loads/stores into VLDM/VSTM with more than 16 doubleword registers - This is an UNPREDICTABLE instruction and shouldn't be done - It looks like the Limit for how many registers included in a merge got dropped at some point so I am reintroducing it in this patch - This fixes https://bugs.llvm.org/show_bug.cgi?id=38389 Differential Revision: https://reviews.llvm.org/D52085 llvm-svn: 342872	2018-09-24 10:42:22 +00:00
Petar Jovanovic	c451c9ef50	[deadargelim] Update dbg.value of 'unused' parameters DeadArgElim pass marks unused function arguments as ‘undef’ without updating existing dbg.values referring to it. As a consequence the debug info metadata in the final executable was wrong. Patch by Djordje Todorovic. Differential Revision: https://reviews.llvm.org/D51968 llvm-svn: 342871	2018-09-24 10:01:24 +00:00
Sam Parker	a7b2405b06	[ARM] bottom-top mul support ARMParallelDSP Originally committed in rL342210 but was reverted in rL342260 because it was causing issues in vectorized code, because I had forgotten to ensure that we're operating on scalar values. Original commit message: On failing to find sequences that can be converted into dual macs, try to find sequential 16-bit loads that are used by muls which we can then use smultb, smulbt, smultt with a wide load. Differential Revision: https://reviews.llvm.org/D51983 llvm-svn: 342870	2018-09-24 09:34:06 +00:00
Craig Topper	2b8107614c	[X86] Add 512-bit test cases to setcc-wide-types.ll. NFC llvm-svn: 342860	2018-09-24 05:46:01 +00:00
Matt Arsenault	9a71e80645	Fix asserts when linking wrong address space declarations llvm-svn: 342858	2018-09-24 04:42:14 +00:00
Matt Arsenault	ce5f203415	llvm-diff: Fix crash on anonymous functions Not sure what the correct behavior is for this. Skip them and report how many there were. llvm-svn: 342857	2018-09-24 04:42:13 +00:00
Simon Pilgrim	9202c9fb47	[X86] RORmCL instruction models should match ROLmCL etc. Confirmed with Craig Topper - fix a typo that was missing a Port4 uop for ROR*mCL instructions on some Intel models. Yet another step on the scheduler model cleanup marathon...... llvm-svn: 342846	2018-09-23 19:16:01 +00:00
Sanjay Patel	0027946915	[DAGCombiner][x86] extend decompose of integer multiply into shift/add with negation This is an alternative to https://reviews.llvm.org/D37896. We can't decompose multiplies generically without a target hook to tell us when it's profitable. ARM and AArch64 may be able to remove some existing code that overlaps with this transform. This extends D52195 and may resolve PR34474: https://bugs.llvm.org/show_bug.cgi?id=34474 (still an open question about transforming legal vector multiplies, but we could open another bug report for those) llvm-svn: 342844	2018-09-23 18:41:38 +00:00
Simon Pilgrim	19952add7c	[X86] Added missing RCL/RCR schedule overrides to the generic SNB model The SandyBridge model was missing schedule values for the RCL/RCR values - instead using the (incredibly optimistic) WriteShift (now WriteRotate) defaults. I've added overrides with more realistic (slow) values, based on a mixture of Agner/instlatx64 numbers and what later Intel models do as well. This is necessary to allow WriteRotate to be updated to remove other rotate overrides. It'd probably be a good idea to investigate a WriteRotateCarry class at some point but its not high priority given the unusualness of these instructions. llvm-svn: 342842	2018-09-23 17:40:24 +00:00
Sanjay Patel	151efca3fe	[x86] add tests for mul decomposition with negative constant; NFC llvm-svn: 342838	2018-09-23 16:07:46 +00:00
Eugene Leviant	2b70d616f0	[WholeProgramDevirt] Don't process declarations when building type id map Differential revision: https://reviews.llvm.org/D52175 llvm-svn: 342836	2018-09-23 13:27:47 +00:00
Craig Topper	c296436a30	[X86] Add isel pattern for (v8i16 (sext (v8i1))) with DQI and no BWI. Our lowering that tries to avoid this sign extend can be defeated by the DAG combine folding it with a truncate. The pattern needs to extend to an v8i32 then truncate back down to v8i16. llvm-svn: 342830	2018-09-23 06:49:48 +00:00
Tri Vo	6c47c62588	[AArch64] Support adding X[8-15,18] registers as CSRs. Summary: Specifying X[8-15,18] registers as callee-saved is used to support CONFIG_ARM64_LSE_ATOMICS in Linux kernel. As part of this patch we: - use custom CSR list/mask when user specifies custom CSRs - update Machine Register Info's list of CSRs with additional custom CSRs in LowerCall and LowerFormalArguments. Reviewers: srhines, nickdesaulniers, efriedma, javed.absar Reviewed By: nickdesaulniers Subscribers: kristof.beyls, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D52216 llvm-svn: 342824	2018-09-22 22:17:50 +00:00
Yonghong Song	55b9156730	[bpf] Test case for symbol information in object file This patch tests the change introduced in r342556. Signed-off-by: Paul Chaignon <paul.chaignon@orange.com> llvm-svn: 342807	2018-09-22 17:31:01 +00:00
Sanjay Patel	09e02fbf51	[InstCombine][x86] try even harder to convert blendv intrinsic to generic IR (PR38814) Follow-up to rL342324 (D52059): Missing optimizations with blendv are shown in: https://bugs.llvm.org/show_bug.cgi?id=38814 This is an easier and more powerful solution than adding pattern matching for a few special cases in the backend. The potential danger with this transform in IR is that the condition value can get separated from the select, and the backend might not be able to make a blendv out of it again. llvm-svn: 342806	2018-09-22 14:43:55 +00:00
George Rimar	ce95ac6490	[lib/MC] - Set SHF_EXCLUDE flag for .dwo sections. DWARF5 spec says about single file split case: "The sections that do not require relocation, however, can be written to the relocatable object (.o) file but ignored by the the linker or they can be written to a separate DWARF object (.dwo) file that need not be accessed by the linker." Nice way to make linker to ignore them is to set SHF_EXCLUDE flag. It seems to be not harmful to always set it for .dwo sections. That is what this patch does. Differential revision: https://reviews.llvm.org/D52303 llvm-svn: 342800	2018-09-22 07:36:20 +00:00
Craig Topper	2b3f5df73a	[InstCombine] Fold (min/max ~X, Y) -> ~(max/min X, ~Y) when Y is freely invertible Summary: This restores the combine that was reverted in r341883. The infinite loop from the failing test no longer occurs due to changes from r342163. Reviewers: spatel, dmgreen Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52070 llvm-svn: 342797	2018-09-22 05:53:27 +00:00
Craig Topper	082e04c61d	[X86] Fix inline expansion for memset in x32 Summary: Similar to D51893 which was for memcpy Reviewers: efriedma Reviewed By: efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52063 llvm-svn: 342796	2018-09-22 05:16:35 +00:00
Craig Topper	9995760df4	[X86] Fold (movmsk (setne (and X, (1 << C)), 0)) -> (movmsk (X << C)) for vXi8 vectors. We don't have a vXi8 shift left so we need to bitcast to a vXi16 vector to perform the shift. If we let lowering legalize the vXi8 shift we get an extra and that we don't need and fail to remove. llvm-svn: 342795	2018-09-22 05:08:38 +00:00
Jordan Rupprecht	8d60f9b6d2	[llvm-size] Berkeley formatting: use tabs instead of spaces as field delimeters. This matches GNU behavior for size and allows use of cut to parse the output of llvm-size. llvm-svn: 342791	2018-09-21 23:48:12 +00:00
Craig Topper	ecdab03d10	[X86] Teach fast isel to use MOV32ri64 for loading an unsigned 32 immediate into a 64-bit register. Previously we used SUBREG_TO_REG+MOV32ri. But regular isel was changed recently to use the MOV32ri64 pseudo. Fast isel now does the same. llvm-svn: 342788	2018-09-21 23:14:05 +00:00
Warren Ristow	4f27730eaf	[Loop Vectorizer] Abandon vectorization when no integer IV found Support for vectorizing loops with secondary floating-point induction variables was added in r276554. A primary integer IV is still required for vectorization to be done. If an FP IV was found, but no integer IV was found at all (primary or secondary), the attempt to vectorize still went forward, causing a compiler-crash. This change abandons that attempt when no integer IV is found. (Vectorizing FP-only cases like this, rather than bailing out, is discussed as possible future work in D52327.) See PR38800 for more information. Differential Revision: https://reviews.llvm.org/D52327 llvm-svn: 342786	2018-09-21 23:03:50 +00:00
Zachary Turner	6345e84dde	[NativePDB] Add support for reading function signatures. This adds support for parsing function signature records and returning them through the native DIA interface. llvm-svn: 342780	2018-09-21 22:36:28 +00:00
Zachary Turner	355ffb0032	[PDB] Add native reading support for UDT / class types. This allows the native reader to find records of class/struct/ union type and dump them. This behavior is tested by using the diadump subcommand against golden output produced by actual DIA SDK on the same PDB file, and again using pretty -native to confirm that we actually dump the classes. We don't find class members or anything like that yet, for now it's just the class itself. llvm-svn: 342779	2018-09-21 22:36:04 +00:00
Fedor Sergeev	10febb0779	[New PM][PassInstrumentation] Adding PassInstrumentation to the AnalysisManager runs As a prerequisite to time-passes implementation which needs to time both passes and analyses, adding instrumentation points to the Analysis Manager. The are two functional differences between Pass and Analysis instrumentation: - the latter does not increment pass execution counter - it does not provide ability to skip execution of the corresponding analysis Reviewers: chandlerc, philip.pfaffe Differential Revision: https://reviews.llvm.org/D51275 llvm-svn: 342778	2018-09-21 22:10:17 +00:00
Adrian Prantl	2e102480ac	llvm-dwarfdump --statistics: Unique abstract origins across multiple CUs. Instead of indexing local variables by DIE offset, use the variable name + the path through the lexical block tree. This makes the lookup key consistent across duplicate abstract origins in different CUs. llvm-svn: 342776	2018-09-21 21:59:34 +00:00
Sanjay Patel	db1fb8cd20	[x86] add more tests for poetntial andnp splitting with AVX1; NFC llvm-svn: 342775	2018-09-21 21:25:16 +00:00
Simon Pilgrim	eabc582b62	[X86] Add AVX512 target to load scalar to vector tests To investigate broadcast instruction codegen for D51553 llvm-svn: 342773	2018-09-21 21:08:26 +00:00
Thomas Lively	720fbf553d	[WebAssembly][NFC] Rename simd-conversions test to simd-bitcasts Summary: This name is more accurate and I want to reuse the simd-conversions name for testing the actual conversion ops. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D52333 llvm-svn: 342761	2018-09-21 18:46:39 +00:00
Caroline Tice	3dea3f9e0a	Pass code-model through Module IR to LTO which will use it. Currently the code-model does not get saved in the module IR, so if a code model is specified when compiling with LTO, it gets lost and is not propagated properly to LTO. This patch, along with one for the front end, fixes that. Differential Revision: https://reviews.llvm.org/D52322 llvm-svn: 342760	2018-09-21 18:41:31 +00:00
Sanjay Patel	19b5eb580b	[x86] add (negative) andnp test for D52318; NFC llvm-svn: 342756	2018-09-21 18:24:53 +00:00
Sanjay Patel	83a1b66c43	[x86] add test with optsize attribute for scalar->vector transform; NFC llvm-svn: 342755	2018-09-21 18:03:49 +00:00
Wouter van Oortmerssen	7beaa30e4e	[WebAssembly] Made assembler only use stack instruction tablegen defs Summary: This ensures we have the non-register version of the instruction. The stack version of call_indirect now wants a type index argument, so that has been added in the existing tests. Tested: llvm-lit -v `find test -name WebAssembly` Reviewers: dschuff Subscribers: sbc100, jgravelle-google, aheejin, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D51662 llvm-svn: 342753	2018-09-21 17:47:58 +00:00
Sameer Sahasrabuddhe	0807e94951	revert changes from r342722 "[AMDGPU] lower-switch in preISel as a workaround for legacy DA" This broke regression tests. The first breakage was noticed here: http://lab.llvm.org:8011/builders/lld-x86_64-freebsd/builds/23549 llvm-svn: 342743	2018-09-21 16:31:51 +00:00
Sanjay Patel	72d627e5ec	[InstCombine] add tests for extractelement; NFC There are folds under visitExtractElementInst() that don't appear to have any test coverage, so adding a few basic cases here. llvm-svn: 342740	2018-09-21 14:43:49 +00:00
Clement Courbet	8171bd8e0f	[X86][Sched] Add zero idiom sched data to the SNB model. Summary: On SNB, renamer-based zeroing does not work for: - 16 and 8-bit GPRs[1]. - MMX [2]. - ANDN variants [3] [1] echo 'sub %ax, %ax' \| /tmp/llvm-exegesis -mode=uops -snippets-file=- [2] echo 'pxor %mm0, %mm0' \| /tmp/llvm-exegesis -mode=uops -snippets-file=- [3] echo 'andnps %xmm0, %xmm0' \| /tmp/llvm-exegesis -mode=uops -snippets-file=- Reviewers: RKSimon, andreadb Subscribers: gbedwell, craig.topper, llvm-commits Differential Revision: https://reviews.llvm.org/D52358 llvm-svn: 342736	2018-09-21 14:07:20 +00:00
Andrea Di Biagio	4cd5cf9fc8	[X86][BtVer2] Fix latency and resource cycles of AVX 256-bit zero-idioms. This patch introduces a SchedWriteVariant to describe zero-idiom VXORP(S\|D)Yrr and VANDNP(S\|D)Yrr. This is a follow-up of r342555. On Jaguar, a VXORPSYrr is 2 macro opcodes. Only one opcode is eliminated at register-renaming stage. The other opcode has to be executed to set the upper half of the destination YMM. Same for VANDNP(S\|D)Yrr. Differential Revision: https://reviews.llvm.org/D52347 llvm-svn: 342728	2018-09-21 12:43:07 +00:00
Jonas Devlieghere	7cf529c11f	[test] Fix Assembler/debug-info.ll Update Assembler/debug-info.ll to contain discriminator. llvm-svn: 342727	2018-09-21 12:28:44 +00:00
Andrea Di Biagio	eebfecee4f	[X86] Add scheduling tests for AVX1 256-bit zero-idioms. NFC llvm-svn: 342726	2018-09-21 12:22:14 +00:00
Jonas Devlieghere	a0c9cb1913	Ensure that variant part discriminator is read by MetadataLoader https://reviews.llvm.org/D42082 introduced variant parts to debug info in LLVM. Subsequent work on the Rust compiler has found a bug in that patch; namely, there is a path in MetadataLoader that fails to restore the discriminator. This patch fixes the bug. Patch by: Tom Tromey Differential revision: https://reviews.llvm.org/D52340 llvm-svn: 342725	2018-09-21 12:03:14 +00:00
Jonas Devlieghere	907ed15f99	[dsymutil] Suppress CoreFoundation leaks in tests. This suppresses CoreFoundation originated leaks in the dsymutil tests. I'm not sure if this is a false positive or not, but either way we don't have control over it and shouldn't keep the bot red. http://lab.llvm.org:8080/green/job/clang-stage2-cmake-RgSan/ llvm-svn: 342724	2018-09-21 11:55:17 +00:00
Sameer Sahasrabuddhe	2de7653fd5	[AMDGPU] lower-switch in preISel as a workaround for legacy DA Summary: The default target of the switch instruction may sometimes be an "unreachable" block, when it is guaranteed that one of the cases is always taken. The dominator tree concludes that such a switch instruction does not have an immediate post dominator. This confuses divergence analysis, which is unable to propagate sync dependence to the targets of the switch instruction. As a workaround, the AMDGPU target now invokes lower-switch as a preISel pass. LowerSwitch is designed to handle the unreachable default target correctly, allowing the divergence analysis to locate the correct immediate dominator of the now-lowered switch. Reviewers: arsenm, nhaehnle Reviewed By: nhaehnle Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits, simoll Differential Revision: https://reviews.llvm.org/D52221 llvm-svn: 342722	2018-09-21 11:26:55 +00:00
Alexander Timofeev	36617f0160	[AMDGPU] Divergence driven instruction selection. Part 1. Summary: This change is the first part of the AMDGPU target description change. The aim of it is the effective splitting the vector and scalar flows at the selection stage. Selection uses predicate functions based on the framework implemented earlier - https://reviews.llvm.org/D35267 Differential revision: https://reviews.llvm.org/D52019 Reviewers: rampitec llvm-svn: 342719	2018-09-21 10:31:22 +00:00

1 2 3 4 5 ...

56245 Commits