llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	ba375263e8	[DAGCombiner][X86] Teach visitCONCAT_VECTORS to combine (concat_vectors (concat_vectors X, Y), undef)) -> (concat_vectors X, Y, undef, undef) I also had to add a new combine to X86's combineExtractSubvector to prevent a regression. This helps our vXi1 code see the full concat operation and allow it optimize undef to a zero if there is already a zero in the concat. This helped us use a movzx instead of an AND in some of the tests. In those tests, one concat comes from SelectionDAGBuilder and the second comes from type legalization of v4i1->i4 bitcasts which uses an additional concat. Though these changes weren't my original motivation. I'm looking at making X86ISelLowering's narrowShuffle emit a concat_vectors instead of an insert_subvector since concat_vectors is more canonical during early DAG combine. This patch helps prevent a regression from my experiments with that. Differential Revision: https://reviews.llvm.org/D66456 llvm-svn: 369459	2019-08-20 22:12:50 +00:00
Sean Fertile	1e46d4cec5	Adds support for writing the .bss section for XCOFF object files. Adds Wrapper classes for MCSymbol and MCSection into the XCOFF target object writer. Also adds a class to represent the top-level sections, which we materialize in the ObjectWriter. executePostLayoutBinding will map all csects into the appropriate container depending on its storage mapping class, and map all symbols into their containing csect. Once all symbols have been processed we - Assign addresses and symbol table indices. - Calaculte section sizes. - Build the section header table. - Assign the sections raw-pointer value for non-virtual sections. Since the .bss section is virtual, writing the header table is enough to add support. Writing of a sections raw data, or of any relocations is not included in this patch. Testing is done by dumping the section header table, but it needs to be extended to include dumping the symbol table once readobj support for dumping auxiallary entries lands. Differential Revision: https://reviews.llvm.org/D65159 llvm-svn: 369454	2019-08-20 22:03:18 +00:00
Aditya Nandakumar	08bd080872	[GlobalISel] Handle multiple registers in dbg.value intrinsic https://reviews.llvm.org/D66077 The value passed into dbg.value may relate to multiple registers, each of which need a DBG_VALUE. This fix calls MIRBuilder.buildDirectDbgValue for each register. Without this, IR passed in from flang-compiler/flang may fail an assertion in getOrCreateVReg. Patch by : peterwaller-arm. llvm-svn: 369403	2019-08-20 16:28:37 +00:00
Thomas Raoux	be699bf389	[CodeGen] Add a pass to do block predication on SSA machine IR. For targets requiring aggressive scheduling and/or software pipeline we need to apply predication before preRA scheduling. This adds a pass re-using the early if-cvt infrastructure but generating predicated instructions instead of speculatively executing instructions. It allows doing if conversion on blocks containing instructions with side-effects. The pass re-use the target hook from postRA if-conversion to let the target decide on the heuristic to apply. Differential Revision: https://reviews.llvm.org/D66190 llvm-svn: 369395	2019-08-20 15:54:59 +00:00
Karl-Johan Karlsson	40da6be2bd	[AsmPrinter] Remove const qualifier from EmitBasicBlockStart. Overriders may want to modify state in it. AMDGPU wants to, but has to make its members mutable in order to do so. Besides, EmitBasicBlockEnd is not const, so why should Start be? Patch by Bevin Hansson. Reviewed By: nickdesaulniers Differential Revision: https://reviews.llvm.org/D66341 llvm-svn: 369325	2019-08-20 05:13:57 +00:00
Vyacheslav Zakharin	f7229ac7d8	Fixed placement of llvm.global_dtors on Windows. Differential revision: https://reviews.llvm.org/D66373 llvm-svn: 369299	2019-08-19 21:07:03 +00:00
Craig Topper	93c2787193	[CGP] Remove ModifiedDT from the makeBitReverse loop I don't think anything in this loop modifies the control flow and we don't restart any iteration after setting the flag. This code was added in http://reviews.llvm.org/D16893 but looking at the test case added there the code that caused the dominator tree to change was merging blocks with their predecessor not the bitreverse optimization. Differential Revision: https://reviews.llvm.org/D66366 llvm-svn: 369283	2019-08-19 18:02:24 +00:00
Roman Lebedev	edfaee0811	[TargetLowering] x s% C == 0 fold: vector divisor with INT_MIN handling Summary: The general fold is only valid for positive divisors. Which effectively means, it is invalid for `INT_MIN` divisors, and we currently bailout if we see them. But that is too strict, we can just fix-up the results. For that, let's do a second computation 'in parallel': ``` Name: srem -> and Pre: isPowerOf2(C) %o = srem i8 %X, C %r = icmp eq %o, 0 => %n = and i8 %X, C-1 %r = icmp eq %n, 0 ``` https://rise4fun.com/Alive/Sup And then just blend results: if the divisor was `INT_MIN`, pick the value we got via bit-test, else pick the value from general fold. There's interesting observation - `ISD::ROTR` is set to `LegalizeAction::Expand` before AVX512, so we should not treat `INT_MIN` divisor as even; and as it can be seen while `@test_srem_odd_even_one` improves on all run-lines, `@test_srem_odd_even_INT_MIN` only improves for AVX512. Reviewers: RKSimon, craig.topper, spatel Reviewed By: RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66300 llvm-svn: 369268	2019-08-19 15:01:42 +00:00
Jinsong Ji	0776da5236	[PeepholeOptimizer] Don't assume bitcast def always has input Summary: If we have a MI marked with bitcast bits, but without input operands, PeepholeOptimizer might crash with assert. eg: If we apply the changes in PPCInstrVSX.td as in this patch: [(set v4i32:$XT, (bitconvert (v16i8 immAllOnesV)))]>; We will get assert in PeepholeOptimizer. ``` llvm-lit llvm-project/llvm/test/CodeGen/PowerPC/build-vector-tests.ll -v llvm-project/llvm/include/llvm/CodeGen/MachineInstr.h:417: const llvm::MachineOperand &llvm::MachineInstr::getOperand(unsigned int) const: Assertion `i < getNumOperands() && "getOperand() out of range!"' failed. ``` The fix is to abort if we found out of bound access. Reviewers: qcolombet, MatzeB, hfinkel, arsenm Reviewed By: qcolombet Subscribers: wdng, arsenm, steven.zhang, wuzish, nemanjai, hiraditya, kbarton, MaskRay, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65542 llvm-svn: 369261	2019-08-19 14:19:04 +00:00
David Stenberg	88df53e6ea	[DebugInfo] Allow bundled calls in the MIR's call site info Summary: Extend the MIR parser and writer so that the call site information can refer to calls that are bundled. Reviewers: aprantl, asowda, NikolaPrica, djtodoro, ivanbaev, vsk Reviewed By: aprantl Subscribers: arsenm, hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D66145 llvm-svn: 369256	2019-08-19 12:41:22 +00:00
Jeremy Morse	176bbd5cde	[DebugInfo] Make postra sinking of DBG_VALUEs subregister-safe Currently the machine instruction sinker identifies DBG_VALUE insts that also need to sink by comparing register numbers. Unfortunately this isn't safe, because (after register allocation) a DBG_VALUE may read a register that aliases what's being sunk. To fix this, identify the DBG_VALUEs that need to sink by recording & examining their register units. Register units gives us the following guarantee: "Two registers overlap if and only if they have a common register unit" [MCRegisterInfo.h] Thus we can always identify aliasing DBG_VALUEs if the set of register units read by the DBG_VALUE, and the register units of the instruction being sunk, intersect. (MachineSink already uses classes like "LiveRegUnits" for determining sinking validity anyway). The test added checks for super and subregister DBG_VALUE reads of a sunk copy being sunk as well. Differential Revision: https://reviews.llvm.org/D58191 llvm-svn: 369247	2019-08-19 09:53:07 +00:00
Craig Topper	74168ded03	[TargetLowering] Teach computeRegisterProperties to only widen v3i16/v3f16 vectors to the next power of 2 type if that's legal. These were recently made simple types. This restores their behavior back to something like their EVT legalization. We might be able to fix the code in type legalization where the assert was failing, but I didn't investigate too much as I had already looked at the computeRegisterProperties code during the review for v3i16/v3f16. Most of the test changes restore the X86 codegen back to what it looked like before the recent change. The test case in vec_setcc.ll and is a reduced version of the reproducer from the fuzzer. Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=16490 llvm-svn: 369205	2019-08-18 06:28:06 +00:00
Craig Topper	f43106e341	[SelectionDAG] Add a node creation debug message to getMachineNode. llvm-svn: 369204	2019-08-18 06:28:00 +00:00
Kang Zhang	b3d258fc44	[CodeGen] Do the Simple Early Return in block-placement pass to optimize the blocks Summary: Fix a bug of preducessors. In `block-placement` pass, it will create some patterns for unconditional we can do the simple early retrun. But the `early-ret` pass is before `block-placement`, we don't want to run it again. This patch is to do the simple early return to optimize the blocks at the last of `block-placement`. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D63972 llvm-svn: 369191	2019-08-17 14:37:05 +00:00
Sanjay Patel	acceedb15f	[CodeGenPrepare] Fix use-after-free If OptimizeExtractBits() encountered a shift instruction with no operands at all, it would erase the instruction, but still return false. This previously didn’t matter because its caller would always return after processing the instruction, but https://reviews.llvm.org/D63233 changed the function’s caller to fall through if it returned false, which would then cause a use-after-free detectable by ASAN. This change makes OptimizeExtractBits return true if it removes a shift instruction with no users, terminating processing of the instruction. Patch by: @brentdax (Brent Royal-Gordon) Differential Revision: https://reviews.llvm.org/D66330 llvm-svn: 369168	2019-08-16 23:10:34 +00:00
Evgeniy Stepanov	187c63f145	Escape % in printf format string. Fixes branch-relax-block-size.mir on the ASan builder. llvm-svn: 369138	2019-08-16 18:23:54 +00:00
Amara Emerson	c809230a69	[AArch64][GlobalISel] Lower G_SHUFFLE_VECTOR with 1 elt src and 1 elt mask. Again, it's weird that these are allowed. Since lowering support was added in r368709 we started crashing on compiling the neon intrinsics test in the test suite. This fixes the lowering to fold the 1 elt src/mask case into copies. llvm-svn: 369135	2019-08-16 18:06:53 +00:00
Guozhi Wei	e03f6a1631	[CodeGen/Analysis] Intrinsic llvm.assume should not block tail call optimization In function Analysis.cpp:isInTailCallPosition, instructions between call and ret are checked to see if they block tail call optimization. If an instruction is an intrinsic call, only llvm.lifetime_end is allowed and other intrinsic functions block tail call. When compiling tcmalloc, we found llvm.assume between a hot function call and ret, it blocks the optimization. But llvm.assume doesn't generate instructions, it should not block tail call. Differential Revision: https://reviews.llvm.org/D66096 llvm-svn: 369125	2019-08-16 16:26:12 +00:00
Florian Hahn	403e85cbc5	Revert [CodeGen] Do the Simple Early Return in block-placement pass to optimize the blocks This reverts r368997 (git commit `2a903c0b67`) It looks like this commit adds invalid predecessors to MBBs. The example below fails the verifier after MachineBlockPlacement (run llc -verify-machineinstrs): @global.4 = external constant i8* declare i32 @zot(...) define i16* @snork.67() personality i8* bitcast (i32 (...)* @zot to i8) { bb: invoke void undef() to label %bb5 unwind label %bb4 bb4: ; preds = %bb %tmp = landingpad { i8, i32 } catch i8* null unreachable bb5: ; preds = %bb %tmp6 = load i32, i32* null, align 4 %tmp7 = icmp eq i32 %tmp6, 0 br i1 %tmp7, label %bb14, label %bb8 bb8: ; preds = %bb11, %bb5 invoke void undef() to label %bb9 unwind label %bb11 bb9: ; preds = %bb8 %tmp10 = invoke i16* undef() to label %bb14 unwind label %bb11 bb11: ; preds = %bb9, %bb8 %tmp12 = landingpad { i8, i32 } cleanup catch i8 bitcast (i8** @global.4 to i8) %tmp13 = icmp ult i64 undef, undef br i1 %tmp13, label %bb8, label %bb14 bb14: ; preds = %bb11, %bb9, %bb5 %tmp15 = phi i16 [ null, %bb5 ], [ null, %bb11 ], [ %tmp10, %bb9 ] ret i16* %tmp15 } llvm-svn: 369104	2019-08-16 13:19:29 +00:00
Bjorn Pettersson	9dddd26e31	[DAGCombiner] Add simple folds for SMULFIX/UMULFIX/SMULFIXSAT Summary: Add the following DAGCombiner folds for mulfix being one of SMULFIX/UMULFIX/SMULFIXSAT: (mulfix x, undef, scale) -> 0 (mulfix x, 0, scale) -> 0 Also added canonicalization of constants to RHS. Reviewers: RKSimon, craig.topper, spatel Reviewed By: RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66052 llvm-svn: 369103	2019-08-16 13:16:48 +00:00
Jeremy Morse	8b593480d3	[DebugInfo] Handle complex expressions with spills in LiveDebugValues In r369026 we disabled spill-recognition in LiveDebugValues for anything that has a complex expression. This is because it's hard to recover the complex expression once the spill location is baked into it. This patch re-enables spill-recognition and slightly adjusts the DBG_VALUE insts that LiveDebugValues tracks: instead of tracking the last DBG_VALUE for a variable, it tracks the last _unspilt_ DBG_VALUE. The spill-restore code is then able to access and copy the original complex expression; but the rest of LiveDebugValues has to be aware of the slight semantic shift, and produce a new spilt location if a spilt location is propagated between blocks. The test added produces an incorrect variable location (see FIXME), which will be the subject of future work. Differential Revision: https://reviews.llvm.org/D65368 llvm-svn: 369092	2019-08-16 10:04:17 +00:00
Volkan Keles	0ae6006bee	[GlobalISel] CSEMIRBuilder: Add support for G_GEP Summary: This patch adds G_GEP to `shouldCSEOpc` so that it can be CSEd. It also refactors `translateGetElementPtr` by replacing `createGenericVirtualRegister` calls with types. Reviewers: aditya_nandakumar, arsenm, dsanders, paquette, aemerson Reviewed By: aditya_nandakumar Subscribers: wdng, rovka, javed.absar, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66316 llvm-svn: 369070	2019-08-15 23:45:45 +00:00
Daniel Sanders	0c47611131	Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM Summary: This clang-tidy check is looking for unsigned integer variables whose initializer starts with an implicit cast from llvm::Register and changes the type of the variable to llvm::Register (dropping the llvm:: where possible). Partial reverts in: X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister X86FixupLEAs.cpp - Some functions return unsigned and arguably should be MCRegister X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister HexagonBitSimplify.cpp - Function takes BitTracker::RegisterRef which appears to be unsigned& MachineVerifier.cpp - Ambiguous operator==() given MCRegister and const Register PPCFastISel.cpp - No Register::operator-=() PeepholeOptimizer.cpp - TargetInstrInfo::optimizeLoadInstr() takes an unsigned& MachineTraceMetrics.cpp - MachineTraceMetrics lacks a suitable constructor Manual fixups in: ARMFastISel.cpp - ARMEmitLoad() now takes a Register& instead of unsigned& HexagonSplitDouble.cpp - Ternary operator was ambiguous between unsigned/Register HexagonConstExtenders.cpp - Has a local class named Register, used llvm::Register instead of Register. PPCFastISel.cpp - PPCEmitLoad() now takes a Register& instead of unsigned& Depends on D65919 Reviewers: arsenm, bogner, craig.topper, RKSimon Reviewed By: arsenm Subscribers: RKSimon, craig.topper, lenary, aemerson, wuzish, jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65962 llvm-svn: 369041	2019-08-15 19:22:08 +00:00
Matt Arsenault	1f2b727298	MVT: Add v3i16/v3f16 vectors AMDGPU has some buffer intrinsics which theoretically could use this. Some of the generated tables include the 3 and 4 element vector versions of these rounded to 64-bits, which is ambiguous. Add these to help the table disambiguate these. Assertion change is for the path odd sized vectors now take for R600. v3i16 is widened to v4i16, which then needs to be promoted to v4i32. llvm-svn: 369038	2019-08-15 18:58:25 +00:00
Philip Reames	d202899431	[NFC] Add a couple of dump routines for RegisterPressure helper classes llvm-svn: 369037	2019-08-15 18:49:39 +00:00
Jeremy Morse	c476124bc8	[DebugInfo] Avoid crash from dropped fragments in LiveDebugValues This patch avoids a crash caused by DW_OP_LLVM_fragments being dropped from DIExpressions by LiveDebugValues spill-restore code. The appearance of a previously unseen fragment configuration confuses LDV, as documented in PR42773, and reproduced by the test function this patch adds (Crashes on a x86_64 debug build). To avoid this, on spill restore, we now use fragment information from the spilt-location-expression. In addition, when spilling, we now don't spill any DBG_VALUE with a complex expression, as it can't be safely restored and will definitely lead to an incorrect variable location. The discussion of this is in D65368. Differential Revision: https://reviews.llvm.org/D66284 llvm-svn: 369026	2019-08-15 17:49:46 +00:00
Jonas Devlieghere	0eaee545ee	[llvm] Migrate llvm::make_unique to std::make_unique Now that we've moved to C++14, we no longer need the llvm::make_unique implementation from STLExtras.h. This patch is a mechanical replacement of (hopefully) all the llvm::make_unique instances across the monorepo. llvm-svn: 369013	2019-08-15 15:54:37 +00:00
Simon Pilgrim	d4df81f463	Remove SmallBitVector.h include. NFCI. SmallBitVector/BitVector types aren't used at all in the cpp file. llvm-svn: 369008	2019-08-15 14:40:37 +00:00
Simon Pilgrim	983e9118a2	Remove BitVector.h include. NFCI. BitVector type isn't used at all in the cpp file. llvm-svn: 369007	2019-08-15 14:39:28 +00:00
Simon Pilgrim	ed804dad1e	[DAGCombine] MergeConsecutiveStores - fix cppcheck/MSVC extension warning. NFCI. Set the StartIdx type to size_t so that it matches the StoreNodes SmallVector size() and index types. Silences the MSVC analyzer warning that unsigned increment might overflow before exceeding size_t on 64-bit targets - this isn't likely to happen but it means we use consistent types and reduces the warning "noise" a little. llvm-svn: 368998	2019-08-15 13:07:14 +00:00
Kang Zhang	2a903c0b67	[CodeGen] Do the Simple Early Return in block-placement pass to optimize the blocks Summary: This patch has trigger a bug of r368339, and the r368339 has been reverted, So upstream this patch again. In `block-placement` pass, it will create some patterns for unconditional we can do the simple early retrun. But the `early-ret` pass is before `block-placement`, we don't want to run it again. This patch is to do the simple early return to optimize the blocks at the last of `block-placement`. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D63972 llvm-svn: 368997	2019-08-15 13:05:16 +00:00
Sanjay Patel	57d459309d	[SDAG][x86] check for relaxed math when matching an FP reduction If the last step in an FP add reduction allows reassociation and doesn't care about -0.0, then we are free to recognize that computation as a reduction that may reorder the intermediate steps. This is requested directly by PR42705: https://bugs.llvm.org/show_bug.cgi?id=42705 and solves PR42947 (if horizontal math instructions are actually faster than the alternative): https://bugs.llvm.org/show_bug.cgi?id=42947 Differential Revision: https://reviews.llvm.org/D66236 llvm-svn: 368995	2019-08-15 12:43:15 +00:00
Florian Hahn	de1d6c8220	Add ptrmask intrinsic This patch adds a ptrmask intrinsic which allows masking out bits of a pointer that must be zero when accessing it, because of ABI alignment requirements or a restriction of the meaningful bits of a pointer through the data layout. This avoids doing a ptrtoint/inttoptr round trip in some cases (e.g. tagged pointers) and allows us to not lose information about the underlying object. Reviewers: nlopes, efriedma, hfinkel, sanjoy, jdoerfert, aqjune Reviewed by: sanjoy, jdoerfert Differential Revision: https://reviews.llvm.org/D59065 llvm-svn: 368986	2019-08-15 10:12:26 +00:00
Craig Topper	e7ea06b7d2	[SelectionDAGBuilder] Teach gather/scatter getUniformBase to look through vector zeroinitializer indices in addition to scalar zeroes. llvm-svn: 368926	2019-08-14 21:38:56 +00:00
Sanjay Patel	ecccf29e6c	[SDAG] move variable closer to use; NFC llvm-svn: 368905	2019-08-14 19:46:15 +00:00
Taewook Oh	df7022825c	[DebugInfo] Consider debug label scope has an extra lexical block file Summary: There are places where a case that debug label scope has an extra lexical block file is not considered properly. The modified test won't pass without this patch. Reviewers: aprantl, HsiangKai Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66187 llvm-svn: 368891	2019-08-14 17:58:45 +00:00
Jeremy Morse	90c2794bfc	[DebugInfo] MCP: collect and update DBG_VALUEs encountered in local block MCP currently uses changeDebugValuesDefReg / collectDebugValues to find debug users of a register, however those functions assume that all DBG_VALUEs immediately follow the specified instruction, which isn't necessarily true. This is going to become very often untrue when we turn off CodeGenPrepare::placeDbgValues. Instead of calling changeDebugValuesDefReg on an instruction to change its debug users, in this patch we instead collect DBG_VALUEs of copies as we iterate over insns, and update the debug users of copies that are made dead. This isn't a non-functional change, because MCP will now update DBG_VALUEs that aren't immediately after a copy, but refer to the same register. I've hijacked the regression test for PR38773 to test for this new behaviour, an entirely new test seemed overkill. Differential Revision: https://reviews.llvm.org/D56265 llvm-svn: 368835	2019-08-14 12:20:02 +00:00
Fangrui Song	8caa0aaa4d	[AsmPrinter] Delete redundant .type foo, @function when emitting an ifunc In MCAsmStreamer: .type foo,@function # <--- this is redundant .type foo,@gnu_indirect_function In MCELFStreamer, the latter STT_GNU_IFUNC overrides STT_FUNC. llvm-svn: 368823	2019-08-14 10:30:27 +00:00
Aditya Nandakumar	c65ac865c3	[GlobalISel]: Fix lowering of G_Shuffle_vector where we pick up the wrong source index https://reviews.llvm.org/D66182 llvm-svn: 368781	2019-08-14 01:23:33 +00:00
Aditya Nandakumar	615eee6402	[GlobalISel]: Fix lowering of G_SHUFFLE_VECTOR with scalar sources https://reviews.llvm.org/D66171 llvm-svn: 368753	2019-08-13 21:49:11 +00:00
Matt Arsenault	28215caa60	GlobalISel: Partially implement fewerElementsVector G_UNMERGE_VALUES Odd sized vectors aren't handled yet. llvm-svn: 368713	2019-08-13 16:26:28 +00:00
Matt Arsenault	690645bda0	GlobalISel: Implement lower for G_SHUFFLE_VECTOR llvm-svn: 368709	2019-08-13 16:09:07 +00:00
Matt Arsenault	0a04a06250	GlobalISel: Add more verifier checks for G_SHUFFLE_VECTOR llvm-svn: 368705	2019-08-13 15:52:21 +00:00
Matt Arsenault	5af9cf042f	GlobalISel: Change representation of shuffle masks Currently shufflemasks get emitted as any other constant, and you end up with a bunch of virtual registers of G_CONSTANT with a G_BUILD_VECTOR. The AArch64 selector then asserts on anything that doesn't fit this pattern. This isn't an ideal representation, and should avoid legalization and have fewer opportunities for a representational error. Rather than invent a new shuffle mask operand type, similar to what ShuffleVectorSDNode does, just track the original IR Constant mask operand. I don't completely like the idea of adding another link to the IR, but MIR is already quite dependent on IR constants already, and this will allow sharing the shuffle mask utility functions with the IR. llvm-svn: 368704	2019-08-13 15:34:38 +00:00
Roman Lebedev	676594305a	[CodeGen][SelectionDAG] More efficient code for X % C == 0 (SREM case) Summary: This implements an optimization described in Hacker's Delight 10-17: when `C` is constant, the result of `X % C == 0` can be computed more cheaply without actually calculating the remainder. The motivation is discussed here: https://bugs.llvm.org/show_bug.cgi?id=35479. One huge caveat: this signed case is only valid for positive divisors. While we can freely negate negative divisors, we can't negate `INT_MIN`, so for now if `INT_MIN` is encountered, we bailout. As a follow-up, it should be possible to handle that more gracefully via extra `and`+`setcc`+`select`. This passes llvm's test-suite, and from cursory(!) cross-examination the folds (the assembly) match those of GCC, and manual checking via alive did not reveal any issues (other than the `INT_MIN` case) Reviewers: RKSimon, spatel, hermord, craig.topper, xbolva00 Reviewed By: RKSimon, xbolva00 Subscribers: xbolva00, thakis, javed.absar, hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65366 llvm-svn: 368702	2019-08-13 14:57:37 +00:00
Roman Lebedev	f4de7eda4a	[TargetLowering][NFC] prepareUREMEqFold(): fixup comment The comment initially matched the code, but the code was incorrect and was fixed after the initial revert back back when it was introduced, but the comment was never updated. llvm-svn: 368701	2019-08-13 14:57:08 +00:00
Hans Wennborg	5390d25f2b	Revert r368276 "[TargetLowering] SimplifyDemandedBits - call SimplifyMultipleUseDemandedBits for ISD::EXTRACT_VECTOR_ELT" This introduced a false positive MemorySanitizer warning about use of uninitialized memory in a vectorized crc function in Chromium. That suggests maybe something is not right with this transformation. See https://crbug.com/992853#c7 for a reproducer. This also reverts the follow-up commits r368307 and r368308 which depended on this. > This patch attempts to peek through vectors based on the demanded bits/elt of a particular ISD::EXTRACT_VECTOR_ELT node, allowing us to avoid dependencies on ops that have no impact on the extract. > > In particular this helps remove some unnecessary scalar->vector->scalar patterns. > > The wasm shift patterns are annoying - @tlively has indicated that the wasm vector shift codegen are to be refactored in the near-term and isn't considered a major issue. > > Differential Revision: https://reviews.llvm.org/D65887 llvm-svn: 368660	2019-08-13 09:33:25 +00:00
Amara Emerson	e14c91b71a	[GlobalISel] Make the InstructionSelector instance non-const, allowing state to be maintained. Currently we can't keep any state in the selector object that we get from subtarget. As a result we have to plumb through all our variables through multiple functions. This change makes it non-const and adds a virtual init() method to allow further state to be captured for each target. AArch64 makes use of this in this patch to cache a call to hasFnAttribute() which is expensive to call, and is used on each selection of G_BRCOND. Differential Revision: https://reviews.llvm.org/D65984 llvm-svn: 368652	2019-08-13 06:26:59 +00:00
Aditya Nandakumar	70fdfed45f	[GlobalISel]: Add KnownBits for G_XOR https://reviews.llvm.org/D66119 llvm-svn: 368648	2019-08-13 04:32:33 +00:00
Daniel Sanders	a58a27513b	Eliminate implicit Register->unsigned conversions in VirtRegMap. NFC Summary: This was mostly an experiment to assess the feasibility of completely eliminating a problematic implicit conversion case in D61321 in advance of landing that* but it also happens to align with the goal of propagating the use of Register/MCRegister instead of unsigned so I believe it makes sense to commit it. The overall process for eliminating the implicit conversions from Register/MCRegister -> unsigned was to: 1. Add an explicit conversion to support genuinely required conversions to unsigned. For example, using them as an index for IndexedMap. Sadly it's not possible to have an explicit and implicit conversion to the same type and only deprecate the implicit one so I called the explicit conversion get(). 2. Temporarily annotate the implicit conversion to unsigned with LLVM_ATTRIBUTE_DEPRECATED to make them visible 3. Eliminate implicit conversions by propagating Register/MCRegister/ explicit-conversions appropriately 4. Remove the deprecation added in 2. * My conclusion is that it isn't feasible as there's too much code to update in one go. Depends on D65678 Reviewers: arsenm Subscribers: MatzeB, wdng, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65685 llvm-svn: 368643	2019-08-13 00:55:24 +00:00

1 2 3 4 5 ...

26866 Commits