llvm-project

Commit Graph

Author	SHA1	Message	Date
Denis Antrushin	248a67f144	[Statepoint] Turn assert into check in foldPatchpoint. Original D81646 had check for tied regs in foldPatchpoint(). Due to unfortunate miscommunication with review comments and adressing some comments post commit, it turned into assertion. We had an offline talk and agreed that with current implementation this path is possible, so I'm changing it back to check. Note that this is workaround until ussues described in PR46917 are resolved.	2020-08-28 20:00:23 +07:00
Philip Reames	a96fc4638b	Remove deopt and gc transition arguments from gc.statepoint intrinsic (Forgot to land this a couple of weeks back.) In a recent series of changes, I've introduced support for using the respective operand bundle kinds on the statepoint. At the moment, code supports either/or, but there's no need to keep the old support around. For the moment, I am simply changing the specification and verifier to require zero length argument sets in the intrinsic. The intrinsic itself is experimental. Given that, there's no forward serialization needed. The in tree uses and generation have already been updated to use the new operand bundle based forms, the only folks broken by the change will be those with frontends generating statepoints directly and the updates should be easy. Why not go ahead and just remove the arguments entirely? Well, I plan to. But while working on this I've found that almost all of the arguments to the statepoint can be expressed via operand bundles or attributes. Given that, I'm planning a radical simplification of the arguments and figured I'd do one update not several small ones. Differential Revision: https://reviews.llvm.org/D80892	2020-08-14 16:07:40 -07:00
Denis Antrushin	d21ce40821	[Statepoints] Operand folding in presense of tied registers. Implement proper folding of statepoint meta operands (deopt and GC) when statepoint uses tied registers. For deopt operands it is just about properly preserving tiedness in new instruction. For tied GC operands folding is a little bit more tricky. We can fold tied GC operands only from InlineSpiller, because it knows how to properly reload tied def after it was turned into memory operand. Other users (e.g. peephole) cannot properly fold such operands as they do not know how (or when) to reload them from memory. We do this by un-tieing operand we want to fold in InlineSpiller and allowing to fold only untied operands in foldPatchpoint.	2020-08-05 20:18:28 +07:00
James Y Knight	4b0aa5724f	Change the INLINEASM_BR MachineInstr to be a non-terminating instruction. Before this instruction supported output values, it fit fairly naturally as a terminator. However, being a terminator while also supporting outputs causes some trouble, as the physreg->vreg COPY operations cannot be in the same block. Modeling it as a non-terminator allows it to be handled the same way as invoke is handled already. Most of the changes here were created by auditing all the existing users of MachineBasicBlock::isEHPad() and MachineBasicBlock::hasEHPadSuccessor(), and adding calls to isInlineAsmBrIndirectTarget or mayHaveInlineAsmBr, as appropriate. Reviewed By: nickdesaulniers, void Differential Revision: https://reviews.llvm.org/D79794	2020-07-01 12:51:50 -04:00
Wang, Pengfei	6eb9eae010	[MS] Copy the symbols assigned to the former instruction when memory folding. The memory folding raplaced the old instruction without copying the symbols assigned. Which will resulted in built fail due to the lost symbols. Reviewed by craig.topper Differential Revision: https://reviews.llvm.org/D78471	2020-06-10 15:38:32 +08:00
hsmahesha	0ed2c04636	[AMDGPU/MemOpsCluster] Let mem ops clustering logic also consider number of clustered bytes Summary: While clustering mem ops, AMDGPU target needs to consider number of clustered bytes to decide on max number of mem ops that can be clustered. This patch adds support to pass number of clustered bytes to target mem ops clustering logic. Reviewers: foad, rampitec, arsenm, vpykhtin, javedabsar Reviewed By: foad Subscribers: MatzeB, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, javed.absar, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80545	2020-06-01 22:52:34 +05:30
Sam McCall	d10c995b4d	std::isspace -> llvm::isSpace (where locale should be ignored) I've left out some cases where I wasn't totally sure this was right or whether the include was ok (compiler-rt) or idiomatic (flang).	2020-05-02 15:36:04 +02:00
Konstantin Schwarz	1a3e89aa2b	[MIR] Add comments to INLINEASM immediate flag MachineOperands Summary: The INLINEASM MIR instructions use immediate operands to encode the values of some operands. The MachineInstr pretty printer function already handles those operands and prints human readable annotations instead of the immediates. This patch adds similar annotations to the output of the MIRPrinter, however uses the new MIROperandComment feature. Reviewers: SjoerdMeijer, arsenm, efriedma Reviewed By: arsenm Subscribers: qcolombet, sdardis, jvesely, wdng, nhaehnle, hiraditya, jrtc27, atanasyan, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78088	2020-04-16 13:46:14 +02:00
Matt Arsenault	30ebafaa56	CodeGen: Convert some TII hooks to use Register	2020-04-03 14:52:54 -04:00
Guillaume Chatelet	b9810988b2	[Alignment][NFC] Transitionning more getMachineMemOperand call sites Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77127	2020-03-31 11:04:10 +00:00
Vedant Kumar	0368b42295	[entry values] ARM: Add a describeLoadedValue override (PR45025) As a narrow stopgap for the assertion failure described in PR45025, add a describeLoadedValue override to ARMBaseInstrInfo and use it to detect copies in which the forwarding reg is a super/sub reg of the copy destination. For the moment this is unsupported. Several follow ups are possible: 1) Handle VORRq. At the moment, we do not, because isCopyInstrImpl returns early when !MI.isMoveReg(). 2) In the case where forwarding reg is a super-reg of the copy destination, we should be able to describe the forwarding reg as a subreg within the copy destination. I'm not 100% sure about this, but it looks like that's what's done in AArch64InstrInfo. 3) In the case where the forwarding reg is a sub-reg of the copy destination, maybe we could describe the forwarding reg using the copy destinaion and a DW_OP_LLVM_fragment (I guess this should be possible after D75036). https://bugs.llvm.org/show_bug.cgi?id=45025 rdar://59772698 Differential Revision: https://reviews.llvm.org/D75273	2020-02-28 14:30:40 -08:00
Sanjay Patel	90fd859f51	[x86] use instruction-level fast-math-flags to drive MachineCombiner The code changes here are hopefully straightforward: 1. Use MachineInstruction flags to decide if FP ops can be reassociated (use both "reassoc" and "nsz" to be consistent with IR transforms; we probably don't need "nsz", but that's a safer interpretation of the FMF). 2. Check that both nodes allow reassociation to change instructions. This is a stronger requirement than we've usually implemented in IR/DAG, but this is needed to solve the motivating bug (see below), and it seems unlikely to impede optimization at this late stage. 3. Intersect/propagate MachineIR flags to enable further reassociation in MachineCombiner. We managed to make MachineCombiner flexible enough that no changes are needed to that pass itself. So this patch should only affect x86 (assuming no other targets have implemented the hooks using MachineIR flags yet). The motivating example in PR43609 is another case of fast-math transforms interacting badly with special FP ops created during lowering: https://bugs.llvm.org/show_bug.cgi?id=43609 The special fadd ops used for converting int to FP assume that they will not be altered, so those are created without FMF. However, the MachineCombiner pass was being enabled for FP ops using the global/function-level TargetOption for "UnsafeFPMath". We managed to run instruction/node-level FMF all the way down to MachineIR sometime in the last 1-2 years though, so we can do better now. The test diffs require some explanation: 1. llvm/test/CodeGen/X86/fmf-flags.ll - no target option for unsafe math was specified here, so MachineCombiner kicks in where it did not previously; to make it behave consistently, we need to specify a CPU schedule model, so use the default model, and there are no code diffs. 2. llvm/test/CodeGen/X86/machine-combiner.ll - replace the target option for unsafe math with the equivalent IR-level flags, and there are no code diffs; we can't remove the NaN/nsz options because those are still used to drive x86 fmin/fmax codegen (special SDAG opcodes). 3. llvm/test/CodeGen/X86/pow.ll - similar to #1 4. llvm/test/CodeGen/X86/sqrt-fastmath.ll - similar to #1, but MachineCombiner does some reassociation of the estimate sequence ops; presumably these are perf wins based on latency/throughput (and we get some reduction of move instructions too); I'm not sure how it affects numerical accuracy, but the test reflects reality better now because we would expect MachineCombiner to be enabled if the IR was generated via something like "-ffast-math" with clang. 5. llvm/test/CodeGen/X86/vec_int_to_fp.ll - this is the test added to model PR43609; the fadds are not reassociated now, so we should get the expected results. 6. llvm/test/CodeGen/X86/vector-reduce-fadd-fast.ll - similar to #1 7. llvm/test/CodeGen/X86/vector-reduce-fmul-fast.ll - similar to #1 Differential Revision: https://reviews.llvm.org/D74851	2020-02-27 15:19:37 -05:00
Djordje Todorovic	016d91ccbd	[CallSiteInfo] Handle bundles when updating call site info This will address the issue: P8198 and P8199 (from D73534). The methods was not handle bundles properly. Differential Revision: https://reviews.llvm.org/D74904	2020-02-27 13:57:06 +01:00
Sander de Smalen	8fbc925807	Add OffsetIsScalable to getMemOperandWithOffset Summary: Making `Scale` a `TypeSize` in AArch64InstrInfo::getMemOpInfo, has the effect that all places where this information is used (notably, TargetInstrInfo::getMemOperandWithOffset) will need to consider Scale - and derived, Offset - possibly being scalable. This patch adds a new operand `bool &OffsetIsScalable` to TargetInstrInfo::getMemOperandWithOffset and fixes up all the places where this function is used, to consider the offset possibly being scalable. In most cases, this means bailing out because the algorithm does not (or cannot) support scalable offsets in places where it does some form of alias checking for example. Reviewers: rovka, efriedma, kristof.beyls Reviewed By: efriedma Subscribers: wuzish, kerbowa, MatzeB, arsenm, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, javed.absar, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72758	2020-02-18 15:53:29 +00:00
Djordje Todorovic	68908993eb	[CSInfo] Use isCandidateForCallSiteEntry() when updating the CSInfo Use the isCandidateForCallSiteEntry(). This should mostly be an NFC, but there are some parts ensuring the moveCallSiteInfo() and copyCallSiteInfo() operate with call site entry candidates (both Src and Dest should be the call site entry candidates). Differential Revision: https://reviews.llvm.org/D74122	2020-02-10 10:03:14 +01:00
Djordje Todorovic	de90d73e03	[DebugInfo] Avoid the call site param for mem instrs with multiple defs We currently only handle mem instructions with a single define. Avoid the call site parameter debug info when we find the case with multiple defs, rather than throwing an assert. Differential Revision: https://reviews.llvm.org/D73954	2020-02-05 10:03:14 +01:00
Jay Foad	e0f0d0e55c	[MachineScheduler] Allow clustering mem ops with complex addresses The generic BaseMemOpClusterMutation calls into TargetInstrInfo to analyze the address of each load/store instruction, and again to decide whether two instructions should be clustered. Previously this had to represent each address as a single base operand plus a constant byte offset. This patch extends it to support any number of base operands. The old target hook getMemOperandWithOffset is now a convenience function for callers that are only prepared to handle a single base operand. It calls the new more general target hook getMemOperandsWithOffset. The only requirements for the base operands returned by getMemOperandsWithOffset are: - they can be sorted by MemOpInfo::Compare, such that clusterable ops get sorted next to each other, and - shouldClusterMemOps knows what they mean. One simple follow-on is to enable clustering of AMDGPU FLAT instructions with both vaddr and saddr (base register + offset register). I've left a FIXME in the code for this case. Differential Revision: https://reviews.llvm.org/D71655	2020-01-22 14:28:24 +00:00
David Green	b891490ceb	[Scheduler] Adjust interface of CreateTargetMIHazardRecognizer to use ScheduleDAGMI. NFC All the callers of this function will be ScheduleDAGMI from the MachineScheduler. This allows us to use the extra info available in ScheduleDAGMI without resorting to awkward casts.	2020-01-15 07:21:44 +00:00
David Green	90555d9253	[Scheduler] Remove superfluous casts. NFC	2020-01-13 16:34:13 +00:00
David Stenberg	6965f835b4	[DebugInfo] Make describeLoadedValue() reg aware Summary: Currently the describeLoadedValue() hook is assumed to describe the value of the instruction's first explicit define. The hook will not be called for instructions with more than one explicit define. This commit adds a register parameter to the describeLoadedValue() hook, and invokes the hook for all registers in the worklist. This will allow us to for example describe instructions which produce more than two parameters' values; e.g. Hexagon's various combine instructions. This also fixes situations in our downstream target where we may pass smaller parameters in the high part of a register. If such a parameter's value is produced by a larger copy instruction, we can't describe the call site value using the super-register, and we instead need to know which sub-register that should be used. This also allows us to handle cases like this: $ebx = [...] $rdi = MOVSX64rr32 $ebx $esi = MOV32rr $edi CALL64pcrel32 @call The hook will first be invoked for the MOV32rr instruction, which will say that @call's second parameter (passed in $esi) is described by $edi. As $edi is not preserved it will be added to the worklist. When we get to the MOVSX64rr32 instruction, we need to describe two values; the sign-extended value of $ebx -> $rdi for the first parameter, and $ebx -> $edi for the second parameter, which is now possible. This commit modifies the dbgcall-site-lea-interpretation.mir test case. In the test case, the values of some 32-bit parameters were produced with LEA64r. Perhaps we can in general cases handle such by emitting expressions that AND out the lower 32-bits, but I have not been able to land in a case where a LEA64r is used for a 32-bit parameter instead of LEA64_32 from C code. I have not found a case where it would be useful to describe parameters using implicit defines, so in this patch the hook is still only invoked for explicit defines of forwarding registers. Reviewers: djtodoro, NikolaPrica, aprantl, vsk Reviewed By: djtodoro, vsk Subscribers: ormris, hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D70431	2019-12-09 10:47:49 +01:00
David Stenberg	f3696533f2	Revert "[DebugInfo] Make describeLoadedValue() reg aware" This reverts commit `3cd93a4efc`. I'll recommit with a well-formatted arcanist commit message.	2019-12-09 10:45:13 +01:00
David Stenberg	3cd93a4efc	[DebugInfo] Make describeLoadedValue() reg aware Currently the describeLoadedValue() hook is assumed to describe the value of the instruction's first explicit define. The hook will not be called for instructions with more than one explicit define. This commit adds a register parameter to the describeLoadedValue() hook, and invokes the hook for all registers in the worklist. This will allow us to for example describe instructions which produce more than two parameters' values; e.g. Hexagon's various combine instructions. This also fixes a case in our downstream target where we may pass smaller parameters in the high part of a register. If such a parameter's value is produced by a larger copy instruction, we can't describe the call site value using the super-register, and we instead need to know which sub-register that should be used. This also allows us to handle cases like this: $ebx = [...] $rdi = MOVSX64rr32 $ebx $esi = MOV32rr $edi CALL64pcrel32 @call The hook will first be invoked for the MOV32rr instruction, which will say that @call's second parameter (passed in $esi) is described by $edi. As $edi is not preserved it will be added to the worklist. When we get to the MOVSX64rr32 instruction, we need to describe two values; the sign-extended value of $ebx -> $rdi for the first parameter, and $ebx -> $edi for the second parameter, which is now possible. This commit modifies the dbgcall-site-lea-interpretation.mir test case. In the test case, the values of some 32-bit parameters were produced with LEA64r. Perhaps we can in general cases handle such by emitting expressions that AND out the lower 32-bits, but I have not been able to land in a case where a LEA64r is used for a 32-bit parameter instead of LEA64_32 from C code. I have not found a case where it would be useful to describe parameters using implicit defines, so in this patch the hook is still only invoked for explicit defines of forwarding registers.	2019-12-09 10:44:17 +01:00
Vedant Kumar	ba71ca3720	[DebugInfo] Describe size of spilled values in call site params A call site parameter description of a memory operand needs to unambiguously convey the size of the operand to prevent incorrect entry value evaluation. Thanks for David Stenberg for pointing this issue out!	2019-11-19 12:03:52 -08:00
Vedant Kumar	1ee84e5ab2	[DebugInfo] Allow spill slots in call site parameter descriptions Allow call site paramter descriptions to reference spill slots. Spill slots are not visible to high-level LLVM IR, so they can safely be referenced during entry value evaluation (as they cannot be clobbered by some other function). This gives a 5% increase in the number of call site parameter DIEs in an LTO x86_64 build of the xnu kernel. This reverts commit `eb4c98ca3d` ( [DebugInfo] Exclude memory location values as parameter entry values), effectively reintroducing the portion of D60716 which dealt with memory locations (authored by Djordje, Nikola, Ananth, and Ivan). This partially addresses llvm.org/PR43343. However, not all memory operands forwarded to callees live in spill slots. In the xnu build, it may be possible to use an escape analysis to increase the number of call site parameter by another 15% (more details in PR43343). Differential Revision: https://reviews.llvm.org/D70254	2019-11-14 12:48:51 -08:00
Djordje Todorovic	8d2ccd1ac3	Reland: [TII] Use optional destination and source pair as a return value; NFC Refactor usage of isCopyInstrImpl, isCopyInstr and isAddImmediate methods to return optional machine operand pair of destination and source registers. Patch by Nikola Prica Differential Revision: https://reviews.llvm.org/D69622	2019-11-08 13:00:39 +01:00
Simon Pilgrim	3842b94c4e	Revert rG57ee0435bd47f23f3939f402914c231b4f65ca5e - [TII] Use optional destination and source pair as a return value; NFC This is breaking MSVC builds: http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/20375	2019-10-31 18:00:29 +00:00
Djordje Todorovic	57ee0435bd	[TII] Use optional destination and source pair as a return value; NFC Refactor usage of isCopyInstrImpl, isCopyInstr and isAddImmediate methods to return optional machine operand pair of destination and source registers. Patch by Nikola Prica Differential Revision: https://reviews.llvm.org/D69622	2019-10-31 15:34:49 +01:00
Djordje Todorovic	532815dd5c	[ARM][AArch64][DebugInfo] Improve call site instruction interpretation Extend the describeLoadedValue() with support for target specific ARM and AArch64 instructions interpretation. The patch provides specialization for ADD and SUB operations that include a register and an immediate/offset operand. Some of the instructions can operate with global string addresses or constant pool indexes but such cases are omitted since we currently lack flexible support for processing such operands at DWARF production stage. Patch by Nikola Prica Differential Revision: https://reviews.llvm.org/D67556	2019-10-30 13:58:14 +01:00
David Stenberg	74a72e6848	[DebugInfo] Stop describing imms in TargetInstrInfo's describeLoadedValue() impl Summary: The default implementation of the describeLoadedValue() hook uses the MoveImm property to determine if an instruction moves an immediate. If an instruction has that property the function returns the second operand, assuming that that is the immediate value the instruction moves. As far as I can tell, the MoveImm property does not imply that the second operand is the immediate value, nor that any other operand necessarily holds the immediate value; it just means that the instruction moves some immediate value. One example where the second operand is not the immediate is SystemZ's LZER instruction, which moves a zero immediate implicitly: $f0S = LZER. That case triggered an out-of-bound assertion when getting the operand. I have added a test case for that instruction. Another example is ARM's MVN instruction, which holds the logical bitwise NOT'd value of the immediate that is moved. For the following reproducer: extern void foo(int); int main() { foo(-11); } an incorrect call site value would be emitted: $ clang --target=arm foo.c -O1 -g -Xclang -femit-debug-entry-values \ -c -o - \| ./build/bin/llvm-dwarfdump - \| \ grep -A2 call_site_parameter 0x00000058: DW_TAG_GNU_call_site_parameter DW_AT_location (DW_OP_reg0 R0) DW_AT_GNU_call_site_value (DW_OP_lit10) Another example is the A2_combineii instruction on Hexagon which moves two immediates to a super-register: $d0 = A2_combineii 20, 10. Perhaps these are rare exceptions, and most MoveImm instructions hold the immediate in the second operand, but in my opinion the default implementation of the hook should only describe values that it can, by some contract, guarantee are safe to describe, rather than leaving it up to the targets to override the exceptions, as that can silently result in incorrect call site values. This patch adds X86's relevant move immediate instructions to the target's hook implementation, so this commit should be a NFC for that target. We need to do the same for ARM and AArch64. Reviewers: djtodoro, NikolaPrica, aprantl, vsk Reviewed By: vsk Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D69109	2019-10-23 11:41:29 +02:00
Reid Kleckner	0ad6c191de	Prune Analysis includes from SelectionDAG.h Only forward declarations are needed here. Follow-on to r375311. llvm-svn: 375319	2019-10-19 01:07:48 +00:00
Nikola Prica	98603a8153	[DebugInfo][If-Converter] Update call site info during the optimization During the If-Converter optimization pay attention when copying or deleting call instructions in order to keep call site information in valid state. Reviewers: aprantl, vsk, efriedma Reviewed By: vsk, efriedma Differential Revision: https://reviews.llvm.org/D66955 llvm-svn: 374068	2019-10-08 15:43:12 +00:00
Djordje Todorovic	eb4c98ca3d	[DebugInfo] Exclude memory location values as parameter entry values Abandon describing of loaded values due to safety concerns. Loaded values are described as derefed memory location at caller point. At callee we can unintentionally change that memory location which would lead to different entry being printed value before and after the memory location clobbering. This problem is described in llvm.org/PR43343. Patch by Nikola Prica Differential Revision: https://reviews.llvm.org/D67717 llvm-svn: 373089	2019-09-27 13:52:43 +00:00
Simon Pilgrim	5f2d8b2618	[TargetInstrInfo] Let findCommutedOpIndices take const MachineInstr& Neither the base implementation of findCommutedOpIndices nor any in-tree target modifies the instruction passed in and there is no reason why they would in the future. Committed on behalf of @hvdijk (Harald van Dijk) Differential Revision: https://reviews.llvm.org/D66138 llvm-svn: 372882	2019-09-25 14:55:57 +00:00
Simon Pilgrim	0d6684d7e5	TargetInstrInfo::getStackSlotRange - fix "variable used but never read" analyzer warning. NFCI. We don't need to divide the BitSize local variable at all. llvm-svn: 372582	2019-09-23 11:36:24 +00:00
James Molloy	8a74eca398	[MachinePipeliner] Improve the TargetInstrInfo API analyzeLoop/reduceLoopCount Recommit: fix asan errors. The way MachinePipeliner uses these target hooks is stateful - we reduce trip count by one per call to reduceLoopCount. It's a little overfit for hardware loops, where we don't have to worry about stitching a loop induction variable across prologs and epilogs (the induction variable is implicit). This patch introduces a new API: /// Analyze loop L, which must be a single-basic-block loop, and if the /// conditions can be understood enough produce a PipelinerLoopInfo object. virtual std::unique_ptr<PipelinerLoopInfo> analyzeLoopForPipelining(MachineBasicBlock LoopBB) const; The return value is expected to be an implementation of the abstract class: /// Object returned by analyzeLoopForPipelining. Allows software pipelining /// implementations to query attributes of the loop being pipelined. class PipelinerLoopInfo { public: virtual ~PipelinerLoopInfo(); /// Return true if the given instruction should not be pipelined and should /// be ignored. An example could be a loop comparison, or induction variable /// update with no users being pipelined. virtual bool shouldIgnoreForPipelining(const MachineInstr MI) const = 0; /// Create a condition to determine if the trip count of the loop is greater /// than TC. /// /// If the trip count is statically known to be greater than TC, return /// true. If the trip count is statically known to be not greater than TC, /// return false. Otherwise return nullopt and fill out Cond with the test /// condition. virtual Optional<bool> createTripCountGreaterCondition(int TC, MachineBasicBlock &MBB, SmallVectorImpl<MachineOperand> &Cond) = 0; /// Modify the loop such that the trip count is /// OriginalTC + TripCountAdjust. virtual void adjustTripCount(int TripCountAdjust) = 0; /// Called when the loop's preheader has been modified to NewPreheader. virtual void setPreheader(MachineBasicBlock *NewPreheader) = 0; /// Called when the loop is being removed. virtual void disposed() = 0; }; The Pipeliner (ModuloSchedule.cpp) can use this object to modify the loop while allowing the target to hold its own state across all calls. This API, in particular the disjunction of creating a trip count check condition and adjusting the loop, improves the code quality in ModuloSchedule.cpp. llvm-svn: 372463	2019-09-21 08:19:41 +00:00
Mitch Phillips	72a3d8597d	Revert "[MachinePipeliner] Improve the TargetInstrInfo API analyzeLoop/reduceLoopCount" This commit broke the ASan buildbot. See comments in rL372376 for more information. This reverts commit `15e27b0b6d`. llvm-svn: 372425	2019-09-20 20:25:16 +00:00
James Molloy	15e27b0b6d	[MachinePipeliner] Improve the TargetInstrInfo API analyzeLoop/reduceLoopCount The way MachinePipeliner uses these target hooks is stateful - we reduce trip count by one per call to reduceLoopCount. It's a little overfit for hardware loops, where we don't have to worry about stitching a loop induction variable across prologs and epilogs (the induction variable is implicit). This patch introduces a new API: /// Analyze loop L, which must be a single-basic-block loop, and if the /// conditions can be understood enough produce a PipelinerLoopInfo object. virtual std::unique_ptr<PipelinerLoopInfo> analyzeLoopForPipelining(MachineBasicBlock LoopBB) const; The return value is expected to be an implementation of the abstract class: /// Object returned by analyzeLoopForPipelining. Allows software pipelining /// implementations to query attributes of the loop being pipelined. class PipelinerLoopInfo { public: virtual ~PipelinerLoopInfo(); /// Return true if the given instruction should not be pipelined and should /// be ignored. An example could be a loop comparison, or induction variable /// update with no users being pipelined. virtual bool shouldIgnoreForPipelining(const MachineInstr MI) const = 0; /// Create a condition to determine if the trip count of the loop is greater /// than TC. /// /// If the trip count is statically known to be greater than TC, return /// true. If the trip count is statically known to be not greater than TC, /// return false. Otherwise return nullopt and fill out Cond with the test /// condition. virtual Optional<bool> createTripCountGreaterCondition(int TC, MachineBasicBlock &MBB, SmallVectorImpl<MachineOperand> &Cond) = 0; /// Modify the loop such that the trip count is /// OriginalTC + TripCountAdjust. virtual void adjustTripCount(int TripCountAdjust) = 0; /// Called when the loop's preheader has been modified to NewPreheader. virtual void setPreheader(MachineBasicBlock *NewPreheader) = 0; /// Called when the loop is being removed. virtual void disposed() = 0; }; The Pipeliner (ModuloSchedule.cpp) can use this object to modify the loop while allowing the target to hold its own state across all calls. This API, in particular the disjunction of creating a trip count check condition and adjusting the loop, improves the code quality in ModuloSchedule.cpp. llvm-svn: 372376	2019-09-20 08:57:46 +00:00
David Stenberg	8b70139e95	[NFC] Make the describeLoadedValue() hook return machine operand objects Summary: This changes the ParamLoadedValue pair which the describeLoadedValue() hook returns so that MachineOperand objects are returned instead of pointers. When describing call site values we may need to describe operands which are not part of the instruction. One such example is zero-materializing XORs on x86, which I have implemented support for in a child revision. Instead of having to return a pointer to an operand stored somewhere outside the instruction, start returning objects directly instead, as that simplifies the code. The MachineOperand class only holds POD members, and on x86-64 it is 32 bytes large. That combined with copy elision means that the overhead of returning a machine operand object from the hook does not become very large. I benchmarked this on a 8-thread i7-8650U machine with 32 GB RAM. The benchmark consisted of building a clang 8.0 binary configured with: -DCMAKE_BUILD_TYPE=RelWithDebInfo \ -DLLVM_TARGETS_TO_BUILD=X86 \ -DLLVM_USE_SANITIZER=Address \ -DCMAKE_CXX_FLAGS="-Xclang -femit-debug-entry-values -stdlib=libc++" The average wall clock time increased by 4 seconds, from 62:05 to 62:09, which is an 0.1% increase. Reviewers: aprantl, vsk, djtodoro, NikolaPrica Reviewed By: vsk Subscribers: hiraditya, ychen, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D67261 llvm-svn: 371332	2019-09-08 14:05:10 +00:00
Daniel Sanders	0c47611131	Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM Summary: This clang-tidy check is looking for unsigned integer variables whose initializer starts with an implicit cast from llvm::Register and changes the type of the variable to llvm::Register (dropping the llvm:: where possible). Partial reverts in: X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister X86FixupLEAs.cpp - Some functions return unsigned and arguably should be MCRegister X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister HexagonBitSimplify.cpp - Function takes BitTracker::RegisterRef which appears to be unsigned& MachineVerifier.cpp - Ambiguous operator==() given MCRegister and const Register PPCFastISel.cpp - No Register::operator-=() PeepholeOptimizer.cpp - TargetInstrInfo::optimizeLoadInstr() takes an unsigned& MachineTraceMetrics.cpp - MachineTraceMetrics lacks a suitable constructor Manual fixups in: ARMFastISel.cpp - ARMEmitLoad() now takes a Register& instead of unsigned& HexagonSplitDouble.cpp - Ternary operator was ambiguous between unsigned/Register HexagonConstExtenders.cpp - Has a local class named Register, used llvm::Register instead of Register. PPCFastISel.cpp - PPCEmitLoad() now takes a Register& instead of unsigned& Depends on D65919 Reviewers: arsenm, bogner, craig.topper, RKSimon Reviewed By: arsenm Subscribers: RKSimon, craig.topper, lenary, aemerson, wuzish, jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65962 llvm-svn: 369041	2019-08-15 19:22:08 +00:00
Daniel Sanders	2bea69bf65	Finish moving TargetRegisterInfo::isVirtualRegister() and friends to llvm::Register as started by r367614. NFC llvm-svn: 367633	2019-08-01 23:27:28 +00:00
Djordje Todorovic	b9973f87c6	Reland "[DwarfDebug] Dump call site debug info" The build failure found after the rL365467 has been resolved. Differential Revision: https://reviews.llvm.org/D60716 llvm-svn: 367446	2019-07-31 16:51:28 +00:00
Djordje Todorovic	0739ccd3b5	Revert "[DwarfDebug] Dump call site debug info" A build failure was found on the SystemZ platform. This reverts commit 9e7e73578e54cd22b3c7af4b54274d743b6607cc. llvm-svn: 365886	2019-07-12 09:45:12 +00:00
Djordje Todorovic	01eaae6dd1	[DwarfDebug] Dump call site debug info Dump the DWARF information about call sites and call site parameters into debug info sections. The patch also provides an interface for the interpretation of instructions that could load values of a call site parameters in order to generate DWARF about the call site parameters. ([13/13] Introduce the debug entry values.) Co-authored-by: Ananth Sowda <asowda@cisco.com> Co-authored-by: Nikola Prica <nikola.prica@rt-rk.com> Co-authored-by: Ivan Baev <ibaev@cisco.com> Differential Revision: https://reviews.llvm.org/D60716 llvm-svn: 365467	2019-07-09 11:33:56 +00:00
Djordje Todorovic	71d3869f60	[Backend] Keep call site info valid through the backend Handle call instruction replacements and deletions in order to preserve valid state of the call site info of the MachineFunction. NOTE: If the call site info is enabled for a new target, the assertion from the MachineFunction::DeleteMachineInstr() should help to locate places where the updateCallSiteInfo() should be called in order to preserve valid state of the call site info. ([10/13] Introduce the debug entry values.) Co-authored-by: Ananth Sowda <asowda@cisco.com> Co-authored-by: Nikola Prica <nikola.prica@rt-rk.com> Co-authored-by: Ivan Baev <ibaev@cisco.com> Differential Revision: https://reviews.llvm.org/D61062 llvm-svn: 364536	2019-06-27 13:10:29 +00:00
Matt Arsenault	e3a676e9ad	CodeGen: Introduce a class for registers Avoids using a plain unsigned for registers throughoug codegen. Doesn't attempt to change every register use, just something a little more than the set needed to build after changing the return type of MachineOperand::getReg(). llvm-svn: 364191	2019-06-24 15:50:29 +00:00
Jonas Paulsson	fdc4ea34e3	[SystemZ, RegAlloc] Favor 3-address instructions during instruction selection. This patch aims to reduce spilling and register moves by using the 3-address versions of instructions per default instead of the 2-address equivalent ones. It seems that both spilling and register moves are improved noticeably generally. Regalloc hints are passed to increase conversions to 2-address instructions which are done in SystemZShortenInst.cpp (after regalloc). Since the SystemZ reg/mem instructions are 2-address (dst and lhs regs are the same), foldMemoryOperandImpl() can no longer trivially fold a spilled source register since the reg/reg instruction is now 3-address. In order to remedy this, new 3-address pseudo memory instructions are used to perform the folding only when the dst and lhs virtual registers are known to be allocated to the same physreg. In order to not let MachineCopyPropagation run and change registers on these transformed instructions (making it 3-address), a new target pass called SystemZPostRewrite.cpp is run just after VirtRegRewriter, that immediately lowers the pseudo to a target instruction. If it would have been possibe to insert a COPY instruction and change a register operand (convert to 2-address) in foldMemoryOperandImpl() while trusting that the caller (e.g. InlineSpiller) would update/repair the involved LiveIntervals, the solution involving pseudo instructions would not have been needed. This is perhaps a potential improvement (see Phabricator post). Common code changes: * A new hook TargetPassConfig::addPostRewrite() is utilized to be able to run a target pass immediately before MachineCopyPropagation. * VirtRegMap is passed as an argument to foldMemoryOperand(). Review: Ulrich Weigand, Quentin Colombet https://reviews.llvm.org/D60888 llvm-svn: 362868	2019-06-08 06:19:15 +00:00
Ulrich Weigand	6c5d5ce551	Allow target to handle STRICT floating-point nodes The ISD::STRICT_ nodes used to implement the constrained floating-point intrinsics are currently never passed to the target back-end, which makes it impossible to handle them correctly (e.g. mark instructions are depending on a floating-point status and control register, or mark instructions as possibly trapping). This patch allows the target to use setOperationAction to switch the action on ISD::STRICT_ nodes to Legal. If this is done, the SelectionDAG common code will stop converting the STRICT nodes to regular floating-point nodes, but instead pass the STRICT nodes to the target using normal SelectionDAG matching rules. To avoid having the back-end duplicate all the floating-point instruction patterns to handle both strict and non-strict variants, we make the MI codegen explicitly aware of the floating-point exceptions by introducing two new concepts: - A new MCID flag "mayRaiseFPException" that the target should set on any instruction that possibly can raise FP exception according to the architecture definition. - A new MI flag FPExcept that CodeGen/SelectionDAG will set on any MI instruction resulting from expansion of any constrained FP intrinsic. Any MI instruction that is both marked as mayRaiseFPException and FPExcept then needs to be considered as raising exceptions by MI-level codegen (e.g. scheduling). Setting those two new flags is straightforward. The mayRaiseFPException flag is simply set via TableGen by marking all relevant instruction patterns in the .td files. The FPExcept flag is set in SDNodeFlags when creating the STRICT_ nodes in the SelectionDAG, and gets inherited in the MachineSDNode nodes created from it during instruction selection. The flag is then transfered to an MIFlag when creating the MI from the MachineSDNode. This is handled just like fast-math flags like no-nans are handled today. This patch includes both common code changes required to implement the new features, and the SystemZ implementation. Reviewed By: andrew.w.kaylor Differential Revision: https://reviews.llvm.org/D55506 llvm-svn: 362663	2019-06-05 22:33:10 +00:00
Matt Arsenault	ca64ef2043	MC: Allow getMaxInstLength to depend on the subtarget Keep it optional in cases this is ever needed in some global context. Currently it's only used for getting an upper bound inline asm code size. For AMDGPU, gfx10 increases the maximum instruction size to 20-bytes. This avoids penalizing older subtargets when estimating code size, and making some annoying branch relaxation test adjustments. llvm-svn: 361405	2019-05-22 16:28:41 +00:00
Austin Kerbow	6e6480e216	[CodeGen] Rename DEBUG_TYPE for default hazard recognizer. Summary: The DEBUG_TYPE of the default hazard recognizer should be updated to match the DEBUG_TYPE of the machine-scheduler pass. Reviewers: rampitec Reviewed By: rampitec Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61359 llvm-svn: 360198	2019-05-07 22:09:04 +00:00
Chandler Carruth	2946cd7010	Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636	2019-01-19 08:50:56 +00:00

1 2 3

142 Commits