llvm-project

Commit Graph

Author	SHA1	Message	Date
Vitaly Buka	c6426e2657	[SafeStack,NFC] Remove unneded branch	2020-06-14 23:05:43 -07:00
Vitaly Buka	7282da1ea8	[SafeStack,NFC] Fix naming style	2020-06-14 23:05:42 -07:00
Vitaly Buka	2f5e535a84	[SafeStack,NFC] Cleanup LiveRange interface	2020-06-14 23:05:42 -07:00
Vitaly Buka	adefa9ca2e	[SafeStack,NFC] "const" cleanup	2020-06-14 23:05:42 -07:00
Vitaly Buka	fb1e0f324f	[SafeStack,NFC] Add BlockLifetimeInfo constructor	2020-06-14 23:05:42 -07:00
Vitaly Buka	645058036a	[SafeStack,NFC] Use IntrinsicInst instead of Instruction	2020-06-14 23:05:41 -07:00
Vitaly Buka	f8e411656e	[SafeStack,NFC] Move ClColoring into SafeStack.cpp This allows to reuse the code in other components.	2020-06-14 23:05:41 -07:00
Vitaly Buka	05590a9cb8	[SafeStack,NFC] Move unconditional code into constructor Prepare to move ClColoring from SafeStackCode to SafeStackLayout. This will allow to reuse the code in other components.	2020-06-14 23:05:41 -07:00
Chen Zheng	bd7096b977	[PowerPC] fma chain break to expose more ILP This patch tries to reassociate two patterns related to FMA to expose more ILP on PowerPC. // Pattern 1: // A = FADD X, Y (Leaf) // B = FMA A, M21, M22 (Prev) // C = FMA B, M31, M32 (Root) // --> // A = FMA X, M21, M22 // B = FMA Y, M31, M32 // C = FADD A, B // Pattern 2: // A = FMA X, M11, M12 (Leaf) // B = FMA A, M21, M22 (Prev) // C = FMA B, M31, M32 (Root) // --> // A = FMUL M11, M12 // B = FMA X, M21, M22 // D = FMA A, M31, M32 // C = FADD B, D Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D80175	2020-06-15 00:00:04 -04:00
Qiu Chaofan	f8ef7c99a0	[DAGCombiner] Require ninf for division estimation Current implementation of division estimation isn't correct for some cases like 1.0/0.0 (result is nan, not expected inf). And this change exposes a potential infinite loop: we use isConstOrConstSplatFP in combineRepeatedFPDivisors to look up if the divisor is some constant. But it doesn't work after legalized on some platforms. This patch restricts the method to act before LegalDAG. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D80542	2020-06-14 22:58:22 +08:00
Amanieu d'Antras	6973125cb7	Fix FastISel dropping srcloc metadata from InlineAsm Summary: Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=46060 I've also added the Extra_IsConvergent flag which was missing from FastISel. Reviewers: echristo Reviewed By: echristo Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80759	2020-06-13 16:52:37 +01:00
Roman Lebedev	17f7654152	[NFCI][MachineCopyPropagation] invalidateRegister(): use SmallSet<8> instead of DenseSet. This decreases the time consumed by the pass [during RawSpeed unity build] by 25% (0.0586 s -> 0.04388 s). While that isn't really impressive overall, that wasn't the goal here. The memory results here are noticeable. The baseline results are: ``` total runtime: 55.65s. calls to allocation functions: 19754254 (354960/s) temporary memory allocations: 4951609 (88974/s) peak heap memory consumption: 239.13MB peak RSS (including heaptrack overhead): 463.79MB total memory leaked: 198.01MB ``` While with this patch the results are: ``` total runtime: 55.37s. calls to allocation functions: 19068237 (344403/s) # -3.47 % temporary memory allocations: 4261772 (76974/s) # -13.93 % (!!!) peak heap memory consumption: 239.13MB peak RSS (including heaptrack overhead): 463.73MB total memory leaked: 198.01MB ``` So we get rid of a lot of temporary allocations. Using `SmallSet<8>` makes sense to me because at least here for x86 BdVer2, the size of that set is never more than 3, over all of llvm test-suite + RawSpeed. The story might be different on other targets, not sure if it will ever justify whole DenseSet, but if it does SmallDenseSet might be a compromise.	2020-06-12 23:10:54 +03:00
Michael Liao	e7b920e6fe	[DAGCombine] Generalize the case (add (or x, c1), c2) -> (add x, (c1 + c2)) Reviewers: arsenm Subscribers: sdardis, wdng, hiraditya, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, ecnelises, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81708	2020-06-12 13:53:08 -04:00
Matt Arsenault	350ee7fb3f	GlobalISel: Fix not erasing old instruction in sitofp/uitofp lowering	2020-06-12 10:33:23 -04:00
Simon Pilgrim	5509e2cc2e	[DAG] foldAddSubOfSignBit - add support for non-uniform vector constants	2020-06-12 14:58:15 +01:00
diggerlin	c6be3ea524	[NFC] clean up the AsmPrinter::emitLinkage for AIX part SUMMARY: Since we deal with aix emitLinkage in the PPCAIXAsmPrinter::emitLinkage() in the patch https://reviews.llvm.org/D75866. It do not go to AsmPrinter::emitLinkage() any more, we clean up some aix related code in the AsmPrinter::emitLinkage() Reviewers: Jason liu Differential Revision: https://reviews.llvm.org/D81613	2020-06-11 13:33:51 -04:00
Petar Avramovic	bd3d951b8b	AMDGPU/GlobalISel: Fix lower for f64->f16 G_FPTRUNC Put AND before ADD in LegalizerHelper::lowerFPTRUNC_F64_TO_F16 in order to match algorithm from AMDGPUTargetLowering::LowerFP_TO_FP16. Differential Revision: https://reviews.llvm.org/D81666	2020-06-11 18:19:27 +02:00
Dominik Montada	f24e2e9eeb	[GlobalISel] fix crash in IRTranslator, MachineIRBuilder when translating @llvm.dbg.value intrinsic and using -debug Summary: Fix crash when using -debug caused by the GlobalISel observer trying to print an incomplete DBG_VALUE instruction. This was caused by the MachineIRBuilder using buildInstr, which immediately inserts the instruction causing print, instead of using BuildMI to first build up the instruction and using insertInstr when finished. Add RUN-line to existing debug-insts.ll test with -debug flag set to make sure no crash is happening. Also fixed a missing %s in the 2nd RUN-line of the same test. Reviewers: t.p.northover, aditya_nandakumar, aemerson, dsanders, arsenm Reviewed By: arsenm Subscribers: wdng, arsenm, rovka, hiraditya, volkan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76934	2020-06-11 10:47:49 +02:00
David Sherwood	bd97342a0c	[CodeGen] Let computeKnownBits do something sensible for scalable vectors Until we have a real need for computing known bits for scalable vectors I have simply changed the code to bail out for now and pretend we know nothing. I've also fixed up some simple callers of computeKnownBits too. Differential Revision: https://reviews.llvm.org/D80437	2020-06-11 08:17:11 +01:00
Matt Arsenault	0671a4c508	RegAllocFast: Avoid unused method warning in release builds	2020-06-10 15:23:56 -04:00
Matt Arsenault	0f2af15c1b	GlobalISel: Make default implementation of legalizeCustom unreachable If the target explicitly requested custom legalization, it should be required to implement this. Also move default legalizeIntrinsic implementation into the header so it's next to the related legalizeCustom.	2020-06-10 11:05:59 -04:00
Wang, Pengfei	6eb9eae010	[MS] Copy the symbols assigned to the former instruction when memory folding. The memory folding raplaced the old instruction without copying the symbols assigned. Which will resulted in built fail due to the lost symbols. Reviewed by craig.topper Differential Revision: https://reviews.llvm.org/D78471	2020-06-10 15:38:32 +08:00
diggerlin	edd819c757	[AIX] supporting the visibility attribute for aix assembly SUMMARY: in the aix assembly , it do not have .hidden and .protected directive. in current llvm. if a function or a variable which has visibility attribute, it will generate something like the .hidden or .protected , it can not recognize by aix as. in aix assembly, the visibility attribute are support in the pseudo-op like .extern Name [ , Visibility ] .globl Name [, Visibility ] .weak Name [, Visibility ] in this patch, we implement the visibility attribute for the global variable, function or extern function . for example. extern __attribute__ ((visibility ("hidden"))) int bar(int* ip); __attribute__ ((visibility ("hidden"))) int b = 0; __attribute__ ((visibility ("hidden"))) int foo(int* ip){ return (*ip)++; } the visibility of .comm linkage do not support , we will have a separate patch for it. we have the unsupported cases ("default" and "internal") , we will implement them in a a separate patch for it. Reviewers: Jason Liu ,hubert.reinterpretcast,James Henderson Differential Revision: https://reviews.llvm.org/D75866	2020-06-09 16:15:06 -04:00
Matt Arsenault	32823091c3	GlobalISel: Set instr/debugloc before any legalizer action It was annoying enough that every custom lowering needed to set the insert point, but this was made worse since now these all needed to be updated to setInstrAndDebugLoc. Consolidate these so every legalization action has the right insert position by default. This should fix dropping debug info in every custom AMDGPU legalization.	2020-06-09 15:37:02 -04:00
Matt Arsenault	b94c9e3b55	GlobalISel: Improve MachineIRBuilder construction The current relationship between LegalizerHelper and MachineIRBuilder confuses me, because the LegalizerHelper modifies the MachineIRBuilder which it does not own. Constructing a LegalizerHelper destroys the insert point, since the constructor calls setMF, which clears all the fields. Try to separate these functions, so it's possible to construct a LegalizerHelper from an existing MachineIRBuilder without losing the insert point/debug loc.	2020-06-09 15:05:04 -04:00
Matt Arsenault	babbf4441b	GlobalISel: Move some trivial MIRBuilder methods into the header The construction APIs for MachineIRBuilder don't make much sense, and it's been annoying to sort through it with these trivial functions separate from the declaration.	2020-06-09 15:04:48 -04:00
Matt Arsenault	bb6cb6bfe4	GlobalISel: Remove redundant check in verifier This was already checked earlier for all instructions.	2020-06-09 15:04:27 -04:00
Matt Arsenault	6eeac6ae33	GlobalISel: Fix double printing new instructions in legalizer New instructions were getting printed both in createdInstr, and in the final printNewInstrs, so it made it look like the same instructions were created twice. This overall made reading the debug output harder. Stop printing the initial construction and only print new instructions in the summary at the end. This avoids printing the less useful case where instructions are sometimes initially created with no operands. I'm not sure this is the correct instance to remove; now the visible ordering is different. Now you will typically see the one erased instruction message before all the new instructions in order. I think this is the more logical view of typical legalization changes, although it's mechanically backwards from the normal insert-new-erase-old pattern.	2020-06-09 15:02:31 -04:00
David Green	2fea3fe41c	[MachineScheduler] Update available queue on the first mop of a new cycle If a resource can be held for multiple cycles in the schedule model then an instruction can be placed into the available queue, another instruction can be scheduled, but the first will not be taken back out if the two instructions hazard. To fix this make sure that we update the available queue even on the first MOp of a cycle, pushing available instructions back into the pending queue if they now conflict. This happens with some downstream schedules we have around MVE instruction scheduling where we use ResourceCycles=[2] to show the instruction executing over two beats. Apparently the test changes here are OK too. Differential Revision: https://reviews.llvm.org/D76909	2020-06-09 19:13:53 +01:00
Sanjay Patel	702cf93356	[DAGCombiner] allow more folding of fadd + fmul into fma If fmul and fadd are separated by an fma, we can fold them together to save an instruction: fadd (fma A, B, (fmul C, D)), N1 --> fma(A, B, fma(C, D, N1)) The fold implemented here is actually a specialization - we should be able to peek through >1 fma to find this pattern. That's another patch if we want to try that enhancement though. This transform was guarded by the TLI hook enableAggressiveFMAFusion(), so it was done for some in-tree targets like PowerPC, but not AArch64 or x86. The hook is protecting against forming a potentially more expensive computation when fma takes longer to execute than a single fadd. That hook may be needed for other transforms, but in this case, we are replacing fmul+fadd with fma, and the fma should never take longer than the 2 individual instructions. 'contract' FMF is all we need to allow this transform. That flag corresponds to -ffp-contract=fast in Clang, so we are allowed to form fma ops freely across expressions. Differential Revision: https://reviews.llvm.org/D80801	2020-06-09 10:41:27 -04:00
Guillaume Chatelet	800e100588	Revert "[Alignment][NFC] Migrate TargetLowering::allowsMemoryAccess" This reverts commit `f21c52667e`.	2020-06-09 10:43:59 +00:00
Simon Wallis	4dba59689d	[ARM] prologue instructions emitted for naked function with >64 byte argument Summary: The naked function attribute is meant to suppress all function prologue/epilogue instructions. On ARM, some are still emitted if an argument greater than 64 bytes in size (the threshold for using the byval attribute in IR) is passed partially in registers. Perform the check for Attribute::Naked and early exit in SelectionDAGISel::LowerArguments(). Checking in ARMFrameLowering::determineCalleeSaves() is too late. A test case is included. Reviewers: llvm-commits, olista01, danielkiss Reviewed By: danielkiss Subscribers: kristof.beyls, hiraditya, danielkiss Tags: #llvm Differential Revision: https://reviews.llvm.org/D80715 Change-Id: Icedecf2a4ad31bc3c35ab0df7489a9d346e1f7cc	2020-06-09 11:33:03 +01:00
Guillaume Chatelet	3b6196c9b3	[Alignment][NFC] TargetLowering::allowsMisalignedMemoryAccesses Summary: Note to downstream target maintainers: this might silently change the semantics of your code if you override `TargetLowering::allowsMisalignedMemoryAccesses` without marking it override. This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81374	2020-06-09 10:17:42 +00:00
Guillaume Chatelet	f21c52667e	[Alignment][NFC] Migrate TargetLowering::allowsMemoryAccess Summary: Note to downstream target maintainers: this might silently change the semantics of your code if you override `TargetLowering::allowsMemoryAccess` without marking it override. This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81379	2020-06-09 10:11:07 +00:00
Guillaume Chatelet	e26ed6bdae	Fix unused variable warning	2020-06-09 08:56:05 +00:00
Kang Zhang	1b6602275d	[MachineVerifier] Add TiedOpsRewritten flag to fix verify two-address error Summary: Currently, MachineVerifier will attempt to verify that tied operands satisfy register constraints as soon as the function is no longer in SSA form. However, PHIElimination will take the function out of SSA form while TwoAddressInstructionPass will actually rewrite tied operands to match the constraints. PHIElimination runs first in the pipeline. Therefore, whenever the MachineVerifier is run after PHIElimination, it will encounter verification errors on any tied operands. This patch adds a function property called TiedOpsRewritten that will be set by TwoAddressInstructionPass and will control when the verifier checks tied operands. Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D80538	2020-06-09 07:39:42 +00:00
David Sherwood	cc8872400c	[CodeGen] Ensure callers of CreateStackTemporary use sensible alignments In two instances of CreateStackTemporary we are sometimes promoting alignments beyond the stack alignment. I have introduced a new function called getReducedAlign that will return the alignment for the broken down parts of illegal vector types. For example, on NEON a <32 x i8> type is made up of two <16 x i8> types - in this case the sensible alignment is 16 bytes, not 32. In the legalization code wherever we create stack temporaries I have started using the reduced alignments instead for illegal vector types. I added a test to CodeGen/AArch64/build-one-lane.ll that tries to insert an element into an illegal fixed vector type that involves creating a temporary stack object. Differential Revision: https://reviews.llvm.org/D80370	2020-06-09 08:10:17 +01:00
Yonghong Song	3eb465a329	[DebugInfo] Fix assertion for extern void type Commit `d77ae1552f` ("[DebugInfo] Support to emit debugInfo for extern variables") added support to emit debuginfo for extern variables. Currently, only BPF target enables to emit debuginfo for extern variables. But if the extern variable has "void" type, the compilation will fail. -bash-4.4$ cat t.c extern void bla; void test() { void x = &bla; return x; } -bash-4.4$ clang -target bpf -g -O2 -S t.c missing global variable type !1 = distinct !DIGlobalVariable(name: "bla", scope: !2, file: !3, line: 1, isLocal: false, isDefinition: false) ... fatal error: error in backend: Broken module found, compilation aborted! PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace, preprocessed source, and associated run script. Stack dump: ... The IR requires a DIGlobalVariable must have a valid type and the "void" type does not generate any type, hence the above fatal error. Note that if the extern variable is defined as "const void", the compilation will succeed. -bash-4.4$ cat t.c extern const void bla; const void test() { const void x = &bla; return x; } -bash-4.4$ clang -target bpf -g -O2 -S t.c -bash-4.4$ cat t.ll ... !1 = distinct !DIGlobalVariable(name: "bla", scope: !2, file: !3, line: 1, type: !6, isLocal: false, isDefinition: false) !6 = !DIDerivedType(tag: DW_TAG_const_type, baseType: null) ... Since currently, "const void extern_var" is supported by the debug info, it is natural that "void extern_var" should also be supported. This patch disabled assertion of "void extern_var" in IR verifier and add proper guarding when emiting potential null debug info type to dwarf types. Differential Revision: https://reviews.llvm.org/D81131	2020-06-08 13:43:18 -07:00
Andrew Litteken	bb677cacc8	[SuffixTree][MachOpt] Factoring out Suffix Tree and adding Unit Tests This moves the SuffixTree test used in the Machine Outliner and moves it into Support for use in other outliners elsewhere in the compilation pipeline. Differential Revision: https://reviews.llvm.org/D80586	2020-06-08 12:44:18 -07:00
Hendrik Greving	f3d8a93970	[ModuloSchedule] Support instructions with > 1 destination when walking canonical use. Fixes a minor bug that led to finding the wrong register if the definition had more than one register destination.	2020-06-08 11:43:59 -07:00
Jan-Willem Maessen	3610d31e7a	[NFC] Fix quadratic LexicalScopes::constructScopeNest We sometimes have functions with large numbers of sibling basic blocks (usually with an error path exit from each one). This was triggering the qudratic behavior in this function - after visiting each child llvm would re-scan the parent from the beginning again. We modify the work stack to record the next index to be worked on alongside the pointer. This avoids the need to linearly search for the next unfinished child. Differential Revision: https://reviews.llvm.org/D80029	2020-06-08 18:40:56 +01:00
Christopher Tetreault	caa2fddce7	[SVE] Eliminate calls to default-false VectorType::get() from CodeGen Reviewers: efriedma, c-rhodes, david-arm, spatel, craig.topper, aqjune, paquette, arsenm, gchatelet Reviewed By: spatel, gchatelet Subscribers: wdng, tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80313	2020-06-08 10:26:10 -07:00
Guillaume Chatelet	54076610dc	[Alignment][NFC] Deprecate dead code from CallingConvLower.h Summary: This is a followup on D81196. Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81362	2020-06-08 14:49:39 +00:00
Matt Arsenault	5f7e38d8f4	GlobalISel: Use Register	2020-06-08 10:15:53 -04:00
Matt Arsenault	f13ba22227	GlobalISel: Remove unused header	2020-06-08 10:15:53 -04:00
Matt Arsenault	f41994f85b	GlobalISel: Make it clearer that regbank/class are mutually exclusive	2020-06-08 10:15:53 -04:00
Matt Arsenault	c1d771dc4b	GlobalISel: Simplify debug printing	2020-06-08 10:15:53 -04:00
Guillaume Chatelet	94b0c32a0b	[Alignment][NFC] Migrate HandleByVal to Align Summary: Note to downstream target maintainers: this might silently change the semantics of your code if you override `TargetLowering::HandleByVal` without marking it `override`. This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: sdardis, hiraditya, jrtc27, atanasyan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81365	2020-06-08 10:50:27 +00:00
Sander de Smalen	ae09670ee4	[CodeGen][SVE] CopyToReg: Split scalable EVTs that are not powers of 2 Scalable vectors cannot use 'BUILD_VECTOR', so it is necessary to properly split and widen scalable vectors when passing them to CopyToReg/CopyFromReg. This functionality is added to TargetLoweringBase::getVectorTypeBreakdown(). This patch only adds support for 'splitting' scalable vectors that are a multiple of some legal type, e.g. <vscale x 6 x i64> -> 3 x <vscale x 2 x i64> Reviewers: efriedma, c-rhodes Reviewed By: efriedma Tags: #llvm Differential Revision: https://reviews.llvm.org/D80139	2020-06-08 10:39:18 +01:00
James Y Knight	748d92b4d3	Simplify MachineVerifier's block-successor verification. There's two properties we want to verify: 1. That the successors returned by analyzeBranch are in the CFG successor list, and 2. That there are no extraneous successors are in the CFG successor list. The previous implementation mostly accomplished this, but in a very convoluted manner. Differential Revision: https://reviews.llvm.org/D79793	2020-06-06 22:30:51 -04:00
James Y Knight	1978309db1	MachineBasicBlock::updateTerminator now requires an explicit layout successor. Previously, it tried to infer the correct destination block from the successor list, but this is a rather tricky propspect, given the existence of successors that occur mid-block, such as invoke, and potentially in the future, callbr/INLINEASM_BR. (INLINEASM_BR, in particular would be problematic, because its successor blocks are not distinct from "normal" successors, as EHPads are.) Instead, require the caller to pass in the expected fallthrough successor explicitly. In most callers, the correct block is immediately clear. But, in MachineBlockPlacement, we do need to record the original ordering, before starting to reorder blocks. Unfortunately, the goal of decoupling the behavior of end-of-block jumps from the successor list has not been fully accomplished in this patch, as there is currently no other way to determine whether a block is intended to fall-through, or end as unreachable. Further work is needed there. Differential Revision: https://reviews.llvm.org/D79605	2020-06-06 22:30:51 -04:00
Simon Pilgrim	f14d4c9c54	EHPersonalities.h - reduce Triple.h include to forward declaration. NFC. Move implicit include dependencies down to source files.	2020-06-06 15:48:31 +01:00
Sanjay Patel	302cc8a121	[DAGCombiner] clean-up FMA+FMUL folds; NFC D80801 suggests some readability improvements before mocing this block.	2020-06-06 10:32:54 -04:00
Nikita Popov	cb5724c71e	[CGP] Remove unnecessary MaybeAlign use (NFC) Stores now always have an alignment.	2020-06-05 23:18:26 +02:00
Matt Arsenault	eaa8af9322	GlobalISel: Add helper for constructing load from offset	2020-06-05 15:06:03 -04:00
Matt Arsenault	45e1a22a92	GlobalISel: Make known bits/alignment API more consistent Just computing the alignment makes sense without caring about the general known bits, such as for non-integral pointers. Separate the two and start calling into the TargetLowering hooks for frame indexes. Start calling the TargetLowering implementation for FrameIndexes, which improves the AMDGPU matching for stack addressing modes. Also introduce a new hook for returning known alignment of target instructions. For AMDGPU, it would be useful to report the known alignment implied by certain intrinsic calls. Also stop using MaybeAlign.	2020-06-05 14:57:22 -04:00
Nikita Popov	d370088611	[LiveDebugValues] Fix output stream (NFC) This should dump to the provided Out, rather than dbgs(), though they coincide in current usage.	2020-06-05 20:02:22 +02:00
Nikita Popov	6a53264926	[LiveDebugValues] Remove PendingInLocs (NFC) PendingInLocs ends up having the same value as InLocs, just computed a bit more indirectly. It is a leftover of a previous implementation approach. This patch drops PendingInLocs, as well as the Diff and Removed calulations, which are no longer needed. Differential Revision: https://reviews.llvm.org/D80868	2020-06-05 20:01:29 +02:00
Sander de Smalen	937cb7a8c7	Reland D80640: [CodeGen][SVE] Calculate correct type legalization for scalable vectors. This reverts commit `9bcef270d7`.	2020-06-05 18:09:31 +01:00
Sander de Smalen	9bcef270d7	Revert "[CodeGen][SVE] Calculate correct type legalization for scalable vectors." Seems to break some buildbots, reverting the patch for now. This reverts commit `164f4b9d26`.	2020-06-05 16:03:52 +01:00
Sander de Smalen	164f4b9d26	[CodeGen][SVE] Calculate correct type legalization for scalable vectors. This patch updates TargetLoweringBase::computeRegisterProperties and TargetLoweringBase::getTypeConversion to support scalable vectors, and make the right calls on how to legalise them. These changes are required to legalise both MVTs and EVTs. Reviewers: efriedma, david-arm, ctetreau Reviewed By: efriedma Tags: #llvm Differential Revision: https://reviews.llvm.org/D80640	2020-06-05 15:20:34 +01:00
Denis Antrushin	dae64d8f42	Fix build breakage caused by `66a1b83bf9`	2020-06-05 15:53:09 +03:00
Denis Antrushin	66a1b83bf9	[TargetLowering][NFC] More efficient emitPatchpoint(). Current implementation of emitPatchpoint() is very inefficient: for every FrameIndex operand if creates new MachineInstr with that operand expanded and all other copied as is. Since PATCHPOINT/STATEPOINT instructions may have a lot of FrameIndex operands, we end up creating and erasing many machine instructions. But we can do it in single pass, with only one new machine instruction generated. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D81181	2020-06-05 14:57:29 +03:00
Kerry McLaughlin	89fc0166f5	[CodeGen][SVE] Legalisation of extends with scalable types Summary: This patch adds legalisation of extensions where the operand of the extend is a legal scalable type but the result is not. EXTRACT_SUBVECTOR is used to split the result, before being replaced by target-specific [S\|U]UNPK[HI\|LO] operations. For example: ``` zext <vscale x 16 x i8> %a to <vscale x 16 x i16> ``` should emit: ``` uunpklo z2.h, z0.b uunpkhi z1.h, z0.b ``` Reviewers: sdesmalen, efriedma, david-arm Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, huihuiz, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79587	2020-06-05 12:08:42 +01:00
Philip Reames	4c735439fd	[Statepoint] Migrate a few tests to gc-live bundle format and fix assert The assert was missed in `0e7c7705`, migrating the test revealed the problem.	2020-06-04 18:15:58 -07:00
Vedant Kumar	198762680e	[LiveDebugValues] Cache LexicalScopes::getMachineBasicBlocks, NFCI Summary: Cache the results from getMachineBasicBlocks in LexicalScopes to speed up UserValueScopes::dominates queries. This replaces the caching done in UserValueScopes. Compared to the old caching method, this reduces memory traffic when a VarLoc is copied (e.g. when a VarLocMap grows), and enables caching across basic blocks. When compiling sqlite 3.5.7 (CTMark version), this patch reduces the number of calls to getMachineBasicBlocks from 10,207 to 1,093. I also measured a small compile-time reduction (~ 0.1% of total wall time, on average, on my machine). As a drive-by, I made the DebugLoc in UserValueScopes a const reference to cut down on MetadataTracking traffic. Reviewers: jmorse, Orlando, aprantl, nikic Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80957	2020-06-04 16:58:45 -07:00
Matt Arsenault	af867b7850	DAG: Change computeKnownBitsForFrameIndex to be usable by GISel This wasn't getting much value from the DAG or depth arguments, since it's only called on the frame index root nodes. FrameIndexes can also only return a scalar value, so it also didn't need DemandedElts.	2020-06-04 10:50:26 -04:00
Matt Arsenault	931a68f26b	RegAllocFast: Remove dead code	2020-06-04 09:38:31 -04:00
Sanjay Patel	652b3757c8	[x86] add test/code comment for chain value use (PR46195); NFC	2020-06-04 09:15:17 -04:00
Simon Pilgrim	adf10dcf2e	[DAG] scalarizeBinOpOfSplats - extract from the source of splat vector (PR46189) D79003/rG9fa58d1bf2f8 exposed an issue with scalarizeBinOpOfSplats that we were extracting from the splatted vector result instead of the source, the splat index is only valid for the source vector not the result, which may contain undefs, including at the splat index.	2020-06-04 11:58:59 +01:00
Tim Northover	87e24c3200	Revert "[DAGCombiner] avoid unnecessary indirection from SDNode/SDValue; NFCI" This reverts commit `21dadd774f`. In at least PromoteIntBinOps, they wanted to know about users of all values produced by the node not just the integer being promoted. For example not replacing chain users if the operation was a load breaks the ordering of the DAG.	2020-06-04 11:53:14 +01:00
Madhur Amilkanthwar	b3cff3c720	Utility to dump .dot representation of SelectionDAG without firing viewer Summary: This patch adds support for dumping .dot representation of SelectionDAG. It is inspired from the fact that, a developer may want to just dump the graph at a predictable path with a simple name to compare. The exisitng utility (i.e. viewGraph) are overkill for this motive hence this patch adds the requires support while using the core routines from GraphWriter. Example usage: DAG.dumpDotGraph("/tmp/graph.dot", "MyGraph") will create /tmp/graph.dot file when DAG is an object of SelectionDAG class. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D80711	2020-06-04 11:51:48 +05:30
Philip Reames	ab6779bbd8	[Statepoint] Remove last of old ImmutableStatepoint code To do so, I had to sink the old school inline operand handling into GCStatepointInst which is non ideal. This code should be removed shortly and I was able to at least clean it up a bunch.	2020-06-03 20:31:17 -07:00
Philip Reames	91dd2f2536	[Statepoint] Delete more dead code from old wrappers The verify() routine duplicates IR/Verifier.cpp checks, so while not technically dead it doesn't add any value either.	2020-06-03 20:10:30 -07:00
Matt Arsenault	ed5017e153	GlobalISel: Start defining strict FP instructions The AMDGPU lowering for unconstrained G_FDIV sometimes needs to introduce a mode switch in the middle, so it's helpful to have constrained instructions available to legalize this. Right now nothing is preventing reordering of the mode switch with the other instructions in the expansion.	2020-06-03 20:46:37 -04:00
Quentin Colombet	ccb3c8e861	[RegisterCoalescer] Update empty subranges when rematerializing When we rematerialize a value as part of the coalescing, we may widen the register class of the destination register. When this happens, updateRegDefUses may create additional subranges to account for the wider register class. The created subranges are empty and if they are not defined by the rematerialized instruction we clean them up. However, if they are defined by the rematerialized instruction but unused, we failed to flag them as dead definition and would leave them as empty live-range. This is wrong because empty live-ranges don't interfere with anything, thus if we don't fix them, we would fail to account that the rematerialized instruction clobbers some lanes. E.g., let us consider the following pseudo code: def.lane_low64:reg128 = ldimm newdef:reg32 = COPY def.lane_low64_low32 When rematerialization happens for newdef, we end up with: newdef.lane_low64:reg128 = ldimm = use newdef.lane_low64_low32 Let's look at the live interval of newdef. Before rematerialization, we would get: newdef [defIdx, useIdx:0) 0@defIdx Right after updateRegDefUses, newdef register class is widen to reg128 and the subrange definitions will be augmented to fill the subreg that is used at the definition point, here lane_low64. The resulting live interval would be: newdef [newDefIdx, useIdx:0) 0@newDefIdx * lane_low64_high32 EMPTY * lane_low64_low32 [newDefIdx, useIdx:0) Before this patch this would be the final status of the live interval. Therefore we miss that lane_low64_high32 is actually live on the definition point of newdef. With this patch, after rematerializing, we check all the added subranges and for the ones that are defined but empty, we flag them as dead def. Thus, in that case, newdef would look like this: newdef [newDefIdx, useIdx:0) 0@newDefIdx * lane_low64_high32 [newDefIdx, newDefIdxDead) ; <-- instead of EMPTY * lane_low64_low32 [newDefIdx, useIdx:0) This fixes https://www.llvm.org/PR46154	2020-06-03 17:10:55 -07:00
Matt Arsenault	3866e0a563	GlobalISel: Fail expansion of G_DYN_STACKALLOC for StackGrowsUp	2020-06-03 19:56:07 -04:00
Philip Reames	382b3023cb	[Statepoints][CGP] Minor parameter type cleanup	2020-06-03 16:00:38 -07:00
Matt Arsenault	66251f7e1d	RegAllocFast: Record internal state based on register units Record internal state based on register units. This is often more efficient as there are typically fewer register units to update compared to iterating over all the aliases of a register. Original patch by Matthias Braun, but I've been rebasing and fixing it for almost 2 years and fixed a few bugs causing intermediate failures to make this patch independent of the changes in https://reviews.llvm.org/D52010.	2020-06-03 16:51:46 -04:00
Victor Huang	3abe7aca45	[CodeGen] Enable tail call position check for speculatable functions In the function "Analysis.cpp:isInTailCallPosition", it only checks whether a call is in a tail call position if the call has side effects, access memory or it is not safe to speculative execute. Therefore, a speculatable function will not go through tail call position check and improperly tail called when it is not in a tail-call position. This patch enables tail call position check for speculatable functions. Differential Revision: https://reviews.llvm.org/D80661	2020-06-03 10:37:45 -05:00
Kang Zhang	2cc77b2b8a	[LiveVariables] Don't set undef reg PHI used as live for FromMBB Summary: In the patch D73152, it adds a new function LiveVariables::addNewBlock. This new function will add the reg which PHI used to the MBB which reg is from. But the new function may cause LiveVariable Verification failed when the Src reg in PHI is undef. Reviewed By: bjope Differential Revision: https://reviews.llvm.org/D80077	2020-06-03 15:25:30 +00:00
Henry Kao	c57e41c000	[CodeGen][SVE] Replace deprecated calls in getCopyFromPartsVector() Summary: Replaced getVectorNumElements() with getVectorElementCount(). Added operator overloads for class ElementCount. Fixes warning in several AArch64 unit tests. Reviewers: sdesmalen, kmclaughlin, dancgr, efriedma, each, andwar, rengolin Reviewed By: efriedma Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80826	2020-06-03 11:20:02 -04:00
Simon Pilgrim	ea80b40669	[DAG] SimplifyDemandedBits - peek through SHL if we only demand sign bits. If we're only demanding the (shifted) sign bits of the shift source value, then we can use the value directly. This handles SimplifyDemandedBits/SimplifyMultipleUseDemandedBits for both ISD::SHL and X86ISD::VSHLI. Differential Revision: https://reviews.llvm.org/D80869	2020-06-03 16:11:54 +01:00
Simon Pilgrim	c438b257f1	[DAG] GetDemandedBits - don't bother asserting for a non-null cast<> result. NFC. cast<> will assert on failure anyhow. This lets us fold the cast<> with the getAPIntValue() that uses it.	2020-06-03 12:43:07 +01:00
Simon Pilgrim	7a96c181d0	TargetFrameLowering.h - remove unnecessary includes. NFC. Move TargetFrameLowering.h include to the top of the TargetFrameLoweringImpl.cpp includes (clang-format doesn't do this by default as the filenames don't match).	2020-06-03 11:12:42 +01:00
Kadir Cetinkaya	c5468253aa	[llvm] Fix unused variable warnings	2020-06-03 11:49:01 +02:00
Djordje Todorovic	dd1bc59b72	[CSInfo][MIPS][DwarfDebug] Add support for delay slots This adds call site info support for call instructions with delay slot. Search for instructions inside call delay slot, which load value into parameter forwarding registers. Return address of the call points to instruction after call delay slot, which is not the one, immediately after the call instruction. Patch by Nikola Tesic Differential revision: https://reviews.llvm.org/D78107	2020-06-03 11:25:17 +02:00
Eric Christopher	153a24ab0f	Undo initialization of TRI in CGP as this is unconditionally initialized later.	2020-06-02 15:08:54 -07:00
Kadir Cetinkaya	af86a10bad	[llvm] Fix unused variable warning	2020-06-02 22:46:24 +02:00
Eric Christopher	971459c3ef	Fix up clang-tidy warnings around null and pointers.	2020-06-02 13:24:20 -07:00
Amy Kwan	a3ada630d8	[DAGCombiner] Combine shifts into multiply-high This patch implements a target independent DAG combine to produce multiply-high instructions from shifts. This DAG combine will combine shifts for any type as long as the MULH on the narrow type is legal. For now, it is enabled on PowerPC as PowerPC is the only target that has an implementation of the isMulhCheaperThanMulShift TLI hook introduced in D78271. Moreover, this DAG combine focuses on catching the pattern: (shift (mul (ext <narrow_type>:$a to <wide_type>), (ext <narrow_type>:$b to <wide_type>)), <narrow_width>) to produce mulhs when we have a sign-extend, and mulhu when we have a zero-extend. The patch performs the following checks: - Operation is a right shift arithmetic (sra) or logical (srl) - Input to the shift is a multiply - Both operands to the shift are sext/zext nodes - The extends into the multiply are both the same - The narrow type is half the width of the wide type - The shift amount is the width of the narrow type - The respective mulh operation is legal Differential Revision: https://reviews.llvm.org/D78272	2020-06-02 15:22:48 -05:00
Djordje Todorovic	4e8e5d60b4	[CSInfo][NFC] Interpret loaded parameter value separately The collectCallSiteParameters() method searches for instructions which load values into registers used for parameters passing. Previously, interpretation of those values, loaded by one such instruction, was implemented inside collectCallSiteParameters() method. This patch moves the interpretation code from collectCallSiteParameters() method into a separate static method named interpretValue. New method is called from collectCallSiteParameters() to process each instruction from targeted instruction scope. The collectCallSiteParameters() searches for loaded parameter value among instructions which precede the call instruction, inside the same basic block. When needed, new method (interpretValue) could be used for searching any instruction scope. This is preparation for search of parameter value, loaded inside call delay slot. Patch by Nikola Tesic Differential revision: https://reviews.llvm.org/D78106	2020-06-02 13:05:04 +02:00
Sriraman Tallam	e0bca46b08	Options for Basic Block Sections, enabled in D68063 and D73674. This patch adds clang options: -fbasic-block-sections={all,<filename>,labels,none} and -funique-basic-block-section-names. LLVM Support for basic block sections is already enabled. + -fbasic-block-sections={all, <file>, labels, none} : Enables/Disables basic block sections for all or a subset of basic blocks. "labels" only enables basic block symbols. + -funique-basic-block-section-names: Enables unique section names for basic block sections, disabled by default. Differential Revision: https://reviews.llvm.org/D68049	2020-06-02 00:23:32 -07:00
Denis Antrushin	fa818ded24	[StatepointLowering] Handle UNDEF gc values. Do not spill UNDEF GC values. Instead, replace corresponding gc.relocate intrinsic with an (arbitrary, but recognizable) constant. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D80714	2020-06-02 10:18:33 +03:00
Richard Smith	4ccb6c36a9	Fix violations of [basic.class.scope]p2. These cases all follow the same pattern: struct A { friend class X; //... class X {}; }; But 'friend class X;' injects 'X' into the surrounding namespace scope, rather than introducing a class member. So the second 'class X {}' is a completely different type, which changes the meaning of the earlier name 'X' from '::X' to 'A::X'. Additionally, the friend declaration is pointless -- members of a class don't need to be befriended to be able to access private members.	2020-06-01 22:03:05 -07:00
Vedant Kumar	776708b00b	[LiveDebugValues] Remove early-exit when testing regmasks, NFC In transferRegisterDef, if the instruction has a regmask attached, we'll check if any currently used register is clobbered by the regmask. The early exit in this scan isn't necessary, costs a set lookup, and is almost never taken [1]. Delete it. [1] http://lab.llvm.org:8080/coverage/coverage-reports/coverage/Users/buildslave/jenkins/workspace/coverage/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp.html#L1136	2020-06-01 15:16:10 -07:00
Vedant Kumar	11c617c417	[LiveDebugValues] Add LocIndex::u32_{location,index}_t types for readability, NFC This is per Adrian's suggestion in https://reviews.llvm.org/D80684.	2020-06-01 11:02:36 -07:00
Vedant Kumar	2ecaf93525	[LiveDebugValues] Speed up removeEntryValue, NFC Summary: Instead of iterating over all VarLoc IDs in removeEntryValue(), just iterate over the interval reserved for entry value VarLocs. This changes the iteration order, hence the test update -- otherwise this is NFC. This appears to give an ~8.5x wall time speed-up for LiveDebugValues when compiling sqlite3.c 3.30.1 with a Release clang (on my machine): ``` ---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name --- Before: 2.5402 ( 18.8%) 0.0050 ( 0.4%) 2.5452 ( 17.3%) 2.5452 ( 17.3%) Live DEBUG_VALUE analysis After: 0.2364 ( 2.1%) 0.0034 ( 0.3%) 0.2399 ( 2.0%) 0.2398 ( 2.0%) Live DEBUG_VALUE analysis ``` The change in removeEntryValue() is the only one that appears to affect wall time, but for consistency (and to resolve a pending TODO), I made the analogous changes for iterating over SpillLocKind VarLocs. Reviewers: nikic, aprantl, jmorse, djtodoro Subscribers: hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80684	2020-06-01 11:02:36 -07:00
Matt Arsenault	836c7dcf12	DAG: Fix getNode dropping flags if there's a glue output The AMDGPU non-strict fdiv lowering needs to introduce an FP mode switch in some cases, and has custom nodes to provide chain/glue for the intermediate FP operations. We need to propagate nofpexcept here, but getNode was dropping the flags. Adding nofpexcept in the AMDGPU custom lowering is left to a future patch. Also fix a second case where flags were dropped, but in this case it seems it just didn't handle this number of operands. Test will be included in future AMDGPU patch.	2020-06-01 13:48:02 -04:00
hsmahesha	0ed2c04636	[AMDGPU/MemOpsCluster] Let mem ops clustering logic also consider number of clustered bytes Summary: While clustering mem ops, AMDGPU target needs to consider number of clustered bytes to decide on max number of mem ops that can be clustered. This patch adds support to pass number of clustered bytes to target mem ops clustering logic. Reviewers: foad, rampitec, arsenm, vpykhtin, javedabsar Reviewed By: foad Subscribers: MatzeB, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, javed.absar, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80545	2020-06-01 22:52:34 +05:30

1 2 3 4 5 ...

28829 Commits