llvm-project

Commit Graph

Author	SHA1	Message	Date
Matt Arsenault	9607ccf626	GlobalISel: Remove leftover lit.local.cfg The global-isel feature has been required for a long time and was removed in `c9455d3c57`, so this was causing all tests to be skipped.	2020-08-27 13:49:06 -04:00
Mikhail Maltsev	ae1396c7d4	[ARM][BFloat16] Change types of some Arm and AArch64 bf16 intrinsics This patch adjusts the following ARM/AArch64 LLVM IR intrinsics: - neon_bfmmla - neon_bfmlalb - neon_bfmlalt so that they take and return bf16 and float types. Previously these intrinsics used <8 x i8> and <4 x i8> vectors (a rudiment from implementation lacking bf16 IR type). The neon_vbfdot[q] intrinsics are adjusted similarly. This change required some additional selection patterns for vbfdot itself and also for vector shuffles (in a previous patch) because of SelectionDAG transformations kicking in and mangling the original code. This patch makes the generated IR cleaner (less useless bitcasts are produced), but it does not affect the final assembly. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D86146	2020-08-27 18:43:16 +01:00
Lucas Prates	3d943bcd22	[CodeGen] Properly propagating Calling Convention information when lowering vector arguments When joining the legal parts of vector arguments into its original value during the lower of Formal Arguments in SelectionDAGBuilder, the Calling Convention information was not being propagated for the handling of each individual parts. The same did not happen when lowering calls, causing a mismatch. This patch fixes the issue by properly propagating the Calling Convention details. This fixes Bugzilla #47001. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D86715	2020-08-27 17:01:10 +01:00
Sam Parker	03141aa04a	[ARM] Enable outliner at -Oz for M-class Enable default outlining when the function has the minsize attribute and we're targeting an m-class core. Differential Revision: https://reviews.llvm.org/D82951	2020-08-27 08:02:56 +01:00
Sam Parker	a3e41d4581	[ARM] Make MachineVerifier more strict about terminators Fix the ARM backend's analyzeBranch so it doesn't ignore predicated return instructions, and make the MachineVerifier rule more strict. Differential Revision: https://reviews.llvm.org/D40061	2020-08-27 07:10:20 +01:00
Yvan Roux	0459f29e8b	[ARM][MachineOutliner] Add default mode. Use the stack to save and restore the link register when there is no available register to do it. Differential Revision: https://reviews.llvm.org/D76069	2020-08-20 09:25:33 +02:00
Dávid Bolvanský	0f14b2e6cb	Revert "[BPI] Improve static heuristics for integer comparisons" This reverts commit `50c743fa71`. Patch will be split to smaller ones.	2020-08-17 20:44:33 +02:00
Ben Shi	05047f0b36	[ARM][test] Add more tests of two-part immediates The ARM backend breaks some specific immediates to two parts in binary operations. And this patch adds more tests for that. Reviewed By: samparker Differential Revision: https://reviews.llvm.org/D84100	2020-08-14 23:11:01 +08:00
Ben Dunbobbin	4cb016cd2d	[X86][ELF] Prefer lowering MC_GlobalAddress operands to .Lfoo$local for STV_DEFAULT only This patch restricts the behaviour of referencing via .Lfoo$local local aliases, introduced in https://reviews.llvm.org/D73230, to STV_DEFAULT globals only. Hidden symbols via --fvisiblity=hidden (https://gcc.gnu.org/wiki/Visibility) is an important scenario. Benefits: - Improves the size of object files by using fewer STT_SECTION symbols. - The code reads a bit better (it was not obvious to me without going back to the code reviews why the canBenefitFromLocalAlias function currently doesn't consider visibility). - There is also a side benefit in restoring the effectiveness of the --wrap linker option and making the behavior of --wrap consistent between LTO and normal builds for references within a translation-unit. Note: this --wrap behavior (which is specific to LLD) should not be considered reliable. See comments on https://reviews.llvm.org/D73230 for more. Differential Revision: https://reviews.llvm.org/D85782	2020-08-14 00:09:15 +01:00
Dávid Bolvanský	50c743fa71	[BPI] Improve static heuristics for integer comparisons Similarly as for pointers, even for integers a == b is usually false. GCC also uses this heuristic. Reviewed By: ebrevnov Differential Revision: https://reviews.llvm.org/D85781	2020-08-13 19:54:27 +02:00
Dávid Bolvanský	f9264995a6	Revert "[BPI] Improve static heuristics for integer comparisons" This reverts commit `44587e2f7e`. Sanitizer tests need to be updated.	2020-08-13 14:37:40 +02:00
Dávid Bolvanský	44587e2f7e	[BPI] Improve static heuristics for integer comparisons Similarly as for pointers, even for integers a == b is usually false. GCC also uses this heuristic. Reviewed By: ebrevnov Differential Revision: https://reviews.llvm.org/D85781	2020-08-13 14:23:58 +02:00
Dávid Bolvanský	a0485421d2	Revert "[BPI] Improve static heuristics for integer comparisons" This reverts commit `385c9d673f`.	2020-08-13 12:59:15 +02:00
Dávid Bolvanský	385c9d673f	[BPI] Improve static heuristics for integer comparisons Similarly as for pointers, even for integers a == b is usually false. GCC also uses this heuristic. Reviewed By: ebrevnov Differential Revision: https://reviews.llvm.org/D85781	2020-08-13 12:45:40 +02:00
Sam Parker	4f9f4b21e0	[ARM] Unrestrict Armv8-a IT when at minsize IT blocks with more than one instruction were performance deprecated in Armv8 but that doesn't mean we should follow that advise when optimising for size. Differential Revision: https://reviews.llvm.org/D85638	2020-08-10 14:59:53 +01:00
Simon Pilgrim	66a163f328	[DAG] GetDemandedBits - remove custom AND handling. As mentioned on D85463, we should be using SimplifyMultipleUseDemandedBits (which is the default fallback). The minor regression in illegal-bitfield-loadstore.ll will be addressed properly by D77804.	2020-08-07 12:55:47 +01:00
Meera Nakrani	20283ff491	[ARM] Generated SSAT and USAT instructions with shift Added patterns so that both SSAT and USAT instructions are generated with shifts. Added corresponding regression tests. Differential Review: https://reviews.llvm.org/D85120	2020-08-04 09:38:17 +00:00
Simon Wallis	6a05c6bfc8	[MachineCopyPropagation] BackwardPropagatableCopy: add check for hasOverlappingMultipleDef In MachineCopyPropagation::BackwardPropagatableCopy(), a check is added for multiple destination registers. The copy propagation is avoided if the copied destination register is the same register as another destination on the same instruction. A new test is added. This used to fail on ARM like this: error: unpredictable instruction, RdHi and RdLo must be different umull r9, r9, lr, r0 Reviewed By: lkail Differential Revision: https://reviews.llvm.org/D82638	2020-07-29 16:21:01 +01:00
Sjoerd Meijer	85342c27a3	[ARM] Optimize immediate selection Optimize some specific immediates selection by materializing them with sub/mvn instructions as opposed to loading them from the constant pool. Patch by Ben Shi, powerman1st@163.com. Differential Revision: https://reviews.llvm.org/D83745	2020-07-29 13:29:17 +01:00
Tim Northover	39108f4c7a	ARM: make Thumb1 instructions non-flag-setting in IT block. Many Thumb1 instructions are defined to set CPSR if executed outside an IT block, but leave it alone from inside one. In MachineIR this is represented by whether an optional register is CPSR or NoReg (0), and affects how the instructions are printed. This sets the instruction to the appropriate form during if-conversion.	2020-07-28 13:31:17 +01:00
Simon Wallis	94e4e37d55	[Thumb] set code alignment for 16-bit load from constant pool Summary: [Thumb] set code alignment for 16-bit load from constant pool LLVM miscompiles this code when compiling for a target with v8.2-A FP16 and the Thumb ISA at -O0: extern void bar(__fp16 P5); int main() { __fp16 P5 = 1.96875; bar(P5); } The code section containing main has 2 byte alignment. It needs to have 4 byte alignment, because the load literal instruction has an offset from the load address with the low 2 bits zeroed. I do not include a test case in this check-in. llc and llvm-mc do not exhibit this bug. They do not set code section alignment in the same manner as clang. Reviewers: dnsampaio Reviewed By: dnsampaio Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D84169	2020-07-22 10:12:41 +01:00
Yuanfang Chen	589c646a7e	[llc] (almost) remove `--print-machineinstrs` Its effect could be achieved by `-stop-after`,`-print-after`,`-print-after-all`. But a few tests need to print MIR after ISel which could not be done with `-print-after`/`-stop-after` since isel pass does not have commandline name. That's the reason `--print-machineinstrs` is downgraded to `--print-after-isel` in this patch. `--print-after-isel` could be removed after we switch to new pass manager since isel pass would have a commandline text name to use `print-after` or equivalent switches. The motivation of this patch is to reduce tests dependency on would-be-deprecated feature. Reviewed By: arsenm, dsanders Differential Revision: https://reviews.llvm.org/D83275	2020-07-20 10:43:28 -07:00
Elvina Yakubova	b36a3e6140	[llvm-readobj] Update tests because of changes in llvm-readobj behavior This patch updates tests using llvm-readobj and llvm-readelf, because soon reading from stdin will be achievable only via a '-' as described here: https://bugs.llvm.org/show_bug.cgi?id=46400. Patch with changes to llvm-readobj behavior is here: https://reviews.llvm.org/D83704 Differential Revision: https://reviews.llvm.org/D83912 Reviewed by: jhenderson, MaskRay, grimar	2020-07-20 10:39:04 +01:00
Florian Hahn	e297006d6f	[ScheduleDAG] Move DBG_VALUEs after first term forward. MBBs are not allowed to have non-terminator instructions after the first terminator. Currently in some cases (see the modified test), EmitSchedule can add DBG_VALUEs after the last terminator, for example when referring a debug value that gets folded into a TCRETURN instruction on ARM. This patch updates EmitSchedule to move inserted DBG_VALUEs just before the first terminator. I am not sure if there are terminators produce values that can in turn be used by a DBG_VALUE. In that case, moving the DBG_VALUE might result in referencing an undefined register. But in any case, it seems like currently there is no way to insert a proper DBG_VALUEs for such registers anyways. Alternatively it might make sense to just remove those extra DBG_VALUES. I am not too familiar with the details of debug info in the backend and would appreciate any suggestions on how to address the issue in the best possible way. Reviewers: vsk, aprantl, jpaquette, efriedma, paquette Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D83561	2020-07-17 10:27:43 +01:00
Simon Wallis	3e0ccf9a90	[ARM] halfword store hits llvm_unreachable with big-endian Summary: [ARM] halfword store hits llvm_unreachable with big-endian Provide missing case in getFixupKindContainerSizeBytes(). This stops execution reaching llvm_unreachable("Unknown fixup kind!") D83947 Reviewers: olista01, ostannard Reviewed By: ostannard Subscribers: ostannard, kristof.beyls, hiraditya, danielkiss, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83947 Change-Id: I598aa1fb51fd1c6f424c557c85d6df6d1958bc62	2020-07-17 08:56:44 +01:00
Pavel Iliin	b9a6fb6428	[ARM] VBIT/VBIF support added. Vector bitwise selects are matched by pseudo VBSP instruction and expanded to VBSL/VBIT/VBIF after register allocation depend on operands registers to minimize extra copies.	2020-07-16 11:25:53 +01:00
Roger Ferrer Ibanez	14bc5e149d	[DAGCombiner] Rebuild (setcc x, y, ==) from (xor (xor x, y), 1) The existing code already considered this case. Unfortunately a typo in the condition prevents it from triggering. Also the existing code, had it run, forgot to do the folding. This fixes PR42876. Differential Revision: https://reviews.llvm.org/D65802	2020-07-15 07:34:22 +00:00
Roger Ferrer Ibanez	2b6215f188	[NFC] Add tests for boolean comparisons They currently show that the not equal case may be improved. See PR42876 Differential Revision: https://reviews.llvm.org/D65801	2020-07-15 07:33:43 +00:00
Pavel Iliin	8f7d3430b7	[ARM][NFC] More detailed vbsl checks in ARM & Thumb2 tests.	2020-07-13 17:00:43 +01:00
Florian Hahn	864586d0fd	[ARM] Pass -verify-machineinstr to test and XFAIL until fixed. Some bots run with -verify-machineinstr enabled. Add it to the new test and XFAIL it until fixed.	2020-07-10 16:44:52 +01:00
Florian Hahn	eb5c7f6b8f	[ARM] Add test with tcreturn and debug value. In the attached test case, a non-terminator instruction (DBG_VALUE) is inserted after a terminator, producing an invalid MBB.	2020-07-10 16:32:21 +01:00
Puyan Lotfi	7e169cec74	[NFC][test] Adding fastcc test case for promoted 16-bit integer bitcasts. The following: https://reviews.llvm.org/D82552 fixed an assert in the SelectionDag ISel legalizer for some CCs on armv7. I noticed that this fix also fixes the assert when using fastcc, so I am adding a fastcc regression test here. Differential Revision: https://reviews.llvm.org/D82443	2020-07-09 11:38:49 -07:00
Lucas Prates	fc39a9ca0e	[CodeGen] Matching promoted type for 16-bit integer bitcasts from fp16 operand Summary: When legalizing a biscast operation from an fp16 operand to an i16 on a target that requires both input and output types to be promoted to 32-bits, an assertion can fail when building the new node due to a mismatch between the the operation's result size and the type specified to the node. This patches fix the issue by making sure the bit width of the types match for the FP_TO_FP16 node, covering the difference with an extra ANYEXTEND operation. Reviewers: ostannard, efriedma, pirama, jmolloy, plotfi Reviewed By: efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82552	2020-07-09 09:46:17 +01:00
David Green	74ca67c109	[ARM] Remove hasSideEffects from FP converts Whether an instruction is deemed to have side effects in determined by whether it has a tblgen pattern that emits a single instruction. Because of the way a lot of the the vcvt instructions are specified either in dagtodag code or with patterns that emit multiple instructions, they don't get marked as not having side effects. This just marks them as not having side effects manually. It can help especially with instruction scheduling, to not create artificial barriers, but one of these tests also managed to produce fewer instructions. Differential Revision: https://reviews.llvm.org/D81639	2020-07-05 16:23:24 +01:00
Nicholas Guy	dc8e4d8566	[ARM] Rearrange SizeReduction when using -Oz Move the Thumb2SizeReduce pass to before IfConversion when optimising for minimal code size. Running the Thumb2SizeReduction pass before IfConversionallows T1 instructions to propagate to the final output, rather than the ifConverter modifying T2 instructions and preventing them from being reduced later. This change does introduce a regression regarding execution time, so it's only applied when optimising for size. Running the LLVM Test Suite with this change produces a geomean difference of -0.1% for the size..text metric. Differential Revision: https://reviews.llvm.org/D82439	2020-07-02 09:19:38 +01:00
James Y Knight	4b0aa5724f	Change the INLINEASM_BR MachineInstr to be a non-terminating instruction. Before this instruction supported output values, it fit fairly naturally as a terminator. However, being a terminator while also supporting outputs causes some trouble, as the physreg->vreg COPY operations cannot be in the same block. Modeling it as a non-terminator allows it to be handled the same way as invoke is handled already. Most of the changes here were created by auditing all the existing users of MachineBasicBlock::isEHPad() and MachineBasicBlock::hasEHPadSuccessor(), and adding calls to isInlineAsmBrIndirectTarget or mayHaveInlineAsmBr, as appropriate. Reviewed By: nickdesaulniers, void Differential Revision: https://reviews.llvm.org/D79794	2020-07-01 12:51:50 -04:00
Guillaume Chatelet	3500d9ec95	Fix invalid alignment in DAGCombiner::isLegalNarrowLdSt `ShAmt / 8` can be a non power of two, this can lead to an invalid alignment. context: https://reviews.llvm.org/D41350#inline-749165 Differential Revision: https://reviews.llvm.org/D82565	2020-06-29 09:22:15 +00:00
David Green	d428f88152	[ARM] VCVTT fpround instruction selection Similar to the recent patch for fpext, this adds vcvtb and vcvtt with insert into vector instruction selection patterns for fptruncs. This helps clear up a lot of register shuffling that we would otherwise do. Differential Revision: https://reviews.llvm.org/D81637	2020-06-26 10:24:06 +01:00
David Green	76e0e1a55d	[ARM] VCVTT instruction selection We current extract and convert from a top lane of a f16 vector using a VMOVX;VCVTB pair. We can simplify that to use a single VCVTT. The pattern is mostly copied from a vector extract pattern, but produces a VCVTTHS f32 directly. This had to move some code around so that ARMInstrVFP had access to the required pattern frags that were previously part of ARMInstrNEON. Differential Revision: https://reviews.llvm.org/D81556	2020-06-26 08:58:55 +01:00
Simon Tatham	b769eb02b5	[ARM][BFloat] Legalize bf16 type even without fullfp16. Summary: This change permits scalar bfloats to be loaded, stored, moved and used as function call arguments and return values, whenever the bf16 feature is supported by the subtarget. Previously that was only supported in the presence of the fullfp16 feature, because the code generation strategy depended on instructions from that extension. This change adds alternative code generation strategies so that those operations can be done even without fullfp16. The strategy for loads and stores is to replace VLDRH/VSTRH with integer LDRH/STRH plus a move between register classes. I've written isel patterns for those, conditional on //not// having the fullfp16 feature (so that in the fullfp16 case, the existing patterns will still be used). For function arguments and returns, instead of writing isel patterns to match `VMOVhr` and `VMOVrh`, I've avoided generating those SDNodes in the first place, by factoring out the code that constructs them into helper functions `MoveToHPR` and `MoveFromHPR` which have a fallback for non-fullfp16 subtargets. The current output code is not especially pretty: in the new test file you can see unnecessary store/load pairs implementing no-op bitcasts, and lots of pointless moves back and forth between FP registers and GPRs. But it at least works, which is an improvement on the previous situation. Reviewers: dmgreen, SjoerdMeijer, stuij, chill, miyuki, labrinea Reviewed By: dmgreen, labrinea Subscribers: labrinea, kristof.beyls, hiraditya, danielkiss, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82372	2020-06-24 09:36:26 +01:00
Momchil Velikov	adf7973fd3	[ARM] Describe defs/uses of VLLDM and VLSTM The VLLDM and VLSTM instructions are incompletely specified. They (potentially) write (or read, respectively) registers Q0-Q7, VPR, and FPSCR, but the compiler is unaware of it. In the new test case `cmse-vlldm-no-reorder.ll` case the compiler missed an anti-dependency and reordered a `VLLDM` ahead of the instruction, which stashed the return value from the non-secure call, effectively clobbering said value. This test case does not fail with upstream LLVM, because of scheduling differences and I couldn't find a test case for the VLSTM either. Differential Revision: https://reviews.llvm.org/D81586	2020-06-23 16:04:23 +01:00
Mikhail Maltsev	3f353a2e5a	[BFloat] Add convert/copy instrinsic support This patch is part of a series implementing the Bfloat16 extension of the Armv8.6-a architecture, as detailed here: https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a Specifically it adds intrinsic support in clang and llvm for Arm and AArch64. The bfloat type, and its properties are specified in the Arm Architecture Reference Manual: https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile The following people contributed to this patch: - Alexandros Lamprineas - Luke Cheeseman - Mikhail Maltsev - Momchil Velikov - Luke Geeson Differential Revision: https://reviews.llvm.org/D80928	2020-06-23 14:27:05 +00:00
Mikhail Maltsev	9c579540ff	[ARM] BFloat MatMul Intrinsics&CodeGen Summary: This patch adds support for BFloat Matrix Multiplication Intrinsics and Code Generation from __bf16 to AArch32. This includes IR intrinsics. Tests are provided as needed. This patch is part of a series implementing the Bfloat16 extension of the Armv8.6-a architecture, as detailed here: https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a The bfloat type and its properties are specified in the Arm Architecture Reference Manual: https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile The following people contributed to this patch: - Luke Geeson - Momchil Velikov - Mikhail Maltsev - Luke Cheeseman - Simon Tatham Reviewers: stuij, t.p.northover, SjoerdMeijer, sdesmalen, fpetrogalli, LukeGeeson, simon_tatham, dmgreen, MarkMurrayARM Reviewed By: MarkMurrayARM Subscribers: MarkMurrayARM, danielkiss, kristof.beyls, hiraditya, cfe-commits, llvm-commits, chill, miyuki Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D81740	2020-06-23 12:06:37 +00:00
Mikhail Maltsev	490f78c038	[ARM][BFloat] Implement lowering of bf16 load/store intrinsics Reviewers: labrinea, dmgreen, pratlucas, LukeGeeson Reviewed By: dmgreen Subscribers: kristof.beyls, hiraditya, danielkiss, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81486	2020-06-19 14:02:35 +00:00
Mikhail Maltsev	7526881246	[ARM][BFloat] Lowering of create/get/set/dup intrinsics This patch adds codegen for the following BFloat operations to the ARM backend: * concatenation of bf16 vectors * bf16 vector element extraction * bf16 vector element insertion * duplication of a bf16 value into each lane of a vector * duplication of a bf16 vector lane into each lane Differential Revision: https://reviews.llvm.org/D81411	2020-06-19 12:52:40 +00:00
Alexandros Lamprineas	ecdf48f15b	[ARM] Basic bfloat support This patch adds basic support for BFloat in the Arm backend. For now the code generation relies on fullfp16 being present. Briefly: * adds the bfloat scalar and vector types in the necessary register classes, * adjusts the calling convention to cope with bfloat argument passing and return, * adds codegen patterns for moves, loads and stores. It's tested mostly by the intrinsic patches that depend on it (load/store, convert/copy). The following people contributed to this patch: * Alexandros Lamprineas * Ties Stuij Differential Revision: https://reviews.llvm.org/D81373	2020-06-18 17:26:24 +01:00
Lucas Prates	92ad6d57c2	[ARM] Moving CMSE handling of half arguments and return to the backend Summary: As half-precision floating point arguments and returns were previously coerced to either float or int32 by clang's codegen, the CMSE handling of those was also performed in clang's side by zeroing the unused MSBs of the coercer values. This patch moves this handling to the backend's calling convention lowering, making sure the high bits of the registers used by half-precision arguments and returns are zeroed. Reviewers: chill, rjmccall, ostannard Reviewed By: ostannard Subscribers: kristof.beyls, hiraditya, danielkiss, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D81428	2020-06-18 13:16:29 +01:00
Lucas Prates	a255931c40	[ARM] Supporting lowering of half-precision FP arguments and returns in AArch32's backend Summary: Half-precision floating point arguments and returns are currently promoted to either float or int32 in clang's CodeGen and there's no existing support for the lowering of `half` arguments and returns from IR in AArch32's backend. Such frontend coercions, implemented as coercion through memory in clang, can cause a series of issues in argument lowering, as causing arguments to be stored on the wrong bits on big-endian architectures and incurring in missing overflow detections in the return of certain functions. This patch introduces the handling of half-precision arguments and returns in the backend using the actual "half" type on the IR. Using the "half" type the backend is able to properly enforce the AAPCS' directions for those arguments, making sure they are stored on the proper bits of the registers and performing the necessary floating point convertions. Reviewers: rjmccall, olista01, asl, efriedma, ostannard, SjoerdMeijer Reviewed By: ostannard Subscribers: stuij, hiraditya, dmgreen, llvm-commits, chill, dnsampaio, danielkiss, kristof.beyls, cfe-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D75169	2020-06-18 13:15:13 +01:00
Yvan Roux	ffe8f6d33b	[ARM][MachineOutliner] Fix no-lr-save testcase. Now that saving LR into a register is handled, some register constraints are needed to keep machine-outliner-no-lr-save.mir meaningful.	2020-06-15 16:09:31 +02:00
Yvan Roux	669066de65	[ARM][MachineOutliner] Add LR RegSave mode. Outline chunks of code which need to save and restore the link register when a spare register can be used to it. Differential Revision: https://reviews.llvm.org/D80127	2020-06-15 15:22:08 +02:00
Michael Liao	e7b920e6fe	[DAGCombine] Generalize the case (add (or x, c1), c2) -> (add x, (c1 + c2)) Reviewers: arsenm Subscribers: sdardis, wdng, hiraditya, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, ecnelises, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81708	2020-06-12 13:53:08 -04:00
Yvan Roux	6b8628a1f0	[ARM][MachineOutliner] Add NoLRSave mode. Outline chunks of code which don't need a save/restore mechanism of the link register. Differential Revision: https://reviews.llvm.org/D80125	2020-06-11 08:45:46 +02:00
David Green	61ef2d27c4	[ARM] Update fp16-insert-extract.ll test checks. NFC	2020-06-10 17:50:27 +01:00
Simon Wallis	4dba59689d	[ARM] prologue instructions emitted for naked function with >64 byte argument Summary: The naked function attribute is meant to suppress all function prologue/epilogue instructions. On ARM, some are still emitted if an argument greater than 64 bytes in size (the threshold for using the byval attribute in IR) is passed partially in registers. Perform the check for Attribute::Naked and early exit in SelectionDAGISel::LowerArguments(). Checking in ARMFrameLowering::determineCalleeSaves() is too late. A test case is included. Reviewers: llvm-commits, olista01, danielkiss Reviewed By: danielkiss Subscribers: kristof.beyls, hiraditya, danielkiss Tags: #llvm Differential Revision: https://reviews.llvm.org/D80715 Change-Id: Icedecf2a4ad31bc3c35ab0df7489a9d346e1f7cc	2020-06-09 11:33:03 +01:00
Eli Friedman	7e58d0ded0	Revert "[arm][darwin] Don't generate libcalls for wide shifts on Darwin" This reverts commit `2ba016cd5c`. This is causing a failure on the clang-cmake-armv7-full bot, and there are outstanding review comments.	2020-06-08 16:37:29 -07:00
Simon Wallis	7432fb2c78	[ARM][XO] Execute-only miscompiles double literals for big-endian Summary: With -mbig-endian -mexecute-only and targeting an fpu, an incorrect sequence of movw/movt was generated to construct a double literal. The test suite was hardwired to check these wrong values. The fault was caused by the explicit word swap in LowerConstantFP(). With -mbig-endian -mexecute-only -mfpu=none, a correct sequence of movw/movt is generated to construct a double literal. The test suite did not test this no fpu case. The test suite expected values have been corrected. The test file is updated to add testing of fpu=none case Reviewers: christof, llvm-commits, dmgreen Reviewed By: dmgreen Subscribers: dmgreen, kristof.beyls, hiraditya, danielkiss Tags: #llvm Differential Revision: https://reviews.llvm.org/D81259 Change-Id: Ia3737df243218c89c82f02b7f9f4032ecd5a3917	2020-06-08 08:13:08 +01:00
Alex Lorenz	2ba016cd5c	[arm][darwin] Don't generate libcalls for wide shifts on Darwin Similar to `ceb801612a`. Darwin doesn't always use compiler-rt, and so we can't assume that these functions are available on arm.	2020-06-05 15:41:23 -07:00
Matt Arsenault	66251f7e1d	RegAllocFast: Record internal state based on register units Record internal state based on register units. This is often more efficient as there are typically fewer register units to update compared to iterating over all the aliases of a register. Original patch by Matthias Braun, but I've been rebasing and fixing it for almost 2 years and fixed a few bugs causing intermediate failures to make this patch independent of the changes in https://reviews.llvm.org/D52010.	2020-06-03 16:51:46 -04:00
Matt Arsenault	056a375b7c	ARM: Reduce debug info testcase This had multiple functions and only one vague check. Reduce it.	2020-06-03 10:33:32 -04:00
Zequan Wu	80e107ccd0	Add NoMerge MIFlag to avoid MIR branch folding Let the codegen recognized the nomerge attribute and disable branch folding when the attribute is given Differential Revision: https://reviews.llvm.org/D79537	2020-05-29 12:31:06 -07:00
Victor Campos	c010d4d195	[ARM] Improve codegen of volatile load/store of i64 Summary: Instead of generating two i32 instructions for each load or store of a volatile i64 value (two LDRs or STRs), now emit LDRD/STRD. These improvements cover architectures implementing ARMv5TE or Thumb-2. The code generation explicitly deviates from using the register-offset variant of LDRD/STRD. In this variant, the register allocated to the register-offset cannot be reused in any of the remaining operands. Such restriction seems to be non-trivial to implement in LLVM, thus it is left as a to-do. Differential Revision: https://reviews.llvm.org/D70072	2020-05-28 10:52:43 +01:00
Juneyoung Lee	54b6457240	[TargetPassConfig] Add CanonicalizeFreezeInLoops before LSR Summary: This patch adds CanonicalizeFreezeInLoops before LSR. Relevant patch: https://reviews.llvm.org/D77523 Reviewers: spatel, efriedma, jdoerfert, fhahn, nikic, reames, xbolva00 Reviewed By: nikic Subscribers: xbolva00, nikic, lebedev.ri, hiraditya, llvm-commits, sanwou01, nlopes Tags: #llvm Differential Revision: https://reviews.llvm.org/D77524	2020-05-28 05:21:12 +09:00
Nikita Popov	0c6bba71e3	[TargetPassConfig] Don't add alias analysis at optnone When performing codegen at optnone, don't add alias analysis to the pipeline. We don't need it, but it causes an unnecessary dominator tree calculation. I've also moved the module verifier call to the top so that a bunch of disabled-at-optnone passes group more nicely. Differential Revision: https://reviews.llvm.org/D80378	2020-05-23 10:35:03 +02:00
Jean-Michel Gorius	65cd2c7a80	Revert "[CodeGen] Add support for multiple memory operands in MachineInstr::mayAlias" This temporarily reverts commit `7019cea26d`. It seems that, for some targets, there are instructions with a lot of memory operands (probably more than would be expected). This causes a lot of buildbots to timeout and notify failed builds. While investigations are ongoing to find out why this happens, revert the changes.	2020-05-22 21:26:46 +02:00
Jon Roelofs	5a8db275f8	Revert "[llvm][test] Add COM: directives before colon-less non-CHECKs in comments. NFC" This reverts commit `183d6af081`. Revert pending further consensus building: https://reviews.llvm.org/D79963#2050521	2020-05-22 05:36:15 -06:00
Victor Campos	872ee78f65	Revert "[ARM] Improve codegen of volatile load/store of i64" This reverts commit `8a12553223`. A bug has been found when generating code for Thumb2. In some very specific cases, the prologue/epilogue emitter generates erroneous stack offsets for the new LDRD instructions that access the stack. This bug does not seem to be caused by the reverted patch though. Likely the latter has made an undiscovered issue emerge in the prologue/epilogue emission pass. Nevertheless, this reversion is necessary since it is blocking users of the ARM backend.	2020-05-22 11:01:57 +01:00
Jean-Michel Gorius	7019cea26d	[CodeGen] Add support for multiple memory operands in MachineInstr::mayAlias Summary: To support all targets, the mayAlias member function needs to support instructions with multiple operands. This revision also changes the order of the emitted instructions in some test cases. Reviewers: efriedma, hfinkel, craig.topper, dmgreen Reviewed By: efriedma Subscribers: MatzeB, dmgreen, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80161	2020-05-21 23:02:54 +02:00
Jon Roelofs	183d6af081	[llvm][test] Add COM: directives before colon-less non-CHECKs in comments. NFC Differential Revision: https://reviews.llvm.org/D79963	2020-05-21 09:29:27 -06:00
Diogo Sampaio	6c68f75ee4	Prevent register coalescing in functions whith setjmp Summary: In the the given example, a stack slot pointer is merged between a setjmp and longjmp. This pointer is spilled, so it does not get correctly restored, addinga undefined behaviour where it shouldn't. Change-Id: I60ec010844f2a24ce01ceccf12eb5eba5ab94abb Reviewers: eli.friedman, thanm, efriedma Reviewed By: efriedma Subscribers: MatzeB, qcolombet, tpr, rnk, efriedma, hiraditya, llvm-commits, chill Tags: #llvm Differential Revision: https://reviews.llvm.org/D77767	2020-05-16 00:36:34 +01:00
Yvan Roux	0e4827aa4e	[ARM][MachineOutliner] Add Machine Outliner support for ARM. Enables Machine Outlining for ARM and Thumb2 modes. This is the first patch of the series which adds all the basic logic for the support, and only handles tail-calls and thunks. The outliner can be turned on by using clang -moutline option or -mllvm -enable-machine-outliner one (like AArch64). Differential Revision: https://reviews.llvm.org/D76066	2020-05-15 08:44:23 +02:00
Eli Friedman	4532a50899	Infer alignment of unmarked loads in IR/bitcode parsing. For IR generated by a compiler, this is really simple: you just take the datalayout from the beginning of the file, and apply it to all the IR later in the file. For optimization testcases that don't care about the datalayout, this is also really simple: we just use the default datalayout. The complexity here comes from the fact that some LLVM tools allow overriding the datalayout: some tools have an explicit flag for this, some tools will infer a datalayout based on the code generation target. Supporting this properly required plumbing through a bunch of new machinery: we want to allow overriding the datalayout after the datalayout is parsed from the file, but before we use any information from it. Therefore, IR/bitcode parsing now has a callback to allow tools to compute the datalayout at the appropriate time. Not sure if I covered all the LLVM tools that want to use the callback. (clang? lli? Misc IR manipulation tools like llvm-link?). But this is at least enough for all the LLVM regression tests, and IR without a datalayout is not something frontends should generate. This change had some sort of weird effects for certain CodeGen regression tests: if the datalayout is overridden with a datalayout with a different program or stack address space, we now parse IR based on the overridden datalayout, instead of the one written in the file (or the default one, if none is specified). This broke a few AVR tests, and one AMDGPU test. Outside the CodeGen tests I mentioned, the test changes are all just fixing CHECK lines and moving around datalayout lines in weird places. Differential Revision: https://reviews.llvm.org/D78403	2020-05-14 13:03:50 -07:00
Momchil Velikov	bc2e572f51	Re-commit: [ARM] CMSE code generation This patch implements the final bits of CMSE code generation: * emit special linker symbols * restrict parameter passing to no use memory * emit BXNS and BLXNS instructions for returns from non-secure entry functions, and non-secure function calls, respectively * emit code to save/restore secure floating-point state around calls to non-secure functions * emit code to save/restore non-secure floating-pointy state upon entry to non-secure entry function, and return to non-secure state * emit code to clobber registers not used for arguments and returns * when switching to no-secure state Patch by Momchil Velikov, Bradley Smith, Javed Absar, David Green, possibly others. Differential Revision: https://reviews.llvm.org/D76518	2020-05-14 16:46:16 +01:00
Craig Topper	de92dc2850	[Statepoint] Mark FixupStatepointCallerSaved as preserving the CFG I'm hoping this will restore some compile time lost by D75936 and D75937. Differential Revision: https://reviews.llvm.org/D79813	2020-05-13 10:59:44 -07:00
Eli Friedman	c9c930ae67	[SelectionDAG] Don't promote the alignment of allocas beyond the stack alignment. allocas in LLVM IR have a specified alignment. When that alignment is specified, the alloca has at least that alignment at runtime. If the specified type of the alloca has a higher preferred alignment, SelectionDAG currently ignores that specified alignment, and increases the alignment. It does this even if it would trigger stack realignment. I don't think this makes sense, so this patch changes that. I was looking into this for SVE in particular: for SVE, overaligning vscale'ed types is extra expensive because it requires realigning the stack multiple times, or using dynamic allocation. (This currently isn't implemented.) I updated the expected assembly for a couple tests; in particular, for arg-copy-elide.ll, the optimization in question does not increase the alignment the way SelectionDAG normally would. For the rest, I just increased the specified alignment on the allocas to match what SelectionDAG was inferring. Differential Revision: https://reviews.llvm.org/D79532	2020-05-11 17:39:00 -07:00
James Y Knight	7af9d386da	Correctly modify the CFG in IfConverter, and then remove the CorrectExtraCFGEdges function. The latter was a workaround for "Various pieces of code" leaving bogus extra CFG edges in place. Where by "various" it meant only IfConverter::MergeBlocks, which failed to clear all of the successors of dead blocks it emptied out. This wouldn't matter a whole lot, except that the dead blocks remained listed as predecessors of still-useful blocks, inhibiting optimizations. This fix slightly changed two thumb tests, because the correct CFG successors allowed for the "diamond" if-conversion pattern to be detected, when it could only use "simple" before. Additionally, the removal of a now-redundant call to analyzeBranch (with AllowModify=true) in BranchFolder::OptimizeFunction caused a later check for an empty block in BranchFolder::OptimizeBlock to fail. Correct this by moving the call to analyzeBranch in OptimizeBlock higher. Differential Revision: https://reviews.llvm.org/D79527	2020-05-07 18:17:07 -04:00
Momchil Velikov	fb18dffaeb	Revert "[ARM] CMSE code generation" This reverts commit `7cbbf89d23`. The regression tests fail with the expensive checks.	2020-05-05 19:05:40 +01:00
Momchil Velikov	7cbbf89d23	[ARM] CMSE code generation This patch implements the final bits of CMSE code generation: * emit special linker symbols * restrict parameter passing to not use memory * emit BXNS and BLXNS instructions for returns from non-secure entry functions, and non-secure function calls, respectively * emit code to save/restore secure floating-point state around calls to non-secure functions * emit code to save/restore non-secure floating-pointy state upon entry to non-secure entry function, and return to non-secure state * emit code to clobber registers not used for arguments and returns when switching to no-secure state Patch by Momchil Velikov, Bradley Smith, Javed Absar, David Green, possibly others. Differential Revision: https://reviews.llvm.org/D76518	2020-05-05 18:23:28 +01:00
Eli Friedman	1eb160fe8d	[ARM] Fix tail call validity checking for varargs calls. If a varargs function is calling a non-varargs function, or vice versa, make sure we use the correct "varargs" bit for each. Fixes https://bugs.llvm.org/show_bug.cgi?id=45234 Differential Revision: https://reviews.llvm.org/D79199	2020-05-04 12:34:14 -07:00
Evgeniy Brevnov	3e68a66704	[BPI][NFC] Reuse post dominantor tree from analysis manager when available Summary: Currenlty BPI unconditionally creates post dominator tree each time. While this is not incorrect we can save compile time by reusing existing post dominator tree (when it's valid) provided by analysis manager. Reviewers: skatkov, taewookoh, yrouban Reviewed By: skatkov Subscribers: hiraditya, steven_wu, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78987	2020-04-30 11:31:03 +07:00
LemonBoy	f30416fdde	[AsmPrinter] Fix emission of non-standard integer constants for BE targets The code assumed that zero-extending the integer constant to the designated alloc size would be fine even for BE targets, but that's not the case as that pulls in zeros from the MSB side while we actually expect the padding zeros to go after the LSB. I've changed the codepath handling the constant integers to use the store size for both small(er than u64) and big constants and then add zero padding right after that. Differential Revision: https://reviews.llvm.org/D78011	2020-04-27 14:57:29 -07:00
David Green	8807139026	[ARM] Only produce qadd8b under hasV6Ops When compiling for a arm5te cpu from clang, the +dsp attribute is set. This meant we could try and generate qadd8 instructions where we would end up having no pattern. I've changed the condition here to be hasV6Ops && hasDSP, which is what other parts of ARMISelLowering seem to use for similar instructions. Fixed PR45677. Differential Revision: https://reviews.llvm.org/D78877	2020-04-27 10:13:29 +01:00
Fangrui Song	25e22613df	[XRay] Change ARM/AArch64/powerpc64le to use version 2 sled (PC-relative address) Follow-up of D78082 (x86-64). This change avoids dynamic relocations in `xray_instr_map` for ARM/AArch64/powerpc64le. MIPS64 cannot use 64-bit PC-relative addresses because R_MIPS_PC64 is not defined. Because MIPS32 shares the same code, for simplicity, we don't use PC-relative addresses for MIPS32 as well. Tested on AArch64 Linux and ppc64le Linux. Reviewed By: ianlevesque Differential Revision: https://reviews.llvm.org/D78590	2020-04-24 08:35:43 -07:00
Luke Geeson	7da1905125	[AArch32] Armv8.6-a Matrix Mult Assembly + Intrinsics This patch upstreams support for the Armv8.6-a Matrix Multiplication Extension. A summary of the features can be found here: https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a This patch includes: - Assembly support for AArch32 - Intrinsics Support for AArch32 Neon Intrinsics for Matrix Multiplication Note: these extensions are optional in the 8.6a architecture and so have to be enabled by default No additional IR types or C Types are needed for this extension. This is part of a patch series, starting with BFloat16 support and the other components in the armv8.6a extension (in previous patches linked in phabricator) Based on work by: - Luke Geeson - Oliver Stannard - Luke Cheeseman Reviewers: t.p.northover, miyuki Reviewed By: miyuki Subscribers: miyuki, ostannard, kristof.beyls, hiraditya, danielkiss, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D77872	2020-04-24 15:54:06 +01:00
David Green	a947be51bd	[ARM] Various tests for MVE and FP16 codegen. NFC	2020-04-24 12:11:46 +01:00
Vedant Kumar	6b58018c05	[ARM] Mark some tests as not safe for -debugify-and-strip-all, NFC These tests contain debug instructions which get checked, so we can't insert synthetic debug info and expect the tests to pass. The rest of the ARM backend tests appear to be fair game.	2020-04-22 17:03:39 -07:00
David Green	eecba95067	[ARM] Replace arm vendor with none. NFC	2020-04-22 18:19:35 +01:00
David Green	892af45c86	[ARM] Distribute MVE post-increments This adds some extra processing into the Pre-RA ARM load/store optimizer to detect and merge MVE loads/stores and adds of the same base. This we don't always turn into a post-inc during ISel, and due to the nature of it being a graph we don't always know an order to use for the nodes, not knowing which nodes to make post-inc and which to use the new post-inc of. After ISel, we have an order that we can use to post-inc the following instructions. So this looks for a loads/store with a starting offset of 0, and an add/sub from the same base, plus a number of other loads/stores. We then do some checks and convert the zero offset load/store into a postinc variant. Any loads/stores after it have the offset subtracted from their immediates. For example: LDR #4 LDR #4 LDR #0 LDR_POSTINC #16 LDR #8 LDR #-8 LDR #12 LDR #-4 ADD #16 It only handles MVE loads/stores at the moment. Normal loads/store will be added in a followup patch, they just have some extra details to ensure that we keep generating LDRD/LDM successfully. Differential Revision: https://reviews.llvm.org/D77813	2020-04-22 14:16:51 +01:00
Eli Friedman	704293b168	[ARM] Fix MIR tests with invalid live-ins. A register can't be live if it isn't defined; fix issues in various testcases. Differential Revision: https://reviews.llvm.org/D78529	2020-04-21 12:13:35 -07:00
Sam Parker	27d19101e9	[ARM][ParallelDSP] Handle squaring multiplies The logic in ARMParallelDSP is setup to merge two 16-bits loads into a 32-bit load and feed them into the smlads. This requires that four loads are combined for the four inputs, but there wasn't actually a check for this. Differential Revision: https://reviews.llvm.org/D78492	2020-04-21 08:39:56 +01:00
David Green	8e8c3c3408	[ARM] Mir test for machine sinking multiple def instructions. NFC	2020-04-16 20:58:14 +01:00
David Green	44c4ba34d0	[MachineSink] Fix for breaking phi edges with instructions with multiple defs BreakPHIEdge would be set based on whether the instruction needs to insert a new critical edge to allow sinking into a block where the uses are PHI nodes. But for instructions with multiple defs it would be reset on the second def, allowing the instruciton to sink where it should not. Fixes PR44981 Differential Revision: https://reviews.llvm.org/D78087	2020-04-16 16:42:07 +01:00
Konstantin Schwarz	1a3e89aa2b	[MIR] Add comments to INLINEASM immediate flag MachineOperands Summary: The INLINEASM MIR instructions use immediate operands to encode the values of some operands. The MachineInstr pretty printer function already handles those operands and prints human readable annotations instead of the immediates. This patch adds similar annotations to the output of the MIRPrinter, however uses the new MIROperandComment feature. Reviewers: SjoerdMeijer, arsenm, efriedma Reviewed By: arsenm Subscribers: qcolombet, sdardis, jvesely, wdng, nhaehnle, hiraditya, jrtc27, atanasyan, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78088	2020-04-16 13:46:14 +02:00
Sterling Augustine	bf94c96007	Write ignored output to stdout, so this test runs on read-only filesystems.	2020-04-15 10:45:14 -07:00
Victor Campos	d85b3877dc	[CodeGen][ARM] Error when writing to specific reserved registers in inline asm Summary: No error or warning is emitted when specific reserved registers are written to in inline assembly. Therefore, writes to the program counter or to the frame pointer, for instance, were permitted, which could have led to undesirable behaviour. Example: int foo() { register int a __asm__("r7"); // r7 = frame-pointer in M-class ARM __asm__ __volatile__("mov %0, r1" : "=r"(a) : : ); return a; } In contrast, GCC issues an error in the same scenario. This patch detects writes to specific reserved registers in inline assembly for ARM and emits an error in such case. The detection works for output and input operands. Clobber operands are not handled here: they are already covered at a later point in AsmPrinter::emitInlineAsm(const MachineInstr *MI). The registers covered are: program counter, frame pointer and base pointer. This is ARM only. Therefore the implementation of other targets' counterparts remain open to do. Reviewers: efriedma Reviewed By: efriedma Subscribers: kristof.beyls, hiraditya, danielkiss, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76848	2020-04-15 14:40:42 +01:00
Eli Friedman	2876b3eef3	[SelectionDAG] Always preserve offset in MachinePointerInfo Previously, getWithOffset() would drop the offset if the base was null. Because of this, MachineMemOperand would return the wrong result from getAlign() in these cases. MachineMemOperand stores the alignment of the pointer without the offset. A bunch of MIR tests changed because we print the offset now. Split off from D77687. Differential Revision: https://reviews.llvm.org/D78049	2020-04-14 15:29:41 -07:00
Pierre-vh	4563024356	[Target][ARM] Adding MVE VPT Optimisation Pass Differential Revision: https://reviews.llvm.org/D76709	2020-04-14 15:16:27 +01:00
Jon Roelofs	0b0bb1969f	[llvm] Fix yet more missing FileCheck colons	2020-04-13 10:49:19 -06:00
Serguei Katkov	4275eb1331	Re-land [Codegen/Statepoint] Allow usage of registers for non gc deopt values. The change introduces the usage of physical registers for non-gc deopt values. This require runtime support to know how to take a value from register. By default usage is off and can be switched on by option. The change also introduces additional fix-up patch which forces the spilling of caller saved registers (clobbered after the call) and re-writes statepoint to use spill slots instead of caller saved registers. Reviewers: reames, danstrushin Reviewed By: dantrushin Subscribers: mgorny, hiraditya, mgrang, llvm-commits Differential Revision: https://reviews.llvm.org/D77797	2020-04-10 10:13:39 +07:00
Jay Foad	9c7bd94ce8	Fix typo in comment	2020-04-09 10:36:00 +01:00
Simon Pilgrim	5f25d22d3f	[ARM] Fix thumb1_return_sequence typo in check to fix issue reported on D77354	2020-04-08 16:00:45 +01:00

1 2 3 4 5 ...

4182 Commits