llvm-project

Commit Graph

Author	SHA1	Message	Date
Diana Picus	1d4421f6a6	[ARM GlobalISel] Modernize LegalizerInfo. NFCI Start using the new LegalizerInfo API introduced in r323681. Keep the old API for opcodes that need Lowering in some circumstances (G_FNEG and G_UREM/G_SREM). llvm-svn: 323876	2018-01-31 14:55:07 +00:00
Pablo Barrio	2e442a7831	[ARM] Lower lower saturate to 0 and lower saturate to -1 using bit-operations Summary: Expressions of the form x < 0 ? 0 : x; and x < -1 ? -1 : x can be lowered using bit-operations instead of branching or conditional moves In thumb-mode this results in a two-instruction sequence, a shift followed by a bic or or while in ARM/thumb2 mode that has flexible second operand the shift can be folded into a single bic/or instructions. In most cases this results in smaller code and possibly less branches, and in no case larger than before. Patch by Marten Svanfeldt. Reviewers: fhahn, pbarrio Reviewed By: pbarrio Subscribers: efriedma, rogfer01, aemerson, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D42574 llvm-svn: 323869	2018-01-31 13:20:10 +00:00
Sjoerd Meijer	98d5359ea2	[ARM] Armv8.2-A FP16 code generation (part 2/3) Half-precision arguments and return values are passed as if it were an int or float for ARM. This results in truncates and bitcasts to/from i16 and f16 values, which are legalized very early to stack stores/loads. When FullFP16 is enabled, we want to avoid codegen for these bitcasts as it is unnecessary and inefficient. Differential Revision: https://reviews.llvm.org/D42580 llvm-svn: 323861	2018-01-31 10:18:29 +00:00
Roger Ferrer Ibanez	aea4208720	[ARM] Allow the scheduler to clone a node with glue to avoid a copy CPSR ↔ GPR. In Thumb 1, with the new ADDCARRY / SUBCARRY the scheduler may need to do copies CPSR ↔ GPR but not all Thumb1 targets implement them. The schedule can attempt, before attempting a copy, to clone the instructions but it does not currently do that for nodes with input glue. In this patch we introduce a target-hook to let the hook decide if a glued machinenode is still eligible for copying. In this case these are ARM::tADCS and ARM::tSBCS . As a follow-up of this change we should actually implement the copies for the Thumb1 targets that do implement them and restrict the hook to the targets that can't really do such copy as these clones are not ideal. This change fixes PR35836. Differential Revision: https://reviews.llvm.org/D42051 llvm-svn: 323857	2018-01-31 09:23:43 +00:00
Diana Picus	2a5b962030	[ARM GlobalISel] Map G_SITOFP and G_UITOFP Straightforward mapping (integer operand to GPR, floating point operand to FPR). llvm-svn: 323731	2018-01-30 09:15:23 +00:00
Diana Picus	517531e5a5	[ARM GlobalISel] Legalize G_SITOFP and G_UITOFP Legal if we have hardware support, libcall otherwise. Also add supporting code to the legalizer helper for libcalls. llvm-svn: 323730	2018-01-30 09:15:17 +00:00
Diana Picus	a2da03022c	[ARM GlobalISel] Map G_FPTOSI and G_FPTOUI Straightforward mapping (integer operand goes to GPR, floating point operand goes to FPR). llvm-svn: 323727	2018-01-30 07:54:58 +00:00
Diana Picus	4ed0ee7b5f	[ARM GlobalISel] Legalize G_FPTOSI and G_FPTOUI Legal if we have hardware support for floating point, libcalls otherwise. Also add the necessary support for libcalls in the legalizer helper. llvm-svn: 323726	2018-01-30 07:54:52 +00:00
Daniel Sanders	08464524c3	[ARM][GISel] PR35965 Constrain RegClasses of nested instructions built from Dst Pattern Summary: Apparently, we missed on constraining register classes of VReg-operands of all the instructions built from a destination pattern but the root (top-level) one. The issue exposed itself while selecting G_FPTOSI for armv7: the corresponding pattern generates VTOSIZS wrapped into COPY_TO_REGCLASS, so top-level COPY_TO_REGCLASS gets properly constrained, while nested VTOSIZS (or rather its destination virtual register to be exact) does not. Fixing this by issuing GIR_ConstrainSelectedInstOperands for every nested GIR_BuildMI. https://bugs.llvm.org/show_bug.cgi?id=35965 rdar://problem/36886530 Patch by Roman Tereshin Reviewers: dsanders, qcolombet, rovka, bogner, aditya_nandakumar, volkan Reviewed By: dsanders, qcolombet, rovka Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D42565 llvm-svn: 323692	2018-01-29 21:09:12 +00:00
Daniel Sanders	9ade5592d9	[globalisel] Make LegalizerInfo::LegalizeAction available outside of LegalizerInfo. NFC Summary: The improvements to the LegalizerInfo discussed in D42244 require that LegalizerInfo::LegalizeAction be available for use in other classes. As such, it needs to be moved out of LegalizerInfo. This has been done separately to the next patch to minimize the noise in that patch. llvm-svn: 323669	2018-01-29 17:37:29 +00:00
Sjoerd Meijer	3ddb7fb663	[ARM] FP16Pat and FullFP16Pat patterns. NFC. Create and use FP16Pat FullFP16Pat helper patterns to make the difference explicit. Differential Revision: https://reviews.llvm.org/D42634 llvm-svn: 323640	2018-01-29 11:28:06 +00:00
Momchil Velikov	d2cc6fd90b	[ARM] Accept a subset of Thumb GPR register class when emitting an SP-relative load instruction The function `Thumb1InstrInfo::loadRegFromStackSlot` accepts only the `tGPR` register class. The function serves to emit a `tLDRspi` instruction and certainly any subset of the `tGPR` register class is a valid destination of the load. Differential revision: https://reviews.llvm.org/D42535 llvm-svn: 323514	2018-01-26 10:20:58 +00:00
Sjoerd Meijer	011de9c0ca	[ARM] Armv8.2-A FP16 code generation (part 1/3) This is the groundwork for Armv8.2-A FP16 code generation . Clang passes and returns _Float16 values as floats, together with the required bitconverts and truncs etc. to implement correct AAPCS behaviour, see D42318. We will implement half-precision argument passing/returning lowering in the ARM backend soon, but for now this means that this: _Float16 sub(_Float16 a, _Float16 b) { return a + b; } gets lowered to this: define float @sub(float %a.coerce, float %b.coerce) { entry: %0 = bitcast float %a.coerce to i32 %tmp.0.extract.trunc = trunc i32 %0 to i16 %1 = bitcast i16 %tmp.0.extract.trunc to half <SNIP> %add = fadd half %1, %3 <SNIP> } When FullFP16 is not supported, we don't make f16 a legal type, and we get legalization for "free", i.e. nothing changes and everything works as before. And also f16 argument passing/returning is handled. When FullFP16 is supported, we do make f16 a legal type, and have 2 places that we need to patch up: f16 argument passing and returning, which involves minor tweaks to avoid unnecessary code generation for some bitcasts. As a "demonstrator" that this works for the different FP16, FullFP16, softfp modes, etc., I've added match rules to the VSUB instruction description showing that we can codegen this instruction from IR, but more importantly, also to some conversion instructions. These conversions were causing issue before in the FP16 and FullFP16 cases. I've also added match rules to the VLDRH and VSTRH desriptions, so that we can actually compile the entire half-precision sub code example above. This showed that these loads and stores had the wrong addressing mode specified: AddrMode5 instead of AddrMode5FP16, which turned out not be implemented at all, so that has also been added. This is the minimal patch that shows all the different moving parts. In patch 2/3 I will add some efficient lowering of bitcasts, and in 2/3 I will add the remaining Armv8.2-A FP16 instruction descriptions. Thanks to Sam Parker and Oliver Stannard for their help and reviews! Differential Revision: https://reviews.llvm.org/D38315 llvm-svn: 323512	2018-01-26 09:26:40 +00:00
Weiming Zhao	665784f170	[ARM] Expand long shifts for Thumb1 to __aeabi_ calls Summary: For long shifts, the inlined version takes about 20 instructions on Thumb1. To avoid the code bloat, expand to __aeabi_ calls if target is Thumb1. Reviewers: samparker Reviewed By: samparker Subscribers: samparker, aemerson, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D42401 llvm-svn: 323354	2018-01-24 18:00:57 +00:00
Martin Storsjo	4ed94a06ac	[ARM] Call __chkstk for dynamic stack allocation in all windows environments This matches what MSVC does for alloca() function calls on ARM. Even if MSVC doesn't support VLAs at the language level, it does support the alloca function. On the clang level, both the _alloca() (when emulating MSVC, which is what the alloca() function expands to) and __builtin_alloca() builtin functions, and VLAs, map to the same LLVM IR "alloca" function - so within LLVM they're not distinguishable from each other. Differential Revision: https://reviews.llvm.org/D42292 llvm-svn: 323308	2018-01-24 06:40:11 +00:00
Joel Galenson	1d89cd2bb4	[ARM] Cleanup part of ARMBaseInstrInfo::optimizeCompareInstr (NFCI). As noted in another review, this loop is confusing. This commit cleans it up somewhat. Differential Revision: https://reviews.llvm.org/D42312 llvm-svn: 323136	2018-01-22 17:53:47 +00:00
Marina Yatsina	0bf841ac2a	Separate LoopTraversal, ReachingDefAnalysis and BreakFalseDeps into their own files. This is the one of multiple patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869 Most of the patches are intended at refactoring the existent code. Additional relevant reviews: https://reviews.llvm.org/D40330 https://reviews.llvm.org/D40331 https://reviews.llvm.org/D40332 https://reviews.llvm.org/D40334 Differential Revision: https://reviews.llvm.org/D40333 Change-Id: Ie5f8eb34d98cfdfae23a3072eb69b5794f0e2d56 llvm-svn: 323095	2018-01-22 10:06:50 +00:00
Marina Yatsina	3d8efa4f0c	Rename ExecutionDepsFix files to ExecutionDomainFix This is the one of multiple patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869 Most of the patches are intended at refactoring the existent code. Additional relevant reviews: https://reviews.llvm.org/D40330 https://reviews.llvm.org/D40331 https://reviews.llvm.org/D40333 https://reviews.llvm.org/D40334 Differential Revision: https://reviews.llvm.org/D40332 Change-Id: I6a048cca7fdafbfc42fb1bac94343e483befded8 llvm-svn: 323094	2018-01-22 10:06:33 +00:00
Marina Yatsina	6fc2aaae8d	Separate ExecutionDepsFix into 4 parts: 1. ReachingDefsAnalysis - Allows to identify for each instruction what is the “closest” reaching def of a certain register. Used by BreakFalseDeps (for clearance calculation) and ExecutionDomainFix (for arbitrating conflicting domains). 2. ExecutionDomainFix - Changes the variant of the instructions in order to minimize domain crossings. 3. BreakFalseDeps - Breaks false dependencies. 4. LoopTraversal - Creatws a traversal order of the basic blocks that is optimal for loops (introduced in revision L293571). Both ExecutionDomainFix and ReachingDefsAnalysis use this to determine the order they will traverse the basic blocks. This also included the following changes to ExcecutionDepsFix original logic: 1. BreakFalseDeps and ReachingDefsAnalysis logic no longer restricted by a register class. 2. ReachingDefsAnalysis tracks liveness of reg units instead of reg indices into a given reg class. Additional changes in affected files: 1. X86 and ARM targets now inherit from ExecutionDomainFix instead of ExecutionDepsFix. BreakFalseDeps also was added to the passes they activate. 2. Comments and references to ExecutionDepsFix replaced with ExecutionDomainFix and BreakFalseDeps, as appropriate. Additional refactoring changes will follow. This commit is (almost) NFC. The only functional change is that now BreakFalseDeps will break dependency for all register classes. Since no additional instructions were added to the list of instructions that have false dependencies, there is no actual change yet. In a future commit several instructions (and tests) will be added. This is the first of multiple patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869 Most of the patches are intended at refactoring the existent code. Additional relevant reviews: https://reviews.llvm.org/D40331 https://reviews.llvm.org/D40332 https://reviews.llvm.org/D40333 https://reviews.llvm.org/D40334 Differential Revision: https://reviews.llvm.org/D40330 Change-Id: Icaeb75e014eff96a8f721377783f9a3e6c679275 llvm-svn: 323087	2018-01-22 10:05:23 +00:00
Joel Galenson	dbc724f764	[ARM] Fix perf regression in compare optimization. Fix a performance regression caused by r322737. While trying to make it easier to replace compares with existing adds and subtracts, I accidentally stopped it from doing so in some cases. This should fix that. I'm also fixing another potential bug in that commit. Differential Revision: https://reviews.llvm.org/D42263 llvm-svn: 322972	2018-01-19 17:46:27 +00:00
Daniel Neilson	1e68724d24	Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1) Summary: This is a resurrection of work first proposed and discussed in Aug 2015: http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html and initially landed (but then backed out) in Nov 2015: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html The @llvm.memcpy/memmove/memset intrinsics currently have an explicit argument which is required to be a constant integer. It represents the alignment of the dest (and source), and so must be the minimum of the actual alignment of the two. This change is the first in a series that allows source and dest to each have their own alignments by using the alignment attribute on their arguments. In this change we: 1) Remove the alignment argument. 2) Add alignment attributes to the source & dest arguments. We, temporarily, require that the alignments for source & dest be equal. For example, code which used to read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 100, i32 4, i1 false) will now read call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 4 %dest, i8* align 4 %src, i32 100, i1 false) Downstream users may have to update their lit tests that check for @llvm.memcpy/memmove/memset call/declaration patterns. The following extended sed script may help with updating the majority of your tests, but it does not catch all possible patterns so some manual checking and updating will be required. s~declare void @llvm\.mem(set\|cpy\|move)\.p([^(])\((.), i32, i1\)~declare void @llvm.mem\1.p\2(\3, i1)~g s~call void @llvm\.memset\.p([^(])i8\(i8([^])\ (.), i8 (.), i8 (.), i32 [01], i1 ([^)])\)~call void @llvm.memset.p\1i8(i8\2* \3, i8 \4, i8 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i16\(i8([^])\ (.), i8 (.), i16 (.), i32 [01], i1 ([^)])\)~call void @llvm.memset.p\1i16(i8\2* \3, i8 \4, i16 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i32\(i8([^])\ (.), i8 (.), i32 (.), i32 [01], i1 ([^)])\)~call void @llvm.memset.p\1i32(i8\2* \3, i8 \4, i32 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i64\(i8([^])\ (.), i8 (.), i64 (.), i32 [01], i1 ([^)])\)~call void @llvm.memset.p\1i64(i8\2* \3, i8 \4, i64 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i128\(i8([^])\ (.), i8 (.), i128 (.), i32 [01], i1 ([^)])\)~call void @llvm.memset.p\1i128(i8\2* \3, i8 \4, i128 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i8\(i8([^])\ (.), i8 (.), i8 (.), i32 ([0-9]), i1 ([^)])\)~call void @llvm.memset.p\1i8(i8\2 align \6 \3, i8 \4, i8 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(])i16\(i8([^])\ (.), i8 (.), i16 (.), i32 ([0-9]), i1 ([^)])\)~call void @llvm.memset.p\1i16(i8\2 align \6 \3, i8 \4, i16 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(])i32\(i8([^])\ (.), i8 (.), i32 (.), i32 ([0-9]), i1 ([^)])\)~call void @llvm.memset.p\1i32(i8\2 align \6 \3, i8 \4, i32 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(])i64\(i8([^])\ (.), i8 (.), i64 (.), i32 ([0-9]), i1 ([^)])\)~call void @llvm.memset.p\1i64(i8\2 align \6 \3, i8 \4, i64 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(])i128\(i8([^])\ (.), i8 (.), i128 (.), i32 ([0-9]), i1 ([^)])\)~call void @llvm.memset.p\1i128(i8\2 align \6 \3, i8 \4, i128 \5, i1 \7)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i8\(i8([^])\ (.), i8([^])\ (.), i8 (.), i32 [01], i1 ([^)])\)~call void @llvm.mem\1.p\2i8(i8\3 \4, i8\5* \6, i8 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i16\(i8([^])\ (.), i8([^])\ (.), i16 (.), i32 [01], i1 ([^)])\)~call void @llvm.mem\1.p\2i16(i8\3 \4, i8\5* \6, i16 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i32\(i8([^])\ (.), i8([^])\ (.), i32 (.), i32 [01], i1 ([^)])\)~call void @llvm.mem\1.p\2i32(i8\3 \4, i8\5* \6, i32 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i64\(i8([^])\ (.), i8([^])\ (.), i64 (.), i32 [01], i1 ([^)])\)~call void @llvm.mem\1.p\2i64(i8\3 \4, i8\5* \6, i64 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i128\(i8([^])\ (.), i8([^])\ (.), i128 (.), i32 [01], i1 ([^)])\)~call void @llvm.mem\1.p\2i128(i8\3 \4, i8\5* \6, i128 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i8\(i8([^])\ (.), i8([^])\ (.), i8 (.), i32 ([0-9]), i1 ([^)])\)~call void @llvm.mem\1.p\2i8(i8\3* align \8 \4, i8\5* align \8 \6, i8 \7, i1 \9)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i16\(i8([^])\ (.), i8([^])\ (.), i16 (.), i32 ([0-9]), i1 ([^)])\)~call void @llvm.mem\1.p\2i16(i8\3* align \8 \4, i8\5* align \8 \6, i16 \7, i1 \9)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i32\(i8([^])\ (.), i8([^])\ (.), i32 (.), i32 ([0-9]), i1 ([^)])\)~call void @llvm.mem\1.p\2i32(i8\3* align \8 \4, i8\5* align \8 \6, i32 \7, i1 \9)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i64\(i8([^])\ (.), i8([^])\ (.), i64 (.), i32 ([0-9]), i1 ([^)])\)~call void @llvm.mem\1.p\2i64(i8\3* align \8 \4, i8\5* align \8 \6, i64 \7, i1 \9)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i128\(i8([^])\ (.), i8([^])\ (.), i128 (.), i32 ([0-9]), i1 ([^)])\)~call void @llvm.mem\1.p\2i128(i8\3* align \8 \4, i8\5* align \8 \6, i128 \7, i1 \9)~g The remaining changes in the series will: Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. Step 3) Update Clang to use the new IRBuilder API. Step 4) Update Polly to use the new IRBuilder API. Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get\|set]Alignment() to use getDestAlignment() and getSourceAlignment() instead. Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get\|set]Alignment() methods. Reviewers: pete, hfinkel, lhames, reames, bollu Reviewed By: reames Subscribers: niosHD, reames, jholewinski, qcolombet, jfb, sanjoy, arsenm, dschuff, dylanmckay, mehdi_amini, sdardis, nemanjai, david2050, nhaehnle, javed.absar, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, llvm-commits Differential Revision: https://reviews.llvm.org/D41675 llvm-svn: 322965	2018-01-19 17:13:12 +00:00
Reid Kleckner	1aa9061c5f	[CodeGen] Hoist common AsmPrinter code out of X86, ARM, and AArch64 Every known PE COFF target emits /EXPORT: linker flags into a .drective section. The AsmPrinter should handle this. While we're at it, use global_values() and emit each export flag with its own .ascii directive. This should make the .s file output more readable. llvm-svn: 322788	2018-01-17 23:55:23 +00:00
Joel Galenson	bbcaf4ac5c	[ARM] Optimize {s,u}mul.with.overflow. This extends my previous patches to also optimize overflow-checked multiplies during SelectionDAG. Differential revision: https://reviews.llvm.org/D40922 llvm-svn: 322738	2018-01-17 19:19:05 +00:00
Joel Galenson	fe7fa40869	[ARM] Optimize {s,u}{add,sub}.with.overflow. The ARM backend contains code that tries to optimize compares by replacing them with an existing instruction that sets the flags the same way. This allows it to replace a "cmp" with a "adds", generalizing the code that replaces "cmp" with "sub". It also heuristically disables sinking of instructions that could potentially be used to replace compares (currently only if they're next to each other). Differential revision: https://reviews.llvm.org/D38378 llvm-svn: 322737	2018-01-17 19:19:05 +00:00
Diana Picus	01bcfd2112	[ARM GlobalISel] Rename local variable. NFC llvm-svn: 322667	2018-01-17 15:25:37 +00:00
Diana Picus	c62a16234b	[ARM GlobalISel] Map G_FPEXT and G_FPTRUNC to FPR llvm-svn: 322657	2018-01-17 14:14:14 +00:00
Diana Picus	65ed364fac	[ARM GlobalISel] Legalize G_FPEXT and G_FPTRUNC Mark G_FPEXT and G_FPTRUNC as legal or libcall, depending on hardware support, but only for conversions between float and double. Also add the necessary boilerplate so that the LegalizerHelper can introduce the required libcalls. This also works only for float and double, but isn't too difficult to extend when the need arises. llvm-svn: 322651	2018-01-17 13:34:10 +00:00
Diana Picus	2dc5405693	[ARM GlobalISel] Map G_FMA to FPR llvm-svn: 322367	2018-01-12 12:06:01 +00:00
Diana Picus	e74243d473	[ARM GlobalISel] Legalize G_FMA For hard float with VFP4, it is legal. Otherwise, we use libcalls. This needs a bit of support in the LegalizerHelper for soft float because we didn't handle G_FMA libcalls yet. The support is trivial, as the only difference between G_FMA and other libcalls that we already handle is that it has 3 input operands rather than just 2. llvm-svn: 322366	2018-01-12 11:30:45 +00:00
Andre Vieira	5627c218e1	[ARM] Add codegen for SMMULR, SMMLAR and SMMLSR This patch teaches the Arm back-end to generate the SMMULR, SMMLAR and SMMLSR instructions from equivalent IR patterns. Differential Revision: https://reviews.llvm.org/D41775 llvm-svn: 322361	2018-01-12 09:24:41 +00:00
Andre Vieira	26b9de9ebb	[ARM] Fix erroneous availability of SMMLS for Armv7-M Differential Revision: https://reviews.llvm.org/D41855 llvm-svn: 322360	2018-01-12 09:21:09 +00:00
Matthias Braun	ea4359e922	PeepholeOptimizer: Fix for vregs without defs The PeepholeOptimizer would fail for vregs without a definition. If this was caused by an undef operand abort to keep the code simple (so we don't need to add logic everywhere to replicate the undef flag). Differential Revision: https://reviews.llvm.org/D40763 llvm-svn: 322319	2018-01-11 22:30:43 +00:00
Evgeniy Stepanov	5223b5d9d6	[arm] Implement Target Operand Flag MIR serialization. Reviewers: efriedma, pcc Subscribers: aemerson, javed.absar, kristof.beyls, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D39975 llvm-svn: 322312	2018-01-11 21:37:58 +00:00
Diana Picus	0ed7513c83	[ARM GlobalISel] Map G_FNEG to the FPR bank llvm-svn: 322169	2018-01-10 11:13:31 +00:00
Diana Picus	f949a0abac	[ARM GlobalISel] Legalize G_FNEG for s32 and s64 For hard float, it is legal. For soft float, we need to lower to 0 - x first, and then we can use the libcall for G_FSUB. This is undoing some of the canonicalization performed by the IRTranslator (which introduces G_FNEG when it sees a 0 - x). Ideally, that canonicalization would be performed by a pre-legalizer pass that would allow targets to opt out of this behaviour rather than dance around it in the legalizer. llvm-svn: 322168	2018-01-10 10:45:34 +00:00
Diana Picus	8f14886630	[ARM GlobalISel] Legalize s32/s64 G_FCONSTANT Legal for hard float. Change to G_CONSTANT for soft float (but preserve the binary representation). llvm-svn: 322164	2018-01-10 10:01:49 +00:00
Diana Picus	734a5e8912	[ARM GlobalISel] Legalize G_CONSTANT for scalars > 32 bits Make G_CONSTANT narrow for any scalars larger than 32 bits. llvm-svn: 322162	2018-01-10 09:32:01 +00:00
Francis Visoiu Mistrih	7d9bef8f5c	[CodeGen] Don't print "pred:" and "opt:" in -debug output In -debug output we print "pred:" whenever a MachineOperand is a predicate operand in the instruction descriptor, and "opt:" whenever a MachineOperand is an optional def in the instruction descriptor. Differential Revision: https://reviews.llvm.org/D41870 llvm-svn: 322096	2018-01-09 17:31:07 +00:00
Momchil Velikov	ac7c5c1d92	[ARM] Fix PR35379 - incorrect unwind information when compiling with -Oz The patch makes the unwind information not mention registers, which were pushed solely for the purpose of saving stack adjustment instructions. Differential revision: https://reviews.llvm.org/D41300 Fixes https://bugs.llvm.org/show_bug.cgi?id=35379 llvm-svn: 321996	2018-01-08 14:47:19 +00:00
Momchil Velikov	d17dabca31	[ARM] Fix PR35481 This patch allows `r7` to be used, regardless of its use as a frame pointer, as a temporary register when popping `lr`, and also falls back to using a high temporary register if, for some reason, we weren't able to find a suitable low one. Differential revision: https://reviews.llvm.org/D40961 Fixes https://bugs.llvm.org/show_bug.cgi?id=35481 llvm-svn: 321989	2018-01-08 11:32:37 +00:00
Reid Kleckner	5619669a5a	Fix -Wsign-compare warnings on Windows These arise because enums are 'int' by default. llvm-svn: 321887	2018-01-05 19:53:51 +00:00
Momchil Velikov	7efdd090e2	[ARM] Issue an erorr when non-general-purpose registers are used in address operands Currently the assembler would accept, e.g. `ldr r0, [s0, #12]` and similar. This patch add checks that only general-purpose registers are used in address operands, shifted registers, and shift amounts. Differential revision: https://reviews.llvm.org/D39910 llvm-svn: 321866	2018-01-05 13:28:10 +00:00
Oliver Stannard	7d9198b296	[ARM] Fix endianness of Thumb .inst.w directive Wide Thumb2 instructions should be emitted into the object file as pairs of 16-bit words of the appropriate endianness, not one 32-bit word. Differential revision: https://reviews.llvm.org/D41185 llvm-svn: 321799	2018-01-04 13:56:40 +00:00
Diana Picus	865f7fecb2	[ARM GlobalISel] Select G_PHI Select G_PHI to PHI and manually constrain the result register. This is very similar to how COPY is handled, so extract and reuse some of that code. llvm-svn: 321797	2018-01-04 13:09:25 +00:00
Diana Picus	c768bbe2e7	[ARM GlobalISel] Legalize scalar G_PHI Mark G_PHI as Legal for s32 and p0, and also for s64 if we have hard float. Widen any smaller types. llvm-svn: 321795	2018-01-04 13:09:14 +00:00
Diana Picus	37ae9f68a4	[ARM GlobalISel] Fix selection of pointer constants We used to handle G_CONSTANT with pointer type by forcing the type of the result register to s32 and then letting TableGen handle it. Unfortunately, setting the type only works for generic virtual registers, that haven't yet been constrained to a register class (e.g. those used only by a COPY later on). If the result register has already been constrained as a use of a previously selected instruction, then setting the type will assert. It would be nice to be able to teach TableGen to select pointer constants the same as integer constants, but since it's such an edge case (at the moment the only pointer constant that we're generally interested in is 0, and that is mostly used for comparisons and selects, which are also not supported by TableGen) it's probably not worth the effort right now. Instead, handle pointer constants with some trivial handwritten code. llvm-svn: 321793	2018-01-04 10:54:57 +00:00
Alex Bradbury	46db78b743	[ARM][NFC] Avoid recreating MCSubtargetInfo in ARMAsmBackend After D41349, we can now directly access MCSubtargetInfo from createARM*AsmBackend. This patch makes use of this, avoiding the need to create a fresh MCSubtargetInfo (which was previously always done with a blank CPU and feature string). Given the total size of the change remains pretty tiny and we're removing the old explicit destructor, I changed the STI field to a reference rather than a pointer. Differential Revision: https://reviews.llvm.org/D41693 llvm-svn: 321707	2018-01-03 13:46:21 +00:00
Alex Bradbury	b22f751fa7	Thread MCSubtargetInfo through Target::createMCAsmBackend Currently it's not possible to access MCSubtargetInfo from a TgtMCAsmBackend. D20830 threaded an MCSubtargetInfo reference through MCAsmBackend::relaxInstruction, but this isn't the only function that would benefit from access. This patch removes the Triple and CPUString arguments from createMCAsmBackend and replaces them with MCSubtargetInfo. This patch just changes the interface without making any intentional functional changes. Once in, several cleanups are possible: * Get rid of the awkward MCSubtargetInfo handling in ARMAsmBackend * Support 16-bit instructions when valid in MipsAsmBackend::writeNopData * Get rid of the CPU string parsing in X86AsmBackend and just use a SubtargetFeature for HasNopl * Emit 16-bit nops in RISCVAsmBackend::writeNopData if the compressed instruction set extension is enabled (see D41221) This change initially exposed PR35686, which has since been resolved in r321026. Differential Revision: https://reviews.llvm.org/D41349 llvm-svn: 321692	2018-01-03 08:53:05 +00:00
Sanjoy Das	26d11ca4b0	(Re-landing) Expose a TargetMachine::getTargetTransformInfo function Re-land r321234. It had to be reverted because it broke the shared library build. The shared library build broke because there was a missing LLVMBuild dependency from lib/Passes (which calls TargetMachine::getTargetIRAnalysis) to lib/Target. As far as I can tell, this problem was always there but was somehow masked before (perhaps because TargetMachine::getTargetIRAnalysis was a virtual function). Original commit message: This makes the TargetMachine interface a bit simpler. We still need the std::function in TargetIRAnalysis to avoid having to add a dependency from Analysis to Target. See discussion: http://lists.llvm.org/pipermail/llvm-dev/2017-December/119749.html I avoided adding all of the backend owners to this review since the change is simple, but let me know if you feel differently about this. Reviewers: echristo, MatzeB, hfinkel Reviewed By: hfinkel Subscribers: jholewinski, jfb, arsenm, dschuff, mcrosier, sdardis, nemanjai, nhaehnle, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, llvm-commits Differential Revision: https://reviews.llvm.org/D41464 llvm-svn: 321375	2017-12-22 18:21:59 +00:00
Diana Picus	28a6d0e639	[ARM GlobalISel] Support G_INTTOPTR and G_PTRTOINT for s32 Mark conversions between pointers and 32-bit scalars as legal, map them to the GPR and select to a simple COPY. llvm-svn: 321356	2017-12-22 13:05:51 +00:00

1 2 3 4 5 ...

9525 Commits