llvm-project

Commit Graph

Author	SHA1	Message	Date
Oliver Stannard	4a9086b537	[ARM] Fix select_cc lowering for fp16 When lowering a select_cc node where the true and false values are of type f16, we can't use a general conditional move because the FP16 instructions do not support conditional execution. Instead, we must ensure that the condition code is one of the four supported by the VSEL instruction. Differential revision: https://reviews.llvm.org/D58813 llvm-svn: 355385	2019-03-05 10:42:34 +00:00
Oliver Stannard	181afc7f3b	[ARM] Fix selection of VLDR.16 instruction with imm offset The isScaledConstantInRange function takes upper and lower bounds which are checked after dividing by the scale, so the bounds checks for half, single and double precision should all be the same. Previously, we had wrong bounds checks for half precision, so selected an immediate the instructions can't actually represent. Differential revision: https://reviews.llvm.org/D58822 llvm-svn: 355305	2019-03-04 09:17:38 +00:00
Oliver Stannard	82fbbc21fd	[ARM] Fix FP16 stack loads/stores for Thumb2 with frame pointer The new addressing mode added for the v8.2A FP16 instructions uses bit 8 of the immediate to encode the sign of the offset, like the other FP loads/stores, so need to be treated the same way. Differential revision: https://reviews.llvm.org/D58816 llvm-svn: 355201	2019-03-01 14:20:28 +00:00
Oliver Stannard	e019e6223b	[ARM] Consider undefined-on-NaN conditions in checkVSELConstraints This function was not checking for the condition code variants which are undefined if either input is NaN, so we were missing selection of the VSEL instruction in some cases when using -fno-honor-nans or -ffast-math. Differential revision: https://reviews.llvm.org/D58812 llvm-svn: 355199	2019-03-01 13:58:25 +00:00
Diana Picus	54829ec5d0	[ARM GlobalISel] Support G_CTLZ for Thumb2 Same as ARM mode but with different opcode. llvm-svn: 355191	2019-03-01 10:12:28 +00:00
Diana Picus	afb3398da0	[ARM GlobalISel] Check target flags in test. NFCI There was a time when we couldn't dump target-specific flags such as arm-sbrel etc, so the tests didn't check for them. We can now be more specific in our tests. llvm-svn: 355189	2019-03-01 10:01:22 +00:00
Diana Picus	3b7beafc77	[ARM GlobalISel] Support global variables for Thumb2 Add the same level of support as for ARM mode (i.e. still no TLS support). In most cases, it is sufficient to replace the opcodes with the t2-equivalent, but there are some idiosyncrasies that I decided to preserve because I don't understand the full implications: * For ARM we use LDRi12 to load from constant pools, but for Thumb we use t2LDRpci (I'm not sure if the ideal would be to use t2LDRi12 for Thumb as well, or to use LDRcp for ARM). * For Thumb we don't have an equivalent for MOV\|LDRLIT_ga_pcrel_ldr, so we have to generate MOV\|LDRLIT_ga_pcrel plus a load from GOT. The tests are in separate files because they're hard enough to read even without doubling the number of checks. llvm-svn: 355077	2019-02-28 10:42:47 +00:00
Luke Cheeseman	9e285bef2b	[ARM] Add Cortex-M35P - Add LLVM backend support for Cortex-M35P - Documentation can be found at https://developer.arm.com/products/processors/cortex-m/cortex-m35p Differentail Revision: https://reviews.llvm.org/D57763 llvm-svn: 354868	2019-02-26 12:02:12 +00:00
David Green	b504f104b2	[ARM] Add some more missing T1 opcodes for the peephole optimisier This adds a few extra Thumb1 opcodes to improve the peephole opimisers ability to remove redundant cmp instructions. tADC and tSBC require a small fixup to prevent MOVS being moved past the instruction, giving the wrong flags. Differential Revision: https://reviews.llvm.org/D58281 llvm-svn: 354791	2019-02-25 15:50:54 +00:00
Dmitri Gribenko	a3a3964f98	Fixed typos in tests: s/CHEKC/CHECK/ Reviewers: ilya-biryukov Subscribers: nemanjai, javed.absar, jsji, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D58611 llvm-svn: 354785	2019-02-25 13:41:59 +00:00
Dmitri Gribenko	751c5fbf6a	Fixed typos in tests: s/CEHCK/CHECK/ Reviewers: ilya-biryukov Subscribers: sanjoy, sdardis, javed.absar, jrtc27, atanasyan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58608 llvm-svn: 354781	2019-02-25 13:12:33 +00:00
Simon Tatham	b70fc0c5fd	[ARM] Make fullfp16 instructions not conditionalisable. More or less all the instructions defined in the v8.2a full-fp16 extension are defined as UNPREDICTABLE if you put them in an IT block (Thumb) or use with any condition other than AL (ARM). LLVM didn't know that, and was happy to conditionalise them. In order to force these instructions to count as not predicable, I had to make a small Tablegen change. The code generation back end mostly decides if an instruction was predicable by looking for something it can identify as a predicate operand; there's an isPredicable bit flag that overrides that check in the positive direction, but nothing that overrides it in the negative direction. (I considered the alternative approach of actually removing the predicate operand from those instructions, but thought that it would be more painful overall for instructions differing only in data type to have different shapes of operand list. This way, the only code that has to notice the difference is the if-converter.) So I've added an isUnpredicable bit alongside isPredicable, and set that bit on the right subset of FP16 instructions, and also on the VSEL, VMAXNM/VMINNM and VRINT[ANPM] families which should be unpredicable for all data types. I've included a couple of representative regression tests, both of which previously caused an fp16 instruction to be conditionalised in ARM state and (with -arm-no-restrict-it) to be put in an IT block in Thumb. Reviewers: SjoerdMeijer, t.p.northover, efriedma Reviewed By: efriedma Subscribers: jdoerfert, javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57823 llvm-svn: 354768	2019-02-25 10:39:53 +00:00
David Green	acb628b2af	[ARM] Add some missing thumb1 opcodes to enable peephole optimisation of CMPs This adds a number of missing Thumb1 opcodes so that the peephole optimiser can remove redundant CMP instructions. Reapplying this after the first attempt broke non-thumb1 code as the t2ADDri instruction can be used with frame indices. In thumb1 we use tADDframe. Differential Revision: https://reviews.llvm.org/D57833 llvm-svn: 354667	2019-02-22 12:23:31 +00:00
Diana Picus	35e1c6663c	[ARM GlobalISel] Support floating point for Thumb2 This is exactly the same as arm mode, so for the instruction selector tests we just extract them to a new file and run with the same checks for both arm and thumb mode. For the legalizer we need to update the tests for soft float a bit, but only because BL and tBL are slightly different. We could be pedantic and check that we get a well-formed BL for arm mode and a tBL for thumb, but for the purposes of the legalizer test it's sufficient to just skip over the predicate operands in the checks. Also note that we have the pedantic checks in the divmod test, so we're covered. llvm-svn: 354665	2019-02-22 09:54:54 +00:00
Diana Picus	dcaa939ab7	[ARM GlobalISel] Support G_FRAME_INDEX for Thumb2 Same as arm mode. llvm-svn: 354579	2019-02-21 13:00:02 +00:00
David Green	7a183a86be	Revert 354564: [ARM] Add some missing thumb1 opcodes to enable peephole optimisation of CMPs I believe it's causing bootstrap failures for A32 code. I'll take a look at what's wrong. llvm-svn: 354569	2019-02-21 11:03:13 +00:00
David Green	89efe24eba	[ARM] Add some missing thumb1 opcodes to enable peephole optimisation of CMPs This adds a number of missing Thumb1 opcodes so that the peephole optimiser can remove redundant CMP instructions. Differential Revision: https://reviews.llvm.org/D57833 llvm-svn: 354564	2019-02-21 10:30:09 +00:00
Sam Parker	6ed47bee27	[ARM] Negative constants mishandled in ARM CGP During type promotion, sometimes we convert negative an add with a negative constant into a sub with a positive constant. The loop that performs this transformation has two issues: - it iterates over a set, causing non-determinism. - it breaks, instead of continuing, when it finds the first non-negative operand. Differential Revision: https://reviews.llvm.org/D58452 llvm-svn: 354557	2019-02-21 09:33:18 +00:00
Diana Picus	19dbc6245f	[ARM GlobalISel] Support G_PHI for Thumb2 Same as arm mode. llvm-svn: 354310	2019-02-19 10:26:47 +00:00
Diana Picus	a00425ff0d	[ARM GlobalISel] Support branches for Thumb2 Just like arm mode, but with different opcodes. llvm-svn: 354113	2019-02-15 10:24:03 +00:00
Sam Parker	3c17cb7bc4	[ARM CGP] Fix ConvertTruncs ConvertTruncs is used to replace a trunc for an AND mask, however this function wasn't working as expected. By performing the change later, we can create a wide type integer mask instead of a narrow -1 value, which could then be simply removed (incorrectly). Because we now perform this action later, it's necessary to cache the trunc type before we perform the promotion. Differential Revision: https://reviews.llvm.org/D57686 llvm-svn: 354108	2019-02-15 09:04:39 +00:00
Diana Picus	aa4118a873	[ARM GlobalISel] Support G_SELECT for Thumb2 Same as arm mode, but slightly different opcodes. llvm-svn: 353938	2019-02-13 11:25:32 +00:00
Ana Pazos	9a3dc3e60b	[LegalizeTypes] Expand FNEG to bitwise op for IEEE FP types Summary: Except for custom floating point types x86_fp80 and ppc_fp128, expand Y = FNEG(X) to Y = X ^ sign mask to avoid library call. Using bitwise operation can improve code size and performance. Reviewers: efriedma Reviewed By: efriedma Subscribers: efriedma, kpn, arsenm, eli.friedman, javed.absar, rbar, johnrusso, simoncook, sabuasal, niosHD, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, asb, llvm-commits Differential Revision: https://reviews.llvm.org/D57875 llvm-svn: 353757	2019-02-11 22:10:08 +00:00
Sam Parker	8ff143033a	[ARM] Add v8m.base pattern for add negative imm The v8m.base ISA contains movw, which can operate on an unsigned 16-bit value. Add the pattern that converts an add with a negative value, that could fit into 16-bits when negated, into a sub with that positive value. Differential Revision: https://reviews.llvm.org/D57942 llvm-svn: 353692	2019-02-11 11:35:42 +00:00
Sam Parker	3fbacd4964	[NFC][ARM] Simplify loop-indexing codegen test Remove unnecessary offset checks, CHECK-BASE checks and add some extra -NOT checks and TODO comments. llvm-svn: 353689	2019-02-11 10:52:49 +00:00
Sjoerd Meijer	150ccb889e	[ARM] LoadStoreOptimizer: reoder limit The whole design of generating LDMs/STMs is fragile and unreliable: it depends on rescheduling here in the LoadStoreOptimizer that isn't register pressure aware and regalloc that isn't aware of generating LDMs/STMs. This patch adds a (hidden) option to control the total number of instructions that can be re-ordered. I appreciate this looks only a tiny bit better than a hard-coded constant, but at least it allows more easy experimentation with different values for now. Ideally we calculate this reorder limit based on some heuristics, and take register pressure into account. I might be looking into that next. Differential Revision: https://reviews.llvm.org/D57954 llvm-svn: 353678	2019-02-11 09:37:42 +00:00
Nemanja Ivanovic	92a8c36735	[DAGCombine] Optimize pow(X, 0.75) to sqrt(X) * sqrt(sqrt(X)) The sqrt case is faster and we already do this for the case where the exponent is 0.25. This adds the 0.75 case which is also not sensitive to signed zeros. Patch by Whitney Tsang (Whitney) Differential revision: https://reviews.llvm.org/D57434 llvm-svn: 353557	2019-02-08 19:50:58 +00:00
Sam Parker	67756c09f2	[LSR] Generate cross iteration indexes Modify GenerateConstantOffsetsImpl to create offsets that can be used by indexed addressing modes. If formulae can be generated which result in the constant offset being the same size as the recurrence, we can generate a pre-indexed access. This allows the pointer to be updated via the single pre-indexed access so that (hopefully) no add/subs are required to update it for the next iteration. For small cores, this can significantly improve performance DSP-like loops. Differential Revision: https://reviews.llvm.org/D55373 llvm-svn: 353403	2019-02-07 13:32:54 +00:00
Diana Picus	75a04e2a77	[ARM GlobalISel] Support G_ICMP for Thumb2 Mark as legal and use the t2* equivalents of the arm mode instructions, e.g. t2CMPrr instead of plain CMPrr. llvm-svn: 353392	2019-02-07 11:05:33 +00:00
Diana Picus	e24b104a11	[ARM GlobalISel] Support G_GEP for Thumb2 Same as ARM, but use a different opcode in the instruction selection. llvm-svn: 353151	2019-02-05 10:21:37 +00:00
Matt Arsenault	1f795e2c2a	GlobalISel: Enforce operand types for constants A number of of tests were using imm operands, not cimm. Since CSE relies on the exact ConstantInt* pointer used, and implicit conversions are generally evil, also enforce the bitsize of the types. llvm-svn: 353113	2019-02-04 23:29:31 +00:00
Oliver Stannard	bac11518cd	[CodeGen] Don't scavenge non-saved regs in exception throwing functions Previously, LiveRegUnits was assuming that if a block has no successors and does not return, then no registers are live at the end of it (because the end of the block is unreachable). This was causing the register scavenger to use callee-saved registers to materialise stack frame addresses without saving them in the prologue. This would normally be fine, because the end of the block is unreachable, but this is not legal if the block ends by throwing a C++ exception. If this happens, the scratch register will be modified, but its previous value won't be preserved, so it doesn't get restored by the exception unwinder. Differential revision: https://reviews.llvm.org/D57381 llvm-svn: 352844	2019-02-01 09:23:51 +00:00
Sjoerd Meijer	f222259c3c	[ARM] Thumb2: ConstantMaterializationCost Constants can also be materialised using the negated value and a MVN, and this case seem to have been missed for Thumb2. To check the constant materialisation costs, we now call getT2SOImmVal twice, once for the original constant and then also for its negated value, and this function checks if the constant can both be splatted or rotated. This was revealed by a test that optimises for minsize: instead of a LDR literal pool load and having a literal pool entry, just a MVN with an immediate is smaller (and also faster). Differential Revision: https://reviews.llvm.org/D57327 llvm-svn: 352737	2019-01-31 08:38:06 +00:00
Sjoerd Meijer	f7cc34cae8	[SelectionDAG] Codesize: don't expand SHIFT to SHIFT_PARTS And instead just generate a libcall. My motivating example on ARM was a simple: shl i64 %A, %B for which the code bloat is quite significant. For other targets that also accept __int128/i128 such as AArch64 and X86, it is also beneficial for these cases to generate a libcall when optimising for minsize. On these 64-bit targets, the 64-bits shifts are of course unaffected because the SHIFT/SHIFT_PARTS lowering operation action is not set to custom/expand. Differential Revision: https://reviews.llvm.org/D57386 llvm-svn: 352736	2019-01-31 08:07:30 +00:00
Matt Arsenault	2a64598ef2	GlobalISel: Fix creating MMOs with align 0 llvm-svn: 352712	2019-01-31 01:38:47 +00:00
Matt Arsenault	547a83b4eb	MIR: Reject non-power-of-4 alignments in MMO parsing llvm-svn: 352686	2019-01-30 23:09:28 +00:00
David Green	54b0115547	[ARM] Use sub for negative offset load/store in thumb1 This attempts to optimise negative values used in load/store operands a little. We currently try to selct them as rr, materialising the negative constant using a MOV/MVN pair. This instead selects ri with an immediate of 0, forcing the add node to become a simpler sub. Differential Revision: https://reviews.llvm.org/D57121 llvm-svn: 352475	2019-01-29 10:40:31 +00:00
David Green	5c33c5da1a	[ARM] Add extra testcases for D57121. NFC llvm-svn: 352472	2019-01-29 10:25:56 +00:00
Diana Picus	574e0c5e32	[ARM GlobalISel] Support integer division for Thumb2 Support G_SDIV, G_UDIV, G_SREM and G_UREM. The only significant difference between arm and thumb mode is that we need to check a different subtarget feature. llvm-svn: 352346	2019-01-28 10:37:30 +00:00
Diana Picus	8976ad12a9	[ARM GlobalISel] Support shifts for Thumb2 Same as ARM. On this occasion we split some of the instruction select tests for more complicated instructions into their own files, so we can reuse them for ARM and Thumb mode. Likewise for the legalizer tests. llvm-svn: 352188	2019-01-25 10:48:42 +00:00
Aditya Nandakumar	3ba0d94bce	[GISel]: Change how CSE is enabled by default for each pass https://reviews.llvm.org/D57178 Now add a hook in TargetPassConfig to query if CSE needs to be enabled. By default this hook returns false only for O0 opt level but this can be overridden by the target. As a consequence of the default of enabled for non O0, a few tests needed to be updated to not use CSE (by passing in -O0) to the run line. reviewed by: arsenm llvm-svn: 352126	2019-01-24 23:11:25 +00:00
Sam Parker	31bef63bb4	[ARM][CGP] Check trunc type before replacing In the last stage of type promotion, we replace any zext that uses a new trunc with the operand of the trunc. This is okay when we only allowed one type to be optimised, but now its the case that the trunc maybe needed to produce a more narrow type than the one we were optimising for. So we need to check this before doing the replacement. Differential Revision: https://reviews.llvm.org/D57041 llvm-svn: 351935	2019-01-23 09:18:44 +00:00
Sam Parker	9a2a89d58f	[DAGCombine] Enable more pre-indexed stores The current check in CombineToPreIndexedLoadStore is too conversative, preventing a pre-indexed store when the base pointer is a predecessor of the value being stored. Instead, we should check the pointer operand of the store. Differential Revision: https://reviews.llvm.org/D56719 llvm-svn: 351933	2019-01-23 09:11:49 +00:00
Diana Picus	d5c2499aec	[ARM GlobalISel] Allow calls to varargs functions Allow varargs functions to be called, both in arm and thumb mode. This boils down to choosing the correct calling convention, which we can easily test by making sure arm_aapcscc is used instead of arm_aapcs_vfpcc when the callee is variadic. llvm-svn: 351424	2019-01-17 10:11:55 +00:00
Sam Parker	dd8cd6d26b	[DAGCombine] Fix ReduceLoadWidth for shifted offsets ReduceLoadWidth can trigger using a shifted mask is used and this requires that the function return a shl node to correct for the offset. However, the way that this was implemented meant that the returned result could be an existing node, which would be incorrect. This fixes the method of inserting the new node and replacing uses. Differential Revision: https://reviews.llvm.org/D50432 llvm-svn: 351310	2019-01-16 08:40:12 +00:00
James Y Knight	693d39dd12	Remove irrelevant references to legacy git repositories from compiler identification lines in test-cases. (Doing so only because it's then easier to search for references which are actually important and need fixing.) llvm-svn: 351200	2019-01-15 16:18:52 +00:00
Diana Picus	8987d00653	[ARM GlobalISel] Import MOVi32imm into GlobalISel Make it possible for TableGen to produce code for selecting MOVi32imm. This allows reasonably recent ARM targets to select a lot more constants than before. We achieve this by adding GISelPredicateCode to arm_i32imm. It's impossible to use the exact same code for both DAGISel and GlobalISel, since one uses "Subtarget->" and the other "STI." to refer to the subtarget. Moreover, in GlobalISel we don't have ready access to the MachineFunction, so we need to add a bit of code for obtaining it from the instruction that we're selecting. This is also the reason why it needs to remain a PatLeaf instead of the more specific IntImmLeaf. llvm-svn: 351056	2019-01-14 12:04:08 +00:00
Francis Visoiu Mistrih	b7cef81fd3	Replace "no-frame-pointer-" function attributes with "frame-pointer" Part of the effort to refactoring frame pointer code generation. We used to use two function attributes "no-frame-pointer-elim" and "no-frame-pointer-elim-non-leaf" to represent three kinds of frame pointer usage: (all) frames use frame pointer, (non-leaf) frames use frame pointer, (none) frame use frame pointer. This CL makes the idea explicit by using only one enum function attribute "frame-pointer" Option "-frame-pointer=" replaces "-disable-fp-elim" for tools such as llc. "no-frame-pointer-elim" and "no-frame-pointer-elim-non-leaf" are still supported for easy migration to "frame-pointer". tests are mostly updated with // replace command line args ‘-disable-fp-elim=false’ with ‘-frame-pointer=none’ grep -iIrnl '\-disable-fp-elim=false' \| xargs sed -i '' -e "s/-disable-fp-elim=false/-frame-pointer=none/g" // replace command line args ‘-disable-fp-elim’ with ‘-frame-pointer=all’ grep -iIrnl '\-disable-fp-elim' * \| xargs sed -i '' -e "s/-disable-fp-elim/-frame-pointer=all/g" Patch by Yuanfang Chen (tabloid.adroit)! Differential Revision: https://reviews.llvm.org/D56351 llvm-svn: 351049	2019-01-14 10:55:55 +00:00
Evandro Menezes	0674762112	[AArch64] Create feature set for Exynos M4 Complete the feature set for Exynos M4 and update test cases. llvm-svn: 350953	2019-01-11 18:54:25 +00:00
Sam Parker	53000a74a5	[ARM] Add missing patterns for DSP muls Using a PatLeaf for sext_16_node allowed matching smulbb and smlabb instructions once the operands had been sign extended. But we also need to use sext_inreg operands along with sext_16_node to catch a few more cases that enable use to remove the unnecessary sxth. Differential Revision: https://reviews.llvm.org/D55992 llvm-svn: 350613	2019-01-08 10:12:36 +00:00

1 2 3 4 5 ...

3635 Commits