Commit Graph

435 Commits

Author SHA1 Message Date
Lucas Prates 70a5c52534 [ARM][Thumb] Command-line option to ensure AAPCS compliant Frame Records
Currently the a AAPCS compliant frame record is not always created for
functions when it should. Although a consistent frame record might not
be required in some cases, there are still scenarios where applications
may want to make use of the call hierarchy made available trough it.

In order to enable the use of AAPCS compliant frame records whilst keep
backwards compatibility, this patch introduces a new command-line option
(`-mframe-chain=[none|aapcs|aapcs+leaf]`) for Aarch32 and Thumb backends.
The option allows users to explicitly select when to use it, and is also
useful to ensure the extra overhead introduced by the frame records is
only introduced when necessary, in particular for Thumb targets.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D125094
2022-06-27 14:08:48 +01:00
Krasimir Georgiev 8f2ba36336 Revert "[ARM][Thumb] Command-line option to ensure AAPCS compliant Frame Records AND [NFC][Thumb] Update frame-chain codegen test to use thumbv6m"
This reverts commit 7625e01d66 and
dependent cbcce82ef6.

Commit 7625e01d66 causes some new codegen test
failures under asan, e.g., CodeGen/ARM/execute-only.ll:
https://lab.llvm.org/buildbot/#/builders/5/builds/24659/steps/15/logs/stdio.
2022-06-15 16:10:02 +02:00
Lucas Prates cbcce82ef6 [NFC][Thumb] Update frame-chain codegen test to use thumbv6m
The `llvm/test/CodeGen/Thumb/frame-chain.ll`, recently added by D125094,
currently fails when expensive checks are enabled due to a tMOVr
instruction that is only valid from V6 onwards.

The use of the invalid instruction is unrelated to the contents of the
original patch, and continues to be triggered by this test if its
CodeGen changes are reverted, so this patch updates the test to use V6-M
while the issue is not resolved.
2022-06-14 14:36:57 +01:00
Lucas Prates 7625e01d66 [ARM][Thumb] Command-line option to ensure AAPCS compliant Frame Records
Currently the a AAPCS compliant frame record is not always created for
functions when it should. Although a consistent frame record might not
be required in some cases, there are still scenarios where applications
may want to make use of the call hierarchy made available trough it.

In order to enable the use of AAPCS compliant frame records whilst keep
backwards compatibility, this patch introduces a new command-line option
(`-mframe-chain=[none|aapcs|aapcs+leaf]`) for Aarch32 and Thumb backends.
The option allows users to explicitly select when to use it, and is also
useful to ensure the extra overhead introduced by the frame records is
only introduced when necessary, in particular for Thumb targets.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D125094
2022-06-14 13:37:51 +01:00
Lucas Prates 33b9ad647e Revert "[ARM][Thumb] Command-line option to ensure AAPCS compliant Frame Records"
Reverting change due to test failure.

This reverts commit 6119053dab.
2022-06-13 11:00:49 +01:00
Lucas Prates 6119053dab [ARM][Thumb] Command-line option to ensure AAPCS compliant Frame Records
Currently the a AAPCS compliant frame record is not always created for
functions when it should. Although a consistent frame record might not
be required in some cases, there are still scenarios where applications
may want to make use of the call hierarchy made available trough it.

In order to enable the use of AAPCS compliant frame records whilst keep
backwards compatibility, this patch introduces a new command-line option
(`-mframe-chain=[none|aapcs|aapcs+leaf]`) for Aarch32 and Thumb backends.
The option allows users to explicitly select when to use it, and is also
useful to ensure the extra overhead introduced by the frame records is
only introduced when necessary, in particular for Thumb targets.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D125094
2022-06-13 10:21:06 +01:00
Hendrik Greving a92ed167f2 [ValueTypes] Define MVTs for v128i2/v64i4 as well as i2 and i4.
Adds MVT::v128i2, MVT::v64i4, and implied MVT::i2, MVT::i4.

Keeps MVT::i2, MVT::i4 lowering actions as expand, which should be
removed once targets set this explicitly.

Adjusts 11 lit tests to reflect slightly different behavior during
DAG combine.

Differential Revision: https://reviews.llvm.org/D125247
2022-06-02 00:49:11 +00:00
Hendrik Greving e9d05cc7d8 Revert "[ValueTypes] Define MVTs for v128i2/v64i4 as well as i2 and i4."
This reverts commit 430ac5c302.

Due to failures in Clang tests.

Differential Revision: https://reviews.llvm.org/D125247
2022-06-01 13:27:49 -07:00
Hendrik Greving 430ac5c302 [ValueTypes] Define MVTs for v128i2/v64i4 as well as i2 and i4.
Adds MVT::v128i2, MVT::v64i4, and implied MVT::i2, MVT::i4.

Keeps MVT::i2, MVT::i4 lowering actions as `expand`, which should be
removed once targets set this explicitly.

Adjusts 11 lit tests to reflect slightly different behavior during
DAG combine.

Differential Revision: https://reviews.llvm.org/D125247
2022-06-01 12:48:01 -07:00
Craig Topper 46eef76876 [DAGCombiner] Fix bug in MatchBSwapHWordLow.
This function tries to match (a >> 8) | (a << 8) as (bswap a) >> 16.

If the SRL isn't masked and the high bits aren't demanded, we still
need to ensure that bits 23:16 are zero. After the right shift they
will be in bits 15:8 which is where the important bits from the SHL
end up. It's only a bswap if the OR on bits 15:8 only takes the bits
from the SHL.

Fixes PR55484.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D125641
2022-05-18 09:23:18 -07:00
Craig Topper 74f6ded49d [AArch64][ARM][RISCV][X86] Add test cases for PR55484. NFC
This bug is in generic DAG combine and easily reproducible on many
targets.

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D125640
2022-05-16 09:28:11 -07:00
Zhiyao Ma bd606afe26 [ARM] Only update the successor edges for immediate predecessors of PrologueMBB
When adjusting the function prologue for segmented stacks, only update
the successor edges of the immediate predecessors of the original
prologue.

Differential Revision: https://reviews.llvm.org/D122959
2022-05-03 12:36:35 +01:00
Zhiyao Ma adc26b4eae [ARM] Fix 8-bit immediate overflow in the instruction of segmented stack prologue.
It fixes the overflow of 8-bit immediate field in the emitted
instruction that allocates large stacklet.

For thumb2 targets, load large immediate by a pair of movw and movt
instruction. For thumb1 and ARM targets, load large immediate by reading
from literal pool.

Differential Revision: https://reviews.llvm.org/D118545
2022-03-10 15:15:24 -08:00
Craig Topper 440c4b705a [SelectionDAG][RISCV][ARM][PowerPC][X86][WebAssembly] Change default abs expansion to use sra (X, size(X)-1); sub (xor (X, Y), Y).
Previous we used sra (X, size(X)-1); xor (add (X, Y), Y).

By placing sub at the end, we allow RISCV to combine sign_extend_inreg
with it to form subw.

Some X86 tests for Z - abs(X) seem to have improved as well.

Other targets look to be a wash.

I had to modify ARM's abs matching code to match from sub instead of
xor. Maybe instead ISD::ABS should be made legal. I'll try that in
parallel to this patch.

This is an alternative to D119099 which was focused on RISCV only.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D119171
2022-02-20 21:11:23 -08:00
Jay Foad f510045d82 [CodeGen] Remove unneeded regex escaping in FileCheck patterns. NFC.
Take advantage of D117117 to simplify all {{\[}} to [ and {{\]}} to ].

Differential Revision: https://reviews.llvm.org/D117298
2022-02-18 16:10:56 +00:00
Ties Stuij 5cff77c23f [clang][ARM] PACBTI-M assembly support
Introduce assembly support for Armv8.1-M PACBTI extension. This is an optional
extension in v8.1-M.

There are 10 new system registers and 5 new instructions, all predicated on the
feature.

The attribute for llvm-mc is called "pacbti". For armclang, an architecture
extension also called "pacbti" was created.

This patch is part of a series that adds support for the PACBTI-M extension of
the Armv8.1-M architecture, as detailed here:

https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/armv8-1-m-pointer-authentication-and-branch-target-identification-extension

The PACBTI-M specification can be found in the Armv8-M Architecture Reference
Manual:

https://developer.arm.com/documentation/ddi0553/latest

The following people contributed to this patch:

- Victor Campos
- Ties Stuij

Reviewed By: labrinea

Differential Revision: https://reviews.llvm.org/D112420
2021-11-30 09:28:18 +00:00
Guozhi Wei f1d8345a2a [TwoAddressInstructionPass] Create register mapping for registers with multiple uses in the current MBB
Currently we create register mappings for registers used only once in current
MBB. For registers with multiple uses, when all the uses are in the current MBB,
we can also create mappings for them similarly according to the last use.
For example

    %reg101 = ...
            = ... reg101
    %reg103 = ADD %reg101, %reg102

We can create mapping between %reg101 and %reg103.

Differential Revision: https://reviews.llvm.org/D113193
2021-11-29 19:01:59 -08:00
Jay Foad bdaa181007 [TwoAddressInstructionPass] Update existing physreg live intervals
In TwoAddressInstructionPass::processTiedPairs with
-early-live-intervals, update any preexisting physreg live intervals,
as well as virtreg live intervals. By default (without
-precompute-phys-liveness) physreg live intervals only exist for
registers that are live-in to some basic block.

Differential Revision: https://reviews.llvm.org/D113191
2021-11-05 21:20:30 +00:00
Jay Foad 0321bd64e6 Revert "[TwoAddressInstructionPass] Update existing physreg live intervals"
This reverts commit ec0e1e88d2.

It was pushed by mistake.
2021-11-05 09:54:26 +00:00
Jay Foad ec0e1e88d2 [TwoAddressInstructionPass] Update existing physreg live intervals
In TwoAddressInstructionPass::processTiedPairs with
-early-live-intervals, update any preexisting physreg live intervals,
as well as virtreg live intervals. By default (without
-precompute-phys-liveness) physreg live intervals only exist for
registers that are live-in to some basic block.

Differential Revision: https://reviews.llvm.org/D113191
2021-11-05 09:10:24 +00:00
Guozhi Wei 6599961c17 [TwoAddressInstructionPass] Improve the SrcRegMap and DstRegMap computation
This patch contains following enhancements to SrcRegMap and DstRegMap:

  1 In findOnlyInterestingUse not only check if the Reg is two address usage,
    but also check after commutation can it be two address usage.

  2 If a physical register is clobbered, remove SrcRegMap entries that are
    mapped to it.

  3 In processTiedPairs, when create a new COPY instruction, add a SrcRegMap
    entry only when the COPY instruction is coalescable. (The COPY src is
    killed)

With these enhancements isProfitableToCommute can do better commute decision,
and finally more register copies are removed.

Differential Revision: https://reviews.llvm.org/D108731
2021-10-11 15:28:31 -07:00
Stanislav Mekhanoshin 08d7eec06e Revert "Allow rematerialization of virtual reg uses"
Reverted due to two distcint performance regression reports.

This reverts commit 92c1fd19ab.
2021-09-24 10:26:11 -07:00
Ben Shi 63ca9371c7 [ARM] Implement target hook function to decide folding (mul (add x, c1), c2)
Prevent the folding in DAGCombine if it leads to worse code.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D109124
2021-09-07 15:42:43 +08:00
Stanislav Mekhanoshin 92c1fd19ab Allow rematerialization of virtual reg uses
Currently isReallyTriviallyReMaterializableGeneric() implementation
prevents rematerialization on any virtual register use on the grounds
that is not a trivial rematerialization and that we do not want to
extend liveranges.

It appears that LRE logic does not attempt to extend a liverange of
a source register for rematerialization so that is not an issue.
That is checked in the LiveRangeEdit::allUsesAvailableAt().

The only non-trivial aspect of it is accounting for tied-defs which
normally represent a read-modify-write operation and not rematerializable.

The test for a tied-def situation already exists in the
/CodeGen/AMDGPU/remat-vop.mir,
test_no_remat_v_cvt_f32_i32_sdwa_dst_unused_preserve.

The change has affected ARM/Thumb, Mips, RISCV, and x86. For the targets
where I more or less understand the asm it seems to reduce spilling
(as expected) or be neutral. However, it needs a review by all targets'
specialists.

Differential Revision: https://reviews.llvm.org/D106408
2021-08-24 11:09:02 -07:00
Petr Hosek 2d4470ab89 Revert "Allow rematerialization of virtual reg uses"
This reverts commit 877572cc19 which
introduced PR51516.
2021-08-18 00:12:41 -07:00
Stanislav Mekhanoshin 877572cc19 Allow rematerialization of virtual reg uses
Currently isReallyTriviallyReMaterializableGeneric() implementation
prevents rematerialization on any virtual register use on the grounds
that is not a trivial rematerialization and that we do not want to
extend liveranges.

It appears that LRE logic does not attempt to extend a liverange of
a source register for rematerialization so that is not an issue.
That is checked in the LiveRangeEdit::allUsesAvailableAt().

The only non-trivial aspect of it is accounting for tied-defs which
normally represent a read-modify-write operation and not rematerializable.

The test for a tied-def situation already exists in the
/CodeGen/AMDGPU/remat-vop.mir,
test_no_remat_v_cvt_f32_i32_sdwa_dst_unused_preserve.

The change has affected ARM/Thumb, Mips, RISCV, and x86. For the targets
where I more or less understand the asm it seems to reduce spilling
(as expected) or be neutral. However, it needs a review by all targets'
specialists.

Differential Revision: https://reviews.llvm.org/D106408
2021-08-16 12:42:42 -07:00
David Green c140ff493e [ARM] Change a couple of instances of LiveRegs.contains to !LiveRegs.available
This changes a couple of calls to LiveRegs.contains to
!LiveRegs.available, one in Thumb1FrameLoweringInfo (which modifies a
test to look more correct to me, given r7 should be the frame pointer so
is not available), and another in the ARMLoadStoreOptimizer, that I
don't have a test for, it was just found by inspection.

Differential Revision: https://reviews.llvm.org/D107454
2021-08-10 09:53:26 +01:00
David Green 85d6045b88 [ARM] Regenerate Thumb PR35481.ll test. NFC 2021-07-31 16:21:23 +01:00
Matt Arsenault fae05692a3 CodeGen: Print/parse LLTs in MachineMemOperands
This will currently accept the old number of bytes syntax, and convert
it to a scalar. This should be removed in the near future (I think I
converted all of the tests already, but likely missed a few).

Not sure what the exact syntax and policy should be. We can continue
printing the number of bytes for non-generic instructions to avoid
test churn and only allow non-scalar types for generic instructions.

This will currently print the LLT in parentheses, but accept parsing
the existing integers and implicitly converting to scalar. The
parentheses are a bit ugly, but the parser logic seems unable to deal
without either parentheses or some keyword to indicate the start of a
type.
2021-06-30 16:54:13 -04:00
Lucas Prates 7749b19e9c [NFC] Adding test for clobbering of high registers in Thumb
Prior to the changes from D52010, clobbering Thumb's high registers in
inline asm would cause incorrect code to be generated - or an assertion
failure for debug builds. Now that the issue is no longer reproducible,
this patch adds a MIR test to cover that scenario.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D96335
2021-06-28 13:44:44 +01:00
Eli Friedman bf0d0671a1 [ARM] Make sure we don't transform unaligned store to stm on Thumb1.
This isn't likely to come up in practice; the combination of compiler
flags required to hit this issue should be rare. Found by inspection.
2021-06-21 14:32:42 -07:00
Koutheir Attouchi 789708617d Do not generate calls to the 128-bit function __multi3() on 32-bit ARM
Re-applying this patch after bots failures. Should be fine now.

The function __multi3() is undefined on 32-bit ARM, so a call to it should
never be emitted. Instead, plain instructions need to be generated to
perform 128-bit multiplications.

Differential Revision: https://reviews.llvm.org/D103906
2021-06-11 11:45:21 +01:00
serge-sans-paille 4ab3041acb Revert "[NFC] remove explicit default value for strboolattr attribute in tests"
This reverts commit bda6e5bee0.

See https://lab.llvm.org/buildbot/#/builders/109/builds/15424 for instance
2021-05-24 19:43:40 +02:00
serge-sans-paille bda6e5bee0 [NFC] remove explicit default value for strboolattr attribute in tests
Since d6de1e1a71, no attributes is quivalent to
setting attribute to false.

This is a preliminary commit for https://reviews.llvm.org/D99080
2021-05-24 19:31:04 +02:00
Simonas Kazlauskas 777a58e05b Support {S,U}REMEqFold before legalization
This allows these optimisations to apply to e.g. `urem i16` directly
before `urem` is promoted to i32 on architectures where i16 operations
are not intrinsically legal (such as on Aarch64). The legalization then
later can happen more directly and generated code gets a chance to avoid
wasting time on computing results in types wider than necessary, in the end.

Seems like mostly an improvement in terms of results at least as far as x86_64 and aarch64 are concerned, with a few regressions here and there. It also helps in preventing regressions in changes like {D87976}.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D88785
2021-04-01 01:35:41 +03:00
David Green dc206be77b [ARM] Regenerate some test checks. NFC 2021-03-24 15:34:34 +00:00
David Green 402f2cae7d [ARM] Use lrdsb for more thumb1 loads.
Given a sextload i16, we can usually generate "ldrsh [rn. rm]". If we
don't naturally have a rn, rm addressing mode, we can either generate
"ldrh [rn, #0]; sxth" or "mov rm, #0; ldrsh [rn. rm]".

We currently generate the first, always creating a sxth. They are both
the same number of instructions, but if we generate the second then the
mov #0 will likely be CSE'd or pulled out of a loop, etc.

This adjusts the ISel patterns to do that, creating a mov instead of a
sxth.

Differential Revision: https://reviews.llvm.org/D98693
2021-03-17 15:29:02 +00:00
Simonas Kazlauskas a2eca31da2 Test cases for rem-seteq fold with illegal types
This also briefly tests a larger set of architectures than the more
exhaustive functionality tests for AArch64 and x86.

As requested in D88785

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D98339
2021-03-12 16:28:04 +02:00
Roger Ferrer Ibanez d4ce062340 [RISCV][PrologEpilogInserter] "Float" emergency spill slots to avoid making them immediately unreachable from the stack pointer
In RISC-V there is a single addressing mode of the form imm(reg) where
imm is a signed integer of 12-bit with a range of [-2048..2047] bytes
from reg.

The test MultiSource/UnitTests/C++11/frame_layout of the LLVM test-suite
exercises several scenarios with the stack, including function calls
where the stack will need to be realigned to to a local variable having
a large alignment of 4096 bytes.

In situations of large stacks, the RISC-V backend (in
RISCVFrameLowering) reserves an extra emergency spill slot which can be
used (if no free register is found) by the register scavenger after the
frame indexes have been eliminated. PrologEpilogInserter already takes
care of keeping the emergency spill slots as close as possible to the
stack pointer or frame pointer (depending on what the function will
use). However there is a final alignment step to honour the maximum
alignment of the stack that, when using the stack pointer to access the
emergency spill slots, has the side effect of setting them farther from
the stack pointer.

In the case of the frame_layout testcase, the net result is that we do
have an emergency spill slot but it is so far from the stack pointer
(more than 2048 bytes due to the extra alignment of a variable to 4096
bytes) that it becomes unreachable via any immediate offset.

During elimination of the frame index, many (regular) offsets of the
stack may be immediately unreachable already. Their address needs to be
computed using a register. A virtual register is created and later
RegisterScavenger should be able to find an unused (physical) register.
However if no register is available, RegisterScavenger will pick a
physical register and spill it onto an emergency stack slot, while we
compute the offset (restoring the chosen register after all this). This
assumes that the emergency stack slot is easily reachable (this is,
without requiring another register!).

This is the assumption we seem to break when we perform the extra
alignment in PrologEpilogInserter.

We can "float" the emergency spill slots by increasing (in absolute
value) their offsets from the incoming stack pointer. This way the
emergency spill slots will remain close to the stack pointer (once the
function has allocated storage for the stack, including the needed
realignment). The new size computed in PrologEpilogInserter is padding
so it should be OK to move the emergency spill slots there. Also because
we're increasing the alignment, the new location should stay aligned for
the purpose of the emergency spill slots.

Note that this change also impacts other backends as shown by the tests.
Changes are minor adjustments to the emergency stack slot offset.

Differential Revision: https://reviews.llvm.org/D89239
2021-01-23 09:10:03 +00:00
Matt Arsenault 1d1234b2a4 OpaquePtr: Update more tests to use typed sret 2020-11-20 20:08:43 -05:00
Matt Arsenault 06c192d454 OpaquePtr: Bulk update tests to use typed byval
Upgrade of the IR text tests should be the only thing blocking making
typed byval mandatory. Partially done through regex and partially
manual.
2020-11-20 14:00:46 -05:00
David Green 32556a9832 [ARM] Remove more unused check prefixes, NFC 2020-11-14 15:37:53 +00:00
John Brawn 8211cfb7c8 [ARM] Don't shrink STM if it would cause an unknown base register store
If a 16-bit thumb STM with writeback stores the base register but it isn't the
first register in the list, then an unknown value is stored. The load/store
optimizer knows this and generates a 32-bit STM without writeback instead, but
thumb2 size reduction converts it into a 16-bit STM. Fix this by having thumb2
size reduction notice such STMs and leave them as they are.

Differential Revision: https://reviews.llvm.org/D78493
2020-04-22 14:50:42 +01:00
Keith Walker 01dc10774e [ARM] unwinding .pad instructions missing in execute-only prologue
If the stack pointer is altered for local variables and we are generating
Thumb2 execute-only code the .pad directive is missing.

Usually the size of the adjustment is stored in a PC-relative location
and loaded into a register which is then added to the stack pointer.
However when we are generating execute-only code code the size of the
adjustment is instead generated using the MOVW/MOVT instruction pair.

As a by product of handling the execute-only case this also fixes an
existing issue that in the none execute-only case the .pad directive was
generated against the load of the constant to a register instruction,
instead of the instruction which adds the register to the stack pointer.

Differential Revision: https://reviews.llvm.org/D76849
2020-04-07 11:51:59 +01:00
Sam Parker 62fdb1f534 [DAGCombine] Skip PostInc combine with later users
When decided whether to generate a post-inc load/store, look at the
other memory nodes that use the same base address and, if any proceed
the current node, then don't do the combine.
The change only seems to be affecting the Arm backend, which I was
surprised at, but it appears to fix a lot of our issues around MVE
masked load/stores having to store a temporary address after an early
post-increment on a shared base address.

Differential Revision: https://reviews.llvm.org/D75847
2020-03-23 08:39:53 +00:00
Fangrui Song ecd6d7254e [test] llvm/test/: change llvm-objdump single-dash long options to double-dash options
As announced here: http://lists.llvm.org/pipermail/llvm-dev/2019-April/131786.html

Grouped option syntax (POSIX Utility Conventions) does not play well with -long-option
A subsequent change will reject -long-option.
2020-03-15 17:46:23 -07:00
Fangrui Song 71e2ca6e32 [llvm-objdump] -d: print `00000000 <foo>:` instead of `00000000 foo:`
The new behavior matches GNU objdump. A pair of angle brackets makes tests slightly easier.

`.foo:` is not unique and thus cannot be used in a `CHECK-LABEL:` directive.
Without `-LABEL`, the CHECK line can match the `Disassembly of section`
line and causes the next `CHECK-NEXT:` to fail.

```
Disassembly of section .foo:

0000000000001634 .foo:
```

Bdragon: <> has metalinguistic connotation. it just "feels right"

Reviewed By: rupprecht

Differential Revision: https://reviews.llvm.org/D75713
2020-03-05 18:05:28 -08:00
Huihui Zhang 44fa47c9e7 [ARM][ConstantIslands] Fix stack mis-alignment caused by undoLRSpillRestore.
Summary:
It is not safe for ARMConstantIslands to undoLRSpillRestore. PrologEpilogInserter is
the one to ensure stack alignment, taking into consideration LR is spilled or not.

For noreturn function with StackAlignment 8 (function contains call/alloc),
undoLRSpillRestore cause stack be mis-aligned. Fixing stack alignment in
ARMConstantIslands doesn't give us much benefit, as undo LR spill/restore only
occur in large function with near branches only, also doesn't have callee-saved LR spill.

Reviewers: t.p.northover, rengolin, efriedma, apazos, samparker, ostannard

Reviewed By: ostannard

Subscribers: dmgreen, ostannard, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75288
2020-03-02 16:28:57 -08:00
Sjoerd Meijer 7efabe5c7d [MIR][ARM] MachineOperand comments
This adds infrastructure to print and parse MIR MachineOperand comments.
The motivation for the ARM backend is to print condition code names instead of
magic constants that are difficult to read (for human beings). For example,
instead of this:

  dead renamable $r2, $cpsr = tEOR killed renamable $r2, renamable $r1, 14, $noreg
  t2Bcc %bb.4, 0, killed $cpsr

we now print this:

  dead renamable $r2, $cpsr = tEOR killed renamable $r2, renamable $r1, 14 /* CC::always */, $noreg
  t2Bcc %bb.4, 0 /* CC:eq */, killed $cpsr

This shows that MachineOperand comments are enclosed between /* and */. In this
example, the EOR instruction is not conditionally executed (i.e. it is "always
executed"), which is encoded by the 14 immediate machine operand. Thus, now
this machine operand has /* CC::always */ as a comment. The 0 on the next
conditional branch instruction represents the equal condition code, thus now
this operand has /* CC:eq */ as a comment.

As it is a comment, the MI lexer/parser completely ignores it. The benefit is
that this keeps the change in the lexer extremely minimal and no target
specific parsing needs to be done. The changes on the MIPrinter side are also
minimal, as there is only one target hooks that is used to create the machine
operand comments.

Differential Revision: https://reviews.llvm.org/D74306
2020-02-24 14:19:21 +00:00
James Clarke b3cd44f80b Use SETNE directly rather than SUB/SETNE 0 for stack guard check
Summary:
Backends should fold the subtraction into the comparison, but not all
seem to. Moreover, on targets where pointers are not integers, such as
CHERI, an integer subtraction is not appropriate. Instead we should just
compare the two pointers directly, as this should work everywhere and
potentially generate more efficient code.

Reviewers: bogner, lebedev.ri, efriedma, t.p.northover, uweigand, sunfish

Reviewed By: lebedev.ri

Subscribers: dschuff, sbc100, arichardson, jgravelle-google, hiraditya, aheejin, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74454
2020-02-18 13:21:26 +00:00