Summary:
This allows enabling useaa on the command-line and will allow enabling the
feature on a per-CPU basis where benchmarking shows improvements.
This is modelled after the ARM/AArch64 target.
Reviewers: RKSimon, andreadb, craig.topper
Subscribers: javed.absar, kristof.beyls, hiraditya, ychen, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67266
llvm-svn: 371989
Previously we only had a stub document.
Reviewed by: serge-sans-paille, MaskRay
Differential Revision: https://reviews.llvm.org/D67555
llvm-svn: 371983
MVE has VPT instructions, which perform the duties of both a VCMP and a VPST in
a single instruction, performing the compare and starting the VPT block in one.
This teaches the MVEVPTBlockPass to fold them, searching back through the
basicblock for a valid VCMP and creating the VPT from its operands.
There are some changes to the VPT instructions to accommodate this, altering
the order of the operands to match the VCMP better, and changing P0 register
defs to be VPR defs, as is used in other places.
Differential Revision: https://reviews.llvm.org/D66577
llvm-svn: 371982
This fold and several others were added in:
rL125734 <https://reviews.llvm.org/rL125734>
...with no explanation for the one-use checks other than the code
comments about register pressure.
Given that this is IR canonicalization, we shouldn't be worried
about register pressure though; the backend should be able to
adjust for that as needed.
This is part of solving PR43310 the theoretically right way:
https://bugs.llvm.org/show_bug.cgi?id=43310
...ie, if we don't cripple basic transforms, then we won't
need to add special-case code to detect larger patterns.
rL371940 is a related patch in this series.
llvm-svn: 371981
This blob was written before match() existed, so it
could probably be reduced significantly.
But I suspect it isn't well tested, so tests would have
to be added to reduce risk from logic changes.
llvm-svn: 371978
The static analyzer is warning about a potential null dereference of the cast_or_null result, I've split the cast_or_null check from the ->getUnderlyingInstr() call to avoid this, but it appears that we weren't seeing any null pointers in the dumped bundles in the first place.
llvm-svn: 371975
The static analyzer is warning about potential null dereferences of dyn_cast<> results - in these cases we can safely use cast<> directly as we know that these cases should all be the correct type, which is why its working atm and anyway cast<> will assert if they aren't.
llvm-svn: 371973
Summary:
Adds the following inline asm constraints for SVE:
- Upl: One of the low eight SVE predicate registers, P0 to P7 inclusive
- Upa: SVE predicate register with full range, P0 to P15
Reviewers: t.p.northover, sdesmalen, rovka, momchil.velikov, cameron.mcinally, greened, rengolin
Reviewed By: rovka
Subscribers: javed.absar, tschuett, rkruppe, psnobl, cfe-commits, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66524
llvm-svn: 371967
After our previous machinecombiner exercises (rL371321, rL371818, rL371833), we
were still missing a few FP16 FMA patterns.
Differential Revision: https://reviews.llvm.org/D67576
llvm-svn: 371960
SystemZExpandPseudo:s only job was to expand LOCRMux instructions into jump
sequences. This needs to be done if expandLOCRPseudo() or expandSELRPseudo()
fails to find a legal opcode (all registers "high" or "low"). This task has
now been moved to SystemZPostRewrite while removing the SystemZExpandPseudo
pass.
It is in fact preferred to expand these pseudos directly after register
allocation in SystemZPostRewrite since the hinted register combinations are
then not subject to later optimizations.
Review: Ulrich Weigand
https://reviews.llvm.org/D67432
llvm-svn: 371959
D53362 gives a prototype heap-to-stack conversion pass. With addition of new attributes in the attributor, this can now be revisted and improved. This will place it in the Attributor to make it easier to use new attributes (eg. nofree, nosync, willreturn, etc.) and other attributor features.
Reviewers: jdoerfert, uenoku, hfinkel, efriedma
Subscribers: lebedev.ri, xbolva00, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D65408
llvm-svn: 371942
This fold and several others were added in:
rL125734
...with no explanation for the one-use checks other than the code
comments about register pressure.
Given that this is IR canonicalization, we shouldn't be worried
about register pressure though; the backend should be able to
adjust for that as needed.
There are similar checks as noted with the TODO comments. I'm
hoping to remove those restrictions too, but if any of these
does cause a regression, it should be easier to correct by making
small, individual commits.
This is part of solving PR43310 the theoretically right way:
https://bugs.llvm.org/show_bug.cgi?id=43310
...ie, if we don't cripple basic transforms, then we won't
need to add special-case code to detect larger patterns.
llvm-svn: 371940
The static analyzer is warning about a potential null dereference - but as we're in DataMemberLayoutItem we should be able to guarantee that the Symbol is a PDBSymbolData type, allowing us to use cast<PDBSymbolData> - and if not assert will fire for us.
llvm-svn: 371933
Masked loads and store fit naturally with MVE, the instructions being easily
predicated. This adds lowering for the simple cases of masked loads and stores.
It does not yet deal with widening/narrowing or pre/post inc, and so is
currently behind an option.
The llvm masked load intrinsic will accept a "passthru" value, dictating the
values used for the zero masked lanes. In MVE the instructions write 0 to the
zero predicated lanes, so we need to match a passthru that isn't 0 (or undef)
with a select instruction to pull in the correct data after the load.
Differential Revision: https://reviews.llvm.org/D67186
llvm-svn: 371932
This is a fix for:
https://bugs.llvm.org/show_bug.cgi?id=33958
It seems universally true that we would not want to transform this kind of
sequence on any target, but if that's not correct, then we could view this
as a target-specific cost model problem. We could also white-list ConstantInt,
ConstantFP, etc. rather than blacklist Global and ConstantExpr.
Differential Revision: https://reviews.llvm.org/D67362
llvm-svn: 371931
Some VLIW instruction sets are Very Long Indeed. Using uint64_t constricts the Inst encoding to 64 bits (naturally).
This change switches CodeEmitter to a mode that uses APInts when Inst's bitwidth is > 64 bits (NFC for existing targets).
When Inst.BitWidth > 64 the prototype changes to:
void TargetMCCodeEmitter::getBinaryCodeForInstr(const MCInst &MI,
SmallVectorImpl<MCFixup> &Fixups,
APInt &Inst,
APInt &Scratch,
const MCSubtargetInfo &STI);
The Inst parameter returns the encoded instruction, the Scratch parameter is used internally for manipulating operands and is exposed so that the underlying storage can be reused between calls to getBinaryCodeForInstr. The goal is to elide any APInt constructions that we can.
Similarly the operand encoding prototype changes to:
getMachineOpValue(const MCInst &MI, const MCOperand &MO, APInt &op, SmallVectorImpl<MCFixup> &Fixups, const MCSubtargetInfo &STI);
That is, the operand is passed by reference as APInt rather than returned as uint64_t.
To reiterate, this APInt mode is enabled only when Inst.BitWidth > 64, so this change is NFC for existing targets.
llvm-svn: 371928
Summary: This should be obsolete once the functionality in D66967 is integrated.
Reviewers: uenoku, sstefan1
Subscribers: hiraditya, bollu, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67231
llvm-svn: 371915
GNU objcopy documents that -B is only useful with architecture-less
input (i.e. "binary" or "ihex"). After D67144, -O defaults to -I, and
-B is essentially a NOP.
* If -O is binary/ihex, GNU objcopy ignores -B.
* If -O is elf*, -B provides the e_machine field in GNU objcopy.
So to convert a blob to an ELF, `-I binary -B i386:x86-64 -O elf64-x86-64` has to be specified.
`-I binary -B i386:x86-64 -O elf64-x86-64` creates an ELF with its
e_machine field set to EM_NONE in GNU objcopy, but a regular x86_64 ELF
in elftoolchain elfcopy. Follow the elftoolchain approach (ignoring -B)
to simplify code. Users that expect their command line portable should
specify -B.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D67215
llvm-svn: 371914
Fixes PR42171.
In GNU objcopy, if -O (--output-target) is not specified, the value is
copied from -I (--input-target).
```
objcopy -I binary -B i386:x86-64 a.txt b # b is copied from a.txt
llvm-objcopy -I binary -B i386:x86-64 a.txt b # b is an x86-64 object file
```
This patch changes our behavior to match GNU. With this change, we can
delete code related to -B handling (D67215).
Reviewed By: jakehehrlich
Differential Revision: https://reviews.llvm.org/D67144
llvm-svn: 371913
Most GNU binutils don't append full stops in error messages. This
convention has been adopted by a bunch of LLVM binary utilities. Make
llvm-ar follow the convention as well.
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D67558
llvm-svn: 371912
This adds a reproducer dump commands which makes it possible to inspect
a reproducer from inside LLDB. Currently it supports the Files, Commands
and Version providers. I'm planning to add support for the GDB Remote
provider in a follow-up patch.
Differential revision: https://reviews.llvm.org/D67474
llvm-svn: 371909