to store to a stack location for outgoing args.
During call arg lowering we shouldn't be modifying SP so cache the SP copy
vreg for subsequent uses.
Gives a 0.2% geomean code size improvement on CTMark.
Differential Revision: https://reviews.llvm.org/D77838
Summary:
G_GEP is rather poorly named. It's a simple pointer+scalar addition and
doesn't support any of the complexities of getelementptr. I therefore
propose that we rename it. There's a G_PTR_MASK so let's follow that
convention and go with G_PTR_ADD
Reviewers: volkan, aditya_nandakumar, bogner, rovka, arsenm
Subscribers: sdardis, jvesely, wdng, nhaehnle, hiraditya, jrtc27, atanasyan, arphaman, Petar.Avramovic, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69734
Change the interface of CallLowering::lowerCall to accept several
virtual registers for each argument, instead of just one. This is a
follow-up to D46018.
CallLowering::lowerReturn was similarly refactored in D49660 and
lowerFormalArguments in D63549.
With this change, we no longer pack the virtual registers generated for
aggregates into one big lump before delegating to the target. Therefore,
the target can decide itself whether it wants to handle them as separate
pieces or use one big register.
ARM and AArch64 have been updated to use the passed in virtual registers
directly, which means we no longer need to generate so many
merge/extract instructions.
NFCI for AMDGPU, Mips and X86.
Differential Revision: https://reviews.llvm.org/D63551
llvm-svn: 364512
Other opcodes shouldn't be CSE'd until we can be sure debug info quality won't
be degraded.
This change also improves the IRTranslator so that in most places, but not all,
it creates constants using the MIRBuilder directly instead of first creating a
new destination vreg and then creating a constant. By doing this, the
buildConstant() method can just return the vreg of an existing G_CONSTANT
instead of having to create a COPY from it.
I measured a 0.2% improvement in compile time and a 0.9% improvement in code
size at -O0 ARM64.
Compile time:
Program base cse diff
test-suite...ark/tramp3d-v4/tramp3d-v4.test 9.04 9.12 0.8%
test-suite...Mark/mafft/pairlocalalign.test 2.68 2.66 -0.7%
test-suite...-typeset/consumer-typeset.test 5.53 5.51 -0.4%
test-suite :: CTMark/lencod/lencod.test 5.30 5.28 -0.3%
test-suite :: CTMark/Bullet/bullet.test 25.82 25.76 -0.2%
test-suite...:: CTMark/ClamAV/clamscan.test 6.92 6.90 -0.2%
test-suite...TMark/7zip/7zip-benchmark.test 34.24 34.17 -0.2%
test-suite :: CTMark/SPASS/SPASS.test 6.25 6.24 -0.1%
test-suite...:: CTMark/sqlite3/sqlite3.test 1.66 1.66 -0.1%
test-suite :: CTMark/kimwitu++/kc.test 13.61 13.60 -0.0%
Geomean difference -0.2%
Code size:
Program base cse diff
test-suite...-typeset/consumer-typeset.test 1315632 1266480 -3.7%
test-suite...:: CTMark/ClamAV/clamscan.test 1313892 1297508 -1.2%
test-suite :: CTMark/lencod/lencod.test 1439504 1423112 -1.1%
test-suite...TMark/7zip/7zip-benchmark.test 2936980 2904172 -1.1%
test-suite :: CTMark/Bullet/bullet.test 3478276 3445460 -0.9%
test-suite...ark/tramp3d-v4/tramp3d-v4.test 8082868 8033492 -0.6%
test-suite :: CTMark/kimwitu++/kc.test 3870380 3853972 -0.4%
test-suite :: CTMark/SPASS/SPASS.test 1434904 1434896 -0.0%
test-suite...Mark/mafft/pairlocalalign.test 764528 764528 0.0%
test-suite...:: CTMark/sqlite3/sqlite3.test 782092 782092 0.0%
Geomean difference -0.9%
Differential Revision: https://reviews.llvm.org/D60580
llvm-svn: 358369
We currently handle all aggregates by creating one large LLT, and letting the
legalizer deal with splitting them up. However using this approach means that
we can't support big endian code correctly.
This patch changes the way that the IRTranslator deals with aggregate values,
by splitting them up into their constituent element values. To do this, parts
of the translator need to be modified to deal with multiple VRegs for a single
Value.
A new Value to VReg mapper is introduced to help keep compile time under
control, currently there is no measurable impact on CTMark despite the extra
code being generated in some cases.
Patch is based on the original work of Tim Northover.
Differential Revision: https://reviews.llvm.org/D46018
llvm-svn: 332449
Discussed here:
http://lists.llvm.org/pipermail/llvm-dev/2018-January/120320.html
In preparation for adding support for named vregs we are changing the sigil for
physical registers in MIR to '$' from '%'. This will prevent name clashes of
named physical register with named vregs.
llvm-svn: 323922
This updates the MIRPrinter to include the regclass when printing
virtual register defs, which is already valid syntax for the
parser. That is, given 64 bit %0 and %1 in a "gpr" regbank,
%1(s64) = COPY %0(s64)
would now be written as
%1:gpr(s64) = COPY %0(s64)
While this change alone introduces a bit of redundancy with the
registers block, it allows us to update the tests to be more concise
and understandable and brings us closer to being able to remove the
registers block completely.
Note: We generally only print the class in defs, but there is one
exception. If there are uses without any defs whatsoever, we'll print
the class on all uses. I'm not completely convinced this comes up in
meaningful machine IR, but for now the MIRParser and MachineVerifier
both accept that kind of stuff, so we don't want to have a situation
where we can print something we can't parse.
llvm-svn: 316479
If a struct would end up half in GPRs and half on SP the ABI says it should
actually go entirely on the stack. We were getting this wrong in GlobalISel
before, causing compatibility issues.
llvm-svn: 311388
If -simplify-mir option is passed then MIRPrinter will not print such fields.
This change also required some lit test cases in CodeGen directory to be changed.
Reviewed By: MatzeB
Differential Revision: https://reviews.llvm.org/D32304
llvm-svn: 304779
MachineIRBuilder had weird before/after and beginning/end flags for the insert
point. Unfortunately the non-default means that instructions will be inserted
in reverse order which is almost never what anyone wants.
Really, I think we just want (like IRBuilder has) the ability to insert at any
C++ iterator-style point (i.e. before any instruction or before MBB.end()). So
this fixes MIRBuilders to behave like IRBuilders in this respect.
llvm-svn: 288980
When the entry block was empty after arg lowering, we were always placing
constants at the end. This is probably hamrless while translating the same
block, but horribly wrong once its terminator has been translated. So switch to
inserting at the beginning.
llvm-svn: 288720
This makes it more similar to the floating-point constant, and also allows for
larger constants to be translated later. There's no real functional change in
this patch though, just syntax updates.
llvm-svn: 288712