It introduced two regressions on 64-bit big-endian targets running under N32
(MultiSource/Benchmarks/tramp3d-v4/tramp3d-v4, and
MultiSource/Applications/kimwitu++/kc) The issue is that on 64-bit targets
comparisons such as BEQ compare the whole GPR64 but incorrectly tell the
instruction selector that they operate on GPR32's. This leads to the
elimination of i32->i64 extensions that are actually required by
comparisons to work correctly.
There's currently a patch under review that fixes this problem.
llvm-svn: 243984
Summary:
This hidden option would disable code generation through FastISel by
default. It was removed from the available options and from the
Fast-ISel tests that required it in order to run the tests.
Reviewers: dsanders
Subscribers: qcolombet, llvm-commits
Differential Revision: http://reviews.llvm.org/D11610
llvm-svn: 243638
Summary:
Previously, we would sign-extend non-boolean negative constants and
zero-extend otherwise. This was problematic for PHI instructions with
negative values that had a type with bitwidth less than that of the
register used for materialization.
More specifically, ComputePHILiveOutRegInfo() assumes the constants
present in a PHI node are zero extended in their container and
afterwards deduces the known bits.
For example, previously we would materialize an i16 -4 with the
following instruction:
addiu $r, $zero, -4
The register would end-up with the 32-bit 2's complement representation
of -4. However, ComputePHILiveOutRegInfo() would generate a constant
with the upper 16-bits set to zero. The SelectionDAG builder would use
that information to generate an AssertZero node that would remove any
subsequent trunc & zero_extend nodes.
In theory, we should modify ComputePHILiveOutRegInfo() to consult
target-specific hooks about the way they prefer to materialize the
given constants. However, git-blame reports that this specific code
has not been touched since 2011 and it seems to be working well for every
target so far.
Reviewers: dsanders
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D11592
llvm-svn: 243636
Summary:
Currently, we support only the MIPS O32 ABI calling convention for call
lowering. With this change we avoid using the O32 calling convetion for
lowering calls marked as using the fast calling convention.
Reviewers: dsanders
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D11515
llvm-svn: 243485
Summary:
Generate correct code for the select instruction by zero-extending
it's boolean/condition operand to GPR-width. This is necessary because
the conditional-move instructions operate on the whole register.
Reviewers: dsanders
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D11506
llvm-svn: 243469
Current implementation handles unordered comparison poorly in soft-float mode.
Consider (a ULE b) which is a <= b. It is lowered to (ledf2(a, b) <= 0 || unorddf2(a, b) != 0) (in general). We can do better job by lowering it to (__gtdf2(a, b) <= 0).
Such replacement is true for other CMP's (ult, ugt, uge). In general, we just call same function as for ordered case but negate comparison against zero.
Differential Revision: http://reviews.llvm.org/D10804
llvm-svn: 242280
The personality routine currently lives in the LandingPadInst.
This isn't desirable because:
- All LandingPadInsts in the same function must have the same
personality routine. This means that each LandingPadInst beyond the
first has an operand which produces no additional information.
- There is ongoing work to introduce EH IR constructs other than
LandingPadInst. Moving the personality routine off of any one
particular Instruction and onto the parent function seems a lot better
than have N different places a personality function can sneak onto an
exceptional function.
Differential Revision: http://reviews.llvm.org/D10429
llvm-svn: 239940
Summary:
Following on from r209907 which made personality encodings indirect, do the
same for TType encodings. This fixes the case where a try/catch block needs
to generate references to, for example, std::exception in the
.gcc_except_table.
Previous attempts at committing this broke the buildbots due to bugs in IAS.
These bugs have now been fixed so trying again.
Reviewers: petarj
Reviewed By: petarj
Subscribers: srhines, joerg, tberghammer, llvm-commits
Differential Revision: http://reviews.llvm.org/D9669
llvm-svn: 238863
Summary:
With this change we are able to realign the stack dynamically, whenever it
contains objects with alignment requirements that are larger than the
alignment specified from the given ABI.
We have to use the $fp register as the frame pointer when we perform
dynamic stack realignment. In complex stack frames, with variably-sized
objects, we reserve additionally the callee-saved register $s7 as the
base pointer in order to reference locals.
Reviewers: dsanders
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D8633
llvm-svn: 238829
Summary: Implement bswap intrinsic for MIPS FastISel. It's very different for misp32 r1/r2 .
Based on a patch by Reed Kotler.
Test Plan:
bswap1.ll
test-suite
Reviewers: dsanders, rkotler
Subscribers: llvm-commits, rfuhler
Differential Revision: http://reviews.llvm.org/D7219
llvm-svn: 238760
Summary:
Implement the intrinsics memset, memcopy and memmove in MIPS FastISel.
Make some needed infrastructure fixes so that this can work.
Based on a patch by Reed Kotler.
Test Plan:
memtest1.ll
The patch passes test-suite for mips32 r1/r2 and at O0/O2
Reviewers: rkotler, dsanders
Subscribers: llvm-commits, rfuhler
Differential Revision: http://reviews.llvm.org/D7158
llvm-svn: 238759
Summary: Implement the LLVM assembly urem/srem and sdiv/udiv instructions in MIPS FastISel.
Based on a patch by Reed Kotler.
Test Plan:
srem1.ll
div1.ll
test-suite at O0/O2 for mips32 r1/r2
Reviewers: dsanders, rkotler
Subscribers: llvm-commits, rfuhler
Differential Revision: http://reviews.llvm.org/D7028
llvm-svn: 238757
Summary: Implement the LLVM IR select statement for MIPS FastISelsel.
Based on a patch by Reed Kotler.
Test Plan:
"Make check" test included now.
Passes test-suite at O2/O0 mips32 r1/r2.
Reviewers: dsanders, rkotler
Subscribers: llvm-commits, rfuhler
Differential Revision: http://reviews.llvm.org/D6774
llvm-svn: 238756
Summary:
The contents of the HI/LO registers are unpredictable after the execution of
the MUL instruction. In addition to implicitly defining these registers in the
MUL instruction definition, we have to mark those registers as dead too.
Without this the fast register allocator is running out of registers when the
MUL instruction is followed by another one that tries to allocate the AC0
register.
Based on a patch by Reed Kotler.
Reviewers: dsanders, rkotler
Subscribers: llvm-commits, rfuhler
Differential Revision: http://reviews.llvm.org/D9825
llvm-svn: 238755
It caused a smaller number of failures than the previous attempt at committing but still caused a couple on the llvm-linux-mips builder. Reverting while I investigate the remainder.
llvm-svn: 238483
Summary:
Following on from r209907 which made personality encodings indirect, do the
same for TType encodings. This fixes the case where a try/catch block needs
to generate references to, for example, std::exception in the
.gcc_except_table.
Reviewers: petarj
Reviewed By: petarj
Subscribers: srhines, joerg, tberghammer, llvm-commits
Differential Revision: http://reviews.llvm.org/D9669
llvm-svn: 238427
This broke the llvm-mips-linux builder and several of our out-of-tree builders.
Initial investigations show that the commit probably isn't the problem but
reverting anyway while I investigate.
llvm-svn: 238302
Summary:
Following on from r209907 which made personality encodings indirect, do the
same for TType encodings. This fixes the case where a try/catch block needs
to generate references to, for example, std::exception in the
.gcc_except_table.
This commit uses DW_EH_PE_sdata8 for N64 as far as is possible at the moment.
However, it is possible to end up with DW_EH_PE_sdata4 when a TargetMachine is
not available. There's no risk of issues with inconsistency here since the
tables are self describing but it does mean there is a small chance of the
PC-relative offset being out of range for particularly large programs.
Reviewers: petarj
Reviewed By: petarj
Subscribers: srhines, joerg, tberghammer, llvm-commits
Differential Revision: http://reviews.llvm.org/D9669
llvm-svn: 238190
Summary:
-check-prefix replaces the default CHECK prefix rather than adding to it and
must be explicitly re-added.
Also added the N32 cases.
Reviewers: petarj
Reviewed By: petarj
Subscribers: tberghammer, llvm-commits
Differential Revision: http://reviews.llvm.org/D9668
llvm-svn: 237790
Summary:
For N32/N64, private labels begin with '.L' but for O32 they begin with '$'.
MCAsmInfo now has an initializer function which can be used to provide information from the TargetMachine to control the assembly syntax.
Reviewers: vkalintiris
Reviewed By: vkalintiris
Subscribers: jfb, sandeep, llvm-commits, rafael
Differential Revision: http://reviews.llvm.org/D9821
llvm-svn: 237789
Summary:
The documentation writes vectors highest-index first whereas LLVM-IR writes
them lowest-index first. As a result, instructions defined in terms of
left_half() and right_half() had the halves reversed.
In addition to correcting them, they have been improved to allow shuffles
that use the same operand twice or in reverse order. For example, ilvev
used to accept masks of the form:
<0, n, 2, n+2, 4, n+4, ...>
but now accepts:
<0, 0, 2, 2, 4, 4, ...>
<n, n, n+2, n+2, n+4, n+4, ...>
<0, n, 2, n+2, 4, n+4, ...>
<n, 0, n+2, 2, n+4, 4, ...>
One further improvement is that splati.[bhwd] is now the preferred instruction
for splat-like operations. The other special shuffles are no longer used
for splats. This lead to the discovery that <0, 0, ...> would not cause
splati.[hwd] to be selected and this has also been fixed.
This fixes the enc-3des test from the test-suite on Mips64r6 with MSA.
Reviewers: vkalintiris
Reviewed By: vkalintiris
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D9660
llvm-svn: 237689
Summary:
When we are trying to fill the delay slot of a call instruction, we must avoid
filler instructions that use the $ra register. This fixes the test
MultiSource/Applications/JM/lencod when we enable the forward delay slot filler.
Reviewers: dsanders
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D9670
llvm-svn: 237362
On Mips, frame pointer points to the same side of the frame as the stack
pointer. This function is used to decide where to put register scavenging
spill slot. So far, it was put on the wrong side of the frame, and thus it
was too far away from $fp when frame was larger than 2^15 bytes.
Patch by Vladimir Radosavljevic.
http://reviews.llvm.org/D8895
llvm-svn: 237153
Summary: Allow calls with non legal integer types based on i8 and i16 to be processed by mips fast-isel.
Based on a patch by Reed Kotler.
Test Plan:
"Make check" test forthcoming.
Test-suite passes at O0/O2 and with mips32 r1/r2
Reviewers: rkotler, dsanders
Subscribers: llvm-commits, rfuhler
Differential Revision: http://reviews.llvm.org/D6770
llvm-svn: 237121
Summary:
Try to compute addresses when the offset from a memory location is a constant
expression.
Based on a patch by Reed Kotler.
Test Plan:
Passes test-suite for -O0/O2 and mips 32 r1/r2
Reviewers: rkotler, dsanders
Subscribers: llvm-commits, aemerson, rfuhler
Differential Revision: http://reviews.llvm.org/D6767
llvm-svn: 237117
Summary:
In microMIPS, labels need to know whether they are on code or data. This is
indicated with STO_MIPS_MICROMIPS and can be inferred by being followed
by instructions. For empty basic blocks, we can ensure this by emitting the
.insn directive after the label.
Also, this fixes some failures in our out-of-tree microMIPS buildbots, for the
exception handling regression tests under: SingleSource/Regression/C++/EH
Reviewers: dsanders
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D9530
llvm-svn: 236815
Summary:
When using the N64 ABI, element-indices use the i64 type instead of i32.
In many cases, we can use iPTR to account for this but additional patterns
and pseudo's are also required.
This fixes most (but not quite all) failures in the test-suite when using
N64 and MSA together.
Reviewers: vkalintiris
Reviewed By: vkalintiris
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D9342
llvm-svn: 236494
Summary:
The majority of the checks are subtarget independent. The few that aren't
will be corrected shortly.
Reviewers: vkalintiris
Reviewed By: vkalintiris
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D9340
llvm-svn: 236220
Summary:
This doesn't make much difference to MIPS32, but it will simplify a
MIPS64r6 bugfix which will follow shortly by removing unnecessary
sign-extension of parameters.
Reviewers: vkalintiris
Reviewed By: vkalintiris
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D9338
llvm-svn: 236216
Summary:
The existing code was correct for 32-bit GPR's but not 64-bit GPR's. It now
accounts for both cases.
Reviewers: vkalintiris
Reviewed By: vkalintiris
Subscribers: llvm-commits, mohit.bhakkad, sagar
Differential Revision: http://reviews.llvm.org/D9337
llvm-svn: 236099
This reapplies r235194, which was reverted in r235495 because it was causing a
failure in our out-of-tree buildbots for MIPS. With the sign-extension patch
in r235718, this patch doesn't cause any problem any more.
llvm-svn: 235878
Same as r235145 for the call instruction - the justification, tradeoffs,
etc are all the same. The conversion script worked the same without any
false negatives (after replacing 'call' with 'invoke').
llvm-svn: 235755
Summary:
Remove the CHECK-DAG calls introduced in r235341, and add a comment that
this test may break due to scheduling variations.
This patch completes the fix discussed in http://reviews.llvm.org/D8804
Reviewers: dsanders, srhines
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D9178
llvm-svn: 235530
Summary:
The 64-bit version of the variable shift instructions uses the
shift_rotate_reg class which uses a GPR32Opnd to specify the variable
shift amount. With this patch we avoid the generation of a redundant
SLL instruction for the variable shift instructions in 64-bit targets.
Reviewers: dsanders
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D7413
llvm-svn: 235376
Summary:
Set operation action for FP16 conversion opcodes, so the Op legalizer
can choose the gnu_* libcalls for Mips.
Set LoadExtAction and TruncStoreAction for f16 scalars and vectors to
prevent (fpext (load )) and (store (fptrunc)) from getting combined into
unsupported operations.
Added test cases to test that these operations are handled correctly
for f16 scalars and vectors. This patch depends on
http://reviews.llvm.org/D8755.
Reviewers: srhines
Subscribers: llvm-commits, ab
Differential Revision: http://reviews.llvm.org/D8804
llvm-svn: 235341
Summary: Implement the method FastMaterializeAlloca in Mips fast-isel
Based on a patch by Reed Kotler.
Test Plan:
Passes test-suite at O0/O2 for mips32 r1/r2
fastalloca.ll
Reviewers: dsanders, rkotler
Subscribers: rfuhler, llvm-commits
Differential Revision: http://reviews.llvm.org/D6742
llvm-svn: 235213
Summary:
Add shift operators implementation to fast-isel for Mips. These are shift ops
for non legal forms, i.e. i8 and i16.
Based on a patch by Reed Kotler.
Test Plan:
Reviewers: dsanders
Subscribers: echristo, rfuhler, llvm-commits
Differential Revision: http://reviews.llvm.org/D6726
llvm-svn: 235194
Summary:
Previously, the presence of KILL instructions would block valid candidates
from filling a specific delay slot. With the elimination of the KILL
instructions, in the appropriate range, we are able to fill more slots and
keep the information from future def/use analysis consistent.
Reviewers: dsanders
Reviewed By: dsanders
Subscribers: hfinkel, llvm-commits
Differential Revision: http://reviews.llvm.org/D7724
llvm-svn: 235183
See r230786 and r230794 for similar changes to gep and load
respectively.
Call is a bit different because it often doesn't have a single explicit
type - usually the type is deduced from the arguments, and just the
return type is explicit. In those cases there's no need to change the
IR.
When that's not the case, the IR usually contains the pointer type of
the first operand - but since typed pointers are going away, that
representation is insufficient so I'm just stripping the "pointerness"
of the explicit type away.
This does make the IR a bit weird - it /sort of/ reads like the type of
the first operand: "call void () %x(" but %x is actually of type "void
()*" and will eventually be just of type "ptr". But this seems not too
bad and I don't think it would benefit from repeating the type
("void (), void () * %x(" and then eventually "void (), ptr %x(") as has
been done with gep and load.
This also has a side benefit: since the explicit type is no longer a
pointer, there's no ambiguity between an explicit type and a function
that returns a function pointer. Previously this case needed an explicit
type (eg: a function returning a void() function was written as
"call void () () * @x(" rather than "call void () * @x(" because of the
ambiguity between a function returning a pointer to a void() function
and a function returning void).
No ambiguity means even function pointer return types can just be
written alone, without writing the whole function's type.
This leaves /only/ the varargs case where the explicit type is required.
Given the special type syntax in call instructions, the regex-fu used
for migration was a bit more involved in its own unique way (as every
one of these is) so here it is. Use it in conjunction with the apply.sh
script and associated find/xargs commands I've provided in rr230786 to
migrate your out of tree tests. Do let me know if any of this doesn't
cover your cases & we can iterate on a more general script/regexes to
help others with out of tree tests.
About 9 test cases couldn't be automatically migrated - half of those
were functions returning function pointers, where I just had to manually
delete the function argument types now that we didn't need an explicit
function type there. The other half were typedefs of function types used
in calls - just had to manually drop the * from those.
import fileinput
import sys
import re
pat = re.compile(r'((?:=|:|^|\s)call\s(?:[^@]*?))(\s*$|\s*(?:(?:\[\[[a-zA-Z0-9_]+\]\]|[@%](?:(")?[\\\?@a-zA-Z0-9_.]*?(?(3)"|)|{{.*}}))(?:\(|$)|undef|inttoptr|bitcast|null|asm).*$)')
addrspace_end = re.compile(r"addrspace\(\d+\)\s*\*$")
func_end = re.compile("(?:void.*|\)\s*)\*$")
def conv(match, line):
if not match or re.search(addrspace_end, match.group(1)) or not re.search(func_end, match.group(1)):
return line
return line[:match.start()] + match.group(1)[:match.group(1).rfind('*')].rstrip() + match.group(2) + line[match.end():]
for line in sys.stdin:
sys.stdout.write(conv(re.search(pat, line), line))
llvm-svn: 235145
This commit makes LLVM not estimate branch probabilities when doing a
single bit bitmask tests.
The code that originally made me discover this is:
if ((a & 0x1) == 0x1) {
..
}
In this case we don't actually have any branch probability information
and should not assume to have any. LLVM transforms this into:
%and = and i32 %a, 1
%tobool = icmp eq i32 %and, 0
So, in this case, the result of a bitwise and is compared against 0,
but nevertheless, we should not assume to have probability
information.
CodeGen/ARM/2013-10-11-select-stalls.ll started failing because the
changed probabilities changed the results of
ARMBaseInstrInfo::isProfitableToIfCvt() and led to an Ifcvt of the
diamond in the test. AFAICT, the test was never meant to test this and
thus changing the test input slightly to not change the probabilities
seems like the best way to preserve the meaning of the test.
llvm-svn: 234979