For function calls on the 64-bit PowerPC SVR4 target, each parameter
is mapped to as many doublewords in the parameter save area as
necessary to hold the parameter. The first 13 non-varargs
floating-point values are passed in registers; any additional
floating-point parameters are passed in the parameter save area. A
single-precision floating-point parameter (32 bits) must be mapped to
the second (rightmost, low-order) word of its assigned doubleword
slot.
Currently LLVM violates this ABI requirement by mapping such a
parameter to the first (leftmost, high-order) word of its assigned
doubleword slot. This is internally self-consistent but will not
interoperate correctly with libraries compiled with an ABI-compliant
compiler.
This patch corrects the problem by adjusting the parameter addressing
on both sides of the calling convention.
llvm-svn: 165714
Note: [D]M{T,F}CP2 is just a recommended encoding. Vendors often provide a
custom CP2 that interprets instructions differently and may wish to add their
own instructions that use this opcode. We should ensure that this is easy to
do. I will probably add a 'has custom CP{0-3}' subtarget flag to make this
easy: We want to avoid the GCC situation where every MIPS vendor makes a custom
fork that breaks every other MIPS CPU and so can't be merged upstream.
llvm-svn: 165711
Original message:
The attached is the fix to radar://11663049. The optimization can be outlined by following rules:
(select (x != c), e, c) -> select (x != c), e, x),
(select (x == c), c, e) -> select (x == c), x, e)
where the <c> is an integer constant.
The reason for this change is that : on x86, conditional-move-from-constant needs two instructions;
however, conditional-move-from-register need only one instruction.
While the LowerSELECT() sounds to be the most convenient place for this optimization, it turns out to be a bad place. The reason is that by replacing the constant <c> with a symbolic value, it obscure some instruction-combining opportunities which would otherwise be very easy to spot. For that reason, I have to postpone the change to last instruction-combining phase.
The change passes the test of "make check-all -C <build-root/test" and "make -C project/test-suite/SingleSource".
llvm-svn: 165661
the compiler makes use of GPR0. However, there are two flavors of
GPR0 defined by the target: the 32-bit GPR0 (R0) and the 64-bit GPR0
(X0). The spill/reload code makes use of R0 regardless of whether we
are generating 32- or 64-bit code.
This patch corrects the problem in the obvious manner, using X0 and
ADDI8 for 64-bit and R0 and ADDI for 32-bit.
llvm-svn: 165658
the Altivec extensions were introduced. Its use is optional, and
allows the compiler to communicate to the operating system which
vector registers should be saved and restored during a context switch.
In practice, this information is ignored by the various operating
systems using the SVR4 ABI; the kernel saves and restores the entire
register state. Setting the VRSAVE register is no longer performed by
the AIX XL compilers, the IBM i compilers, or by GCC on Power Linux
systems. It seems best to avoid this logic within LLVM as well.
This patch avoids generating code to update and restore VRSAVE for the
PowerPC SVR4 ABIs (32- and 64-bit). The code remains in place for the
Darwin ABI.
llvm-svn: 165656
- Due to the current matching vector elements constraints in
ISD::FP_ROUND, rounding from v2f64 to v4f32 (after legalization from
v2f32) is scalarized. Add a customized v2f32 widening to convert it
into a target-specific X86ISD::VFPROUND to work around this
constraints.
llvm-svn: 165631
- Due to the current matching vector elements constraints in ISD::FP_EXTEND,
rounding from v2f32 to v2f64 is scalarized. Add a customized v2f32 widening
to convert it into a target-specific X86ISD::VFPEXT to work around this
constraints. This patch also reverts a previous attempt to fix this issue by
recovering the scalarized ISD::FP_EXTEND pattern and thus significantly
reduces the overhead of supporting non-power-2 vector FP extend.
llvm-svn: 165625
SDNode for LDRB_POST_IMM is invalid: number of registers added to SDNode fewer
that described in .td.
7 ops is needed, but SDNode with only 6 is created.
In more details:
In ARMInstrInfo.td, in multiclass AI2_ldridx, in definition _POST_IMM, offset
operand is defined as am2offset_imm. am2offset_imm is complex parameter type,
and actually it consists from dummy register and imm itself. As I understood
trick with dummy reg was made for AsmParser. In ARMISelLowering.cpp, this dummy
register was not added to SDNode, and it cause crash in Peephole Optimizer pass.
The problem fixed by setting up additional dummy reg when emitting
LDRB_POST_IMM instruction.
llvm-svn: 165617
SchedulerDAGInstrs::buildSchedGraph ignores dependencies between FixedStack
objects and byval parameters. So loading byval parameters from stack may be
inserted *before* it will be stored, since these operations are treated as
independent.
Fix:
Currently ARMTargetLowering::LowerFormalArguments saves byval registers with
FixedStack MachinePointerInfo. To fix the problem we need to store byval
registers with MachinePointerInfo referenced to first the "byval" parameter.
Also commit adds two new fields to the InputArg structure: Function's argument
index and InputArg's part offset in bytes relative to the start position of
Function's argument. E.g.: If function's argument is 128 bit width and it was
splitted onto 32 bit regs, then we got 4 InputArg structs with same arg index,
but different offset values.
llvm-svn: 165616
Allows the new machine model to be used for NumMicroOps and OutputLatency.
Allows the HazardRecognizer to be disabled along with itineraries.
llvm-svn: 165603
This patch provides initial implementation of load address
macro instruction for Mips. We have implemented two kinds
of expansions with their variations depending on the size
of immediate operand:
1) load address with immediate value directly:
* la d,j => addiu d,$zero,j (for -32768 <= j <= 65535)
* la d,j => lui d,hi16(j)
ori d,d,lo16(j) (for any other 32 bit value of j)
2) load load address with register offset value
* la d,j(s) => addiu d,s,j (for -32768 <= j <= 65535)
* la d,j(s) => lui d,hi16(j) (for any other 32 bit value of j)
ori d,d,lo16(j)
addu d,d,s
This patch does not cover the case when the address is loaded
from the value of the label or function.
Contributer: Vladimir Medic
llvm-svn: 165561
We use the enums to query whether an Attributes object has that attribute. The
opaque layer is responsible for knowing where that specific attribute is stored.
llvm-svn: 165488
Vector compare using altivec 'vcmpxxx' instructions have as third argument
a vector register instead of CR one, different from integer and float-point
compares. This leads to a failure in code generation, where 'SelectSETCC'
expects a DAG with a CR register and gets vector register instead.
This patch changes the behavior by just returning a DAG with the
vector compare instruction based on the type. The patch also adds a testcase
for all vector types llvm defines.
It also included a fix on signed 5-bits predicates printing, where
signed values were not handled correctly as signed (char are unsigned by
default for PowerPC). This generates 'vspltisw' (vector splat)
instruction with SIM out of range.
llvm-svn: 165419
into separate versions for the Darwin and 64-bit SVR4 ABIs. This will
facilitate doing more major surgery on the 64-bit SVR4 ABI in the near future.
llvm-svn: 165336