to pass around a struct instead of a large set of individual values. This
cleans up the interface and allows more information to be added to the struct
for future targets without requiring changes to each and every target.
NV_CONTRIB
llvm-svn: 157479
instruction encodings can be excluded during mips16 processing.
This revision fixes the issue raised by Jim Grosbach.
bool hasStandardEncoding() const { return !inMips16Mode(); }
When micromips is added it will be
bool StandardEncoding() const { return !inMips16Mode()&& !inMicroMipsMode(); }
No additional testing is needed other than to assure that there is no regression
from this patch.
Patch by Reed Kotler.
llvm-svn: 157234
32-bit offset jump tables just use real branch instructions and so aren't
marked as data regions. We were still emitting the .end_data_region
marker though, which assert()ed.
rdar://11499158
llvm-svn: 157221
The current code will generate a prologue which starts with something like:
mflr 0
stw 31, -4(1)
stw 0, 4(1)
stwu 1, -16(1)
But under the PPC32 SVR4 ABI, access to negative offsets from R1 is not allowed.
This was pointed out by Peter Bergner.
llvm-svn: 157133
Use a dedicated MachO load command to annotate data-in-code regions.
This is the same format the linker produces for final executable images,
allowing consistency of representation and use of introspection tools
for both object and executable files.
Data-in-code regions are annotated via ".data_region"/".end_data_region"
directive pairs, with an optional region type.
data_region_directive := ".data_region" { region_type }
region_type := "jt8" | "jt16" | "jt32" | "jta32"
end_data_region_directive := ".end_data_region"
The previous handling of ARM-style "$d.*" labels was broken and has
been removed. Specifically, it didn't handle ARM vs. Thumb mode when
marking the end of the section.
rdar://11459456
llvm-svn: 157062
the 0b10 mask encoding bits. Make MSR APSR writes without a _<bits> qualifier
an alias for MSR APSR_nzcvq even though ARM as deprecated it use. Also add
support for suffixes (_nzcvq, _g, _nzcvqg) for APSR versions. Some FIXMEs in
the code for better error checking when versions shouldn't be used.
rdar://11457025
llvm-svn: 157019
llc to recognize MIPS16 as a MIPS ASE extension. -mips16 will mean the
mips16 ASE for mips32 by default.
As part of fixing of adding this we discovered some small changes that
need to be made to MipsInstrInfo::storeRegToStackSLot and
MipsInstrInfo::loadRegFromStackSlot. We were using some "==" equality tests
where in fact we should have been using Mips::<regclas>.hasSubClassEQ instead,
per suggestion of Jakob Stoklund Olesen.
Patch by Reed Kotler.
llvm-svn: 156958
The purpose of this option is to silence error messages issued by machine
verifier passes and enable them to run to the end. If this option is not
provided, -verify-machineinstrs complains when it discovers there is a
non-terminator instruction (an instruction that is in a delay slot) after the
first terminator in a basic block.
llvm-svn: 156790
the ones that get or set the frame index for the $gp save slot.
Remove the piece of code in MipsFunctionInfo::getGlobalBaseReg() which returns
GP. This function should always return a virtual register.
llvm-svn: 156695
- Stop creating stack frame objects needed for saving $gp.
- Insert a node that copies the global pointer register to register $gp
before the call node. This will ensure $gp is valid at the entry of the
called function.
llvm-svn: 156692
- Stop emitting instructions needed to initialize the global pointer register.
- Stop emitting .cprestore directive.
- Do not take into account the $gp save slot when computing stack size.
llvm-svn: 156691
- Remove code which lowers pseudo SETGP01.
- Fix LowerSETGP01. The first two of the three instructions that are emitted to
initialize the global pointer register now use register $2.
- Stop emitting .cpload directive.
llvm-svn: 156689
pointer register.
This is the first of the series of patches which clean up the way global pointer
register is used. The patches will make the following improvements:
- Make $gp an allocatable temporary register rather than reserving it.
- Use a virtual register as the global pointer register and let the register
allocator decide which register to assign to it or whether spill/reloads are
needed.
- Make sure $gp is valid at the entry of a called function, which is necessary
for functions using lazy binding.
- Remove the need for emitting .cprestore and .cpload directives.
llvm-svn: 156671
This patch will optimize the following cases:
sub r1, r3 | sub r1, imm
cmp r3, r1 or cmp r1, r3 | cmp r1, imm
bge L1
TO
subs r1, r3
bge L1 or ble L1
If the branch instruction can use flag from "sub", then we can replace
"sub" with "subs" and eliminate the "cmp" instruction.
rdar: 10734411
llvm-svn: 156599
This patch will optimize the following cases:
sub r1, r3 | sub r1, imm
cmp r3, r1 or cmp r1, r3 | cmp r1, imm
bge L1
TO
subs r1, r3
bge L1 or ble L1
If the branch instruction can use flag from "sub", then we can replace
"sub" with "subs" and eliminate the "cmp" instruction.
rdar: 10734411
llvm-svn: 156550
The getPointerRegClass() hook can return register classes that depend on
the calling convention of the current function (ptr_rc_tailcall).
So far, we have been able to infer the calling convention from the
subtarget alone, but as we add support for multiple calling conventions
per target, that no longer works.
Patch by Yiannis Tsiouris!
llvm-svn: 156328
This function is a generalization of getMatchingSuperRegClass() to the
symmetric case where both sides are using a sub-register index. It will
find a super-register class and sub-register indexes that make this
diagram commute:
PreA
SuperRC ----------> RCA
| |
| |
PreB | | SubA
| |
| |
V V
RCB ----------> SubRC
SubB
This can be used to coalesce copies like:
%vreg1:sub16 = COPY %vreg2:sub16; GR64:%vreg1, GR32: %vreg2
llvm-svn: 156317
This patch will optimize -(x != 0) on X86
FROM
cmpl $0x01,%edi
sbbl %eax,%eax
notl %eax
TO
negl %edi
sbbl %eax %eax
In order to generate negl, I added patterns in Target/X86/X86InstrCompiler.td:
def : Pat<(X86sub_flag 0, GR32:$src), (NEG32r GR32:$src)>;
rdar: 10961709
llvm-svn: 156312
This will be used to determine whether it's profitable to turn a select into a
branch when the branch is likely to be predicted.
Currently enabled for everything but Atom on X86 and Cortex-A9 devices on ARM.
I'm not entirely happy with the name of this flag, suggestions welcome ;)
llvm-svn: 156233
In file included from ../lib/Target/NVPTX/VectorElementize.cpp:53:
../lib/Target/NVPTX/NVPTX.h:44:3: warning: default label in switch which covers all enumeration values [-Wcovered-switch-default]
default: assert(0 && "Unknown condition code");
^
1 warning generated.
The prevailing pattern in LLVM is to not use a default label, and instead to
use llvm_unreachable to denote that the switch in fact covers all return paths
from the function.
llvm-svn: 156209
The new target machines are:
nvptx (old ptx32) => 32-bit PTX
nvptx64 (old ptx64) => 64-bit PTX
The sources are based on the internal NVIDIA NVPTX back-end, and
contain more functionality than the current PTX back-end currently
provides.
NV_CONTRIB
llvm-svn: 156196
This moves the logic for selecting a TLS model to a single place,
instead of the previous three (ARM, Mips, and X86 which already
uses this function).
llvm-svn: 156162
This iterator class provides a more abstract interface to the (Idx,
Mask) lists of super-registers for a register class. The layout of the
tables shouldn't be exposed to clients.
llvm-svn: 156144
for the assembler and disassembler. Which were not being set/read correctly
for offsets greater than 22 bits in some cases.
Changes to lib/Target/ARM/ARMAsmBackend.cpp from Gideon Myles!
llvm-svn: 156118
The ensures that virtual registers always belong to an allocatable class.
If your target attempts to create a vreg for an operand that has no
allocatable register subclass, you will crash quickly.
This ensures that targets define register classes as intended.
llvm-svn: 156046
Expressions for movw/movt don't always have an :upper16: or :lower16:
on them and that's ok. When they don't, it's just a plain [0-65536]
immediate result, effectively the same as a :lower16: variant kind.
rdar://10550147
llvm-svn: 155941
in order to avoid assertion failures in the register scavenger. The assertion
failures were “Bad machine code: Using an undefined physical register” and
“Bad machine code: MBB exits via unconditional fall-through but its successor
differs from its CFG successor!”.
llvm-svn: 155930
This patch will optimize the following cases on X86
(a > b) ? (a-b) : 0
(a >= b) ? (a-b) : 0
(b < a) ? (a-b) : 0
(b <= a) ? (a-b) : 0
FROM
movl %edi, %ecx
subl %esi, %ecx
cmpl %edi, %esi
movl $0, %eax
cmovll %ecx, %eax
TO
xorl %eax, %eax
subl %esi, %edi
cmovll %eax, %edi
movl %edi, %eax
rdar: 10734411
llvm-svn: 155919
The TargetPassManager's default constructor wants to initialize the PassManager
to 'null'. But it's illegal to bind a null reference to a null l-value. Make the
ivar a pointer instead.
PR12468
llvm-svn: 155902
Replace some assert() calls w/ actual diagnostics. In a perfect world,
there'd be range checks on these values long before things ever reached
this code. For now, though, issuing a better-late-than-never diagnostic
is still a big improvement over assert().
rdar://11347287
llvm-svn: 155851
This was exposed by SingleSource/UnitTests/Vector/constpool.c.
The computed size of a basic block isn't always a multiple of its known
alignment, and that can introduce extra alignment padding after the
block.
<rdar://problem/11347135>
llvm-svn: 155845
On x86-32, structure return via sret lets the callee pop the hidden
pointer argument off the stack, which the caller then re-pushes.
However if the calling convention is fastcc, then a register is used
instead, and the caller should not adjust the stack. This is
implemented with a check of IsTailCallConvention
X86TargetLowering::LowerCall but is now checked properly in
X86FastISel::DoSelectCall.
(this time, actually commit what was reviewed!)
llvm-svn: 155825
ARM BUILD_VECTORs created after type legalization cannot use i8 or i16
operands, since those types are not legal. Instead use i32 operands, which
will be implicitly truncated by the BUILD_VECTOR to match the element type.
llvm-svn: 155824
The code could search past the end of the basic block when there was
already a constant pool entry after the block.
Test case with giant basic block in SingleSource/UnitTests/Vector/constpool.c
llvm-svn: 155753
Make sure when parsing the Thumb1 sp+register ADD instruction that
the source and destination operands match. In thumb2, just use the
wide encoding if they don't. In Thumb1, issue a diagnostic.
rdar://11219154
llvm-svn: 155748
On x86-32, structure return via sret lets the callee pop the hidden
pointer argument off the stack, which the caller then re-pushes.
However if the calling convention is fastcc, then a register is used
instead, and the caller should not adjust the stack. This is
implemented with a check of IsTailCallConvention
X86TargetLowering::LowerCall but is now checked properly in
X86FastISel::DoSelectCall.
llvm-svn: 155745
Previously, ARMConstantIslandPass would conservatively compute the
address of an aligned basic block as:
RoundUpToAlignment(Offset + UnknownPadding)
This worked fine for the layout algorithm itself, but it could fool the
verify() function because it accounts for alignment padding twice: Once
when adding the worst case UnknownPadding, and again by rounding up the
fictional block offset. This meant that when optimizeThumb2Instructions
would shrink an instruction, the conservative distance estimate could
grow. That shouldn't be possible since the woorst case alignment padding
wss already included.
This patch drops the use of RoundUpToAlignment, and depends only on
worst case padding to compute conservative block offsets. This has the
weird effect that the computed offset for an aligned block may not be
aligned.
The important difference is that shrinking an instruction can never
cause the estimated distance between two instructions to grow. The
estimated distance is always larger than the real distance that only the
assembler knows.
<rdar://problem/11339352>
llvm-svn: 155744
x == -y --> x+y == 0
x != -y --> x+y != 0
On x86, the generated code goes from
negl %esi
cmpl %esi, %edi
je .LBB0_2
to
addl %esi, %edi
je .L4
This case is correctly handled for ARM with "cmn".
Patch by Manman Ren.
rdar://11245199
PR12545
llvm-svn: 155739
* Model FPSW (the FPU status word) as a register.
* Add ISel patterns for the FUCOM*, FNSTSW and SAHF instructions.
* During Legalize/Lowering, build a node sequence to transfer the comparison
result from FPSW into EFLAGS. If you're wondering about the right-shift: That's
an implicit sub-register extraction (%ax -> %ah) which is handled later on by
the instruction selector.
Fixes PR6679. Patch by Christoph Erhardt!
llvm-svn: 155704