Emit 32-bit register names instead of 64-bit register names if the target does
not have 64-bit general purpose registers.
<rdar://problem/14653996>
llvm-svn: 205067
Turns out debug_frame does use multiple fragments, so it doesn't
compress correctly with the current approach. Disable compressing it for
now while I figure out what's the best solution for it.
llvm-svn: 205059
WinCOFF cannot form PC relative relocations to support absolute
MCValues. We should reenable this once WinCOFF supports emission of
IMAGE_REL_I386_REL32 relocations.
This fixes PR19272.
llvm-svn: 205058
v2[fi]64 values need to be explicitly passed in VSX registers. This is because
the code in TRI that finds the minimal register class given a register and a
value type will assert if given an Altivec register and a non-Altivec type.
llvm-svn: 205041
This reverts commit r204912, and follow-up commit r204948.
This introduced a performance regression, and the fix is not completely
clear yet.
llvm-svn: 205010
This reverts commit r203553, and follow-up commits r203558 and r203574.
I will follow this up on the mailinglist to do it in a way that won't
cause subtle PRE bugs.
llvm-svn: 205009
This was causing my llc to go into an infinite loop on
CodeGen/R600/address-space.ll (just triggered recently by some allocator
changes).
llvm-svn: 205005
These are used in the ARM backends to aid type-checking on patterns involving
intrinsics. By making sure one argument is an extended/truncated version of
another.
However, there's no reason to limit them to just vectors types. For example
AArch64 has the instruction "uqshrn sD, dN, #imm" which would naturally use an
intrinsic taking an i64 and returning an i32.
llvm-svn: 205003
BumpPtrAllocator significantly less strange by making it a simple
function of the number of slabs allocated rather than by making it
a recurrance. I *think* the previous behavior was essentially that the
size of the slabs would be doubled after the first 128 were allocated,
and then doubled again each time 64 more were allocated, but only if
every allocation packed perfectly into the slab size. If not, the wasted
space wouldn't be counted toward increasing the size, but allocations
over the size threshold *would*. And since the allocations over the size
threshold might be much larger than the slab size, this could have
somewhat surprising consequences where we rapidly grow the slab size.
This currently requires adding state to the allocator to track the
number of slabs currently allocated, but that isn't too bad. I'm
planning further changes to the allocator that will make this state fall
out even more naturally.
It still doesn't fully decouple the growth rate from the allocations
which are over the size threshold. That fix is coming later.
This specific fix will allow making the entire thing into a more
stateless device and lifting the parameters into template parameters
rather than runtime parameters.
llvm-svn: 204993
top of the default jit memory manager. This will allow them to be used
as template parameters rather than runtime parameters in a subsequent
commit.
llvm-svn: 204992
As explained in r204976, because of how the allocation of VSX registers
interacts with the call-lowering code, we sometimes end up generating self VSX
copies. Specifically, things like this:
%VSL2<def> = COPY %F2, %VSL2<imp-use,kill>
(where %F2 is really a sub-register of %VSL2, and so this copy is a nop)
This adds a small cleanup pass to remove these prior to post-RA scheduling.
llvm-svn: 204980
Construct a uniform Windows target triple nomenclature which is congruent to the
Linux counterpart. The old triples are normalised to the new canonical form.
This cleans up the long-standing issue of odd naming for various Windows
environments.
There are four different environments on Windows:
MSVC: The MS ABI, MSVCRT environment as defined by Microsoft
GNU: The MinGW32/MinGW32-W64 environment which uses MSVCRT and auxiliary libraries
Itanium: The MSVCRT environment + libc++ built with Itanium ABI
Cygnus: The Cygwin environment which uses custom libraries for everything
The following spellings are now written as:
i686-pc-win32 => i686-pc-windows-msvc
i686-pc-mingw32 => i686-pc-windows-gnu
i686-pc-cygwin => i686-pc-windows-cygnus
This should be sufficiently flexible to allow us to target other windows
environments in the future as necessary.
llvm-svn: 204977
Because of how the allocation of VSX registers interacts with the call-lowering
code, we sometimes end up generating self VSX copies. Specifically, things like
this:
%VSL2<def> = COPY %F2, %VSL2<imp-use,kill>
(where %F2 is really a sub-register of %VSL2, and so this copy is a nop)
The problem is that ExpandPostRAPseudos always assumes that *some* instruction
has been inserted, and adds implicit defs to it. This is a problem if no copy
was inserted because it can cause subtle problems during post-RA scheduling.
These self copies will have to be removed some other way.
llvm-svn: 204976
First, v2f64 vector extract had not been declared legal (and so the existing
patterns were not being used). Second, the patterns for that, and for
scalar_to_vector, should really be a regclass copy, not a subregister
operation, because the VSX registers directly hold both the vector and scalar data.
llvm-svn: 204971
These operations need to be expanded during legalization so that isel does not
crash. In theory, we might be able to custom lower some of these. That,
however, would need to be follow-up work.
llvm-svn: 204963
1) When creating a .debug_* section and instead create a .zdebug_
section.
2) When creating a fragment in a .zdebug_* section, make it a compressed
fragment.
3) When computing the size of a compressed section, compress the data
and use the size of the compressed data.
4) Emit the compressed bytes.
Also, check that only if a section has a compressed fragment, then that
is the only fragment in the section.
Assert-fail if the fragment's data is modified after it is compressed.
Initial review on llvm-commits by Eric Christopher and Rafael Espindola.
llvm-svn: 204958
Fixes a miscompile introduced in r204912. It would miscompile code like
(unsigned)(a + -49) <= 5U. The transform would turn this into
(unsigned)a < 55U, which would return true for values in [0, 49], when
it should not.
llvm-svn: 204948
Summary:
No functional change since these predicates are (currently) synonymous.
Extracted from a patch by David Chisnall
His work was sponsored by: DARPA, AFRL
Differential Revision: http://llvm-reviews.chandlerc.com/D3202
llvm-svn: 204943
This adds back r204781.
Original message:
Aliases are just another name for a position in a file. As such, the
regular symbol resolutions are not applied. For example, given
define void @my_func() {
ret void
}
@my_alias = alias weak void ()* @my_func
@my_alias2 = alias void ()* @my_alias
We produce without this patch:
.weak my_alias
my_alias = my_func
.globl my_alias2
my_alias2 = my_alias
That is, in the resulting ELF file my_alias, my_func and my_alias are
just 3 names pointing to offset 0 of .text. That is *not* the
semantics of IR linking. For example, linking in a
@my_alias = alias void ()* @other_func
would require the strong my_alias to override the weak one and
my_alias2 would end up pointing to other_func.
There is no way to represent that with aliases being just another
name, so the best solution seems to be to just disallow it, converting
a miscompile into an error.
llvm-svn: 204934
Summary:
Patch by Robert N. M. Watson
His work was sponsored by: DARPA, AFRL
Small corrections by myself.
CC: theraven, matheusalmeida
Differential Revision: http://llvm-reviews.chandlerc.com/D3199
llvm-svn: 204924
of MCELFStreamer.
This is so that changes to MipsELFStreamer will automatically propagate through
its subclasses.
No functional changes (MipsELFStreamer has the same functionality of MCELFStreamer
at the moment).
Differential Revision: http://llvm-reviews.chandlerc.com/D3130
llvm-svn: 204918
This allows us to insert some hooks before emitting data into an actual object file.
For example, we can capture the register usage for a translation unit by overriding
the EmitInstruction method. The register usage information is needed to generate
.reginfo and .Mips.options ELF sections.
No functional changes.
Differential Revision: http://llvm-reviews.chandlerc.com/D3129
llvm-svn: 204917
Summary:
The short name is quite convenient so provide an accessor for them instead.
No functional change
Depends on D3177
Reviewers: matheusalmeida
Reviewed By: matheusalmeida
Differential Revision: http://llvm-reviews.chandlerc.com/D3178
llvm-svn: 204911
Fix description:
Expressions like 'cmp r0, #(l1 - l2) >> 3' could not be evaluated on asm parsing stage,
since it is impossible to resolve labels on this stage. In the end of stage we still have
expression (MCExpr).
Then, when we want to encode it, we expect it to be an immediate, but it still an expression.
Patch introduces a Fixup (MCFixup instance), that is processed after main encoding stage.
llvm-svn: 204899
Summary:
Tested with a unit test because we don't appear to have any transforms
that use this other than ASan, I think.
Fixes PR17935.
Reviewers: nicholas
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D3194
llvm-svn: 204866
Functions may in an instrumented binary but not in the original source
when they're inserted by the compiler or the runtime. These functions
aren't meaningful to the user, so teach llvm-cov to skip over them
instead of crashing.
llvm-svn: 204863
vector list parameter that is using all lanes "{d0[], d2[]}" but can
match and instruction with a ”{d0, d2}" parameter.
I’m finishing up a fix for proper checking of the unsupported
alignments on vld/vst instructions and ran into this. Thus I don’t
have a test case at this time. And adding all code that will
demonstrate the bug would obscure the very simple one line fix.
So if you would indulge me on not having a test case at this
time I’ll instead offer up a detailed explanation of what is
going on in this commit message.
This instruction:
vld2.8 {d0[], d2[]}, [r4:64]
is not legal as the alignment can only be 16 when the size is 8.
Per this documentation:
A8.8.325 VLD2 (single 2-element structure to all lanes)
<align> The alignment. It can be one of:
16 2-byte alignment, available only if <size> is 8, encoded as a = 1.
32 4-byte alignment, available only if <size> is 16, encoded as a = 1.
64 8-byte alignment, available only if <size> is 32, encoded as a = 1.
omitted Standard alignment, see Unaligned data access on page A3-108.
So when code is added to the llvm integrated assembler to not match
that instruction because of the alignment it then goes on to try to match
other instructions and comes across this:
vld2.8 {d0, d2}, [r4:64]
and and matches it. This is because of the method
ARMOperand::isVecListDPairSpaced() is missing the check of the Kind.
In this case the Kind is k_VectorListAllLanes . While the name of the method
may suggest that this is OK it really should check that the Kind is
k_VectorList.
As the method ARMOperand::isDoubleSpacedVectorAllLanes() is what was
used to match {d0[], d2[]} and correctly checks the Kind:
bool isDoubleSpacedVectorAllLanes() const {
return Kind == k_VectorListAllLanes && VectorList.isDoubleSpaced;
}
where the original ARMOperand::isVecListDPairSpaced() does not check
the Kind:
bool isVecListDPairSpaced() const {
if (isSingleSpacedVectorList()) return false;
return (ARMMCRegisterClasses[ARM::DPairSpcRegClassID]
.contains(VectorList.RegNum));
}
Jim Grosbach has reviewed the change and said: Yep, that sounds right. …
And by "right" I mean, "wow, that's a nasty latent bug I'm really, really
glad to see fixed." :)
rdar://16436683
llvm-svn: 204861
This commit consist of two parts.
The first part fix the PR15967. The wrong conclusion was made when the MaxLookup
limit was reached. The fix introduce a out parameter (MaxLookupReached) to
DecomposeGEPExpression that the function aliasGEP can act upon.
The second part is introducing the constant MaxLookupSearchDepth to make sure
that DecomposeGEPExpression and GetUnderlyingObject use the same search depth.
This is a small cleanup to clarify the original algorithm.
Patch by Karl-Johan Karlsson!
llvm-svn: 204859
These patterns are dead (because v4f32 stores are currently promoted to v4i32
and stored using Altivec instructions), and also are likely not correct
(because they'd store the vector elements in the opposite order from that
assumed by the rest of the Altivec code).
llvm-svn: 204839
These instructions have access to the complete VSX register file. In addition,
they "swap" the order of the elements so that element 0 (the scalar part) comes
first in memory and element 1 follows at a higher address.
llvm-svn: 204838
In some cases it is possible for CGP to attempt to reuse a base address from
another basic block. In those cases we have to be sure that all the address
math was either done at the same bit width, or that none of it overflowed
before it was extended.
Patch by Louis Gerbarg <lgg@apple.com>
rdar://16307442
llvm-svn: 204833
> For functions where esi is used as base pointer, we would previously fall ba
> from lowering memcpy with "rep movs" because that clobbers esi.
>
> With this patch, we just store esi in another physical register, and restore
> it afterwards. This adds a little bit of register preassure, but the more
> efficient memcpy should be worth it.
>
> Differential Revision: http://llvm-reviews.chandlerc.com/D2968
This didn't work. I was ending up with code like this:
lea edi,[esi+38h]
mov ecx,0Fh
mov edx,esi
mov esi,ebx
rep movs dword ptr es:[edi],dword ptr [esi]
lea ecx,[esi+74h] <-- Ooops, we're now using esi before restoring it from edx.
add ebx,3Ch
mov esi,edx
I guess if we want to do this we need stronger glue or something, or doing the expansion
much later.
llvm-svn: 204829
v2i64 needs to be a legal VSX type because it is the SetCC result type from
v2f64 comparisons. We need to expand all non-arithmetic v2i64 operations.
This fixes the lowering for v2f64 VSELECT.
llvm-svn: 204828
This enables TableGen to generate an additional two operand matcher
for our ArithLogicR class of instructions (constituted by 3 register operands).
E.g.: and $1, $2 <=> and $1, $1, $2
llvm-svn: 204826
parseDirectiveWord is a generic function that parses an expression which
means there's no need for it to have such an specific name. Renaming it to
parseDataDirective so that it can also be used to handle .dword directives[1].
[1]To be added in a follow up commit.
No functional changes.
llvm-svn: 204818
The '.set mips64' directive enables the feature Mips:FeatureMips64
from assembly. Note that it doesn't modify the ELF header as opposed
to the use of -mips64 from the command-line. The reason for this
is that we want to be as compatible as possible with existing assemblers
like GAS.
llvm-svn: 204817
The '.set mips64r2' directive enables the feature Mips:FeatureMips64r2
from assembly. Note that it doesn't modify the ELF header as opposed
to the use of -mips64r2 from the command-line. The reason for this
is that we want to be as compatible as possible with existing assemblers
like GAS.
llvm-svn: 204815
We've already got versions without the barriers, so this just adds IR-level
support for generating the new v8 ones.
rdar://problem/16227836
llvm-svn: 204813
Given that we support multiple directives that enable a particular feature
(e.g. '.set mips16'), it's best to hoist that code into a new function
so that we don't repeat the same pattern w.r.t parsing and handling error cases.
No functional changes.
llvm-svn: 204811
After some discussion on IRC, emitting a call to the library function seems
like a better default, since it will move from a compiler internal error to
a linker error, that the user can work around until LLVM is fixed.
I'm also adding a note on the responsibility of the user to confirm that
the cache was cleared on platforms where nothing is done.
llvm-svn: 204806
The directive '.option pic2' enables PIC from assembly source.
At the moment none of the macros/directives check the PIC bit
but that's going to be fixed relatively soon. For example, the
expansion of macros like 'la' depend on the relocation model.
llvm-svn: 204803
Implementing the LLVM part of the call to __builtin___clear_cache
which translates into an intrinsic @llvm.clear_cache and is lowered
by each target, either to a call to __clear_cache or nothing at all
incase the caches are unified.
Updating LangRef and adding some tests for the implemented architectures.
Other archs will have to implement the method in case this builtin
has to be compiled for it, since the default behaviour is to bail
unimplemented.
A Clang patch is required for the builtin to be lowered into the
llvm intrinsic. This will be done next.
llvm-svn: 204802
With VSX there is a real vector select instruction, and so we should use it.
Note that VSELECT will still scalarize for v2f64 because the corresponding
SetCC result type (v2i64) is not currently a legal type.
llvm-svn: 204801
Summary: Added test cases for O32 and N32 on MIPS64.
Reviewers: matheusalmeida
Reviewed By: matheusalmeida
Differential Revision: http://llvm-reviews.chandlerc.com/D3175
llvm-svn: 204796
This reverts commit r204781.
I will follow up to with msan folks to see what is what they
were trying to do with aliases to weak aliases.
llvm-svn: 204784