Commit Graph

27665 Commits

Author SHA1 Message Date
Daniel Sanders bd0e39079a [mips] The decision between GOT_DISP and GOT16 for global addresses depends on ABI rather than MIPS64
Summary: No functional change (for supported use cases)

Reviewers: matheusalmeida

Reviewed By: matheusalmeida

Differential Revision: http://llvm-reviews.chandlerc.com/D3191

llvm-svn: 204922
2014-03-27 12:49:34 +00:00
Zoran Jovanovic ada38ef61b Split the file MipsAsmBackend.cpp in Split the file MipsAsmBackend.cpp and Split the file MipsAsmBackend.h.
Differential Revision: http://llvm-reviews.chandlerc.com/D3134

llvm-svn: 204921
2014-03-27 12:38:40 +00:00
Matheus Almeida a805e85cb2 [mips] Remove unused private field.
llvm-svn: 204919
2014-03-27 12:02:48 +00:00
Matheus Almeida 61218ba798 [mips] NaCl should now use the custom MipsELFStreamer (recently added) in spite
of MCELFStreamer.

This is so that changes to MipsELFStreamer will automatically propagate through
its subclasses.

No functional changes (MipsELFStreamer has the same functionality of MCELFStreamer
at the moment).

Differential Revision: http://llvm-reviews.chandlerc.com/D3130

llvm-svn: 204918
2014-03-27 11:52:20 +00:00
Matheus Almeida dac77fb389 [mips] Implement custom MCELFStreamer.
This allows us to insert some hooks before emitting data into an actual object file.
For example, we can capture the register usage for a translation unit by overriding
the EmitInstruction method. The register usage information is needed to generate
.reginfo and .Mips.options ELF sections.
    
No functional changes.
    
Differential Revision: http://llvm-reviews.chandlerc.com/D3129

llvm-svn: 204917
2014-03-27 11:39:03 +00:00
Erik Verbruggen 59a1219846 InstCombine: merge constants in both operands of icmp.
Transform:
    icmp X+Cst2, Cst
into:
    icmp X, Cst-Cst2
when Cst-Cst2 does not overflow, and the add has nsw.

llvm-svn: 204912
2014-03-27 11:16:05 +00:00
Daniel Sanders d897b564ca [mips] Stop caching the result of hasMips64(), isABI_O32(), isABI_N32(), and isABI_N64() from MipsSubTarget in MipsTargetLowering
Summary:
The short name is quite convenient so provide an accessor for them instead.

No functional change

Depends on D3177

Reviewers: matheusalmeida

Reviewed By: matheusalmeida

Differential Revision: http://llvm-reviews.chandlerc.com/D3178

llvm-svn: 204911
2014-03-27 10:46:12 +00:00
Elena Demikhovsky bb2f6b72d3 AVX-512: Implemented masking for integer arithmetic & logic instructions.
By Robert Khasanov rob.khasanov@gmail.com

llvm-svn: 204906
2014-03-27 09:45:08 +00:00
Stepan Dyatkovskiy e8747e30ef Rejected r204899 and r204900 due to remaining test failures on cmake-llvm-x86_64-linux buildbot.
llvm-svn: 204901
2014-03-27 08:38:18 +00:00
Stepan Dyatkovskiy 3530003008 Fix for pr18931: Crash using integrated assembler with immediate arithmetic
Fix description:
Expressions like 'cmp r0, #(l1 - l2) >> 3' could not be evaluated on asm parsing stage,
since it is impossible to resolve labels on this stage. In the end of stage we still have
expression (MCExpr).
Then, when we want to encode it, we expect it to be an immediate, but it still an expression.
Patch introduces a Fixup (MCFixup instance), that is processed after main encoding stage.

llvm-svn: 204899
2014-03-27 07:49:39 +00:00
Jiangning Liu 1d3f2c7c82 ARM: raise error message when complex SO expressions can't really be
solved as a constant at compilation time.

llvm-svn: 204898
2014-03-27 07:42:58 +00:00
Quentin Colombet 3914bf516b [X86][Vectorizer Cost Model] Correct vectorization cost model for v2i64->v2f64
and v4i64->v4f64.

The new costs match what we did for SSE2 and reflect the reality of our codegen.

<rdar://problem/16381225>

llvm-svn: 204884
2014-03-27 00:52:16 +00:00
Jim Grosbach 72fbde84b8 X86: Correct vectorization cost model for v8f32->v8i8.
Fix the cost model to reflect the reality of our codegen.

rdar://16370633

llvm-svn: 204880
2014-03-27 00:04:11 +00:00
Hal Finkel df3e34d944 [PowerPC] Generate VSX permutations for v2[fi]64 vectors
llvm-svn: 204873
2014-03-26 22:58:37 +00:00
Kevin Enderby 5611398b6b Fix a problem with the ARM assembler incorrectly matching a
vector list parameter that is using all lanes "{d0[], d2[]}" but can
match and instruction with a ”{d0, d2}" parameter.

I’m finishing up a fix for proper checking of the unsupported
alignments on vld/vst instructions and ran into this.  Thus I don’t
have a test case at this time.  And adding all code that will
demonstrate the bug would obscure the very simple one line fix.
So if you would indulge me on not having a test case at this
time I’ll instead offer up a detailed explanation of what is
going on in this commit message.

This instruction:

	vld2.8  {d0[], d2[]}, [r4:64]

is not legal as the alignment can only be 16 when the size is 8.
Per this documentation:

A8.8.325 VLD2 (single 2-element structure to all lanes)
 <align> The alignment. It can be one of:
16 2-byte alignment, available only if <size> is 8, encoded as a = 1.
32 4-byte alignment, available only if <size> is 16, encoded as a = 1.
64 8-byte alignment, available only if <size> is 32, encoded as a = 1.
omitted Standard alignment, see Unaligned data access on page A3-108.

So when code is added to the llvm integrated assembler to not match
that instruction because of the alignment it then goes on to try to match
other instructions and comes across this:

	vld2.8  {d0, d2}, [r4:64]

and and matches it. This is because of the method
ARMOperand::isVecListDPairSpaced() is missing the check of the Kind.
In this case the Kind is k_VectorListAllLanes . While the name of the method
may suggest that this is OK it really should check that the Kind is
k_VectorList.

As the method ARMOperand::isDoubleSpacedVectorAllLanes() is what was
used to match {d0[], d2[]}  and correctly checks the Kind:

  bool isDoubleSpacedVectorAllLanes() const {
    return Kind == k_VectorListAllLanes && VectorList.isDoubleSpaced;
  }

where the original ARMOperand::isVecListDPairSpaced() does not check
the Kind:

  bool isVecListDPairSpaced() const {
    if (isSingleSpacedVectorList()) return false;
    return (ARMMCRegisterClasses[ARM::DPairSpcRegClassID]
              .contains(VectorList.RegNum));
  }

Jim Grosbach has reviewed the change and said:  Yep, that sounds right. …
And by "right" I mean, "wow, that's a nasty latent bug I'm really, really
glad to see fixed." :)

rdar://16436683

llvm-svn: 204861
2014-03-26 21:54:11 +00:00
Hal Finkel 6e28e6aaaf [PowerPC] VSX loads and stores support unaligned access
I've not yet updated PPCTTI because I'm not sure what the actual relative cost
is compared to the aligned uses.

llvm-svn: 204848
2014-03-26 19:39:09 +00:00
Kevin Enderby 8108f38437 Fix the ARM VST4 (single 4-element structure from one lane)
size 16 double-spaced registers instruction printing.

This:
	vld4.16 {d17[1], d19[1], d21[1], d23[1]}, [r7]!

was being printed as:

	vld4.16 {d17[1], d18[1], d19[1], d20[1]}, [r7]!

rdar://16435096

llvm-svn: 204847
2014-03-26 19:35:40 +00:00
Hal Finkel 7279f4b00d [PowerPC] Use v2f64 <-> v2i64 VSX conversion instructions
llvm-svn: 204843
2014-03-26 19:13:54 +00:00
Matt Arsenault 90b733a3cf R600: Add a testcase for sext_in_reg I missed.
This sext_inreg i32 in i64 case was already handled, but not enabled.

llvm-svn: 204840
2014-03-26 18:31:06 +00:00
Hal Finkel ea76a44584 [PowerPC] Remove some dead VSX v4f32 store patterns
These patterns are dead (because v4f32 stores are currently promoted to v4i32
and stored using Altivec instructions), and also are likely not correct
(because they'd store the vector elements in the opposite order from that
assumed by the rest of the Altivec code).

llvm-svn: 204839
2014-03-26 18:26:36 +00:00
Hal Finkel 9281c9a38b [PowerPC] Use VSX vector load/stores for v2[fi]64
These instructions have access to the complete VSX register file. In addition,
they "swap" the order of the elements so that element 0 (the scalar part) comes
first in memory and element 1 follows at a higher address.

llvm-svn: 204838
2014-03-26 18:26:30 +00:00
Hans Wennborg d683a22dd2 Revert "X86 memcpy lowering: use "rep movs" even when esi is used as base pointer" (r204174)
>  For functions where esi is used as base pointer, we would previously fall ba
>  from lowering memcpy with "rep movs" because that clobbers esi.
>
>  With this patch, we just store esi in another physical register, and restore
>  it afterwards. This adds a little bit of register preassure, but the more
>  efficient memcpy should be worth it.
>
>  Differential Revision: http://llvm-reviews.chandlerc.com/D2968

This didn't work. I was ending up with code like this:

  lea     edi,[esi+38h]
  mov     ecx,0Fh
  mov     edx,esi
  mov     esi,ebx
  rep movs dword ptr es:[edi],dword ptr [esi]
  lea     ecx,[esi+74h] <-- Ooops, we're now using esi before restoring it from edx.
  add     ebx,3Ch
  mov     esi,edx

I guess if we want to do this we need stronger glue or something, or doing the expansion
much later.

llvm-svn: 204829
2014-03-26 16:30:54 +00:00
Hal Finkel a6c8b51212 [PowerPC] Add v2i64 as a legal VSX type
v2i64 needs to be a legal VSX type because it is the SetCC result type from
v2f64 comparisons. We need to expand all non-arithmetic v2i64 operations.

This fixes the lowering for v2f64 VSELECT.

llvm-svn: 204828
2014-03-26 16:12:58 +00:00
Matheus Almeida ea06727f03 [mips] Use TwoOperandAliasConstraint for ArithLogicR instructions.
This enables TableGen to generate an additional two operand matcher
for our ArithLogicR class of instructions (constituted by 3 register operands).
E.g.: and $1, $2 <=> and $1, $1, $2

llvm-svn: 204826
2014-03-26 16:09:43 +00:00
Matheus Almeida ab5633b70c [mips] Add support to the '.dword' directive.
The '.dword' directive accepts a list of expressions and emits
them in 8-byte chunks in successive locations.

llvm-svn: 204822
2014-03-26 15:44:18 +00:00
Matheus Almeida 3e2a702aa2 [mips] Rename function in MipsAsmParser.
parseDirectiveWord is a generic function that parses an expression which
means there's no need for it to have such an specific name. Renaming it to
parseDataDirective so that it can also be used to handle .dword directives[1].

[1]To be added in a follow up commit.

No functional changes.

llvm-svn: 204818
2014-03-26 15:24:36 +00:00
Matheus Almeida 3b9c63d29b [mips] Add support to '.set mips64'.
The '.set mips64' directive enables the feature Mips:FeatureMips64
from assembly. Note that it doesn't modify the ELF header as opposed
to the use of -mips64 from the command-line. The reason for this
is that we want to be as compatible as possible with existing assemblers
like GAS.

llvm-svn: 204817
2014-03-26 15:14:32 +00:00
Matheus Almeida a2cd009c51 [mips] Add support to '.set mips64r2'.
The '.set mips64r2' directive enables the feature Mips:FeatureMips64r2
from assembly. Note that it doesn't modify the ELF header as opposed
to the use of -mips64r2 from the command-line. The reason for this
is that we want to be as compatible as possible with existing assemblers
like GAS.

llvm-svn: 204815
2014-03-26 14:52:22 +00:00
Christian Pirker 3aa0e6a1f9 AArch64_BE function argument passing for ARM ABI
llvm-svn: 204814
2014-03-26 14:51:22 +00:00
Tim Northover 1ff5f29fb5 ARM: add intrinsics for the v8 ldaex/stlex
We've already got versions without the barriers, so this just adds IR-level
support for generating the new v8 ones.

rdar://problem/16227836

llvm-svn: 204813
2014-03-26 14:39:31 +00:00
Matheus Almeida fe1e39dcba [mips] Hoist common functionality into a new function.
Given that we support multiple directives that enable a particular feature
(e.g. '.set mips16'), it's best to hoist that code into a new function
so that we don't repeat the same pattern w.r.t parsing and handling error cases.

No functional changes.

llvm-svn: 204811
2014-03-26 14:26:27 +00:00
Renato Golin 93010e687f Change @llvm.clear_cache default to call rt-lib
After some discussion on IRC, emitting a call to the library function seems
like a better default, since it will move from a compiler internal error to
a linker error, that the user can work around until LLVM is fixed.

I'm also adding a note on the responsibility of the user to confirm that
the cache was cleared on platforms where nothing is done.

llvm-svn: 204806
2014-03-26 14:01:32 +00:00
Daniel Sanders 6dd7251599 [mips] The decision to use MO_GOT_PAGE and MO_GOT_OFST depends on the ABI being N32 or N64 not the arch being MIPS64
Summary: No functional change (in supported use cases)

Reviewers: matheusalmeida

Reviewed By: matheusalmeida

Differential Revision: http://llvm-reviews.chandlerc.com/D3177

llvm-svn: 204805
2014-03-26 13:59:42 +00:00
Cameron McInally 4532596b8f Fix AVX512 Gather and Scatter execution domains.
llvm-svn: 204804
2014-03-26 13:50:50 +00:00
Matheus Almeida f79b281421 [mips] Add support for '.option pic2'.
The directive '.option pic2' enables PIC from assembly source.
At the moment none of the macros/directives check the PIC bit
but that's going to be fixed relatively soon. For example, the
expansion of macros like 'la' depend on the relocation model.

llvm-svn: 204803
2014-03-26 13:40:29 +00:00
Renato Golin c0a3c1d66b Add @llvm.clear_cache builtin
Implementing the LLVM part of the call to __builtin___clear_cache
which translates into an intrinsic @llvm.clear_cache and is lowered
by each target, either to a call to __clear_cache or nothing at all
incase the caches are unified.

Updating LangRef and adding some tests for the implemented architectures.
Other archs will have to implement the method in case this builtin
has to be compiled for it, since the default behaviour is to bail
unimplemented.

A Clang patch is required for the builtin to be lowered into the
llvm intrinsic. This will be done next.

llvm-svn: 204802
2014-03-26 12:52:28 +00:00
Hal Finkel 732f0f73a7 [PowerPC] Lower VSELECT using xxsel when VSX is available
With VSX there is a real vector select instruction, and so we should use it.
Note that VSELECT will still scalarize for v2f64 because the corresponding
SetCC result type (v2i64) is not currently a legal type.

llvm-svn: 204801
2014-03-26 12:49:28 +00:00
Daniel Sanders a4b0c74765 [mips] The register names depend on the ABI being N32/N64 rather than the arch being mips64
Summary: Added test cases for O32 and N32 on MIPS64.

Reviewers: matheusalmeida

Reviewed By: matheusalmeida

Differential Revision: http://llvm-reviews.chandlerc.com/D3175

llvm-svn: 204796
2014-03-26 11:39:07 +00:00
Daniel Sanders 85f482b02f [mips] $s8 is an alias for $fp in all ABI's, not just N32/N64.
llvm-svn: 204793
2014-03-26 11:05:24 +00:00
Rafael Espindola 65481d7b97 Revert "Prevent alias from pointing to weak aliases."
This reverts commit r204781.

I will follow up to with msan folks to see what is what they
were trying to do with aliases to weak aliases.

llvm-svn: 204784
2014-03-26 06:14:40 +00:00
Hal Finkel bd4de9d478 [PowerPC] Generate logical vector VSX instructions
These instructions are essentially the same as their Altivec counterparts, but
have access to the larger VSX register file.

llvm-svn: 204782
2014-03-26 04:55:40 +00:00
Rafael Espindola 3b712a84a9 Prevent alias from pointing to weak aliases.
Aliases are just another name for a position in a file. As such, the
regular symbol resolutions are not applied. For example, given

define void @my_func() {
  ret void
}
@my_alias = alias weak void ()* @my_func
@my_alias2 = alias void ()* @my_alias

We produce without this patch:

        .weak   my_alias
my_alias = my_func
        .globl  my_alias2
my_alias2 = my_alias

That is, in the resulting ELF file my_alias, my_func and my_alias are
just 3 names pointing to offset 0 of .text. That is *not* the
semantics of IR linking. For example, linking in a

@my_alias = alias void ()* @other_func

would require the strong my_alias to override the weak one and
my_alias2 would end up pointing to other_func.

There is no way to represent that with aliases being just another
name, so the best solution seems to be to just disallow it, converting
a miscompile into an error.

llvm-svn: 204781
2014-03-26 04:48:47 +00:00
Quentin Colombet 6f12ae0d5c [X86] Add broadcast instructions to the table used by ExeDepsFix pass.
Adds the different broadcast instructions to the ReplaceableInstrsAVX2 table.
That way the ExeDepsFix pass can take better decisions when AVX2 broadcasts are
across domain (int <-> float).

In particular, prior to this patch we were generating:
  vpbroadcastd  LCPI1_0(%rip), %ymm2
  vpand %ymm2, %ymm0, %ymm0
  vmaxps  %ymm1, %ymm0, %ymm0 ## <- domain change penalty

Now, we generate the following nice sequence where everything is in the float
domain:
  vbroadcastss  LCPI1_0(%rip), %ymm2
  vandps  %ymm2, %ymm0, %ymm0
  vmaxps  %ymm1, %ymm0, %ymm0

<rdar://problem/16354675>

llvm-svn: 204770
2014-03-26 00:10:22 +00:00
Hal Finkel 174e590966 [PowerPC] Select between VSX A-type and M-type FMA instructions just before RA
The VSX instruction set has two types of FMA instructions: A-type (where the
addend is taken from the output register) and M-type (where one of the product
operands is taken from the output register). This adds a small pass that runs
just after MI scheduling (and, thus, just before register allocation) that
mutates A-type instructions (that are created during isel) into M-type
instructions when:

 1. This will eliminate an otherwise-necessary copy of the addend

 2. One of the product operands is killed by the instruction

The "right" moment to make this decision is in between scheduling and register
allocation, because only there do we know whether or not one of the product
operands is killed by any particular instruction. Unfortunately, this also
makes the implementation somewhat complicated, because the MIs are not in SSA
form and we need to preserve the LiveIntervals analysis.

As a simple example, if we have:

%vreg5<def> = COPY %vreg9; VSLRC:%vreg5,%vreg9
%vreg5<def,tied1> = XSMADDADP %vreg5<tied0>, %vreg17, %vreg16,
                        %RM<imp-use>; VSLRC:%vreg5,%vreg17,%vreg16
  ...
  %vreg9<def,tied1> = XSMADDADP %vreg9<tied0>, %vreg17, %vreg19,
                        %RM<imp-use>; VSLRC:%vreg9,%vreg17,%vreg19
  ...

We can eliminate the copy by changing from the A-type to the
M-type instruction. This means:

  %vreg5<def,tied1> = XSMADDADP %vreg5<tied0>, %vreg17, %vreg16,
                        %RM<imp-use>; VSLRC:%vreg5,%vreg17,%vreg16

is replaced by:

  %vreg16<def,tied1> = XSMADDMDP %vreg16<tied0>, %vreg18, %vreg9,
                        %RM<imp-use>; VSLRC:%vreg16,%vreg18,%vreg9

and we remove: %vreg5<def> = COPY %vreg9; VSLRC:%vreg5,%vreg9

llvm-svn: 204768
2014-03-25 23:29:21 +00:00
Hal Finkel 6c32ff31d0 [PowerPC] Correct commutable indices for VSX FMA instructions
Although the first two operands are the ones that can be swapped, the tied
input operand is listed before them, so we need to adjust for that.

I have a test case for this, but it goes along with an upcoming commit (so it
will come soon).

llvm-svn: 204748
2014-03-25 19:26:43 +00:00
Hal Finkel 25e0454f10 [PowerPC] Add a TableGen relation for A-type and M-type VSX FMA instructions
TableGen will create a lookup table for the A-type FMA instructions providing
their corresponding M-form opcodes. This will be used by upcoming commits.

llvm-svn: 204746
2014-03-25 18:55:11 +00:00
Matt Arsenault 0c274feedf R600: Move computeMaskedBitsForTargetNode out of AMDILISelLowering.cpp
Remove handling of select_cc, since it makes no sense to be there. This
now does nothing, but I'll be adding some handling of other target nodes
soon.

llvm-svn: 204743
2014-03-25 18:18:27 +00:00
Juergen Ributzka 631c4914b2 [X86TTI] Make constant base pointers for getElementPtr opaque.
If getElementPtr uses a constant as base pointer, then make the constant opaque.
This prevents constant folding it with the offset. The offset can usually be
encoded in the load/store instruction itself and the base address doesn't have
to be rematerialized several times.

llvm-svn: 204739
2014-03-25 18:01:25 +00:00
Juergen Ributzka 5eef98cf7a [Stackmaps][X86TTI] Fix think-o in getIntImmCost calculation.
The cost for the first four stackmap operands was always TCC_Free.
This is only true for the first two operands. All other operands
are TCC_Free if they are within 64bit.

llvm-svn: 204738
2014-03-25 18:01:23 +00:00
Adam Nemet 4beef4c90d [X86] Generate VPSHUFB for in-place v16i16 shuffles
This used to resort to splitting the 256-bit operation into two 128-bit
shuffles and then recombining the results.

Fixes <rdar://problem/16167303>

llvm-svn: 204735
2014-03-25 17:47:06 +00:00
Adam Nemet ac6d6383a3 [X86] Factor out new helper getPSHUFB
I found three implementations of this.  This splits it out into a new function
and uses it from the three places.

My plan is to add a fourth use when lowering a vector_shuffle:v16i16.

Compared the assembly output of test/CodeGen/X86 before and after.

The only change is due to how the first PSHUFB was generated in
LowerVECTOR_SHUFFLEv8i16.  If the shuffle mask specified undef (i.e. -1), the
old implementation would write -1 * 2 and -1 * 2 + 1 (254 and 255) in the
control mask.  Now we write 0x80.  These are of course interchangeable since
bit 7 decides if a constant zero is written in the result byte.  The other
instances of this code use 0x80 consistently.

Related to <rdar://problem/16167303>

llvm-svn: 204734
2014-03-25 17:47:03 +00:00
Daniel Sanders 71a89d92f6 [mips] '.set at=$0' should be equivalent to '.set noat'
Differential Revision: http://llvm-reviews.chandlerc.com/D3171

llvm-svn: 204714
2014-03-25 13:01:06 +00:00
Cameron McInally 45dc489403 Fix AVX2 Gather execution domains.
llvm-svn: 204713
2014-03-25 12:36:38 +00:00
Daniel Sanders b1d7e53a26 [mips] Correct testcase for .set at=$reg and emit the new warnings for numeric registers too.
Summary:
Remove the XFAIL added in my previous commit and correct the test such that
it correctly tests the expansion of the assembler temporary.

Also added a test to check that $at is always $1 when written by the
user.

Corrected the new assembler temporary warnings so that they are emitted for
numeric registers too.

Differential Revision: http://llvm-reviews.chandlerc.com/D3169

llvm-svn: 204711
2014-03-25 11:16:03 +00:00
Daniel Sanders e231ae9e3a [mips] Fix assembler temporary expansion and add associated warnings about the use of $at.
Summary:
The assembler temporary is normally $at ($1) but can be reassigned using
'.set at=$reg'. Regardless of which register is nominated as the assembler
temporary, $at remains $1 when written by the user.

Adds warnings under the following conditions:
* The register nominated as the assembler temporary is used by the user.
* '.set noat' is in effect and $at is used by the user.
Both of these only work for named registers. I have a follow up commit that makes it work for numeric registers as well.

XFAIL set-at-directive.s since it incorrectly tests that $at is redefined by
'.set at=$reg'. Testcases will follow in a separate commit.

Patch by David Chisnall
His work was sponsored by: DARPA, AFRL

Differential Revision: http://llvm-reviews.chandlerc.com/D3167

llvm-svn: 204710
2014-03-25 10:57:07 +00:00
Kevin Enderby 89299400ac Fix crashes when assembler directives are used that are not
for Mach-O object files by generating an error instead.

rdar://16335232

llvm-svn: 204687
2014-03-25 00:05:50 +00:00
Matt Arsenault db8b1d5b6c R600: Don't viewCFG() under DEBUG() except on failure.
Having these popping up every time you use -debug is really
irritating.

llvm-svn: 204664
2014-03-24 20:29:02 +00:00
Matt Arsenault 684dc80b6d R600/SI: Fix extra mov from legalizing 64-bit SALU ops.
Check the register class of each operand individually
to avoid an extra copy to a vgpr.

llvm-svn: 204662
2014-03-24 20:08:13 +00:00
Matt Arsenault 248b7b6ba1 R600/SI: Sub-optimial fix for 64-bit immediates with SALU ops.
No longer asserts, but now you get moves loading legal immediates
into the split 32-bit operations.

llvm-svn: 204661
2014-03-24 20:08:09 +00:00
Matt Arsenault f35182c783 R600/SI: Fix 64-bit bit ops that require the VALU.
Try to match scalar and first like the other instructions.
Expand 64-bit ands to a pair of 32-bit ands since that is not
available on the VALU.

llvm-svn: 204660
2014-03-24 20:08:05 +00:00
Matt Arsenault a7f1e0c44f R600: Implement isNarrowingProfitable.
llvm-svn: 204658
2014-03-24 19:43:31 +00:00
Matt Arsenault bd9958038c R600/SI: Move splitting 64-bit immediates to separate function.
llvm-svn: 204651
2014-03-24 18:26:52 +00:00
Ulrich Weigand cae3a17a21 [PowerPC] Generate little-endian object files
As a first step towards real little-endian code generation, this patch
changes the PowerPC MC layer to actually generate little-endian object
files.  This involves passing the little-endian flag through the various
layers, including down to createELFObjectWriter so we actually get basic
little-endian ELF objects, emitting instructions in little-endian order,
and handling fixups and relocations as appropriate for little-endian.

The bulk of the patch is to update most test cases in test/MC/PowerPC
to verify both big- and little-endian encodings.  (The only test cases
*not* updated are those that create actual big-endian ABI code, like
the TLS tests.)

Note that while the object files are now little-endian, the generated
code itself is not yet updated, in particular, it still does not adhere
to the ELFv2 ABI.

llvm-svn: 204634
2014-03-24 18:16:09 +00:00
Quentin Colombet 2d5c156b96 [X86][ISelDAG] Add missing fallback patterns for avx2 broadcast instructions.
Those patterns are used when the load cannot be folded into the related broadcast
during the select phase.
This happens when the load gets additional uses that were not anticipated during
the previous lowering phases (constant vector to constant load, then constant
load reused) or when selection DAG is not able to prove that folding the load
will not create a cycle in the DAG.

<rdar://problem/16074331>

llvm-svn: 204631
2014-03-24 17:54:19 +00:00
Matt Arsenault ad41d7b531 R600/SI: Fix 64-bit private loads.
llvm-svn: 204630
2014-03-24 17:50:46 +00:00
Adam Nemet b47372f555 [X86] Fix non-determinism in LowerVectorAllZeroTest
This can be observed with the old testcase of CodeGen/X86/pr12312.ll:

47c47
<       vorps   %ymm0, %ymm1, %ymm0
---
>       vorps   %ymm1, %ymm0, %ymm0
97c97
<       vorps   %ymm1, %ymm0, %ymm0
---
>       vorps   %ymm0, %ymm1, %ymm0

The vector VecIns is populated with all the values from VecInMap. This is done
while iterating VecInMap.  VecInMap uses a hash of pointer values so the
resulting order can vary depending on the memory layout.

The fix is to populate the vector VecIns earlier as VecInMap is populated.
This is done in DAG traversal order.

Fixes <rdar://problem/16398806>

llvm-svn: 204623
2014-03-24 16:52:08 +00:00
Daniel Sanders d89b13625e [mips] Add error message when trying to use $at in '.set noat' mode.
Summary:
Patch by David Chisnall
His work was sponsored by: DARPA, AFRL

Differential Revision: http://llvm-reviews.chandlerc.com/D3158

llvm-svn: 204621
2014-03-24 16:48:01 +00:00
Eli Bendersky 6de2087ea7 Removes the NVPTXSplitBBatBar pass.
This pass is a historic remnant and actually causes less efficient code to be
generated in some cases.

llvm-svn: 204620
2014-03-24 16:36:39 +00:00
Tom Stellard 8c12fd9252 R600/SI: Fix warning with gcc 4.8.2
llvm-svn: 204618
2014-03-24 16:12:34 +00:00
Tom Stellard da99c6eff5 R600/SI: Promote fp64 SELECT to i64
This type promotion is replacing a Tablegen pattern and it is already
covered by existing tests.

llvm-svn: 204617
2014-03-24 16:07:30 +00:00
Tom Stellard 2c1c9de151 R600: Reorganize tablegen instruction definitions
Each GPU family now has its own file.

llvm-svn: 204615
2014-03-24 16:07:25 +00:00
Will Schmidt 114777e47f [PPC64LE] ELFv2 ABI updates for the .opd section
[PPC64LE] ELFv2 ABI updates for the .opd section
The PPC64 Little Endian (PPC64LE) target supports the ELFv2 ABI, and as
such, does not have a ".opd" section.  This is keyed off a _CALL_ELF=2
macro check.

The CALL_ELF check is not clearly documented at this time.  The basis
for usage in this patch is from the gcc thread here:
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg01144.html

> Adding comment from Uli:
Looks good to me.  I think the old-style JIT doesn't really work
anyway for 64-bit, but at least with this patch LLVM will compile
and link again on a ppc64le host ...

llvm-svn: 204614
2014-03-24 16:04:15 +00:00
Daniel Sanders 01f9fc06e7 [mips] Allow dsubu to take an immediate as an alias for dsubiu.
Summary:
Patch by David Chisnall
His work was sponsored by: DARPA, AFRL

Differential Revision: http://llvm-reviews.chandlerc.com/D3155

llvm-svn: 204611
2014-03-24 15:38:00 +00:00
Hal Finkel e01d32107c [PowerPC] Mark many instructions as commutative
I'm under the impression that we used to infer the isCommutable flag from the
instruction-associated pattern. Regardless, we don't seem to do this (at least
by default) any more. I've gone through all of our instruction definitions, and
marked as commutative all of those that should be trivial to commute (by
exchanging the first two operands). There has been special code for the RL*
instructions, and that's not changed.

Before this change, we had the following commutative instructions:

 RLDIMI
 RLDIMIo
 RLWIMI
 RLWIMI8
 RLWIMI8o
 RLWIMIo
 XSADDDP
 XSMULDP
 XVADDDP
 XVADDSP
 XVMULDP
 XVMULSP

After:

 ADD4
 ADD4o
 ADD8
 ADD8o
 ADDC
 ADDC8
 ADDC8o
 ADDCo
 ADDE
 ADDE8
 ADDE8o
 ADDEo
 AND
 AND8
 AND8o
 ANDo
 CRAND
 CREQV
 CRNAND
 CRNOR
 CROR
 CRXOR
 EQV
 EQV8
 EQV8o
 EQVo
 FADD
 FADDS
 FADDSo
 FADDo
 FMADD
 FMADDS
 FMADDSo
 FMADDo
 FMSUB
 FMSUBS
 FMSUBSo
 FMSUBo
 FMUL
 FMULS
 FMULSo
 FMULo
 FNMADD
 FNMADDS
 FNMADDSo
 FNMADDo
 FNMSUB
 FNMSUBS
 FNMSUBSo
 FNMSUBo
 MULHD
 MULHDU
 MULHDUo
 MULHDo
 MULHW
 MULHWU
 MULHWUo
 MULHWo
 MULLD
 MULLDo
 MULLW
 MULLWo
 NAND
 NAND8
 NAND8o
 NANDo
 NOR
 NOR8
 NOR8o
 NORo
 OR
 OR8
 OR8o
 ORo
 RLDIMI
 RLDIMIo
 RLWIMI
 RLWIMI8
 RLWIMI8o
 RLWIMIo
 VADDCUW
 VADDFP
 VADDSBS
 VADDSHS
 VADDSWS
 VADDUBM
 VADDUBS
 VADDUHM
 VADDUHS
 VADDUWM
 VADDUWS
 VAND
 VAVGSB
 VAVGSH
 VAVGSW
 VAVGUB
 VAVGUH
 VAVGUW
 VMADDFP
 VMAXFP
 VMAXSB
 VMAXSH
 VMAXSW
 VMAXUB
 VMAXUH
 VMAXUW
 VMHADDSHS
 VMHRADDSHS
 VMINFP
 VMINSB
 VMINSH
 VMINSW
 VMINUB
 VMINUH
 VMINUW
 VMLADDUHM
 VMULESB
 VMULESH
 VMULEUB
 VMULEUH
 VMULOSB
 VMULOSH
 VMULOUB
 VMULOUH
 VNMSUBFP
 VOR
 VXOR
 XOR
 XOR8
 XOR8o
 XORo
 XSADDDP
 XSMADDADP
 XSMAXDP
 XSMINDP
 XSMSUBADP
 XSMULDP
 XSNMADDADP
 XSNMSUBADP
 XVADDDP
 XVADDSP
 XVMADDADP
 XVMADDASP
 XVMAXDP
 XVMAXSP
 XVMINDP
 XVMINSP
 XVMSUBADP
 XVMSUBASP
 XVMULDP
 XVMULSP
 XVNMADDADP
 XVNMADDASP
 XVNMSUBADP
 XVNMSUBASP
 XXLAND
 XXLNOR
 XXLOR
 XXLXOR

This is a by-inspection change, and I'm not sure how to write a reliable test
case. I would like advice on this, however.

llvm-svn: 204609
2014-03-24 15:07:28 +00:00
Daniel Sanders a771fefb72 [mips] Implement shorthand add / sub forms for MIPS.
Summary:
- If only two registers are passed to a three-register operation, then the
  first argument is both source and destination register.

- If a non-register is passed as the last argument, generate the immediate
  version of the instruction.

Also mark DADD commutative and add scheduling information (to the generic
scheduler), and implement DSUB.

Patch by David Chisnall
His work was sponsored by: DARPA, AFRL

CC: theraven

Differential Revision: http://llvm-reviews.chandlerc.com/D3148

llvm-svn: 204605
2014-03-24 14:05:39 +00:00
Justin Holewinski ba2fa6de4f [NVPTX] Add isel patterns for addrspacecast
llvm-svn: 204600
2014-03-24 11:17:53 +00:00
Hal Finkel 32854b0439 [PowerPC] Don't schedule VSX copy legalization unless VSX is enabled
There is no need to schedule this extra pass if it will have nothing to do.

llvm-svn: 204594
2014-03-24 09:51:41 +00:00
Hal Finkel bbad2332e3 [PowerPC] Update comment re: VSX copy-instruction selection
I've done some experimentation with this, and it looks like using the
lower-latency (but lower throughput) copy instruction is essentially always the
right thing to do.

My assumption is that, in order to be relatively sure that the higher-latency
copy will increase throughput, we'd want to have it unlikely to be in-flight
with its use. On the P7, the global completion table (GCT) can hold a maximum
of 120 instructions, shared among all active threads (up to 4), giving 30
instructions per thread.  So specifically, I'd require at least that many
instructions between the copy and the use before the high-latency variant is
used.

Trying this, however, over the entire test suite resulted in zero cases where
the high-latency form would be preferable. This may be a consequence of the
fact that the scheduler views copies as free, and so they tend to end up close
to their uses. For this experiment I created a function:

  unsigned chooseVSXCopy(MachineBasicBlock &MBB,
                         MachineBasicBlock::iterator I,
                         unsigned DestReg, unsigned SrcReg,
                         unsigned StartDist = 1,
                         unsigned Depth = 3) const;

with an implementation like:

  if (!Depth)
    return PPC::XXLOR;

  const unsigned MaxDist = 30;
  unsigned Dist = StartDist;
  for (auto J = I, JE = MBB.end(); J != JE && Dist <= MaxDist; ++J) {
    if (J->isTransient() && !J->isCopy())
      continue;

    if (J->isCall() || J->isReturn() || J->readsRegister(DestReg, TRI))
      return PPC::XXLOR;

    ++Dist;
  }

  // We've exceeded the required distance for the high-latency form, use it.
  if (Dist > MaxDist)
    return PPC::XVCPSGNDP;

  // If this is only an exit block, use the low-latency form.
  if (MBB.succ_empty())
    return PPC::XXLOR;

  // We've reached the end of the block, check the successor blocks (up to some
  // depth), and use the high-latency form if that is okay with all successors.
  for (auto J = MBB.succ_begin(), JE = MBB.succ_end(); J != JE; ++J) {
    if (chooseVSXCopy(**J, (*J)->begin(), DestReg, SrcReg,
                      Dist, --Depth) == PPC::XXLOR)
      return PPC::XXLOR;
  }

  // All of our successor blocks seem okay with the high-latency variant, so
  // we'll use it.
  return PPC::XVCPSGNDP;

and then changed the copy opcode selection from:
    Opc = PPC::XXLOR;
to:
    Opc = chooseVSXCopy(MBB, std::next(I), DestReg, SrcReg);

In conclusion, I'm removing the FIXME from the comment, because I believe that
there is, at least absent other examples, nothing to fix.

llvm-svn: 204591
2014-03-24 09:36:36 +00:00
Arnaud A. de Grandmaison 1182600f20 ARM: no need to update SplatBits as it is not used
llvm-svn: 204575
2014-03-23 21:14:32 +00:00
Nuno Lopes 31617266ea remove a bunch of unused private methods
found with a smarter version of -Wunused-member-function that I'm playwing with.
Appologies in advance if I removed someone's WIP code.

 include/llvm/CodeGen/MachineSSAUpdater.h            |    1 
 include/llvm/IR/DebugInfo.h                         |    3 
 lib/CodeGen/MachineSSAUpdater.cpp                   |   10 --
 lib/CodeGen/PostRASchedulerList.cpp                 |    1 
 lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp    |   10 --
 lib/IR/DebugInfo.cpp                                |   12 --
 lib/MC/MCAsmStreamer.cpp                            |    2 
 lib/Support/YAMLParser.cpp                          |   39 ---------
 lib/TableGen/TGParser.cpp                           |   16 ---
 lib/TableGen/TGParser.h                             |    1 
 lib/Target/AArch64/AArch64TargetTransformInfo.cpp   |    9 --
 lib/Target/ARM/ARMCodeEmitter.cpp                   |   12 --
 lib/Target/ARM/ARMFastISel.cpp                      |   84 --------------------
 lib/Target/Mips/MipsCodeEmitter.cpp                 |   11 --
 lib/Target/Mips/MipsConstantIslandPass.cpp          |   12 --
 lib/Target/NVPTX/NVPTXISelDAGToDAG.cpp              |   21 -----
 lib/Target/NVPTX/NVPTXISelDAGToDAG.h                |    2 
 lib/Target/PowerPC/PPCFastISel.cpp                  |    1 
 lib/Transforms/Instrumentation/AddressSanitizer.cpp |    2 
 lib/Transforms/Instrumentation/BoundsChecking.cpp   |    2 
 lib/Transforms/Instrumentation/MemorySanitizer.cpp  |    1 
 lib/Transforms/Scalar/LoopIdiomRecognize.cpp        |    8 -
 lib/Transforms/Scalar/SCCP.cpp                      |    1 
 utils/TableGen/CodeEmitterGen.cpp                   |    2 
 24 files changed, 2 insertions(+), 261 deletions(-)

llvm-svn: 204560
2014-03-23 17:09:26 +00:00
Hal Finkel 4a912250fa [PowerPC] Make use of VSX f64 <-> i64 conversion instructions
When VSX is available, these instructions should be used in preference to the
older variants that only have access to the scalar floating-point registers.

llvm-svn: 204559
2014-03-23 05:35:00 +00:00
Craig Topper a9253267a9 Prune includes in ARM target.
llvm-svn: 204548
2014-03-22 23:51:00 +00:00
Saleem Abdulrasool 44419fc3cd ARM IAS: properly handle function entries in .thumb
When a label is parsed, check if there is type information available for the
label.  If so, check if the symbol is a function.  If the symbol is a function
and we are in thumb mode and no explicit thumb_func has been emitted, adjust the
symbol data to indicate that the function definition is a thumb function.

The application of this inferencing is improved value handling in the object
file (the required thumb bit is set on symbols which are thumb functions).  It
also helps improve compatibility with binutils.

The one complication that arises from this handling is the MCAsmStreamer.  The
default implementation of getOrCreateSymbolData in MCStreamer does not support
tracking the symbol data.  In order to support the semantics of thumb functions,
track symbol data in assembly streamer.  Although O(n) in number of labels in
the TU, this is already done in various other streamers and as such the memory
overhead is not a practical concern in this scenario.

llvm-svn: 204544
2014-03-22 19:26:18 +00:00
Hal Finkel 55805eb562 [PowerPC] Fix the VSX v2f64 return register
v2f64 values, like other 128-bit values, are returned under VSX in register
vs34 (Altivec register v2).

llvm-svn: 204543
2014-03-22 18:24:43 +00:00
Chad Rosier b7747e31ef [AArch64] Add SchedRW lists to NEON instructions.
Previously, only regular AArch64 instructions were annotated with SchedRW lists.
This patch does the same for NEON enabling these instructions to be scheduled by
the MIScheduler. Additionally, store operations are now modeled and a few
SchedRW lists were updated for bug fixes (e.g. multiple def operands).

Reviewers: apazos, mcrosier, atrick
Patch by Dave Estes <cestes@codeaurora.org>!

llvm-svn: 204505
2014-03-21 19:34:41 +00:00
Matt Arsenault 8e2581b11e R600/SI: Move instruction patterns to scalar versions.
Some of them also had the pattern on both, so this removes the
duplication.

llvm-svn: 204492
2014-03-21 18:01:18 +00:00
Daniel Sanders f88a29e66a [mips] Correct lowering of VECTOR_SHUFFLE to VSHF.
Summary:
VECTOR_SHUFFLE concatenates the vectors in an vectorwise fashion.
  <0b00, 0b01> + <0b10, 0b11> -> <0b00, 0b01, 0b10, 0b11>
VSHF concatenates the vectors in a bitwise fashion:
  <0b00, 0b01> + <0b10, 0b11> ->
  0b0100       + 0b1110       -> 0b01001110
                                 <0b10, 0b11, 0b00, 0b01>
We must therefore swap the operands to get the correct result.

The test case that discovered the issue was MultiSource/Benchmarks/nbench.

Reviewers: matheusalmeida

Reviewed By: matheusalmeida

Differential Revision: http://llvm-reviews.chandlerc.com/D3142

llvm-svn: 204480
2014-03-21 16:56:51 +00:00
Tom Stellard 1583409e33 R600/SI: Handle MUBUF instructions in SIInstrInfo::moveToVALU()
llvm-svn: 204476
2014-03-21 15:51:57 +00:00
Tom Stellard e038720702 R600/SI: Handle S_MOV_B64 in SIInstrInfo::moveToVALU()
llvm-svn: 204475
2014-03-21 15:51:54 +00:00
Tom Stellard def38c567d R600/SI: Use SGPR_(32|64) reg clases when lowering SI_ADDR64_RSRC
The SReg_(32|64) register classes contain special registers in addition
to the numbered SGPRs.  This can lead to machine verifier errors when
these register classes are used as sub-registers for SReg_128, since
SReg_128 only uses the numbered SGPRs.

Replacing SReg_(32|64) with SGPR_(32|64) fixes this problem, since
the SGPR_(32|64) register classes contain only numbered SGPRs.

Tests cases for this are comming in a later commit.

llvm-svn: 204474
2014-03-21 15:51:53 +00:00
Richard Sandiford 5676585886 [SystemZ] Use "let Predicates =" for blocks of new instructions
...instead of a separate Requires for each one.  This style was already
used in some places and seems more compact.

No behavioral change intended.

llvm-svn: 204452
2014-03-21 11:04:54 +00:00
Richard Sandiford dc6c2c953d [SystemZ] Add support for z196 float<->unsigned conversions
These complement the older float<->signed instructions.

llvm-svn: 204451
2014-03-21 10:56:30 +00:00
Matheus Almeida 518b4f9fcf [mips] Update namespace.
We should be using the llvm namespace and not an anonymous namespace
in a header file.

llvm-svn: 204450
2014-03-21 10:35:14 +00:00
Juergen Ributzka f0dff49ad0 [Constant Hoisting] Make the constant materialization cost operand dependent
Extend the target hook to take also the operand index into account when
calculating the cost of the constant materialization.

Related to <rdar://problem/16381500>

llvm-svn: 204435
2014-03-21 06:04:45 +00:00
Jiangning Liu db55b02e1c This reverts commit r203762, "ARM: support emission of complex SO expressions".
The commit r203762 introduced silent failure for complext SO expression, and it's even worse than compiler crash.

llvm-svn: 204427
2014-03-21 02:51:01 +00:00
Kevin Qin b2c78b07d6 [AArch64] Remove .data_region directive from AArch64.
.data_region is only used in Darwin, so it shouldn't be generated
for other OS. Currently AArch64 doesn't support darwin yet, so
I removed it from AArch64. When Darwin is supported someday, we can
add it back and associate it with Darwin.

llvm-svn: 204424
2014-03-21 02:12:48 +00:00
Weiming Zhao 0152485679 Fix PR19136: [ARM] Fix Folding SP Update into vpush/vpop
Sicne MBB->computeRegisterLivenes() returns Dead for sub regs like s0,
d0 is used in vpop instead of updating sp, which causes s0 dead before
its use.

This patch checks the liveness of each subreg to make sure the reg is
actually dead.

llvm-svn: 204411
2014-03-20 23:28:16 +00:00
Juergen Ributzka 46357931ab Revert "[Constant Hoisting] Extend coverage of the constant hoisting pass."
I will break this up into smaller pieces for review and recommit.

llvm-svn: 204393
2014-03-20 20:17:13 +00:00
Juergen Ributzka 6dab520c70 [Constant Hoisting] Extend coverage of the constant hoisting pass.
This commit extends the coverage of the constant hoisting pass, adds additonal
debug output and updates the function names according to the style guide.

Related to <rdar://problem/16381500>

llvm-svn: 204389
2014-03-20 19:55:52 +00:00
Matt Arsenault 99395fa98f R600: Remove unused method declaration.
llvm-svn: 204357
2014-03-20 16:41:06 +00:00
Kai Nacke 93fe5e810d [MIPS] Add cpu octeon and some instructions
The Octeon cpu from Cavium Networks is mips64r2 based and has an extended
instruction set. In order to utilize this with LLVM, a new cpu feature "octeon"
and a subtarget feature "cnmips" is added. A small set of new instructions
(baddu, dmul, pop, dpop, seq, sne) is also added. LLVM generates dmul, pop and
dpop instructions with option -mcpu=octeon or -mattr=+cnmips.

llvm-svn: 204337
2014-03-20 11:51:58 +00:00
Zoran Jovanovic a0f5328984 Provide an operand for microMIPS wait instruction.
llvm-svn: 204329
2014-03-20 10:41:37 +00:00
Zoran Jovanovic 87d13e5ec1 Implementation of microMIPS 16-bit instructions MOVE and JALR.
Differential Revision: http://llvm-reviews.chandlerc.com/D3112

llvm-svn: 204325
2014-03-20 10:18:24 +00:00
Zoran Jovanovic 28221d8bc1 Mark alias symbols as microMIPS if necessary. Differential Revision: http://llvm-reviews.chandlerc.com/D3080
llvm-svn: 204323
2014-03-20 09:44:49 +00:00
Matheus Almeida 9e1450bce9 [mips] Splitting up class definition from implementation.
Also removed some unnecessary #includes.

No functional changes.

llvm-svn: 204320
2014-03-20 09:29:54 +00:00
Alexey Samsonov 94bc422d7a Add llvm_unreachable after fully-covered switches to appease GCC
llvm-svn: 204318
2014-03-20 07:30:40 +00:00
Saleem Abdulrasool 39f773f939 Reapply 'ARM IAS: support .thumb_set'
Re-apply the change after it was reverted to do conflicts due to another change
being reverted.

llvm-svn: 204306
2014-03-20 06:05:33 +00:00
Craig Topper 38afbfdd76 [X86] Check return value of readSIB in disassembler so errors propagate. In particular this makes a too short instruction with a missing SIB byte fail.
llvm-svn: 204305
2014-03-20 05:56:00 +00:00
Hao Liu 40b5ab8e5b [ARM]Fix an assertion failure in A15SDOptimizer about DPair reg class by treating DPair as QPR.
llvm-svn: 204304
2014-03-20 05:36:59 +00:00
Rafael Espindola 7fadc0ea7d Look through variables when computing relocations.
Given

bar = foo + 4
	.long bar

MC would eat the 4. GNU as includes it in the relocation. The rule seems to be
that a variable that defines a symbol is used in the relocation and one that
does not define a symbol is evaluated and the result included in the relocation.

Fixing this unfortunately required some other changes:

* Since the variable is now evaluated, it would prevent the ELF writer from
  noticing the weakref marker the elf streamer uses. This patch then replaces
  that with a VariantKind in MCSymbolRefExpr.

* Using VariantKind then requires us to look past other VariantKind to see

	.weakref	bar,foo
	call	bar@PLT

  doing this also fixes

	zed = foo +2
	call zed@PLT

  so that is a good thing.

* Looking past VariantKind means that the relocation selection has to use
  the fixup instead of the target.

This is a reboot of the previous fixes for MC. I will watch the sanitizer
buildbot and wait for a build before adding back the previous fixes.

llvm-svn: 204294
2014-03-20 02:12:01 +00:00
Matt Arsenault dd78b8059b R600/SI: Add unused LDS 2 form instructions.
llvm-svn: 204275
2014-03-19 22:19:56 +00:00
Matt Arsenault d06ebd93e6 R600/SI: Add support for 64-bit LDS writes
llvm-svn: 204274
2014-03-19 22:19:54 +00:00
Matt Arsenault b943348cb9 R600/SI: Add support for 64-bit LDS loads.
v2:
  -Use correct opcode for DS_READ_64

llvm-svn: 204273
2014-03-19 22:19:52 +00:00
Matt Arsenault 99ed78926b R600/SI: Match i16 immediate offset of LDS instructions.
llvm-svn: 204272
2014-03-19 22:19:49 +00:00
Matt Arsenault 547aff20f5 R600/SI: Don't display the GDS bit.
It isn't actually used now, and probably never will be, plus it makes
tests less annoying. I also think SC prints GDS instructions as a
separate instruction name.

llvm-svn: 204270
2014-03-19 22:19:43 +00:00
Matt Arsenault 9cd8c38a32 R600/SI: Merge offset0 and offset1 fields for single address DS instructions v2
Also remove unused data fields from the DS_Load_Helper class.

v2:
  - Merge fields for DS_WRITE

llvm-svn: 204269
2014-03-19 22:19:39 +00:00
Matheus Almeida c11f305082 [mips] 80-column.
llvm-svn: 204252
2014-03-19 16:29:06 +00:00
Craig Topper c6d4efa1e5 Prune includes in X86 target.
llvm-svn: 204216
2014-03-19 06:53:25 +00:00
Rafael Espindola 7bbd5c2636 Revert "Add back r203962, r204028 and r204059."
This reverts commit r204178.

llvm-svn: 204203
2014-03-19 00:13:43 +00:00
Rafael Espindola 574bfa12fa Add back r203962, r204028 and r204059.
This reverts commit r204137.

This includes a fix for handling aliases of aliases.

llvm-svn: 204178
2014-03-18 20:40:38 +00:00
Hans Wennborg aec21ce43e X86 memcpy lowering: use "rep movs" even when esi is used as base pointer
For functions where esi is used as base pointer, we would previously fall back
from lowering memcpy with "rep movs" because that clobbers esi.

With this patch, we just store esi in another physical register, and restore
it afterwards. This adds a little bit of register preassure, but the more
efficient memcpy should be worth it.

Differential Revision: http://llvm-reviews.chandlerc.com/D2968

llvm-svn: 204174
2014-03-18 20:04:34 +00:00
Manuel Jacob dcb78dbc82 X86: Use enums for memory operand decoding instead of integer literals.
Summary:
X86BaseInfo.h defines an enum for the offset of each operand in a memory operand
sequence.  Some code uses it and some does not.  This patch replaces (hopefully)
all remaining locations where an integer literal was used instead of this enum.
No functionality change intended.

Reviewers: nadav

CC: llvm-commits, t.p.northover

Differential Revision: http://llvm-reviews.chandlerc.com/D3108

llvm-svn: 204158
2014-03-18 16:14:11 +00:00
Krzysztof Parzyszek 4d38c82575 Enable CFI on Hexagon.
llvm-svn: 204157
2014-03-18 16:02:37 +00:00
Bill Schmidt ff9622ef0e Fix PR19144: Incorrect offset generated for int-to-fp conversion at -O0.
When converting a signed 32-bit integer to double-precision floating point on
hardware without a lfiwax instruction, we have to instead use a lfd followed
by fcfid.  We were erroneously offsetting the address by 4 bytes in
preparation for either a lfiwax or lfiwzx when generating the lfd.  This fixes
that silly error.

This was not caught in the test suite since the conversion tests were run with
-mcpu=pwr7, which implies availability of lfiwax.  I've added another test
case for older hardware that checks the code we expect in the absence of
lfiwax and other flavors of fcfid.  There are fewer tests in this test case
because we punt to DAG selection in more cases on older hardware.  (We must
generate complex fiddly sequences in those cases, and there is marginal
benefit in duplicating that logic in fast-isel.)

llvm-svn: 204155
2014-03-18 14:32:50 +00:00
Alexander Kornienko 64de613751 Revert r203962 and two revisions depending on it: r204028 and r204059.
The revision I'm reverting breaks handling of transitive aliases. This blocks us
and breaks sanitizer bootstrap:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/2651
(and checked locally by Alexey).

This revision is the result of:

  svn merge -r204059:204058 -r204028:204027 -r203962:203961 .

+ the regression test added to test/MC/ELF/alias.s

Another way to reproduce the regression with clang:
  $ cat q.c
  void a1();
  void a2() __attribute__((alias("a1")));
  void a3() __attribute__((alias("a2")));
  void a1() {}

  $ ~/work/llvm-build/bin/clang-3.5-good -c q.c && mv q.o good.o && \
      ~/work/llvm-build/bin/clang-3.5-bad -c q.c && mv q.o bad.o && \
      objdump -t good.o bad.o

    good.o:     file format elf64-x86-64

    SYMBOL TABLE:
    0000000000000000 l    df *ABS*  0000000000000000 q.c
    0000000000000000 l    d  .text  0000000000000000 .text
    0000000000000000 l    d  .data  0000000000000000 .data
    0000000000000000 l    d  .bss   0000000000000000 .bss
    0000000000000000 l    d  .comment       0000000000000000 .comment
    0000000000000000 l    d  .note.GNU-stack        0000000000000000 .note.GNU-stack
    0000000000000000 l    d  .eh_frame      0000000000000000 .eh_frame
    0000000000000000 g     F .text  0000000000000006 a1
    0000000000000000 g     F .text  0000000000000006 a2
    0000000000000000 g     F .text  0000000000000006 a3



    bad.o:     file format elf64-x86-64

    SYMBOL TABLE:
    0000000000000000 l    df *ABS*  0000000000000000 q.c
    0000000000000000 l    d  .text  0000000000000000 .text
    0000000000000000 l    d  .data  0000000000000000 .data
    0000000000000000 l    d  .bss   0000000000000000 .bss
    0000000000000000 l    d  .comment       0000000000000000 .comment
    0000000000000000 l    d  .note.GNU-stack        0000000000000000 .note.GNU-stack
    0000000000000000 l    d  .eh_frame      0000000000000000 .eh_frame
    0000000000000000 g     F .text  0000000000000006 a1
    0000000000000000 g     F .text  0000000000000006 a2
    0000000000000000 g       .text  0000000000000000 a3

llvm-svn: 204137
2014-03-18 10:36:11 +00:00
Alon Mishne ad312155a6 [C++11] Change DebugInfoFinder to use range-based loops
Also changes the iterators to return actual DI type over MDNode.

llvm-svn: 204130
2014-03-18 09:41:07 +00:00
Craig Topper 26696314d5 [C++11] Mark the target fast isel classes as 'final' so that the compiler can de-virtualize some of the internal calls.
llvm-svn: 204123
2014-03-18 07:27:13 +00:00
Saleem Abdulrasool 42b233a836 ARM: add an assertion
Add an assertion that a valid section is referenced.  The potential NULL pointer
dereference was identified by the clang static analyzer.

llvm-svn: 204114
2014-03-18 05:26:55 +00:00
Matt Arsenault f45faaf30d Make methods static
llvm-svn: 204085
2014-03-17 22:23:09 +00:00
Matt Arsenault fae02989b7 R600: Match sign_extend_inreg to BFE instructions
llvm-svn: 204072
2014-03-17 18:58:11 +00:00
Adam Nemet 8a130a5f86 [X86] Fix unused variable warning with NDEBUG from r204058
llvm-svn: 204063
2014-03-17 17:32:53 +00:00
Saleem Abdulrasool 11543a9953 ARM IAS: support .thumb_set
This performs the equivalent of a .set directive in that it creates a symbol
which is an alias for another symbol or value which may possibly be yet
undefined.  This directive also has the added property in that it marks the
aliased symbol as being a thumb function entry point, in the same way that the
.thumb_func directive does.

The current implementation fails one test due to an unrelated issue.  Functions
within .thumb sections are not marked as thumb_func.  The result is that
the aliasee function is not valued correctly.

llvm-svn: 204059
2014-03-17 17:13:54 +00:00
Adam Nemet 24381f1cb7 [VectorLegalizer/X86] Don't unvectorize fp_to_uint for v8f32->v8i16
Rather than LegalizeAction::Expand, this needs LegalizeAction::Promote to get
promoted to fp_to_sint v8f32->v8i32.  This is a legal operation on AVX.

For that to work properly, we also need to teach the legalizer about the
specific promotion required here.  The default vector promotion uses
bitcasting to a vector type of the same total size.  We want to promote the
vector element type, effectively widening the operation and then truncating
the result.  This is analogous to the current logic of how int_to_fp is
promoted.

The change also factors out some code from the int_to_fp promotion code to
ValueType::widenIntegerVectorElementType.  This is now shared between
int_to_fp and fp_to_int.

There is no longer need for the custom lowering of fp_to_sint f32->v8i16 in
X86.  It can now go through the new target-independent fp_to_*int promotion
logic.

I also checked that no other target uses Promote for these ops yet, so there
shouldn't be any unexpected change in behavior.

Fixes <rdar://problem/16202247>

llvm-svn: 204058
2014-03-17 17:06:14 +00:00
Tom Stellard d0084464b5 R600/SI: Fix implementation of isInlineConstant() used by the verifier
The type of the immediates should not matter as long as the encoding is
equivalent to the encoding of one of the legal inline constants.

Tested-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 204056
2014-03-17 17:03:52 +00:00
Tom Stellard fbe435de63 R600/SI: Use correct dest register class for V_READFIRSTLANE_B32
This instructions writes to an 32-bit SGPR.  This change required adding
the 32-bit VCC_LO and VCC_HI registers, because the full VCC register
is 64 bits.

This fixes verifier errors on several of the indirect addressing piglit
tests.

Tested-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 204055
2014-03-17 17:03:51 +00:00
Tom Stellard ca700e41ef R600/SI: Add generic checks to SIInstrInfo::verifyInstruction()
Added checks for number of operands and operand register classes.

Tested-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 204054
2014-03-17 17:03:49 +00:00
Lang Hames 7c8189c6d3 [X86] New and improved VZeroUpperInserter optimization.
- Adds support for inserting vzerouppers before tail-calls.
  This is enabled implicitly by having MachineInstr::copyImplicitOps preserve
  regmask operands, which allows VZeroUpperInserter to see where tail-calls use
  vector registers.

- Fixes a bug that caused the previous version of this optimization to miss some
  vzeroupper insertion points in loops. (Loops-with-vector-code that followed
  loops-without-vector-code were mistakenly overlooked by the previous version).

- New algorithm never revisits instructions.

Fixes <rdar://problem/16228798>

llvm-svn: 204021
2014-03-17 01:22:54 +00:00
Arnaud A. de Grandmaison 75c9e6dedf Remove some dead assignements found by scan-build
llvm-svn: 204013
2014-03-15 22:13:15 +00:00
Patrik Hagglund 8d09a6c674 Replace ValueTypes.h with MachineValueType.h if possible.
Utilize the previous move of MVT to a separate header for all trivial
cases (that don't need any further restructuring).

Reviewed By: Tim Northover

llvm-svn: 204003
2014-03-15 09:11:41 +00:00
Matt Arsenault ea330fbe49 R600: Remove unnecessary attempt to zext a pointer.
Private pointers are now always 32-bits.

llvm-svn: 203989
2014-03-15 00:08:26 +00:00
Matt Arsenault 74891cdefe R600: Code cleanup.
Use sign_extend_inreg and getZeroExtendInReg instead of
using the bit operations they expand into.

llvm-svn: 203988
2014-03-15 00:08:22 +00:00
Duncan P. N. Exon Smith a824862664 x86: Add missing break to getCallPreservedMask()
This change brings getCallPreservedMask()'s logic in line with
getCalleeSavedRegs().

While this changes the control flow slightly, the change is not
currently observable.  is64Bit must be false to get to the accidental
fallthrough, but the case that we fall into (coldcc) does nothing unless
is64Bit is true.

llvm-svn: 203943
2014-03-14 16:29:21 +00:00
Duncan P. N. Exon Smith fea3c8afd6 x86: NFC: Make getCallPreservedMask() more similar to getCalleeSavedRegs()
Changing order of checks in getCallPreservedMask() to match
getCalleeSavedRegs() so that the logic is easier to compare.

llvm-svn: 203939
2014-03-14 16:09:13 +00:00
Duncan P. N. Exon Smith 8f66a3afe0 x86: getCalleeSavedRegs() would crash on 0 (so don't default to it)
The current logic assumes that MF is not 0.  Assert that it isn't, and
remove the default of 0 from the header.

llvm-svn: 203934
2014-03-14 15:38:12 +00:00
Ulrich Weigand f445399870 [ppc64] Avoid copy relocs in named rodata sections
Commit r181723 introduced code to avoid placing initialized variables
needing relocations into the .rodata section, which avoid copy relocs
that do not work as expected on ppc64 function references.

The same treatment is also needed for *named* .rodata.XXX sections.
This patch changes PPC64LinuxTargetObjectFile::SelectSectionForGlobal
to modify "Kind" *before* calling the default SelectSectionForGlobal
routine, instead of first calling the default routine and then just
checking for the (main) .rodata section afterwards.

llvm-svn: 203921
2014-03-14 12:45:22 +00:00
Evgeniy Stepanov 49e2625144 AddressSanitizer instrumentation for MOV and MOVAPS.
This is an initial version of *Sanitizer instrumentation of assembly code.

Patch by Yuri Gorshenin.

llvm-svn: 203908
2014-03-14 08:58:04 +00:00
Rafael Espindola 2fb5bc33a3 Remove the linker_private and linker_private_weak linkages.
These linkages were introduced some time ago, but it was never very
clear what exactly their semantics were or what they should be used
for. Some investigation found these uses:

* utf-16 strings in clang.
* non-unnamed_addr strings produced by the sanitizers.

It turns out they were just working around a more fundamental problem.
For some sections a MachO linker needs a symbol in order to split the
section into atoms, and llvm had no idea that was the case. I fixed
that in r201700 and it is now safe to use the private linkage. When
the object ends up in a section that requires symbols, llvm will use a
'l' prefix instead of a 'L' prefix and things just work.

With that, these linkages were already dead, but there was a potential
future user in the objc metadata information. I am still looking at
CGObjcMac.cpp, but at this point I am convinced that linker_private
and linker_private_weak are not what they need.

The objc uses are currently split in

* Regular symbols (no '\01' prefix). LLVM already directly provides
whatever semantics they need.
* Uses of a private name (start with "\01L" or "\01l") and private
linkage. We can drop the "\01L" and "\01l" prefixes as soon as llvm
agrees with clang on L being ok or not for a given section. I have two
patches in code review for this.
* Uses of private name and weak linkage.

The last case is the one that one could think would fit one of these
linkages. That is not the case. The semantics are

* the linker will merge these symbol by *name*.
* the linker will hide them in the final DSO.

Given that the merging is done by name, any of the private (or
internal) linkages would be a bad match. They allow llvm to rename the
symbols, and that is really not what we want. From the llvm point of
view, these objects should really be (linkonce|weak)(_odr)?.

For now, just keeping the "\01l" prefix is probably the best for these
symbols. If we one day want to have a more direct support in llvm,
IMHO what we should add is not a linkage, it is just a hidden_symbol
attribute. It would be applicable to multiple linkages. For example,
on weak it would produce the current behavior we have for objc
metadata. On internal, it would be equivalent to private (and we
should then remove private).

llvm-svn: 203866
2014-03-13 23:18:37 +00:00
Owen Anderson 16c6bf49b7 Phase 2 of the great MachineRegisterInfo cleanup. This time, we're changing
operator* on the by-operand iterators to return a MachineOperand& rather than
a MachineInstr&.  At this point they almost behave like normal iterators!

Again, this requires making some existing loops more verbose, but should pave
the way for the big range-based for-loop cleanups in the future.

llvm-svn: 203865
2014-03-13 23:12:04 +00:00
Rafael Espindola 4269b9eed5 Use printable names to implement directional labels.
This changes the implementation of local directional labels to use a dedicated
map. With that it can then just use CreateTempSymbol, which is what the rest
of MC uses.

CreateTempSymbol doesn't do a great job at making sure the names are unique
(or being efficient when the names are not needed), but that should probably
be fixed in a followup patch.

This fixes pr18928.

llvm-svn: 203826
2014-03-13 18:09:26 +00:00
Tom Stellard 08ef1233c6 R600: LDS instructions shouldn't implicitly define OQAP
LDS instructions are pseudo instructions which model
the OQAP defs and uses within a single instruction.

This fixes a hang in the opencv MedianFilter tests.

llvm-svn: 203818
2014-03-13 17:13:04 +00:00