Commit Graph

67939 Commits

Author SHA1 Message Date
Kevin Enderby 89299400ac Fix crashes when assembler directives are used that are not
for Mach-O object files by generating an error instead.

rdar://16335232

llvm-svn: 204687
2014-03-25 00:05:50 +00:00
Manman Ren 9db66b3d34 Register Allocator: refactoring (no functionality change).
Factor out two functions calculateRegionSplitCost and doRegionSplit
from tryRegionSplit. These two functions will be used in coming patches.

rdar://16162005

llvm-svn: 204684
2014-03-24 23:23:42 +00:00
David Blaikie 84d8e18f2b DebugInfo: Simplify debug loc list handling by keeping separate lists
Rather than using a flat list with "empty" entries (ala the actual
on-disk format), keep separate lists for each variable.

llvm-svn: 204680
2014-03-24 22:38:38 +00:00
David Blaikie 34ec5d07e1 DwarfDebug: Simplify debug_loc merging
No functional change intended.

Merging up-front rather than delaying this task until later. This just
seems simpler and more efficient (avoiding growing the debug loc list
only to have to skip over those post-merged entries, etc).

llvm-svn: 204679
2014-03-24 22:27:06 +00:00
Adrian Prantl c95ec91e2a Get rid of an unnecessary use of the * and & operators.
llvm-svn: 204673
2014-03-24 21:33:01 +00:00
David Blaikie 96dea0581e DebugInfo: Add DW_AT_GNU_ranges_base to skeleton CUs
This is used to avoid relocations in the dwo file by allowing
DW_AT_ranges specified in debug_info.dwo to be relative to this base
address. (r204667 implements the base-relative DW_AT_ranges side of
this)

llvm-svn: 204672
2014-03-24 21:31:35 +00:00
David Blaikie 26b2bd04fd DebugInfo: Implement relative addressing for DW_AT_ranges under fission
This removes the debug_ranges relocations from debug_info.dwo (but
doesn't implement the DW_AT_GNU_ranges_base which is also necessary for
correct functioning)

llvm-svn: 204668
2014-03-24 21:07:27 +00:00
David Blaikie 3c9a3cc495 DebugInfo: Don't emit relocations to abbreviations in debug_info.dwo
llvm-svn: 204667
2014-03-24 20:53:02 +00:00
David Blaikie f72ed5f9ed DwarfDebug: Remove an unused parameter
llvm-svn: 204665
2014-03-24 20:31:01 +00:00
Matt Arsenault db8b1d5b6c R600: Don't viewCFG() under DEBUG() except on failure.
Having these popping up every time you use -debug is really
irritating.

llvm-svn: 204664
2014-03-24 20:29:02 +00:00
David Blaikie d82b237785 Remove unused parameter
llvm-svn: 204663
2014-03-24 20:28:10 +00:00
Matt Arsenault 684dc80b6d R600/SI: Fix extra mov from legalizing 64-bit SALU ops.
Check the register class of each operand individually
to avoid an extra copy to a vgpr.

llvm-svn: 204662
2014-03-24 20:08:13 +00:00
Matt Arsenault 248b7b6ba1 R600/SI: Sub-optimial fix for 64-bit immediates with SALU ops.
No longer asserts, but now you get moves loading legal immediates
into the split 32-bit operations.

llvm-svn: 204661
2014-03-24 20:08:09 +00:00
Matt Arsenault f35182c783 R600/SI: Fix 64-bit bit ops that require the VALU.
Try to match scalar and first like the other instructions.
Expand 64-bit ands to a pair of 32-bit ands since that is not
available on the VALU.

llvm-svn: 204660
2014-03-24 20:08:05 +00:00
Yaron Keren 7b085a4799 In Release modes, Visual Studio complains that the Operator destructor in User.cpp
never returns, which is true by design. 

Initially assumed that the reason is llvm_unreachable being dependent on NDEBUG.

However, even if llvm_unreachable is replaced by __assume(false), VC still warns in
Release modes but not in Debug modes...

The real reason turned out to be optimization flags.
With /Od in Debug modes the warning is not issued whereas with /O1 it is.

I could not find any documentation to this effect, but it is reproducable:

Try compiling http://msdn.microsoft.com/en-us/library/khwfyc5d(v=vs.90).aspx
with /O1 and then with /Od.

llvm-svn: 204659
2014-03-24 19:48:13 +00:00
Matt Arsenault a7f1e0c44f R600: Implement isNarrowingProfitable.
llvm-svn: 204658
2014-03-24 19:43:31 +00:00
Matt Arsenault bd9958038c R600/SI: Move splitting 64-bit immediates to separate function.
llvm-svn: 204651
2014-03-24 18:26:52 +00:00
Ulrich Weigand cae3a17a21 [PowerPC] Generate little-endian object files
As a first step towards real little-endian code generation, this patch
changes the PowerPC MC layer to actually generate little-endian object
files.  This involves passing the little-endian flag through the various
layers, including down to createELFObjectWriter so we actually get basic
little-endian ELF objects, emitting instructions in little-endian order,
and handling fixups and relocations as appropriate for little-endian.

The bulk of the patch is to update most test cases in test/MC/PowerPC
to verify both big- and little-endian encodings.  (The only test cases
*not* updated are those that create actual big-endian ABI code, like
the TLS tests.)

Note that while the object files are now little-endian, the generated
code itself is not yet updated, in particular, it still does not adhere
to the ELFv2 ABI.

llvm-svn: 204634
2014-03-24 18:16:09 +00:00
Quentin Colombet 2d5c156b96 [X86][ISelDAG] Add missing fallback patterns for avx2 broadcast instructions.
Those patterns are used when the load cannot be folded into the related broadcast
during the select phase.
This happens when the load gets additional uses that were not anticipated during
the previous lowering phases (constant vector to constant load, then constant
load reused) or when selection DAG is not able to prove that folding the load
will not create a cycle in the DAG.

<rdar://problem/16074331>

llvm-svn: 204631
2014-03-24 17:54:19 +00:00
Matt Arsenault ad41d7b531 R600/SI: Fix 64-bit private loads.
llvm-svn: 204630
2014-03-24 17:50:46 +00:00
Adam Nemet b47372f555 [X86] Fix non-determinism in LowerVectorAllZeroTest
This can be observed with the old testcase of CodeGen/X86/pr12312.ll:

47c47
<       vorps   %ymm0, %ymm1, %ymm0
---
>       vorps   %ymm1, %ymm0, %ymm0
97c97
<       vorps   %ymm1, %ymm0, %ymm0
---
>       vorps   %ymm0, %ymm1, %ymm0

The vector VecIns is populated with all the values from VecInMap. This is done
while iterating VecInMap.  VecInMap uses a hash of pointer values so the
resulting order can vary depending on the memory layout.

The fix is to populate the vector VecIns earlier as VecInMap is populated.
This is done in DAG traversal order.

Fixes <rdar://problem/16398806>

llvm-svn: 204623
2014-03-24 16:52:08 +00:00
Daniel Sanders d89b13625e [mips] Add error message when trying to use $at in '.set noat' mode.
Summary:
Patch by David Chisnall
His work was sponsored by: DARPA, AFRL

Differential Revision: http://llvm-reviews.chandlerc.com/D3158

llvm-svn: 204621
2014-03-24 16:48:01 +00:00
Eli Bendersky 6de2087ea7 Removes the NVPTXSplitBBatBar pass.
This pass is a historic remnant and actually causes less efficient code to be
generated in some cases.

llvm-svn: 204620
2014-03-24 16:36:39 +00:00
Tom Stellard 8c12fd9252 R600/SI: Fix warning with gcc 4.8.2
llvm-svn: 204618
2014-03-24 16:12:34 +00:00
Tom Stellard da99c6eff5 R600/SI: Promote fp64 SELECT to i64
This type promotion is replacing a Tablegen pattern and it is already
covered by existing tests.

llvm-svn: 204617
2014-03-24 16:07:30 +00:00
Tom Stellard c9a67a2b6d SelectionDAG: Allow promotion of SELECT nodes from float to int types
And vice-versa, as long as the types are the same width.

There are a few R600 tests that will cover this.

llvm-svn: 204616
2014-03-24 16:07:28 +00:00
Tom Stellard 2c1c9de151 R600: Reorganize tablegen instruction definitions
Each GPU family now has its own file.

llvm-svn: 204615
2014-03-24 16:07:25 +00:00
Will Schmidt 114777e47f [PPC64LE] ELFv2 ABI updates for the .opd section
[PPC64LE] ELFv2 ABI updates for the .opd section
The PPC64 Little Endian (PPC64LE) target supports the ELFv2 ABI, and as
such, does not have a ".opd" section.  This is keyed off a _CALL_ELF=2
macro check.

The CALL_ELF check is not clearly documented at this time.  The basis
for usage in this patch is from the gcc thread here:
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg01144.html

> Adding comment from Uli:
Looks good to me.  I think the old-style JIT doesn't really work
anyway for 64-bit, but at least with this patch LLVM will compile
and link again on a ppc64le host ...

llvm-svn: 204614
2014-03-24 16:04:15 +00:00
Daniel Sanders 01f9fc06e7 [mips] Allow dsubu to take an immediate as an alias for dsubiu.
Summary:
Patch by David Chisnall
His work was sponsored by: DARPA, AFRL

Differential Revision: http://llvm-reviews.chandlerc.com/D3155

llvm-svn: 204611
2014-03-24 15:38:00 +00:00
Hal Finkel e01d32107c [PowerPC] Mark many instructions as commutative
I'm under the impression that we used to infer the isCommutable flag from the
instruction-associated pattern. Regardless, we don't seem to do this (at least
by default) any more. I've gone through all of our instruction definitions, and
marked as commutative all of those that should be trivial to commute (by
exchanging the first two operands). There has been special code for the RL*
instructions, and that's not changed.

Before this change, we had the following commutative instructions:

 RLDIMI
 RLDIMIo
 RLWIMI
 RLWIMI8
 RLWIMI8o
 RLWIMIo
 XSADDDP
 XSMULDP
 XVADDDP
 XVADDSP
 XVMULDP
 XVMULSP

After:

 ADD4
 ADD4o
 ADD8
 ADD8o
 ADDC
 ADDC8
 ADDC8o
 ADDCo
 ADDE
 ADDE8
 ADDE8o
 ADDEo
 AND
 AND8
 AND8o
 ANDo
 CRAND
 CREQV
 CRNAND
 CRNOR
 CROR
 CRXOR
 EQV
 EQV8
 EQV8o
 EQVo
 FADD
 FADDS
 FADDSo
 FADDo
 FMADD
 FMADDS
 FMADDSo
 FMADDo
 FMSUB
 FMSUBS
 FMSUBSo
 FMSUBo
 FMUL
 FMULS
 FMULSo
 FMULo
 FNMADD
 FNMADDS
 FNMADDSo
 FNMADDo
 FNMSUB
 FNMSUBS
 FNMSUBSo
 FNMSUBo
 MULHD
 MULHDU
 MULHDUo
 MULHDo
 MULHW
 MULHWU
 MULHWUo
 MULHWo
 MULLD
 MULLDo
 MULLW
 MULLWo
 NAND
 NAND8
 NAND8o
 NANDo
 NOR
 NOR8
 NOR8o
 NORo
 OR
 OR8
 OR8o
 ORo
 RLDIMI
 RLDIMIo
 RLWIMI
 RLWIMI8
 RLWIMI8o
 RLWIMIo
 VADDCUW
 VADDFP
 VADDSBS
 VADDSHS
 VADDSWS
 VADDUBM
 VADDUBS
 VADDUHM
 VADDUHS
 VADDUWM
 VADDUWS
 VAND
 VAVGSB
 VAVGSH
 VAVGSW
 VAVGUB
 VAVGUH
 VAVGUW
 VMADDFP
 VMAXFP
 VMAXSB
 VMAXSH
 VMAXSW
 VMAXUB
 VMAXUH
 VMAXUW
 VMHADDSHS
 VMHRADDSHS
 VMINFP
 VMINSB
 VMINSH
 VMINSW
 VMINUB
 VMINUH
 VMINUW
 VMLADDUHM
 VMULESB
 VMULESH
 VMULEUB
 VMULEUH
 VMULOSB
 VMULOSH
 VMULOUB
 VMULOUH
 VNMSUBFP
 VOR
 VXOR
 XOR
 XOR8
 XOR8o
 XORo
 XSADDDP
 XSMADDADP
 XSMAXDP
 XSMINDP
 XSMSUBADP
 XSMULDP
 XSNMADDADP
 XSNMSUBADP
 XVADDDP
 XVADDSP
 XVMADDADP
 XVMADDASP
 XVMAXDP
 XVMAXSP
 XVMINDP
 XVMINSP
 XVMSUBADP
 XVMSUBASP
 XVMULDP
 XVMULSP
 XVNMADDADP
 XVNMADDASP
 XVNMSUBADP
 XVNMSUBASP
 XXLAND
 XXLNOR
 XXLOR
 XXLXOR

This is a by-inspection change, and I'm not sure how to write a reliable test
case. I would like advice on this, however.

llvm-svn: 204609
2014-03-24 15:07:28 +00:00
Daniel Sanders a771fefb72 [mips] Implement shorthand add / sub forms for MIPS.
Summary:
- If only two registers are passed to a three-register operation, then the
  first argument is both source and destination register.

- If a non-register is passed as the last argument, generate the immediate
  version of the instruction.

Also mark DADD commutative and add scheduling information (to the generic
scheduler), and implement DSUB.

Patch by David Chisnall
His work was sponsored by: DARPA, AFRL

CC: theraven

Differential Revision: http://llvm-reviews.chandlerc.com/D3148

llvm-svn: 204605
2014-03-24 14:05:39 +00:00
Justin Holewinski ba2fa6de4f [NVPTX] Add isel patterns for addrspacecast
llvm-svn: 204600
2014-03-24 11:17:53 +00:00
Hal Finkel 32854b0439 [PowerPC] Don't schedule VSX copy legalization unless VSX is enabled
There is no need to schedule this extra pass if it will have nothing to do.

llvm-svn: 204594
2014-03-24 09:51:41 +00:00
Hal Finkel bbad2332e3 [PowerPC] Update comment re: VSX copy-instruction selection
I've done some experimentation with this, and it looks like using the
lower-latency (but lower throughput) copy instruction is essentially always the
right thing to do.

My assumption is that, in order to be relatively sure that the higher-latency
copy will increase throughput, we'd want to have it unlikely to be in-flight
with its use. On the P7, the global completion table (GCT) can hold a maximum
of 120 instructions, shared among all active threads (up to 4), giving 30
instructions per thread.  So specifically, I'd require at least that many
instructions between the copy and the use before the high-latency variant is
used.

Trying this, however, over the entire test suite resulted in zero cases where
the high-latency form would be preferable. This may be a consequence of the
fact that the scheduler views copies as free, and so they tend to end up close
to their uses. For this experiment I created a function:

  unsigned chooseVSXCopy(MachineBasicBlock &MBB,
                         MachineBasicBlock::iterator I,
                         unsigned DestReg, unsigned SrcReg,
                         unsigned StartDist = 1,
                         unsigned Depth = 3) const;

with an implementation like:

  if (!Depth)
    return PPC::XXLOR;

  const unsigned MaxDist = 30;
  unsigned Dist = StartDist;
  for (auto J = I, JE = MBB.end(); J != JE && Dist <= MaxDist; ++J) {
    if (J->isTransient() && !J->isCopy())
      continue;

    if (J->isCall() || J->isReturn() || J->readsRegister(DestReg, TRI))
      return PPC::XXLOR;

    ++Dist;
  }

  // We've exceeded the required distance for the high-latency form, use it.
  if (Dist > MaxDist)
    return PPC::XVCPSGNDP;

  // If this is only an exit block, use the low-latency form.
  if (MBB.succ_empty())
    return PPC::XXLOR;

  // We've reached the end of the block, check the successor blocks (up to some
  // depth), and use the high-latency form if that is okay with all successors.
  for (auto J = MBB.succ_begin(), JE = MBB.succ_end(); J != JE; ++J) {
    if (chooseVSXCopy(**J, (*J)->begin(), DestReg, SrcReg,
                      Dist, --Depth) == PPC::XXLOR)
      return PPC::XXLOR;
  }

  // All of our successor blocks seem okay with the high-latency variant, so
  // we'll use it.
  return PPC::XVCPSGNDP;

and then changed the copy opcode selection from:
    Opc = PPC::XXLOR;
to:
    Opc = chooseVSXCopy(MBB, std::next(I), DestReg, SrcReg);

In conclusion, I'm removing the FIXME from the comment, because I believe that
there is, at least absent other examples, nothing to fix.

llvm-svn: 204591
2014-03-24 09:36:36 +00:00
Karthik Bhat 195e9dd91b Allow constant folding of ceil function whenever feasible
llvm-svn: 204583
2014-03-24 04:36:06 +00:00
Rafael Espindola 022bb76879 Propagate section from base to derived symbol.
We were already propagating the section in

a = b

With this patch we also propagate it for

a = b + 1

llvm-svn: 204581
2014-03-24 03:43:21 +00:00
Duncan P. N. Exon Smith d7d83477fa InstrProf: Silence spurious warnings in GCC 4.8
No functionality change.

llvm-svn: 204580
2014-03-24 00:47:18 +00:00
Arnaud A. de Grandmaison 1182600f20 ARM: no need to update SplatBits as it is not used
llvm-svn: 204575
2014-03-23 21:14:32 +00:00
David Majnemer 9338984f57 WinCOFF: Add support for -ffunction-sections
This is a pretty straight forward translation for COFF, we just need to
stick the function in a COMDAT section marked as
IMAGE_COMDAT_SELECT_NODUPLICATES.

llvm-svn: 204565
2014-03-23 17:47:39 +00:00
Nuno Lopes 31617266ea remove a bunch of unused private methods
found with a smarter version of -Wunused-member-function that I'm playwing with.
Appologies in advance if I removed someone's WIP code.

 include/llvm/CodeGen/MachineSSAUpdater.h            |    1 
 include/llvm/IR/DebugInfo.h                         |    3 
 lib/CodeGen/MachineSSAUpdater.cpp                   |   10 --
 lib/CodeGen/PostRASchedulerList.cpp                 |    1 
 lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp    |   10 --
 lib/IR/DebugInfo.cpp                                |   12 --
 lib/MC/MCAsmStreamer.cpp                            |    2 
 lib/Support/YAMLParser.cpp                          |   39 ---------
 lib/TableGen/TGParser.cpp                           |   16 ---
 lib/TableGen/TGParser.h                             |    1 
 lib/Target/AArch64/AArch64TargetTransformInfo.cpp   |    9 --
 lib/Target/ARM/ARMCodeEmitter.cpp                   |   12 --
 lib/Target/ARM/ARMFastISel.cpp                      |   84 --------------------
 lib/Target/Mips/MipsCodeEmitter.cpp                 |   11 --
 lib/Target/Mips/MipsConstantIslandPass.cpp          |   12 --
 lib/Target/NVPTX/NVPTXISelDAGToDAG.cpp              |   21 -----
 lib/Target/NVPTX/NVPTXISelDAGToDAG.h                |    2 
 lib/Target/PowerPC/PPCFastISel.cpp                  |    1 
 lib/Transforms/Instrumentation/AddressSanitizer.cpp |    2 
 lib/Transforms/Instrumentation/BoundsChecking.cpp   |    2 
 lib/Transforms/Instrumentation/MemorySanitizer.cpp  |    1 
 lib/Transforms/Scalar/LoopIdiomRecognize.cpp        |    8 -
 lib/Transforms/Scalar/SCCP.cpp                      |    1 
 utils/TableGen/CodeEmitterGen.cpp                   |    2 
 24 files changed, 2 insertions(+), 261 deletions(-)

llvm-svn: 204560
2014-03-23 17:09:26 +00:00
Hal Finkel 4a912250fa [PowerPC] Make use of VSX f64 <-> i64 conversion instructions
When VSX is available, these instructions should be used in preference to the
older variants that only have access to the scalar floating-point registers.

llvm-svn: 204559
2014-03-23 05:35:00 +00:00
Lang Hames 459b5dc39e Revert r204076 for now - it caused significant regressions in a number of
benchmarks.

<rdar://problem/16368461>

llvm-svn: 204558
2014-03-23 04:22:31 +00:00
Duncan P. N. Exon Smith 4680361d7c InstrProf: Check pointer size in raw profile
Since the profile can come from 32-bit machines, we need to check the
pointer size.  Change the magic number to facilitate this.

Adds tests for reading 32-bit and 64-bit binaries (both big- and
little-endian).  The tests write a binary using printf in RUN lines
(like raw-magic-but-no-header.test).  Assuming the bots don't complain,
this seems like a better way forward for testing RawInstrProfReader than
committing binary files.

<rdar://problem/16400648>

llvm-svn: 204557
2014-03-23 03:38:12 +00:00
Rafael Espindola a6e3a599d1 Propagate types from symbol to aliases.
This is similar, but not identical to what gas does. The logic in MC is to just
compute the symbol table after parsing the entire file. GAS is mixed, given

.type b, @object
a = b
b:
.type b, @function

It will propagate the change and make 'a' a function. Given

.type b, @object
b:
a = b
.type b, @function

the type of 'a' is still object.

Since we do the computation in the end, we produce a function in both cases.

llvm-svn: 204555
2014-03-23 03:33:20 +00:00
NAKAMURA Takumi d3415c2440 [CMake] LLVMProfileData: No need to add LINK_LIBS here. LLVMBuild should do.
llvm-svn: 204553
2014-03-23 01:23:36 +00:00
Craig Topper a9253267a9 Prune includes in ARM target.
llvm-svn: 204548
2014-03-22 23:51:00 +00:00
Saleem Abdulrasool 44419fc3cd ARM IAS: properly handle function entries in .thumb
When a label is parsed, check if there is type information available for the
label.  If so, check if the symbol is a function.  If the symbol is a function
and we are in thumb mode and no explicit thumb_func has been emitted, adjust the
symbol data to indicate that the function definition is a thumb function.

The application of this inferencing is improved value handling in the object
file (the required thumb bit is set on symbols which are thumb functions).  It
also helps improve compatibility with binutils.

The one complication that arises from this handling is the MCAsmStreamer.  The
default implementation of getOrCreateSymbolData in MCStreamer does not support
tracking the symbol data.  In order to support the semantics of thumb functions,
track symbol data in assembly streamer.  Although O(n) in number of labels in
the TU, this is already done in various other streamers and as such the memory
overhead is not a practical concern in this scenario.

llvm-svn: 204544
2014-03-22 19:26:18 +00:00
Hal Finkel 55805eb562 [PowerPC] Fix the VSX v2f64 return register
v2f64 values, like other 128-bit values, are returned under VSX in register
vs34 (Altivec register v2).

llvm-svn: 204543
2014-03-22 18:24:43 +00:00
Juergen Ributzka e474752f4c [Constant Hoisting] Erase dead cast instructions.
The cleanup code that removes dead cast instructions only removed them from the
basic block, but didn't delete them. This fix erases them now too.

llvm-svn: 204538
2014-03-22 01:49:30 +00:00
Juergen Ributzka e802d507b0 [Constant Hoisting] Fix multiple entries for the same basic block in PHI nodes.
A PHI node usually has only one value/basic block pair per incoming basic block.
In the case of a switch statement it is possible that a following PHI node may
have more than one such pair per incoming basic block. E.g.:
%0 = phi i64 [ 123456, %case2 ], [ 654321, %Entry ], [ 654321, %Entry ]
This is valid and the verfier doesn't complain, because both values are the
same.

Constant hoisting materializes the constant for each operand separately and the
value is still the same, but the variable names have changed. As a result the
verfier can't recognize anymore that they are the same value and complains.

This fix adds special update code for PHI node in constant hoisting to prevent
this corner case.

This fixes <rdar://problem/16394449>

llvm-svn: 204537
2014-03-22 01:49:27 +00:00