Commit Graph

91258 Commits

Author SHA1 Message Date
Eli Bendersky aa3ffafbde Rewrite test/Verifier tests to use FileCheck instead of grep
llvm-svn: 179036
2013-04-08 18:33:51 +00:00
Arnold Schwaighofer f47d2d7f6b X86 cost model: Model cost for uitofp and sitofp on SSE2
The costs are overfitted so that I can still use the legalization factor.

For example the following kernel has about half the throughput vectorized than
unvectorized when compiled with SSE2. Before this patch we would vectorize it.

unsigned short A[1024];
double B[1024];
void f() {
  int i;
  for (i = 0; i < 1024; ++i) {
    B[i] = (double) A[i];
  }
}

radar://13599001

llvm-svn: 179033
2013-04-08 18:05:48 +00:00
Chad Rosier fce4fab1a4 [ms-inline asm] Add support for ImmDisp [ Symbol ] memory operands.
rdar://13521249

llvm-svn: 179030
2013-04-08 17:43:47 +00:00
Hal Finkel b5aa7e54d9 Generate PPC early conditional returns
PowerPC has a conditional branch to the link register (return) instruction: BCLR.
This should be used any time when we'd otherwise have a conditional branch to a
return. This adds a small pass, PPCEarlyReturn, which runs just prior to the
branch selection pass (and, importantly, after block placement) to generate
these conditional returns when possible. It will also eliminate unconditional
branches to returns (these happen rarely; most of the time these have already
been tail duplicated by the time PPCEarlyReturn is invoked). This is a nice
optimization for small functions that do not maintain a stack frame.

llvm-svn: 179026
2013-04-08 16:24:03 +00:00
Alexey Samsonov c03f2ee0ae DWARF parser: remove duplicated code and fix code style in DIE extractors.
llvm-svn: 179023
2013-04-08 14:37:16 +00:00
Rafael Espindola d66c414619 Add all 4 MachO object types. Use the stored type to implement is64Bits().
llvm-svn: 179021
2013-04-08 13:25:33 +00:00
Vincent Lejeune 5f11dd390a R600: Control Flow support for pre EG gen
llvm-svn: 179020
2013-04-08 13:05:49 +00:00
Chandler Carruth a6d5e3e9a2 Simplify the quoting here. Our lit emulator doesn't deal well with the
nested quoting schemes, and they're not important here...

llvm-svn: 179014
2013-04-08 10:07:50 +00:00
Chandler Carruth 3fa99abfae Remove a global 'endl' variable from the other file as well.
llvm-svn: 179010
2013-04-08 08:55:18 +00:00
Chandler Carruth 1819289607 Clean up namespaces in obj2yaml.cpp.
llvm-svn: 179009
2013-04-08 08:55:14 +00:00
Tim Northover 85c19f5a73 Add ACLE link to ARM documentation sections
llvm-svn: 179006
2013-04-08 08:42:24 +00:00
Tim Northover 15410e98d3 AArch64: remove barriers from AArch64 atomic operations.
I've managed to convince myself that AArch64's acquire/release
instructions are sufficient to guarantee C++11's required semantics,
even in the sequentially-consistent case.

llvm-svn: 179005
2013-04-08 08:40:41 +00:00
Chandler Carruth c224e25d49 Cleanup the formatting of obj2yaml.cpp.
I couldn't touch this file and not clean it up some. These reformattings
brought to you by clang-format, with some minor adjustments by me. More
spring cleaning to follow here.

llvm-svn: 179004
2013-04-08 08:39:59 +00:00
Chandler Carruth 741c00df17 Don't define our own global 'endl' variable. While technically it had
internal linkage and so wasn't a patent bug, it doesn't make any sense
here. We can avoid even calling operator<< by just embedding the newline
in the string literals that were already being streamed out. It also
gives the impression of some line-ending agnosticisms which is not
present, and that flushing happens when it doesn't.

If we want to use std::endl, we could do that, but honestly it doesn't
seem remotely worth it. Using '\n' directly is much more clear when
working with raw_ostream.

It also happens to fix builds with old crufty GCC STL implementations
that include std::endl into the global namespace (or headers written to
be compatible with such atrocities).

llvm-svn: 179003
2013-04-08 08:30:47 +00:00
Benjamin Kramer d56a324e30 ARM: Remove unused variable.
llvm-svn: 179001
2013-04-08 08:07:35 +00:00
Hal Finkel 81f8799fe3 Cleanup and improve PPC fsel generation
First, we should not cheat: fsel-based lowering of select_cc is a
finite-math-only optimization (the ISA manual, section F.3 of v2.06, makes
this clear, as does a note in our own README).

This also adds fsel-based lowering of EQ and NE condition codes. As it turned
out, fsel generation was covered by a grand total of zero regression test
cases. I've added some test cases to cover the existing behavior (which is now
finite-math only), as well as the new EQ cases.

llvm-svn: 179000
2013-04-07 22:11:09 +00:00
Arnold Schwaighofer 995ce6c388 TargetLowering: Fix getTypeConversion handling of extended vector types
The code in getTypeConversion attempts to promote the element vector type
before it trys to split or widen the vector.
After it failed finding a legal vector type by promoting it would continue using
the promoted vector element type. Thereby missing legal splitted vector types.
For example the type v32i32 that has a legal split of 4 x v3i32 on x86/sse2
would be transformed to: v32i256 and from there on successively split to:
v16i256, v8i256, v1i256 and then finally ends up as an i64 type.
By resetting the vector element type to the original vector element type that
existed before the promotion the code will attempt to split the vector type to
smaller vector widths of the same type.

llvm-svn: 178999
2013-04-07 20:22:56 +00:00
Rafael Espindola 421305aff8 Make MachOObjectFile independent from MachOObject.
llvm-svn: 178998
2013-04-07 20:01:29 +00:00
Rafael Espindola c1f28b6a8e Implement MachOObjectFile::getData directly.
llvm-svn: 178997
2013-04-07 19:42:15 +00:00
Rafael Espindola 28814d7911 Implement MachOObjectFile::is64Bit directly.
llvm-svn: 178996
2013-04-07 19:38:15 +00:00
Rafael Espindola 774a8cec37 Implement MachOObjectFile::getHeaderSize directly.
llvm-svn: 178995
2013-04-07 19:31:49 +00:00
Rafael Espindola d665259104 Implement MachOObjectFile::getHeader directly.
llvm-svn: 178994
2013-04-07 19:26:57 +00:00
Jakob Stoklund Olesen a30f4832c9 Implement LowerCall_64 for the SPARC v9 64-bit ABI.
There is still no support for byval arguments (which I don't think are
needed) and varargs.

llvm-svn: 178993
2013-04-07 19:10:57 +00:00
Rafael Espindola 60689987ef Implement MachOObjectFile::getHeaderSize and MachOObjectFile::getData.
These were the last missing forwarding functions. Also consistently use
the forwarding functions instead of using MachOObj directly.

llvm-svn: 178992
2013-04-07 19:05:30 +00:00
Rafael Espindola 3c50f06202 Remove LoadCommandInfo now that we always have a pointer to the command.
LoadCommandInfo was needed to keep a command and its offset in the file. Now
that we always have a pointer to the command, we don't need the offset.

llvm-svn: 178991
2013-04-07 18:42:06 +00:00
Rafael Espindola 224208b868 Add MachOObjectFile::LoadCommandInfo.
This avoids using MachOObject::getLoadCommandInfo.

llvm-svn: 178990
2013-04-07 18:08:12 +00:00
Rafael Espindola 1309a448ff Use getLoadCommandInfo instead of MachOObj->getLoadCommandInfo.
llvm-svn: 178989
2013-04-07 17:41:59 +00:00
Rafael Espindola 17bece31af Construct MachOObject in MachOObjectFile's constructor.
llvm-svn: 178988
2013-04-07 16:58:48 +00:00
Rafael Espindola 717c4d44c4 Remove unused argument.
llvm-svn: 178987
2013-04-07 16:40:00 +00:00
Rafael Espindola 5ffc079c8a Remove MachOObjectFile::getObject.
llvm-svn: 178986
2013-04-07 16:07:35 +00:00
Rafael Espindola 31fce89645 Remove two uses of getObject.
llvm-svn: 178985
2013-04-07 15:46:05 +00:00
Rafael Espindola 79bb550577 Remove usage of InMemoryStruct in getSymbol.
llvm-svn: 178984
2013-04-07 15:35:18 +00:00
Hal Finkel d04bb0b8ff PPC Altivec load/store intrinsics can be marked IntrRead[Write]ArgMem
llvm-svn: 178983
2013-04-07 15:32:40 +00:00
Hal Finkel 7795e47b5e PPC rotate instructions don't have unmodeled side effcts
llvm-svn: 178982
2013-04-07 15:06:53 +00:00
Rafael Espindola 6f5d6c7e78 Remove a use of InMemoryStruct in llvm-readobj.
llvm-svn: 178981
2013-04-07 15:05:12 +00:00
Rafael Espindola 0944c13e6b Make getObject const. Remove a const_cast.
llvm-svn: 178980
2013-04-07 14:50:40 +00:00
Rafael Espindola b7b11f7bac Remove last use of InMemoryStruct in llvm-objdump.
llvm-svn: 178979
2013-04-07 14:40:18 +00:00
Hal Finkel b47a69acde Most PPC M[TF]CR instructions do not have side effects
llvm-svn: 178978
2013-04-07 14:33:13 +00:00
Rafael Espindola 7be6ead7a4 Remove dead code.
llvm-svn: 178977
2013-04-07 14:30:21 +00:00
Rafael Espindola 91e626ebd2 Remove unused argument.
llvm-svn: 178976
2013-04-07 14:25:39 +00:00
Chandler Carruth 0e8a52d18f Fix PR15674 (and PR15603): a SROA think-o.
The fix for PR14972 in r177055 introduced a real think-o in the *store*
side, likely because I was much more focused on the load side. While we
can arbitrarily widen (or narrow) a loaded value, we can't arbitrarily
widen a value to be stored, as that changes the width of memory access!
Lock down the code path in the store rewriting which would do this to
only handle the intended circumstance.

All of the existing tests continue to pass, and I've added a test from
the PR.

llvm-svn: 178974
2013-04-07 11:47:54 +00:00
Hal Finkel d71cc3a7f3 PPC pre-increment load instructions do not have side effects
A few were missed in r178972.

llvm-svn: 178973
2013-04-07 06:30:47 +00:00
Hal Finkel 6efd45e902 PPC pre-increment load instructions do not have side effects
llvm-svn: 178972
2013-04-07 05:46:58 +00:00
Hal Finkel 933e8f037d PPC MCRF instruction does not have side effects
llvm-svn: 178971
2013-04-07 05:16:57 +00:00
Hal Finkel 94072b98eb PPC FMR instruction does not have side effects
llvm-svn: 178970
2013-04-07 04:56:16 +00:00
Eric Christopher 55863befd1 DW_FORM_sec_offset should be a relocation on platforms that use
a relocation across sections. Do this for DW_AT_stmt list in the
skeleton CU and check the relocations in the debug_info section.

Add a FIXME for multiple CUs.

llvm-svn: 178969
2013-04-07 03:43:09 +00:00
Reid Kleckner c1c01d5313 [cmake] Avoid rel+asserts warnings when passing -UNDEBUG
MSVC 2012 gives warning D9025, "overriding /D NDEBUG with -UNDEBUG".
Removing the original definition of NDEBUG silences this.

llvm-svn: 178967
2013-04-07 01:45:01 +00:00
Jakob Stoklund Olesen edaf66b056 Implement LowerReturn_64 for SPARC v9.
Integer return values are sign or zero extended by the callee, and
structs up to 32 bytes in size can be returned in registers.

The CC_Sparc64 CallingConv definition is shared between
LowerFormalArguments_64 and LowerReturn_64. Function arguments and
return values are passed in the same registers.

The inreg flag is also used for return values. This is required to handle
C functions returning structs containing floats and ints:

  struct ifp {
    int i;
    float f;
  };

  struct ifp f(void);

LLVM IR:

  define inreg { i32, float } @f() {
     ...
     ret { i32, float } %retval
  }

The ABI requires that %retval.i is returned in the high bits of %i0
while %retval.f goes in %f1.

Without the inreg return value attribute, %retval.i would go in %i0 and
%retval.f would go in %f3 which is a more efficient way of returning
%multiple values, but it is not ABI compliant for returning C structs.

llvm-svn: 178966
2013-04-06 23:57:33 +00:00
Jakob Stoklund Olesen 03d9f7fda6 SPARC v9 stack pointer bias.
64-bit SPARC v9 processes use biased stack and frame pointers, so the
current function's stack frame is located at %sp+BIAS .. %fp+BIAS where
BIAS = 2047.

This makes more local variables directly accessible via [%fp+simm13]
addressing.

llvm-svn: 178965
2013-04-06 21:38:57 +00:00
Hal Finkel d61d4f80e6 Implement PPCInstrInfo::FoldImmediate
There are certain PPC instructions into which we can fold a zero immediate
operand. We can detect such cases by looking at the register class required
by the using operand (so long as it is not otherwise constrained).

llvm-svn: 178961
2013-04-06 19:30:30 +00:00
Hal Finkel 8fc33e5d95 PPC ISEL is a select and never has side effects
llvm-svn: 178960
2013-04-06 19:30:28 +00:00
Hal Finkel 537ec71775 Add a comment to TargetInstrInfo about FoldImmediate
This comment documents the current behavior of the ARM implementation of this
callback, and also the soon-to-be-committed PPC version.

llvm-svn: 178959
2013-04-06 19:30:20 +00:00
Jakob Stoklund Olesen 1c9a95ab2a Complete formal arguments for the SPARC v9 64-bit ABI.
All arguments are formally assigned to stack positions and then promoted
to floating point and integer registers. Since there are more floating
point registers than integer registers, this can cause situations where
floating point arguments are assigned to registers after integer
arguments that where assigned to the stack.

Use the inreg flag to indicate 32-bit fragments of structs containing
both float and int members.

The three-way shadowing between stack, integer, and floating point
registers requires custom argument lowering. The good news is that
return values are passed in the exact same way, and we can share the
code.

Still missing:

 - Update LowerReturn to handle structs returned in registers.
 - LowerCall.
 - Variadic functions.

llvm-svn: 178958
2013-04-06 18:32:12 +00:00
Nadav Rotem c4bd84c1d5 typo
llvm-svn: 178949
2013-04-06 04:24:12 +00:00
Rafael Espindola 91af8e84b9 Remove last use of InMemoryStruct from MachOObjectFile.cpp.
llvm-svn: 178948
2013-04-06 03:50:05 +00:00
Rafael Espindola 15e2a9cd57 Don't use InMemoryStruct<macho::SymtabLoadCommand>.
This also required not using the RegisterStringTable API, which is also a
good thing.

llvm-svn: 178947
2013-04-06 03:31:08 +00:00
Rafael Espindola a65f5de499 Don't use InMemoryStruct in getSymbol64TableEntry.
llvm-svn: 178946
2013-04-06 02:15:44 +00:00
Rafael Espindola 2a34c2d8dd Don't use InMemoryStruct in getSymbolTableEntry.
llvm-svn: 178945
2013-04-06 01:59:05 +00:00
Rafael Espindola 7caf2fbdc3 Don't use InMemoryStruct in getRelocation.
llvm-svn: 178943
2013-04-06 01:24:11 +00:00
Manman Ren 5b22f9fe18 Dwarf: use utostr on CUID to append to SmallString.
We used to do "SmallString += CUID", which is incorrect, since CUID will
be truncated to a char.

rdar://problem/13573833

llvm-svn: 178941
2013-04-06 01:02:38 +00:00
Michael Gottesman 7924997c36 Removed trailing whitespace.
llvm-svn: 178932
2013-04-05 23:46:45 +00:00
Tom Stellard 754f80ff3a R600/SI: Add support for buffer stores v2
v2:
  - Use the ADDR64 bit

Reviewed-by: Christian König <christian.koenig@amd.com>
llvm-svn: 178931
2013-04-05 23:31:51 +00:00
Tom Stellard 6db08eb42f R600/SI: Use same names for corresponding MUBUF operands and encoding fields
The code emitter knows how to encode operands whose name matches one of
the encoding fields.  If there is no match, the code emitter relies on
the order of the operand and field definitions to determine how operands
should be encoding.  Matching by order makes it easy to accidentally break
the instruction encodings, so we prefer to match by name.

Reviewed-by: Christian König <christian.koenig@amd.com>
llvm-svn: 178930
2013-04-05 23:31:44 +00:00
Tom Stellard 60174bb9ca R600: Add RV670 processor
This is an R600 GPU with double support.

Reviewed-by: Christian König <christian.koenig@amd.com>
llvm-svn: 178929
2013-04-05 23:31:40 +00:00
Tom Stellard 2f21c7e551 R600/SI: Add processor types for each SI variant
Reviewed-by: Christian König <christian.koenig@amd.com>
llvm-svn: 178928
2013-04-05 23:31:35 +00:00
Tom Stellard edbf1eb42b R600/SI: Avoid generating S_MOVs with 64-bit immediates v2
SITargetLowering::analyzeImmediate() was converting the 64-bit values
to 32-bit and then checking if they were an inline immediate.  Some
of these conversions caused this check to succeed and produced
S_MOV instructions with 64-bit immediates, which are illegal.

v2:
  - Clean up logic

Reviewed-by: Christian König <christian.koenig@amd.com>
llvm-svn: 178927
2013-04-05 23:31:20 +00:00
Hal Finkel ed6a28597b Enable early if conversion on PPC
On cores for which we know the misprediction penalty, and we have
the isel instruction, we can profitably perform early if conversion.
This enables us to replace some small branch sequences with selects
and avoid the potential stalls from mispredicting the branches.

Enabling this feature required implementing canInsertSelect and
insertSelect in PPCInstrInfo; isel code in PPCISelLowering was
refactored to use these functions as well.

llvm-svn: 178926
2013-04-05 23:29:01 +00:00
Hal Finkel 85526f2e71 Correct the PPC A2 misprediction penalty
The manual states that there is a minimum of 13 cycles from when the
mispredicted branch is issued to when the correct branch target is
issued.

llvm-svn: 178925
2013-04-05 23:28:58 +00:00
Michael Gottesman 31ba23aa56 An objc_retain can serve as a use for a different pointer.
This is the counterpart to commit r160637, except it performs the action
in the bottomup portion of the data flow analysis.

llvm-svn: 178922
2013-04-05 22:54:32 +00:00
Michael Gottesman 1d8d25777d Properly model precise lifetime when given an incomplete dataflow sequence.
The normal dataflow sequence in the ARC optimizer consists of the following
states:

    Retain -> CanRelease -> Use -> Release

The optimizer before this patch stored the uses that determine the lifetime of
the retainable object pointer when it bottom up hits a retain or when top down
it hits a release. This is correct for an imprecise lifetime scenario since what
we are trying to do is remove retains/releases while making sure that no
``CanRelease'' (which is usually a call) deallocates the given pointer before we
get to the ``Use'' (since that would cause a segfault).

If we are considering the precise lifetime scenario though, this is not
correct. In such a situation, we *DO* care about the previous sequence, but
additionally, we wish to track the uses resulting from the following incomplete
sequences:

  Retain -> CanRelease -> Release   (TopDown)
  Retain <- Use <- Release          (BottomUp)

*NOTE* This patch looks large but the most of it consists of updating
test cases. Additionally this fix exposed an additional bug. I removed
the test case that expressed said bug and will recommit it with the fix
in a little bit.

llvm-svn: 178921
2013-04-05 22:54:28 +00:00
Hal Finkel 3005c299b5 Reapply r178845 with fix - Fix bug in PEI's virtual-register scavenging
This fixes PEI as previously described, but correctly handles the case where
the instruction defining the virtual register to be scavenged is the first in
the block. Arnold provided me with a bugpoint-reduced test case, but even that
seems too large to use as a regression test. If I'm successful in cleaning it
up then I'll commit that as well.

Original commit message:

    This change fixes a bug that I introduced in r178058. After a register is
    scavenged using one of the available spills slots the instruction defining the
    virtual register needs to be moved to after the spill code. The scavenger has
    already processed the defining instruction so that registers killed by that
    instruction are available for definition in that same instruction. Unfortunately,
    after this, the scavenger needs to iterate through the spill code and then
    visit, again, the instruction that defines the now-scavenged register. In order
    to avoid confusion, the register scavenger needs the ability to 'back up'
    through the spill code so that it can again process the instructions in the
    appropriate order. Prior to this fix, once the scavenger reached the
    just-moved instruction, it would assert if it killed any registers because,
    having already processed the instruction, it believed they were undefined.

    Unfortunately, I don't yet have a small test case. Thanks to Pranav Bhandarkar
    for diagnosing the problem and testing this fix.

llvm-svn: 178919
2013-04-05 22:31:56 +00:00
Bill Wendling eb108bad50 Use the target options specified on a function to reset the back-end.
During LTO, the target options on functions within the same Module may
change. This would necessitate resetting some of the back-end. Do this for X86,
because it's a Friday afternoon.

llvm-svn: 178917
2013-04-05 21:52:40 +00:00
Hal Finkel 81c46d0809 Revert r178845 - Fix bug in PEI's virtual-register scavenging
Reverting because this breaks one of the LTO builders. Original commit message:

    This change fixes a bug that I introduced in r178058. After a register is
    scavenged using one of the available spills slots the instruction defining the
    virtual register needs to be moved to after the spill code. The scavenger has
    already processed the defining instruction so that registers killed by that
    instruction are available for definition in that same instruction. Unfortunately,
    after this, the scavenger needs to iterate through the spill code and then
    visit, again, the instruction that defines the now-scavenged register. In order
    to avoid confusion, the register scavenger needs the ability to 'back up'
    through the spill code so that it can again process the instructions in the
    appropriate order. Prior to this fix, once the scavenger reached the
    just-moved instruction, it would assert if it killed any registers because,
    having already processed the instruction, it believed they were undefined.

    Unfortunately, I don't yet have a small test case. Thanks to Pranav Bhandarkar
    for diagnosing the problem and testing this fix.

llvm-svn: 178916
2013-04-05 21:30:40 +00:00
Jim Grosbach bdbd73460c Tidy up a bit. No functional change.
llvm-svn: 178915
2013-04-05 21:20:12 +00:00
Shuxin Yang 95adf5258f Disable the optimization about promoting vector-element-access with symbolic index.
This optimization is unstable at this moment; it 
  1) block us on a very important application
  2) PR15200
  3) test6 and test7 in test/Transforms/ScalarRepl/dynamic-vector-gep.ll
     (the CHECK command compare the output against wrong result)

   I personally believe this optimization should not have any impact on the
autovectorized code, as auto-vectorizer is supposed to put gather/scatter
in a "right" way.  Although in theory downstream optimizaters might reveal 
some gather/scatter optimization opportunities, the chance is quite slim.

   For the hand-crafted vectorizing code, in term of redundancy elimination,
load-CSE, copy-propagation and DSE can collectively achieve the same result,
but in much simpler way. On the other hand, these optimizers are able to 
improve the code in a incremental way; in contrast, SROA is sort of all-or-none
approach. However, SROA might slighly win in stack size, as it tries to figure 
out a stretch of memory tightenly cover the area accessed by the dynamic index.

 rdar://13174884
 PR15200

llvm-svn: 178912
2013-04-05 21:07:08 +00:00
Akira Hatanaka fac2db4a3d [mips] XFAIL test-interp-vec-loadstore.ll in an attempt to turn builder
llvm-mips-linux green.

llvm-mips-linux runs on a big endian machine. This test passes if I change 'e'
to 'E' in the target data layout string.

llvm-svn: 178910
2013-04-05 20:54:46 +00:00
Douglas Gregor 0cb6846090 <rdar://problem/13551789> Fix a race in the LockFileManager.
It's possible for the lock file to disappear and the owning process to
return before we're able to see the generated file. Spin for a little
while to see if it shows up before failing. 

llvm-svn: 178909
2013-04-05 20:53:57 +00:00
Douglas Gregor 6bd4d8cf72 <rdar://problem/13551789> Fix yet another race in unique_file.
If the directory that will contain the unique file doesn't exist when
we tried to create the file, but another process creates it before we
get a chance to try creating it, we would bail out rather than try to
create the unique file.

llvm-svn: 178908
2013-04-05 20:48:36 +00:00
Michael J. Spencer b8055cbc9d [Support][FileSystem] Fix identify_magic for big endian ELF.
llvm-svn: 178905
2013-04-05 20:10:04 +00:00
Rafael Espindola 3add3e9c4a Move yaml2obj to tools too.
llvm-svn: 178904
2013-04-05 20:00:35 +00:00
Rafael Espindola 4386fa9948 Define versions of Section that are explicitly marked as little endian.
These should really be templated like ELF, but this is a start.

llvm-svn: 178896
2013-04-05 18:45:28 +00:00
Michael Gottesman bab49e976b Added two debug logging messages to VisitInstructionsTopDown to match VisitInstructionsBottomUp.
llvm-svn: 178895
2013-04-05 18:26:08 +00:00
Rafael Espindola 8622f2c14e Don't use InMemoryStruct in getSection and getSection64.
llvm-svn: 178894
2013-04-05 18:18:19 +00:00
Michael Gottesman 89279f8383 Cleaned up whitespace and made debug logging less verbose.
llvm-svn: 178893
2013-04-05 18:10:41 +00:00
Timur Iskhodzhanov dcf44ca4f8 Make the test/CodeGen/X86/win32_sret.ll reliable on any CPU by explicitly specifying the -mcpu
llvm-svn: 178885
2013-04-05 17:05:56 +00:00
Renato Golin 91de828f46 Reverting 178851 as it broke buildbots
llvm-svn: 178883
2013-04-05 16:39:53 +00:00
Chad Rosier 4a7005e976 [ms-inline asm] Add support for numeric displacement expressions in bracketed
memory operands.

Essentially, this layers an infix calculator on top of the parsing state
machine.  The scale on the index register is still expected to be an immediate

 __asm mov eax, [eax + ebx*4]

and will not work with more complex expressions.  For example,

 __asm mov eax, [eax + ebx*(2*2)]

The plus and minus binary operators assume the numeric value of a register is
zero so as to not change the displacement.  Register operands should never
be an operand for a multiply or divide operation; the scale*indexreg
expression is always replaced with a zero on the operand stack to prevent
such a case.
rdar://13521380

llvm-svn: 178881
2013-04-05 16:28:55 +00:00
Reid Kleckner bd39f21336 [Support] Disable assertion dialogs from the MSVC debug CRT
Summary:
Sets a report hook that emulates pressing "retry" in the "abort, retry,
ignore" dialog box that _CrtDbgReport normally raises.  There are many
other ways to disable assertion reports, but this was the only way I
could find that still calls our exception handler.

Reviewers: Bigcheese

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D625

llvm-svn: 178880
2013-04-05 16:18:03 +00:00
Rafael Espindola da37835a8d Use ScalarBitSetTraits.
What was missing was were the type strong operator|.

llvm-svn: 178879
2013-04-05 16:00:31 +00:00
Rafael Espindola 601b6d4e33 Fix include guards to match new location.
llvm-svn: 178877
2013-04-05 15:31:16 +00:00
Rafael Espindola b0f76a4b75 Don't fetch pointers from a InMemoryStruct.
InMemoryStruct is extremely dangerous as it returns data from an internal
buffer when the endiannes doesn't match. This should fix the tests on big
endian hosts.

llvm-svn: 178875
2013-04-05 15:15:22 +00:00
Jyotsna Verma b1180dc6d9 Enable JIT/MCJIT unit tests for targets with JIT support.
Change unittests/ExecutionEngine/Makefile to include Makefile.config before
TARGET_HAS_JIT flag is checked.

Fixes bug: http://llvm.org/bugs/show_bug.cgi?id=15669

llvm-svn: 178871
2013-04-05 14:26:16 +00:00
Ulrich Weigand 78e9765b19 Respect Addend when processing MCJIT relocations to local/global symbols.
When the RuntimeDyldELF::processRelocationRef routine finds the target
symbol of a relocation in the local or global symbol table, it performs
a section-relative relocation:

    Value.SectionID = lsi->second.first;
    Value.Addend = lsi->second.second;

At this point, however, any Addend that might have been specified in
the original relocation record is lost.  This is somewhat difficult to
trigger for relocations within the code section since they usually
do not contain non-zero Addends (when built with the default JIT code
model, in any case).  However, the problem can be reliably triggered
by a relocation within the data section caused by code like:

 int test[2] = { -1, 0 };
 int *p = &test[1];

The initializer of "p" will need a relocation to "test + 4".  On
platforms using RelA relocations this means an Addend of 4 is required.
Current code ignores this addend when processing the relocation,
resulting in incorrect execution.

Fixed by taking the Addend into account when processing relocations
to symbols found in the local or global symbol table.

Tested on x86_64-linux and powerpc64-linux.

llvm-svn: 178869
2013-04-05 13:29:04 +00:00
Alexey Samsonov d2069321e0 llvm-symbolizer: correctly parse filenames given in quotes
llvm-svn: 178859
2013-04-05 09:22:24 +00:00
Alexey Samsonov c6ee5835d6 Add a basic test for llvm-symbolizer tool
llvm-svn: 178858
2013-04-05 08:30:13 +00:00
Stepan Dyatkovskiy 6b53a2f50a Buildbot fix for r178851: mistake was in wrong TargetRegisterInfo::getRegClass usage.
llvm-svn: 178854
2013-04-05 07:34:08 +00:00
Alexey Samsonov 3eb78ec974 Add obj2yaml to test dependencies
llvm-svn: 178852
2013-04-05 07:26:37 +00:00
Stepan Dyatkovskiy b309b3b33e Fix for PR14824: "Optimization arm_ldst_opt inserts newly generated instruction vldmia at incorrect position".
Patch introduces memory operands tracking in ARMLoadStoreOpt::LoadStoreMultipleOpti. For each register it keeps the order of load operations as it was before optimization pass.
It is kind of deep improvement of fix proposed by Hao: http://llvm.org/bugs/show_bug.cgi?id=14824#c4
But it also tracks conflicts between different register classes (e.g. D2 and S5).
For more details see:
Bug description: http://llvm.org/bugs/show_bug.cgi?id=14824
LLVM Commits discussion: 
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130311/167936.html
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130318/168688.html
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130325/169376.html
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130401/170238.html

llvm-svn: 178851
2013-04-05 05:52:14 +00:00
Hal Finkel 1a958cf30d Add a SchedMachineModel for the PPC G5
llvm-svn: 178850
2013-04-05 05:49:18 +00:00
Rafael Espindola 4e1e3e75b6 The ppc bots say this is the last broken line, so lets try one more :-(
llvm-svn: 178849
2013-04-05 05:36:37 +00:00
Hal Finkel 5fde1b033e Add a SchedMachineModel for the PPC A2
llvm-svn: 178848
2013-04-05 05:34:08 +00:00
Rafael Espindola 1218a40c92 One more try before I just delete the macho bits until tomorrow.
llvm-svn: 178847
2013-04-05 05:15:39 +00:00
Hal Finkel e6f48e4e2f Fix bug in PEI's virtual-register scavenging
This change fixes a bug that I introduced in r178058. After a register is
scavenged using one of the available spills slots the instruction defining the
virtual register needs to be moved to after the spill code. The scavenger has
already processed the defining instruction so that registers killed by that
instruction are available for definition in that same instruction. Unfortunately,
after this, the scavenger needs to iterate through the spill code and then
visit, again, the instruction that defines the now-scavenged register. In order
to avoid confusion, the register scavenger needs the ability to 'back up'
through the spill code so that it can again process the instructions in the
appropriate order. Prior to this fix, once the scavenger reached the
just-moved instruction, it would assert if it killed any registers because,
having already processed the instruction, it believed they were undefined.

Unfortunately, I don't yet have a small test case. Thanks to Pranav Bhandarkar
for diagnosing the problem and testing this fix.

llvm-svn: 178845
2013-04-05 05:01:13 +00:00
Arnold Schwaighofer fb6b9f48d0 ARM scheduler model: Add scheduler info to more instructions and resource
descriptions for compares

llvm-svn: 178844
2013-04-05 05:01:06 +00:00
Rafael Espindola 531efab615 More test loosening.
Sorry for so many commits, but llvm is still building on my ppc vm.

llvm-svn: 178843
2013-04-05 04:54:42 +00:00
Arnold Schwaighofer 5dde1f39c1 ARM scheduler model: Swift has varying latencies, uops for simple ALU ops
llvm-svn: 178842
2013-04-05 04:42:00 +00:00
Rafael Espindola 61ad74938d Loosen this test too.
llvm-svn: 178841
2013-04-05 04:37:55 +00:00
Rafael Espindola b080267bff Loosen this test.
Looks like there is a big endian/little endian problem here. Loosen the
test to try to get the bots green while llvm builds on a ppc qemu vm.

The failure was in http://lab.llvm.org:8011/builders/clang-ppc64-elf-linux2/

llvm-svn: 178839
2013-04-05 04:31:09 +00:00
Rafael Espindola 87a0290941 Move obj2yaml to tools to sort out make's dependencies.
llvm-svn: 178835
2013-04-05 02:57:22 +00:00
Rafael Espindola 726c8c3bdd Build obj2yaml with configure+make.
llvm-svn: 178833
2013-04-05 02:24:51 +00:00
Rafael Espindola 599e810ae3 Add a test for obj2yaml in preparation for refactoring it.
llvm-svn: 178829
2013-04-05 02:02:05 +00:00
Jakob Stoklund Olesen 45a1157233 Clean up some confusing language, and use more realistic examples.
llvm-svn: 178828
2013-04-05 01:25:41 +00:00
Andrew Trick 80e66ce0b4 RegisterPressure heuristics currently require signed comparisons.
llvm-svn: 178823
2013-04-05 00:31:34 +00:00
Andrew Trick 96ce3848d6 Disable DFSResult for ConvergingScheduler.
For now, just save the compile time since the ConvergingScheduler
heuristics don't use this analysis. We'll probably enable it later
after compile-time investigation.

llvm-svn: 178822
2013-04-05 00:31:31 +00:00
Andrew Trick 419d491747 MachineScheduler: format DEBUG output.
I'm getting more serious about tuning and enabling on x86/ARM. Start
by making the trace readable.

llvm-svn: 178821
2013-04-05 00:31:29 +00:00
Arnold Schwaighofer df6f67ed87 LoopVectorizer: Pass OperandValueKind information to the cost model
Pass down the fact that an operand is going to be a vector of constants.

This should bring the performance of MultiSource/Benchmarks/PAQ8p/paq8p on x86
back. It had degraded to scalar performance due to my pervious shift cost change
that made all shifts expensive on x86.

radar://13576547

llvm-svn: 178809
2013-04-04 23:26:27 +00:00
Arnold Schwaighofer 44f902ed7d X86 cost model: Differentiate cost for vector shifts of constants
SSE2 has efficient support for shifts by a scalar. My previous change of making
shifts expensive did not take this into account marking all shifts as expensive.
This would prevent vectorization from happening where it is actually beneficial.

With this change we differentiate between shifts of constants and other shifts.

radar://13576547

llvm-svn: 178808
2013-04-04 23:26:24 +00:00
Arnold Schwaighofer b977387112 CostModel: Add parameter to instruction cost to further classify operand values
On certain architectures we can support efficient vectorized version of
instructions if the operand value is uniform (splat) or a constant scalar.
An example of this is a vector shift on x86.

We can efficiently support

for (i = 0 ; i < ; i += 4)
  w[0:3] = v[0:3] << <2, 2, 2, 2>

but not

for (i = 0; i < ; i += 4)
  w[0:3] = v[0:3] << x[0:3]

This patch adds a parameter to getArithmeticInstrCost to further qualify operand
values as uniform or uniform constant.

Targets can then choose to return a different cost for instructions with such
operand values.

A follow-up commit will test this feature on x86.

radar://13576547

llvm-svn: 178807
2013-04-04 23:26:21 +00:00
Manman Ren bdcb4464e2 Debug Info: revert 178722 for now.
There is a difference for FORM_ref_addr between DWARF 2 and DWARF 3+.
Since Eric is against guarding DWARF 2 ref_addr with DarwinGDBCompat, we are
still in discussion on how to handle this.

The correct solution is to update our header to say version 4 instead of version
2 and update tool chains as well.

rdar://problem/13559431

llvm-svn: 178806
2013-04-04 23:13:11 +00:00
Adrian Prantl 322f41d095 typo
llvm-svn: 178804
2013-04-04 22:56:49 +00:00
Hal Finkel e5680b3c36 Rename the current PPC BCL definition to BCLalways
BCL is normally a conditional branch-and-link instruction, but has
an unconditional form (which is used in the SjLj code, for example).
To make clear that this BCL instruction definition is specifically
the special unconditional form (which does not meaningfully take
a condition-register input), rename it to BCLalways.

No functionality change intended.

llvm-svn: 178803
2013-04-04 22:55:54 +00:00
Hal Finkel f96c18e3bc PPC: Improve code generation for mixed-precision reciprocal sqrt
The DAGCombine logic that recognized a/sqrt(b) and transformed it into
a multiplication by the reciprocal sqrt did not handle cases where the
sqrt and the division were separated by an fpext or fptrunc.

llvm-svn: 178801
2013-04-04 22:44:12 +00:00
Jyotsna Verma a929ab58c0 Hexagon: Expand br_cc.
It fixes following tests for Hexagon:

CodeGen/Generic/2003-07-29-BadConstSbyte.ll
CodeGen/Generic/2005-10-21-longlonggtu.ll
CodeGen/Generic/2009-04-28-i128-cmp-crash.ll
CodeGen/Generic/MachineBranchProb.ll
CodeGen/Generic/builtin-expect.ll
CodeGen/Generic/pr12507.ll

llvm-svn: 178794
2013-04-04 21:18:26 +00:00
Benjamin Kramer dd67654af6 Reassociate: Avoid iterator invalidation.
OpndPtrs stored pointers into the Opnd vector that became invalid when the
vector grows. Store indices instead. Sadly I only have a large testcase that
only triggers under valgrind, so I didn't include it.

llvm-svn: 178793
2013-04-04 21:15:42 +00:00
Jyotsna Verma bc03a9792a Disable 2010-10-01-crash.ll for Hexagon as the Hexagon frontend will
never produce a byval parameter with size < 8 bytes.

llvm-svn: 178792
2013-04-04 21:05:46 +00:00
Rafael Espindola 7733466c73 Add back parsing of header charactestics.
It had been dropped during the switch to yaml::IO. Also add a test going
from yaml2obj to llvm-readobj. It can be extended as we add more
fields/formats to yaml2obj.

llvm-svn: 178786
2013-04-04 20:30:52 +00:00
Richard Osborne 0c12d1851e [XCore] Add bru instruction.
llvm-svn: 178783
2013-04-04 20:05:35 +00:00
Richard Osborne f18d95f756 [XCore] The RRegs register class is a superset of GRRegs.
At the time when the XCore backend was added there were some issues with
with overlapping register classes but these all seem to be fixed now.
Describing the register classes correctly allow us to get rid of a
codegen only instruction (LDAWSP_lru6_RRegs) and it means we can
disassemble ru6 instructions that use registers above r11.

llvm-svn: 178782
2013-04-04 19:57:46 +00:00
Eli Bendersky 4ee93cd45b Missing word
llvm-svn: 178774
2013-04-04 18:29:19 +00:00
Jakob Stoklund Olesen 299475e0c6 Avoid high-latency false CPSR dependencies even for tMOVSi.
The Thumb2SizeReduction pass avoids false CPSR dependencies, except it
still aggressively creates tMOVi8 instructions because they are so
common.

Avoid creating false CPSR dependencies even for tMOVi8 instructions when
the the CPSR flags are known to have high latency. This allows integer
computation to overlap floating point computations.

Also process blocks in a reverse post-order and propagate high-latency
flags to successors.

<rdar://problem/13468102>

llvm-svn: 178773
2013-04-04 18:25:36 +00:00
Eli Bendersky fc186358f2 Formatting
llvm-svn: 178771
2013-04-04 18:03:41 +00:00
Evan Cheng 2e254d041e Revert r178713
llvm-svn: 178769
2013-04-04 17:40:53 +00:00
Stepan Dyatkovskiy e58df62ea4 New-password-test commit.
llvm-svn: 178765
2013-04-04 16:11:18 +00:00
Vincent Lejeune bcbb13d691 R600: Use a mask for offsets when encoding instructions
llvm-svn: 178763
2013-04-04 14:00:09 +00:00
Vincent Lejeune 8e377fdba6 R600: Fix wrong address when substituting ENDIF
llvm-svn: 178762
2013-04-04 14:00:03 +00:00
Vincent Lejeune c44fa99719 R600: Take export into account when computing cf address
llvm-svn: 178761
2013-04-04 13:59:59 +00:00
Alexey Samsonov e2c772a1b0 Propagate path to ASan/MSan symbolizer into test environment to produce useful reports on errors.
llvm-svn: 178749
2013-04-04 07:41:00 +00:00
Nadav Rotem 319758aa7c Document the return value of SmallSet insert.
llvm-svn: 178742
2013-04-04 04:54:21 +00:00
Jakob Stoklund Olesen 8cfaffaade Add SPARC v9 support for select on 64-bit compares.
This requires v9 cmov instructions using the %xcc flags instead of the
%icc flags.

Still missing:
- Select floats on %xcc flags.
- Select i64 on %fcc flags.

llvm-svn: 178737
2013-04-04 03:08:00 +00:00
Rafael Espindola 5a2af525ab Explicitly add -Wl,--export-all-symbols on mingw/cygwin.
Looks like cmake on windows is not expanding ENABLE_EXPORTS to
-Wl,--export-all-symbols on mingw or cygwin, so add this back.

llvm-svn: 178730
2013-04-04 01:19:55 +00:00
Rafael Espindola 76f92277ca Don't export symbols in every binary on linux.
On freebsd this makes sure that symbols are exported on the binaries that need
them. The net result is that we should get symbols in the binaries that need
them on every platform.

On linux x86-64 this reduces the size of the bin directory from 262MB to 250MB.

Patch by Stephen Checkoway.

llvm-svn: 178725
2013-04-04 01:01:32 +00:00
Manman Ren 5a15c9ed9f Debug Info: according to DWARF 2, FORM_ref_addr the same size as an address on
the target system.

It was hard-coded to 4 bytes before. I can't get llvm to generate a
ref_addr on a reasonably sized testing case.

rdar://problem/13559431

llvm-svn: 178722
2013-04-04 00:22:54 +00:00
Michael Gottesman 21a4ed3227 Refactored out the helper method FindPredecessorAutoreleaseWithSafePath from ObjCARCOpt::OptimizeReturns.
Now ObjCARCOpt::OptimizeReturns is easy to read and reason about.

llvm-svn: 178715
2013-04-03 23:39:14 +00:00
Michael Gottesman 6908db148b Refactored out the helper function FindPredecessorRetainWithSafePath from ObjCARCOpt::OptimizeReturns.
llvm-svn: 178714
2013-04-03 23:16:05 +00:00
Evan Cheng 51a7a9d712 Make it possible to include llvm-c without including C++ headers. Patch by Filip Pizlo.
llvm-svn: 178713
2013-04-03 23:12:39 +00:00
Michael Gottesman c2d5bf5c53 Small cleanups.
Cleaned up trailing whitespace and added extra slashes in front of a
function level comment so that it follow the convention of having 3
slashes.

llvm-svn: 178712
2013-04-03 23:07:45 +00:00
Michael Gottesman 54dc7fdefb Refactored out a part of ObjCARCOpt::OptimizeReturns into its own method HasSafePathToPredecessorCall.
llvm-svn: 178710
2013-04-03 23:04:28 +00:00
Michael Gottesman 0a1748bb8c Removed an old comment.
llvm-svn: 178709
2013-04-03 23:04:24 +00:00
Michael Gottesman 43e7e00a68 Clean up arc annotations by moving the top/bottom BB annotations into conditional macros that no-op in Release mode instead of #ifdef sections of the code.
This is to follow the example of the DEBUG macro.

llvm-svn: 178705
2013-04-03 22:41:59 +00:00
Arnold Schwaighofer e9b5016411 X86 cost model: Vector shifts are expensive in most cases
The default logic does not correctly identify costs of casts because they are
marked as custom on x86.

For some cases, where the shift amount is a scalar we would be able to generate
better code. Unfortunately, when this is the case the value (the splat) will get
hoisted out of the loop, thereby making it invisible to ISel.

radar://13130673
radar://13537826

llvm-svn: 178703
2013-04-03 21:46:05 +00:00
Rafael Espindola 2025e8b820 Implement the "mips endian" for r_info.
Normally r_info is just a 32 of 64 bit number matching the endian of the rest
of the file. Unfortunately, mips 64 bit little endian is special: The top 32
bits are a little endian number and the following 32 are a big endian one.

llvm-svn: 178694
2013-04-03 21:02:51 +00:00
Richard Osborne 122acb216c [XCore] Check disassembly of the st8 instruction.
llvm-svn: 178689
2013-04-03 20:07:11 +00:00
Richard Osborne fb0b4ea3a7 [XCore] Update disassembler test to improve coverage of the instructions.
Previously some instructions were unintentionally covered twice and
others were not covered at all.

llvm-svn: 178688
2013-04-03 20:07:06 +00:00
Eric Christopher 9cad53cfec Implements low-level object file format specific output for COFF and
ELF with support for:

- File headers
- Section headers + data
- Relocations
- Symbols
- Unwind data (only COFF/Win64)

The output format follows a few rules:
- Values are almost always output one per line (as elf-dump/coff-dump already do). - Many values are translated to something readable (like enum names), with the raw value in parentheses.
- Hex numbers are output in uppercase, prefixed with "0x".
- Flags are sorted alphabetically.
- Lists and groups are always delimited.

Example output:
---------- snip ----------
Sections [
  Section {
    Index: 1
    Name: .text (5)
    Type: SHT_PROGBITS (0x1)
    Flags [ (0x6)
      SHF_ALLOC (0x2)
      SHF_EXECINSTR (0x4)
    ]
    Address: 0x0
    Offset: 0x40
    Size: 33
    Link: 0
    Info: 0
    AddressAlignment: 16
    EntrySize: 0
    Relocations [
      0x6 R_386_32 .rodata.str1.1 0x0
      0xB R_386_PC32 puts 0x0
      0x12 R_386_32 .rodata.str1.1 0x0
      0x17 R_386_PC32 puts 0x0
    ]
    SectionData (
      0000: 83EC04C7 04240000 0000E8FC FFFFFFC7  |.....$..........|
      0010: 04240600 0000E8FC FFFFFF31 C083C404  |.$.........1....|
      0020: C3                                   |.|
    )
  }
]
---------- snip ----------

Relocations and symbols can be output standalone or together with the section header as displayed in the example.
This feature set supports all tests in test/MC/COFF and test/MC/ELF (and I suspect all additional tests using elf-dump), making elf-dump and coff-dump deprecated.

Patch by Nico Rieck!

llvm-svn: 178679
2013-04-03 18:31:38 +00:00
Eric Christopher 2d4b3a6b94 Don't disassemble symbols with an unknown address or size.
Patch by Nico Rieck!

llvm-svn: 178678
2013-04-03 18:31:23 +00:00
Eric Christopher 8d67ab4f70 Implement sectionContainsSymbol for ELF.
Patch by Nico Rieck!

llvm-svn: 178677
2013-04-03 18:31:19 +00:00
Eric Christopher d5972ea8fc When dumping clear the arm/thumb flag for now.
Patch by Nico Rieck!

llvm-svn: 178676
2013-04-03 18:31:12 +00:00
Vincent Lejeune c3d3f9b66e R600: Fix last ALU of a clause being emitted in a separate clause
llvm-svn: 178675
2013-04-03 18:24:47 +00:00
Aaron Ballman 5e6d20524a Ensuring that both bits are set, and not just a combination of one or the other.
llvm-svn: 178674
2013-04-03 18:00:22 +00:00
Hal Finkel b0c810ff6d Cleanup PPC reciprocal-estimate functionality
Incorporating review feedback from Bill Schmidt on r178617. No functionality
change intended.

llvm-svn: 178672
2013-04-03 17:44:56 +00:00
Vincent Lejeune 80031d9fc4 R600: Factorize maximum alu per clause in a single location
llvm-svn: 178667
2013-04-03 16:49:34 +00:00
Aaron Ballman edc03c660c Testing for Visual Studio 2010 SP1 or greater before calling the _xgetbv intrinsic. This also fixes a minor code formatting issue.
llvm-svn: 178666
2013-04-03 16:28:24 +00:00
Vincent Lejeune b6d6c0d458 R600: Simplify data structure and add DEBUG to R600ControlFlowFinalizer
llvm-svn: 178665
2013-04-03 16:24:09 +00:00
Vincent Lejeune 9931298b30 R600: Consider KILLGT as an ALU instruction
Mesa does not override llvm behavior wrt KILLGT anymore so llvm
has to handle KILLGT on its own.

llvm-svn: 178664
2013-04-03 16:24:04 +00:00
Eli Bendersky b35a211f61 Measure time that IR parsing took as part of the -time-passes measurement.
llvm-svn: 178662
2013-04-03 15:33:45 +00:00
Hal Finkel 7ac4592e97 PPC: Enable FRES and FRSQRTE on the default PPC64 description
I discussed this with Bill Schmidt on IRC, and it was decided that this is a
safe and reasonable default.

llvm-svn: 178659
2013-04-03 14:40:18 +00:00
Hal Finkel 0c6d21933a PPC: Add a FIXME regarding the non-working fma+fneg Altivec pattern
llvm-svn: 178658
2013-04-03 14:40:16 +00:00
Hal Finkel 2ed21a8ca6 Remove some obsolete PowerPC/README entries
llvm-svn: 178657
2013-04-03 14:25:55 +00:00
Ulrich Weigand 084ff8e891 More direct types in PowerPC AltiVec intrinsics.
This patch follows up on work done by Bill Schmidt in r178277,
and replaces most of the remaining uses of VRRC in ISEL DAG patterns.

The resulting .inc files are identical except for comments, so
no change in code generation is expected.

llvm-svn: 178656
2013-04-03 14:08:13 +00:00
Bill Schmidt 92e26646bc Fix PR15632: No support for ppcf128 floating-point remainder on PowerPC.
For this we need to use a libcall.  Previously LLVM didn't implement
libcall support for frem, so I've added it in the usual
straightforward manner.  A test case from the bug report is included.

llvm-svn: 178639
2013-04-03 13:05:44 +00:00
Tim Northover 5816ca117b AArch64: implement ETMv4 trace system registers.
llvm-svn: 178637
2013-04-03 12:31:29 +00:00
Aaron Ballman 5f7c680fdc Second pass at addressing PR15351 by explicitly checking for AVX support
when getting the host processor information.  It emits a .byte sequence on GNUC compilers to work around lack of xgetbv support with older assemblers, and resolves a comment typo found in the previous patch.

llvm-svn: 178636
2013-04-03 12:25:06 +00:00
Timur Iskhodzhanov 7205c72d84 Temporarily relax the WIN32 checks in the SRet test to fix the Atom D2700 bot
llvm-svn: 178635
2013-04-03 12:17:15 +00:00
Timur Iskhodzhanov f4e0665e56 Fix SRet for thiscall in i686-pc-win32
llvm-svn: 178634
2013-04-03 11:27:54 +00:00
Tim Northover 5b097a735f AArch64: switch patterns to be type-based rather than RegClass-based
It's a bit of churn in the blame log, but I think there are real benefits to
the newer system so I'm making the change in one go.

llvm-svn: 178633
2013-04-03 11:19:16 +00:00
Eric Christopher 14c2067ca1 Fix grammar.
llvm-svn: 178624
2013-04-03 05:29:58 +00:00
Eric Christopher 5590949f29 Remove ZeroOrMore from the option description. We don't need it here.
llvm-svn: 178623
2013-04-03 05:26:07 +00:00
Jakob Stoklund Olesen d9bbdfd3cc Add 64-bit compare + branch for SPARC v9.
The same compare instruction is used for 32-bit and 64-bit compares. It
sets two different sets of flags: icc and xcc.

This patch adds a conditional branch instruction using the xcc flags for
64-bit compares.

llvm-svn: 178621
2013-04-03 04:41:44 +00:00
Hal Finkel b00fc87608 Remove some unsupported-feature comments from PPC.td
These refer to the reciprocal estimate support recently committed.

llvm-svn: 178618
2013-04-03 04:03:58 +00:00
Hal Finkel 2e10331057 Use PPC reciprocal estimates with Newton iteration in fast-math mode
When unsafe FP math operations are enabled, we can use the fre[s] and
frsqrte[s] instructions, which generate reciprocal (sqrt) estimates, together
with some Newton iteration, in order to quickly generate floating-point
division and sqrt results. All of these instructions are separately optional,
and so each has its own feature flag (except for the Altivec instructions,
which are covered under the existing Altivec flag). Doing this is not only
faster than using the IEEE-compliant fdiv/fsqrt instructions, but allows these
computations to be pipelined with other computations in order to hide their
overall latency.

I've also added a couple of missing fnmsub patterns which turned out to be
missing (but are necessary for good code generation of the Newton iterations).
Altivec needs a similar fix, but that will probably be more complicated because
fneg is expanded for Altivec's v4f32.

llvm-svn: 178617
2013-04-03 04:01:11 +00:00
Rafael Espindola b9b7ae0c78 Fix the fde encoding used by mips to match gas.
This finally fixes the encoding. The patch also
* Removes eh-frame.ll. It was an unnecessary .ll to .o test that was checking
  the wrong value.
* Merge fde-reloc.s and eh-frame.s into a single test, since the only difference
  was the run lines.
* Don't blindly test the content of the entire .eh_frame section. It makes it
  hard to anyone actually fixing a bug and hitting a difference in a binary
  blob. Instead, use a CHECK for each field and document what is being checked.

llvm-svn: 178615
2013-04-03 03:13:19 +00:00
Aaron Ballman 9c0f0af54f Rolling back the AVX support patch due to breaking a gcc 4.6 build bot that doesn't understand the xgetbv instruction for some reason. Will revisit when time permits.
llvm-svn: 178614
2013-04-03 03:11:39 +00:00
Michael Gottesman b8c8836594 Remove an optimization where we were changing an objc_autorelease into an objc_autoreleaseReturnValue.
The semantics of ARC implies that a pointer passed into an objc_autorelease
must live until some point (potentially down the stack) where an
autorelease pool is popped. On the other hand, an
objc_autoreleaseReturnValue just signifies that the object must live
until the end of the given function at least.

Thus objc_autorelease is stronger than objc_autoreleaseReturnValue in
terms of the semantics of ARC* implying that performing the given
strength reduction without any knowledge of how this relates to
the autorelease pool pop that is further up the stack violates the
semantics of ARC.

*Even though objc_autoreleaseReturnValue if you know that no RV
optimization will occur is more computationally expensive.

llvm-svn: 178612
2013-04-03 02:57:24 +00:00
Michael Gottesman 624243914f Improved comment. No functionality change.
llvm-svn: 178605
2013-04-03 01:57:16 +00:00
Aaron Ballman 56be6ba5e4 Attempting to fix the build on older GCC versions.
llvm-svn: 178604
2013-04-03 01:39:37 +00:00
Rafael Espindola aff6081415 Remove anonymous namespace.
Looks like the gcc in http://lab.llvm.org:8011/builders/clang-x86_64-darwin11-self-mingw32/ doesn't like "not external linkage":

/Volumes/Macintosh_HD2/buildbots/clang-x86_64-darwin11-self-mingw32/llvm.src/include/llvm/Support/YAMLTraits.h: In instantiation of 'const bool llvm::yaml::has_SequenceMethodTraits<std::vector<<unnamed>::COFFYAML::Relocation, std::allocator<<unnamed>::COFFYAML::Relocation> > >::value':
/Volumes/Macintosh_HD2/buildbots/clang-x86_64-darwin11-self-mingw32/llvm.src/include/llvm/Support/YAMLTraits.h:281:   instantiated from 'llvm::yaml::has_SequenceTraits<std::vector<<unnamed>::COFFYAML::Relocation, std::allocator<<unnamed>::COFFYAML::Relocation> > >'
/Volumes/Macintosh_HD2/buildbots/clang-x86_64-darwin11-self-mingw32/llvm.src/utils/yaml2obj/yaml2obj.cpp:627:   instantiated from here
/Volumes/Macintosh_HD2/buildbots/clang-x86_64-darwin11-self-mingw32/llvm.src/include/llvm/Support/YAMLTraits.h:243: error: 'llvm::yaml::SequenceTraits<std::vector<<unnamed>::COFFYAML::Relocation, std::allocator<<unnamed>::COFFYAML::Relocation> > >::size' is not a valid template argument for type 'size_t (*)(llvm::yaml::IO&, std::vector<<unnamed>::COFFYAML::Relocation, std::allocator<<unnamed>::COFFYAML::Relocation> >&)' because function 'static size_t llvm::yaml::SequenceTraits<std::vector<<unnamed>::COFFYAML::Relocation, std::allocator<<unnamed>::COFFYAML::Relocation> > >::size(llvm::yaml::IO&, std::vector<<unnamed>::COFFYAML::Relocation, std::allocator<<unnamed>::COFFYAML::Relocation> >&)' has not external linkage

llvm-svn: 178600
2013-04-03 01:07:53 +00:00
Aaron Ballman 6bc0dfc7bd This patch addresses PR15351 by explicitly checking for AVX support
when getting the host processor information.

llvm-svn: 178598
2013-04-03 00:33:32 +00:00
Rafael Espindola e1d9afa82d Use yaml::IO in yaml2obj.cpp.
The generic structs and specializations will be refactored when obj2yaml is
changed to use yaml::IO.

llvm-svn: 178593
2013-04-02 23:56:40 +00:00
Eric Christopher e2fbc67e81 Formatting.
llvm-svn: 178589
2013-04-02 23:06:40 +00:00
Akira Hatanaka 023c678a0d [mips] Small update to the implementation of eh.return for Mips.
This patch initializes t9 to the handler address, but only if the relocation
model is pic. This handles the case where handler to which eh.return jumps 
points to the start of the function.

Patch by Sasa Stankovic.

llvm-svn: 178588
2013-04-02 23:02:07 +00:00
Eric Christopher 6476f908b3 Support and test template arguments for unions.
llvm-svn: 178586
2013-04-02 22:55:56 +00:00
Eric Christopher 17dd8f07c6 Reformat arguments.
llvm-svn: 178585
2013-04-02 22:55:52 +00:00
Akira Hatanaka 2ffc5734e7 [mips] Expand pseudo multiply/divide instructions in MipsCodeEmitter.cpp.
This patch fixes the following two tests which have been failing on
llvm-mips-linux builder since r178403:

LLVM :: Analysis/Profiling/load-branch-weights-ifs.ll
LLVM :: Analysis/Profiling/load-branch-weights-loops.ll

llvm-svn: 178584
2013-04-02 22:53:58 +00:00
NAKAMURA Takumi fc613f4d61 llvm/test/CodeGen/X86: Unmark them out of XFAIL:cygming, in atomic{32|64}.ll and handle-move.ll, corresponding to r178549.
This reverts r176808, r176798, and r177914.

llvm-svn: 178583
2013-04-02 22:35:08 +00:00
Jakob Stoklund Olesen aeb69a5481 Allow MachineTraceMetrics to be used when the model has no resources.
It it still possible to extract information from itineraries, for
example.

llvm-svn: 178582
2013-04-02 22:27:45 +00:00
Jakub Staszak 09da37d10d Fix a typo.
llvm-svn: 178567
2013-04-02 20:02:36 +00:00
Chad Rosier 8a24466f69 [ms-inline asm] Add support for parsing variables with namespace alias
qualifiers.

This patch only adds support for parsing these identifiers in the
X86AsmParser.  The front-end interface isn't capable of looking up
these identifiers at this point in time.  The end result is the
compiler now errors during object file emission, rather than at
parse time.  Test case coming shortly.
Part of rdar://13499009 and PR13340

llvm-svn: 178566
2013-04-02 20:02:33 +00:00
Manman Ren c018eea684 Add MDBuilder utilities for path-aware TBAA.
Add utilities to create struct nodes in TBAA type DAG and to create path-aware
tags. The format of struct nodes in TBAA type DAG: a unique name, a list of
fields with field offsets and field types. The format of path-aware tags:
a base type in TBAA type DAG, an access type and an offset relative to the base
type.

llvm-svn: 178564
2013-04-02 19:50:49 +00:00
Bill Schmidt 3581cd4b4c Fix PR15630: Replace faulty stdcx. with stwcx.
When doing a partword atomic operation, a lwarx was being paired with
a stdcx. instead of a stwcx. when compiling for a 64-bit target.  The
target has nothing to do with it in this case; we always need a stwcx.

Thanks to Kai Nacke for reporting the problem.

llvm-svn: 178559
2013-04-02 18:37:08 +00:00
Jakob Stoklund Olesen 8fbfc59164 Don't attempt MTM heuristics without a scheduling model present.
This should fix the PPC buildbots.

llvm-svn: 178558
2013-04-02 18:26:45 +00:00
Jakob Stoklund Olesen 3ca14772d0 Count processor resources individually in MachineTraceMetrics.
The new instruction scheduling models provide information about the
number of cycles consumed on each processor resource. This makes it
possible to estimate ILP more accurately than simply counting
instructions / issue width.

The functions getResourceDepth() and getResourceLength() now identify
the limiting processor resource, and return a cycle count based on that.

This gives more precise resource information, particularly in traces
that use one resource a lot more than others.

llvm-svn: 178553
2013-04-02 17:49:51 +00:00
Chad Rosier 7925d280ff [fast-isel] Use the correct API to disable FastLowerArguments for Win64.
llvm-svn: 178549
2013-04-02 16:31:41 +00:00
Arnold Schwaighofer d6c6e868b2 DAGCombiner: Merge store/loads when we have extload/truncstores
This is helps on architectures where i8,i16 are not legal but we have byte, and
short loads/stores. Allowing us to merge copies like the one below on ARM.

copy(char *a, char *b, int n) {
 do {
   int t0 = a[0];
   int t1 = a[1];
   b[0] = t0;
   b[1] = t1;

radar://13536387

llvm-svn: 178546
2013-04-02 15:58:51 +00:00
Preston Gurd 95cbee6ce4 Simplify test cases for Atom preferring call register indirect over
call memory indirect (32 and 64 bit).

llvm-svn: 178541
2013-04-02 14:25:06 +00:00
Justin Holewinski a922c7e90e [NVPTX] Fix a few style issues in NVVMReflect
llvm-svn: 178536
2013-04-02 12:37:11 +00:00
Bill Wendling 88d06c3b2d Use a worklist to avoid a sneaky iterator invalidation.
The iterator could be invalidated when it's recursively deleting a whole bunch
of constant expressions in a constant initializer.

Note: This was only reproducible if `opt' was run on a `.bc' file. If `opt' was
run on a `.ll' file, it wouldn't crash. This is why the test first pushes the
`.ll' file through `llvm-as' before feeding it to `opt'.

PR15440

llvm-svn: 178531
2013-04-02 08:16:45 +00:00
Jakob Stoklund Olesen 8eabc3ffde Add 64-bit load and store instructions.
There is only a few new instructions, the rest is handled with patterns.

llvm-svn: 178528
2013-04-02 04:09:28 +00:00
Jakob Stoklund Olesen 917e07f095 Basic 64-bit ALU operations.
SPARC v9 extends all ALU instructions to 64 bits, so we simply need to
add patterns to use them for both i32 and i64 values.

llvm-svn: 178527
2013-04-02 04:09:23 +00:00
Jakob Stoklund Olesen bddb20eeef Materialize 64-bit immediates.
The last resort pattern produces 6 instructions, and there are still
opportunities for materializing some immediates in fewer instructions.

llvm-svn: 178526
2013-04-02 04:09:17 +00:00
Jakob Stoklund Olesen c1d1a4816e Add 64-bit shift instructions.
SPARC v9 defines new 64-bit shift instructions. The 32-bit shift right
instructions are still usable as zero and sign extensions.

This adds new F3_Sr and F3_Si instruction formats that probably should
be used for the 32-bit shifts as well. They don't really encode an
simm13 field.

llvm-svn: 178525
2013-04-02 04:09:12 +00:00
Jakob Stoklund Olesen 739d722ef7 Add predicates for distinguishing 32-bit and 64-bit modes.
The 'sparc' architecture produces 32-bit code while 'sparcv9' produces
64-bit code.

It is also possible to run 32-bit code using SPARC v9 instructions with:

  llc -march=sparc -mattr=+v9

llvm-svn: 178524
2013-04-02 04:09:06 +00:00
Jakob Stoklund Olesen 0b21f35aca Add support for 64-bit calling convention.
This is far from complete, but it is enough to make it possible to write
test cases using i64 arguments.

Missing features:
- Floating point arguments.
- Receiving arguments on the stack.
- Calls.

llvm-svn: 178523
2013-04-02 04:09:02 +00:00
Jakob Stoklund Olesen 5ad3b35377 Add an I64Regs register class for 64-bit registers.
We are going to use the same registers for 32-bit and 64-bit values, but
in two different register classes. The I64Regs register class has a
larger spill size and alignment.

The addition of an i64 register class confuses TableGen's type
inference, so it is necessary to clarify the type of some immediates and
the G0 register.

In 64-bit mode, pointers are i64 and should use the I64Regs register
class. Implement getPointerRegClass() to dynamically provide the pointer
register class depending on the subtarget. Use ptr_rc and iPTR for
memory operands.

Finally, add the i64 type to the IntRegs register class. This register
class is not used to hold i64 values, I64Regs is for that. The type is
required to appease TableGen's type checking in output patterns like this:

  def : Pat<(add i64:$a, i64:$b), (ADDrr $a, $b)>;

SPARC v9 uses the same ADDrr instruction for i32 and i64 additions, and
TableGen doesn't know to check the type of register sub-classes.

llvm-svn: 178522
2013-04-02 04:08:54 +00:00
Hal Finkel 93d75ea08a Fix typo in PPCISelLowering
Thanks to Bill Schmidt for finding this in review of r178480.

llvm-svn: 178521
2013-04-02 03:29:51 +00:00
Andrew Trick e1d88cfb57 The divide unit is not pipeline, but it is still buffered.
Buffered means a later divide may be executed out-of-order while a
prior divide is sitting (buffered) in a reservation station.

You can tell it's not pipelined, because operations that use it
reserve it for more than one cycle:

def : WriteRes<WriteIDiv, [HWPort0, HWDivider]> {
  let Latency = 25;
  let ResourceCycles = [1, 10];
}

We don't currently distinguish between an unpipeline operation and one
that is split into multiple micro-ops requiring the same unit. Except
that the later may have NumMicroOps > 1 if they also consume
issue/dispatch resources.

llvm-svn: 178519
2013-04-02 01:58:47 +00:00
Chris Lattner e381a8e5d0 unindent the file to follow coding standards, change class doc comment
to be correct.  No functionality or behavior change.

llvm-svn: 178511
2013-04-01 23:00:01 +00:00
NAKAMURA Takumi fd98f7f2b6 Target/R600: Fix CMake build to add missing files.
llvm-svn: 178508
2013-04-01 22:05:58 +00:00
Jack Carter 9423f507b1 Mips direct object exception handling regression
Revision 177141 caused a regression in all but
mips64 little endian. That is because none of the
other Mips targets had test cases checking the 
contents of the .eh_frame section. This patch fixes
both the llvm code and adds an assembler test case 
to include the current 4 flavors.

The test cases unfortunately rely on llvm-objdump. A
preferable method would be to use a pretty printer output
such as what readelf -wf <elf_file> would give.

I also changed the name of the test case to correct a typo.

llvm-svn: 178506
2013-04-01 21:55:15 +00:00
Vincent Lejeune bfaa63a6db R600: Add support for native control flow
llvm-svn: 178505
2013-04-01 21:48:05 +00:00
Vincent Lejeune ace6f7351e R600/SI: Share code recording ShaderTypeAttribute between generations
llvm-svn: 178504
2013-04-01 21:47:53 +00:00
Vincent Lejeune f43bc57b66 R600: Emit CF_ALU and use true kcache register.
llvm-svn: 178503
2013-04-01 21:47:42 +00:00
Eli Bendersky e60fc2f676 Fix top-comment header and some indentation
llvm-svn: 178492
2013-04-01 19:47:56 +00:00
Hal Finkel 3f88d08974 Fix a bad assert in PPCTargetLowering
llvm-svn: 178489
2013-04-01 18:42:58 +00:00
Hal Finkel c2eddb0d02 Add triple to test/CodeGen/PowerPC/stfiwx-2
llvm-svn: 178486
2013-04-01 18:18:44 +00:00
Shuxin Yang 6662fd0f15 Correct assertion condition
llvm-svn: 178484
2013-04-01 18:13:05 +00:00
Arnold Schwaighofer 6752366ed7 Merge load/store sequences with adresses: base + index + offset
We would also like to merge sequences that involve a variable index like in the
example below.

    int index = *idx++
    int i0 = c[index+0];
    int i1 = c[index+1];
    b[0] = i0;
    b[1] = i1;

By extending the parsing of the base pointer to handle dags that contain a
base, index, and offset we can handle examples like the one above.

The dag for the code above will look something like:

 (load (i64 add (i64 copyfromreg %c)
                (i64 signextend (i8 load %index))))

 (load (i64 add (i64 copyfromreg %c)
                (i64 signextend (i32 add (i32 signextend (i8 load %index))
                                         (i32 1)))))

The code that parses the tree ignores the intermediate sign extensions. However,
if there is a sign extension it needs to be on all indexes.

 (load (i64 add (i64 copyfromreg %c)
                (i64 signextend (add (i8 load %index)
                                     (i8 1))))
 vs

 (load (i64 add (i64 copyfromreg %c)
                (i64 signextend (i32 add (i32 signextend (i8 load %index))
                                         (i32 1)))))
radar://13536387

llvm-svn: 178483
2013-04-01 18:12:58 +00:00
Hal Finkel f6d45f2379 Add more PPC floating-point conversion instructions
The P7 and A2 have additional floating-point conversion instructions which
allow a direct two-instruction sequence (plus load/store) to convert from all
combinations (signed/unsigned i32/i64) <--> (float/double) (on previous cores,
only some combinations were directly available).

llvm-svn: 178480
2013-04-01 17:52:07 +00:00
Hal Finkel 39caf9f5ec Use ImmToIdxMap.count in PPCRegisterInfo
Code improvement suggested by Jakob (in review of r178450). No functionality
change intended.

llvm-svn: 178473
2013-04-01 17:02:06 +00:00
Hal Finkel c006295f3c Fix PowerPC/cttz.ll to specify a cpu (and use FileCheck)
llvm-svn: 178472
2013-04-01 16:31:56 +00:00
Hal Finkel 290376dd78 Add the PPC popcntw instruction
The popcntw instruction is available whenever the popcntd instruction is
available, and performs a separate popcnt on the lower and upper 32-bits.
Ignoring the high-order count, this can be used for the 32-bit input case
(saving on the explicit zero extension otherwise required to use popcntd).

llvm-svn: 178470
2013-04-01 15:58:15 +00:00
Nadav Rotem be79a7ac7a Add support for vector data types in the LLVM interpreter.
Patch by:
Veselov, Yuri <Yuri.Veselov@intel.com>

llvm-svn: 178469
2013-04-01 15:53:30 +00:00
Hal Finkel 60c7510711 Treat PPCISD::STFIWX like the memory opcode that it is
PPCISD::STFIWX is really a memory opcode, and so it should come after
FIRST_TARGET_MEMORY_OPCODE, and we should use DAG.getMemIntrinsicNode to create
nodes using it.

No functionality change intended (although there could be optimization benefits
from preserving the MMO information).

llvm-svn: 178468
2013-04-01 15:37:53 +00:00
Duncan Sands fee96f832d Remove unused typedef.
llvm-svn: 178462
2013-04-01 13:46:15 +00:00
Arnold Schwaighofer 6793aebb84 ARM Scheduler Model: Add resources instructions, map resources in subtargets
Reapply r177968:
After commit 178074 we can now have undefined scheduler variants.

Move the CortexA9 resources into the CortexA9 SchedModel namespace. Define
resource mappings under the CortexA9 SchedModel. Define resources and mappings
for the SwiftModel.

Incooperate Andrew's feedback.

llvm-svn: 178460
2013-04-01 13:07:05 +00:00
Benjamin Kramer 52ceb44331 X86TTI: Add accurate costs for itofp operations, based on the actual instruction counts.
llvm-svn: 178459
2013-04-01 10:23:49 +00:00
Joe Abbey bc6f4baea9 Whitespace cleanup
llvm-svn: 178454
2013-04-01 02:28:07 +00:00
Vincent Lejeune 53f3525d35 R600: Emit native instructions for tex
llvm-svn: 178452
2013-03-31 19:33:04 +00:00
Duncan Sands e1aa194aab There is no longer any need to silence this compiler warning as the warning has
been turned off globally.

llvm-svn: 178451
2013-03-31 17:44:09 +00:00
Hal Finkel 8540f7771c Cleanup ImmToIdxMap and noImmForm in PPCRegisterInfo
ImmToIdxMap should be a DenseMap (not a std::map) because there
is no ordering requirement. Also, we don't need a separate list
of instructions for noImmForm in eliminateFrameIndex, because this
list is essentially the complement of the keys in ImmToIdxMap.

No functionality change intended.

llvm-svn: 178450
2013-03-31 14:43:31 +00:00
Benjamin Kramer b60633fb87 X86: Promote sitofp <8 x i16> to <8 x i32> when AVX is available.
A vector sext + sitofp is a lot cheaper than 8 scalar conversions.

llvm-svn: 178448
2013-03-31 12:49:15 +00:00
Hal Finkel beb296bea1 Add the PPC lfiwax instruction
This instruction is available on modern PPC64 CPUs, and is now used
to improve the SINT_TO_FP lowering (by eliminating the need for the
separate sign extension instruction and decreasing the amount of
needed stack space).

llvm-svn: 178446
2013-03-31 10:12:51 +00:00
Hal Finkel e53429a13e Cleanup PPC(64) i32 -> float/double conversion
The existing SINT_TO_FP code for i32 -> float/double conversion was disabled
because it relied on broken EXTSW_32/STD_32 instruction definitions. The
original intent had been to enable these 64-bit instructions to be used on CPUs
that support them even in 32-bit mode.  Unfortunately, this form of lying to
the infrastructure was buggy (as explained in the FIXME comment) and had
therefore been disabled.

This re-enables this functionality, using regular DAG nodes, but only when
compiling in 64-bit mode. The old STD_32/EXTSW_32 definitions (which were dead)
are removed.

llvm-svn: 178438
2013-03-31 01:58:02 +00:00
Benjamin Kramer 9335443236 DAGCombine: visitXOR can replace a node without returning it, bail out in that case.
Fixes the crash reported in PR15608.

llvm-svn: 178429
2013-03-30 21:28:18 +00:00
Justin Holewinski 45df882045 Add start of user documentation for NVPTX
Summary: This is the beginning of user documentation for the NVPTX back-end.  I want to ensure I am integrating this properly into the rest of the LLVM documentation.

Differential Revision: http://llvm-reviews.chandlerc.com/D600

llvm-svn: 178428
2013-03-30 16:41:14 +00:00
Benjamin Kramer 9c9e0a2c04 Change '@SECREL' suffix to GAS-compatible '@SECREL32'.
'@SECREL' is what is used by the Microsoft assembler, but GNU as expects '@SECREL32'.
With the patch, the MC-generated code works fine in combination with a recent GNU as (2.23.51.20120920 here).

Patch by David Nadlinger!
Differential Revision: http://llvm-reviews.chandlerc.com/D429

llvm-svn: 178427
2013-03-30 16:21:50 +00:00
Sean Silva 2a74699dd5 [docs] llvmbugs is not the place for patches.
llvm-svn: 178426
2013-03-30 15:33:02 +00:00
Sean Silva ab4997d0ec [docs] Annotate mailing lists with their "name".
Nobody says "the developer's list" or "commits archive"; they always say
"llvmdev" or "llvm-commits". It makes sense for our documentation to
at least make that association explicitly.

llvm-svn: 178425
2013-03-30 15:33:01 +00:00
Sean Silva 84b296c8d1 [docs] Reorganize mailing lists.
Order them roughly by "which one should a newbie join first".

llvm-svn: 178424
2013-03-30 15:32:54 +00:00
Sean Silva 0129924384 [docs] Pull IRC and Mailing Lists under a new "Community" heading.
llvm-svn: 178423
2013-03-30 15:32:51 +00:00
Sean Silva 163b5c4bd5 [docs] The GEP FAQ is not "design and overview"
llvm-svn: 178422
2013-03-30 15:32:50 +00:00
Sean Silva 9d205c8edc [docs] Put DeveloperPolicy under "Development Process Documentation"
llvm-svn: 178421
2013-03-30 15:32:47 +00:00
Benjamin Kramer a73cc5eead Put private class into an anonmyous namespace.
llvm-svn: 178420
2013-03-30 15:23:08 +00:00
Justin Holewinski 59fd8ba5f5 [NVPTX] Remove support for SM < 2.0. This was never fully supported anyway.
llvm-svn: 178417
2013-03-30 14:29:30 +00:00
Justin Holewinski b94bd05b95 [NVPTX] Add NVVMReflect pass to allow compile-time selection of
specific code paths.

This allows us to write code like:

  if (__nvvm_reflect("FOO"))
    // Do something
  else
    // Do something else

and compile into a library, then give "FOO" a value at kernel
compile-time so the check becomes a no-op.

llvm-svn: 178416
2013-03-30 14:29:25 +00:00
Justin Holewinski 0497ab142d [NVPTX] Run clang-format on all NVPTX sources.
Hopefully this resolves any outstanding style issues and gives us
an automated way of ensuring we conform to the style guidelines.

llvm-svn: 178415
2013-03-30 14:29:21 +00:00
Benjamin Kramer 06c44dce73 Object: Turn a couple of degenerate for loops into while loops.
No functionality change.

llvm-svn: 178413
2013-03-30 13:07:51 +00:00
Shuxin Yang 7b0c94e207 Implement XOR reassociation. It is based on following rules:
rule 1: (x | c1) ^ c2 => (x & ~c1) ^ (c1^c2),
     only useful when c1=c2
  rule 2: (x & c1) ^ (x & c2) = (x & (c1^c2))
  rule 3: (x | c1) ^ (x | c2) = (x & c3) ^ c3 where c3 = c1 ^ c2
  rule 4: (x | c1) ^ (x & c2) => (x & c3) ^ c1, where c3 = ~c1 ^ c2

 It reduces an application's size (in terms of # of instructions) by 8.9%.
 Reviwed by Pete Cooper. Thanks a lot!

 rdar://13212115  

llvm-svn: 178409
2013-03-30 02:15:01 +00:00
Akira Hatanaka b3c1847b30 [mips] Add patterns for DSP indexed load instructions.
llvm-svn: 178408
2013-03-30 02:14:45 +00:00
Akira Hatanaka b1457304cc [mips] Define reg+imm load/store pattern templates.
llvm-svn: 178407
2013-03-30 02:01:48 +00:00
Akira Hatanaka fb221c197d [mips] Fix DSP instructions to have explicit accumulator register operands.
Check that instruction selection can select multiply-add/sub DSP instructions
from a pattern that doesn't have intrinsics.

llvm-svn: 178406
2013-03-30 01:58:00 +00:00
Akira Hatanaka 33c060480d Remove unused variables.
llvm-svn: 178405
2013-03-30 01:46:28 +00:00
Akira Hatanaka 9efcd76c2c [mips] Move the code which does dag-combine for multiply-add/sub nodes to
derived class MipsSETargetLowering.

We shouldn't be generating madd/msub nodes if target is Mips16, since Mips16
doesn't have support for multipy-add/sub instructions.

llvm-svn: 178404
2013-03-30 01:42:24 +00:00
Akira Hatanaka be8612f6f4 [mips] Fix definitions of multiply, multiply-add/sub and divide instructions.
The new instructions have explicit register output operands and use table-gen
patterns instead of C++ code to do instruction selection.

Mips16's instructions are unaffected by this change.

llvm-svn: 178403
2013-03-30 01:36:35 +00:00
Akira Hatanaka f0ea500c14 [mips] Remove function getFPBranchCodeFromCond. Rename invertFPCondCodeAdd.
llvm-svn: 178396
2013-03-30 01:16:38 +00:00
Akira Hatanaka d5a0e096bc Fix indentation.
llvm-svn: 178395
2013-03-30 01:15:17 +00:00
Akira Hatanaka 28721bd7dd [mips] Add mips-specific nodes which will be used to select multiply and divide
instructions.

llvm-svn: 178394
2013-03-30 01:14:04 +00:00
Akira Hatanaka 3a34d14745 [mips] Implement getRepRegClassFor in MipsSETargetLowering. This function is
called in several places in ScheduleDAGRRList.cpp.

llvm-svn: 178393
2013-03-30 01:12:05 +00:00
Akira Hatanaka cd77e15cfb [mips] Fix MipsSEInstrInfo::copyPhysReg, loadRegFromStack and storeRegToStack
to handle accumulator registers.

llvm-svn: 178392
2013-03-30 01:08:05 +00:00
Akira Hatanaka 3b70145184 [mips] Expand pseudo load, store and copy instructions right before
callee-saved scan.

The code makes use of register's scavenger's capability to spill multiple
registers.

llvm-svn: 178391
2013-03-30 01:04:11 +00:00
Akira Hatanaka c8d85025a0 [mips] Define pseudo instructions for spilling and copying accumulator
registers.

llvm-svn: 178390
2013-03-30 00:54:52 +00:00
Eric Christopher 4887c8f4ff Use SmallVectorImpl instead of SmallVector at the uses.
llvm-svn: 178386
2013-03-29 23:34:06 +00:00
Bob Wilson f36f15fc06 Run the ObjCARCContract pass for LTO. <rdar://problem/13538084>
llvm-svn: 178385
2013-03-29 23:28:55 +00:00
Michael Gottesman 9412830090 Updated test0 of retain-not-declared.ll to reflect the fact that objc-arc-expand runs before objc-arc/objc-arc-contract.
Specifically, objc-arc-expand will make sure that the
objc_retainAutoreleasedReturnValue, objc_autoreleaseReturnValue, and ret
will all have %call as an argument.

llvm-svn: 178382
2013-03-29 22:44:59 +00:00
Jean-Luc Duprat 89fe247094 SmallVector and SmallPtrSet allocations now power-of-two aligned.
This time tested on both OSX and Linux.

llvm-svn: 178377
2013-03-29 22:07:12 +00:00
Sean Silva c9fbd23621 [docs] The STL "binary search" has a non-obvious name.
std::lower_bound is the canonical "binary search" in the STL
(std::binary_search generally is not what you want). The name actually
makes a lot of sense (and also has a beautiful symmetry with the
std::upper_bound algorithm). The name is nonetheless non-obvious.

Also, remove mention of "radix search". It's not even clear how that
would work in the context of a sorted vector. AFAIK "radix search" only
makes sense when you have a trie-like data structure.

llvm-svn: 178376
2013-03-29 21:57:47 +00:00
Timur Iskhodzhanov 64a5cf5617 Exclude the X86/complex-fca.ll test at it probably wasn't supposed to work on Windows
llvm-svn: 178375
2013-03-29 21:54:00 +00:00
Michael Gottesman 3b8f877860 Add clang.arc.used to ModuleHasARC so ARC always runs if said call is present in a module.
clang.arc.used is an interesting call for ARC since ObjCARCContract
needs to run to remove said intrinsic to avoid a linker error (since the
call does not exist).

llvm-svn: 178369
2013-03-29 21:15:23 +00:00
Jyotsna Verma add82b3c75 Hexagon: Add emitFrameIndexDebugValue function to emit debug information.
llvm-svn: 178368
2013-03-29 21:09:53 +00:00
Eric Christopher 9c8414f84a Use 12 as the magic number for our abbreviation data and our
die values. A lot of DIEs have 10 attributes in C++ code (example
clang), none had more than 12. Seems like a good default.

llvm-svn: 178366
2013-03-29 20:23:06 +00:00
Eric Christopher 6be35037b5 Move the construction of the skeleton compile unit after the
entire original compile unit has been constructed.

llvm-svn: 178365
2013-03-29 20:23:02 +00:00
Adrian Prantl 30cf851ef2 move testcase into appropriate X86 subdirectory.
llvm-svn: 178364
2013-03-29 20:14:08 +00:00
Hal Finkel f8ac57e289 Implement FRINT lowering on PPC using frin
Like nearbyint, rint can be implemented on PPC using the frin instruction. The
complication comes from the fact that rint needs to set the FE_INEXACT flag
when the result does not equal the input value (and frin does not do that). As
a result, we use a custom inserter which, after the rounding, compares the
rounded value with the original, and if they differ, explicitly sets the XX bit
in the FPSCR register (which corresponds to FE_INEXACT).

Once LLVM has better modeling of the floating-point environment we should be
able to (often) eliminate this extra complexity.

llvm-svn: 178362
2013-03-29 19:41:55 +00:00
Akira Hatanaka 7b8b9b9abf [mips] Define a function which returns the GPR register class.
llvm-svn: 178359
2013-03-29 19:17:42 +00:00
Andrew Trick d97ff1fcee Fix TableGen subtarget-emitter to handle A9/Swift.
A9 uses itinerary classes, Swift uses RW lists. This tripped some
verification when we're expanding variants. I had to refine the
verification a bit.

llvm-svn: 178357
2013-03-29 19:08:31 +00:00
Matt Arsenault 19f773be37 Build fixes for STLPort + GCC
llvm-svn: 178356
2013-03-29 18:48:45 +00:00
Matt Arsenault 2080ecd107 Fix loop style
llvm-svn: 178355
2013-03-29 18:48:42 +00:00
Adrian Prantl 4b7cf64f66 Split the llvm/tools/clang/test/CodeGenObjC/debug-info-blocks.m testcase into a CFE and LLVM part.
rdar://problem/12767564

llvm-svn: 178353
2013-03-29 18:08:14 +00:00
Benjamin Kramer 70671b9937 Remove the old CodePlacementOpt pass.
It was superseded by MachineBlockPlacement and disabled by default since LLVM 3.1.

llvm-svn: 178349
2013-03-29 17:14:24 +00:00
Nadav Rotem 6036f581aa Fix a typo
llvm-svn: 178346
2013-03-29 16:34:23 +00:00
Jyotsna Verma 26226cea4b Hexagon: Disable DwarfUsesInlineInfoSection flag.
llvm-svn: 178345
2013-03-29 15:46:12 +00:00
Hal Finkel c20a08d25b Add PPC FP rounding instructions fri[mnpz]
These instructions are available on the P5x (and later) and on the A2. They
implement the standard floating-point rounding operations (floor, trunc, etc.).
One caveat: frin (round to nearest) does not implement "ties to even", and so
is only enabled in fast-math mode.

llvm-svn: 178337
2013-03-29 08:57:48 +00:00
Rafael Espindola de65751493 Revert "Fix allocations of SmallVector and SmallPtrSet so they are more prone to"
This reverts commit 617330909f0c26a3f2ab8601a029b9bdca48aa61.

It broke the bots:

/home/clangbuild2/clang-ppc64-2/llvm.src/unittests/ADT/SmallVectorTest.cpp:150: PushPopTest
/home/clangbuild2/clang-ppc64-2/llvm.src/unittests/ADT/SmallVectorTest.cpp:118: Failure
Value of: v[i].getValue()
  Actual: 0
Expected: value
Which is: 2

llvm-svn: 178334
2013-03-29 07:11:21 +00:00
Jean-Luc Duprat 67ce1472b4 Fix allocations of SmallVector and SmallPtrSet so they are more prone to
being power-of-two sized.

llvm-svn: 178332
2013-03-29 05:45:22 +00:00
Michael Gottesman 60f6b28c58 Removed trailing whitespace.
llvm-svn: 178329
2013-03-29 05:13:07 +00:00
Akira Hatanaka f05e9ad59f [mips] Change type of accumulator registers to Untyped. Add two more accumulator
register classes for Mips64 and DSP-ASE.

No functionality changes.

llvm-svn: 178328
2013-03-29 03:27:21 +00:00
Akira Hatanaka 465faccafa [mips] Define overloaded versions of storeRegToStack and loadRegFromStack.
No functionality changes.

llvm-svn: 178327
2013-03-29 02:14:12 +00:00
Akira Hatanaka 11184e4c8c [mips] Add parameter Alignment to MipsFrameLowering's constructor.
No functionality changes.

llvm-svn: 178326
2013-03-29 01:51:04 +00:00
Dan Gohman f6169d020c Revert r178166. According to Howard, this code is actually ok.
llvm-svn: 178319
2013-03-29 00:13:08 +00:00
Jack Carter 311246c6d5 [Mips Assembler] Add support for OR macro with imediate opperand
Mips assembler supports macros that allows the OR instruction 
to have an immediate parameter. This patch adds an instruction 
alias that converts this macro into a Mips ORI instruction. 

Contributer: Vladimir Medic
llvm-svn: 178316
2013-03-28 23:45:13 +00:00
Michael Liao a486a11dcf Add support of RDSEED defined in AVX2 extension
llvm-svn: 178314
2013-03-28 23:41:26 +00:00
Michael Liao 5fff5c7b26 Enhance boolean simplification to handle 16-/64-bit RDRAND
- RDRAND always clears the destination value when a random value is not
  available (i.e. CF == 0). This value is truncated or zero-extended as
  the false boolean value to be returned. Boolean simplification needs
  to skip this 'zext' or 'trunc' node.

llvm-svn: 178312
2013-03-28 23:38:52 +00:00
Michael Liao 96b42608ab Skip moving call address loading into callseq when targets prefer register indirect call.
To enable a load of a call address to be folded with that call, this
load is moved from outside of callseq into callseq. Such a moving
adds a non-glued node (that load) into a glued sequence. This non-glue
load is only removed when DAG selection folds them into a memory form
call instruction. When such instruction selection is disabled, it breaks
DAG schedule.

To prevent that, such moving is disabled when target favors register
indirect call.

Previous workaround disabling CALL32m/CALL64m insn selection is removed.

llvm-svn: 178308
2013-03-28 23:13:21 +00:00
Michael Gottesman ba64859e6e Removed dead code from ObjCARCOpts relating to tracking objc_retainBlocks through the ARC Dataflow analysis. By the time we get to the ARC dataflow analysis, any objc_retainBlock calls are not optimizable.
llvm-svn: 178306
2013-03-28 23:08:44 +00:00
Chad Rosier dbac025d84 [fast-isel] Add a preemptive fix for the case where we fail to materialize an
immediate in a register.  I don't believe this should ever fail, but I see no
harm in trying to make this code bullet proof.

I've added an assert to ensure my assumtion is correct.  If the assertion fires
something is wrong and we should fix it, rather then just silently fall back to
SelectionDAG isel.

llvm-svn: 178305
2013-03-28 23:04:47 +00:00
Jack Carter e1d85d55e6 [Mips Assembler] Add alias definitions for jal
Mips assembler allows following to be used as aliased instructions:
jal $rs for jalr $rs
jal $rd,$rd for jalr $rd,$rs

This patch provides alias definitions in td files and test cases to show the usage.

Contributer: Vladimir Medic
llvm-svn: 178304
2013-03-28 23:02:21 +00:00
Nadav Rotem ff8c45529c Add the X86 FMAs to the scheduling model.
llvm-svn: 178303
2013-03-28 22:54:45 +00:00
Bill Wendling 85722f48da Minor simplification.
Go ahead and use the full path for both the .gcno and .gcda files.

llvm-svn: 178302
2013-03-28 22:40:08 +00:00
Nadav Rotem e7b6a8aa8c Add the Haswell machine model.
llvm-svn: 178301
2013-03-28 22:34:46 +00:00
Nadav Rotem a20ec3164e Remove the unused port from the SandyBridge machine model
llvm-svn: 178300
2013-03-28 22:32:41 +00:00
Michael Liao c93fe7f8b2 Add ADX CPUID detection
llvm-svn: 178299
2013-03-28 22:29:53 +00:00
Eric Christopher 6c75232cf0 These two are default in the constructor for MCAsmInfo.
llvm-svn: 178293
2013-03-28 21:37:18 +00:00
Timur Iskhodzhanov a2fd5fdd7a Make Win32 put the SRet address into EAX, fixes PR15556
llvm-svn: 178291
2013-03-28 21:30:04 +00:00
Hal Finkel fd53ddd71c Specify CPUs on the PPC bswap-load-store test
Otherwise, the CHECK-NOT's might trigger depending on the host's CPU.

llvm-svn: 178287
2013-03-28 20:35:18 +00:00
Hal Finkel 22e41c411e Only enable 64-bit bswap DAG combines for PPC64
Compiling in 32-bit mode on a P7 would assert after 64-bit DAG combines were
added for bswap with load/store. This is because these combines are really only
valid in 64-bit mode, regardless of the CPU (and this was not being checked).

llvm-svn: 178286
2013-03-28 20:23:46 +00:00
Michael Gottesman 49f9885a2a Non optimizable objc_retainBlock calls are not forwarding.
Since we handle optimizable objc_retainBlocks through strength reduction
in OptimizableIndividualCalls, we know that all code after that point
will only see non-optimizable objc_retainBlock calls. IsForwarding is
only called by functions after that point, so it is ok to just classify
objc_retainBlock as non-forwarding.

<rdar://problem/13249661>.

llvm-svn: 178285
2013-03-28 20:11:30 +00:00
Michael Gottesman 158fdf699e [ObjCARC] Strength reduce objc_retainBlock -> objc_retain if the objc_retainBlock is optimizable.
If an objc_retainBlock has the copy_on_escape metadata attached to it
AND if the block pointer argument only escapes down the stack, we are
allowed to strength reduce the objc_retainBlock to to an objc_retain and
thus optimize it.

Current there is logic in the ARC data flow analysis to handle
this case which is complicated and involved making distinctions in
between objc_retainBlock and objc_retain in certain places and
considering them the same in others.

This patch simplifies said code by:

1. Performing the strength reduction in the initial ARC peephole
analysis (ObjCARCOpts::OptimizeIndividualCalls).

2. Changes the ARC dataflow analysis (which runs after the peephole
analysis) to consider all objc_retainBlock calls to not be optimizable
(since if the call was optimizable, we would have strength reduced it
already).

This patch leaves in the infrastructure in the ARC dataflow analysis to
handle this case, which due to 2 will just be dead code. I am doing this
on purpose to separate the removal of the old code from the testing of
the new code.

<rdar://problem/13249661>.

llvm-svn: 178284
2013-03-28 20:11:19 +00:00
Jyotsna Verma a46059b74d Hexagon: Replace switch-case in isDotNewInst with TSFlags.
llvm-svn: 178281
2013-03-28 19:44:04 +00:00
Hal Finkel 93492fa696 Fix bad indentation in r178276
Thanks to Bill Schmidt for pointing this out!

llvm-svn: 178280
2013-03-28 19:43:12 +00:00
Jyotsna Verma 27c06f3322 Hexagon: Enable SupportDebugInfomation and DwarfInSection flags.
llvm-svn: 178279
2013-03-28 19:34:49 +00:00
Akira Hatanaka 19468cafad Remove -O3.
llvm-svn: 178278
2013-03-28 19:34:14 +00:00
Bill Schmidt 74b2e72ab3 Use direct types in most PowerPC Altivec instructions and patterns.
This follows up Ulrich Weigand's work in PPCInstrInfo.td and
PPCInstr64Bit.td by doing the corresponding work for most of the
Altivec patterns.  I have not been able to do anything for the
following classes of instructions:

(1) Vector logicals.  These don't have corresponding intrinsics and
don't have a single obvious vector type.  So far as I can tell I need
to leave these as VRRC.  Affected instructions are:  VAND, VANDC,
VNOR, VOR, VXOR, V_SET0.

(2) Instructions that make use of vector shuffle.  The selection code
promotes all shuffles to v16i8, so any pattern that matches on a
shuffle is constrained.  I haven't found any way to make the patterns
match on their natural types, so I plan to leave these as VRRC.
Affected instructions are:  VMRG*, VSPLTB, VSPLTH, VSPLTW, VPKUHUM,
VPKUWUM.

No change in behavior is anticipated.

llvm-svn: 178277
2013-03-28 19:27:24 +00:00
Hal Finkel 31d2956510 Add the PPC64 ldbrx/stdbrx instructions
These are 64-bit load/store with byte-swap, and available on the P7 and the A2.
Like the similar instructions for 16- and 32-bit words, these are matched in the
target DAG-combine phase against load/store-bswap pairs.

llvm-svn: 178276
2013-03-28 19:25:55 +00:00
Gordon Keiser 772cf466da Fix issue with disassembler decoding CBZ/CBNZ immediates as negatives when the upper bit is set.
They should always be zero-extended, not sign extended.  Added test case.

llvm-svn: 178275
2013-03-28 19:22:28 +00:00
Gordon Keiser fb1ce5fa25 Testing commit access to llvm. Remove two lines of whitespace from the Thumb README.
llvm-svn: 178256
2013-03-28 18:26:15 +00:00
Thomas Schwinge b1322d5691 Correct spelling of Git.
llvm-svn: 178254
2013-03-28 18:06:20 +00:00
Rafael Espindola ddbf4f47dc Move test since it depends on the X86 backend.
llvm-svn: 178249
2013-03-28 17:01:28 +00:00
Jyotsna Verma 93e740485f Hexagon: Use multiclass for gp-relative instructions.
Remove noV4T gp-relative instructions.

llvm-svn: 178246
2013-03-28 16:25:57 +00:00
Howard Hinnant 24d7f8c342 Seciton 24.2.2 of the C++ standard, [iterator.iterators], Table 106
requires that the return type of *r for all iterators r be reference,
where reference is defined in [iterator.requirements.general]/p11 as
iterator_traits<X>::reference, and X is the type of r.

But in CFG.h, the dereference operator of PredIterator and SuccIterator
return pointer, not reference.

Furthermore the nested type reference is value_type&, which is not the
type returned from operator*().

This patch simply makes the iterator::reference type value_type*, which
is what the operator*() returns, and then re-lables the return type as
reference.

From a functionality point of view, the only difference is that the
nested reference type is now value_type* instead of value_type&.

llvm-svn: 178240
2013-03-28 15:47:50 +00:00
Tim Northover 08bb3ce383 AArch64: implement GICv3 system registers
llvm-svn: 178236
2013-03-28 14:30:46 +00:00
Hal Finkel a4d074863a Add the PPC64 popcntd instruction
PPC ISA 2.06 (P7, A2, etc.) has a popcntd instruction. Add this instruction and
tell TTI about it so that popcount-loop recognition will know about it.

llvm-svn: 178233
2013-03-28 13:29:47 +00:00
Kostya Serebryany 463aa81418 [tsan] make sure memset/memcpy/memmove are not inlined in tsan mode
llvm-svn: 178230
2013-03-28 11:21:13 +00:00
Michael Gottesman f23bb4e558 Revert "Updated ELF relocation test for .eh_frame section"
This reverts commit c8d65364223a04b179958a50a4bf0f89b21dd7d2.

This broke a bunch of the buildbots.

llvm-svn: 178222
2013-03-28 05:14:26 +00:00
Jyotsna Verma 8542abd762 Disable JIT/MCJIT tests in unittests/ExecutionEngine for the targets that don't support JIT.
llvm-svn: 178221
2013-03-28 03:38:29 +00:00
Hal Finkel 035b4825ce Cleanup PPC CR-spill kill flags and 32- vs. 64-bit instructions
There were a few places where kill flags were not being set correctly, and
where 32-bit instruction variants were being used with 64-bit registers. After
r178180, this code was being triggered causing llc to assert.

llvm-svn: 178220
2013-03-28 03:38:16 +00:00
Hal Finkel 25aab01058 Fix typo in PPCInstr64Bit
llvm-svn: 178219
2013-03-28 03:38:08 +00:00
David Blaikie 5692e72f30 Revert "Adding DIImportedModules to DIScopes."
This reverts commit 342d92c7a0adeabc9ab00f3f0d88d739fe7da4c7.

Turns out we're going with a different schema design to represent
DW_TAG_imported_modules so we won't need this extra field.

llvm-svn: 178215
2013-03-28 02:44:59 +00:00
Akira Hatanaka 99866dd535 Check if Type is a vector before calling function Type::getVectorNumElements.
llvm-svn: 178208
2013-03-28 01:28:02 +00:00
Preston Gurd d6be4bf87f This patch follows is a follow up to r178171, which uses the register
form of call in preference to memory indirect on Atom.

In this case, the patch applies the optimization to the code for reloading
spilled registers.

The patch also includes changes to sibcall.ll and movgs.ll, which were
failing on the Atom buildbot after the first patch was applied.

This patch by Sriram Murali.

llvm-svn: 178193
2013-03-27 23:16:18 +00:00
Jack Carter ccc266559f Updated ELF relocation test for .eh_frame section
Made sure we were looking a correct section
Added Mips32/64 as an extra check
Updated llvm-objdump to generate symbolic info for Mips relocations

llvm-svn: 178190
2013-03-27 22:58:49 +00:00
Chad Rosier 1530ba5e73 [ms-inline asm] Add support of imm displacement before bracketed memory
expression.  Specifically, this syntax:

 ImmDisp [ BaseReg + Scale*IndexReg + Disp ] 

We don't currently support:

 ImmDisp [ Symbol ]

rdar://13518671

llvm-svn: 178186
2013-03-27 21:49:56 +00:00
Hal Finkel 37714b8a48 Resynchronize isLoadFromStackSlot with LoadRegFromStackSlot (and stores) in PPCInstrInfo
These functions should have the same list of load/store instructions. Now that
all load/store forms have been normalized (to single instructions or pseudos)
they can be resynchronized.

Found by inspection, although hopefully this will improve optimization.  I've
also added some comments.

llvm-svn: 178180
2013-03-27 21:21:15 +00:00
Jack Carter d224bc4f25 test file name change to correct typo
llvm-svn: 178174
2013-03-27 20:07:48 +00:00
Preston Gurd 663e6f9558 For the current Atom processor, the fastest way to handle a call
indirect through a memory address is to load the memory address into
a register and then call indirect through the register.

This patch implements this improvement by modifying SelectionDAG to
force a function address which is a memory reference to be loaded
into a virtual register.

Patch by Sriram Murali.

llvm-svn: 178171
2013-03-27 19:14:02 +00:00
Hal Finkel 1996f3d87f Fix typo (common to both X86 and PPC)
Thanks to Bill Schmidt for pointing this out during code review!

llvm-svn: 178170
2013-03-27 19:10:42 +00:00
Hal Finkel 5791f51449 Remove more dead LR-as-GPR PPC code
I had removed similar code a few days ago, but somehow missed this.

llvm-svn: 178169
2013-03-27 19:10:40 +00:00
Dan Gohman 38a723830c Avoid undefined behavior from passing a std::vector's own contents
in as an argument to push_back.

llvm-svn: 178166
2013-03-27 18:44:56 +00:00
Hal Finkel f1af79ab45 Remove "gpr0 allocation" from the PPC README TODO list
As Chris pointed out, post r178123, this is now done!

llvm-svn: 178165
2013-03-27 18:39:52 +00:00
Chad Rosier 7ce810cb84 Don't try to generate crash diagnostics if we had an I/O failure. It's very
likely the crash diagnostics generation will fail as well.
Part of rdar://13296693

llvm-svn: 178163
2013-03-27 18:30:00 +00:00
Chad Rosier 583b4d3961 Add a boolean parameter to the llvm::report_fatal_error() function to indicated
if crash diagnostics should be generated.  By default this is enabled.
Part of rdar://13296693

llvm-svn: 178161
2013-03-27 18:27:54 +00:00
Bill Wendling fa2287825d Specutively revert r178130.
This may be causing a failure on some buildbots:

Referencing function in another module!
  tail call fastcc void @_ZL11EvaluateOpstPtRj(i16 zeroext %17, i16* %Vals, i32* %NumVals), !dbg !219
Referencing function in another module!
  tail call fastcc void @_ZL11EvaluateOpstPtRj(i16 zeroext %19, i16* %Vals, i32* %NumVals), !dbg !221
Broken module found, compilation aborted!
Stack dump:
0.    Running pass 'Function Pass Manager' on module 'ld-temp.o'.
1.    Running pass 'Module Verifier' on function '@_ZL11EvaluateOpstPtRj'
clang: error: unable to execute command: Illegal instruction: 4
clang: error: linker command failed due to signal (use -v to see invocation)

<rdar://problem/13516485>

llvm-svn: 178156
2013-03-27 17:54:41 +00:00
David Blaikie 986d7049cd Fix comment
llvm-svn: 178155
2013-03-27 17:50:12 +00:00
Rafael Espindola 0afce43be1 Cleanup the simplify_type implementation.
As far as simplify_type is concerned, there are 3 kinds of smart pointers:

* const correct: A 'const MyPtr<int> &' produces a 'const int*'. A
'MyPtr<int> &' produces a 'int *'.
* always const: Even a 'MyPtr<int> &' produces a 'const int*'.
* no const: Even a 'const MyPtr<int> &' produces a 'int*'.

This patch then does the following:

* Removes the unused specializations. Since they are unused, it is hard
to know which kind should be implemented.
* Make sure we don't drop const.
* Fix the default forwarding so that const correct pointer only need
one specialization.
* Simplifies the existing specializations.

llvm-svn: 178147
2013-03-27 16:43:11 +00:00
Christian Konig 08f5929942 R600/SI: add SETO/SETUO patterns
6 more piglit tests.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 178145
2013-03-27 15:27:31 +00:00
Benjamin Kramer b6bf15cba7 Silence warning about mixing || in &&, fix up 80-cols.
llvm-svn: 178144
2013-03-27 15:03:14 +00:00
Hal Finkel 687143557d Print PPC ZERO as 0 (not r0) even on Darwin
It seems that the Darwin PPC assembler requires r0 to be written as 0 when it
means 0 (at least in lwarx/stwcx.). Fixes PR15605.

llvm-svn: 178142
2013-03-27 13:20:52 +00:00
Tim Northover d3490dc06a Switch to LLVM support function abs64 to keep VS2008 happy.
llvm-svn: 178141
2013-03-27 13:15:08 +00:00
Evgeniy Stepanov 8dcc970760 Disable ASan/MSan symbolization of reports in tests.
It was using an instrumented symbolizer binary, which is a potential fork bomb.

llvm-svn: 178139
2013-03-27 13:11:12 +00:00
Hal Finkel 35dd5c5932 Fix target-customized spilling in the register scavenger
This is a follow-up to r178073 (which should actually make target-customized
spilling work again).

I still don't have a regression test for this (but it would be good to have
one; Thumb 1 and Mips16 use this callback as well).

Patch by Richard Sandiford.

llvm-svn: 178137
2013-03-27 13:00:56 +00:00
Evgeniy Stepanov 11820f4bf0 Disable Initialize.MultipleThreads test under MemorySanitizer.
Fails due to insufficient thread stack.

llvm-svn: 178135
2013-03-27 12:50:49 +00:00
Silviu Baranga dc45336d09 Enabling the generation of dependency breakers for partial updates on Cortex-A15. Also fixing a small bug in getting the update clearence for VLD1LNd32.
llvm-svn: 178134
2013-03-27 12:38:44 +00:00
Jyotsna Verma 653d8839c8 Hexagon: Disable optimizations at O0.
llvm-svn: 178132
2013-03-27 11:14:24 +00:00
James Molloy ec25de421c Improve performance of LinkModules when linking with modules with large numbers of functions which link lazily. Instead of creating and destroying function prototypes irrespective of if they are used, only create them if they are used.
llvm-svn: 178130
2013-03-27 10:23:32 +00:00
Christian Konig 3c14580acb R600/SI: add cummuting of rev instructions
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 178127
2013-03-27 09:12:59 +00:00
Christian Konig 70a5032c1b R600/SI: add mulhu/mulhs patterns
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 178126
2013-03-27 09:12:51 +00:00
Christian Konig 20a7e6b764 R600/SI: add srl/sha patterns for SI
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 178125
2013-03-27 09:12:44 +00:00
Hal Finkel 0f77861d9f Allocate r0 on PPC
The R0 register can now be allocated because instructions
that cannot use R0 as a GPR have been appropriately marked.

llvm-svn: 178123
2013-03-27 06:52:27 +00:00
Hal Finkel 573fc28d64 Use the PPC no-r0 class on the TOC LD pseudos
The register parameter in these instructions becomes the base register in an
r+i ld instruction (and, thus, cannot be r0).

This is not yet testable because we don't yet allocate r0 (and even then any
test would be very fragile).

llvm-svn: 178121
2013-03-27 06:36:55 +00:00
Hal Finkel 3fa362a51a Apply the no-r0 register class to the PPC SELECT_CC_I[4|8] pseudos
Either operand of these pseudo instructions can be transformed into the first
operand of an isel instruction (and this operand cannot be r0).

This is not yet testable because we don't yet allocate r0 (and even when we do,
any test would be very fragile).

llvm-svn: 178119
2013-03-27 05:57:58 +00:00
Hal Finkel 42a312b261 Apply the no-r0 class to PPC TOC ADDI[S] pseudo instructions
Like the addi/addis instructions themselves, these pseudo instructions also
cannot have r0 as their register parameter (because it will be interpreted as
the value 0).

This is not yet testable because we don't yet allocate r0 (and even when we do,
any regression test would be very fragile because it would depend on the
register allocator heuristics).

llvm-svn: 178118
2013-03-27 05:57:56 +00:00
Bill Schmidt a1b72d0f6a Remove the link register from the GPR classes on PowerPC.
Some implementation detail in the forgotten past required the link
register to be placed in the GPRC and G8RC register classes.  This is
just wrong on the face of it, and causes several extra intersection
register classes to be generated.  I found this was having evil
effects on instruction scheduling, by causing the wrong register class
to be consulted for register pressure decisions.

No code generation changes are expected, other than some minor changes
in instruction order.  Seven tests in the test bucket required minor
tweaks to adjust to the new normal.

llvm-svn: 178114
2013-03-27 02:40:14 +00:00
Michael Gottesman 46ebe53ead Added back in the test for arc-annotations.
The test was removed since I had not turned off the test during release
builds. This fails since ARC annotations support  is conditionally
compiled out during release builds. I added the proper requires header
to assuage this issue.

llvm-svn: 178101
2013-03-27 00:09:58 +00:00
David Blaikie a26d70358f Adding DIImportedModules to DIScopes.
This is just the basic groundwork for supporting DW_TAG_imported_module but I
wanted to commit this before pushing support further into Clang or LLVM so that
this rather churny change is isolated from the rest of the work. The major
churn here is obviously adding another field (within the common DIScope prefix)
to all DIScopes (files, classes, namespaces, lexical scopes, etc). This should
be the last big churny change needed for DW_TAG_imported_module/using directive
support/PR14606.

llvm-svn: 178099
2013-03-27 00:07:26 +00:00
Hal Finkel a7b0630ba8 Don't spill PPC VRSAVE on non-Darwin (even in SjLj)
As Bill Schmidt pointed out to me, only on Darwin do we need to spill/restore
VRSAVE in the SjLj code. For non-Darwin, don't spill/restore VRSAVE (and I've
added some asserts to make sure that we're not).

As it turns out, we're not currently handling the Darwin case correctly (I've
added a FIXME in the test case). I've tried adding various implied register
definitions/uses to force the spill without success, so I'll need to address
this later.

llvm-svn: 178096
2013-03-27 00:02:20 +00:00
David Blaikie a7310a3d1b Make DIBuilder::createClassType more type safe by returning DICompositeType rather than DIType
llvm-svn: 178091
2013-03-26 23:46:39 +00:00
David Blaikie fe6db31ce8 DebugInfo: more support for mutating DICompositeType to reduce magic number usage in Clang
llvm-svn: 178090
2013-03-26 23:46:36 +00:00
Chad Rosier 654190a12d Add a boolean parameter to the ExecuteAndWait static function to indicated
if execution failed.  ExecuteAndWait returns -1 upon an execution failure, but
checking the return value isn't sufficient because the wait command may
return -1 as well.  This new parameter is to be used by the clang driver in a
subsequent commit.
Part of rdar://13362359

llvm-svn: 178087
2013-03-26 23:35:00 +00:00
Bill Wendling 5aa82397e8 Use the full path when outputting the `.gcda' file.
If we compile a single source program, the `.gcda' file will be generated where
the program was executed. This isn't desirable, because that place may be at an
unpredictable place (the program could call `chdir' for instance).

Instead, we will output the `.gcda' file in the same place we output the `.gcno'
file. I.e., the directory where the executable was generated. This matches GCC's
behavior.

<rdar://problem/13061072> & PR11809

llvm-svn: 178084
2013-03-26 22:47:50 +00:00
Michael Liao 03f9ad0e67 Add XTEST codegen support
llvm-svn: 178083
2013-03-26 22:47:01 +00:00
Michael Liao e344ec919f Add HLE target feature
llvm-svn: 178082
2013-03-26 22:46:02 +00:00
Jakob Stoklund Olesen 1ac7e662d4 Enable SandyBridgeModel for all modern Intel P6 descendants.
All Intel CPUs since Yonah look a lot alike, at least at the granularity
of the scheduling models. We can add more accurate models for
processors that aren't Sandy Bridge if required. Haswell will probably
need its own.

The Atom processor and anything based on NetBurst is completely
different. So are the non-Intel chips.

llvm-svn: 178080
2013-03-26 22:19:12 +00:00
David Blaikie d03326b75a Debug Info: Provide a means to update the members of a composite type
This will be used to factor out some uses of magic number operand offsets
inside Clang where these fields were updated in an effort to resolve forward
declarations/circular references.

llvm-svn: 178078
2013-03-26 21:59:17 +00:00
Hal Finkel 567fa62ddc Restore real bit lengths on PPC register numbers
As suggested by Bill Schmidt (in reviewing r178067), use the real register
number bit lengths (which is self-documenting, and prevents using illegal
numbers), and set only the relevant bits in HWEncoding (which defaults to 0).

No functionality change intended.

llvm-svn: 178077
2013-03-26 21:50:26 +00:00
Andrew Trick e97978f94c TableGen SubtargetEmitter fix to allow A9 and Swift to coexist.
Allow variants to be defined only for some processors on a target.

llvm-svn: 178074
2013-03-26 21:36:39 +00:00
Hal Finkel 1fa2f945ea Fix the register scavenger for targets that provide custom spilling
As pointed out by Richard Sandiford, my recent updates to the register
scavenger broke targets that use custom spilling (because the new code assumed
that if there were no valid spill slots, than spilling would be impossible).

I don't have a test case, but it should be possible to create one for Thumb 1,
Mips 16, etc.

llvm-svn: 178073
2013-03-26 21:20:15 +00:00
Hal Finkel feea653974 PPC: Use HWEncoding and TRI->getEncodingValue
As pointed out by Jakob, we don't need to maintain a separate
register-numbering table. Instead we should let TableGen generate the table for
us from the information (already present) in PPCRegisterInfo.td.
TRI->getEncodingValue is now used to access register-encoding values.

No functionality change intended.

llvm-svn: 178067
2013-03-26 20:08:20 +00:00
NAKAMURA Takumi 3234178bf9 R600/SIMCCodeEmitter.cpp: Prune a couple of unused members, STI and Ctx. [-Wunused-private-field]
llvm-svn: 178065
2013-03-26 19:42:48 +00:00
Hal Finkel 0dfbb05aff Use multiple virtual registers in PPC CR spilling
Now that the register scavenger can support multiple spill slots, and PEI can
use virtual-register-based scavenging for multiple simultaneous registers, we
can use a virtual register for the transfer register in the CR spilling code.

This should eliminate the last place (outside of the prologue/epilogue) where
we depend on the unconditional availability of the r0 register. We will soon be
able to allocate it (in a somewhat restricted sense) as a GPR.

llvm-svn: 178060
2013-03-26 18:57:22 +00:00
Hal Finkel d8a423cd71 Update PPCRegisterInfo's use of virtual registers to be SSA
PPC's use of PEI's virtual-register-based scavenging functionality had
redefined the virtual registers (it was non-SSA). Now that PEI supports
dealing with instructions with multiple virtual registers, this can be
cleanup up to use multiple virtual registers and keep SSA form.

No functionality change intended.

llvm-svn: 178059
2013-03-26 18:57:20 +00:00
Hal Finkel 4e05788cc3 Update PEI's virtual-register-based scavenging to support multiple simultaneous mappings
The previous algorithm could not deal properly with scavenging multiple virtual
registers because it kept only one live virtual -> physical mapping (and
iterated through operands in order). Now we don't maintain a current mapping,
but rather use replaceRegWith to completely remove the virtual register as
soon as the mapping is established.

In order to allow the register scavenger to return a physical register killed
by an instruction for definition by that same instruction, we now call
RS->forward(I) prior to eliminating virtual registers defined in I. This
requires a minor update to forward to ignore virtual registers.

These new features will be tested in forthcoming commits.

llvm-svn: 178058
2013-03-26 18:56:54 +00:00
Jakob Stoklund Olesen e440d476ee Annotate the remaining x86 instructions with SchedRW lists.
Now all x86 instructions that have itinerary classes also have SchedRW
lists. This is required before the new scheduling models can be used.

There are still unannotated instructions remaining, but they don't have
itinerary classes either.

llvm-svn: 178051
2013-03-26 18:24:22 +00:00
Jakob Stoklund Olesen 267dd946f6 Annotate x87 and mmx instructions with SchedRW lists.
This only covers the instructions that were given itinerary classes for
the Atom model.

llvm-svn: 178050
2013-03-26 18:24:20 +00:00
Jakob Stoklund Olesen d59419eb67 Annotate control instructions with SchedRW lists.
This could definitely be more granular. I am not sure if it makes a
difference.

llvm-svn: 178049
2013-03-26 18:24:17 +00:00
Jakob Stoklund Olesen 7c8a760d28 Annotate the rest of X86InstrInfo.td with SchedRW lists.
llvm-svn: 178048
2013-03-26 18:24:15 +00:00
Michael Liao 4a44e556cc Fix PRFCHW test on non-x86 builds
- 'prefetch' intrinsics are only lowered when SSE is available. On non-X86
  builds, 'generic' CPU is used and stops lowering any prefetch intrinsics.

llvm-svn: 178046
2013-03-26 18:15:45 +00:00
Arnold Schwaighofer aadf10435a BasicAA: Only query twice if the result of the more general query was MayAlias
This is a compile time optimization. Before the patch we would do two traversals
on each call to aliasGEP - one with a set size parameter one with UnknownSize.
We can do better by first checking the result of the alias query with
UnknownSize.
Only if this one returns MayAlias do we query a second time using size and type.

This recovers an about 7% compile time regression on spec/ammp.

radar://12349960

llvm-svn: 178045
2013-03-26 18:07:53 +00:00
Michael Liao 5173ee03af Add PREFETCHW codegen support
- Add 'PRFCHW' feature defined in AVX2 ISA extension

llvm-svn: 178040
2013-03-26 17:47:11 +00:00
Ulrich Weigand b1e02b2af2 Add test case for commit r178031.
llvm-svn: 178038
2013-03-26 17:30:02 +00:00
Jyotsna Verma 15957b129f Hexagon: Use multiclass for aslh, asrh, sxtb, sxth, zxtb and zxth.
llvm-svn: 178032
2013-03-26 15:43:57 +00:00
Ulrich Weigand 8a51d8ea95 Make InstCombineCasts.cpp:OptimizeIntToFloatBitCast endian safe.
The OptimizeIntToFloatBitCast converts shift-truncate sequences
into extractelement operations.  The computation of the element
index to be used in the resulting operation is currently only
correct for little-endian targets.

This commit fixes the element index computation to be correct
for big-endian targets as well.  If the target byte order is
unknown, the optimization cannot be performed at all.

llvm-svn: 178031
2013-03-26 15:36:14 +00:00
Jyotsna Verma f299668aeb Hexagon: Remove HexagonMCInst.h file. It has been replaced with MCTargetDesc/HexagonMCInst.h.
llvm-svn: 178030
2013-03-26 15:34:22 +00:00