Commit Graph

19024 Commits

Author SHA1 Message Date
Rafael Espindola b080267bff Loosen this test.
Looks like there is a big endian/little endian problem here. Loosen the
test to try to get the bots green while llvm builds on a ppc qemu vm.

The failure was in http://lab.llvm.org:8011/builders/clang-ppc64-elf-linux2/

llvm-svn: 178839
2013-04-05 04:31:09 +00:00
Rafael Espindola 599e810ae3 Add a test for obj2yaml in preparation for refactoring it.
llvm-svn: 178829
2013-04-05 02:02:05 +00:00
Andrew Trick 80e66ce0b4 RegisterPressure heuristics currently require signed comparisons.
llvm-svn: 178823
2013-04-05 00:31:34 +00:00
Arnold Schwaighofer df6f67ed87 LoopVectorizer: Pass OperandValueKind information to the cost model
Pass down the fact that an operand is going to be a vector of constants.

This should bring the performance of MultiSource/Benchmarks/PAQ8p/paq8p on x86
back. It had degraded to scalar performance due to my pervious shift cost change
that made all shifts expensive on x86.

radar://13576547

llvm-svn: 178809
2013-04-04 23:26:27 +00:00
Arnold Schwaighofer 44f902ed7d X86 cost model: Differentiate cost for vector shifts of constants
SSE2 has efficient support for shifts by a scalar. My previous change of making
shifts expensive did not take this into account marking all shifts as expensive.
This would prevent vectorization from happening where it is actually beneficial.

With this change we differentiate between shifts of constants and other shifts.

radar://13576547

llvm-svn: 178808
2013-04-04 23:26:24 +00:00
Hal Finkel f96c18e3bc PPC: Improve code generation for mixed-precision reciprocal sqrt
The DAGCombine logic that recognized a/sqrt(b) and transformed it into
a multiplication by the reciprocal sqrt did not handle cases where the
sqrt and the division were separated by an fpext or fptrunc.

llvm-svn: 178801
2013-04-04 22:44:12 +00:00
Jyotsna Verma bc03a9792a Disable 2010-10-01-crash.ll for Hexagon as the Hexagon frontend will
never produce a byval parameter with size < 8 bytes.

llvm-svn: 178792
2013-04-04 21:05:46 +00:00
Rafael Espindola 7733466c73 Add back parsing of header charactestics.
It had been dropped during the switch to yaml::IO. Also add a test going
from yaml2obj to llvm-readobj. It can be extended as we add more
fields/formats to yaml2obj.

llvm-svn: 178786
2013-04-04 20:30:52 +00:00
Richard Osborne 0c12d1851e [XCore] Add bru instruction.
llvm-svn: 178783
2013-04-04 20:05:35 +00:00
Richard Osborne f18d95f756 [XCore] The RRegs register class is a superset of GRRegs.
At the time when the XCore backend was added there were some issues with
with overlapping register classes but these all seem to be fixed now.
Describing the register classes correctly allow us to get rid of a
codegen only instruction (LDAWSP_lru6_RRegs) and it means we can
disassemble ru6 instructions that use registers above r11.

llvm-svn: 178782
2013-04-04 19:57:46 +00:00
Jakob Stoklund Olesen 299475e0c6 Avoid high-latency false CPSR dependencies even for tMOVSi.
The Thumb2SizeReduction pass avoids false CPSR dependencies, except it
still aggressively creates tMOVi8 instructions because they are so
common.

Avoid creating false CPSR dependencies even for tMOVi8 instructions when
the the CPSR flags are known to have high latency. This allows integer
computation to overlap floating point computations.

Also process blocks in a reverse post-order and propagate high-latency
flags to successors.

<rdar://problem/13468102>

llvm-svn: 178773
2013-04-04 18:25:36 +00:00
Stepan Dyatkovskiy e58df62ea4 New-password-test commit.
llvm-svn: 178765
2013-04-04 16:11:18 +00:00
Vincent Lejeune c44fa99719 R600: Take export into account when computing cf address
llvm-svn: 178761
2013-04-04 13:59:59 +00:00
Alexey Samsonov e2c772a1b0 Propagate path to ASan/MSan symbolizer into test environment to produce useful reports on errors.
llvm-svn: 178749
2013-04-04 07:41:00 +00:00
Jakob Stoklund Olesen 8cfaffaade Add SPARC v9 support for select on 64-bit compares.
This requires v9 cmov instructions using the %xcc flags instead of the
%icc flags.

Still missing:
- Select floats on %xcc flags.
- Select i64 on %fcc flags.

llvm-svn: 178737
2013-04-04 03:08:00 +00:00
Arnold Schwaighofer e9b5016411 X86 cost model: Vector shifts are expensive in most cases
The default logic does not correctly identify costs of casts because they are
marked as custom on x86.

For some cases, where the shift amount is a scalar we would be able to generate
better code. Unfortunately, when this is the case the value (the splat) will get
hoisted out of the loop, thereby making it invisible to ISel.

radar://13130673
radar://13537826

llvm-svn: 178703
2013-04-03 21:46:05 +00:00
Rafael Espindola 2025e8b820 Implement the "mips endian" for r_info.
Normally r_info is just a 32 of 64 bit number matching the endian of the rest
of the file. Unfortunately, mips 64 bit little endian is special: The top 32
bits are a little endian number and the following 32 are a big endian one.

llvm-svn: 178694
2013-04-03 21:02:51 +00:00
Richard Osborne 122acb216c [XCore] Check disassembly of the st8 instruction.
llvm-svn: 178689
2013-04-03 20:07:11 +00:00
Richard Osborne fb0b4ea3a7 [XCore] Update disassembler test to improve coverage of the instructions.
Previously some instructions were unintentionally covered twice and
others were not covered at all.

llvm-svn: 178688
2013-04-03 20:07:06 +00:00
Eric Christopher 9cad53cfec Implements low-level object file format specific output for COFF and
ELF with support for:

- File headers
- Section headers + data
- Relocations
- Symbols
- Unwind data (only COFF/Win64)

The output format follows a few rules:
- Values are almost always output one per line (as elf-dump/coff-dump already do). - Many values are translated to something readable (like enum names), with the raw value in parentheses.
- Hex numbers are output in uppercase, prefixed with "0x".
- Flags are sorted alphabetically.
- Lists and groups are always delimited.

Example output:
---------- snip ----------
Sections [
  Section {
    Index: 1
    Name: .text (5)
    Type: SHT_PROGBITS (0x1)
    Flags [ (0x6)
      SHF_ALLOC (0x2)
      SHF_EXECINSTR (0x4)
    ]
    Address: 0x0
    Offset: 0x40
    Size: 33
    Link: 0
    Info: 0
    AddressAlignment: 16
    EntrySize: 0
    Relocations [
      0x6 R_386_32 .rodata.str1.1 0x0
      0xB R_386_PC32 puts 0x0
      0x12 R_386_32 .rodata.str1.1 0x0
      0x17 R_386_PC32 puts 0x0
    ]
    SectionData (
      0000: 83EC04C7 04240000 0000E8FC FFFFFFC7  |.....$..........|
      0010: 04240600 0000E8FC FFFFFF31 C083C404  |.$.........1....|
      0020: C3                                   |.|
    )
  }
]
---------- snip ----------

Relocations and symbols can be output standalone or together with the section header as displayed in the example.
This feature set supports all tests in test/MC/COFF and test/MC/ELF (and I suspect all additional tests using elf-dump), making elf-dump and coff-dump deprecated.

Patch by Nico Rieck!

llvm-svn: 178679
2013-04-03 18:31:38 +00:00
Eric Christopher 8d67ab4f70 Implement sectionContainsSymbol for ELF.
Patch by Nico Rieck!

llvm-svn: 178677
2013-04-03 18:31:19 +00:00
Eric Christopher d5972ea8fc When dumping clear the arm/thumb flag for now.
Patch by Nico Rieck!

llvm-svn: 178676
2013-04-03 18:31:12 +00:00
Vincent Lejeune c3d3f9b66e R600: Fix last ALU of a clause being emitted in a separate clause
llvm-svn: 178675
2013-04-03 18:24:47 +00:00
Bill Schmidt 92e26646bc Fix PR15632: No support for ppcf128 floating-point remainder on PowerPC.
For this we need to use a libcall.  Previously LLVM didn't implement
libcall support for frem, so I've added it in the usual
straightforward manner.  A test case from the bug report is included.

llvm-svn: 178639
2013-04-03 13:05:44 +00:00
Tim Northover 5816ca117b AArch64: implement ETMv4 trace system registers.
llvm-svn: 178637
2013-04-03 12:31:29 +00:00
Timur Iskhodzhanov 7205c72d84 Temporarily relax the WIN32 checks in the SRet test to fix the Atom D2700 bot
llvm-svn: 178635
2013-04-03 12:17:15 +00:00
Timur Iskhodzhanov f4e0665e56 Fix SRet for thiscall in i686-pc-win32
llvm-svn: 178634
2013-04-03 11:27:54 +00:00
Jakob Stoklund Olesen d9bbdfd3cc Add 64-bit compare + branch for SPARC v9.
The same compare instruction is used for 32-bit and 64-bit compares. It
sets two different sets of flags: icc and xcc.

This patch adds a conditional branch instruction using the xcc flags for
64-bit compares.

llvm-svn: 178621
2013-04-03 04:41:44 +00:00
Hal Finkel 2e10331057 Use PPC reciprocal estimates with Newton iteration in fast-math mode
When unsafe FP math operations are enabled, we can use the fre[s] and
frsqrte[s] instructions, which generate reciprocal (sqrt) estimates, together
with some Newton iteration, in order to quickly generate floating-point
division and sqrt results. All of these instructions are separately optional,
and so each has its own feature flag (except for the Altivec instructions,
which are covered under the existing Altivec flag). Doing this is not only
faster than using the IEEE-compliant fdiv/fsqrt instructions, but allows these
computations to be pipelined with other computations in order to hide their
overall latency.

I've also added a couple of missing fnmsub patterns which turned out to be
missing (but are necessary for good code generation of the Newton iterations).
Altivec needs a similar fix, but that will probably be more complicated because
fneg is expanded for Altivec's v4f32.

llvm-svn: 178617
2013-04-03 04:01:11 +00:00
Rafael Espindola b9b7ae0c78 Fix the fde encoding used by mips to match gas.
This finally fixes the encoding. The patch also
* Removes eh-frame.ll. It was an unnecessary .ll to .o test that was checking
  the wrong value.
* Merge fde-reloc.s and eh-frame.s into a single test, since the only difference
  was the run lines.
* Don't blindly test the content of the entire .eh_frame section. It makes it
  hard to anyone actually fixing a bug and hitting a difference in a binary
  blob. Instead, use a CHECK for each field and document what is being checked.

llvm-svn: 178615
2013-04-03 03:13:19 +00:00
Michael Gottesman b8c8836594 Remove an optimization where we were changing an objc_autorelease into an objc_autoreleaseReturnValue.
The semantics of ARC implies that a pointer passed into an objc_autorelease
must live until some point (potentially down the stack) where an
autorelease pool is popped. On the other hand, an
objc_autoreleaseReturnValue just signifies that the object must live
until the end of the given function at least.

Thus objc_autorelease is stronger than objc_autoreleaseReturnValue in
terms of the semantics of ARC* implying that performing the given
strength reduction without any knowledge of how this relates to
the autorelease pool pop that is further up the stack violates the
semantics of ARC.

*Even though objc_autoreleaseReturnValue if you know that no RV
optimization will occur is more computationally expensive.

llvm-svn: 178612
2013-04-03 02:57:24 +00:00
Akira Hatanaka 023c678a0d [mips] Small update to the implementation of eh.return for Mips.
This patch initializes t9 to the handler address, but only if the relocation
model is pic. This handles the case where handler to which eh.return jumps 
points to the start of the function.

Patch by Sasa Stankovic.

llvm-svn: 178588
2013-04-02 23:02:07 +00:00
Eric Christopher 6476f908b3 Support and test template arguments for unions.
llvm-svn: 178586
2013-04-02 22:55:56 +00:00
NAKAMURA Takumi fc613f4d61 llvm/test/CodeGen/X86: Unmark them out of XFAIL:cygming, in atomic{32|64}.ll and handle-move.ll, corresponding to r178549.
This reverts r176808, r176798, and r177914.

llvm-svn: 178583
2013-04-02 22:35:08 +00:00
Bill Schmidt 3581cd4b4c Fix PR15630: Replace faulty stdcx. with stwcx.
When doing a partword atomic operation, a lwarx was being paired with
a stdcx. instead of a stwcx. when compiling for a 64-bit target.  The
target has nothing to do with it in this case; we always need a stwcx.

Thanks to Kai Nacke for reporting the problem.

llvm-svn: 178559
2013-04-02 18:37:08 +00:00
Jakob Stoklund Olesen 8fbfc59164 Don't attempt MTM heuristics without a scheduling model present.
This should fix the PPC buildbots.

llvm-svn: 178558
2013-04-02 18:26:45 +00:00
Chad Rosier 7925d280ff [fast-isel] Use the correct API to disable FastLowerArguments for Win64.
llvm-svn: 178549
2013-04-02 16:31:41 +00:00
Arnold Schwaighofer d6c6e868b2 DAGCombiner: Merge store/loads when we have extload/truncstores
This is helps on architectures where i8,i16 are not legal but we have byte, and
short loads/stores. Allowing us to merge copies like the one below on ARM.

copy(char *a, char *b, int n) {
 do {
   int t0 = a[0];
   int t1 = a[1];
   b[0] = t0;
   b[1] = t1;

radar://13536387

llvm-svn: 178546
2013-04-02 15:58:51 +00:00
Preston Gurd 95cbee6ce4 Simplify test cases for Atom preferring call register indirect over
call memory indirect (32 and 64 bit).

llvm-svn: 178541
2013-04-02 14:25:06 +00:00
Bill Wendling 88d06c3b2d Use a worklist to avoid a sneaky iterator invalidation.
The iterator could be invalidated when it's recursively deleting a whole bunch
of constant expressions in a constant initializer.

Note: This was only reproducible if `opt' was run on a `.bc' file. If `opt' was
run on a `.ll' file, it wouldn't crash. This is why the test first pushes the
`.ll' file through `llvm-as' before feeding it to `opt'.

PR15440

llvm-svn: 178531
2013-04-02 08:16:45 +00:00
Jakob Stoklund Olesen 8eabc3ffde Add 64-bit load and store instructions.
There is only a few new instructions, the rest is handled with patterns.

llvm-svn: 178528
2013-04-02 04:09:28 +00:00
Jakob Stoklund Olesen 917e07f095 Basic 64-bit ALU operations.
SPARC v9 extends all ALU instructions to 64 bits, so we simply need to
add patterns to use them for both i32 and i64 values.

llvm-svn: 178527
2013-04-02 04:09:23 +00:00
Jakob Stoklund Olesen bddb20eeef Materialize 64-bit immediates.
The last resort pattern produces 6 instructions, and there are still
opportunities for materializing some immediates in fewer instructions.

llvm-svn: 178526
2013-04-02 04:09:17 +00:00
Jakob Stoklund Olesen c1d1a4816e Add 64-bit shift instructions.
SPARC v9 defines new 64-bit shift instructions. The 32-bit shift right
instructions are still usable as zero and sign extensions.

This adds new F3_Sr and F3_Si instruction formats that probably should
be used for the 32-bit shifts as well. They don't really encode an
simm13 field.

llvm-svn: 178525
2013-04-02 04:09:12 +00:00
Jakob Stoklund Olesen 0b21f35aca Add support for 64-bit calling convention.
This is far from complete, but it is enough to make it possible to write
test cases using i64 arguments.

Missing features:
- Floating point arguments.
- Receiving arguments on the stack.
- Calls.

llvm-svn: 178523
2013-04-02 04:09:02 +00:00
Jack Carter 9423f507b1 Mips direct object exception handling regression
Revision 177141 caused a regression in all but
mips64 little endian. That is because none of the
other Mips targets had test cases checking the 
contents of the .eh_frame section. This patch fixes
both the llvm code and adds an assembler test case 
to include the current 4 flavors.

The test cases unfortunately rely on llvm-objdump. A
preferable method would be to use a pretty printer output
such as what readelf -wf <elf_file> would give.

I also changed the name of the test case to correct a typo.

llvm-svn: 178506
2013-04-01 21:55:15 +00:00
Vincent Lejeune bfaa63a6db R600: Add support for native control flow
llvm-svn: 178505
2013-04-01 21:48:05 +00:00
Vincent Lejeune f43bc57b66 R600: Emit CF_ALU and use true kcache register.
llvm-svn: 178503
2013-04-01 21:47:42 +00:00
Hal Finkel 3f88d08974 Fix a bad assert in PPCTargetLowering
llvm-svn: 178489
2013-04-01 18:42:58 +00:00
Hal Finkel c2eddb0d02 Add triple to test/CodeGen/PowerPC/stfiwx-2
llvm-svn: 178486
2013-04-01 18:18:44 +00:00
Shuxin Yang 6662fd0f15 Correct assertion condition
llvm-svn: 178484
2013-04-01 18:13:05 +00:00
Arnold Schwaighofer 6752366ed7 Merge load/store sequences with adresses: base + index + offset
We would also like to merge sequences that involve a variable index like in the
example below.

    int index = *idx++
    int i0 = c[index+0];
    int i1 = c[index+1];
    b[0] = i0;
    b[1] = i1;

By extending the parsing of the base pointer to handle dags that contain a
base, index, and offset we can handle examples like the one above.

The dag for the code above will look something like:

 (load (i64 add (i64 copyfromreg %c)
                (i64 signextend (i8 load %index))))

 (load (i64 add (i64 copyfromreg %c)
                (i64 signextend (i32 add (i32 signextend (i8 load %index))
                                         (i32 1)))))

The code that parses the tree ignores the intermediate sign extensions. However,
if there is a sign extension it needs to be on all indexes.

 (load (i64 add (i64 copyfromreg %c)
                (i64 signextend (add (i8 load %index)
                                     (i8 1))))
 vs

 (load (i64 add (i64 copyfromreg %c)
                (i64 signextend (i32 add (i32 signextend (i8 load %index))
                                         (i32 1)))))
radar://13536387

llvm-svn: 178483
2013-04-01 18:12:58 +00:00
Hal Finkel f6d45f2379 Add more PPC floating-point conversion instructions
The P7 and A2 have additional floating-point conversion instructions which
allow a direct two-instruction sequence (plus load/store) to convert from all
combinations (signed/unsigned i32/i64) <--> (float/double) (on previous cores,
only some combinations were directly available).

llvm-svn: 178480
2013-04-01 17:52:07 +00:00
Hal Finkel c006295f3c Fix PowerPC/cttz.ll to specify a cpu (and use FileCheck)
llvm-svn: 178472
2013-04-01 16:31:56 +00:00
Hal Finkel 290376dd78 Add the PPC popcntw instruction
The popcntw instruction is available whenever the popcntd instruction is
available, and performs a separate popcnt on the lower and upper 32-bits.
Ignoring the high-order count, this can be used for the 32-bit input case
(saving on the explicit zero extension otherwise required to use popcntd).

llvm-svn: 178470
2013-04-01 15:58:15 +00:00
Nadav Rotem be79a7ac7a Add support for vector data types in the LLVM interpreter.
Patch by:
Veselov, Yuri <Yuri.Veselov@intel.com>

llvm-svn: 178469
2013-04-01 15:53:30 +00:00
Benjamin Kramer 52ceb44331 X86TTI: Add accurate costs for itofp operations, based on the actual instruction counts.
llvm-svn: 178459
2013-04-01 10:23:49 +00:00
Benjamin Kramer b60633fb87 X86: Promote sitofp <8 x i16> to <8 x i32> when AVX is available.
A vector sext + sitofp is a lot cheaper than 8 scalar conversions.

llvm-svn: 178448
2013-03-31 12:49:15 +00:00
Hal Finkel beb296bea1 Add the PPC lfiwax instruction
This instruction is available on modern PPC64 CPUs, and is now used
to improve the SINT_TO_FP lowering (by eliminating the need for the
separate sign extension instruction and decreasing the amount of
needed stack space).

llvm-svn: 178446
2013-03-31 10:12:51 +00:00
Hal Finkel e53429a13e Cleanup PPC(64) i32 -> float/double conversion
The existing SINT_TO_FP code for i32 -> float/double conversion was disabled
because it relied on broken EXTSW_32/STD_32 instruction definitions. The
original intent had been to enable these 64-bit instructions to be used on CPUs
that support them even in 32-bit mode.  Unfortunately, this form of lying to
the infrastructure was buggy (as explained in the FIXME comment) and had
therefore been disabled.

This re-enables this functionality, using regular DAG nodes, but only when
compiling in 64-bit mode. The old STD_32/EXTSW_32 definitions (which were dead)
are removed.

llvm-svn: 178438
2013-03-31 01:58:02 +00:00
Benjamin Kramer 9335443236 DAGCombine: visitXOR can replace a node without returning it, bail out in that case.
Fixes the crash reported in PR15608.

llvm-svn: 178429
2013-03-30 21:28:18 +00:00
Benjamin Kramer 9c9e0a2c04 Change '@SECREL' suffix to GAS-compatible '@SECREL32'.
'@SECREL' is what is used by the Microsoft assembler, but GNU as expects '@SECREL32'.
With the patch, the MC-generated code works fine in combination with a recent GNU as (2.23.51.20120920 here).

Patch by David Nadlinger!
Differential Revision: http://llvm-reviews.chandlerc.com/D429

llvm-svn: 178427
2013-03-30 16:21:50 +00:00
Justin Holewinski 59fd8ba5f5 [NVPTX] Remove support for SM < 2.0. This was never fully supported anyway.
llvm-svn: 178417
2013-03-30 14:29:30 +00:00
Justin Holewinski b94bd05b95 [NVPTX] Add NVVMReflect pass to allow compile-time selection of
specific code paths.

This allows us to write code like:

  if (__nvvm_reflect("FOO"))
    // Do something
  else
    // Do something else

and compile into a library, then give "FOO" a value at kernel
compile-time so the check becomes a no-op.

llvm-svn: 178416
2013-03-30 14:29:25 +00:00
Shuxin Yang 7b0c94e207 Implement XOR reassociation. It is based on following rules:
rule 1: (x | c1) ^ c2 => (x & ~c1) ^ (c1^c2),
     only useful when c1=c2
  rule 2: (x & c1) ^ (x & c2) = (x & (c1^c2))
  rule 3: (x | c1) ^ (x | c2) = (x & c3) ^ c3 where c3 = c1 ^ c2
  rule 4: (x | c1) ^ (x & c2) => (x & c3) ^ c1, where c3 = ~c1 ^ c2

 It reduces an application's size (in terms of # of instructions) by 8.9%.
 Reviwed by Pete Cooper. Thanks a lot!

 rdar://13212115  

llvm-svn: 178409
2013-03-30 02:15:01 +00:00
Akira Hatanaka b3c1847b30 [mips] Add patterns for DSP indexed load instructions.
llvm-svn: 178408
2013-03-30 02:14:45 +00:00
Akira Hatanaka fb221c197d [mips] Fix DSP instructions to have explicit accumulator register operands.
Check that instruction selection can select multiply-add/sub DSP instructions
from a pattern that doesn't have intrinsics.

llvm-svn: 178406
2013-03-30 01:58:00 +00:00
Akira Hatanaka 9efcd76c2c [mips] Move the code which does dag-combine for multiply-add/sub nodes to
derived class MipsSETargetLowering.

We shouldn't be generating madd/msub nodes if target is Mips16, since Mips16
doesn't have support for multipy-add/sub instructions.

llvm-svn: 178404
2013-03-30 01:42:24 +00:00
Michael Gottesman 9412830090 Updated test0 of retain-not-declared.ll to reflect the fact that objc-arc-expand runs before objc-arc/objc-arc-contract.
Specifically, objc-arc-expand will make sure that the
objc_retainAutoreleasedReturnValue, objc_autoreleaseReturnValue, and ret
will all have %call as an argument.

llvm-svn: 178382
2013-03-29 22:44:59 +00:00
Timur Iskhodzhanov 64a5cf5617 Exclude the X86/complex-fca.ll test at it probably wasn't supposed to work on Windows
llvm-svn: 178375
2013-03-29 21:54:00 +00:00
Michael Gottesman 3b8f877860 Add clang.arc.used to ModuleHasARC so ARC always runs if said call is present in a module.
clang.arc.used is an interesting call for ARC since ObjCARCContract
needs to run to remove said intrinsic to avoid a linker error (since the
call does not exist).

llvm-svn: 178369
2013-03-29 21:15:23 +00:00
Adrian Prantl 30cf851ef2 move testcase into appropriate X86 subdirectory.
llvm-svn: 178364
2013-03-29 20:14:08 +00:00
Hal Finkel f8ac57e289 Implement FRINT lowering on PPC using frin
Like nearbyint, rint can be implemented on PPC using the frin instruction. The
complication comes from the fact that rint needs to set the FE_INEXACT flag
when the result does not equal the input value (and frin does not do that). As
a result, we use a custom inserter which, after the rounding, compares the
rounded value with the original, and if they differ, explicitly sets the XX bit
in the FPSCR register (which corresponds to FE_INEXACT).

Once LLVM has better modeling of the floating-point environment we should be
able to (often) eliminate this extra complexity.

llvm-svn: 178362
2013-03-29 19:41:55 +00:00
Adrian Prantl 4b7cf64f66 Split the llvm/tools/clang/test/CodeGenObjC/debug-info-blocks.m testcase into a CFE and LLVM part.
rdar://problem/12767564

llvm-svn: 178353
2013-03-29 18:08:14 +00:00
Benjamin Kramer 70671b9937 Remove the old CodePlacementOpt pass.
It was superseded by MachineBlockPlacement and disabled by default since LLVM 3.1.

llvm-svn: 178349
2013-03-29 17:14:24 +00:00
Hal Finkel c20a08d25b Add PPC FP rounding instructions fri[mnpz]
These instructions are available on the P5x (and later) and on the A2. They
implement the standard floating-point rounding operations (floor, trunc, etc.).
One caveat: frin (round to nearest) does not implement "ties to even", and so
is only enabled in fast-math mode.

llvm-svn: 178337
2013-03-29 08:57:48 +00:00
Jack Carter 311246c6d5 [Mips Assembler] Add support for OR macro with imediate opperand
Mips assembler supports macros that allows the OR instruction 
to have an immediate parameter. This patch adds an instruction 
alias that converts this macro into a Mips ORI instruction. 

Contributer: Vladimir Medic
llvm-svn: 178316
2013-03-28 23:45:13 +00:00
Michael Liao a486a11dcf Add support of RDSEED defined in AVX2 extension
llvm-svn: 178314
2013-03-28 23:41:26 +00:00
Michael Liao 5fff5c7b26 Enhance boolean simplification to handle 16-/64-bit RDRAND
- RDRAND always clears the destination value when a random value is not
  available (i.e. CF == 0). This value is truncated or zero-extended as
  the false boolean value to be returned. Boolean simplification needs
  to skip this 'zext' or 'trunc' node.

llvm-svn: 178312
2013-03-28 23:38:52 +00:00
Jack Carter e1d85d55e6 [Mips Assembler] Add alias definitions for jal
Mips assembler allows following to be used as aliased instructions:
jal $rs for jalr $rs
jal $rd,$rd for jalr $rd,$rs

This patch provides alias definitions in td files and test cases to show the usage.

Contributer: Vladimir Medic
llvm-svn: 178304
2013-03-28 23:02:21 +00:00
Timur Iskhodzhanov a2fd5fdd7a Make Win32 put the SRet address into EAX, fixes PR15556
llvm-svn: 178291
2013-03-28 21:30:04 +00:00
Hal Finkel fd53ddd71c Specify CPUs on the PPC bswap-load-store test
Otherwise, the CHECK-NOT's might trigger depending on the host's CPU.

llvm-svn: 178287
2013-03-28 20:35:18 +00:00
Hal Finkel 22e41c411e Only enable 64-bit bswap DAG combines for PPC64
Compiling in 32-bit mode on a P7 would assert after 64-bit DAG combines were
added for bswap with load/store. This is because these combines are really only
valid in 64-bit mode, regardless of the CPU (and this was not being checked).

llvm-svn: 178286
2013-03-28 20:23:46 +00:00
Michael Gottesman 49f9885a2a Non optimizable objc_retainBlock calls are not forwarding.
Since we handle optimizable objc_retainBlocks through strength reduction
in OptimizableIndividualCalls, we know that all code after that point
will only see non-optimizable objc_retainBlock calls. IsForwarding is
only called by functions after that point, so it is ok to just classify
objc_retainBlock as non-forwarding.

<rdar://problem/13249661>.

llvm-svn: 178285
2013-03-28 20:11:30 +00:00
Michael Gottesman 158fdf699e [ObjCARC] Strength reduce objc_retainBlock -> objc_retain if the objc_retainBlock is optimizable.
If an objc_retainBlock has the copy_on_escape metadata attached to it
AND if the block pointer argument only escapes down the stack, we are
allowed to strength reduce the objc_retainBlock to to an objc_retain and
thus optimize it.

Current there is logic in the ARC data flow analysis to handle
this case which is complicated and involved making distinctions in
between objc_retainBlock and objc_retain in certain places and
considering them the same in others.

This patch simplifies said code by:

1. Performing the strength reduction in the initial ARC peephole
analysis (ObjCARCOpts::OptimizeIndividualCalls).

2. Changes the ARC dataflow analysis (which runs after the peephole
analysis) to consider all objc_retainBlock calls to not be optimizable
(since if the call was optimizable, we would have strength reduced it
already).

This patch leaves in the infrastructure in the ARC dataflow analysis to
handle this case, which due to 2 will just be dead code. I am doing this
on purpose to separate the removal of the old code from the testing of
the new code.

<rdar://problem/13249661>.

llvm-svn: 178284
2013-03-28 20:11:19 +00:00
Jyotsna Verma 27c06f3322 Hexagon: Enable SupportDebugInfomation and DwarfInSection flags.
llvm-svn: 178279
2013-03-28 19:34:49 +00:00
Akira Hatanaka 19468cafad Remove -O3.
llvm-svn: 178278
2013-03-28 19:34:14 +00:00
Hal Finkel 31d2956510 Add the PPC64 ldbrx/stdbrx instructions
These are 64-bit load/store with byte-swap, and available on the P7 and the A2.
Like the similar instructions for 16- and 32-bit words, these are matched in the
target DAG-combine phase against load/store-bswap pairs.

llvm-svn: 178276
2013-03-28 19:25:55 +00:00
Gordon Keiser 772cf466da Fix issue with disassembler decoding CBZ/CBNZ immediates as negatives when the upper bit is set.
They should always be zero-extended, not sign extended.  Added test case.

llvm-svn: 178275
2013-03-28 19:22:28 +00:00
Rafael Espindola ddbf4f47dc Move test since it depends on the X86 backend.
llvm-svn: 178249
2013-03-28 17:01:28 +00:00
Jyotsna Verma 93e740485f Hexagon: Use multiclass for gp-relative instructions.
Remove noV4T gp-relative instructions.

llvm-svn: 178246
2013-03-28 16:25:57 +00:00
Tim Northover 08bb3ce383 AArch64: implement GICv3 system registers
llvm-svn: 178236
2013-03-28 14:30:46 +00:00
Hal Finkel a4d074863a Add the PPC64 popcntd instruction
PPC ISA 2.06 (P7, A2, etc.) has a popcntd instruction. Add this instruction and
tell TTI about it so that popcount-loop recognition will know about it.

llvm-svn: 178233
2013-03-28 13:29:47 +00:00
Kostya Serebryany 463aa81418 [tsan] make sure memset/memcpy/memmove are not inlined in tsan mode
llvm-svn: 178230
2013-03-28 11:21:13 +00:00
Michael Gottesman f23bb4e558 Revert "Updated ELF relocation test for .eh_frame section"
This reverts commit c8d65364223a04b179958a50a4bf0f89b21dd7d2.

This broke a bunch of the buildbots.

llvm-svn: 178222
2013-03-28 05:14:26 +00:00
Hal Finkel 035b4825ce Cleanup PPC CR-spill kill flags and 32- vs. 64-bit instructions
There were a few places where kill flags were not being set correctly, and
where 32-bit instruction variants were being used with 64-bit registers. After
r178180, this code was being triggered causing llc to assert.

llvm-svn: 178220
2013-03-28 03:38:16 +00:00
David Blaikie 5692e72f30 Revert "Adding DIImportedModules to DIScopes."
This reverts commit 342d92c7a0adeabc9ab00f3f0d88d739fe7da4c7.

Turns out we're going with a different schema design to represent
DW_TAG_imported_modules so we won't need this extra field.

llvm-svn: 178215
2013-03-28 02:44:59 +00:00
Akira Hatanaka 99866dd535 Check if Type is a vector before calling function Type::getVectorNumElements.
llvm-svn: 178208
2013-03-28 01:28:02 +00:00
Preston Gurd d6be4bf87f This patch follows is a follow up to r178171, which uses the register
form of call in preference to memory indirect on Atom.

In this case, the patch applies the optimization to the code for reloading
spilled registers.

The patch also includes changes to sibcall.ll and movgs.ll, which were
failing on the Atom buildbot after the first patch was applied.

This patch by Sriram Murali.

llvm-svn: 178193
2013-03-27 23:16:18 +00:00
Jack Carter ccc266559f Updated ELF relocation test for .eh_frame section
Made sure we were looking a correct section
Added Mips32/64 as an extra check
Updated llvm-objdump to generate symbolic info for Mips relocations

llvm-svn: 178190
2013-03-27 22:58:49 +00:00
Chad Rosier 1530ba5e73 [ms-inline asm] Add support of imm displacement before bracketed memory
expression.  Specifically, this syntax:

 ImmDisp [ BaseReg + Scale*IndexReg + Disp ] 

We don't currently support:

 ImmDisp [ Symbol ]

rdar://13518671

llvm-svn: 178186
2013-03-27 21:49:56 +00:00
Jack Carter d224bc4f25 test file name change to correct typo
llvm-svn: 178174
2013-03-27 20:07:48 +00:00
Preston Gurd 663e6f9558 For the current Atom processor, the fastest way to handle a call
indirect through a memory address is to load the memory address into
a register and then call indirect through the register.

This patch implements this improvement by modifying SelectionDAG to
force a function address which is a memory reference to be loaded
into a virtual register.

Patch by Sriram Murali.

llvm-svn: 178171
2013-03-27 19:14:02 +00:00
Christian Konig 08f5929942 R600/SI: add SETO/SETUO patterns
6 more piglit tests.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 178145
2013-03-27 15:27:31 +00:00
Hal Finkel 687143557d Print PPC ZERO as 0 (not r0) even on Darwin
It seems that the Darwin PPC assembler requires r0 to be written as 0 when it
means 0 (at least in lwarx/stwcx.). Fixes PR15605.

llvm-svn: 178142
2013-03-27 13:20:52 +00:00
Evgeniy Stepanov 8dcc970760 Disable ASan/MSan symbolization of reports in tests.
It was using an instrumented symbolizer binary, which is a potential fork bomb.

llvm-svn: 178139
2013-03-27 13:11:12 +00:00
Silviu Baranga dc45336d09 Enabling the generation of dependency breakers for partial updates on Cortex-A15. Also fixing a small bug in getting the update clearence for VLD1LNd32.
llvm-svn: 178134
2013-03-27 12:38:44 +00:00
Christian Konig 3c14580acb R600/SI: add cummuting of rev instructions
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 178127
2013-03-27 09:12:59 +00:00
Christian Konig 70a5032c1b R600/SI: add mulhu/mulhs patterns
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 178126
2013-03-27 09:12:51 +00:00
Christian Konig 20a7e6b764 R600/SI: add srl/sha patterns for SI
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 178125
2013-03-27 09:12:44 +00:00
Hal Finkel 0f77861d9f Allocate r0 on PPC
The R0 register can now be allocated because instructions
that cannot use R0 as a GPR have been appropriately marked.

llvm-svn: 178123
2013-03-27 06:52:27 +00:00
Bill Schmidt a1b72d0f6a Remove the link register from the GPR classes on PowerPC.
Some implementation detail in the forgotten past required the link
register to be placed in the GPRC and G8RC register classes.  This is
just wrong on the face of it, and causes several extra intersection
register classes to be generated.  I found this was having evil
effects on instruction scheduling, by causing the wrong register class
to be consulted for register pressure decisions.

No code generation changes are expected, other than some minor changes
in instruction order.  Seven tests in the test bucket required minor
tweaks to adjust to the new normal.

llvm-svn: 178114
2013-03-27 02:40:14 +00:00
Michael Gottesman 46ebe53ead Added back in the test for arc-annotations.
The test was removed since I had not turned off the test during release
builds. This fails since ARC annotations support  is conditionally
compiled out during release builds. I added the proper requires header
to assuage this issue.

llvm-svn: 178101
2013-03-27 00:09:58 +00:00
David Blaikie a26d70358f Adding DIImportedModules to DIScopes.
This is just the basic groundwork for supporting DW_TAG_imported_module but I
wanted to commit this before pushing support further into Clang or LLVM so that
this rather churny change is isolated from the rest of the work. The major
churn here is obviously adding another field (within the common DIScope prefix)
to all DIScopes (files, classes, namespaces, lexical scopes, etc). This should
be the last big churny change needed for DW_TAG_imported_module/using directive
support/PR14606.

llvm-svn: 178099
2013-03-27 00:07:26 +00:00
Hal Finkel a7b0630ba8 Don't spill PPC VRSAVE on non-Darwin (even in SjLj)
As Bill Schmidt pointed out to me, only on Darwin do we need to spill/restore
VRSAVE in the SjLj code. For non-Darwin, don't spill/restore VRSAVE (and I've
added some asserts to make sure that we're not).

As it turns out, we're not currently handling the Darwin case correctly (I've
added a FIXME in the test case). I've tried adding various implied register
definitions/uses to force the spill without success, so I'll need to address
this later.

llvm-svn: 178096
2013-03-27 00:02:20 +00:00
Michael Liao 03f9ad0e67 Add XTEST codegen support
llvm-svn: 178083
2013-03-26 22:47:01 +00:00
Jakob Stoklund Olesen 1ac7e662d4 Enable SandyBridgeModel for all modern Intel P6 descendants.
All Intel CPUs since Yonah look a lot alike, at least at the granularity
of the scheduling models. We can add more accurate models for
processors that aren't Sandy Bridge if required. Haswell will probably
need its own.

The Atom processor and anything based on NetBurst is completely
different. So are the non-Intel chips.

llvm-svn: 178080
2013-03-26 22:19:12 +00:00
Hal Finkel 0dfbb05aff Use multiple virtual registers in PPC CR spilling
Now that the register scavenger can support multiple spill slots, and PEI can
use virtual-register-based scavenging for multiple simultaneous registers, we
can use a virtual register for the transfer register in the CR spilling code.

This should eliminate the last place (outside of the prologue/epilogue) where
we depend on the unconditional availability of the r0 register. We will soon be
able to allocate it (in a somewhat restricted sense) as a GPR.

llvm-svn: 178060
2013-03-26 18:57:22 +00:00
Hal Finkel 4e05788cc3 Update PEI's virtual-register-based scavenging to support multiple simultaneous mappings
The previous algorithm could not deal properly with scavenging multiple virtual
registers because it kept only one live virtual -> physical mapping (and
iterated through operands in order). Now we don't maintain a current mapping,
but rather use replaceRegWith to completely remove the virtual register as
soon as the mapping is established.

In order to allow the register scavenger to return a physical register killed
by an instruction for definition by that same instruction, we now call
RS->forward(I) prior to eliminating virtual registers defined in I. This
requires a minor update to forward to ignore virtual registers.

These new features will be tested in forthcoming commits.

llvm-svn: 178058
2013-03-26 18:56:54 +00:00
Michael Liao 4a44e556cc Fix PRFCHW test on non-x86 builds
- 'prefetch' intrinsics are only lowered when SSE is available. On non-X86
  builds, 'generic' CPU is used and stops lowering any prefetch intrinsics.

llvm-svn: 178046
2013-03-26 18:15:45 +00:00
Michael Liao 5173ee03af Add PREFETCHW codegen support
- Add 'PRFCHW' feature defined in AVX2 ISA extension

llvm-svn: 178040
2013-03-26 17:47:11 +00:00
Ulrich Weigand b1e02b2af2 Add test case for commit r178031.
llvm-svn: 178038
2013-03-26 17:30:02 +00:00
Jyotsna Verma 15957b129f Hexagon: Use multiclass for aslh, asrh, sxtb, sxth, zxtb and zxth.
llvm-svn: 178032
2013-03-26 15:43:57 +00:00
Christian Konig 727d06de1d R600/SI: mark most intrinsics as readnone v2
They read from constant register space anyway.

v2: fix lit tests

Signed-off-by: Christian König <christian.koenig@amd.com>
llvm-svn: 178020
2013-03-26 14:03:57 +00:00
Joe Abbey f686be4674 Patch by Gordon Keiser!
If PC or SP is the destination, the disassembler erroneously failed with the
invalid encoding, despite the manual saying that both are fine.

This patch addresses failure to decode encoding T4 of LDR (A8.8.62) which is a
postindexed load, where the offset 0xc is applied to SP after the load occurs.

llvm-svn: 178017
2013-03-26 13:58:53 +00:00
Alexey Samsonov 3a9d50396a Add asan/msan to the list of available features in LIT test runner
llvm-svn: 177994
2013-03-26 08:27:39 +00:00
Andrew Trick 9093e15066 Fix SCEV forgetMemoizedResults should search and destroy backedge exprs.
Fixes PR15570: SEGV: SCEV back-edge info invalid after dead code removal.

Indvars creates a SCEV expression for the loop's back edge taken
count, then determines that the comparison is always true and
removes it.

When loop-unroll asks for the expression, it contains a NULL
SCEVUnknkown (as a CallbackVH).

forgetMemoizedResults should invalidate the loop back edges expression.

llvm-svn: 177986
2013-03-26 03:14:53 +00:00
Bill Wendling 1ee6d8cef4 Remove testcase. It's failing on some platforms but not others.
llvm-svn: 177956
2013-03-26 01:10:03 +00:00
Bill Wendling d78cfa81be Hmm...not failing...odd
llvm-svn: 177955
2013-03-26 01:08:02 +00:00
Bill Wendling a5a44d4fd6 Temporarily XFAIL this test until Michael can look at it.
llvm-svn: 177953
2013-03-26 00:46:31 +00:00
Michael Gottesman cd4de0f9bb [ObjCARC Annotations] Added support for displaying the state of pointers at the bottom/top of BBs of the ARC dataflow analysis for both bottomup and topdown analyses.
This will allow for verification and analysis of the merge function of
the data flow analyses in the ARC optimizer.

The actual implementation of this feature is by introducing calls to
the functions llvm.arc.annotation.{bottomup,topdown}.{bbstart,bbend}
which are only declared. Each such call takes in a pointer to a global
with the same name as the pointer whose provenance is being tracked and
a pointer whose name is one of our Sequence states and points to a
string that contains the same name.

To ensure that the optimizer does not consider these annotations in any
way, I made it so that the annotations are considered to be of IC_None
type.

A test case is included for this commit and the previous
ObjCARCAnnotation commit.

llvm-svn: 177952
2013-03-26 00:42:09 +00:00
Michael Liao 5fbcd81793 Revise alignment checking/calculation on 256-bit unaligned memory access
- It's still considered aligned when the specified alignment is larger
  than the natural alignment;
- The new alignment for the high 128-bit vector should be min(16,
  alignment) as the pointer is advanced by 16, a power-of-2 offset.

llvm-svn: 177947
2013-03-25 23:50:10 +00:00
Michael Liao bb05a1d7b5 Enhance folding of (extract_subvec (insert_subvec V1, V2, IIdx), EIdx)
- Handle the case where the result of 'insert_subvect' is bitcasted
  before 'extract_subvec'. This removes the redundant insertf128/extractf128
  pair on unaligned 256-bit vector load/store on vectors of non 64-bit integer.

llvm-svn: 177945
2013-03-25 23:47:35 +00:00
Jakob Stoklund Olesen 52576719e0 Add an -mcpu option to a test that is apparently scheduler-sensitive.
This should fix the clang-atom-d2700-ubuntu-rel buildbot.

llvm-svn: 177943
2013-03-25 23:43:23 +00:00
Shuxin Yang 93b1f12ac1 Disable some unsafe-fp-math DAG-combine transformation after legalization.
For instance, following transformation will be disabled:
    x + x + x => 3.0f * x;

The problem of these transformations is that it introduces a FP constant, which
following Instruction-Selection pass cannot handle.

Reviewed by Nadav, thanks a lot!

rdar://13445387

llvm-svn: 177933
2013-03-25 22:52:29 +00:00
John McCall a237239097 Add an optimizer-side test case for ARC bug <rdar://13195034>, fixed
in the frontend with @clang.arc.use.

llvm-svn: 177928
2013-03-25 22:09:52 +00:00
Jyotsna Verma 5bd02926b4 Disable profiling tests for Hexagon since it doesn't support JIT.
llvm-svn: 177917
2013-03-25 21:15:11 +00:00
NAKAMURA Takumi 8c0d63c120 llvm/test/CodeGen/X86/atomic{32|64}.ll: Unmark them out of XFAIL:win32.
I know it is incorrect and they'd fail with +Asserts for win32 targets, though.
I'll try to fix them tonight.

llvm-svn: 177914
2013-03-25 21:07:53 +00:00
Jyotsna Verma 18ca8101ee XFAIL some of the generic CodeGen tests for Hexagon.
test/CodeGen/Generic/2008-02-20-MatchingMem.ll: Test contains inline assembly not supported by Hexagon.

Following tests are XFAILed due to multiple return values which Hexagon doesn't support.

test/CodeGen/Generic/multiple-return-values-cross-block-with-invoke.ll
test/CodeGen/Generic/select-cc.ll
test/CodeGen/Generic/vector.ll

llvm-svn: 177912
2013-03-25 21:04:16 +00:00
Shuxin Yang 389ed4b8f7 Fix a bug in fast-math fadd/fsub simplification.
The problem is that the code mistakenly took for granted that following constructor 
is able to create an APFloat from a *SIGNED* integer:
   
  APFloat::APFloat(const fltSemantics &ourSemantics, integerPart value)

rdar://13486998

llvm-svn: 177906
2013-03-25 20:43:41 +00:00
Jyotsna Verma 2dd5fcc8f5 XFAIL DebugInfo tests for Hexagon.
Hexagon does not support -filetype=obj(direct object generation) flag. Therefore,
the following tests are being XFAILed:

test/DebugInfo/dwarf-public-names.ll
test/DebugInfo/member-pointers.ll
test/DebugInfo/two-cus-from-same-file.ll

llvm-svn: 177901
2013-03-25 20:20:34 +00:00
Jyotsna Verma a6c51833ef Disable Execution Engine tests not supported by Hexagon.
llvm-svn: 177896
2013-03-25 20:02:14 +00:00
NAKAMURA Takumi 951a9b169b Disable, for now, llvm/test/Transforms/GCOVProfiling on win32. I'll investigate them later.
llvm-svn: 177894
2013-03-25 19:47:20 +00:00
Dave Zarzycki 656e8515fc x86 -- add the XTEST instruction
llvm-svn: 177888
2013-03-25 18:59:43 +00:00
Dave Zarzycki 07fabeecad x86 -- disassemble the REP/REPNE prefix when needed
This fixes Apple bug: 13493622

llvm-svn: 177887
2013-03-25 18:59:38 +00:00
Chad Rosier 1ad494d35b Remove unnecessary attributes from test case.
llvm-svn: 177882
2013-03-25 18:36:19 +00:00
Shankar Easwaran 82050340c6 [tools][llvm-readobj] print the name of the section when iterating the symbol table / dynamic symbol table
llvm-svn: 177873
2013-03-25 16:06:51 +00:00
Yiannis Tsiouris dbb4adf134 Add a GC plugin for Erlang
llvm-svn: 177867
2013-03-25 13:47:46 +00:00
Arnaud A. de Grandmaison 3ee88e8a77 Address issues found by Duncan during post-commit review of r177856.
llvm-svn: 177863
2013-03-25 11:47:38 +00:00
Arnaud A. de Grandmaison 9c383d68cf InstCombine: simplify comparisons to zero of (shl %x, Cst) or (mul %x, Cst)
This simplification happens at 2 places :
 - using the nsw attribute when the shl / mul is used by a sign test
 - when the shl / mul is compared for (in)equality to zero

llvm-svn: 177856
2013-03-25 09:48:49 +00:00
Justin Holewinski e988409423 [NVPTX] Fix handling of vector arguments
llvm-svn: 177847
2013-03-24 21:17:47 +00:00
Jakob Stoklund Olesen 91a5848cab Allow TableGen DAG arguments to be just a name.
DAG arguments can optionally be named:

  (dag node, node:$name)

With this change, the node is also optional:

  (dag node, node:$name, $name)

The missing node is treated as an UnsetInit, so the above is equivalent
to:

  (dag node, node:$name, ?:$name)

This syntax is useful in output patterns where we currently require the
types of variables to be repeated:

  def : Pat<(subc i32:$b, i32:$c), (SUBCCrr i32:$b, i32:$c)>;

This is preferable:

  def : Pat<(subc i32:$b, i32:$c), (SUBCCrr $b, $c)>;

llvm-svn: 177843
2013-03-24 19:36:51 +00:00
Benjamin Kramer eab1c5f98b Move X86-dependent test into the right subdirectory.
llvm-svn: 177821
2013-03-23 09:35:44 +00:00
Owen Anderson c81616b0a9 Remove the type legality check from the SelectionDAGBuilder when it lowers @llvm.fmuladd to ISD::FMA nodes.
Performing this check unilaterally prevented us from generating FMAs when the incoming IR contained illegal vector types which would eventually be legalized to underlying types that *did* support FMA.
For example, an @llvm.fmuladd on an OpenCL float16 should become a sequence of float4 FMAs, not float4 fmul+fadd's.

NOTE: Because we still call the target-specific profitability hook, individual targets can reinstate the old behavior, if desired, by simply performing the legality check inside their callback hook.  They can also perform more sophisticated legality checks, if, for example, some illegal vector types can be productively implemented as FMAs, but not others.
llvm-svn: 177820
2013-03-23 08:26:53 +00:00
Manman Ren 0827e97700 Support in AAEvaluator to print alias queries of loads/stores with TBAA tags.
Add "evaluate-tbaa" to print alias queries of loads/stores. Alias queries
between pointers do not include TBAA tags.

Add testing case for "placement new". TBAA currently says NoAlias.

llvm-svn: 177772
2013-03-22 22:34:41 +00:00
John McCall 20182ac0c7 Kill every call to @clang.arc.use in the ARC contract phase.
llvm-svn: 177769
2013-03-22 21:38:36 +00:00
Bill Wendling d96a7a6be8 Update test. There may be multiple catches, but those will be cleaned up.
llvm-svn: 177758
2013-03-22 20:36:39 +00:00
David Blaikie 7c8dbc12bb reorder the fields in DILexicalBlockFile to match the common prefix for DIScopes
llvm-svn: 177754
2013-03-22 19:13:22 +00:00
Galina Kistanova 36ffafc388 Reverted r176374. In some cases the lit.site.cfg file does not get generated in tools/clang/tools/extra.
llvm-svn: 177751
2013-03-22 18:54:14 +00:00
Jyotsna Verma fdc660bf2e Hexagon: Add and enable memops setbit, clrbit, &,|,+,- for byte, short, and word.
llvm-svn: 177747
2013-03-22 18:41:34 +00:00
David Blaikie 30ce0788e7 Refactor out the DIFile parameter to DILexicalBlock to refer to the raw file/directory pair
llvm-svn: 177742
2013-03-22 17:33:20 +00:00
Michel Danzer 3de8ae38e6 R600: Fix up test/CodeGen/R600/llvm.pow.ll for r177730
llvm-svn: 177736
2013-03-22 15:24:16 +00:00
Dmitry Vyukov 7b261db8e6 tsan: fix the test
Add missed file from r177717 commit that adds __tsan_vptr_read.

llvm-svn: 177719
2013-03-22 09:04:01 +00:00
Dmitry Vyukov 55e63ef454 tsan: handle vptr loads specially
This is required to determine ctor/dtor vs virtual call races.
http://llvm-reviews.chandlerc.com/D566

llvm-svn: 177717
2013-03-22 08:51:22 +00:00
Evgeniy Stepanov 2a066afce5 Fix llvm::removeUnreachableBlocks to handle unreachable loops.
llvm-svn: 177713
2013-03-22 08:43:04 +00:00
Arnaud A. de Grandmaison f364bc63e7 InstCombine: Improve the result bitvect type when folding (cmp pred (load (gep GV, i)) C) to a bit test.
The original code used i32, and i64 if legal. This introduced unneeded
casts when they aren't legal, or when the index variable i has another
type. In order of preference: try to use i's type; use the smallest
fitting legal type (using an added DataLayout method); default to i32.
A testcase checks that this works when the index gep operand is i16.

Patch by : Ahmed Bougacha <ahmed.bougacha@gmail.com>
Reviewed by : Duncan

llvm-svn: 177712
2013-03-22 08:25:01 +00:00
David Blaikie f333dc9571 Reorder the DIFile field in DILexicalBlock to become a prefix common with other DIScopes
llvm-svn: 177703
2013-03-22 05:47:44 +00:00
Chandler Carruth 273acf3395 Remove the ARM-specific variant of this test. It's already covered by
the ARM build bots, and it adds a weird case to the test suite where
a test uses as inputs files in the parent directory.

Talked about this with Dave on IRC and he's fine with this approach even
though it isn't optimal.

llvm-svn: 177700
2013-03-22 05:16:46 +00:00
Jack Carter 4f69a0f25d Fix the invalid opcode for Mips branch instructions in the assembler
For mips a branch an 18-bit signed offset (the 16-bit 
offset field shifted left 2 bits) is added to the 
address of the instruction following the branch 
(not the branch itself), in the branch delay slot, 
to form a PC-relative effective target address. 

Previously, the code generator did not perform the 
shift of the immediate branch offset which resulted 
in wrong instruction opcode. This patch fixes the issue.

Contributor: Vladimir Medic
llvm-svn: 177687
2013-03-22 00:29:10 +00:00
Jack Carter 9e65aa35a0 This patch that enables the Mips assembler to use symbols for offset for instructions
This patch uses the generated instruction info tables to 
identify memory/load store instructions.
After successful matching and based on the operand type 
and size, it generates additional instructions to the output.

Contributor: Vladimir Medic
llvm-svn: 177685
2013-03-22 00:05:30 +00:00
Hal Finkel 891671afe5 Fix a register-class comparison bug in PPCCTRLoops
Thanks to Jakob for isolating the underlying problem from the
test case in r177423. The original commit had introduced
asymmetric copy operations, but these turned out to be a work-around
to the real problem (the use of == instead of hasSubClassEq in PPCCTRLoops).

llvm-svn: 177679
2013-03-21 23:23:34 +00:00
David Blaikie 5ef3fcb745 Refactor the filename/directory information in DISubprogram to refer directly to the pair rather than the DIFile.
llvm-svn: 177677
2013-03-21 23:08:34 +00:00
David Blaikie 0d7d62e4b2 Move the DIFile in DISubprogram to the beginning to be a common prefix along with other DIScopes
llvm-svn: 177674
2013-03-21 22:29:36 +00:00
Jack Carter d76b2376f2 This patch enables the Mips .set directive to define aliases
The .set directive in the Mips the assembler can be 
used to set the value of a symbol to an expression. 
This changes the symbol's value and type to conform 
to the expression's.

Syntax: .set symbol, expression

This patch implements the parsing of the above syntax 
and enables the parser to use defined symbols when 
parsing operands.

Contributor: Vladimir Medic
llvm-svn: 177667
2013-03-21 21:44:16 +00:00
Hal Finkel 756810fe36 Implement builtin_{setjmp/longjmp} on PPC
This implements SJLJ lowering on PPC, making the Clang functions
__builtin_{setjmp/longjmp} functional on PPC platforms. The implementation
strategy is similar to that on X86, with the exception that a branch-and-link
variant is used to get the right jump address. Credit goes to Bill Schmidt for
suggesting the use of the unconditional bcl form (instead of the regular bl
instruction) to limit return-address-cache pollution.

Benchmarking the speed at -O3 of:

static jmp_buf env_sigill;

void foo() {
                __builtin_longjmp(env_sigill,1);
}

main() {
	...

        for (int i = 0; i < c; ++i) {
                if (__builtin_setjmp(env_sigill)) {
                        goto done;
                } else {
                        foo();
                }

done:;
        }

	...
}

vs. the same code using the libc setjmp/longjmp functions on a P7 shows that
this builtin implementation is ~4x faster with Altivec enabled and ~7.25x
faster with Altivec disabled. This comparison is somewhat unfair because the
libc version must also save/restore the VSX registers which we don't yet
support.

llvm-svn: 177666
2013-03-21 21:37:52 +00:00
Renato Golin a4d563526e Fix Darwin NEON FP and increase coverage
llvm-svn: 177664
2013-03-21 21:30:49 +00:00
David Blaikie cc8d090163 Remove unused field in DISubprogram
llvm-svn: 177661
2013-03-21 20:28:52 +00:00
Hal Finkel a1431df540 Add support for spilling VRSAVE on PPC
Although there is only one Altivec VRSAVE register, it is a member of
a register class, and we need the ability to spill it. Because this
register is normally callee-preserved and handled by special code this
has never before been necessary. However, this capability will be required by
a forthcoming commit adding SjLj support.

llvm-svn: 177654
2013-03-21 19:03:21 +00:00
Hal Finkel aa03c03a2d Correct PPC FRAMEADDR lowering using a pseudo-register
The old code used to lower FRAMEADDR tried to replicate the logic in the real
frame-lowering code that determines whether or not the frame pointer (r31) will
be used. When it seemed as through the frame pointer would not be used, the
stack pointer (r1) was used instead. Unfortunately, because the stack size is
not yet known, this does not work. Instead, this change introduces new
always-reserved pseudo-registers (FP and FP8) that are replaced during prologue
insertion with the real frame-pointer register (either r1 or r31).

It is important that this intrinsic always return a valid frame address because
it is used by Clang to store the frame address as part of code generation for
__builtin_setjmp.

llvm-svn: 177653
2013-03-21 19:03:19 +00:00
Renato Golin b4dd6c5945 Avoid NEON SP-FP unless unsafe-math or Darwin
NEON is not IEEE 754 compliant, so we should avoid lowering single-precision
floating point operations with NEON unless unsafe-math is turned on. The
equivalent VFP instructions are IEEE 754 compliant, but in some cores they're
much slower, so some archs/OSs might still request it to be on by default,
such as Swift and Darwin.

llvm-svn: 177651
2013-03-21 18:47:47 +00:00
Bill Wendling 5b98172115 Update some EH tests that were violating the new EH model.
The landingpad instruction needs to be the first non-PHI instruction in the
unwind destination block.

llvm-svn: 177650
2013-03-21 18:30:10 +00:00
Meador Inge 6b6a161ccf Move library call prototype attribute inference to functionattrs
The simplify-libcalls pass implemented a doInitialization hook to infer
function prototype attributes for well-known functions.  Given that the
simplify-libcalls pass is going away *and* that the functionattrs pass
is already in place to deduce function attributes, I am moving this logic
to the functionattrs pass.  This approach was discussed during patch
review:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20121126/157465.html.

llvm-svn: 177619
2013-03-21 00:55:59 +00:00
David Blaikie efb0d65ed7 Debug info: refactor the first field of DICompileUnit to be a raw file/directory pair
This removes the DICompileUnit special case from DIScope.

llvm-svn: 177610
2013-03-20 23:58:12 +00:00
Nadav Rotem 4536d582fd When computing the demanded bits of Load SDNodes, make sure that we are looking at the loaded-value operand and not the ptr result (in case of pre-inc loads).
rdar://13348420

llvm-svn: 177596
2013-03-20 22:53:44 +00:00
David Blaikie 3b88852a2d Debug Info: Swap the 2nd and 3rd parameters to DICompileUnit to match the common DIScope prefix
llvm-svn: 177595
2013-03-20 22:52:54 +00:00
David Blaikie 43a729d165 Remove unused field in DICompileUnit
llvm-svn: 177590
2013-03-20 22:34:33 +00:00
Michael Liao 70dd7f999d Correct cost model for vector shift on AVX2
- After moving logic recognizing vector shift with scalar amount from
  DAG combining into DAG lowering, we declare to customize all vector
  shifts even vector shift on AVX is legal. As a result, the cost model
  needs special tuning to identify these legal cases.

llvm-svn: 177586
2013-03-20 22:01:10 +00:00
David Blaikie 2881da7929 Refactor file/directory path in namespace debug info to refer directly to the pair rather than the DIFile
(paired to a Clang test - excuse the buildbot skew/fallout)

llvm-svn: 177566
2013-03-20 19:39:15 +00:00
David Blaikie a354eedcdb Enhance debug info namespace test to check for context/scope reference
The differing file (due to the #line directive in the original source) is for
future testing improvements coming soon.

llvm-svn: 177560
2013-03-20 19:14:16 +00:00
David Blaikie 7a84d4f864 Make target-specific test case in r177474 only run when that target is built
llvm-svn: 177545
2013-03-20 17:39:02 +00:00
David Blaikie 4f278f9b80 Reorder the DIFile parameter in DINameSpace
Moving the DIFile parameter to immediately proceed the tag so that it will be a
common prefix with other DIScopes (once the DIFile is replaced with the raw
file/directory pair).

llvm-svn: 177492
2013-03-20 06:27:06 +00:00
Hao Liu a7521131da Add a test case for PR15318 fixed in r177472
llvm-svn: 177489
2013-03-20 06:18:06 +00:00
Nick Lewycky 2b7fe9fc93 Don't assume the test directory is writable, use %T to find a writable
directory.

llvm-svn: 177488
2013-03-20 05:59:40 +00:00
David Blaikie fdabd48005 Test DW_TAG_namespace support in the backend
This is the backend portion of a Clang test case
(clang/test/CodeGenCXX/debug-info-namespace.cpp) that was roughly/coarsely
testing LLVM.

llvm-svn: 177487
2013-03-20 05:15:37 +00:00
Michael Liao 0f4ea0c4a9 Fix PR15296
- Move SRA/SRL/SHL lowering support from DAG combination to DAG lowering
  to support extended 256-bit integer in AVX but not AVX2.

llvm-svn: 177478
2013-03-20 02:33:21 +00:00
David Blaikie 67ff9b6752 Fix test case regression on ARM & PPC introduced r177239
llvm-svn: 177474
2013-03-20 01:55:11 +00:00
David Blaikie 200b6ed80f Refactor the DIFile (2nd) parameter to DITypes to be an MDNode reference to a raw directory/file pair
This makes DIType's first non-tag parameter the same as DIFile's, allowing them
to both share the common implementation of getFilename/getDirectory in DIScope.

llvm-svn: 177467
2013-03-20 00:26:26 +00:00
Justin Holewinski d068943809 Propagate DAG node ordering during type legalization and instruction selection
A node's ordering is only propagated during legalization if (a) the new node does
not have an ordering (is not a CSE'd node), or (b) the new node has an ordering
that is higher than the node being legalized.

llvm-svn: 177465
2013-03-20 00:10:32 +00:00
Chad Rosier b162a5ca4d Fix pr13145 - Naming a function like a register name confuses the asm parser.
Patch by Stepan Dyatkovskiy <stpworld@narod.ru>
rdar://13457826

llvm-svn: 177463
2013-03-19 23:44:03 +00:00
David Blaikie abec74b38e Move the DIFile operand to DITypes from the 4th operand to the 2nd.
This is another step along the way to making all DIScopes have a common prefix
which can be added to in a general manner to support using directives
(DW_TAG_imported_module).

llvm-svn: 177462
2013-03-19 23:25:22 +00:00
Hal Finkel 559754a482 Add a comment to the CodeGen/PowerPC/asym-regclass-copy.ll test
llvm-svn: 177434
2013-03-19 20:22:32 +00:00
Arnaud A. de Grandmaison 87c473f0d1 IndVarSimplify: do not recompute an IV value outside of the loop if :
- it is trivially known to be used inside the loop in a way that can not be optimized away
- there is no use outside of the loop which can take advantage of the computation hoisting

llvm-svn: 177432
2013-03-19 20:00:22 +00:00
Ulrich Weigand d850167a19 Rewrite pre-increment store patterns to use standard memory operands.
Currently, pre-increment store patterns are written to use two separate
operands to represent address base and displacement:

  stwu $rS, $ptroff($ptrreg)

This causes problems when implementing the assembler parser, so this
commit changes the patterns to use standard (complex) memory operands
like in all other memory access instruction patterns:

  stwu $rS, $dst

To still match those instructions against the appropriate pre_store
SelectionDAG nodes, the patch uses the new feature that allows a Pat
to match multiple DAG operands against a single (complex) instruction
operand.

Approved by Hal Finkel.

llvm-svn: 177429
2013-03-19 19:52:04 +00:00
Hal Finkel 638a9fa43e Prepare to make r0 an allocatable register on PPC
Currently the PPC r0 register is unconditionally reserved. There are two reasons
for this:

 1. r0 is treated specially (as the constant 0) by certain instructions, and so
    cannot be used with those instructions as a regular register.

 2. r0 is used as a temporary register in the CR-register spilling process
    (where, under some circumstances, we require two GPRs).

This change addresses the first reason by introducing a restricted register
class (without r0) for use by those instructions that treat r0 specially. These
register classes have a new pseudo-register, ZERO, which represents the r0-as-0
use. This has the side benefit of making the existing target code simpler (and
easier to understand), and will make it clear to the register allocator that
uses of r0 as 0 don't conflict will real uses of the r0 register.

Once the CR spilling code is improved, we'll be able to allocate r0.

Adding these extra register classes, for some reason unclear to me, causes
requests to the target to copy 32-bit registers to 64-bit registers. The
resulting code seems correct (and causes no test-suite failures), and the new
test case covers this new kind of asymmetric copy.

As r0 is still reserved, no functionality change intended.

llvm-svn: 177423
2013-03-19 18:51:05 +00:00
Nadav Rotem 0f1bc60d51 Optimize sext <4 x i8> and <4 x i16> to <4 x i64>.
Patch by Ahmad, Muhammad T <muhammad.t.ahmad@intel.com>

llvm-svn: 177421
2013-03-19 18:38:27 +00:00
Hal Finkel 6681486375 Cleanup PPC64 unaligned i64 load/store
Remove an accidentally-added instruction definition and add a comment in the
test case. This is in response to a post-commit review by Bill Schmidt.

No functionality change intended.

llvm-svn: 177404
2013-03-19 15:23:39 +00:00
Renato Golin 227eb6fc5f Improve long vector sext/zext lowering on ARM
The ARM backend currently has poor codegen for long sext/zext
operations, such as v8i8 -> v8i32. This patch addresses this
by performing a custom expansion in ARMISelLowering. It also
adds/changes the cost of such lowering in ARMTTI.

This partially addresses PR14867.

Patch by Pete Couperus

llvm-svn: 177380
2013-03-19 08:15:38 +00:00
Hal Finkel d9e10d51fa Don't reserve R31 on PPC64 unless the frame pointer is needed
llvm-svn: 177379
2013-03-19 08:09:38 +00:00
Nick Lewycky d67186337a Emit the linkage name instead of the function name, when available. This means
that we'll prefer to emit the mangled C++ name (pending a clang change).

llvm-svn: 177371
2013-03-19 01:37:55 +00:00
Hal Finkel fc9aad6436 Fix a sign-extension bug in PPCCTRLoops
Don't sign extend the immediate value from the OR instruction in
an LIS/OR pair.

llvm-svn: 177361
2013-03-18 23:58:28 +00:00
Hal Finkel b09680b0f7 Fix PPC unaligned 64-bit loads and stores
PPC64 supports unaligned loads and stores of 64-bit values, but
in order to use the r+i forms, the offset must be a multiple of 4.
Unfortunately, this cannot always be determined by examining the
immediate itself because it might be available only via a TOC entry.

In order to get around this issue, we additionally predicate the
selection of the r+i form on the alignment of the load or store
(forcing it to be at least 4 in order to select the r+i form).

llvm-svn: 177338
2013-03-18 23:00:58 +00:00
Arnold Schwaighofer ae0052f114 ARM cost model: Make some vector integer to float casts cheaper
The default logic marks them as too expensive.

For example, before this patch we estimated:
  cost of 16 for instruction:   %r = uitofp <4 x i16> %v0 to <4 x float>

While this translates to:
  vmovl.u16 q8, d16
  vcvt.f32.u32  q8, q8

All other costs are left to the values assigned by the fallback logic. Theses
costs are mostly reasonable in the sense that they get progressively more
expensive as the instruction sequences emitted get longer.

radar://13445992

llvm-svn: 177334
2013-03-18 22:47:09 +00:00
Arnold Schwaighofer 6c9c3a8b99 ARM cost model: Correct cost for some cheap float to integer conversions
Fix cost of some "cheap" cast instructions. Before this patch we used to
estimate for example:
  cost of 16 for instruction:   %r = fptoui <4 x float> %v0 to <4 x i16>

While we would emit:
  vcvt.s32.f32  q8, q8
  vmovn.i32 d16, q8
  vuzp.8  d16, d17

All other costs are left to the values assigned by the fallback logic. Theses
costs are mostly reasonable in the sense that they get progressively more
expensive as the instruction sequences emitted get longer.

radar://13434072

llvm-svn: 177333
2013-03-18 22:47:06 +00:00
Quentin Colombet 8fc340976d Extend global merge pass to optionally consider global constant variables.
Also add some checks to not merge globals used within landing pad instructions or marked as "used".

llvm-svn: 177331
2013-03-18 22:30:07 +00:00
Bill Schmidt 45bdbe7ec0 Change test cases to handle unaligned references.
Hal Finkel recently added code to allow unaligned memory references
for PowerPC.  Two tests were temporarily modified with
-disable-ppc-unaligned to keep them from failing.  This patch adjusts
the expected code generation for the unaligned references.

llvm-svn: 177328
2013-03-18 22:12:04 +00:00
David Blaikie 3053029310 Remove unnecessary leading comment characters in lit-only file
llvm-svn: 177327
2013-03-18 22:08:16 +00:00
Manman Ren 1217112d11 Check whether a pointer is non-null (isKnownNonNull) in isKnownNonZero.
This handles the case where we have an inbounds GEP with alloca as the pointer.
This fixes the regression in PR12750 and rdar://13286434.
Note that we can also fix this by handling some GEP cases in isKnownNonNull.

llvm-svn: 177321
2013-03-18 21:23:25 +00:00
David Blaikie 99067af791 Include '.test' suffix in target specific lit configs that need it
Apparently my final cleanup to use a relevant suffix for these tests before
committing r176831 caused them to stop running since lit wasn't configured to
run tests with that suffix in those directories (why don't we just have a
global suffix list?). So, add the suffix to the relevant directories & fix the
test that has bitrotted over the last week due to my debug info schema changes.

llvm-svn: 177315
2013-03-18 20:31:44 +00:00
Hal Finkel 21f2a43ab4 Fix large count and negative constant count handling in PPCCTRLoops
This commit fixes an assert that would occur on loops with large constant counts
(like looping for ((uint32_t) -1) iterations on PPC64). The existing code did
not handle counts that it computed to be negative (asserting instead), but
these can be created with valid inputs.

This bug was discovered by bugpoint while I was attempting to isolate a
completely different problem.

Also, in writing test cases for the negative-count problem, I discovered that
the ori/lsi handling was broken (there was a typo which caused the logic that
was supposed to detect these pairs and extract the iteration count to always
fail). This has now also been corrected (and is covered by one of the new test
cases).

llvm-svn: 177295
2013-03-18 17:40:44 +00:00
Hal Finkel 12337e4e7d Cleanup initial-value constants in PPCCTRLoops
Because the initial-value constants had not been added to the list
of instructions considered for DCE the resulting code had redundant
constant-materialization instructions.

llvm-svn: 177294
2013-03-18 17:40:27 +00:00
David Tweed d505b24277 Initially forgotten-to-svn-add test case for r177279.
llvm-svn: 177280
2013-03-18 12:07:24 +00:00
Kostya Serebryany 10cc12f2b7 [asan] when creating string constants, set unnamed_attr and align 1 so that equal strings are merged by the linker. Observed up to 1% binary size reduction. Thanks to Anton Korobeynikov for the suggestion
llvm-svn: 177264
2013-03-18 09:38:39 +00:00
Kostya Serebryany 6b5b58deeb [asan] don't instrument functions with available_externally linkage. This saves a bit of compile time and reduces the number of redundant global strings generated by asan (https://code.google.com/p/address-sanitizer/issues/detail?id=167)
llvm-svn: 177250
2013-03-18 07:33:49 +00:00
Craig Topper 0498b88d48 Post process ADC/SBB and use a shorter encoding if they use a sign extended immediate.
llvm-svn: 177243
2013-03-18 03:34:55 +00:00
Craig Topper 7e9a1cb199 Refactor some duplicated code into helper functions.
llvm-svn: 177242
2013-03-18 02:53:34 +00:00
Michael Gottesman a8b60a4fda Reduced dont-infinite-loop-during-block-escape-analysis.ll with bugpoint and moved it to retain-block-escape-analysis.ll.
*NOTE* I verified that the original bug behind
dont-infinite-loop-during-block-escape-analysis.ll occurs when using opt on
retain-block-escape-analysis.ll.

llvm-svn: 177240
2013-03-17 21:31:12 +00:00
David Blaikie 8fb8224578 Split out filename & directory from DIFile to start generalizing over DIScopes
This is the first step to making all DIScopes have a common metadata prefix (so
that things (using directives, for example) that can appear in any scope can be
added to that common prefix). DIFile is itself a DIScope so the common prefix
of all DIScopes cannot be a DIFile - instead it's the raw filename/directory
name pair.

llvm-svn: 177239
2013-03-17 21:13:55 +00:00
David Blaikie 2e488d1f0d Generalize debug info test to be resilient to changes in metadata node numbering
llvm-svn: 177238
2013-03-17 21:08:22 +00:00
Michael Gottesman 9782183126 The promised test case for r175939.
This test makes sure that the ObjCARC escape analysis looks at the uses of
instructions which copy the block pointer value by checking all four cases where
that can occur.

llvm-svn: 177232
2013-03-17 08:42:58 +00:00
Hal Finkel fcc51d4ff1 Improve PPC VR (Altivec) register spilling
This change cleans up two issues with Altivec register spilling:

  1. The spilling code was inefficient (using two instructions, and add and a
     load, when just one would do)

  2. The code assumed that r0 would always be available (true for now, but this
     will change)

The new code handles VR spilling just like GPR spills but forced into r+r mode.
As a result, when any VR spills are present, we must now always allocate the
register-scavenger spill slot.

llvm-svn: 177231
2013-03-17 04:43:44 +00:00
Hal Finkel 57080382e6 Remove FIXMEs in PPC test cases related to unaligned loads/stores
As pointed out by Bill in response to r177160, these two FIXMEs
can also be removed.

llvm-svn: 177229
2013-03-16 23:02:31 +00:00
Craig Topper 612f7bfa4d Add X86 code emitter support AVX encoded MRMDestReg instructions.
Previously we weren't skipping the VVVV encoded register. Based on patch by Michael Liao.

llvm-svn: 177221
2013-03-16 03:44:31 +00:00
Arnold Schwaighofer 9d7a3827e4 ARM cost model: Fix costs for some vector selects
I was too pessimistic in r177105. Vector selects that fit into a legal register
type lower just fine. I was mislead by the code fragment that I was using. The
stores/loads that I saw in those cases came from lowering the conditional off
an address.

Changing the code fragment to:

%T0_3 = type <8 x i18>
%T1_3 = type <8 x i1>

define void @func_blend3(%T0_3* %loadaddr, %T0_3* %loadaddr2,
                         %T1_3* %blend, %T0_3* %storeaddr) {
  %v0 = load %T0_3* %loadaddr
  %v1 = load %T0_3* %loadaddr2
==> FROM:
  ;%c = load %T1_3* %blend
==> TO:
  %c = icmp slt %T0_3 %v0, %v1
==> USE:
  %r = select %T1_3 %c, %T0_3 %v0, %T0_3 %v1

  store %T0_3 %r, %T0_3* %storeaddr
  ret void
}

revealed this mistake.

radar://13403975

llvm-svn: 177170
2013-03-15 18:31:01 +00:00
Silviu Baranga 82dd6ac3bc Adding an A15 specific optimization pass for interactions between S/D/Q registers. The pass handles all the required transformations pre-regalloc.
llvm-svn: 177169
2013-03-15 18:28:25 +00:00
Benjamin Kramer 2f5457141a ARM: Fix an old refacto.
Fixes PR15520.

llvm-svn: 177167
2013-03-15 17:27:39 +00:00
Hal Finkel 8d7fbc9dad Enable unaligned memory access on PPC for scalar types
Unaligned access is supported on PPC for non-vector types, and is generally
more efficient than manually expanding the loads and stores.

A few of the existing test cases were using expanded unaligned loads and stores
to test other features (like load/store with update), and for these test cases,
unaligned access remains disabled.

llvm-svn: 177160
2013-03-15 15:27:13 +00:00
Arnold Schwaighofer f5284ff61f ARM cost model: Fix cost of fptrunc and fpext instructions
A vector fptrunc and fpext simply gets split into scalar instructions.

radar://13192358

llvm-svn: 177159
2013-03-15 15:10:47 +00:00
Hal Finkel b0fac42987 Protect PPC Altivec patterns with a predicate
In preparation for the addition of other SIMD ISA extensions (such as QPX) we
need to make sure that all Altivec patterns are properly predicated on having
Altivec support.

No functionality change intended (one test case needed to be updated b/c it
assumed that Altivec intrinsics would be supported without enabling Altivec
support).

llvm-svn: 177152
2013-03-15 13:21:21 +00:00
Alexey Samsonov cd27b98d38 Fixup for r176933: more careful setup of path to llvm-symbolizer
llvm-svn: 177144
2013-03-15 07:27:49 +00:00
Rafael Espindola ef9d3494b2 Fix the FDE encoding to be relative on ELF.
This is a very late complement to r130637 which fixed this on x86_64. Fixes
pr15448.

Since it looks like that every elf architecture uses this encoding when using
cfi, make it the default for elf. Just exclude mips64el. It has a lovely
.ll -> .o test (ef_frame.ll) that tests that nothing changes in the binary
content of the .eh_frame produced by llc. Oblige it.

llvm-svn: 177141
2013-03-15 05:51:57 +00:00
Hal Finkel bb420f10e9 Allocate the RS spill slot for any PPC function with spills and a large stack frame
For spills into a large stack frame, the FI-elimination code uses the register
scavenger to obtain a free GPR for use with an r+r-addressed load or store.
When there are no available GPRs, the scavenger gets one by using its spill
slot. Previously, we were not always allocating that spill slot and the RS
would assert when the spill slot was needed.

I don't currently have a small test that triggered the assert, but I've
created a small regression test that verifies that the spill slot is now
added when the stack frame is sufficiently large.

llvm-svn: 177140
2013-03-15 05:06:04 +00:00
Nadav Rotem 4a4827ce21 Add a triple to the test.
llvm-svn: 177131
2013-03-15 00:10:23 +00:00
Nadav Rotem adfa5eaf8c Unaligned loads should use the VMOVUPS opcode.
llvm-svn: 177130
2013-03-14 23:49:44 +00:00
Arnold Schwaighofer 9b55e31bcb LoopVectorizer: Insert some white space to make test case more readable
Also remove some unneeded function attributes.

llvm-svn: 177114
2013-03-14 21:31:09 +00:00
Chad Rosier 4b54f594b4 [fast-isel] The X86FastISel::FastLowerArguments function doesn't properly handle
the win64 calling convention.
rdar://13423768

llvm-svn: 177113
2013-03-14 21:25:04 +00:00
Hal Finkel e987a311ba Not all PPC functions with a frame pointer need a RS spill slot
We used to add a spill slot for the register scavenger whenever the function
has a frame pointer. This is unnecessarily conservative: We may need the spill
slot for dynamic stack allocations, and functions with dynamic stack
allocations always have a FP, but we might also have a FP for other reasons
(such as the user explicitly disabling frame-pointer elimination), and we don't
necessarily need a spill slot for those functions.

The structsinregs test needed adjustment because it disables FP elimination.

llvm-svn: 177106
2013-03-14 19:34:32 +00:00
Arnold Schwaighofer 8070b382ec ARM cost model: Increase cost of some vector selects we do terrible on
By terrible I mean we store/load from the stack.

This matters on PAQp8 in _Z5trainPsS_ii (which is inlined into Mixer::update)
where we decide to vectorize a loop with a VF of 8 resulting in a 25%
degradation on a cortex-a8.

LV: Found an estimated cost of 2 for VF 8 For instruction:   icmp slt i32
LV: Found an estimated cost of 2 for VF 8 For instruction:   select i1, i32, i32

The bug that tracks the CodeGen part is PR14868.

radar://13403975

llvm-svn: 177105
2013-03-14 19:17:02 +00:00
Jyotsna Verma ec613665c2 Hexagon: Removed asserts regarding alignment and offset.
We are warning the user about the alignment, so we should not assert.

llvm-svn: 177103
2013-03-14 19:08:03 +00:00
Arnold Schwaighofer 4991ce9d49 Add missing asserts flag to test - it uses debug flags
llvm-svn: 177102
2013-03-14 19:01:58 +00:00
Arnold Schwaighofer c63cf3a0ae LoopVectorize: Invert case when we use a vector cmp value to query select cost
We generate a select with a vectorized condition argument when the condition is
NOT loop invariant. Not the other way around.

llvm-svn: 177098
2013-03-14 18:54:36 +00:00