Commit Graph

23001 Commits

Author SHA1 Message Date
Nico Rieck a0abeb3548 Fix more broken CHECK lines
llvm-svn: 201493
2014-02-16 13:28:39 +00:00
Nico Rieck 5ba5226ab9 Add extra CHECK prefix to tests with explicit prefix
These tests mistakenly assume that CHECK is still available even if an
explicit prefix is specified.

llvm-svn: 201492
2014-02-16 13:28:15 +00:00
Nico Rieck 35a237d4ed Actually call FileCheck in tests
llvm-svn: 201491
2014-02-16 13:27:39 +00:00
Elena Demikhovsky 1fad075974 AVX-512: simpyfied BUILD_VECTOR for masks; fixed cmp/test sequence
llvm-svn: 201487
2014-02-16 11:34:23 +00:00
Eric Christopher 4a74104933 Add a DIELoc class to cover the DW_FORM_exprloc set of expressions
alongside DIEBlock and replace uses accordingly. Use DW_FORM_exprloc
in DWARF4 and later code. Update testcases.

Adding a DIELoc instead of using extra forms inside DIEBlock so
that we can keep location expressions separate from other uses. No
direct use at the moment, however, it's not a lot of code and
using a separately named class keeps it somewhat more obvious
what's going on in various locations.

llvm-svn: 201481
2014-02-16 08:46:55 +00:00
Nico Rieck 7647178738 Fix broken CHECK lines
llvm-svn: 201479
2014-02-16 07:31:05 +00:00
Saleem Abdulrasool 27304cb189 MCAsmParser: relax declaration parsing
The Linux kernel defines empty macros for compatibility with ARM UAL syntax.
The comma after the name is optional, and if present can be safely lexed.  This
improves compatibility with the GNU assembler.

llvm-svn: 201474
2014-02-16 04:56:31 +00:00
Saleem Abdulrasool 49480bf01c ARM IAS: (partially) support .arch_extension directive
This adds a partial implementation of the .arch_extension directive to the
integrated ARM assembler.  There are a number of limitations to this
implementation arising from the target backend support rather than the
implementation itself.  Namely, iWMMXT (v1 and v2), Maverick, and XScale support
is not present in the ARM backend.  Currently, there is no check for A-class
only (needed for virt), and no ARMv6k detection (needed for os and sec).  The
remainder of the extensions are fully supported.

llvm-svn: 201471
2014-02-16 00:16:41 +00:00
David Blaikie f1a6dea82c DebugInfo: Deduplicate entries in the fission address table
This broke in r185459 while TLS support was being generalized to handle
non-symbol TLS representations.

I thought about/tried having an enum rather than a bool to track the
TLS-ness of the address table entry, but namespaces and naming seemed
more hassle than it was worth for only one caller that needed to specify
this.

llvm-svn: 201469
2014-02-15 19:34:03 +00:00
Arnold Schwaighofer 847d96142c Revert "SCEVExpander: Try hard not to create derived induction variables in other loops"
This reverts commit r201465. It broke an arm bot.

llvm-svn: 201466
2014-02-15 18:16:56 +00:00
Arnold Schwaighofer 1e12f8563d SCEVExpander: Try hard not to create derived induction variables in other loops
During LSR of one loop we can run into a situation where we have to expand the
start of a recurrence of a loop induction variable in this loop. This start
value is a value derived of the induction variable of a preceeding loop. SCEV
has cannonicalized this value to a different recurrence than the recurrence of
the preceeding loop's induction variable (the type and/or step direction) has
changed). When we come to instantiate this SCEV we created a second induction
variable in this preceeding loop.  This patch tries to base such derived
induction variables of the preceeding loop's induction variable.

This helps twolf on arm and seems to help scimark2 on x86.

radar://15970709

llvm-svn: 201465
2014-02-15 17:11:56 +00:00
Craig Topper 34875ab0b5 Add opcode extension forms of MOV8ri/MOV16ri/MOV32ri.
llvm-svn: 201463
2014-02-15 07:29:18 +00:00
David Blaikie 60e6386b87 DebugInfo: Implement DW_AT_stmt_list for type units
Type units will share the statement list of their defining compile unit.
This is a tradeoff that reduces .o debug info size at the cost of some
linked debug info size (since the contents of those string tables won't
be deduplicated along with the type unit) which seems right for now.

llvm-svn: 201445
2014-02-14 23:58:13 +00:00
Quentin Colombet 867c550947 [CodeGenPrepare][AddressingModeMatcher] Give up on type promotion if the
transformation does not bring any immediate benefits and introduce an illegal
operation. 

llvm-svn: 201439
2014-02-14 22:23:22 +00:00
Tom Stellard 728d4172df TargetLowering: n * r where n > 2 should be an illegal addressing mode
llvm-svn: 201433
2014-02-14 21:10:34 +00:00
David Blaikie 9acebfdd94 DebugInfo: Don't include the name of the CU file in the line table file list when it's unneeded
Recommitting r201380 (reverted in r201389)
Recommitting r201351 and r201355 (reverted in r201351 and r201355)

We weren't emitting the an empty (header only) line table when the line
table was empty - this made the DWARF invalid (the compile unit would
point to the zero-size debug_lines section where there should've been an
empty line table but there was nothing at all). Fix that, and as a
consequence this works around/addresses PR18809.

Also, we emit a non-empty line table to workaround a darwin linker bug,
so XFAILing on darwin too.

Also, mark the test as 'REQUIRES: object-emission' because it does.

llvm-svn: 201429
2014-02-14 19:51:35 +00:00
Diego Novillo 5b5cf503b5 Support DWARF discriminators in object streamer.
Summary:
This adds support for emitting DWARF path discriminator values in
the object streamer. It also changes the DWARF dumper to show
discriminator values in the line table output.

Reviewers: echristo

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D2794

llvm-svn: 201427
2014-02-14 19:27:53 +00:00
Reed Kotler 4cdaa7d778 This patch has two main functions:
1) Fix a specific bug when certain conversion functions are called in a program compiled as mips16 with hard float and
the program is linked as c++. There are two libraries that are reversed in the link order with gcc/g++ and clang/clang++ for
mips16 in this case and the proper stubs will then not be called. These stubs are normally handled in the Mips16HardFloat pass
but in this case we don't know at that time that we need to generate the stubs. This must all be handled later in code generation
and we have moved this functionality to MipsAsmPrinter. When linked as C (gcc or clang) the proper stubs are linked in from libc.

2) Set up the infrastructure to handle 90% of what is in the Mips16HardFloat pass in this new area of MipsAsmPrinter. This is a more
logical place to handle this and we have known for some time that we needed to move the code later and not implement it using
inline asm as we do now but it was not clear exactly where to do this and what mechanism should be used. Now it's clear to us
how to do this and this patch contains the infrastructure to move most of this to MipsAsmPrinter but the actual moving will be done
in a follow on patch. The same infrastructure is used to fix this current bug as described in #1. This change was requested by the list
during the original putback of the Mips16HardFloat pass but was not practical for us do at that time.

llvm-svn: 201426
2014-02-14 19:16:39 +00:00
Artyom Skrobov f6830f47b8 Generate the DWARF stack frame decode operations in the function prologue for ARM/Thumb functions.
Patch by Keith Walker!

llvm-svn: 201423
2014-02-14 17:19:07 +00:00
Kevin Qin edc95ee196 [AArch64 NEON] Fix a bug to avoid using floating type as condition type in lowering SELECT_CC.
llvm-svn: 201395
2014-02-14 09:41:15 +00:00
Eric Christopher abc621668d Revert "DebugInfo: Don't include the name of the CU file in the line table file list when it's unneeded"
This reverts commit r201380 for now while we investigate.

llvm-svn: 201389
2014-02-14 05:33:16 +00:00
NAKAMURA Takumi 602978ae0e llvm/test/DebugInfo/empty.ll: Mark it as XFAIL:win32 lacking of line table.
llvm-svn: 201388
2014-02-14 05:26:49 +00:00
NAKAMURA Takumi b54440f9ed [PR18809] Remove XFAIL from DebugInfo/empty.ll.
I added it in r201211.

llvm-svn: 201383
2014-02-14 03:59:43 +00:00
Hao Liu 7146ef8542 [AArch64]Fix the assertion failure caused by "v1i1 SETCC" DAG node.
As v1i1 is illegal, the type legalizer tries to scalarize such node. But if the type operands of SETCC is legal, the scalarization algorithm will cause an assertion failure.

llvm-svn: 201381
2014-02-14 02:21:56 +00:00
David Blaikie 177585d1d9 DebugInfo: Don't include the name of the CU file in the line table file list when it's unneeded
Recommitting r201351 and r201355 (reverted in r201351 and r201355)

We weren't emitting the an empty (header only) line table when the line
table was empty - this made the DWARF invalid (the compile unit would
point to the zero-size debug_lines section where there should've been an
empty line table but there was nothing at all). Fix that, and as a
consequence this works around/addresses PR18809.

llvm-svn: 201380
2014-02-14 01:57:59 +00:00
Eric Christopher 02dbadb3a0 Disable emission of aranges by default and add a command line
option to enable again that will be matched with a commit to enable
in clang.

llvm-svn: 201378
2014-02-14 01:26:55 +00:00
Matt Arsenault aa689f5079 Do more addrspacecast transforms that happen for bitcast.
Makes addrspacecast (gep) do addrspacecast (gep) instead.

llvm-svn: 201376
2014-02-14 00:49:12 +00:00
Tom Stellard 967bf5813f R600/SI: Expand all v8[if]32 operations
llvm-svn: 201371
2014-02-13 23:34:15 +00:00
Tom Stellard f16d38cbb5 R600/SI: Add a pattern for i32 anyext
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 201370
2014-02-13 23:34:13 +00:00
Tom Stellard 6c7a7e82a7 R600/SI: Completely Disable TypeRewriter on compute
llvm-svn: 201369
2014-02-13 23:34:12 +00:00
Tom Stellard 80be9650e3 R600/SI: Split global vector loads with more than 4 elements
llvm-svn: 201368
2014-02-13 23:34:10 +00:00
Tom Stellard 60b6153044 R600/SI: Add ShaderType attribute to some tests
llvm-svn: 201367
2014-02-13 23:34:07 +00:00
Rafael Espindola 1f3de49f37 Use __literal16. It has been supported by the linker since 2005.
llvm-svn: 201365
2014-02-13 23:16:11 +00:00
Diego Novillo 8d9be9a4cf Simplify checks in MC/AsmParser/directive_loc.s
llvm-svn: 201361
2014-02-13 20:16:42 +00:00
Diego Novillo b1b5007c52 Fix generation of 'isa' and 'discriminator' keywords.
Summary:
There should be a space before each of these two keywords to avoid
generating invalid assembly files.

NOTE: I could not find an obvious maintainers in CODE_OWNERS.TXT, but
      this seems related to debug info.

Reviewers: echristo

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D2791

llvm-svn: 201359
2014-02-13 20:05:03 +00:00
NAKAMURA Takumi f71dc448b9 Tweak llvm/test/DebugInfo/X86/generate-odr-hash.ll corresponding to r201351 (Revert r201187).
llvm-svn: 201355
2014-02-13 18:28:28 +00:00
NAKAMURA Takumi 4f2a067df1 [PR18809] Revert r201187, "DebugInfo: Don't include the name of the CU file in the line table file list when it's unneeded"
It really crashes cygwin's stage2 configure with "clang -g".

llvm-svn: 201351
2014-02-13 18:18:56 +00:00
Rafael Espindola 8459762c88 Add triples to try to fix the windows bots.
llvm-svn: 201345
2014-02-13 16:49:47 +00:00
Rafael Espindola 98f0cfdd90 .file is only available on ELF, use a triple instead of -march.
llvm-svn: 201337
2014-02-13 15:38:16 +00:00
Rafael Espindola e6f37b908a "foo" is not a ppc instruction, don't try to parse it.
llvm-svn: 201336
2014-02-13 15:33:35 +00:00
Rafael Espindola 432acf5cef Specify a triple. MachO AArch64 support is missing.
llvm-svn: 201335
2014-02-13 15:30:06 +00:00
Daniel Sanders 753e17629d Re-commit: Demote EmitRawText call in AsmPrinter::EmitInlineAsm() and remove hasRawTextSupport() call
Summary:
AsmPrinter::EmitInlineAsm() will no longer use the EmitRawText() call for
targets with mature MC support. Such targets will always parse the inline
assembly (even when emitting assembly). Targets without mature MC support
continue to use EmitRawText() for assembly output.

The hasRawTextSupport() check in AsmPrinter::EmitInlineAsm() has been replaced
with MCAsmInfo::UseIntegratedAs which when true, causes the integrated assembler
to parse inline assembly (even when emitting assembly output). UseIntegratedAs
is set to true for targets that consider any failure to parse valid assembly
to be a bug. Target specific subclasses generally enable the integrated
assembler in their constructor. The default value can be overridden with
-no-integrated-as.

All tests that rely on inline assembly supporting invalid assembly (for example,
those that use mnemonics such as 'foo' or 'hello world') have been updated to
disable the integrated assembler.

Changes since review (and last commit attempt):
- Fixed test failures that were missed due to configuration of local build.
  (fixes crash.ll and a couple others).
- Fixed tests that happened to pass because the local build was on X86
  (should fix 2007-12-17-InvokeAsm.ll)
- mature-mc-support.ll's should no longer require all targets to be compiled.
  (should fix ARM and PPC buildbots)
- Object output (-filetype=obj and similar) now forces the integrated assembler
  to be enabled regardless of default setting or -no-integrated-as.
  (should fix SystemZ buildbots)

Reviewers: rafael

Reviewed By: rafael

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D2686

llvm-svn: 201333
2014-02-13 14:44:26 +00:00
NAKAMURA Takumi 6fe94621b9 llvm/test/CodeGen/AArch64/cpus.ll: Tweak to use -mtriple=aarch64-unknown-unknown, or this would crash for targeting pecoff like *-mingw32.
llvm-svn: 201315
2014-02-13 11:06:23 +00:00
Tim Northover 914af6273b ARM: remove floating-point patterns for @llvm.arm.neon.vabs
The front-end is now generating the generic @llvm.fabs for this
operation now, so the extra patterns are no longer needed.

llvm-svn: 201314
2014-02-13 10:44:30 +00:00
Oliver Stannard 5bbb72f37e Add Cortex-A53 and Cortex-A57 cores to the AArch64 backend
llvm-svn: 201305
2014-02-13 09:46:11 +00:00
Hao Liu 7b6dfcf06a [AArch64]Fix the problems that can't select mul/add/sub of v1i8/v1i16/v1i32 types.
As this problems are similar to shl/sra/srl, also add patterns for shift nodes.

llvm-svn: 201298
2014-02-13 05:42:33 +00:00
Rafael Espindola 75ec01f3de Copy dll storage in copyAttributes.
llvm-svn: 201295
2014-02-13 05:11:35 +00:00
Juergen Ributzka 2b97f9b211 [DAG] Fix the recognition of opaque constants in the SelectionDAGBuilder.
This fix checks the original LLVM IR node to identify opaque constants by
looking for the bitcast-constant pattern. Originally we looked at the generated
SDNode, but this might lead to incorrect results. The SDNode could have been
generated by an constant expression that was folded to a constant.

This fixes <rdar://problem/16050719>

llvm-svn: 201291
2014-02-13 04:19:26 +00:00
Hao Liu 4f345f3c03 [AArch64]Add support for spilling FPR8/FPR16.
llvm-svn: 201287
2014-02-13 02:36:58 +00:00
Reid Kleckner 22b19da9fc GlobalOpt: Aliases don't have sections, don't copy them when replacing
As defined in LangRef, aliases do not have sections.  However, LLVM's
GlobalAlias class inherits from GlobalValue, which means we can read and
set its section.  We should probably ban that as a separate change,
since it doesn't make much sense for an alias to have a section that
differs from its aliasee.

Fixes PR18757, where the section was being lost on the global in code
from Clang like:

extern "C" {
__attribute__((used, section("CUSTOM"))) static int in_custom_section;
}

Reviewers: rafael.espindola

Differential Revision: http://llvm-reviews.chandlerc.com/D2758

llvm-svn: 201286
2014-02-13 02:18:36 +00:00
Owen Anderson 883b5add8e Remove a very old instcombine where we would turn sequences of selects into
logical operations on the i1's driving them.  This is a bad idea for every
target I can think of (confirmed with micro tests on all of: x86-64, ARM,
AArch64, Mips, and PowerPC) because it forces the i1 to be materialized into
a general purpose register, whereas consuming it directly into a select generally
allows it to exist only transiently in a predicate or flags register.

Chandler ran a set of performance tests with this change, and reported no
measurable change on x86-64.

llvm-svn: 201275
2014-02-12 23:54:07 +00:00
Andrea Di Biagio b7882b3bd1 [Vectorizer] Add a new 'OperandValueKind' in TargetTransformInfo called
'OK_NonUniformConstValue' to identify operands which are constants but
not constant splats.

The cost model now allows returning 'OK_NonUniformConstValue'
for non splat operands that are instances of ConstantVector or
ConstantDataVector.

With this change, targets are now able to compute different costs
for instructions with non-uniform constant operands.
For example, On X86 the cost of a vector shift may vary depending on whether
the second operand is a uniform or non-uniform constant.

This patch applies the following changes:
 - The cost model computation now takes into account non-uniform constants;
 - The cost of vector shift instructions has been improved in
   X86TargetTransformInfo analysis pass;
 - BBVectorize, SLPVectorizer and LoopVectorize now know how to distinguish
   between non-uniform and uniform constant operands.

Added a new test to verify that the output of opt
'-cost-model -analyze' is valid in the following configurations: SSE2,
SSE4.1, AVX, AVX2.

llvm-svn: 201272
2014-02-12 23:43:47 +00:00
Andrea Di Biagio 386d566395 [X86] Teach the backend how to lower vector shift left into multiply rather than scalarizing it.
Instead of expanding a packed shift into a sequence of scalar shifts,
the backend now tries (when possible) to convert the vector shift into a
vector multiply.

Before this change, a shift of a MVT::v8i16 vector by a
build_vector of constants was always scalarized into a long sequence of "vector
extracts + scalar shifts + vector insert".
With this change, if there is SSE2 support, we emit a single vector multiply.

This change also affects SSE4.1, AVX, AVX2 shifts:
 - A shift of a MVT::v4i32 vector by a build_vector of non uniform constants
is now lowered when possible into a single SSE4.1 vector multiply.
 - Packed v16i16 shift left by constant build_vector are now expanded when
possible into a single AVX2 vpmullw.
This change also improves the lowering of AVX512f vector shifts.

Added test CodeGen/X86/vec_shift6.ll with some code examples that are affected
by this change.

llvm-svn: 201271
2014-02-12 23:42:28 +00:00
David Blaikie 3c7e961b29 DebugInfo: Demonstrate that we're not currently uniquing address table entries in fission
Since I just discovered this while poking at other things, here's the
test case so I have it to come back to later.

llvm-svn: 201267
2014-02-12 23:03:54 +00:00
David Blaikie 23afec97ba DebugInfo: Merge fission and non-fission (and 32 and 64 bit) tests for TLS support.
llvm-svn: 201266
2014-02-12 23:03:51 +00:00
Adrian Prantl 7199fd532c Debug info: Bugfix for r201190: DW_OP_piece takes bytes, not bits.
rdar://problem/16015314

llvm-svn: 201253
2014-02-12 19:34:44 +00:00
Akira Hatanaka a07ffb5b31 Pass edges weights to MachineBasicBlock::addSuccessor in TailDuplicatePass to
preserve branch probability information.

<rdar://problem/15893208>

llvm-svn: 201245
2014-02-12 18:09:18 +00:00
Renato Golin c2ae1b1b41 PC-rel implemented in AArch64, test now pass
llvm-svn: 201243
2014-02-12 17:17:41 +00:00
Daniel Sanders abe212a3b8 Revert r201237+r201238: Demote EmitRawText call in AsmPrinter::EmitInlineAsm() and remove hasRawTextSupport() call
It introduced multiple test failures in the buildbots.

llvm-svn: 201241
2014-02-12 15:39:20 +00:00
Daniel Sanders a7d504cf58 Demote EmitRawText call in AsmPrinter::EmitInlineAsm() and remove hasRawTextSupport() call
Summary:
AsmPrinter::EmitInlineAsm() will no longer use the EmitRawText() call for targets with mature MC support. Such targets will always parse the inline assembly (even when emitting assembly). Targets without mature MC support continue to use EmitRawText() for assembly output.

The hasRawTextSupport() check in AsmPrinter::EmitInlineAsm() has been replaced with MCAsmInfo::UseIntegratedAs which when true, causes the integrated assembler to parse inline assembly (even when emitting assembly output). UseIntegratedAs is set to true for targets that consider any failure to parse valid assembly to be a bug. Target specific subclasses generally enable the integrated assembler in their constructor. The default value can be overridden with -no-integrated-as.

All tests that rely on inline assembly supporting invalid assembly (for example, those that use mnemonics such as 'foo' or 'hello world') have been updated to disable the integrated assembler.

Reviewers: rafael

Reviewed By: rafael

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D2686

llvm-svn: 201237
2014-02-12 14:44:54 +00:00
NAKAMURA Takumi 2dfe92e010 [PR18809] Mark DebugInfo/empty.ll as XFAIL:cygming.
llvm-svn: 201211
2014-02-12 07:15:05 +00:00
Craig Topper 522af29465 Test case I forgot to 'add' for r201126.
llvm-svn: 201207
2014-02-12 03:58:47 +00:00
David Blaikie 5b85858b77 DwarfUnit: Include type unit's file strings in the defining compile unit's file_names table
There's still one piece missing here, which is adding the
DW_AT_stmt_list to the type unit that refer's to the compile unit's line
table. Working on that.

llvm-svn: 201198
2014-02-12 00:40:47 +00:00
Evan Cheng 57add3e4ee Tweak ARM fastcc by adopting these two AAPCS rules:
* CPRCs may be allocated to co-processor registers or the stack – they may never be allocated to core registers
* When a CPRC is allocated to the stack, all other VFP registers should be marked as unavailable

The difference is only noticeable in rare cases where there are a large number of floating point arguments (e.g.
7 doubles + additional float, double arguments). Although it's probably still better to avoid vmov as it can cause
stalls in some older ARM cores. The other, more subtle benefit, is to minimize difference between the various
calling conventions.

rdar://16039676

llvm-svn: 201193
2014-02-11 23:49:31 +00:00
Adrian Prantl cbcd578f0c Reapply r201180 with an additional error path.
Debug info: Emit values in subregisters that do not have a separate
DWARF register number by emitting a super-register + DW_OP_bit_piece.
This is necessary because on x86_64, there are no DWARF register numbers
for i386-style subregisters.
Fixes a bunch of FIXMEs.

rdar://problem/16015314

llvm-svn: 201190
2014-02-11 22:22:15 +00:00
Adrian Prantl 80b6fd02fa Revert "Debug info: Emit values in subregisters that do not have a separate"
This reverts commit r201179 for buildbot breakage.

llvm-svn: 201188
2014-02-11 22:03:30 +00:00
David Blaikie 284cfc1089 DebugInfo: Don't include the name of the CU file in the line table file list when it's unneeded
This comes up in empty files or files containing #file directives that
never reference the actual source file name. Came up in a small test of
line tables I was playing with.

llvm-svn: 201187
2014-02-11 21:49:46 +00:00
David Blaikie 6bd395f3f0 DebugInfo: Remove dependence on file numbering in the line table.
These tests were unnecessarily sensitive to the presence and ordering of
elements in the line table file_names list which will break on a future
change I'm working on.

llvm-svn: 201185
2014-02-11 21:46:46 +00:00
Adrian Prantl a83cc8a356 Debug info: Emit values in subregisters that do not have a separate
DWARF register number by emitting a super-register + DW_OP_bit_piece.
This is necessary because on x86_64, there are no DWARF register numbers
for i386-style subregisters.
Fixes a bunch of FIXMEs.

rdar://problem/16015314

llvm-svn: 201180
2014-02-11 21:22:59 +00:00
Matt Arsenault 71b71d25eb R600/SI: Fix assertion on infinite loops.
This isn't the most useful case to fix in the real world,
but bugpoint runs into this.

llvm-svn: 201177
2014-02-11 21:12:38 +00:00
Benjamin Kramer 94fc18d040 InstCombine: Teach icmp merging about the equivalence of bit tests and UGE/ULT with a power of 2.
This happens in bitfield code. While there reorganize the existing code
a bit.

llvm-svn: 201176
2014-02-11 21:09:03 +00:00
Jim Grosbach 7422074df0 Tidy up a bit. Formatting only.
llvm-svn: 201174
2014-02-11 20:48:41 +00:00
Jim Grosbach 8bfcb735fa ARM: Thumb2 LDR(literal) can target SP.
Fix a slightly overzealous destination register restriction for the
'without .w' alias. Add some explicit testcases.

rdar://16033140

llvm-svn: 201173
2014-02-11 20:48:39 +00:00
Benjamin Kramer 5a188549ad ScalarEvolution: Analyze trip count of loops with a switch guarding the exit.
llvm-svn: 201159
2014-02-11 15:44:32 +00:00
Robert Lougher 7d9084ffa1 Teach the DAGCombiner how to fold concat_vector nodes when the input is two
BUILD_VECTOR nodes, e.g.:

(concat_vectors (BUILD_VECTOR a1, a2, a3, a4), (BUILD_VECTOR b1, b2, b3, b4))
->
(BUILD_VECTOR a1, a2, a3, a4, b1, b2, b3, b4)

This fixes an issue with AVX, where a sequence was not recognized as a 256-bit
vbroadcast due to the concat_vectors.

llvm-svn: 201158
2014-02-11 15:42:46 +00:00
Chandler Carruth fc25854b09 [LPM] Switch LICM to actively use LCSSA in addition to preserving it.
Fixes PR18753 and PR18782.

This is necessary for LICM to preserve LCSSA correctly and efficiently.
There is still some active discussion about whether we should be using
LCSSA, but we can't just immediately stop using it and we *need* LICM to
preserve it while we are using it. We can restore the old SSAUpdater
driven code if and when there is a serious effort to remove the reliance
on LCSSA from all of the loop passes.

However, this also serves as a great example of why LCSSA is very nice
to have. This change significantly simplifies the process of sinking
instructions for LICM, and makes it quite a bit less expensive.

It wouldn't even be as complex as it is except that I had to start the
process of removing the big recursive LCSSA formation hammer in order to
switch even this much of the re-forming code to asserting that LCSSA was
preserved. I'll fully remove that next just to tidy things up until the
LCSSA debate settles one way or the other.

llvm-svn: 201148
2014-02-11 12:52:27 +00:00
Robert Lytton 70b5ba49c3 XCore target: fix const section handling
Xcore target ABI requires const data that is externally visible
to be handled differently if it has C-language linkage rather than
C++ language linkage.

Clang now emits ".cp.rodata" section information.

All other externally visible constant data will be placed in the DP section.

llvm-svn: 201144
2014-02-11 10:36:26 +00:00
Robert Lytton 9b6bb461b1 XCore target: Lower ATOMIC_LOAD & ATOMIC_STORE
llvm-svn: 201143
2014-02-11 10:36:18 +00:00
Elena Demikhovsky 1f32c313f1 AVX: fixed a bug in LowerVECTOR_SHUFFLE
llvm-svn: 201140
2014-02-11 10:21:53 +00:00
Elena Demikhovsky 2aafc22ed9 AVX-512: Optimized BUILD_VECTOR pattern;
fixed encoding of VEXTRACTPS instruction.

llvm-svn: 201134
2014-02-11 07:25:59 +00:00
Quentin Colombet 64ca544d95 [CodeGenPrepare] Test case for the promotions that bypass the
profitability check due to some other checks in the addressing
mode matcher. I.e., test case for commit r201121.

<rdar://problem/16020230>

llvm-svn: 201132
2014-02-11 06:55:43 +00:00
Tom Stellard 5d7aaaed7d R600/SI: Initialize M0 and emit S_WQM_B64 whenever DS instructions are used
DS instructions that access local memory can only uses addresses that
are less than or equal to the value of M0.  When M0 is uninitialized,
then we experience undefined behavior.

This patch also changes the behavior to emit S_WQM_B64 on pixel shaders
no matter what kind of DS instruction is used.

llvm-svn: 201097
2014-02-10 16:58:30 +00:00
Tim Northover b0430415e6 ARM: use natural LLVM IR for vshll instructions
Similarly to the vshrn instructions, these are simple zext/sext + trunc
operations. Using normal LLVM IR should allow for better code, and more sharing
with the AArch64 backend.

llvm-svn: 201093
2014-02-10 16:20:29 +00:00
Chad Rosier bcde0c49cb [AArch64] Handle aliases of conditional branches without b.pred form.
llvm-svn: 201091
2014-02-10 15:43:11 +00:00
Oliver Stannard 8dcaa761a2 ARM: r12 is callee-saved for interrupt handlers
For A- and R-class processors, r12 is not normally callee-saved, but is for
interrupt handlers. See AAPCS, 5.3.1.1, "Use of IP by the linker".

llvm-svn: 201089
2014-02-10 14:24:23 +00:00
Tim Northover 170daafe01 ARM: use LLVM IR to represent the vshrn operation
vshrn is just the combination of a right shift and a truncate (and the limits
on the immediate value actually mean the signedness of the shift doesn't
matter). Using that representation allows us to get rid of an ARM-specific
intrinsic, share more code with AArch64 and hopefully get better code out of
the mid-end optimisers.

llvm-svn: 201085
2014-02-10 14:04:07 +00:00
Robert Lougher 48ee75b7e3 Test commit - added a new line to vec_shuf-insert.ll.
llvm-svn: 201083
2014-02-10 12:42:13 +00:00
Matheus Almeida 4b27eb588c [mips][msa] Add DLSA instruction.
llvm-svn: 201081
2014-02-10 12:05:17 +00:00
Matheus Almeida b4133b25e7 [mips][msa] Update FileCheck prefix in preparation for
the addition of Mips64 tests.

No functional changes.

llvm-svn: 201080
2014-02-10 11:30:09 +00:00
Kostya Serebryany 8baa386670 [asan] support for FreeBSD, LLVM part. patch by Viktor Kutuzov
llvm-svn: 201067
2014-02-10 07:37:04 +00:00
Elena Demikhovsky 9f423d6f25 AVX-512: Fixed extract_vector_elt for v16i1 and v8i1 vectors.
llvm-svn: 201066
2014-02-10 07:02:39 +00:00
Craig Topper a0869dceea Recommit r201059 and r201060 with hopefully a fix for its original failure.
Original commits messages:

Add MRMXr/MRMXm form to X86 for use by instructions which treat the 'reg' field of modrm byte as a don't care value. Will allow for simplification of disassembler code.

Simplify a bunch of code by removing the need for the x86 disassembler table builder to know about extended opcodes. The modrm forms are sufficient to convey the information.

llvm-svn: 201065
2014-02-10 06:55:41 +00:00
Hao Liu 6e73761dc8 [AArch64]Implement the copy of two FPR8 registers by using FMOVss of two FPR32 registers in copyPhysReg.
llvm-svn: 201061
2014-02-10 03:16:22 +00:00
Benjamin Kramer 9d94a4eee9 AsmParser: Parse (and ignore) nested .macro definitions.
This enables a slightly odd feature of gas. The macro is defined when
the outermost macro is instantiated.

PR18599

llvm-svn: 201045
2014-02-09 16:22:00 +00:00
Saleem Abdulrasool 40726a1c11 ARM: change attribute tests to use parsed form
This makes the tests more readable by using the -arm-attributes decoding support
in llvm-readobj since that is now available.  Change the invocation commands to
be similar to other test and use a more precise triple (the tests only require
ARM EABI support).

llvm-svn: 201029
2014-02-08 23:17:02 +00:00
Arnold Schwaighofer 348e1b60be LoopVectorizer: Keep track of conditional store basic blocks
Before conditional store vectorization/unrolling we had only one
vectorized/unrolled basic block. After adding support for conditional store
vectorization this will not only be one block but multiple basic blocks. The
last block would have the back-edge. I updated the code to use a vector of basic
blocks instead of a single basic block and fixed the users to use the last entry
in this vector. But, I forgot to add the basic blocks to this vector!

Fixes PR18724.

llvm-svn: 201028
2014-02-08 20:41:13 +00:00
Juergen Ributzka 9479b31f97 [Constant Hoisting] Fix insertion point for constant materialization.
The bitcast instruction during constant materialization was not placed correcly
in the presence of phi nodes. This commit fixes the insertion point to be in the
idom instead.

This fixes PR18768

llvm-svn: 201009
2014-02-08 00:20:49 +00:00
Rafael Espindola 5054362920 Always create a temporary symbol to use with the cfi frame.
This is a small simplification and a small step in fixing pr18743 since
private functions on MachO should be using a 'l' prefix.

llvm-svn: 200994
2014-02-07 21:23:18 +00:00
Rafael Espindola e393aab13b Use FileCheck variables to simplify this test.
llvm-svn: 200992
2014-02-07 21:11:33 +00:00
Renato Golin 3dc5ade8bb Fix Darwin bots from EHABI change
llvm-svn: 200990
2014-02-07 20:32:32 +00:00
Matt Arsenault 9ad27e8d76 R600/SI: Add failing test for 3 x i64 vectors.
Stores of <4 x i64> do work (although they do expand to 4 stores
instead of 2), but 3 x i64 vectors fail to select.

llvm-svn: 200989
2014-02-07 20:29:40 +00:00
Renato Golin 78a6eba862 Remove -arm-disable-ehabi option
llvm-svn: 200988
2014-02-07 20:12:49 +00:00
Rafael Espindola 66f273be34 Don't internalize linkonce_odr non constant variables.
llvm-svn: 200983
2014-02-07 19:04:43 +00:00
Sasa Stankovic 4c80bdae72 [mips] Forbid the use of registers t6, t7 and t8 if the target is NaCl.
Differential Revision: http://llvm-reviews.chandlerc.com/D2694

llvm-svn: 200978
2014-02-07 17:16:40 +00:00
Rafael Espindola 61acf5d9b0 Fix a bug with .weak_def_can_be_hidden: Mutable variables cannot use it.
Thanks to John McCall for noticing it.

llvm-svn: 200977
2014-02-07 16:21:30 +00:00
Oliver Stannard 1dc1034218 LLVM-1163: AAPCS-VFP violation when CPRC allocated to stack
According to the AAPCS, when a CPRC is allocated to the stack, all other
VFP registers should be marked as unavailable.

I have also modified the rules for allocating non-CPRCs to the stack, to make
it more explicit that all GPRs must be made unavailable. I cannot think of a
case where the old version would produce incorrect answers, so there is no test
for this.

llvm-svn: 200970
2014-02-07 11:19:53 +00:00
Sasa Stankovic 6a284eecec Changed comment.
llvm-svn: 200969
2014-02-07 11:16:02 +00:00
Venkatraman Govindaraju de98fae368 [Sparc] Add support for parsing synthetic instruction 'mov'.
llvm-svn: 200965
2014-02-07 09:06:52 +00:00
Venkatraman Govindaraju ced9226b0f [Sparc] Emit correct encoding for atomic instructions. Also, add support for parsing CAS instructions to test the CAS encoding.
llvm-svn: 200963
2014-02-07 07:34:49 +00:00
Venkatraman Govindaraju fd07500dd1 [Sparc] Emit relocations for Thread Local Storage (TLS) when integrated assembler is used.
llvm-svn: 200962
2014-02-07 05:54:20 +00:00
Venkatraman Govindaraju 104643d0aa [Sparc] Emit correct relocations for PIC code when integrated assembler is used.
llvm-svn: 200961
2014-02-07 04:24:35 +00:00
Manman Ren 37c9267107 PGO branch weight: fix PR18752.
Fix a bug triggered in IfConverterTriangle when CvtBB has multiple predecessors
by getting the weights before removing a successor.

llvm-svn: 200958
2014-02-07 00:38:56 +00:00
Jim Grosbach e9008de652 X86: Resolve a long standing FIXME and properly isel pextr[bw].
Generalize the AArch64 .td nodes for AssertZext and AssertSext. Use
them to match the relevant pextr store instructions.

The test widen_load-2.ll requires a slight change because with the
stores gone, the remaining instructions are scheduled in a different
order.

Add test cases for SSE4 and AVX variants.

Resolves rdar://13414672.

Patch by Adam Nemet <anemet@apple.com>.

llvm-svn: 200957
2014-02-07 00:16:33 +00:00
Rafael Espindola 803fb10874 Convert test to FileCheck.
llvm-svn: 200955
2014-02-06 23:35:22 +00:00
Quentin Colombet 3a4bf0405e [CodeGenPrepare] Move away sign extensions that get in the way of addressing
mode.

Basically the idea is to transform code like this:
%idx = add nsw i32 %a, 1
%sextidx = sext i32 %idx to i64
%gep = gep i8* %myArray, i64 %sextidx
load i8* %gep

Into:
%sexta = sext i32 %a to i64
%idx = add nsw i64 %sexta, 1
%gep = gep i8* %myArray, i64 %idx
load i8* %gep

That way the computation can be folded into the addressing mode.

This transformation is done as part of the addressing mode matcher.
If the matching fails (not profitable, addressing mode not legal, etc.), the
matcher will revert the related promotions.

<rdar://problem/15519855>

llvm-svn: 200947
2014-02-06 21:44:56 +00:00
Tom Stellard e236794578 R600/SI: Add a MUBUF store pattern for Reg+Imm offsets
llvm-svn: 200935
2014-02-06 18:36:41 +00:00
Tom Stellard 2937cbc005 R600/SI: Add a MUBUF store pattern for Imm offsets
llvm-svn: 200934
2014-02-06 18:36:39 +00:00
Tom Stellard 11624bc577 R600/SI: Add a MUBUF load pattern for Reg+Imm offsets
llvm-svn: 200933
2014-02-06 18:36:38 +00:00
Tom Stellard 044e418f15 R600/SI: Use immediates offsets for SMRD instructions whenever possible
There was a problem with the old pattern, so we were copying some
larger immediates into registers when we could have been encoding
them in the instruction.

llvm-svn: 200932
2014-02-06 18:36:34 +00:00
Tim Northover f0e21616f3 X86: add costs for 64-bit vector ext/trunc & rebalance
The most important part of this is probably adding any cost at all for
operations like zext <8 x i8> to <8 x i32>. Before they were being
recorded as extremely costly (24, I believe) which made LLVM fall back
on a 4-wide vectorisation of a loop.

It also rebalances the values for sext, zext and trunc. Lacking any
other sane metric that might work across CPU microarchitectures I went
for instructions. This seems to be in reasonable accord with the rest
of the table (sitofp, ...) though no doubt at least one value is
sub-optimal for some bizarre reason.

Finally, separate AVX and AVX2 values are provided where appropriate.
The CodeGen is quite different in many cases.

rdar://problem/15981990

llvm-svn: 200928
2014-02-06 18:18:36 +00:00
Nick Lewycky 993849490e A memcpy out of an fresh alloca is a no-op, delete it. Patch by Patrick Walton!
llvm-svn: 200907
2014-02-06 06:29:19 +00:00
Chandler Carruth bf71a34eb9 [PM] Add a new "lazy" call graph analysis pass for the new pass manager.
The primary motivation for this pass is to separate the call graph
analysis used by the new pass manager's CGSCC pass management from the
existing call graph analysis pass. That analysis pass is (somewhat
unfortunately) over-constrained by the existing CallGraphSCCPassManager
requirements. Those requirements make it *really* hard to cleanly layer
the needed functionality for the new pass manager on top of the existing
analysis.

However, there are also a bunch of things that the pass manager would
specifically benefit from doing differently from the existing call graph
analysis, and this new implementation tries to address several of them:

- Be lazy about scanning function definitions. The existing pass eagerly
  scans the entire module to build the initial graph. This new pass is
  significantly more lazy, and I plan to push this even further to
  maximize locality during CGSCC walks.
- Don't use a single synthetic node to partition functions with an
  indirect call from functions whose address is taken. This node creates
  a huge choke-point which would preclude good parallelization across
  the fanout of the SCC graph when we got to the point of looking at
  such changes to LLVM.
- Use a memory dense and lightweight representation of the call graph
  rather than value handles and tracking call instructions. This will
  require explicit update calls instead of some updates working
  transparently, but should end up being significantly more efficient.
  The explicit update calls ended up being needed in many cases for the
  existing call graph so we don't really lose anything.
- Doesn't explicitly model SCCs and thus doesn't provide an "identity"
  for an SCC which is stable across updates. This is essential for the
  new pass manager to work correctly.
- Only form the graph necessary for traversing all of the functions in
  an SCC friendly order. This is a much simpler graph structure and
  should be more memory dense. It does limit the ways in which it is
  appropriate to use this analysis. I wish I had a better name than
  "call graph". I've commented extensively this aspect.

This is still very much a WIP, in fact it is really just the initial
bits. But it is about the fourth version of the initial bits that I've
implemented with each of the others running into really frustrating
problms. This looks like it will actually work and I'd like to split the
actual complexity across commits for the sake of my reviewers. =] The
rest of the implementation along with lots of wiring will follow
somewhat more rapidly now that there is a good path forward.

Naturally, this doesn't impact any of the existing optimizer. This code
is specific to the new pass manager.

A bunch of thanks are deserved for the various folks that have helped
with the design of this, especially Nick Lewycky who actually sat with
me to go through the fundamentals of the final version here.

llvm-svn: 200903
2014-02-06 04:37:03 +00:00
Juergen Ributzka fa0eba6c8b [DAG] Don't pull the binary operation though the shift if the operands have opaque constants.
During DAGCombine visitShiftByConstant assumes that certain binary operations
with only constant operands can always be folded successfully. This is no longer
true when the constant is opaque. This commit fixes visitShiftByConstant by not
performing the optimization for opaque constants. Otherwise we would end up in
an infinite DAGCombine loop.

llvm-svn: 200900
2014-02-06 04:09:06 +00:00
Manman Ren d461244972 Set default of inlinecold-threshold to 225.
225 is the default value of inline-threshold. This change will make sure
we have the same inlining behavior as prior to r200886.

As Chandler points out, even though we don't have code in our testing
suite that uses cold attribute, there are larger applications that do
use cold attribute.

r200886 + this commit intend to keep the same behavior as prior to r200886.
We can later on tune the inlinecold-threshold.

The main purpose of r200886 is to help performance of instrumentation based
PGO before we actually hook up inliner with analysis passes such as BPI and BFI.
For instrumentation based PGO, we try to increase inlining of hot functions and
reduce inlining of cold functions by setting inlinecold-threshold.

Another option suggested by Chandler is to use a boolean flag that controls
if we should use OptSizeThreshold for cold functions. The default value
of the boolean flag should not change the current behavior. But it gives us
less freedom in controlling inlining of cold functions.

llvm-svn: 200898
2014-02-06 01:59:22 +00:00
Kevin Enderby d6b107136a Update the X86 assembler for .intel_syntax to accept
the << and >> bitwise operators.

rdar://15975725

llvm-svn: 200896
2014-02-06 01:21:15 +00:00
Paul Robinson af4e64d095 Disable most IR-level transform passes on functions marked 'optnone'.
Ideally only those transform passes that run at -O0 remain enabled,
in reality we get as close as we reasonably can.
Passes are responsible for disabling themselves, it's not the job of
the pass manager to do it for them.

llvm-svn: 200892
2014-02-06 00:07:05 +00:00
Manman Ren e8781b1a36 Inliner uses a smaller inline threshold for callees with cold attribute.
Added command line option inlinecold-threshold to set threshold for inlining
functions with cold attribute. Listen to the cold attribute when it would
decrease the inline threshold.

llvm-svn: 200886
2014-02-05 22:53:44 +00:00
Quentin Colombet 87769713cf [RegAlloc] Add a last chance recoloring mechanism when everything else failed to
find a register.

The idea is to choose a color for the variable that cannot be allocated and
recolor its interferences around. Unlike the current register allocation scheme,
it is allowed to change the color of an already assigned (but maybe not
splittable or spillable) live interval while propagating this change to its
neighbors.
In other word, there are two things that may help finding an available color:
- Already assigned variables (RS_Done) can be recolored to different color.
- The recoloring allows to catch solutions that needs to touch more that just
  the neighbors of the current allocated variable.

E.g.,
vA can use {R1, R2    }
vB can use {    R2, R3}
vC can use {R1        }
Where vA, vB, and vC cannot be split anymore (they are reloads for instance) and
they all interfere.

vA is assigned R1
vB is assigned R2
vC tries to evict vA but vA is already done.
=> Regular register allocation heuristic fails.

Last chance recoloring kicks in:
vC does as if vA was evicted => vC uses R1.
vC is marked as fixed.
vA needs to find a color.
None are available.
vA cannot evict vC: vC is a fixed virtual register now.
vA does as if vB was evicted => vA uses R2.
vB needs to find a color.
R3 is available.
Recoloring => vC = R1, vA = R2, vB = R3.

<rdar://problem/15947839>

llvm-svn: 200883
2014-02-05 22:13:59 +00:00
Rafael Espindola b4eec1daa1 Remove support for not using .loc directives.
Clang itself was not using this. The only way to access it was via llc.

llvm-svn: 200862
2014-02-05 18:00:21 +00:00
Petar Jovanovic 9725016af3 [mips] Add NaCl target and forbid indexed loads and stores for it
This patch adds NaCl target for Mips. It also forbids indexed loads and
stores if the target is NaCl.

Patch by Sasa Stankovic.

Differential Revision: http://llvm-reviews.chandlerc.com/D2690

llvm-svn: 200855
2014-02-05 17:19:30 +00:00
Petar Jovanovic f387387835 mips: XFAIL non-extern-addend-smallcodemodel test
Small code model (and default reloc model) set Reloc::PIC_ in this test,
and PIC is not yet supported in MCJIT for MIPS.

llvm-svn: 200852
2014-02-05 16:47:59 +00:00
Elena Demikhovsky 0b79be8ab2 AVX-512: optimized icmp -> sext -> icmp pattern
llvm-svn: 200849
2014-02-05 16:17:36 +00:00
Logan Chien d5c48aa3d3 ARM: Resolve thumb_bl fixup in same MCFragment.
In Thumb1 mode, bl instruction might be selected for branches between
basic blocks in the function if the offset is greater than 2KB.
However, this might cause SEGV because the destination symbol
is not marked as thumb function and the execution mode will be reset
to ARM mode.

Since we are sure that these symbols are in the same data fragment, we
can simply resolve these local symbols, and don't emit any relocation
information for this bl instruction.

llvm-svn: 200842
2014-02-05 14:15:16 +00:00
Elena Demikhovsky a38114c45e AVX-512: fixed a bug in EVEX encoding (the bug appeared after r200624)
llvm-svn: 200837
2014-02-05 13:03:01 +00:00
Michel Danzer 5d26fdfcba R600/SI: Add pattern for zero-extending i1 to i32
Fixes opencl-example if_* tests with radeonsi.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74469

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 200830
2014-02-05 09:48:05 +00:00
Kai Nacke 382c140567 ARM: Enable use of relocation type tlsldo in debug info for tls data.
This fixes PR18554.

Reviewers: Renato Golin, Keith Walker
llvm-svn: 200826
2014-02-05 07:23:09 +00:00
Elena Demikhovsky a30e437659 AVX-512: Added intrinsic for cvtph2ps.
Added VPTESTNM instruction.
Added a pattern to vselect (lit tests will follow).

llvm-svn: 200823
2014-02-05 07:05:03 +00:00
Rafael Espindola 02eac9a270 Add a test for printing absolute symbols in ELF.
llvm-svn: 200818
2014-02-05 04:36:47 +00:00
Rafael Espindola f42c58d2ef Small fix for llvm-nm handling of weak symbols on ELF (print 'v').
llvm-svn: 200808
2014-02-04 23:53:15 +00:00
Manman Ren 81f136f94a Update testing case for r200806.
llvm-svn: 200807
2014-02-04 23:53:12 +00:00
Rafael Espindola fb66ef05f7 Add a test for common symbols in coff.
llvm-svn: 200803
2014-02-04 23:18:52 +00:00
Benjamin Kramer 34f460ed29 SimplifyLibCalls: Push TLI through the exp2->ldexp transform.
For the odd case of platforms with exp2 available but not ldexp.

llvm-svn: 200795
2014-02-04 20:27:23 +00:00
Petar Jovanovic a5da588b2f [mips] Implement %hi(sym1 - sym2) and %lo(sym1 - sym2) expressions
Patch implements %hi(sym1 - sym2) and %lo(sym1 - sym2) expressions for MIPS
by creating target expression class MipsMCExpr.

Patch by Sasa Stankovic.

Differential Revision: http://llvm-reviews.chandlerc.com/D2592

llvm-svn: 200783
2014-02-04 18:41:57 +00:00
David Peixotto b9b7362cdc Fix PR18345: ldr= pseudo instruction produces incorrect code when using in inline assembly
This patch fixes the ldr-pseudo implementation to work when used in
inline assembly.  The fix is to move arm assembler constant pools
from the ARMAsmParser class to the ARMTargetStreamer class.

Previously we kept the assembler generated constant pools in the
ARMAsmParser object. This does not work for inline assembly because
a new parser object is created for each blob of inline assembly.
This patch moves the constant pools to the ARMTargetStreamer class
so that the constant pool will remain alive for the entire code
generation process.

An ARMTargetStreamer class is now required for the arm backend.
There was no existing implementation for MachO, only Asm and ELF.
Instead of creating an empty MachO subclass, we decided to make the
ARMTargetStreamer a non-abstract class and provide default
(llvm_unreachable) implementations for the non constant-pool related
methods.

Differential Revision: http://llvm-reviews.chandlerc.com/D2638

llvm-svn: 200777
2014-02-04 17:22:40 +00:00
Tom Stellard 0ec134f3d6 R600/SI: Custom lower i64 ISD::SELECT
llvm-svn: 200774
2014-02-04 17:18:40 +00:00
Tom Stellard bfebd1fc7e R600: Enable vector fpow.
The OpenCL specs say: "The vector versions of the math functions operate
component-wise. The description is per-component."

Patch by: Jan Vesely

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 200773
2014-02-04 17:18:37 +00:00
Tim Northover 103e648d30 OS X: the correct function is __sincospif_stret, not __sincospi_stretf
rdar://problem/13729466

llvm-svn: 200771
2014-02-04 16:28:20 +00:00
Tim Northover fdbdb4b6d5 ARM & AArch64: merge NEON absolute compare intrinsics
There was an extremely confusing proliferation of LLVM intrinsics to implement
the vacge & vacgt instructions. This combines them all into two polymorphic
intrinsics, shared across both backends.

llvm-svn: 200768
2014-02-04 14:55:42 +00:00
Justin Bogner c6af350698 llvm-cov: Implement the preserve-paths flag
Until now, when a path in a gcno file included a directory, we would
emit our .gcov file in that directory, whereas gcov always emits the
file in the current directory. In doing so, this implements gcov's
strange name-mangling -p flag, which is needed to avoid clobbering
files when two with the same name exist in different directories.

The path mangling is a bit ugly and only handles unix-like paths, but
it's simple, and it doesn't make any guesses as to how it should
behave outside of what gcov documents. If we decide this should be
cross platform later, we can consider the compatibility implications
then.

llvm-svn: 200754
2014-02-04 10:45:02 +00:00
Tim Northover e42fb07618 ARM: fix fast-isel assertion failure
Missing braces on if meant we inserted both ARM and Thumb load for a litpool
entry. This didn't end well.

rdar://problem/15959157

llvm-svn: 200752
2014-02-04 10:38:46 +00:00
Michel Danzer 624b02aa67 R600/SI: Fix fneg for 0.0
V_ADD_F32 with source modifier does not produce -0.0 for this. Just
manipulate the sign bit directly instead.

Also add a pattern for (fneg (fabs ...)).

Fixes a bunch of bit encoding piglit tests with radeonsi.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 200743
2014-02-04 07:12:38 +00:00
Justin Bogner f69557e670 llvm-cov: Implement the object-directory flag
llvm-svn: 200741
2014-02-04 06:41:43 +00:00
Justin Bogner 93d1edbb7d llvm-cov: Ignore missing .gcda files
When gcov is run without gcda data, it acts as if the counts are all
zero and labels the file as - to indicate that there was no data. We
should do the same.

llvm-svn: 200740
2014-02-04 06:41:39 +00:00
Justin Bogner b5bff69ac6 llvm-cov: Document the llvm-cov tests
llvm-svn: 200739
2014-02-04 06:41:33 +00:00
Kai Nacke ab7ee461c2 Revert: ARM: Enable use of relocation type tlsldo in debug info for tls data.
There seems to be a new problem with the debug info in the test case.
I'll have to investigate this.

llvm-svn: 200737
2014-02-04 06:07:00 +00:00
Kai Nacke a56bb78021 Add strchr(p, 0) -> p + strlen(p) to SimplifyLibCalls
Add the missing transformation strchr(p, 0) -> p + strlen(p) to SimplifyLibCalls
and remove the ToDo comment.

Reviewer: Duncan P.N. Exan Smith
llvm-svn: 200736
2014-02-04 05:55:16 +00:00
Kai Nacke 5e8c30f192 ARM: Enable use of relocation type tlsldo in debug info for tls data.
This fixes PR18554.

Reviewers: Renato Golin, Keith Walker
llvm-svn: 200735
2014-02-04 05:43:09 +00:00
David Blaikie 5e390e4df7 DebugInfo: Remove some unneeded conditionals now that DIBuilder no longer emits zero-length arrays as {i32 0}
A bunch of test cases needed to be cleaned up for this, many my fault -
when implementid imported modules I updated test cases by simply
duplicating the prior metadata field - which wasn't always the empty
metadata entry.

llvm-svn: 200731
2014-02-04 01:23:52 +00:00
Reid Kleckner d47a59a4f8 inalloca: Don't remove dead arguments in the presence of inalloca args
It disturbs the layout of the parameters in memory and registers,
leading to problems in the backend.

The plan for optimizing internal inalloca functions going forward is to
essentially SROA the argument memory and demote any captured arguments
(things that aren't trivially written by a load or store) to an indirect
pointer to a static alloca.

llvm-svn: 200717
2014-02-03 20:42:49 +00:00
Tim Northover 24979d8e10 AArch64 & ARM: refactor crypto intrinsics to take scalars
Some of the SHA instructions take a scalar i32 as one argument (largely because
they work on 160-bit hash fragments). This wasn't reflected in the IR
previously, with ARM and AArch64 choosing different types (<4 x i32> and <1 x
i32> respectively) which was ugly.

This makes all the affected intrinsics take a uniform "i32", allowing them to
become non-polymorphic at the same time.

llvm-svn: 200706
2014-02-03 17:27:49 +00:00
Hal Finkel 5c968d9440 Expand vector bswap in LegalizeVectorOps
ISD::BSWAP was missing from the list of node types that should be expanded
element-wise.

llvm-svn: 200705
2014-02-03 17:27:25 +00:00
Duncan P. N. Exon Smith 1ff08e389f Lower llvm.expect intrinsic correctly for i1
LowerExpectIntrinsic previously only understood the idiom of an expect
intrinsic followed by a comparison with zero. For llvm.expect.i1, the
comparison would be stripped by the early-cse pass.

Patch by Daniel Micay.

llvm-svn: 200664
2014-02-02 22:43:55 +00:00
Arnold Schwaighofer 17455633c7 LoopVectorizer: Enable unrolling of conditional stores and the load/store
unrolling heuristic per default

Benchmarking on x86_64 (thanks Chandler!) and ARM has shown those options speed
up some benchmarks while not causing any interesting regressions.

llvm-svn: 200621
2014-02-02 03:12:34 +00:00
Matt Arsenault 6e63dd27a2 Add some xfailed R600 tests for 64-bit private accesses.
llvm-svn: 200620
2014-02-02 00:13:12 +00:00
Matt Arsenault f5958dded4 R600/SI: Fix insertelement with dynamic indices.
This didn't work for any integer vectors, and didn't
work with some sizes of float vectors. This should now
work with all sizes of float and i32 vectors.

llvm-svn: 200619
2014-02-02 00:05:35 +00:00
Venkatraman Govindaraju 52b6473d74 [Sparc] Set %o7 as the return address register instead of %i7 in MCRegisterInfo. Also, add CFI instructions to initialize the frame correctly.
llvm-svn: 200617
2014-02-01 18:54:16 +00:00
Arnold Schwaighofer 445f7fb064 ARMTTI: We don't have 16 allocatable scalar registers
This caused an regression on libquantum after enabling the new loop vectorizer
unroll heuristics.

llvm-svn: 200616
2014-02-01 18:00:25 +00:00
David Woodhouse 6c9a6f9b3d MC: Fix .octa output for APInts with BitWidth > 128
llvm-svn: 200615
2014-02-01 16:52:33 +00:00
David Woodhouse d6de0d99c5 MC: Add support for .octa
This is a minimal implementation which accepts only constants rather than
full expressions, but that should be perfectly sufficient for all known
users for now.

Patch from PaX Team <pageexec@freemail.hu>

llvm-svn: 200614
2014-02-01 16:20:59 +00:00
Chandler Carruth 1665152cce [LPM] Apply a really big hammer to fix PR18688 by recursively reforming
LCSSA when we promote to SSA registers inside of LICM.

Currently, this is actually necessary. The promotion logic in LICM uses
SSAUpdater which doesn't understand how to place LCSSA PHI nodes.
Teaching it to do so would be a very significant undertaking. It may be
worthwhile and I've left a FIXME about this in the code as well as
starting a thread on llvmdev to try to figure out the right long-term
solution.

For now, the PR needs to be fixed. Short of using the promition
SSAUpdater to place both the LCSSA PHI nodes and the promoted PHI nodes,
I don't see a cleaner or cheaper way of achieving this. Fortunately,
LCSSA is relatively lazy and sparse -- it should only update
instructions which need it. We can also skip the recursive variant when
we don't promote to SSA values.

llvm-svn: 200612
2014-02-01 13:35:14 +00:00
Chandler Carruth 6b4cc8b66a [inliner] Skip debug intrinsics even earlier in computing the inline
cost so that they don't impact the vector bonus. Fundamentally, counting
unsimplified instructions is just *wrong*; it will continue to introduce
instability as things which do not generate code bizarrely impact
inlining. For example, sufficiently nested inlined functions could turn
off the vector bonus with lifetime markers just like the debug
intrinsics do. =/

This is a short-term tactical fix. Long term, I think we need to remove
the vector bonus entirely. That's a separate patch and discussion
though.

The patch to fix this provided by Dario Domizioli. I've added some
comments about the planned direction and used a heavily pruned form of
debug info intrinsics for the test case. While this debug info doesn't
work or "do" anything useful, it lets us easily test all manner of
interference easily, and I suspect this will not be the last time we
want to craft a pattern where debug info interferes with the inliner in
a problematic way.

llvm-svn: 200609
2014-02-01 10:38:17 +00:00
David Majnemer 5a67e2b1b8 Update a .fill test to use the updated semantics.
Something funny happened, this should've been part of r200606.

llvm-svn: 200607
2014-02-01 07:36:52 +00:00
David Majnemer 522d3db745 MC: Improve the .fill directive's compatibility with GAS
Per the GAS documentation, .fill should permit pattern widths that
aren't a power of two. While I was in the neighborhood, I added some
sanity checking. This change was motivated by a use of this construct
in the Linux Kernel.

llvm-svn: 200606
2014-02-01 07:19:38 +00:00
Reid Kleckner a04504fe97 Revert "[SLPV] Recognize vectorizable intrinsics during SLP vectorization ..."
This reverts commit r200576.  It broke 32-bit self-host builds by
vectorizing two calls to @llvm.bswap.i64, which we then fail to expand.

llvm-svn: 200602
2014-02-01 01:37:30 +00:00
Josh Magee 24c7f06333 [stackprotector] Implement the sspstrong rules for stack layout.
This changes the PrologueEpilogInserter and LocalStackSlotAllocation passes to
follow the extended stack layout rules for sspstrong and sspreq.

The sspstrong layout rules are:
 1. Large arrays and structures containing large arrays (>= ssp-buffer-size)
are closest to the stack protector.
 2. Small arrays and structures containing small arrays (< ssp-buffer-size) are
2nd closest to the protector.
 3. Variables that have had their address taken are 3rd closest to the
protector.


Differential Revision: http://llvm-reviews.chandlerc.com/D2546

llvm-svn: 200601
2014-02-01 01:36:16 +00:00
Reid Kleckner f5b76518c9 Implement inalloca codegen for x86 with the new inalloca design
Calls with inalloca are lowered by skipping all stores for arguments
passed in memory and the initial stack adjustment to allocate argument
memory.

Now the frontend is responsible for the memory layout, and the backend
doesn't have to do any work.  As a result these changes are pretty
minimal.

Reviewers: echristo

Differential Revision: http://llvm-reviews.chandlerc.com/D2637

llvm-svn: 200596
2014-01-31 23:50:57 +00:00
Reid Kleckner dfbed59cc2 Don't put non-static allocas in the static alloca map
Allocas marked inalloca are never static, but we were trying to put them
into the static alloca map if they were in the entry block.  Also add an
assertion in x86 fastisel.

llvm-svn: 200593
2014-01-31 23:45:12 +00:00
Lang Hames b0bd489e4a Split out small-code-model MCJIT testcase in order to xfail for AArch64, where
PC-rel relocations aren't yet fully implemented.

llvm-svn: 200592
2014-01-31 23:36:25 +00:00
Rafael Espindola 972e71ab5a Remove another hasRawTextSupport.
To remove this one simply move the end of file logic from the asm printer to
the target mc streamer.

This removes the last call to hasRawTextSupport from lib/Target.

llvm-svn: 200590
2014-01-31 23:10:26 +00:00
Reid Kleckner edb94c70c1 Set -mcpu to make this test pass on atom bots
llvm-svn: 200588
2014-01-31 22:58:10 +00:00
Rafael Espindola e9d1102545 Mark the first dynamic elf symbol as SF_FormatSpecific.
llvm-svn: 200578
2014-01-31 21:40:13 +00:00
Lang Hames 5ec150c967 Replace X86 FMA intrinsic pseduo-instructions with def pats.
It looks like these pseudos were only used for pattern matching. Def pats are
the appropriate way to do that. As a bonus, these intrinsics will now have
memory operands folded properly, and better FMA3 variants selected where
appropriate (see r199933).

<rdar://problem/15611947>

llvm-svn: 200577
2014-01-31 21:29:19 +00:00
Chandler Carruth b3da389e30 [SLPV] Recognize vectorizable intrinsics during SLP vectorization and
transform accordingly. Based on similar code from Loop vectorization.
Subsequent commits will include vectorization of function calls to
vector intrinsics and form function calls to vector library calls.

Patch by Raul Silvera! (Much delayed due to my not running dcommit)

llvm-svn: 200576
2014-01-31 21:14:40 +00:00
David Blaikie 322d79b4a2 DebugInfo: Flag type unit references as declarations
This ensures DWARF consumers don't confuse these references for
definitions. I'd argue it might be nice to improve debuggers so we don't
need this, but it's just one field in an abbreviation anyway - so it
doesn't seem worth the fight.

llvm-svn: 200569
2014-01-31 19:52:26 +00:00
Reid Kleckner 1c843228f8 [ms-cxxabi] Add a new calling convention that swaps 'this' and 'sret'
MSVC always places the 'this' parameter for a method first.  The
implicit 'sret' pointer for methods always comes second.  We already
implement this for __thiscall by putting sret parameters on the stack,
but __cdecl methods require putting both parameters on the stack in
opposite order.

Using a special calling convention allows frontends to keep the sret
parameter first, which avoids breaking lots of assumptions in LLVM and
Clang.

Fixes PR15768 with the corresponding change in Clang.

Reviewers: ributzka, majnemer

Differential Revision: http://llvm-reviews.chandlerc.com/D2663

llvm-svn: 200561
2014-01-31 17:41:22 +00:00
Matheus Almeida 1ace1f1236 [mips][msa] Add insert.d instruction.
This instruction is only available on Mips64 cores that implement the MSA ASE.

llvm-svn: 200543
2014-01-31 13:31:20 +00:00
Matheus Almeida 8114cf70aa Update FileCheck prefixes in preparation for the addition of Mips64 MSA tests.
No functional changes.

llvm-svn: 200541
2014-01-31 13:05:56 +00:00
Chandler Carruth c12224cb93 [vectorizer] Tweak the way we do small loop runtime unrolling in the
loop vectorizer to not do so when runtime pointer checks are needed and
share code with the new (not yet enabled) load/store saturation runtime
unrolling. Also ensure that we only consider the runtime checks when the
loop hasn't already been vectorized. If it has, the runtime check cost
has already been paid.

I've fleshed out a test case to cover the scalar unrolling as well as
the vector unrolling and comment clearly why we are or aren't following
the pattern.

llvm-svn: 200530
2014-01-31 10:51:08 +00:00
Craig Topper 327f13b43b Move address override handling in X86MCCodeEmitter to a place where it works for VEX encoded instructions too. This allows 32-bit addressing to work in 64-bit mode.
llvm-svn: 200516
2014-01-31 05:33:45 +00:00
Manman Ren 413a6cb42b This patch teaches the DAGCombiner how to fold insert_subvector nodes
when the input is a concat_vectors and the insert replaces one of the
concat halves:

Lower half: fold (insert_subvector (concat_vectors X, Y), Z) ->
(concat_vectors Z, Y)
Upper half: fold (insert_subvector (concat_vectors X, Y), Z) ->
(concat_vectors X, Z)

This can be seen with the following IR:

define <8 x float> @lower_half(<4 x float> %v1, <4 x float> %v2, <4 x
float> %v3) {
  %1 = shufflevector <4 x float> %v1, <4 x float> %v2, <8 x i32> <i32
0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7>
  %2 = tail call <8 x float> @llvm.x86.avx.vinsertf128.ps.256(<8 x
float> %1, <4 x float> %v3, i8 0)

The vinsertf128 intrinsic is converted into an insert_subvector node
in SelectionDAGBuilder.cpp.

Using AVX, without the patch this generates two vinsertf128 instructions:

vinsertf128 $1, %xmm1, %ymm0, %ymm0
vinsertf128 $0, %xmm2, %ymm0, %ymm0

With the patch this is optimized into:

vinsertf128 $1, %xmm1, %ymm2, %ymm0

Patch by Robert Lougher.

llvm-svn: 200506
2014-01-31 01:10:35 +00:00
Manman Ren 4ece7452ba PGO branch weight: update edge weights in SelectionDAGBuilder.
When converting from "or + br" to two branches, or converting from
"and + br" to two branches, we correctly update the edge weights of
the two branches.

The previous attempt at r200431 was reverted at r200434 because of
two testing case failures. I modified my patch a little, but forgot
to re-run "make check-all".

Testing case CodeGen/ARM/lsr-unfolded-offset.ll is updated because of
the patch's impact on branch probability which causes changes in
spill placement.

llvm-svn: 200502
2014-01-31 00:42:44 +00:00
Matt Arsenault ee364ee729 Allow speculating llvm.sqrt, fma and fmuladd
This doesn't set errno, so this should be OK.
Also update the documentation to explicitly state
that errno are not set.

llvm-svn: 200501
2014-01-31 00:09:00 +00:00
Timur Iskhodzhanov f33d8b979b Add a link to a bug to a couple of FIXMEs
llvm-svn: 200500
2014-01-30 23:14:38 +00:00
David Woodhouse 0b6c94909e [x86] Fix signed relocations for i64i32imm operands
These should end up (in ELF) as R_X86_64_32S relocs, not R_X86_64_32.
Kill the horrid and incomplete special case and FIXME in
EncodeInstruction() and set things up so it can infer the signedness
from the ImmType just like it can the size and whether it's PC-relative.

llvm-svn: 200495
2014-01-30 22:20:41 +00:00
Chad Rosier fe5ab2f5ba [AArch64] Custom lower concat_vector patterns with v4i16, v4i32, v8i8, v8i16, v16i8 types.
llvm-svn: 200491
2014-01-30 21:46:54 +00:00
Timur Iskhodzhanov 3e4ac4eeca Fix PR18381 - print a minimal diagnostic rather than assert on unresolved .secidx target
llvm-svn: 200490
2014-01-30 21:13:05 +00:00
Rafael Espindola 196666cf5f Only ELF has a dynamic symbol table. Remove it from ObjectFile.
COFF has only one symbol table.
MachO has a LC_DYSYMTAB, but that is not a symbol table, just extra info about
the one symbol table (LC_SYMTAB).
IR (coming soon) also has only one table.

llvm-svn: 200488
2014-01-30 20:45:33 +00:00
Rafael Espindola 41186dd288 This has been fixed.
llvm-svn: 200487
2014-01-30 20:29:25 +00:00
Juergen Ributzka fb4d648295 [Stackmaps] Record the stack size of each function that contains a stackmap/patchpoint intrinsic.
Re-applying the patch, but this time without using AsmPrinter methods.

Reviewed by Andy

llvm-svn: 200481
2014-01-30 18:58:27 +00:00
Timur Iskhodzhanov 4a83bf168f Explicitly specify the CPU to avoid Atom-specific assembly mismatch
llvm-svn: 200473
2014-01-30 17:53:45 +00:00
Evgeniy Stepanov 02bc78b9cd Reenable ARM EHABI on Android.
Broken in r200388.

llvm-svn: 200466
2014-01-30 14:18:25 +00:00
Jakob Stoklund Olesen ef1d59a175 Implement SPARCv9 atomic_swap_64 with a pseudo.
The SWAP instruction only exists in a 32-bit variant, but the 64-bit
atomic swap can be implemented in terms of CASX, like the other atomic
rmw primitives.

llvm-svn: 200453
2014-01-30 04:48:46 +00:00
Saleem Abdulrasool 4c4789be5e ARM IAS: support .object_arch
The .object_arch directive indicates an alternative architecture to be specified
in the object file.  The directive does *not* effect the enabled feature bits
for the object file generation.  This is particularly useful when the code
performs runtime detection and would like to indicate a lower architecture as
the requirements than the actual instructions used.

llvm-svn: 200451
2014-01-30 04:46:41 +00:00
Saleem Abdulrasool 15d16d809b tools: add support for decoding ARM attributes
Enhance the ARM specific parsing support in llvm-readobj to support attributes.
This allows for simpler tests to validate encoding of the build attributes as
specified in the ARM ELF specification.

llvm-svn: 200450
2014-01-30 04:46:33 +00:00
Saleem Abdulrasool 5d962d31f1 ARM IAS: support .movsp
.movsp is an ARM unwinding directive that indicates to the unwinder that a
register contains an offset from the current stack pointer.  If the offset is
unspecified, it defaults to zero.

llvm-svn: 200449
2014-01-30 04:46:24 +00:00
Saleem Abdulrasool 56e06e8640 ARM: suuport .tlsdescseq directive
This enhances the ARMAsmParser to handle .tlsdescseq directives.  This is a
slightly special relocation.  We must be able to generate them, but not consume
them in assembly.  The relocation is meant to assist the linker in generating a
TLS descriptor sequence.  The ELF target streamer is enhanced to append
additional fixups into the current segment and that is used to emit the new
R_ARM_TLS_DESCSEQ relocations.

llvm-svn: 200448
2014-01-30 04:02:47 +00:00
Saleem Abdulrasool a3f12bdeec ARM: support TLS descriptor relocations
Add support for tlsdesc relocations which are part of the ABI, marked as
experimental.  These relocations permit the linker to perform TLS reference
optimizations.

llvm-svn: 200447
2014-01-30 04:02:38 +00:00
Saleem Abdulrasool 6e00ca887e ARM: support tlscall relocations
This adds support for TLS CALL relocations.  TLS CALL relocations are used to
indicate to the linker to generate appropriate entries to resolve TLS references
via an appropriate function invocation (e.g. __tls_get_addr(PLT)).

In order to accomodate the linker relaxation of the TLS access model for the
references (GD/LD -> IE, IE -> LE), the relocation addend must be incomplete.
This requires that the partial inplace value is also incomplete (i.e. 0).  We
simply avoid the offset value calculation at the time of the fixup adjustment in
the ARM assembler backend.

llvm-svn: 200446
2014-01-30 04:02:31 +00:00
Juergen Ributzka f6f0ce903e Revert "[Stackmaps] Record the stack size of each function that contains a stackmap/patchpoint intrinsic."
This reverts commit r200444 to unbreak buildbots.

llvm-svn: 200445
2014-01-30 03:34:02 +00:00
Juergen Ributzka aece7583a7 [Stackmaps] Record the stack size of each function that contains a stackmap/patchpoint intrinsic.
Reviewed by Andy

llvm-svn: 200444
2014-01-30 03:06:14 +00:00
Timur Iskhodzhanov f166f6c8d0 Reland r200340 - 'Add line table debug info to COFF files when using a win32 triple'
This incorporates a couple of fixes reviewed at http://llvm-reviews.chandlerc.com/D2651

llvm-svn: 200440
2014-01-30 01:39:17 +00:00
Manman Ren 7407e0e31c Revert r200431 due to bot failures.
llvm-svn: 200434
2014-01-30 00:53:27 +00:00
Rafael Espindola 9fa52155a9 Fix TLS handling in ELF's getAddress and llvm-nm to print 'D' for it.
llvm-svn: 200433
2014-01-30 00:42:30 +00:00
Manman Ren 104e0c80cc PGO branch weight: update edge weights in SelectionDAGBuilder.
When converting from "or + br" to two branches, or converting from
"and + br" to two branches, we correctly update the edge weights of
the two branches.

llvm-svn: 200431
2014-01-30 00:24:37 +00:00
Manman Ren b681918ddd PGO branch weight: update edge weights in IfConverter.
This commit only handles IfConvertTriangle. To update edge weights
of a successor, one interface is added to MachineBasicBlock:
/// Set successor weight of a given iterator.
setSuccWeight(succ_iterator I, uint32_t weight)

An existing testing case test/CodeGen/Thumb2/v8_IT_5.ll is updated,
since we now correctly update the edge weights, the cold block
is placed at the end of the function and we jump to the cold block.

llvm-svn: 200428
2014-01-29 23:18:47 +00:00
Eric Christopher 8873adaa60 If we use DW_AT_ranges we need to specify a base address that ranges
are relative to in the compile unit. Currently let's just use 0...

Thanks to Greg Clayton for the catch!

llvm-svn: 200425
2014-01-29 22:22:56 +00:00
Eric Christopher fb8dd0085e Turn on CU ranges if we've got multiple compile units in the same
module since there's no range guarantee that we could make given
output order. This also fixes up the testcases that have multiple
CUs to have the correct range offset.

llvm-svn: 200422
2014-01-29 22:06:27 +00:00
Justin Bogner a99a3902a5 llvm-cov: Expect a source file as a positional parameter
Currently, llvm-cov isn't command-line compatible with gcov, which
accepts a source file name as its first parameter and infers the gcno
and gcda file names from that. This change keeps our -gcda and -gcno
options available for convenience in overriding this behaviour, but
adds the required parameter and inference behaviour as a compatible
default.

llvm-svn: 200417
2014-01-29 21:31:34 +00:00
David Majnemer 91fc4c2169 MC: Better management of macro arguments
The linux kernel makes uses of a GAS `feature' which substitutes nothing
for macro arguments which aren't specified.

Proper support for these kind of macro arguments necessitated a cleanup of
differences between `GAS' and `Darwin' dialect macro processing.

Differential Revision: http://llvm-reviews.chandlerc.com/D2634

llvm-svn: 200409
2014-01-29 18:57:46 +00:00
Arnold Schwaighofer 85a26704e9 LoopVectorizer: Add a test case for unrolling of small loops that need a runtime
check.

llvm-svn: 200408
2014-01-29 18:55:44 +00:00
Lang Hames 2cbbdf41c0 Add support for PC-relative non-extern relocations to RuntimeDyldMachO.
Also replaces testcase for r180790 (support for absolute non-externs relocs)
with a more robust version.

<rdar://problem/15864721>

llvm-svn: 200404
2014-01-29 18:31:35 +00:00
Matheus Almeida ec079d9e1d [mips][msa] Add fill.d instruction.
This instruction is only available on Mips64 cores
that implement the MSA ASE.

llvm-svn: 200400
2014-01-29 15:12:02 +00:00
Matheus Almeida 4cb577c614 [mips][msa] CHECK-DAG-ize MSA 2r_vector_scalar.ll test.
This update is a preparation for the addition of Mips64 MSA tests.

No functional changes.

llvm-svn: 200399
2014-01-29 14:32:03 +00:00
Matheus Almeida 74070327b2 [mips][msa] Add copy_{u,s}.d.
These instructions are only available on Mips64 cores
that implement the MSA ASE.

llvm-svn: 200398
2014-01-29 14:05:28 +00:00
Matheus Almeida a64f0600f3 [mips][msa] CHECK-DAG-ize MSA elm_copy.ll test.
This update is a preparation for the addition of Mips64 MSA tests.

No functional changes.

llvm-svn: 200395
2014-01-29 13:51:34 +00:00
Chandler Carruth d4be9dc02d [LPM] Fix PR18643, another scary place where loop transforms failed to
preserve loop simplify of enclosing loops.

The problem here starts with LoopRotation which ends up cloning code out
of the latch into the new preheader it is buidling. This can create
a new edge from the preheader into the exit block of the loop which
breaks LoopSimplify form. The code tries to fix this by splitting the
critical edge between the latch and the exit block to get a new exit
block that only the latch dominates. This sadly isn't sufficient.

The exit block may be an exit block for multiple nested loops. When we
clone an edge from the latch of the inner loop to the new preheader
being built in the outer loop, we create an exiting edge from the outer
loop to this exit block. Despite breaking the LoopSimplify form for the
inner loop, this is fine for the outer loop. However, when we split the
edge from the inner loop to the exit block, we create a new block which
is in neither the inner nor outer loop as the new exit block. This is
a predecessor to the old exit block, and so the split itself takes the
outer loop out of LoopSimplify form. We need to split every edge
entering the exit block from inside a loop nested more deeply than the
exit block in order to preserve all of the loop simplify constraints.

Once we try to do that, a problem with splitting critical edges
surfaces. Previously, we tried a very brute force to update LoopSimplify
form by re-computing it for all exit blocks. We don't need to do this,
and doing this much will sometimes but not always overlap with the
LoopRotate bug fix. Instead, the code needs to specifically handle the
cases which can start to violate LoopSimplify -- they aren't that
common. We need to see if the destination of the split edge was a loop
exit block in simplified form for the loop of the source of the edge.
For this to be true, all the predecessors need to be in the exact same
loop as the source of the edge being split. If the dest block was
originally in this form, we have to split all of the deges back into
this loop to recover it. The old mechanism of doing this was
conservatively correct because at least *one* of the exiting blocks it
rewrote was the DestBB and so the DestBB's predecessors were fixed. But
this is a much more targeted way of doing it. Making it targeted is
important, because ballooning the set of edges touched prevents
LoopRotate from being able to split edges *it* needs to split to
preserve loop simplify in a coherent way -- the critical edge splitting
would sometimes find the other edges in need of splitting but not
others.

Many, *many* thanks for help from Nick reducing these test cases
mightily. And helping lots with the analysis here as this one was quite
tricky to track down.

llvm-svn: 200393
2014-01-29 13:16:53 +00:00
Renato Golin 8cea6e8fc6 Enable EHABI by default
After all hard work to implement the EHABI and with the test-suite
passing, it's time to turn it on by default and allow users to
disable it as a work-around while we fix the eventual bugs that show
up.

This commit also remove the -arm-enable-ehabi-descriptors, since we
want the tables to be printed every time the EHABI is turned on
for non-Darwin ARM targets.

Although MCJIT EHABI is not working yet (needs linking with the right
libraries), this commit also fixes some relocations on MCJIT regarding
the EH tables/lib calls, and update some tests to avoid using EH tables
when none are needed.

The EH tests in the test-suite that were previously disabled on ARM
now pass with these changes, so a follow-up commit on the test-suite
will re-enable them.

llvm-svn: 200388
2014-01-29 11:50:56 +00:00
David Majnemer ad5c1734e2 MC: Reorganize macro MC test along dialect lines
This commit seeks to do two things:
 - Run the surfeit of tests under the Darwin dialect.  This ends up
   affecting tests which assumed that spaces could deliminate arguments.
 - The GAS dialect tests should limit their surface area to things that
   could plausibly work under GAS. For example, Darwin style arguments
   have no business being in such a test.

llvm-svn: 200383
2014-01-29 09:18:43 +00:00
Kostya Serebryany 6e15ec1442 [asan] simplify a test
llvm-svn: 200378
2014-01-29 07:35:43 +00:00
Venkatraman Govindaraju 141d0e2221 [Sparc] Use %r_disp32 for pc_rel entries in FDE as well.
This makes MCAsmInfo::getExprForFDESymbol() a virtual function and overrides it in SparcMCAsmInfo.

llvm-svn: 200376
2014-01-29 06:59:20 +00:00
NAKAMURA Takumi b366f01f83 Revert r200340, "Add line table debug info to COFF files when using a win32 triple."
It was incompatible with --target=i686-win32.

llvm-svn: 200375
2014-01-29 06:05:38 +00:00
Venkatraman Govindaraju fd5c1f9497 [Sparc] Use %r_disp32 for pc_rel entries in gcc_except_table and eh_frame.
Otherwise, assembler (gas) fails to assemble them with error message "operation
combines symbols in different segments". This is because MC computes
pc_rel entries with subtract expression between labels from different sections.

llvm-svn: 200373
2014-01-29 04:51:35 +00:00
Chandler Carruth 66f0b16360 [LPM] Fix PR18642, a pretty nasty bug in IndVars that "never mattered"
because of the inside-out run of LoopSimplify in the LoopPassManager and
the fact that LoopSimplify couldn't be "preserved" across two
independent LoopPassManagers.

Anyways, in that case, IndVars wasn't correctly preserving an LCSSA PHI
node because it thought it was rewriting (via SCEV) the incoming value
to a loop invariant value. While it may well be invariant for the
current loop, it may be rewritten in terms of an enclosing loop's
values. This in and of itself is fine, as the LCSSA PHI node in the
enclosing loop for the inner loop value we're rewriting will have its
own LCSSA PHI node if used outside of the enclosing loop. With me so
far?

Well, the current loop and the enclosing loop may share an exiting
block and exit block, and when they do they also share LCSSA PHI nodes.
In this case, its not valid to RAUW through the LCSSA PHI node.

Expected crazy test included.

llvm-svn: 200372
2014-01-29 04:40:19 +00:00
Rafael Espindola 7a578026ae We do use pipefail these days. Update the test.
llvm-svn: 200370
2014-01-29 04:08:05 +00:00
Venkatraman Govindaraju 50f32d949b [SparcV9] Use correct register class (I64RegClass) to hold the address of _GLOBAL_OFFSET_TABLE_ in sparcv9.
llvm-svn: 200368
2014-01-29 03:35:08 +00:00
Rafael Espindola 310f501ef0 Use a raw_stream to implement the mangler.
This is a bit more convenient for some callers, but more importantly, it is
easier to implement correctly. Doing this removes the patching of already
printed data that was used for fastcall, fixing a crash with private fastcall
symbols.

llvm-svn: 200367
2014-01-29 02:30:38 +00:00
Kevin Qin 92d64d2d56 [AArch64 NEON] Lower SELECT_CC with vector operand.
When the scalar compare is between floating point and operands are
vector, we custom lower SELECT_CC to use NEON SIMD compare for
generating less instructions.

llvm-svn: 200365
2014-01-29 01:57:30 +00:00
David Woodhouse 21bfc71752 [ARM] Remove superfluous inline asm mode switch test
llvm-svn: 200361
2014-01-29 00:49:28 +00:00
David Woodhouse 7db3705f9e Tests for mode switching
1. test that inlineasm works
2. test that relaxable instructions are re-encoded in the correct mode.

llvm-svn: 200351
2014-01-28 23:13:30 +00:00
Timur Iskhodzhanov 7523743075 Disable the COFF tests on non-X86 archs
llvm-svn: 200341
2014-01-28 21:47:33 +00:00
Timur Iskhodzhanov 2c659648b3 Add line table debug info to COFF files when using a win32 triple.
Reviewed at http://llvm-reviews.chandlerc.com/D2232

llvm-svn: 200340
2014-01-28 21:33:27 +00:00
Matheus Almeida 2e03f24301 [mips] Fix ELF header flags.
As opposed to GCC/GAS the default ABI for Mips64 is n64.
Compatibility bit should be set if o32 ABI is used when targeting Mips64.

llvm-svn: 200332
2014-01-28 19:24:11 +00:00
Gautam Chakrabarti 2c283400f9 [NVPTX] Fix emitting aggregate parameters
The code was missing the case for aggregate parameters and
hence was emitting them as .b0 type. Also fixed a couple
of comments.

llvm-svn: 200325
2014-01-28 18:35:29 +00:00
Andrea Di Biagio 2ea61f17ad [X86] Add extra rules for combining vselect dag nodes into movsd.
This improves the fix committed at revision 199683 adding the
following new target specific combine rules:

1) fold (v4i32: vselect <0,0,-1,-1>, A, B) ->
        (v4i32 (bitcast (movsd (v2i64 (bitcast A)), (v2i64 (bitcast B))) ))

2) fold (v4f32: vselect <0,0,-1,-1>, A, B) ->
        (v4f32 (bitcast (movsd (v2f64 (bitcast A)), (v2f64 (bitcast B))) ))

3) fold (v4i32: vselect <-1,-1,0,0>, A, B) ->
        (v4i32 (bitcast (movsd (v2i64 (bitcast B)), (v2i64 (bitcast A))) ))

4) fold (v4f32: vselect <-1,-1,0,0>, A, B) ->
        (v4f32 (bitcast (movsd (v2i64 (bitcast B)), (v2i64 (bitcast A))) ))

llvm-svn: 200324
2014-01-28 18:14:21 +00:00
Rafael Espindola ab73c493ea Fix pr14893.
When simplifycfg moves an instruction, it must drop metadata it doesn't know
is still valid with the preconditions changes. In particular, it must drop
the range and tbaa metadata.

The patch implements this with an utility function to drop all metadata not
in a white list.

llvm-svn: 200322
2014-01-28 16:56:46 +00:00
Andrea Di Biagio b6d39afbda [DAGCombiner] Avoid introducing an illegal build_vector when folding a sign_extend.
Make sure that we don't introduce illegal build_vector dag nodes
when trying to fold a sign_extend of a build_vector.

This fixes a regression introduced by r200234.
Added test CodeGen/X86/fold-vector-sext-crash.ll
to verify that llc no longer crashes with an assertion failure
due to an illegal build_vector of type MVT::v4i64.

Thanks to Ilia Filippov for spotting this regression and for
providing a reproducible test case.

llvm-svn: 200313
2014-01-28 12:53:56 +00:00
Chandler Carruth b783628560 [vectorizer] Completely disable the block frequency guidance of the loop
vectorizer, placing it behind an off-by-default flag.

It turns out that block frequency isn't what we want at all, here or
elsewhere. This has been I think a nagging feeling for several of us
working with it, but Arnold has given some really nice simple examples
where the results are so comprehensively wrong that they aren't useful.

I'm planning to email the dev list with a summary of why its not really
useful and a couple of ideas about how to better structure these types
of heuristics.

llvm-svn: 200294
2014-01-28 09:10:41 +00:00
Hal Finkel 4e703bcecd Handle spilling the PPC GPRC_NOR0 register class
GPRC_NOR0 is not a subclass of GPRC (because it also contains the ZERO pseudo
register). As a result, we also need to check for it in the spilling code.

llvm-svn: 200288
2014-01-28 05:32:58 +00:00
Michel Danzer bf1a641060 R600/SI: Add pattern for truncating i32 to i1
Fixes half a dozen piglit tests with radeonsi.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 200283
2014-01-28 03:01:16 +00:00
Jakob Stoklund Olesen 83c677353b Fix the DWARF EH encodings for Sparc PIC code.
Also emit the stubs that were generated for references to typeinfo
symbols.

llvm-svn: 200282
2014-01-28 02:52:26 +00:00