Commit Graph

67401 Commits

Author SHA1 Message Date
NAKAMURA Takumi 3045767c7f COFFObjectFile.cpp: Appease msvc in r201760.
llvm-svn: 201769
2014-02-20 09:16:23 +00:00
Craig Topper e2347df24d [x86] Switch PAUSE instruction to use XS prefix instead of HasREPPrefix. Remove HasREPPrefix support from disassembler table generator since its now only used by CodeGenOnly instructions.
llvm-svn: 201767
2014-02-20 07:59:43 +00:00
Elena Demikhovsky 9f09b3ec17 AVX-512: Fixed compilation issue
llvm-svn: 201761
2014-02-20 07:00:10 +00:00
Rui Ueyama 215a586c4b llvm-objdump/COFF: Print SEH table addresses.
SEH table addresses are VA in COFF file. In this patch we convert VA to RVA
before printing it, because dumpbin prints them as RVAs.

llvm-svn: 201760
2014-02-20 06:51:07 +00:00
Nick Lewycky c4a9f8a019 Fix change in behaviour accidentally introduced in r201754.
llvm-svn: 201758
2014-02-20 06:35:31 +00:00
Elena Demikhovsky c96570172a AVX-512: Assembly parsing of broadcast semantic in AVX-512; imlemented by Nis Zinovy (zinovy.y.nis@intel.com)
Fixed truncate i32 to i1; a test will be provided in the next commit.

llvm-svn: 201757
2014-02-20 06:34:39 +00:00
Nick Lewycky b9e44d6bcf Simplify the implementation of getUnderlyingObjectsForInstr, without intending to change the semantics at all.
llvm-svn: 201754
2014-02-20 05:06:26 +00:00
Eric Christopher 420569be04 Add support for hashing attributes with DW_FORM_block. This required
passing down an AsmPrinter instance so we could compute the size of
the block which could be target specific. All of the test cases in
the unittest don't have any target specific data so we can use a NULL
AsmPrinter there. This also depends upon block data being added as
integers.

We can now hash the entire fission-cu.ll compile unit so turn the
flag on there with the hash value.

llvm-svn: 201752
2014-02-20 02:50:45 +00:00
Eric Christopher 5d503b5deb Make DIELoc/DIEBlock's ComputeSize method const. Add a setSize
method to actually set it in the class to avoid computing it
multiple times.

llvm-svn: 201751
2014-02-20 02:40:45 +00:00
Eric Christopher a1b87fdfbf Format.
llvm-svn: 201750
2014-02-20 02:40:41 +00:00
Eric Christopher 8192ba2a7b Add support for hashing DW_FORM_sdata and a small testcase.
llvm-svn: 201747
2014-02-20 00:54:40 +00:00
Eric Christopher 9651bc00eb Remove FIXME that had snuck in.
llvm-svn: 201745
2014-02-20 00:54:35 +00:00
Reed Kotler d2da810961 Make one statement easier to understand from post commmit feedback from a
review of the previous patch that introduced this week.

llvm-svn: 201723
2014-02-19 22:11:45 +00:00
Roman Divacky 37136c0333 Expand 64bit {SHL,SHR,SRA}_PARTS on sparcv9.
llvm-svn: 201718
2014-02-19 21:35:39 +00:00
Rafael Espindola a3ad4e693c move getNameWithPrefix and getSymbol to TargetMachine.
TargetLoweringBase is implemented in CodeGen, so before this patch we had
a dependency fom Target to CodeGen. This would show up as a link failure of
llvm-stress when building with -DBUILD_SHARED_LIBS=ON.

This fixes pr18900.

llvm-svn: 201711
2014-02-19 20:30:41 +00:00
Rafael Espindola daeafb4c2a Add back r201608, r201622, r201624 and r201625
r201608 made llvm corretly handle private globals with MachO. r201622 fixed
a bug in it and r201624 and r201625 were changes for using private linkage,
assuming that llvm would do the right thing.

They all got reverted because r201608 introduced a crash in LTO. This patch
includes a fix for that. The issue was that TargetLoweringObjectFile now has
to be initialized before we can mangle names of private globals. This is
trivially true during the normal codegen pipeline (the asm printer does it),
but LTO has to do it manually.

llvm-svn: 201700
2014-02-19 17:23:20 +00:00
Christian Pirker bd1eb0db1f Test commit - remove the new line to lib/Target/AArch64/AArch64TargetMachine.cpp.
llvm-svn: 201698
2014-02-19 16:58:28 +00:00
Daniel Sanders acb20adbe4 [mips] In the integrated assembler, select the default feature bits by changing the CPU value.
This is consistent with the way CodeGen acheives this. However, CodeGen
always selects mips32 (even when the architecture is mips64).

llvm-svn: 201694
2014-02-19 16:13:26 +00:00
Christian Pirker 25ff038545 Test commit - added a new line to lib/Target/AArch64/AArch64TargetMachine.cpp.
llvm-svn: 201692
2014-02-19 16:07:32 +00:00
Daniel Sanders b3172307b8 [mips] Use llvm::Triple in ParseMipsTriple() instead of manually parsing it
No functional change.

llvm-svn: 201689
2014-02-19 15:55:21 +00:00
Rafael Espindola 2173603825 This reverts commit r201625 and r201624.
Since r201608 got reverted, it is not safe to use private linkage in these cases
until it is committed back.

llvm-svn: 201688
2014-02-19 15:49:46 +00:00
Daniel Sanders 4d4f3d98de [mips] Remove unused NotN64 predicate
llvm-svn: 201682
2014-02-19 15:16:47 +00:00
Cameron McInally 7b544f0297 Fix AVX512 vector sqrt assembly strings.
llvm-svn: 201681
2014-02-19 15:16:09 +00:00
Daniel Jasper 7e198ad862 Revert r201622 and r201608.
This causes the LLVMgold plugin to segfault. More information on the
replies to r201608.

llvm-svn: 201669
2014-02-19 12:26:01 +00:00
Tim Northover aeb8e06d4c X86 CodeGenPrep: sink shufflevectors before shifts
On x86, shifting a vector by a scalar is significantly cheaper than shifting a
vector by another fully general vector. Unfortunately, because SelectionDAG
operates on just one basic block at a time, the shufflevector instruction that
reveals whether the right-hand side of a shift *is* really a scalar is often
not visible to CodeGen when it's needed.

This adds another handler to CodeGenPrepare, to sink any useful shufflevector
instructions down to the basic block where they're used, predicated on a target
hook (since on other architectures, doing so will often just introduce extra
real work).

rdar://problem/16063505

llvm-svn: 201655
2014-02-19 10:02:43 +00:00
Craig Topper 56f0ed815e Remove special FP opcode maps and instead add enough MRM_XX formats to handle all the FP operations. This increases format by 1 bit, but decreases opcode map by 1 bit so the TSFlags size doesn't change.
llvm-svn: 201649
2014-02-19 08:25:02 +00:00
Craig Topper 8f540272e8 Reduce size of map field in X86 TSFlags since it now requires less bits.
llvm-svn: 201646
2014-02-19 07:29:07 +00:00
Craig Topper 2fb696b214 Put some of the X86 formats in a more logical order.
llvm-svn: 201645
2014-02-19 06:59:13 +00:00
Craig Topper 0d1fd55c13 Remove A6/A7 opcode maps. They can all be handled with a TB map, opcode of 0xa6/0xa7, and adding MRM_C0/MRM_E0 forms. Removes 376K from the disassembler tables.
llvm-svn: 201641
2014-02-19 05:34:21 +00:00
Saleem Abdulrasool f903a44728 MCAsmParser: support required parameters
This enhances the macro parser to parse and handle parameter qualifications,
which is needed to support required formal parameters in macro definitions.  A
required parameter may not be defaulted (though providing a default value is
accepted with a warning).  This improves GAS compatibility.

Partially addresses PR9248.

llvm-svn: 201630
2014-02-19 03:00:29 +00:00
Saleem Abdulrasool a08585b233 MCAsmParser: change representation of MCAsmMacroParameter
Rather than using std::pair, create a structure to represent the type.  This is
a preliminary refactoring to enable required parameter handling.  Additional
state is needed to indicate required parameters.  This has a minor side effect
of improving readability by providing more accurate names compared to first and
second.

llvm-svn: 201629
2014-02-19 03:00:23 +00:00
Rafael Espindola 8b27c4edc6 Now that llvm always does the right thing with private, use it.
llvm-svn: 201625
2014-02-19 02:08:39 +00:00
Rafael Espindola b9ea63c551 Avoid an infinite cycle with private linkage and -f{data|function}-sections.
When outputting an object we check its section to find its name, but when
looking for the section with -ffunction-section we look for the symbol name.

Break the loop by requesting a name with the private prefix when constructing
the section name. This matches the behavior before r201608.

llvm-svn: 201622
2014-02-19 01:28:30 +00:00
Rafael Espindola 09dcc6a536 Fix PR18743.
The IR
@foo = private constant i32 42

is valid, but before this patch we would produce an invalid MachO from it. It
was invalid because it would use an L label in a section where the liker needs
the labels in order to atomize it.

One way of fixing it would be to just reject this IR in the backend, but that
would not be very front end friendly.

What this patch does is use an 'l' prefix in sections that we know the linker
requires symbols for atomizing them. This allows frontends to just use
private and not worry about which sections they go to or how the linker handles
them.

One small issue with this strategy is that now a symbol name depends on the
section, which is not available before codegen. This is not a problem in
practice. The reason is that it only happens with private linkage, which will
be ignored by the non codegen users (llvm-nm and llvm-ar).

llvm-svn: 201608
2014-02-18 22:24:57 +00:00
Rafael Espindola ea09c595a6 Rename a DebugLoc variable to DbgLoc and a DataLayout to DL.
This is quiet a bit less confusing now that TargetData was renamed DataLayout.

llvm-svn: 201606
2014-02-18 22:05:46 +00:00
Lang Hames 9b2dc930d7 Consistently check 'IsCode' when allocating sections in RuntimeDyld (via
findOrEmitSection).

Vaidas Gasiunas's patch, r201259, fixed one instance where we were always
allocating sections as text. This patch fixes the remaining buggy call sites.

No test case: This isn't breaking anything that I know of, it's just
inconsistent.

<rdar://problem/15943542>

llvm-svn: 201605
2014-02-18 21:46:39 +00:00
Ana Pazos 7c27a265dc [AArch64] Expanded sin, cos, pow with FP vector types inputs
llvm-svn: 201601
2014-02-18 20:31:05 +00:00
Rafael Espindola 7c68bebb9c Rename some member variables from TD to DL.
TargetData was renamed DataLayout back in r165242.

llvm-svn: 201581
2014-02-18 15:33:12 +00:00
Robert Lytton 346e808ec6 XCore target: Handle common linkage
llvm-svn: 201563
2014-02-18 11:21:59 +00:00
Robert Lytton 19ed0d05b8 XCore target: addMemOperand as necessary
BuildMI instructions were not including MachineMemOperand information.
This was discovered by 'SingleSource/Benchmarks/Stanford/Oscar' failing
due to a FrameIndex load incorrectly being hoisted by postra-machine-licm.
No other tests have been found to fail.

llvm-svn: 201562
2014-02-18 11:21:53 +00:00
Robert Lytton af6c256c34 XCore target: Fix llvm.eh.return and EH info register handling
llvm-svn: 201561
2014-02-18 11:21:48 +00:00
Tim Northover f804c178a1 GlobalMerge: move "-global-merge" option to the pass itself.
It's rather odd to have the flag enabling and disabling this pass only affect a
single target.

llvm-svn: 201559
2014-02-18 11:17:29 +00:00
Tim Northover f06df5866f X86: use vpsllvd (& friends) for 16-bit shifts on Haswell
llvm-svn: 201558
2014-02-18 11:15:32 +00:00
Craig Topper 8755740de0 Add PS prefix to some classes I missed in r201538.
llvm-svn: 201551
2014-02-18 08:24:22 +00:00
Craig Topper 6872fd3ad9 Add a bunch of OpSize32 tags to 64-bit mode only instructions to match their 32-bit mode counterparts for cases where there is also a OpSize16 instruction.
llvm-svn: 201550
2014-02-18 08:18:29 +00:00
Elena Demikhovsky 16a03613fa AVX-512: Fixed size of mask registers
llvm-svn: 201546
2014-02-18 07:52:26 +00:00
Jiangning Liu 742c588edc Fix a typo about lowering AArch64 va_copy.
llvm-svn: 201541
2014-02-18 02:37:42 +00:00
Craig Topper 5ccb61781f Add an x86 prefix encoding for instructions that would decode to a different instruction with 0xf2/f3/66 were in front of them, but don't themselves have a prefix. For now this doesn't change any bbehavior, but plan to use it to fix some bugs in the disassembler.
llvm-svn: 201538
2014-02-18 00:21:49 +00:00
Kevin Enderby 6287371ce6 Fix the arm assembler so that this malformed instruction:
ldrd r6, r7 [r2, #15]
simply gives an error and does not triggers an assertion.

As Jim points out, the diagnostic is really strange here,
but fixing that would be more complicated. The missing
comma results in the parser expecting a construct like r2[2],
which is the vector index thing the error message is talking
about. That's not what the user intended, though, and there's
nothing else in the instruction that looks at all like a vector.
Yet more fallout from not having a real parser here and trying
to do context-free generic matching for addressing modes.

rdar://15097243

llvm-svn: 201531
2014-02-17 21:45:27 +00:00
Anders Waldenborg 8480957486 Add support for assigning to . in AsmParser.
This is implemented by handling assignments to the '.' pseudo symbol
as ".org" directives.

Differential Revision: http://llvm-reviews.chandlerc.com/D2625

llvm-svn: 201530
2014-02-17 20:48:32 +00:00
Craig Topper fae5ac27a2 Fix diassembler handling of rex.b when mod=00/01/10 and bbb=101. Mod=00 should ignore the base register entirely. Mod=01/10 should treat this as R13 plus displacment. Fixes PR18860.
llvm-svn: 201507
2014-02-17 10:03:43 +00:00
Elena Demikhovsky 750498c77b AVX-512: implemented zext fron i1 to i16
llvm-svn: 201502
2014-02-17 07:29:33 +00:00
Gerolf Hoflehner 7a463d0650 fix for null VectorizedValue assertion in the SLP Vectorizer (in function vectorizeTree()). radar://16064178
llvm-svn: 201501
2014-02-17 03:06:16 +00:00
Saleem Abdulrasool 6d7c0c203e MCAsmParser: better handling for named arguments
Until this point only macro definition with named parameters were parsed but the
names were ignored.  This adds support for using that information for named
parameter instantiation.

In order to support the full semantics of the keyword arguments, the arguments
are no longer lazily initialised since the keyword arguments can be specified
out of order and partially if they are defaulted.  Prepopulate the arguments
with the default value for any defaulted parameters, and then parse the
specified arguments.

This simplies some of the handling of the arguments in the inner loop since
empty arguments simply increment the parameter index and move on.

Note that keyword and positional arguments cannot be mixed.

llvm-svn: 201499
2014-02-17 00:40:17 +00:00
Mark Seaborn be266aa325 Use 16 byte stack alignment for NaCl on ARM
NaCl's ARM ABI uses 16 byte stack alignment, so set that in
ARMSubtarget.cpp.

Using 16 byte alignment exposes an issue in code generation in which a
varargs function leaves a 4 byte gap between the values of r1-r3 saved
to the stack and the following arguments that were passed on the
stack.  (Previously, this code only needed to support 4 byte and 8
byte alignment.)

With this issue, llc generated:

varargs_func:
        sub     sp, sp, #16
        push    {lr}
        sub     sp, sp, #12
        add     r0, sp, #16   // Should be 20
        stm     r0, {r1, r2, r3}
        ldr     r0, .LCPI0_0  // Address of va_list
        add     r1, sp, #16
        str     r1, [r0]
        bl      external_func

Fix the bug by checking for "Align > 4".  Also simplify the code by
using OffsetToAlignment(), and update comments.

Differential Revision: http://llvm-reviews.chandlerc.com/D2677

llvm-svn: 201497
2014-02-16 18:59:48 +00:00
Arnold Schwaighofer 26f567d8a4 SCEVExpander: Try hard not to create derived induction variables in other loops
During LSR of one loop we can run into a situation where we have to expand the
start of a recurrence of a loop induction variable in this loop. This start
value is a value derived of the induction variable of a preceeding loop. SCEV
has cannonicalized this value to a different recurrence than the recurrence of
the preceeding loop's induction variable (the type and/or step direction) has
changed). When we come to instantiate this SCEV we created a second induction
variable in this preceeding loop.  This patch tries to base such derived
induction variables of the preceeding loop's induction variable.

This helps twolf on arm and seems to help scimark2 on x86.

Reapply with a fix for the case of a value derived from a pointer.

radar://15970709

llvm-svn: 201496
2014-02-16 15:49:50 +00:00
Rafael Espindola 7e78a5a2f5 Remove dead code, we already require cmake 2.8.8.
llvm-svn: 201495
2014-02-16 14:36:26 +00:00
Rafael Espindola 56b663b6e4 Remove unnecessary typename.
Thanks to Elena Demikhovsky for noticing.

llvm-svn: 201494
2014-02-16 14:12:35 +00:00
Elena Demikhovsky 1fad075974 AVX-512: simpyfied BUILD_VECTOR for masks; fixed cmp/test sequence
llvm-svn: 201487
2014-02-16 11:34:23 +00:00
Gerolf Hoflehner 282949bf4d fixed typo in comment as my test commit
llvm-svn: 201486
2014-02-16 10:43:25 +00:00
Eric Christopher 4a74104933 Add a DIELoc class to cover the DW_FORM_exprloc set of expressions
alongside DIEBlock and replace uses accordingly. Use DW_FORM_exprloc
in DWARF4 and later code. Update testcases.

Adding a DIELoc instead of using extra forms inside DIEBlock so
that we can keep location expressions separate from other uses. No
direct use at the moment, however, it's not a lot of code and
using a separately named class keeps it somewhat more obvious
what's going on in various locations.

llvm-svn: 201481
2014-02-16 08:46:55 +00:00
Saleem Abdulrasool 27304cb189 MCAsmParser: relax declaration parsing
The Linux kernel defines empty macros for compatibility with ARM UAL syntax.
The comma after the name is optional, and if present can be safely lexed.  This
improves compatibility with the GNU assembler.

llvm-svn: 201474
2014-02-16 04:56:31 +00:00
Saleem Abdulrasool 49480bf01c ARM IAS: (partially) support .arch_extension directive
This adds a partial implementation of the .arch_extension directive to the
integrated ARM assembler.  There are a number of limitations to this
implementation arising from the target backend support rather than the
implementation itself.  Namely, iWMMXT (v1 and v2), Maverick, and XScale support
is not present in the ARM backend.  Currently, there is no check for A-class
only (needed for virt), and no ARMv6k detection (needed for os and sec).  The
remainder of the extensions are fully supported.

llvm-svn: 201471
2014-02-16 00:16:41 +00:00
David Blaikie f1a6dea82c DebugInfo: Deduplicate entries in the fission address table
This broke in r185459 while TLS support was being generalized to handle
non-symbol TLS representations.

I thought about/tried having an enum rather than a bool to track the
TLS-ness of the address table entry, but namespaces and naming seemed
more hassle than it was worth for only one caller that needed to specify
this.

llvm-svn: 201469
2014-02-15 19:34:03 +00:00
David Blaikie f28703a181 DwarfDebug: Remove dead code.
llvm-svn: 201467
2014-02-15 18:33:11 +00:00
Arnold Schwaighofer 847d96142c Revert "SCEVExpander: Try hard not to create derived induction variables in other loops"
This reverts commit r201465. It broke an arm bot.

llvm-svn: 201466
2014-02-15 18:16:56 +00:00
Arnold Schwaighofer 1e12f8563d SCEVExpander: Try hard not to create derived induction variables in other loops
During LSR of one loop we can run into a situation where we have to expand the
start of a recurrence of a loop induction variable in this loop. This start
value is a value derived of the induction variable of a preceeding loop. SCEV
has cannonicalized this value to a different recurrence than the recurrence of
the preceeding loop's induction variable (the type and/or step direction) has
changed). When we come to instantiate this SCEV we created a second induction
variable in this preceeding loop.  This patch tries to base such derived
induction variables of the preceeding loop's induction variable.

This helps twolf on arm and seems to help scimark2 on x86.

radar://15970709

llvm-svn: 201465
2014-02-15 17:11:56 +00:00
Craig Topper 34875ab0b5 Add opcode extension forms of MOV8ri/MOV16ri/MOV32ri.
llvm-svn: 201463
2014-02-15 07:29:18 +00:00
David Blaikie 60e6386b87 DebugInfo: Implement DW_AT_stmt_list for type units
Type units will share the statement list of their defining compile unit.
This is a tradeoff that reduces .o debug info size at the cost of some
linked debug info size (since the contents of those string tables won't
be deduplicated along with the type unit) which seems right for now.

llvm-svn: 201445
2014-02-14 23:58:13 +00:00
David Blaikie dfade747f0 DwarfUnit: Remove unnecessarily explicit/out of line virtual dtors.
These types have an out of line virtual function each (emitHeader at
least) so they won't have weak vtables - no need for more than that.

llvm-svn: 201444
2014-02-14 22:50:59 +00:00
David Blaikie 461c72b7e0 DwarfUnit: Remove unnecessary (void)t; that was previously used to suppress -Wunused-member-variable
llvm-svn: 201442
2014-02-14 22:47:55 +00:00
David Blaikie 2494fdb838 DwarfUnit: Refactor out DW_AT_stmt_list creation into common function for fission and non-fission cases
This probably also addresses the FIXME in the fission case regarding
multiple compile units, though I haven't tested that.

This code still confuses me (the literal zero offset makes little sense,
the limitations surrounding asm output I'm not sure about either - but
perhaps we should just always emit one line table? Or should we not rely
on .loc/.file even in assembly so we can produce the same output between
asm and object output?) but this maintains the existing functionality.

llvm-svn: 201441
2014-02-14 22:41:51 +00:00
Rafael Espindola 30616362d3 Add extern template instantiations of llvm::Calculate.
This should be a small build time improvement in general and fixes
the build on OS X with -DBUILD_SHARED_LIBS=ON.

The issue is that not all users are including GenericDomTreeConstruction.h,
causing undefined references when ld64 managed to hide the
linkonce_odr symbols.

llvm-svn: 201440
2014-02-14 22:36:16 +00:00
Quentin Colombet 867c550947 [CodeGenPrepare][AddressingModeMatcher] Give up on type promotion if the
transformation does not bring any immediate benefits and introduce an illegal
operation. 

llvm-svn: 201439
2014-02-14 22:23:22 +00:00
Tom Stellard 728d4172df TargetLowering: n * r where n > 2 should be an illegal addressing mode
llvm-svn: 201433
2014-02-14 21:10:34 +00:00
David Blaikie 9acebfdd94 DebugInfo: Don't include the name of the CU file in the line table file list when it's unneeded
Recommitting r201380 (reverted in r201389)
Recommitting r201351 and r201355 (reverted in r201351 and r201355)

We weren't emitting the an empty (header only) line table when the line
table was empty - this made the DWARF invalid (the compile unit would
point to the zero-size debug_lines section where there should've been an
empty line table but there was nothing at all). Fix that, and as a
consequence this works around/addresses PR18809.

Also, we emit a non-empty line table to workaround a darwin linker bug,
so XFAILing on darwin too.

Also, mark the test as 'REQUIRES: object-emission' because it does.

llvm-svn: 201429
2014-02-14 19:51:35 +00:00
Diego Novillo 5b5cf503b5 Support DWARF discriminators in object streamer.
Summary:
This adds support for emitting DWARF path discriminator values in
the object streamer. It also changes the DWARF dumper to show
discriminator values in the line table output.

Reviewers: echristo

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D2794

llvm-svn: 201427
2014-02-14 19:27:53 +00:00
Reed Kotler 4cdaa7d778 This patch has two main functions:
1) Fix a specific bug when certain conversion functions are called in a program compiled as mips16 with hard float and
the program is linked as c++. There are two libraries that are reversed in the link order with gcc/g++ and clang/clang++ for
mips16 in this case and the proper stubs will then not be called. These stubs are normally handled in the Mips16HardFloat pass
but in this case we don't know at that time that we need to generate the stubs. This must all be handled later in code generation
and we have moved this functionality to MipsAsmPrinter. When linked as C (gcc or clang) the proper stubs are linked in from libc.

2) Set up the infrastructure to handle 90% of what is in the Mips16HardFloat pass in this new area of MipsAsmPrinter. This is a more
logical place to handle this and we have known for some time that we needed to move the code later and not implement it using
inline asm as we do now but it was not clear exactly where to do this and what mechanism should be used. Now it's clear to us
how to do this and this patch contains the infrastructure to move most of this to MipsAsmPrinter but the actual moving will be done
in a follow on patch. The same infrastructure is used to fix this current bug as described in #1. This change was requested by the list
during the original putback of the Mips16HardFloat pass but was not practical for us do at that time.

llvm-svn: 201426
2014-02-14 19:16:39 +00:00
Rafael Espindola 8eee97ddce Trivial cleanup: reuse existing variable.
Extracted while trying to understand http://llvm-reviews.chandlerc.com/D1764.

Patch by Matt Arsenault.

llvm-svn: 201425
2014-02-14 19:02:01 +00:00
Artyom Skrobov f6830f47b8 Generate the DWARF stack frame decode operations in the function prologue for ARM/Thumb functions.
Patch by Keith Walker!

llvm-svn: 201423
2014-02-14 17:19:07 +00:00
Kevin Qin edc95ee196 [AArch64 NEON] Fix a bug to avoid using floating type as condition type in lowering SELECT_CC.
llvm-svn: 201395
2014-02-14 09:41:15 +00:00
Eric Christopher abc621668d Revert "DebugInfo: Don't include the name of the CU file in the line table file list when it's unneeded"
This reverts commit r201380 for now while we investigate.

llvm-svn: 201389
2014-02-14 05:33:16 +00:00
Jiangning Liu 293349e4d7 Enable AArch64 NEON by default.
llvm-svn: 201385
2014-02-14 04:38:09 +00:00
Hao Liu 7146ef8542 [AArch64]Fix the assertion failure caused by "v1i1 SETCC" DAG node.
As v1i1 is illegal, the type legalizer tries to scalarize such node. But if the type operands of SETCC is legal, the scalarization algorithm will cause an assertion failure.

llvm-svn: 201381
2014-02-14 02:21:56 +00:00
David Blaikie 177585d1d9 DebugInfo: Don't include the name of the CU file in the line table file list when it's unneeded
Recommitting r201351 and r201355 (reverted in r201351 and r201355)

We weren't emitting the an empty (header only) line table when the line
table was empty - this made the DWARF invalid (the compile unit would
point to the zero-size debug_lines section where there should've been an
empty line table but there was nothing at all). Fix that, and as a
consequence this works around/addresses PR18809.

llvm-svn: 201380
2014-02-14 01:57:59 +00:00
Eric Christopher 02dbadb3a0 Disable emission of aranges by default and add a command line
option to enable again that will be matched with a commit to enable
in clang.

llvm-svn: 201378
2014-02-14 01:26:55 +00:00
Juergen Ributzka b575878145 [X86] Don't mark movabsq as cheap-as-move - it isn't that cheap.
A simple register copy on X86 is just 3 bytes, whereas movabsq is a 10 byte
instruction. Marking movabsq as not beeing cheap will allow LICM to move it
out of the loop and it also prevents unnecessary rematerializations if the
value is needed in more than one register.

llvm-svn: 201377
2014-02-14 00:51:13 +00:00
Matt Arsenault aa689f5079 Do more addrspacecast transforms that happen for bitcast.
Makes addrspacecast (gep) do addrspacecast (gep) instead.

llvm-svn: 201376
2014-02-14 00:49:12 +00:00
Tom Stellard 967bf5813f R600/SI: Expand all v8[if]32 operations
llvm-svn: 201371
2014-02-13 23:34:15 +00:00
Tom Stellard f16d38cbb5 R600/SI: Add a pattern for i32 anyext
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 201370
2014-02-13 23:34:13 +00:00
Tom Stellard 6c7a7e82a7 R600/SI: Completely Disable TypeRewriter on compute
llvm-svn: 201369
2014-02-13 23:34:12 +00:00
Tom Stellard 80be9650e3 R600/SI: Split global vector loads with more than 4 elements
llvm-svn: 201368
2014-02-13 23:34:10 +00:00
Rafael Espindola 1f3de49f37 Use __literal16. It has been supported by the linker since 2005.
llvm-svn: 201365
2014-02-13 23:16:11 +00:00
Diego Novillo b1b5007c52 Fix generation of 'isa' and 'discriminator' keywords.
Summary:
There should be a space before each of these two keywords to avoid
generating invalid assembly files.

NOTE: I could not find an obvious maintainers in CODE_OWNERS.TXT, but
      this seems related to debug info.

Reviewers: echristo

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D2791

llvm-svn: 201359
2014-02-13 20:05:03 +00:00
Rafael Espindola 79c3ab7c5e Check that GlobalAliases don't have section or alignment.
An alias is always in the section of its aliasee and has the same alignment
(since it has the same address).

llvm-svn: 201354
2014-02-13 18:26:41 +00:00
Benjamin Kramer 920409585f InstCombine: Replace custom constant folding code with ConstantExpr.
llvm-svn: 201352
2014-02-13 18:23:24 +00:00
NAKAMURA Takumi 4f2a067df1 [PR18809] Revert r201187, "DebugInfo: Don't include the name of the CU file in the line table file list when it's unneeded"
It really crashes cygwin's stage2 configure with "clang -g".

llvm-svn: 201351
2014-02-13 18:18:56 +00:00
Rafael Espindola b6f72b240f Use mkdir instead of stat+mkdir.
This is an optimistic version of create_diretories: it tries to create the
directory first and looks at the parent only if that fails.

Running strace on "mkdir -p" shows that it is pessimistic, calling mkdir on
every element of the path. We could implement that if needed.

In any case, with both strategies there is no reason to call stat, just check
the return of mkdir.

llvm-svn: 201347
2014-02-13 16:58:19 +00:00
Benjamin Kramer 989b92936c Reduce code duplication resulting from the ConstantVector/ConstantDataVector split.
No intended functionality change.

llvm-svn: 201344
2014-02-13 16:48:38 +00:00
Daniel Sanders 753e17629d Re-commit: Demote EmitRawText call in AsmPrinter::EmitInlineAsm() and remove hasRawTextSupport() call
Summary:
AsmPrinter::EmitInlineAsm() will no longer use the EmitRawText() call for
targets with mature MC support. Such targets will always parse the inline
assembly (even when emitting assembly). Targets without mature MC support
continue to use EmitRawText() for assembly output.

The hasRawTextSupport() check in AsmPrinter::EmitInlineAsm() has been replaced
with MCAsmInfo::UseIntegratedAs which when true, causes the integrated assembler
to parse inline assembly (even when emitting assembly output). UseIntegratedAs
is set to true for targets that consider any failure to parse valid assembly
to be a bug. Target specific subclasses generally enable the integrated
assembler in their constructor. The default value can be overridden with
-no-integrated-as.

All tests that rely on inline assembly supporting invalid assembly (for example,
those that use mnemonics such as 'foo' or 'hello world') have been updated to
disable the integrated assembler.

Changes since review (and last commit attempt):
- Fixed test failures that were missed due to configuration of local build.
  (fixes crash.ll and a couple others).
- Fixed tests that happened to pass because the local build was on X86
  (should fix 2007-12-17-InvokeAsm.ll)
- mature-mc-support.ll's should no longer require all targets to be compiled.
  (should fix ARM and PPC buildbots)
- Object output (-filetype=obj and similar) now forces the integrated assembler
  to be enabled regardless of default setting or -no-integrated-as.
  (should fix SystemZ buildbots)

Reviewers: rafael

Reviewed By: rafael

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D2686

llvm-svn: 201333
2014-02-13 14:44:26 +00:00
Rafael Espindola b32292ddf7 Remove dead code.
llvm-svn: 201327
2014-02-13 13:45:45 +00:00
Tim Northover 914af6273b ARM: remove floating-point patterns for @llvm.arm.neon.vabs
The front-end is now generating the generic @llvm.fabs for this
operation now, so the extra patterns are no longer needed.

llvm-svn: 201314
2014-02-13 10:44:30 +00:00
Oliver Stannard 5bbb72f37e Add Cortex-A53 and Cortex-A57 cores to the AArch64 backend
llvm-svn: 201305
2014-02-13 09:46:11 +00:00
Hao Liu 7b6dfcf06a [AArch64]Fix the problems that can't select mul/add/sub of v1i8/v1i16/v1i32 types.
As this problems are similar to shl/sra/srl, also add patterns for shift nodes.

llvm-svn: 201298
2014-02-13 05:42:33 +00:00
Quentin Colombet 0e3b5e0b20 [RegAlloc] Fix the assertion in the last chance recoloring to match the
condition at the call site.

llvm-svn: 201296
2014-02-13 05:17:37 +00:00
Rafael Espindola 75ec01f3de Copy dll storage in copyAttributes.
llvm-svn: 201295
2014-02-13 05:11:35 +00:00
Juergen Ributzka 2b97f9b211 [DAG] Fix the recognition of opaque constants in the SelectionDAGBuilder.
This fix checks the original LLVM IR node to identify opaque constants by
looking for the bitcast-constant pattern. Originally we looked at the generated
SDNode, but this might lead to incorrect results. The SDNode could have been
generated by an constant expression that was folded to a constant.

This fixes <rdar://problem/16050719>

llvm-svn: 201291
2014-02-13 04:19:26 +00:00
Rafael Espindola 9e7a638be1 Use simpler version of sys::fs::exists when possible.
llvm-svn: 201289
2014-02-13 04:00:35 +00:00
Hao Liu 4f345f3c03 [AArch64]Add support for spilling FPR8/FPR16.
llvm-svn: 201287
2014-02-13 02:36:58 +00:00
Reid Kleckner 22b19da9fc GlobalOpt: Aliases don't have sections, don't copy them when replacing
As defined in LangRef, aliases do not have sections.  However, LLVM's
GlobalAlias class inherits from GlobalValue, which means we can read and
set its section.  We should probably ban that as a separate change,
since it doesn't make much sense for an alias to have a section that
differs from its aliasee.

Fixes PR18757, where the section was being lost on the global in code
from Clang like:

extern "C" {
__attribute__((used, section("CUSTOM"))) static int in_custom_section;
}

Reviewers: rafael.espindola

Differential Revision: http://llvm-reviews.chandlerc.com/D2758

llvm-svn: 201286
2014-02-13 02:18:36 +00:00
Owen Anderson 883b5add8e Remove a very old instcombine where we would turn sequences of selects into
logical operations on the i1's driving them.  This is a bad idea for every
target I can think of (confirmed with micro tests on all of: x86-64, ARM,
AArch64, Mips, and PowerPC) because it forces the i1 to be materialized into
a general purpose register, whereas consuming it directly into a select generally
allows it to exist only transiently in a predicate or flags register.

Chandler ran a set of performance tests with this change, and reported no
measurable change on x86-64.

llvm-svn: 201275
2014-02-12 23:54:07 +00:00
Andrea Di Biagio b7882b3bd1 [Vectorizer] Add a new 'OperandValueKind' in TargetTransformInfo called
'OK_NonUniformConstValue' to identify operands which are constants but
not constant splats.

The cost model now allows returning 'OK_NonUniformConstValue'
for non splat operands that are instances of ConstantVector or
ConstantDataVector.

With this change, targets are now able to compute different costs
for instructions with non-uniform constant operands.
For example, On X86 the cost of a vector shift may vary depending on whether
the second operand is a uniform or non-uniform constant.

This patch applies the following changes:
 - The cost model computation now takes into account non-uniform constants;
 - The cost of vector shift instructions has been improved in
   X86TargetTransformInfo analysis pass;
 - BBVectorize, SLPVectorizer and LoopVectorize now know how to distinguish
   between non-uniform and uniform constant operands.

Added a new test to verify that the output of opt
'-cost-model -analyze' is valid in the following configurations: SSE2,
SSE4.1, AVX, AVX2.

llvm-svn: 201272
2014-02-12 23:43:47 +00:00
Andrea Di Biagio 386d566395 [X86] Teach the backend how to lower vector shift left into multiply rather than scalarizing it.
Instead of expanding a packed shift into a sequence of scalar shifts,
the backend now tries (when possible) to convert the vector shift into a
vector multiply.

Before this change, a shift of a MVT::v8i16 vector by a
build_vector of constants was always scalarized into a long sequence of "vector
extracts + scalar shifts + vector insert".
With this change, if there is SSE2 support, we emit a single vector multiply.

This change also affects SSE4.1, AVX, AVX2 shifts:
 - A shift of a MVT::v4i32 vector by a build_vector of non uniform constants
is now lowered when possible into a single SSE4.1 vector multiply.
 - Packed v16i16 shift left by constant build_vector are now expanded when
possible into a single AVX2 vpmullw.
This change also improves the lowering of AVX512f vector shifts.

Added test CodeGen/X86/vec_shift6.ll with some code examples that are affected
by this change.

llvm-svn: 201271
2014-02-12 23:42:28 +00:00
Eric Christopher d0d5bba185 Reformat a few lines with clang-format.
llvm-svn: 201265
2014-02-12 22:47:09 +00:00
Eric Christopher 89a575cbdc 80-col.
llvm-svn: 201264
2014-02-12 22:38:04 +00:00
Juergen Ributzka d1777cc344 [Stackmaps] Improve the stackmap lowering code in the SelectionDAGBuilder.
We are now no longer relying on the target-specific call lowering implementation
to lower a stackmap intrinsic call. Instead we perform the call lowering in a
target-independent way directly in the stackmap lowering code. This simplifies
the code and removes the need to fixup the code after the target-specific call
lowering.

llvm-svn: 201263
2014-02-12 22:17:13 +00:00
Juergen Ributzka aa30da30bb [Stackmaps] Fix the ID type to be i64 also for stackmaps (as we claim in the documenation)
The ID type for the stackmap and patchpoint intrinsics are in both cases i64.
This fixes an zero extend in the SelectionDAGBuilder that still used i32. This
also updates the target independent instructions STACKMAP and PATCHPOINT to use
the correct type.

llvm-svn: 201262
2014-02-12 22:17:10 +00:00
Lang Hames 937ec54951 Extend RTDyld API to enable optionally precomputing the total amount of memory
required for all sections in a module. This can be useful when targets or
code-models place strict requirements on how sections must be laid out
in memory.

If RTDyldMemoryManger::needsToReserveAllocationSpace() is overridden to return
true then the JIT will call the following method on the memory manager, which
can be used to preallocate the necessary memory.

void RTDyldMemoryManager::reserveAllocationSpace(uintptr_t CodeSize,
                                                 uintptr_t DataSizeRO,
                                                 uintptr_t DataSizeRW)

Patch by Vaidas Gasiunas. Thanks very much Viadas!

llvm-svn: 201259
2014-02-12 21:30:07 +00:00
Reid Kleckner d59e2faae1 Rename Windows.h to WindowsSupport.h to avoid ambiguity
llvm-svn: 201258
2014-02-12 21:26:20 +00:00
David Fang 6860c819f4 _CS_DARWIN_USER macros available on darwin>=9. Thanks, Dave Odell!
llvm-svn: 201255
2014-02-12 21:02:12 +00:00
Adrian Prantl 7199fd532c Debug info: Bugfix for r201190: DW_OP_piece takes bytes, not bits.
rdar://problem/16015314

llvm-svn: 201253
2014-02-12 19:34:44 +00:00
Akira Hatanaka a07ffb5b31 Pass edges weights to MachineBasicBlock::addSuccessor in TailDuplicatePass to
preserve branch probability information.

<rdar://problem/15893208>

llvm-svn: 201245
2014-02-12 18:09:18 +00:00
Daniel Sanders abe212a3b8 Revert r201237+r201238: Demote EmitRawText call in AsmPrinter::EmitInlineAsm() and remove hasRawTextSupport() call
It introduced multiple test failures in the buildbots.

llvm-svn: 201241
2014-02-12 15:39:20 +00:00
Daniel Sanders a7d504cf58 Demote EmitRawText call in AsmPrinter::EmitInlineAsm() and remove hasRawTextSupport() call
Summary:
AsmPrinter::EmitInlineAsm() will no longer use the EmitRawText() call for targets with mature MC support. Such targets will always parse the inline assembly (even when emitting assembly). Targets without mature MC support continue to use EmitRawText() for assembly output.

The hasRawTextSupport() check in AsmPrinter::EmitInlineAsm() has been replaced with MCAsmInfo::UseIntegratedAs which when true, causes the integrated assembler to parse inline assembly (even when emitting assembly output). UseIntegratedAs is set to true for targets that consider any failure to parse valid assembly to be a bug. Target specific subclasses generally enable the integrated assembler in their constructor. The default value can be overridden with -no-integrated-as.

All tests that rely on inline assembly supporting invalid assembly (for example, those that use mnemonics such as 'foo' or 'hello world') have been updated to disable the integrated assembler.

Reviewers: rafael

Reviewed By: rafael

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D2686

llvm-svn: 201237
2014-02-12 14:44:54 +00:00
NAKAMURA Takumi 04d39d7a2d Windows/Path.inc: Move <shlobj.h> after "Windows.h" for some API available.
I found that swapping the order of some header files helped fix a
  build issue that we're seeing on mingw32. Without the swap, windows.h
  was being included before _WIN32_WINNT was being defined and the
  CreateHardLinkW function was #ifdef'd out.

  It looks like the header is mainly used to get the SHGetFolderPathW
  function, so I don't think that there'll be much fallout from the
  switch.

Suggested by Alex Crichton. Thanks!

llvm-svn: 201230
2014-02-12 11:50:22 +00:00
Benjamin Kramer 53f9df4c93 R600: Always implement both versions of isTruncateFree and add a sanity check.
llvm-svn: 201222
2014-02-12 10:17:54 +00:00
Craig Topper ea91f02762 Mark XACQUIRE_PREFIX/XRELEASE_PREFIX as isAsmParserOnly so they'll disappear from the disassembler table build without custom filtering code.
llvm-svn: 201215
2014-02-12 08:02:29 +00:00
David Blaikie 5b85858b77 DwarfUnit: Include type unit's file strings in the defining compile unit's file_names table
There's still one piece missing here, which is adding the
DW_AT_stmt_list to the type unit that refer's to the compile unit's line
table. Working on that.

llvm-svn: 201198
2014-02-12 00:40:47 +00:00
David Blaikie d696fac175 Fix some formatting in my last commit (r201196)
llvm-svn: 201197
2014-02-12 00:32:05 +00:00
David Blaikie 15632ae11a DwarfUnit: Provide a reference to a defining DwarfCompileUnit from DwarfTypeUnit.
Type units need to insert their file strings into the compile unit's
line/file table. This is preliminary work to that end.

llvm-svn: 201196
2014-02-12 00:31:30 +00:00
David Blaikie 101613e903 DwarfUnit: Refactor DW_AT_file creation into a common function.
This is preliminary work to fix type unit file strings so they appear in
their originating CU's line table - but it's also just good/simple
cleanup, so I'm committing it ahead of time.

llvm-svn: 201195
2014-02-12 00:11:25 +00:00
David Blaikie 5201930762 DwarfUnit: Replace unnecessary conditionals with asserts.
We used to be pretty vague about what debug entities were what, with
many conditionals to silently drop/skip/accept things. These don't seem
to be relevant anymore.

llvm-svn: 201194
2014-02-11 23:57:03 +00:00
Evan Cheng 57add3e4ee Tweak ARM fastcc by adopting these two AAPCS rules:
* CPRCs may be allocated to co-processor registers or the stack – they may never be allocated to core registers
* When a CPRC is allocated to the stack, all other VFP registers should be marked as unavailable

The difference is only noticeable in rare cases where there are a large number of floating point arguments (e.g.
7 doubles + additional float, double arguments). Although it's probably still better to avoid vmov as it can cause
stalls in some older ARM cores. The other, more subtle benefit, is to minimize difference between the various
calling conventions.

rdar://16039676

llvm-svn: 201193
2014-02-11 23:49:31 +00:00
Adrian Prantl cbcd578f0c Reapply r201180 with an additional error path.
Debug info: Emit values in subregisters that do not have a separate
DWARF register number by emitting a super-register + DW_OP_bit_piece.
This is necessary because on x86_64, there are no DWARF register numbers
for i386-style subregisters.
Fixes a bunch of FIXMEs.

rdar://problem/16015314

llvm-svn: 201190
2014-02-11 22:22:15 +00:00
Adrian Prantl 80b6fd02fa Revert "Debug info: Emit values in subregisters that do not have a separate"
This reverts commit r201179 for buildbot breakage.

llvm-svn: 201188
2014-02-11 22:03:30 +00:00
David Blaikie 284cfc1089 DebugInfo: Don't include the name of the CU file in the line table file list when it's unneeded
This comes up in empty files or files containing #file directives that
never reference the actual source file name. Came up in a small test of
line tables I was playing with.

llvm-svn: 201187
2014-02-11 21:49:46 +00:00
Adrian Prantl c4fd6b71c6 whitespace
llvm-svn: 201181
2014-02-11 21:23:02 +00:00
Adrian Prantl a83cc8a356 Debug info: Emit values in subregisters that do not have a separate
DWARF register number by emitting a super-register + DW_OP_bit_piece.
This is necessary because on x86_64, there are no DWARF register numbers
for i386-style subregisters.
Fixes a bunch of FIXMEs.

rdar://problem/16015314

llvm-svn: 201180
2014-02-11 21:22:59 +00:00
Adrian Prantl 6f84d31540 make llvm-dwarfdump a little more resilient when parsing .debug_loc
sections. The call to data.getUnsigned(&Offset, AddressSize) only
increments Offset if the read succeeds, which will result in an infinite
loop.

llvm-svn: 201179
2014-02-11 21:22:53 +00:00
Matt Arsenault 71b71d25eb R600/SI: Fix assertion on infinite loops.
This isn't the most useful case to fix in the real world,
but bugpoint runs into this.

llvm-svn: 201177
2014-02-11 21:12:38 +00:00
Benjamin Kramer 94fc18d040 InstCombine: Teach icmp merging about the equivalence of bit tests and UGE/ULT with a power of 2.
This happens in bitfield code. While there reorganize the existing code
a bit.

llvm-svn: 201176
2014-02-11 21:09:03 +00:00
Jim Grosbach 8bfcb735fa ARM: Thumb2 LDR(literal) can target SP.
Fix a slightly overzealous destination register restriction for the
'without .w' alias. Add some explicit testcases.

rdar://16033140

llvm-svn: 201173
2014-02-11 20:48:39 +00:00
Benjamin Kramer 987b850cf2 SCEV: Cast switched values to make -Wswitch more useful.
llvm-svn: 201170
2014-02-11 19:02:55 +00:00
Benjamin Kramer 5a188549ad ScalarEvolution: Analyze trip count of loops with a switch guarding the exit.
llvm-svn: 201159
2014-02-11 15:44:32 +00:00
Robert Lougher 7d9084ffa1 Teach the DAGCombiner how to fold concat_vector nodes when the input is two
BUILD_VECTOR nodes, e.g.:

(concat_vectors (BUILD_VECTOR a1, a2, a3, a4), (BUILD_VECTOR b1, b2, b3, b4))
->
(BUILD_VECTOR a1, a2, a3, a4, b1, b2, b3, b4)

This fixes an issue with AVX, where a sequence was not recognized as a 256-bit
vbroadcast due to the concat_vectors.

llvm-svn: 201158
2014-02-11 15:42:46 +00:00
Bradley Smith 9d80849714 [AArch64] Add missing PCRel relocations for AArch64 in RuntimeDyldELF
llvm-svn: 201149
2014-02-11 12:59:09 +00:00
Chandler Carruth fc25854b09 [LPM] Switch LICM to actively use LCSSA in addition to preserving it.
Fixes PR18753 and PR18782.

This is necessary for LICM to preserve LCSSA correctly and efficiently.
There is still some active discussion about whether we should be using
LCSSA, but we can't just immediately stop using it and we *need* LICM to
preserve it while we are using it. We can restore the old SSAUpdater
driven code if and when there is a serious effort to remove the reliance
on LCSSA from all of the loop passes.

However, this also serves as a great example of why LCSSA is very nice
to have. This change significantly simplifies the process of sinking
instructions for LICM, and makes it quite a bit less expensive.

It wouldn't even be as complex as it is except that I had to start the
process of removing the big recursive LCSSA formation hammer in order to
switch even this much of the re-forming code to asserting that LCSSA was
preserved. I'll fully remove that next just to tidy things up until the
LCSSA debate settles one way or the other.

llvm-svn: 201148
2014-02-11 12:52:27 +00:00
Robert Lytton 70b5ba49c3 XCore target: fix const section handling
Xcore target ABI requires const data that is externally visible
to be handled differently if it has C-language linkage rather than
C++ language linkage.

Clang now emits ".cp.rodata" section information.

All other externally visible constant data will be placed in the DP section.

llvm-svn: 201144
2014-02-11 10:36:26 +00:00
Robert Lytton 9b6bb461b1 XCore target: Lower ATOMIC_LOAD & ATOMIC_STORE
llvm-svn: 201143
2014-02-11 10:36:18 +00:00
Elena Demikhovsky 1f32c313f1 AVX: fixed a bug in LowerVECTOR_SHUFFLE
llvm-svn: 201140
2014-02-11 10:21:53 +00:00
Dmitri Gribenko 70e6585f0c Remove TimeValue::toPosixTime() -- it is buggy, semantics are unclear, and its
only current user should be using toEpochTime() instead.

llvm-svn: 201136
2014-02-11 09:11:18 +00:00
Elena Demikhovsky 2aafc22ed9 AVX-512: Optimized BUILD_VECTOR pattern;
fixed encoding of VEXTRACTPS instruction.

llvm-svn: 201134
2014-02-11 07:25:59 +00:00
Lang Hames d41001706a In RuntimeDyldImpl::emitSection, make Allocate (section size to be allocated) a
uintptr_t. An unsigned could overflow for large sections.

No test case - anything big enough to overflow an unsigned is going to take an
appreciable time to zero when the test passes.

The choice of uintptr_t was made to match the RTDyldMemoryManager APIs, but
these should probably be hardcoded to uint64_ts: It is legitimate to JIT for
64-bit targets from a 32-bit host/compiler.

llvm-svn: 201127
2014-02-11 05:28:24 +00:00
Aaron Ballman 07e7618e95 Using the helper API for random number generation.
llvm-svn: 201125
2014-02-11 03:40:14 +00:00
Aaron Ballman 3f5e8b8dc1 Hopefully fixing the MinGW 32 build, which was broken by r200767. Not using rand_s() since MinGW does not have an implementation for it, but instead using the underlying CryptGenRandom APIs.
llvm-svn: 201124
2014-02-11 02:47:33 +00:00
Quentin Colombet 5a69dda9b0 [CodeGenPrepare] Undo changes that happened for the profitability check.
The addressing mode matcher checks at some point the profitability of folding an
instruction into the addressing mode. When the instruction to be folded has
several uses, it checks that the instruction can be folded in each use.
To do so, it creates a new matcher for each use and check if the instruction is
in the list of the matched instructions of this new matcher.

The new matchers may promote some instructions and this has to be undone to keep
the state of the original matcher consistent.

A test case will follow.

<rdar://problem/16020230>

llvm-svn: 201121
2014-02-11 01:59:02 +00:00
David Blaikie a47009dbd3 DebugInfo: Use existing symbol rather than creating it again.
llvm-svn: 201119
2014-02-11 01:23:52 +00:00
Juergen Ributzka 73a7fcc6e1 [Stackmaps] Cleanup code. No functional change intended.
llvm-svn: 201115
2014-02-10 23:30:26 +00:00
Manman Ren 03456a176d LTO API: add lto_module_create_from_memory_with_path.
This function adds an extra path argument to lto_module_create_from_memory.
The path argument will be passed to makeBuffer to make sure the MemoryBuffer
has a name and the created module has a module identifier.

This is mainly for emitting warning messages from the linker. When we emit
warning message on a module, we can use the module identifier.

rdar://15985737

llvm-svn: 201114
2014-02-10 23:26:14 +00:00
Rafael Espindola efedd3aa1b Mark the methods in the Mangler const.
A const ObjectFile needs to be able to provide its name. For an IRObjectFile,
that means being able to call the mangler. Since each IRObjectFile can have
a different mangling, it is natural for them to contain a Mangler which is
therefore also const.

llvm-svn: 201113
2014-02-10 21:25:13 +00:00
Rafael Espindola b5155a572f Change the begin and end methods in ObjectFile to match the style guide.
llvm-svn: 201108
2014-02-10 20:24:04 +00:00
Matt Arsenault 0cdcd961bf R600: Implement isTruncateFree
Truncation is just accessing a subregister for any multiple of
the register size, so it's free.

llvm-svn: 201107
2014-02-10 19:57:42 +00:00
Chandler Carruth 756c22cded [LPM] A terribly simple fix to a terribly complex bug: PR18773.
The crux of the issue is that LCSSA doesn't preserve stateful alias
analyses. Before r200067, LICM didn't cause LCSSA to run in the LTO pass
manager, where LICM runs essentially without any of the other loop
passes. As a consequence the globalmodref-aa pass run before that loop
pass manager was able to survive the loop pass manager and be used by
DSE to eliminate stores in the function called from the loop body in
Adobe-C++/loop_unroll (and similar patterns in other benchmarks).

When LICM was taught to preserve LCSSA it had to require it as well.
This caused it to be run in the loop pass manager and because it did not
preserve AA, the stateful AA was lost. Most of LLVM's AA isn't stateful
and so this didn't manifest in most cases. Also, in most cases LCSSA was
already running, and so there was no interesting change.

The real kicker is that LCSSA by its definition (injecting PHI nodes
only) trivially preserves AA! All we need to do is mark it, and then
everything goes back to working as intended. It probably was blocking
some other weird cases of stateful AA but the only one I have is
a 1000-line IR test case from loop_unroll, so I don't really have a good
test case here.

Hopefully this fixes the regressions on performance that have been seen
since that revision.

llvm-svn: 201104
2014-02-10 19:39:35 +00:00
Hans Wennborg 4a4be11e62 Copy the ThreadLocalMode in GlobalVariable::copyAttributesFrom
This fixes the oversight from r159077.

llvm-svn: 201098
2014-02-10 17:13:56 +00:00
Tom Stellard 5d7aaaed7d R600/SI: Initialize M0 and emit S_WQM_B64 whenever DS instructions are used
DS instructions that access local memory can only uses addresses that
are less than or equal to the value of M0.  When M0 is uninitialized,
then we experience undefined behavior.

This patch also changes the behavior to emit S_WQM_B64 on pixel shaders
no matter what kind of DS instruction is used.

llvm-svn: 201097
2014-02-10 16:58:30 +00:00
Tom Stellard 9a32e5f29a R600/SI: Only use S_WQM_B64 in pixel shaders
This doesn't change any functionality, since we only have two shader
types (compute and pixel) that use local memory.  We're just changing
the logic to match the documentation.

llvm-svn: 201096
2014-02-10 16:58:27 +00:00
David Blaikie 00107f8203 Remove some prototype code accidentally committed in r201043
Thanks to Chandler for the catch.

llvm-svn: 201095
2014-02-10 16:49:07 +00:00
Tim Northover b0430415e6 ARM: use natural LLVM IR for vshll instructions
Similarly to the vshrn instructions, these are simple zext/sext + trunc
operations. Using normal LLVM IR should allow for better code, and more sharing
with the AArch64 backend.

llvm-svn: 201093
2014-02-10 16:20:29 +00:00
Chad Rosier bcde0c49cb [AArch64] Handle aliases of conditional branches without b.pred form.
llvm-svn: 201091
2014-02-10 15:43:11 +00:00
Oliver Stannard 8dcaa761a2 ARM: r12 is callee-saved for interrupt handlers
For A- and R-class processors, r12 is not normally callee-saved, but is for
interrupt handlers. See AAPCS, 5.3.1.1, "Use of IP by the linker".

llvm-svn: 201089
2014-02-10 14:24:23 +00:00
Benjamin Kramer 3c29c0704b Make succ_iterator a real random access iterator and clean up a couple of users.
llvm-svn: 201088
2014-02-10 14:17:42 +00:00
Benjamin Kramer b8266d2062 GlobalsModRef: Unify and clean up duplicated pointer analysis code.
llvm-svn: 201087
2014-02-10 14:17:30 +00:00
Tim Northover 170daafe01 ARM: use LLVM IR to represent the vshrn operation
vshrn is just the combination of a right shift and a truncate (and the limits
on the immediate value actually mean the signedness of the shift doesn't
matter). Using that representation allows us to get rid of an ARM-specific
intrinsic, share more code with AArch64 and hopefully get better code out of
the mid-end optimisers.

llvm-svn: 201085
2014-02-10 14:04:07 +00:00
Matheus Almeida 4b27eb588c [mips][msa] Add DLSA instruction.
llvm-svn: 201081
2014-02-10 12:05:17 +00:00
Matheus Almeida 883a2f893d [mips][msa] Make LSA_DESC a parameterizable class.
This way it's possible to share the instruction's description for LSA and
DLSA (to be added).

No functional changes.

llvm-svn: 201078
2014-02-10 11:15:37 +00:00
NAKAMURA Takumi 2f96171cb0 [CMake] LLVMSupport should be responsible to provide system_libs.
llvm-svn: 201077
2014-02-10 10:52:19 +00:00
Kostya Serebryany 8baa386670 [asan] support for FreeBSD, LLVM part. patch by Viktor Kutuzov
llvm-svn: 201067
2014-02-10 07:37:04 +00:00
Elena Demikhovsky 9f423d6f25 AVX-512: Fixed extract_vector_elt for v16i1 and v8i1 vectors.
llvm-svn: 201066
2014-02-10 07:02:39 +00:00
Craig Topper a0869dceea Recommit r201059 and r201060 with hopefully a fix for its original failure.
Original commits messages:

Add MRMXr/MRMXm form to X86 for use by instructions which treat the 'reg' field of modrm byte as a don't care value. Will allow for simplification of disassembler code.

Simplify a bunch of code by removing the need for the x86 disassembler table builder to know about extended opcodes. The modrm forms are sufficient to convey the information.

llvm-svn: 201065
2014-02-10 06:55:41 +00:00
Bob Wilson ebdae7c2ff Revert r201059 and r201060.
r201059 appears to cause a crash in a bootstrapped build of clang. Craig
isn't available to look at it right now, so I'm reverting it while he
investigates.

llvm-svn: 201064
2014-02-10 05:28:30 +00:00
Hao Liu 6e73761dc8 [AArch64]Implement the copy of two FPR8 registers by using FMOVss of two FPR32 registers in copyPhysReg.
llvm-svn: 201061
2014-02-10 03:16:22 +00:00
Craig Topper 0d88de8c56 Add MRMXr/MRMXm form to X86 for use by instructions which treat the 'reg' field of modrm byte as a don't care value. Will allow for simplification of disassembler code.
llvm-svn: 201059
2014-02-10 00:50:34 +00:00
Saleem Abdulrasool a879fab3b3 MCParser: add a single token lookahead
Some of the more complex directive and macro handling for GAS compatibility
requires lookahead.  Add a single token lookahead in the MCAsmLexer.

llvm-svn: 201058
2014-02-09 23:29:24 +00:00
Benjamin Kramer d31aaf109e AsmParser: Simplify code with ArrayRef.
No functionality change.

llvm-svn: 201055
2014-02-09 17:13:11 +00:00
Benjamin Kramer 9d94a4eee9 AsmParser: Parse (and ignore) nested .macro definitions.
This enables a slightly odd feature of gas. The macro is defined when
the outermost macro is instantiated.

PR18599

llvm-svn: 201045
2014-02-09 16:22:00 +00:00
Rafael Espindola 15b26696af Use a consistent argument order in TargetLoweringObjectFile.
These methods normally call each other and it is really annoying if the
arguments are in different order. The more common rule was that the arguments
specific to call are first (GV, Encoding, Suffix) and the auxiliary objects
(Mang, TM) come after. This patch changes the exceptions.

llvm-svn: 201044
2014-02-09 14:50:44 +00:00
David Blaikie 9aff95c940 Fix formatting introduced in r200941
llvm-svn: 201043
2014-02-09 09:49:29 +00:00
Arnold Schwaighofer 348e1b60be LoopVectorizer: Keep track of conditional store basic blocks
Before conditional store vectorization/unrolling we had only one
vectorized/unrolled basic block. After adding support for conditional store
vectorization this will not only be one block but multiple basic blocks. The
last block would have the back-edge. I updated the code to use a vector of basic
blocks instead of a single basic block and fixed the users to use the last entry
in this vector. But, I forgot to add the basic blocks to this vector!

Fixes PR18724.

llvm-svn: 201028
2014-02-08 20:41:13 +00:00
Rafael Espindola fa0f72837f Pass the Mangler by reference.
It is never null and it is not used in casts, so there is no reason to use a
pointer. This matches how we pass TM.

llvm-svn: 201025
2014-02-08 14:53:28 +00:00
Rafael Espindola 1070501586 Add LLVM_OVERRIDE to a few declarations.
llvm-svn: 201022
2014-02-08 06:07:27 +00:00
Juergen Ributzka 9479b31f97 [Constant Hoisting] Fix insertion point for constant materialization.
The bitcast instruction during constant materialization was not placed correcly
in the presence of phi nodes. This commit fixes the insertion point to be in the
idom instead.

This fixes PR18768

llvm-svn: 201009
2014-02-08 00:20:49 +00:00
Juergen Ributzka 4c8a02521d [Constant Hoisting] Don't update the use list while traversing it - DOH!
This fix first traverses the whole use list of the constant expression and
keeps track of the instructions that need to be updated. Then perform the
fixup afterwards.

llvm-svn: 201008
2014-02-08 00:20:45 +00:00
Rafael Espindola b3b52a7532 Remove dead code.
llvm-svn: 201006
2014-02-07 23:32:41 +00:00
Rafael Espindola 5054362920 Always create a temporary symbol to use with the cfi frame.
This is a small simplification and a small step in fixing pr18743 since
private functions on MachO should be using a 'l' prefix.

llvm-svn: 200994
2014-02-07 21:23:18 +00:00
Renato Golin 78a6eba862 Remove -arm-disable-ehabi option
llvm-svn: 200988
2014-02-07 20:12:49 +00:00
Rafael Espindola 66f273be34 Don't internalize linkonce_odr non constant variables.
llvm-svn: 200983
2014-02-07 19:04:43 +00:00
Alexander Kornienko d772d72140 Fix an invalid check for duplicate option categories.
An intermediate solution until the problems with analyzer plugins linking with
llvm/Support and causing assertions due to duplicate GeneralCategory are solved.

llvm-svn: 200981
2014-02-07 17:42:30 +00:00
Sasa Stankovic 4c80bdae72 [mips] Forbid the use of registers t6, t7 and t8 if the target is NaCl.
Differential Revision: http://llvm-reviews.chandlerc.com/D2694

llvm-svn: 200978
2014-02-07 17:16:40 +00:00
Rafael Espindola 61acf5d9b0 Fix a bug with .weak_def_can_be_hidden: Mutable variables cannot use it.
Thanks to John McCall for noticing it.

llvm-svn: 200977
2014-02-07 16:21:30 +00:00
Rafael Espindola a005342db3 Refactor logic into a function predicate.
No functionality change.

llvm-svn: 200976
2014-02-07 16:07:11 +00:00
Benjamin Kramer 6128d00052 Try to unbreak the mingw32 build.
llvm-svn: 200973
2014-02-07 12:05:36 +00:00
Oliver Stannard 1dc1034218 LLVM-1163: AAPCS-VFP violation when CPRC allocated to stack
According to the AAPCS, when a CPRC is allocated to the stack, all other
VFP registers should be marked as unavailable.

I have also modified the rules for allocating non-CPRCs to the stack, to make
it more explicit that all GPRs must be made unavailable. I cannot think of a
case where the old version would produce incorrect answers, so there is no test
for this.

llvm-svn: 200970
2014-02-07 11:19:53 +00:00
Venkatraman Govindaraju de98fae368 [Sparc] Add support for parsing synthetic instruction 'mov'.
llvm-svn: 200965
2014-02-07 09:06:52 +00:00
Venkatraman Govindaraju ced9226b0f [Sparc] Emit correct encoding for atomic instructions. Also, add support for parsing CAS instructions to test the CAS encoding.
llvm-svn: 200963
2014-02-07 07:34:49 +00:00
Venkatraman Govindaraju fd07500dd1 [Sparc] Emit relocations for Thread Local Storage (TLS) when integrated assembler is used.
llvm-svn: 200962
2014-02-07 05:54:20 +00:00
Venkatraman Govindaraju 104643d0aa [Sparc] Emit correct relocations for PIC code when integrated assembler is used.
llvm-svn: 200961
2014-02-07 04:24:35 +00:00
Venkatraman Govindaraju dfe09b1b5b [Sparc] Use SparcMCExpr::VariantKind itself as MachineOperand's target flags.
llvm-svn: 200960
2014-02-07 02:36:06 +00:00
Manman Ren 37c9267107 PGO branch weight: fix PR18752.
Fix a bug triggered in IfConverterTriangle when CvtBB has multiple predecessors
by getting the weights before removing a successor.

llvm-svn: 200958
2014-02-07 00:38:56 +00:00
Jim Grosbach e9008de652 X86: Resolve a long standing FIXME and properly isel pextr[bw].
Generalize the AArch64 .td nodes for AssertZext and AssertSext. Use
them to match the relevant pextr store instructions.

The test widen_load-2.ll requires a slight change because with the
stores gone, the remaining instructions are scheduled in a different
order.

Add test cases for SSE4 and AVX variants.

Resolves rdar://13414672.

Patch by Adam Nemet <anemet@apple.com>.

llvm-svn: 200957
2014-02-07 00:16:33 +00:00
Quentin Colombet 3a4bf0405e [CodeGenPrepare] Move away sign extensions that get in the way of addressing
mode.

Basically the idea is to transform code like this:
%idx = add nsw i32 %a, 1
%sextidx = sext i32 %idx to i64
%gep = gep i8* %myArray, i64 %sextidx
load i8* %gep

Into:
%sexta = sext i32 %a to i64
%idx = add nsw i64 %sexta, 1
%gep = gep i8* %myArray, i64 %idx
load i8* %gep

That way the computation can be folded into the addressing mode.

This transformation is done as part of the addressing mode matcher.
If the matching fails (not profitable, addressing mode not legal, etc.), the
matcher will revert the related promotions.

<rdar://problem/15519855>

llvm-svn: 200947
2014-02-06 21:44:56 +00:00
Andrew Trick 2a15637ede Track register pressure a bit more carefully (weird corner case).
This solves a problem where a def machine operand has no uses but has
not been marked dead. In this case, the initial RP analysis was being
extra precise and determining from LiveIntervals the the register was
actually dead. This caused us to omit the register from the RP
tracker's block live out. That's all good, but the per-instruction
summary still accounted for it as a valid def. This could cause an
assertion in the tracker later when we underflow pressure.

This is from a bug report on an out-of-tree target. It is not
reproducible on well-behaved targets. I'm just making an obvious fix
without unit test.

llvm-svn: 200941
2014-02-06 19:20:41 +00:00
Evan Cheng 91f205bfc4 Revert r200095 and r200152. It turns out when compiling with -arch armv7 -mcpu=cortex-m3, the triple would still set iOS as the OS so the hack is still needed. rdar://15984891
llvm-svn: 200937
2014-02-06 18:51:34 +00:00
Tom Stellard e236794578 R600/SI: Add a MUBUF store pattern for Reg+Imm offsets
llvm-svn: 200935
2014-02-06 18:36:41 +00:00
Tom Stellard 2937cbc005 R600/SI: Add a MUBUF store pattern for Imm offsets
llvm-svn: 200934
2014-02-06 18:36:39 +00:00
Tom Stellard 11624bc577 R600/SI: Add a MUBUF load pattern for Reg+Imm offsets
llvm-svn: 200933
2014-02-06 18:36:38 +00:00
Tom Stellard 044e418f15 R600/SI: Use immediates offsets for SMRD instructions whenever possible
There was a problem with the old pattern, so we were copying some
larger immediates into registers when we could have been encoding
them in the instruction.

llvm-svn: 200932
2014-02-06 18:36:34 +00:00
David Peixotto ea2bcb9e07 Remove const_cast for STI when parsing inline asm
In a previous commit (r199818) we added a const_cast to an existing
subtarget info instead of creating a new one so that we could reuse
it when creating the TargetAsmParser for parsing inline assembly.
This cast was necessary because we needed to reuse the existing STI
to avoid generating incorrect code when the inline asm contained
mode-switching directives (e.g. .code 16).

The root cause of the failure was that there was an implicit sharing
of the STI between the parser and the MCCodeEmitter. To fix a
different but related issue, we now explicitly pass the STI to the
MCCodeEmitter (see commits r200345-r200351).

The const_cast is no longer necessary and we can now create a fresh
STI for the inline asm parser to use.

Differential Revision: http://llvm-reviews.chandlerc.com/D2709

llvm-svn: 200929
2014-02-06 18:19:40 +00:00
Tim Northover f0e21616f3 X86: add costs for 64-bit vector ext/trunc & rebalance
The most important part of this is probably adding any cost at all for
operations like zext <8 x i8> to <8 x i32>. Before they were being
recorded as extremely costly (24, I believe) which made LLVM fall back
on a 4-wide vectorisation of a loop.

It also rebalances the values for sext, zext and trunc. Lacking any
other sane metric that might work across CPU microarchitectures I went
for instructions. This seems to be in reasonable accord with the rest
of the table (sitofp, ...) though no doubt at least one value is
sub-optimal for some bizarre reason.

Finally, separate AVX and AVX2 values are provided where appropriate.
The CodeGen is quite different in many cases.

rdar://problem/15981990

llvm-svn: 200928
2014-02-06 18:18:36 +00:00
Eli Bendersky e17f37082b Add a -suppress-warnings option to bitcode linking.
llvm-svn: 200927
2014-02-06 18:01:56 +00:00
Puyan Lotfi efbcf4943c Yet another patch to reduce compile time for small programs:
The aim in this patch is to reduce work that VirtRegRewriter needs to do when
telling MachineRegisterInfo which physregs are in use. Up until now
VirtRegRewriter::rewrite has been doing rewriting and populating def info and
then proceeding to set whether a physreg is used based this info for every
physreg that the target provides. This can be expensive when a target has an
unusually high number of supported physregs, and is a noticeable chunk of
compile time for small programs on such targets.

So to reduce compile time, this patch simply adds the use of a SparseSet to the
rewrite function that is used to flag each physreg that is encountered in a
MachineFunction. Afterward, rather than iterating over the set of all physregs
for a given target to set the physregs used in MachineRegisterInfo, the new way
is to iterate over the set of physregs that were actually encountered and set
in the SparseSet. This improves compile time because the existing rewrite
function was iterating over all MachineOperands already, and because the
iterations afterward to setPhysRegUsed is reduced by use of the SparseSet data.

llvm-svn: 200919
2014-02-06 09:57:39 +00:00
Tim Northover 546b57b011 X86: deduplicate V[SZ]EXT_MOVL and V[SZ]EXT nodes
I believe VZEXT_MOVL means "zero all vector elements except the first" (and
should have identical input & output types) whereas VZEXT means "zero extend
each element of a vector (discarding higher elements if necessary)".

For example:
    (v4i32 (vzext (v16i8 ...)))

should zero extend the low 4 bytes of the incoming vector to 32-bits,
discarding higher bytes.

However, somewhere in the past, these two concepts had become confused, even
leading to a nonsensical VSEXT_MOVL.

This re-merges the nodes where appropriate (all VSEXT_MOVL -> VSEXT, VZEXT_MOVL
-> VZEXT when it's an actual extension).

rdar://problem/15981990

llvm-svn: 200918
2014-02-06 09:54:51 +00:00
Puyan Lotfi 5eb1004889 The following patch' purpose is to reduce compile time for compilation of small
programs on targets with large register files. The root of the compile time
overhead was in the use of llvm::SmallVector to hold PhysRegEntries, which
resulted in slow-down from calling llvm::SmallVector::assign(N, 0). In contrast
std::vector uses the faster __platform_bzero to zero out primitive buffers when
assign is called, while SmallVector uses an iterator.

The fix for this was simply to replace the SmallVector with a dynamically
allocated buffer and to initialize or reinitialize the buffer based on the
total registers that the target architecture requires. The changes support
cases where a pass manager may be reused for different targets, and note that
the PhysRegEntries is allocated using calloc mainly for good for, and also to
quite tools like Valgrind (see comments for more info on this).

There is an rdar to track the fact that SmallVector doesn't have platform
specific speedup optimizations inside of it for things like this, and I'll
create a bugzilla entry at some point soon as well.

TL;DR: This fix replaces the expensive llvm::SmallVector<unsigned
char>::assign(N, 0) with a call to calloc for N bytes which is much faster
because SmallVector's assign uses iterators.

llvm-svn: 200917
2014-02-06 09:23:24 +00:00
Puyan Lotfi 12ae04bd17 This small change reduces compile time for small programs on targets that have
large register files. The omission of Queries.clear() is perfectly safe because
LiveIntervalUnion::Query doesn't contain any data that needs freeing and
because LiveRegMatrix::runOnFunction happens to reset the OwningArrayPtr
holding Queries every time it is run, so there's no need to zero out the
queries either. Not having to do this for very large numbers of physregs
is a noticeable constant cost reduction in compilation of small programs.

llvm-svn: 200913
2014-02-06 08:42:01 +00:00
Nick Lewycky 993849490e A memcpy out of an fresh alloca is a no-op, delete it. Patch by Patrick Walton!
llvm-svn: 200907
2014-02-06 06:29:19 +00:00
Chandler Carruth d1ba2efb8f [PM] Fix horrible typos that somehow didn't cause a failure in a C++11
build but spectacularly changed behavior of the C++98 build. =]

This shows my one problem with not having unittests -- basic API
expectations aren't well exercised by the integration tests because they
*happen* to not come up, even though they might later. I'll probably add
a basic unittest to complement the integration testing later, but
I wanted to revive the bots.

llvm-svn: 200905
2014-02-06 05:17:02 +00:00
Chandler Carruth bf71a34eb9 [PM] Add a new "lazy" call graph analysis pass for the new pass manager.
The primary motivation for this pass is to separate the call graph
analysis used by the new pass manager's CGSCC pass management from the
existing call graph analysis pass. That analysis pass is (somewhat
unfortunately) over-constrained by the existing CallGraphSCCPassManager
requirements. Those requirements make it *really* hard to cleanly layer
the needed functionality for the new pass manager on top of the existing
analysis.

However, there are also a bunch of things that the pass manager would
specifically benefit from doing differently from the existing call graph
analysis, and this new implementation tries to address several of them:

- Be lazy about scanning function definitions. The existing pass eagerly
  scans the entire module to build the initial graph. This new pass is
  significantly more lazy, and I plan to push this even further to
  maximize locality during CGSCC walks.
- Don't use a single synthetic node to partition functions with an
  indirect call from functions whose address is taken. This node creates
  a huge choke-point which would preclude good parallelization across
  the fanout of the SCC graph when we got to the point of looking at
  such changes to LLVM.
- Use a memory dense and lightweight representation of the call graph
  rather than value handles and tracking call instructions. This will
  require explicit update calls instead of some updates working
  transparently, but should end up being significantly more efficient.
  The explicit update calls ended up being needed in many cases for the
  existing call graph so we don't really lose anything.
- Doesn't explicitly model SCCs and thus doesn't provide an "identity"
  for an SCC which is stable across updates. This is essential for the
  new pass manager to work correctly.
- Only form the graph necessary for traversing all of the functions in
  an SCC friendly order. This is a much simpler graph structure and
  should be more memory dense. It does limit the ways in which it is
  appropriate to use this analysis. I wish I had a better name than
  "call graph". I've commented extensively this aspect.

This is still very much a WIP, in fact it is really just the initial
bits. But it is about the fourth version of the initial bits that I've
implemented with each of the others running into really frustrating
problms. This looks like it will actually work and I'd like to split the
actual complexity across commits for the sake of my reviewers. =] The
rest of the implementation along with lots of wiring will follow
somewhat more rapidly now that there is a good path forward.

Naturally, this doesn't impact any of the existing optimizer. This code
is specific to the new pass manager.

A bunch of thanks are deserved for the various folks that have helped
with the design of this, especially Nick Lewycky who actually sat with
me to go through the fundamentals of the final version here.

llvm-svn: 200903
2014-02-06 04:37:03 +00:00
Juergen Ributzka fa0eba6c8b [DAG] Don't pull the binary operation though the shift if the operands have opaque constants.
During DAGCombine visitShiftByConstant assumes that certain binary operations
with only constant operands can always be folded successfully. This is no longer
true when the constant is opaque. This commit fixes visitShiftByConstant by not
performing the optimization for opaque constants. Otherwise we would end up in
an infinite DAGCombine loop.

llvm-svn: 200900
2014-02-06 04:09:06 +00:00
Manman Ren d461244972 Set default of inlinecold-threshold to 225.
225 is the default value of inline-threshold. This change will make sure
we have the same inlining behavior as prior to r200886.

As Chandler points out, even though we don't have code in our testing
suite that uses cold attribute, there are larger applications that do
use cold attribute.

r200886 + this commit intend to keep the same behavior as prior to r200886.
We can later on tune the inlinecold-threshold.

The main purpose of r200886 is to help performance of instrumentation based
PGO before we actually hook up inliner with analysis passes such as BPI and BFI.
For instrumentation based PGO, we try to increase inlining of hot functions and
reduce inlining of cold functions by setting inlinecold-threshold.

Another option suggested by Chandler is to use a boolean flag that controls
if we should use OptSizeThreshold for cold functions. The default value
of the boolean flag should not change the current behavior. But it gives us
less freedom in controlling inlining of cold functions.

llvm-svn: 200898
2014-02-06 01:59:22 +00:00
Kevin Enderby d6b107136a Update the X86 assembler for .intel_syntax to accept
the << and >> bitwise operators.

rdar://15975725

llvm-svn: 200896
2014-02-06 01:21:15 +00:00
Rafael Espindola 6a383f9a54 don't set HasReliableSymbolDifference for ELF.
It is only used in MachObjectWriter.cpp. Another leftover from early days
of ELF in MC.

llvm-svn: 200895
2014-02-06 01:06:31 +00:00
Rafael Espindola 12f04984f8 doesSectionRequireSymbols is meaningless on ELF, remove.
This is a nop. doesSectionRequireSymbols is only used from
isSymbolLinkerVisible. isSymbolLinkerVisible only use from ELF was in

if (!Asm.isSymbolLinkerVisible(Symbol) && !Symbol.isUndefined())
  return false;

if (Symbol.isTemporary())
  return false;

If the symbol is a temporary this code returns false and it is irrelevant if
we take the first if or not. If the symbol is not a temporary,
Asm.isSymbolLinkerVisible returns true without ever calling
doesSectionRequireSymbols.

This was an horrible leftover from when support for ELF was first added.

llvm-svn: 200894
2014-02-06 00:54:53 +00:00
Paul Robinson af4e64d095 Disable most IR-level transform passes on functions marked 'optnone'.
Ideally only those transform passes that run at -O0 remain enabled,
in reality we get as close as we reasonably can.
Passes are responsible for disabling themselves, it's not the job of
the pass manager to do it for them.

llvm-svn: 200892
2014-02-06 00:07:05 +00:00
Rafael Espindola 4998280fdf Just returning false is the default.
llvm-svn: 200890
2014-02-06 00:03:15 +00:00
Matt Arsenault 1b55dd9a81 Pass address space to allowsUnalignedMemoryAccesses
llvm-svn: 200888
2014-02-05 23:16:05 +00:00
Matt Arsenault 25793a3f22 Add address space argument to allowsUnalignedMemoryAccess.
On R600, some address spaces have more strict alignment
requirements than others.

llvm-svn: 200887
2014-02-05 23:15:53 +00:00
Manman Ren e8781b1a36 Inliner uses a smaller inline threshold for callees with cold attribute.
Added command line option inlinecold-threshold to set threshold for inlining
functions with cold attribute. Listen to the cold attribute when it would
decrease the inline threshold.

llvm-svn: 200886
2014-02-05 22:53:44 +00:00
Quentin Colombet 87769713cf [RegAlloc] Add a last chance recoloring mechanism when everything else failed to
find a register.

The idea is to choose a color for the variable that cannot be allocated and
recolor its interferences around. Unlike the current register allocation scheme,
it is allowed to change the color of an already assigned (but maybe not
splittable or spillable) live interval while propagating this change to its
neighbors.
In other word, there are two things that may help finding an available color:
- Already assigned variables (RS_Done) can be recolored to different color.
- The recoloring allows to catch solutions that needs to touch more that just
  the neighbors of the current allocated variable.

E.g.,
vA can use {R1, R2    }
vB can use {    R2, R3}
vC can use {R1        }
Where vA, vB, and vC cannot be split anymore (they are reloads for instance) and
they all interfere.

vA is assigned R1
vB is assigned R2
vC tries to evict vA but vA is already done.
=> Regular register allocation heuristic fails.

Last chance recoloring kicks in:
vC does as if vA was evicted => vC uses R1.
vC is marked as fixed.
vA needs to find a color.
None are available.
vA cannot evict vC: vC is a fixed virtual register now.
vA does as if vB was evicted => vA uses R2.
vB needs to find a color.
R3 is available.
Recoloring => vC = R1, vA = R2, vB = R3.

<rdar://problem/15947839>

llvm-svn: 200883
2014-02-05 22:13:59 +00:00
Chandler Carruth eedf9fca28 [PM] Don't require analysis results to be const in the new pass manager.
I think this was just over-eagerness on my part. The analysis results
need to often be non-const because they need to (in some cases at least)
be updated by the transformation pass in order to remain correct. It
also makes lazy analyses (a common case) needlessly annoying to write in
order to make their entire state mutable.

llvm-svn: 200881
2014-02-05 21:41:42 +00:00
Rafael Espindola b4eec1daa1 Remove support for not using .loc directives.
Clang itself was not using this. The only way to access it was via llc.

llvm-svn: 200862
2014-02-05 18:00:21 +00:00
Rafael Espindola 0bca63a33a Revert "Fix an invalid check for duplicate option categories."
This reverts commit r200853.

It was causing clang/Analysis/checker-plugins.c to crash.

llvm-svn: 200858
2014-02-05 17:49:31 +00:00
Petar Jovanovic 9725016af3 [mips] Add NaCl target and forbid indexed loads and stores for it
This patch adds NaCl target for Mips. It also forbids indexed loads and
stores if the target is NaCl.

Patch by Sasa Stankovic.

Differential Revision: http://llvm-reviews.chandlerc.com/D2690

llvm-svn: 200855
2014-02-05 17:19:30 +00:00
Alexander Kornienko e88421b6f7 Fix an invalid check for duplicate option categories.
Summary:
The check performed in the comparator is invalid, as some STL
implementations enforce strict weak ordering by calling the comparator with the
same value. This check was also in a wrong place: the assertion would only fire
when -help was used. The new check is performed each time the category is
registered (we are not going to have thousands of them, so it's fine to do it in
O(N^2)).

Reviewers: jordan_rose

Reviewed By: jordan_rose

CC: cfe-commits, alexmc

Differential Revision: http://llvm-reviews.chandlerc.com/D2699

llvm-svn: 200853
2014-02-05 16:56:37 +00:00
Elena Demikhovsky 0b79be8ab2 AVX-512: optimized icmp -> sext -> icmp pattern
llvm-svn: 200849
2014-02-05 16:17:36 +00:00
Alon Mishne 0394c1e615 Test commit
llvm-svn: 200843
2014-02-05 14:23:18 +00:00
Logan Chien d5c48aa3d3 ARM: Resolve thumb_bl fixup in same MCFragment.
In Thumb1 mode, bl instruction might be selected for branches between
basic blocks in the function if the offset is greater than 2KB.
However, this might cause SEGV because the destination symbol
is not marked as thumb function and the execution mode will be reset
to ARM mode.

Since we are sure that these symbols are in the same data fragment, we
can simply resolve these local symbols, and don't emit any relocation
information for this bl instruction.

llvm-svn: 200842
2014-02-05 14:15:16 +00:00
Elena Demikhovsky a38114c45e AVX-512: fixed a bug in EVEX encoding (the bug appeared after r200624)
llvm-svn: 200837
2014-02-05 13:03:01 +00:00
Michel Danzer 5d26fdfcba R600/SI: Add pattern for zero-extending i1 to i32
Fixes opencl-example if_* tests with radeonsi.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74469

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 200830
2014-02-05 09:48:05 +00:00
Kai Nacke 382c140567 ARM: Enable use of relocation type tlsldo in debug info for tls data.
This fixes PR18554.

Reviewers: Renato Golin, Keith Walker
llvm-svn: 200826
2014-02-05 07:23:09 +00:00
Craig Topper 7ee163842f Move matching for x86 BMI BLSI/BLSMSK/BLSR instructions to isel patterns instead of DAG combine. This weakens the ability to fold loads with them because we aren't able to match patterns that load the same thing twice. But maybe we should fix that if we care. The peephole optimizer will be able to fold some loads in its absense.
llvm-svn: 200824
2014-02-05 07:09:40 +00:00
Elena Demikhovsky a30e437659 AVX-512: Added intrinsic for cvtph2ps.
Added VPTESTNM instruction.
Added a pattern to vselect (lit tests will follow).

llvm-svn: 200823
2014-02-05 07:05:03 +00:00
Craig Topper 7ca1d18055 Add CheckChildInteger to ISelMatcher operations. Removes nearly 2000 bytes from X86 matcher table.
llvm-svn: 200821
2014-02-05 05:44:28 +00:00
Rafael Espindola 22fe9c1e88 Use the information provided by getFlags to unify some code in llvm-nm.
It is not clear how much we should try to expose in getFlags. For example,
should there be a SF_Object and a SF_Text?

But for information that is already being exposed, we may as well use it in
llvm-nm.

llvm-svn: 200820
2014-02-05 05:19:19 +00:00
Todd Fiala 4ccfe392ed Fix configure to find arc4random via header files.
ISSUE:

On Ubuntu 12.04 LTS, arc4random is provided by libbsd.so, which is a
transitive dependency of libedit. If a system had libedit on it that
was implemented in terms of libbsd.so, then the arc4random test,
previously implemented as a linker test, would succeed with -ledit.
However, on Ubuntu this would also require a #include <bsd/stdlib.h>.
This caused a build breakage on configure-based Ubuntu 12.04 with
libedit installed.

FIX:

This fix changes configure to test for arc4random by searching for it
in the standard header files. On Ubuntu 12.04, this test now properly
fails to find arc4random as it is not defined in the default header
locations. It also tweaks the #define names to match the output of the
header check command, which is slightly different than the linker
function check #defines.

I tested the following scenarios:

(1) Ubuntu 12.04 without the libedit package [did not find arc4random,
as expected]

(2) Ubuntu 12.04 with libedit package [properly did not find
arc4random, as expected]

(3) Ubuntu 12.04 with most recent libedit, custom built, and not
dependent on libbsd.so [properly did not find arc4random, as
expected].

(4) FreeBSD 10.0B1 [properly found arc4random, as expected]

llvm-svn: 200819
2014-02-05 05:04:36 +00:00
Manman Ren 3762808d05 Fix wording of warning message about invalid debug info.
llvm-svn: 200806
2014-02-04 23:49:02 +00:00
Rafael Espindola 975e115eac Remove unused SF_ThreadLocal.
llvm-svn: 200800
2014-02-04 22:50:47 +00:00
Justin Bogner df82c62fcf llvm-cov: Fix include order in GCOV.cpp
llvm-svn: 200796
2014-02-04 21:03:17 +00:00
Benjamin Kramer 34f460ed29 SimplifyLibCalls: Push TLI through the exp2->ldexp transform.
For the odd case of platforms with exp2 available but not ldexp.

llvm-svn: 200795
2014-02-04 20:27:23 +00:00
Peter Collingbourne 45b4c4995e Avoid using EL_GETFP.
This should fix the build against old versions of libedit.

llvm-svn: 200794
2014-02-04 20:04:46 +00:00
Lang Hames 3303a339b1 [X86] Only 213 FMA3 variants should be marked commutable.
Commuting the 231 and 132 variants would swap addends and
multiplicands/multipliers, which isn't valid.

I'm still trying to reduce a decent test case for this.

llvm-svn: 200792
2014-02-04 19:42:47 +00:00
Duncan P. N. Exon Smith 8e661efc00 cleanup: scc_iterator consumers should use isAtEnd
No functional change.  Updated loops from:

    for (I = scc_begin(), E = scc_end(); I != E; ++I)

to:

    for (I = scc_begin(); !I.isAtEnd(); ++I)

for teh win.

llvm-svn: 200789
2014-02-04 19:19:07 +00:00
Petar Jovanovic a5da588b2f [mips] Implement %hi(sym1 - sym2) and %lo(sym1 - sym2) expressions
Patch implements %hi(sym1 - sym2) and %lo(sym1 - sym2) expressions for MIPS
by creating target expression class MipsMCExpr.

Patch by Sasa Stankovic.

Differential Revision: http://llvm-reviews.chandlerc.com/D2592

llvm-svn: 200783
2014-02-04 18:41:57 +00:00
Rafael Espindola 7cbbd28c67 Every target uses .align. Simplify.
llvm-svn: 200782
2014-02-04 18:39:51 +00:00
Rafael Espindola 7b51496975 Use the default values.
llvm-svn: 200781
2014-02-04 18:34:04 +00:00
David Peixotto b9b7362cdc Fix PR18345: ldr= pseudo instruction produces incorrect code when using in inline assembly
This patch fixes the ldr-pseudo implementation to work when used in
inline assembly.  The fix is to move arm assembler constant pools
from the ARMAsmParser class to the ARMTargetStreamer class.

Previously we kept the assembler generated constant pools in the
ARMAsmParser object. This does not work for inline assembly because
a new parser object is created for each blob of inline assembly.
This patch moves the constant pools to the ARMTargetStreamer class
so that the constant pool will remain alive for the entire code
generation process.

An ARMTargetStreamer class is now required for the arm backend.
There was no existing implementation for MachO, only Asm and ELF.
Instead of creating an empty MachO subclass, we decided to make the
ARMTargetStreamer a non-abstract class and provide default
(llvm_unreachable) implementations for the non constant-pool related
methods.

Differential Revision: http://llvm-reviews.chandlerc.com/D2638

llvm-svn: 200777
2014-02-04 17:22:40 +00:00
Tom Stellard aeb456438c R600/SI: Expand i1 BR_CC
This fixes a crashes in the OpenCV test suite and also the scrypt
kernel in bfgminer.

I was unable to come up with a reduced test case for this.

https://bugs.freedesktop.org/show_bug.cgi?id=72785

llvm-svn: 200776
2014-02-04 17:18:43 +00:00
Tom Stellard b8725d84d6 R600/SI: Don't assume copies will be coalesced in SIFixSGPRCopies
There is no lit test for this, because it would be too big and
complicated, but it does fix a crash in the Arithm/Absdiff.* OpenCV test.

llvm-svn: 200775
2014-02-04 17:18:42 +00:00
Tom Stellard 0ec134f3d6 R600/SI: Custom lower i64 ISD::SELECT
llvm-svn: 200774
2014-02-04 17:18:40 +00:00
Tom Stellard bfebd1fc7e R600: Enable vector fpow.
The OpenCL specs say: "The vector versions of the math functions operate
component-wise. The description is per-component."

Patch by: Jan Vesely

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 200773
2014-02-04 17:18:37 +00:00
Tim Northover 103e648d30 OS X: the correct function is __sincospif_stret, not __sincospi_stretf
rdar://problem/13729466

llvm-svn: 200771
2014-02-04 16:28:20 +00:00
Tim Northover fdbdb4b6d5 ARM & AArch64: merge NEON absolute compare intrinsics
There was an extremely confusing proliferation of LLVM intrinsics to implement
the vacge & vacgt instructions. This combines them all into two polymorphic
intrinsics, shared across both backends.

llvm-svn: 200768
2014-02-04 14:55:42 +00:00
Aaron Ballman 7844073402 Implemented support for Process::GetRandomNumber on Windows.
Patch thanks to Stephan Tolksdorf!

llvm-svn: 200767
2014-02-04 14:49:21 +00:00
Justin Bogner c6af350698 llvm-cov: Implement the preserve-paths flag
Until now, when a path in a gcno file included a directory, we would
emit our .gcov file in that directory, whereas gcov always emits the
file in the current directory. In doing so, this implements gcov's
strange name-mangling -p flag, which is needed to avoid clobbering
files when two with the same name exist in different directories.

The path mangling is a bit ugly and only handles unix-like paths, but
it's simple, and it doesn't make any guesses as to how it should
behave outside of what gcov documents. If we decide this should be
cross platform later, we can consider the compatibility implications
then.

llvm-svn: 200754
2014-02-04 10:45:02 +00:00
Tim Northover e42fb07618 ARM: fix fast-isel assertion failure
Missing braces on if meant we inserted both ARM and Thumb load for a litpool
entry. This didn't end well.

rdar://problem/15959157

llvm-svn: 200752
2014-02-04 10:38:46 +00:00
Michel Danzer 624b02aa67 R600/SI: Fix fneg for 0.0
V_ADD_F32 with source modifier does not produce -0.0 for this. Just
manipulate the sign bit directly instead.

Also add a pattern for (fneg (fabs ...)).

Fixes a bunch of bit encoding piglit tests with radeonsi.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 200743
2014-02-04 07:12:38 +00:00
NAKAMURA Takumi a71003ae10 RegAllocGreedy.cpp: Use more simple value as Hysteresis, to suppress -mfpmath-dependent behavior.
llvm-svn: 200738
2014-02-04 06:29:38 +00:00
Kai Nacke ab7ee461c2 Revert: ARM: Enable use of relocation type tlsldo in debug info for tls data.
There seems to be a new problem with the debug info in the test case.
I'll have to investigate this.

llvm-svn: 200737
2014-02-04 06:07:00 +00:00
Kai Nacke a56bb78021 Add strchr(p, 0) -> p + strlen(p) to SimplifyLibCalls
Add the missing transformation strchr(p, 0) -> p + strlen(p) to SimplifyLibCalls
and remove the ToDo comment.

Reviewer: Duncan P.N. Exan Smith
llvm-svn: 200736
2014-02-04 05:55:16 +00:00
Kai Nacke 5e8c30f192 ARM: Enable use of relocation type tlsldo in debug info for tls data.
This fixes PR18554.

Reviewers: Renato Golin, Keith Walker
llvm-svn: 200735
2014-02-04 05:43:09 +00:00
David Blaikie 5e390e4df7 DebugInfo: Remove some unneeded conditionals now that DIBuilder no longer emits zero-length arrays as {i32 0}
A bunch of test cases needed to be cleaned up for this, many my fault -
when implementid imported modules I updated test cases by simply
duplicating the prior metadata field - which wasn't always the empty
metadata entry.

llvm-svn: 200731
2014-02-04 01:23:52 +00:00
Nick Lewycky 00703e76dc Self-memcpy-elision and memcpy of constant byte to memset transforms don't care how many bytes you were trying to transfer. Sink that safety test after those transforms. Noticed by inspection.
llvm-svn: 200726
2014-02-04 00:18:54 +00:00
David Blaikie 2c7a2684f2 DIBuilder: simplify array generation to produce true zero-length arrays
For some anachronistic reason we were producing {i32 0} for zero-length
debug info arrays.

(this change is paired with a Clang change and may cause temporary
buildbot noise)

Let's not.

llvm-svn: 200721
2014-02-03 23:08:54 +00:00
Matt Arsenault d5ab971b54 Add DEBUG_TYPE to SIAnnotateControlFlow
llvm-svn: 200720
2014-02-03 22:58:05 +00:00
Reid Kleckner d47a59a4f8 inalloca: Don't remove dead arguments in the presence of inalloca args
It disturbs the layout of the parameters in memory and registers,
leading to problems in the backend.

The plan for optimizing internal inalloca functions going forward is to
essentially SROA the argument memory and demote any captured arguments
(things that aren't trivially written by a load or store) to an indirect
pointer to a static alloca.

llvm-svn: 200717
2014-02-03 20:42:49 +00:00
Tim Northover 24979d8e10 AArch64 & ARM: refactor crypto intrinsics to take scalars
Some of the SHA instructions take a scalar i32 as one argument (largely because
they work on 160-bit hash fragments). This wasn't reflected in the IR
previously, with ARM and AArch64 choosing different types (<4 x i32> and <1 x
i32> respectively) which was ugly.

This makes all the affected intrinsics take a uniform "i32", allowing them to
become non-polymorphic at the same time.

llvm-svn: 200706
2014-02-03 17:27:49 +00:00
Hal Finkel 5c968d9440 Expand vector bswap in LegalizeVectorOps
ISD::BSWAP was missing from the list of node types that should be expanded
element-wise.

llvm-svn: 200705
2014-02-03 17:27:25 +00:00
Aaron Ballman 42f6622b28 Undef'ing _WIN32_IE to silence an MSVC warning about redefining a macro value.
No functional change intended.

llvm-svn: 200704
2014-02-03 17:20:26 +00:00
Chandler Carruth 173bd7ed2e Rename the non-templated base class of SmallPtrSet to
'SmallPtrSetImplBase'. This more closely matches the organization of
SmallVector and should allow introducing a SmallPtrSetImpl which serves
the same purpose as SmallVectorImpl: isolating the element type from the
particular small size chosen. This in turn allows a lot of
simplification of APIs by not coding them against a specific small size
which is rarely needed.

llvm-svn: 200687
2014-02-03 11:24:18 +00:00
Craig Topper e7a9ee5c4a Remove unnecessary include of AArch64GenInstrInfo.inc from AArch64Disassembler.cpp. None of the GET_ defines were set that would make the include do anything.
llvm-svn: 200677
2014-02-03 06:33:17 +00:00
Duncan P. N. Exon Smith 1ff08e389f Lower llvm.expect intrinsic correctly for i1
LowerExpectIntrinsic previously only understood the idiom of an expect
intrinsic followed by a comparison with zero. For llvm.expect.i1, the
comparison would be stripped by the early-cse pass.

Patch by Daniel Micay.

llvm-svn: 200664
2014-02-02 22:43:55 +00:00
Joerg Sonnenberger 4455ffc4d0 Unaligned access is supported on ARMv6 and ARMv7 for the NetBSD target.
Patch from Matt Thomas.

llvm-svn: 200654
2014-02-02 21:18:36 +00:00
Craig Topper fa6298a162 Merge x86 HasOpSizePrefix/HasOpSize16Prefix into a 2-bit OpSize field with 0 meaning no 0x66 prefix in any mode. Rename Opsize16->OpSize32 and OpSize->OpSize16. The classes now refer to their operand size rather than the mode in which they need a 0x66 prefix. Hopefully can merge REX_W into this as OpSize64.
llvm-svn: 200626
2014-02-02 09:25:09 +00:00
Craig Topper d402df3ce8 Merge HasVEXPrefix/HasEVEXPrefix/HasXOPPrefix into a 2-bit 'encoding' field in TSFlags.
llvm-svn: 200624
2014-02-02 07:08:01 +00:00
Hal Finkel a7bbaf6de6 Replace PPC instruction-size code with MCInstrDesc getSize
As part of the cleanup done to enable the disassembler, the PPC instructions
now have a valid Size description field. This can now be used to replace some
custom logic in a few places to compute instruction sizes.

Patch by David Wiberg!

llvm-svn: 200623
2014-02-02 06:12:27 +00:00
Arnold Schwaighofer 17455633c7 LoopVectorizer: Enable unrolling of conditional stores and the load/store
unrolling heuristic per default

Benchmarking on x86_64 (thanks Chandler!) and ARM has shown those options speed
up some benchmarks while not causing any interesting regressions.

llvm-svn: 200621
2014-02-02 03:12:34 +00:00
Matt Arsenault f5958dded4 R600/SI: Fix insertelement with dynamic indices.
This didn't work for any integer vectors, and didn't
work with some sizes of float vectors. This should now
work with all sizes of float and i32 vectors.

llvm-svn: 200619
2014-02-02 00:05:35 +00:00
Venkatraman Govindaraju 52b6473d74 [Sparc] Set %o7 as the return address register instead of %i7 in MCRegisterInfo. Also, add CFI instructions to initialize the frame correctly.
llvm-svn: 200617
2014-02-01 18:54:16 +00:00
Arnold Schwaighofer 445f7fb064 ARMTTI: We don't have 16 allocatable scalar registers
This caused an regression on libquantum after enabling the new loop vectorizer
unroll heuristics.

llvm-svn: 200616
2014-02-01 18:00:25 +00:00
David Woodhouse 6c9a6f9b3d MC: Fix .octa output for APInts with BitWidth > 128
llvm-svn: 200615
2014-02-01 16:52:33 +00:00
David Woodhouse d6de0d99c5 MC: Add support for .octa
This is a minimal implementation which accepts only constants rather than
full expressions, but that should be perfectly sufficient for all known
users for now.

Patch from PaX Team <pageexec@freemail.hu>

llvm-svn: 200614
2014-02-01 16:20:59 +00:00
David Woodhouse f42a666250 MC: Add AsmLexer::BigNum token for integers greater than 64 bits
This will be needed for .octa support, but we don't want to just use the
existing AsmLexer::Integer for it and then have to litter all its users
with explicit checks for the size, and make them use the new get APIntVal()
method.

So let the lexer produce an AsmLexer::Integer as before for numbers which
are small enough — which appears to cover what was previously a nasty
special case handling of numbers which don't fit in int64_t but *do* fit
in uint64_t.

Where the number is too large even for that, produce an AsmLexer::BigNum
instead. We do nothing with these except complain about them for now,
but that will be changed shortly...

Based on a patch from PaX Team <pageexec@freemail.hu>

llvm-svn: 200613
2014-02-01 16:20:54 +00:00