Commit Graph

30534 Commits

Author SHA1 Message Date
Igor Breger abe4a79b75 AVX-512: Implemented cvtsi2ss/d cvtusi2ss/d instructions with round control for KNL.
Added intrinsics for cvtsi2ss/d instructions.
Added tests for intrinsics and encoding.

Differential Revision: http://reviews.llvm.org/D10430

llvm-svn: 239694
2015-06-14 12:44:55 +00:00
Colin LeMahieu b8575b14be [Hexagon] Adding some codegen tests and updating some to match spec.
llvm-svn: 239690
2015-06-13 21:46:39 +00:00
Simon Pilgrim d3f6427446 [DAGCombiner] Added BSWAP(BSWAP(x)) -> x combine pattern.
llvm-svn: 239682
2015-06-13 16:25:12 +00:00
Simon Pilgrim 011381d48b [DAGCombiner] Added BSWAP vector constant folding support.
llvm-svn: 239675
2015-06-13 14:08:15 +00:00
Tom Stellard 45bb48ea19 R600 -> AMDGPU rename
llvm-svn: 239657
2015-06-13 03:28:10 +00:00
Tim Northover 02cfdbb7f1 AArch64: map bare-metal arm64-macho triple to MachO MC layer.
Far better than an assertion about expecting ELF.

llvm-svn: 239647
2015-06-12 23:37:11 +00:00
Tom Stellard 12a1910e87 R600/SI: Add assembler support for FLAT instructions
- Add glc, slc, and tfe operands to flat instructions
- Add missing flat instructions
- Fix the encoding of flat_load_dwordx3 and flat_store_dwordx3.

llvm-svn: 239637
2015-06-12 20:47:06 +00:00
Colin LeMahieu 79ec06525e [Hexagon] Making intrinsic tests agnostic to register allocation. Narrowing intrinsic parameters to appropriate width.
llvm-svn: 239634
2015-06-12 19:57:32 +00:00
Rafael Espindola de28b7375f Don't depend on the interleaving of stdout and stderr.
That can change as we change the buffering.

llvm-svn: 239602
2015-06-12 12:20:03 +00:00
John Brawn d9e39d53b6 [ARM] Disabling vfp4 should disable fp16
ARMTargetParser::getFPUFeatures should disable fp16 whenever it
disables vfp4, as otherwise something like -mcpu=cortex-a7 -mfpu=none
leaves us with fp16 enabled (though the only effect that will have is
a wrong build attribute).

Differential Revision: http://reviews.llvm.org/D10397

llvm-svn: 239599
2015-06-12 09:38:51 +00:00
Peter Collingbourne 005354b1f4 LowerBitSets: Give names to aliases of unnamed bitset element objects.
It is valid for globals to be unnamed, but aliases must have a name. To avoid
creating invalid IR, we need to assign names to any aliases we create that
point to unnamed objects that have been moved into combined globals.

llvm-svn: 239590
2015-06-12 03:25:05 +00:00
Alexey Samsonov 9947e48cd1 [GVN] Use a simpler form of IRBuilder constructor.
Summary:
A side effect of this change is that it IRBuilder now automatically
created debug info locations for new instructions, which is the
same as debug location of insertion point. This is fine for the
functions in questions (GetStoreValueForLoad and
GetMemInstValueForLoad), as they are used in two situations:
  * GVN::processLoad, which tries to eliminate a load. In this case
    new instructions would have the same debug location as the load they
    eventually replace;
  * MaterializeAdjustedValue, which adds new instructions to the end
    of the basic blocks, which could later be used to replace the load
    definition. In this case we don't yet know the way the load would
    be eventually replaced (either by assembling the precomputed values
    via PHI, or by using them directly), so just using the basic block
    strategy seems to be reasonable. There is also a special case
    in the code that *would* adjust the location of the last
    instruction replacing the load definition to the location of the
    load.

Test Plan: regression test suite

Reviewers: echristo, dberlin, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10405

llvm-svn: 239585
2015-06-12 01:39:48 +00:00
Reid Kleckner 81d1cc00b7 [WinEH] Put finally pointers in the handler scope table field
We were putting them in the filter field, which is correct for 64-bit
but wrong for 32-bit.

Also switch the order of scope table entry emission so outermost entries
are emitted first, and fix an obvious state assignment bug.

llvm-svn: 239574
2015-06-11 23:37:18 +00:00
Reid Kleckner a9d6253572 [WinEH] Create an llvm.x86.seh.exceptioninfo intrinsic
This intrinsic is like framerecover plus a load. It recovers the EH
registration stack allocation from the parent frame and loads the
exception information field out of it, giving back a pointer to an
EXCEPTION_POINTERS struct. It's designed for clang to use in SEH filter
expressions instead of accessing the EXCEPTION_POINTERS parameter that
is available on x64.

This required a minor change to MC to allow defining a label variable to
another absolute framerecover label variable.

llvm-svn: 239567
2015-06-11 22:32:23 +00:00
Peter Collingbourne 82e657b509 Object: Prepend __imp_ when mangling a dllimport symbol in IRObjectFile.
We cannot prepend __imp_ in the IR mangler because a function reference may
be emitted unmangled in a constant initializer. The linker is expected to
resolve such references to thunks. This is covered by the new test case.

Strictly speaking we ought to emit two undefined symbols, one with __imp_ and
one without, as we cannot know which symbol the final object file will refer
to. However, this would require rather intrusive changes to IRObjectFile,
and lld works fine without it for now.

This reimplements r239437, which was reverted in r239502.

Differential Revision: http://reviews.llvm.org/D10400

llvm-svn: 239560
2015-06-11 21:42:18 +00:00
Alexey Samsonov 770f65ca6a Set proper debug location for branch added in BasicBlock::splitBasicBlock().
This improves debug locations in passes that do a lot of basic block
transformations. Important case is LoopUnroll pass, the test for correct
debug locations accompanies this change.

Test Plan: regression test suite

Reviewers: dblaikie, sanjoy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10367

llvm-svn: 239551
2015-06-11 18:25:54 +00:00
Rafael Espindola 65d37e64a9 This reverts commit r239529 and r239514.
Revert "[AArch64] Match interleaved memory accesses into ldN/stN instructions."
Revert "Fixing MSVC 2013 build error."

The  test/CodeGen/AArch64/aarch64-interleaved-accesses.ll test was failing on OS X.

llvm-svn: 239544
2015-06-11 17:30:33 +00:00
Reid Kleckner 2691c59e97 Revert "Fix merges of non-zero vector stores"
This reverts commit r239539.

It was causing SDAG assertions while building freetype.

llvm-svn: 239543
2015-06-11 17:25:24 +00:00
Matt Arsenault 91f90e694f SLSR: Pass address space to isLegalAddressingMode
This only updates one of the uses. The other is used in cases
that may never touch memory, so I'm not sure why this is even
calling it. Perhaps there should be a new, similar hook for such
cases or pass -1 for unknown address space.

llvm-svn: 239540
2015-06-11 16:13:39 +00:00
Matt Arsenault e23a063dc3 Fix merges of non-zero vector stores
Now actually stores the non-zero constant instead of 0.
I somehow forgot to include this part of r238108.

The test change was just an independent instruction order swap,
so just add another check line to satisfy CHECK-NEXT.

llvm-svn: 239539
2015-06-11 16:03:52 +00:00
Tom Stellard 53e015f37d R600/SI: Add -mcpu=bonaire to a test that uses flat address space
Flat instructions don't exist on SI, but there is a bug in the backend that
allows them to be selected.

llvm-svn: 239533
2015-06-11 14:51:46 +00:00
Toma Tabacu e1e460dbc5 Recommit "[mips] [IAS] Add support for BNE and BEQ with an immediate operand." (r239396).
Apparently, Arcanist didn't include some of my local changes in my previous
commit attempt.

llvm-svn: 239523
2015-06-11 10:36:10 +00:00
Zoran Jovanovic cdfcbe41f2 [mips][microMIPS] Implement ERET and ERETNC instructions
http://reviews.llvm.org/D10091

llvm-svn: 239522
2015-06-11 10:22:46 +00:00
Zoran Jovanovic 6b0dcd7b8c [mips] Change existing uimm10 operand to restrict the accepted immediates
http://reviews.llvm.org/D10312

llvm-svn: 239520
2015-06-11 09:51:58 +00:00
Zoran Jovanovic fcecf26092 [mips][microMIPSr6] Change disassembler tests to one line format
llvm-svn: 239519
2015-06-11 09:42:10 +00:00
Hao Liu 4566d18e89 [AArch64] Match interleaved memory accesses into ldN/stN instructions.
Add a pass AArch64InterleavedAccess to identify and match interleaved memory accesses. This pass transforms an interleaved load/store into ldN/stN intrinsic. As Loop Vectorizor disables optimization on interleaved accesses by default, this optimization is also disabled by default. To enable it by "-aarch64-interleaved-access-opt=true"

E.g. Transform an interleaved load (Factor = 2):
       %wide.vec = load <8 x i32>, <8 x i32>* %ptr
       %v0 = shuffle %wide.vec, undef, <0, 2, 4, 6>  ; Extract even elements
       %v1 = shuffle %wide.vec, undef, <1, 3, 5, 7>  ; Extract odd elements
     Into:
       %ld2 = { <4 x i32>, <4 x i32> } call aarch64.neon.ld2(%ptr)
       %v0 = extractelement { <4 x i32>, <4 x i32> } %ld2, i32 0
       %v1 = extractelement { <4 x i32>, <4 x i32> } %ld2, i32 1

E.g. Transform an interleaved store (Factor = 2):
       %i.vec = shuffle %v0, %v1, <0, 4, 1, 5, 2, 6, 3, 7>  ; Interleaved vec
       store <8 x i32> %i.vec, <8 x i32>* %ptr
     Into:
       %v0 = shuffle %i.vec, undef, <0, 1, 2, 3>
       %v1 = shuffle %i.vec, undef, <4, 5, 6, 7>
       call void aarch64.neon.st2(%v0, %v1, %ptr)

llvm-svn: 239514
2015-06-11 09:05:02 +00:00
Simon Pilgrim 5965680d53 [X86][SSE] Vectorized i8 and i16 shift operators
This patch ensures that SHL/SRL/SRA shifts for i8 and i16 vectors avoid scalarization. It builds on the existing i8 SHL vectorized implementation of moving the shift bits up to the sign bit position and separating the 4, 2 & 1 bit shifts with several improvements:

1 - SSE41 targets can use (v)pblendvb directly with the sign bit instead of performing a comparison to feed into a VSELECT node.
2 - pre-SSE41 targets were masking + comparing with an 0x80 constant - we avoid this by using the fact that a set sign bit means a negative integer which can be compared against zero to then feed into VSELECT, avoiding the need for a constant mask (zero generation is much cheaper).
3 - SRA i8 needs to be unpacked to the upper byte of a i16 so that the i16 psraw instruction can be correctly used for sign extension - we have to do more work than for SHL/SRL but perf tests indicate that this is still beneficial.

The i16 implementation is similar but simpler than for i8 - we have to do 8, 4, 2 & 1 bit shifts but less shift masking is involved. SSE41 use of (v)pblendvb requires that the i16 shift amount is splatted to both bytes however.

Tested on SSE2, SSE41 and AVX machines.

Differential Revision: http://reviews.llvm.org/D9474

llvm-svn: 239509
2015-06-11 07:46:37 +00:00
Nemanja Ivanovic ea1db8a697 LLVM support for vector quad bit permute and gather instructions through builtins
This patch corresponds to review:
http://reviews.llvm.org/D10096

This is the back end portion of the patch related to D10095.
The patch adds the instructions and back end intrinsics for:
vbpermq
vgbbd

llvm-svn: 239505
2015-06-11 06:21:25 +00:00
Reid Kleckner c35e7f52ba Revert "Move dllimport name mangling to IR mangler."
This reverts commit r239437.

This broke clang-cl self-hosts. We'd end up calling the __imp_ symbol
directly instead of using it to do an indirect function call.

llvm-svn: 239502
2015-06-11 01:31:48 +00:00
Peter Collingbourne 115fe37621 ArgumentPromotion: Drop sret attribute on functions that are only called directly.
If the first argument to a function is a 'this' argument and the second
has the sret attribute, the ArgumentPromotion pass may promote the 'this'
argument to more than one argument, violating the IR constraint that 'sret'
may only be applied to the first or second argument.

Although this IR constraint is arguably unnecessary, it highlighted the fact
that ArgPromotion does not need to preserve this attribute. Dropping the
attribute reduces register pressure in the backend by avoiding the register
copy required by sret. Because sret implies noalias, we also replace the
former with the latter.

Differential Revision: http://reviews.llvm.org/D10353

llvm-svn: 239488
2015-06-10 21:14:34 +00:00
Sanjay Patel 08829bac81 [x86] Add a reassociation optimization to increase ILP via the MachineCombiner pass
This is a reimplementation of D9780 at the machine instruction level rather than the DAG.

Use the MachineCombiner pass to reassociate scalar single-precision AVX additions (just a
starting point; see the TODO comments) to increase ILP when it's safe to do so.

The code is closely based on the existing MachineCombiner optimization that is implemented
for AArch64.

This patch should not cause the kind of spilling tragedy that led to the reversion of r236031.

Differential Revision: http://reviews.llvm.org/D10321

llvm-svn: 239486
2015-06-10 20:32:21 +00:00
Reid Kleckner c87a6faba1 [WinEH] _except_handlerN uses 0 instead of 1 to indicate catch-all
Our usage of 1 was a holdover from __C_specific_handler.

llvm-svn: 239482
2015-06-10 18:14:07 +00:00
Alexey Samsonov 89645dfa4d [GVN] Set proper debug locations for some instructions created by GVN.
Determining proper debug locations for instructions created in
PHITransAddr is tricky. We use a simple approach here and simply copy
debug locations from instructions computing load address to
"corresponding" instructions re-creating the address computation
in predecessor basic blocks.

This may not always be correct, given all the rearrangement and
simplification going on, and debug locations may jump around a lot,
as the basic blocks we copy locations between may be very far from
each other.

Still, this would work good in most simple cases (e.g. when chain
of address computing instruction is short, or our mapping turns out
to be 1-to-1), and we desire to have *some* reasonable debug locations
associated with newly inserted instructions.

See http://reviews.llvm.org/D10351 review thread for more details.

Test Plan: regression test suite

Reviewers: spatel, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10351

llvm-svn: 239479
2015-06-10 17:37:38 +00:00
Colin LeMahieu 1e9d1d768c [Hexagon] Adding decoders for signed operands and ensuring all signed operand types disassemble correctly.
llvm-svn: 239477
2015-06-10 16:52:32 +00:00
Igor Laevsky 965bf6a3ce [Statepoints] Add test case to check that statepoint is marked with Throwable attribute.
Differential Revision: http://reviews.llvm.org/D10215

llvm-svn: 239473
2015-06-10 13:24:00 +00:00
Igor Laevsky 346ff628f7 [StatepointLowering] Reuse stack slots across basic blocks
During statepoint lowering we can sometimes avoid spilling of the value if we know that it was already spilled for previous statepoint.
We were doing this by checking if incoming statepoint value was lowered into load from stack slot. This was working only in boundaries of one basic block.

But instead of looking at the lowered node we can look directly at the llvm-ir value and if it was gc.relocate (or some simple modification of it) look up stack slot for it's derived pointer and reuse stack slot from it. This allows us to look across basic block boundaries.

Differential Revision: http://reviews.llvm.org/D10251

llvm-svn: 239472
2015-06-10 12:31:53 +00:00
Elena Demikhovsky 00c9ad5ec2 AVX-512: Fixed a bug in comparison of i1 vectors.
cmp eq should give kxnor instruction
cmp neq should give kxor 

https://llvm.org/bugs/show_bug.cgi?id=23631

llvm-svn: 239460
2015-06-10 06:49:28 +00:00
Reid Kleckner 673de15af9 [WinEH] Call llvm.stackrestore in __except blocks
We have to do this manually, the runtime only sets up ebp. Fixes a crash
when returning after catching an exception.

llvm-svn: 239451
2015-06-10 01:34:54 +00:00
Reid Kleckner 2bc93ca846 [WinEH] Emit .safeseh directives for all 32-bit exception handlers
Use a "safeseh" string attribute to do this. You would think we chould
just accumulate the set of personalities like we do on dwarf, but this
fails to account for the LSDA-loading thunks we use for
__CxxFrameHandler3. Each of those needs to make it into .sxdata as well.
The string attribute seemed like the most straightforward approach.

llvm-svn: 239448
2015-06-10 01:02:30 +00:00
NAKAMURA Takumi e6c1b56780 Add explicit -mtriple=arm-unknown to llvm/test/CodeGen/ARM/disable-tail-calls.ll, to satisfy *-win32.
llvm-svn: 239442
2015-06-09 23:33:25 +00:00
Alexey Samsonov b7f02d371f [BasicBlockUtils] Set debug locations for instructions created in SplitBlockPredecessors.
Test Plan: regression test suite

Reviewers: eugenis, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10343

llvm-svn: 239438
2015-06-09 22:10:29 +00:00
Peter Collingbourne 9fe51fdf18 Move dllimport name mangling to IR mangler.
This ensures that LTO clients see the correct external symbol name.

Differential Revision: http://reviews.llvm.org/D10318

llvm-svn: 239437
2015-06-09 22:09:53 +00:00
Jingyue Wu 75589ffcc2 [NVPTX] fix a crash bug in NVPTXFavorNonGenericAddrSpaces
Summary:
We used to assume V->RAUW only modifies the operand list of V's user.
However, if V and V's user are Constants, RAUW may replace and invalidate V's
user entirely.

This patch fixes the above issue by letting the caller replace the
operand instead of calling RAUW on Constants.

Test Plan: @nested_const_expr and @rauw in access-non-generic.ll

Reviewers: broune, jholewinski

Reviewed By: broune, jholewinski

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D10345

llvm-svn: 239435
2015-06-09 21:50:32 +00:00
Peter Collingbourne bc05163f15 LibDriver, llvm-lib: introduce.
llvm-lib is intended to be a lib.exe compatible utility that also
understands bitcode. The implementation lives in a library so that
lld can use it to implement /lib.

Differential Revision: http://reviews.llvm.org/D10297

llvm-svn: 239434
2015-06-09 21:50:22 +00:00
Reid Kleckner f12c030f48 [WinEH] Add 32-bit SEH state table emission prototype
This gets all the handler info through to the asm printer and we can
look at the .xdata tables now. I've convinced one small catch-all test
case to work, but other than that, it would be a stretch to say this is
functional.

The state numbering algorithm avoids doing any scope reconstruction as
we do for C++ to simplify the implementation.

llvm-svn: 239433
2015-06-09 21:42:19 +00:00
Chad Rosier cf90acc104 [AArch64] Remove an overly conservative check when generating store pairs.
Store instructions do not modify register values and therefore it's safe
to form a store pair even if the source register has been read in between
the two store instructions.

Previously, the read of w1 (see below) prevented the formation of a stp.

        str      w0, [x2]
        ldr     w8, [x2, #8]
        add      w0, w8, w1
        str     w1, [x2, #4]
        ret

We now generate the following code.

        stp      w0, w1, [x2]
        ldr     w8, [x2, #8]
        add      w0, w8, w1
        ret

All correctness tests with -Ofast on A57 with Spec200x and EEMBC pass.
Performance results for SPEC2K were within noise.

llvm-svn: 239432
2015-06-09 20:59:41 +00:00
Akira Hatanaka d9699bc7bd Remove DisableTailCalls from TargetOptions and the code in resetTargetOptions
that was resetting it.

Remove the uses of DisableTailCalls in subclasses of TargetLowering and use
the value of function attribute "disable-tail-calls" instead. Also,
unconditionally add pass TailCallElim to the pipeline and check the function
attribute at the start of runOnFunction to disable the pass on a per-function
basis. 
 
This is part of the work to remove TargetMachine::resetTargetOptions, and since
DisableTailCalls was the last non-fast-math option that was being reset in that
function, we should be able to remove the function entirely after the work to
propagate IR-level fast-math flags to DAG nodes is completed.

Out-of-tree users should remove the uses of DisableTailCalls and make changes
to attach attribute "disable-tail-calls"="true" or "false" to the functions in
the IR.

rdar://problem/13752163

Differential Revision: http://reviews.llvm.org/D10099

llvm-svn: 239427
2015-06-09 19:07:19 +00:00
Arnold Schwaighofer 7e226271a1 MergeFunctions: Don't replace a weak function use by another equivalent weak function
We don't know whether the weak functions definition is the definitive definition.

rdar://21303727

llvm-svn: 239422
2015-06-09 18:19:17 +00:00
David Blaikie 0ebe35b278 Revert "[DWARF] Fix a few corner cases in expression emission"
This reverts commit r239380 due to apparently GDB regressions:
http://lab.llvm.org:8011/builders/clang-x86_64-ubuntu-gdb-75/builds/22562

llvm-svn: 239420
2015-06-09 18:01:51 +00:00
Samuel Antao cd50135a29 The constant initialization for globals in NVPTX is generated as an
array of bytes. The generation of this byte arrays was expecting 
the host to be little endian, which prevents big endian hosts to be 
used in the generation of the PTX code. This patch fixes the 
problem by changing the way the bytes are extracted so that it 
works for either little and big endian.

llvm-svn: 239412
2015-06-09 16:29:34 +00:00
Toma Tabacu 465acfd13c Recommit "[mips] [IAS] Restore STI.FeatureBits in .set pop." (r239144).
Specified the llvm namespace for the 2 calls to make_unique() which caused
compilation errors in Visual Studio 2013.

llvm-svn: 239405
2015-06-09 13:33:26 +00:00
Elena Demikhovsky 6b62b659cb X86-MPX: Implemented encoding for MPX instructions.
Added encoding tests.

llvm-svn: 239403
2015-06-09 13:02:10 +00:00
Toma Tabacu 7977cfd52a Revert "[mips] [IAS] Add support for BNE and BEQ with an immediate operand." (r239396).
It was breaking buildbots.

llvm-svn: 239397
2015-06-09 10:43:49 +00:00
Toma Tabacu 5fa8fb5762 [mips] [IAS] Add support for BNE and BEQ with an immediate operand.
Summary:
For some branches, GAS accepts an immediate instead of the 2nd register operand.
We only implement this for BNE and BEQ for now. Other branch instructions can be added later, if needed.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: seanbruno, emaste, llvm-commits

Differential Revision: http://reviews.llvm.org/D9666

llvm-svn: 239396
2015-06-09 10:34:31 +00:00
NAKAMURA Takumi d93855c82f llvm/test/DebugInfo/X86/expressions.ll: %llc_dwarf shouldn't be used with -mtriple, since %llc_dwarf implies the triple.
In this case, use plain "llc".

llvm-svn: 239390
2015-06-09 08:03:33 +00:00
Keno Fischer 979ae9f1f6 Move X86-only test case to appropriate directory
llvm-svn: 239384
2015-06-09 02:52:47 +00:00
Keno Fischer e34147ce2f [DWARF] Fix a few corner cases in expression emission
Summary: I noticed an object file with `DW_OP_reg4 DW_OP_breg4 0` as a DWARF expression,
which I traced to a missing break (and `++I`) in this code snippet.
While I was at it, I also added support for a few other corner cases
along the same lines that I could think of.

Test Plan: Hand-crafted test case to exercises these cases is included.

Reviewers: echristo, dblaikie, aprantl

Reviewed By: aprantl

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10302

llvm-svn: 239380
2015-06-09 01:53:59 +00:00
Anna Zaks 119046098a [asan] Prevent __attribute__((annotate)) triggering errors on Darwin
The following code triggers a fatal error in the compiler instrumentation
of ASan on Darwin because we place the attribute into llvm.metadata section,
which does not have the proper MachO section name.

void foo() __attribute__((annotate("custom")));
void foo() {;}

This commit reorders the checks so that we skip everything in llvm.metadata
first. It also removes the hard failure in case the section name does not
parse. That check will be done lower in the compilation pipeline anyway.

(Reviewed in http://reviews.llvm.org/D9093.)

llvm-svn: 239379
2015-06-09 00:58:08 +00:00
Matt Arsenault 705eb8f6b1 Implement computeKnownBits for min/max nodes
llvm-svn: 239378
2015-06-09 00:52:41 +00:00
Jingyue Wu 2e4d1dd0ed [NVPTX] run SROA after NVPTXFavorNonGenericAddrSpaces
Summary:
This cleans up most allocas NVPTXLowerKernelArgs emits for byval
parameters.

Test Plan: makes bug21465.ll more stronger to verify no redundant local load/store.

Reviewers: eliben, jholewinski

Reviewed By: eliben, jholewinski

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D10322

llvm-svn: 239368
2015-06-09 00:05:56 +00:00
Arnold Schwaighofer 0302da614a MergeFunctions: Impose a total order on the replacement of functions
We don't want to replace function A by Function B in one module and Function B
by Function A in another module.

If these functions are marked with linkonce_odr we would end up with a function
stub calling B in one module and a function stub calling A in another module. If
the linker decides to pick these two we will have two stubs calling each other.

rdar://21265586

llvm-svn: 239367
2015-06-09 00:03:29 +00:00
Ranjeet Singh 10511a493e [AArch64] AsmParser should be case insensitive about accepting vector register names.
Differential Revision: http://reviews.llvm.org/D10320

llvm-svn: 239353
2015-06-08 21:32:16 +00:00
Rafael Espindola 4b23eb85f9 Fix a regression in .pop_section.
It was calling ChangeSection with the wrong current section, eventually leading
to a crash.

llvm-svn: 239335
2015-06-08 20:08:55 +00:00
Simon Pilgrim 4af289d0f2 [X86][SSE] Added lzcnt vector tests.
llvm-svn: 239333
2015-06-08 19:58:43 +00:00
Akira Hatanaka 4a61619ff5 [ARM] Pass a callback to FunctionPass constructors to enable skipping execution
on a per-function basis.

Previously some of the passes were conditionally added to ARM's pass pipeline
based on the target machine's subtarget. This patch makes changes to add those
passes unconditionally and execute them conditonally based on the predicate
functor passed to the pass constructors. This enables running different sets of
passes for different functions in the module.

rdar://problem/20542263

Differential Revision: http://reviews.llvm.org/D8717

llvm-svn: 239325
2015-06-08 18:50:43 +00:00
Matthias Braun 6f8db0e1a7 X86: Reject register operands with obvious type mismatches.
While we have some code to transform specification like {ax} into
{eax}/{rax} if the operand type isn't 16bit, we should reject cases
where there is no sane way to do this, like the i128 type in the
example.

Related to rdar://21042280

Differential Revision: http://reviews.llvm.org/D10260

llvm-svn: 239309
2015-06-08 16:56:23 +00:00
Colin LeMahieu 6aca6f0be5 [Hexagon] Adding functionality for searching for compound instruction pairs. Compound instructions reduce slot resource requirements freeing those packet slots up for more instructions.
llvm-svn: 239307
2015-06-08 16:34:47 +00:00
Simon Pilgrim 4791f6d89b [DAGCombiner] Added CTLZ vector constant folding support.
llvm-svn: 239305
2015-06-08 16:19:00 +00:00
Javed Absar e1c7dc3ee2 ARM]: Add support for MMFR4_EL1 in assembler
This patch adds support for system register MMFR4_EL1 (memory model feature register) in the assembler.
This register provides information about the implemented memory model and memory management support.

llvm-svn: 239302
2015-06-08 15:01:11 +00:00
Petar Jovanovic cf197f0bde [Mips64][mcjit] Add R_MIPS_PC32 relocation
This patch adds R_MIPS_PC32 relocation for Mips64.

Patch by Vladimir Radosavljevic.

Differential Revision: http://reviews.llvm.org/D10235

llvm-svn: 239301
2015-06-08 14:10:23 +00:00
Igor Breger 00d9f8457b AVX-512: Implemented 256/128bit VALIGND/Q instructions for SKX and KNL
Implemented DAG lowering for all these forms.
Added tests for DAG lowering and encoding.

Differential Revision: http://reviews.llvm.org/D10310

llvm-svn: 239300
2015-06-08 14:03:17 +00:00
Artur Pilipenko 7fad7e57e8 Minor refactoring of GEP handling in isDereferenceablePointer
For GEP instructions isDereferenceablePointer checks that all indices are constant and within bounds. Replace this index calculation logic to a call to accumulateConstantOffset. Separated from the http://reviews.llvm.org/D9791

Reviewed By: sanjoy

Differential Revision: http://reviews.llvm.org/D9874

llvm-svn: 239299
2015-06-08 11:58:13 +00:00
Silviu Baranga 98a137196a [LAA] Fix estimation of number of memchecks
Summary:
We need to add a runtime memcheck for pair of accesses (x,y) where at least one of x and y
are writes.
 
Assuming we have w writes and r reads, currently this number is  estimated as being
w* (w+r-1). This estimation will count (write,write) pairs twice and will overestimate
the number of checks required.

This change adds a getNumberOfChecks method to RuntimePointerCheck, which
will count the number of runtime checks needed (similar in implementation to
needsAnyChecking) and uses it to produce the correct number of runtime checks.

Test Plan:
llvm test suite
spec2k
spec2k6

Performance results: no changes observed (not surprising since the formula for 1 writer is basically the same, which would covers most cases - at least with the current check limit).

Reviewers: anemet

Reviewed By: anemet

Subscribers: mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D10217

llvm-svn: 239295
2015-06-08 10:27:06 +00:00
Simon Pilgrim c789e1d57b [DAGCombiner] Added CTTZ vector constant folding support.
llvm-svn: 239293
2015-06-08 09:57:09 +00:00
Hao Liu 32c0539691 [LoopVectorize] Teach Loop Vectorizor about interleaved memory accesses.
Interleaved memory accesses are grouped and vectorized into vector load/store and shufflevector.
E.g. for (i = 0; i < N; i+=2) {
       a = A[i];         // load of even element
       b = A[i+1];       // load of odd element
       ...               // operations on a, b, c, d
       A[i] = c;         // store of even element
       A[i+1] = d;       // store of odd element
     }

  The loads of even and odd elements are identified as an interleave load group, which will be transfered into vectorized IRs like:
     %wide.vec = load <8 x i32>, <8 x i32>* %ptr
     %vec.even = shufflevector <8 x i32> %wide.vec, <8 x i32> undef, <4 x i32> <i32 0, i32 2, i32 4, i32 6>
     %vec.odd = shufflevector <8 x i32> %wide.vec, <8 x i32> undef, <4 x i32> <i32 1, i32 3, i32 5, i32 7>

  The stores of even and odd elements are identified as an interleave store group, which will be transfered into vectorized IRs like:
     %interleaved.vec = shufflevector <4 x i32> %vec.even, %vec.odd, <8 x i32> <i32 0, i32 4, i32 1, i32 5, i32 2, i32 6, i32 3, i32 7> 
     store <8 x i32> %interleaved.vec, <8 x i32>* %ptr

This optimization is currently disabled by defaut. To try it by adding '-enable-interleaved-mem-accesses=true'. 

llvm-svn: 239291
2015-06-08 06:39:56 +00:00
Hao Liu 751004a67d [LoopAccessAnalysis] Teach LAA to check the memory dependence between strided accesses.
Differential Revision: http://reviews.llvm.org/D9368

llvm-svn: 239285
2015-06-08 04:48:37 +00:00
Colin LeMahieu 14ec76eb63 [objdump] Moving PrintImmHex out of MachODump and in to llvm-objdump and setting instprinter appropriately.
llvm-svn: 239265
2015-06-07 21:07:17 +00:00
Simon Pilgrim 856fc94a2f [X86] Added tzcnt vector tests.
llvm-svn: 239264
2015-06-07 21:01:34 +00:00
Matt Arsenault e81944fd5e SeparateConstOffsetFromGEP: Pass address space to isLegalAddressingMode
llvm-svn: 239262
2015-06-07 20:17:44 +00:00
Simon Pilgrim 3a7718038d [X86] Added BitScanForward/BitScanReverse memory folding + tests
llvm-svn: 239257
2015-06-07 18:34:25 +00:00
Simon Pilgrim 19cefe6903 Fixed line endings
llvm-svn: 239253
2015-06-07 16:09:48 +00:00
Simon Pilgrim 68cd237f57 [DAGCombiner] Added CTPOP vector constant folding support.
Added tests to the existing SSE/AVX test files.

llvm-svn: 239252
2015-06-07 15:37:14 +00:00
Colin LeMahieu 229a1e69fc Teaching llvm-mc how to understand the defsym command line option. This allows integer-constant symbols to be defined on the command line and used during assembly.
llvm-svn: 239240
2015-06-07 01:46:24 +00:00
Colin LeMahieu 1c8c213529 [MC] Common symbols weren't being checked for redeclaration which allowed an assembly file to generate an assertion in setCommon(): !isCommon(). This change allows redeclaration as long as the size and alignment match exactly, otherwise report a fatal error.
llvm-svn: 239227
2015-06-06 20:12:40 +00:00
Sanjoy Das ad714b1af3 [LoopUnroll] Fix truncation bug in canUnrollCompletely.
Summary:
canUnrollCompletely takes `unsigned` values for `UnrolledCost` and
`RolledDynamicCost` but is passed in `uint64_t`s that are silently
truncated.  Because of this, when `UnrolledSize` is a large integer
that has a small remainder with UINT32_MAX, LLVM tries to completely
unroll loops with high trip counts.

Reviewers: mzolotukhin, chandlerc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10293

llvm-svn: 239218
2015-06-06 05:24:10 +00:00
David Majnemer 1c297e66fb [CVP] Don't assume Constants of type i1 can be known to be true or false
CVP wants to analyze the condition operand of a select along an edge.
It succeeds in getting back a Constant but not a ConstantInt.  Instead,
it gets a ConstantExpr.  It then assumes that the Constant must be equal
to false because it isn't equal to true.

Instead, perform an additional comparison.

This fixes PR23752.

llvm-svn: 239217
2015-06-06 04:56:51 +00:00
David Majnemer 468f670021 [InstCombine] Don't miscompile select to poison
If we have (select a, b, c), it is sometimes valid to simplify this to a
single select operand.  However, doing so is only valid if the
computation doesn't inject poison into the computation.

It might be helpful to consider the following example:
  (select (icmp ne %i, INT_MAX), (add nsw %i, 1), INT_MIN)

The select is equivalent to (add %i, 1) but not (add nsw %i, 1).

Self hosting on x86_64 revealed that this occurs very, very rarely so
bailing out is hopefully pretty reasonable.

llvm-svn: 239215
2015-06-06 02:30:43 +00:00
Rafael Espindola f3d49b30b5 Handle 16 bit PC relative relocations.
Fixes pr23771.

llvm-svn: 239214
2015-06-06 02:29:56 +00:00
Frederic Riss 123303122d [dsymutil] Fix misspelled CHECK line.
llvm-svn: 239200
2015-06-05 23:46:18 +00:00
Frederic Riss 5a642079d7 [dsymutil] Add support for linking the debug_frame section.
Linking the debug frame section is actually very easy as we just have to
patch the start address in the FDE header and then copy the rest of the
FDE without even looking at it. The only small complexity comes from the
handling of the CIEs that we should unique across object file. This is
also really easy by using a StringMap keyed on the raw contents of the
CIE.

llvm-svn: 239198
2015-06-05 23:06:11 +00:00
Frederic Riss 4f5874a51f [dsymutil] Have the YAML deserialization rewrite the object address of symbols.
The main use of the YAML debug map format is for testing inside LLVM. If we have IR
files in the tests used to generate object files, then we obviously don't know the
addresses of the symbols inside the object files beforehand.

This change lets the YAML import lookup the addresses in the object files and rewrite
them. This will allow to have test that really don't need any binary input.

llvm-svn: 239189
2015-06-05 21:12:07 +00:00
Renato Golin 3dabb23384 Revert "[InstCombine] Rephrase fix to SimplifyWithOpReplaced"
This reverts commit r239141. This commit was an attempt to reintroduce
a previous patch that broke many self-hosting bots with clang timeouts,
but it still has slowdown issues, at least  on ARM, increasing the
compilation time (stage 2, clang's) by 5x.

llvm-svn: 239175
2015-06-05 18:24:12 +00:00
Sanjoy Das 72cb5e1087 [InstCombine] Fix PR23751.
PR23751 was caused by a missing ``break;`` in r234388.

llvm-svn: 239171
2015-06-05 18:04:42 +00:00
Peter Collingbourne 6679fc1a79 Revert r238473, "Thumb2: Modify codegen for memcpy intrinsic to prefer LDM/STM."
as it caused miscompilations and assertion failures (PR23768,
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150601/280380.html).

llvm-svn: 239169
2015-06-05 18:01:28 +00:00
Fiona Glaser 666e352440 DAGCombiner: don't duplicate (fmul x, c) in visitFNEG if fneg is free
For targets with a free fneg, this fold is always a net loss if it
ends up duplicating the multiply, so definitely avoid it.

This might be true for some targets without a free fneg too, but
I'll leave that for future investigation.

llvm-svn: 239167
2015-06-05 17:52:34 +00:00
Chandler Carruth 9dabd14d59 [Unroll] Rework the naming and structure of the new unroll heuristics.
The new naming is (to me) much easier to understand. Here is a summary
of the new state of the world:

- '*Threshold' is the threshold for full unrolling. It is measured
  against the estimated unrolled cost as computed by getUserCost in TTI
  (or CodeMetrics, etc). We will exceed this threshold when unrolling
  loops where unrolling exposes a significant degree of simplification
  of the logic within the loop.
- '*PercentDynamicCostSavedThreshold' is the percentage of the loop's
  estimated dynamic execution cost which needs to be saved by unrolling
  to apply a discount to the estimated unrolled cost.
- '*DynamicCostSavingsDiscount' is the discount applied to the estimated
  unrolling cost when the dynamic savings are expected to be high.

When actually analyzing the loop, we now produce both an estimated
unrolled cost, and an estimated rolled cost. The rolled cost is notably
a dynamic estimate based on our analysis of the expected execution of
each iteration.

While we're still working to build up the infrastructure for making
these estimates, to me it is much more clear *how* to make them better
when they have reasonably descriptive names. For example, we may want to
apply estimated (from heuristics or profiles) dynamic execution weights
to the *dynamic* cost estimates. If we start doing that, we would also
need to track the static unrolled cost and the dynamic unrolled cost, as
only the latter could reasonably be weighted by profile information.

This patch is sadly not without functionality change for the new unroll
analysis logic. Buried in the heuristic management were several things
that surprised me. For example, we never subtracted the optimized
instruction count off when comparing against the unroll heursistics!
I don't know if this just got lost somewhere along the way or what, but
with the new accounting of things, this is much easier to keep track of
and we use the post-simplification cost estimate to compare to the
thresholds, and use the dynamic cost reduction ratio to select whether
we can exceed the baseline threshold.

The old values of these flags also don't necessarily make sense. My
impression is that none of these thresholds or discounts have been tuned
yet, and so they're just arbitrary placehold numbers. As such, I've not
bothered to adjust for the fact that this is now a discount and not
a tow-tier threshold model. We need to tune all these values once the
logic is ready to be enabled.

Differential Revision: http://reviews.llvm.org/D9966

llvm-svn: 239164
2015-06-05 17:01:43 +00:00
Frederic Riss c0866ad2c0 [dsymutil] Handle the -oso-prepend-path option when the input is a YAML debug map
All the tests using a YAML debug map will need this.

llvm-svn: 239163
2015-06-05 16:35:44 +00:00
Alexei Starovoitov 8cf9a4c472 [bpf] rename triple names bpf_be -> bpfeb
llvm-svn: 239162
2015-06-05 16:11:14 +00:00
Colin LeMahieu be8c453d58 [Hexagon] Reapply r239097 with tests corrected for shuffling and duplexing.
llvm-svn: 239161
2015-06-05 16:00:11 +00:00
John Brawn 985c04e8fa [ARM] Add support for -sp- FPUs and FPU none to TargetParser
These are added mainly for the benefit of clang, but this also means that they
are now allowed in .fpu directives and we emit the correct .fpu directive when
single-precision-only is used.

Differential Revision: http://reviews.llvm.org/D10238

llvm-svn: 239151
2015-06-05 13:31:19 +00:00
Simon Pilgrim 7de557e5a9 [X86][AVX2] Added tests for v32i8 vector shifts
Currently still scalarized, but D9474 should remedy that.

llvm-svn: 239146
2015-06-05 12:35:36 +00:00
Toma Tabacu 399a56d771 Revert "[mips] [IAS] Restore STI.FeatureBits in .set pop." (r239144).
This is breaking the Windows buildbots.

llvm-svn: 239145
2015-06-05 12:19:27 +00:00
Toma Tabacu 89ebf88ff3 [mips] [IAS] Restore STI.FeatureBits in .set pop.
Summary:
Only restoring AvailableFeatures is not enough and will lead to buggy behaviour.
For example, if we have a feature enabled and we ".set pop", the next time we try
to ".set" that feature nothing will happen because the "!(STI.getFeatureBits()[Feature])"
check will be false, because we didn't restore STI.FeatureBits.

In order to fix this, we need to make MipsAssemblerOptions remember the STI.FeatureBits
instead of the AvailableFeatures and then regenerate AvailableFeatures each time we ".set pop".
This is because, AFAIK, there is no way to convert from AvailableFeatures back to STI.FeatureBits,
but the reverse is possible by using ComputeAvailableFeatures(STI.FeatureBits).

I also moved the updating of AssemblerOptions inside the "if" statement in
setFeatureBits() and clearFeatureBits(), as there is no reason to update if
nothing changes.

Reviewers: dsanders, mkuper

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9156

llvm-svn: 239144
2015-06-05 11:48:54 +00:00
David Majnemer b58f32f7a8 [LoopVectorize] Don't crash on zero-sized types in isInductionPHI
isInductionPHI wants to calculate the stride based on the pointee size.
However, this is not possible when the pointee is zero sized.

This fixes PR23763.

llvm-svn: 239143
2015-06-05 10:52:40 +00:00
Andrea Di Biagio eb33134ce7 Simplify code; NFC.
Also, moved test cases from CodeGen/X86/fold-buildvector-bug.ll into
CodeGen/X86/buildvec-insertvec.ll and regenerated CHECK lines using
update_llc_test_checks.py.

llvm-svn: 239142
2015-06-05 10:29:55 +00:00
David Majnemer 6d8081835d [InstCombine] Rephrase fix to SimplifyWithOpReplaced
I don't have the IR which is causing the build bot breakage but I can
postulate as to why they are timing out:
1. SimplifyWithOpReplaced was stripping flags from the simplified value.
2. visitSelectInstWithICmp was overriding SimplifyWithOpReplaced because
   it's simplification wasn't correct.
3. InstCombine would revisit the add instruction and note that it can
   rederive the flags.
4. By modifying the value, we chose to revisit instructions which reuse
   the value.  One of the instructions is the original select, causing
   LLVM to never reach fixpoint.

Instead, strip the flags only when we are sure we are going to perform
the simplification.

llvm-svn: 239141
2015-06-05 09:57:57 +00:00
Daniel Jasper 917fa5ee66 Revert "[InstCombine] Don't miscompile safe increment idiom"
This is breaking a lot of build bots and is causing very long-running
compiles (infinite loops)?

Likely, we shouldn't return nullptr?

llvm-svn: 239139
2015-06-05 09:31:20 +00:00
Simon Pilgrim b4c562de87 [X86][SSE] Added tests for i8/i16 vector shifts
Currently still scalarized, but D9474 should remedy that.

llvm-svn: 239136
2015-06-05 08:24:23 +00:00
Alexey Samsonov 5c3ed00eb2 Revert "[Object, ELF] Fix segmentation fault in ELFFile::getSectionName()."
This reverts commit r239124.

llvm-svn: 239125
2015-06-04 23:58:31 +00:00
Alexey Samsonov 24558d5520 [Object, ELF] Fix segmentation fault in ELFFile::getSectionName().
Don't do a null dereference if .shstrtab section is missing.

llvm-svn: 239124
2015-06-04 23:40:23 +00:00
Alexey Samsonov 49179ddba4 [Object, ELF] Don't assert on invalid magic in createELFObjectFile.
Instead, return a proper error code from factory.

llvm-svn: 239116
2015-06-04 23:14:43 +00:00
David Majnemer 00f7d9ecc8 [InstCombine] Don't miscompile safe increment idiom
We cleverly handle cases where computation done in one argument of a select
instruction is suitable for the other operand, thus obviating the need
of the select and the comparison.  However, the other operand cannot
have flags.

This fixes PR23757.

llvm-svn: 239115
2015-06-04 23:11:30 +00:00
Swaroop Sridhar 70d18df18f Statepoint: Fix handling of Far Immediate calls
gc.statepoint intrinsics with a far immediate call target 
were lowered incorrectly as pc-rel32 calls.

This change fixes the problem, and generates an indirect call 
via a scratch register.

For example: 

Intrinsic:
  %safepoint_token = call i32 (i64, i32, void ()*, i32, i32, ...) @llvm.experimental.gc.statepoint.p0f_isVoidf(i64 0, i32 0, void ()* inttoptr (i64 140727162896504 to void ()*), i32 0, i32 0, i32 0, i32 0)

Old Incorrect Lowering:
  callq 140727162896504

New Correct Lowering:
  movabsq $140727162896504, %rax 
  callq *%rax

In lowerCallFromStatepoint(), the callee-target was modified and 
represented as a "TargetConstant" node, rather than a "Constant" node.
Undoing this modification enabled LowerCall() to generate the 
correct CALL instruction.

llvm-svn: 239114
2015-06-04 23:03:21 +00:00
Alexey Samsonov 18ad2e54ab [Object, ELF] Don't call llvm_unreachable() from createELFObjectFile.
Instead, return a proper error code from factory.

llvm-svn: 239113
2015-06-04 22:58:25 +00:00
Charles Davis da280728b6 [Target/X86] Don't use callee-saved registers in a Win64 tail call on non-Windows.
Summary:
A small bit that I missed when I updated the X86 backend to account for
the Win64 calling convention on non-Windows. Now we don't use dead
non-volatile registers when emitting a Win64 indirect tail call on
non-Windows.

Should fix PR23710.

Test Plan: Added test for the correct behavior based on the case I posted to PR23710.

Reviewers: rnk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10258

llvm-svn: 239111
2015-06-04 22:50:05 +00:00
Alexey Samsonov f8a7bf8c6e [Object, MachO] Don't crash on incomplete MachO segment load commands.
Report proper error code from MachOObjectFile constructor if we
can't parse another segment load command (we already return a proper
error if segment load command contents is suspicious).

llvm-svn: 239109
2015-06-04 22:26:44 +00:00
Benjamin Kramer ff0fb6936b [SDAG switch lowering] Fix switch case -> or merging for 0 and INT_MIN
The big/small ordering here is based on signed values so SmallValue will
be INT_MIN and BigValue 0. This shouldn't be a problem but the code
assumed that BigValue always had more bits set than SmallValue.

We used to just miss the transformation, but a recent refactoring of
mine turned this into an assertion failure.

llvm-svn: 239105
2015-06-04 22:05:51 +00:00
Colin LeMahieu 348efdbd36 Shouldn't be XFAIL'ed.
llvm-svn: 239103
2015-06-04 21:49:43 +00:00
Colin LeMahieu c40be85adc Revert r239095 incorrect test tree.
llvm-svn: 239102
2015-06-04 21:32:42 +00:00
Jingyue Wu a2f6027a31 [NVPTX] roll forward r239082
NVPTXISelDAGToDAG translates "addrspacecast to param" to
NVPTX::nvvm_ptr_gen_to_param

Added an llc test in bug21465.

llvm-svn: 239100
2015-06-04 21:28:26 +00:00
Colin LeMahieu fc52c11d80 [Hexagon] Adding functionality for duplexing. Duplexing is a way to compress commonly used pairs of instructions in order to reduce code size. The test case duplex.ll normally would be 8 bytes, assign register to 0 and jump to link register. After duplexing this is only 4 bytes. This also tests the HexagonMCShuffler code path which is used to make sure duplexed instructions still follow slot requirements.
llvm-svn: 239095
2015-06-04 21:16:16 +00:00
Jingyue Wu b8f38668d5 Revert r239082
llc crashed for NVPTX backend

llvm-svn: 239094
2015-06-04 21:07:08 +00:00
Sergey Dmitrouk 3160d02b5b Erase constant dbgloc on reuse in PHI node
Basic block selection involves checking successor BBs for PHI nodes
that depend on the current BB.  In case such BBs are found, the value
being selected is a constant and such constant already exists in
current BB, it's value is reused.

This might lead to wrong locations in some situations, especially if
same constant value ends up being materialized twice in two different
ways, which discards that sharing and leaves us with wrong debug
location in the successor BB.

In code this involves the following sequence of calls:

 SelectionDAGBuilder::HandlePHINodesInSuccessorBlocks ->
 SelectionDAGBuilder::CopyValueToVirtualRegister ->
 SelectionDAGBuilder::getNonRegisterValue

llvm-svn: 239089
2015-06-04 20:48:40 +00:00
Ahmed Bougacha 8207641251 [GlobalMerge] Take into account minsize on Global users' parents.
Now that we can look at users, we can trivially do this: when we would
have otherwise disabled GlobalMerge (currently -O<3), we can just run
it for minsize functions, as it's usually a codesize win.

Differential Revision: http://reviews.llvm.org/D10054

llvm-svn: 239087
2015-06-04 20:39:23 +00:00
Jim Grosbach 7c76b4cc6e MC: Remove obsolete MachO UseAggressiveSymbolFolding.
Fix the FIXME and remove this old as(1) compat option. It was useful for
bringup of the integrated assembler to diff object files, but now it's
just causing more relocations than strictly necessary to be generated.

rdar://21201804

llvm-svn: 239084
2015-06-04 20:27:42 +00:00
Jingyue Wu f3a8079b75 [NVPTX] kernel pointer arguments point to the global address space
Summary:
With this patch, NVPTXLowerKernelArgs converts a kernel pointer argument to a
pointer in the global address space. This change, along with
NVPTXFavorNonGenericAddrSpaces, allows the NVPTX backend to emit ld.global.*
and st.global.* for accessing kernel pointer arguments.

Minor changes:
1. refactor: extract function convertToPointerInAddrSpace
2. fix a bug in the test case in bug21465.ll

Test Plan: lower-kernel-ptr-arg.ll

Reviewers: eliben, meheff, jholewinski

Reviewed By: jholewinski

Subscribers: wengxt, jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D10154

llvm-svn: 239082
2015-06-04 20:19:38 +00:00
Alexey Samsonov 074da9b5e7 [Object, MachO] Don't crash on invalid MachO segment load commands.
Summary:
Properly report the error in segment load commands from MachOObjectFile
constructor instead of crashing the program.

Adjust the test case accordingly.

Test Plan: regression test suite

Reviewers: rafael, filcab

Subscribers: llvm-commits
llvm-svn: 239081
2015-06-04 20:08:52 +00:00
Alexey Samsonov de5a94a6b4 [Object, MachO] Don't crash on invalid MachO load commands.
Summary:
Currently all load commands are parsed in MachOObjectFile constructor.
If the next load command cannot be parsed, or if command size is too
small, properly report it through the error code and fail to construct
the object, instead of crashing the program.

Test Plan: regression test suite

Reviewers: rafael, filcab

Subscribers: llvm-commits
llvm-svn: 239080
2015-06-04 19:57:46 +00:00
Alexey Samsonov 9f336636fe [Object, MachO] Don't crash on parsing invalid MachO header.
Summary: Instead, properly report this error from MachOObjectFile constructor.

Test Plan: regression test suite

Reviewers: rafael

Subscribers: llvm-commits
llvm-svn: 239078
2015-06-04 19:45:22 +00:00
Alexey Samsonov 3642508921 Fix buildbot failure on Windows by relaxing test expectations.
llvm-svn: 239074
2015-06-04 19:22:00 +00:00
Alexei Starovoitov 310deada10 [bpf] add big- and host- endian support
Summary:
-march=bpf    -> host endian
-march=bpf_le -> little endian
-match=bpf_be -> big endian

Test Plan:
v1 was tested by IBM s390 guys and appears to be working there.
It bit rots too fast here.

Reviewers: chandlerc, tstellarAMD

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10177

llvm-svn: 239071
2015-06-04 19:15:05 +00:00
Andrea Di Biagio 9ac8a6b13d [DAGCombiner] Fix wrong folding of a build_vector into a blend with zero.
Method 'visitBUILD_VECTOR' in the DAGCombiner knows how to combine a
build_vector of a bunch of extract_vector_elt nodes and constant zero nodes
into a shuffle blend with a zero vector.

However, method 'visitBUILD_VECTOR' forgot that a floating point
build_vector may contain negative zero as well as positive zero.

Example:

define <2 x double> @example(<2 x double> %A) {
entry:
  %0 = extractelement <2 x double> %A, i32 0
  %1 = insertelement <2 x double> undef, double %0, i32 0
  %2 = insertelement <2 x double> %1, double -0.0, i32 1
  ret <2 x double> %2
}

Before this patch, llc (with -mattr=+sse4.1) wrongly generated
  movq   %xmm0, %xmm0  # xmm0 = xmm0[0],zero

So, the sign bit of the negative zero was effectively lost.

This patch fixes the problem by adding explicit checks for positive zero.

With this patch, llc produces the following code for the example above:
  movhpd .LCPI0_0(%rip), %xmm0

where .LCPI0_0 referes to a 'double -0'.

llvm-svn: 239070
2015-06-04 19:15:01 +00:00
Alexey Samsonov 2b5fe3f5b2 Make test case more readable: move CHECK-lines next to corresponding RUN-lines.
llvm-svn: 239068
2015-06-04 18:50:04 +00:00
Alexey Samsonov 50d0fbd2b9 llvm-objdump: return non-zero exit code for certain cases of invalid input
* If the input file is missing;
* If the type of input object file can't be recognized;
* If the object file can't be parsed correctly.

llvm-svn: 239065
2015-06-04 18:34:11 +00:00
Matt Arsenault 73e06fa262 R600/SI: Reimplement isLegalAddressingMode
Now that we sometimes know the address space, this can
theoretically do a better job.

This needs better test coverage, but this mostly depends on
first updating the loop optimizatiosn to provide the address
space.

llvm-svn: 239053
2015-06-04 16:17:42 +00:00
Matt Arsenault 81c7ae2bf5 R600/SI: Fix some cases for load / store of half
Mostly argument loads were producing broken zextloads
from an FP type.

llvm-svn: 239049
2015-06-04 16:00:27 +00:00
Hans Wennborg d922915685 Switch lowering: fix assert in buildBitTests (PR23738)
When checking (High - Low + 1).sle(BitWidth), BitWidth would be truncated
to the size of the left-hand side. In the case of this PR, the left-hand
side was i4, so BitWidth=64 got truncated to 0 and the assert failed.

llvm-svn: 239048
2015-06-04 15:55:00 +00:00
Rafael Espindola a401eee22f Omit unused section symbols from the symbol table.
Section symbols exist as an optimization: instead of having multiple relocations
point to different symbols, many of them can point to a single section symbol.

When that optimization is unused, a section symbol is also unused and adds no
extra information to the object file.

This saves a bit of space on the object files and makes the output of
llvm-objdump -t easier to read and consequently some tests get quite a bit
simpler.

llvm-svn: 239045
2015-06-04 15:33:30 +00:00
Rafael Espindola 09e5b1ca76 Move test that depends on x86 to the x86 directory.
llvm-svn: 239043
2015-06-04 15:25:47 +00:00
Rafael Espindola 54a381463e No need to check the raw relocation bytes if checking the parsed dump.
llvm-svn: 239042
2015-06-04 15:21:17 +00:00
Rafael Espindola 20733034a7 llvm-readobj can parse relocations, no need to check the raw bytes.x
llvm-svn: 239041
2015-06-04 15:15:12 +00:00
Rafael Espindola 7884c95c7e Disassemble the start of sections even if there is no symbol there.
We already handled a section with no symbols, extend that to also handle a
section with symbols that don't include the section start.

llvm-svn: 239039
2015-06-04 15:01:05 +00:00
James Molloy 37593732a4 Don't create a MIN/MAX node if the underlying compare has more than one use.
If the compare in a select pattern has another use then it can't be removed, so we'd just
be creating repeated code if we created a min/max node.

Spotted by Matt Arsenault!

llvm-svn: 239037
2015-06-04 13:48:23 +00:00
Elena Demikhovsky 2f1a0dabd0 AVX-512: I brought back vector-shuffle-512-v8.ll test.
I re-generated it after all AVX-512 shuffle optimizations.

llvm-svn: 239026
2015-06-04 07:49:56 +00:00
Igor Breger 8bdcb69413 Test commit
llvm-svn: 239019
2015-06-04 07:23:38 +00:00
David Majnemer 0a99278f7f Make the test introduced in r239015 more targeted.
We don't need to go through LSR to trigger this bug.  Instead,
hand-craft a tricky GEP and get the constant folder to hack on it when
parsing the IR.

llvm-svn: 239017
2015-06-04 07:21:42 +00:00
Elena Demikhovsky 4078c75bd4 AVX-512: added all SKX forms of VPERMW/D/Q instructions.
Added all forms of VPERMPS/PD instrcuctions.
Added encoding tests.

llvm-svn: 239016
2015-06-04 07:07:13 +00:00
David Majnemer 38eb9f46db [ConstantFold] Don't skip the first gep index when folding geps
We neglected to check if the first index made the GEP ineligible for
'inbounds'.

This fixes PR23753.

llvm-svn: 239015
2015-06-04 07:01:56 +00:00
Rafael Espindola af5f51f6a8 Add testcase that would crash before the previous revert.
llvm-svn: 239011
2015-06-04 05:51:13 +00:00
Sanjay Patel 667a7e2a0f make reciprocal estimate code generation more flexible by adding command-line options (3rd try)
The first try (r238051) to land this was reverted due to ExecutionEngine build failure;
that was hopefully addressed by r238788.

The second try (r238842) to land this was reverted due to BUILD_SHARED_LIBS failure;
that was hopefully addressed by r238953.

This patch adds a TargetRecip class for processing many recip codegen possibilities.
The class is intended to handle both command-line options to llc as well
as options passed in from a front-end such as clang with the -mrecip option.

The x86 backend is updated to use the new functionality.
Only -mcpu=btver2 with -ffast-math should see a functional change from this patch.
All other x86 CPUs continue to *not* use reciprocal estimates by default with -ffast-math.

Differential Revision: http://reviews.llvm.org/D8982

llvm-svn: 239001
2015-06-04 01:32:35 +00:00