Commit Graph

31480 Commits

Author SHA1 Message Date
Sanjay Patel fec7965b36 fix minsize detection: minsize attribute implies optimizing for size
llvm-svn: 244617
2015-08-11 15:56:31 +00:00
John Brawn 863bfdbfb4 [GlobalMerge] Use private linkage for MergedGlobals variables
Other objects can never reference the MergedGlobals symbol so external linkage
is never needed. Using private instead of internal linkage means the object is
more similar to what it looks like when global merging is not enabled, with
the only difference being that the merged variables are addressed indirectly
relative to the start of the section they are in.

Also add aliases for merged variables with internal linkage, as this also makes
the object be more like what it is when they are not merged.

Differential Revision: http://reviews.llvm.org/D11942

llvm-svn: 244615
2015-08-11 15:48:04 +00:00
Mehdi Amini b10555cc61 Fix InstCombine test: invalid CHECK line slipped in r231270
I incorrectly wrote CHECK-NEXT with followin with ':', the check was
ignored by FileCheck.
The non-inbound GEP is folded here because the DataLayout is no longer
optional, the fold was originally guarded with a comment that said:
    We need TD information to know the pointer size unless this is inbounds.
Now we always have "TD information" and perform the fold.

Thanks Jonathan Roelofs for noticing.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 244613
2015-08-11 15:31:17 +00:00
Sanjay Patel b5c0c58737 remove unnecessary settings/attributes from test case
llvm-svn: 244612
2015-08-11 15:30:53 +00:00
Sanjay Patel c454f07eb1 delete FIXME comment; it's fixed
llvm-svn: 244605
2015-08-11 14:35:29 +00:00
Sanjay Patel 74ca312666 fix minsize detection: minsize attribute implies optimizing for size
llvm-svn: 244604
2015-08-11 14:31:14 +00:00
Sanjay Patel 52c2691829 add missing test for machine combiner when optimizing for size
The minsize test will be fixed in the next commit.

llvm-svn: 244603
2015-08-11 14:29:45 +00:00
Michael Kuperstein 243c073a2e [X86] Allow merging of immediates within a basic block for code size savings
First step in preventing immediates that occur more than once within a single
basic block from being pulled into their users, in order to prevent unnecessary
large instruction encoding .Currently enabled only when optimizing for size.

Patch by: zia.ansari@intel.com
Differential Revision: http://reviews.llvm.org/D11363

llvm-svn: 244601
2015-08-11 14:10:58 +00:00
James Molloy b7b2a1e9b4 [AArch64] Match fminnum/fmaxnum for vector fminnm/fmaxnm instead of an intrinsic.
Lower Intrinsic::aarch64_neon_fmin/fmax to fminnum/fmannum and match that instead. Minimal functional change:

  - Extra tests added because coverage of scalar fminnm/fmaxnm instructions was nonexistant.
  - f16 test updated because now we actually generate scalar fminnm/fmaxnm we no longer need to bail out to a libcall!

llvm-svn: 244595
2015-08-11 12:06:37 +00:00
Marina Yatsina 8c997af103 [X86] Add SAL mnemonics for Intel syntax
SAL and SHL instructions perform the same operation

Differential Revision: http://reviews.llvm.org/D11882

llvm-svn: 244588
2015-08-11 12:05:06 +00:00
Marina Yatsina d353c45eaf [X86] Fix REPE, REPZ, REPNZ for intel syntax
REPE, REPZ, REPNZ, REPNE should have mnemonics for Intel syntax as well.
Currently using these instructions causes compilation errors for Intel syntax.

Differential Revision: http://reviews.llvm.org/D11794

llvm-svn: 244584
2015-08-11 11:28:10 +00:00
Marina Yatsina f6bc15d763 [X86] Fix imul alias for intel syntax
The "imul reg, imm" alias is not defined for intel syntax. 
In intel syntax there is no w/l/q suffix for the imul instruction.

Differential Revision: http://reviews.llvm.org/D11887

llvm-svn: 244582
2015-08-11 10:43:04 +00:00
James Molloy 134bec2722 Add support for floating-point minnum and maxnum
The select pattern recognition in ValueTracking (as used by InstCombine
and SelectionDAGBuilder) only knew about integer patterns. This teaches
it about minimum and maximum operations.

matchSelectPattern() has been extended to return a struct containing the
existing Flavor and a new enum defining the pattern's behavior when
given one NaN operand.

C minnum() is defined to return the non-NaN operand in this case, but
the idiomatic C "a < b ? a : b" would return the NaN operand.

ARM and AArch64 at least have different instructions for these different cases.

llvm-svn: 244580
2015-08-11 09:12:57 +00:00
Vasileios Kalintiris 1c78ca6a09 [mips] Remap move as or.
Summary:
This patch remaps the assembly idiom 'move' to 'or' instead of 'daddu' or
'addu'. The use of addu/daddu instead of or as move was highlighted as a
performance issue during the analysis of a recent 64bit design. Originally
move was encoded as 'or' by binutils but was changed for the r10k cpu family
due to their pipeline which had 2 arithmetic units and a single logical unit,
and so could issue multiple (d)addu based moves at the same time but only 1
logical move.

This patch preserves the disassembly behaviour so that disassembling a old style
(d)addu move still appears as move, but assembling move always gives an or

Patch by Simon Dardis.

Reviewers: vkalintiris

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11796

llvm-svn: 244579
2015-08-11 08:56:25 +00:00
Michael Kuperstein 7337ee23d8 [X86] When optimizing for minsize, use POP for small post-call stack clean-up
When optimizing for size, replace "addl $4, %esp" and "addl $8, %esp"
following a call by one or two pops, respectively. We don't try to do it in
general, but only when the stack adjustment immediately follows a call - which
is the most common case.

That allows taking a short-cut when trying to find a free register to pop into,
instead of a full-blown liveness check. If the adjustment immediately follows a
call, then every register the call clobbers but doesn't define should be dead at
that point, and can be used.

Differential Revision: http://reviews.llvm.org/D11749

llvm-svn: 244578
2015-08-11 08:48:48 +00:00
Michael Kuperstein 82814f63c0 Allow PeepholeOptimizer to fold a few more cases
The condition for clearing the folding candidate list was clamped together
with the "uninteresting instruction" condition. This is too conservative,
e.g. we don't need to clear the list when encountering an IMPLICIT_DEF.

Differential Revision: http://reviews.llvm.org/D11591

llvm-svn: 244577
2015-08-11 08:19:43 +00:00
Michael Kuperstein 07f31d92ca [GMR] Be a bit smarter about which globals don't alias when doing recursive lookups
Should hopefully fix the remainder of PR24288.

Differential Revision: http://reviews.llvm.org/D11900

llvm-svn: 244575
2015-08-11 08:06:44 +00:00
Lang Hames 0fd3610e6d [RuntimeDyld][AArch64] Add explicit addends before calling relocationValueRef.
relocationValueRef uses the addend, so it has to be set before the call.

llvm-svn: 244574
2015-08-11 06:27:53 +00:00
Yaron Keren 4988786b0f Enable five passing dsymutil tests on Windows.
These tests pass with Windows 7 x64 + MSYS2. I'll see if the bots like
them as well and disable the failing ones.

llvm-svn: 244572
2015-08-11 06:05:27 +00:00
David Majnemer 85a549dbc8 [IR] Verify EH pad predecessors
Make sure that an EH pad's predecessors are using their unwind edge to
transfer control to the EH pad.

llvm-svn: 244563
2015-08-11 02:48:30 +00:00
JF Bastien ef172fc9f0 WebAssembly: add basic floating-point tests
Summary: I somehow forgot to add these when I added the basic floating-point opcodes. Also remove ceil/floor/trunc/nearestint for now, and add them only when properly tested.

Subscribers: llvm-commits, sunfish, jfb

Differential Revision: http://reviews.llvm.org/D11927

llvm-svn: 244562
2015-08-11 02:45:15 +00:00
Tyler Nowicki c94d6ad241 Print vectorization analysis when loop hint is specified.
This patch and a relatec clang patch solve the problem of having to explicitly enable analysis when specifying a loop hint pragma to get the diagnostics. Passing AlwasyPrint as the pass name (see below) causes the front-end to print the diagnostic if the user has specified '-Rpass-analysis' without an '=<target-pass>’. Users of loop hints can pass that compiler option without having to specify the pass and they will get diagnostics for only those loops with loop hints.

llvm-svn: 244555
2015-08-11 01:09:15 +00:00
JF Bastien e73ce68225 WebAssembly: simply assert on SNaN and NaNs with payloads
Summary: convertToHexString doesn't represent them correctly at this point in time. This is a follow-up to sunfish's suggestion in D11914.

Subscribers: llvm-commits, sunfish, jfb

Differential Revision: http://reviews.llvm.org/D11925

llvm-svn: 244551
2015-08-11 00:49:20 +00:00
Alex Lorenz c483808785 MIR Serialization: Serialize UsedPhysRegMask from the machine register info.
This commit serializes the UsedPhysRegMask register mask from the machine
register information class. The mask is serialized as an inverted
'calleeSavedRegisters' mask to keep the output minimal.

This commit also allows the MIR parser to infer this mask from the register
mask operands if the machine function doesn't specify it.

Reviewers: Duncan P. N. Exon Smith
llvm-svn: 244548
2015-08-11 00:32:49 +00:00
Kostya Serebryany 2569118621 [libFuzzer] don't crash if the condition in a switch has unusual type (e.g. i72)
llvm-svn: 244544
2015-08-11 00:24:39 +00:00
Sanjoy Das 7742b8ba15 Address post-commit review from r243378.
This checks that bork_directive occurs exactly twice in the test output.

llvm-svn: 244543
2015-08-11 00:20:24 +00:00
Alex Lorenz c5d35ba009 MIR Parser: Report an error when a stack object is redefined.
llvm-svn: 244536
2015-08-10 23:50:41 +00:00
Joerg Sonnenberger ebe7bf44ec Add lduw and lwua aliases for SPARCv9.
llvm-svn: 244535
2015-08-10 23:47:22 +00:00
Alex Lorenz 1d9a303142 MIR Parser: Report an error when a fixed stack object is redefined.
llvm-svn: 244534
2015-08-10 23:45:02 +00:00
Joerg Sonnenberger 2ee3d76737 Load/store for float registers from/to alternate space.
llvm-svn: 244532
2015-08-10 23:33:17 +00:00
Alex Lorenz b97c9ef4d0 MIR Serialization: Serialize the liveout register mask machine operands.
llvm-svn: 244529
2015-08-10 23:24:42 +00:00
Sanjay Patel d967a878fa fix minsize detection: minsize attribute implies optimizing for size
llvm-svn: 244528
2015-08-10 23:07:26 +00:00
Tyler Nowicki 652b0dabe6 Extend late diagnostics to include late test for runtime pointer checks.
This patch moves checking the threshold of runtime pointer checks to the vectorization requirements (late diagnostics) and emits a diagnostic that infroms the user the loop would be vectorized if not for exceeding the pointer-check threshold. Clang will also append the options that can be used to allow vectorization.

llvm-svn: 244523
2015-08-10 23:01:55 +00:00
JF Bastien 4a6422562d WebAssembly: print immediates
Summary:
For now output using C99's hexadecimal floating-point representation.

This patch also cleans up how machine operands are printed: instead of special-casing per type of machine instruction, the code now handles operands generically.

Reviewers: sunfish

Subscribers: llvm-commits, jfb

Differential Revision: http://reviews.llvm.org/D11914

llvm-svn: 244520
2015-08-10 22:36:48 +00:00
Joerg Sonnenberger 6dce129051 Add support for the signx instrution alias of SPARCv9.
llvm-svn: 244519
2015-08-10 22:32:25 +00:00
Alex Lorenz e5101e2016 MachineVerifier: Handle the optional def operand in a PATCHPOINT instruction.
The PATCHPOINT instructions have a single optional defined register operand,
but the machine verifier can't verify the optional defined register operands.
This commit makes sure that the machine verifier won't report an error when a
PATCHPOINT instruction doesn't have its optional defined register operand.
This change will allow us to enable the machine verifier for the code
generation tests for the patchpoint intrinsics.

Reviewers: Juergen Ributzka
llvm-svn: 244513
2015-08-10 21:47:36 +00:00
Reid Kleckner c25c7944f0 [llvm-symbolizer] Remove underscores and other C mangling on Windows
Summary:
This makes it so that reports symbolized after the fact with
llvm-symbolizer are more similar to the ones we generate at runtime with
in-process dbghelp.

Reviewers: samsonov

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11785

llvm-svn: 244512
2015-08-10 21:47:11 +00:00
Alex Lorenz 2f43dd5a12 StackMap: FastISel: Add an appropriate number of immediate operands to the
frame setup instruction.

This commit ensures that the stack map lowering code in FastISel adds an
appropriate number of immediate operands to the frame setup instruction.

The previous code added just one immediate operand, which was fine for a target
like AArch64, but on X86 the ADJCALLSTACKDOWN64 instruction needs two explicit
operands. This caused the machine verifier to report an error when the old code
added just one.

Reviewers: Juergen Ributzka

Differential Revision: http://reviews.llvm.org/D11853

llvm-svn: 244508
2015-08-10 21:27:03 +00:00
Tyler Nowicki 655e573dc5 Make fp vectorization test X86 specified to avoid cost-model related problems on arm-thumb and hexagon.
llvm-svn: 244505
2015-08-10 21:14:38 +00:00
Rafael Espindola 3db2273861 Add a test showing that objdump (and so ObjectFIle) can handle shndx.
It was already passing, we were just not testing the code.

llvm-svn: 244504
2015-08-10 21:00:15 +00:00
JF Bastien fa9746dc8d x86: Emit LAHF/SAHF instead of PUSHF/POPF
NaCl's sandbox doesn't allow PUSHF/POPF out of security concerns (priviledged emulators have forgotten to mask system bits in the past, and EFLAGS's DF bit is a constant source of hilarity). Commit r220529 fixed PR20376 by saving cmpxchg's flags result using EFLAGS, this commit now generated LAHF/SAHF instead, for all of x86 (not just NaCl) because it leads to an overall performance gain over PUSHF/POPF.

As with the previous patch this code generation is pretty bad because it occurs very later, after register allocation, and in many cases it rematerializes flags which were already available (e.g. already in a register through SETE). Fortunately it's somewhat rare that this code needs to fire.

I did [[ https://github.com/jfbastien/benchmark-x86-flags | a bit of benchmarking ]], the results on an Intel Haswell E5-2690 CPU at 2.9GHz are:

| Time per call (ms)  | Runtime (ms) | Benchmark                      |
| 0.000012514         |      6257    | sete.i386                      |
| 0.000012810         |      6405    | sete.i386-fast                 |
| 0.000010456         |      5228    | sete.x86-64                    |
| 0.000010496         |      5248    | sete.x86-64-fast               |
| 0.000012906         |      6453    | lahf-sahf.i386                 |
| 0.000013236         |      6618    | lahf-sahf.i386-fast            |
| 0.000010580         |      5290    | lahf-sahf.x86-64               |
| 0.000010304         |      5152    | lahf-sahf.x86-64-fast          |
| 0.000028056         |     14028    | pushf-popf.i386                |
| 0.000027160         |     13580    | pushf-popf.i386-fast           |
| 0.000023810         |     11905    | pushf-popf.x86-64              |
| 0.000026468         |     13234    | pushf-popf.x86-64-fast         |

Clearly `PUSHF`/`POPF` are suboptimal. It doesn't really seems to be worth teaching LLVM about individual flags, at least not for this purpose.

Reviewers: rnk, jvoung, t.p.northover

Subscribers: llvm-commits

Differential revision: http://reviews.llvm.org/D6629

llvm-svn: 244503
2015-08-10 20:59:36 +00:00
Sanjay Patel d09391c8cd fix minsize detection: minsize attribute implies optimizing for size
llvm-svn: 244499
2015-08-10 20:45:44 +00:00
Sanjay Patel 178f8cba51 [x86, SSE]]add missing tests for load folding with partial register update
The minsize case is wrong; that will be fixed in the next commit.

llvm-svn: 244498
2015-08-10 20:34:34 +00:00
Simon Pilgrim a3a72b41de [InstCombine] Move SSE2/AVX2 arithmetic vector shift folding to instcombiner
As discussed in D11760, this patch moves the (V)PSRA(WD) arithmetic shift-by-constant folding to InstCombine to match the logical shift implementations.

Differential Revision: http://reviews.llvm.org/D11886

llvm-svn: 244495
2015-08-10 20:21:15 +00:00
Jonathan Roelofs f45295c366 Fix a few more cases of 'CHECK[^:]*$'. NFCI
llvm-svn: 244491
2015-08-10 19:56:39 +00:00
Tyler Nowicki c1a86f5866 Late evaluation of the fast-math vectorization requirement.
This patch moves the verification of fast-math to just before vectorization is done. This way we can tell clang to append the command line options would that allow floating-point commutativity. Specifically those are enableing fast-math or specifying a loop hint. 

llvm-svn: 244489
2015-08-10 19:51:46 +00:00
Jonathan Roelofs 5dcf157443 Fix another case of 'CHECK[^:]*$'. NFCI
llvm-svn: 244486
2015-08-10 19:22:55 +00:00
Tyler Nowicki 4d62f2e039 Modify diagnostic messages to clearly indicate the why interleaving wasn't done.
Sometimes interleaving is not beneficial, as determined by the cost-model and sometimes it is disabled by a loop hint (by the user). This patch modifies the diagnostic messages to make it clear why interleaving wasn't done.

llvm-svn: 244485
2015-08-10 19:14:16 +00:00
James Y Knight 3994be87de [Sparc] Implement i64 load/store support for 32-bit sparc.
The LDD/STD instructions can load/store a 64bit quantity from/to
memory to/from a consecutive even/odd pair of (32-bit) registers. They
are part of SparcV8, and also present in SparcV9. (Although deprecated
there, as you can store 64bits in one register).

As recommended on llvmdev in the thread "How to enable use of 64bit
load/store for 32bit architecture" from Apr 2015, I've modeled the
64-bit load/store operations as working on a v2i32 type, rather than
making i64 a legal type, but with few legal operations. The latter
does not (currently) work, as there is much code in llvm which assumes
that if i64 is legal, operations like "add" will actually work on it.

The same assumption does not hold for v2i32 -- for vector types, it is
workable to support only load/store, and expand everything else.

This patch:
- Adds a new register class, IntPair, for even/odd pairs of registers.

- Modifies the list of reserved registers, the stack spilling code,
  and register copying code to support the IntPair register class.

- Adds support in AsmParser. (note that in asm text, you write the
  name of the first register of the pair only. So the parser has to
  morph the single register into the equivalent paired register).

- Adds the new instructions themselves (LDD/STD/LDDA/STDA).

- Hooks up the instructions and registers as a vector type v2i32. Adds
  custom legalizer to transform i64 load/stores into v2i32 load/stores
  and bitcasts, so that the new instructions can actually be
  generated, and marks all operations other than load/store on v2i32
  as needing to be expanded.

- Copies the unfortunate SelectInlineAsm hack from ARMISelDAGToDAG.
  This hack undoes the transformation of i64 operands into two
  arbitrarily-allocated separate i32 registers in
  SelectionDAGBuilder. and instead passes them in a single
  IntPair. (Arbitrarily allocated registers are not useful, asm code
  expects to be receiving a pair, which can be passed to ldd/std.)

Also adds a bunch of test cases covering all the bugs I've added along
the way.

Differential Revision: http://reviews.llvm.org/D8713

llvm-svn: 244484
2015-08-10 19:11:39 +00:00
Jonathan Roelofs 49e46ce8e2 Fix a bunch of trivial cases of 'CHECK[^:]*$' in the tests. NFCI
I looked into adding a warning / error for this to FileCheck, but there doesn't
seem to be a good way to avoid it triggering on the instances of it in RUN lines.

llvm-svn: 244481
2015-08-10 19:01:27 +00:00
Mark Heffernan 8939154a22 Add new llvm.loop.unroll.enable metadata.
This change adds the unroll metadata "llvm.loop.unroll.enable" which directs
the optimizer to unroll a loop fully if the trip count is known at compile time, and
unroll partially if the trip count is not known at compile time. This differs from
"llvm.loop.unroll.full" which explicitly does not unroll a loop if the trip count is not
known at compile time.

The "llvm.loop.unroll.enable" is intended to be added for loops annotated with
"#pragma unroll".

llvm-svn: 244466
2015-08-10 17:28:08 +00:00
Sanjay Patel 10294b59de fix minsize detection: minsize attribute implies optimizing for size
llvm-svn: 244464
2015-08-10 17:15:17 +00:00
Sanjay Patel 0f12d71b49 fix minsize detection: minsize attribute implies optimizing for size
llvm-svn: 244463
2015-08-10 17:00:44 +00:00
Sanjay Patel 68b0325a9e fix minsize detection: minsize attribute implies optimizing for size
llvm-svn: 244460
2015-08-10 16:47:47 +00:00
Sanjay Patel 9a9003d94c fix minsize detection: minsize attribute implies optimizing for size
llvm-svn: 244458
2015-08-10 16:43:20 +00:00
Fraser Cormack e29ab2bfab Prevent the scalarizer from caching incorrect entries
The scalarizer can cache incorrect entries when walking up a chain of
insertelement instructions. This occurs when it encounters more than one
instruction that it is not actively searching for, as it unconditionally caches
every element it finds. The fix is to only cache the first element that it
isn't searching for so we don't overwrite correct entries.

Reviewers: hfinkel

Differential Revision: http://reviews.llvm.org/D11559

llvm-svn: 244448
2015-08-10 14:48:47 +00:00
Robert Lougher 11a44b78a3 Trace copies when checking for rematerializability in spill weight calculation
PR24139 contains an analysis of poor register allocation. One of the findings
was that when calculating the spill weight, a rematerializable interval once
split is no longer rematerializable. This is because the isRematerializable
check in CalcSpillWeights.cpp does not follow the copies introduced by live
range splitting (after splitting, the live interval register definition is a
copy which is not rematerializable).

Reviewers: qcolombet

Differential Revision: http://reviews.llvm.org/D11686

llvm-svn: 244439
2015-08-10 11:59:44 +00:00
David Majnemer 4232fb3f8d [PHITransAddr] Don't assume that instruction operands are translatable
We can only PHI translate instructions.  In our attempt to PHI translate
a bitcast, we attempt to translate its operand; however, the operand
might be an argument or a global instead of an instruction.  Benignly
bail out when this happens.

This fixes PR24397.

Differential Revision: http://reviews.llvm.org/D11879

llvm-svn: 244418
2015-08-09 15:43:02 +00:00
Sanjay Patel e0178262d4 [x86] enable machine combiner reassociations for 128-bit vector single/double adds
llvm-svn: 244403
2015-08-08 19:08:20 +00:00
Sanjay Patel 0f51d14957 add a missing regression test for a DAGCombiner FDIV optimization
There's no test for this transform in any backend. Discovered
while debugging fast-math-flag propagation in the DAG (r244053).

llvm-svn: 244373
2015-08-07 23:19:41 +00:00
Tom Stellard fd25395c72 AMDGPU: Add pass to lower OpenCL image and sampler arguments.
The pass adds new kernel arguments for image attributes, and
resolves calls to dummy attribute and resource id getter functions.

Patch by: Zoltan Gilian

llvm-svn: 244372
2015-08-07 23:19:30 +00:00
James Y Knight 0cab80c9b3 [SPARC] Disable unsupported ExecutionEngine tests, and XFAIL a couple
of DebugInfo tests.

llvm-svn: 244371
2015-08-07 23:01:16 +00:00
Tom Stellard 8ebad11ee9 AMDGPU/SI: Use InstAlias instead of MnemonicAlias for VOPC instructions
Summary:
With InstAlias, we don't need to print the _e32 portion of the mnemonic
when we print the $dst operand.  This change makes it possible to
include vcc in the asm string when we switch VOPC over to having
implicit vcc defs.

Reviewers: arsenm

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11813

llvm-svn: 244362
2015-08-07 22:00:56 +00:00
Sanjay Patel 8d768b5b3a redo r244360 (tighten checks...) after specifying triple
llvm-svn: 244361
2015-08-07 21:42:24 +00:00
Sanjay Patel 0e1d731631 tighten checks using update_llc_test_checks.py
llvm-svn: 244360
2015-08-07 21:38:53 +00:00
Alex Lorenz 61420f790d MIR Serialization: Serialize the base alignment for the machine memory operands.
llvm-svn: 244357
2015-08-07 20:48:30 +00:00
Alex Lorenz 83127739ff MIR Serialization: Serialize the offsets for the machine memory operands.
llvm-svn: 244356
2015-08-07 20:26:52 +00:00
Matt Arsenault 711b390a7c AMDGPU: Assume SMRD access for constant address space
Since r243294 these are selected to SMRD and
moved later if required.

llvm-svn: 244354
2015-08-07 20:18:34 +00:00
Chen Li eafbc9dc47 [ConstantFoldTerminator] Preserve make.implicit metadata when converting SwitchInst to BranchInst
Summary: llvm::ConstantFoldTerminator function can convert SwitchInst with single case (and default) to a conditional BranchInst. This patch adds support to preserve make.implicit metadata on this conversion.

Reviewers: sanjoy, weimingz, chenli

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D11841

llvm-svn: 244348
2015-08-07 19:30:12 +00:00
Simon Pilgrim 3815c16bf8 [InstCombine] Fix SSE2/AVX2 vector logical shift by constant
This patch fixes the sse2/avx2 vector shift by constant instcombine call to correctly deal with the fact that the shift amount is formed from the entire lower 64-bit and not just the lowest element as it currently assumes.

e.g.

%1 = tail call <4 x i32> @llvm.x86.sse2.psrl.d(<4 x i32> %v, <4 x i32> <i32 15, i32 15, i32 15, i32 15>)

In this case, (V)PSRLD doesn't perform a lshr by 15 but in fact attempts to shift by 64424509455 ((15 << 32) | 15) - giving a zero result.

In addition, this review also recognizes shift-by-zero from a ConstantAggregateZero type (PR23821).

Differential Revision: http://reviews.llvm.org/D11760

llvm-svn: 244341
2015-08-07 18:22:50 +00:00
Tom Stellard c8733e805e AMDGPU/SI: Use correct encoding of vopc for VI in the assembler
Summary: We were using the SI encoding for VI.

Reviewers: arsenm

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11812

llvm-svn: 244332
2015-08-07 16:45:33 +00:00
Tom Stellard d37631a8ac AMDGPU/SI: Add VI checks to vop3 assembler tests
Reviewers: arsenm

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11811

llvm-svn: 244331
2015-08-07 16:45:30 +00:00
Rafael Espindola f7eb882176 add missing tests files
llvm-svn: 244323
2015-08-07 15:35:49 +00:00
Rafael Espindola e01f43bcc1 Add dynamic_table iterators back to ELF.h.
In tree they are only used by llvm-readobj, but it is also used by
https://github.com/mono/CppSharp.

While at it, add some missing error checking.

llvm-svn: 244320
2015-08-07 15:25:20 +00:00
Frederic Riss a5e1453ac3 [dsymutil] Use the new MCDwarfLineTableParams customization to emit linetables
llvm-dsymutil has to be able to process debug info produced by other compilers
which use different line table settings. The testcase wasn't generated by
another compiler, but by a modified clang.

llvm-svn: 244319
2015-08-07 15:14:13 +00:00
Silviu Baranga 3e8e51c1a9 [ARM] Update ReconstructShuffle to handle mismatched types
Summary:
Port the ReconstructShuffle function from AArch64 to ARM
to handle mismatched incoming types in the BUILD_VECTOR
node.

This fixes an outstanding FIXME in the ReconstructShuffle
code.

Reviewers: t.p.northover, rengolin

Subscribers: aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D11720

llvm-svn: 244314
2015-08-07 11:40:46 +00:00
John Brawn 64e5a66794 Revert "Make global aliases have symbol size equal to their type"
This reverts r242520, as it caused pr24379. Also removes part of the test added
by r243874 that checks the size of alias symbols.

llvm-svn: 244313
2015-08-07 10:56:21 +00:00
NAKAMURA Takumi b918a404c3 Tweak llvm/test/tools/dsymutil/arch-option.test to avoid globbing on mingw-w64.
llvm-svn: 244311
2015-08-07 08:38:22 +00:00
JF Bastien 315cc06840 WebAssembly: textual emission uses expected opcode names
Summary: WebAssembly's tablegen instructions have the names WebAssembly expects, but by LLVM convention they're uppercase and suffixed with their type after an underscore. Leave the C++ code that way, but print outt he names WebAssembly expects (lowercase, no type). We could teach tablegen to do this later, maybe by using `!cast<string>(node)` in the .td files.

Reviewers: sunfish

Subscribers: jfb, llvm-commits

Differential Revision: http://reviews.llvm.org/D11776

llvm-svn: 244305
2015-08-07 01:57:03 +00:00
Tom Stellard f594fcad73 ELF: Add AMDGPU specific defintions
Reviewers: rafael

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11458

llvm-svn: 244303
2015-08-07 01:35:24 +00:00
Alex Lorenz cba8c5fe31 MIR Serialization: Fix serialization of unnamed IR block references.
The block address machine operands can reference IR blocks in other functions.
This commit fixes a bug where the references to unnamed IR blocks in other
functions weren't serialized correctly.

llvm-svn: 244299
2015-08-06 23:57:04 +00:00
Juergen Ributzka f09c7a3d0f [AArch64][FastISel] Always use AND before checking the branch flag.
When we are not emitting the condition for the branch, because the condition is
in another BB or SDAG did the selection for us, then we have to mask the flag in
the register with AND.

This is required when the condition comes from a truncate, because SDAG only
truncates down to a legal size of i32.

This fixes rdar://problem/22161062.

llvm-svn: 244291
2015-08-06 22:44:15 +00:00
Juergen Ributzka 9f54dbe7a1 Revert "[AArch64][FastISel] Add more truncation tests." and "[AArch64][FastISel] Always use an AND instruction when truncating to non-legal types."
This reverts commit r243198 and 243304.

Turns out this wasn't the correct fix for this problem. It works only within
FastISel, but fails when the truncate is selected by SDAG.

llvm-svn: 244287
2015-08-06 22:13:48 +00:00
Sean Silva c2b70bf999 [compatibility.ll] Cover explicitly named comdats.
Patch by Vedant Kumar! <vsk@apple.com>

llvm-svn: 244284
2015-08-06 22:04:21 +00:00
Rafael Espindola 8b3b09fdcf Move to llvm-readobj code that is only used there.
lld might end up using a small part of this, but it will be in a much
refactored form. For now this unblocks avoiding the full section scan in the
ELFFile constructor.

This also has a (very small) error handling improvement.

llvm-svn: 244282
2015-08-06 21:54:37 +00:00
Frederic Riss dc5370b9cc [dsymutil] Implement dSYM bundle creation
A dSYM bundle is a file hierarchy that looks slike this:
 <bundle name>.dSYM/
     Contents/
        Info.plist
        Resources/
           DWARF/
              <DWARF file(s)>

This is the default output mode of dsymutil.

llvm-svn: 244270
2015-08-06 21:05:06 +00:00
Frederic Riss 0948db6064 [dsymutil] Add (unimplemented) --flat option
dsymutil should by default generate dSYM bundles which are filesystem
hierarchies containing the debug info and an additional Info.plist.
Currently llvm-dsymutil emits raw binaries containing the debug info.
This is what we call the 'flat mode'. Add a -f/-flat option that is
supposed to enable that flat mode, but don't wire it for now, only
pass it to the tests that will need it to stay functional once we
do bundle generation by default.
This basically makes this commit NFC and removes the noise from the
actual commit that adds support for bundle generation.

llvm-svn: 244269
2015-08-06 21:05:01 +00:00
Sanjoy Das 366acc175e [IndVars] Fix PR24356.
Unsigned predicates increase or decrease agnostic of the signs of their
increments.

llvm-svn: 244265
2015-08-06 20:43:41 +00:00
Rui Ueyama b9583d22eb Update comments.
llvm-svn: 244259
2015-08-06 20:05:27 +00:00
Tom Stellard 217361c33f AMDGPU/SI: Add support for 32-bit immediate SMRD offsets on CI
Reviewers: arsenm

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11604

llvm-svn: 244254
2015-08-06 19:28:38 +00:00
Tom Stellard dee26a2876 AMDGPU/SI: Use ComplexPatterns for SMRD addressing modes
Summary: This allows us to consolidate several of the TableGen patterns.

Reviewers: arsenm

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11602

llvm-svn: 244253
2015-08-06 19:28:30 +00:00
Kit Barton a7bf96ab5c Fix possible infinite loop in shrink wrapping when searching for save/restore
points.

There is an infinite loop that can occur in Shrink Wrapping while searching 
for the Save/Restore points. 

Part of this search checks whether the save/restore points are located in
different loop nests and if so, uses the (post) dominator trees to find the
immediate (post) dominator blocks. However, if the current block does not have
any immediate (post) dominators then this search will result in an infinite
loop. This can occur in code containing an infinite loop.

The modification checks whether the immediate (post) dominator is different from
the current save/restore block. If it is not, then the search terminates and the
current location is not considered as a valid save/restore point for shrink wrapping.

Phabricator: http://reviews.llvm.org/D11607
llvm-svn: 244247
2015-08-06 19:01:57 +00:00
Quentin Colombet 6443cce233 [Reassociation] Fix miscompile for va_arg arguments.
iisUnmovableInstruction() had a list of instructions hardcoded which are
considered unmovable. The list lacked (at least) an entry for the va_arg
and cmpxchg instructions.
Fix this by introducing a new Instruction::mayBeMemoryDependent()
instead of maintaining another instruction list.

Patch by Matthias Braun <matze@braunis.de>.

Differential Revision: http://reviews.llvm.org/D11577

rdar://problem/22118647

llvm-svn: 244244
2015-08-06 18:44:34 +00:00
Alex Lorenz e86d51533d MIR Parser: Report an error when parsing duplicate memory operand flags.
llvm-svn: 244240
2015-08-06 18:26:36 +00:00
Alex Lorenz dc8de2a6b7 MIR Serialization: Serialize the 'invariant' machine memory operand flag.
llvm-svn: 244230
2015-08-06 16:55:53 +00:00
Richard Diamond bd753c9315 Fix an alignment error in `llvm::expandAtomicRMWToCmpXchg` without breaking the build where X86 isn't enabled.
Summary: Divide the primitive size in bits by eight so the initial load's alignment is in bytes as expected. Tested with the included unit test.

Reviewers: rengolin, jfb

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11804

llvm-svn: 244229
2015-08-06 16:55:03 +00:00
Alex Lorenz 10fd03857f MIR Serialization: Serialize the 'non-temporal' machine memory operand flag.
llvm-svn: 244228
2015-08-06 16:49:30 +00:00
Douglas Katzman 63d64da0ce [SPARC] Don't compare arch name as a string, use the enum instead.
Fixes PR22695

llvm-svn: 244221
2015-08-06 15:44:12 +00:00
Renato Golin a02ac60469 Revert "Divide the primitive size in bits by eight so the initial load's alignment is in bytes as expected. Tested with the included unit test."
This reverts commit r244155, as it was breaking the buildbots for too long.
Should be reapplied with proper fix.

llvm-svn: 244205
2015-08-06 10:37:59 +00:00
Michael Kuperstein 868dc65444 [X86] Improve EmitLoweredSelect for contiguous CMOV pseudo instructions.
This change improves EmitLoweredSelect() so that multiple contiguous CMOV pseudo
instructions with the same (or exactly opposite) conditions get lowered using a single
new basic-block. This eliminates unnecessary extra basic-blocks (and CFG merge points)
when contiguous CMOVs are being lowered.

Patch by: kevin.b.smith@intel.com
Differential Revision: http://reviews.llvm.org/D11428

llvm-svn: 244202
2015-08-06 08:45:34 +00:00
Peter Collingbourne e834f42073 COFF: Assign the correct symbol type to internal functions.
The COFFSymbolRef::isFunctionDefinition() function tests for several conditions
that are not related to whether a symbol is a function, but rather whether
the symbol meets the requirements for a function definition auxiliary record,
which excludes certain symbols such as internal functions and undefined
references. The test we need to determine the symbol type is much simpler:
we only need to compare the complex type against IMAGE_SYM_DTYPE_FUNCTION.

llvm-svn: 244195
2015-08-06 05:26:35 +00:00
Alex Lorenz 49873a8382 MIR Serialization: Initial serialization of the machine operand target flags.
This commit implements the initial serialization of the machine operand target
flags. It extends the 'TargetInstrInfo' class to add two new methods that help
to provide text based serialization for the target flags.

This commit can serialize only the X86 target flags, and the target flags for
the other targets will be serialized in the follow-up commits.

Reviewers: Duncan P. N. Exon Smith
llvm-svn: 244185
2015-08-06 00:44:07 +00:00
Frederic Riss b5dce473e3 Revert "Make sure all temporary files get created under %T."
This reverts commit r244163. The workaround shouldn't be necessary
after r244172, and moreover the commit was slightly buggy as it
dis a simple mkdir without removing the directory first, which could
cause 'File exists' errors.

llvm-svn: 244182
2015-08-05 23:53:38 +00:00
Frederic Riss 2a6e44518f [dsymutil] Update source used to generate test binary.
Forgot to include that in the last commit.

llvm-svn: 244171
2015-08-05 23:33:45 +00:00
Bjarke Hammersholt Roune 5cbc7d2999 [NVPTX] Use LDG for pointer induction variables.
More specifically, make NVPTXISelDAGToDAG able to emit cached loads (LDG) for pointer induction variables.

Also fix latent bug where LDG was not restricted to kernel functions. I believe that this could not be triggered so far since we do not currently infer that a pointer is global outside a kernel function, and only loads of global pointers are considered for cached loads.

llvm-svn: 244166
2015-08-05 23:11:57 +00:00
Artem Belevich d41611cfdf Make sure all temporary files get created under %T.
llvm-svn: 244163
2015-08-05 22:54:36 +00:00
Frederic Riss ae0d436545 [dsymutil] Add support for the -arch option.
This option allows to select a subset of the architectures when
performing a universal binary link. The filter is done completely
in the mach-o specific part of the code.

llvm-svn: 244160
2015-08-05 22:33:28 +00:00
Reid Kleckner 10cac7a190 Fix Windows test failure with triple instead of using the native OS
llvm-svn: 244159
2015-08-05 22:27:08 +00:00
Reid Kleckner 12d2c12023 If the "CodeView" module flag is set, emit codeview instead of DWARF
Summary:
Emit both DWARF and CodeView if "CodeView" and "Dwarf Version" module
flags are set.

Reviewers: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11756

llvm-svn: 244158
2015-08-05 22:26:20 +00:00
Alex Lorenz 5672a893e5 MIR Serialization: Serialize the machine operand's offset.
This commit serializes the offset for the following operands: target index,
global address, external symbol, constant pool index, and block address.

llvm-svn: 244157
2015-08-05 22:26:15 +00:00
Richard Diamond 559c1d72a9 Divide the primitive size in bits by eight so the initial load's alignment is in
bytes as expected. Tested with the included unit test.

llvm-svn: 244155
2015-08-05 22:10:57 +00:00
Chen Li 50efd9220a [LoopUnswitch] Preserve make.implicit metadata for unswitched conditions
Summary: This patch adds support to preserve make.implicit metadata for unswitched conditions in loop pre-header.

Reviewers: sanjoy, weimingz

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D11769

llvm-svn: 244132
2015-08-05 21:13:26 +00:00
JF Bastien 8662083770 x86 atomic: optimize a.store(reg op a.load(acquire), release)
Summary: PR24191 finds that the expected memory-register operations aren't generated when relaxed { load ; modify ; store } is used. This is similar to PR17281 which was addressed in D4796, but only for memory-immediate operations (and for memory orderings up to acquire and release). This patch also handles some floating-point operations.

Reviewers: reames, kcc, dvyukov, nadav, morisset, chandlerc, t.p.northover, pete

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11382

llvm-svn: 244128
2015-08-05 21:04:59 +00:00
JF Bastien 7c4218f49c Revert "Fix MO's analyzePhysReg, it was confusing sub- and super-registers. Problem pointed out by Michael Hordijk."
I mistakenly committed the patch for D6629, and was trying to commit another. Reverting until it gets proper signoff.

llvm-svn: 244121
2015-08-05 20:53:56 +00:00
JF Bastien ce5256f5c5 Fix MO's analyzePhysReg, it was confusing sub- and super-registers. Problem pointed out by Michael Hordijk.
llvm-svn: 244120
2015-08-05 20:49:46 +00:00
Alex Lorenz 3f2058da16 MIR Parser: Report an error when parsing large immediate operands.
llvm-svn: 244100
2015-08-05 19:03:42 +00:00
Alex Lorenz 05e3882e81 MIR Serialization: Serialize the typed immediate integer machine operands.
llvm-svn: 244098
2015-08-05 18:52:21 +00:00
Frederic Riss 6e3278633d [dsymutil] Fix test patterns.
Depending on the filesystem paths, the YAML dump might quote paths.
Account for that in the regex patterns.

llvm-svn: 244094
2015-08-05 18:45:13 +00:00
Frederic Riss 4dd3e0c41e [dsymutil] Implement support for handling mach-o universal binaries as main input/output.
The DWARF linker isn't touched by this, the implementation links
individual files and merges them together into a fat binary by
calling out to the 'lipo' utility.

The main change is that the MachODebugMapParser can now return
multiple debug maps for a single binary.

The test just verifies that lipo would be invoked correctly, but
doesn't actually generate a binary. This mimics the way clang
tests its external iplatform tools integration.

llvm-svn: 244087
2015-08-05 18:27:44 +00:00
Alex Lorenz 2b3cf19332 MIR Parser: Report an error when parsing duplicate register flags.
llvm-svn: 244081
2015-08-05 18:09:03 +00:00
Chandler Carruth 405e4f9051 [GMR] Teach the conservative path of GMR to catch even more easy cases.
In PR24288 it was pointed out that the easy case of a non-escaping
global and something that *obviously* required an escape sometimes is
hidden behind PHIs (or selects in theory). Because we have this binary
test, we can easily just check that all possible input values satisfy
the requirement. This is done with a (very small) recursion through PHIs
and selects. With this, the specific example from the PR is correctly
folded by GVN.

Differential Revision: http://reviews.llvm.org/D11707

llvm-svn: 244078
2015-08-05 17:58:30 +00:00
Alex Lorenz 01c1a5ee58 MIR Serialization: Serialize the 'early-clobber' register operand flag.
llvm-svn: 244075
2015-08-05 17:49:03 +00:00
Alex Lorenz 9075258b6a MIR Serialization: Serialize the 'debug-use' register operand flag.
llvm-svn: 244071
2015-08-05 17:41:17 +00:00
James Y Knight bce20afe0f [Sparc] Fix disassembly of popc instruction.
And add tests.

Patch by David Wiberg!

llvm-svn: 244064
2015-08-05 17:00:30 +00:00
Steven Wu 9927206f8c Force the MachO generated for Darwin to have VERSION_MIN load command
On Darwin, it is required to stamp the object file with VERSION_MIN load
command. This commit will provide a VERSRION_MIN load command to the
MachO file that doesn't specify the version itself by inferring from
Target Triple.

llvm-svn: 244059
2015-08-05 15:36:38 +00:00
Artyom Skrobov 6fbef2a780 ARMISelDAGToDAG.cpp had this self-contradictory code:
return StringSwitch<int>(Flags)
          .Case("g", 0x1)
          .Case("nzcvq", 0x2)
          .Case("nzcvqg", 0x3)
          .Default(-1);
...

  // The _g and _nzcvqg versions are only valid if the DSP extension is
  // available.
  if (!Subtarget->hasThumb2DSP() && (Mask & 0x2))
    return -1;

ARMARM confirms that the comment is right, and the code was wrong.

llvm-svn: 244029
2015-08-05 11:02:14 +00:00
Simon Pilgrim 42c611b9ae [InstCombine] Added more specific SSE2/AVX2 vector shift tests.
llvm-svn: 244022
2015-08-05 08:21:38 +00:00
Hal Finkel 17caf326e5 [MachineCombiner] Don't use the opcode-only form of computeInstrLatency
In r242277, I updated the MachineCombiner to work with itineraries, but I
missed a call that is scheduling-model-only (the opcode-only form of
computeInstrLatency). Using the form that takes an MI* allows this to work with
itineraries (and should be NFC for subtargets with scheduling models).

llvm-svn: 244020
2015-08-05 07:45:28 +00:00
Hal Finkel 23cdeeea0f [RuntimeDyld] Adapt PPC64 relocations to PPC32
Begin adapting some of the implemented PPC64 relocations for PPC32 (with a
test case).

Patch by Pierre-Andre Saulais!

llvm-svn: 243991
2015-08-04 15:29:00 +00:00
Sanjay Patel 75ced2782b [x86] machine combiner reassociation: mark EFLAGS operand as 'dead'
In the commentary for D11660, I wasn't sure if it was alright to create new
integer machine instructions without also creating the implicit EFLAGS operand. 
From what I can see, the implicit operand is always created by the MachineInstrBuilder
based on the instruction type, so we don't have to do that explicitly. However, in
reviewing the debug output, I noticed that the operand was not marked as 'dead'. 
The machine combiner should do that to preserve future optimization opportunities 
that may be checking for that dead EFLAGS operand themselves.

Differential Revision: http://reviews.llvm.org/D11696

llvm-svn: 243990
2015-08-04 15:21:56 +00:00
Vasileios Kalintiris 044e172228 Revert r229675 - [mips] Avoid redundant sign extension of the result of binary bitwise instructions.
It introduced two regressions on 64-bit big-endian targets running under N32
(MultiSource/Benchmarks/tramp3d-v4/tramp3d-v4, and
MultiSource/Applications/kimwitu++/kc) The issue is that on 64-bit targets
comparisons such as BEQ compare the whole GPR64 but incorrectly tell the
instruction selector that they operate on GPR32's. This leads to the
elimination of i32->i64 extensions that are actually required by
comparisons to work correctly.

There's currently a patch under review that fixes this problem.

llvm-svn: 243984
2015-08-04 14:26:35 +00:00
Simon Pilgrim d19b9d8229 [InstCombine] Split off SSE2/AVX2 vector shift tests.
These aren't vector demanded bits tests. More tests to follow.

llvm-svn: 243963
2015-08-04 08:05:27 +00:00
Duncan P. N. Exon Smith 706f37e8df Linker: Fix references to uniqued nodes after r243883
r243883 started moving 'distinct' nodes instead of duplicated them in
lib/Linker.  This had the side-effect of sometimes not cloning uniqued
nodes that reference them.  I missed a corner case:

    !named = !{!0}
    !0 = !{!1}
    !1 = distinct !{!0}

!0 is the entry point for "remapping", and a temporary clone (say,
!0-temp) is created and mapped in case we need to model a uniquing
cycle.

    Recursive descent into !1.  !1 is distinct, so we leave it alone,
    but update its operand to !0-temp.

Pop back out to !0.  Its only operand, !1, hasn't changed, so we don't
need to use !0-temp.  !0-temp goes out of scope, and we're finished
remapping, but we're left with:

    !named = !{!0}
    !0 = !{!1}
    !1 = distinct !{null} ; uh oh...

Previously, if !0 and !0-temp ended up with identical operands, then
!0-temp couldn't have been referenced at all.  Now that distinct nodes
don't get duplicated, that assumption is invalid.  We need to
!0-temp->replaceAllUsesWith(!0) before freeing !0-temp.

I found this while running an internal `-flto -g` bootstrap.  Strangely,
there was no case of this in the open source bootstrap I'd done before
commit...

llvm-svn: 243961
2015-08-04 06:42:31 +00:00
Mehdi Amini c8d5783114 Update test suite to make "ninja check" succeed without native backend builtin
Requires "native" feature in most places that were failing.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 243960
2015-08-04 06:32:54 +00:00
Mehdi Amini 63c3989f6a Move generic MIR tests in their own subdir, requires "native" as well
These tests rely on the native backend to be built-in.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 243959
2015-08-04 06:32:45 +00:00
Mehdi Amini fc6b6983ac Improve lit "native" feature to check if the native backend is builtin
The goal is to have 'ninja check' passing even if the X86 backend is
not built.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 243958
2015-08-04 06:32:31 +00:00
Saleem Abdulrasool 0a2672bb43 ARM: support windows division routines
This adds the software division routines for the Windows RTABI.  These are not
expected to be used often though as most modern Windows ARM capable targets
support hardware division.  In the case that the target CPU doesnt support
hardware division, this will be the fallback.

llvm-svn: 243952
2015-08-04 03:57:56 +00:00
Sanjoy Das 215df9ed98 Revert "[LSR] Generate and use zero extends"
This reverts commit r243348 and r243357.  They caused PR24347.

llvm-svn: 243939
2015-08-04 01:52:05 +00:00
Ahmed Bougacha e0e12db8c8 [AArch64] Add isel support for f16 indexed LD/ST.
llvm-svn: 243935
2015-08-04 01:29:38 +00:00
Ahmed Bougacha b0ae36f0d1 [AArch64] Vector FCOPYSIGN supports Custom-lowering: mark it as such.
There's a bunch of code in LowerFCOPYSIGN that does smart lowering, and
is actually already vector-aware; let's use it instead of scalarizing!

The only interesting change is that for v2f32, we previously always used
use v4i32 as the integer vector type.
Use v2i32 instead, and mark FCOPYSIGN as Custom.

llvm-svn: 243926
2015-08-04 00:42:34 +00:00
Ahmed Bougacha f65371a235 [CodeGen] Fix FCOPYSIGN legalization to account for mismatched types.
We used to legalize it like it's any other binary operations.  It's not,
because it accepts mismatched operand types.  Because of that, we used
to hit various asserts and miscompiles.

Specialize vector legalizations to, in the worst case, unroll, or, when
possible, to just legalize the operand that needs legalization.

Scalarization isn't covered, because I can't think of a target where
some but not all of the 1-element vector types are to be scalarized.

llvm-svn: 243924
2015-08-04 00:32:55 +00:00
Alex Lorenz a518b79601 MIR Serialization: Serialize the 'volatile' machine memory operand flag.
llvm-svn: 243923
2015-08-04 00:24:45 +00:00
Alex Lorenz 4af7e610c3 MIR Serialization: Initial serialization of the machine memory operands.
Reviewers: Duncan P. N. Exon Smith
llvm-svn: 243915
2015-08-03 23:08:19 +00:00
Justin Bogner 45291391b2 lto: Avoid relying on the environment for this test
It's better to pass libLTO to ld64 via the command line flag than rely
on setting DYLD_LIBRARY_PATH.

llvm-svn: 243911
2015-08-03 22:43:14 +00:00
Chandler Carruth 87adb7a2e2 [Unroll] Improve the brute force loop unroll estimate by propagating
through PHI nodes across iterations.

This patch teaches the new advanced loop unrolling heuristics to propagate
constants into the loop from the preheader and around the backedge after
simulating each iteration. This lets us brute force solve simple recurrances
that aren't modeled effectively by SCEV. It also makes it more clear why we
need to process the loop in-order rather than bottom-up which might otherwise
make much more sense (for example, for DCE).

This came out of an attempt I'm making to develop a principled way to account
for dead code in the unroll estimation. When I implemented
a forward-propagating version of that it produced incorrect results due to
failing to propagate *cost* between loop iterations through the PHI nodes, and
it occured to me we really should at least propagate simplifications across
those edges, and it is quite easy thanks to the loop being in canonical and
LCSSA form.

Differential Revision: http://reviews.llvm.org/D11706

llvm-svn: 243900
2015-08-03 20:32:27 +00:00
Derek Schuff b4c1c28c6e Fix testing for end of stream in bitstream reader.
This fixes a bug found while working on the bitcode reader. In
particular, the method BitstreamReader::AtEndOfStream doesn't always
behave correctly when processing a data streamer. The method
fillCurWord doesn't properly set CurWord/BitsInCurWord if the data
streamer was already at eof, but GetBytes had not yet set the
ObjectSize field of the streaming memory object.

This patch fixes this problem, and provides a test to show that
this problem has been fixed.

Patch by Karl Schimpf.

Differential Revision: http://reviews.llvm.org/D11391

llvm-svn: 243890
2015-08-03 18:01:50 +00:00
Duncan P. N. Exon Smith 55ca964e94 DI: Disallow uniquable DICompileUnits
Since r241097, `DIBuilder` has only created distinct `DICompileUnit`s.
The backend is liable to start relying on that (if it hasn't already),
so make uniquable `DICompileUnit`s illegal and automatically upgrade old
bitcode.  This is a nice cleanup, since we can remove an unnecessary
`DenseSet` (and the associated uniquing info) from `LLVMContextImpl`.

Almost all the testcases were updated with this script:

    git grep -e '= !DICompileUnit' -l -- test |
    grep -v test/Bitcode |
    xargs sed -i '' -e 's,= !DICompileUnit,= distinct !DICompileUnit,'

I imagine something similar should work for out-of-tree testcases.

llvm-svn: 243885
2015-08-03 17:26:41 +00:00
Tim Northover 910dde7ab2 ARM: prefer allocating VFP regs at stride 4 on Darwin.
This is necessary for WatchOS support, where the compact unwind format assumes
this kind of layout. For now we only want this on Swift-like CPUs though, where
it's been the Xcode behaviour for ages. Also, since it can expand the prologue
we don't want it at -Oz.

llvm-svn: 243884
2015-08-03 17:20:10 +00:00
Artur Pilipenko 17376c4e02 Currently string attributes on function arguments/return values can be generated using LLVM API. However they are not supported in parser. So, the following scenario will fail:
* generate function with string attribute using API,
* dump it in LL format,
* try to parse.
Add parser support for string attributes to fix the issue.

Reviewed By: reames, hfinkel

Differential Revision: http://reviews.llvm.org/D11058

llvm-svn: 243877
2015-08-03 14:31:49 +00:00
John Brawn f3324cf1a5 [ARM] Make GlobalMerge merge extern globals by default
Enabling merging of extern globals appears to be generally either beneficial or
harmless. On some benchmarks suites (on Cortex-M4F, Cortex-A9, and Cortex-A57)
it gives improvements in the 1-5% range, but in the rest the overall effect is
zero.

Differential Revision: http://reviews.llvm.org/D10966

llvm-svn: 243874
2015-08-03 12:13:33 +00:00