Commit Graph

81524 Commits

Author SHA1 Message Date
Alex Lorenz 31d706836c MIR Serialization: Serialize the jump table index operands.
Reviewers: Duncan P. N. Exon Smith
llvm-svn: 242358
2015-07-15 23:38:35 +00:00
Alex Lorenz 6799e9b3e0 MIR Serialization: Serialize the jump table info.
The jump table info is serialized using a YAML mapping that contains its kind
and a YAML sequence of jump table entries. A jump table entry is a YAML mapping
that has an ID and an inline YAML sequence of machine basic block references.

The testcase 'CodeGen/MIR/X86/jump-table-info.mir' doesn't have any instructions
because one of them contains a jump table index operand. The jump table index
operands will be serialized in a follow up patch, and the appropriate
instructions will be added to this testcase.

Reviewers: Duncan P. N. Exon Smith
llvm-svn: 242357
2015-07-15 23:31:07 +00:00
Rafael Espindola 57c0525d2c llvm-ar: Don't write the directory in the string table.
We were already doing the right thing for short file names, but not long
ones.

llvm-svn: 242354
2015-07-15 23:15:33 +00:00
Cong Hou ab23bfbc0e Create a wrapper pass for BranchProbabilityInfo.
This new wrapper pass is useful when we want to do branch probability analysis conditionally (e.g. only in PGO mode) but don't want to add one more pass dependence.

http://reviews.llvm.org/D11241

llvm-svn: 242349
2015-07-15 22:48:29 +00:00
David Majnemer 3f6994b3c7 Silence GCC -Wparenthesis warning
llvm-svn: 242348
2015-07-15 22:48:26 +00:00
Rafael Espindola 0a74a60bc4 For new archive member we only need to store the full path.
We were storing both the path and the file name, which was redundant
and easy to get confused up with.

llvm-svn: 242347
2015-07-15 22:46:53 +00:00
Chen Li 3f5ed1566e [LoopUnswitch] Add an else clause to IsTrivialUnswitchCondition() when checking HeaderTerm instruction type
Summary:
This is a trivial code change with no functionality effect. 

When LoopUnswitch determines trivial unswitch condition, it checks whether the loop header's terminator instruction is a branch instruction or switch instruction since trivial unswitch condition can only apply to these two instruction types. The current code does not fail the check directly on other instruction types, but check the nullness of LoopExitBB variable instead. The added else clause makes the check fail immediately on other instruction types and makes the code more obvious.  

Reviewers: reames

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11239

llvm-svn: 242345
2015-07-15 22:41:13 +00:00
Matthias Braun 5d1f12d1f5 TargetRegisterInfo: Provide a way to check assigned registers in getRegAllocationHints()
Pass a const reference to LiveRegMatrix to getRegAllocationHints()
because some targets can prodive better hints if they can test whether a
physreg has been used for register allocation yet.

llvm-svn: 242340
2015-07-15 22:16:00 +00:00
Alex Lorenz 37643a04a4 MIR Serialization: Serialize references from the stack objects to named allocas.
This commit serializes the references to the named LLVM alloca instructions from
the stack objects in the machine frame info. This commit adds a field 'Name' to
the struct 'yaml::MachineStackObject'. This new field is used to store the name
of the alloca instruction when the alloca is present and when it has a name.

llvm-svn: 242339
2015-07-15 22:14:49 +00:00
Paul Robinson b9de106d04 Add a "debugger tuning" concept that allows us to fine-tune how we
emit debug info, according to the preferences of the different
debuggers used on various targets.
Darwin and FreeBSD default to tuning for LLDB; PS4 defaults to tuning for
the SCE (Sony Computer Entertainment) debugger.  All others default to GDB.

Differential Revision: http://reviews.llvm.org/D8506

llvm-svn: 242338
2015-07-15 22:04:54 +00:00
JF Bastien 7289f73b8d Fix mergefunc infinite loop
Self-referential constants containing references to a merged function
no longer cause the MergeFunctions pass to infinite loop. Also adds a
reproduction IR which would otherwise fail, which was isolated from a similar
issue in Chromium.

Author: jrkoenig
Reviewers: nlewycky, jfb
Subscribers: llvm-commits, nlewycky, jfb

Differential Revision: http://reviews.llvm.org/D11208

llvm-svn: 242337
2015-07-15 21:51:33 +00:00
Rafael Espindola f662e00a68 Simplify a few uses of remove_filename by using parent_path instead.
llvm-svn: 242334
2015-07-15 21:24:07 +00:00
Rafael Espindola 449208d95b Handle the error of trying to convert a regular archive to a thin one.
While at it, test that we can add to a thin archive.

llvm-svn: 242330
2015-07-15 20:45:56 +00:00
Cong Hou 5e67b66640 Rename doFunction() in BFI to calculate() and change its parameters from pointers to references.
http://reviews.llvm.org/D11196

llvm-svn: 242322
2015-07-15 19:58:26 +00:00
Tobias Edler von Koch d8ce16b1e6 Analyze recursive PHI nodes in BasicAA
Summary:
This patch allows phi nodes like
  %x = phi [ %incptr, ... ] [ %var, ... ]
  %incptr = getelementptr %x, 1
to be analyzed by BasicAliasAnalysis.

In aliasPHI, we can detect incoming values that are recursive GEPs with a
constant offset. Instead of trying to analyze a recursive GEP (and failing), 
we now ignore it and instead set the size of the memory referenced by
the PHINode to UnknownSize. This represents all the possible memory
locations the pointer represented by the PHINode could be advanced to
by the GEP.

For now, this new behavior is turned off by default to allow debugging of
performance degradations seen with SPEC/x86 and Hexagon benchmarks.
The flag -basicaa-recphi turns it on.


Reviewers: hfinkel, sanjoy

Subscribers: tobiasvk_caf, sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D10368

llvm-svn: 242320
2015-07-15 19:32:22 +00:00
Bruno Cardoso Lopes 9b39693a5d Revert "Refactor optimizeUncoalescable logic"
Likely broke compilation on ARM:

http://lab.llvm.org:8011/builders/clang-native-arm-lnt/builds/13054

This reverts commit 0b7824464fbe3d3f386e2d4aef6a431422709e53.

llvm-svn: 242311
2015-07-15 18:10:46 +00:00
Bruno Cardoso Lopes ad61f34293 Revert "Look through PHIs to find additional register sources"
Likely broke compilation on ARM:

http://lab.llvm.org:8011/builders/clang-native-arm-lnt/builds/13054

This reverts commit 131ce4a838c081516cbfed039fc986b33e3979d6.

llvm-svn: 242310
2015-07-15 18:10:35 +00:00
Cong Hou 0881fc1198 Test commit.
This is a test commit (one blank line deleted).

llvm-svn: 242308
2015-07-15 17:58:15 +00:00
Adrian Prantl ee5feafc0f Debug Info: Add basic support for external types references.
This is a necessary prerequisite for bootstrapping the emission
of debug info inside modules.

- Adds a FlagExternalTypeRef to DICompositeType.
  External types must have a unique identifier.
- External type references are emitted using a forward declaration
  with a DW_AT_signature([DW_FORM_ref_sig8]) based on the UID.

http://reviews.llvm.org/D9612

llvm-svn: 242302
2015-07-15 17:01:41 +00:00
Pete Cooper 21ca199cea Add missing load/store flags to thumb2 instructions.
These were the cause of a verifier error when building 7zip with
-verify-machineinstrs.  Running 'make check' with the verifier
triggered the same error on the test here so i've updated the test
to run the verifier on one of its runs instead of adding a new one.

While looking at this code, there was a stale comment that these
instructions were only used for disassembly.  This probably used to
be the case, but they are now used in the 'ARM load / store optimization pass' too.

llvm-svn: 242300
2015-07-15 16:36:38 +00:00
Bill Schmidt 1e77bb12b4 [PPC64LE] Fix vec_sld semantics for little endian
The vec_sld interface provides access to the vsldoi instruction.
Unlike most of the vec_* interfaces, we do not attempt to change the
generated code for vec_sld based on the endian mode.  It is too
difficult to correctly infer the desired semantics because of
different element types, and the corrected instruction sequence is
expensive, involving loading a permute control vector and performing a
generalized permute.

For GCC, this was implemented as "Don't touch the vec_sld"
implementation.  When it came time for the LLVM implementation, I did
the same thing.  However, this was hasty and incorrect.  In LLVM's
version of altivec.h, vec_sld was previously defined in terms of the
vec_perm interface.  Because vec_perm semantics are adjusted for
little endian, this means that leaving vec_sld untouched causes it to
generate something different for LE than for BE.  Not good.

This back-end patch accompanies the changes to altivec.h that change
vec_sld's behavior for little endian.  Those changes mean that we see
slightly different code in the back end when trying to recognize a
VSLDOI instruction in isVSLDOIShuffleMask.  In particular, a
ShuffleKind of 1 (where the two inputs are identical) must now be
treated the same way as a ShuffleKind of 2 (little endian with
different inputs) when little endian mode is in force.  This is
because ShuffleKind of 1 is defined using big-endian numbering.

This has a ripple effect on LowerBUILD_VECTOR, where we create our own
internal VSLDOI instructions.  Because these are a ShuffleKind of 1,
they will now have their shift amounts subtracted from 16 when
recognizing the shuffle mask.  To avoid problems we have to subtract
them from 16 again before creating the VSLDOI instructions.

There are a couple of other uses of BuildVSLDOI, but these do not need
to be modified because the shift amount is 8, which is unchanged when
subtracted from 16.

llvm-svn: 242296
2015-07-15 15:45:30 +00:00
Bruno Cardoso Lopes fadd4fef2a Look through PHIs to find additional register sources
- Teaches the ValueTracker in the PeepholeOptimizer to look through PHI
instructions.
- Add findNextSourceAndRewritePHI method to lookup into multiple sources
returnted by the ValueTracker and rewrite PHIs with new sources.

With these changes we can find more register sources and rewrite more
copies to allow coaslescing of bitcast instructions. Hence, we eliminate
unnecessary VR64 <-> GR64 copies in x86, but it could be extended to
other archs by marking "isBitcast" on target specific instructions. The
x86 example follows:

A:
  psllq %mm1, %mm0
  movd  %mm0, %r9
  jmp C

B:
  por %mm1, %mm0
  movd  %mm0, %r9
  jmp C

C:
  movd  %r9, %mm0
  pshufw  $238, %mm0, %mm0

Becomes:

A:
  psllq %mm1, %mm0
  jmp C

B:
  por %mm1, %mm0
  jmp C

C:
  pshufw  $238, %mm0, %mm0

Differential Revision: http://reviews.llvm.org/D11197

rdar://problem/20404526

llvm-svn: 242295
2015-07-15 15:35:23 +00:00
Bruno Cardoso Lopes bd68a09591 Refactor optimizeUncoalescable logic
- Create a new CopyRewriter for Uncoalescable copy-like instructions
- Change the ValueTracker to return a ValueTrackerResult

This makes optimizeUncoalescable looks more like optimizeCoalescable and
use the CopyRewritter infrastructure.

This is also the preparation for looking up into PHI nodes in the
ValueTracker.

Differential Revision: http://reviews.llvm.org/D11195

llvm-svn: 242294
2015-07-15 15:35:09 +00:00
Benjamin Kramer c11fd3e775 [PPC] Disassemble little endian ppc instructions in the right byte order
PR24122. The test is simply a byte swapped version of ppc64-encoding.txt.

llvm-svn: 242288
2015-07-15 12:56:19 +00:00
Alexandros Lamprineas fcd93d539e -Added API for retrieving the default FPU of a CPU from TargetParser.
-Implemented as a table lookup.

Change-Id: Iaad0eaf4b29b06827e6700269496dc1ba20e9018
Phabricator: http://reviews.llvm.org/D11100
llvm-svn: 242284
2015-07-15 10:46:21 +00:00
Chandler Carruth 6af95d0a08 [PM/AA] Fix *numerous* serious bugs in GlobalsModRef found by
inspection.

While we want to handle calls specially in this code because they should
have been modeled by the call graph analysis that precedes it, we should
*not* be re-implementing the predicates for whether an instruction reads
or writes memory. Those are well defined already. Notably, at least the
following issues seem to be clearly missed before:
- Ordered atomic loads can "write" to memory by causing writes from other
  threads to become visible. Similarly for ordered atomic stores.
- AtomicRMW instructions quite obviously both read and write to memory.
- AtomicCmpXchg instructions also read and write to memory.
- Fences read and write to memory.
- Invokes of intrinsics or memory allocation functions.

I don't have any test cases, and I suspect this has never really come up
in the real world. But there is no reason why it wouldn't, and it makes
the code simpler to do this the right way.

While here, I've tried to make the loops significantly simpler as well
and added helpful comments as to what is going on.

llvm-svn: 242281
2015-07-15 08:53:29 +00:00
Alexey Bataev b9288601a3 [SDAG] Optimize unordered comparison in soft-float mode (patch by Anton Nadolskiy)
Current implementation handles unordered comparison poorly in soft-float mode. 
Consider (a ULE b) which is a <= b. It is lowered to (ledf2(a, b) <= 0 || unorddf2(a, b) != 0) (in general). We can do better job by lowering it to (__gtdf2(a, b) <= 0). 
Such replacement is true for other CMP's (ult, ugt, uge). In general, we just call same function as for ordered case but negate comparison against zero.
Differential Revision: http://reviews.llvm.org/D10804

llvm-svn: 242280
2015-07-15 08:39:35 +00:00
Hal Finkel 5d36b230b5 [PowerPC] Use the MachineCombiner to reassociate fadd/fmul
This is a direct port of the code from the X86 backend (r239486/r240361), which
uses the MachineCombiner to reassociate (floating-point) adds/muls to increase
ILP, to the PowerPC backend. The rationale is the same.

There is a lot of copy-and-paste here between the X86 code and the PowerPC
code, and we should extract at least some of this into CodeGen somewhere.
However, I don't want to do that until this code is enhanced to handle FMAs as
well. After that, we'll be in a better position to extract the common parts.

llvm-svn: 242279
2015-07-15 08:23:05 +00:00
Hal Finkel 673b493e98 [PowerPC] Extend physical register live range in PPCVSXFMAMutate
If the source of the copy that defines the addend is a physical register, then
its existing live range may not extend to the FMA being mutated. Make sure we
extend the live range of the register to meet the FMA because it will become
its operand in this case.

I don't have an independent test case, but it will be exposed by change to be
committed shortly enabling the use of the machine combiner to do fadd/fmul
reassociation, and will be covered by one of the associated regression tests.

llvm-svn: 242278
2015-07-15 08:23:03 +00:00
Hal Finkel e0fa8f2c86 [MachineCombiner] Work with itineraries
MachineCombiner predicated its use of scheduling-based metrics on
hasInstrSchedModel(), but useful conclusions can be drawn from pipeline
itineraries as well. Almost all of the logic (except for resource tracking in
preservesResourceLen) can be used if we have an itinerary, so enable it in that
case as well.

This will be used by the PowerPC backend in an upcoming commit.

llvm-svn: 242277
2015-07-15 08:22:23 +00:00
Petr Pavlu 097adfb98c [AArch64] Fix problems in decoding generic MSR instructions
Bitpatterns rejected by the decoder method of `MSR (immediate)` should be
decoded as the `extended MSR (register)` instruction.

Differential Revision: http://reviews.llvm.org/D7174

llvm-svn: 242276
2015-07-15 08:10:30 +00:00
Chandler Carruth a033bbbe96 [PM/AA] Cleanup some loops to be range-based. NFC.
llvm-svn: 242275
2015-07-15 08:09:23 +00:00
Igor Breger 096e8b0995 AVX : Fix ISA disabling in case AVX512VL , some instructions should be disabled only if AVX512BW present.
Tests added.

Differential Revision: http://reviews.llvm.org/D11122

llvm-svn: 242270
2015-07-15 07:08:10 +00:00
Rafael Espindola e649258272 Initial support for writing thin archives.
llvm-svn: 242269
2015-07-15 05:47:46 +00:00
Pete Cooper 6923461a16 Use enum instead of unsigned. NFC.
The unsigned opcode argument here was the result of BinaryOperator->getOpcode().
That returns a BinaryOps enum which is more accurate than passing around an
unsigned.

llvm-svn: 242265
2015-07-15 01:31:26 +00:00
Pete Cooper a8127d8c92 Use cast<> instead of dyn_cast to remove llvm_unreachable. NFC.
This code was checking if we are an ICmpInst or FCmpInst then throwing
unreachable if we are neither.  We must be one or the other, so use a
cast on the FCmpInst case to ensure that we are that case.  Then we can
avoid having an unreachable but still catch an error if we ever had another
subclass of CmpInst.

llvm-svn: 242264
2015-07-15 01:31:23 +00:00
Pete Cooper 20dc71b1f1 Use another foreach loop. NFC
llvm-svn: 242263
2015-07-15 01:31:20 +00:00
Pete Cooper 6a96c61659 Use getAnyExtOrTrunc helper instead of manually doing ext/trunc check. NFC.
The code here was doing exactly what is already in getAnyExtOrTrunc().
Just use that method instead.

llvm-svn: 242261
2015-07-15 00:43:57 +00:00
Pete Cooper 8acd386969 Use getZExtOrTrunc helper instead of manually doing zext/trunc check. NFC.
The code here was doing exactly what is already in getZExtOrTrunc().
Just use that method instead.

llvm-svn: 242260
2015-07-15 00:43:54 +00:00
Michael Zolotukhin 31b3eaaf28 [LoopUnrolling] Handle cast instructions.
During estimation of unrolling effect we should be able to propagate
constants through casts.

Differential Revision: http://reviews.llvm.org/D10207

llvm-svn: 242257
2015-07-15 00:19:51 +00:00
Pete Cooper e46f7ef385 Change conditional to assert. NFC.
This code was breaking from the case statement if the getStoreSizeInBits()
value was not a multiple of 0.  Given that the implementation returns
getStoreSize() * 8, it can only be a multiple of 8.

llvm-svn: 242255
2015-07-15 00:07:57 +00:00
Pete Cooper 7e747d26c5 Use getStoreSize() instead of getStoreSizeInBits()/8. NFC.
The calls here were both to getStoreSizeInBits() which multiplies by 8.
We then immediately divided by 8.  Calling getStoreSize() returns the
values we need without the extra arithmetic.

llvm-svn: 242254
2015-07-15 00:07:55 +00:00
Rafael Espindola 6fce2e4f26 Use a range loop.
llvm-svn: 242250
2015-07-14 23:51:01 +00:00
Pete Cooper 7e64ef06e6 Use more foreach loops in SelectionDAG. NFC
llvm-svn: 242249
2015-07-14 23:43:29 +00:00
Wei Mi deee61e434 Create a wrapper pass for BlockFrequencyInfo.
This is useful when we want to do block frequency analysis
conditionally (e.g. only in PGO mode) but don't want to add
one more pass dependence.

Patch by congh.
Approved by dexonsmith.
Differential Revision: http://reviews.llvm.org/D11196

llvm-svn: 242248
2015-07-14 23:40:50 +00:00
JF Bastien c8f48c19d3 WebAssembly: fix build breakage.
Summary:
processFunctionBeforeCalleeSavedScan was renamed to determineCalleeSaves and now takes a BitVector parameter as of rL242165, reviewed in http://reviews.llvm.org/D10909

WebAssembly is still marked as experimental and therefore doesn't build by default. It does, however, grep by default! I notice that processFunctionBeforeCalleeSavedScan is still mentioned in a few comments and error messages, which I also fixed.

Reviewers: qcolombet, sunfish

Subscribers: jfb, dsanders, hfinkel, MatzeB, llvm-commits

Differential Revision: http://reviews.llvm.org/D11199

llvm-svn: 242242
2015-07-14 23:06:07 +00:00
Hal Finkel 4012024fea [PowerPC] Support symbolic targets in patchpoints
Follow-up r235483, with the corresponding support in PPC. We use a regular call
for symbolic targets (because they're much cheaper than indirect calls).

llvm-svn: 242239
2015-07-14 22:53:11 +00:00
David Majnemer 33b6f82e72 [InstCombine] Generalize sub of selects optimization to all BinaryOperators
This exposes further optimization opportunities if the selects are
correlated.

llvm-svn: 242235
2015-07-14 22:39:23 +00:00
Adam Nemet 9f7dedc376 [LAA] Introduce RuntimePointerChecking::PointerInfo, NFC
Turn this structure-of-arrays (i.e. the various pointer attributes) into
array-of-structures.

llvm-svn: 242219
2015-07-14 22:32:50 +00:00
Adam Nemet 7cdebac0c8 [LAA] Lift RuntimePointerCheck out of LoopAccessInfo, NFC
I am planning to add more nested classes inside RuntimePointerCheck so
all these triple-nesting would be hard to follow.

Also rename it to RuntimePointerChecking (i.e. append 'ing').

llvm-svn: 242218
2015-07-14 22:32:44 +00:00
Hal Finkel 9bbad03b98 [PowerPC] Use the ABI indirect-call protocol for patchpoints
We used to take the address specified as the direct target of the patchpoint
and did no TOC-pointer handling.  This, however, as not all that useful,
because MCJIT tends to create a lot of modules, and they have their own TOC
sections. Thus, to call from the generated code to other generated code, you
really need to switch TOC pointers. Make this work as expected, and under
ELFv1, tread the address as the function descriptor address so that the correct
TOC pointer can be loaded.

llvm-svn: 242217
2015-07-14 22:26:06 +00:00
Rafael Espindola 4b83cb5390 Add support for reading members out of thin archives.
For now the Archive owns the buffers of the thin archive members.
This makes for a simple API, but all the buffers are destructed
only when the archive is destructed. This should be fine since we
close the files after mmap so we should not hit an open file
limit.

llvm-svn: 242215
2015-07-14 22:18:43 +00:00
Pete Cooper 65c69407c8 Add allnodes() iterator range to SelectionDAG. NFC.
SelectionDAG already had begin/end methods for iterating over all
the nodes, but didn't define an iterator_range for us in foreach
loops.

This adds such a method and uses it in some of the eligible places
throughout the backends.

llvm-svn: 242212
2015-07-14 22:10:54 +00:00
Pete Cooper 06e249e713 Constify parameters in SelectionDAG methods. NFC
llvm-svn: 242210
2015-07-14 21:54:52 +00:00
Pete Cooper cf17e18f4e Remove unnecessary .getNode() in SelectionDAG. NFC.
The simplify_type specialisation allows us to cast directly from
SDValue to an SDNode* subclass so we don't need to pass a SDNode*
to cast<>.

llvm-svn: 242209
2015-07-14 21:54:48 +00:00
Pete Cooper e89ba67f72 Use more foreach loops in SelectionDAG. NFC
llvm-svn: 242208
2015-07-14 21:54:45 +00:00
Alex Lorenz 9fab370d79 MIR Serialization: Serialize the machine basic block live in registers.
llvm-svn: 242204
2015-07-14 21:24:41 +00:00
Alex Lorenz 15a00a858a MIR Printer: move the function 'printReg'. NFC.
This commit moves the function 'printReg' towards the start of the file so that
it can be used by the conversion methods in MIRPrinter and not just the printing
methods in MIPrinter.

llvm-svn: 242203
2015-07-14 21:18:25 +00:00
Tim Northover 586b741959 GVN: use a static array instead of regenerating it each time. NFC.
llvm-svn: 242202
2015-07-14 21:14:58 +00:00
JF Bastien d9767a368f WebAssembly: add basic int/fp instruction codegen.
Summary: This patch has the most basic instruction codegen for 32 and 64 bit int/fp.

Reviewers: sunfish

Subscribers: llvm-commits, jfb

Differential Revision: http://reviews.llvm.org/D11193

llvm-svn: 242201
2015-07-14 21:13:29 +00:00
Krzysztof Parzyszek fdfaae42b6 Fix NDEBUG build warning
llvm-svn: 242200
2015-07-14 21:03:24 +00:00
Tim Northover d5fdef016d GVN: tolerate an instruction being replaced without existing in the leaderboard
Sometimes an incidentally created instruction can duplicate a Value used
elsewhere. It then often doesn't end up in the leader table. If it's later
removed, we attempt to remove it from the leader table and segfault.

Instead we should just ignore the removal request, which won't cause any
problems. The reverse situation, where the original instruction is replaced by
the new one (which you might think could leave the leader table empty) cannot
occur, because the incidental instruction will never be found in the first
place.

llvm-svn: 242199
2015-07-14 21:03:18 +00:00
Krzysztof Parzyszek 2e82883620 Fix Windows build: replace __func__ with LLVM_FUNCTION_NAME
llvm-svn: 242192
2015-07-14 20:11:28 +00:00
Bruno Cardoso Lopes 9e6dea1df8 [MMX] Use the appropriate instructions for GR64 <-> VR64 copies.
MOVSDto64rr and MOV64toSDrr are defined to convert between FR64 (%xmm)
<-> GR64 registers, not VR64 (%mm) <-> GR64. This is wrong.

I found this by inspection and could not find a suitable testcase for it
since (1) we don't handle MMX bitcasts in Peephole optimizer as to
generate COPYs that (2) could be expanded back to the appropriate x86
instruction in ExpandPostRA.

Switch to use the appropriate instructions: MMX_MOVD64from64rr and
MMX_MOVD64to64rr here.

llvm-svn: 242191
2015-07-14 20:09:34 +00:00
Hal Finkel 8acae5276e [PowerPC] Fix the PPCInstrInfo::getInstrLatency implementation
PowerPC uses itineraries to describe processor pipelines (and dispatch-group
restrictions for P7/P8 cores). Unfortunately, the target-independent
implementation of TII.getInstrLatency calls ItinData->getStageLatency, and that
looks for the largest cycle count in the pipeline for any given instruction.
This, however, yields the wrong answer for the PPC itineraries, because we
don't encode the full pipeline. Because the functional units are fully
pipelined, we only model the initial stages (there are no relevant hazards in
the later stages to model), and so the technique employed by getStageLatency
does not really work. Instead, we should take the maximum output operand
latency, and that's what PPCInstrInfo::getInstrLatency now does.

This caused some test-case churn, including two unfortunate side effects.
First, the new arrangement of copies we get from function parameters now
sometimes blocks VSX FMA mutation (a FIXME has been added to the code and the
test cases), and we have one significant test-suite regression:

SingleSource/Benchmarks/BenchmarkGame/spectral-norm
	56.4185% +/- 18.9398%

In this benchmark we have a loop with a vectorized FP divide, and it with the
new scheduling both divides end up in the same dispatch group (which in this
case seems to cause a problem, although why is not exactly clear). The grouping
structure is hard to predict from the bottom of the loop, and there may not be
much we can do to fix this.

Very few other test-suite performance effects were really significant, but
almost all weakly favor this change. However, in light of the issues
highlighted above, I've left the old behavior available via a
command-line flag.

llvm-svn: 242188
2015-07-14 20:02:02 +00:00
Krzysztof Parzyszek 758744706a [Hexagon] Generate instructions for operations on predicate registers
Convert logical operations on general-purpose registers to the correspon-
ding operations on predicate registers.

llvm-svn: 242186
2015-07-14 19:30:21 +00:00
Keno Fischer aff703a2ca [CodeGen] Force emission of personality directive if explicitly specified
Summary:
Before this change, personality directives were not emitted
if there was no invoke left in the function (of course until
recently this also meant that we couldn't know what
the personality actually was). This patch forces personality directives
to still be emitted, unless it is known to be a noop in the absence of
invokes, or the user explicitly specified `nounwind` (and not
`uwtable`) on the function.

Reviewers: majnemer, rnk

Subscribers: rnk, llvm-commits

Differential Revision: http://reviews.llvm.org/D10884

llvm-svn: 242185
2015-07-14 19:22:51 +00:00
Matt Arsenault 24692118ba AMDGPU: Avoid using 64-bit shift for i64 (shl x, 32)
This can be done only with moves which theoretically
will optimize better later.

Although this transform increases the instruction count,
it should be code size / cycle count neutral in the worst
VALU case. It also seems to slightly improve a couple
of testcases due to other DAG combines this exposes.

This is probably slightly worse for the SALU case, so
it might be better to handle this during moveToVALU,
although then you lose some simplifications like
the load width reducing in the simple testcase.

llvm-svn: 242177
2015-07-14 18:20:33 +00:00
Matt Arsenault 84db5d97b0 AMDGPU/SI: Fix read2 merging into a super register.
If the read2 produced was supposed to be writing into a
super register, it would use the wrong subregister indices.
Fix this by inserting copies, so we only ever write to a vreg_64.
Run the register coalescer again to clean this up, although this
isn't ideal and often does result in an extra move.

Also remove the assert that offset1 > offset0.

There isn't a real reason to not allow this other than a minor
convenience in the compiler, and it doesn't seem worth the effort
of avoiding it.

llvm-svn: 242174
2015-07-14 17:57:36 +00:00
Matthias Braun 9912bb817c MachineRegisterInfo: Remove UsedPhysReg infrastructure
We have a detailed def/use lists for every physical register in
MachineRegisterInfo anyway, so there is little use in maintaining an
additional bitset of which ones are used.

Removing it frees us from extra book keeping. This simplifies
VirtRegMap.

Differential Revision: http://reviews.llvm.org/D10911

llvm-svn: 242173
2015-07-14 17:52:07 +00:00
Matthias Braun 953393a72c RAGreedy: Keep track of allocated PhysRegs internally
Do not use MachineRegisterInfo::setPhysRegUsed()/isPhysRegUsed()
anymore. This bitset changes function-global state and is set by the
VirtRegRewriter anyway.
Simply use a bitvector private to RAGreedy.

Differential Revision: http://reviews.llvm.org/D10910

llvm-svn: 242169
2015-07-14 17:38:17 +00:00
Nemanja Ivanovic 984a3613b3 Add missing builtins to the PPC back end for ABI compliance (vol. 4)
This patch corresponds to review:
http://reviews.llvm.org/D11183

Back end portion of the fourth round of additions to altivec.h.

llvm-svn: 242167
2015-07-14 17:25:20 +00:00
Matthias Braun 0256486532 PrologEpilogInserter: Rewrite API to determine callee save regsiters.
This changes TargetFrameLowering::processFunctionBeforeCalleeSavedScan():

- Rename the function to determineCalleeSaves()
- Pass a bitset of callee saved registers by reference, thus avoiding
  the function-global PhysRegUsed bitset in MachineRegisterInfo.
- Without PhysRegUsed the implementation is fine tuned to not save
  physcial registers which are only read but never modified.

Related to rdar://21539507

Differential Revision: http://reviews.llvm.org/D10909

llvm-svn: 242165
2015-07-14 17:17:13 +00:00
Tim Northover c962d4f28b AArch64: add rev64 alias for 64-bit rev instruction.
It could be useful to assembly programmers and makes the permitted variants a
little more uniform.

llvm-svn: 242164
2015-07-14 17:07:29 +00:00
Krzysztof Parzyszek a0ecf07c0b [Hexagon] Generate "extract" instructions more aggressively
Generate extract instructions (via intrinsics) before the DAG combiner
folds shifts into unrecognizable forms.

llvm-svn: 242163
2015-07-14 17:07:24 +00:00
Hans Wennborg 61f9efe73b ARMAsmParser: Take MCInst param by const-ref
(Broken out from http://reviews.llvm.org/D11167)

llvm-svn: 242160
2015-07-14 16:39:01 +00:00
Alexandros Lamprineas a04e6b1853 Caused regressions: compile Release+Asserts failed on clang-native-arm-cortex-a9
Revert "-Added API for retrieving the default FPU of a CPU from TargetParser."

This reverts commit 01199ab0c6ff2d5c4f6b2c05a95ec011e41c4669.

llvm-svn: 242147
2015-07-14 14:34:06 +00:00
Tom Stellard e48fe2a27a AMDGPU/SI: Add support for shrinking v_cndmask_b32_e32 instructions
Reviewers: arsenm

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11061

llvm-svn: 242146
2015-07-14 14:15:03 +00:00
Aaron Ballman a927a86c72 Silencing two MSVC warnings; 'argument' : truncation from 'unsigned int' to 'int16_t' and truncation of constant value. NFC intended.
llvm-svn: 242145
2015-07-14 14:14:00 +00:00
Alexandros Lamprineas ab9907a217 -Added API for retrieving the default FPU of a CPU from TargetParser.
-Implemented as a table lookup.

Change-Id: Ibf7217f6bd2769e9c06835a5aede3d072dee6757
Phabricator: http://reviews.llvm.org/D11100
llvm-svn: 242141
2015-07-14 13:20:48 +00:00
Daniel Sanders 03f9c019d2 [mips] Fix li/la differences between IAS and GAS.
Summary:
- Signed 16-bit should have priority over unsigned.
- For la, unsigned 16-bit must use ori+addu rather than directly use ori.
- Correct tests on 32-bit immediates with 64-bit predicates by
  sign-extending the immediate beforehand. For example, isInt<16>(0xffff8000)
  should be true and use addiu.

Also split li/la testing into separate files due to their size.

Reviewers: vkalintiris

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10967

llvm-svn: 242139
2015-07-14 12:24:22 +00:00
Chandler Carruth 466d7ad42d [PM/AA] Reformat GlobalsModRef so that subsequent patches I make here
don't continually introduce formatting deltas. NFC

llvm-svn: 242129
2015-07-14 08:42:39 +00:00
David Majnemer 62690b1952 [SROA] Don't de-atomic volatile loads and stores
Volatile loads and stores are made visible in global state regardless of
what memory is involved.  It is not correct to disregard the ordering
and synchronization scope because it is possible to synchronize with
memory operations performed by hardware.

This partially addresses PR23737.

llvm-svn: 242126
2015-07-14 06:19:58 +00:00
Yaron Keren d1ba2d9d8b Generate correct asm info for mingw and cygwin ARM targets.
http://reviews.llvm.org/D11075

Patch by Martell Malone
Reviewed by Reid Kleckner

llvm-svn: 242123
2015-07-14 05:51:05 +00:00
NAKAMURA Takumi 0b305db8df Prune trailing whitespaces and CRs.
llvm-svn: 242117
2015-07-14 04:03:49 +00:00
Matthias Braun 75e668ea6e Revert "LegalizeDAG: Fix and improve FCOPYSIGN/FABS legalization"
Accidental commit, needs review first.

This reverts commit r242107.

llvm-svn: 242108
2015-07-14 02:09:57 +00:00
Matthias Braun 4ac4ecdadf LegalizeDAG: Fix and improve FCOPYSIGN/FABS legalization
- Factor out code to query and modify the sign bit of a floatingpoint
  value as an integer. This also works if none of the targets integer
  types is big enough to hold all bits of the floatingpoint value.

- Legalize FABS(x) as FCOPYSIGN(x, 0.0) if FCOPYSIGN is available,
  otherwise perform bit manipulation on the sign bit. The previous code
  used "x >u 0 ? x : -x" which is incorrect for x being -0.0! It also
  takes 34 instructions on ARM Cortex-M4. With this patch we only
  require 5:
    vldr d0, LCPI0_0
    vmov r2, r3, d0
    lsrs r2, r3, #31
    bfi r1, r2, #31, #1
    bx lr
  (This could be further improved if the compiler would recognize that
   r2, r3 is zero).

- Only lower FCOPYSIGN(x, y) = sign(x) ? -FABS(x) : FABS(x) if FABS is
  available otherwise perform bit manipulation on the sign bit.

- Perform the sign(x) test by masking out the sign bit and comparing
  with 0 rather than shifting the sign bit to the highest position and
  testing for "<s 0". For x86 copysignl (on 80bit values) this gets us:
    testl $32768, %eax
  rather than:
    shlq $48, %rax
    sets %al
    testb %al, %al

llvm-svn: 242107
2015-07-14 02:08:26 +00:00
Andrew Wilkins 3bdfc1cd0c Add capability to get and set the personalitty function from the C API
Summary:
The capability was lost with D10429 where the personality function was set at function level rather than landing pad level. Now there is no way to get/set the personality function from the C API. That is a problem.

Note that the whole thing could be avoided by improving the C API testing, as started by D10725

Reviewers: chandlerc, bogner, majnemer, andrew.w.kaylor, rafael, rnk, axw

Subscribers: rafael, llvm-commits

Differential Revision: http://reviews.llvm.org/D10946

llvm-svn: 242104
2015-07-14 01:23:06 +00:00
Rafael Espindola 2b05416be8 Add a herper function. NFC.
llvm-svn: 242100
2015-07-14 01:06:16 +00:00
Alex Lorenz 418f3ec17d MIR Serialization: Serialize the variable sized stack objects.
llvm-svn: 242095
2015-07-14 00:26:26 +00:00
Reid Kleckner 486fa3977a Update enforceKnownAlignment after the isWeakForLinker semantic change
Previously we would refrain from attempting to increase the linkage of
available_externally globals because they were considered weak for the
linker. Now they are treated more like a declaration instead of a weak
definition.

This was causing SSE alignment faults in Chromuim, when some code
assumed it could increase the alignment of a dllimported global that it
didn't control.  http://crbug.com/509256

llvm-svn: 242091
2015-07-14 00:11:08 +00:00
Alex Lorenz 2eacca86ef MIR Serialization: Serialize the sub register indices.
This commit serializes the sub register indices from the register machine
operands.

Reviewers: Duncan P. N. Exon Smith
llvm-svn: 242084
2015-07-13 23:24:34 +00:00
Rafael Espindola c60d0d2a15 Fix reading archive members with / in the name.
This is important for thin archives.

llvm-svn: 242082
2015-07-13 23:07:05 +00:00
Bill Schmidt 15deb803b4 [PPC64LE] More improvements to VSX swap optimization
This patch allows VSX swap optimization to succeed more frequently.
Specifically, it is concerned with common code sequences that occur
when copying a scalar floating-point value to a vector register.  This
patch currently handles cases where the floating-point value is
already in a register, but does not yet handle loads (such as via an
LXSDX scalar floating-point VSX load).  That will be dealt with later.

A typical case is when a scalar value comes in as a floating-point
parameter.  The value is copied into a virtual VSFRC register, and
then a sequence of SUBREG_TO_REG and/or COPY operations will convert
it to a full vector register of the class required by the context.  If
this vector register is then used as part of a lane-permuted
computation, the original scalar value will be in the wrong lane.  We
can fix this by adding a swap operation following any widening
SUBREG_TO_REG operation.  Additional COPY operations may be needed
around the swap operation in order to keep register assignment happy,
but these are pro forma operations that will be removed by coalescing.

If a scalar value is otherwise directly referenced in a computation
(such as by one of the many XS* vector-scalar operations), we
currently disable swap optimization.  These operations are
lane-sensitive by definition.  A MentionsPartialVR flag is added for
use in each swap table entry that mentions a scalar floating-point
register without having special handling defined.

A common idiom for PPC64LE is to convert a double-precision scalar to
a vector by performing a splat operation.  This ensures that the value
can be referenced as V[0], as it would be for big endian, whereas just
converting the scalar to a vector with a SUBREG_TO_REG operation
leaves this value only in V[1].  A doubleword splat operation is one
form of an XXPERMDI instruction, which takes one doubleword from a
first operand and another doubleword from a second operand, with a
two-bit selector operand indicating which doublewords are chosen.  In
the general case, an XXPERMDI can be permitted in a lane-swapped
region provided that it is properly transformed to select the
corresponding swapped values.  This transformation is to reverse the
order of the two input operands, and to reverse and complement the
bits of the selector operand (derivation left as an exercise to the
reader ;).

A new test case that exercises the scalar-to-vector and generalized
XXPERMDI transformations is added as CodeGen/PowerPC/swaps-le-5.ll.
The patch also requires a change to CodeGen/PowerPC/swaps-le-3.ll to
use CHECK-DAG instead of CHECK for two independent instructions that
now appear in reverse order.

There are two small unrelated changes that are added with this patch.
First, the XXSLDWI instruction was incorrectly omitted from the list
of lane-sensitive instructions; this is now fixed.  Second, I observed
that the same webs were being rejected over and over again for
different reasons.  Since it's sufficient to reject a web only once, I
added a check for this to speed up the compilation time slightly.

llvm-svn: 242081
2015-07-13 22:58:19 +00:00
Pete Cooper 90d95edbb4 Loop idiom recognizer was replacing too many uses of popcount.
When spotting that a loop can use ctpop, we were incorrectly replacing all uses of a value with a value derived from ctpop.

The bug here was exposed because we were replacing a use prior to the ctpop with the ctpop value and so we have a use before def, i.e., we changed

 %tobool.5 = icmp ne i32 %num, 0
 store i1 %tobool.5, i1* %ptr
 br i1 %tobool.5, label %for.body.lr.ph, label %for.end

to

 store i1 %1, i1* %ptr
 %0 = call i32 @llvm.ctpop.i32(i32 %num)
 %1 = icmp ne i32 %0, 0
 br i1 %1, label %for.body.lr.ph, label %for.end

Even if we inserted the ctpop so that it dominates the store here, that would still be incorrect.  The store doesn’t want the result of ctpop.

The fix is very simple, and involves replacing only the branch condition with the ctpop instead of all uses.

Reviewed by Hal Finkel.

llvm-svn: 242068
2015-07-13 21:25:33 +00:00
Reid Kleckner 9a1a919465 [WinEH] Emit the LSDA even if no lpads remain but outlining occurred
The outlined funclets call intrinsics which reference labels from the
LSDA. This situation can easily arise in small functions with a single
cleanup at -O0, where Clang marks a definition as nounwind, and then
WinEHPrepare "discovers" that the landingpad is dead by accident and
deletes it.

We now need to ask the LLVM IR Function for it's personality directly,
rather than going through MachineModuleInfo.

Fixes PR23892.

llvm-svn: 242063
2015-07-13 20:41:46 +00:00
Benjamin Kramer d8861517bf [Hexagon] Move BitTracker into the llvm namespace and remove redundant qualifications
No functional change intended.

llvm-svn: 242062
2015-07-13 20:38:16 +00:00
Rafael Espindola 6a8e86f26e Add support deterministic output in llvm-ar and make it the default.
llvm-svn: 242061
2015-07-13 20:38:09 +00:00
Matt Arsenault ca95d44110 AMDGPU: Minor cleanups to always inline pass
llvm-svn: 242053
2015-07-13 19:08:36 +00:00
David Majnemer 1305e2c0f5 [MC] Correctly escape .safeseh's symbol
This fixes PR24107.

llvm-svn: 242050
2015-07-13 18:51:15 +00:00
Mark Heffernan 4c8ca53f7e Enable partial and runtime loop unrolling for NVPTX.
Enable partial and runtime loop unrolling for NVPTX backend via
TTI::UnrollingPreferences with a small threshold. This partially unrolls
small loops which are often unrolled by the PTX to SASS compiler
and unrolling earlier can be beneficial.

llvm-svn: 242049
2015-07-13 18:33:21 +00:00
Mark Heffernan d7ebc24112 Enable runtime unrolling with unroll pragma metadata
Enable runtime unrolling for loops with unroll count metadata ("#pragma unroll N")
and a runtime trip count. Also, do not unroll loops with unroll full metadata if the
loop has a runtime loop count. Previously, such loops would be unrolled with a
very large threshold (pragma-unroll-threshold) if runtime unrolled happened to be
enabled resulting in a very large (and likely unwise) unroll factor.

llvm-svn: 242047
2015-07-13 18:26:27 +00:00
Adrian Prantl 857237ee70 Service the doxygen comments in DwarfUnit and DwarfDebug.
llvm-svn: 242046
2015-07-13 18:25:29 +00:00
Alex Lorenz de491f0515 MIR Serialization: Serialize the fixed stack objects.
This commit serializes the fixed stack objects, including fixed spill slots.
The fixed stack objects are serialized using a YAML sequence of YAML inline
mappings. Each mapping has the object's ID, type, size, offset, and alignment.
The objects that aren't spill slots also serialize the isImmutable and isAliased
flags.

The fixed stack objects are a part of the machine function's YAML mapping.

Reviewers: Duncan P. N. Exon Smith
llvm-svn: 242045
2015-07-13 18:07:26 +00:00
Reid Kleckner 5f4dd92209 [WinEH] Strip the \01 character from the __CxxFrameHandler3 thunk name
Add another C++ 32-bit EH table test.

llvm-svn: 242044
2015-07-13 17:55:14 +00:00
Benjamin Kramer a667d1adb7 Remove macro guards for extern template instantiations.
This is a C++11 feature that both GCC and MSVC have supported as ane extension
long before C++11 was approved.

llvm-svn: 242042
2015-07-13 17:21:31 +00:00
Benjamin Kramer e448b5be05 Avoid using Loop::getSubLoopsVector.
Passes should never modify it, just use the const version. While there
reduce copying in LoopInterchange. No functional change intended.

llvm-svn: 242041
2015-07-13 17:21:14 +00:00
James Y Knight 46f91c8457 Fix handling of the 'n' asm constraint with invalid operands.
It had accidently accepted a symbol+offset value (and emitted
incorrect code for it, keeping only the offset part) instead of
properly reporting the constraint as invalid.

Differential Revision: http://reviews.llvm.org/D11039

llvm-svn: 242040
2015-07-13 16:36:22 +00:00
Tom Stellard db5a11f698 AMDGPU/SI: Select mad patterns to v_mac_f32
The two-address instruction pass will convert these back to v_mad_f32
if necessary.

Differential Revision: http://reviews.llvm.org/D11060

llvm-svn: 242038
2015-07-13 15:47:57 +00:00
Logan Chien 0a43abc9f8 ARM: Fix cttz expansion on vector types.
The 64/128-bit vector types are legal if NEON instructions are
available.  However, there was no matching patterns for @llvm.cttz.*()
intrinsics and result in fatal error.

This commit fixes the problem by lowering cttz to:
a. ctpop((x & -x) - 1)
b. width - ctlz(x & -x) - 1

llvm-svn: 242037
2015-07-13 15:37:30 +00:00
Scott Douglass 69bf1ce03a [ARM] Handle commutativity when converting to tADDhirr in Thumb2
Also, run thumb_rewrite.s tests in Thumb2 now that they pass.

Differential Revision: http://reviews.llvm.org/D11132

llvm-svn: 242036
2015-07-13 15:31:48 +00:00
Scott Douglass d9d8d26458 [ARM] Add Thumb2 ADD with SP narrowing from 3 operand to 2
Differential Revision: http://reviews.llvm.org/D11131

llvm-svn: 242035
2015-07-13 15:31:40 +00:00
Scott Douglass 039f768c42 [ARM] Small refactor of tryConvertingToTwoOperandForm (nfc)
Also, add more Thumb2 ADD tests requested during review of
http://reviews.llvm.org/D11053.

Differential Revision: http://reviews.llvm.org/D11130

llvm-svn: 242034
2015-07-13 15:31:33 +00:00
Silviu Baranga a647c30f88 Cleanup after r241809 - remove uncessary call to std::sort
Summary:
The iteration order within a member of DepCands is deterministic
and therefore we don't have to sort the accesses within a member.
We also don't have to copy the indices of the pointers into a
vector, since we can iterate over the members of the class.

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11145

llvm-svn: 242033
2015-07-13 14:48:24 +00:00
Rafael Espindola c1d63f7499 Remove unused variable.
Sorry I missed it in the previous commit.

llvm-svn: 242032
2015-07-13 14:43:33 +00:00
Rafael Espindola 5895d6845e Aliases don't have available_externally linkage.
Allowing that is probably a good idea, but currently we don't, so
this is dead code.

llvm-svn: 242031
2015-07-13 14:39:02 +00:00
Rafael Espindola 237c3a6def Don't change the visibility when converting a definition to a declaration.
llvm-svn: 242030
2015-07-13 14:18:22 +00:00
Aaron Ballman 6d8f785073 Removing several -Wunused-but-set-variable warnings; NFC intended.
llvm-svn: 242028
2015-07-13 14:04:30 +00:00
Rafael Espindola 7068cbbc1a Print the visibility of available_externally functions.
We were already printing it for declarations, but not available_externally.

llvm-svn: 242027
2015-07-13 13:55:18 +00:00
Manuel Klimek 779cf85a4f Revert r241981 "Revert "Revert r236894 "[BasicAA] Fix zext & sext handling"""
The repros from PR23626 still fail.

llvm-svn: 242025
2015-07-13 13:50:55 +00:00
Elena Demikhovsky 0f370936a0 AVX-512: Added all AVX-512 forms of Vector Convert for Float/Double/Int/Long types.
In this patch I have only encoding. Intrinsics and DAG lowering will be in the next patch.
I temporary removed the old intrinsics test (just to split this patch).
Half types are not covered here.

Differential Revision: http://reviews.llvm.org/D11134

llvm-svn: 242023
2015-07-13 13:26:20 +00:00
Jingyue Wu 9a92d4fb04 [LSR] don't attempt to promote ephemeral values to indvars
Summary:
This at least saves compile time. I also encountered a case where
ephemeral values affect whether other variables are promoted, causing
performance issues. It may be a bug in LSR, but I didn't manage to
reduce it yet. Anyhow, I believe it's in general not worth considering
ephemeral values in LSR.

Reviewers: atrick, hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11115

llvm-svn: 242011
2015-07-13 03:28:53 +00:00
David Majnemer 599ca4426c [InstSimplify] Teach InstSimplify how to simplify extractelement
llvm-svn: 242008
2015-07-13 01:15:53 +00:00
David Majnemer 25a796e148 [InstSimplify] Teach InstSimplify how to simplify extractvalue
llvm-svn: 242007
2015-07-13 01:15:46 +00:00
Renato Golin 1ef7a0f7c0 [ARM] Add support for nest attribute using r12
Register r12 ('ip') is used by GCC for this purpose
and hence is used here. As discussed on the GCC mailing
list, the register choice is an ABI issue and so
choosing the same register as GCC means
__builtin_call_with_static_chain is compatible.

A similar patch has just gone in the AArch64 backend,
so this is just the ARM counterpart, following the same
discussion.

Patch by Stephen Cross.

llvm-svn: 241996
2015-07-12 18:16:40 +00:00
Simon Pilgrim 4f500525ef [X86][SSE] (V)PMINSB is commutable.
(V)PMINSB is no different to the other (V)PMIN/(V)PMAX B/D/W instructions - it is fully commutable.

llvm-svn: 241994
2015-07-12 16:44:11 +00:00
Simon Pilgrim ae5cd2773d Trim trailing whitespaces. NFC.
llvm-svn: 241990
2015-07-12 11:17:33 +00:00
Simon Pilgrim 64cc4ad0a2 [X86][SSE] Vectorized v4i32 non-uniform shifts.
While the v4i32 shl operation is already vectorized using a cvttps2dq/pmulld pattern, the lshr/ashr opeations are still scalarized.

This patch adds vectorization support for non-uniform v4i32 shift operations - it splats constant shift amounts to allow them to use the immediate sse shift instructions, or extracts/zero-extends non-constant shift amounts. The individual results are then blended together.

Differential Revision: http://reviews.llvm.org/D11063

llvm-svn: 241989
2015-07-12 11:15:19 +00:00
David Majnemer 6bc83e0f43 [LICM] Don't try to sink values out of loops without any exits
There is no suitable basic block to sink instructions in loops without
exits.  The only way an instruction in a loop without exits can be used
is as an incoming value to a PHI.  In such cases, the incoming block for
the corresponding value is unreachable.

This fixes PR24013.

Differential Revision: http://reviews.llvm.org/D10903

llvm-svn: 241987
2015-07-12 03:53:05 +00:00
Hal Finkel cbf08925ef [PowerPC] Make use of the TargetRecip system
r238842 added the TargetRecip system for controlling use of reciprocal
estimates for sqrt and division using a set of parameters that can be set by
the frontend. Clang now supports a sophisticated -mrecip option, and this will
allow that option to effectively control the relevant code-generation
functionality of the PPC backend.

llvm-svn: 241985
2015-07-12 02:33:57 +00:00
Hal Finkel 965cea5670 [PowerPC] Support the nest parameter attribute
This adds support for the 'nest' attribute, which allows the static chain
register to be set for functions calls under non-Darwin PPC/PPC64 targets. r11
is the chain register (which the PPC64 ELF ABI calls the "environment
pointer"). For indirect calls under PPC64 ELFv1, this would normally be loaded
from the function descriptor, but providing an explicit 'nest' parameter will
override that process and use the value provided.

This allows __builtin_call_with_static_chain to work as expected on PowerPC.

llvm-svn: 241984
2015-07-12 00:37:44 +00:00
Hal Finkel ef28aad9f4 Revert "Revert r236894 "[BasicAA] Fix zext & sext handling""
r236894 caused PR23626 (Clang miscompiles webkit's base64 decoder), and was
reverted in r237984. This reapplies the patch with an additional test case for
PR23626 and the associated fix (both scales and offsets in the
BasicAliasAnalysis::constantOffsetHeuristic should initially be zero).

Patch by Nick White, thanks!

llvm-svn: 241981
2015-07-11 11:04:54 +00:00
Hal Finkel 9cf58c4095 Move getStrideFromPointer and friends from LoopVectorize to VectorUtils
The following functions are moved from the LoopVectorizer to VectorUtils:

  - getGEPInductionOperand
  - stripGetElementPtr
  - getUniqueCastUse
  - getStrideFromPointer

These used to be static functions in LoopVectorize, but will also be used by
the upcoming loop versioning LICM transformation.

Patch by Ashutosh Nema!

llvm-svn: 241980
2015-07-11 10:52:42 +00:00
Igor Laevsky 39d662f7ba Add argmemonly attribute.
This change adds new attribute called "argmemonly". Function marked with this attribute can only access memory through it's argument pointers. This attribute directly corresponds to the "OnlyAccessesArgumentPointees" ModRef behaviour in alias analysis.

Differential Revision: http://reviews.llvm.org/D10398

llvm-svn: 241979
2015-07-11 10:30:36 +00:00
Chandler Carruth 00ebdbcc47 [PM/AA] Completely remove the AliasAnalysis::copyValue interface.
No in-tree alias analysis used this facility, and it was not called in
any particularly rigorous way, so it seems unlikely to be correct.

Note that one of the only stateful AA implementations in-tree,
GlobalsModRef is completely broken currently (and any AA passes like it
are equally broken) because Module AA passes are not effectively
invalidated when a function pass that fails to update the AA stack runs.

Ultimately, it doesn't seem like we know how we want to build stateful
AA, and until then trying to support and maintain correctness for an
untested API is essentially impossible. To that end, I'm planning to rip
out all of the update API. It can return if and when we need it and know
how to build it on top of the new pass manager and as part of *tested*
stateful AA implementations in the tree.

Differential Revision: http://reviews.llvm.org/D10889

llvm-svn: 241975
2015-07-11 04:39:00 +00:00
Tyler Nowicki 3960d85262 Renamed some uses of unroll to interleave in the vectorizer.
llvm-svn: 241971
2015-07-11 00:31:11 +00:00
Adrian Prantl 12d528493e Cleanup a couple of comments in DIBuilder.cpp
llvm-svn: 241966
2015-07-10 23:26:02 +00:00
Duncan P. N. Exon Smith e463e470f8 MC: Only allow changing feature bits in MCSubtargetInfo
Disallow all mutation of `MCSubtargetInfo` expect the feature bits.

Besides deleting the assignment operators -- which were dead "code" --
this restricts `InitMCProcessorInfo()` to subclass initialization
sequences, and exposes a new more limited function called
`setDefaultFeatures()` for use by the ARMAsmParser `.cpu` directive.

There's a small functional change here: ARMAsmParser used to adjust
`MCSubtargetInfo::CPUSchedModel` as a side effect of calling
`InitMCProcessorInfo()`, but I've removed that suspicious behaviour.
Since the AsmParser shouldn't be doing any scheduling, there shouldn't
be any observable change...

llvm-svn: 241961
2015-07-10 22:52:15 +00:00
Matt Arsenault cf13d18730 AMDGPU: Fix chains for memory ops dependent on argument loads
Most loads and stores are derived from pointers derived from
a kernel argument load inserted during argument lowering.
This was just using the EntryToken chain for the argument loads,
and any users of these loads were also on the EntryToken chain.

Return the chain of the lowered argument load so that dependent loads
end up on the correct chain.

No test since I'm not aware of any case where this actually
broke.

llvm-svn: 241960
2015-07-10 22:51:36 +00:00
Alex Lorenz 53464510cc MIR Serialization: Serialize the virtual register operands.
Reviewers: Duncan P. N. Exon Smith

Differential Revision: http://reviews.llvm.org/D11005

llvm-svn: 241959
2015-07-10 22:51:20 +00:00
David Majnemer a5c7051a60 [IR] Switch static const to an enum to silence MSVC linker warnings
Integral class statics are handled oddly in MSVC, we don't need them
in this case, use an enum instead.

llvm-svn: 241958
2015-07-10 22:46:02 +00:00
Duncan P. N. Exon Smith 754e21f244 MC: Remove MCSubtargetInfo() default constructor
Force all creators of `MCSubtargetInfo` to immediately initialize it,
merging the default constructor and the initializer into an initializing
constructor.  Besides cleaning up the code a little, this makes it clear
that the initializer is never called again later.

Out-of-tree backends need a trivial change: instead of calling:

    auto *X = new MCSubtargetInfo();
    InitXYZMCSubtargetInfo(X, ...);
    return X;

they should call:

    return createXYZMCSubtargetInfoImpl(...);

There's no real functionality change here.

llvm-svn: 241957
2015-07-10 22:43:42 +00:00
Duncan P. N. Exon Smith bb57d73805 MC: Remove MCSubtargetInfo::InitCPUSched()
Remove all calls to `MCSubtargetInfo::InitCPUSched()` and merge its body
into the only relevant caller, `MCSubtargetInfo::InitMCProcessorInfo()`.
We were only calling the former after explicitly calling the latter with
the same CPU; it's confusing to have both methods exposed.

Besides a minor (surely unmeasurable) speedup in ARM and X86 from
avoiding running the logic twice, no functionality change.

llvm-svn: 241956
2015-07-10 22:33:01 +00:00
Bjorn Steinbrink a6b929dfe2 [InstCombine] Actually combine AA metadata when replacing one load with another
Fixes PR24083

llvm-svn: 241955
2015-07-10 22:30:17 +00:00
Matt Arsenault 0d5197380c AMDGPU: Use requested chain when lowering arguments
No test since I'm not aware of any case where this will
end up being a different chain.

llvm-svn: 241954
2015-07-10 22:28:41 +00:00
Matthias Braun e5a112f5e1 ARM: Use SpecificBumpPtrAllocator to fix leak introduced in r241920
llvm-svn: 241951
2015-07-10 22:23:57 +00:00
Reid Kleckner 7ea7708d92 [SEH] Push reloads of the SEH code past phi nodes
This in turn would sometimes introduce new cleanupblocks that didn't
previously exist. The uses were being introduced by SSA value demotion.
We actually want to *promote* uses of EH pointers and selectors, so I
added some spcecial casing to avoid demoting such instructions.  This is
getting overly complicated, but hopefully we'll come along and delete it
in the new representation.

llvm-svn: 241950
2015-07-10 22:21:54 +00:00
Duncan P. N. Exon Smith f787ed0b35 Add <type_traits> for is_pod, fixing r241947
llvm-svn: 241949
2015-07-10 22:17:49 +00:00
Matt Arsenault f54dc2384d DAGCombiner: Assume invariant load cannot alias a store
The motivation is to allow GatherAllAliases / FindBetterChain
to not give up on dependent loads of a pointer from constant memory.

This is important for AMDGPU, because most loads are pointers
derived from a load of a kernel argument from constant memory.

llvm-svn: 241948
2015-07-10 22:17:40 +00:00
Duncan P. N. Exon Smith f862f87ff2 MC: Remove the copy of MCSchedModel in MCSubtargetInfo
`MCSchedModel` is large.  Make `MCSchedModel::GetDefaultSchedModel()`
return by-reference instead of by-value, so we can store a pointer in
`MCSubtargetInfo::CPUSchedModel` instead of a copy.

Note: since `MCSchedModel` is POD, this doesn't create a static
constructor.

llvm-svn: 241947
2015-07-10 22:13:43 +00:00
Quentin Colombet 8b984d19f2 [ShrinkWrap][PEI] Do not insert epilogue for unreachable blocks.
Although this is not incorrect to insert such code, it is useless
and it hurts the binary size.

llvm-svn: 241946
2015-07-10 22:09:55 +00:00
David Majnemer 3f0a0e4a28 [MC] Switch static const to an enum to silence MSVC linker warnings
Integral class statics are handled oddly in MSVC, we don't need them in
this case, use an enum instead.

llvm-svn: 241945
2015-07-10 21:50:04 +00:00
Evgeniy Stepanov 00b3020453 Fix AArch64 prologue for empty frame with dynamic allocas.
Fixes PR23804: assertion failure in emitPrologue in the case of a
function with an empty frame and a dynamic alloca that needs stack
realignment. This is a typical case for AddressSanitizer.

llvm-svn: 241943
2015-07-10 21:24:07 +00:00
Jingyue Wu a277561922 [TTI] BasicTTIImpl assumes no vector registers
Summary:
Following the discussion on r241884, it's more reasonable to assume that a
target has no vector registers by default instead of letting every such
target overrides getNumberOfRegisters.

Therefore, this patch modifies BasicTTIImpl::getNumberOfRegisters to
return 0 when Vector is true, and partially reverts r241884 which
modifies NVPTXTTIImpl::getNumberOfRegisters.

It also fixes a performance bug in LoopVectorizer. Even if a target has
no vector registers, vectorization may still help ILP. So, we need both
checks to be false before disabling loop vectorization all together.

Reviewers: hfinkel

Subscribers: llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D11108

llvm-svn: 241942
2015-07-10 21:14:54 +00:00
Adam Nemet 215746b45a [LoopDist/LoopVer] Move LoopVersioning to a new module, NFC
Summary:
The class will obviously need improvement down the road.  For one, there
is no reason that addPHINodes would have to be exposed like that.  I
will make this and other improvements in follow-up patches.

The main goal is to be able to share this functionality.  The
LoopLoadElimination pass I am working on needs it too.  Later we can
move other clients as well (LV and Ashutosh's LICMVer).

Reviewers: hfinkel, ashutosh.nema

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10577

llvm-svn: 241932
2015-07-10 18:55:13 +00:00
Adam Nemet 1a689188c4 [LoopDist] Move loop-versioning helper functions to Cloning, NFC
Summary:
This makes them available to the LoopVersioning class as that is moved
to its own module in the next patch.

Reviewers: ashutosh.nema, hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10576

llvm-svn: 241931
2015-07-10 18:55:09 +00:00
Matthias Braun d9bd22b2c4 ARMLoadStoreOpt: Merge subs/adds into LDRD/STRD; Factor out common code
This commit factors out common code from MergeBaseUpdateLoadStore() and
MergeBaseUpdateLSMultiple() and introduces a new function
MergeBaseUpdateLSDouble() which merges adds/subs preceding/following a
strd/ldrd instruction into an strd/ldrd instruction with writeback where
possible.

Differential Revision: http://reviews.llvm.org/D10676

llvm-svn: 241928
2015-07-10 18:37:33 +00:00
Fiona Glaser b08ae7affb ComputeKnownBits: be a bit smarter about ADDs
If our two inputs have known top-zero bit counts M and N, we trivially
know that the output cannot have any bits set in the top (min(M, N)-1)
bits, since nothing could carry past that point.

llvm-svn: 241927
2015-07-10 18:29:02 +00:00
Matthias Braun e4ba6b8c24 ARMLoadStoreOptimizer: Create LDRD/STRD on thumb2
Differential Revision: http://reviews.llvm.org/D10623

llvm-svn: 241926
2015-07-10 18:28:49 +00:00
JF Bastien 5ca0baca4a WebAssembly: basic instructions todo, and basic register info.
Summary:
This code is based on AArch64 for modern backend good practice, and NVPTX for
virtual ISA concerns.

Reviewers: sunfish

Subscribers: aemerson, llvm-commits, jfb

Differential Revision: http://reviews.llvm.org/D11070

llvm-svn: 241923
2015-07-10 18:23:10 +00:00
Alex Lorenz f6bc8667cd MIR Serialization: Initial serialization of stack objects.
This commit implements the initial serialization of stack objects from the
MachineFrameInfo class. It can only serialize the ordinary stack objects
(including ordinary spill slots), but it doesn't serialize variable sized or
fixed stack objects yet.

The stack objects are serialized using a YAML sequence of YAML inline mappings.
Each mapping has the object's ID, type, size, offset and alignment. The stack
objects are a part of machine function's YAML mapping.

Reviewers: Duncan P. N. Exon Smith
llvm-svn: 241922
2015-07-10 18:13:57 +00:00
JF Bastien b73a2ed20e Target RegisterInfo: devirtualize TargetFrameLowering
Summary:
The target frame lowering's concrete type is always known in RegisterInfo, yet it's only sometimes devirtualized through a static_cast. This change adds an auto-generated static function <Target>GenRegisterInfo::getFrameLowering(const MachineFunction &MF) which does this devirtualization, and uses this function in all targets which can.

This change was suggested by sunfish in D11070 for WebAssembly, I figure that I may as well improve the other targets while I'm here.

Subscribers: sunfish, ted, llvm-commits, jfb

Differential Revision: http://reviews.llvm.org/D11093

llvm-svn: 241921
2015-07-10 18:13:17 +00:00
Matthias Braun a4a3182ded ARMLoadStoreOptimizer: Rewrite LDM/STM matching logic.
This improves the logic in several ways and is a preparation for
followup patches:
- First perform an analysis and create a list of merge candidates, then
  transform. This simplifies the code in that you have don't have to
  care to much anymore that you may be holding iterators to
  MachineInstrs that get removed.
- Analyze/Transform basic blocks in reverse order. This allows to use
  LivePhysRegs to find free registers instead of the RegisterScavenger.
  The RegisterScavenger will become less precise in the future as it
  relies on the deprecated kill-flags.
- Return the newly created node in MergeOps so there's no need to look
  around in the schedule to find it.
- Rename some MBBI iterators to InsertBefore to make their role clear.
- General code cleanup.

Differential Revision: http://reviews.llvm.org/D10140

llvm-svn: 241920
2015-07-10 18:08:49 +00:00
Eli Bendersky 5c0039a014 Actually support volatile memcpys in NVPTX lowering
Differential Revision: http://reviews.llvm.org/D11091

llvm-svn: 241914
2015-07-10 15:40:33 +00:00
Nemanja Ivanovic d9e4b4ff36 NFC. Added a blank line for consistency.
llvm-svn: 241913
2015-07-10 14:25:17 +00:00
Benjamin Kramer f4ebfa3ae1 [InstSimplify] Fold away ord/uno fcmps when nnan is present.
This is important to fold away the slow case of complex multiplies
emitted by clang.

llvm-svn: 241911
2015-07-10 14:02:02 +00:00
James Molloy 88eb535b2d Add support for fast-math flags to the FCmp instruction.
FCmp behaves a lot like a floating-point binary operator in many ways,
and can benefit from fast-math information. Flags such as nsz and nnan
can affect if this fcmp (in combination with a select) can be treated
as a fminnum/fmaxnum operation.

This adds backwards-compatible bitcode support, IR parsing and writing,
LangRef changes and IRBuilder changes. I'll need to audit InstSimplify
and InstCombine in a followup to find places where flags should be
copied.

llvm-svn: 241901
2015-07-10 12:52:00 +00:00
Nemanja Ivanovic 5655fb320c Add missing builtins to the PPC back end for ABI compliance (vol. 3)
This patch corresponds to review:
http://reviews.llvm.org/D10973

Back end portion of the third round of additions to altivec.h.

llvm-svn: 241900
2015-07-10 12:38:08 +00:00
Alexey Bataev da33d80e9a Disable loop re-rotation for -Oz (patch by Andrey Turetsky)
After changes in rL231820 loop re-rotation is performed even in -Oz mode. Since loop rotation is disabled for -Oz, it seems loop re-rotation should be disabled too.
Differential Revision: http://reviews.llvm.org/D10961

llvm-svn: 241897
2015-07-10 10:37:09 +00:00
David Majnemer db82d2f338 Revert the new EH instructions
This reverts commits r241888-r241891, I didn't mean to commit them.

llvm-svn: 241893
2015-07-10 07:15:17 +00:00
David Majnemer 82771b1ad6 Tighten the verifier check for catchblock.
llvm-svn: 241891
2015-07-10 07:01:07 +00:00
David Majnemer 11aeb90aaa Address Joseph's review comments.
llvm-svn: 241890
2015-07-10 07:01:03 +00:00
David Majnemer 1d3fe98d57 Address Reid's review feedback.
llvm-svn: 241889
2015-07-10 07:00:58 +00:00
David Majnemer ae2ffc8a8c New EH representation for MSVC compatibility
Summary:
This introduces new instructions neccessary to implement MSVC-compatible
exception handling support.  Most of the middle-end and none of the
back-end haven't been audited or updated to take them into account.

Reviewers: rnk, JosephTremoulet, reames, nlewycky, rjmccall

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11041

llvm-svn: 241888
2015-07-10 07:00:44 +00:00
Bjorn Steinbrink 8350534772 [InstCombine] Employ AliasAnalysis in FindAvailableLoadedValue
llvm-svn: 241887
2015-07-10 06:55:49 +00:00
Bjorn Steinbrink a91fd0998f [InstCombine] Properly combine metadata when replacing a load with another
Not doing this can lead to misoptimizations down the line, e.g. because
of range metadata on the replacing load excluding values that are valid
for the load that is being replaced.

llvm-svn: 241886
2015-07-10 06:55:44 +00:00
Jingyue Wu ad85c8c204 [NVPTX] declare no vector registers
Summary:
Without this patch, LoopVectorizer in certain cases (see loop-vectorize.ll)
produces code with complex control flow which hurts later optimizations. Since
NVPTX doesn't have vector registers in LLVM's sense
(NVPTXTTI::getRegisterBitWidth(true) == 32), we for now declare no vector
registers to effectively disable loop vectorization.

Reviewers: jholewinski

Subscribers: jingyue, llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D11089

llvm-svn: 241884
2015-07-10 04:31:56 +00:00
Reid Kleckner 85a2450d56 [WinEH] Make sure LSDA tables are 4 byte aligned
Apparently this is important, otherwise _except_handler3 assumes that
the registration node is corrupted and ignores it.

Also fix a bug in WinEHPrepare where we would insert code after a
terminator instruction.

llvm-svn: 241877
2015-07-10 00:08:49 +00:00
Eli Bendersky d880520bc2 Replace index-loops by range-based loops
NFC

llvm-svn: 241875
2015-07-09 23:06:03 +00:00
Sanjay Patel 81beefc541 [x86] enable machine combiner reassociations for scalar double-precision multiplies
llvm-svn: 241873
2015-07-09 22:58:39 +00:00
Sanjay Patel ea81edf351 [x86] enable machine combiner reassociations for scalar double-precision adds
llvm-svn: 241871
2015-07-09 22:48:54 +00:00
Alex Lorenz 28148ba82d MIR Serialization: Serialize the virtual register definitions.
The virtual registers are serialized using a YAML sequence of YAML inline
mappings. Each mapping has the id of the virtual register and the register
class.

Reviewers: Duncan P. N. Exon Smith

Differential Revision: http://reviews.llvm.org/D10981

llvm-svn: 241868
2015-07-09 22:23:13 +00:00
Adam Nemet 0f67c6c1d5 [LAA] Fix grammar in debug output
llvm-svn: 241867
2015-07-09 22:17:41 +00:00
Adam Nemet ee61474a61 [LAA] Hide NeedRTCheck logic completely inside canCheckPtrAtRT, NFC
Currently canCheckPtrAtRT returns two flags NeedRTCheck and CanDoRT.
NeedRTCheck says whether we need checks and CanDoRT whether we can
generate the checks.  The idea is to encode three states with these:

     Need/Can:
(1) false/dont-care: no checks are needed
(2) true/false: we need checks but can't generate them
(3) true/true: we need checks and we can generate them

This is pretty unnecessary since the caller (analyzeLoop) is only
interested in whether we can generate the checks if we actually need
them (i.e. 1 or 3).

So this change cleans up to return just that (CanDoRTIfNeeded) and pulls
all the underlying logic into canCheckPtrAtRT.

By doing all this, we simplify analyzeLoop which is the complex function
in LAA.

There is further room for improvement here by using RtCheck.Need
directly rather than a new local variable NeedRTCheck but that's for a
later patch.

llvm-svn: 241866
2015-07-09 22:17:38 +00:00
Reid Kleckner 8eecb3c160 [WinEH] Give up on using CSRs across 32-bit invokes for now
The runtime does not restore CSRs when transferring control back to the
function handling the exception. According to the experts on IRC, LLVM's
register allocator has no way to model register clobbers that only
happen on one edge of the CFG. For now, don't worry about trying to use
the meager three CSRs available on 32-bit X86 and just say that such
invokes preserve nothing.

llvm-svn: 241865
2015-07-09 22:09:41 +00:00
Reid Kleckner c16b1078df Expose sjlj preparation through opt for my own debugging purposes
llvm-svn: 241864
2015-07-09 21:48:40 +00:00
Alex Lorenz c8704b02df MIR Parser: Report an error when parsing machine function with an empty body.
This commit adds a new error which is reported when the MIR Parser encounters
a machine function without any machine basic blocks. The machine verifier
expects that the machine functions have at least one MBB, and this error will
prevent machine functions without MBBs from reaching the machine verifier and
crashing with an assertion.

llvm-svn: 241862
2015-07-09 21:21:33 +00:00
Tom Stellard dcb9f0907f AMDGPU: Add helper function for implicit parameter offsets.
Patch by: Zoltan Gilian

llvm-svn: 241861
2015-07-09 21:20:37 +00:00
JF Bastien b379643f7c Unbreak WebAssembly build
Summary: D11021 and D11045 didn't update the WebAssembly target's code. It's still experimental so all tests passed.

Reviewers: sunfish, joker.eph, echristo

Subscribers: llvm-commits, jfb

Differential Revision: http://reviews.llvm.org/D11084

llvm-svn: 241859
2015-07-09 21:00:09 +00:00
Sanjoy Das c3a8e398a2 [ImplicitNullChecks] Fix a memory leak.
llvm-svn: 241851
2015-07-09 20:13:31 +00:00
Sanjoy Das b771845461 [ImplicitNullChecks] Be smarter in picking the memory op.
Summary:
Before this change ImplicitNullChecks would only pick loads of the form:

```
   test Reg, Reg
   jz elsewhere
 fallthrough:
   movl 32(Reg), Reg2
```

but not (say)

```
   test Reg, Reg
   jz elsewhere
 fallthrough:
   inc Reg3
   movl 32(Reg), Reg2
```

This change teaches ImplicitNullChecks to look through "unrelated"
instructions like `inc Reg3` when searching for a load instruction
to convert to a trapping load.

Reviewers: atrick, JosephTremoulet, reames

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11044

llvm-svn: 241850
2015-07-09 20:13:25 +00:00
Alex Lorenz 60541c1d44 MIR Serialization: Serialize the simple MachineFrameInfo attributes.
This commit serializes the 13 scalar boolean and integer attributes from the
MachineFrameInfo class: IsFrameAddressTaken, IsReturnAddressTaken, HasStackMap,
HasPatchPoint, StackSize, OffsetAdjustment, MaxAlignment, AdjustsStack,
HasCalls, MaxCallFrameSize, HasOpaqueSPAdjustment, HasVAStart, and
HasMustTailInVarArgFunc. These attributes are serialized as part
of the frameInfo YAML mapping, which itself is a part of the machine function's
YAML mapping.

llvm-svn: 241844
2015-07-09 19:55:27 +00:00
Rafael Espindola 594e676cbe llvm-ar: Pad the symbol table to 4 bytes.
It looks like ld64 requires it. With this we seem to be able to bootstrap using
llvm-ar+/usr/bin/true instead of ar+ranlib (currently on stage2).

llvm-svn: 241842
2015-07-09 19:48:06 +00:00
Matt Arsenault 8b03e6c164 AMDGPU/R600: Return correct chain when lowering loads
The other LowerLOAD should be returning the correct chain.

llvm-svn: 241839
2015-07-09 18:47:03 +00:00
Sanjoy Das 6f062c8c2a [IndVars] Try to use existing values in RewriteLoopExitValues.
Summary:
In RewriteLoopExitValues, before expanding out an SCEV expression using
SCEVExpander, try to see if an existing LLVM IR expression already
computes the value we're interested in.  If so use that existing
expression.

Apart from reducing IndVars' reliance on the rest of the compilation
pipeline, this also prevents IndVars from concluding some expressions as
"high cost" when they're not.  For instance,
`InductiveRangeCheckElimination` often emits code of the following form:

```
len = umin(len_A, len_B)

loop:
  ...
  if (i++ < len)
    goto loop

outside_loop:
    use(i)
```

`SCEVExpander` refuses to rewrite the use of `i` in `outside_loop`,
since it thinks the value of `i` on loop exit, `len`, is a high cost
expansion since it contains an `umax` in it.  With this change,
`IndVars` can see that it can re-use `len` instead of creating a new
expression to compute `umin(len_A, len_B)`.

I considered putting this cleverness in `SCEVExpander`, but I was
worried that it may then have a deterimental effect on other passes
that use it.  So I decided it was better to just do this in the one
place where it seems like an obviously good idea, with the intent of
generalizing later if needed.

Reviewers: atrick, reames

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10782

llvm-svn: 241838
2015-07-09 18:46:12 +00:00
Reid Kleckner 0f7f8d41f7 Remove dead code from old 64-bit SEH lowering
llvm-svn: 241829
2015-07-09 17:46:39 +00:00
Pat Gavlin a717f255b6 Allow {e,r}bp as the target of {read,write}_register.
This patch allows the read_register and write_register intrinsics to
read/write the RBP/EBP registers on X86 iff the targeted register is
the frame pointer for the containing function.

Differential Revision: http://reviews.llvm.org/D10977

llvm-svn: 241827
2015-07-09 17:40:29 +00:00
Sanjay Patel e2361d4a18 fix an invisible bug when combining repeated FP divisors
This patch fixes bugs that were exposed by the addition of fast-math-flags in the DAG:
r237046 ( http://reviews.llvm.org/rL237046 ):

1. When replacing a division node, it's not enough to RAUW.
   We should call CombineTo() to delete dead nodes and combine again.
2. Because we are changing the DAG, we can't return an empty SDValue
   after the transform. As the code comments say:

    Visitation implementation - Implement dag node combining for different node types.
    The semantics are as follows: Return Value:
      SDValue.getNode() == 0 - No change was made
      SDValue.getNode() == N - N was replaced, is dead and has been handled.
      otherwise - N should be replaced by the returned Operand.

The new test case shows no difference with or without this patch, but it will crash if
we re-apply r237046 or enable FMF via the current -enable-fmf-dag cl::opt.

Differential Revision: http://reviews.llvm.org/D9893

llvm-svn: 241826
2015-07-09 17:28:37 +00:00
Juergen Ributzka 216ed03ebb [StackMap] Use lambdas to specify the sort and erase conditions. NFC.
llvm-svn: 241823
2015-07-09 17:11:15 +00:00
Juergen Ributzka aef76cafa0 [StackMap] Rename variables to be more consistent. NFC.
Rename a few variables and use auto for long iterator names.

llvm-svn: 241822
2015-07-09 17:11:11 +00:00
Juergen Ributzka e4685a1c0d [StackMaps] Use emplace_back when possible. NFC.
llvm-svn: 241821
2015-07-09 17:11:08 +00:00
Tom Stellard ab6e9c0f94 AMDGPU/SI: The SIShrinkInstructions pass should only fold immediates with one use
This is convered by existing testcases and will be exposed by a future
commit.

llvm-svn: 241817
2015-07-09 16:30:36 +00:00
Tom Stellard 9ebf7ca2f0 AMDGPU/SI: Fix crash on physical registers in SIInstrInfo::isOperandLegal()
No test case for this.  I ran into it while working on some improvements
to SIShrinkInstructions.cpp.

llvm-svn: 241816
2015-07-09 16:30:27 +00:00
Rafael Espindola c79bff6bb1 Basic support for BSD symbol tables in archives.
This could be optimized and for now we only produce __.SYMDEF
and not "__.SYMDEF SORTED".

llvm-svn: 241814
2015-07-09 15:56:23 +00:00
Krzysztof Parzyszek 8b26fbf758 [Hexagon] Add missing preamble to a source file
llvm-svn: 241813
2015-07-09 15:40:25 +00:00
Rafael Espindola 2ba806c702 Remove redundant variable. NFC.
llvm-svn: 241810
2015-07-09 15:24:39 +00:00
Silviu Baranga ce3877fc8c Don't rely on the DepCands iteration order when constructing checking pointer groups
Summary:
The checking pointer group construction algorithm relied on the iteration on DepCands.
We would need the same leaders across runs and the same iteration order over the underlying std::set for determinism.

This changes the algorithm to process the pointers in the order in which they were added to the runtime check, which is deterministic.
We need to update the tests, since the order in which pointers appear has changed.

No new tests were added, since it is impossible to test for non-determinism.

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11064

llvm-svn: 241809
2015-07-09 15:18:25 +00:00
Rafael Espindola b870e9ca93 Add a helper to printing BE of LE depending on the format.
The gnu ar format uses BE numbers. The BSD one uses LE. Add a helper for one or the
other. NFC for now, just removes some noise from the following patch.

llvm-svn: 241808
2015-07-09 15:13:41 +00:00
Mehdi Amini eaabc51e78 Re-instate the EVT parameter to getScalarShiftAmountTy() for OOT user
A documentation for this function would be nice by the way.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 241807
2015-07-09 15:12:23 +00:00
Pawel Bylica d1b818bcf4 Reapply fixed r241790: Fix shift legalization and lowering for big constants.
Summary: If shift amount is a constant value > 64 bit it is handled incorrectly during type legalization and X86 lowering. This patch the type of shift amount argument in function DAGTypeLegalizer::ExpandShiftByConstant from unsigned to APInt.

Reviewers: nadav, majnemer, sanjoy, RKSimon

Subscribers: RKSimon, llvm-commits

Differential Revision: http://reviews.llvm.org/D10767

llvm-svn: 241806
2015-07-09 14:58:04 +00:00
Rafael Espindola 8cde5c01d8 Extract printBSDMemberHeader.
It will get another use in the following patch. Also rename the other helper to
printGNUSmallMemberHeader for consistency.

llvm-svn: 241805
2015-07-09 14:54:12 +00:00
Krzysztof Parzyszek feaf7b8d35 [Hexagon] Add support for atomic RMW operations
llvm-svn: 241804
2015-07-09 14:51:21 +00:00
Arnaud A. de Grandmaison f40f99e3a4 [AArch64] Select SBFIZ or UBFIZ instead of left + right shifts
And rename LSB to Immr / MSB to Imms to match the ARM ARM terminology.

llvm-svn: 241803
2015-07-09 14:33:38 +00:00
Scott Douglass 8143bc25ee [ARM] Thumb1 3 to 2 operand convertion for commutative operations
Differential Revision: http://reviews.llvm.org/D11057

llvm-svn: 241802
2015-07-09 14:13:55 +00:00
Scott Douglass 2740a63725 [ARM] Don't be overzealous converting Thumb1 3 to 2 operands
Differential Revision: http://reviews.llvm.org/D11056

llvm-svn: 241801
2015-07-09 14:13:48 +00:00
Scott Douglass 47a3fce461 [ARM] Add Thumb2 ADD with PC narrowing from 3 operand to 2
Differential Revision: http://reviews.llvm.org/D11055

llvm-svn: 241800
2015-07-09 14:13:41 +00:00
Scott Douglass 8c7803f4c1 [ARM] Refactor converting Thumb1 from 3 to 2 operand (nfc)
Also adds some test cases.

Differential Revision: http://reviews.llvm.org/D11054

llvm-svn: 241799
2015-07-09 14:13:34 +00:00
Renato Golin 17d4efe7c1 Add support for nest attribute to AArch64 backend
The nest attribute is currently supported on the x86 (32-bit) and x86-64
backends, but not on ARM (32-bit) or AArch64. This patch adds support for
nest to the AArch64 backend.

Register x18 is used by GCC for this purpose and hence is used here.
As discussed on the GCC mailing list the register choice is an ABI issue
and so choosing the same register as GCC means __builtin_call_with_static_chain
is compatible.

Patch by Stephen Cross.

llvm-svn: 241794
2015-07-09 10:18:02 +00:00
Tamas Berghammer b6b0ddfc95 Add getSizeInBits function to the APFloat class
The newly added function returns the size of the specified floating
point semantics in bits.

Differential revision: http://reviews.llvm.org/D8413

llvm-svn: 241793
2015-07-09 10:13:39 +00:00
Pawel Bylica 627762fda5 Revert r241790: Fix shift legalization and lowering for big constants.
llvm-svn: 241792
2015-07-09 09:50:54 +00:00
Pawel Bylica eb122f2baf Fix shift legalization and lowering for big constants.
Summary: If shift amount is a constant value > 64 bit it is handled incorrectly during type legalization and X86 lowering. This patch the type of shift amount argument in function DAGTypeLegalizer::ExpandShiftByConstant from unsigned to APInt.

Reviewers: nadav, majnemer, sanjoy, RKSimon

Subscribers: RKSimon, llvm-commits

Differential Revision: http://reviews.llvm.org/D10767

llvm-svn: 241790
2015-07-09 08:01:36 +00:00
Elena Demikhovsky 37a4da825f Extended syntax of vector version of getelementptr instruction.
The justification of this change is here: http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-March/082989.html

According to the current GEP syntax, vector GEP requires that each index must be a vector with the same number of elements.

%A = getelementptr i8, <4 x i8*> %ptrs, <4 x i64> %offsets

In this implementation I let each index be or vector or scalar. All vector indices must have the same number of elements. The scalar value will mean the splat vector value.

(1) %A = getelementptr i8, i8* %ptr, <4 x i64> %offsets
or
(2) %A = getelementptr i8, <4 x i8*> %ptrs, i64 %offset

In all cases the %A type is <4 x i8*>

In the case (2) we add the same offset to all pointers.

The case (1) covers C[B[i]] case, when we have the same base C and different offsets B[i].

The documentation is updated.

http://reviews.llvm.org/D10496

llvm-svn: 241788
2015-07-09 07:42:48 +00:00
Adam Nemet b41d2d3fa3 [LAA] Fix line break in comment
llvm-svn: 241785
2015-07-09 06:47:21 +00:00
Adam Nemet 5dc3b2cf53 [LAA] Rename IsRTNeeded to IsRTCheckAnalysisNeeded
The original name was too close to NeedRTCheck which is what the actual
memcheck analysis returns.  This flag, as the new name suggests, is only
used to whether to initiate that analysis.

Also a comment is added to answer one question I had about this code for
a long time.  Namely, how does this flag differ from
isDependencyCheckNeeded since they are seemingly set at the same time.

llvm-svn: 241784
2015-07-09 06:47:18 +00:00
Mehdi Amini 157e5a6d10 Remove getDataLayout() from TargetSelectionDAGInfo (had no users)
Summary:
Remove empty subclass in the process.

This change is part of a series of commits dedicated to have a single
DataLayout during compilation by using always the one owned by the
module.

Reviewers: echristo

Subscribers: jholewinski, llvm-commits, rafael, yaron.keren, ted

Differential Revision: http://reviews.llvm.org/D11045

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 241780
2015-07-09 02:10:08 +00:00
Mehdi Amini a749f2ad47 Remove getDataLayout() from TargetLowering
Summary:
This change is part of a series of commits dedicated to have a single
DataLayout during compilation by using always the one owned by the
module.

Reviewers: echristo

Subscribers: yaron.keren, rafael, llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D11042

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 241779
2015-07-09 02:09:52 +00:00
Mehdi Amini 0cdec1e2ab Make isLegalAddressingMode() taking DataLayout as an argument
Summary:
This change is part of a series of commits dedicated to have a single
DataLayout during compilation by using always the one owned by the
module.

Reviewers: echristo

Subscribers: jholewinski, llvm-commits, rafael, yaron.keren

Differential Revision: http://reviews.llvm.org/D11040

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 241778
2015-07-09 02:09:40 +00:00
Mehdi Amini 5c183d5239 Make getByValTypeAlignment() taking DataLayout as an argument
Summary:
This change is part of a series of commits dedicated to have a single
DataLayout during compilation by using always the one owned by the
module.

Reviewers: echristo

Subscribers: yaron.keren, rafael, llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D11038

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 241777
2015-07-09 02:09:28 +00:00
Mehdi Amini 9639d650bb Make TargetLowering::getShiftAmountTy() taking DataLayout as an argument
Summary:
This change is part of a series of commits dedicated to have a single
DataLayout during compilation by using always the one owned by the
module.

Reviewers: echristo

Subscribers: jholewinski, llvm-commits, rafael, yaron.keren

Differential Revision: http://reviews.llvm.org/D11037

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 241776
2015-07-09 02:09:20 +00:00
Mehdi Amini 44ede33a69 Make TargetLowering::getPointerTy() taking DataLayout as an argument
Summary:
This change is part of a series of commits dedicated to have a single
DataLayout during compilation by using always the one owned by the
module.

Reviewers: echristo

Subscribers: jholewinski, ted, yaron.keren, rafael, llvm-commits

Differential Revision: http://reviews.llvm.org/D11028

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 241775
2015-07-09 02:09:04 +00:00
Mehdi Amini 5010ebf181 Make TargetTransformInfo keeping a reference to the Module DataLayout
DataLayout is no longer optional. It was initialized with or without
a DataLayout, and the DataLayout when supplied could have been the
one from the TargetMachine.

Summary:
This change is part of a series of commits dedicated to have a single
DataLayout during compilation by using always the one owned by the
module.

Reviewers: echristo

Subscribers: jholewinski, llvm-commits, rafael, yaron.keren

Differential Revision: http://reviews.llvm.org/D11021

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 241774
2015-07-09 02:08:42 +00:00
Mehdi Amini 56228dabfa Redirect DataLayout from TargetMachine to Module in ComputeValueVTs()
Summary:
Avoid using the TargetMachine owned DataLayout and use the Module owned
one instead. This requires passing the DataLayout up the stack to
ComputeValueVTs().

This change is part of a series of commits dedicated to have a single
DataLayout during compilation by using always the one owned by the
module.

Reviewers: echristo

Subscribers: jholewinski, yaron.keren, rafael, llvm-commits

Differential Revision: http://reviews.llvm.org/D11019

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 241773
2015-07-09 01:57:34 +00:00
David Majnemer 3f49e662c8 [CodeView] Add support for emitting column information
Column information is present in CodeView when the line table subsection
has bit 0 set to 1 in it's flags field.  The column information is
represented as a pair of 16-bit quantities: a starting and ending
column.  This information is present at the end of the chunk, after all
the line-PC pairs.

llvm-svn: 241764
2015-07-09 00:19:51 +00:00
Adam Nemet 943befedf1 [LAA] Fix misleading use of word 'consecutive'
Fix some places where the word consecutive is used but the code really
means constant-stride (i.e. not just unit stride).

llvm-svn: 241763
2015-07-09 00:03:22 +00:00
Alex Lorenz 4d026b89da MIR Serialization: Serialize the 'undef' register machine operand flag.
llvm-svn: 241762
2015-07-08 23:58:31 +00:00
Sanjay Patel 1319446195 [SLPVectorizer] Try different vectorization factors for store chains
...and set max vector register size based on target 

This patch is based on discussion on the llvmdev mailing list:
http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-July/087405.html

and also solves:
https://llvm.org/bugs/show_bug.cgi?id=17170

Several FIXME/TODO items are noted in comments as potential improvements.

Differential Revision: http://reviews.llvm.org/D10950

llvm-svn: 241760
2015-07-08 23:40:55 +00:00
Matthias Braun 91e85d4327 RegisterPressure: Add PressureDiff::dump()
Also display the pressure diff in the case of a
getMaxUpwardPressureDelta() verify failure.

llvm-svn: 241759
2015-07-08 23:40:27 +00:00
Adam Nemet 424edc6c80 [LAA] Revert a small part of r239295
This commit ([LAA] Fix estimation of number of memchecks) regressed the
logic a bit.  We shouldn't quit the analysis if we encounter a pointer
without known bounds *unless* we actually need to emit a memcheck for
it.

The original code was using NumComparisons which is now computed
differently.  Instead I compute NeedRTCheck from NumReadPtrChecks and
NumWritePtrChecks.

As side note, I find the separation of NeedRTCheck and CanDoRT
confusing, so I will try to merge them in a follow-up patch.

llvm-svn: 241756
2015-07-08 22:58:48 +00:00
Juergen Ributzka d25407e972 Run clang-format before making changes to StackMaps. NFC.
llvm-svn: 241754
2015-07-08 22:42:09 +00:00
Sanjay Patel 093fb170a6 [x86] enable machine combiner reassociations for scalar single-precision multiplies
llvm-svn: 241752
2015-07-08 22:35:20 +00:00
Rafael Espindola 4104fe8ae9 Don't reject an archive with just a symbol table.
It is pretty unambiguous how to interpret it and gnu ar accepts it too.

llvm-svn: 241750
2015-07-08 22:27:54 +00:00
Rafael Espindola c91177e410 Disallow Archive::child_iterator that don't point to an archive.
NFC, just less error prone.

llvm-svn: 241747
2015-07-08 22:15:07 +00:00
Michael Zolotukhin 97295ea7dd [LoopVectorizer] Rename BypassBlock to VectorPH, and CheckBlock to NewVectorPH. NFCI.
llvm-svn: 241742
2015-07-08 21:48:03 +00:00
Michael Zolotukhin 8c874bb2f1 [LoopVectorizer] Restructurize code for emitting RT checks. NFCI.
Place all code corresponding to a run-time check in one place.
Previously we generated some code, then proceeded to a next check, then
finished the code for the first check (like splitting blocks and
generating branches). Now the code for generating a check is
self-contained.

llvm-svn: 241741
2015-07-08 21:47:59 +00:00
Michael Zolotukhin 66f5591f9b [LoopVectorizer] Remove redundant variables PastOverflowCheck and OverflowCheckAnchor. NFCI.
llvm-svn: 241740
2015-07-08 21:47:56 +00:00
Michael Zolotukhin 00345cadd5 [LoopVectorizer] Move some code around to ease further refactoring. NFCI.
llvm-svn: 241739
2015-07-08 21:47:53 +00:00
Michael Zolotukhin 7db3063f87 [LoopVectorizer] Remove redundant variable LastBypassBlock. NFC.
llvm-svn: 241738
2015-07-08 21:47:47 +00:00
Alex Lorenz df08179d1b MIR Parser: Remove redundant TODO comment. NFC.
This TODO comment has been redundant since r240474.

llvm-svn: 241737
2015-07-08 21:30:21 +00:00
Alex Lorenz 495ad87919 MIR Serialization: Serialize the 'killed' register machine operand flag.
llvm-svn: 241734
2015-07-08 21:23:34 +00:00
Diego Novillo 13e20f1bbf Add missing dependency to Hexagon target.
A recent patch added calls to isInstructionTriviallyDead without the
corresponding dependency on TransformUtils.

llvm-svn: 241731
2015-07-08 21:13:37 +00:00