Commit Graph

141655 Commits

Author SHA1 Message Date
Peter Collingbourne bc0705240e IR: Move NumElements field from {Array,Vector}Type to SequentialType.
Now that PointerType is no longer a SequentialType, all SequentialTypes
have an associated number of elements, so we can move that information to
the base class, allowing for a number of simplifications.

Differential Revision: https://reviews.llvm.org/D27122

llvm-svn: 288464
2016-12-02 03:20:58 +00:00
Dehao Chen c3be225895 Change LoopUnrollPass cost from int to unsigned to make it consistent. (NFC)
llvm-svn: 288463
2016-12-02 03:17:07 +00:00
Peter Collingbourne 4568158c4d IR: Change PointerType to derive from Type rather than SequentialType.
As proposed on llvm-dev:
http://lists.llvm.org/pipermail/llvm-dev/2016-October/106640.html

This is for a couple of reasons:

- Values of type PointerType are unlike the other SequentialTypes (arrays
  and vectors) in that they do not hold values of the element type. By moving
  PointerType we can unify certain aspects of how the other SequentialTypes
  are handled.
- PointerType will have no place in the SequentialType hierarchy once
  pointee types are removed, so this is a necessary step towards removing
  pointee types.

Differential Revision: https://reviews.llvm.org/D26595

llvm-svn: 288462
2016-12-02 03:05:41 +00:00
Peter Collingbourne 25a40759c1 Fix GlobalISel build.
llvm-svn: 288460
2016-12-02 02:55:30 +00:00
Matt Arsenault 47a4b39646 ConstantFolding: Factor code into helper function
llvm-svn: 288459
2016-12-02 02:26:02 +00:00
Peter Collingbourne ab85225be4 IR: Change the gep_type_iterator API to avoid always exposing the "current" type.
Instead, expose whether the current type is an array or a struct, if an array
what the upper bound is, and if a struct the struct type itself. This is
in preparation for a later change which will make PointerType derive from
Type rather than SequentialType.

Differential Revision: https://reviews.llvm.org/D26594

llvm-svn: 288458
2016-12-02 02:24:42 +00:00
Paul Robinson dad4907bc1 [DWARF] Put linkage-name on abstract origin even when there's a declaration.
In r266692, we made it possible to emit linkage names for just inlined
functions, putting the attribute on the abstract origin. Make sure we
don't think the linkage-name was already emitted on a declaration.

Differential Revision: http://reviews.llvm.org/D27320

llvm-svn: 288450
2016-12-02 01:55:17 +00:00
Teresa Johnson 185b4ab6d4 [ThinLTO] Stop importing constant global vars as copies in the backend
Summary:
We were doing an optimization in the ThinLTO backends of importing
constant unnamed_addr globals unconditionally as a local copy (regardless
of whether the thin link decided to import them). This should be done in
the thin link instead, so that resulting exported references are marked
and promoted appropriately, but will need a summary enhancement to mark
these variables as constant unnamed_addr.

The function import logic during the thin link was trying to handle
this proactively, by conservatively marking all values referenced in
the initializer lists of exported global variables as also exported.
However, this only handled values referenced directly from the
initializer list of an exported global variable. If the value is itself
a constant unnamed_addr variable, we could end up exporting its
references as well. This caused multiple issues. The first is that the
transitively exported references weren't promoted. Secondly, some could
not be promoted/renamed (e.g. they had a section or other constraint).
recursively, instead of just adding the first level of initializer list
references to the ExportList directly.

Remove this optimization and the associated handling in the function
import backend. SPEC measurements indicate we weren't getting much
from it in any case.

Fixes PR31052.

Reviewers: mehdi_amini

Subscribers: krasin, llvm-commits

Differential Revision: https://reviews.llvm.org/D26880

llvm-svn: 288446
2016-12-02 01:02:30 +00:00
Matt Arsenault c47701c0e9 AMDGPU: Use wider scalar spills for SGPR spilling
Since the spill is for the whole wave, these
don't have the swizzling problems that vector stores do
and a single 4-byte allocation is enough to spill a 64 element
register. This should reduce the number of spill instructions and
put all the spills for a register in the same cacheline.

This should save allocated private size, but for now it doesn't.
The extra slots are allocated for each component, but never used
because the frame layout is essentially finalized before frame
indices are replaced. For always using the scalar store path,
this should probably be moved into processFunctionBeforeFrameFinalized.

llvm-svn: 288445
2016-12-02 00:54:45 +00:00
Wolfgang Pieb 42f92a7225 When instructions are hoisted out of loops by MachineLICM, remove their debug loc.
This prevents erratic stepping behavior as well as incorrect source attribution
for sample profiling.

Reviewers: dblakie

Subscribers: llvm-commit

Differential Revision: https://reviews.llvm.org/D27290

llvm-svn: 288442
2016-12-02 00:37:57 +00:00
Justin Bogner 35c5e58f8c SDAG: Avoid a large, usually empty SmallVector in a recursive function
This SmallVector is using up 128 bytes on the stack every time despite
almost always being empty[1], and since this function can recurse quite
deeply that adds up to a lot of overhead. We've seen this run afoul of
ulimits in some cases with ASAN on.

Replacing the SmallVector with a std::vector trades an occasional heap
allocation for vastly less stack usage.

[1]: I gathered some stats on an internal test suite and the vector
was non-empty in only 45,000 of 10,000,000 calls to this function.

llvm-svn: 288441
2016-12-02 00:11:01 +00:00
Geoff Berry 7ffce7be0c [AArch64] Fold more spilled/refilled COPYs.
Summary:
Make AArch64InstrInfo::foldMemoryOperandImpl more general by folding all
full COPYs between register classes of the same size that are either
spilled or refilled.

Reviewers: MatzeB, qcolombet

Subscribers: aemerson, rengolin, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D27271

llvm-svn: 288439
2016-12-01 23:43:55 +00:00
Dan Gohman 734c59d501 [MC] Refactor emitELFSize to make usage more consistent. NFC.
Move the cast<MCSymbolELF> inside emitELFSize, so that: 
 - it's done in one place instead of at each call
 - it's more consistent with similar functions like EmitCOFFSafeSEH
 - ambiguity between cast<> and dyn_cast<> is avoided (which also
   eliminates an unnecessary dyn_cast call)

This also makes it easier to experiment with using ".size" directives on
non-ELF targets.

llvm-svn: 288437
2016-12-01 23:39:08 +00:00
Peter Collingbourne 85c2184a8e llvm-modextract: Call keep() on the output stream before exiting.
llvm-svn: 288435
2016-12-01 23:13:11 +00:00
Oleg Ranevskyy e2ae41519f [ARM] Fix for 64-bit CAS expansion on ARM32 with -O0
Summary:
This patch fixes comparison of 64-bit atomic with its expected value in CMP_SWAP_64 expansion.

Currently, the low words are compared with CMP, while the high words are compared with SBC. SBC expects the carry flag to be set if CMP detects a difference. CMP might leave the carry unset for unequal arguments though if the first one is >= than the second. This might cause the comparison logic to detect false equality.

Example of the broken C++ code:
```
std::atomic<long long> at(2);

long long ll = 1;
std::atomic_compare_exchange_strong(&at, &ll, 3);
```
Even though the atomic `at` and the expected value `ll` are not equal and `atomic_compare_exchange_strong` returns `false`, `at` is changed to 3.

The patch replaces SBC with CMPEQ.

Reviewers: t.p.northover

Subscribers: aemerson, rengolin, llvm-commits, asl

Differential Revision: https://reviews.llvm.org/D27315

llvm-svn: 288433
2016-12-01 22:58:35 +00:00
Artem Belevich 704395a25a Revert "[SLP] Fix for PR6246: vectorization for scalar ops on vector elements."
This reverts r288412 which causes severe compile-time regression.

llvm-svn: 288431
2016-12-01 22:52:15 +00:00
Matthias Braun 709a4cc238 RegisterCoalscer: Only coalesce complete reserved registers.
The coalescer eliminates copies from reserved registers of the form:
   %vregX = COPY %rY
in the case where %rY is a reserved register. However this turns out to
be invalid if only some of the subregisters are reserved (see also
https://reviews.llvm.org/D26648).

Differential Revision: https://reviews.llvm.org/D26687

llvm-svn: 288428
2016-12-01 22:39:51 +00:00
Eugene Zelenko e7c0b2e0f8 Fix broken buildbots because of r288424 (NFC).
llvm-svn: 288426
2016-12-01 22:26:55 +00:00
Eugene Zelenko f65e4ce2c4 [ADT, Support, TableGen] Fix some Clang-tidy modernize-use-default and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 288424
2016-12-01 22:13:24 +00:00
David Blaikie 4aa8175a92 [dsymutil] Simplify a lazy-init condition/expression
llvm-svn: 288423
2016-12-01 22:04:16 +00:00
David Blaikie e40caaee99 [debug info] Minor cleanup from D27170/r288399
llvm-svn: 288421
2016-12-01 21:59:09 +00:00
Chih-Hung Hsieh 76b913c470 [SelectionDAG] getRawSubclassData should not return HasDebugValue.
This change fixes a regression in r279537 and
makes getRawSubclassData behave like r279536.
Without this change, the fp128-g.ll test case will have an
infinite loop involving SoftenFloatRes_LOAD.

Differential Revision: http://reviews.llvm.org/D26942

llvm-svn: 288420
2016-12-01 21:56:33 +00:00
Tim Northover 5bb87b6769 AArch64: fix 128-bit cmpxchg at -O0 (again, again).
This time the issue is fortunately just a simple mistake rather than a horrible
design spectre. I thought SUBS/SBCS provided sufficient NZCV flags for
comparing two 64-bit values, but they don't.

The fix is slightly clunkier in AArch64 because we can't use conditional
execution to emit a pair of CMPs. Traditionally an "icmp ne i128" would map to
an EOR/EOR/ORR/CBNZ, but that uses more registers so it's easier to go with a
CSET/CINC/CBNZ combination. Slightly less efficient, but this is -O0 anyway.

Thanks to Anton Korobeynikov for pointing out the issue.

llvm-svn: 288418
2016-12-01 21:31:59 +00:00
Mehdi Amini 873947141b Improve documentation on MSVC workaround for AlignedCharArray (NFC)
The comment only mentioned "old version of MSVC".

Differential Revision: https://reviews.llvm.org/D27312

llvm-svn: 288417
2016-12-01 20:54:29 +00:00
Benjamin Kramer 215b22e612 Fix unused variable warning in Release builds. NFC.
llvm-svn: 288416
2016-12-01 20:49:34 +00:00
Philip Reames 89e92d21b4 [PR29121] Don't fold if it would produce atomic vector loads or stores
The instcombine code which folds loads and stores into their use types can trip up if the use is a bitcast to a type which we can't directly load or store in the IR. In principle, such types shouldn't exist, but in practice they do today. This is a workaround to avoid a bug while we work towards the long term goal.

Differential Revision: https://reviews.llvm.org/D24365

llvm-svn: 288415
2016-12-01 20:17:06 +00:00
Philip Reames 4d00af1bde Factor out common parts of LVI and Float2Int into ConstantRange [NFCI]
This just extracts out the transfer rules for constant ranges into a single shared point. As it happens, neither bit of code actually overlaps in terms of the handled operators, but with this change that could easily be tweaked in the future.

I also want to have this separated out to make experimenting with a eager value info implementation and possibly a ValueTracking-like fixed depth recursion peephole version. There's no reason all four of these can't share a common implementation which reduces the chances of bugs.

Differential Revision: https://reviews.llvm.org/D27294

llvm-svn: 288413
2016-12-01 20:08:47 +00:00
Alexey Bataev 2c01af5904 [SLP] Fix for PR6246: vectorization for scalar ops on vector elements.
When trying to vectorize trees that start at insertelement instructions
function tryToVectorizeList() uses vectorization factor calculated as
MinVecRegSize/ScalarTypeSize. But sometimes it does not work as tree
cost for this fixed vectorization factor is too high.
Patch tries to improve the situation. It tries different vectorization
factors from max(PowerOf2Floor(NumberOfVectorizedValues),
MinVecRegSize/ScalarTypeSize) to MinVecRegSize/ScalarTypeSize and tries
to choose the best one.

Differential Revision: https://reviews.llvm.org/D27215

llvm-svn: 288412
2016-12-01 20:06:53 +00:00
Dan Gohman 3ec875d212 [WebAssembly] Define more wasm binary encoding constants.
llvm-svn: 288411
2016-12-01 20:02:12 +00:00
David L Kreitzer 0e3ae305b6 Refactored X86InterleavedAccess into a class. NFCI.
Patch by Farhana Aleen

Differential Revision: https://reviews.llvm.org/D25986

llvm-svn: 288410
2016-12-01 19:56:39 +00:00
Vedant Kumar 47de8391c0 [tablegen] Delete duplicates from a vector without skipping elements
Tablegen's -gen-instr-info pass has a bug in its emitEnums() routine.
The function intends for values in a vector to be deduplicated, but it
accidentally skips over elements after performing a deletion.

I think there are smarter ways of doing this deduplication, but we can
do that in a follow-up commit if there's interest. See the thread:
[PATCH] TableGen InstrMapping Bug fix.

Patch by Tyler Kenney!

llvm-svn: 288408
2016-12-01 19:38:50 +00:00
Vedant Kumar 618d78ca40 Remove unused header, NFC.
llvm-svn: 288407
2016-12-01 19:38:48 +00:00
Matthias Braun d0ee66c2e9 Move most EH from MachineModuleInfo to MachineFunction
Recommitting r288293 with some extra fixes for GlobalISel code.

Most of the exception handling members in MachineModuleInfo is actually
per function data (talks about the "current function") so it is better
to keep it at the function instead of the module.

This is a necessary step to have machine module passes work properly.

Also:
- Rename TidyLandingPads() to tidyLandingPads()
- Use doxygen member groups instead of "//===- EH ---"... so it is clear
  where a group ends.
- I had to add an ugly const_cast at two places in the AsmPrinter
  because the available MachineFunction pointers are const, but the code
  wants to call tidyLandingPads() in between
  (markFunctionEnd()/endFunction()).

Differential Revision: https://reviews.llvm.org/D27227

llvm-svn: 288405
2016-12-01 19:32:15 +00:00
Kevin Enderby 5997c9480b Fix a bug with llvm-size and the -m option with multiple files not printing the file names.
llvm-svn: 288402
2016-12-01 19:12:55 +00:00
Benjamin Kramer 6a8704c1c0 Fix unused variable warning in Release builds. NFC.
llvm-svn: 288401
2016-12-01 19:10:10 +00:00
Mehdi Amini 9676d5ed59 Fix module map to create a module for the configured header Config/abi-breaking.h
A client of a header that relies on ABI breaking should get the macro
exported there.
Before this, the unittest for Support/Error including Support/Error.h
didn't get the macro exported by the Support module, because the
latter only re-export its submodules and included module, not
textual headers.

Hopefully, it'll also fix the build with local submodule visibility,
since the LLVM_Utils contains two submodules: ADT and Support. They
both include abi-breaking.h that defines a symbol. The textual
inclusion lead to a double definition of the symbol which broke
the parent module.

Differential Revision: https://reviews.llvm.org/D27273

llvm-svn: 288400
2016-12-01 19:08:38 +00:00
Greg Clayton 35630c3357 This change removes the dependency on DwarfDebug that was used for DW_FORM_ref_addr by making a new DIEUnit class in DIE.cpp.
The DIEUnit class represents a compile or type unit and it owns the unit DIE as an instance variable. This allows anyone with a DIE, to get the unit DIE, and then get back to its DIEUnit without adding any new ivars to the DIE class. Why was this needed? The DIE class has an Offset that is always the CU relative DIE offset, not the "offset in debug info section" as was commented in the header file (the comment has been corrected). This is great for performance because most DIE references are compile unit relative and this means most code that accessed the DIE's offset didn't need to make it into a compile unit relative offset because it already was. When we needed to emit a DW_FORM_ref_addr though, we needed to find the absolute offset of the DIE by finding the DIE's compile/type unit. This class did have the absolute debug info/type offset and could be added to the CU relative offset to compute the absolute offset. With this change we can easily get back to a DIE's DIEUnit which will have this needed offset. Prior to this is required having a DwarfDebug and required calling:

DwarfCompileUnit *DwarfDebug::lookupUnit(const DIE *CU) const;
Now we can use the DIEUnit class to do so without needing DwarfDebug. All clients now use DIEUnit objects (the DwarfDebug stack and the DwarfLinker). A follow on patch for the DWARF generator will also take advantage of this.

Differential Revision: https://reviews.llvm.org/D27170

llvm-svn: 288399
2016-12-01 18:56:29 +00:00
Alexey Bataev 62af7252f1 [SLP] Fixed cost model for horizontal reduction.
Currently when cost of scalar operations is evaluated the vector type is
used for scalar operations. Patch fixes this issue and fixes evaluation
of the vector operations cost.
Several test showed that vector cost model is too optimistic. It
allowed vectorization of 8 or less add/fadd operations, though scalar
code is faster. Actually, only for 16 or more operations vector code
provides better performance.

Differential Revision: https://reviews.llvm.org/D26277

llvm-svn: 288398
2016-12-01 18:42:42 +00:00
Mandeep Singh Grang 32360071a0 [llvm] Implement support for -defsym assembler option
Summary:
Changes to llvm-mc to move common logic to separate function.

Related clang patch: https://reviews.llvm.org/D26213

Reviewers: rafael, t.p.northover, colinl, echristo, rengolin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26214

llvm-svn: 288396
2016-12-01 18:42:04 +00:00
Simon Pilgrim 17d5b6b493 [X86][SSE] Moved shuffle mask widening/narrowing helper functions earlier in the file.
Will be necessary for a future patch.

llvm-svn: 288395
2016-12-01 18:27:19 +00:00
Kostya Serebryany 09f4fa5200 [libFuzzer] add a test for r288389 (-rss_limit_mb=0 means no limit).
llvm-svn: 288392
2016-12-01 18:02:07 +00:00
Ulrich Weigand d36b31d03f [SystemZ] Fix fallout from r288374
Avoid undefined behavior due to too-large shift count.

llvm-svn: 288391
2016-12-01 18:00:50 +00:00
Weiming Zhao cf26d56390 [AsmParser] Diagnose empty symbol for .set directive
Summary: Diagnose empty symbol to avoid hitting assertion in MCContext::getOrCreateSymbol

Reviewers: eli.friedman, rengolin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26728

llvm-svn: 288390
2016-12-01 18:00:36 +00:00
Kostya Serebryany dc6b8ca879 [libFuzzer] treat -rss_limit_mb=0 as no limit
llvm-svn: 288389
2016-12-01 17:56:15 +00:00
Kuba Mracek 93f12aff55 Recommit r287403 (reverted in r287804): [lit] When setting SDKROOT on Darwin, use '--sdk macosx' to find the right SDK path.
This shouls now be safe and not break any more bots.  It's strictly better to use '--sdk macosx', otherwise xcrun can return weird things for example when you have Command Line Tools or the SDK installed into '/'.

llvm-svn: 288385
2016-12-01 17:45:22 +00:00
Adam Nemet 4ddb8c01b1 [GVN, OptDiag] Print the interesting instructions involved in missed load-elimination
[recommitting after the fix in r288307]

This includes the intervening store and the load/store that we're trying
to forward from in the optimization remark for the missed load
elimination.

This is hooked up under a new mode in ORE that allows for compile-time
budget for a bit more analysis to print more insightful messages.  This
mode is currently enabled for -fsave-optimization-record (-Rpass is
trickier since it is controlled in the front-end).

With this we can now print the red remark in http://lab.llvm.org:8080/artifacts/opt-view_test-suite/build/SingleSource/Benchmarks/Dhrystone/CMakeFiles/dry.dir/html/_org_test-suite_SingleSource_Benchmarks_Dhrystone_dry.c.html#L446

Differential Revision: https://reviews.llvm.org/D26490

llvm-svn: 288381
2016-12-01 17:34:50 +00:00
Adam Nemet 8b5fba8081 [GVN, OptDiag] Include the value that is forwarded in load elimination
[recommitting after the fix in r288307]

This requires some changes to the opt-diag API.  Hal and I have
discussed this at the Dev Meeting and came up with a streaming delimiter
(setExtraArgs) to solve this.

Arguments after this delimiter are only included in the optimization
records and not in the remarks printed in the compiler output.  (Note,
how in the test the content of the YAML file changes but the remarks on
the compiler output don't.)

This implements the green GVN message with a bug fix at line
http://lab.llvm.org:8080/artifacts/opt-view_test-suite/build/SingleSource/Benchmarks/Dhrystone/CMakeFiles/dry.dir/html/_org_test-suite_SingleSource_Benchmarks_Dhrystone_dry.c.html#L446

The fix is that now we properly include the constant value in the
message: "load of type i32 eliminated in favor of 7"

Differential Revision: https://reviews.llvm.org/D26489

llvm-svn: 288380
2016-12-01 17:34:44 +00:00
Alexey Bataev fc617690ab [SLP] Additional tests with the cost of vector operations.
llvm-svn: 288377
2016-12-01 17:26:54 +00:00
Ulrich Weigand 55082cddef [SystemZ] Fix applyFixup for 12-bit fixups
Now that we have fixups that only fill parts of a byte, it turns
out we have to mask off the bits outside the fixup area when
applying them.  Failing to do so caused invalid object code to
be emitted for bprp with a negative 12-bit displacement.

llvm-svn: 288374
2016-12-01 17:10:27 +00:00
Alexey Bataev e59a8351d0 Revert "[SLP] Additional tests with the cost of vector operations."
This reverts commit a61718435fc4118c82f8aa6133fd81f803789c1e.

llvm-svn: 288371
2016-12-01 16:45:04 +00:00
Adam Nemet 4d2a6e5998 [GVN] Basic optimization remark support
[recommitting after the fix in r288307]

Follow-on patches will add more interesting cases.

The goal of this patch-set is to get the GVN messages printed in
opt-viewer from Dhrystone as was presented in my Dev Meeting talk.  This
is the optimization view for the function (the last remark in the
function has a bug which is fixed in this series):
http://lab.llvm.org:8080/artifacts/opt-view_test-suite/build/SingleSource/Benchmarks/Dhrystone/CMakeFiles/dry.dir/html/_org_test-suite_SingleSource_Benchmarks_Dhrystone_dry.c.html#L430

Differential Revision: https://reviews.llvm.org/D26488

llvm-svn: 288370
2016-12-01 16:40:32 +00:00
Alexey Bataev 2ff768475d [SLP] Additional tests with the cost of vector operations.
llvm-svn: 288369
2016-12-01 16:11:48 +00:00
Simon Pilgrim 5fe6236035 [X86][SSE] Classify AND bitmasks as variable shuffle masks
They are loading the bitmasks from the constant pool so the cost is similar to loading a shuffle mask.

llvm-svn: 288367
2016-12-01 16:00:14 +00:00
Simon Pilgrim 1e4d870999 [X86][SSE] Add support for combining AND bitmasks to shuffles.
llvm-svn: 288365
2016-12-01 15:41:40 +00:00
Pavel Labath 5d9f8f914a Remove iostream include from WasmObjectFile
The file does not seems to use c++ iostreams (and is is llvm policy to avoid
that). Committing as obvious.

llvm-svn: 288364
2016-12-01 15:20:34 +00:00
Asaf Badouh 7f6968ed0a [LMT] Restrict nop length to one
not all lakemont MCU support long nop.
we can't assume we can generate long nop by default for MCU.

Differential Revision: https://reviews.llvm.org/D26895

llvm-svn: 288363
2016-12-01 15:19:10 +00:00
Simon Pilgrim 2cff28dd27 [X86][SSE] Tidied up filecheck prefixes for uitofp fast-math tests.
They should be in 'narrowing' order from common to more specific test prefixes.

llvm-svn: 288338
2016-12-01 14:56:48 +00:00
Daniel Jasper 19b9284f1d Silence GCC's -Wenum-compare after r288335 in the same way it is done
in X86FastISel.cpp.

llvm-svn: 288337
2016-12-01 14:33:50 +00:00
Nicolai Haehnle da7e4017c6 [SelectionDAG] Rename and clarify visitFMULForFMADCombine (NFC)
Summary: Suggested by @spatel in D26602.

Reviewers: spatel, hfinkel

Subscribers: spatel, llvm-commits

Differential Revision: https://reviews.llvm.org/D27260

llvm-svn: 288336
2016-12-01 14:04:13 +00:00
Simon Pilgrim 55066e5622 [X86][SSE] Add support for combining target shuffles to AND bitmasks.
llvm-svn: 288335
2016-12-01 13:47:02 +00:00
Simon Pilgrim 947650e99d [X86][SSE] Add support for combining ISD::AND with shuffles.
Attempts to convert an AND with a vector of 255 or 0 values into a shuffle (blend) mask.

llvm-svn: 288333
2016-12-01 11:52:37 +00:00
Simon Pilgrim ed4ede0c29 [X86][SSE] Added tests showing missed combines of shuffles with ANDs.
llvm-svn: 288330
2016-12-01 11:26:07 +00:00
Davide Italiano 33af6fe71e [SCCP] Switch over to DEBUG() and drop an #ifdef.
llvm-svn: 288325
2016-12-01 08:48:14 +00:00
Davide Italiano e3bdd615c1 [SCCP] Prefer `auto` when the type is obvious. NFCI.
llvm-svn: 288324
2016-12-01 08:36:12 +00:00
Eric Christopher e70b7c3dfb Temporarily Revert "Move most EH from MachineModuleInfo to MachineFunction"
This apprears to have broken the global isel bot:
http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-globalisel_build/5174/console

This reverts commit r288293.

llvm-svn: 288322
2016-12-01 07:50:12 +00:00
Peter Collingbourne d64ecf26e7 Object: Set SF_Indirect in ModuleSymbolTable.
This lets us remove the last use of IRObjectFile::getSymbolGV() in llvm-nm.

Differential Revision: https://reviews.llvm.org/D27076

llvm-svn: 288321
2016-12-01 07:00:35 +00:00
Peter Collingbourne e2f1b4a651 Object: Add SF_Executable symbol flag.
This allows us to remove a few uses of IRObjectFile::getSymbolGV() in
llvm-nm.

While here change host-dependent logic in llvm-nm to target-dependent
logic.

Differential Revision: https://reviews.llvm.org/D27075

llvm-svn: 288320
2016-12-01 06:53:47 +00:00
Peter Collingbourne 863cbfbeba Object: Extract a ModuleSymbolTable class from IRObjectFile.
This class represents a symbol table built from in-memory IR. It provides
access to GlobalValues and should only be used if such access is required
(e.g. in the LTO implementation). We will eventually change IRObjectFile
to read from a bitcode symbol table rather than using ModuleSymbolTable,
so it would not be able to expose the module.

Differential Revision: https://reviews.llvm.org/D27073

llvm-svn: 288319
2016-12-01 06:51:47 +00:00
Peter Collingbourne 57f9b8c5b5 Bitcode: The index used by ModuleSummaryIndexBitcodeReader is now required, so make it a reference. NFCI.
llvm-svn: 288318
2016-12-01 06:21:08 +00:00
Peter Collingbourne a46ec9f0a8 Bitcode: Introduce BitcodeModule::{has,get}Summary().
These are equivalent to hasGlobalValueSummary() and getModuleSummaryIndex().

Differential Revision: https://reviews.llvm.org/D27242

llvm-svn: 288317
2016-12-01 06:00:53 +00:00
Peter Collingbourne dac43b49bd LTO: Remove ModuleLoader, make loadModuleFromBuffer static and move into its only client, ThinLTOCodeGenerator.
This is no longer the recommended way to load modules for importing, so it should not be public API.

Differential Revision: https://reviews.llvm.org/D27292

llvm-svn: 288316
2016-12-01 05:52:32 +00:00
Peter Collingbourne cf2750a501 Bitcode: Correctly handle Fixed and VBR arrays in BitstreamCursor::skipRecord().
The assertions were wrong; we need to call getEncodingData() on the element,
not the array. While here, simplify the skipRecord() implementation for Fixed
and Char6 arrays. This is tested by the code I added to llvm-bcanalyzer
which makes sure that we can skip any record.

Differential Revision: https://reviews.llvm.org/D27241

llvm-svn: 288315
2016-12-01 05:47:58 +00:00
Philip Reames 812476b495 Revert previous whitespace change
llvm-svn: 288312
2016-12-01 04:37:35 +00:00
Philip Reames d6f7024ae3 Test commit of whitespace to check permissions.
llvm-svn: 288311
2016-12-01 04:37:09 +00:00
Adam Nemet feafcd9688 [GVN] When merging blocks update LoopInfo if it's available
If LoopInfo is available during GVN, BasicAA will use it.  However
MergeBlockIntoPredecessor does not update LI as it merges blocks.

This didn't use to cause problems because LI was freed before
GVN/BasicAA.  Now with OptimizationRemarkEmitter, the lifetime of LI is
extended so LI needs to be kept up-to-date during GVN.

Differential Revision: https://reviews.llvm.org/D27288

llvm-svn: 288307
2016-12-01 03:56:43 +00:00
Ivan Krasin 3dade419bf Use trigrams to speed up SpecialCaseList.
Summary:
it's often the case when the rules in the SpecialCaseList
are of the form hel.o*bar. That gives us a chance to build
trigram index to quickly discard 99% of inputs without
running a full regex. A similar idea was used in Google Code Search
as described in the blog post:
https://swtch.com/~rsc/regexp/regexp4.html

The check is defeated, if there's at least one regex
more complicated than that. In this case, all inputs
will go through the regex. That said, the real-world
rules are often simple or can be simplied. That considerably
speeds up compiling Chromium with CFI and UBSan.

As measured on Chromium's content_message_generator.cc:

before, CFI: 44 s
after, CFI: 23 s
after, CFI, no blacklist: 23 s (~1% slower, but 3 runs were unable to show the difference)
after, regular compilation to bitcode: 23 s

Reviewers: pcc

Subscribers: mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D27188

llvm-svn: 288303
2016-12-01 02:54:54 +00:00
Peter Collingbourne fb8c2a4a6b LTO: Remove Symbol::getIRName().
Its only use was in the LTO implementation. Also document
Symbol::getName().

llvm-svn: 288302
2016-12-01 02:51:12 +00:00
Kostya Serebryany b66cb88c2e revert r288283 as it causes debug info (line numbers) to be lost in instrumented code. also revert r288299 which was a workaround for the problem.
llvm-svn: 288300
2016-12-01 02:06:56 +00:00
Kostya Serebryany 73f438ef9a [libFuzzer] temporary disable a part of the test broken by r288283
llvm-svn: 288299
2016-12-01 01:33:44 +00:00
Derek Schuff 7747d703e3 [WebAssembly] Emit .import_global assembler directives
Support a new assembler directive, .import_global, to declare imported
global variables (i.e. those with external linkage and no
initializer). The linker turns these into wasm imports.

Patch by Jacob Gravelle

Differential Revision: https://reviews.llvm.org/D26875

llvm-svn: 288296
2016-12-01 00:11:15 +00:00
Matthias Braun ed14cb0604 Move most EH from MachineModuleInfo to MachineFunction
Most of the exception handling members in MachineModuleInfo is actually
per function data (talks about the "current function") so it is better
to keep it at the function instead of the module.

This is a necessary step to have machine module passes work properly.

Also:
- Rename TidyLandingPads() to tidyLandingPads()
- Use doxygen member groups instead of "//===- EH ---"... so it is clear
  where a group ends.
- I had to add an ugly const_cast at two places in the AsmPrinter
  because the available MachineFunction pointers are const, but the code
  wants to call tidyLandingPads() in between
  (markFunctionEnd()/endFunction()).

Differential Revision: https://reviews.llvm.org/D27227

llvm-svn: 288293
2016-11-30 23:49:01 +00:00
Matthias Braun ef331eff5a Move VariableDbgInfo from MachineModuleInfo to MachineFunction
VariableDbgInfo is per function data, so it makes sense to have it with
the function instead of the module.

This is a necessary step to have machine module passes work properly.

Differential Revision: https://reviews.llvm.org/D27186

llvm-svn: 288292
2016-11-30 23:48:50 +00:00
Matthias Braun f23ef437cc Move FrameInstructions from MachineModuleInfo to MachineFunction
This is per function data so it is better kept at the function instead
of the module.

This is a necessary step to have machine module passes work properly.

Differential Revision: https://reviews.llvm.org/D27185

llvm-svn: 288291
2016-11-30 23:48:42 +00:00
Matthias Braun 39c3c89cdc MCStreamer: Use "cfi" for CFI related temp labels.
Choosing a "cfi" name makes the intend a bit clearer in an assembly dump
and more importantly the assembly dumps are slightly more stable as the
numbers don't move around anymore when unrelated code calls
createTempSymbol() more or less often.
As they are temp labels the name doesn't influence the generated object
code.

Differential Revision: https://reviews.llvm.org/D27244

llvm-svn: 288290
2016-11-30 23:48:26 +00:00
Peter Collingbourne a5b71649b3 llvm-lto2: Simpler workaround for PR30396.
Maintain the command line resolutions as a map to a list of resolutions
rather than a single resolution, and apply the resolutions in the order
observed. This is not only simpler but allows us to test the scenario where
the two symbols have different resolutions.

Differential Revision: https://reviews.llvm.org/D27285

llvm-svn: 288288
2016-11-30 23:19:05 +00:00
Paul Robinson 78a695321e [PS4] Tighten up a triple check.
llvm-svn: 288286
2016-11-30 23:14:27 +00:00
Eugene Zelenko 8069677ad2 Fix some Clang-tidy modernize-use-default and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 288285
2016-11-30 23:10:42 +00:00
Paul Robinson 37a13ddb4b Recommit r288212: Emit 'no line' information for interesting 'orphan' instructions.
The LLDB tests are now ready for this patch.

DWARF specifies that "line 0" really means "no appropriate source
location" in the line table.  Use this for branch targets and some
other cases that have no specified source location, to prevent
inheriting unfortunate line numbers from physically preceding
instructions (which might be from completely unrelated source).

Differential Revision: http://reviews.llvm.org/D24180

llvm-svn: 288283
2016-11-30 22:49:55 +00:00
Kostya Serebryany 05f7791fbf [libFuzzer] extend -rss_limit_mb to crash instantly on a single malloc that exceeds the limit
llvm-svn: 288281
2016-11-30 22:39:35 +00:00
David Callahan 5cb34077e8 Only computeRelativePath() on new members
Summary:
When using thin archives, and processing the same archive multiple times, we were mangling existing entries.  The root cause is that we were calling computeRelativePath() more than once.   Here, we only call it when adding new members to an archive.

Note that D27218 changes the way thin archives are printed, and will break the new unit test included here.  Depending on which one lands first, the other will need to be slightly modified.

Reviewers: rafael, davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D27217

llvm-svn: 288280
2016-11-30 22:32:58 +00:00
Joel Jones 75818bc8f7 [AArch64] Refactor LSE support as feature separate from V8.1a support.
Summary:
This is preparation for ThunderX processors that have Large
System Extension (LSE) atomic instructions, but not the 
other instructions introduced by V8.1a.
This will mimic changes to GCC as described here:
https://gcc.gnu.org/ml/gcc-patches/2015-06/msg00388.html

LSE instructions are: LD/ST<op>, CAS*, SWP

Reviewers: t.p.northover, echristo, jmolloy, rengolin

Subscribers: aemerson, mehdi_amini

Differential Revision: https://reviews.llvm.org/D26621

llvm-svn: 288279
2016-11-30 22:25:24 +00:00
Evgeny Stupachenko 0c4300fac7 Fix LSR best register search algorithm.
Summary:
Fix a case when first register in a search has maximum
RegUses.getUsedByIndices(Reg).count()

Reviewers: qcolombet

Differential Revision: http://reviews.llvm.org/D26877

From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 288278
2016-11-30 22:23:51 +00:00
Matthias Braun c52fe2961c Clarify rules for reserved regs, fix aarch64 ones.
No test case necessary as the problematic condition is checked with the
newly introduced assertAllSuperRegsMarked() function.

Differential Revision: https://reviews.llvm.org/D26648

llvm-svn: 288277
2016-11-30 22:17:10 +00:00
Kostya Serebryany 1cba0a96e7 [libFuzzer] extend -print_coverage to print the comma-separated list of covered dirs. Note: the Windows stub for DirName is left unimplemented
llvm-svn: 288276
2016-11-30 21:53:32 +00:00
Zachary Turner 5abac1769f [LibFuzzer] Add Windows implementations of some IO functions.
This patch moves some posix specific file i/o code into a new
file, FuzzerIOPosix.cpp, and provides implementations for these
functions on Windows in FuzzerIOWindows.cpp.  This is another
incremental step towards getting libfuzzer working on Windows,
although it still should not be expected to be fully working.

Patch by Marcos Pividori
Differential Revision: https://reviews.llvm.org/D27233

llvm-svn: 288275
2016-11-30 21:44:26 +00:00
Michael Kuperstein b151a641aa [LoopUnroll] Implement profile-based loop peeling
This implements PGO-driven loop peeling.

The basic idea is that when the average dynamic trip-count of a loop is known,
based on PGO, to be low, we can expect a performance win by peeling off the
first several iterations of that loop.
Unlike unrolling based on a known trip count, or a trip count multiple, this
doesn't save us the conditional check and branch on each iteration. However,
it does allow us to simplify the straight-line code we get (constant-folding,
etc.). This is important given that we know that we will usually only hit this
code, and not the actual loop.

This is currently disabled by default.

Differential Revision: https://reviews.llvm.org/D25963

llvm-svn: 288274
2016-11-30 21:13:57 +00:00
Sanjay Patel aa8b28e509 [InstCombine] allow more narrowing transforms for logic ops
We had a limited version of this for scalar 'and'; this expands
the transform to 'or' and 'xor' and allows vectors types too.

llvm-svn: 288273
2016-11-30 20:48:54 +00:00
Sanjay Patel 3d78d3a2ae [InstCombine] add tests to show potentially missed logic+trunc transforms; NFC
llvm-svn: 288270
2016-11-30 20:20:49 +00:00
Quentin Colombet 09cefacf9f CODE_OWNERS: Take ownership of Loop Strenght Reduce.
llvm-svn: 288268
2016-11-30 19:55:49 +00:00
Mehdi Amini 5c289b77fa [git-llvm] Use --force-interactive when commiting to enable SVN to prompt password
When svn does not know the password and it has to prompt, it needs to query.
However it won't when invoked from the Python script and instead fails with:

svn: E215004: Authentication failed and interactive prompting is disabled; see the --force-interactive option

Differential Revision: https://reviews.llvm.org/D27274

llvm-svn: 288266
2016-11-30 19:12:53 +00:00
Mehdi Amini cb79c079b8 Fix macro check for ABI breacking check: should use #if instead of #ifndef
llvm-svn: 288265
2016-11-30 19:08:41 +00:00
Zachary Turner 24a148b1d4 [LibFuzzer] Split up some functions among different headers.
In an effort to get libfuzzer working on Windows, we need to make
a distinction between what functions require platform specific
code (e.g. different code on Windows vs Linux) and what code
doesn't.  IO functions, for example, tend to be platform
specific.

This patch separates out some of the functions which will need
to have platform specific implementations into different headers,
so that we can then provide different implementations for each
platform.

Aside from that, this patch contains no functional change.  It
is purely a re-organization.

Patch by Marcos Pividori
Differential Revision: https://reviews.llvm.org/D27230

llvm-svn: 288264
2016-11-30 19:06:14 +00:00
Matt Arsenault 387afc9375 AMDGPU: Move mir tests into mir test directory
llvm-svn: 288262
2016-11-30 18:50:26 +00:00
Sanjay Patel c41d0c8a9b [InstCombine] update test to use FileCheck and auto-generate checks; NFC
llvm-svn: 288261
2016-11-30 18:49:56 +00:00
Simon Pilgrim 4ae3834792 [X86][SSE] Added tests showing missed combines of ANDs with shuffles.
llvm-svn: 288259
2016-11-30 18:15:10 +00:00
Eugene Zelenko a3fe70d233 Fix some Clang-tidy and Include What You Use warnings; other minor fixes (NFC).
This preparation to remove SetVector.h dependency on SmallSet.h.

llvm-svn: 288256
2016-11-30 17:48:10 +00:00
Sanjay Patel 2419d670c2 [InstCombine] auto-generate checks for select+bitwise logic tests; NFC
llvm-svn: 288254
2016-11-30 17:07:21 +00:00
Silviu Baranga aab65b155e [AArch64] Fix useful bits detection for BFM instructions
Summary:
When computing useful bits for a BFM instruction, we need
to take into consideration the case where both operands
of the BFM are equal and provide data that we need to track.

Not doing this can cause us to miss useful bits.
    
Fixes PR31138 (https://llvm.org/bugs/show_bug.cgi?id=31138)

Reviewers: t.p.northover, jmolloy

Subscribers: evandro, gberry, srhines, pirama, mcrosier, aemerson, llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D27130

llvm-svn: 288253
2016-11-30 17:04:22 +00:00
Derek Schuff 2c6f75ddc5 [WebAssembly] Add llvm-objdump support for wasm file format
This is the first part of an effort to add wasm binary
support across all llvm tools.

Patch by Sam Clegg

Differential Revision: https://reviews.llvm.org/D26172

llvm-svn: 288251
2016-11-30 16:49:11 +00:00
Simon Pilgrim 288c088c17 [X86][SSE] Add support for target shuffle constant folding
Initial support for target shuffle constant folding in cases where all shuffle inputs are constant. We may be able to relax this and merge shuffles with only some constant inputs in the future.

I've added the helper function getTargetConstantBitsFromNode (based off a similar function in X86ShuffleDecodeConstantPool.cpp) that could be reused for other cases requiring constant vector extraction.

Differential Revision: https://reviews.llvm.org/D27220

llvm-svn: 288250
2016-11-30 16:33:46 +00:00
Zachary Turner c6d8b4c044 [LibFuzzer] Add macro flags for Posix and Windows.
This is the beginning of an effort to get libfuzzer working on
Windows.  This is a NFC to just add some macros for platform
detection on Windows.

Patch by Marcos Pividori
Differential Revision: https://reviews.llvm.org/D27229

llvm-svn: 288249
2016-11-30 16:32:54 +00:00
Nicolai Haehnle 73a9a27b5a [SelectionDAG] Refactor TargetLowering::expandMUL (NFC)
Summary: Further preparation for the expansion of MUL_LOHI added in D24956.

Reviewers: efriedma, RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D27064

llvm-svn: 288248
2016-11-30 16:26:33 +00:00
Pavel Labath 5d92bc5bd9 [Support] Use HAVE_DLOPEN to guard dlopen(3) usage
Summary:
The usage was previously guarded by HAVE_DLFCN. This breaks on Android with
LLVM_BUILD_STATIC as the platform does not provide a static version of libdl.
Using HAVE_DLOPEN fixes it as the code will only get used if we are actually able
to link an executable using dlopen.

Reviewers: rafael, beanz

Subscribers: tberghammer, danalbert, llvm-commits

Differential Revision: https://reviews.llvm.org/D26504

llvm-svn: 288246
2016-11-30 15:34:29 +00:00
Sanjay Patel 6f52fe9c8b [AArch64] use exact checks; NFC
llvm-svn: 288245
2016-11-30 15:00:43 +00:00
Krzysztof Parzyszek 31095d2ff5 [PowerPC] Preserve machine dominator tree in PPCVSXFMAMutate
It is needed by LiveIntervalAnalysis.

llvm-svn: 288243
2016-11-30 13:31:09 +00:00
Simon Pilgrim 6c38b7548d Updated test with -verify-machineinstrs to check for PR21931
llvm-svn: 288242
2016-11-30 13:21:12 +00:00
Simon Pilgrim b099d16516 [X86][SSE] Add tests demonstrating missed opportunities to combine 64-bit element unpacks with horizontal pair ops.
llvm-svn: 288240
2016-11-30 11:30:33 +00:00
Benjamin Kramer e6ba5efa80 Apply clang-tidy's 'performance-faster-string-find' check to LLVM.
No functionality change intended.

llvm-svn: 288235
2016-11-30 10:01:11 +00:00
Jonas Devlieghere cc7eafcf42 Revert 'Test commit as per developer policy'
llvm-svn: 288233
2016-11-30 08:24:43 +00:00
Jonas Devlieghere cd7a953720 Test commit as per developer policy
llvm-svn: 288232
2016-11-30 08:06:23 +00:00
Adam Nemet d4717bd8f3 Revert "[GVN] Basic optimization remark support"
This reverts commit r288210.

The failure on the stage2 LTO build is back.

llvm-svn: 288226
2016-11-30 01:14:35 +00:00
Lang Hames 90370702c5 [RuntimeDyld] Skip undefined symbols when building the symbol table.
Storing these in the symbol table (with zero values) is just wasted space.

llvm-svn: 288225
2016-11-30 01:12:07 +00:00
Nemanja Ivanovic f9b191f135 [PowerPC] Improvements for BUILD_VECTOR Vol. 2
This patch corresponds to review:
https://reviews.llvm.org/D26023

This patch adds support for converting a vector of loads into a single load if
the loads are consecutive (in either direction).

llvm-svn: 288219
2016-11-29 23:57:54 +00:00
Nemanja Ivanovic 8c11e79b17 [PowerPC] Improvements for BUILD_VECTOR Vol. 2
This patch corresponds to review:
https://reviews.llvm.org/D25980

This is the 2nd patch in a series of 4 that improve the lowering and combining
for BUILD_VECTOR nodes on PowerPC. This particular patch combines a build vector
of fp-to-int conversions into an fp-to-int conversion of a build vector of fp
values. For example:
Converts (build_vector (fp_to_[su]i $A), (fp_to_[su]i $B), ...)
Into (fp_to_[su]i (build_vector $A, $B, ...))).
Which is a natural match for much cleaner code.

llvm-svn: 288218
2016-11-29 23:36:03 +00:00
Peter Collingbourne c893278419 Add another missing dependency.
llvm-svn: 288217
2016-11-29 23:22:55 +00:00
Paul Robinson 957ba405e8 Revert r288212 due to lldb failure.
llvm-svn: 288216
2016-11-29 23:20:35 +00:00
Jacques Pienaar fc13bdd2db [lanai] Manually match 0/-1 with R0/R1.
Summary: Previously 0 and -1 was matched via tablegen rules. But this could cause problems where a physical register was being used where a virtual register was expected (seen in optimizeSelect and TwoAddressInstructionPass). Instead follow AArch64 and match in DAGToDAGISel.

Reviewers: eliben, majnemer

Subscribers: llvm-commits, aemerson

Differential Revision: https://reviews.llvm.org/D27171

llvm-svn: 288215
2016-11-29 23:01:09 +00:00
Nemanja Ivanovic f57f150b1b Revert https://reviews.llvm.org/rL287679
This commit caused some miscompiles that did not show up on any of the bots.
Reverting until we can investigate the cause of those failures.

llvm-svn: 288214
2016-11-29 23:00:33 +00:00
Paul Robinson 96de8c778b Emit 'no line' information for interesting 'orphan' instructions.
DWARF specifies that "line 0" really means "no appropriate source
location" in the line table.  Use this for branch targets and some
other cases that have no specified source location, to prevent
inheriting unfortunate line numbers from physically preceding
instructions (which might be from completely unrelated source).

Differential Revision: http://reviews.llvm.org/D24180

llvm-svn: 288212
2016-11-29 22:41:16 +00:00
Simon Pilgrim 17062a2bf6 [X86][AVX512VL] Improved testing of vcvtpd2ps, vcvtpd2dq/vcvtpd2udq and vcvttpd2dq/vcvttpd2udq implicit zeroing of upper 64-bits of xmm result
Ensure that masked instruction doesn't assume implicit zeroing.

llvm-svn: 288211
2016-11-29 22:38:30 +00:00
Adam Nemet d5747be721 [GVN] Basic optimization remark support
[recommiting patches one-by-one to see which breaks the stage2 LTO bot]

Follow-on patches will add more interesting cases.

The goal of this patch-set is to get the GVN messages printed in
opt-viewer from Dhrystone as was presented in my Dev Meeting talk.  This
is the optimization view for the function (the last remark in the
function has a bug which is fixed in this series):
http://lab.llvm.org:8080/artifacts/opt-view_test-suite/build/SingleSource/Benchmarks/Dhrystone/CMakeFiles/dry.dir/html/_org_test-suite_SingleSource_Benchmarks_Dhrystone_dry.c.html#L430

Differential Revision: https://reviews.llvm.org/D26488

llvm-svn: 288210
2016-11-29 22:37:01 +00:00
Simon Pilgrim 849473ae99 [X86][AVX512DQVL] Improved testing of vcvtqq2ps/vcvtuqq2ps implicit zeroing of upper 64-bits of xmm result
Ensure that masked instruction doesn't assume implicit zeroing.

llvm-svn: 288209
2016-11-29 22:36:28 +00:00
Sanjay Patel 47f7f30df9 [AArch64] allow and-not-compare transform to form 'bics'
This target hook was added with D19087:
https://reviews.llvm.org/D19087

Differential Revision: https://reviews.llvm.org/D27221

llvm-svn: 288206
2016-11-29 22:28:58 +00:00
Zachary Turner 3f0627c5e4 Add documentation for the PDB Module Info stream.
llvm-svn: 288205
2016-11-29 22:14:56 +00:00
Peter Collingbourne 8700d91ddb Bitcode: Add a more comprehensive multi-module test now that we have both llvm-cat and llvm-modextract.
Differential Revision: https://reviews.llvm.org/D27189

llvm-svn: 288202
2016-11-29 21:55:09 +00:00
Peter Collingbourne cb6b920ba0 Add llvm-modextract tool.
This program is for testing features that rely on multi-module bitcode files.
It takes a multi-module bitcode file, extracts one of the modules and writes
it to the output file.

Differential Revision: https://reviews.llvm.org/D26778

llvm-svn: 288201
2016-11-29 21:54:33 +00:00
Justin Lebar 96e2915574 [StructurizeCFG] Fix infinite loop in rebuildSSA.
Michel Dänzer reported that r288051, "[StructurizeCFG] Use range-based
for loops", introduced a bug into rebuildSSA, wherein we were iterating
over an instruction's use list while modifying it, without taking care
to do this correctly.

llvm-svn: 288200
2016-11-29 21:49:02 +00:00
Kevin Enderby f9d60f00e5 Add to llvm-objdump the -no-leading-headers option with the use of the -macho option.
In some cases the leading headers of the file name, archive member and
architecture slice name in the output of lvm-objdump is not wanted so the
tool’s output can be directly used by scripts.  This matches the -X option
of the Apple otool(1) program.

rdar://28491674

llvm-svn: 288199
2016-11-29 21:43:40 +00:00
Peter Collingbourne 9303efb0af Add missing dependency.
llvm-svn: 288198
2016-11-29 21:02:19 +00:00
Mehdi Amini 11407ae581 Change Error unittest to use the LLVM_ENABLE_ABI_BREAKING_CHECKS instead of NDEBUG
This is consistent with the header (after r288087) and fixes the
test for the configuration:
  -DLLVM_ENABLE_ASSERTIONS=ON -DLLVM_ABI_BREAKING_CHECKS=FORCE_OFF

llvm-svn: 288196
2016-11-29 20:45:48 +00:00
Peter Collingbourne 5a0a2e648c Bitcode: Introduce BitcodeWriter interface.
This interface allows clients to write multiple modules to a single
bitcode file. Also introduce the llvm-cat utility which can be used
to create a bitcode file containing multiple modules.

Differential Revision: https://reviews.llvm.org/D26179

llvm-svn: 288195
2016-11-29 20:43:47 +00:00
Chad Rosier d34c26eb08 [AArch64] Add a basic SchedMachineModel for Falkor.
Differential Revision: https://reviews.llvm.org/D26972

llvm-svn: 288194
2016-11-29 20:00:27 +00:00
David Blaikie 831b652020 Use CallSite to simplify code
llvm-svn: 288192
2016-11-29 19:42:27 +00:00
Matt Arsenault 640c44b893 AMDGPU: Disallow exec as SMEM instruction operand
This is not in the list of valid inputs for the encoding.
When spilling, copies from exec can be folded directly
into the spill instruction which results in broken
stores.

This only fixes the operand constraints, more codegen
work is required to avoid emitting the invalid
spills.

This sort of breaks the dbg.value test. Because the
register class of the s_load_dwordx2 changes, there
is a copy to SReg_64, and the copy is the operand
of dbg_value. The copy is later dead, and removed
from the dbg_value.

llvm-svn: 288191
2016-11-29 19:39:53 +00:00
Matt Arsenault cdad316cc2 AMDGPU: Use SGPR_64 for argument lowerings
llvm-svn: 288190
2016-11-29 19:39:48 +00:00
Geoff Berry 4d66cea347 [LiveRangeEdit] Handle instructions with no defs correctly.
Summary:
The code in LiveRangeEdit::eliminateDeadDef() that computes isOrigDef
doesn't handle instructions in which operand 0 is not a def (e.g. KILL)
correctly.  Add a check that operand 0 is a def before doing the rest of
the isOrigDef computation.

Reviewers: qcolombet, MatzeB, wmi

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D27174

llvm-svn: 288189
2016-11-29 19:31:35 +00:00
Matt Arsenault 97279a8ca3 AMDGPU: Rename flat operands to match mubuf
Use vaddr/vdst for the same purposes.

This also fixes a beg in SIInsertWaits for the
operand check. The stored value operand is currently called
data0 in the single offset case, not data.

llvm-svn: 288188
2016-11-29 19:30:44 +00:00
Matt Arsenault 437fd71f5b AMDGPU: Use else if
llvm-svn: 288187
2016-11-29 19:30:41 +00:00
Matt Arsenault f96eeec005 AMDGPU: Materialize frame index before add
It isn't generally safe to fold the frame index
directly into the operand since it will possibly
not be an inline immediate after it is expanded.

This surprisingly seems to produce better code, since
the FI doesn't prevent folding other immediate operands.

llvm-svn: 288185
2016-11-29 19:20:48 +00:00
Matt Arsenault ff8bb49bf4 AMDGPU: Refactor immediate folding logic
Change the logic for when to fold immediates to
consider the destination operand rather than the
source of the materializing mov instruction.

No change yet, but this will allow for correctly handling
i16/f16 operands. Since 32-bit moves are used to materialize
constants for these, the same bitvalue will not be in the
register.

llvm-svn: 288184
2016-11-29 19:20:42 +00:00
Sanjay Patel 09c5630818 [AArch64] add tests for bics; NFC
llvm-svn: 288183
2016-11-29 19:15:27 +00:00
Sanjay Patel 183f90ad04 [AArch64] add tests to show select transforms; NFC
llvm-svn: 288180
2016-11-29 18:35:04 +00:00
Adam Nemet c2ed4b35b4 Revert "[GVN] Basic optimization remark support"
This reverts commit r288046.

Trying to see if the revert fixes a compiler crash during a stage2 LTO
build with a GVN backtrace.

llvm-svn: 288179
2016-11-29 18:32:04 +00:00
Adam Nemet 91d4d93f94 Revert "[GVN, OptDiag] Include the value that is forwarded in load elimination"
This reverts commit r288047.

Trying to see if the revert fixes a compiler crash during a stage2 LTO
build with a GVN backtrace.

llvm-svn: 288178
2016-11-29 18:32:00 +00:00
Adam Nemet a4d3d44ec2 Revert "[GVN, OptDiag] Print the interesting instructions involved in missed load-elimination"
This reverts commit r288090.

Trying to see if the revert fixes a compiler crash during a stage2 LTO
build with a GVN backtrace.

llvm-svn: 288177
2016-11-29 18:31:53 +00:00
Geoff Berry 7c078fc035 [AArch64] Fold spills of COPY of WZR/XZR
Summary:
In AArch64InstrInfo::foldMemoryOperandImpl, catch more cases where the
COPY being spilled is copying from WZR/XZR, but the source register is
not in the COPY destination register's regclass.

For example, when spilling:

  %vreg0 = COPY %XZR ; %vreg0:GPR64common

without this change, the code in TargetInstrInfo::foldMemoryOperand()
and canFoldCopy() that normally handles cases like this would fail to
optimize since %XZR is not in GPR64common.  So the spill code generated
would be:

  %vreg0 = COPY %XZR
  STR %vreg

instead of the new code generated:

  STR %XZR

Reviewers: qcolombet, MatzeB

Subscribers: mcrosier, aemerson, t.p.northover, llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D26976

llvm-svn: 288176
2016-11-29 18:28:32 +00:00
Mehdi Amini c62b64a9e8 [docs] Typos and whitespace fixed in LTO docs.
While reading the LTO docs I fixed few small typos and whitespace issues.

Patch by: Jonas Devlieghere <jonas@devlieghere.com>

Differential Revision: https://reviews.llvm.org/D27196

llvm-svn: 288171
2016-11-29 18:00:31 +00:00
Simon Pilgrim edccc1254b Avoid repeated calls to MVT getSizeInBits and getScalarSizeInBits(). NFCI.
llvm-svn: 288170
2016-11-29 17:57:48 +00:00
NAKAMURA Takumi 121a3dabe7 Suppress abi-breaking.h on cygming, for now.
FIXME: Implement checks without weak for them.
llvm-svn: 288168
2016-11-29 17:32:58 +00:00
NAKAMURA Takumi b5cb3e5335 Fix a linefeed at eof.
llvm-svn: 288167
2016-11-29 17:32:43 +00:00
Artur Pilipenko 5365746723 [CVP] Remove use of removed flag (-cvp-dont-process-adds) from the test
The flag was removed by 288154

llvm-svn: 288161
2016-11-29 16:43:30 +00:00
Artur Pilipenko cf93b5ba9e [CVP] Remove cvp-dont-process-adds flag
The flag was introduced because the optimization controlled by the flag initially caused regressions. All the regressions were fixed some time ago and the flag has been false for quite a while. 

llvm-svn: 288154
2016-11-29 16:24:57 +00:00
Nemanja Ivanovic df1cb520df [PowerPC] Improvements for BUILD_VECTOR Vol. 1
This patch corresponds to review:
https://reviews.llvm.org/D25912

This is the first patch in a series of 4 that improve the lowering and combining
for BUILD_VECTOR nodes on PowerPC.

llvm-svn: 288152
2016-11-29 16:11:34 +00:00
Alexey Bataev e951e5eb7b [SLP] Add a new test for tree vectorization starting from insertelement
instruction.

llvm-svn: 288148
2016-11-29 15:37:52 +00:00
Simon Pilgrim 001368abc8 [X86] Moved getTargetConstantFromNode function so a future patch is more understandable. NFCI.
llvm-svn: 288147
2016-11-29 15:32:58 +00:00
Aditya Kumar 314ebe05ac [GVNHoist] Rename variables.
Differential Revision: https://reviews.llvm.org/D27110

llvm-svn: 288142
2016-11-29 14:36:27 +00:00
Aditya Kumar 07cb304826 [GVNHoist] Enable aggressive hoisting when optimizing for code-size
Enable scalar hoisting at -Oz as it is safe to hoist scalars to a place
where they are partially needed.

Differential Revision: https://reviews.llvm.org/D27111

llvm-svn: 288141
2016-11-29 14:34:01 +00:00
Simon Pilgrim 35c47c494d [X86][SSE] Add initial support for combining target shuffles to (V)PMOVZX.
We can only handle 128-bit vectors until we support target shuffle inputs of different size to the output.

llvm-svn: 288140
2016-11-29 14:18:51 +00:00
Simon Pilgrim 923020a652 Avoid repeated calls to MVT::getScalarSizeInBits(). NFCI.
llvm-svn: 288138
2016-11-29 13:43:08 +00:00
Simon Pilgrim c17fb85090 [X86][SSE] Added tests showing missed combines to (V)PMOVZX
llvm-svn: 288136
2016-11-29 13:16:11 +00:00
Chandler Carruth 9ac7a059b6 [PM] Fix a bad invalid densemap iterator bug in the new invalidation
logic.

Yup, the invalidation logic has an invalid iterator bug. Can't make this
stuff up.

We can recursively insert things into the map so we can't cache the
iterator into that map across those recursive calls. We did this
differently in two places. I have an end-to-end test that triggers at
least one of them. I'm going to work on a nice minimal test case that
triggers these, but I didn't want to leave the bug in the tree while
I tried to trigger it.

Also, the dense map iterator checking stuff we have now is awesome. =D

llvm-svn: 288135
2016-11-29 12:54:34 +00:00
Malcolm Parsons ecfb4825b3 [StringRef] Use default member initializers and = default.
Summary: This makes the default constructor implicitly constexpr and noexcept.

Reviewers: zturner, beanz

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D27094

llvm-svn: 288131
2016-11-29 10:53:18 +00:00
Alexey Bataev 4fa063ebc9 [SLPVectorizer] Improved support of partial tree vectorization.
Currently SLP vectorizer tries to vectorize a binary operation and dies
immediately after unsuccessful the first unsuccessfull attempt. Patch
tries to improve the situation, trying to vectorize all binary
operations of all children nodes in the binop tree.

Differential Revision: https://reviews.llvm.org/D25517

llvm-svn: 288115
2016-11-29 08:21:14 +00:00
Warren Ristow d9777c1dbb Test commit. Comment changes. NFC.
llvm-svn: 288100
2016-11-29 02:37:13 +00:00
Peter Collingbourne bfcf9800b8 Bitcode: Change expected layout of module blocks.
We now expect each module's identification block to appear immediately before
the module block. Any module block that appears without an identification block
immediately before it is interpreted as if it does not have a module block.

Also change the interpretation of VST and function offsets in bitcode.
The offset is always taken as relative to the start of the identification
(or module if not present) block, minus one word. This corresponds to the
historical interpretation of offsets, i.e. relative to the start of the file.

These changes allow for bitcode modules to be concatenated by copying bytes.

Differential Revision: https://reviews.llvm.org/D27184

llvm-svn: 288098
2016-11-29 02:27:04 +00:00
Reid Kleckner 78565839c6 [asan/win] Align global registration metadata to its size
This way, when the linker adds padding between globals, we can skip over
the zero padding bytes and reliably find the start of the next metadata
global.

llvm-svn: 288096
2016-11-29 01:32:21 +00:00
Tom Stellard 0bc688116c AMDGPU/SI: Avoid moving PHIs to VALU when phi values are defined in scalar branches
Reviewers: arsenm

Subscribers: arsenm, llvm-commits, kzhuravl

Differential Revision: https://reviews.llvm.org/D23417

llvm-svn: 288095
2016-11-29 00:46:46 +00:00
Reid Kleckner c68a6c4ca9 Recognize ${:uid} escapes in intel syntax inline asm
It looks like this logic was duplicated long ago and the GCC side of
things has grown additional functionality. We need ${:uid} at least to
generate unique MS inline asm labels (PR23715), so expose these.

llvm-svn: 288092
2016-11-29 00:29:27 +00:00
Adam Nemet b9e53c9056 [GVN, OptDiag] Print the interesting instructions involved in missed load-elimination
This includes the intervening store and the load/store that we're trying
to forward from in the optimization remark for the missed load
elimination.

This is hooked up under a new mode in ORE that allows for compile-time
budget for a bit more analysis to print more insightful messages.  This
mode is currently enabled for -fsave-optimization-record (-Rpass is
trickier since it is controlled in the front-end).

With this we can now print the red remark in http://lab.llvm.org:8080/artifacts/opt-view_test-suite/build/SingleSource/Benchmarks/Dhrystone/CMakeFiles/dry.dir/html/_org_test-suite_SingleSource_Benchmarks_Dhrystone_dry.c.html#L446

Differential Revision: https://reviews.llvm.org/D26490

llvm-svn: 288090
2016-11-29 00:09:22 +00:00
Sanjay Patel 2bd32b05fb [DAG] clean up foldSelectCCToShiftAnd(); NFCI
llvm-svn: 288088
2016-11-28 23:05:55 +00:00
Mehdi Amini c87de4249d Put ABI breaking test in Error checking behind LLVM_ENABLE_ABI_BREAKING_CHECKS
This macro is supposed to be the one controlling the compatibility
of ABI breaks induced when enabling or disabling assertions in LLVM.

The macro is enabled by default in assertions build, so this commit
won't disable the tests.

Differential Revision: https://reviews.llvm.org/D26700

llvm-svn: 288087
2016-11-28 22:57:11 +00:00
Kevin Enderby 4ffec859eb Add error checking for Mach-O universal files.
Add the checking for both the MachO::fat_header and the
MachO::fat_arch struct values in the constructor for
MachOUniversalBinary. Such that when the constructor
for ObjectForArch is called it can assume the values in
the MachO::fat_arch for the offset and size are contained
in the file after the MachOUniversalBinary constructor
is called for the Parent.

llvm-svn: 288084
2016-11-28 22:40:50 +00:00
Mehdi Amini 28dd54c38f Add link-time detection of LLVM_ABI_BREAKING_CHECKS mismatch
The macro LLVM_ENABLE_ABI_BREAKING_CHECKS is moved to a new header
abi-breaking.h, from llvm-config.h. Only headers that are using the
macro are including this new header.

LLVM will define a symbol, either EnableABIBreakingChecks or
DisableABIBreakingChecks depending on the configuration setting for
LLVM_ABI_BREAKING_CHECKS.

The abi-breaking.h header will add weak references to these symbols in
every clients that includes this header. This should ensure that
a mismatch triggers a link failure (or a load time failure for DSO).

On MSVC, the pragma "detect_mismatch" is used instead.

Differential Revision: https://reviews.llvm.org/D26876

llvm-svn: 288082
2016-11-28 22:23:53 +00:00
Chandler Carruth 3ab2a5a824 [PM] Extend the explicit 'invalidate' method API on analysis results to
accept an Invalidator that allows them to invalidate themselves if their
dependencies are in turn invalidated.

Rather than recording the dependency graph ahead of time when analysis
get results from other analyses, this simply lets each result trigger
the immediate invalidation of any analyses they actually depend on. They
do this in a way that has three nice properties:

1) They don't have to handle transitive dependencies because the
   infrastructure will recurse for them.
2) The invalidate methods are still called only once. We just
   dynamically discover the necessary topological ordering, everything
   is memoized nicely.
3) The infrastructure still provides a default implementation and can
   access it so that only analyses which have dependencies need to do
   anything custom.

To make this work at all, the invalidation logic also has to defer the
deletion of the result objects themselves so that they can remain alive
until we have collected the complete set of results to invalidate.

A unittest is added here that has exactly the dependency pattern we are
concerned with. It hit the use-after-free described by Sean in much
detail in the long thread about analysis invalidation before this
change, and even in an intermediate form of this change where we failed
to defer the deletion of the result objects.

There is an important problem with doing dependency invalidation that
*isn't* solved here: we don't *enforce* that results correctly
invalidate all the analyses whose results they depend on.

I actually looked at what it would take to do that, and it isn't as hard
as I had thought but the complexity it introduces seems very likely to
outweigh the benefit. The technique would be to provide a base class for
an analysis result that would be populated with other results, and
automatically provide the invalidate method which immediately does the
correct thing. This approach has some nice pros IMO:
- Handles the case we care about and nothing else: only *results*
  that depend on other analyses trigger extra invalidation.
- Localized to the result rather than centralized in the analysis
  manager.
- Ties the storage of the reference to another result to the triggering
  of the invalidation of that analysis.
- Still supports extending invalidation in customized ways.

But the down sides here are:
- Very heavy-weight meta-programming is needed to provide this base
  class.
- Requires a pretty awful API for accessing the dependencies.

Ultimately, I fear it will not pull its weight. But we can re-evaluate
this at any point if we start discovering consistent problems where the
invalidation and dependencies get out of sync. It will fit as a clean
layer on top of the facilities in this patch that we can add if and when
we need it.

Note that I'm not really thrilled with the names for these APIs... The
name "Invalidator" seems ok but not great. The method name "invalidate"
also. In review some improvements were suggested, but they really need
*other* uses of these terms to be updated as well so I'm going to do
that in a follow-up commit.

I'm working on the actual fixes to various analyses that need to use
these, but I want to try to get tests for each of them so we don't
regress. And those changes are seperable and obvious so once this goes
in I should be able to roll them out throughout LLVM.

Many thanks to Sean, Justin, and others for help reviewing here.

Differential Revision: https://reviews.llvm.org/D23738

llvm-svn: 288077
2016-11-28 22:04:31 +00:00
Peter Collingbourne b192a2067c cmake: Set rpath for loadable modules as well as shared libraries.
This fixes a regression introduced by r285714: we weren't setting the
rpath on LLVMgold.so correctly.

Spotted by mark@chromium.org!

Differential Revision: https://reviews.llvm.org/D27176

llvm-svn: 288076
2016-11-28 21:59:14 +00:00
Eli Friedman 5096775393 [SROA] Drop lifetime.start/end intrinsics when they block promotion.
Preserving lifetime markers isn't as important as allowing promotion,
so just drop the lifetime markers if necessary.

This also fixes an assertion failure where other parts of SROA assumed
that lifetime markers never block promotion.

Fixes https://llvm.org/bugs/show_bug.cgi?id=29139.

Differential Revision: https://reviews.llvm.org/D24854

llvm-svn: 288074
2016-11-28 21:50:34 +00:00
Sanjay Patel 1cf9aff659 [DAG] add helper function for selectcc --> and+shift transforms; NFC
llvm-svn: 288073
2016-11-28 21:47:41 +00:00
Mehdi Amini 3ab3fef2f1 Improve error handling in YAML parsing
Some scanner errors were not checked and reported by the parser.

Fix PR30934. Recommit r288014 after fixing unittest.

Patch by: Serge Guelton <serge.guelton@telecom-bretagne.eu>

Differential Revision: https://reviews.llvm.org/D26419

llvm-svn: 288071
2016-11-28 21:38:52 +00:00
David Blaikie ce3c8ef26e [DebugInfo] Add support for DW_AT_main_subprogram on subprograms
Patch by Tom Tromey! (for use with Rust)

llvm-svn: 288068
2016-11-28 21:32:19 +00:00
Matthias Braun 115efcd3d1 MachineScheduler: Export function to construct "default" scheduler.
This makes the createGenericSchedLive() function that constructs the
default scheduler available for the public API. This should help when
you want to get a scheduler and the default list of DAG mutations.

This also shrinks the list of default DAG mutations:
{Load|Store}ClusterDAGMutation and MacroFusionDAGMutation are no longer
added by default. Targets can easily add them if they need them. It also
makes it easier for targets to add alternative/custom macrofusion or
clustering mutations while staying with the default
createGenericSchedLive(). It also saves the callback back and forth in
TargetInstrInfo::enableClusterLoads()/enableClusterStores().

Differential Revision: https://reviews.llvm.org/D26986

llvm-svn: 288057
2016-11-28 20:11:54 +00:00
Artem Belevich 57b99f9bea Revert r287637 "[wasm] hack around test failure after r287553."
-cgp-freq-ratio-to-skip-merge option was removed by rollback in r288052.

llvm-svn: 288055
2016-11-28 19:55:46 +00:00
Stanislav Mekhanoshin 0ee250eee8 [AMDGPU] Allow hoisting of comparisons out of a loop and eliminate condition copies
Codegen prepare sinks comparisons close to a user is we have only one register
for conditions. For AMDGPU we have many SGPRs capable to hold vector conditions.
Changed BE to report we have many condition registers. That way IR LICM pass
would hoist an invariant comparison out of a loop and codegen prepare will not
sink it.

With that done a condition is calculated in one block and used in another.
Current behavior is to store workitem's condition in a VGPR using v_cndmask_b32
and then restore it with yet another v_cmp instruction from that v_cndmask's
result. To mitigate the issue a propagation of source SGPR pair in place of v_cmp
is implemented. Additional side effect of this is that we may consume less VGPRs
at a cost of more SGPRs in case if holding of multiple conditions is needed, and
that is a clear win in most cases.

Differential Revision: https://reviews.llvm.org/D26114

llvm-svn: 288053
2016-11-28 18:58:49 +00:00
Joerg Sonnenberger caaa82d90d Revert r287553: [CodeGenPrep] Skip merging empty case blocks
It results in assertions in lib/Analysis/BlockFrequencyInfoImpl.cpp line
670 ("Expected irreducible CFG").

llvm-svn: 288052
2016-11-28 18:56:54 +00:00
Justin Lebar 3aec10ca7e [StructurizeCFG] Use range-based for loops.
Reviewers: arsenm

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D27000

llvm-svn: 288051
2016-11-28 18:50:03 +00:00
Justin Lebar 62c20d8b3b [StructurizeCFG] Refactor NearestCommonDominator.
Summary:
As far as I can tell, doing our own computations in
NearestCommonDominator is a false optimization -- DomTree will build up
what appears to be exactly this data when it decides it's worthwhile.
Moreover, by building the cache ourselves, we cannot take advantage of
the cache that the domtree might have available.

In addition, I am not convinced of the correctness of the original code.
In particular, setting ResultIndex = 1 on the first addBlock instead of
setting it to 0 is quite fishy.  Similarly, it's not clear to me that
setting IndexMap[Node] = 0 for every node as we walk up the tree finding
a common parent is correct.  But rather than ponder over these
questions, I'd rather just make the code do the obviously-correct thing.

This patch also changes the NearestCommonDominator API a bit, improving
the names and getting rid of the boolean parameter in addBlock -- see
http://jlebar.com/2011/12/16/Boolean_parameters_to_API_functions_considered_harmful..html

Reviewers: arsenm

Subscribers: aemerson, wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D26998

llvm-svn: 288050
2016-11-28 18:49:59 +00:00
Simon Pilgrim 2228f70a85 [X86][SSE] Add initial support for combining (V)PMOVZX with shuffles.
llvm-svn: 288049
2016-11-28 17:58:19 +00:00
Adam Nemet a415a9bde6 [GVN, OptDiag] Include the value that is forwarded in load elimination
This requires some changes to the opt-diag API.  Hal and I have
discussed this at the Dev Meeting and came up with a streaming delimiter
(setExtraArgs) to solve this.

Arguments after this delimiter are only included in the optimization
records and not in the remarks printed in the compiler output.  (Note,
how in the test the content of the YAML file changes but the remarks on
the compiler output don't.)

This implements the green GVN message with a bug fix at line
http://lab.llvm.org:8080/artifacts/opt-view_test-suite/build/SingleSource/Benchmarks/Dhrystone/CMakeFiles/dry.dir/html/_org_test-suite_SingleSource_Benchmarks_Dhrystone_dry.c.html#L446

The fix is that now we properly include the constant value in the
message: "load of type i32 eliminated in favor of 7"

Differential Revision: https://reviews.llvm.org/D26489

llvm-svn: 288047
2016-11-28 17:45:34 +00:00
Adam Nemet e5112b14b9 [GVN] Basic optimization remark support
Follow-on patches will add more interesting cases.

The goal of this patch-set is to get the GVN messages printed in
opt-viewer from Dhrystone as was presented in my Dev Meeting talk.  This
is the optimization view for the function (the last remark in the
function has a bug which is fixed in this series):
http://lab.llvm.org:8080/artifacts/opt-view_test-suite/build/SingleSource/Benchmarks/Dhrystone/CMakeFiles/dry.dir/html/_org_test-suite_SingleSource_Benchmarks_Dhrystone_dry.c.html#L430

Differential Revision: https://reviews.llvm.org/D26488

llvm-svn: 288046
2016-11-28 17:45:28 +00:00
Sanjay Patel 100bc01a72 [x86] fix formatting; NFC
llvm-svn: 288045
2016-11-28 17:39:21 +00:00
Daniil Fukalov fb50133076 [CMAKE] fix LLVM_OPTIMIZED_TABLEGEN for Visual Studio
At the moment optimized tablegen is generated by LLVM_USE_HOST_TOOLS variable that is not set for Visual Sudio since LLVM_ENABLE_ASSERTIONS depends on CMAKE_BUILD_TYPE value that is not equal to "DEBUG" in case of Visual Studio soltion generation.

Modified to do not depend on LLVM_ENABLE_ASSERTIONS value in VS and Xcode cases

Reviewers: beanz

Subscribers: RKSimon, llvm-commits, mgorny

Differential Revision: https://reviews.llvm.org/D27135

llvm-svn: 288042
2016-11-28 17:12:09 +00:00
Adam Nemet 58951d3869 [LTO] Move finishOptimizationRemarks after codegen
This addresses the comment D26832.

llvm-svn: 288041
2016-11-28 16:51:49 +00:00
Simon Pilgrim 3f10e66981 [X86][SSE] Added support for combining bit-shifts with shuffles.
Bit-shifts by a whole number of bytes can be represented as a shuffle mask suitable for combining.

Added a 'getFauxShuffleMask' function to allow us to create shuffle masks from other suitable operations.

llvm-svn: 288040
2016-11-28 16:25:01 +00:00
Simon Pilgrim 3def9e11e2 [X86][SSE] Added tests showing missed combines of shifts with shuffles.
llvm-svn: 288037
2016-11-28 15:50:39 +00:00
Daniel Cederman 59168e28e0 Test commit
llvm-svn: 288036
2016-11-28 15:33:03 +00:00
Nirav Dave a413361798 Revert "[DAG] Improve loads-from-store forwarding to handle TokenFactor"
This reverts commit r287773 which caused issues with ppc64le builds.

llvm-svn: 288035
2016-11-28 14:30:29 +00:00
Ulrich Weigand a29bf16ed5 [SystemZ] Fix build bot fallout from r288030
Remove unused variable that came in due to a copy-and-paste bug
and caused build bot failures.

llvm-svn: 288033
2016-11-28 14:24:14 +00:00
Ulrich Weigand 84404f30b3 [SystemZ] Support execution hint instructions
This adds assembler support for the instructions provided by the
execution-hint facility (NIAI and BP(R)P).  This required adding
support for the new relocation types for 12-bit and 24-bit PC-
relative offsets used by the BP(R)P instructions.

llvm-svn: 288031
2016-11-28 14:01:51 +00:00
Ulrich Weigand 2d9e3d9d3b [SystemZ] Support load-and-trap instructions
This adds support for the instructions provided with the
load-and-trap facility.

llvm-svn: 288030
2016-11-28 13:59:22 +00:00
Ulrich Weigand 758399131a [SystemZ] Add remaining branch instructions
This patch adds assembler support for the remaining branch instructions:
the non-relative branch on count variants, and all variants of branch
on index.

The only one of those that can be readily exploited for code generation
is BRCTH (branch on count using a high 32-bit register as count).  Do
use it, however, it is necessary to also introduce a hew CHIMux pseudo
to allow comparisons of a 32-bit value agains a short immediate to go
into a high register as well (implemented via CHI/CIH).

This causes a bit of codegen changes overall, but those have proven to
be neutral (or even beneficial) in performance measurements.

llvm-svn: 288029
2016-11-28 13:40:08 +00:00
Ulrich Weigand 524f276c74 [SystemZ] Improve use of conditional instructions
This patch moves formation of LOC-type instructions from (late)
IfConversion to the early if-conversion pass, and in some cases
additionally creates them directly from select instructions
during DAG instruction selection.

To make early if-conversion work, the patch implements the
canInsertSelect / insertSelect callbacks.  It also implements
the commuteInstructionImpl and FoldImmediate callbacks to
enable generation of the full range of LOC instructions.

Finally, the patch adds support for all instructions of the
load-store-on-condition-2 facility, which allows using LOC
instructions also for high registers.

Due to the use of the GRX32 register class to enable high registers,
we now also have to handle the cases where there are still no single
hardware instructions (conditional move from a low register to a high
register or vice versa).  These are converted back to a branch sequence
after register allocation.  Since the expandRAPseudos callback is not
allowed to create new basic blocks, this requires a simple new pass,
modelled after the ARM/AArch64 ExpandPseudos pass.

Overall, this patch causes significantly more LOC-type instructions
to be used, and results in a measurable performance improvement.

llvm-svn: 288028
2016-11-28 13:34:08 +00:00
James Molloy 6bed13c551 [InlineCost] Reduce inline thresholds to compensate for cost changes
In r286814, the algorithm for calculating inline costs changed. This
caused more inlining to take place which is especially apparent
in optsize and minsize modes.

As the cost calculation removed a skewed behaviour (we were inconsistent
about the cost of calls) it isn't possible to update the thresholds to
get exactly the same behaviour as before. However, this threshold change
accounts for the very common case where an inline candidate has no
calls within it. In this case, r286814 would inline around 5-6 more (IR)
instructions.

The changes to -Oz have been heavily benchmarked. The "obvious" value
for the inline threshold at -Oz is zero, but due to inaccuracies in the
inline heuristics this can actually cause code size increases due to
not inlining key thunk functions (that then disappear). Experimentally,
5 was the sweet spot for code size over the test-suite.

For -Os, this change removes the outlier results shown up by green dragon
(http://104.154.54.203/db_default/v4/nts/13248).

Fixes D26848.

llvm-svn: 288024
2016-11-28 11:07:37 +00:00
Chandler Carruth 0c6efff178 [PM] Remove weird marking of invalidated analyses as "preserved".
This never made a lot of sense. They've been invalidated for one IR unit
but they aren't really preserved in any normal sense. It seemed like it
would be an elegant way of communicating to outer IR units that pass
managers and adaptors had already handled invalidation, but we've since
ended up adding sets that model this more clearly: we're now using
the 'AllAnalysesOn<IRUnitT>' set to handle cases where the trick of
"preserving" invalidated analyses didn't work.

This patch moves to rely on that technique exclusively and removes the
cumbersome API aspect of updating the preserved set when doing
invalidation. This in turn will simplify a *number* of upcoming patches.

This has a side benefit of exposing a number of places where we were
failing to mark the 'AllAnalysesOn<IRUnitT>' set as preserved. This
patch fixes those, and with those fixes shouldn't change any observable
behavior.

llvm-svn: 288023
2016-11-28 10:42:21 +00:00
Davide Italiano 0f0d5d8f8d [ThreadPool] Rollback recent changes until I figure out the breakage.
llvm-svn: 288018
2016-11-28 09:17:12 +00:00
Davide Italiano 3dd87dad64 [ThreadPool] Remove outdated comment after r288016.
llvm-svn: 288017
2016-11-28 08:57:05 +00:00
Davide Italiano 3ea0bfa7e0 [ThreadPool] Simplify the interface. NFCI.
The callers don't use the return value. Found by Michael
Spencer.

llvm-svn: 288016
2016-11-28 08:53:41 +00:00
Mehdi Amini 43c2428203 Revert "Improve error handling in YAML parsing"
This reverts commit r288014, the unittest isn't passing

llvm-svn: 288015
2016-11-28 04:57:04 +00:00
Mehdi Amini c54281be4f Improve error handling in YAML parsing
Some scanner errors were not checked and reported by the parser.

Fix PR30934

Patch by: Serge Guelton <serge.guelton@telecom-bretagne.eu>

Differential Revision: https://reviews.llvm.org/D26419

llvm-svn: 288014
2016-11-28 04:44:13 +00:00
Chandler Carruth 4cf2c89883 [PM] Add an ASCII-art diagram for the call graph in the CGSCC unit test.
No functionality changed.

llvm-svn: 288013
2016-11-28 03:40:33 +00:00
Craig Topper 17786f77f0 [X86][FMA4] Remove isCommutable from FMA4 scalar intrinsics. They aren't commutable as operand 0 should pass its upper bits through to the output.
llvm-svn: 288011
2016-11-27 21:37:04 +00:00
Craig Topper 13b27a2748 [X86][FMA] Add missing Predicates qualifier around scalar FMA intrinsic patterns.
llvm-svn: 288010
2016-11-27 21:37:02 +00:00
Craig Topper ff9d45875a [X86][FMA4] Add load folding support for FMA4 scalar intrinsic instructions.
llvm-svn: 288009
2016-11-27 21:37:00 +00:00
Craig Topper b00872b983 [X86][FMA4] Add test cases to demonstrate missed folding opportunities for FMA4 scalar intrinsics.
llvm-svn: 288008
2016-11-27 21:36:58 +00:00
Craig Topper 3674f44e40 [X86] Add SHL by 1 to the load folding tables.
I don't think isel selects these today, favoring adding the register to itself instead. But the load folding tables shouldn't be so concerned with what isel will use and just represent the relationships.

llvm-svn: 288007
2016-11-27 21:36:54 +00:00
Simon Pilgrim 91d6f5fbc1 [X86][SSE] Add support for combining target shuffles to 128/256-bit PSLL/PSRL bit shifts
llvm-svn: 288006
2016-11-27 21:08:19 +00:00
Sanjay Patel 8ca30ab0c5 [InstSimplify] allow integer vector types to use computeKnownBits
Note that the non-splat lshr+lshr test folded, but that does not
work in general. Something is missing or wrong in computeKnownBits
as the non-splat shl+shl test still shows.

llvm-svn: 288005
2016-11-27 21:07:28 +00:00
Craig Topper 4fab487265 [AVX-512] Add integer and fp unpck instructions to load folding tables.
llvm-svn: 288004
2016-11-27 19:51:41 +00:00
Simon Pilgrim cdb2ce661d [X86][SSE] Split lowerVectorShuffleAsShift ready for combines. NFCI.
Moved most of matching code into matchVectorShuffleAsShift to share with target shuffle combines (in a future commit).

llvm-svn: 288003
2016-11-27 19:28:39 +00:00
Craig Topper 7ad961cc70 [X86] Add TB_NO_REVERSE to entries in the load folding table where the instruction's load size is smaller than the register size.
If we were to unfold these, the load size would be increased to the register size. This is not safe to do since the enlarged load can do things like cross a page boundary into a page that doesn't exist.

I probably missed some instructions, but this should be a large portion of them.

llvm-svn: 288001
2016-11-27 18:51:13 +00:00
Simon Pilgrim 4571157d2d [X86][SSE] Added tests showing missed combines for shuffle to shifts.
llvm-svn: 288000
2016-11-27 18:25:02 +00:00
Sanjay Patel dc2917b969 add tests to show missing analysis; NFC
llvm-svn: 287998
2016-11-27 15:54:45 +00:00
Sanjay Patel da9f7bf0fc fix formatting; NFC
llvm-svn: 287997
2016-11-27 15:53:48 +00:00
Craig Topper c3b3926f8b [AVX-512] Add masked EVEX vpmovzx/sx instructions to load folding tables.
llvm-svn: 287995
2016-11-27 08:55:31 +00:00
Mohammad Shahid 2f5cb60b07 [SLP] Add new and update existing lit testfor providing more context to incoming patch for vectorization of jumbled load
Change-Id: Ifb9091bb0f84c1937c2c8bd2fc345734f250d2f9
llvm-svn: 287992
2016-11-27 03:35:31 +00:00
Craig Topper fb64a25ba1 [X86] Remove alignment restrictions from load folding table for some instructions that don't have a restriction.
Most of these are the SSE4.1 PMOVZX/PMOVSX instructions which all read less than 128-bits. The only other was PMOVUPD which by definition is an unaligned load.

llvm-svn: 287991
2016-11-27 01:52:51 +00:00
Craig Topper 837ff25da1 [X86] Remove hasOneUse check that is redundant with the one in IsProfitableToFold.
llvm-svn: 287987
2016-11-26 18:43:26 +00:00
Craig Topper e266e126ff [X86] Fix the zero extending load detection in X86DAGToDAGISel::selectScalarSSELoad to pass the load node to IsProfitableToFold and IsLegalToFold.
Previously we were passing the SCALAR_TO_VECTOR node.

llvm-svn: 287986
2016-11-26 18:43:24 +00:00
Craig Topper d3ab1a3905 [X86] Simplify control flow. NFCI
llvm-svn: 287985
2016-11-26 18:43:21 +00:00
Craig Topper 991d1ca3ba [X86] Add a hasOneUse check to selectScalarSSELoad to keep the same load from being folded multiple times.
Summary: When selectScalarSSELoad is looking for a scalar_to_vector of a scalar load, it makes sure the load is only used by the scalar_to_vector. But it doesn't make sure the scalar_to_vector is only used once. This can cause the same load to be folded multiple times. This can be bad for performance. This also causes the chain output to be duplicated, but not connected to anything so chain dependencies will not be satisfied.

Reviewers: RKSimon, zvi, delena, spatel

Subscribers: andreadb, llvm-commits

Differential Revision: https://reviews.llvm.org/D26790

llvm-svn: 287983
2016-11-26 17:29:25 +00:00
Sanjay Patel 12a2af447b [InstCombine] add test to show missing vector optimization; NFC
llvm-svn: 287982
2016-11-26 16:13:23 +00:00
Sanjay Patel 8bd69b7ed9 [InstCombine] don't drop metadata in FoldOpIntoSelect()
llvm-svn: 287980
2016-11-26 15:23:20 +00:00
Sanjay Patel 91e73a7bfa add optional param to copy metadata when creating selects; NFC
There are other spots where we can use this; we're currently dropping 
metadata in some places, and there are proposed changes where we will
want to propagate metadata.

IRBuilder's CreateSelect() already has a parameter like this, so this
change makes the regular 'Create' API line up with that.

llvm-svn: 287976
2016-11-26 15:01:59 +00:00
Craig Topper 10d5eec1a1 [AVX-512] Add unmasked EVEX vpmovzx/sx instructions to load folding tables.
llvm-svn: 287975
2016-11-26 08:21:52 +00:00
Craig Topper 97169ea5f9 [AVX-512] Add masked 128/256-bit integer add/sub instructions to load folding tables.
llvm-svn: 287974
2016-11-26 08:21:48 +00:00
Craig Topper 53b33de1e3 [AVX-512] Add masked 512-bit integer add/sub instructions to load folding tables.
llvm-svn: 287972
2016-11-26 07:21:00 +00:00
Craig Topper 6677bb4e50 [AVX-512] Teach LowerFormalArguments to use the extended register class when available. Fix the avx512vl stack folding tests to clobber more registers or otherwise they use xmm16 after this change.
llvm-svn: 287971
2016-11-26 07:20:57 +00:00
Craig Topper 39265bb1ce [AVX-512] Add VLX versions of VDIVPD/PS and VMULPD/PS to load folding tables.
llvm-svn: 287970
2016-11-26 07:20:53 +00:00
Tom Stellard 1473f07ceb AMDGPU/SI: Use float as the operand type for amdgcn.interp intrinsics
Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye

Differential Revision: https://reviews.llvm.org/D26724

llvm-svn: 287962
2016-11-26 02:26:04 +00:00
Craig Topper 7f76c23781 [X86][XOP] Add a reversed reg/reg form for VPROT instructions.
The W bit distinquishes which operand is the memory operand. But if the mod bits are 3 then the memory operand is a register and there are two possible encodings. We already did this correctly for several other XOP instructions.

llvm-svn: 287961
2016-11-26 02:14:00 +00:00
Craig Topper 516fd7abfe [X86] Add SSE, AVX, and AVX2 version of MOVDQU to the load/store folding tables for consistency.
Not sure this is truly needed but we had the floating point equivalents, the aligned equivalents, and the EVEX equivalents. So this just makes it complete.

llvm-svn: 287960
2016-11-26 02:13:58 +00:00
Dylan McKay 656c1fa544 Un-XFAIL an AVR CodeGen test
llvm-svn: 287958
2016-11-26 01:07:32 +00:00
Craig Topper a363d42973 [AVX-512] Put the AVX-512 sections of the load folding tables into mostly alphabetical order. This is consistent with the older sections of the table. NFC
llvm-svn: 287956
2016-11-25 23:21:34 +00:00
David Majnemer d5648c7a7d Replace some callers of setTailCall with setTailCallKind
We were a little sloppy with adding tailcall markers.  Be more
consistent by using setTailCallKind instead of setTailCall.

llvm-svn: 287955
2016-11-25 22:35:09 +00:00
Sanjay Patel 534e270ae5 [SimplifyCFG] auto-generate better checks; NFC
llvm-svn: 287954
2016-11-25 21:12:39 +00:00
Sanjay Patel d1a147f9f4 [SimplifyCFG] auto-generate better checks; NFC
llvm-svn: 287953
2016-11-25 21:07:13 +00:00
Marek Olsak 79c05871a2 AMDGPU/SI: Add back reverted SGPR spilling code, but disable it
suggested as a better solution by Matt

llvm-svn: 287942
2016-11-25 17:37:09 +00:00
Simon Pilgrim c5fb167df0 Use SDValue helpers instead of explicitly going via SDValue::getNode(). NFCI
llvm-svn: 287941
2016-11-25 17:25:21 +00:00
Simon Pilgrim 8e8ae7219f Use SDValue helper instead of explicitly going via SDValue::getNode(). NFCI
llvm-svn: 287940
2016-11-25 17:19:53 +00:00
Craig Topper 88071b37ab [AVX-512] Add support for changing VSHUFF64x2 to VSHUFF32x4 when its feeding a vselect with 32-bit element size.
Summary:
Shuffle lowering may have widened the element size of a i32 shuffle to i64 before selecting X86ISD::SHUF128. If this shuffle was used by a vselect this can prevent us from selecting masked operations.

This patch detects this and changes the element size to match the vselect.

I don't handle changing integer to floating point or vice versa as its not clear if its better to push such a bitcast to the inputs of the shuffle or to the user of the vselect. So I'm ignoring that case for now.

Reviewers: delena, zvi, RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D27087

llvm-svn: 287939
2016-11-25 16:48:05 +00:00
Craig Topper 1e48829747 [AVX-512] Add VPERMT2* and VPERMI2* instructions to load folding tables.
llvm-svn: 287937
2016-11-25 16:33:53 +00:00
Marek Olsak e3895bfb47 Revert "AMDGPU: Implement SGPR spilling with scalar stores"
This reverts commit 4404d0d6e354e80dd7f8f0a0e12d8ad809cf007e.

llvm-svn: 287936
2016-11-25 16:03:34 +00:00
Marek Olsak dad553a5cf Revert "AMDGPU: Fix MMO when splitting spill"
This reverts commit 79d4f8b8b1ce430c3d5dac4fc72a9eebaed24fe1.

llvm-svn: 287935
2016-11-25 16:03:27 +00:00
Marek Olsak 8cbbf65361 Revert "AMDGPU: Fix adding extra implicit def of register"
This reverts commit e834ce5976567575621901fb967b8018b9916d71.

llvm-svn: 287934
2016-11-25 16:03:22 +00:00
Marek Olsak 713e6fc531 Revert "AMDGPU: Fix not setting kill flag on temp reg when spilling"
This reverts commit 057bbbe4ae170247ba37f08f2e70ef185267d1bb.

llvm-svn: 287933
2016-11-25 16:03:19 +00:00
Marek Olsak a45dae458d Revert "AMDGPU: Make m0 unallocatable"
This reverts commit 124ad83dae04514f943902446520c859adee0e96.

llvm-svn: 287932
2016-11-25 16:03:15 +00:00
Marek Olsak ea848df84c Revert "AMDGPU: Remove m0 spilling code"
This reverts commit f18de36554eb22416f8ba58e094e0272523a4301.

llvm-svn: 287931
2016-11-25 16:03:06 +00:00
Marek Olsak 18a95bcb3c Revert "AMDGPU: Preserve m0 value when spilling"
This reverts commit a5a179ffd94fd4136df461ec76fb30f04afa87ce.

llvm-svn: 287930
2016-11-25 16:03:02 +00:00
Simon Pilgrim 84b6f26eca [X86][SSE] Added knownbits through bitcast test
llvm-svn: 287928
2016-11-25 15:07:15 +00:00
Abhilash Bhandari 54e5a1a4da [Loop Unswitch] Patch to selective unswitch only the reachable branch instructions.
Summary:
The iterative algorithm for Loop Unswitching may render some of the branches unreachable in the unswitched loops.
Given the exponential nature of the algorithm, this is quite an overhead.
This patch fixes this problem by selectively unswitching only those branches within a loop that are reachable from the loop header.

Reviewers: Michael Zolothukin, Anna Thomas, Weiming Zhao.
Subscribers: llvm-commits.

Differential Revision: http://reviews.llvm.org/D26299

llvm-svn: 287925
2016-11-25 14:07:44 +00:00
Simon Pilgrim 8424b03d00 [X86][SSE] Added v16i8 shuffle test case from PR31151
llvm-svn: 287919
2016-11-25 11:10:43 +00:00
Simon Dardis c08af6db5b [mips] Correct jal expansion for local symbols in .local directives.
This patch corrects the behaviour of code such as:

   .local foo
   jal foo
foo:
to use the correct jal expansion when writing ELF files.

Patch by: Daniel Sanders

Reviewers: zoran.jovanovic, seanbruno, vkalintiris

Differential Revision: https://reviews.llvm.org/D24722

llvm-svn: 287918
2016-11-25 11:06:43 +00:00
Craig Topper d4091494d3 [X86] Invert an 'if' and early out to fix a weird indentation. NFCI
llvm-svn: 287909
2016-11-25 02:29:24 +00:00
Craig Topper a46936185a [X86] Size a SmallVector to the worst case mask size for a 512-bit shuffle. NFCI
llvm-svn: 287908
2016-11-25 02:29:21 +00:00
Craig Topper 8c4cdf06db [DAGCombine] Teach DAG combine that if both inputs of a vselect are the same, then the condition doesn't matter and the vselect can be removed.
Selects with scalar condition already handle this correctly.

llvm-svn: 287904
2016-11-24 21:48:52 +00:00
Craig Topper d621e3a25b [X86] Modify two tests that passed undef to both sides of a vselect to instead pass unique values.
I'd like to teach DAG combine to remove vselects where both sides are identical and these tests were in the way of that.

llvm-svn: 287903
2016-11-24 21:48:50 +00:00
Serge Rogatch a331133e6d Test commit access.
llvm-svn: 287898
2016-11-24 18:51:47 +00:00
Craig Topper 00758090ca [AVX-512] Add tests demonstrating failure to generated masked instructions for VSHUFF32x4 and VSHUFI32x4 due to shuffle lowering widening elements.
llvm-svn: 287897
2016-11-24 18:24:46 +00:00
Abhilash Bhandari e6a31c507c Test Commit, removing a blank line in CREDITS.TXT
llvm-svn: 287891
2016-11-24 15:40:19 +00:00
Simon Pilgrim f1ee930db0 Fix unused variable warning
llvm-svn: 287889
2016-11-24 15:24:47 +00:00
Benjamin Kramer fc54e35d94 [X86] Don't round trip a unique_ptr through a raw pointer for assignment.
No functional change.

llvm-svn: 287888
2016-11-24 15:17:39 +00:00
Simon Pilgrim 9c71e07276 [X86][SSE] Improve UINT_TO_FP v2i32 -> v2f64
Vectorize UINT_TO_FP v2i32 -> v2f64 instead of scalarization (albeit still on the SIMD unit).

The codegen matches that generated by legalization (and is in fact used by AVX for UINT_TO_FP v4i32 -> v4f64), but has to be done in the x86 backend to account for legalization via 4i32.

Differential Revision: https://reviews.llvm.org/D26938

llvm-svn: 287886
2016-11-24 15:12:56 +00:00
Simon Pilgrim 841d7ca463 [X86][AVX512] Add support for v2i64 fptosi/fptoui/sitofp/uitofp on AVX512DQ-only targets
Use 512-bit instructions with subvector insertion/extraction like we do in a number of similar circumstances

llvm-svn: 287882
2016-11-24 14:46:55 +00:00
Simon Pilgrim 7c26a6f9ef [X86][AVX512DQVL] Add awareness of vcvtqq2ps and vcvtuqq2ps implicit zeroing of upper 64-bits of xmm result
llvm-svn: 287878
2016-11-24 14:02:30 +00:00
Simon Pilgrim ab323ec411 [X86][AVX512DQVL] Add support for v2i64 -> v2f32 SINT_TO_FP/UINT_TO_FP lowering
llvm-svn: 287877
2016-11-24 13:38:59 +00:00
Simon Pilgrim 1e7a846d5f [X86][AVX512DQVL] Add v2i64 -> v2f32 + zero codegen tests
llvm-svn: 287876
2016-11-24 13:26:51 +00:00
Nikolai Bozhenov 3a8d108b2b [x86] Fixing PR28755 by precomputing the address used in CMPXCHG8B
The bug arises during register allocation on i686 for
CMPXCHG8B instruction when base pointer is needed. CMPXCHG8B
needs 4 implicit registers (EAX, EBX, ECX, EDX) and a memory address,
plus ESI is reserved as the base pointer. With such constraints the only
way register allocator would do its job successfully is when the addressing
mode of the instruction requires only one register. If that is not the case
- we are emitting additional LEA instruction to compute the address.

It fixes PR28755.

Patch by Alexander Ivchenko <alexander.ivchenko@intel.com>

Differential Revision: https://reviews.llvm.org/D25088

llvm-svn: 287875
2016-11-24 13:23:35 +00:00
Nikolai Bozhenov bb64aa14a3 [x86] Minor refactoring of X86TargetLowering::EmitInstrWithCustomInserter
Move the definitions of three variables out of the switch.

Patch by Alexander Ivchenko <alexander.ivchenko@intel.com>

Differential Revision: https://reviews.llvm.org/D25192

llvm-svn: 287874
2016-11-24 13:15:49 +00:00
Nikolai Bozhenov a2dabed3b6 [x86] Rewrite getAddressFromInstr helper function
- It does not modify the input instruction
- Second operand of any address is always an Index Register,
  make sure we actually check for that, instead of a check for
  an immediate value

Patch by Alexander Ivchenko <alexander.ivchenko@intel.com>

Differential Revision: https://reviews.llvm.org/D24938

llvm-svn: 287873
2016-11-24 13:05:43 +00:00
Dylan McKay c2de8e8ec3 [AVR] Mark the 'select-must-add-unconditional-jump' test as 'XFAIL'
llvm-svn: 287871
2016-11-24 12:38:54 +00:00
Simon Pilgrim a3af79678e [X86] Generalize CVTTPD2DQ/CVTTPD2UDQ and CVTDQ2PD/CVTUDQ2PD opcodes. NFCI
Replace the CVTTPD2DQ/CVTTPD2UDQ and CVTDQ2PD/CVTUDQ2PD opcodes with general versions.

This is an initial step towards similar FP_TO_SINT/FP_TO_UINT and SINT_TO_FP/UINT_TO_FP lowering to AVX512 CVTTPS2QQ/CVTTPS2UQQ and CVTQQ2PS/CVTUQQ2PS with illegal types.

Differential Revision: https://reviews.llvm.org/D27072

llvm-svn: 287870
2016-11-24 12:13:46 +00:00
Malcolm Parsons 1c7f07aa3e [CommandLine] Remove redundant initializers for StringRef members
Summary: The default constructor for a StringRef stores an empty string.

Reviewers: beanz, zturner

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D27067

llvm-svn: 287857
2016-11-24 08:54:05 +00:00
Jacob Baungard Hansen a8cbbdc9b6 TableGen: Allow signed immediates for instruction aliases
Patch by Daniel Cederman.

Reviewers: stoklund, arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: https://reviews.llvm.org/D27046

llvm-svn: 287856
2016-11-24 08:53:28 +00:00
Craig Topper f23b995f78 [AVX-512] Fix some mask shuffle tests to actually test the case they were supposed to test.
llvm-svn: 287854
2016-11-24 05:36:50 +00:00
Craig Topper 993c7416d3 [AVX-512] Move a 16 x float shuffle test to the v16 test file and add an integer variant.
llvm-svn: 287853
2016-11-24 05:36:47 +00:00
Peter Collingbourne debb6f6cc1 Object: Add IRObjectFile::getTargetTriple().
This lets us remove a use of IRObjectFile::getModule() in llvm-nm.

Differential Revision: https://reviews.llvm.org/D27074

llvm-svn: 287846
2016-11-24 01:13:09 +00:00
Peter Collingbourne e32baa0c3e Object: Simplify the IRObjectFile symbol iterator implementation.
Change the IRObjectFile symbol iterator to be a pointer into a vector of
PointerUnions representing either IR symbols or asm symbols.

This change is in preparation for a future change for supporting multiple
modules in an IRObjectFile. Although it causes an increase in memory
consumption, we can deal with that issue separately by introducing a bitcode
symbol table.

Differential Revision: https://reviews.llvm.org/D26928

llvm-svn: 287845
2016-11-24 00:41:05 +00:00
Matt Arsenault 7b54dd039e AMDGPU: Preserve m0 value when spilling
llvm-svn: 287844
2016-11-24 00:26:50 +00:00
Matt Arsenault 94b32ffe8e TRI: Add hook to pass scavenger during frame elimination
The scavenger was not passed if requiresFrameIndexScavenging was
enabled. I need to be able to test for the availability of an
unallocatable register here, so I can't create a virtual register for
it.

It might be better to just always use the scavenger and stop
creating virtual registers.

llvm-svn: 287843
2016-11-24 00:26:47 +00:00
Matt Arsenault 5ee3325358 AMDGPU: Remove m0 spilling code
Since m0 isn't allocatable it should never be spilled anymore.

llvm-svn: 287842
2016-11-24 00:26:44 +00:00
Matt Arsenault 9e5c7b1031 AMDGPU: Make m0 unallocatable
m0 may need to be written for spill code, so
we don't want general code uses relying on the
value stored in it.

This introduces a few code quality regressions where copies
from m0 are not coalesced into copies of a copy of m0.

llvm-svn: 287841
2016-11-24 00:26:40 +00:00
Davide Italiano 8812f28f47 [lib/LTO] Rename few instances of Lto to LTO.
llvm-svn: 287840
2016-11-24 00:23:09 +00:00