Commit Graph

155193 Commits

Author SHA1 Message Date
Sanjay Patel 2a61a821a0 [DAG] combine assertsexts around a trunc
This was a suggested follow-up to:
D37017 / https://reviews.llvm.org/rL313577

llvm-svn: 315206
2017-10-09 15:22:20 +00:00
Amara Emerson 24ca39ce71 [AArch64] Improve codegen for inverted overflow checking intrinsics
E.g. if we have a (xor(overflow-bit), 1) where overflow-bit comes from an
intrinsic like llvm.sadd.with.overflow then we can kill the xor and use the
inverted condition code for the CSEL.

rdar://28495949

Reviewed By: kristof.beyls

Differential Revision: https://reviews.llvm.org/D38160

llvm-svn: 315205
2017-10-09 15:15:09 +00:00
Sanjay Patel 8557e29408 [x86] regenerate test checks; NFC
llvm-svn: 315204
2017-10-09 15:01:58 +00:00
Sanjay Patel be37ab864c [AArch64] fix typos in test assertions
llvm-svn: 315203
2017-10-09 01:29:54 +00:00
Craig Topper c88883b07d [X86] Remove a setLoadExtAction from the AVX512 section that uses an AVX512BW type and is alraedy present in the AVX512BW section.
llvm-svn: 315202
2017-10-09 01:05:16 +00:00
Craig Topper 4f8656a7af [X86] Enable extended comparison predicate support for SETUEQ/SETONE when targeting AVX instructions.
We believe that despite AMD's documentation, that they really do support all 32 comparision predicates under AVX.

Differential Revision: https://reviews.llvm.org/D38609

llvm-svn: 315201
2017-10-09 01:05:15 +00:00
Benjamin Kramer c24fb0718d Remove unused variables. No functionality change.
llvm-svn: 315196
2017-10-08 21:23:02 +00:00
Simon Pilgrim 2c742f919a [X86][SSE] Don't call combineTo inside combineX86ShufflesRecursively. NFCI.
Return the combined shuffle from combineX86ShufflesRecursively and perform the combineTo in the caller.

Makes it easier for future patches to use this in functions that aren't actually shuffles themselves.

llvm-svn: 315195
2017-10-08 20:58:14 +00:00
Simon Pilgrim 6abbd33ec0 Tidyup with clang-format. NFCI.
llvm-svn: 315187
2017-10-08 19:24:30 +00:00
Simon Pilgrim 135a2639f4 [X86][SSE] Add test case for PR27708
llvm-svn: 315186
2017-10-08 19:18:10 +00:00
Benjamin Kramer 16610028ea Remove unused variables. No functionality change.
llvm-svn: 315185
2017-10-08 19:11:02 +00:00
Craig Topper 977c546b0c [X86] Regenerate fast-isel-select-pseudo-cmov.ll to prepare for D38609.
llvm-svn: 315184
2017-10-08 17:54:50 +00:00
Javed Absar f45d0b9849 [TableGen] Simplify, add range_loop in CodeGenSchedule
llvm-svn: 315183
2017-10-08 17:23:30 +00:00
Simon Pilgrim dc32c844f9 [X86] getTargetConstantBitsFromNode - add support for decoding scalar constants
llvm-svn: 315182
2017-10-08 17:21:18 +00:00
Craig Topper c97775c03c [X86] Prefer MOVSS/SD over BLENDI during legalization. Remove BLENDI versions of scalar arithmetic patterns
Summary:
We currently disable some converting of shuffles to MOVSS/MOVSD during legalization if SSE41 is enabled. But later during shuffle combining we go back to prefering MOVSS/MOVSD.

Additionally we have patterns that look for BLENDIs to detect scalar arithmetic operations. I believe due to the combining using MOVSS/MOVSD these are unnecessary.

Interestingly, we still codegen blend instructions even though lowering/isel emit movss/movsd instructions. Turns out machine CSE commutes them to blend, and then commuting those blends back into blends that are equivalent to the original movss/movsd.

This patch fixes the inconsistency in legalization to prefer MOVSS/MOVSD. The one test change was caused by this change. The problem is that we have integer types and are mostly selecting integer instructions except for the shufps. This shufps forced the execution domain, but the vpblendw couldn't have its domain changed with a naive instruction swap. We could fix this by special casing VPBLENDW based on the immediate to widen the element type.

The rest of the patch is removing all the excess scalar patterns.

Long term we should probably add isel patterns to make MOVSS/MOVSD emit blends directly instead of relying on the double commute. We may also want to consider emitting movss/movsd for optsize. I also wonder if we should still use the VEX encoded blendi instructions even with AVX512. Blends have better throughput, and that may outweigh the register constraint.

Reviewers: RKSimon, zvi

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38023

llvm-svn: 315181
2017-10-08 16:57:23 +00:00
Benjamin Kramer 3f8c1189c7 Make more constructors constexpr or use =default.
This lets the compiler reason about the type more easily. No
functionality change intended.

llvm-svn: 315180
2017-10-08 15:59:35 +00:00
Amara Emerson 1f83777bd3 [AArch64][GlobalISel] Add a test case for G_PHI of p0 instruction selection.
llvm-svn: 315179
2017-10-08 15:29:35 +00:00
Amara Emerson e205cab66f [AArch64][GlobalISel] Add a test case for G_PHI of p0 regbank selection.
llvm-svn: 315178
2017-10-08 15:29:31 +00:00
Amara Emerson 1cd89ca669 [AArch64][GlobalISel] Make G_PHI of p0 types legal.
Differential Revision: https://reviews.llvm.org/D38621

llvm-svn: 315177
2017-10-08 15:29:11 +00:00
Simon Pilgrim 6410ca70aa [X86][XOP] Add XOP oddshuffles tests
XOP codegen is often different to generic AVX - thank you vpperm!

llvm-svn: 315176
2017-10-08 12:58:15 +00:00
Gadi Haber 684944b822 [X86][SKX] Adding the scheduling information for the SKX target.
Adding the scheduling information for the SkylakeServer (SKX) target.

This patch adds the instruction scheduling information for the SkylakeServer (SKX) architecture target by adding the file X86SchedSkylakeServer.td located under the X86 Target.
We used the scheduling information retrieved from the Skylake architects in order to create the file.
The scheduling information includes latency, number of micro-Ops and used ports by each SKL instruction.

The patch continues the scheduling replacement and insertion effort started with the SNB target in r310792, the HSW target in r311879 and the SkylakeClient (SKL) target in rL313613.

Please expect some performance fluctuations due to code alignment effects.

Reviewers: zvi, RKSimon, craig.topper, chandlerc, aymanmu
Differential Revision: https://reviews.llvm.org/D38443

Change-Id: I5c228fcc09e9e5a99b6116e62b356c4f9b971185
llvm-svn: 315175
2017-10-08 12:52:54 +00:00
Ayman Musa 1170deb9c8 [X86] Add missing entries in 'MemoryFoldTable2Addr' to get complete form of the table.
Get the folding table 'MemoryFoldTable2Addr' to a complete state as part of the process explained in https://reviews.llvm.org/D38028

Differential Revision: https://reviews.llvm.org/D38500

llvm-svn: 315174
2017-10-08 09:46:50 +00:00
Ayman Musa 993339b941 [X86][TableGen] Recommitting the X86 memory folding tables TableGen backend while disabling it by default.
After the original commit ([[ https://reviews.llvm.org/rL304088 | rL304088 ]]) was reverted, a discussion in llvm-dev was opened on 'how to accomplish this task'.
In the discussion we concluded that the best way to achieve our goal (which is to automate the folding tables and remove the manually maintained tables) is:

 # Commit the tablegen backend disabled by default.

 # Proceed with an incremental updating of the manual tables - while checking the validity of each added entry.

 # Repeat previous step until we reach a state where the generated and the manual tables are identical. Then we can safely remove the manual tables and include the generated tables instead.

 # Schedule periodical (1 week/2 weeks/1 month) runs of the pass:

   - if changes appear (new entries):
      - make sure the entries are legal
      - If they are not, mark them as illegal to folding
   - Commit the changes (if there are any).

CMake flag added for this purpose is "X86_GEN_FOLD_TABLES". Building with this flags will run the pass and emit the X86GenFoldTables.inc file under build/lib/Target/X86/ directory which is a good reference for any developer who wants to take part in the effort of completing the current folding tables.

Differential Revision: https://reviews.llvm.org/D38028

llvm-svn: 315173
2017-10-08 09:20:32 +00:00
Craig Topper bbca2f2978 [X86] Stop LowerSIGN_EXTEND_AVX512 from creating v8i16/v16i16/v16i8 vselects with a v8i1/v16i1 condition when BWI is not available.
Some of the tests in vector-shuffle-v1.ll would get into an infinite loop without this.

llvm-svn: 315172
2017-10-08 08:50:59 +00:00
Ayman Musa 5fc6dc58d7 [X86] Add new attribute to X86 instructions to enable marking them as "not memory foldable"
This attribute will be used in a tablegen backend that generated the X86 memory folding tables which will be added in a future pass.
Instructions with this attribute unset will be excluded from the full set of X86 instructions available for the pass.

Differential Revision: https://reviews.llvm.org/D38027

llvm-svn: 315171
2017-10-08 08:32:56 +00:00
Craig Topper 9563cab961 [X86] Simplify some code in getInsertVINSERTImmediate and getExtractVEXTRACTImmediate. NFC
Replace one of the divides with a multiply.

llvm-svn: 315162
2017-10-08 01:33:42 +00:00
Craig Topper 27170fee8d [X86] If we see an insert of a bitcast into zero vector, canonicalize it to move the bitcast to the other side of the insert.
This improves detection of zeroing of upper bits during isel.

llvm-svn: 315161
2017-10-08 01:33:41 +00:00
Craig Topper f7a19db649 [X86] Remove ISD::INSERT_SUBVECTOR handling from combineBitcastForMaskedOp. Add isel patterns to make up for it.
This will allow for some flexibility in canonicalizing bitcasts around insert_subvector.

llvm-svn: 315160
2017-10-08 01:33:40 +00:00
Craig Topper 16f2044fa8 [X86] Use getConstantOperandVal to simplify some code. NFC
llvm-svn: 315159
2017-10-08 01:33:38 +00:00
Simon Pilgrim 9508fe7924 [X86][SSE] Match bitcasted BUILD_VECTOR of constants for v2i64 shifts on 64-bit targets (PR34855)
Extension to rL315155, generate constant shifts on 64-bits as well as 32-bits.

llvm-svn: 315156
2017-10-07 17:57:22 +00:00
Simon Pilgrim 70e1db78db [X86][SSE] Match bitcasted v4i32 BUILD_VECTORS for v2i64 shifts on 64-bit targets (PR34855)
We were already doing this for 32-bit targets, but we can generate these on 64-bits as well.

llvm-svn: 315155
2017-10-07 17:42:17 +00:00
Craig Topper 90b76211d3 [SelectionDAG} Use KnownBits::isUnknown and hasConflict. NFC
llvm-svn: 315154
2017-10-07 17:07:48 +00:00
Craig Topper 2f60295364 [X86] Add X86ISD::CMOV to computeKnownBitsForTargetNode and ComputeNumSignBitsForTargetNode.
Summary: Implementations based on ISD::SELECT.

Reviewers: RKSimon, spatel

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38663

llvm-svn: 315153
2017-10-07 16:51:19 +00:00
Sanjay Patel 865e6c852b [InstSimplify] add tests to show we can do better at folding poison; NFC
llvm-svn: 315152
2017-10-07 15:39:06 +00:00
Simon Pilgrim b021b13aac [TableGen] Avoid repeated find calls in CodeGenDAGPatterns getters. NFCI.
The assertion tests were using count() instead of testing the find result, resulting in double the number of searches in debug/assert builds.

Instead, call find once (like the release builds do) and assert the result against end().

llvm-svn: 315151
2017-10-07 14:34:24 +00:00
Simon Pilgrim 73f143e774 [X86][SSE] Improve shuffling combining with horizontal operations
Recognise cases when we can merge the shuffles with their horizontal (HADD/HSUB/PACK) instruction inputs.

Replaces an older implementation which performed some of this during lowering, expanding an existing target shuffle combine stage instead.

Differential Revision: https://reviews.llvm.org/D38506

llvm-svn: 315150
2017-10-07 12:42:23 +00:00
Simon Pilgrim 5e030f9cdc [TableGen] Avoid unnecessary std::string creations
Avoid unnecessary std::string creations in the TreePredicateFn getters and in CodeGenDAGPatterns::getSDNodeNamed

Differential Revision: https://reviews.llvm.org/D38624

llvm-svn: 315148
2017-10-07 12:08:43 +00:00
Martin Storsjo 5e9d482b0a [X86] Update an outdated comment about SjLj
The SjLj intrinsics in the X86 backend are intended for use with
SjLj exception handling as well, since SVN r271244.

Differential Revision: https://reviews.llvm.org/D38532

llvm-svn: 315146
2017-10-07 06:00:32 +00:00
Craig Topper e79eff3bb5 [X86] Correct result type for the flag result of RDSEED and RDRAND nodes. Correct the CC type for the CMOV used with RDSEED/RDRAND.
The flag result was MVT::Glue, but should be MVT::i32. The CC type was MVT::i8, but should be MVT::i32.

llvm-svn: 315145
2017-10-07 05:11:59 +00:00
Jessica Paquette 13593843f6 [MachineOutliner] Disable outlining from LinkOnceODRs by default
Say you have two identical linkonceodr functions, one in M1 and one in M2.
Say that the outliner outlines A,B,C from one function, and D,E,F from another
function (where letters are instructions). Now those functions are not
identical, and cannot be deduped. Locally to M1 and M2, these outlining
choices would be good-- to the whole program, however, this might not be true!

To mitigate this, this commit makes it so that the outliner sees linkonceodr
functions as unsafe to outline from. It also adds a flag,
-enable-linkonceodr-outlining, which allows the user to specify that they
want to outline from such functions when they know what they're doing.

Changing this handles most code size regressions in the test suite caused by
competing with linker dedupe. It also doesn't have a huge impact on the code
size improvements from the outliner. There are 6 tests that regress > 5% from
outlining WITH linkonceodrs to outlining WITHOUT linkonceodrs. Overall, most
tests either improve or are not impacted.

Not outlined vs outlined without linkonceodrs:
https://hastebin.com/raw/qeguxavuda

Not outlined vs outlined with linkonceodrs:
https://hastebin.com/raw/edepoqoqic

Outlined with linkonceodrs vs outlined without linkonceodrs:
https://hastebin.com/raw/awiqifiheb

Numbers generated using compare.py with -m size.__text. Tests run for AArch64
with -Oz -mllvm -enable-machine-outliner -mno-red-zone.

llvm-svn: 315136
2017-10-07 00:16:34 +00:00
Sanjay Patel 72d339abb7 [InstCombine] use correct type when propagating constant condition in simplifyDivRemOfSelectWithZeroOp (PR34856)
llvm-svn: 315130
2017-10-06 23:43:06 +00:00
Zachary Turner b12792a4c0 [llvm-rc] Fix some endianness errors.
llvm-svn: 315128
2017-10-06 23:21:43 +00:00
Sanjay Patel ae2e3a44d2 [InstCombine] rename SimplifyDivRemOfSelect to be clearer, add comments, simplify code; NFCI
There's at least one bug here - this code can fail with vector types (PR34856).
It's also being called for FREM; I'm still trying to understand how that is valid.

llvm-svn: 315127
2017-10-06 23:20:16 +00:00
Cameron McInally 9d64101fe8 [AVX512] Fix TERNLOG when folding broadcast
Patch to fix ternlog instructions with a folded
broadcast. The broadcast decorator, e.g. {1toX}, was missing.

Differential Revision: https://reviews.llvm.org/D38649

llvm-svn: 315122
2017-10-06 22:31:29 +00:00
Jonas Devlieghere f2fa9ebe3f [dwarfdump] Verify that unit type matches root DIE
This patch adds two new verifiers:

  - It checks that the root DIE of a CU is actually a valid unit DIE.
    (based on its tag)
  - For DWARF5 which contains a unit type int he CU header, it checks that
    this matches the type of the unit DIE.

Differential revision: https://reviews.llvm.org/D38453

llvm-svn: 315121
2017-10-06 22:27:31 +00:00
Zachary Turner a92eb33ad5 [llvm-rc] Implement escape sequences in .rc files.
This allows the escape sequences (\a, \n, \r, \t, \\, \x[0-9a-f]*,
\[0-7]*, "") to appear in .rc scripts. These are parsed and output in
the same way as it's done in original MS implementation.

The way these sequences are processed depends on the type of the
resource it resides in, and on whether the user declared the string to
be "wide" or "narrow".

I tried to maintain the maximum compatibility with the original tool
(and fail in some erroneous situations that are accepted by .rc).
However, there are some (extremely rare) cases where Microsoft tool
outputs nonsense. I found it infeasible to detect such casses.

Patch by Marek Sokolowski

Differential Revision: https://reviews.llvm.org/D38426

llvm-svn: 315118
2017-10-06 22:05:15 +00:00
Zachary Turner 9d8b358a49 [llvm-rc] Serialize user-defined resources to .res files.
This allows rc to serialize user-defined resources, as
documented at:

msdn.microsoft.com/en-us/library/windows/desktop/aa381054.aspx

Escape sequences are yet unavailable, and are to be added in one of
child patches.

Patch by: Marek Sokolowski

Differential Revision: https://reviews.llvm.org/D38423

llvm-svn: 315117
2017-10-06 21:52:15 +00:00
Zachary Turner da366693bf [llvm-rc] Serialize STRINGTABLE statements to .res file.
This allows llvm-rc to serialize STRINGTABLE resources.

These are output in an unusual way: we locate them at the end of the
file, and strings are merged into bundles of max 16 strings, depending
on their IDs, language, and characteristics.

Ref: msdn.microsoft.com/en-us/library/windows/desktop/aa381050.aspx

Patch by: Marek Sokolowski
Differential Revision: https://reviews.llvm.org/D38420

llvm-svn: 315112
2017-10-06 21:30:55 +00:00
Zachary Turner 07bc04ff38 [llvm-rc] Serialize VERSIONINFO resources to .res files.
This is now able to dump VERSIONINFO resources.

Ref: msdn.microsoft.com/en-us/library/windows/desktop/aa381058.aspx

Differential Revision: https://reviews.llvm.org/D38410
Patch by: Marek Sokolowski

llvm-svn: 315110
2017-10-06 21:26:06 +00:00
Zachary Turner c3ab013aa1 [llvm-rc] Serialize CURSOR and ICON resources to .res
This is part 6 of llvm-rc serialization.

This adds ability to output cursors and icons as resources.

Unfortunately, we can't just copy .cur or .ico files to output - as each
file might contain multiple images, each of them needs to be unpacked
and stored as a separate resource. This forces us to parse cursor and
icon contents. (Fortunately, these formats are pretty similar and can be
processed by mostly common code).

As test files are binary, here is a short explanation of .cur and .ico
files stored:

cursor.cur, cursor-8.cur, cursor-32.cur are sample correct cursor files,
differing in their bit depth.

icon-old.ico, icon-new.ico are sample correct icon files;

icon-png.ico is a sample correct icon file in PNG format (instead of
usual BMP);

cursor-eof.cur is an incorrect cursor file - this is cursor.cur with
some of its final bytes removed.

cursor-bad-offset.cur is an incorrect cursor file - image header states
that image data begins at offset 0xFFFFFFFF.

Sample correct cursors and icons were created by Nico Weber.

Patch by Marek Sokolowski
Differential Revision: https://reviews.llvm.org/D37878

llvm-svn: 315109
2017-10-06 21:25:44 +00:00
Reid Kleckner b6b210e61f Revert "Roll forward r314928"
This appears to be miscompiling Clang, as shown on two Windows bootstrap
bots:
http://lab.llvm.org:8011/builders/clang-x86-windows-msvc2015/builds/7611
http://lab.llvm.org:8011/builders/clang-x64-ninja-win7/builds/6870

Nothing else is in the blame list. Both emit errors on this valid code
in the Windows ucrt headers:

C:\...\ucrt\malloc.h:95:32: error: invalid operands to binary expression ('char *' and 'int')
            _Ptr = (char*)_Ptr + _ALLOCA_S_MARKER_SIZE;
                   ~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~

I am attempting to reproduce this now.

This reverts r315044

llvm-svn: 315108
2017-10-06 21:17:51 +00:00
Zachary Turner 420090af89 [llvm-rc] Add optional serialization support for DIALOG(EX) resources.
This is part 5 of llvm-rc serialization support.

This allows DIALOG and DIALOGEX to serialize if dialog-specific optional
statements are provided. These are (as of now): CAPTION, FONT, and
STYLE.

Notably, FONT statement can take more than two arguments when describing
DIALOGEX resources (as in
msdn.microsoft.com/en-us/library/windows/desktop/aa381013.aspx). I made
some changes to the parser to reflect this fact.

Patch by Marek Sokolowski
Differential Revision: https://reviews.llvm.org/D37864

llvm-svn: 315104
2017-10-06 20:51:20 +00:00
Adrian Prantl 214babea60 Unify spelling.
llvm-svn: 315102
2017-10-06 20:24:35 +00:00
Adrian Prantl 59f30b8874 llvm-dwarfdump: Add an option to collect debug info quality metrics.
At the last LLVM dev meeting we had a debug info for optimized code
BoF session. In that session I presented some graphs that showed how
the quality of the debug info produced by LLVM changed over the last
couple of years. This is a cleaned up version of the patch I used to
collect the this data. It is implemented as an extension of
llvm-dwarfdump, adding a new --statistics option. The intended
use-case is to automatically run this on the debug info produced by,
e.g., our bots, to identify eyebrow-raising changes or regressions
introduced by new transformations that we could act on.

In the current form, two kinds of data are being collected:

- The number of variables that have a debug location versus the number
  of variables in total (this takes into account inlined instances of
  the same function, so if a variable is completely missing form only
  one instance it will be found).

- The PC range covered by variable location descriptions versus the PC
  range of all variables' containing lexical scopes.

The output format is versioned and extensible, so I'm looking forward
to both bug fixes and ideas for other data that would be interesting
to track.

Differential Revision: https://reviews.llvm.org/D36627

llvm-svn: 315101
2017-10-06 20:24:34 +00:00
Amara Emerson ba0f79af92 [GlobalISel] Fix legalizer trying to process a deleted instruction.
In some cases an instruction is deleted from the block during combining,
however it can still exist in the legalizer worklist.

This change modifies the combiner routines to add the given MI to the
dead instruction list rather than trying to remove it from the block
themselves. The responsibility is then on the caller to delete the instruction
from the block and any worklists.

Differential Revision: https://reviews.llvm.org/D38622

llvm-svn: 315092
2017-10-06 19:24:15 +00:00
Reid Kleckner 813c577cc2 [PEI] Remove required properties and use 'if' instead of std::function
Summary:
After r303360, we initialize UsesCalleeSaves in runOnMachineFunction,
which runs after getRequiredProperties. UsesCalleeSaves was initialized
to 'false', so getRequiredProperties would always return an empty set.
We don't have a TargetMachine available early anymore after r303360.
Just removing the requirement of NoVRegs seems to make things work, so
let's do that.

Reviewers: thegameg, dschuff, MatzeB

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D38597

llvm-svn: 315089
2017-10-06 18:21:19 +00:00
Francis Ricci 85255eda91 Revert "[dsymutil] Emit valid debug locations when no symbol flags are set"
This reverts commit r315082, which fails on non-darwin buildbots.

llvm-svn: 315088
2017-10-06 18:19:52 +00:00
Saleem Abdulrasool 46a59fdab6 Bitcode: add an auto-upgrade for LTO section name
The bitcode reader looks specifically for `__DATA, __objc_catlist` as a
section name.  However, SVN r304661 removed the spaces (the two names
are functionally equivalent but do not compare equally
lexicographically).  This causes compatibility issues.  Add an
auto-upgrade path for removing the spaces as well as use the new name in
the LTO plugin.

llvm-svn: 315086
2017-10-06 18:06:59 +00:00
Zachary Turner 96b04b68ed [lit] Improve tool substitution in lit.
This addresses two sources of inconsistency in test configuration
files.

1. Substitution boundaries.  Previously you would specify a
   substitution, such as 'lli', and then additionally a set
   of characters that should fail to match before and after
   the tool.  This was used, for example, so that matches that
   are parts of full paths would not be replaced.  But not all
   tools did this, and those that did would often re-invent
   the set of characters themselves, leading to inconsistency.
   Now, every tool substitution defaults to using a sane set
   of reasonable defaults and you have to explicitly opt out
   of it.  This actually fixed a few latent bugs that were
   never being surfaced, but only on accident.

2. There was no standard way for the system to decide how to
   locate a tool.  Sometimes you have an explicit path, sometimes
   we would search for it and build up a path ourselves, and
   sometimes we would build up a full command line.  Furthermore,
   there was no standardized way to handle missing tools.  Do we
   warn, fail, ignore, etc?  All of this is now encapsulated in
   the ToolSubst class.  You either specify an exact command to
   run, or an instance of FindTool('<tool-name>') and everything
   else just works.  Furthermore, you can specify an action to
   take if the tool cannot be resolved.

Differential Revision: https://reviews.llvm.org/D38565

llvm-svn: 315085
2017-10-06 17:54:46 +00:00
Zachary Turner c981448063 Run pyformat on lit code.
llvm-svn: 315084
2017-10-06 17:54:27 +00:00
Diana Picus 57285d7a43 [ARM] GlobalISel: Make tests less strict
These are intended as integration tests, so they shouldn't be too
specific about what they're checking.

llvm-svn: 315083
2017-10-06 17:47:27 +00:00
Francis Ricci b468fd64f9 [dsymutil] Emit valid debug locations when no symbol flags are set
Summary:
swiftc emits symbols without flags set, which led dsymutil to ignore
them when searching for global symbols, causing dwarf location data
to be omitted. Xcode's dsymutil handles this case correctly, and emits
valid location data. Add this functionality to llvm-dsymutil by
allowing parsing of symbols with no flags set.

Reviewers: aprantl, friss, JDevlieghere

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38587

llvm-svn: 315082
2017-10-06 17:43:37 +00:00
Stanislav Mekhanoshin de42c29a68 [AMDGPU] New 64 bit div/rem expansion
Old expansion was 20 VGPRs, 78 SGPRs and ~380 instructions.
This expansion is 11 VGPRs, 12 SGPRs and ~120 instructions.

Passes OpenCL conformance test_integer_ops quick_[u]long_math

Differential Revision: https://reviews.llvm.org/D38607

llvm-svn: 315081
2017-10-06 17:24:45 +00:00
Reid Kleckner 4c4422f9a5 [MC] Use unique_ptr to manage WinFrameInfos, NFC
The FrameInfo cannot be stored directly in the vector because chained
frames may refer to parent frames, so we need pointers that are stable
across a vector resize.

llvm-svn: 315080
2017-10-06 17:21:49 +00:00
Peter Collingbourne 80e31f1f84 Support: Rewrite Windows implementation of sys::fs::rename to be more POSIXy.
The current implementation of rename uses ReplaceFile if the
destination file already exists. According to the documentation for
ReplaceFile, the source file is opened without a sharing mode. This
means that there is a short interval of time between when ReplaceFile
renames the file and when it closes the file during which the
destination file cannot be opened.

This behaviour is not POSIX compliant because rename is supposed
to be atomic. It was also causing intermittent link failures when
linking with a ThinLTO cache; the ThinLTO cache implementation expects
all cache files to be openable.

This patch addresses that problem by re-implementing rename
using CreateFile and SetFileInformationByHandle. It is roughly a
reimplementation of ReplaceFile with a better sharing policy as well
as support for renaming in the case where the destination file does
not exist.

This implementation is still not fully POSIX. Specifically in the case
where the destination file is open at the point when rename is called,
there will be a short interval of time during which the destination
file will not exist. It isn't clear whether it is possible to avoid
this using the Windows API.

Differential Revision: https://reviews.llvm.org/D38570

llvm-svn: 315079
2017-10-06 17:14:36 +00:00
Dehao Chen 9bd60429e2 Directly return promoted direct call instead of rely on stripPointerCast.
Summary: stripPointerCast is not reliably returning the value that's being type-casted. Instead it may look further at function attributes to further propagate the value. Instead of relying on stripPOintercast, the more reliable solution is to directly use the pointer to the promoted direct call.

Reviewers: tejohnson, davidxl

Reviewed By: tejohnson

Subscribers: llvm-commits, sanjoy

Differential Revision: https://reviews.llvm.org/D38603

llvm-svn: 315077
2017-10-06 17:04:55 +00:00
Francis Ricci 195a66f6eb Guard xar RAII behind HAVE_LIBXAR
llvm-svn: 315072
2017-10-06 15:54:20 +00:00
Diana Picus e393bc72ee [ARM] GlobalISel: Select shifts
Unfortunately TableGen doesn't handle this yet:
Unable to deduce gMIR opcode to handle Src (which is a leaf).

Just add some temporary hand-written code to generate the proper MOVsr.

llvm-svn: 315071
2017-10-06 15:39:16 +00:00
Simon Pilgrim d0faf16d04 Strip trailing whitespace
llvm-svn: 315070
2017-10-06 15:33:55 +00:00
Francis Ricci 6f94297422 [llvm-objdump] Add RAII for xar apis
Summary:
xar_open and xar_iter_new require manual calls to close/free functions
to deallocate resources. This makes it easy to introduce memory leaks,
so add RAII struct wrappers for these resources.

Reviewers: enderby, rafael, compnerd, lhames, dblaikie

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38598

llvm-svn: 315069
2017-10-06 15:33:28 +00:00
Javed Absar 32e3cb7bde [TableGen] Simplify SubtargetEmitter
Remove unnecessary duplicate if-condition.

llvm-svn: 315068
2017-10-06 15:25:04 +00:00
Diana Picus a81a4b17e5 [ARM] GlobalISel: Map shift operands to GPRs
llvm-svn: 315067
2017-10-06 14:52:43 +00:00
Francis Ricci 8aedfde298 [llvm-dsymutil] Add support for __swift_ast MachO DWARF section
Summary:
Xcode's dsymutil emits a __swift_ast DWARF section, which is required for debugging,
and which contains a byte-for-byte dump of the swiftmodule file.
Add this feature to llvm-dsymutil.

Tested with `gobjdump --dwarf=info -s`, by verifying that the contents of
`__DWARF.__swift_ast` match between Xcode's dsymutil and llvm-dsymutil
(Xcode's dwarfdump and llvm-dwarfdump don't currently recognize the
__swift_ast section).

Reviewers: aprantl, friss

Subscribers: llvm-commits, JDevlieghere

Differential Revision: https://reviews.llvm.org/D38504

llvm-svn: 315066
2017-10-06 14:49:20 +00:00
Diana Picus 2c95730450 [ARM] GlobalISel: Mark shifts as legal for s32
The new legalize combiner introduces shifts all over the place, so we
should support them sooner rather than later.

llvm-svn: 315064
2017-10-06 14:30:05 +00:00
Jonas Paulsson c63ed222b8 [SystemZ] Enable machine scheduler.
The machine scheduler (before register allocation) is enabled by default for
SystemZ.

The SelectionDAG scheduling preference now becomes source order scheduling
(was regpressure).

Review: Ulrich Weigand
https://reviews.llvm.org/D37977

llvm-svn: 315063
2017-10-06 13:59:28 +00:00
Clement Courbet 028d2eb671 [MergeICmp][NFC] Make test tuple-four-int8.ll more readable.
llvm-svn: 315062
2017-10-06 13:45:16 +00:00
Simon Pilgrim a29dbdf2ca [X86][SSE] Add SKX cpu tests to SSE/AVX scheduling tests (D38443)
llvm-svn: 315061
2017-10-06 13:40:29 +00:00
Clement Courbet d12c189e2e Revert "[MergeICmps] Disable mergeicmps if the target does not want to handle memcmp expansion."
Still a few stability issues on windows.

This reverts commit 67e3db9bc121ba244e20337aabc7cf341a62b545.

llvm-svn: 315058
2017-10-06 13:02:24 +00:00
Clement Courbet 4e1bae8136 Re-land "[MergeICmps] Disable mergeicmps if the target does not want to handle memcmp expansion."
(fixed unit tests by making comparisons stable)

This reverts commit 1b2d359ce256fd6737da4e93833346a0bd6d7583.

llvm-svn: 315056
2017-10-06 12:12:35 +00:00
Javed Absar 41705e9e09 [TableGen] : CodeGenInsrtuction modify to range loop. NFC.
llvm-svn: 315050
2017-10-06 09:32:45 +00:00
Xinliang David Li bcd36f7c5a Roll forward r314928
Fixed ThinLTO bootstrap failure : track new
bitcast per incomingVal. Added new tests.

llvm-svn: 315044
2017-10-06 05:15:25 +00:00
Davide Italiano c74ea93b8c [PM] Retire disable unit-at-a-time switch.
This is a vestige from the GCC-3 days, which disables IPO passes
when set. I don't think anybody actually uses it as there are
several IPO passes which still run with this flag set and
nobody complained/noticed. This reduces the delta between
current and new pass manager and allows us to easily review
the difference when we decide to flip the switch (or audit
which passes should run, FWIW).

llvm-svn: 315043
2017-10-06 04:39:40 +00:00
Jakub Kuderski cbe9fae99d [CodeExtractor] Fix multiple bugs under certain shape of extracted region
Summary:
If the extracted region has multiple exported data flows toward the same BB which is not included in the region, correct resotre instructions and PHI nodes won't be generated inside the exitStub. The solution is simply put the restore instructions right after the definition of output values instead of putting in exitStub.
Unittest for this bug is included.

Author: myhsu

Reviewers: chandlerc, davide, lattner, silvas, davidxl, wmi, kuhar

Subscribers: dberlin, kuhar, mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D37902

llvm-svn: 315041
2017-10-06 03:37:06 +00:00
Daniel Berlin 08dd582ea0 NewGVN: Factor out duplicate parts of OpIsSafeForPHIOfOps
llvm-svn: 315040
2017-10-06 01:33:06 +00:00
Francis Ricci b4e77d98ed Revert "[llvm-dsymutil] Add support for __swift_ast MachO DWARF section"
Breaks aarch64 builders

This reverts commit r315014.

llvm-svn: 315034
2017-10-05 23:09:17 +00:00
Xin Tong 27e66fb579 [MBP] Remove an invalid assert.
The patch that this assert comes with is fixing a bug in MBP. The assert is
invalid however.

Thanks to @sergey.k.okunev for finding this

Currently this fails SPECCPU2006 LTO. I will add a test case when I do more
investigation and have one.

llvm-svn: 315032
2017-10-05 23:00:04 +00:00
Peter Collingbourne 715bcfe0c9 ModuleUtils: Stop using comdat members to generate unique module ids.
It is possible for two modules to define the same set of external
symbols without causing a duplicate symbol error at link time,
as long as each of the symbols is a comdat member. So we cannot
use them as part of a unique id for the module.

Differential Revision: https://reviews.llvm.org/D38602

llvm-svn: 315026
2017-10-05 21:54:53 +00:00
Reid Kleckner 676941909d [X86] Extract CATCHRET handling from emitEpilogue, NFC
llvm-svn: 315023
2017-10-05 21:37:39 +00:00
Derek Schuff 885dc59297 [WebAssembly] Add the rest of the atomic loads
Add extending loads and constant offset patterns
A bit more refactoring of the tablegen to make the patterns fairly nice and
uniform between the regular and atomic loads.

Differential Revision: https://reviews.llvm.org/D38523

llvm-svn: 315022
2017-10-05 21:18:42 +00:00
Sanjay Patel 7ac2db6a48 [InstCombine] improve folds for icmp gt/lt (shr X, C1), C2
We can always eliminate the shift in: icmp gt/lt (shr X, C1), C2 --> icmp gt/lt X, C'
This patch was supposed to just be an efficiency improvement because we were doing this 3-step process to fold:

IC: Visiting:   %c = icmp ugt i4 %s, 1
IC: ADD:   %s = lshr i4 %x, 1
IC: ADD:   %1 = udiv i4 %x, 2
IC: Old =   %c = icmp ugt i4 %1, 1
    New =   <badref> = icmp uge i4 %x, 4
IC: ADD:   %c = icmp uge i4 %x, 4
IC: ERASE   %2 = icmp ugt i4 %1, 1
IC: Visiting:   %c = icmp uge i4 %x, 4
IC: Old =   %c = icmp uge i4 %x, 4
    New =   <badref> = icmp ugt i4 %x, 3
IC: ADD:   %c = icmp ugt i4 %x, 3
IC: ERASE   %2 = icmp uge i4 %x, 4
IC: Visiting:   %c = icmp ugt i4 %x, 3
IC: DCE:   %1 = udiv i4 %x, 2
IC: ERASE   %1 = udiv i4 %x, 2
IC: DCE:   %s = lshr i4 %x, 1
IC: ERASE   %s = lshr i4 %x, 1
IC: Visiting:   ret i1 %c

When we could go directly to canonical icmp form:

IC: Visiting:   %c = icmp ugt i4 %s, 1
IC: Old =   %c = icmp ugt i4 %s, 1
    New =   <badref> = icmp ugt i4 %x, 3
IC: ADD:   %c = icmp ugt i4 %x, 3
IC: ERASE   %1 = icmp ugt i4 %s, 1
IC: ADD:   %s = lshr i4 %x, 1
IC: DCE:   %s = lshr i4 %x, 1
IC: ERASE   %s = lshr i4 %x, 1
IC: Visiting:   %c = icmp ugt i4 %x, 3

...but then I noticed that the folds were incomplete too:
https://godbolt.org/g/aB2hLE

Here are attempts to prove the logic with Alive:
https://rise4fun.com/Alive/92o

Name: lshr_ult
Pre: ((C2 << C1) u>> C1) == C2
%sh = lshr i8 %x, C1
%r = icmp ult i8 %sh, C2
  =>
%r = icmp ult i8 %x, (C2 << C1)

Name: ashr_slt
Pre: ((C2 << C1) >> C1) == C2
%sh = ashr i8 %x, C1
%r = icmp slt i8 %sh, C2
  =>
%r = icmp slt i8 %x, (C2 << C1)

Name: lshr_ugt
Pre: (((C2+1) << C1) u>> C1) == (C2+1)
%sh = lshr i8 %x, C1
%r = icmp ugt i8 %sh, C2
  =>
%r = icmp ugt i8 %x, ((C2+1) << C1) - 1

Name: ashr_sgt
Pre: (C2 != 127) && ((C2+1) << C1 != -128) && (((C2+1) << C1) >> C1) == (C2+1)
%sh = ashr i8 %x, C1
%r = icmp sgt i8 %sh, C2
  =>
%r = icmp sgt i8 %x, ((C2+1) << C1) - 1

Name: ashr_exact_sgt
Pre: ((C2 << C1) >> C1) == C2
%sh = ashr exact i8 %x, C1
%r = icmp sgt i8 %sh, C2
  =>
%r = icmp sgt i8 %x, (C2 << C1)

Name: ashr_exact_slt
Pre: ((C2 << C1) >> C1) == C2
%sh = ashr exact i8 %x, C1
%r = icmp slt i8 %sh, C2
  =>
%r = icmp slt i8 %x, (C2 << C1)

Name: lshr_exact_ugt
Pre: ((C2 << C1) u>> C1) == C2
%sh = lshr exact i8 %x, C1
%r = icmp ugt i8 %sh, C2
  =>
%r = icmp ugt i8 %x, (C2 << C1)

Name: lshr_exact_ult
Pre: ((C2 << C1) u>> C1) == C2
%sh = lshr exact i8 %x, C1
%r = icmp ult i8 %sh, C2
  =>
%r = icmp ult i8 %x, (C2 << C1)

We did something similar for 'shl' in D28406.

Differential Revision: https://reviews.llvm.org/D38514

llvm-svn: 315021
2017-10-05 21:11:49 +00:00
Krzysztof Parzyszek a114941fa8 [Hexagon] Make PS_fi and PS_fia extendable (they both expand to A2_addi)
llvm-svn: 315019
2017-10-05 20:20:06 +00:00
Francis Ricci 6ae88262a8 [dsymutil] Fix typo in swift-ast.test
llvm-svn: 315017
2017-10-05 20:16:16 +00:00
Dehao Chen 16f01fb1db Annotate VP prof on indirect call if it is ICPed in the profiled binary.
Summary: In SamplePGO, when an indirect call is promoted in the profiled binary, before profile annotation, it will be promoted and inlined. For the original indirect call, the current implementation will not mark VP profile on it. This is an issue when profile becomes stale. This patch annotates VP prof on indirect calls during annotation.

Reviewers: tejohnson

Reviewed By: tejohnson

Subscribers: sanjoy, llvm-commits

Differential Revision: https://reviews.llvm.org/D38477

llvm-svn: 315016
2017-10-05 20:15:29 +00:00
Francis Ricci 2b513b5c99 [llvm-dsymutil] Add support for __swift_ast MachO DWARF section
Summary:
Xcode's dsymutil emits a __swift_ast DWARF section, which is required for debugging,
and which contains a byte-for-byte dump of the swiftmodule file.
Add this feature to llvm-dsymutil.

Tested with `gobjdump --dwarf=info -s`, by verifying that the contents of
`__DWARF.__swift_ast` match between Xcode's dsymutil and llvm-dsymutil
(Xcode's dwarfdump and llvm-dwarfdump don't currently recognize the
__swift_ast section).

Reviewers: aprantl, friss

Subscribers: llvm-commits, JDevlieghere

Differential Revision: https://reviews.llvm.org/D38504

llvm-svn: 315014
2017-10-05 20:03:01 +00:00
Krzysztof Parzyszek 7ae3ae9ef4 [Hexagon] Give uniform names to functions changing addressing modes, NFC
The new format is changeAddrMode_xx_yy, where xx is the current mode,
and yy is the new one.

Old name:               New name:
getBaseWithImmOffset    changeAddrMode_abs_io
getAbsoluteForm         changeAddrMode_io_abs
getBaseWithRegOffset    changeAddrMode_io_rr
xformRegToImmOffset     changeAddrMode_rr_io
getBaseWithLongOffset   changeAddrMode_rr_ur
getRegShlForm           changeAddrMode_ur_rr

llvm-svn: 315013
2017-10-05 20:01:38 +00:00
Rafael Espindola 42eb1f2ba9 Added phdr upper bound checks to ElfObject.
Ensure the program_headers call will fail correctly if the program
headers are larger than the underlying buffer.

Patch by Parker Thompson!

llvm-svn: 315012
2017-10-05 20:01:32 +00:00
Francis Ricci 5f689d0db3 Revert "[llvm-dsymutil] Add support for __swift_ast MachO DWARF section"
This reverts commit r315004, because of a failing test on non-apple platforms

llvm-svn: 315009
2017-10-05 19:47:13 +00:00
Francis Ricci 4407767fe8 [dsymutil] Fix unused variable warning
llvm-svn: 315006
2017-10-05 19:35:55 +00:00
Francis Ricci 7767277639 [llvm-dsymutil] Add support for __swift_ast MachO DWARF section
Summary:
Xcode's dsymutil emits a __swift_ast DWARF section, which is required for debugging,
and which contains a byte-for-byte dump of the swiftmodule file.
Add this feature to llvm-dsymutil.

Tested with `gobjdump --dwarf=info -s`, by verifying that the contents of
`__DWARF.__swift_ast` match between Xcode's dsymutil and llvm-dsymutil
(Xcode's dwarfdump and llvm-dwarfdump don't currently recognize the
__swift_ast section).

Reviewers: aprantl, friss

Subscribers: llvm-commits, JDevlieghere

Differential Revision: https://reviews.llvm.org/D38504

llvm-svn: 315004
2017-10-05 19:17:28 +00:00
Davide Italiano e070721308 [NewPassManager] Run global dead code elimination after the inliner.
This is the same exact change we did for the current pass manager
in rL314997, but the new pass manager pipeline already happened
to run GlobalOpt after the inliner, so we just insert a run of
GDCE here.

llvm-svn: 315003
2017-10-05 18:36:01 +00:00