Commit Graph

162432 Commits

Author SHA1 Message Date
Lama Saba 927468309f [X86] Reduce Store Forward Block issues in HW - Recommit after fixing Bug 36346
If a load follows a store and reloads data that the store has written to memory, Intel microarchitectures can in many cases forward the data directly from the store to the load, This "store forwarding" saves cycles by enabling the load to directly obtain the data instead of accessing the data from cache or memory.
A "store forward block" occurs in cases that a store cannot be forwarded to the load. The most typical case of store forward block on Intel Core microarchiticutre that a small store cannot be forwarded to a large load.
The estimated penalty for a store forward block is ~13 cycles.

This pass tries to recognize and handle cases where "store forward block" is created by the compiler when lowering memcpy calls to a sequence
of a load and a store.

The pass currently only handles cases where memcpy is lowered to XMM/YMM registers, it tries to break the memcpy into smaller copies.
breaking the memcpy should be possible since there is no atomicity guarantee for loads and stores to XMM/YMM.

Differential revision: https://reviews.llvm.org/D41330

Change-Id: Ib48836ccdf6005989f7d4466fa2035b7b04415d9
llvm-svn: 328973
2018-04-02 13:48:28 +00:00
Andrea Di Biagio 6fd62feff8 [llvm-mca] Do not assume that implicit reads cannot be associated with ReadAdvance entries.
Before, the instruction builder incorrectly assumed that only explicit reads
could have been associated with ReadAdvance entries.
This patch fixes the issue and adds a test to verify it.

llvm-svn: 328972
2018-04-02 13:46:49 +00:00
Nico Weber 62ea0c562e Attempt to fix papertrail-warnings.test on Windows bots.
llvm-svn: 328971
2018-04-02 13:45:39 +00:00
Nico Weber dce9a72d98 Assume existence of inttypes.h and stdint.h in DataTypes.h.
These should exist in all toolchains LLVM supports nowadays.

Enables making DataTypes.h a regular header instead of a .h.cmake file and
allows deleting a bunch of cmake goop (which should also speed up cmake
configure time a bit).

All the code this removes is 9+ years old.
https://reviews.llvm.org/D45155

llvm-svn: 328970
2018-04-02 13:22:26 +00:00
Hiroshi Inoue 6d48493817 [PowerPC] fix assertion failure due to missing instruction in P9InstrResources.td
This patch adds L(D|W|H|B)XTLS instructions introduced by https://reviews.llvm.org/rL327635 in P9InstrResources.td.

llvm-svn: 328969
2018-04-02 12:18:21 +00:00
Jonas Devlieghere 9e3e7a99e8 [dsymutil] Upstream emitting of papertrail warnings.
When running dsymutil as part of your build system, it can be desirable
for warnings to be part of the end product, rather than just being
emitted to the output stream. This patch upstreams that functionality.

Differential revision: https://reviews.llvm.org/D44639

llvm-svn: 328965
2018-04-02 10:40:43 +00:00
Simon Pilgrim 3f0bda296d Wdocumentation fix. NFCI.
llvm-svn: 328964
2018-04-02 10:34:39 +00:00
Simon Pilgrim 49a5ddfda0 Wdocumentation fixes. NFCI.
llvm-svn: 328963
2018-04-02 10:21:51 +00:00
Craig Topper 96729cd64b [X86][Silvermont] Use correct latency and throughput information for divide and square root in the scheduler model.
Data taken from Table 16-17 in the Intel Optimization Manual.

llvm-svn: 328962
2018-04-02 06:34:16 +00:00
Craig Topper 6a814904da [X86][SkylakeServer] Correct throughput for 512-bit sqrt and divide.
Data taken from the AVX512_SKX_PortAssign spreadsheet at http://instlatx64.atw.hu/

llvm-svn: 328961
2018-04-02 05:54:34 +00:00
Craig Topper 8104f266a4 [X86] Correct the throughput for divide instructions in Sandy Bridge/Haswell/Broadwell/Skylake scheduler models.
Fixes most of PR36898. Still need to fix the 512-bit instructions, but Agner's tables don't have those.

llvm-svn: 328960
2018-04-02 05:33:28 +00:00
Craig Topper dc74094398 [X86] Fix the SchedRW for AVX512 shift instructions.
It was being inadvertently defaulted to an FADD scheduler class.

llvm-svn: 328959
2018-04-02 03:15:02 +00:00
Craig Topper 5fb1dc2d22 [X86] Give the AVX512 VEXTRACT instructions the same SchedRWs as the SSE/AVX versions.
llvm-svn: 328958
2018-04-02 02:44:55 +00:00
Nico Weber 7a4f647dc2 Remove a few unreferenced config.h defines.
Found by looking through the output of

  for f in $(grep -o '\bHAVE_[A-Z0-9_]*\b' llvm/cmake/config-ix.cmake); do
    echo $f $(git grep $f '*' | wc -l);
  done

in the monorepo.

llvm-svn: 328957
2018-04-02 01:46:08 +00:00
Craig Topper caec723a1a [X86] Add an itinerary to BTR64rr.
llvm-svn: 328956
2018-04-02 01:12:34 +00:00
Craig Topper 02daec00a2 [X86] Make sure all the classes declare in the Haswell scheduler model are prefixed with HW.
The tablegen files all share a namespace so we shouldn't use a generic names in a specific scheduler model.

llvm-svn: 328955
2018-04-02 01:12:32 +00:00
Craig Topper c90d906b16 [X86] Give VINSERTPS the same intinerary as INSERTPS.
llvm-svn: 328954
2018-04-02 00:48:11 +00:00
Harlan Haskins b7881bbfa2 Add C API bindings for DIBuilder 'Type' APIs
This patch adds a set of unstable C API bindings to the DIBuilder interface for
creating structure, function, and aggregate types.

This patch also removes the existing implementations of these functions from
the Go bindings and updates the Go API to fit the new C APIs.

llvm-svn: 328953
2018-04-02 00:17:40 +00:00
Craig Topper dc4a6d1ef6 [X86] Cleanup ADCX/ADOX instruction definitions.
Give them both the same itineraries. Add hasSideEffects = 0 to ADOX since they don't have patterns. Rename source operands to $src1 and $src2 instead of $src0 and $src. Add ReadAfterLd to the memory form SchedRW.

llvm-svn: 328952
2018-04-01 23:58:50 +00:00
Petr Hosek 934e5d5436 [AArch64] Reserve x18 register on Fuchsia
This register is reserved as a platform register on Fuchsia.

Differential Revision: https://reviews.llvm.org/D45105

llvm-svn: 328950
2018-04-01 23:44:04 +00:00
Craig Topper 8a1787ae22 [DebugCounter] Make -debug-counter cl::Hidden.
llvm-svn: 328948
2018-04-01 22:16:52 +00:00
Craig Topper f5730c38e9 [LegacyPassManager] Make 'print-module-scope' cl::Hidden like the rest of the printing options.
llvm-svn: 328947
2018-04-01 21:54:26 +00:00
Craig Topper 9f834810ea [X86] Give ADC8/16/32/64mi the same scheduling information as ADC8/16/32/64mr and SBB8/16/32/64mi.
It doesn't make a lot of sense that it would be different.

llvm-svn: 328946
2018-04-01 21:54:24 +00:00
Chandler Carruth 4244625c51 [x86] Correct the operand structure of the ADOX instruction.
This also moves to define it in the same way as ADCX which seems to use
constraints a bit better.

This is pulled out of the review for reducing the use of popf for
restoring EFLAGS, but is independent. There are still more problems with
our definitions for these instructions that Craig is going to look at
but this is at least less broken and he can start from this to improve
them more fully.

Thanks to Craig for the review here.

llvm-svn: 328945
2018-04-01 21:53:18 +00:00
Chandler Carruth 06b343c6ed [x86] Expose more of the condition conversion routines in the public API
for X86's instruction information. I've now got a second patch under
review that needs these same APIs. This bit is nicely orthogonal and
obvious, so landing it. NFC.

llvm-svn: 328944
2018-04-01 21:47:55 +00:00
Mandeep Singh Grang 8db564e033 [tools] Change std::sort to llvm::sort in response to r327219
Summary:
r327219 added wrappers to std::sort which randomly shuffle the container before sorting.
This will help in uncovering non-determinism caused due to undefined sorting
order of objects having the same key.

To make use of that infrastructure we need to invoke llvm::sort instead of std::sort.

Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort.
Refer the comments section in D44363 for a list of all the required patches.

Reviewers: JDevlieghere, zturner, echristo, dberris, friss

Reviewed By: echristo

Subscribers: gbedwell, llvm-commits

Differential Revision: https://reviews.llvm.org/D45141

llvm-svn: 328943
2018-04-01 21:24:53 +00:00
Mandeep Singh Grang ba8033be4d [include] Change std::sort to llvm::sort in response to r327219
Summary:
r327219 added wrappers to std::sort which randomly shuffle the container before sorting.
This will help in uncovering non-determinism caused due to undefined sorting
order of objects having the same key.

To make use of that infrastructure we need to invoke llvm::sort instead of std::sort.

Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort.
Refer the comments section in D44363 for a list of all the required patches.

Reviewers: echristo, zturner, mzolotukhin, lhames

Reviewed By: echristo

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D45135

llvm-svn: 328940
2018-04-01 18:39:50 +00:00
Nicolai Haehnle 4254d45a79 AMDGPU: Make isIntrinsicSourceOfDivergence table-driven
Summary:
This is in preparation for the new dimension-aware image intrinsics,
which I'd rather not have to list here by hand.

Change-Id: Iaa16e3a635a11283918ce0d9e1e618591b0bf6fa

Reviewers: arsenm, rampitec, b-sumner

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D44938

llvm-svn: 328939
2018-04-01 17:09:14 +00:00
Nicolai Haehnle 5d0d30304c AMDGPU: Make getTgtMemIntrinsic table-driven for resource-based intrinsics
Summary:
Avoids having to list all intrinsics manually.

This is in preparation for the new dimension-aware image intrinsics,
which I'd rather not have to list here by hand.

Change-Id: If7ced04998397ef68c4cb8f7de66b5050fb767e5

Reviewers: arsenm, rampitec, b-sumner

Subscribers: kzhuravl, wdng, mgorny, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D44937

llvm-svn: 328938
2018-04-01 17:09:07 +00:00
Nicolai Haehnle 398c0b6701 TableGen: Support Intrinsic values in SearchableTable
Summary:
We will use this in the AMDGPU backend in a subsequent patch
in the stack to lookup target-specific per-intrinsic information.

The generic CodeGenIntrinsic machinery is used to ensure that,
even though we don't calculate actual enum values here, we do
get the intrinsics in the right order for the binary search
index.

Change-Id: If61cd5587963a4c5a1cc53df1e59c5e4dec1f9dc

Reviewers: arsenm, rampitec, b-sumner

Subscribers: wdng, tpr, llvm-commits

Differential Revision: https://reviews.llvm.org/D44935

llvm-svn: 328937
2018-04-01 17:08:58 +00:00
Nicolai Haehnle 24e3a4d6e9 TableGen: More helpful error messages
Summary: Change-Id: I3c23f6f6597912423762780cd8c5315870412bbe

Reviewers: arsenm, rampitec, b-sumner

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D44936

Change-Id: Ie62614a3e2d7774f46e4034478b28f57100a2c92
llvm-svn: 328936
2018-04-01 17:08:49 +00:00
Mandeep Singh Grang fe1d28e83d [DebugInfo] Change std::sort to llvm::sort in response to r327219
Summary:
r327219 added wrappers to std::sort which randomly shuffle the container before sorting.
This will help in uncovering non-determinism caused due to undefined sorting
order of objects having the same key.

To make use of that infrastructure we need to invoke llvm::sort instead of std::sort.

Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort.
Refer the comments section in D44363 for a list of all the required patches.

Reviewers: echristo, zturner, samsonov

Reviewed By: echristo

Subscribers: JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D45134

llvm-svn: 328935
2018-04-01 16:18:49 +00:00
Teresa Johnson 974706ebf7 [ThinLTO] Add an import cutoff for debugging/triaging
Summary:
Adds -import-cutoff=N which will stop importing during the thin link
after N imports. Default is -1 (no  limit).

Reviewers: wmi

Subscribers: inglorion, llvm-commits

Differential Revision: https://reviews.llvm.org/D45127

llvm-svn: 328934
2018-04-01 15:54:40 +00:00
David Green f80ebc8d21 [LoopRotate] Rotate loops with loop exiting latches
If a loop has a loop exiting latch, it can be profitable
to rotate the loop if it leads to the simplification of
a phi node. Perform rotation in these cases even if loop
rotate itself didnt simplify the loop to get there.

Differential Revision: https://reviews.llvm.org/D44199

llvm-svn: 328933
2018-04-01 12:48:24 +00:00
Craig Topper 9b8cd5fe55 [X86] Don't check for folding into a store when deciding if we can promote an i16 mul.
There's no RMW mul operation.

llvm-svn: 328931
2018-04-01 06:29:32 +00:00
Craig Topper db6caabccc [X86] Check if the load and store are to the same pointer before preventing i16 RMW shifts and subtracts from being promoted.
llvm-svn: 328930
2018-04-01 06:29:28 +00:00
Craig Topper 3998041e80 [X86] Add test case to show failure to promote i16 subtract when the LHS is a load and the result is stored to a different address.
We mistakenly believe we might be able to fold this as a RMW operation, but that doesn't end up happening.

llvm-svn: 328929
2018-04-01 06:29:27 +00:00
Craig Topper ae2de57db0 [X86] Allow i16 subtracts to be promoted if the load is on the LHS and its not being stored.
llvm-svn: 328928
2018-04-01 06:29:25 +00:00
Craig Topper 280f631350 [X86] Add test case to show failure to promote i16 subtract because we mistakenly believe the load can be folded. NFC
The left hand side of the subtract is a load, but we cna't fold those unless we also have a store.

llvm-svn: 328927
2018-04-01 06:29:23 +00:00
Craig Topper 9bc0d881a3 [X86] Remove unneeded temporary variable. NFC
This Promote flag was alwasys set to true except in the default case. But in the default case we don't need to set PVT and can just return false.

llvm-svn: 328926
2018-04-01 06:29:21 +00:00
Mandeep Singh Grang 97bcade70f [Analysis] Change std::sort to llvm::sort in response to r327219
Summary:
r327219 added wrappers to std::sort which randomly shuffle the container before sorting.
This will help in uncovering non-determinism caused due to undefined sorting
order of objects having the same key.

To make use of that infrastructure we need to invoke llvm::sort instead of std::sort.

Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort.
Refer D44363 for a list of all the required patches.

Reviewers: sanjoy, dexonsmith, hfinkel, RKSimon

Reviewed By: dexonsmith

Subscribers: david2050, llvm-commits

Differential Revision: https://reviews.llvm.org/D44944

llvm-svn: 328925
2018-04-01 01:46:51 +00:00
Sanjay Patel 6124cae8f7 [DAGCombine] (float)((int) f) --> ftrunc (PR36617)
fptosi / fptoui round towards zero, and that's the same behavior as ISD::FTRUNC, 
so replace a pair of casts with the equivalent node. We don't have to account for 
special cases (NaN, INF) because out-of-range casts are undefined.

Differential Revision: https://reviews.llvm.org/D44909

llvm-svn: 328921
2018-03-31 17:55:44 +00:00
Lang Hames 9c755450ef [llvm-rtdyld] Fix the InputFileList cl::opt description: it accepts multiple
input files.

llvm-svn: 328920
2018-03-31 16:01:01 +00:00
Simon Pilgrim 3b8ad346f9 [X86][Btver2] Add MMX_PSHUFB to the JWritePSHUFB InstRW entries
llvm-svn: 328918
2018-03-31 09:15:54 +00:00
Simon Pilgrim 8c8ebd7945 Fix trailing whitespace. NFCI.
llvm-svn: 328917
2018-03-31 09:14:14 +00:00
Benjamin Kramer 824f36edff Unbreak the build of the go bindings after r328839.
llvm-svn: 328916
2018-03-31 07:41:25 +00:00
Puyan Lotfi 57c4f38c35 [MIR-Canon] Adding support for local idempotent instruction hoisting.
llvm-svn: 328915
2018-03-31 05:48:51 +00:00
Craig Topper 13a0f83a05 [X86] Add SchedRW for PMULLD
Summary:
It seems many CPUs don't implement this instruction as well as the other vector multiplies. Often using a multi uop flow. Silvermont in particular has a 7 uop flow with 11 cycle throughput. Sandy Bridge implements it as a single uop with 5 cycle latency and 1 cycle throughput. But Haswell and later use 2 uops with 10 cycle latency and 2 cycle throughput.

This patch adds a new X86SchedWritePair we can use to tag this instruction separately. I've provided correct information for Silvermont, Btver2, and Sandy Bridge. I've removed the InstRWs for SandyBridge. I've left Haswell/Broadwell/Skylake InstRWs in place because I wasn't sure how to account for the different load latency between 128 and 256 bits. I also left Znver1 InstRWs in place because the existing values don't match Agner's spreadsheet.

I also left a FIXME in the SandyBridge model because it being used for the "generic" model is too optimistic for the 256/512-bit versions since those are multiple uops on all known CPUs.

Reviewers: RKSimon, GGanesh, courbet

Reviewed By: RKSimon

Subscribers: gchatelet, gbedwell, andreadb, llvm-commits

Differential Revision: https://reviews.llvm.org/D44972

llvm-svn: 328914
2018-03-31 04:54:32 +00:00
Teresa Johnson db83aceb06 [ThinLTO] Add an option to force summary call edges cold for debugging
Summary:
Useful to selectively disable importing into specific modules for
debugging/triaging/workarounds.

Reviewers: eraman

Subscribers: inglorion, llvm-commits

Differential Revision: https://reviews.llvm.org/D45062

llvm-svn: 328909
2018-03-31 00:18:08 +00:00
Fangrui Song 956ee79795 Fix a bunch of typoes. NFC
llvm-svn: 328907
2018-03-30 22:22:31 +00:00
Ekaterina Romanova 0b01dfbba6 Prevent data races in concurrent ThinLTO processes.
Make sure ThinLTO with caching doesn't use non-atomic writes to the cache file (to prevent data races and cache files corruption).

1. Place temp file to the same place where the caching directory is (instead of creating it the directory pointed to by TMP/TEMP variable). This will help to prevent using non-atomic rename and falling back to non-atomic "direct" write to the cache file.
2. if rename failed do not write to the cache file directly (direct write to the file is non-atomic and could cause data race conditions).
3. if cache file doesn't exist (e.g., because 'rename' failed or because some other reasons), bypass using the cache altogether.

Differential Revision:  https://reviews.llvm.org/D45076

llvm-svn: 328904
2018-03-30 21:35:42 +00:00
Jacob Gravelle 40926451d2 [WebAssembly] Register wasm passes with the PassRegistry
Summary:
This exposes WebAssembly passes for use on the command line (as
arguments to -print-before and the like).

Reviewers: dschuff, sunfish

Subscribers: MatzeB, jfb, sbc100, llvm-commits, aheejin

Differential Revision: https://reviews.llvm.org/D45103

llvm-svn: 328901
2018-03-30 20:36:58 +00:00
Krzysztof Parzyszek 526fbf8e33 [Hexagon] Fix testcase
llvm-svn: 328899
2018-03-30 19:46:28 +00:00
Krzysztof Parzyszek 74096f7258 [Hexagon] Reduce excessive indentation in .s output
llvm-svn: 328898
2018-03-30 19:30:28 +00:00
Krzysztof Parzyszek 0f983d69a4 [Hexagon] Avoid creating invalid offsets in packetizer
Two memory instructions with a dependency only on the address register
between the two (the first one of them being post-incrememnt) can be
packetized together after the offset on the second was updated to the
incremement value. Make sure that the new offset is valid for the
instruction.

llvm-svn: 328897
2018-03-30 19:28:37 +00:00
Andrea Di Biagio dc97172b2f [X86][BtVer2] Fixed the number of micro opcodes for AVX vector converts and
VSQRT instructions.

There were still a few AVX instructions with an incorrect number of opcodes.
These should be fixed now.

llvm-svn: 328892
2018-03-30 18:53:47 +00:00
Peter Collingbourne d03bf12c1b DataFlowSanitizer: wrappers of functions with local linkage should have the same linkage as the function being wrapped
This patch resolves link errors when the address of a static function is taken, and that function is uninstrumented by DFSan.

This change resolves bug 36314.

Patch by Sam Kerner!

Differential Revision: https://reviews.llvm.org/D44784

llvm-svn: 328890
2018-03-30 18:37:55 +00:00
Puyan Lotfi 399b46c98d [MIR] Adding support for Named Virtual Registers in MIR.
llvm-svn: 328887
2018-03-30 18:15:54 +00:00
Andrea Di Biagio 3eaa26bb64 [X86][BtVer2] Fix the number of uOps for horizontal operations.
llvm-svn: 328886
2018-03-30 18:15:30 +00:00
Tim Shen 8f9f026965 [NVPTX] Enable StructuredCFG for NVPTX
Summary:
Make NVPTX require structured CFG. Added a temporary flag to
"roll back" the behavior for easy deployment.

Combined with D45008, this fixes several internal Nvidia GPU test
failures that we suspect to be ptxas miscompiles (PR27738).

Reviewers: jlebar

Subscribers: jholewinski, sanjoy, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D45070

llvm-svn: 328885
2018-03-30 17:51:03 +00:00
Tim Shen 1a8c6776a3 [BlockPlacement] Disable block placement tail duplciation in structured CFG.
Summary:
Tail duplication easily breaks the structure of CFG, e.g. duplicating on
a region entry. If the structure is intended to be preserved, then we
may want to configure tail duplication, or disable it for structured
CFG. From our benchmark results disabling it doesn't cause performance
regression.

Notice that this currently affects AMDGPU backend. In the next patch, I
also plan to turn on requiresStructuredCFG for NVPTX.

All unit tests still pass.

Reviewers: jlebar, arsenm

Subscribers: jholewinski, sanjoy, wdng, tpr, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D45008

llvm-svn: 328884
2018-03-30 17:51:00 +00:00
Robert Widmann 478fce9ebf [LLVM-C] Finish exception instruction bindings - Round 2
Summary:
Previous revision caused a leak in the echo test that got caught by the ASAN bots because of missing free of the handlers array and was reverted in r328759.  Resubmitting the patch with that correction.

Add support for cleanupret, catchret, catchpad, cleanuppad and catchswitch and their associated accessors.

Test is modified from SimplifyCFG because it contains many diverse usages of these instructions.

Reviewers: whitequark, deadalnix

Reviewed By: whitequark

Subscribers: llvm-commits, vlad.tsyrklevich

Differential Revision: https://reviews.llvm.org/D45100

llvm-svn: 328883
2018-03-30 17:49:53 +00:00
Zachary Turner ce5b834abf Fix some signed / unsigned conversion problems.
llvm-svn: 328881
2018-03-30 17:28:35 +00:00
Zachary Turner d5cf5cf637 [llvm-pdbutil] Dig deeper into the PDB and DBI streams when explaining.
This will show more detail when using `llvm-pdbutil explain` on an
offset in the DBI or PDB streams.  Specifically, it will dig into
individual header fields and substreams to give a more precise
description of what the byte represents.

llvm-svn: 328878
2018-03-30 17:16:50 +00:00
Derek Schuff a2726e9ab6 [WebAssembly] Refactor tablegen for store instructions (NFC)
Summary: Add patterns similar to loads.

Differential Revision: https://reviews.llvm.org/D45064

llvm-svn: 328876
2018-03-30 17:02:50 +00:00
Krzysztof Parzyszek fce30c2ba3 Revert "peel loops with runtime small trip counts"
This reverts commit r328854, it breaks some Hexagon tests.

llvm-svn: 328875
2018-03-30 16:55:44 +00:00
Stanislav Mekhanoshin 74e2974ac6 [AMDGPU] Fixed some instructions latencies
Differential Revision: https://reviews.llvm.org/D45073

llvm-svn: 328874
2018-03-30 16:19:13 +00:00
Sanjay Patel e09b7dcf3d [SelectionDAG] Removing FABS folding from DAGCombiner
The code has bugs dealing with -0.0.

Since D44550 introduced FABS pattern folding in InstCombine, 
this patch removes the now-redundant code that causes 
https://bugs.llvm.org/show_bug.cgi?id=36600.

Patch by Mikhail Dvoretckii!

Differential Revision: https://reviews.llvm.org/D44683

llvm-svn: 328872
2018-03-30 15:42:52 +00:00
Krzysztof Parzyszek 4f99836a9e [Hexagon] Recognize and handle :endloop01
llvm-svn: 328870
2018-03-30 15:29:47 +00:00
Krzysztof Parzyszek 46abcb236b [Hexagon] Fix printing :mem_noshuf on compiler-generated packets
llvm-svn: 328869
2018-03-30 15:09:05 +00:00
Krzysztof Parzyszek 71731fab24 [Hexagon] Fix flags for store-related intrinsics
llvm-svn: 328868
2018-03-30 14:57:01 +00:00
Andrea Di Biagio 073a9d74ca [X86][BtVer2] Add missing ReadAfterLd to RM variants of AVX horizontal adds and
most vector logic instructions.

Fixed a few InstRW that forgot to specify a ReadAfterLd for the register input
operand.

llvm-svn: 328867
2018-03-30 14:48:08 +00:00
Krzysztof Parzyszek 3f55ad8fae [Hexagon] Remove unused scheduling classes
llvm-svn: 328866
2018-03-30 14:34:32 +00:00
Andrea Di Biagio 42d8ea22c0 [X86][BtVer2] Add tests that show how ReadAfterLd is missing for some
instructions.

In the Btver2 model, there are a few InstRW overrides that don't specify a
ReadAfterLd for the register input operand.

As a result, a few AVX variants of horizontal operations and most vector logic
operations with a folded memory operand don't have a ReadAdvance info associated
to their input register operands.

llvm-svn: 328865
2018-03-30 14:29:33 +00:00
Krzysztof Parzyszek 1ca23d9837 [Hexagon] Pass pointer to SelectionDAG to dump functions
llvm-svn: 328864
2018-03-30 14:29:15 +00:00
Andrea Di Biagio 01043625cf [X86] Add llvm-mca tests for r328834.
Verify that the ReadAfterLd is correctly applied to FMA and 4-ops variable blend
instructions.

As Craig pointed out in D44726, some Intel models still have to be fixed.

llvm-svn: 328861
2018-03-30 13:38:37 +00:00
Andrea Di Biagio 0823090843 [X86] Add tests to verify the presence of "ReadAfterLd" after r328823.
This change adds a couple of tests to verify the change introduced by revision
328823 ([X86] Correct the placement of ReadAfterLd in BEXTR and BZHI).

llvm-svn: 328859
2018-03-30 11:44:48 +00:00
Vlad Tsyrklevich 894c028d56 Revert "[LLVM-C] Finish exception instruction bindings"
This reverts commit r328759. It was causing LSan failures on sanitizer-x86_64-linux-bootstrap

llvm-svn: 328858
2018-03-30 06:21:28 +00:00
Michael Bedy 59e5ef793c [AMDGPU] Fix the SDWA Peephole phase to handle src for dst:UNUSED_PRESERVE.
Summary:
The phase attempts to transform operations that extract a portion of a value
into an SDWA src operand in cases where that value is used only once. It
was not prepared for this use to be the preserved portion of a value for
dst:UNUSED_PRESERVE, resulting in a crash or assert.

This change either rejects the illegal SDWA attempt, or in the case where
dst:WORD_1 and the src_sel would be WORD_0, removes the unneeded
extract instruction.

Reviewers: arsenm, #amdgpu

Reviewed By: arsenm, #amdgpu

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D44364

llvm-svn: 328856
2018-03-30 05:03:36 +00:00
Ikhlas Ajbar a7d614c3e5 [Hexagon] add missing lit config file
llvm-svn: 328855
2018-03-30 03:32:24 +00:00
Ikhlas Ajbar 66c8ba5a50 peel loops with runtime small trip counts
For Hexagon, peeling loops with small runtime trip count is beneficial for our
benchmarks. We set PeelCount in HexagonTargetInfo.cpp and we use PeelCount set
by the target for computing the desired peel count.

Differential Revision: https://reviews.llvm.org/D44880

llvm-svn: 328854
2018-03-30 03:05:34 +00:00
Eli Friedman 208fe67a78 [MachineCopyPropagation] Handle COPY with overlapping source/dest.
MachineCopyPropagation::CopyPropagateBlock has a bunch of special
handling for COPY instructions. This handling assumes that COPY
instructions do not modify the source of the copy; this is wrong if
the COPY destination overlaps the source.

To fix the bug, check explicitly for this situation, and fall back to
the generic instruction handling.

This bug can't happen for most register classes because they don't
have this sort of overlap, but there are a few register classes
where this is possible. The testcase uses the AArch64 QQQQ register
class.

Differential Revision: https://reviews.llvm.org/D44911

llvm-svn: 328851
2018-03-30 00:56:03 +00:00
Eugene Zelenko 7fb5d41e44 [IR] Fix some Clang-tidy modernize-use-auto warnings; other minor fixes (NFC).
llvm-svn: 328850
2018-03-30 00:47:31 +00:00
Rafael Espindola 4b4d85fd4d Style update. NFC.
Rename 3 functions to start with lowercase letters. Don't repeat the
name in the comments.

llvm-svn: 328848
2018-03-29 23:32:54 +00:00
David Blaikie f423062aff Fix some layering in StripNonLineTableDebugInfo, moving its declaration from IPO.h to Utils.h to match its implementation
llvm-svn: 328844
2018-03-29 22:42:08 +00:00
David Blaikie 7883340331 Remove unused header to fix layering.
llvm-svn: 328842
2018-03-29 22:35:59 +00:00
David Blaikie 4778bb88ef Remove unused headers to fix layering
llvm-svn: 328840
2018-03-29 22:31:39 +00:00
David Blaikie c90289b5d3 llvm-c: Split Utils out of Scalar.h
To fix layering (so that Scalar.h, a libScalarOpts header, isn't
included from Utils - which libScalarOpts depends on).

llvm-svn: 328839
2018-03-29 22:31:38 +00:00
David Blaikie bd0c88078a Remove some unneeded #includes to fix layering
llvm-svn: 328838
2018-03-29 22:31:36 +00:00
Craig Topper ee3c19fd7f [X86] Add ReadAfterLds to some 3 src instructions
Sometimes the operand comes after the memory operand so we need 5 ReadDefaults first.

I suspect we also need to do something for the mask operand for masked avx512 instructions? I'm not sure if the mask should be ReadAfterLd or not since it can mask faults. If it shouldn't be ReadAfterLd then we're probably wrong for zero masking instructions already.

Differential Revision: https://reviews.llvm.org/D44726

llvm-svn: 328834
2018-03-29 22:03:05 +00:00
Eric Christopher dd4baff48d Typo fix: epilouge->epilogue. NFC.
llvm-svn: 328833
2018-03-29 21:59:04 +00:00
Matt Arsenault efd1b30436 AMDGPU: Fix build warning in release
llvm-svn: 328832
2018-03-29 21:44:44 +00:00
Matt Arsenault 03ae399d50 AMDGPU: Support realigning stack
While the stack access instructions don't care about
alignment > 4, some transformations on the pointer calculation
do make assumptions based on knowing the low bits of a pointer
are 0. If a stack object ends up being accessed through its
absolute address (relative to the kernel scratch wave offset),
the addressing expression may depend on the stack frame being
properly aligned. This was breaking in a testcase due to the
add->or combine.

I think some of the SP/FP handling logic is still backwards,
and overly simplistic to support all of the stack features.
Code which tries to modify the SP with inline asm for example
or variable sized objects will probably require redoing this.

llvm-svn: 328831
2018-03-29 21:30:06 +00:00
Evgeniy Stepanov 50635dab26 Add msan custom mapping options.
Similarly to https://reviews.llvm.org/D18865 this adds options to provide custom mapping for msan.
As discussed in http://lists.llvm.org/pipermail/llvm-dev/2018-February/121339.html

Patch by vit9696(at)avp.su.

Differential Revision: https://reviews.llvm.org/D44926

llvm-svn: 328830
2018-03-29 21:18:17 +00:00
Craig Topper 3f2dbec652 [X86] Remove ReadAfterLd from BMI and TBM instructions that don't have a register operand in their memory form
The memory form of these instructions only read an input from memory. They don't have any register operands.

Differential Revision: https://reviews.llvm.org/D44836

llvm-svn: 328828
2018-03-29 21:03:53 +00:00
Kevin Enderby 4129308413 Try to fix sanitizer-x86_64-linux-fast bot due to change in r328820.
llvm-svn: 328824
2018-03-29 20:49:24 +00:00
Craig Topper 89310f56c8 [X86] Correct the placement of ReadAfterLd in BEXTR and BZHI. Add dedicated SchedRW for BEXTR/BZHI.
These instructions have the memory operand before the register operand. So we need to put ReadDefault for all the load ops first. Then the ReadAfterLd

Differential Revision: https://reviews.llvm.org/D44838

llvm-svn: 328823
2018-03-29 20:41:39 +00:00
Philip Reames 5c14ed89f6 [NFC][LICM] Rearrange checks to have the cheap bail out first
llvm-svn: 328822
2018-03-29 20:32:15 +00:00
Matt Arsenault ffb132e74b AMDGPU: Increase default stack alignment
8 and 16-byte values are common, so increase the default
alignment to avoid realigning the stack in most functions.

llvm-svn: 328821
2018-03-29 20:22:04 +00:00
Kevin Enderby d9911f6f7b For llvm-nm and Mach-O files that are fully stripped, special case a redacted LC_MAIN
As a further refinement on:

r328274 - For llvm-nm and Mach-O files also use function starts info in some cases when printing symbols

we want to special case a redacted LC_MAIN so it is easier to find.

rdar://38978929

llvm-svn: 328820
2018-03-29 20:04:29 +00:00
Matt Arsenault 6c041a3cab AMDGPU: Fix selection error on constant loads with < 4 byte alignment
llvm-svn: 328818
2018-03-29 19:59:28 +00:00
Philip Reames e4b728e82b Fix an accidental circular dependence
llvm-svn: 328816
2018-03-29 19:22:12 +00:00
Mandeep Singh Grang 10d8b85570 [Mips] Change std::sort to llvm::sort in response to r327219
Summary:
r327219 added wrappers to std::sort which randomly shuffle the container before sorting.
This will help in uncovering non-determinism caused due to undefined sorting
order of objects having the same key.

To make use of that infrastructure we need to invoke llvm::sort instead of std::sort.

Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort.
Refer the comments section in D44363 for a list of all the required patches.

Reviewers: sdardis, RKSimon, dsanders, atanasyan

Reviewed By: atanasyan

Subscribers: atanasyan, arichardson, llvm-commits

Differential Revision: https://reviews.llvm.org/D44869

llvm-svn: 328815
2018-03-29 19:05:26 +00:00
Paul Robinson 407ff1b1cd Try to fix a couple tests for Windows.
llvm-svn: 328814
2018-03-29 18:59:33 +00:00
Dinar Temirbulatov c326c1c582 [SLPVectorizer] Add tests related to PR30787, NFCI.
llvm-svn: 328813
2018-03-29 18:57:03 +00:00
Zachary Turner 3203e27473 [MSF] Default to FPM2, and always mark FPM pages allocated.
There are two FPMs in an MSF file, the idea being that for
incremental updates you can write to the alternate one and then
atomically swap them on commit.  LLVM defaulted to using FPM1
on the first commit, but this differs from Microsoft's behavior
which is to default to using FPM2 on the first commit.  To
eliminate some byte-level file differences, this patch changes
LLVM's default to also be FPM2.

Additionally, LLVM was trying to be "smart" about marking FPM
pages allocated.  In addition to marking every page belonging
to the alternate FPM as unallocated, LLVM also marked pages at
the end of the main FPM which were not needed as unallocated.

In order to match the behavior of Microsoft-generated PDBs, we
now always mark every FPM block as allocated, regardless of
whether it is in the main FPM or the alt FPM, and regardless of
whether or not it describes blocks which are actually in the file.

This has the side benefit of simplifying our code.

llvm-svn: 328812
2018-03-29 18:34:15 +00:00
Zachary Turner f4b6dcf6af [PDB] Print some more details when explaining MSF fields.
When we determine that a field belongs to an MSF super block or
the free page map, we wouldn't print any additional information.

With this patch, we now print the value of the field (for super
block fields) or the allocation status of the specified byte (in
the case of offsets in the FPM).

llvm-svn: 328808
2018-03-29 17:45:34 +00:00
Craig Topper 2fa1436206 [IR][CodeGen] Remove dependency on EVT from IR/Function.cpp. Move EVT to CodeGen layer.
Currently EVT is in the IR layer only because of Function.cpp needing a very small piece of the functionality of EVT::getEVTString(). The rest of EVT is used in codegen making CodeGen a better place for it.

The previous code converted a Type* to EVT and then called getEVTString. This was only expected to handle the primitive types from Type*. Since there only a few primitive types, we can just print them as strings directly.

Differential Revision: https://reviews.llvm.org/D45017

llvm-svn: 328806
2018-03-29 17:21:10 +00:00
Paul Robinson b271f31d8d Reapply "[DWARFv5] Emit file 0 to the line table."
DWARF v5 specifies that the root file (also given in the DW_AT_name
attribute of the compilation unit DIE) should be emitted explicitly to
the line table's list of files.  This makes the line table more
independent of the .debug_info section.
We emit the new syntax only for DWARF v5 and later.

Fixes the bug found by asan. Also XFAIL the new test for Darwin, which
is stuck on DWARF v2, and fix up other tests so they stop failing on
Windows.  Last but not least, don't break "clang -g" of an assembler
file that has .file directives in it.

Differential Revision: https://reviews.llvm.org/D44054

llvm-svn: 328805
2018-03-29 17:16:41 +00:00
Zachary Turner 1b20416bfa [PDB] Fix a bug in the explain subcommand.
We were trying to dig into the super block fields and print a
description of the field at the specified offset, but we were
printing the wrong field due to an off-by-one-field-error.

llvm-svn: 328804
2018-03-29 17:11:14 +00:00
David Zarzycki b458329327 [ADT] NFC: Fix bogus StringSwitch rule-of-five boilerplate
Now that 'Str' is constant, the rule-of-file logic needs updating.

Reported by: vit9696@avp.su
Reviewed by: jordan_rose@apple.com

llvm-svn: 328803
2018-03-29 16:51:28 +00:00
Zachary Turner db0f2f68b0 Remove unused function.
llvm-svn: 328802
2018-03-29 16:46:47 +00:00
Zachary Turner ea40f40e1b [PDB] Add an explain subcommand.
When investigating various things, we often have a file offset
and what to know what's in the PDB at that address.  For example
we may be doing a binary comparison of two LLD-generated PDBs
to look for sources of non-determinism, or we may wish to compare
an LLD-generated PDB with a Microsoft generated PDB for sources
of byte-for-byte incompatibility.  In these cases, we can do a
binary diff of the two files, and once we find a mismatched byte
we can use explain to figure out what that byte is, immediately
honining in on the problem.

This patch implements this by trying to narrow the meaning of
a particular file offset down as much as possible.

Differential Revision: https://reviews.llvm.org/D44959

llvm-svn: 328799
2018-03-29 16:28:20 +00:00
Haicheng Wu c7cc87922e [JumpThreading] Don't select an edge that we know we can't thread
In r312664 (D36404), JumpThreading stopped threading edges into
loop headers. Unfortunately, I observed a significant performance
regression as a result of this change. Upon further investigation,
the problematic pattern looked something like this (after
many high level optimizations):

while (true) {
    bool cond = ...;
    if (!cond) {
        <body>
    }
    if (cond)
        break;
}

Now, naturally we want jump threading to essentially eliminate the
second if check and hook up the edges appropriately. However, the
above mentioned change, prevented it from doing this because it would
have to thread an edge into the loop header.

Upon further investigation, what is happening is that since both branches
are threadable, JumpThreading picks one of them at arbitrarily. In my
case, because of the way that the IR ended up, it tended to pick
the one to the loop header, bailing out immediately after. However,
if it had picked the one to the exit block, everything would have
worked out fine (because the only remaining branch would then be folded,
not thraded which is acceptable).

Thus, to fix this problem, we can simply eliminate loop headers from
consideration as possible threading targets earlier, to make sure that
if there are multiple eligible branches, we can still thread one of
the ones that don't target a loop header.

Patch by Keno Fischer!
Differential Revision: https://reviews.llvm.org/D42260

llvm-svn: 328798
2018-03-29 16:01:26 +00:00
Pavel Labath ea0f841c3b .debug_names: Correctly align the AugmentationStringSize field
We should align the value of the field, not the overall section offset.

This distinction matters if one of the debug_names contributions is not
of size which is a multiple of four. The dwarf producers may choose to
emit rounded contributions, but they are not required to do so. In the
latter case, without this patch we would corrupt the parsing state, as
we would adjust the offset even if subsequent contributions contained
correctly rounded augmentation strings.

llvm-svn: 328796
2018-03-29 15:12:45 +00:00
Andrea Di Biagio 0a837ef6b1 [llvm-mca] Correctly set the ReadAdvance information for register use operands.
The tool was passing the wrong operand index to method
MCSubtargetInfo::getReadAdvanceCycles(). That method requires a "UseIdx", and
not the operand index. This was found when testing X86 code where instructions
had a memory folded operand.

This patch fixes the issue and adds test read-advance-1.s to ensure that
the ReadAfterLd (a ReadAdvance of 3cy) information is correctly used.

llvm-svn: 328790
2018-03-29 14:26:56 +00:00
Krzysztof Parzyszek dc7a557e6a [Hexagon] Add support to handle bit-reverse load intrinsics
Patch by Sumanth Gundapaneni.

llvm-svn: 328774
2018-03-29 13:52:46 +00:00
Pavel Labath 2d1fc4375f .debug_names: Parse DW_IDX_die_offset as a reference
Before this patch we were parsing the attributes as section offsets, as
that is what apple_names is doing. However, this is not correct as DWARF
v5 specifies that this attribute should use the Reference form class.

This also updates all the testcases (except the ones that deliberately
pass a different form) to use the correct form class.

llvm-svn: 328773
2018-03-29 13:47:57 +00:00
Sjoerd Meijer 4f8f1e5115 [Kaleidoscope] Tiny typo fixes
Fixes for "lets" references which should be "let's" in the Kaleidoscope
tutorial.

Patch by: Robin Dupret

Differential Revision: https://reviews.llvm.org/D44990

llvm-svn: 328772
2018-03-29 12:31:06 +00:00
Simon Pilgrim 71c5f3fffd [X86][SSE] Don't bother re-adding combined target shuffles to the work list
We are re-adding all the bitcasts, constant masks and target shuffles to the work list for no apparent gain.

Found while investigating adding SimplifyDemandedVectorElts to target shuffles.

Differential Revision: https://reviews.llvm.org/D44942

llvm-svn: 328771
2018-03-29 11:18:41 +00:00
Sylvestre Ledru f22ebb7599 Rename llvm library from libLLVM-X.Y to libLLVM-X
Summary:
As we are only doing X.0.Z releases (not using the minor version), there is no need to keep -X.Y in the version.

Like patch https://reviews.llvm.org/D41808, I propose that we rename libLLVM-7.0svn.so to libLLVM-7svn.so 
This patch will also rename downstream libraries like liblldb-7.0 to liblldb-7

Reviewers: axw, beanz, dim, hans

Reviewed By: dim, hans

Subscribers: mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D41869

llvm-svn: 328768
2018-03-29 09:44:09 +00:00
Simon Dardis 32a27fc77a [Mips] Remove dead code
I believe the role of ehDataReg has been replaced by MipsABIInfo::GetEhDataReg, thus removing the dead code.

Patch By: Wei-Ren Chen.

Reviewers: ehostunreach, sdardis

Differential Revision: https://reviews.llvm.org/D44867

llvm-svn: 328767
2018-03-29 09:21:20 +00:00
David Green b0aa36f9c2 [LoopRotate] Restructuring LoopRotation.cpp to create Loop Rotation Pass with Loop Rotation Utility Interface
The existing LoopRotation.cpp is implemented as one of loop passes instead of
being a utility. The user cannot easily perform the loop rotation selectively
(or on demand) under different optimization level. For example, the loop
rotation is needed as part of the logic to convert a loop into a loop with
bottom test for a transformation. If the loop rotation is simply added as a
loop pass before the transformation, the pass is skipped if it is compiled at
–O0 or if it is explicitly disabled by the user, causing the compiler to
generate incorrect code. Furthermore, as a loop pass it will rotate all loops
instead of just the relevant loops.

We provide a utility interface for the loop rotation so that the loop rotation
can be called on demand. The changeset is as follows:

- Create a new file lib/Transforms/Utils/LoopRotationUtils.cpp and move the main
  implementation of class LoopRotate into this file.
- Create a new file llvm/include/Transform/Utils/LoopRotationUtils.h with the
  interface LoopRotation(...).
- Original LoopRotation.cpp is changed to use the utility function LoopRotation
  in LoopRotationUtils.cpp. This is done in the same way community did for
  mem-to-reg implementation.

Patch by Jin Lin!

Differential Revision: https://reviews.llvm.org/D44595

llvm-svn: 328766
2018-03-29 08:48:15 +00:00
Benjamin Kramer 6b995a4a7e [Transforms] Make sure to include the c binding header when defining c binding functions
Otherwise the definitions can't see the extern C declarations and get
name mangled, making it impossible for users to call them. This breaks
the Go bindings.

llvm-svn: 328765
2018-03-29 07:56:53 +00:00
Max Kazantsev 18f93894db [NFC] Fix meaningless assert in SCEV
llvm-svn: 328764
2018-03-29 07:54:59 +00:00
Craig Topper a21758fa2c [X86] Don't pass getRegisterName from the InstPrinters into EmitAnyX86InstComments. Just always use the function from the ATTPrinter. NFC
The IntelPrinter and the ATTPrinter produce the same strings for the same input. We already use the ATTPrinter explicitly in several other places.

llvm-svn: 328762
2018-03-29 04:14:04 +00:00
Robert Widmann 6775f52fe0 [LLVM-C] Finish exception instruction bindings
Summary:
Add support for cleanupret, catchret, catchpad, cleanuppad and catchswitch and their associated accessors.

Test is modified from SimplifyCFG because it contains many diverse usages of these instructions.

Reviewers: whitequark, deadalnix, echristo

Reviewed By: echristo

Subscribers: llvm-commits, harlanhaskins

Differential Revision: https://reviews.llvm.org/D44496

llvm-svn: 328759
2018-03-29 03:43:15 +00:00
Craig Topper 7456af88f4 [X86] Rename RIi64_NOREX tblgen class to just Ii64. Make RIi64 inherit from it. NFC
This feels more consistent with the other classes. We don't need to say _NOREX if we didn't start it with an R in the first place.

llvm-svn: 328757
2018-03-29 03:14:57 +00:00
Craig Topper 7441ffff84 [X86] Cleanup inheritance of the X86InstrFormats.td classes. NFC
EVEX shouldn't inherit from VEX and EVEX_4V shouldn't inherit from VEX_4V.

llvm-svn: 328756
2018-03-29 03:14:56 +00:00
George Burgess IV af0b06f4fd [MemorySSA] Turn an assert into a condition
Eli pointed out that variadic functions are totally a thing, so this
assert is incorrect.

No test-case is provided, since the only way this assert fires is if a
specific DenseMap falls back to doing `isEqual` checks, and that seems
fairly brittle (and requires a pyramid of growing
`call void (i8, ...) @varargs(i8 0)`).

llvm-svn: 328755
2018-03-29 03:12:03 +00:00
George Burgess IV 3588fd4865 [MemorySSA] Consider callsite args for hashing and equality.
We use a `DenseMap<MemoryLocOrCall, MemlocStackInfo>` to keep track of
prior work when optimizing uses in MemorySSA. Because we weren't
accounting for callsite arguments in either the hash code or equality
tests for `MemoryLocOrCall`s, we optimized uses too aggressively in
some rare cases.

Fix by Daniel Berlin.

Should fix PR36883.

llvm-svn: 328748
2018-03-29 00:54:39 +00:00
David Blaikie b3f471a4bd Remove some unused includes to fix layering.
llvm-svn: 328745
2018-03-29 00:29:45 +00:00
David Blaikie 3d94e9f320 Split Disassembler.h in two to fix dependencies
Support includes this header for the typedefs - but logically it's part
of the MC/Disassembler library that implements the functions. Split the
header so as not to create a circular dependency.

This is another case where probably inverting the llvm-c implementation
might be best (rather than core llvm libraries implementing the parts of
llvm-c - instead llvm-c could be its own library, depending on all the
parts of LLVM's core libraries to then implement llvm-c on top of
them... if that makes sense)

llvm-svn: 328744
2018-03-29 00:29:44 +00:00
David Blaikie 10f71304b7 Add missing dependency (headers are included from MC, so a link dependency could exist easily enough)
llvm-svn: 328743
2018-03-29 00:29:43 +00:00
David Blaikie 8ad9a97310 Plumb useAA through TargetTransformInfo to remove Transforms->CodeGen header dependency
Thanks to echristo for the pointers on direction.

llvm-svn: 328737
2018-03-28 22:28:50 +00:00
Craig Topper aac23d7881 [X86][SkylakeServer] Remove checks for 'k', 'z', '_Int' and 'b' from scheduler regexs.
Most of these were optional matches at the end of the strings, but since the strings themselves are prefix matches by default you don't need to check for something optional at the end.

I've left the 'b' on memory instructions where it means 'broadcast' because I'm not sure those really have the same load latency and we may need to split them explicitly in the future.

llvm-svn: 328730
2018-03-28 20:40:24 +00:00
Jun Bum Lim f90fe701ef [PostRAMachineSink] preserve CFG
Summary: Mark CFG is preserved  since this pass do not make any change in CFG.

Reviewers: sebpop, mzolotukhin, mcrosier

Reviewed By: mzolotukhin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D44845

llvm-svn: 328727
2018-03-28 19:56:26 +00:00
Krzysztof Parzyszek 440ba3ae5c [Hexagon] Add support for "new" circular buffer intrinsics
These instructions have been around for a long time, but we
haven't supported intrinsics for them. The "new" versions use
the CSx register for the start of the buffer instead of the K
field in the Mx register.

We need to use pseudo instructions for these instructions until
after register allocation. The problem is that these instructions
allocate a M0/CS0 or M1/CS1 pair. But, we can't generate code for
the CSx set-up until after register allocation when the Mx
register has been fixed for the instruction.

There is a related clang patch.

Patch by Brendon Cahoon.

llvm-svn: 328724
2018-03-28 19:38:29 +00:00
David Blaikie eb8cc04ea2 Oops - moved slightly too many things from Scalar to Utils. Move LoopSimplifyCFG things back
llvm-svn: 328720
2018-03-28 18:03:25 +00:00
Jessica Paquette 4aa14dbcc2 [MachineOutliner] Simplify call outlining + require valid callee save info for call outlining
This commit simplifies the call outlining logic by removing references to the
Function associated with the callee. To do this, it requires that valid
callee save info is available to the outliner.

llvm-svn: 328719
2018-03-28 17:52:31 +00:00
David Blaikie a373d18eb7 Transforms: Introduce Transforms/Utils.h rather than spreading the declarations amongst Scalar.h and IPO.h
Fixes layering - Transforms/Utils shouldn't depend on including a Scalar
or IPO header, because Scalar and IPO depend on Utils.

llvm-svn: 328717
2018-03-28 17:44:36 +00:00
Peter Collingbourne d579c31d68 [llvm-ar] Support multiple dashed options
This allows syntax like:
$ llvm-ar -c -r -u file.a file.o

This is in addition to the other formats that are already supported:
$ llvm-ar cru file.a file.o
$ llvm-ar -cru file.a file.o

Patch by Tom Anderson!

Differential Revision: https://reviews.llvm.org/D44452

llvm-svn: 328716
2018-03-28 17:21:14 +00:00
Simon Pilgrim 7237e0cf39 [X86][AVX2] Add shuffle test case from PR36933
llvm-svn: 328714
2018-03-28 16:48:48 +00:00
Dmitry Preobrazhensky 622bde8bc7 [AMDGPU][MC] Added ds_add_src2_f32
See bug 36833: https://bugs.llvm.org/show_bug.cgi?id=36833

Differential Revision: https://reviews.llvm.org/D44779

Reviewers: arsenm, artem.tamazov, timcorringham
llvm-svn: 328713
2018-03-28 16:21:56 +00:00
Lang Hames da5c6acfe9 [ORC] Restore the narrower check from before r328687.
This should get the builders green again while I investigate why r328706 was
insufficient.

llvm-svn: 328711
2018-03-28 15:58:14 +00:00
Dmitry Preobrazhensky 2456ac696a [AMDGPU][MC] Added PCK variants of image load/store instructions
See bug 36834: https://bugs.llvm.org/show_bug.cgi?id=36834

Differential Revision: https://reviews.llvm.org/D44795

Reviewers: artem.tamazov, arsenm, timcorringham, nhaehnle
llvm-svn: 328710
2018-03-28 15:44:16 +00:00
Daniel Neilson 45796f6be9 [PatternMatch] Add matchers for vector operations
Summary:
There aren't any matchers for the three vector operations: insertelement, extractelement, and
shufflevector. This patch adds them as well as corresponding unit tests.

llvm-svn: 328709
2018-03-28 15:39:00 +00:00
Dmitry Preobrazhensky a917e88585 [AMDGPU][MC][GFX9] Added buffer_*_format_d16_hi_x
See bug 36835: https://bugs.llvm.org/show_bug.cgi?id=36835

Differential Revision: https://reviews.llvm.org/D44825

Reviewers: artem.tamazov, arsenm, timcorringham
llvm-svn: 328707
2018-03-28 14:53:13 +00:00
Lang Hames ec978e2226 [ORC] Re-add the Windows check that was dropped in r328687.
This check prevents the ORC execution tests from running on Windows (which is
not supported yet).

This should fix the windows bots.

llvm-svn: 328706
2018-03-28 14:47:11 +00:00
Dmitry Preobrazhensky dd2b929ffb [AMDGPU][MC][GFX9] Added s_scratch* instructions
See bug 36836: https://bugs.llvm.org/show_bug.cgi?id=36836

Differential Revision: https://reviews.llvm.org/D44832

Reviewers: artem.tamazov, arsenm, timcorringham
llvm-svn: 328704
2018-03-28 14:08:03 +00:00
Dan Liew 8ade9e75b0 Revert "[lit] Temporarily disable shtest-timeout.py on darwin"
This reverts commit 771829b640a5494ab65c810dd6b4330522bf3a33 (rr328598)

Hopefully the test will now pass on the bots.

rdar://problem/38774530

llvm-svn: 328703
2018-03-28 13:55:13 +00:00
Dan Liew 7efde3c440 [lit] Remove a timing senstive part of `shtest-timeout.py`
The `shtest-timeout.py` test was failing intermittently. It looks like
the issue is that on a resource constrained system lit is unable to run
`quick_then_slow.py` twice and print out the messages the tests expects
within the one second timeout.

The underlying issue is that the test is dependent on the performance of
the host machine is a rather fragile way. This is due to hardcoding
timeout values and having assumptions that the host machine is able to
perform a certain amount of work within the hardcoded timeout values.

We could increase the timeout values but that doesn't really fix the
underlying issue. Instead this patch removes one of fragile assumptions
in the hope that this will be enough to fix the bots.
There are other fragile assumptions in this test (e.g. `quick.py` can be
executed in less than 1 second). If the bots continue to fail we'll have
to revisit this.

rdar://problem/38774530

llvm-svn: 328702
2018-03-28 13:55:08 +00:00
Simon Pilgrim b1bc6cd96b [X86][Btver2] Moved JWriteFCmp/JWriteFCmpY classes next to each other. NFCI
Renamed JWriteFPAY22 to JWriteFCmpY - we've tended to avoid latency based names

llvm-svn: 328701
2018-03-28 13:53:21 +00:00
Alexander Potapenko 202f809437 Revert "Reapply "[DWARFv5] Emit file 0 to the line table.""
This reverts commit r328676.

Commit r328676 broke the -no-integrated-as flag necessary to build Linux kernel with Clang:

$ cat t.c
void foo() {}
$ clang -no-integrated-as   -c  t.c -g
/tmp/t-dcdec5.s: Assembler messages:
/tmp/t-dcdec5.s:8: Error: file number less than one
clang-7.0: error: assembler command failed with exit code 1 (use -v to see invocation)

llvm-svn: 328699
2018-03-28 12:36:46 +00:00
Andrea Di Biagio 5076b98fb9 [X86][BtVer2] Fix the number of micro opcodes for AES[ENC|DEC] and other YMM instructions.
Similar to r328694. The number of micro opcodes should be 2 for those
instructions.

This was found when testing AVX code for BtVer2 using llvm-mca.

llvm-svn: 328698
2018-03-28 12:12:04 +00:00
Alexander Potapenko 4e7ad0805e [MSan] Introduce ActualFnStart. NFC
This is a step towards the upcoming KMSAN implementation patch.
KMSAN is going to prepend a special basic block containing
tool-specific calls to each function. Because we still want to
instrument the original entry block, we'll need to store it in
ActualFnStart.

For MSan this will still be F.getEntryBlock(), whereas for KMSAN
it'll contain the second BB.

llvm-svn: 328697
2018-03-28 11:35:09 +00:00
Tim Renouf cdac172e2a Revert "[AMDGPU] For OS type AMDPAL, fixed scratch on compute shader"
This reverts commit 0daf86291d3aa04d3cc280cd0ef24abdb0174981.

It was causing an assert in test/CodeGen/AMDGPU/amdpal.ll only on a
release-with-asserts build. I will resubmit the change when I have fixed
that.

Change-Id: If270594eba27a7dc4076bdeab3fa8e6bfda3288a
llvm-svn: 328695
2018-03-28 11:21:07 +00:00
Andrea Di Biagio 010924e35c [X86][BtVer2] Fix the number of micro opcodes for a bunch of YMM instructions.
The Jaguar backend natively supports 128-bit data types. Operations on YMM
registers are split into two COPs (complex operations). Each COP consumes a slot
in the dispatch group, and in the reorder buffer.

The scheduling model for Jaguar should mark those instructions as `let
NumMicroOps = 2`.

This was found when testing AVX code for BtVer2 using llvm-mca.

llvm-svn: 328694
2018-03-28 10:49:33 +00:00
Alexander Potapenko e1d5877847 [MSan] Add an isStore argument to getShadowOriginPtr(). NFC
This is a step towards the upcoming KMSAN implementation patch.
The isStore argument is to be used by getShadowOriginPtrKernel(),
it is ignored by getShadowOriginPtrUserspace().

Depending on whether a memory access is a load or a store, KMSAN
instruments it with different functions, __msan_metadata_ptr_for_load_X()
and __msan_metadata_ptr_for_store_X().

Those functions may return different values for a single address,
which is necessary in the case the runtime library decides to ignore
particular accesses.

llvm-svn: 328692
2018-03-28 10:17:17 +00:00
Christof Douma a1e77c0e02 [ARM] Support float literals under XO
Follow up patch of r328313 to support the UseVMOVSR constraint. Removed
some unneeded instructions from the test and removed some stray
comments.

Differential Revision: https://reviews.llvm.org/D44941

llvm-svn: 328691
2018-03-28 10:02:26 +00:00
Mikael Holmen 6c062b7641 [RegisterCoalescing] Don't move COPY if it would interfere with another value
Summary:
RegisterCoalescer::removePartialRedundancy tries to hoist B = A from
BB0/BB2 to BB1:

  BB1:
       ...
  BB0/BB2:  ----
       B = A;   |
       ...      |
       A = B;   |
         |-------
         |

It does so if a number of conditions are fulfilled. However, it failed
to check if B was used by any of the terminators in BB1. Since we must
insert B = A before the terminators (since it's not a terminator itself),
this means that we could erroneously insert a new definition of B before a
use of it.

Reviewers: wmi, qcolombet

Reviewed By: wmi

Subscribers: MatzeB, llvm-commits, sdardis

Differential Revision: https://reviews.llvm.org/D44918

llvm-svn: 328689
2018-03-28 06:01:30 +00:00
Lang Hames a95b0df5ed [ORC] Fix ORC on platforms without indirection support.
Previously this crashed because a nullptr (returned by
createLocalIndirectStubsManagerBuilder() on platforms without
indirection support) functor was unconditionally invoked.

Patch by Andres Freund. Thanks Andres!

llvm-svn: 328687
2018-03-28 03:41:45 +00:00
Sanjay Patel bb33007b25 [AArch64] add ftrunc tests; NFC
As suggested in D44909.

llvm-svn: 328683
2018-03-28 00:56:00 +00:00
Sanjay Patel 594c1546f1 [PowerPC] add ftrunc vector tests; NFC
Baseline tests for vectors as suggested in D44909.

llvm-svn: 328682
2018-03-28 00:49:12 +00:00
Heejin Ahn 37307b450f [WebAssembly] Add exception and selector intrinsics
Summary:
Since wasm EH does not use landingpad instructions, these instructions
provide exception pointer and selector values until we lower them in
WasmEHPrepare.

Reviewers: jgravelle-google

Subscribers: jfb, sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D44930

llvm-svn: 328678
2018-03-27 23:37:07 +00:00
Matt Arsenault bd49eccca1 AMDGPU: Really implement getFrameRegister
Currently this seems to only really be used for debug
info.

llvm-svn: 328677
2018-03-27 23:26:59 +00:00
Paul Robinson 07480bd177 Reapply "[DWARFv5] Emit file 0 to the line table."
DWARF v5 specifies that the root file (also given in the DW_AT_name
attribute of the compilation unit DIE) should be emitted explicitly to
the line table's list of files.  This makes the line table more
independent of the .debug_info section.

Fixes the bug found by asan. Also XFAIL the new test for Darwin, which
is stuck on DWARF v2, and fix up other tests so they stop failing on
Windows.  Last but not least, don't break "clang -g" of an assembler
file that has .file directives in it.

Differential Revision: https://reviews.llvm.org/D44054

llvm-svn: 328676
2018-03-27 22:40:34 +00:00
Jessica Paquette 2519ee7081 [MachineOutliner] AArch64: Don't outline ADRPs with un-outlinable operands
If an ADRP appears with, say, a CPI operand, we shouldn't outline it.

This moves the check for unsafe operands so that it occurs before the special-case
for ADRPs. Also add a test for outlining ADRPs.

llvm-svn: 328674
2018-03-27 22:23:48 +00:00
Tim Renouf e4208bfa5b [AMDGPU] For OS type AMDPAL, fixed scratch on compute shader
Summary:
For OS type AMDPAL, the scratch descriptor is loaded from offset 0 of
the GIT, whose 32 bit pointer is in s0 (s8 for gfx9 merged shaders).

This commit fixes that to use offset 0x10 instead of offset 0 for a
compute shader, per the PAL ABI spec.

Reviewers: kzhuravl, nhaehnle, timcorringham

Subscribers: kzhuravl, wdng, yaxunl, t-tye, llvm-commits, dstuttard, nhaehnle, arsenm

Differential Revision: https://reviews.llvm.org/D44468

Change-Id: I93dffa647758e37f613bb5e0dfca840d82e6d26f
llvm-svn: 328673
2018-03-27 21:35:00 +00:00
Paul Robinson 7cb26ad2ef [DWARF] Suppress split line tables more carefully.
If a given split type unit does not have source locations, don't have
it refer to the split line table.
If no split type unit refers to the split line table, don't emit the
line table at all.

This will save a little space on rare occasions, but also refactors
things a bit to improve which class is responsible for what.

Responding to review comments on r326395.

Differential Revision: https://reviews.llvm.org/D44220

llvm-svn: 328670
2018-03-27 21:28:59 +00:00
Tony Tye 01bfd6c4e5 [AMDGPU] Define code object identification string used in AMDHSA runtimes.
Differential Revision: https://reviews.llvm.org/D44718

llvm-svn: 328669
2018-03-27 21:20:46 +00:00
Tim Renouf 4db0960420 [CodeGen] Fixed unreachable with -print-machineinstrs and custom pseudo source value
Summary:
Rev 327580 "[CodeGen] Use MIR syntax for MachineMemOperand printing"
broke -print-machineinstrs for us on AMDGPU, because we have custom
pseudo source values, and MIR serialization does not implement that.

This commit at least restores the functionality of -print-machineinstrs,
even if it does not properly implement the missing MIR serialization
functionality.

Differential Revision: https://reviews.llvm.org/D44871

Change-Id: I44961c0b90bf6d48c01484ed7a4e466fd300db66
llvm-svn: 328668
2018-03-27 21:14:04 +00:00
Sterling Augustine 33dc01861a Initialize variable added in r328617.
llvm-svn: 328667
2018-03-27 21:11:57 +00:00
Graydon Hoare b62f86a6b7 [YAML] Remove unit test of multibyte non-printable escaping that uses C++11 escapes
llvm-svn: 328665
2018-03-27 20:46:26 +00:00
Simon Pilgrim a2f26788a3 [X86] Add WriteFMOVMSK/WriteVecMOVMSK/WriteMMXMOVMSK scheduler classes
Currently MOVMSK instructions use the WriteVecLogic class, which is a very poor choice given that MOVMSK involves a SSE->GPR transfer.

Differential Revision: https://reviews.llvm.org/D44924

llvm-svn: 328664
2018-03-27 20:38:54 +00:00
Wolfgang Pieb ab068eaa57 [DWARF][DWARF v5]: Adding support for dumping DW_RLE_offset_pair and DW_RLE_base_address
Reviewers: dblakie, aprantl

Differential Revision: https://reviews.llvm.org/D44811

llvm-svn: 328662
2018-03-27 20:27:36 +00:00
Graydon Hoare 926cd9b837 [YAML] Escape non-printable multibyte UTF8 in Output::scalarString.
The existing YAML Output::scalarString code path includes a partial and
incorrect implementation of YAML escaping logic. In particular, the logic put
in place in rL321283 escapes non-printable bytes only if they are not part of a
multibyte UTF8 sequence; implicitly this means that all multibyte UTF8
sequences -- printable and non -- are passed through verbatim.

The simplest solution to this is to direct the Output::scalarString method to
use the standalone yaml::escape function, and this _almost_ works, except that
the existing code in that function _over_ escapes: any multibyte UTF8 sequence
is escaped, even printable ones. While this is permitted for YAML, it is also
more aggressive (and hard to read for non-English locales) than necessary,
and the entire point of rL321283 was to back off such aggressive over-escaping.

So in this change, I have both redirected Output::scalarString to use
yaml::escape _and_ modified yaml::escape to optionally restrict its escaping to
non-printables. This preserves behaviour of any existing clients while giving
them a path to more moderate escaping should they desire.

Reviewers: JDevlieghere, thegameg, MatzeB, vladimir.plyashkun

Reviewed By: thegameg

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D44863

llvm-svn: 328661
2018-03-27 19:52:45 +00:00
Xin Tong 0272cb077f 80-line wrap. NFC
llvm-svn: 328660
2018-03-27 19:43:02 +00:00
Matt Arsenault 17f3338015 AMDGPU: Fix not preserving CSR VGPR if used for SGPR spills
Before this was not done if the function had no calls in it. This
is still a possible issue with any callable function, regardless
of calls present.

llvm-svn: 328659
2018-03-27 19:42:55 +00:00
Matt Arsenault 95329f8c53 AMDGPU: Set natural stack alignment in DataLayout
Only 4 byte alignment is ever useful, so increasing anything
beyond this may require realigning the stack.

llvm-svn: 328656
2018-03-27 19:26:40 +00:00
Fangrui Song fc5dabe738 [DWARF] Simplify DWARFAddressRange::contains
This transform is valid because the ranges have been validated (LowPC <= HighPC).

Differential Revision: https://reviews.llvm.org/D44772

llvm-svn: 328655
2018-03-27 19:05:02 +00:00
Rong Xu 662f38b16f [PGO] Fix branch probability remarks assert
Fixed counter/weight overflow that leads to an assertion. Also fixed the help
string for pgo-emit-branch-prob option.

Differential Revision: https://reviews.llvm.org/D44809

llvm-svn: 328653
2018-03-27 18:55:56 +00:00
Matt Arsenault 0a0c871f60 AMDGPU: Fix crash when MachinePointerInfo invalid
The combine on a select of a load only triggers for
addrspace 0, and discards the MachinePointerInfo. The
conservative default needs to be used for this.

llvm-svn: 328652
2018-03-27 18:39:45 +00:00
Matt Arsenault 126a874952 AMDGPU: Fix register name format in tests
These were changed to match the asm output name a long time ago,
although I think the old tablegenerated names still work.

llvm-svn: 328651
2018-03-27 18:39:42 +00:00
Matt Arsenault e9f3679031 AMDGPU: Fix FP restore from being reordered with stack ops
In a function, s5 is used as the frame base SGPR. If a function
is calling another function, during the call sequence
it is copied to a preserved SGPR and restored.

Before it was possible for the scheduler to move stack operations
before the restore of s5, since there's nothing to associate
a frame index access with the restore.

Add an implicit use of s5 to the adjcallstack pseudo which ends
the call sequence to preven this from happening. I'm not 100%
satisfied with this solution, but I'm not sure what else would be
better.

llvm-svn: 328650
2018-03-27 18:38:51 +00:00
Krzysztof Parzyszek 0375cd46ef [Hexagon] Implement TTI::shouldMaximizeVectorBandwidth
llvm-svn: 328648
2018-03-27 18:10:47 +00:00
Stefan Pintilie 659f040351 [Power9] Fix the resource list for the COPY instruction.
The COPY instruction was listed as a 4 cycle instruction.
It is now listed correctly as a 2 cycle ALU instruction.

llvm-svn: 328647
2018-03-27 17:51:53 +00:00
Pirama Arumuga Nainar ddd7b06842 Remap values in PromotedFloats
Summary: When a node is about to be erased from ReplacedValues, we should also remap its corresponding values in PromotedFloats.

Patch by Yan Luo (Yan.Luo2@synopsys.com)

Reviewers: pirama

Reviewed By: pirama

Subscribers: lebedev.ri, llvm-commits

Differential Revision: https://reviews.llvm.org/D44872

llvm-svn: 328644
2018-03-27 17:42:36 +00:00
Artur Pilipenko ca1d849cd6 Fix a reoccuring typo in load-combine tests
%tmp = bitcast i32* %arg to i8*
   %tmp1 = getelementptr inbounds i8, i8* %tmp, i32 0
-  %tmp2 = load i8, i8* %tmp, align 1
+  %tmp2 = load i8, i8* %tmp1, align 1

This doesn't change the semantics of the tests but makes use of %tmp1 which was originally intended.

llvm-svn: 328642
2018-03-27 17:33:50 +00:00
Krzysztof Parzyszek 0a15d24134 [Hexagon] Rudimentary support for auto-vectorization for HVX
This implements a set of TTI functions that the loop vectorizer uses.
The only purpose of this is to enable testing. Auto-vectorization is
disabled by default, enabled by -hexagon-autohvx.

llvm-svn: 328639
2018-03-27 17:07:52 +00:00
Rafael Auler d058b882be [AArch64] Decorate AArch64 instrs with OPERAND_PCREL
Summary:
This is a canonical way to teach objdump to print the target
symbols for branches when disassembling AArch64 code.

Reviewers: evandro, t.p.northover, espindola

Reviewed By: t.p.northover

Differential Revision: https://reviews.llvm.org/D44851

llvm-svn: 328638
2018-03-27 16:58:01 +00:00
Fedor Sergeev 98014e433f [NFC] OptPassGate extracted from OptBisect
Summary:
This is an NFC refactoring of the OptBisect class to split it into an optional pass gate interface used by LLVMContext and the Optional Pass Bisector (OptBisect) used for debugging of optional passes.

This refactoring is needed for D44464, which introduces setOptPassGate() method to allow implementations other than OptBisect.

Patch by Yevgeny Rouban.

Reviewers: andrew.w.kaylor, fedor.sergeev, vsk, dberlin, Eugene.Zelenko, reames, skatkov
Reviewed By: fedor.sergeev
Differential Revision: https://reviews.llvm.org/D44821

llvm-svn: 328637
2018-03-27 16:57:20 +00:00
Krzysztof Parzyszek 52396bb9c5 Use .set instead of = when printing assignment in assembly output
On Hexagon "x = y" is a syntax used in most instructions, and is not
treated as a directive.

Differential Revision: https://reviews.llvm.org/D44256

llvm-svn: 328635
2018-03-27 16:44:41 +00:00
Krzysztof Parzyszek 5d93fdfa89 [LV] Add TTI::shouldMaximizeVectorBandwidth to allow enabling it per target
The default implementation returns false and keeps the current behavior.

Differential Revision: https://reviews.llvm.org/D44735

llvm-svn: 328632
2018-03-27 16:14:11 +00:00
Andrea Di Biagio 9ecb4011ca [llvm-mca] pass the correct set of used registers in checkRAT.
We were incorrectly initializing the array of used registers in method checkRAT.
As a consequence, the number of register file stalls was misreported.

Added a test to cover this case.

llvm-svn: 328629
2018-03-27 15:23:41 +00:00
Simon Pilgrim 5f7ab4fedf [X86][Btver2] Add MMX_PMOVMSKBrr to MOVMSK scheduler class
llvm-svn: 328620
2018-03-27 12:26:12 +00:00
Strahinja Petrovic 06cf6a6490 [PowerPC] Secure PLT support
This patch supports secure PLT mode for PowerPC 32 architecture.

Differential Revision: https://reviews.llvm.org/D42112

llvm-svn: 328617
2018-03-27 11:23:53 +00:00
Alexander Richardson e8059b1de4 [MIPS] Add static_assert that all Fixups are handled in getFixupKind
Summary:
I recently added a new Fixup kind to our fork of LLVM but forgot to add
it to the table in MipsAsmBackend.cpp. With this static_assert the error
would have been caught instead of zero-initializing the array entries for
the new fixups.

Reviewers: sdardis, atanasyan

Reviewed By: atanasyan

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D44895

llvm-svn: 328616
2018-03-27 10:08:12 +00:00
Max Kazantsev b1ad66ff12 [LoopUnroll][NFC] Remove redundant canPeel check
We check `canPeel` twice: when evaluating the number of iterations to be peeled
and within the method `peelLoop` that performs peeling. This method is only
executed if the calculated peel count is positive. Thus, the check in `peelLoop` can
never fail. This patch replaces this check with an assert.

Differential Revision: https://reviews.llvm.org/D44919
Reviewed By: fhahn

llvm-svn: 328615
2018-03-27 09:40:51 +00:00
Sam Parker 90b7f4f72c [IRCE] Enable decreasing loops of non-const bound
As a follow-up to r328480, this updates the logic for the decreasing
safety checks in a similar manner:
- CanBeMax is replaced by CannotBeMaxInLoop which queries
  isLoopEntryGuardedByCond on the maximum value.
- SumCanReachMin is replaced by isSafeDecreasingBound which includes
  some logic from parseLoopStructure and, again, has been updated to
  use isLoopEntryGuardedByCond on the given bounds.

Differential Revision: https://reviews.llvm.org/D44776

llvm-svn: 328613
2018-03-27 08:24:53 +00:00
Max Kazantsev ee5dd8306f [NFC] Fix comments in getExact()
llvm-svn: 328612
2018-03-27 08:13:55 +00:00
Max Kazantsev 7094c8deb2 [SCEV] Make exact taken count calculation more optimistic
Currently, `getExact` fails if it sees two exit counts in different blocks. There is
no solid reason to do so, given that we only calculate exact non-taken count
for exiting blocks that dominate latch. Using this fact, we can simply take min
out of all exits of all blocks to get the exact taken count.

This patch makes the calculation more optimistic with enforcing our assumption
with asserts. It allows us to calculate exact backedge taken count in trivial loops
like

  for (int i = 0; i < 100; i++) {
    if (i > 50) break;
    . . .
  }

Differential Revision: https://reviews.llvm.org/D44676
Reviewed By: fhahn

llvm-svn: 328611
2018-03-27 07:30:38 +00:00
Max Kazantsev a63d333881 [SCEV] Add one more case in computeConstantDifference
This patch teaches `computeConstantDifference` handle calculation of constant
difference between `(X + C1)` and `(X + C2)` which is `(C2 - C1)`.

Differential Revision: https://reviews.llvm.org/D43759
Reviewed By: anna

llvm-svn: 328609
2018-03-27 04:54:00 +00:00
Craig Topper 44a23f4283 [MachineScheduler] Add itinerary to schedcover.py. Make default work in the command line filter
Summary:
This patch adds itinerary support to the schedcover.py script. I've been trying to use this script to figure out why SSE and AVX instructions are ending up in separate tablegen scheduler classes and sometimes its because we are using different itineraries.

Rather than using None to indicate the default scheduler model, I now use the string "default". I had to hack around the sorting a little to keep "default" at the beginning. But this also makes it so you can specify "default" on the command line to just get the defaults

I also fixed the regular expression code so that the no_default wasn't evaluated twice.

Reviewers: RKSimon, atrick, jmolloy, javed.absar

Reviewed By: javed.absar

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D44834

llvm-svn: 328608
2018-03-27 04:26:39 +00:00
Mircea Trofin 56ba71b2a7 Revert "Revert "[lit] Generalized /dev/null support on Windows.""
Summary:
This reverts commit r328596.

Checking if the arguments are strings before testing if they contain "/dev/null".

Reviewers: rnk

Reviewed By: rnk

Subscribers: delcypher, llvm-commits

Differential Revision: https://reviews.llvm.org/D44914

llvm-svn: 328603
2018-03-27 01:39:17 +00:00
Sanjay Patel 15f7df9f44 [x86] add RUN for target before roundss; NFC
llvm-svn: 328601
2018-03-27 00:32:19 +00:00
Jan Korous 1e0e0b077d [lit] Temporarily disable shtest-timeout.py on darwin
Disabled until fixed in order to avoid random failures on green dragon.

rdar://problem/38774530

llvm-svn: 328598
2018-03-27 00:16:28 +00:00
Mircea Trofin 373c445c24 Revert "[lit] Generalized /dev/null support on Windows."
This reverts commit ca7fdbb974384ce5a05528b22a41d46b1cc13e92.

llvm-svn: 328596
2018-03-26 23:59:39 +00:00
David Blaikie 60e62438d2 Add a build dependency from libMC to libDebugInfoCodeView to match the reality of header dependencies here
llvm-svn: 328595
2018-03-26 23:48:52 +00:00
David Blaikie 211c67cdb2 Move CVDebugRecord from CodeView to Object to fix layering
llvm-svn: 328593
2018-03-26 23:37:02 +00:00
Sanjay Patel 8653776367 [x86] add tests for ftrunc; NFC
llvm-svn: 328592
2018-03-26 23:18:32 +00:00
Aaron Smith f13938382c [DebugInfoPDB] Print the method name along with the variant value
Before this change, using dumpProperties() with PDBSymbolData
would look like this:

  get_locationType: 3
  1

After this change:

  get_locationType: 3
  get_value: 1

llvm-svn: 328590
2018-03-26 22:53:38 +00:00
Mircea Trofin 88911686c8 [lit] Generalized /dev/null support on Windows.
Generalized /dev/null remapping on Windows, and added test.

Reviewers: rnk

Reviewed By: rnk

Subscribers: amccarth, zturner, delcypher, llvm-commits

Differential Revision: https://reviews.llvm.org/D44771

llvm-svn: 328589
2018-03-26 22:41:06 +00:00
Aaron Smith 1af50bcf89 [DebugInfoPDB] Add methods to get the compiland and line numbers with PDBSymbolData
llvm-svn: 328587
2018-03-26 22:17:12 +00:00
Aaron Smith ed81a9db29 [DebugInfoPDB] Add DIA implementation of findLineNumbersByRVA
This method is used to find line numbers for PDBSymbolData
that have an invalid virtual address.

llvm-svn: 328586
2018-03-26 22:13:22 +00:00
Aaron Smith 53708a5e9e [DebugInfoPDB] Add DIA implementation of addressForVA and addressForRVA
These are used in finding line numbers for PDBSymbolData

llvm-svn: 328585
2018-03-26 22:10:02 +00:00
Simon Pilgrim f6440b6fb1 Fix newlines. NFCI.
llvm-svn: 328583
2018-03-26 21:07:59 +00:00
Simon Pilgrim 28e7bcbba6 [X86] Add WriteCRC32 scheduler class
Currently CRC32 instructions use the WriteFAdd class, this patch splits them off into their own, at the moment it is still mostly just a duplicate of WriteFAdd but it can now be tweaked on a target by target basis.

Differential Revision: https://reviews.llvm.org/D44647

llvm-svn: 328582
2018-03-26 21:06:14 +00:00
Rafael Espindola 78fdca3cd5 Use local symbols for creating .stack-size.
llvm-svn: 328581
2018-03-26 20:40:22 +00:00
Reid Kleckner 2fe905cfee Fix go bindings test when using goma distributed build tool
Goma[1] is a distributed build system similar to distcc and icecc
primarily used to compile Chromium. The client is open source, and
hopefully soon the server will be as well. The intended usage model is
similar to most distributed build systems: prefix gomacc onto your
compiler command line, and it transparently distributes compilation.

The go lit config wants to determine the host compiler binary, so it
needs some extra logic to avoid looking at these prefixes.

[1] https://chromium.googlesource.com/infra/goma/client/

llvm-svn: 328580
2018-03-26 20:19:14 +00:00
Paul Robinson 82e4864730 Use correct format specifier.
Review comment on r328235 by James Henderson.

llvm-svn: 328578
2018-03-26 19:55:01 +00:00
Eli Friedman 88e2bac94d [MemorySSA] Fix exponential compile-time updating MemorySSA.
MemorySSAUpdater::getPreviousDefRecursive is a recursive algorithm, for
each block, it computes the previous definition for each predecessor,
then takes those definitions and combines them. But currently it doesn't
remember results which it already computed; this means it can visit the
same block multiple times, which adds up to exponential time overall.

To fix this, this patch adds a cache. If we computed the result for a
block already, we don't need to visit it again because we'll come up
with the same result. Well, unless we RAUW a MemoryPHI; in that case,
the TrackingVH will be updated automatically.

This matches the original source paper for this algorithm.

The testcase isn't really a test for the bug, but it adds coverage for
the case where tryRemoveTrivialPhi erases an existing PHI node. (It's
hard to write a good regression test for a performance issue.)

Differential Revision: https://reviews.llvm.org/D44715

llvm-svn: 328577
2018-03-26 19:52:54 +00:00
Krzysztof Parzyszek 4a5a80c370 [Hexagon] Assertion failure in HexagonSubtarget.cpp
In restoreLatency, replace range-for loop with std::find.

Patch by Jyotsna Verma.

llvm-svn: 328574
2018-03-26 19:04:58 +00:00
Simon Pilgrim fcf49df21c [X86][Btver2] Add (U)COMISD/(U)COMISD scheduler costs
Account for the "+i" integer pipe transfer cost (1cy use of JALU0 for GPR PRF write)

llvm-svn: 328573
2018-03-26 19:01:06 +00:00
Haicheng Wu b45f921678 [SLP] Add more checks to a test case. NFC.
llvm-svn: 328572
2018-03-26 18:59:28 +00:00
Reid Kleckner 41fb2dba9c [X86] Fix Windows `i1 zeroext` conventions to use i8 instead of i32
Summary:
Re-lands r328386 and r328443, reverting r328482.

Incorporates fixes from @mstorsjo in D44876 (thanks!) so that small
parameters in i8 and i16 do not end up in the SysV register parameters
(EDI, ESI, etc).

I added tests for how we receive small parameters, since that is the
important part. It's always safe to store more bytes than will be read,
but the assumptions you make when loading them are what really matter.

I also tested this by self-hosting clang and it passed tests on win64.

Reviewers: mstorsjo, hans

Subscribers: hiraditya, mstorsjo, llvm-commits

Differential Revision: https://reviews.llvm.org/D44900

llvm-svn: 328570
2018-03-26 18:49:48 +00:00
Simon Pilgrim f33d905293 [X86] Add WriteBitScan/WriteLZCNT/WriteTZCNT/WritePOPCNT scheduler classes (PR36881)
Give the bit count instructions their own scheduler classes instead of forcing them into existing classes.

These were mostly overridden anyway, but I had to add in costs from Agner for silvermont and znver1 and the Fam16h SoG for btver2 (Jaguar).

Differential Revision: https://reviews.llvm.org/D44879

llvm-svn: 328566
2018-03-26 18:19:28 +00:00
David Blaikie 7c4b5d92f1 Remove unused file, ExecutionEngine/MCJIT/ObjectBuffer.h
This header also wasn't self contained/modular - but with no users, it
didn't seem worth fixing because it'd break so easily again.

llvm-svn: 328565
2018-03-26 18:10:31 +00:00
Mandeep Singh Grang 1b9ff45157 [XCore] Change std::sort to llvm::sort in response to r327219
Summary:
r327219 added wrappers to std::sort which randomly shuffle the container before sorting.
This will help in uncovering non-determinism caused due to undefined sorting
order of objects having the same key.

To make use of that infrastructure we need to invoke llvm::sort instead of std::sort.

Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort.
Refer the comments section in D44363 for a list of all the required patches.

Reviewers: dblaikie, RKSimon, robertlytton

Reviewed By: robertlytton

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D44875

llvm-svn: 328564
2018-03-26 18:08:26 +00:00
Reid Kleckner 8252892951 [lit] Implement 'cat' command for internal shell
Fixes PR36449

Patch by Chamal de Silva

Differential Revision: https://reviews.llvm.org/D43501

llvm-svn: 328563
2018-03-26 18:05:12 +00:00
Zachary Turner 7b84b678a9 Delete pdbutil diff mode.
This has been made obsolete by the fact that almost all of the
things it previously checked for are no longer relevant since
we can just compare bytes in a lot of places.

llvm-svn: 328562
2018-03-26 18:01:07 +00:00
Krzysztof Parzyszek 5488deb1ab [Hexagon] Add more lit tests
llvm-svn: 328561
2018-03-26 17:53:48 +00:00
Sanjay Patel 0e3167cb30 [InstCombine] improve code comment; NFC
llvm-svn: 328560
2018-03-26 17:52:02 +00:00
Lei Huang be0afb0870 [Power9]Legalize and emit code for quad-precision convert from double-precision
Legalize and emit code for quad-precision floating point operation xscvdpqp
and add option to guard the quad precision operation support.

Differential Revision: https://reviews.llvm.org/D44746

llvm-svn: 328558
2018-03-26 17:46:25 +00:00
Stefan Pintilie 26d4f923c4 [PowerPC] Infrastructure work. Implement getting the opcode for a spill in one place.
A new function getOpcodeForSpill should now be the only place to get
the opcode for a given spilled register.

Differential Revision: https://reviews.llvm.org/D43086

llvm-svn: 328556
2018-03-26 17:39:18 +00:00
Zaara Syeda 17e4eeaa8b Disable [MachineLICM] Add functions to MachineLICM to hoist invariant stores
Disable https://reviews.llvm.org/D40196 with setting option
hoist-const-stores to false since failing s390 buildbot.

llvm-svn: 328555
2018-03-26 17:22:33 +00:00
Krzysztof Parzyszek 3ca233414b [Pipeliner] Several node-ordering fixes
First, we change the heuristic that is used to ignore the recurrent
node-sets in the node ordering. In certain cases it's not important
to focus on the recurrent node-sets.  Instead, the algorithm begins
by considering all the instructions in the node ordering step.

Second, a minor change to the bottom up traversal, which needs to
consider loop carried dependences (modeled as anti dependences).
Previously, these instructions were skipped, which caused problems
because the instruction ends up having both predecessors and
sucessors in the schedule.

Third, consider anti-dependences as a tie breaker when choosing
between instructions in the node ordering. We want to make sure
that the source of the anti-dependence does not end up with both
predecesssors and sucessors in the final node ordering.

Patch by Brendon Cahoon.

llvm-svn: 328554
2018-03-26 17:07:41 +00:00
Tim Corringham 7116e8963d [AMDGPU] Improve disassembler error handling
Summary:
llvm-objdump now disassembles unrecognised opcodes as data, using
the .long directive. We treat unrecognised opcodes as being 32 bit
values, so move along 4 bytes rather than the single byte which
previously resulted in a cascade of bogus disassembly following an
unrecognised opcode.

While no solution can always disassemble code that contains
embedded data correctly this provides a significant improvement.

The disassembler will now cope with an arbitrary length section
as it no longer truncates it to a multiple of 4 bytes, and will
use the .byte directive for trailing bytes.

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D44685

llvm-svn: 328553
2018-03-26 17:06:33 +00:00
Simon Pilgrim 86ea53123d [X86][Btver2] Add CVTSI2SD/CVTSI2SS scheduler costs
We still need to account for how Jaguar passes data from GPR -> XMM, which isn't as clean as XMM -> GPR.....

llvm-svn: 328551
2018-03-26 17:02:02 +00:00
Krzysztof Parzyszek 8c07d0c42c [Pipeliner] Check for affine expression in isLoopCarriedOrder
The pipeliner must add a loop carried dependence between two memory
operations if the base register is not an affine (linear) exression.
The current implementation doesn't check how the base register is
defined, which allows non-affine expressions, and then the pipeliner
does not add a loop carried dependence when one is needed.

This patch adds code to isLoopCarriedOrder that checks if the base
register of the memory operations is defined by a phi, and the loop
definition for the phi is a constant increment value.  This is a very
simple check for a linear expression.

Patch by Brendon Cahoon.

llvm-svn: 328550
2018-03-26 16:58:40 +00:00
David Blaikie 535ca36e5e Remove an unneeded (& mislayered) include from Target/TargetLoweringObjectFile on a CodeGen header
llvm-svn: 328549
2018-03-26 16:57:31 +00:00
David Blaikie a1b2bf4c71 Remove unneeded (& mislayered) include from TargetMachine.cpp on a CodeGen header
llvm-svn: 328548
2018-03-26 16:52:10 +00:00
Krzysztof Parzyszek 9f041b1830 [Pipeliner] Add missing loop carried dependences
The pipeliner is not adding a dependence edge for a loop carried
dependence, and ends up scheduling a load from iteration n prior
to an aliased store in iteration n-1.

The code that adds the loop carried dependences in the pipeliner
doesn't check if the memory objects for loads and stores are
"identified" (i.e., distinct) objects. If they are not, then the
code that adds the dependences needs to be conservative. The
objects can be used to check dependences only when they are
distinct objects.

The code that checks for loop carried dependences has been updated
to classify loads and stores that are not identified as "unknown"
values. A store with an "unknown" value can potentially create
a loop carried dependence with any pending load.

Patch by Brendon Cahoon.

llvm-svn: 328547
2018-03-26 16:50:11 +00:00
Haicheng Wu 0ec1dbe417 [SLP] Add a test case. NFC.
llvm-svn: 328546
2018-03-26 16:47:37 +00:00
Krzysztof Parzyszek 16e66f5901 [Pipeliner] Fix renaming in pipeliner when eliminating phis
The phi renaming code in the pipeliner uses the wrong value when
rewriting phi uses, which results in an undefined value. In this
case, the original phi is no longer needed due to the order of
instruction in the pipelined loop. The pipeliner was assuming, in
this case, the the phi loop definition should be used to
rewrite the uses. However, the pipeliner needs to check to make
sure that the loop definition has already been scheduled. If not,
then the phi initial value needs to be used instead.

Patch by Brendon Cahoon.

llvm-svn: 328545
2018-03-26 16:41:36 +00:00
Krzysztof Parzyszek 3f72a6b7a1 [Pipeliner] Fix number of phis to generate in the epilog
The pipeliner was generating too many phis in the epilog blocks, which
caused incorrect code generation when rewriting an instruction that uses
the phi.

In this case, there 3 prolog and epilog stages. An existing phi was
scheduled at stage 1. When generating the code for the 2nd epilog an
extra new phi was generated.

To fix this, we need to update the code that calculates the maximum
number of phis that can be generated, which is based upon the current
prolog stage and the stage of the original phi. In this case, when the
prolog stage is 1 and the original phi stage is 1, the maximum number
of phis to generate is 2.

Patch by Brendon Cahoon.

llvm-svn: 328543
2018-03-26 16:37:55 +00:00
Krzysztof Parzyszek a212204453 [Pipeliner] Use latency to compute RecMII
The patch contains severals changes needed to pipeline an example
that was transformed so that a Phi with a subreg is converted to
copies.

The pipeliner wasn't working for a couple of reasons.
- The RecMII was 3 instead of 2 due to the extra copies.
- Copy instructions contained a latency of 1.
- The node order algorithm was not choosing the best "bottom"
node, which caused an instruction to be scheduled that had a 
predecessor and successor already scheduled.
- Updated the Hexagon Machine Scheduler to check if the node is
latency bound when adding the cost for a 0-latency dependence.

The RecMII was 3 because the computation looks at the number of
nodes in the recurrence. The extra copy is an extra node but
it shouldn't increase the latency. The new RecMII computation
looks at the latency of the instructions in the recurrence. We
changed the latency of the dependence of a copy to 0. The latency
computation for the copy also checks the use of the copy (similar
to a reg_sequence).

The node order algorithm was not choosing the last instruction
in the recurrence for a bottom up traversal. This was when the
last instruction is a copy. A check was added when choosing the
instruction to check for NodeNum if the maxASAP is the same. This
means that the scheduler will not end up with another node in
the recurrence that has both a predecessor and successor already
scheduled.

The cost computation in Hexagon Machine Scheduler adds cost when
an instruction can be packetized with a zero-latency instruction.
We should only do this if the schedule is latency bound. 

Patch by Brendon Cahoon.

llvm-svn: 328542
2018-03-26 16:33:16 +00:00
Simon Pilgrim 8815105cd5 [X86][Btver2] Add CVTSD2SS/CVTSS2SD scheduler costs
llvm-svn: 328541
2018-03-26 16:24:13 +00:00
Krzysztof Parzyszek f13bbf1d58 [Pipeliner] Fix assert caused by pipeliner serialization
The pipeliner is asserting because the serialization step that 
occurs at the end is deleting an instruction.  The assert
occurs later on because there is a use without a definition.  

The problem occurs when an instruction defines a value used 
by a REQ_SEQUENCE and that value is used by a COPY instruction.
The latencies between these instructions are zero, so they are
put in to the same packet.  The serialization code is unable to
handle this correctly, and ends up putting the REG_SEQUENCE
before its definition.

There is special code in the serialization step that attempts
to handle zero-cost instructions (phis, copy, reg_sequence)
differently than regular instructions. Unfortunately, this means
the order does not come out correct.

This patch simplifies the code by changing the seperate steps for
handling zero-cost and regular instructions. Only phis are
handled separate now, since they should occurs first. Then, this
patch adds checks to make use the MoveUse is set to the smallest
value if there are multiple uses in a cycle.

Patch by Brendon Cahoon.

llvm-svn: 328540
2018-03-26 16:23:29 +00:00
Sebastian Pop d870aea03e [InstCombine] reassociate loop invariant GEP chains to enable LICM
This change brings performance of zlib up by 10%. The example below is from a
hot loop in longest_match() from zlib.

do.body:
  %cur_match.addr.0 = phi i32 [ %cur_match, %entry ], [ %2, %do.cond ]
  %idx.ext = zext i32 %cur_match.addr.0 to i64
  %add.ptr = getelementptr inbounds i8, i8* %win, i64 %idx.ext
  %add.ptr2 = getelementptr inbounds i8, i8* %add.ptr, i64 %idx.ext1
  %add.ptr3 = getelementptr inbounds i8, i8* %add.ptr2, i64 -1

In this example %idx.ext1 is a loop invariant. It will be moved above the use of
loop induction variable %idx.ext such that it can be hoisted out of the loop by
LICM. The operands that have dependences carried by the loop will be sinked down
in the GEP chain. This patch will produce the following output:

do.body:
  %cur_match.addr.0 = phi i32 [ %cur_match, %entry ], [ %2, %do.cond ]
  %idx.ext = zext i32 %cur_match.addr.0 to i64
  %add.ptr = getelementptr inbounds i8, i8* %win, i64 %idx.ext1
  %add.ptr2 = getelementptr inbounds i8, i8* %add.ptr, i64 -1
  %add.ptr3 = getelementptr inbounds i8, i8* %add.ptr2, i64 %idx.ext

llvm-svn: 328539
2018-03-26 16:19:31 +00:00