Commit Graph

167489 Commits

Author SHA1 Message Date
Craig Topper 3c869cb5e5 [X86] Add isel patterns for atomic_load+sub+atomic_sub.
Despite the comment removed in this patch, this is beneficial when the RHS of the sub is a register.

llvm-svn: 338930
2018-08-03 22:08:30 +00:00
Craig Topper 84319d1b42 [X86] Add test cases to show missed opportunity to use RMW for atomic_load+sub+atomic_store.
llvm-svn: 338929
2018-08-03 22:08:28 +00:00
Reid Kleckner 8e40702c1c [X86] Re-generate abi-isel.ll checks with update_llc_test_checks.py
These tests were clearly auto-generated when they were converted to
FileCheck back in r80019 (2009), but we didn't have a fancy script to
keep them up to date then. I've reviewed the diff, and we should be
generating the exact same code sequences we used to.

After this, I plan to commit a change that changes our output slightly,
but in a way that is still correct. It will generate a large diff, and I
want it to be clearly correct, so I am regenerating these checks in
preparation for that.

llvm-svn: 338928
2018-08-03 21:58:25 +00:00
Reid Kleckner 5578b53c92 [X86] Make abi-isel.ll like update_llc_test_checks.py output
- Remove -asm-verbose=0 from every llc command. The tests still pass.
- Reorder the RUN lines to match CHECKs.
- Use -LABEL like update_llc_test_checks.py does.

llvm-svn: 338927
2018-08-03 21:58:12 +00:00
Reid Kleckner 13a9035190 [X86] Layout tests exactly as update_llc_test_checks.py would
Put the LLVM IR at the bottom of the function instead of the top.  In my
next patch, I will run update_llc_test_checks.py on this file, and I
want to only highlight the diffs in the CHECK lines. Hopefully by doing
this change first, the patch will be more understandable.

llvm-svn: 338926
2018-08-03 21:57:59 +00:00
Craig Topper d7391eefdf [X86] Remove RELEASE_ and ACQUIRE_ pseudo instructions. Use isel patterns and the normal instructions instead
At one point in time acquire implied mayLoad and mayStore as did release. Thus we needed separate pseudos that also carried that property. This appears to no longer be the case. I believe it was changed in 2012 with a comment saying that atomic memory accesses are marked volatile which preserves the ordering.

So from what I can tell we shouldn't need additional pseudos since they aren't carry any flags that are different from the normal instructions. The only thing I can think of is that we may consider them for load folding candidates in the peephole pass now where we didn't before. If that's important hopefully there's something in the memory operand we can check to prevent the folding without relying on pseudo instructions.

Differential Revision: https://reviews.llvm.org/D50212

llvm-svn: 338925
2018-08-03 21:40:44 +00:00
Craig Topper 8c41136ca3 [X86] Autogenerate complete checks. NFC
llvm-svn: 338921
2018-08-03 20:58:14 +00:00
Anastasis Grammenos 4dfe279e00 [TRE][DebugInfo] Preserve Debug Location in new branch instruction
There are two branch instructions created
so the new test covers them both.

Differential Revision: https://reviews.llvm.org/D50263

llvm-svn: 338917
2018-08-03 20:27:13 +00:00
Craig Topper c4960582ec [SelectionDAG] Teach LegalizeVectorTypes to widen the mask input to a masked store.
The mask operand is visited before the data operand so we need to be able to widen it.

Fixes PR38436.

llvm-svn: 338915
2018-08-03 20:14:18 +00:00
Fangrui Song 23310a89be [Support] Don't initialize compressed buffer allocated by zlib::compress
resize() (zeroing) makes every allocated page resident. The actual size of the compressed buffer is usually much
smaller. Making every page resident is wasteful.

When linking a test binary with ~1.9GiB uncompressed debug info with LLD, this optimization decreases max RSS by ~1.5GiB.

Differential Revision: https://reviews.llvm.org/50223

llvm-svn: 338913
2018-08-03 19:37:49 +00:00
Matt Arsenault c3dc8e65e2 DAG: Enhance isKnownNeverNaN
Add a parameter for testing specifically for
sNaNs - at least one instruction pattern on AMDGPU
needs to check specifically for this.

Also handle more cases, and add a target hook
for custom nodes, similar to the hooks for known
bits.

llvm-svn: 338910
2018-08-03 18:27:52 +00:00
Artem Belevich 0a11b6366a [NVPTX] Handle __nvvm_reflect("__CUDA_ARCH").
Summary:
libdevice in recent CUDA versions relies on __nvvm_reflect() to select
GPU-specific bitcode. This patch addresses the requirement.

Reviewers: jlebar

Subscribers: jholewinski, sanjoy, hiraditya, bixia, llvm-commits

Differential Revision: https://reviews.llvm.org/D50207

llvm-svn: 338908
2018-08-03 18:05:24 +00:00
Craig Topper feb2a58860 [X86] Add a DAG combine for the __builtin_parity idiom used by clang to enable better codegen
Clang uses "ctpop & 1" to implement __builtin_parity. If the popcnt instruction isn't supported this generates a large amount of code to calculate the population count. Instead we can bisect the data down to a single byte using xor and then check the parity flag.

Even when popcnt is supported, its still a good idea to split 64-bit data on 32-bit targets using an xor in front of a single popcnt. Otherwise we get two popcnts and an add before the and.

I've specifically targeted this at the sizes supported by clang builtins, but we could generalize this if we think that's useful.

Differential Revision: https://reviews.llvm.org/D50165

llvm-svn: 338907
2018-08-03 18:00:29 +00:00
Craig Topper b0ad9b9fd7 [X86] Add test cases for the current codegen of __builtin_parity.
Will be improved in a follow commit

llvm-svn: 338906
2018-08-03 18:00:23 +00:00
Evandro Menezes 5aa217ac68 [SLC] Refactor shrinking of functions (NFC)
Merge the helper functions for shrinking unary and binary functions into a
single one, while keeping all their functionality.  Otherwise, NFC.

llvm-svn: 338905
2018-08-03 17:50:16 +00:00
Joel Galenson cfe5bc158d Fix crash in bounds checking.
In r337830 I added SCEV checks to enable us to insert fewer bounds checks.  Unfortunately, this sometimes crashes when multiple bounds checks are added due to SCEV caching issues.  This patch splits the bounds checking pass into two phases, one that computes all the conditions (using SCEV checks) and the other that adds the new instructions.

Differential Revision: https://reviews.llvm.org/D49946

llvm-svn: 338902
2018-08-03 17:12:23 +00:00
Matt Davis b4588e594f [llvm-mca][docs] Move the code marker text into its own subsection. NFC.
Also fixed a few undecorated 'llvm-mca' references to be highlighted
with the 'program' emphasis.

llvm-svn: 338900
2018-08-03 15:56:07 +00:00
Graham Yiu 58dbc00559 [Partial Inlining] Fix small bug in detecting if we did something
- It's possible for 'Changed' to return as false even if we did
  partial inline something.  Fixed to accumulate return values

llvm-svn: 338896
2018-08-03 14:42:53 +00:00
Nicholas Wilson e408a89a3a [WebAssembly] Cleanup of the way globals and global flags are handled
Differential Revision: https://reviews.llvm.org/D44030

llvm-svn: 338894
2018-08-03 14:33:37 +00:00
Andrea Di Biagio 1c3bcc6ce5 [llvm-mca] Speed up the computation of the wait/ready/issued sets in the Scheduler.
This patch is a follow-up to r338702.

We don't need to use a map to model the wait/ready/issued sets. It is much more
efficient to use a vector instead.

This patch gives us an average 7.5% speedup (on top of the ~12% speedup obtained
after r338702).

llvm-svn: 338883
2018-08-03 12:55:28 +00:00
Chijun Sima 530484372b [Dominators] Make RemoveUnreachableBlocks return false if the BasicBlock is already awaiting deletion
Summary:
Previously, `removeUnreachableBlocks` still returns true (which indicates the CFG is changed) even when all the unreachable blocks found is awaiting deletion in the DDT class.
This makes code pattern like
```
// Code modified from lib/Transforms/Scalar/SimplifyCFGPass.cpp 
bool EverChanged = removeUnreachableBlocks(F, nullptr, DDT);
...
do {
    EverChanged = someMightHappenModifications();
    EverChanged |= removeUnreachableBlocks(F, nullptr, DDT);
  } while (EverChanged);
```
become a dead loop.
Fix this by detecting whether a BasicBlock is already awaiting deletion.

Reviewers: kuhar, brzycki, dmgreen, grosser, davide

Reviewed By: kuhar, brzycki

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D49738

llvm-svn: 338882
2018-08-03 12:45:29 +00:00
Andrea Di Biagio eaca8ed5b8 [llvm-mca][docs] Improve the CommandLine documentation.
This patch replaces all the remaining occurrences of string "MCA" with
":program:`llvm-mca`".  Somehow I missed those strings when I committed r338394.

This patch also improves section "Instruction Dispatch".

llvm-svn: 338881
2018-08-03 12:44:56 +00:00
Nico Weber 0e81519571 convert some tabs to spaces
llvm-svn: 338880
2018-08-03 12:21:54 +00:00
Jonas Devlieghere 3a92c5c1d3 [DebugInfo/Verifier] Don't emit error for missing module in index
We don't expect module names to be present in the index. This patch adds
DW_TAG_module to the blacklist.

Differential revision: https://reviews.llvm.org/D50237

llvm-svn: 338878
2018-08-03 12:01:43 +00:00
Jonas Paulsson f107b7275c [SystemZ] Improve handling of instructions which expand to several groups
Some instructions expand to more than one decoder group.

This has been hitherto ignored, but is handled with this patch.

Review: Ulrich Weigand
https://reviews.llvm.org/D50187

llvm-svn: 338849
2018-08-03 10:43:05 +00:00
Max Kazantsev dcf6706e52 [NFC] Add missing comment
llvm-svn: 338848
2018-08-03 10:41:51 +00:00
Max Kazantsev 65cd4836d2 [NFC] Move some methods into static functions
llvm-svn: 338843
2018-08-03 10:16:40 +00:00
Jeremy Morse 019406554b [Windows FS] Allow moving files in TempFile::keep
In r338216 / D49860 TempFile::keep was extended to allow keeping across
filesystems. The aim on Windows was to have this happen in rename_internal
using the existing system API. However, to fix an issue and preserve the
idea of "renaming" not being a move, put Windows keep-across-filesystem in
TempFile::keep.

Differential Revision: https://reviews.llvm.org/D50048

llvm-svn: 338841
2018-08-03 10:13:35 +00:00
Simon Pilgrim 94112ebc75 [TargetLowering] Generalise BuildSDIV function
First step towards a BuildSDIV equivalent to D49248 for non-uniform vector support - this just pushes the splat detection down into TargetLowering::BuildSDIV where its still used.

Differential Revision: https://reviews.llvm.org/D50185

llvm-svn: 338838
2018-08-03 10:00:54 +00:00
Guillaume Chatelet e60866a4e0 [llvm-exegesis] Renaming classes and functions.
Summary: Functional No Op.

Reviewers: gchatelet

Subscribers: tschuett, courbet, llvm-commits

Differential Revision: https://reviews.llvm.org/D50231

llvm-svn: 338836
2018-08-03 09:29:38 +00:00
Sjoerd Meijer d62c5ec2fe [ARM] FP16: support vector zip and unzip
This is addressing PR38404.

Differential Revision: https://reviews.llvm.org/D50186

llvm-svn: 338835
2018-08-03 09:24:29 +00:00
Dean Michael Berris 2c4dcf0576 [XRay][tools] Use Support/JSON.h in llvm-xray convert
Summary:
This change removes the ad-hoc implementation used by llvm-xray's
`convert` subcommand to generate JSON encoded catapult (AKA Chrome
Trace Viewer) trace output, to instead use the JSON encoder now in the
Support library.

Reviewers: kpw, zturner, eizan

Reviewed By: kpw

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D50129

llvm-svn: 338834
2018-08-03 09:21:31 +00:00
Simon Pilgrim 4014fb1049 [X86] Add example of 'zero shift' guards on rotation patterns (PR34924)
Basic pattern that leaves an unnecessary select on a rotation by zero result. This variant is trivial - the more general case with a compare+branch to prevent execution of undefined shifts is more tricky.

llvm-svn: 338833
2018-08-03 09:20:02 +00:00
Sjoerd Meijer 9b30213828 [ARM] FP16: support VFMA
This is addressing PR38404.

llvm-svn: 338830
2018-08-03 09:12:56 +00:00
Dean Michael Berris 7a3bd723b4 [XRay] fixup: add one more missing std::move(...)
Follow up to D48370.

llvm-svn: 338829
2018-08-03 09:06:11 +00:00
Dean Michael Berris 5e8c5a31dd [XRay] fixup: Add missing std::move(...)
Follow up to D48370.

llvm-svn: 338827
2018-08-03 07:54:37 +00:00
Dean Michael Berris 1fa08d1132 [XRay] Fixup: remove 'noexcept' in defaulted move members
This is to appease stage1 builds using gcc.

Follow-up to D48370.

llvm-svn: 338826
2018-08-03 07:41:34 +00:00
Dean Michael Berris 13d04e766a [XRay][llvm] Load XRay Profiles
Summary:
This change implements the profile loading functionality in LLVM to
support XRay's profiling mode in compiler-rt.

We introduce a type named `llvm::xray::Profile` which allows building a
profile representation. We can load an XRay profile from a file to build
Profile instances, or do it manually through the Profile type's API.

The intent is to get the `llvm-xray` tool to generate `Profile`
instances and use that as the common abstraction through which all
conversion and analysis can be done. In the future we can generate
`Profile` instances from `Trace` instances as well, through conversion
functions.

Some of the key operations supported by the `Profile` API are:

- Path interning (`Profile::internPath(...)`) which returns a unique path
  identifier.

- Block appending (`Profile::addBlock(...)`) to add thread-associated
  profile information.

- Path ID to Path lookup (`Profile::expandPath(...)`) to look up a
  PathID and return the original interned path.

- Block iteration.

A 'Path' in this context represents the function call stack in
leaf-to-root order. This is represented as a path in an internally
managed prefix tree in the `Profile` instance. Having a handle (PathID)
to identify the unique Paths we encounter for a particular Profile
allows us to reduce the amount of memory required to associate profile
data to a particular Path.

This is the first of a series of patches to migrate the `llvm-stacks`
tool towards using a single profile representation.

Depends on D48653.

Reviewers: kpw, eizan

Reviewed By: kpw

Subscribers: mgorny, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D48370

llvm-svn: 338825
2018-08-03 07:18:39 +00:00
Craig Topper a7a12399a1 [X86] Remove all the vector NOP bitcast patterns. Use a few lines of code in the Select method in X86ISelDAGToDAG.cpp instead.
There are a lot of permutations of types here generating a lot of patterns in the isel table. It's more efficient to just ReplaceUses and RemoveDeadNode from the Select function.

The test changes are because we have a some shuffle patterns that have a bitcast as their root node. But the behavior is identical to another instruction whose pattern doesn't start with a bitcast. So this isn't a functional change.

llvm-svn: 338824
2018-08-03 07:01:10 +00:00
Hans Wennborg e566e2c335 build_llvm_package.bat: Add OpenMP back
After r338721, it builds again.

llvm-svn: 338823
2018-08-03 07:00:08 +00:00
Chijun Sima c72ff1011d [Dominators] Refine the logic of recalculate() in the DomTreeUpdater
Summary:
This patch refines the logic of `recalculate()` in the `DomTreeUpdater` in the following two aspects:
1. Previously, `recalculate()` tests whether there are pending updates/BBs awaiting deletion and then do recalculation under Lazy UpdateStrategy; and do recalculation immediately under Eager UpdateStrategy. (The former behavior is inherited from the `DeferredDominance` class). This is an inconsistency between two strategies and there is no obvious reason to do this. So the behavior is changed to always recalculate available trees when calling `recalculate()`.
2. Fix the issue of when DTU under Lazy UpdateStrategy holds nothing but with BBs awaiting deletion, after calling `recalculate()`, BBs awaiting deletion aren't flushed. An additional unittest is added to cover this case.

Reviewers: kuhar, dmgreen, brzycki, grosser, davide

Reviewed By: kuhar

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D50173

llvm-svn: 338822
2018-08-03 06:51:35 +00:00
Craig Topper e902b7d0b0 [X86] Support fp128 and/or/xor/load/store with VEX and EVEX encoded instructions.
Move all the patterns to X86InstrVecCompiler.td so we can keep SSE/AVX/AVX512 all in one place.

To save some patterns we'll use an existing DAG combine to convert f128 fand/for/fxor to integer when sse2 is enabled. This allows use to reuse all the existing patterns for v2i64.

I believe this now makes SHA instructions the only case where VEX/EVEX and legacy encoded instructions could be generated simultaneously.

llvm-svn: 338821
2018-08-03 06:12:56 +00:00
Hiroshi Inoue 73f8b255b6 [InstSimplify] fold extracting from std::pair (2/2)
This is the second patch of the series which intends to enable jump threading for an inlined method whose return type is std::pair<int, bool> or std::pair<bool, int>. 
The first patch is https://reviews.llvm.org/rL338485.

This patch handles code sequences that merges two values using `shl` and `or`, then extracts one value using `and`.

Differential Revision: https://reviews.llvm.org/D49981

llvm-svn: 338817
2018-08-03 05:39:48 +00:00
Chijun Sima 21a8b605a1 [Dominators] Convert existing passes and utils to use the DomTreeUpdater class
Summary:
This patch is the second in a series of patches related to the [[ http://lists.llvm.org/pipermail/llvm-dev/2018-June/123883.html | RFC - A new dominator tree updater for LLVM ]].

It converts passes (e.g. adce/jump-threading) and various functions which currently accept DDT in local.cpp and BasicBlockUtils.cpp to use the new DomTreeUpdater class.
These converted functions in utils can accept DomTreeUpdater with either UpdateStrategy and can deal with both DT and PDT held by the DomTreeUpdater.

Reviewers: brzycki, kuhar, dmgreen, grosser, davide

Reviewed By: brzycki

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D48967

llvm-svn: 338814
2018-08-03 05:08:17 +00:00
Craig Topper a80352c04e [X86] When post-processing the DAG to remove zero extending moves for YMM/ZMM, make sure the producing instruction is VEX/XOP/EVEX encoded.
If the producing instruction is legacy encoded it doesn't implicitly zero the upper bits. This is important for the SHA instructions which don't have a VEX encoded version. We might also be able to hit this with the incomplete f128 support that hasn't been ported to VEX.

llvm-svn: 338812
2018-08-03 04:49:42 +00:00
Craig Topper ded14af7aa [X86] Autogenerate complete checks. NFC
llvm-svn: 338811
2018-08-03 04:49:41 +00:00
Craig Topper b2cc9a1d44 [X86] Add R13D to the isInefficientLEAReg in FixupLEAs.
I'm assuming the R13 restriction extends to R13D. Guessing this restriction is related to the funny encoding of this register as base always requiring a displacement to be encoded.

llvm-svn: 338806
2018-08-03 03:45:19 +00:00
Craig Topper 55697276dc [X86] Autogenerate complete checks. NFC
llvm-svn: 338802
2018-08-03 01:28:12 +00:00
Craig Topper b99281c9b8 [X86] Autogenerate complete checks. NFC
llvm-svn: 338799
2018-08-03 01:20:32 +00:00
Craig Topper 2c095444a4 [X86] Prevent promotion of i16 add/sub/and/or/xor to i32 if we can fold an atomic load and atomic store.
This makes them consistent with i8/i32/i64. Which still seems to be more aggressive on folding than icc, gcc, or MSVC.

llvm-svn: 338795
2018-08-03 00:37:34 +00:00