Commit Graph

282471 Commits

Author SHA1 Message Date
Craig Topper 98ae8f833f [X86] Change some compare patterns to use loadi8/loadi16/loadi32/loadi64 helper fragments.
This enables CMP8mi to fold zextloadi8i1 which in all tests allows us to avoid creating a TEST8rr that peephole can't fold.

llvm-svn: 324863
2018-02-12 02:48:42 +00:00
Craig Topper 27d5b6e4a6 [X86] Autogenerate complete checks. NFC
llvm-svn: 324862
2018-02-12 02:03:36 +00:00
Craig Topper 3ce035acf3 [X86] Add KADD X86ISD opcode instead of reusing ISD::ADD.
ISD::ADD implies individual vector element addition with no carries between elements. But for a vXi1 type that would be the same as XOR. And we already turn ISD::ADD into ISD::XOR for all vXi1 types during lowering. So the ISD::ADD pattern would never be able to match anyway.

KADD is different, it adds the elements but also propagates a carry between them. This just a way of doing an add in k-register without bitcasting to the scalar domain. There's still no way to match the pattern, but at least its not obviously wrong.

llvm-svn: 324861
2018-02-12 01:33:38 +00:00
Craig Topper dfc322ddf4 [X86] Allow zextload/extload i1->i8 to be folded into instructions during isel
Previously we just emitted this as a MOV8rm which would likely get folded during the peephole pass anyway. This just makes it explicit earlier.

The gpr-to-mask.ll test changed because the kaddb instruction has no memory form.

llvm-svn: 324860
2018-02-12 01:33:36 +00:00
Charles Saternos d061dd06e8 Follow on to rL324854 (Added tests)
llvm-svn: 324859
2018-02-12 00:20:16 +00:00
Craig Topper 363e099446 [X86] Remove MASK_BINOP intrinsic type. NFC
llvm-svn: 324858
2018-02-11 22:32:30 +00:00
Craig Topper 38d61c38a2 [X86] Remove dead code from getMaskNode that looked for a i64 mask with a maskVT that wasn't v64i1. NFC
llvm-svn: 324857
2018-02-11 22:32:29 +00:00
Craig Topper a7ac028a6b [X86] Remove LowerBoolVSETCC_AVX512, we get this with a target independent DAG combine now. NFC
llvm-svn: 324856
2018-02-11 22:32:27 +00:00
Dimitry Andric 35479ffb12 Add default C++ ABI libname and include paths for FreeBSD
Summary:
As noted in a discussion about testing the LLVM 6.0.0 release candidates
(with libc++) for FreeBSD, many tests turned out to fail with
"exception_ptr not yet implemented".  This was because libc++ did not
choose the correct C++ ABI library, and therefore it fell back to the
`exception_fallback.ipp` header.

Since FreeBSD 10.x, we have been using libcxxrt as our C++ ABI library,
and its headers have always been installed in /usr/include/c++/v1,
together with the (system) libc++ headers.  (Older versions of FreeBSD
used GNU libsupc++ by default, but these are now unsupported.)

Therefore, if we are building libc++ for FreeBSD, set:
* `LIBCXX_CXX_ABI_LIBNAME` to "libcxxrt"
* `LIBCXX_CXX_ABI_INCLUDE_PATHS` to "/usr/include/c++/v1"
by default.

Reviewers: emaste, EricWF, mclow.lists

Reviewed By: EricWF

Subscribers: mgorny, cfe-commits, krytarowski

Differential Revision: https://reviews.llvm.org/D43166

llvm-svn: 324855
2018-02-11 22:31:05 +00:00
Charles Saternos d3e7d19f59 [ThinLTO] Add GraphTraits for FunctionSummaries
Add GraphTraits definitions to the FunctionSummary and ModuleSummaryIndex classes. These GraphTraits will be used to construct find SCC's in ThinLTO analysis passes.

llvm-svn: 324854
2018-02-11 22:06:20 +00:00
Eric Fiselier 2328c589f1 Fix libcxx MSVC C++17 redefinition of 'align_val_t'
Patch from 	charlieio@outlook.com

Reviewed as https://reviews.llvm.org/D42354

When the following command is used:

> clang-cl -std:c++17 -Iinclude\c++\v1 hello.cc c++.lib

An error occurred:

In file included from hello.cc:1:
In file included from include\c++\v1\iostream:38:
In file included from include\c++\v1\ios:216:
In file included from include\c++\v1\__locale:15:
In file included from include\c++\v1\string:477:
In file included from include\c++\v1\string_view:176:
In file included from include\c++\v1\__string:56:
In file included from include\c++\v1\algorithm:643:
In file included from include\c++\v1\memory:656:
include\c++\v1\new(165,29):  error: redefinition of 'align_val_t'
enum class _LIBCPP_ENUM_VIS align_val_t : size_t { };
                            ^
C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.12.25827\include\vcruntime_new.h(43,16):  note:
      previous definition is here
    enum class align_val_t : size_t {};
               ^
1 error generated.
vcruntime_new.h has defined align_val_t, libcxx need hide align_val_t.

This patch fixes that error.

llvm-svn: 324853
2018-02-11 22:00:19 +00:00
Eric Fiselier 6808a59244 Mark two issues as complete
llvm-svn: 324852
2018-02-11 21:57:25 +00:00
Marshall Clow 026b75f209 Fix a typo in the synopsis comment. NFC. Thanks to K-ballo for the catch
llvm-svn: 324851
2018-02-11 21:51:49 +00:00
Brock Wyma 19e17b3970 [CodeView] Allow variable names to be as long as the codeview format supports
Instead of reserving 0xF00 bytes for the fixed length portion of the CodeView
symbol name, calculate the actual length of the fixed length portion.

Differential Revision: https://reviews.llvm.org/D42125

llvm-svn: 324850
2018-02-11 21:26:46 +00:00
Kuba Mracek 9ead7bb3d4 Revert r324847, there's bot failures.
llvm-svn: 324849
2018-02-11 20:44:04 +00:00
Kuba Mracek 3eb694d01e [sanitizer] Implement NanoTime() on Darwin
Currently NanoTime() on Darwin is unimplemented and always returns 0. Looks like there's quite a few things broken because of that (TSan periodic memory flush, ASan allocator releasing pages back to the OS). Let's fix that.

Differential Revision: https://reviews.llvm.org/D40665

llvm-svn: 324847
2018-02-11 19:25:34 +00:00
Kuba Mracek 3ecf9dcaf4 [compiler-rt] Replace forkpty with posix_spawn
On Darwin, we currently use forkpty to communicate with the "atos" symbolizer. There are several problems that fork or forkpty has, e.g. that after fork, interceptors are still active and this sometimes causes crashes or hangs. This is especially problematic for TSan, which uses interceptors for OS-provided locks and mutexes, and even Libc functions use those.

This patch replaces forkpty with posix_spawn. Since posix_spawn doesn't fork (at least on Darwin), the interceptors are not a problem. Additionally, this also fixes a latent threading problem with ptsname (it's unsafe to use this function in multithreaded programs). Yet another benefit is that we'll handle post-fork failures (e.g. sandbox disallows "exec") gracefully now.

Differential Revision: https://reviews.llvm.org/D40032

llvm-svn: 324846
2018-02-11 19:23:42 +00:00
Craig Topper 3a354152dd [X86] Update some required-vector-width.ll test cases to not pass 512-bit vectors in arguments or return.
ABI for these would require 512 bits support so we don't want to test that.

llvm-svn: 324845
2018-02-11 18:52:16 +00:00
Simon Pilgrim 0d8c4bfc2a [X86][SSE] Use SplitBinaryOpsAndApply to recognise PSUBUS patterns before they're split on AVX1
This needs to be generalised further to support AVX512BW cases but I want to add non-uniform constants first.

llvm-svn: 324844
2018-02-11 17:29:42 +00:00
Sanjay Patel 510d647a4d [InstCombine] X / (X * Y) -> 1 / Y if the multiplication does not overflow
The related cases for (X * Y) / X were handled in rL124487.

https://rise4fun.com/Alive/6k9

The division in these tests is subsequently eliminated by existing instcombines
for 1/X.

llvm-svn: 324843
2018-02-11 17:20:32 +00:00
Craig Topper ca5a340171 [X86] Use min/max for vector ult/ugt compares if avoids a sign flip.
Summary:
Currently we only use min/max to help with ule/uge compares because it removes an invert of the result that would otherwise be needed. But we can also use it for ult/ugt compares if it will prevent the need for a sign bit flip needed to use pcmpgt at the cost of requiring an invert after the compare.

I also refactored the code so that the max/min code is self contained and does its own return instead of setting up a flag to manipulate the rest of the function's behavior.

Most of the test cases look ok with this. I did notice that we added instructions when one of the operands being sign flipped is a constant vector that we were able to constant fold the flip into.

I also noticed that sometimes the SSE min/max clobbers a register that is needed after the compare. This resulted in an extra move being inserted before the min/max to preserve the register. We could try to detect this and switch from min to max and change the compare operands to use the operand that gets reused in the compare.

Reviewers: spatel, RKSimon

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D42935

llvm-svn: 324842
2018-02-11 17:11:40 +00:00
Simon Pilgrim c2544c572a [X86][SSE] Moved SplitBinaryOpsAndApply earlier so more methods can use it. NFCI.
llvm-svn: 324841
2018-02-11 17:01:43 +00:00
Sanjay Patel aee107f30d [InstCombine] add tests for div-mul folds; NFC
The related cases for (X * Y) / X were handled in rL124487.

llvm-svn: 324840
2018-02-11 16:52:44 +00:00
Sanjay Patel eb8c408e50 [TargetLowering] try to create -1 constant operand for math ops via demanded bits
This reverses instcombine's demanded bits' transform which always tries to clear bits in constants.

As noted in PR35792 and shown in the test diffs:
https://bugs.llvm.org/show_bug.cgi?id=35792
...we can do better in codegen by trying to form -1. The x86 sub test shows a missed opportunity. 

I did investigate changing instcombine's behavior, but it would be more work to change 
canonicalization in IR. Clearing bits / shrinking constants can allow killing instructions, 
so we'd have to figure out how to not regress those cases.

Differential Revision: https://reviews.llvm.org/D42986

llvm-svn: 324839
2018-02-11 14:38:23 +00:00
Simon Pilgrim 7630150222 [X86] Add PR33747 test case
llvm-svn: 324838
2018-02-11 13:12:50 +00:00
Simon Pilgrim 0be5567a89 [X86][SSE] Enable SMIN/SMAX/UMIN/UMAX custom lowering for all legal types
This allows us to recognise more saturation patterns and also simplify some MINMAX codegen that was failing to combine CMPGE comparisons to a legal CMPGT.

Differential Revision: https://reviews.llvm.org/D43014

llvm-svn: 324837
2018-02-11 10:52:37 +00:00
Lama Saba 91e2b9d081 fix test/CodeGen/X86/fixup-sfb.ll test failure after commit https://reviews.llvm.org/rL324835
Change-Id: I2526c2f342654e85ce054237de03ae9db9ab4994
llvm-svn: 324836
2018-02-11 10:33:06 +00:00
Lama Saba c2ba6c387e [X86] Reduce Store Forward Block issues in HW
If a load follows a store and reloads data that the store has written to memory, Intel microarchitectures can in many cases forward the data directly from the store to the load, This "store forwarding" saves cycles by enabling the load to directly obtain the data instead of accessing the data from cache or memory.
A "store forward block" occurs in cases that a store cannot be forwarded to the load. The most typical case of store forward block on Intel Core microarchiticutre that a small store cannot be forwarded to a large load.
The estimated penalty for a store forward block is ~13 cycles.

This pass tries to recognize and handle cases where "store forward block" is created by the compiler when lowering memcpy calls to a sequence
of a load and a store.

The pass currently only handles cases where memcpy is lowered to XMM/YMM registers, it tries to break the memcpy into smaller copies.
breaking the memcpy should be possible since there is no atomicity guarantee for loads and stores to XMM/YMM.

Change-Id: I620b6dc91583ad9a1444591e3ddc00dd25d81748
llvm-svn: 324835
2018-02-11 09:34:12 +00:00
Craig Topper 24d3b28d93 [X86] Don't make 512-bit vectors legal when preferred vector width is 256 bits and 512 bits aren't required
This patch adds a new function attribute "required-vector-width" that can be set by the frontend to indicate the maximum vector width present in the original source code. The idea is that this would be set based on ABI requirements, intrinsics or explicit vector types being used, maybe simd pragmas, etc. The backend will then use this information to determine if its save to make 512-bit vectors illegal when the preference is for 256-bit vectors.

For code that has no vectors in it originally and only get vectors through the loop and slp vectorizers this allows us to generate code largely similar to our AVX2 only output while still enabling AVX512 features like mask registers and gather/scatter. The loop vectorizer doesn't always obey TTI and will create oversized vectors with the expectation the backend will legalize it. In order to avoid changing the vectorizer and potentially harm our AVX2 codegen this patch tries to make the legalizer behavior similar.

This is restricted to CPUs that support AVX512F and AVX512VL so that we have good fallback options to use 128 and 256-bit vectors and still get masking.

I've qualified every place I could find in X86ISelLowering.cpp and added tests cases for many of them with 2 different values for the attribute to see the codegen differences.

We still need to do frontend work for the attribute and teach the inliner how to merge it, etc. But this gets the codegen layer ready for it.

Differential Revision: https://reviews.llvm.org/D42724

llvm-svn: 324834
2018-02-11 08:06:27 +00:00
Craig Topper a4bf9b8d51 [X86] Remove setOperationAction lines for promoting vXi1 SINT_TO_FP/UINT_TO_FP.
We promote these via a DAG combine now before lowering gets the chance.

Also remove the v2i1 custom handling since it will no longer be triggered.

llvm-svn: 324833
2018-02-11 07:44:33 +00:00
Craig Topper 36f913ee80 [SelectionDAG] Remove TargetLowering::getConstTrueVal. Use SelectionDAG::getBoolConstant in the one place it was used.
SelectionDAG::getBoolConstant was recently introduced. At the time I didn't know getConstTrueVal existed, but I think getBoolConstant is better as it will use the source VT to make sure it can properly detect floating point if it is configured differently.

llvm-svn: 324832
2018-02-11 04:58:58 +00:00
Craig Topper ba5ad55965 [X86] Remove some redundant qualifications from the setOperationAction blocks. NFC
These were added as part of the refactoring for prefer vector width. At the time I thought the hasAVX512 here would be replaced with "allow 512 bit vectors" so that it would read "allow 512 bit vectors OR VLX". But now the plan is to only give the option of disabling 512 bit vectors when VLX is enabled. So we don't need this qualification at all

llvm-svn: 324831
2018-02-11 03:07:19 +00:00
Galina Kistanova c6cd1f0139 Fixed extra ‘;’ warning
llvm-svn: 324830
2018-02-11 02:32:21 +00:00
Simon Pilgrim d229bfd20d [X86][SSE] Add SMIN/SMAX combine test
As discussed on D43014, we need the ability to flip SMIN/SMAX to (legal) UMIN/UMAX

llvm-svn: 324829
2018-02-10 23:38:50 +00:00
Craig Topper a57d64e30f [X86] Change the signature of the AVX512 packed fp compare intrinsics to return vXi1 mask. Make bitcasts to scalar explicit in IR
Summary: This is the clang equivalent of r324827

Reviewers: zvi, delena, RKSimon, spatel

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D43143

llvm-svn: 324828
2018-02-10 23:34:27 +00:00
Craig Topper 4dccffc84a [X86] Change signatures of avx512 packed fp compare intrinsics to return a vXi1 mask type to be closer to an fcmp.
Summary:
This patch changes the signature of the avx512 packed fp compare intrinsics to return a vXi1 vector and no longer take a mask as input. The casts to scalar type will now need to be explicit in the IR. The masking node will now be an explicit and in the IR.

This makes the intrinsic look much more similar to an fcmp instruction that we wish we could use for these but can't. We already use icmp instructions for integer compares.

Previously the lowering step of isel would turn the intrinsic into an X86 specific ISD node and a emit the masking nodes as well as some bitcasts. This means DAG combines can't see the vXi1 type until somewhat late, making it more difficult to combine out gpr<->mask transition sequences. By exposing the vXi1 type explicitly in the IR and initial SelectionDAG we give earlier DAG combines and even InstCombine the chance to see it and optimize it.

This should make any issues with gpr<->mask sequences the same between integer and fp. Meaning we only have to fix them once.

Reviewers: spatel, delena, RKSimon, zvi

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D43137

llvm-svn: 324827
2018-02-10 23:33:55 +00:00
Simon Pilgrim b781b2e24c [X86][SSE] Add UMIN/UMAX combine test
As discussed on D43014, we need the ability to flip UMIN/UMAX to (legal) SMIN/SMAX

llvm-svn: 324826
2018-02-10 22:27:35 +00:00
Simon Pilgrim 19495198af [InstCombine] Add constant vector support for ~(C >> Y) --> ~C >> Y
Includes adding m_NonNegative constant pattern matcher

llvm-svn: 324825
2018-02-10 21:46:09 +00:00
Aaron Smith 51a6fc6fec Fix test clang-diff-json.cpp
Summary:
This test would fail if the python path had spaces. Add a quote around the path to fix this problem and update some test values changed by the addition of quotes around the path.

Tested on Windows and Linux with Python 3.x

Reviewers: zturner, llvm-commits

Subscribers: klimek, cfe-commits

Differential Revision: https://reviews.llvm.org/D43164

llvm-svn: 324824
2018-02-10 21:28:55 +00:00
Simon Pilgrim cb9a02f60e [X86][SSE] Increase PMULLD costs to better match hardware
Until Skylake, most hardware could only issue a PMULLD op every other cycle

llvm-svn: 324823
2018-02-10 19:27:10 +00:00
Craig Topper 9121eb575e [X86] Custom legalize (v2i32 (setcc (v2f32))) so that we don't end up with a (v4i1 (setcc (v4f32)))
Undef VLX, getSetCCResultType returns v2i1/v4i1 for v2f32/v4f32 so default type legalization will end up changing the setcc result type back to vXi1 if it had been extended. The resulting extend gets messed up further by type legalization and is difficult to recombine back to (v4i32 (setcc (v4f32))) after legalization.

I went ahead and enabled this for SSE2 and later since its always the result we want and this helps type legalization get there in less steps.

llvm-svn: 324822
2018-02-10 19:12:58 +00:00
Alexander Richardson 59aaf1ae33 Use RelType instead of uint32_t in DynamicReloc. NFC
llvm-svn: 324821
2018-02-10 18:14:34 +00:00
Craig Topper 28d3a73c81 [X86] Extend inputs with elements smaller than i32 to sint_to_fp/uint_to_fp before type legalization.
This prevents extends of masks being introduced during lowering where it become difficult to combine them out.

There are a few oddities in here.

We sometimes concatenate two k-registers produced by two compares, sign_extend the combined pair, then extract two halves. This worked better previously because the sign_extend wasn't created until after the fp_to_sint was split which led to a split sign_extend being created.

We probably also need to custom type legalize (v2i32 (sext v2i1)) via widening.

llvm-svn: 324820
2018-02-10 17:58:58 +00:00
Craig Topper 0fbdaa6d81 [X86] Remove some check-prefixes from avx512-cvt.ll to prepare for an upcoming patch.
The update script sometimes has trouble when there are check-prefixes representing every possible combination of feature flags. I have a patch where the update script was generating something that didn't pass lit.

This patch just removes some check-prefixes and expands out some of the checks to workaround this.

llvm-svn: 324819
2018-02-10 17:58:56 +00:00
Simon Pilgrim 99af1e11b2 Add vector add/sub/mul/div by scalar tests (PR27085)
Ensure the scalar is correctly splatted to all lanes

llvm-svn: 324818
2018-02-10 17:55:23 +00:00
Sanjay Patel 3a7e447cd6 [x86] preserve test intent by removing undef
D43141 proposes to correct undef folding in the DAG,
and this test would not survive that change.

llvm-svn: 324817
2018-02-10 15:36:23 +00:00
Sanjay Patel 5d176da09c [x86] preserve test intent by removing undef
D43141 proposes to correct undef folding in the DAG,
and this test would not survive that change.

llvm-svn: 324816
2018-02-10 15:28:08 +00:00
Simon Pilgrim 514d8cc1de Fix Wdocumentation warning. NFCI.
llvm-svn: 324815
2018-02-10 15:14:00 +00:00
Sanjay Patel 3659e8c95c [ARM] preserve test intent by removing undef
D43141 proposes to correct undef folding in the DAG,
and this test would not survive that change.

llvm-svn: 324814
2018-02-10 15:14:00 +00:00
Simon Pilgrim dd94785848 Fix Wdocumentation warnings. NFCI.
llvm-svn: 324813
2018-02-10 15:02:07 +00:00