We want to make sure that LINKER_IS_LLD_LINK is properly set - in
this case it shouldn't be set when building for MinGW.
Then we want to make the test for it correct and finally include
the option to build with thinlto cache since the MinGW driver now
supports that.
Differential Revision: https://reviews.llvm.org/D80493
* Enables using with more variadic sized operands;
* Generate convenience accessors for attributes;
- The accessor are named the same as their name in ODS and returns attribute
type (not convenience type) and no derived attributes.
This is first step to changing adapter to support verifying argument
constraints before the op is even created. This does not change the name of
adaptor nor does it require it except for ops with variadic operands to keep this change smaller.
Considered creating separate adapter but decided against that given operands also require attributes in general (and definitely for verification of operands and attributes).
Differential Revision: https://reviews.llvm.org/D80420
Fixes a build issue with libc++ configured with _LIBCPP_RAW_ITERATORS (ADL not effective)
```
llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp:1602:3: error: no matching function for call to 'transform'
transform(HexString.begin(), HexString.end(), HexString.begin(), tolower);
^~~~~~~~~
```
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D80475
Summary:
https://bugs.llvm.org/show_bug.cgi?id=46043
Git's config is generally of the format 'key=val', but a setting
'key=true' can be written as just 'key'. The git-clang-format script
expects a value and crashes in this case; this change handles implicit
'true' values in the script.
Reviewers: MyDeveloperDay, krasimir, sammccall
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D80486
This makes many scenarios simpler by not requiring the user to write
ignoringImplicit() all the time, nor to account for non-visible
cxxConstructExpr() and cxxMemberCalExpr() nodes. This is also, in part,
inclusive of the equivalent of adding a use of ignoringParenImpCasts()
between all expr()-related matchers in an expression.
The pre-existing traverse(TK_AsIs, ...) matcher can be used to explcitly
match on implicit/invisible nodes. See
http://lists.llvm.org/pipermail/cfe-dev/2019-December/064143.html
for more
Reviewers: aaron.ballman
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72534
If we're extracting an upper subvector from a broadcast we're better off extracting the lowest subvector instead as it avoids an actual extract instruction and might help SimplifyDemandedVectorElts further simplify the code.
EarlyCSE was added with D75145, but the motivating test is
not regressed by removing the extra pass now. That might be
because VectorCombine altered the way it processes instructions,
or it might be from (re)moving VectorCombine in the pipeline.
The extra round of EarlyCSE appears to cost approximately
0.26% in compile-time as discussed in D80236, so we need some
evidence to justify its inclusion here, but we do not have
that (yet).
I suspect that between SLP and VectorCombine, we are creating
patterns that InstCombine and/or codegen are not prepared for,
but we will need to reduce those examples and include them as
PhaseOrdering and/or test-suite benchmarks.
Currently we unconditionally get the first lane of the condition
operand, even if we later use the full vector condition. This can result
in some unnecessary instructions being generated.
Suggested as follow-up in D80219.
As discussed in D80236 - this test (like all PhaseOrdering tests?)
was intended to show that there is no difference with the new
pass manager, but the 'opt' command requires extra parameters
to make that happen.
This initial version only peeks through cases where we just demand the sign bit of an ashr shift, but we could generalize this further depending on how many sign bits we already have.
The pr18014.ll case is a minor annoyance - we've failed to to move the psrad/paddd after the blendvps which would have avoided the extra move, but we have still increased the ILP.
Replace TargetMachine.h include with forward declaration and CodeGen.h include in AMDGPU.h.
Exposes a couple of implicit dependencies that require additional forward declarations/includes.
Summary: Fix a potential assert in use-noexcept check if there is an issue getting the `TypeSourceInfo` as well as a small clean up.
Reviewers: aaron.ballman, alexfh, gribozavr2
Reviewed By: aaron.ballman
Subscribers: xazax.hun, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D80371
VPWidenSelectRecipe already contains a VPUser, but it is not used. This
patch updates the code related to VPWidenSelectRecipe to use VPUser for
its operands.
Reviewers: Ayal, gilr, rengolin
Reviewed By: gilr
Differential Revision: https://reviews.llvm.org/D80219
For the 'inverse shift', we currently always perform a subtraction of the original (masked) shift amount.
But for the case where we are handling power-of-2 type widths, we can replace:
(sub bw-1, (and amt, bw-1) ) -> (and (xor amt, bw-1), bw-1) -> (and ~amt, bw-1)
This allows x86 shifts to fold away the and-mask.
Followup to D77301 + D80466.
http://volta.cs.utah.edu:8080/z/Nod0Gr
Differential Revision: https://reviews.llvm.org/D80489
On X86 (AVX1/AVX2), non-boolean masked loads only demand the sign bit of the mask, we already do the equivalent for masked stores.
Annoyingly I can't easily handle this inside TargetLowering::SimplifyDemandedBits as this is an x86 specific case for a generic node.
Differential Revision: https://reviews.llvm.org/D80478
This adds the family/model returned by CPUID for some Intel
Comet Lake CPUs. Instruction set and tuning wise these are
the same as "skylake".
These are not in the Intel SDM yet, but these should be correct.
This is a preliminary patch before I deal with the xor+and issue raised in D77301.
We get much better code for i8/i16 funnel shifts by concatenating the operands together and performing the shift as a double width type, it avoids repeated use of the shift amount and partial registers.
fshl(x,y,z) -> (((zext(x) << bw) | zext(y)) << (z & (bw-1))) >> bw.
fshr(x,y,z) -> (((zext(x) << bw) | zext(y)) >> (z & (bw-1))) >> bw.
Alive2: http://volta.cs.utah.edu:8080/z/CZx7Cn
This doesn't do as well for i32 cases on x86_64 (the xor+and followup patch is much better) so I haven't bothered with that.
Cases with constant amounts are more dubious as well so I haven't currently bothered with those - its these kind of 'edge' cases that put me off trying to put this in TargetLowering::expandFunnelShift.
Differential Revision: https://reviews.llvm.org/D80466
Although writing to wzr/xzr is correct since we don't care about the result
of the sub, only the flags, doing so causes tail merge blocks to fail.
Writing to an unused virtual register instead allows the optimization to fire,
improving performance significantly on 256.bzip2.
Differential Revision: https://reviews.llvm.org/D80460
Summary:
The test demonstrates the current state of the compiler and
I am going to resolve FIXME in followup patches.
Reviewers: eugenis
Reviewed By: eugenis
Subscribers: inglorion, hiraditya, steven_wu, dexonsmith, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D80039
This patch introduces a TargetLowering query, isMulhCheaperThanMulShift.
Currently in DAG Combine, it will transform mulhs/mulhu into a
wider multiply and a shift if the wide multiply is legal.
This TLI function is implemented on 64-bit PowerPC, as it is more desirable to
have multiply-high over multiply + shift for words and doublewords. Having
multiply-high can also aid in further transformations that can be done.
Differential Revision: https://reviews.llvm.org/D78271
__test_has_construct.
In C++17 some tests started failing after a521532aa1. This fixes those errors by suppressing the deprecation warning when calling `construct` in `__test_has_construct`. This is the same solution as `__has_destroy_test` already uses.
Reviewers: ldionne, #libc!
Subscribers: dexonsmith, libcxx-commits
Tags: #libc
Differential Revision: https://reviews.llvm.org/D80481
Summary:
Libcxx only supports compilers with variadics. We can safely remove all "fake" variadic overloads of allocator_traits::construct.
This also provides the correct behavior if anything other than exactly one argument is supplied to allocator_traits::construct in C++03 mode.
Reviewers: ldionne, #libc!
Subscribers: dexonsmith, libcxx-commits
Tags: #libc
Differential Revision: https://reviews.llvm.org/D80067