While, indeed, we may end up pushing less updates that we'd reserve space
for, self-dominating updates aren't often enough for that to matter.
But this should matter for normal updates.
Improve AVX512 mask inversion, rG38c799bce801 exposed some missing opportunities to move scalar not() back onto the boolvector types for folding with setcc etc.
In the final SIMD spec, there is only a single v128.any_true instruction, rather
than one for each lane interpretation because the semantics do not depend on the
lane interpretation.
Differential Revision: https://reviews.llvm.org/D100241
Followup to D100177, handle an similar (demorgan inverse style) case from PR47797 as well
The AVX512 test cases could be further improved if we folded not(iX bitcast(vXi1)) -> (iX bitcast(not(vXi1)))
Alive2: https://alive2.llvm.org/ce/z/AnA_-W
The first source has the same EEW as the destination and the other
source is a scalar so the overlap constraints don't apply to
the unmasked version.
For the masked version we have a constraint that the destination
can't be V0 so that covers the only overlap issue there.
Reviewed By: khchen
Differential Revision: https://reviews.llvm.org/D100217
These require the input to be zero or sign extended. If we have
sext.b, sext.h or zext.h instructions we can use them. Otherwise
we need to use a pair of shifts to accomplish the zero/sign extend
and the final shift.
We don't currently use zext.h when it is available.
This adds a CI job validating that the output of
utils/generate_feature_test_macro_components.py,
libcxx/utils/generate_header_inclusion_tests.py, and
utils/generate_header_tests.py are up to date.
The validation method has been copied from the Format job.
Differential Revision: https://reviews.llvm.org/D99862
It breaks up the function pass manager in the codegen pipeline.
With empty parameters, it looks at the -mllvm flag -rewrite-map-file.
This is likely not in use.
Add a check that we only have one function pass manager in the codegen
pipeline.
This required reverting commit 9583a3f2625818b78c0cf6d473cdedb9f23ad82c:
"[AsmPrinter] Delete dead takeDeletedSymbsForFunction()".
This was not NFC as initially thought. By coalescing two function
psas managers, this exposed the reverted code as necessary.
addr-label.ll was crashing due to an emitted blockaddress's block being
removed but the label not emitted.
Some tests relied on the fact that we had a module pass somewhere in the
codegen pipeline.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D99707
Polly use algorithms from the Integer Set Library (isl), which is a library written in C and which is incompatible with the rest of the LLVM as it is written in C++.
Changes made:
- Refactoring the following methods of class IslAstInfo
- isParallel() isExecutedInParallel() isReductionParallel() getSchedule() getMinimalDependenceDistance() getBrokenReductions()
- Refactoring the following methods of class IslNodeBuilder
- getReferencesInSubtree() getScheduleForAstNode()
- Refactoring function getBrokenReductionsStr()
- Fixed the mismatching function declaration for getScheduleForAstNode()
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D99971
Check the cache before calling isLoopSimplifyForm(). Otherwise we'd
always perform the check for the innermost loop and only skip it
for dominating loops.
This patch fixed the following issues along side with some refactoring:
1. Fix bugs where StringRef for context string out live the underlying std::string. We now keep string table in profile generator to hold std::strings. We also do the same for bracketed context strings in profile writer.
2. Make sure profile output strictly follow (total sample, name) order. Previously, there's inconsistency between ProfileMap's key and FunctionSamples's name, leading to inconsistent ordering. This is now fixed by introducing context profile canonicalization. Assertions are also added to make sure ProfileMap's key and FunctionSamples's name are always consistent.
3. Enhanced error handling for profile writing to make sure we bubble up errors properly for both llvm-profgen and llvm-profdata when string table is not populated correctly for extended binary profile.
4. Keep all internal context representation bracket free. This avoids creating new strings for context trimming, merging and preinline. getNameWithContext API is now simplied accordingly.
5. Factor out the code for context trimming and merging into SampleContextTrimmer in SampleProf.cpp. This enables llvm-profdata to use the trimmer when merging profiles. Changes in llvm-profgen will be in separate patch.
Differential Revision: https://reviews.llvm.org/D100090
The default is likely wrong.
Out of all the callees, only a single one needs to pass-in false (JumpThread),
everything else either already passes true, or should pass true.
Until the default is flipped, at least make it harder to unintentionally
add new callees with UseBlockValue=false.
F18 supports the standard intrinsic function SELECTED_REAL_KIND
but not its synonym in the standard module IEEE_ARITHMETIC
named IEEE_SELECTED_REAL_KIND until this patch.
Differential Revision: https://reviews.llvm.org/D100066
There was an off-by-one issue with calculating the *exact* end location
of token ranges (as given by SomeDecl->getSourceRange()) which resulted in:
xxx(something)
^~~~~~~~ // Note the missing ~ under the last character.
In addition, a test is added to keep the behaviour in check in the future.
This patch hotfixes commit 3b677b81ce.
"Does the predicate hold between two ranges?"
Not very surprisingly, some places were already doing this check,
without explicitly naming the algorithm, cleanup them all.
"Does the predicate hold between two ranges?"
Not very surprisingly, some places were already doing this check,
without explicitly naming the algorithm, cleanup them all.
Fixes bug http://bugs.llvm.org/show_bug.cgi?id=49000.
This patch allows Clang-Tidy checks to do
diag(X->getLocation(), "text") << Y->getSourceRange();
and get the highlight of `Y` as expected:
warning: text [blah-blah]
xxx(something)
^ ~~~~~~~~~
Reviewed-By: aaron.ballman, njames93
Differential Revision: http://reviews.llvm.org/D98635
This implements C-style type conversions for matrix types, as specified
in clang/docs/MatrixTypes.rst.
Fixes PR47141.
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D99037
Added cost estimation for switch instruction, updated costs of branches, fixed
phi cost.
Had to increase `-amdgpu-unroll-threshold-if` default value since conditional
branch cost (size) was corrected to higher value.
Test renamed to "control-flow.ll".
Removed redundant code in `X86TTIImpl::getCFInstrCost()` and
`PPCTTIImpl::getCFInstrCost()`.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D96805
Update llvm::sys::fs::mapped_file_region to have a move constructor and
a move assignment operator, allowing it to be used as an Optional. Also,
update FileOutputBuffer's OnDiskBuffer to take advantage of this,
avoiding an extra allocation from the unique_ptr.
A nice follow-up would be to make the mapped_file_region constructor
private and replace its use with a factory function, such as
mapped_file_region::create(), that returns an Expected (or ErrorOr). I
don't plan on doing that immediately, but I might swing back later.
No functionality change, besides the saved allocation in OnDiskBuffer.
Differential Revision: https://reviews.llvm.org/D100159