This is partly in preparation for an upcoming change that can change the
order in which DeclContext lookup results are presented.
In passing, fix some obvious errors where name lookup's notion of a
"static member function" missed static member function templates, and
where its notion of "same set of declarations" was confused by the same
declarations appearing in a different order.
In https://reviews.llvm.org/D89072 I added static const data members
to the debug subsection for globals. It skipped emitting an S_CONSTANT if it
didn't have a value, which meant the subsection could be empty.
This patch fixes the empty subsection issue.
Differential Revision: https://reviews.llvm.org/D92049
Revert "[compiler-rt] [builtins] Support conversion between fp16 and fp128" & dependency
Revert "[compiler-rt] [builtins] Use _Float16 on extendhfsf2, truncdfhf2 __truncsfhf2 if available"
This reverts commit 7a94829881.
This reverts commit 1fb91fcf9c.
Start with an assumption that FMA is faster than Fmul+FAdd. If thats not true
on some particular implementation we can add a tuning parameter in the future.
I've update the fmuladd test cases and added new test cases for fast math flag
based contraction.
Differential Revision: https://reviews.llvm.org/D91987
We currently don't match this which limits the effectiveness of D91120 until
InstCombine starts canonicalizing to llvm.abs. This should be easy to remove
if/when we remove the SPF_ABS handling.
Differential Revision: https://reviews.llvm.org/D92118
In text-item contexts, %expr expands to a string containing the results of evaluating `expr`.
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D89736
This is the logically correct thing to do. But it generates worse
code for i32 umin/umax on the rv64 due to type legalize requesting
zext even though the arguments are sext. Maybe we can teach type
legalizer to use sext for umin/umax for RISCV.
It's also producing possibly worse code on i64 on RV32 since we
still end up with selects that become branches. But this seems
like something we could improve in type legalization or DAG combine.
Hopefully this makes D92095 work for RISCV with Zbb.
There were a couple of places where we needed to call the underlying
platform's aligned allocation/deallocation function. Instead of having
the same logic all over the place, extract the logic into a pair of
helper functions __libcpp_aligned_alloc and __libcpp_aligned_free.
The code in libcxxabi/src/fallback_malloc.cpp looks like it could be
simplified after this change -- I purposefully did not simplify it
further to keep this change as straightforward as possible, since it
is touching very important parts of the library.
Also, the changes in libcxx/src/new.cpp and libcxxabi/src/stdlib_new_delete.cpp
are basically the same -- I just kept both source files in sync.
The underlying reason for this refactoring is to make it easier to support
platforms that provide aligned allocation through C11's aligned_alloc
function instead of posix_memalign. After this change, we'll only have
to add support for that in a single place.
Differential Revision: https://reviews.llvm.org/D91379
Adding missing custom builders for AffineVectorLoadOp & AffineVectorStoreOp. In practice, it is difficult to correctly construct these ops without these builders (because the AffineMap is not included at construction time).
Differential Revision: https://reviews.llvm.org/D86380
This code got quite twisted because we consider some MSVC builtins to be
target agnostic, and some to be target specific. Target specific
intrinsics have a pattern of doing up-front argument evaluation, while
general intrinsics do not evaluate their arguments up front. As we tried
to share codepaths between the target-specific and target-agnostic
handling, we ended up doing double evaluation.
Instead, have each target handle MSVC intrinsics consistently before up
front argument evaluation. This requires passing less data around and is
more consistent with target independent intrinsic handling.
See D50979 for past examples of this bug. I noticed this while looking
into adding some more intrinsics.
Differential Revision: https://reviews.llvm.org/D92061
This should handle the basic integer min/max handling - the HVX ops are still TODO.
This is some necessary cleanup work for min/max ops to eventually help us move the add/sub sat patterns into DAGCombine - D91876.
Differential Revision: https://reviews.llvm.org/D92112
Update costs now that D92095 and D92102 have tweaked the SSE2 implementation
The SSE42 BLENDVPD cost can actually be used on SSE41 as we don't attempt to generate PCMPGT anymore
Add scalar i16/i32/i64 costs as we can do this cheaply with CMOV
Added some new ClangTidyOptionsProvider like classes designed for clangd work flow.
These providers are designed to source the options on the worker thread but in a thread safe manner.
This is done through making the options getter take a pointer to the filesystem used by the worker thread which natuarally is from a ThreadsafeFS.
Internal caching in the providers is also guarded.
The providers don't inherit from `ClangTidyOptionsProvider` instead they share a base class which is able to create a provider for the `ClangTidyContext` using a specific FileSystem.
This approach means one provider can be used for multiple contexts even though `ClangTidyContext` owns its provider.
Depends on D90531
Reviewed By: sammccall
Differential Revision: https://reviews.llvm.org/D91029
Local values are constants or addresses that can't be folded into
the instruction that uses them. FastISel materializes these in a
"local value" area that always dominates the current insertion
point, to try to avoid materializing these values more than once
(per block).
https://reviews.llvm.org/D43093 added code to sink these local
value instructions to their first use, which has two beneficial
effects. One, it is likely to avoid some unnecessary spills and
reloads; two, it allows us to attach the debug location of the
user to the local value instruction. The latter effect can
improve the debugging experience for debuggers with a "set next
statement" feature, such as the Visual Studio debugger and PS4
debugger, because instructions to set up constants for a given
statement will be associated with the appropriate source line.
There are also some constants (primarily addresses) that could be
produced by no-op casts or GEP instructions; the main difference
from "local value" instructions is that these are values from
separate IR instructions, and therefore could have multiple users
across multiple basic blocks. D43093 avoided sinking these, even
though they were emitted to the same "local value" area as the
other instructions. The patch comment for D43093 states:
Local values may also be used by no-op casts, which adds the
register to the RegFixups table. Without reversing the RegFixups
map direction, we don't have enough information to sink these
instructions.
This patch undoes most of D43093, and instead flushes the local
value map after(*) every IR instruction, using that instruction's
debug location. This avoids sometimes incorrect locations used
previously, and emits instructions in a more natural order.
This does mean materialized values are not re-used across IR
instruction boundaries; however, only about 5% of those values
were reused in an experimental self-build of clang.
(*) Actually, just prior to the next instruction. It seems like
it would be cleaner the other way, but I was having trouble
getting that to work.
Differential Revision: https://reviews.llvm.org/D91734
This adds custom opcodes for FSLW/FSRW so we can type legalize
fshl/fshr without needing to match a sign_extend_inreg.
I've used the operand order from fshl/fshr to make the isel
pattern similar to the non-W form. It was also hard to decide
another order since the register instruction has the shift amount
as the second operand, but the immediate instruction has it as
the third operand.
Differential Revision: https://reviews.llvm.org/D91479
This is an alternative approach to address inconsistencies pointed out in: D90078
This patch makes sure that the return address is reset, when leaving the scope.
In some cases, I had to move the macro out of an if-statement to have it in the
right scope, in some cases I added an additional block to restrict the scope.
This patch does not handle inconsistencies, which might occur if the return
address is still set when we call into the application.
Test case (repeated_calls.c) provided by @hbae
Differential Revision: https://reviews.llvm.org/D91692
OpenMP 5.1 introduces the new env variable
OMP_TOOL_VERBOSE_INIT=(disabled|stdout|stderr|<filename>) to enable verbose
loading and initialization of OMPT tools.
This env variable helps to understand the cause when loading of a tool fails
(e.g., undefined symbols or dependency not in LD_LIBRARY_PATH)
Output of OMP_TOOL_VERBOSE_INIT is added for OMP_DISPLAY_ENV
Tests for this patch are integrated into the different existing tool loading
tests, making these tests more verbose. An Archer specific verbose test is
integrated into an existing Archer test.
Patch prepared by: Isabel Thärigen
Differential Revision: https://reviews.llvm.org/D91464
The TypeSize warning would occur because RuntimePointerChecking::insert
was not scalable vector aware. The fix is to use
ScalarEvolution::getSizeOfExpr to grab the size of types.
Differential Revision: https://reviews.llvm.org/D90171
With this change, `TargetInfo::adjustRelaxExpr` is only related to TLS
relaxations and a subsequent clean-up can delete the `data` parameter.
Differential Revision: https://reviews.llvm.org/D92079
The indirect function table, synthesized by the linker, is needed if and
only if there are TABLE_INDEX relocs.
Differential Revision: https://reviews.llvm.org/D91637
Currently, `-indvars` runs first, and then immediately after `-loop-idiom` does.
I'm not really sure if `-loop-idiom` requires `-indvars` to run beforehand,
but i'm *very* sure that `-indvars` requires `-loop-idiom` to run afterwards,
as it can be seen in the phase-ordering test.
LoopIdiom runs on two types of loops: countable ones, and uncountable ones.
For uncountable ones, IndVars obviously didn't make any change to them,
since they are uncountable, so for them the order should be irrelevant.
For countable ones, well, they should have been countable before IndVars
for IndVars to make any change to them, and since SCEV is used on them,
it shouldn't matter if IndVars have already canonicalized them.
So i don't really see why we'd want the current ordering.
Should this cause issues, it will give us a reproducer test case
that shows flaws in this logic, and we then could adjust accordingly.
While this is quite likely beneficial in-the-wild already,
it's a required part for the full motivational pattern
behind `left-shift-until-bittest` loop idiom (D91038).
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D91800
This commit factors out a WasmTableType definition from WasmTable, as is
the case for WasmGlobal and other data types. Also add support for
extracting the SymbolName for a table from the linking section's symbol
table.
Differential Revision: https://reviews.llvm.org/D91849
Add .shader_functions to pal metadata, which contains the stack frame
size for all non-entry-point functions.
Differential Revision: https://reviews.llvm.org/D90036
Add semantic check for the cache directive. According to section 2.10 from the specification:
A var in a cache directive must be a single array element or a simple subarray.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D90184
If smax() is legal, this is likely to result in smaller codegen expansion for abs(x) than the xor(add,ashr) method.
This is also what PowerPC has been doing for its abs implementation, so it lets us get rid of a load of custom lowering code there (and which was never updated when they added smax lowering).
Alive2: https://alive2.llvm.org/ce/z/xRk3cD
Differential Revision: https://reviews.llvm.org/D92095
If the enum is a dependent type, we would crash somewhere in
getIntWidth(). -Wswitch diagnostic doesn't work on dependent enums
either.
Differential Revision: https://reviews.llvm.org/D92051
MaxSafeRegisterWidth is a misnomer since it actually returns the maximum
safe vector width. Register suggests it relates directly to a physical
register where it could be a vector spanning one or more physical
registers.
Reviewed By: sdesmalen
Differential Revision: https://reviews.llvm.org/D91727