Use BFI if it is available and BPI otherwise.
This is a promised follow-up after D89541.
Reviewers: ebrevnov, mkazantsev
Reviewed By: ebrevnov
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D89773
Split out of D89086 as suggested.
Change the default of the log_path flag to nullptr, and the code
consuming that flag (ReportFile::SetReportPath), to treat nullptr as
stderr (so no change to the behavior of existing users). This allows
code to distinguish between the log_path being specified explicitly as
stderr vs the default.
This is so the flag can be used to override the new report path variable
that will be encoded in the binary for memprof for runtime testing.
Differential Revision: https://reviews.llvm.org/D89629
Some of our conversion algorithms produce -0.0 when converting unsigned i64 to double when the rounding mode is round toward negative. This switches them to other algorithms that don't have this problem. Since it is undefined behavior to change rounding mode with the non-strict nodes, this patch only changes the behavior for strict nodes.
There are still problems with unsigned i32 conversions too which I'll try to fix in another patch.
Fixes part of PR47393
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D87115
Seems users have enough different uses of the symbolizer where they
might have unknown binaries and offsets such that "best effort" behavior
is all that's expected of llvm-symbolizer - so even erroring on unknown
executables and out of bounds offsets might not be suitable.
This reverts commit 1de0199748.
This reverts commit a7b209a6d4.
This reverts commit 338dd138ea.
Prior to this patch, computeKnownBits would only try to deduce trailing zeros
bits for getelementptrs. This patch adds the logic to treat geps as a series
of add * scaling factor.
Thanks to this patch, using a gep or performing an address computation
directly "by hand" (ptrtoint followed by adds and mul followed by inttoptr)
offers the same computeKnownBits information.
Previously, the "by hand" approach would have given more information.
This is related to https://llvm.org/PR47241.
Differential Revision: https://reviews.llvm.org/D86364
Since its call operator is const but can modify the state of its underlying
functor we cannot tell whether the copy is necessary or not.
This avoids false positives.
Reviewed-by: aaron.ballman, gribozavr2
Differential Revision: https://reviews.llvm.org/D89332
This requires that we track enough information to determine the original
type of the parameter in a substituted non-type template parameter, to
distinguish the reference-to-class case from the class case.
The changes made in D88594 caused the test OpenMP/driver.c to fail on a 32-bit host becuase it was offloading to a 64-bit architecture by default. The offloading test was moved to a new file and a feature was added to the lit config to check for a 64-bit host.
Reviewed By: daltenty
Differential Revision: https://reviews.llvm.org/D89904
This commit should really be named "Workaround external projects depending
on libc++ build system implementation details". It seems that the compiler-rt
build (and perhaps other projects) is relying on the fact that we copy libc++
and libc++abi headers to `<build-root>/include/c++/v1`. This was changed
by 5d796645, which moved the headers to `<build-root>/projects/libcxx/include/c++/v1`
and broke the compiler-rt build.
I'm committing this workaround to fix the compiler-rt build, but we should
remove reliance on implementation details like that. The correct way to
setup the compiler-rt build would be to "link" against the `cxx-headers`
target in CMake, or to run `install-cxx-headers` using an appropriate
installation prefix, and then manually add a `-I` path to that location.
Forward missing attributes when creating the new transfer op otherwise the
builder would use default values.
Differential Revision: https://reviews.llvm.org/D89907
non-type template parameters.
Create a unique TemplateParamObjectDecl instance for each such value,
representing the globally unique template parameter object to which the
template parameter refers.
No IR generation support yet; that will follow in a separate patch.
While running this test on a bare metal target, I got an error as 'sleep' was not available on that system. As 'sleep' call is not doing anything useful for cases when _LIBCXXABI_HAS_NO_THREADS is defined. This patch puts it under this check.
Reviewed By: ldionne
Differential Revision: https://reviews.llvm.org/D89871
As discussed in the review for D87120 (specifically at
https://reviews.llvm.org/D87120#inline-831939), clean up PrintModuleMap
and DumpProcessMap usage differences. The former is only implemented for
Mac OSX, whereas the latter is implemented for all OSes. The former is
called by asan and tsan, and the latter by hwasan and now memprof, under
the same option. Simply rename the PrintModuleMap implementation for Mac
to DumpProcessMap, remove other empty PrintModuleMap implementations,
and convert asan/tsan to new name. The existing posix DumpProcessMap is
disabled for SANITIZER_MAC.
Differential Revision: https://reviews.llvm.org/D89630
PreservedCFGCheckerInstrumentation was saying that LowerMatrixIntrinsics
didn't properly preserve CFG even though it claimed to. The legacy pass
says it doesn't. Match the legacy pass's preserved analyses.
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D89175
For GC parseable element atomic memcpy/memmove we'll need to
shuffle statepoint arguments. Make it possible by storing the
arguments as Value *, not Use *.
* Adds a new MlirOpPrintingFlags type and supporting accessors.
* Adds a new mlirOperationPrintWithFlags function.
* Adds a full featured python Operation.print method with all options and the ability to print directly to files/stdout in text or binary.
* Adds an Operation.get_asm which delegates to print and returns a str or bytes.
* Reworks Operation.__str__ to be based on get_asm.
Differential Revision: https://reviews.llvm.org/D89848
A "structural" type conversion is one where the underlying ops are
completely agnostic to the actual types involved and simply need to update
their types. An example of this is shape.assuming -- the shape.assuming op
and the corresponding shape.assuming_yield op need to update their types
accordingly to the TypeConverter, but otherwise don't care what type
conversions are happening.
Also, the previous conversion code would not correctly materialize
conversions for the shape.assuming_yield op. This should have caused a
verification failure, but shape.assuming's verifier wasn't calling
RegionBranchOpInterface::verifyTypes (which for reasons can't be called
automatically as part of the trait verification, and requires being
called manually). This patch also adds that verification.
Differential Revision: https://reviews.llvm.org/D89833
A "structural" type conversion is one where the underlying ops are
completely agnostic to the actual types involved and simply need to update
their types. An example of this is scf.if -- the scf.if op and the
corresponding scf.yield ops need to update their types accordingly to the
TypeConverter, but otherwise don't care what type conversions are happening.
To test the structural type conversions, it is convenient to define a
bufferize pass for a dialect, which exercises them nicely.
Differential Revision: https://reviews.llvm.org/D89757
This is similar in spirit to 01ea93d85d (memcpy) except that
here the underlying caller assumptions were created for vectorizer
use (throughput) rather than other passes.
That meant ARM could have an enormous throughput cost with no
corresponding size, latency, or blended cost increase. X86 has
the same throughput restriction as the basic implementation, so
it is still unchanged.
Paraphrasing from the previous commit:
This may not make sense for some callers, but at least now the
costs will be consistently wrong instead of mysteriously wrong.
Targets should provide better overrides if the current modeling
is not accurate.
Update comments describing how OMPGridValues enums will be used in
clang, deviceRTLs, and hsa and cuda plugins.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D86232
37c030f81a made it so that depending on //libcxx/include
automatically added the copied header dir to the include search path.
For some reason, clang can't build against the copied libcxx headers
(it complains about ldiv_t not being a type). I don't have a mac
to debug right now, but for the clang target this change was
unintentional anyways -- only depend on the copies target, instead
of on the target that also adjusts the include path.