Extracted from D117246. This reflects the march value used by the
compile back into the toolchain arguments, letting downstream processes
such as LTO rely on it being present. Subsequent patches should also be able
to remove the two other calls to checkSystemForAMDGPU.
Reviewed By: jonchesterfield
Differential Revision: https://reviews.llvm.org/D117706
This patch fixes the removal of unreachable uncondtional branch located
after return instruction.
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/D117677
The bulk of the implementation is common between 'release' mode (==AOT-ed
model) and 'development' mode (for training), the main difference is
that in development mode, we may also log features (for training logs),
inject scoring information (currently after the Virtual Register
Rewriter) and then produce the log file.
This patch also introduces the score injection pass, 'Register
Allocation Pass Scoring', which is trivially just logging the score in
development mode.
Differential Revision: https://reviews.llvm.org/D117147
Fixing ICE when partial inline tries to deal with blockaddress uses of function which is typical for asm-goto/callbr. We ran into this with PGO multi-region partial inline.
Differential Revision: https://reviews.llvm.org/D117509
We may not be allowed to use vXiXLen vectors. Consult ELEN to
determine what is allowed. This will become even more important
when Zve32 is added.
Reviewed By: frasercrmck, arcbbb
Differential Revision: https://reviews.llvm.org/D117518
The warning was printing the timestamp from the debug map twice rather
than both the file system modification time and debug map modification
time.
rdar://86036385
Differential revision: https://reviews.llvm.org/D117331
UBOUND, SIZE, and SHAPE folding was still creating expressions that are
invalid on the caller side without the call expression context.
A previous patch intended to deal with this situation (https://reviews.llvm.org/D116933)
but it assumed the return expression would be a descriptor inquiry to
the result symbol, which is not the case if the extent expression is
"scope invariant" inside the called subroutine (e.g., referring to
intent(in) dummy arguments). Simply prevent folding from inlining non
constant extent expression on the caller side.
Folding could be later improved by having ad-hoc folding for UBOUND, SIZE, and
SHAPE on function references where it could try replacing the dummy symbols
by the actual expression, but this is left as a possible later improvement.
Differential Revision: https://reviews.llvm.org/D117686
In ld.lld, when an ObjFile/BitcodeFile is read in --start-lib state, the file is
given archive semantics. --end-lib closes the previous --start-lib. A build
system can use this feature as an alternative to archives. This patch ports
the feature to lld-macho.
--start-lib and --end-lib are positional, unlike usual ld64 options.
I think the slight drawback does not matter as (a) reusing option names
make build systems convenient (b) `--start-lib a.o b.o --end-lib` conveys more
information than an alternative design: `-objlib a.o -objlib b.o` because
--start-lib makes it clear which objects are in the same conceptual archive.
This provides flexibility (c) `-objlib`/`-filelist` interaction may be weird.
Close https://github.com/llvm/llvm-project/issues/52931
Reviewed By: #lld-macho, Jez Ng, oontvoo
Differential Revision: https://reviews.llvm.org/D116913
This diff adds support for relative roots to VFS overlays. The directory root
will be made absolute from the current working directory and will be used to
determine the path style to use. This supports the use of VFS overlays with
remote build systems that might use a different working directory for each
compilation.
Reviewed By: benlangmuir
Differential Revision: https://reviews.llvm.org/D116174
https://reviews.llvm.org/D116179 introduced some changes to
`InstrProfData.inc` which broke some downstream builds. This commit
reverts those changes since they only changes two field names.
Reviewed By: phosek
Differential Revision: https://reviews.llvm.org/D117631
This prevents the debuginfod client from dumping files directly into the
default cache directory (e.g., ~/.cache). Instead, these files are
placed in a subdirectory (e.g., ~/.cache/llvm-debuginfod/client).
Behavior is unaffected if the cache directory is provided by the
DEBUGINFO_CACHE_PATH environment variable.
Patch By: mysterymath
Differential Revision: https://reviews.llvm.org/D117619
Subclause 7.5.2.4 lists conditions under which two distinct derived
types are to be considered the same type for purposes of argument
association, assignment, and so on. These conditions are implemented
in evaluate::IsTkCompatibleWith(), but assignment semantics doesn't
use it for testing for intrinsic assignment compatibility. Fix that.
Differential Revision: https://reviews.llvm.org/D117621
Disable promote alloca to LDS when HIP-style dynamic LDS since the size
is unknown at compile time.
Patch by: Siu Chi Chan
Reviewed by: Matt Arsenault, Yaxun Liu
Differential Revision: https://reviews.llvm.org/D117494
When an archive with an empty index contains only bitcode files, it is
handled as a group of lazy (--start-lib) object files. If there is a
non-bitcode file, there will be a diagnostic a la GNU ld.
For some programs, the archive member extraction ratio is high (e.g. for chrome,
79% archive members are extracted according to --print-archive-stats=). Because
symbol interning is cached for ObjFile::parseLazy but not for ArchiveFile,
parsing an archive as a group of --start-lib object files may be faster.
If the linker speculatively creates section representations for archive members,
the archive index will not be used.
If we take the above view, the archive index is essentially useless. If a user
wants a fast build without using --start-lib, they may just build thin archives
without index (`ar rcS --thin`).
Therefore, I suggest that we no longer treat the code as a hack, instead as a
supported feature. I believe we will do this anyway if we add parallel symbol
interning (parallel symbol interning for lazy object files is simpler than that
for archives).
Ecosystem issues:
* parseLazy actually has nearly the same behavior as ArchiveFile::parse, but the symbol order may be different.
* users may get addicted to the behavior and build archives not working with GNU ld and gold. I think it is easy to rebuild archives to be compatible.
Reviewed By: ikudrin
Differential Revision: https://reviews.llvm.org/D117284
Benchmarks show that memref::CopyOp is curently up to 200x slower than
tiled and vectorized versions of linalg::Copy.
Add a temporary flag to allow comprehensive bufferize to generate a
linalg::GenericOp that implements a copy until this performance bug is
resolved.
Differential Revision: https://reviews.llvm.org/D117696
We got a few crash reports that showed LLDB initializing Python on two
separate threads. Make sure Python is initialized exactly once.
rdar://87287005
Differential revision: https://reviews.llvm.org/D117601
HIP program with printf call fails to compile with -fsanitize=address
option, because of appending module flag - amdgpu_hostcall twice, one
for printf and one for sanitize option. This patch fixes that issue.
Patch by: Praveen Velliengiri
Reviewed by: Yaxun Liu, Roman Lebedev
Differential Revision: https://reviews.llvm.org/D116216
The constructor function was being defined without indicating its "__init__"
name, which made it interpret it as a regular fuction rather than a
constructor. When overload resolution failed, Pybind would attempt to print the
arguments actually passed to the function, including "self", which is not
initialized since the constructor couldn't be called. This would result in
"__repr__" being called with "self" referencing an uninitialized MLIR C API
object, which in turn would cause undefined behavior when attempting to print
in C++. Even if the correct name is provided, the mechanism used by
PybindAdaptors.h to bind constructors directly as "__init__" functions taking
"self" is deprecated by Pybind. The new mechanism does not seem to have access
to a fully-constructed "self" object (i.e., the constructor in C++ takes a
`pybind11::detail::value_and_holder` that cannot be forwarded back to Python).
Instead, redefine "__new__" to perform the required checks (there are no
additional initialization needed for attributes and types as they are all
wrappers around a C++ pointer). "__new__" can call its equivalent on a
superclass without needing "self".
Bump pybind11 dependency to 3.8.0, which is the first version that allows one
to redefine "__new__".
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D117646
I tried to look over the file and didn't see any other non-use of *Len variables.
Reviewed By: deadalnix
Differential Revision: https://reviews.llvm.org/D116482
When doing the float to int conversion the strict conversion also needs to
retun a chain. This patch fixes that.
Reviewed By: nemanjai, #powerpc, qiucf
Differential Revision: https://reviews.llvm.org/D117464
This patch adds OMPIRBuilder support for the simd directive (without any clause). This will be a first step towards lowering simd directive in LLVM_Flang. The patch uses existing CanonicalLoop infrastructure of IRBuilder to add the support. Also adds necessary code to add llvm.access.group and llvm.loop metadata wherever needed.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D114379
This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.
Reviewed-by: xazax.hun
Differential Revision: https://reviews.llvm.org/D117667
Upstream gtest now prints 'Running main() from FILE' instead of
plain 'Running main() from gtest_main.cc'. Thus, all such tests
ended-up being mistakenly marked as UNRESOLVED.
Patch by @lzaoral
Differential Revision: https://reviews.llvm.org/D100043
std::chrono::duration types are not thread-safe, and they cannot be
concurrently updated from multiple threads. Currently, we were doing
such a thing (only) in the DWARF indexing code
(DWARFUnit::ExtractDIEsRWLocked), but I think it can easily happen that
someone else tries to update another statistic like this without
bothering to check for thread safety.
This patch changes the StatsDuration type from a simple typedef into a
class in its own right. The class stores the duration internally as
std::atomic<uint64_t> (so it can be updated atomically), but presents it
to its users as the usual chrono type (duration<float>).
Differential Revision: https://reviews.llvm.org/D117474
Arbitrary stack pointers are accessed using MUBUF instructions with
the voffset field, which is interpreted as the swizzled address. We
want to fold fold into the MUBUF form to use the SP in the SGPR
offset, and previously we were special casing the interpretation of
the pointer value if the access memory operand said it was relative to
the stack pointer.
690f5b7a01 removed this check, and moved
the DAG path to special casing copies from SGPRs. This is not an
entirely sound approach, since it's still changing the interpretation
of pointer values based the context.
Introduce a new pseudo which corresponds to the wave-to-vector address
transform. This way the memory instruction has consistent semantics
where the incoming pointer is always interpreted as a vector address,
and we're not obligated to optimize into the MUBUF offset-only
addressing mode. The DAG should probably have an equivalent pseudo.
This should fix some correctness issues, and folding this into
addressing modes will be a future optimization patch.
It will make the output more versbose, but I found that these are useful
information when debugging selection tree.
Differential Revision: https://reviews.llvm.org/D117475