The hsa library must be initialized before any calls into it and
destructed after the last call into it. There have been a number of bugs in
this area related to member variables which would like to use raii to manage
resources acquired from hsa.
This patch moves the init/shutdown of hsa into a class, such that when used as
the first member variable (could be a base), the lifetime of other member
variables are reliably scoped within it. This will allow other classes to use
raii reliably when used as member variables within the global.
Reviewed By: pdhaliwal
Differential Revision: https://reviews.llvm.org/D109512
llvm::errs() is unbuffered. On a POSIX platform, composing a diagnostic
string may invoke the ::write syscall multiple times, which can be slow.
Buffer writes to a temporary SmallString when composing a single diagnostic to
reduce the number of ::write syscalls to one (also easier to read under
strace/truss).
For an invocation of ld.lld with 62000+ lines of
`ld.lld: warning: symbol ordering file: no such symbol: ` warnings (D87121),
the buffering decreases the write time from 1s to 0.4s (for /dev/tty) and
from 0.4s to 0.1s (for a tmpfs file). This can speed up
`relocation R_X86_64_PC32 out of range` diagnostic printing as well
with `--noinhibit-exec --no-fatal-warnings`.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D87272
This patch adds class SystemZFrameLowering which is a SystemZ-specific class
detailing special registers used by calling conventions on the target.
SystemZELFFrameLowering and SystemZXPLINKFrameLowering implement this class
for ELF and XPLINK64 respectively. Previous functionality in SystemZFrameLowering
is moved to SystemZELFFrameLowering. SystemZXPLINKFrameLowering can then be
implemented in future patches.
Reviewed By: uweigand, Kai
Differential Revision: https://reviews.llvm.org/D108777
Given D109057, change test runner to use the libomptarget-x-bc-path
argument instead of the LIBRARY_PATH environment variable to find the device
library.
Also drop the use of LIBRARY_PATH environment variable as it is far
too easy to pull in the device library from an unrelated toolchain by accident
with the current setup. No loss in flexibility to developers as the clang
commandline used here is still available.
Reviewed By: jdoerfert, tianshilei1992
Differential Revision: https://reviews.llvm.org/D109061
module lookup by name alone
This removes the need to create a fake source file that imports a
module.
rdar://64538073
Differential Revision: https://reviews.llvm.org/D109485
Further enhance the set of operations that can be handled by the sparse compiler
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D109413
Extends handling of list initialization of bounded array parameters.
This adds the missing checks on converting each initializer for both
std::initializer_list and arrays. And extends
CompareImplicitConversionSequence to compares array size, for two
conversions to array type.
As noted in this patch, there's a defect in the std concerning the
partial orderability of conversion sequences. DR2492 has a suggested
direction that will be simple to add once it (hopefully) is accepted.
Differential Revision: https://reviews.llvm.org/D103088
Conversion to the LLVM dialect is being refactored to be more progressive and
is now performed as a series of independent passes converting different
dialects. These passes may produce `unrealized_conversion_cast` operations that
represent pending conversions between built-in and LLVM dialect types.
Historically, a more monolithic Standard-to-LLVM conversion pass did not need
these casts as all operations were converted in one shot. Previous refactorings
have led to the requirement of running the Standard-to-LLVM conversion pass to
clean up `unrealized_conversion_cast`s even though the IR had no standard
operations in it. The pass must have been also run the last among all to-LLVM
passes, in contradiction with the partial conversion logic. Additionally, the
way it was set up could produce invalid operations by removing casts between
LLVM and built-in types even when the consumer did not accept the uncasted
type, or could lead to cryptic conversion errors (recursive application of the
rewrite pattern on `unrealized_conversion_cast` as a means to indicate failure
to eliminate casts).
In fact, the need to eliminate A->B->A `unrealized_conversion_cast`s is not
specific to to-LLVM conversions and can be factored out into a separate type
reconciliation pass, which is achieved in this commit. While the cast operation
itself has a folder pattern, it is insufficient in most conversion passes as
the folder only applies to the second cast. Without complex legality setup in
the conversion target, the conversion infra will either consider the cast
operations valid and not fold them (a separate canonicalization would be
necessary to trigger the folding), or consider the first cast invalid upon
generation and stop with error. The pattern provided by the reconciliation pass
applies to the first cast operation instead. Furthermore, having a separate
pass makes it clear when `unrealized_conversion_cast`s could not have been
eliminated since it is the only reason why this pass can fail.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D109507
Fix extra space print for llvm global op when the 'unamed_addr'
attribute was empty. This led to two spaces being printed in the custom
form between non-whitespace chars. A round trip would add an extra space
to a typical spaced form. NFC.
Differential Revision: https://reviews.llvm.org/D109502
Once all the bots are passing with from-scratch configs, we can attempt
to make the from-scratch config the default configuration.
Differential Revision: https://reviews.llvm.org/D103417
The motivating case is an infinite loop shown with a reduced test from:
https://llvm.org/PR51762
To solve this, I'm proposing we delete the most obviously broken part of this code.
The bug example shows a fundamental problem: we ask computeKnownBits if a transform
will be profitable, alter the code by creating new instructions, then rely on
computeKnownBits to return the same answer to actually eliminate instructions.
But there's no guarantee that the results will be the same between the 1st and 2nd
calls. In the infinite loop example, we get different answers, so we add
instructions that conflict with some other transform, and we're stuck.
There's at least one other problem visible in the test diff for
`@zext_or_masked_bit_test_uses`: the code doesn't check uses properly, so we can
end up with extra instructions created.
Last, it's not clear if this set of transforms actually improves analysis or
codegen. I spot-checked a few targets and don't see a clear win:
https://godbolt.org/z/x87EWovso
If we do see a regression from this change, codegen seems like the right place to
add a cmp -> bit-hack fold.
If this is too big of a step, we could limit the computeKnownBits calls by not
passing a context instruction and/or limiting the recursion. I checked that those
would stop the infinite loop for PR51762, but that won't guarantee that some other
example does not fall into the same loop.
Differential Revision: https://reviews.llvm.org/D109440
It was added after we changed the way the CI jobs are run, in particular
how they are pinned down to Linux instances only. As a result, the job
would sometimes run on Mac machines, which we're trying to keep only for
jobs that absolutely need it due to capacity concerns.
These paths are needed when building with per-target runtime directories.
(It's possible to fix this by manually setting these when invoking
cmake, but one isn't supposed to need to do that.)
Also set LLVM_TOOLS_BINARY_DIR while touching this area (as it's
also unset in this case) even if it isn't specifically needed by the
per-target runtime configuration.
Fixed since previous attempt: Don't check if the runtimes directory
is the root of the CMake invocation; when the main LLVM CMake
build builds runtimes, it does invoke a sub-CMake with this directory
as the root too, just as if manually invoking CMake at the runtimes
directory. Instead check whether LLVM_TOOLS_BINARY_DIR was set and
whether find_package(LLVM) succeeded or not.
Differential Revision: https://reviews.llvm.org/D107895
This feature doesn't seem to have any dedicated test. Instead some random tests
(e.g. the bitfield tests) are declaring function-local classes for some reason.
This adds a dedicated test so we can clean up those other tests.
Also add FIXME's for some basic stuff that doesn't work. The first FIXME is a
good beginner bug which just requires prepending the function name (in case we
decide to fix it instead of documenting this behaviour). The second FIXME is
caused by LLDB searching for definitions by name (which also seems to miss the
function name so there is a conflict with the outer type).
Some more things that should be tested (and might not work):
* Local classes with member functions with local classes.
* Classes in different functions with same name.
* Classes with the same name in different TUs with internal linkage functions of
the same name.
* Empty classes are parsed by the DWARF parser in a fast path, so that requires
dedicated tests.
* Repeat some of the tested logic for C.
Finds base classes and structs whose destructor is neither public and
virtual nor protected and non-virtual.
A base class's destructor should be specified in one of these ways to
prevent undefined behaviour.
Fixes are available for user-declared and implicit destructors that are
either public and non-virtual or protected and virtual.
This check implements C.35 [1] from the CppCoreGuidelines.
Reviewed By: aaron.ballman, njames93
Differential Revision: http://reviews.llvm.org/D102325
[1]: http://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Rc-dtor-virtual
As discussed on the ticket, I'm intending to add additional 128->256 patterns when we have test coverage, but this addresses a known crash.
Differential Revision: https://reviews.llvm.org/D109434
This patch fixes register save/restore on expression call to also include SVE registers.
This will fix expression calls like:
re re p1
<Register Value P1 before expression>
p <var-name or function call>
re re p1
<Register Value P1 after expression>
In above example register P1 should remain the same before and after the expression evaluation.
Reviewed By: DavidSpickett
Differential Revision: https://reviews.llvm.org/D108739
OpenMP reductions need a neutral element, so we match some known reduction
kinds (integer add/mul/or/and/xor, float add/mul, integer and float min/max) to
define the neutral element and the atomic version when possible to express
using atomicrmw (everything except float mul). The SCF-to-OpenMP pass becomes a
module pass because it now needs to introduce new symbols for reduction
declarations in the module.
Reviewed By: chelini
Differential Revision: https://reviews.llvm.org/D107549
Allow variable number of directories, as allowed by the
specification. NumberOfRvaAndSize will default to 16 if not specified,
as in the past.
Reviewed by: jhenderson
Differential Revision: https://reviews.llvm.org/D108825
I can't seem to wrap my head around the proper fix here,
we should be fine without this requirement, iff we can form this form,
but the naive attempt (https://reviews.llvm.org/D106317) has failed.
So just to unblock the release, put up a restriction.
Fixes https://bugs.llvm.org/show_bug.cgi?id=51125
https://reviews.llvm.org/D109156 did not properly update the case where
the equivalence symbol appearing in the common statement is the
"base symbol of an equivalence group" (this was the only case that previously
worked ok, and the patch broke it).
Fix this and add a test that actually uses this code path.
Differential Revision: https://reviews.llvm.org/D109439
For sve_fp_3op_p_zds_zx we have zero patterns downstream but the
intrinsic args can be added again if/when the patterns are implemented.
Identified in D109359.
Reviewed By: sdesmalen
Differential Revision: https://reviews.llvm.org/D109429
Don't outline machine instructions which are using jump table indexes
since they are materialized as local labels (like the already handled
case of constant pools).
Reviewed By: paquette
Differential Revision: https://reviews.llvm.org/D109436