Currently, we have support for SYCL 1.2.1 (also known as SYCL 2017).
This patch introduces the start of support for SYCL 2020 mode, which is
the latest SYCL standard available at (https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html).
This sets the default SYCL to be 2020 in the driver, and introduces the
notion of a "default" version (set to 2020) when cc1 is in SYCL mode
but there was no explicit -sycl-std= specified on the command line.
This adds support to the X86 backend for the newly committed swiftasync
function parameter. If such a (pointer) parameter is present it gets stored
into an augmented frame record (populated in IR, but generally containing
enhanced backtrace for coroutines using lots of tail calls back and forth).
The context frame is identical to AArch64 (primarily so that unwinders etc
don't get extra complexity). Specfically, the new frame record is [AsyncCtx,
%rbp, ReturnAddr], and its presence is signalled by bit 60 of the stored %rbp
being set to 1. %rbp still points to the frame pointer in memory for backwards
compatibility (only partial on x86, but OTOH the weird AsyncCtx before the rest
of the record is because of x86).
Recommited with a fix for unwind info when i386 pc-rel thunks are
adjacent to a prologue.
We use `CHECK-LABEL: define` to divide input stream into functions,
this works well on most platforms.
But there are cases that some platforms (eg: AIX) may have different
codegen , especially for global constructor and descructors.
On AIX, the codegen will have two more functions: __dtor_b,
__finalize_b, which will fail the test.
The fix is to use specific function name so that we can safely ignore
those unrelated codegen differences.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D102654
A long time ago LLDB wanted to start using StringRef instead of
C-Strings/ConstString but was blocked by the StringRef(const char *) ctor
asserting that the C-string isn't a nullptr. To workaround this, D24697
introduced a special function called withNullAsEmpty and that's what LLDB (and
only LLDB) started to use to build StringRefs from C-strings.
A bit later it seems that withNullAsEmpty was declared too awkward to use and
instead the assert in the StringRef constructor got removed (see D24904). The
rest of LLDB was then converted to StringRef by just calling the now perfectly
usable implicit constructor.
However, it seems that the original approach with withNullAsEmpty was never
touched again since then and now just exists as a function in StringRef that
is only used in a few places in LLDB.
I removed the few uses of withNullAsEmpty in D102597 and this patch removes
the function itself. Calling the implicit StringRef(const char *) constructor
is the preferred way of doing this today.
Reviewed By: lattner
Differential Revision: https://reviews.llvm.org/D102599
While i would like to keep the right value here,
i would also like to be able to actually compile
e.g. vanilla test-suite.
256 is a pretty random guess, it should be pretty good enough
for serious loops, but small enough to result in tolerant
compile times for certain edge cases.
https://bugs.llvm.org/show_bug.cgi?id=50384
The problem with debug mode tests is that it isn't known which particular
_LIBCPP_ASSERT causes the test to exit, and as shown by
https://reviews.llvm.org/D100029 and 2908eb20ba it might be not the
expected one.
The patch adds TEST_LIBCPP_ASSERT_FAILURE macro that allows checking
_LIBCPP_ASSERT message to ensure we caught an expected failure.
Reviewed By: Quuxplusone, ldionne
Differential Revision: https://reviews.llvm.org/D100595
The MinGW driver passed a hardcoded true to this parameter
since 6f4e255219, but when the MinGW driver got the
canExitEarly parameter for consistency in b11386f9be, this
call was missed so it wasn't passed on properly.
Noticed while investigating PR50364, the truncation costs for v4i64->v4i16/v4i8 and v8i32->v8i8 were way too optimistic for a shuffle sequence that usually matches the AVX1 codegen (they matched AVX512 numbers which have actual truncation instructions!).
That way, it's done only once instead of every time shouldExportSymbol() is
called.
Possibly a bit faster:
% ministat at_main at_symtodo
x at_main
+ at_symtodo
N Min Max Median Avg Stddev
x 30 3.9732189 4.114846 4.024621 4.0304692 0.037022865
+ 30 3.93766 4.0510042 3.9973931 3.991469 0.028472565
Difference at 95.0% confidence
-0.0390002 +/- 0.0170714
-0.967635% +/- 0.423559%
(Student's t, pooled s = 0.0330256)
In other runs with n=30 it makes no perf difference, so maybe it's just noise.
But being able to quickly and conveniently answer "is this symbol exported?"
is useful for fixing PR50373 and for implementing -dead_strip, so this seems
like a good change regardless.
No behavior change.
Differential Revision: https://reviews.llvm.org/D102661
This fixes the initialization of objects in the __constant
address space that occurs when declaring the object.
Fixes part of PR42566
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D102248
This allows tests to detect whether to run or not, dependent on which
LLD version is required for the test.
Reviewed by: thopre
Differential Revision: https://reviews.llvm.org/D101997
This patch stops lit from looking on the PATH for clang, lld and other
users of use_llvm_tool (currently only the debuginfo-tests) unless the
call explicitly requests to opt into using the PATH. When not opting in,
tests will only look in the build directory.
See the mailing list thread starting from
https://lists.llvm.org/pipermail/llvm-dev/2021-May/150421.html.
See the review for details of why decisions were made about when still
to use the PATH.
Reviewed by: thopre
Differential Revision: https://reviews.llvm.org/D102630
Passing template parameter packs to std::map doesn't work in VS 2017/2019, so this updates the preprocessor version check to use an alternate version in VS2019, as well.
Reviewed By: DavidSpickett
Differential Revision: https://reviews.llvm.org/D102260
Where the RVV specification writes `vs2, vs1`, our TableGen patterns use
`rs1, rs2`. These differences can easily cause confusion. The VMANDNOT
instruction performs `LHS && !RHS`, and similarly for VMORNOT.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D102606
This allows cast/dyn_cast'ing from VPUser to recipes. This is needed
because there are VPUsers that are not recipes.
Reviewed By: gilr, a.elovikov
Differential Revision: https://reviews.llvm.org/D100257
In the case of TypedefDecls we set the DeclContext after we imported it.
It turns out, it could lead to null pointer dereferences during the
cleanup part of a failed import.
This patch demonstrates this issue and fixes it by checking if the
DeclContext is available or not.
Reviewed By: shafik
Differential Revision: https://reviews.llvm.org/D102640
A long time ago LLDB wanted to start using StringRef instead of
C-Strings/ConstString but was blocked by the fact that the StringRef constructor
that takes a C-string was asserting that the C-string isn't a nullptr. To
workaround this, D24697 introduced a special function called `withNullAsEmpty`
and that's what LLDB (and only LLDB) started to use to build StringRefs from
C-strings.
A bit later it seems that `withNullAsEmpty` was declared too awkward to use and
instead the assert in the StringRef constructor got removed (see D24904). The
rest of LLDB was then converted to StringRef by just calling the now perfectly
usable implicit constructor.
However, all the calls to `withNullAsEmpty` just remained and are now just
strange artefacts in the code base that just look out of place. It's also
curiously a LLDB-exclusive function and no other project ever called it since
it's introduction half a decade ago.
This patch removes all uses of `withNullAsEmpty`. The follow up will be to
remove the function from StringRef.
Reviewed By: JDevlieghere
Differential Revision: https://reviews.llvm.org/D102597
This patch introduces a new class, MaxVFCandidates, that holds the
maximum vectorization factors that have been computed for both scalable
and fixed-width vectors.
This patch is intended to be NFC for fixed-width vectors, although
considering a scalable max VF (which is disabled by default) pessimises
tail-loop elimination, since it can no longer determine if any chosen VF
(less than fixed/scalable MaxVFs) is guaranteed to handle all vector
iterations if the trip-count is known. This issue will be addressed in
a future patch.
Reviewed By: fhahn, david-arm
Differential Revision: https://reviews.llvm.org/D98721
Override __cxa_atexit and ignore callbacks.
This prevents crashes in a configuration when the symbolizer
is built into sanitizer runtime and consequently into the test process.
LLVM libraries have some global objects destroyed during exit,
so if the test process triggers any bugs after that, the symbolizer crashes.
An example stack trace of such crash:
For the standalone llvm-symbolizer this does not hurt,
we just don't destroy few global objects on exit.
Reviewed By: kda
Differential Revision: https://reviews.llvm.org/D102470
This patch moves g_executables to private member of Runtime class
and is renamed to HSAExecutables following LLVM naming convention.
This movement required making Runtime::Initialize and Runtime::Finalize
non-static. Verified the correctness of this change by running
libomptarget tests on gfx906.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D102600
This patch is the Part-1 (FE Clang) implementation of HW Exception handling.
This new feature adds the support of Hardware Exception for Microsoft Windows
SEH (Structured Exception Handling).
This is the first step of this project; only X86_64 target is enabled in this patch.
Compiler options:
For clang-cl.exe, the option is -EHa, the same as MSVC.
For clang.exe, the extra option is -fasync-exceptions,
plus -triple x86_64-windows -fexceptions and -fcxx-exceptions as usual.
NOTE:: Without the -EHa or -fasync-exceptions, this patch is a NO-DIFF change.
The rules for C code:
For C-code, one way (MSVC approach) to achieve SEH -EHa semantic is to follow
three rules:
* First, no exception can move in or out of _try region., i.e., no "potential
faulty instruction can be moved across _try boundary.
* Second, the order of exceptions for instructions 'directly' under a _try
must be preserved (not applied to those in callees).
* Finally, global states (local/global/heap variables) that can be read
outside of _try region must be updated in memory (not just in register)
before the subsequent exception occurs.
The impact to C++ code:
Although SEH is a feature for C code, -EHa does have a profound effect on C++
side. When a C++ function (in the same compilation unit with option -EHa ) is
called by a SEH C function, a hardware exception occurs in C++ code can also
be handled properly by an upstream SEH _try-handler or a C++ catch(...).
As such, when that happens in the middle of an object's life scope, the dtor
must be invoked the same way as C++ Synchronous Exception during unwinding
process.
Design:
A natural way to achieve the rules above in LLVM today is to allow an EH edge
added on memory/computation instruction (previous iload/istore idea) so that
exception path is modeled in Flow graph preciously. However, tracking every
single memory instruction and potential faulty instruction can create many
Invokes, complicate flow graph and possibly result in negative performance
impact for downstream optimization and code generation. Making all
optimizations be aware of the new semantic is also substantial.
This design does not intend to model exception path at instruction level.
Instead, the proposed design tracks and reports EH state at BLOCK-level to
reduce the complexity of flow graph and minimize the performance-impact on CPP
code under -EHa option.
One key element of this design is the ability to compute State number at
block-level. Our algorithm is based on the following rationales:
A _try scope is always a SEME (Single Entry Multiple Exits) region as jumping
into a _try is not allowed. The single entry must start with a seh_try_begin()
invoke with a correct State number that is the initial state of the SEME.
Through control-flow, state number is propagated into all blocks. Side exits
marked by seh_try_end() will unwind to parent state based on existing
SEHUnwindMap[].
Note side exits can ONLY jump into parent scopes (lower state number).
Thus, when a block succeeds various states from its predecessors, the lowest
State triumphs others. If some exits flow to unreachable, propagation on those
paths terminate, not affecting remaining blocks.
For CPP code, object lifetime region is usually a SEME as SEH _try.
However there is one rare exception: jumping into a lifetime that has Dtor but
has no Ctor is warned, but allowed:
Warning: jump bypasses variable with a non-trivial destructor
In that case, the region is actually a MEME (multiple entry multiple exits).
Our solution is to inject a eha_scope_begin() invoke in the side entry block to
ensure a correct State.
Implementation:
Part-1: Clang implementation described below.
Two intrinsic are created to track CPP object scopes; eha_scope_begin() and eha_scope_end().
_scope_begin() is immediately added after ctor() is called and EHStack is pushed.
So it must be an invoke, not a call. With that it's also guaranteed an
EH-cleanup-pad is created regardless whether there exists a call in this scope.
_scope_end is added before dtor(). These two intrinsics make the computation of
Block-State possible in downstream code gen pass, even in the presence of
ctor/dtor inlining.
Two intrinsic, seh_try_begin() and seh_try_end(), are added for C-code to mark
_try boundary and to prevent from exceptions being moved across _try boundary.
All memory instructions inside a _try are considered as 'volatile' to assure
2nd and 3rd rules for C-code above. This is a little sub-optimized. But it's
acceptable as the amount of code directly under _try is very small.
Part-2 (will be in Part-2 patch): LLVM implementation described below.
For both C++ & C-code, the state of each block is computed at the same place in
BE (WinEHPreparing pass) where all other EH tables/maps are calculated.
In addition to _scope_begin & _scope_end, the computation of block state also
rely on the existing State tracking code (UnwindMap and InvokeStateMap).
For both C++ & C-code, the state of each block with potential trap instruction
is marked and reported in DAG Instruction Selection pass, the same place where
the state for -EHsc (synchronous exceptions) is done.
If the first instruction in a reported block scope can trap, a Nop is injected
before this instruction. This nop is needed to accommodate LLVM Windows EH
implementation, in which the address in IPToState table is offset by +1.
(note the purpose of that is to ensure the return address of a call is in the
same scope as the call address.
The handler for catch(...) for -EHa must handle HW exception. So it is
'adjective' flag is reset (it cannot be IsStdDotDot (0x40) that only catches
C++ exceptions).
Suppress push/popTerminate() scope (from noexcept/noTHrow) so that HW
exceptions can be passed through.
Original llvm-dev [RFC] discussions can be found in these two threads below:
https://lists.llvm.org/pipermail/llvm-dev/2020-March/140541.htmlhttps://lists.llvm.org/pipermail/llvm-dev/2020-April/141338.html
Differential Revision: https://reviews.llvm.org/D80344/new/
This initial patch removes some unused variables from global namespace.
There will more incoming patches for moving global variables to classes
or static members.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D102598
This diff changes the type of the argument of isCodeSection to const InputSection *.
NFC.
Test plan: make check-lld-macho
Differential revision: https://reviews.llvm.org/D102664
This change tries to handle multiple dominating users of the pointer operand
by choosing the most immediately dominating one, if possible. While making
this change I also found that the previous implementation had a missing break
statement, making all loads with an odd number of dominating users emit an
OtherAccess value, so that has also been fixed.
Patch by Henrik G Olsson!
Differential Revision: https://reviews.llvm.org/D79097
The main motivation for this refactor is to remove the subclass
relationship between the InputSegment and MergeInputSegment and
SyntenticMergedInputSegment so that we can use the merging classes for
debug sections which are not data segments.
In the process of refactoring I also remove all the virtual functions
from the class hierarchy and try to reuse techniques used in the ELF
linker (see `lld/ELF/InputSections.h`).
Differential Revision: https://reviews.llvm.org/D102546
This reverts commit 6d3e3ae8a9.
Still seeing PPC build bot failures, and one arm self host bot failing. I'm officially stumped, and need help from a bot owner to reduce.
During inlining of call-site with deoptimize intrinsic callee we miss
attributes set on this call site. As a result attributes like deopt-lowering are
disappeared resulting in inefficient behavior of register allocator in codegen.
Just copy attributes for deoptimize call like we do for others calls.
Reviewers: reames, apilipenko
Reviewed By: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D102602
Only supported with -polly-position=early. Unfortunately, the
extension point callpack for VectorizerStart only passes a
FunctionPassManager, making it impossible to add a module pass.