The ArgumentPromotion pass uses Mem2Reg promotion at the end to cutting
down generated `alloca` instructions as well as meaningless `store`s and
this behavior can leave unused (dead) arguments. To eliminate the dead
arguments and therefore let the DeadCodeElimination remove becoming dead
inserted `GEP`s as well as `load`s and `cast`s in the callers, the
DeadArgumentElimination pass should be run after the ArgumentPromotion
one.
Differential Revision: https://reviews.llvm.org/D128830
This test simply checks whether we can print an optimized
function argument. With recent changes to Clang the assumption
that we don't generate a `DW_AT_location` attribute for the
unused funciton parameter breaks.
This patch tries harder to get Clang to drop the location
from DWARF by making it generate an `undef` for `unused1`.
Drop the check for `unused2` since it adds no benefit.
Differential Revision: https://reviews.llvm.org/D132635
Match Clang's sorting, so that longer (more specific) prefix paths will match
before less specific paths.
Reviewed By: MaskRay, raj.khem, #debug-info
Differential Revision: https://reviews.llvm.org/D132390
The invocation is only ever used to serialize cc1 arguments from, so
instead serialize the arguments inside the dep scanner to simplify the
interface.
Differential Revision: https://reviews.llvm.org/D132616
ToolInvocation is useful even if you already have a -cc1 invocation,
since it provides a standard way to setup diagnostics, parse arguments,
and handoff to a ToolAction. So teach it to support -cc1 commands by
skipping the driver bits.
Differential Revision: https://reviews.llvm.org/D132615
This patch complete TODO left in D66965, and achieve
related pattern for bitreverse.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D132431
D132080 introduced a bug leading to `RegisterClassInfo` caches not
getting invalidated when there was exactly one more CSR register added.
Differential Revision: https://reviews.llvm.org/D132606
To generate all symbols correctly, it is necessary to record the address
of each fragment. This patch moves the address info for the main and
cold fragments from BinaryFunction to FunctionFragment, where this data
is recorded for all fragments.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D132051
This changes `FunctionFragment` from being used as a temporary proxy
object to access basic block ranges to a heap-allocated object that can
store fragment-specific information.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D132050
The goal is to separate collecting items for post-processing
and processing them. Post processing also outlined as
dedicated method.
Differential Revision: https://reviews.llvm.org/D132603
The element type enum is not needed to differentiate dense array kinds
because the element type of the shaped type can be used instead.
Reviewed By: mehdi_amini, rriddle
Differential Revision: https://reviews.llvm.org/D132535
A const-qualified reference to function layout allows accessing
non-const qualified basic blocks on a const-qualified function. This
patch adds or removes const-qualifiers where necessary to indicate where
basic blocks are used in a non-const manner.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D132049
TLite is a lightweight, statically linkable[1], model evaluator, supporting a
subset of what the full tensorflow library does, sufficient for the
types of scenarios we envision having. It is also faster.
We still use saved models as "source of truth" - 'release' mode's AOT
starts from a saved model; and the ML training side operates in terms of
saved models.
Using TFLite solves the following problems compared to using the full TF
C API:
- a compiler-friendly implementation for runtime-loadable (as opposed
to AOT-embedded) models: it's statically linked; it can be built via
cmake;
- solves an issue we had when building the compiler with both AOT and
full TF C API support, whereby, due to a packaging issue on the TF
side, we needed to have the pip package and the TF C API library at
the same version. We have no such constraints now.
The main liability is it supporting a subset of what the full TF
framework does. We do not expect that to cause an issue, but should that
be the case, we can always revert back to using the full framework
(after also figuring out a way to address the problems that motivated
the move to TFLite).
Details:
This change switches the development mode to TFLite. Models are still
expected to be placed in a directory - i.e. the parameters to clang
don't change; what changes is the directory content: we still need
an `output_spec.json` file; but instead of the saved_model protobuf and
the `variables` directory, we now just have one file, `model.tflite`.
The change includes a utility showing how to take a saved model and
convert it to TFLite, which it uses for testing.
The full TF implementation can still be built (not side-by-side). We
intend to remove it shortly, after patching downstream dependencies. The
build behavior, however, prioritizes TFLite - i.e. trying to enable both
full TF C API and TFLite will just pick TFLite.
[1] thanks to @petrhosek's changes to TFLite's cmake support and its deps!
I'd used the wrong signatures for these as I had not remembered we'd added the second boolean argument. We appear to still parse the old form just fine, but we should use the two argument form in test since that's what the LangRef actually describes.
that can lead to security vulnerabilities
Also, fix a few places that were causing -Wshadow and
-Wformat-nonliteral warnings to be emitted.
This reapplies the patch that was reverted in 0d66dc57e8 because it
broke a few bots.
I made changes so that cmake checks whether some of the flags are
supported by the compiler that is used before adding them to the list.
Also, I moved function add_security_warnings to CompilerRTUtils.cmake so
that it is defined before it's used.
Differential Revision: https://reviews.llvm.org/D131714
The KCFI sanitizer, enabled with `-fsanitize=kcfi`, implements a
forward-edge control flow integrity scheme for indirect calls. It
uses a !kcfi_type metadata node to attach a type identifier for each
function and injects verification code before indirect calls.
Unlike the current CFI schemes implemented in LLVM, KCFI does not
require LTO, does not alter function references to point to a jump
table, and never breaks function address equality. KCFI is intended
to be used in low-level code, such as operating system kernels,
where the existing schemes can cause undue complications because
of the aforementioned properties. However, unlike the existing
schemes, KCFI is limited to validating only function pointers and is
not compatible with executable-only memory.
KCFI does not provide runtime support, but always traps when a
type mismatch is encountered. Users of the scheme are expected
to handle the trap. With `-fsanitize=kcfi`, Clang emits a `kcfi`
operand bundle to indirect calls, and LLVM lowers this to a
known architecture-specific sequence of instructions for each
callsite to make runtime patching easier for users who require this
functionality.
A KCFI type identifier is a 32-bit constant produced by taking the
lower half of xxHash64 from a C++ mangled typename. If a program
contains indirect calls to assembly functions, they must be
manually annotated with the expected type identifiers to prevent
errors. To make this easier, Clang generates a weak SHN_ABS
`__kcfi_typeid_<function>` symbol for each address-taken function
declaration, which can be used to annotate functions in assembly
as long as at least one C translation unit linked into the program
takes the function address. For example on AArch64, we might have
the following code:
```
.c:
int f(void);
int (*p)(void) = f;
p();
.s:
.4byte __kcfi_typeid_f
.global f
f:
...
```
Note that X86 uses a different preamble format for compatibility
with Linux kernel tooling. See the comments in
`X86AsmPrinter::emitKCFITypeId` for details.
As users of KCFI may need to locate trap locations for binary
validation and error handling, LLVM can additionally emit the
locations of traps to a `.kcfi_traps` section.
Similarly to other sanitizers, KCFI checking can be disabled for a
function with a `no_sanitize("kcfi")` function attribute.
Relands 67504c9549 with a fix for
32-bit builds.
Reviewed By: nickdesaulniers, kees, joaomoreira, MaskRay
Differential Revision: https://reviews.llvm.org/D119296
If the input to a tosa.slice operation is a splat we can just replace with
another splat. If the result is a single element, replacing with a splat
is universally useful.
Reviewed By: NatashaKnk
Differential Revision: https://reviews.llvm.org/D132499
The runtime function was implemented and tested and the
specifier was handled, but it seems lowering was never
specifically added.
Differential Revision: https://reviews.llvm.org/D131814
Added folders for tosa.add that handles bypassing add-zero,
fold additions of two splat tensors, and additions between
two tensors with small values.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D132272
This test helped me catch an issue in my implementation of chained
fixups (D132560). This commit makes the test more robust by checking if
the bind opcodes are emitted for the correct address.
Differential Revision: https://reviews.llvm.org/D132367
Compile fix for Microsoft Visual Studio 17.3 (msvc 14.33.31629) which regressed from 17.1.
The compile error is:
```
llvm-project\flang\lib\Evaluate\fold-integer.cpp(802,41): error C2672: 'invoke': no matching overloaded function found
C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.33.31629\include\type_traits(1552,19): message : could be 'unknown-type std::invoke(_Callable &&,_Ty1 &&,_Types2 &&...) noexcept(<expr>)'
llvm-project\flang\lib\Evaluate\fold-integer.cpp(802,41): message : Failed to specialize function template 'unknown-type std::invoke(_Callable &&,_Ty1 &&,_Types2 &&...) noexcept(<expr>)'
C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.33.31629\include\type_traits(1552): message : see declaration of 'std::invoke'
llvm-project\flang\lib\Evaluate\fold-integer.cpp(490,1): message : With the following template arguments:
llvm-project\flang\lib\Evaluate\fold-integer.cpp(490,1): message : '_Callable=int (__cdecl Fortran::evaluate::value::Integer<16,true,16,unsigned short,unsigned int>::* &)(void) const'
llvm-project\flang\lib\Evaluate\fold-integer.cpp(490,1): message : '_Ty1=const Fortran::evaluate::value::Integer<8,true,8,unsigned char,unsigned short> &'
llvm-project\flang\lib\Evaluate\fold-integer.cpp(490,1): message : '_Types2={}'
C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.33.31629\include\type_traits(1546,19): message : or 'unknown-type std::invoke(_Callable &&) noexcept(<expr>)'
C:\Users\meinersbur\src\llvm-project\flang\lib\Evaluate\fold-integer.cpp(802,41): message : 'unknown-type std::invoke(_Callable &&) noexcept(<expr>)': expects 1 arguments - 2 provided
```
For some reason, msvc thinks that the lambda argument is `Scalar<T>` instead of `Scalar<TI>` as declared. This only happens in nested closures, using a lambda without `[]` arguments makes the problem disappear. Using `auto` instead to automatically derive the type fixes the problem.
I recently updated the version of Visual Studio on [[ https://lab.llvm.org/buildbot/#/builders/172 | flang-x86_64-windows buildbot ]] which since then fails because of this problem. It occasionally failed with 'fatal error C1001: Internal compiler error." internal error which I hoped to fix with an update.
Reviewed By: klausler
Differential Revision: https://reviews.llvm.org/D132594
Bionic's <sys/procfs.h> defines the necessary symbols. Remove the
specialization for Android and the now-unnecessary include of
<sys/ptrace.h>. This also helps resolve issues when building the
x86/x86_64 lldb-server for Android.
Curiously, the default branch to include <sys/procfs.h> doesn't seem
necessary on Linux. I'll remove it and add it back if it breaks other
builders.
Differential Revision: https://reviews.llvm.org/D132514
This patch adds a formatter for `std::coroutine_handle`, both for libc++
and libstdc++. For the type-erased `coroutine_handle<>`, it shows the
`resume` and `destroy` function pointers. For a non-type-erased
`coroutine_handle<promise_type>` it also shows the `promise` value.
With this change, executing the `v t` command on the example from
https://clang.llvm.org/docs/DebuggingCoroutines.html now outputs
```
(task) t = {
handle = coro frame = 0x55555555b2a0 {
resume = 0x0000555555555a10 (a.out`coro_task(int, int) at llvm-example.cpp:36)
destroy = 0x0000555555556090 (a.out`coro_task(int, int) at llvm-example.cpp:36)
}
}
```
instead of just
```
(task) t = {
handle = {
__handle_ = 0x55555555b2a0
}
}
```
Note, how the symbols for the `resume` and `destroy` function pointer
reveal which coroutine is stored inside the `std::coroutine_handle`.
A follow-up commit will use this fact to infer the coroutine's promise
type and the representation of its internal coroutine state based on
the `resume` and `destroy` pointers.
The same formatter is used for both libc++ and libstdc++. It would
also work for MSVC's standard library, however it is not registered
for MSVC, given that lldb does not provide pretty printers for other
MSVC types, either.
The formatter is in a newly added `Coroutines.{h,cpp}` file because there
does not seem to be an already existing place where we could share
formatters across libc++ and libstdc++. Also, I expect this code to grow
as we improve debugging experience for coroutines further.
**Testing**
* Added API test
Differential Revision: https://reviews.llvm.org/D132415
When Clang encounters `@import M.Private` during implicit build, it precompiles module `M` and looks through its submodules. If the `Private` submodule is not found, Clang assumes `@import M_Private`. In the dependency scanner, we don't capture the dependency on `M`, since it's not imported. It's an affecting module, though: compilation of the import statement will fail when implicit modules are disabled and `M` is not precompiled and explicitly provided. This patch fixes that.
Depends on D132430.
Reviewed By: benlangmuir
Differential Revision: https://reviews.llvm.org/D132502
Rename/generalize getEnclosingAffineForIfOps -> getEnclosingAffineOps.
The utility was originally written only for affine.for ops and then
extended for affine.if as well. It wasn't however updated when
affine.parallel was introduced -- in most cases, analysis has been used
in the presence of affine.for/if but not post parallelization. Extend
utility to also support affine.parallel ops; this allows future changes
to enable affine analysis and opts in the presence of affine.parallel
ops. Fix related stale comments.
This is NFC for all use cases in tree.
Change an associated assert to a utility failure.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D132326
Add the remaining pieces to support IO for noncontigous formats.
This is done by passing an array descriptor to IO calls. Scalar
formats continue to pass string and length arguments. IO calls
with formats are modified to place the new format descriptor
argument directly after the original string and length arguments.
The patch adds support for OpenCL and SPIR-V built-in functions.
Their detection and properties are implemented using TableGen.
Five tests are added to demonstrate the improvement.
Differential Revision: https://reviews.llvm.org/D132024
Co-authored-by: Aleksandr Bezzubikov <zuban32s@gmail.com>
Co-authored-by: Michal Paszkowski <michal.paszkowski@outlook.com>
Co-authored-by: Andrey Tretyakov <andrey1.tretyakov@intel.com>
Co-authored-by: Konrad Trifunovic <konrad.trifunovic@intel.com>