This is useful when building a complete toolchain to ensure that CRT
is built after builtins but before the rest of the compiler-rt.
Differential Revision: https://reviews.llvm.org/D120682
Add support of parsing .csky_attribute directive and emit related target attributes in .csky.attribute section.
It does not emit attribute directive in assembly code, so only emit target attributes in ELF streamer.
In ELF streamer, it handles the header EFlag and the csky_attribute section which contains some attribute items.
The EFlag and attribute items are calculated from feature bits based on Subtarget.
This relands commit 7313474319.
It failed on Windows/Mac because `-fjmc` is only checked for ELF targets.
Check the flag unconditionally instead and issue a warning for non-ELF targets.
If the user disables de-globalization we did not seed the AAHeapToShared
and AAHeapToStack but we still could end up with them through in-flight
lookups. With this patch we disable AAHeapToShared completely if the
user disabled de-globalization. Heap-2-stack is still run though.
Differential Revision: https://reviews.llvm.org/D121059
An event pool, similar to the stream pool, needs to be kept per device.
For one, events are associated with cuda contexts which means we cannot
destroy the former after the latter. Also, CUDA documentation states
streams and events need to be associated with the same context, which
we did not ensure at all.
Differential Revision: https://reviews.llvm.org/D120142
There are two problems this patch tries to address:
1) We currently free resources in a random order wrt. plugin and
libomptarget destruction. This patch should ensure the CUDA plugin
is less fragile if something during the deinitialization goes wrong.
2) We need to support (hard) pause runtime calls eventually. This patch
allows us to free all associated resources, though we cannot
reinitialize the device yet.
Follow up patch will associate one event pool per device/context.
Differential Revision: https://reviews.llvm.org/D120089
Rewrite isInnermostAffineForOp utility to make it more direct/efficient.
Drop unnecessary check. NFC.
Differential Revision: https://reviews.llvm.org/D121170
This patch removes the unintended resolution of locally scoped absolute symbols
(which was causing unexpected definition errors).
It stops using the JITSymbolFlags::Absolute flag (it isn't set or used elsewhere,
and causes mismatch-flags asserts), and adds JITSymbolFlags::Exported to default
scoped absolute symbols.
Finally, we now set the scope of absolute symbols correctly in
MachOLinkGraphBuilder.
Use TII::getRegClass to return a valid regclass or a nullptr
if the RC is unknown for a given OpIdx. This fixes a potential
crash occurred while getting the RC from a variadic instruction.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D120813
When pre-initializing fields in the environment, the code assumed that all
fields of a struct would be initialized. However, given limits on value
construction, that assumption is incorrect. This patch changes the code to drop
that assumption and thereby avoid dereferencing a nullptr.
Differential Revision: https://reviews.llvm.org/D121158
The old command wrote to CWD, which doesn't always work, and if it
didn't, there was no workaround (and it crashed on failure). This
patch changed the setting to provide a directory to save the objects
to.
Differential Revision: https://reviews.llvm.org/D121036
When a module uses a derived type that is shadowed by a generic
interface, the module file was missing a USE statement for the
name. Detect and handle this situation.
Differential Revision: https://reviews.llvm.org/D121160
The paramemter of hint clause in OpenMP critical hint should be
non-negative. The omp_lock_hint_none is 0 in omp.h.
Reviewed By: Alexey Bataev
Differential Revision: https://reviews.llvm.org/D121101
This is to align with the PyTACO API better.
Modify an existing unit test to test the new routines.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D121083
This adds JSON output to llvm-remark-size-diff.
The goal here is to make it easy for external tools to consume output from
llvm-remark-size-diff. These tools could be used for automated size analysis.
(E.g. in CI).
To specify JSON output, use `--report_style=json`. JSON output can be
pretty-printed via `--pretty`.
With automation in mind, the schema looks like this:
```
"Files": {
"A": <filename_a>
"B": <filename_b>
},
"InBoth": [
{
"FunctionName": <function name>,
"InstCount": [
<count_in_a>,
<count_in_b>
],
"StackSize": [
<count_in_a>,
<count_in_b>
]
},
...
]
"OnlyInA": [
{
"FunctionName": <function name>,
"InstCount": [
<count_in_a>,
0
],
"StackSize": [
<count_in_a>,
0
]
},
...
]
"OnlyInB": [
{
"FunctionName": <function name>,
"InstCount": [
0,
<count_in_b>
],
"StackSize": [
0,
<count_in_b>
]
},
...
]
```
A few notes:
- Filenames are included, because tools may want to combine many outputs
together in some way (a big JSON file, a big CSV, or something.)
- Counts are represented as [a, b] so that a diff can be calculated via b - a.
The original counts may be useful for size analysis (e.g. was this function
extremely large before?) and so both are preserved.
- `OnlyInA` and `OnlyInB` have a 0 for one of the counts always. This is to
make it easier for tools to share code between `OnlyInA`, `OnlyInB`, and
`InBoth`.
Differential Revision: https://reviews.llvm.org/D121173
When a structure constructor does not initialize an allocatable component,
ensure that the typed expression representation contains an explicit
NULL() for the component. Expression semantics already copies default
initialized expressions for nonallocatable components into structure
constructors. This change is expected to simplify lowering.
Differential Revision: https://reviews.llvm.org/D121162
This patch cleans up the interface to PresburgerSet. At a high level it does
the following changes:
- Move member functions around to have constructors at top and print/dump
at end.
- Move a private function to be a static function instead.
- Change member functions of type "getAllIntegerPolyhedron" to "getAllPolys"
instead.
- Improve documentation for PresburgerSet.
Reviewed By: arjunp
Differential Revision: https://reviews.llvm.org/D121027
Rather than reading default character variables in formatted
input one byte at a time via NextInField(), skip and read
them via blocks of available buffer data. This eliminates
a bottleneck that affected reads of large character values.
(It also exposed a problem with sequential reads with RECL=
set on the OPEN statement, so that's fixed too.)
Differential Revision: https://reviews.llvm.org/D121144
Register constraint switched to "=q" which means very specifically (from
https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html#Machine-Constraints)
> Any register accessible as rl. In 32-bit mode, a, b, c, and d; in 64-bit
mode, any integer register.
Older gcc versions (8.x and below) were trying to use esi or edi for the
8 bit flag variable, but it wound up displaying this error in the end:
kmp_lock.cpp: In function ‘void __kmp_spin_backoff(kmp_backoff_t*)’:
kmp_lock.cpp:2684:1: error: unsupported size for integer register
Hence the correct restriction is "=q" instead of "=r".
Fixes: https://github.com/llvm/llvm-project/issues/53309
Differential Revision: https://reviews.llvm.org/D120519
In quantized comutation, there are casting ops around computation ops.
Reorder the ops to make reduce-to-contract actually work.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D120760
This patch add the lowering for the allocate
and the deallocate statements.
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D121146
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: V Donaldson <vdonaldson@nvidia.com>
Saves one move in each case, which is basically nothing perf-wise;
this is more about simplifying the code.
Differential Revision: https://reviews.llvm.org/D121130
Currently, when instrumenting indirect calls, this uses
CallBase::getCalledFunction to determine whether a given callsite is
eligible.
However, that returns null if:
this is an indirect function invocation or the function signature
does not match the call signature.
So, we end up instrumenting direct calls where the callee is a bitcast
ConstantExpr, even though we presumably don't need to.
Use isIndirectCall to ignore those funky direct calls.
Differential Revision: https://reviews.llvm.org/D119594
If LIBCXX_ENABLE_SHARED isn't explicitly set on the cmake command
line, isn't set in the cache, and the libcxxabi project is configured
before libcxx, then LIBCXX_ENABLE_SHARED isn't defined yet. Once
the libcxx cmake project has been parsed, LIBCXX_ENABLE_SHARED would
have been set to its default value of ON.
This makes sure that the symbols are properly dllexported in such
a configuration scenario.
Differential Revision: https://reviews.llvm.org/D120982
This extension is a portability trap for users, since no other standard
library supports it. Furthermore, the Standard explicitly allows
implementations to reject std::allocator<cv T>, so allowing it is
really going against the current.
This was discovered in D120684: this extension required `const_cast`ing
in `__construct_range_forward`, a fishy bit of code that can be removed
if we don't support the extension anymore.
Differential Revision: https://reviews.llvm.org/D120996
A recent patch made it possible to emit more localized error messages
pertaining to actual arguments in non-intrinsic procedure references.
Use these new powers for good and make intrinsic error messages more
precise, too.
Differential Revision: https://reviews.llvm.org/D121126
Support ANSI escape codes for bright colors variants. Most modern
terminals support them. LLDB is not using them in any of its defaults,
but they're useful for people who want to modify their preferred ANSI
prefix/suffix.
Differential revision: https://reviews.llvm.org/D121131