This allows for other implementations to define their own version of `Thread::Init`.
This will be the case for Fuchsia where much of the thread initialization can be
broken up between different thread hooks (`__sanitizer_before_thread_create_hook`,
`__sanitizer_thread_create_hook`, `__sanitizer_thread_start_hook`). Namely, setting
up the heap ring buffer and stack info and can be setup before thread creation.
The stack ring buffer can also be setup before thread creation, but storing it into
`__hwasan_tls` can only be done on the thread start hook since it's only then we
can access `__hwasan_tls` for that thread correctly.
Differential Revision: https://reviews.llvm.org/D104248
This is *not* user-defined derived type I/O, but rather Fortran's
built-in capabilities for using derived type data in I/O lists
and NAMELIST groups.
This feature depends on having the derived type description tables
that are created by Semantics available, passed through compilation
as initialized static objects to which pointers can be targeted
in the descriptors of I/O list items and NAMELIST groups.
NAMELIST processing now handles component references on input
(e.g., "&GROUP x%component = 123 /").
The C++ perspectives of the derived type information records
were transformed into proper classes when it was necessary to add
member functions to them.
The code in Semantics that generates derived type information
was changed to emit derived type components in component order,
not alphabetic order.
Differential Revision: https://reviews.llvm.org/D104485
The `icf` command-line option is not present in ld64, so it should use the LLD option syntax, which begins with double dashes and separates primary option from any suboption with the equal sign.
Differential Revision: https://reviews.llvm.org/D104548
The output of the object file is unimportant and entirely discarded.
Simply redirect the output to `/dev/null` or `NUL` as the case may be.
Additionally, the space between the labels is unimportant. There is no
need to add space between the labels. Two labels at the same address
are sufficient to generate the difference expression and should still
test the same behaviour.
The default callback instrumentation in x86 LAM mode uses ASLR bits
to randomly choose a tag, and thus has a 1/64 chance of choosing a
stack tag of 0, causing stack tests to fail intermittently. By using
__hwasan_generate_tag to pick tags, we guarantee non-zero tags and
eliminate the test flakiness.
aarch64 doesn't seem to have this problem using thread-local addresses
to pick tags, so perhaps we can remove this workaround once we implement
a similar mechanism for LAM.
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D104470
This fixes a crash in MallocChecker for the situation when operator new (delete) is invoked via NTTP and makes the behavior of CallContext.getCalleeDecl(Expr) identical to CallEvent.getDecl().
Reviewed By: vsavchenko
Differential Revision: https://reviews.llvm.org/D103025
Given an invalid SourceLocation, translateSourceLocation will call
clang_getNullLocation, and then do nothing with the result. But
clang_getNullLocation has no side effects: it just constructs and
returns a null CXSourceLocation value.
Surely the intention was to //return// that null CXSourceLocation to
the caller, instead of throwing it away and pressing on anyway.
Reviewed By: miyuki
Differential Revision: https://reviews.llvm.org/D104442
This partially reverts 838490de7e, which broke some Solaris bots. Apparently
Solaris defines int8_t as char rather than signed char, which made the
SerializationTypeName<char> specialization a redefinition.
This partial revert isolates use of uint8_t buffers to ORC-RPC handling of
wrapper functions only. The TargetProcessControl::runWrapper method will
continue to use char buffers.
This implements a more comprehensive fix than was done at D95409.
Instead of excluding just function pointer subobjects, we also
exclude any user-defined function pointer conversion operators.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D103855
Provides ObjectTransformLayer APIs, a getter to access the
ObjectTransformLayer member of LLJIT, and the DumpObjects utility
to make construction of a dump-to-disk transform easy.
An example showing how the new APIs can be used has been added in
llvm/examples/OrcV2Examples/OrcV2CBindingsDumpObjects.
This patch handles one particular case of one-iteration loops for which SCEV
cannot straightforwardly prove BECount = 1. The idea of the optimization is to
symbolically execute conditional branches on the 1st iteration, moving in topoligical
order, and only visiting blocks that may be reached on the first iteration. If we find out
that we never reach header via the latch, then the backedge can be broken.
This implementation uses InstSimplify. SCEV version was rejected due to high
compile time impact.
Differential Revision: https://reviews.llvm.org/D102615
Reviewed By: nikic
Introduce the execute_region op that is able to hold a region which it
executes exactly once. The op encapsulates a CFG within itself while
isolating it from the surrounding control flow. Proposal discussed here:
https://llvm.discourse.group/t/introduce-std-inlined-call-op-proposal/282
execute_region enables one to inline a function without lowering out all
other higher level control flow constructs (affine.for/if, scf.for/if)
to the flat list of blocks / CFG form. It thus allows the benefit of
transforms on higher level control flow ops available in the presence of
the inlined calls. The inlined calls continue to benefit from
propagation of SSA values across their top boundary. Functions won’t
have to remain outlined until later than desired. Abstractions like
affine execute_regions, lambdas with implicit captures could be lowered
to this without first lowering out structured loops/ifs or outlining.
But two potential early use cases are of: (1) an early inliner (which
can inline functions by introducing execute_region ops), (2) lowering of
an affine.execute_region, which cleanly maps to an scf.execute_region
when going from the affine dialect to the scf dialect.
Differential Revision: https://reviews.llvm.org/D75837
InstCombine didn't perform (sext bool X) * (sext bool X) --> zext (and X, X) which can result in just (zext X). The patch adds regression tests to check this transformation and adds a check for equality of mul's operands for that case.
Differential Revision: https://reviews.llvm.org/D104193
Don't rely on volatile writes to keep the CPU busy - it seems MSVC
optimizes them out, so we don't get different values for 'start' and
'end' on Windows. Rewrite the test to loop until we get a different
value for 'end'.
Fix suggested by Michael Kruse in
https://reviews.llvm.org/rG57e85622bbdb2eb18cc03df2ea457019c58f6912#inline-6002
Committing to fix the Windows buildbot, post-commit comments welcome!
This patch adds an optional PriorityInlineOrder, which uses the heap to order inlining.
The callsite which size is smaller would have a higher priority.
Reviewed By: mtrofin
Differential Revision: https://reviews.llvm.org/D104028
Polly uses algorithms from the Integer Set Library (isl), which is a library written in C and which is incompatible with the rest of the LLVM as it is written in C++.
Changes made:
- Refactoring the method `IslAstInfo::getBuild()`
- `IslAstInfo::IslAstUserPayload.Build` now uses C++ types instead of C types
- Removing destructor of `IslAstInfo::IslAstUserPayload`
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D104370
This looks like not a practical pattern in our codebase (it could fail
in some sandbox environement).
Instead we print it via standard output, and it is controled by the
-attributor-print-call-graph, this follows a similiar pattern of attributor-print-dep.
Allow mangled names to include an arbitrary dot suffix, akin to vendor
specific suffix in Itanium mangling.
Primary motivation is a support for symbols renamed during ThinLTO
import / promotion (ThinLTO is the default configuration for optimized
builds in rustc).
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D104358
InstCombine didn't perform (sext bool X) * (sext bool X) --> zext (and X, X) which can result in just (zext X). The patch adds regression tests to check this transformation and adds a check for equality of mul's operands for that case.
Differential Revision: https://reviews.llvm.org/D104193
Commit 4c7f820b2b changed the llvm.powi intrinsic to support
different 'int' sizes for the exponent. That happened to break
the IntrinsicToLibdeviceFunc mapping in PPCGCodeGeneration, which
obviously should have been updated as part of commit 4c7f820b2b
(https://reviews.llvm.org/D99439).
The shortcoming was found by buildbots that use
-DPOLLY_ENABLE_GPGPU_CODEGEN=ON
This patch should fixup the problem.
This reverts commit 76d0747e08.
If a group has `__llvm_prf_vals` due to static value profiler counters
(`NS!=0`), we cannot make `__llvm_prf_data` private, because a prevailing text
section may reference `__llvm_prf_data` and will cause a `relocation refers to a
discarded section` linker error.
Note: while a `__profc_` group is non-prevailing, it may be referenced by a
prevailing text section due to inlining.
```
group section [ 66] `.group' [__profc__ZN5clang20EmitClangDeclContextERN4llvm12RecordKeeperERNS0_11raw_ostreamE] contains 4 sections:
[Index] Name
[ 67] __llvm_prf_cnts
[ 68] __llvm_prf_vals
[ 69] __llvm_prf_data
[ 70] .rela__llvm_prf_data
```
This should fix PR50683. The wrong assumption was that we
could always know what the callee is when we replace a call site
argument with undef. We wanted to know that to remove the `noundef`
that might be attached to the argument. Since no callee means we
did the propagation on the caller site, there is no need to remove
an attribute. It is only needed if we replace all uses and therefore
pass `undef` instead of the value that was passed in otherwise.
Users might want to run initialize for a set of AAs without an
intermediate update step. Running update eagerly is not a requirement
anyway so we make it optional.
To allow outside AAs that simplify values we need to ensure all value
simplification goes through the Attributor, not AAValueSimplify (or any
of the other AAs we have already like AAPotentialValues). This patch
also introduces an interface for the outside AAs to register
simplification callbacks for an IRPosition. To make this work as
expected we have to pass IRPositions instead of Values in
AAValueSimplify, which makes sense by itself.
If we simplify values we sometimes end up with type mismatches. If the
value is a constant we can often cast it though to still allow
propagation. The logic is now put into a helper and it replaces some
ad hoc things we did before.
This also introduces the AA namespace for abstract attribute related
functions and types.
Differential Revision: https://reviews.llvm.org/D103856
If the target stack is not accessible between different running
"threads" we have to make sure not to create allocas for mallocs
that might be used by multiple "threads". The "use check" is
sufficient to prevent this but if we apply the "free check" we have
to make sure the pointer is not communicated to others before
the free is reached.
Differential Revision: https://reviews.llvm.org/D98608
The initial use for AAExecutionDomain was to determine if a single
thread executes a block. While this is sometimes informative most
of the time, and for other reasons, we actually want to know if it
is the "initial thread". Thus, the thread that started execution on
the current device. The deduction needs to be adjusted in a follow
up as the methods we use right not are looking for the OpenMP thread
id which is resets whenever a thread enters a parallel region. What
we basically want is to look for `llvm.nvvm.read.ptx.sreg.ntid.x` and
equivalent functions.
We invalidated AAReachabilityImpl directly which is not helpful and
confusing as we still used it regardless. We now avoid invalidating it
(not needed anyway) and add checks for the state. This has by itself no
actual effect but prepares for later extensions.
The current naming scheme adds the `dfs$` prefix to all
DFSan-instrumented functions. This breaks mangling and prevents stack
trace printers and other tools from automatically demangling function
names.
This new naming scheme is mangling-compatible, with the `.dfsan`
suffix being a vendor-specific suffix:
https://itanium-cxx-abi.github.io/cxx-abi/abi.html#mangling-structure
With this fix, demangling utils would work out-of-the-box.
Reviewed By: stephan.yichao.zhao
Differential Revision: https://reviews.llvm.org/D104494