This is a patch that allows isGuaranteedNotToBeUndefOrPoison to return more precise result
when an argument is given, by looking through its uses at the entry block (and following blocks as well, if it is checking poison only).
This is useful when there is a function call with noundef arguments at the entry block.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D88207
Recently https://reviews.llvm.org/D88103 introduced a nice API for
converting a JSON object into C++ types, which include nice error
messaging.
I'm using that new functioniality to perform the parsing in a much more
elegant way. As a result, the code looks simpler and more maintainable,
as we aren't parsing anymore individual fields manually.
I updated the test cases accordingly.
Differential Revision: https://reviews.llvm.org/D88264
Passing them directly is likely to be non-conforming, since it usually
involves copying the bytes of the record. For unknown architectures, we
don't know what MSVC does or will do, but we should at least try to
conform as well as we can.
Regardless of the target architecture, we should always use the C rules
(RAA_Default) for records that "canBePassedInRegisters". Those are
trivially copyable things, and things marked with [[trivial_abi]].
This should be NFC, although it changes where the final decision about
x86_32 overaligned records is made. The current x86_32 C rules say that
overaligned things are passed indirectly, so there is no functional
difference.
This change adds an option to basic block sections to allow cold
clusters to be assigned a custom text prefix. With a custom prefix such
as ".text.split." (D87840), lld can place them in a separate output section.
The benefits are -
* Empirically shown to improve icache and itlb metrics by 3-5%
(absolute) compared to placing split parts in .text.unlikely.
* Mitigates against poor profiles, eg samplePGO profiles used with the
machine function splitter. Optimizations such as hugepage remapping can
make different decisions at the section granularity.
* Enables section granularity hotness monitoring (checking on the
decisions made during compilation vs sample data from production).
Differential Revision: https://reviews.llvm.org/D87813
Summary:
This patch add support for printing analysis messages relating to data
globalization on the GPU. This occurs when data is shared between the
threads in a GPU context and must be pushed to global or shared memory.
Reviewers: jdoerfert
Subscribers: guansong hiraditya llvm-commits ormris sstefan1 yaxunl
Tags: #OpenMP #LLVM
Differential Revision: https://reviews.llvm.org/D88243
".text.split." holds symbols which are split out from functions in
other input sections. For example, with -fsplit-machine-functions,
placing the cold parts in .text.split instead of .text.unlikely mitigates
against poor profile inaccuracy. Techniques such as hugepage remapping can
make conservative decisions at the section granularity.
Differential Revision: https://reviews.llvm.org/D87840
Introduce a helper which can be used to update the debug location of an
Instruction after the instruction is hoisted. This can be used to safely
drop a source location as recommended by the docs.
For more context, see the discussion in https://reviews.llvm.org/D60913.
Differential Revision: https://reviews.llvm.org/D85670
Emscripten's longjump and exception mechanism depends on two global variables,
`__THREW__` and `__threwValue`, which are changed to be defined as thread-local
in https://github.com/emscripten-core/emscripten/pull/12056. This patch updates
the corresponding code in the WebAssembly backend to properly declare these
globals as thread-local as well.
Differential Revision: https://reviews.llvm.org/D88262
constructors.
This changes the code to avoid using constructor homing for aggregate
classes and classes with trivial default constructors, instead of trying
to loop through the constructors.
Differential Revision: https://reviews.llvm.org/D87808
Until then, this one line fix removes the assert fail with basic block sections
with debug info. Bug tracking this: #47549
This fix does not generate loc list or DW_AT_const_value if the argument is
mentioned in a different section than the start of the function.
Temporarily fixes bugzilla : https://bugs.llvm.org/show_bug.cgi?id=47549
Differential Revision: https://reviews.llvm.org/D87787
I have long complained that while we have exhaustive tests
for ConstantRange, they are, uh, not good.
The approach of groking our own constant range
via exhaustive enumeration is, mysterious.
It neither tells us without doubt that the result is
conservatively correct, nor the precise match to the ConstantRange
result tells us that the result is precise.
But yeah, it's fast, i give it that.
In short, there are three things that we need to check:
1. That ConstantRange result is conservatively correct
2. That ConstantRange range is reasonable
3. That ConstantRange result is reasonably precise
So let's not just check the middle one, but all three.
This provides precision test coverage for D88178.
When processing PHI nodes after a callbr, we need to make sure that the
PHI nodes on the default branch are resolved after the callbr
(inserted after INLINEASM_BR). The PHI node values on the indirect
branches are processed before the INLINEASM_BR.
Differential Revision: https://reviews.llvm.org/D86260
This is needed to compile some headers in CUDA-11 that assume that threadIdx is
implicitly convertible to dim3. With NVCC, threadIdx is uint3 and there's
dim3(uint3) constructor. Clang uses a special type for the builtin variables, so
that path does not work. Instead, this patch adds conversion function to the
builtin variable classes. that will allow them to be converted to dim3 and uint3.
Differential Revision: https://reviews.llvm.org/D88250
This change adds the support for __builtin_return_address
for ARMv8.3A Pointer Authentication.
Location of the authentication code in the pointer depends on
the system configuration, therefore a dedicated instruction is used for
effectively removing the authentication code without
authenticating the pointer.
Reviewed By: chill
Differential Revision: https://reviews.llvm.org/D75044
This patch moves the memory space field from MemRefType and UnrankedMemRefType
to their base class BaseMemRefType so that it can be retrieved from it without
downcasting it to the specific memref.
Reviewed By: silvas
Differential Revision: https://reviews.llvm.org/D87649
This is triggered during serialization. The test is for modules, but
will occur for any serialization effort using asm goto.
Reviewed By: nickdesaulniers, jyknight
Differential Revision: https://reviews.llvm.org/D88195
The following crashes on my system before this patch, but not after:
void foo(int i) {
switch (i) {
case 1:
case 2:
... 100000 cases ...
;
}
}
clang-query -c="match stmt(hasAncestor(stmt()))" deep.c
I'm not sure it's actually a sane testcase to run though, it's pretty slow :-)
Differential Revision: https://reviews.llvm.org/D88222
AIX by default usually folds 32-bit & 64-bit arch libraries into a single
archive, a behaviour we may want for the runtime libraries in the future,
so we don't necessarily want to opt into the multlib layout introduce in
D45604, which is currently the default for runtime builds.
Reviewed By: hubert.reinterpretcast
Differential Revision: https://reviews.llvm.org/D88169
Need to fix a check for the variable if it is declared in the inner
OpenMP region to be able to firstprivatize it.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D88240
`flang-new` depends on libclangFrontend (it uses DiagnosticConsumer
classes from there). This patch adds the missing dependency in CMake.
clang::TextDiagnosticBuffer is only reported as missing when compiling
`flang-new` with BUILD_SHARED_LIBS=ON. This symbol is linked in
statically with libflangFrontend when BUILD_SHARED_LIBS=OFF.
This introduces an analysis pass that wraps IRSimilarityIdentifier,
and adds a printer pass to examine in what function similarities are
being found.
Test for what the printer pass can find are in
test/Analysis/IRSimilarityIdentifier.
Reviewed by: paquette, jroelofs
Differential Revision: https://reviews.llvm.org/D86973
This reverts commit c4bacc3c9b.
Test "LLVM :: ThinLTO/X86/funcimport-stats.ll" is failing. Reverting now
and will recommit after making the test not fail with the added stats.
This pass converts shape.cstr_* ops to eager (side-effecting)
error-handling code. After that conversion is done, the witnesses are
trivially satisfied and are replaced with `shape.const_witness true`.
Differential Revision: https://reviews.llvm.org/D87941
Measure amount of high-level or fixed-cost operations performed during
building/loading modules and during header search. High-level operations
like building a module or processing a .pcm file are motivated by
previous issues where clang was re-building modules or re-reading .pcm
files unnecessarily. Fixed-cost operations like `stat` calls are tracked
because clang cannot change how long each operation takes but it can
perform fewer of such operations to improve the compile time.
Also tracking such stats over time can help us detect compile-time
regressions. Added stats are more stable than the actual measured
compilation time, so expect the detected regressions to be less noisy.
rdar://problem/55715134
Reviewed By: aprantl, bruno
Differential Revision: https://reviews.llvm.org/D86895
This allows to point to an executable that isn't named exactly
"llvm-symbolizer" and not necessarily in the current PATH.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D88192
As mentioned in the bug report, tryEmitPrivate chokes on the
MaterializeTemporaryExpr in the reproducers, since it assumes that if
there are elements, than it must be a ConstantArrayType. However, the
MaterializeTemporaryExpr (which matches exactly the AST when it is NOT a
global/static) has an incomplete array type.
This changes the section where the number-of-elements is non-zero to
properly handle non-CAT types by just extracting it as an array type
(since all we needed was the element type out of it).
In lit tests, we run each LLD invocation twice (LLD_IN_TEST=2), without shutting down the process in-between. This ensures a full cleanup is properly done between runs.
Only active for the COFF driver for now. Other drivers still use LLD_IN_TEST=1 which executes just one iteration with full cleanup, like before.
When the environment variable LLD_IN_TEST is unset, a shortcut is taken, only one iteration is executed, no cleanup for faster exit, like before.
A public API, lld::safeLldMain(), is also available when using LLD as a library.
Differential Revision: https://reviews.llvm.org/D70378
Before this patch, these two tests were emitting both a .DLL and .LIB. The output .LIB file name also happens to be an input .LIB file name. This prevented the test from executing a second time when LLD is re-entrant (LLD_IN_TEST=2).
This is a support patch for https://reviews.llvm.org/D70378.