This patch adds !nosanitize metadata to FixedMetadataKinds.def, !nosanitize indicates that LLVM should not insert any sanitizer instrumentation.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D126294
Today, text section prefixes (none, .unlikely, .hot, and .unkown) are determined based on PGO profile. However, Propeller may deem a function hot when PGO doesn't. Besides, when `-Wl,-keep-text-section-prefix=true` Propeller cannot enforce a global section ordering as the linker can only reorder sections within each output section (.text, .text.hot, .text.unlikely).
This patch promotes all functions with Propeller profiles (functions listed in the basic-block-sections profile) to .text.hot. The feature is hidden behind the flag `--bbsections-guided-section-prefix` which defaults to `true`.
The new implementation refactors the parsing of basic block sections profile into a new `BasicBlockSectionsProfileReader` analysis pass. This allows us to use the information earlier in `CodeGenPrepare` in order to set the functions text prefix. `BasicBlockSectionsProfileReader` will be used both by `BasicBlockSections` pass and `CodeGenPrepare`.
Differential Revision: https://reviews.llvm.org/D122930
All callers pass true.
select-unfold-freeze.ll is now a subset of select.ll so delete it.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D126501
treated as Copy instruction in MCP.
This is then used in AArch64 to remove copy instructions after taildup
ran in machine block placement
Differential Revision: https://reviews.llvm.org/D125335
C++ generated code with huge amount of switch cases chokes badly while emitting
coverage mapping, in our specific testcase (~72k cases), it won't stop after hours.
After this change, the frontend job now finishes in 4.5s and shrinks down `@__covrec_`
by 288k when compared to disabling simplification altogether.
There's probably no good way to create a testcase for this, but it's easy to
reproduce, just add thousands of cases in the below switch, and build with
`-fprofile-instr-generate -fcoverage-mapping`.
```
enum type : int {
FEATURE_INVALID = 0,
FEATURE_A = 1,
...
};
const char *to_string(type e) {
switch (e) {
case type::FEATURE_INVALID: return "FEATURE_INVALID";
case type::FEATURE_A: return "FEATURE_A";}
...
}
```
Differential Revision: https://reviews.llvm.org/D126345
The default implementations will perform a shallow copy instead of a deep
copy, causing some internal data structures to be shared between different
objects. Disable these operations so they don't get accidentally used.
Differential Revision: https://reviews.llvm.org/D126401
reapply 62a9b36fcf and fix module build
failue:
1: remove MachineCycleInfoWrapperPass in MachinePassRegistry.def
MachineCycleInfoWrapperPass is a anylysis pass, should not be there.
2: move the definition for MachineCycleInfoPrinterPass to cpp file.
Otherwise, there are module conflicit for MachineCycleInfoWrapperPass
in MachinePassRegistry.def and MachineCycleAnalysis.h after
62a9b36fcf.
MachineCycle can handle irreducible loop. Natural loop
analysis (MachineLoop) can not return correct loop depth if
the loop is irreducible loop. And MachineSink is sensitive
to the loop depth, see MachineSinking::isProfitableToSinkTo().
This patch tries to use MachineCycle so that we can handle
irreducible loop better.
Reviewed By: sameerds, MatzeB
Differential Revision: https://reviews.llvm.org/D123995
The purpose of the custom linked list was to optimize for the case
of a single-element list. It turns out that TinyPtrVector handles
the same basic scenario even better, reducing the size of
LeaderTableEntry by 33%, and requiring only log2(N) allocations
as the size of the list grows. The only downside is that we have
to store the Value's and BasicBlock's in separate vectors, which
is slightly awkward in a few cases. Fortunately that ends up being
entirely encapsulated inside helper functions.
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D125205
Extend the Frame struct to hold the symbol name if requested
when a RawMemProfReader object is constructed. This change updates the
tests and removes the need to pass --debug to obtain the mapping from
GUID to symbol names.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D126344
Need to use all ReductionOps when propagating flags for the reduction
ops, otherwise transformation is not correct. Plus, need to drop nuw/nsw
flags.
Differential Revision: https://reviews.llvm.org/D126371
MCSymbolizer::tryAddingSymbolicOperand() overloaded the Size parameter
to specify either the instruction size or the operand size depending on
the architecture. However, for proper symbolic disassembly on X86, we
need to know both sizes, as an instruction can have two operands, and
the instruction size cannot be reliably calculated based on the operand
offset and its size. Hence, split Size into OpSize and InstSize.
For X86, the new interface allows to fix a couple of issues:
* Correctly adjust the value of PC-relative operands.
* Set operand size to zero when the operand is specified implicitly.
Differential Revision: https://reviews.llvm.org/D126101
This adds support for pointer types for `atomic xchg` and let us write
instructions such as `atomicrmw xchg i64** %0, i64* %1 seq_cst`. This
is similar to the patch for allowing atomicrmw xchg on floating point
types: https://reviews.llvm.org/D52416.
Differential Revision: https://reviews.llvm.org/D124728
This is a support for " #pragma omp atomic compare fail ". It has Parser & AST support for now.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D123235
MachineCycle can handle irreducible loop. Natural loop
analysis (MachineLoop) can not return correct loop depth if
the loop is irreducible loop. And MachineSink is sensitive
to the loop depth, see MachineSinking::isProfitableToSinkTo().
This patch tries to use MachineCycle so that we can handle
irreducible loop better.
Reviewed By: sameerds, MatzeB
Differential Revision: https://reviews.llvm.org/D123995
This patch adds basic support for `omp task` to the OpenMPIRBuilder.
The outlined function after code extraction is called from a wrapper function with appropriate arguments. This wrapper function is passed to the runtime calls for task allocation.
This approach is different from the Clang approach - clang directly emits the runtime call to the outlined function. The outlining utility (OutlineInfo) simply outlines the code and generates a function call to the outlined function. After the function has been generated by the outlining utility, there is no easy way to alter the function arguments without meddling with the outlining itself. Hence the wrapper function approach is taken.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D71989
This is minimum changes extracted from https://reviews.llvm.org/D78950. The old patch tried to add LRU eviction of caching data structure. Due to multiple layers of interfaces that users could be using, it was not clear where to put the functionality. While we work out on where to put that functionality, it'll be great to add this minimum interface change so that the user could implement their own memory management. More specifically:
* Add a clearLineTable method for DWARFDebugLine which erases the given offset from the LineTableMap.
* DWARFDebugContext adds the clearLineTableForUnit method that leverages clearLineTable to remove the object corresponding to a given compile unit, for memory management purposes. When it is referred to again, the line table object will be repopulated.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D90006
It turns out we were already allocating static address space for TLS
data along with the non-TLS static data, but this space was going
unused/ignored.
With this change, we include the TLS segment in `__wasm_init_memory`
(which does the work of loading the passive segments into memory when a
module is first loaded). We also set the `__tls_base` global to point
to the start of this segment.
This means that the runtime can use this static copy of the TLS data for
the first/primary thread if it chooses, rather than doing a runtime
allocation prior to calling `__wasm_init_tls`.
Practically speaking, this will allow emscripten to avoid dynamic
allocation of TLS region on the main thread.
Differential Revision: https://reviews.llvm.org/D126107
A new hidden option -print-on-crash that prints the IR as it was upon entering
the last pass when there is a crash.
The IR is saved in its print form before each pass is started and a
signal handler is registered. If the compilation crashes, the signal
handler will print the saved IR to dbgs(). This option
can be modified using -print-module-scope to get the IR for the complete
module. Note that this option only works with the new pass manager.
Reviewed By: yrouban
Differential Revision: https://reviews.llvm.org/D86657
Currently, llvm-symbolizer doesn't like to parse .debug_info in order to
show the line info for global variables. addr2line does this. In the
future, I'm looking to migrate AddressSanitizer off of internal metadata
over to using debuginfo, and this is predicated on being able to get the
line info for global variables.
This patch adds the requisite support for getting the line info from the
.debug_info section for symbolizing global variables. This only happens
when you ask for a global variable to be symbolized as data.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D123538
Currently added versions are from v1.0 to v1.5, other versions
can be added as needed.
This change also adds documentation about SPIR-V target support
in LLVM.
Differential Revision: https://reviews.llvm.org/D124776
Previously, `getRegUsageForType` was implemented using
`getTypeLegalizationCost`. `getRegUsageForType` is used by the loop
vectorizer to estimate the register pressure caused by using a vector
type. However, `getTypeLegalizationCost` currently only appears to
understand splitting and not scalarization, so significantly
underestimates the register requirements.
Instead, use `getNumRegisters`, which understands when scalarization
can occur (via computeRegisterProperties).
This was discovered while investigating D118979 (Set maximum VF with
shouldMaximizeVectorBandwidth), where under fixed-length 512-bit SVE the
loop vectorizer previously ends up costing an v128i1 as 2 v64i*
registers where it actually occupies 128 i32 registers.
I'm sending this patch early for comment, I'm still doing some sanity checking
with LNT. I note that getRegisterClassForType appears to return VectorRC even
though the type in question (large vNi1 types) end up occupying scalar
registers. That might be worth fixing too.
Differential Revision: https://reviews.llvm.org/D125918
Without the change llvm build fails on this week's gcc-13 snapshot as:
[ 91%] Building CXX object unittests/Support/CMakeFiles/SupportTests.dir/Base64Test.cpp.o
In file included from llvm/unittests/Support/Base64Test.cpp:14:
llvm/include/llvm/Support/Base64.h: In function 'std::string llvm::encodeBase64(const InputBytes&)':
llvm/include/llvm/Support/Base64.h:29:5: error: 'uint32_t' was not declared in this scope
29 | uint32_t x = ((unsigned char)Bytes[i] << 16) |
| ^~~~~~~~
Without the change llvm build fails on this week's gcc-13 snapshot as:
[ 0%] Building CXX object lib/Support/CMakeFiles/LLVMSupport.dir/Signals.cpp.o
In file included from llvm/lib/Support/Signals.cpp:14:
llvm/include/llvm/Support/Signals.h:119:8: error: variable or field 'CleanupOnSignal' declared void
119 | void CleanupOnSignal(uintptr_t Context);
| ^~~~~~~~~~~~~~~
validateTree() is instantiated with __FILE__.
It will be pruned at link time due to -ffunction-sections but be left in
object files.
Its user is only GenericCycleInfo::compute() with assert(validateTree());
Therefore I think validateTree() may be hidden with NDEBUG.
This is a fixup for https://reviews.llvm.org/D112696
Idiomatic llvm::Error usage can result in a FailedToMaterialize error tearing
down an ExecutionSession instance. Since the FailedToMaterialize error holds
SymbolStringPtrs and JITDylib references this leads to crashes when accessing
or logging the error.
This patch modifies FailedToMaterialize to retain the SymbolStringPool and
JITDylibs involved in the failure so that we can safely report an error message
to the client, even if the error tears down the session.
The contract for JITDylibs allows the getName method to be used even after the
session has been torn down, but no other JITDylib fields should be accessed via
the FailedToMaterialize error if the ssesion has been torn down. Logging the
error is guaranteed to be safe in all cases.
Clients are required to call ExecutionSession::endSession before destroying the
ExecutionSession. Failure to do so can lead to memory leaks and other difficult
to debug issues. Enforcing this requirement by assertion makes it easy to spot
or debug situations where the contract was not followed.
This would be ambigious with itself when C++20 tries to lookup the
reversed form. I didn't find a use in LLVM, but MLIR does a lot of
comparisons of ranges of different types.
JumpThreading may convert selects into branch instructions,
in which case the condition needs to be frozen (as branch on
poison is immediate undefined behavior, unlike select on poison).
The necessary code for this is already in place, this just enables
the option.
Differential Revision: https://reviews.llvm.org/D125869
In certain use-cases, these can be emitted by old compilers, but the
operand is now always required. These are only used for optimizations,
so it's safe to drop them if they happen to have the now-invalid format.
The semantically-required call is already a separate instruction.
Differential Revision: https://reviews.llvm.org/D123811
Currently for atomic load, store, and rmw instructions, as long as the
operand is floating-point value, they are casted to integer. Nowadays many
targets can actually support part of atomic operations with floating-point
operands. For example, NVPTX supports atomic load and store of floating-point
values. This patch adds a series interface functions `shouldCastAtomicXXXInIR`,
and the default implementations are same as what we currently do. Later for
targets can have their specialization.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D125652
It is already marked as having side effects, at least in MIR. It does
not interact with anything else that is modelled as a memory access
either in IR or MachineIR.
Differential Revision: https://reviews.llvm.org/D125985
s_getreg does not interact with anything else that is modelled as a
memory access either in IR or MachineIR.
Differential Revision: https://reviews.llvm.org/D125968