Summary:
This should produce slightly better error messages in case of failures.
Only slightly, because this code was pretty careful about that to begin
with -- I've seen code which does much worse.
Reviewers: jhenderson, dblaikie
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74899
Summary:
We already have a "Failed" matcher, which can be used to check any
property of the Error object. However, most frequently one just wants to
check the error message, and while this is possible with the "Failed"
matcher, it is also very convoluted
(Failed<ErrorInfoBase>(testing::Property(&ErrorInfoBase::message, "the
message"))).
Now, one can just write: FailedWithMessage("the message"). I expect that
most of the usages will remain this simple, but the argument of the
matcher is not limited to simple strings -- the argument of the matcher
can be any other matcher, so one can write more complicated assertions
if needed (FailedWithMessage(ContainsRegex("foo|bar"))). If one wants to
match multiple error messages, he can pass multiple arguments to the
matcher.
If one wants to match the message list as a whole (perhaps to check the
message count), I've also included a FailedWithMessageArray matcher,
which takes a single matcher receiving a vector of error message
strings.
Reviewers: sammccall, dblaikie, jhenderson
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74898
This change is relevant when embedding the llvm cmake project into
another project. It should not change the build behavior of a normal
llvm build.
In the case where llvm is embedded as a cmake subproject,
CMAKE_SOURCE_DIR does not point to the expected directory and building
the tests fails.
Using CMAKE_CURRENT_SOURCE_DIR fixes this problem, as it will always
point to the same directory.
Differential Revision: https://reviews.llvm.org/D73466
When analyzing PHIs, we gather the known bits for every operand and
merge them together to get the known bits of the result of the PHI.
It is not unusual that merging the information leads to know nothing
on the result (e.g., phi a: i8 3, b: i8 unknown, ..., after looking at the
second argument we know we will know nothing on the result), thus, as
soon as we reach that state, stop analyzing the following operand (i.e.,
on the previous example, we won't process anything after looking at `b`).
This improves compile time in particular with PHIs with a large number
of operands.
NFC.
Summary:
The Offset provides the offset within the function in a SourceLocation struct. This allows us to show the byte offset within a function. We also track offsets within inline functions as well. Updated the lookup tests to verify the offset for functions and inline functions.
0x1000: main + 32 @ /tmp/main.cpp:45
Reviewers: labath, aadsm, serhiy.redko, jankratochvil, xiaobai, wallace, aprantl, JDevlieghere
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74680
Initializers and deinitializers are used to implement C++ static constructors
and destructors, runtime registration for some languages (e.g. with the
Objective-C runtime for Objective-C/C++ code) and other tasks that would
typically be performed when a shared-object/dylib is loaded or unloaded by a
statically compiled program.
MCJIT and ORC have historically provided limited support for discovering and
running initializers/deinitializers by scanning the llvm.global_ctors and
llvm.global_dtors variables and recording the functions to be run. This approach
suffers from several drawbacks: (1) It only works for IR inputs, not for object
files (including cached JIT'd objects). (2) It only works for initializers
described by llvm.global_ctors and llvm.global_dtors, however not all
initializers are described in this way (Objective-C, for example, describes
initializers via specially named metadata sections). (3) To make the
initializer/deinitializer functions described by llvm.global_ctors and
llvm.global_dtors searchable they must be promoted to extern linkage, polluting
the JIT symbol table (extra care must be taken to ensure this promotion does
not result in symbol name clashes).
This patch introduces several interdependent changes to ORCv2 to support the
construction of new initialization schemes, and includes an implementation of a
backwards-compatible llvm.global_ctor/llvm.global_dtor scanning scheme, and a
MachO specific scheme that handles Objective-C runtime registration (if the
Objective-C runtime is available) enabling execution of LLVM IR compiled from
Objective-C and Swift.
The major changes included in this patch are:
(1) The MaterializationUnit and MaterializationResponsibility classes are
extended to describe an optional "initializer" symbol for the module (see the
getInitializerSymbol method on each class). The presence or absence of this
symbol indicates whether the module contains any initializers or
deinitializers. The initializer symbol otherwise behaves like any other:
searching for it triggers materialization.
(2) A new Platform interface is introduced in llvm/ExecutionEngine/Orc/Core.h
which provides the following callback interface:
- Error setupJITDylib(JITDylib &JD): Can be used to install standard symbols
in JITDylibs upon creation. E.g. __dso_handle.
- Error notifyAdding(JITDylib &JD, const MaterializationUnit &MU): Generally
used to record initializer symbols.
- Error notifyRemoving(JITDylib &JD, VModuleKey K): Used to notify a platform
that a module is being removed.
Platform implementations can use these callbacks to track outstanding
initializers and implement a platform-specific approach for executing them. For
example, the MachOPlatform installs a plugin in the JIT linker to scan for both
__mod_inits sections (for C++ static constructors) and ObjC metadata sections.
If discovered, these are processed in the usual platform order: Objective-C
registration is carried out first, then static initializers are executed,
ensuring that calls to Objective-C from static initializers will be safe.
This patch updates LLJIT to use the new scheme for initialization. Two
LLJIT::PlatformSupport classes are implemented: A GenericIR platform and a MachO
platform. The GenericIR platform implements a modified version of the previous
llvm.global-ctor scraping scheme to provide support for Windows and
Linux. LLJIT's MachO platform uses the MachOPlatform class to provide MachO
specific initialization as described above.
Reviewers: sgraenitz, dblaikie
Subscribers: mgorny, hiraditya, mgrang, ributzka, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74300
Summary: Commit 63bb9fee52 was reverted in
7603bfb4b0 because it broke builds that treat
warnings as errors.
This commit updates the calls to `assembleToStream()` in tests to check that
the return value is valid.
Original commit message:
Followup to D74084.
Replace the use of `report_fatal_error()` with returning the error to
`llvm-exegesis.cpp` and handling it there.
Differential Revision: https://reviews.llvm.org/D74325
This test was getting a bit long. Before adding more checks, group the
existing checks according to the matcher used, and break it up into
smaller tests.
After having committed https://reviews.llvm.org/D72226, 2 buildbots
running GCC 5.4.0 began failing. The cause was the order in which those
compilers evaluated the left- and right-hand sides of the expression
`RC.SCCIndices[C] = RC.SCCIndices.size();`. This commit splits the
expression into multiple statements to avoid ambiguity, and adds a test
case that exercises the code that caused the test failures on those
older compilers (which was originally included in the reviewed patch,
https://reviews.llvm.org/D72226).
Essentially, fold OrderedBasicBlock into BasicBlock, and make it
auto-invalidate the instruction ordering when new instructions are
added. Notably, we don't need to invalidate it when removing
instructions, which is helpful when a pass mostly delete dead
instructions rather than transforming them.
The downside is that Instruction grows from 56 bytes to 64 bytes. The
resulting LLVM code is substantially simpler and automatically handles
invalidation, which makes me think that this is the right speed and size
tradeoff.
The important change is in SymbolTableTraitsImpl.h, where the numbering
is invalidated. Everything else should be straightforward.
We probably want to implement a fancier re-numbering scheme so that
local updates don't invalidate the ordering, but I plan for that to be
future work, maybe for someone else.
Reviewed By: lattner, vsk, fhahn, dexonsmith
Differential Revision: https://reviews.llvm.org/D51664
This patch upstreams support for the AArch64 Armv8-A cpu Cortex-A34.
In detail adding support for:
- mcpu option in clang
- AArch64 Target Features in clang
- llvm AArch64 TargetParser definitions
details of the cpu can be found here:
https://developer.arm.com/ip-products/processors/cortex-a/cortex-a34
Reviewers: SjoerdMeijer
Reviewed By: SjoerdMeijer
Subscribers: SjoerdMeijer, kristof.beyls, hiraditya, cfe-commits,
llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D74483
Change-Id: Ida101fc544ca183a0a0e61a1277c8957855fde0b
Summary:
I noticed a small regression in a toy project of mine after applying
D73835, in which instruction names weren't being set properly. In the
example test case included with this patch,
`llvm::IRBuilderBase::CreateAdd` returns an `llvm::Value *` that is then
passed as an argument to `llvm::IRBuilderBase::Insert`. The overloaded
function that is selected for that call then ignores the `Name`
parameter that is given. This patch addresses that issue.
Reviewers: nikic, Meinersbur, nhaehnle, fhahn, thakis, teemperor
Reviewed By: nikic, fhahn
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74754
Summary:
Depends on https://reviews.llvm.org/D70927.
`LazyCallGraph::addNewFunctionIntoSCC` allows users to insert a new
function node into a call graph, into a specific, existing SCC.
Extend this interface such that functions can be added even when they do
not belong in any existing SCC, but instead in a new SCC within an
existing RefSCC.
The ability to insert new functions as part of a RefSCC is necessary for
outlined functions that do not form a strongly connected cycle with the
function they are outlined from. An example of such a function would be the
coroutine funclets 'f.resume', etc., which are outlined from a coroutine 'f'.
Coroutine 'f' only references the funclets' addresses, it does not call
them directly.
Reviewers: jdoerfert, chandlerc, wenlei, hfinkel
Reviewed By: jdoerfert
Subscribers: hfinkel, JonChesterfield, mehdi_amini, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D72226
Currently we always return true, when markConstantRange is used on an
object already containing a constant range. If NewR is equal to the
existing constant range however, nothing changes and we should return
false.
I also went ahead and added a clarifying comment and improved the
assertion.
Reviewers: efriedma, davide, nikic
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D73240
As noted on D74621, the bswap intrinsic has a self imposed limitation that the type's bitwidth must be divisible by 16, but there's no reason that APInt::byteSwap must have the same limitation, given that it can already handle any byte width.
Add support for Master and Critical directive in the OMPIRBuilder. Both make use of a new common interface for emitting inlined OMP regions called `emitInlinedRegion` which was added in this patch as well.
Also this patch modifies clang to use the new directives when `-fopenmp-enable-irbuilder` commandline option is passed.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D72304
This patch implements an almost complete handling of OpenMP
contexts/traits such that we can reuse most of the logic in Flang
through the OMPContext.{h,cpp} in llvm/Frontend/OpenMP.
All but construct SIMD specifiers, e.g., inbranch, and the device ISA
selector are define in `llvm/lib/Frontend/OpenMP/OMPKinds.def`. From
these definitions we generate the enum classes `TraitSet`,
`TraitSelector`, and `TraitProperty` as well as conversion and helper
functions in `llvm/lib/Frontend/OpenMP/OMPContext.{h,cpp}`.
The above enum classes are used in the parser, sema, and the AST
attribute. The latter is not a collection of multiple primitive variant
arguments that contain encodings via numbers and strings but instead a
tree that mirrors the `match` clause (see `struct OpenMPTraitInfo`).
The changes to the parser make it more forgiving when wrong syntax is
read and they also resulted in more specialized diagnostics. The tests
are updated and the core issues are detected as before. Here and
elsewhere this patch tries to be generic, thus we do not distinguish
what selector set, selector, or property is parsed except if they do
behave exceptionally, as for example `user={condition(EXPR)}` does.
The sema logic changed in two ways: First, the OMPDeclareVariantAttr
representation changed, as mentioned above, and the sema was adjusted to
work with the new `OpenMPTraitInfo`. Second, the matching and scoring
logic moved into `OMPContext.{h,cpp}`. It is implemented on a flat
representation of the `match` clause that is not tied to clang.
`OpenMPTraitInfo` provides a method to generate this flat structure (see
`struct VariantMatchInfo`) by computing integer score values and boolean
user conditions from the `clang::Expr` we keep for them.
The OpenMP context is now an explicit object (see `struct OMPContext`).
This is in anticipation of construct traits that need to be tracked. The
OpenMP context, as well as the `VariantMatchInfo`, are basically made up
of a set of active or respectively required traits, e.g., 'host', and an
ordered container of constructs which allows duplication. Matching and
scoring is kept as generic as possible to allow easy extension in the
future.
---
Test changes:
The messages checked in `OpenMP/declare_variant_messages.{c,cpp}` have
been auto generated to match the new warnings and notes of the parser.
The "subset" checks were reversed causing the wrong version to be
picked. The tests have been adjusted to correct this.
We do not print scores if the user did not provide one.
We print spaces to make lists in the `match` clause more legible.
Reviewers: kiranchandramohan, ABataev, RaviNarayanaswamy, gtbercea, grokos, sdmitriev, JonChesterfield, hfinkel, fghanim
Subscribers: merge_guards_bot, rampitec, mgorny, hiraditya, aheejin, fedor.sergeev, simoncook, bollu, guansong, dexonsmith, jfb, s.egerton, llvm-commits, cfe-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D71830
The goal of this patch is to maximize CPU utilization on multi-socket or high core count systems, so that parallel computations such as LLD/ThinLTO can use all hardware threads in the system. Before this patch, on Windows, a maximum of 64 hardware threads could be used at most, in some cases dispatched only on one CPU socket.
== Background ==
Windows doesn't have a flat cpu_set_t like Linux. Instead, it projects hardware CPUs (or NUMA nodes) to applications through a concept of "processor groups". A "processor" is the smallest unit of execution on a CPU, that is, an hyper-thread if SMT is active; a core otherwise. There's a limit of 32-bit processors on older 32-bit versions of Windows, which later was raised to 64-processors with 64-bit versions of Windows. This limit comes from the affinity mask, which historically is represented by the sizeof(void*). Consequently, the concept of "processor groups" was introduced for dealing with systems with more than 64 hyper-threads.
By default, the Windows OS assigns only one "processor group" to each starting application, in a round-robin manner. If the application wants to use more processors, it needs to programmatically enable it, by assigning threads to other "processor groups". This also means that affinity cannot cross "processor group" boundaries; one can only specify a "preferred" group on start-up, but the application is free to allocate more groups if it wants to.
This creates a peculiar situation, where newer CPUs like the AMD EPYC 7702P (64-cores, 128-hyperthreads) are projected by the OS as two (2) "processor groups". This means that by default, an application can only use half of the cores. This situation could only get worse in the years to come, as dies with more cores will appear on the market.
== The problem ==
The heavyweight_hardware_concurrency() API was introduced so that only *one hardware thread per core* was used. Once that API returns, that original intention is lost, only the number of threads is retained. Consider a situation, on Windows, where the system has 2 CPU sockets, 18 cores each, each core having 2 hyper-threads, for a total of 72 hyper-threads. Both heavyweight_hardware_concurrency() and hardware_concurrency() currently return 36, because on Windows they are simply wrappers over std:🧵:hardware_concurrency() -- which can only return processors from the current "processor group".
== The changes in this patch ==
To solve this situation, we capture (and retain) the initial intention until the point of usage, through a new ThreadPoolStrategy class. The number of threads to use is deferred as late as possible, until the moment where the std::threads are created (ThreadPool in the case of ThinLTO).
When using hardware_concurrency(), setting ThreadCount to 0 now means to use all the possible hardware CPU (SMT) threads. Providing a ThreadCount above to the maximum number of threads will have no effect, the maximum will be used instead.
The heavyweight_hardware_concurrency() is similar to hardware_concurrency(), except that only one thread per hardware *core* will be used.
When LLVM_ENABLE_THREADS is OFF, the threading APIs will always return 1, to ensure any caller loops will be exercised at least once.
Differential Revision: https://reviews.llvm.org/D71775
This patch adds DenseMapInfo<> support for BitVector and SmallBitVector.
This is part of https://reviews.llvm.org/D71775, where a BitVector is used as a thread affinity mask.
Prior to this patch, if a DW_LNE_set_address opcode was parsed with an
address size (i.e. with a length after the opcode) of anything other 1,
2, 4, or 8, an llvm_unreachable would be hit, as the data extractor does
not support other values. This patch introduces a new error check that
verifies the address size is one of the supported sizes, in common with
other places within the DWARF parsing.
This patch also fixes calculation of a generated line table's size in
unit tests. One of the tests in this patch highlighted a bug introduced
in 1271cde474, when non-byte operands were used as arguments for
extended or standard opcodes.
Reviewed by: dblaikie
Differential Revision: https://reviews.llvm.org/D73962
replaceDbgDeclare is used to update the descriptions of stack variables
when they are moved (e.g. by ASan or SafeStack). A side effect of
replaceDbgDeclare is that it moves dbg.declares around in the
instruction stream (typically by hoisting them into the entry block).
This behavior was introduced in llvm/r227544 to fix an assertion failure
(llvm.org/PR22386), but no longer appears to be necessary.
Hoisting a dbg.declare generally does not create problems. Usually,
dbg.declare either describes an argument or an alloca in the entry
block, and backends have special handling to emit locations for these.
In optimized builds, LowerDbgDeclare places dbg.values in the right
spots regardless of where the dbg.declare is. And no one uses
replaceDbgDeclare to handle things like VLAs.
However, there doesn't seem to be a positive case for moving
dbg.declares around anymore, and this reordering can get in the way of
understanding other bugs. I propose getting rid of it.
Testing: stage2 RelWithDebInfo sanitized build, check-llvm
rdar://59397340
Differential Revision: https://reviews.llvm.org/D74517
Same as D73328 but for TBD_V4. One notable tidbit is that the swift abi
version for swift 1 & 2 is emitted as a float which is considered
invalid input.
Differential revision: https://reviews.llvm.org/D73330
Summary:
The DWARF transformer is added as a class so it can be unit tested fully.
The DWARF is converted to GSYM format and handles many special cases for functions:
- omit functions in compile units with 4 byte addresses whose address is UINT32_MAX (dead stripped)
- omit functions in compile units with 8 byte addresses whose address is UINT64_MAX (dead stripped)
- omit any functions whose high PC is <= low PC (dead stripped)
- StringTable builder doesn't copy strings, so we need to make backing copies of strings but only when needed. Many strings come from sections in object files and won't need to have backing copies, but some do.
- When a function doesn't have a mangled name, store the fully qualified name by creating a string by traversing the parent decl context DIEs and then. If we don't do this, we end up having cases where some function might appear in the GSYM as "erase" instead of "std::vector<int>::erase".
- omit any functions whose address isn't in the optional TextRanges member variable of DwarfTransformer. This allows object file to register address ranges that are known valid code ranges and can help omit functions that should have been dead stripped, but just had their low PC values set to zero. In this case we have many functions that all appear at address zero and can omit these functions by making sure they fall into good address ranges on the object file. Many compilers do this when the DWARF has a DW_AT_low_pc with a DW_FORM_addr, and a DW_AT_high_pc with a DW_FORM_data4 as the offset from the low PC. In this case the linker can't write the same address to both the high and low PC since there is only a relocation for the DW_AT_low_pc, so many linkers tend to just zero it out.
Reviewers: aprantl, dblaikie, probinson
Subscribers: mgorny, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74450
Finalization can introduce new blocks we need to outline as well so it
makes sense to identify the blocks that need to be outlined after
finalization happened. There was also a minor unit test adjustment to
account for the fact that we have a single outlined exit block now.
Reapply 8a56d64d76 with minor fixes.
The problem was that cancellation can cause new edges to the parallel
region exit block which is not outlined. The CodeExtractor will encode
the information which "exit" was taken as a return value. The fix is to
ensure we do not return any value from the outlined function, to prevent
control to value conversion we ensure a single exit block for the
outlined region.
This reverts commit 3aac953afa.
In order to fix PR44560 and to prepare for loop transformations we now
finalize a function late, which will also do the outlining late. The
logic is as before but the actual outlining step happens now after the
function was fully constructed. Once we have loop transformations we
can apply them in the finalize step before the outlining.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D74372
function_ref is non-owning, so if we get it as a parameter in constructor,
our reference goes out-of-scope as soon as constructor returns.
Instead, let's just take it as a parameter to the actual `generate()` call
Summary:
Currently, we only have nice exploration for LEA instruction,
while for the rest, we rely on `randomizeUnsetVariables()`
to sometimes generate something interesting.
While that works, it isn't very reliable in coverage :)
Here, i'm making an assumption that while we may want to explore
multi-instruction configs, we are most interested in the
characteristics of the main instruction we were asked about.
Which we can do, by taking the existing `randomizeMCOperand()`,
and turning it on it's head - instead of relying on it to randomly fill
one of the interesting values, let's pregenerate all the possible interesting
values for the variable, and then generate as much `InstructionTemplate`
combinations of these possible values for variables as needed/possible.
Of course, that requires invasive changes to no longer pass just the
naked `Instruction`, but sometimes partially filled `InstructionTemplate`.
As it can be seen from the test, this allows us to explore
`X86::OperandType::OPERAND_COND_CODE` for instructions
that take such an operand.
I'm hoping this will greatly simplify exploration.
Reviewers: courbet, gchatelet
Reviewed By: gchatelet
Subscribers: orodley, mgorny, sdardis, tschuett, jrtc27, atanasyan, mstojanovic, andreadb, RKSimon, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74156
The DWARFv2-4 specification for the line table header states that the
include directories and file name tables both end with a single null
byte. Prior to this change, the parser did not detect if this byte was
missing, because it also stopped reading the tables once it reached the
prologue end, as claimed by the header_length field. This change adds a
check that the terminator has been seen at the end of each table.
Reviewed by: dblaikie, MaskRay
Differential Revision: https://reviews.llvm.org/D74413
Also remove some test duplication and add a test case that shows the
maximum version is rejected (this also shows that the value in the error
message is actually in decimal, and not just missing an 0x prefix).
Reviewed by: dblaikie
Differential Revision: https://reviews.llvm.org/D74403
Summary:
Dwarf stores source-file names the three parts:
<compilation_directory><include_directory><filename>
Prior to this change, the code only allowed retrieving either all
three as the absolute path, or just the filename. But many
compile-command lines--especially those in hermetic build systems
don't specify an absolute path, nor just the filename, but rather the
path relative to the compilation directory. This features allows
retrieving them in that style.
Add tests for path printing styles.
Modify createBasicPrologue to handle include directories.
Subscribers: aprantl, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73383
Summary:
* for <= tbd_v3, simulator platforms appear the same as the real
platform and we distinct the difference from the architecture.
fixes: rdar://problem/59161559
Reviewers: ributzka, steven_wu
Reviewed By: ributzka
Subscribers: hiraditya, dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74416
Summary:
Simplifies the C++11-style "-> decltype(...)" return-type deduction.
Note that you have to be careful about whether the function return type
is `auto` or `decltype(auto)`. The difference is that bare `auto`
strips const and reference, just like lambda return type deduction. In
some cases that's what we want (or more likely, we know that the return
type is a value type), but whenever we're wrapping a templated function
which might return a reference, we need to be sure that the return type
is decltype(auto).
No functional change.
Subscribers: dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74383
Summary: It attempts to devirtualize a call on alloca through vtable loads.
Reviewers: davidxl
Subscribers: mgorny, Prazek, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71308
The plugin expects to have undefined references to symbols exported
by the loading process, which isn't supported by shared libraries
on windows.
Differential Revision: https://reviews.llvm.org/D74042
If a debug line section with version of greater than 5 is encountered,
prior to this change the parser would accept it and treat it as version
5. This might work to some extent, but then it might not at all, as it
really depends on the format of the unspecified future version, which
will be different (otherwise there would be no point in changing the
version number). Any information we could provide has a good chance of
being invalid, so we should just refuse to parse such tables.
Reviewed by: dblaikie, MaskRay
Differential Revision: https://reviews.llvm.org/D74204
The DebugInfo/dwarfdump-invalid-line-table test used a pre-canned binary
generated by a fuzzer to demonstrate a bug fix. Unfortunately, the
binary is rigid and requires hand-editing if we change behaviour, such
as rejecting certain properties within it (as I plan on doing in another
change).
Rather than hand-edit the binary, I have replaced it with two tests. The
first tests the high-level code path from the debug line parser that
produces the same error as this test previously did, and the second is a
set of unit test cases that comprehensively cover the
FormValue::skipValue method, which in turn covers the area that the
original bug fix touched.
Reviewed by: MaskRay, dblaikie
Differential Revision: https://reviews.llvm.org/D74202
Add support for Master and Critical directive in the OMPIRBuilder. Both make use of a new common interface for emitting inlined OMP regions called `emitInlinedRegion` which was added in this patch as well.
Also this patch modifies clang to use the new directives when `-fopenmp-enable-irbuilder` commandline option is passed.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D72304
The CallGraphUpdater is a helper that simplifies the process of updating
the call graph, both old and new style, while running an CGSCC pass.
The uses are contained in different commits, e.g. D70767.
More functionality is added as we need it.
Reviewed By: modocache, hfinkel
Differential Revision: https://reviews.llvm.org/D70927
Bionic has had `__strlen_chk` for a while. Optimizing that into a
constant is quite profitable, when possible.
Differential Revision: https://reviews.llvm.org/D74079
Allows more flexible use of buildMerge in places where
use operands are available as SrcOp since it does not
require explicit conversion to Register.
Simplify code with new buildMerge.
Differential Revision: https://reviews.llvm.org/D74223
The type passed to lower was invalid, so I'm not sure how this was
even working before. The source and destination type also do not have
to match, so make sure to use the right ones.
The legalizer produces a lot of these, and they make reading legalized
MIR annoying. For some reason, this does seem to sometimes introduce
copies of implicit def, which is dumb.
The problem was noticed by the Chrome OS toolchain folks
(crbug.com/1048445) because llvm-objcopy --add-gnu-debuglink would
insert the wrong checksum when processing a binary larger than 4 GB.
That use case regressed in 1e1e3ba252 when we started using
llvm::crc32() in more places.
Differential revision: https://reviews.llvm.org/D74039
I was debug stepping through an x86 shuffle lowering and
noticed we were doing an N^2 search for splat index. I
didn't find the equivalent functionality anywhere else in
LLVM, so here's a helper that takes an array of int and
returns a splatted index while ignoring undefs (any
negative value).
This might also be used inside existing
ShuffleVectorInst/ShuffleVectorSDNode functions and/or
help with D72467.
Differential Revision: https://reviews.llvm.org/D74064
Removed some #ifdefs specific to Windows handling of VFS paths. This
eliminates most of the differences between the Windows and non-Windows
code paths.
Making this work required some changes to account for the fact that VFS
file paths can be Posix style or Windows style, so you cannot just assume
that they use the host's native path style. In one case, this means
implementing our own version of make_absolute, since the filesystem code
in Support doesn't have styles in the sense that the path code does.
Differential Review: https://reviews.llvm.org/D71092
AMDGPU and x86 at least both have separate controls for whether
denormal results are flushed on output, and for whether denormals are
implicitly treated as 0 as an input. The current DAGCombiner use only
really cares about the input treatment of denormals.
Summary:
This patch changes the underlying type of the ARM::ArchExtKind
enumeration to uint64_t and adjusts the related code.
The goal of the patch is to prepare the code base for a new
architecture extension.
Reviewers: simon_tatham, eli.friedman, ostannard, dmgreen
Reviewed By: dmgreen
Subscribers: merge_guards_bot, kristof.beyls, hiraditya, cfe-commits, llvm-commits, pbarrio
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D73906
ClangBuildAnalyzer results show that a lot of time is spent
instantiating AnalysisManager::getResultImpl across the code base:
**** Templates that took longest to instantiate:
50445 ms: llvm::AnalysisManager<llvm::Function>::getResultImpl (412 times, avg 122 ms)
47797 ms: llvm::AnalysisManager<llvm::Function>::getResult<llvm::TargetLibraryAnalysis> (389 times, avg 122 ms)
46894 ms: std::tie<const unsigned long long, const bool> (2452 times, avg 19 ms)
43851 ms: llvm::BumpPtrAllocatorImpl<llvm::MallocAllocator, 4096, 4096>::Allocate (3228 times, avg 13 ms)
33911 ms: std::tie<const unsigned int, const unsigned int, const unsigned int, const unsigned int> (897 times, avg 37 ms)
33854 ms: std::tie<const unsigned long long, const unsigned long long> (1897 times, avg 17 ms)
27886 ms: std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string (11156 times, avg 2 ms)
I mentioned this result to @chandlerc, and he suggested this direction.
AnalysisManager is already explicitly instantiated, and getResultImpl
doesn't need to be inlined. Move the definition to an Impl header, and
include that header in files that explicitly instantiate
AnalysisManager. There are only four (real) IR units:
- function
- module
- loop
- cgscc
Looking at a specific transform (ArgumentPromotion.cpp), here are three
compilations before & after this change:
BEFORE:
$ for i in $(seq 3) ; do ./ccit.bat ; done
peak memory: 258.15MB
real: 0m6.297s
peak memory: 257.54MB
real: 0m5.906s
peak memory: 257.47MB
real: 0m6.219s
AFTER:
$ for i in $(seq 3) ; do ./ccit.bat ; done
peak memory: 235.35MB
real: 0m5.454s
peak memory: 234.72MB
real: 0m5.235s
peak memory: 234.39MB
real: 0m5.469s
The 20MB of memory saved seems real, and the time improvement seems like
it is there.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D73817
Add support for Master and Critical directive in the OMPIRBuilder. Both make use of a new common interface for emitting inlined OMP regions called `emitInlinedRegion` which was added in this patch as well.
Also this patch modifies clang to use the new directives when `-fopenmp-enable-irbuilder` commandline option is passed.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D72304
Previously, if a debug line Prologue was created via
createBasicPrologue, its TotalLength field did not account for any
contents in the table itself. This change fixes this issue.
Reviewed by: probinson
Differential Revision: https://reviews.llvm.org/D73772
Disable the red zone in the unit test allocator to fix the test errors in sanitizer builds.
The red zone changed the amount of allocated bytes which made the test fail as it
checked the number of allocated bytes of the allocator.
This reverts commit b848b510a8 as the unit tests
fail on the sanitizer bots:
/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/unittests/Support/AllocatorTest.cpp:145: Failure
Expected: SlabSize
Which is: 4096
To be equal to: Alloc.getTotalMemory()
Which is: 4097
Summary:
In D68549 we noticed that our BumpPtrAllocator we use for LLDB's ConstString implementation is growing its slabs at
a rate that is too slow for our use case. It causes that we spend a lot of time calling `malloc` for all the tiny slabs that our
ConstString BumpPtrAllocators create. We also can't just increase the slab size in the ConstString implementation
(which is what D68549 originally did) as this really increased the amount of (mostly unused) allocated memory
in any process using ConstString.
This patch adds a template argument for the BumpPtrAllocatorImpl that allows specifying a faster rate at which the
BumpPtrAllocator increases the slab size. This allows LLDB to specify a faster rate at which the slabs grow which
should keep both memory consumption and time spent calling malloc low.
Reviewers: george.karpenkov, chandlerc, NoQ
Subscribers: NoQ, llvm-commits, llunak
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71654
With this patch new trivial edges can be added to an SCC in a CGSCC
pass via the updateCGAndAnalysisManagerForCGSCCPass method. It shares
almost all the code with the existing
updateCGAndAnalysisManagerForFunctionPass method but it implements the
first step towards the TODOs.
This was initially part of D70927.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D72025
This is the first of multiple parts to make OpenMP context/trait
handling reusable and generic. This patch was originally part of D71830
but with the unit tests it can be tested independently.
This patch implements an almost complete handling of OpenMP
contexts/traits such that we can reuse most of the logic in Flang
through the OMPContext.{h,cpp} in llvm/Frontend/OpenMP.
All but construct SIMD specifiers, e.g., inbranch, and the device ISA
selector are define in llvm/lib/Frontend/OpenMP/OMPKinds.def. From
these definitions we generate the enum classes TraitSet,
TraitSelector, and TraitProperty as well as conversion and helper
functions in llvm/lib/Frontend/OpenMP/OMPContext.{h,cpp}.
The OpenMP context is now an explicit object (see `struct OMPContext`).
This is in anticipation of construct traits that need to be tracked. The
OpenMP context, as well as the VariantMatchInfo, are basically made up
of a set of active or respectively required traits, e.g., 'host', and an
ordered container of constructs which allows duplication. Matching and
scoring is kept as generic as possible to allow easy extension in the
future.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D71847
We want to allow splat value transforms to improve PR44588 and related bugs:
https://bugs.llvm.org/show_bug.cgi?id=44588
...but to do that, we need to know if values are splatted from the same,
specific index (lane) rather than splatted from an arbitrary index.
We can improve the undef handling with 1-liner follow-ups because the
Constant API optionally allow undefs now.
Differential Revision: https://reviews.llvm.org/D73549
Summary: With the new pass manager, it is not possible to obtain a pointer to the pass.
Reviewers: jfb, rinon, yln
Subscribers: hiraditya, dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73390
Summary: With the new pass manager, it is not possible to obtain a pointer to the pass.
Reviewers: jfb, rinon, yln
Subscribers: hiraditya, dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73390
The function a) returned 32-bits when in DWARF64, the PrologueLength
field is 64-bits in size, and b) didn't work for DWARF version 5.
Also deleted some related dead code. With this deletion, getLength is
itself dead, but another change is about to make use of it.
Reviewed by: probinson
Differential Revision: https://reviews.llvm.org/D73626
Summary:
This patch makes sure that the field VFShape.VF is greater than zero
when demangling the vector function name of scalable vector functions
encoded in the "vector-function-abi-variant" attribute.
This change is required to be able to provide instances of VFShape
that can be used to query the VFDatabase for the vectorization passes,
as such passes always require a positive value for the Vectorization Factor (VF)
needed by the vectorization process.
It is not possible to extract the value of VFShape.VF from the mangled
name of scalable vector functions, because it is encoded as
`x`. Therefore, the VFABI demangling function has been modified to
extract such information from the IR declaration of the vector
function, under the assumption that _all_ vectors in the signature of
the vector function have the same number of lanes. Such assumption is
valid because it is also assumed by the Vector Function ABI
specifications supported by the demangling function (x86, AArch64, and
LLVM internal one).
The unit tests that demangle scalable names have been modified by
adding the IR module that carries the declaration of the vector
function name being demangled.
In particular, the demangling function fails in the following cases:
1. When the declaration of the scalable vector function is not
present in the module.
2. When the value of VFSHape.VF is not greater than 0.
Reviewers: jdoerfert, sdesmalen, andwar
Reviewed By: jdoerfert
Subscribers: mgorny, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73286
Summary:
This will help with devirtualization (store forwarding with vtable pointers in
the presence of other stores into members in the constructor.) During inlining,
we don't have AA.
Reviewers: davidxl
Subscribers: mgorny, Prazek, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71307
With the conversion between StringRef and std::string now being
explicit, converting SmallStrings becomes more tedious. This patch adds
an explicit operator so you can write std::string(Str) instead of
Str.str().str().
Differential revision: https://reviews.llvm.org/D73640
Many of the debug line prologue errors are not inherently fatal. In most
cases, we can make reasonable assumptions and carry on. This patch does
exactly that. In the case of length problems, the approach of "assume
stated length is correct" is taken which means the offset might need
adjusting.
This is a relanding of b94191fe, fixing an LLD test and the LLDB build.
Reviewed by: dblaikie, labath
Differential Revision: https://reviews.llvm.org/D72158
During extraction, stale llvm.assume handles may be retained in the
original function. The setup is:
1) CodeExtractor unregisters assumptions in the blocks that are to be
extracted.
2) Extraction happens. There are now two functions: f1 and f1.extracted.
3) Leftover assumptions in f1 (/not/ removed as they were not in the set of
blocks to be extracted) now have affected-value llvm.assume handles in
f1.extracted.
When assumptions for a value used in f1 are looked up, ValueTracking can assert
as some of the handles are in the wrong function. To fix this, simply erase the
llvm.assume calls in the extracted function.
Alternatives include flushing the assumption cache in the original function, or
walking all values used in the original function to prune stale affected-value
handles. Both seem more expensive.
Testing: check-llvm, LNT run with -mllvm -hot-cold-split enabled
rdar://58460728
This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.
This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.
This doesn't actually modify StringRef yet, I'll do that in a follow-up.
Summary:
Currently IsControlFlowEquivalent determine if two blocks are control
flow equivalent by checking if A dominates B and B post dominates A.
There exists blocks that are control flow equivalent even if they don't
satisfy the A dominates B and B post dominates A condition.
For example,
if (cond)
A
if (cond)
B
In the PR, we determine if two blocks are control flow equivalent by
also checking if the two sets of conditions A and B depends on are
equivalent.
Reviewer: jdoerfert, Meinersbur, dmgreen, etiotto, bmahjour, fhahn,
hfinkel, kbarton
Reviewed By: fhahn
Subscribers: hiraditya, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D71578
This makes the types almost seamlessly interchangeable in C++17
codebases. Eventually we want to replace StringRef with the standard
type, but that requires C++17 being the default and a huge refactoring
job as StringRef has a lot more functionality.
Many of the debug line prologue errors are not inherently fatal. In most
cases, we can make reasonable assumptions and carry on. This patch does
exactly that. In the case of length problems, the approach of "the
claimed length is correct" is taken to be consistent with other
instances such as the SectionParser, which ignores the read length.
Reviewed by: dblaikie
Differential Revision: https://reviews.llvm.org/D72158
TAPI currently lacks a way to emit the macCatalyst platform. For TBD_V3
is does support zippered frameworks given that both macOS and
macCatalyst are part of the PlatformSet.
Differential revision: https://reviews.llvm.org/D73325
It is possible to try to keep parsing a debug line program even when the
length of an extended opcode does not match what is expected for that
opcode. This patch changes what was previously a fatal error to be
non-fatal. The parser now continues by assuming the the claimed length
is correct, even if it means moving the offset backwards.
Reviewed by: dblaikie
Differential Revision: https://reviews.llvm.org/D72155
StringMap.h is very popular (4K uses), and it doesn't need to see
BumpPtrAllocator, which is relatively expensive according to
ClangBuildAnalyzer. StringMap only needs MallocAllocator, so split that
into AllocatorBase.h and use it instead.
Here is the change in header uses:
$ diff -u thedeps-before.txt thedeps-after.txt | \
grep '^[-+] ' | sort | uniq -c | sort -nr
3993 + ../llvm/include/llvm/Support/AllocatorBase.h
758 - ../llvm/include/llvm/Support/Allocator.h
270 - ../llvm/include/llvm/Support/Alignment.h
13 - ../llvm/include/llvm/Support/Host.h
6 - ../llvm/include/llvm/ADT/StringMap.h
4 - ../llvm/include/llvm/Support/SwapByteOrder.h
4 - ../llvm/include/llvm/Support/MathExtras.h
4 - ../llvm/include/llvm/Support/AlignOf.h
4 - ../llvm/include/llvm/ADT/SmallVector.h
1 - ../llvm/include/llvm/Support/PointerLikeTypeTraits.h
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D73392
Teach the GISelKnowBits analysis how to deal with PHI operations.
PHIs are essentially COPYs happening on edges, so we can just reuse
the code for COPY.
This is NFC COPY-wise has we leave Depth untouched when calling
computeKnownBitsImpl for COPYs, like it was before this patch.
Increasing Depth is however required for PHIs as they may loop back to
themselves and we would end up in an infinite loop if we were not
increasing Depth.
Differential Revision: https://reviews.llvm.org/D73317
Summary:
This patch is part of a patch series to add support for FileCheck
numeric expressions. This specific patch adds support for selecting a
matching format to match a numeric value against (ie. decimal, hex lower
case letters or hex upper case letters).
This commit allows to select what format a numeric value should be
matched against. The following formats are supported: decimal value,
lower case hex value and upper case hex value. Matching formats impact
both the format of numeric value to be matched as well as the format of
accepted numbers in a definition with empty numeric expression
constraint.
Default for absence of format is decimal value unless the numeric
expression constraint is non null and use a variable in which case the
format is the one used to define that variable. Conclict of format in
case of several variable being used is diagnosed and forces the user to
select a matching format explicitely.
This commit also enables immediates in numeric expressions to be in any
radix known to StringRef's GetAsInteger method, except for legacy
numeric expressions (ie [[@LINE+<offset>]] which only support decimal
immediates.
Copyright:
- Linaro (changes up to diff 183612 of revision D55940)
- GraphCore (changes in later versions of revision D55940 and
in new revision created off D55940)
Reviewers: jhenderson, chandlerc, jdenny, probinson, grimar, arichardson
Reviewed By: jhenderson, arichardson
Subscribers: daltenty, MaskRay, hiraditya, llvm-commits, probinson, dblaikie, grimar, arichardson, kristina, hfinkel, rogfer01, JonChesterfield
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60389
Summary:
This is a follow up on https://reviews.llvm.org/D71473#inline-647262.
There's a caveat here that `Align(1)` relies on the compiler understanding of `Log2_64` implementation to produce good code. One could use `Align()` as a replacement but I believe it is less clear that the alignment is one in that case.
Reviewers: xbolva00, courbet, bollu
Subscribers: arsenm, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, jsji, Jim, kerbowa, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D73099
The existing unit tests cover a wide variety of reading TBD files but
lack coverages on the writing side. Case in point is the macCatalyst
case which we're able to read, but not write.
This patch extends the unit test dealing with valid input to write their
content again to verify the writer.
Differential revision: https://reviews.llvm.org/D73328
In case of loops with multiple exit where all-but-one exit are deoptimizing
it might happen that the first rotation will end up with latch having a deoptimizing
exit. This makes the loop unsuitable for trip-count analysis (say, getLoopEstimatedTripCount)
as well as for loop transformations that know how to handle multple deoptimizing exits.
It pretty much means that canonical form in multple-deoptimizing-exits case should be
with non-deoptimizing exit at latch.
Teach loop-rotation to reach this canonical form by repeating rotation.
-loop-rotate-multi option introduced to control this behavior, currently disabled by default.
Reviewers: skatkov, asbirlea, reames, fhahn
Reviewed By: skatkov
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73058
Summary:
In NEON, the immediate forms of VBIC and VORR are each represented as
a single MC instruction, which takes its immediate operand already
encoded in a NEON-friendly format: 8 data bits, plus some control bits
indicating how to expand them into a full vector.
In MVE, we represented immediate VBIC and VORR as four separate MC
instructions each, for an 8-bit immediate shifted left by 0, 8, 16 or
24 bits. For each one, the value of the immediate operand is in the
'natural' form, i.e. the numerical value that would actually be BICed
or ORRed into each vector lane (and also the same value shown in
assembly). For example, MVE_VBICIZ16v4i32 takes an operand such as
0xab0000, which NEON would represent as 0xab | (control bits << 8).
The MVE approach is superficially nice (it makes assembly input and
output easy, and it's also nice if you're manually constructing
immediate VBICs). But it turns out that it's better for isel if we
make the NEON and MVE instructions work the same, because the
ARMISD::VBICIMM and VORRIMM node types already encode their immediate
into the NEON format, so it's easier if we can just use it.
Also, this commit reduces the total amount of code rather than
increasing it, which is surely an indication that it really is simpler
to do it this way!
Reviewers: dmgreen, ostannard, miyuki, MarkMurrayARM
Reviewed By: dmgreen
Subscribers: kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73205
Summary:
This commit adds error checking beyond UndefVarError and fix a number of
Error/Expected related idioms:
- use (EXPECT|ASSERT)_THAT_(ERROR|EXPECTED) instead of errorToBool or
boolean operator
- ASSERT when a further check require the check to be successful to give
a correct result
Reviewers: jhenderson, jdenny, probinson, grimar, arichardson, rnk
Reviewed By: jhenderson
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D72914
Summary:
... instead of crashing.
On typical exmaple is when there are no available registers.
Reviewers: gchatelet
Subscribers: tschuett, mstojanovic, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73196
This helps to detect and report parsing errors better.
The patch follows the ideas of LLDB's patches D59370 and D59381.
It adds tests for valid and some invalid cases. More checks and
tests to come. Note that the patch fixes validation of the Length
field because the value does not include the field itself.
The existing users are updated to show the error messages.
Differential Revision: https://reviews.llvm.org/D71875
The current m_APInt() and m_APFloat() matchers do not accept splats
that include undefs (unlike m_Zero() and other matchers for specific
values). We can't simply change the default behavior, as there are
existing transforms that would not be safe with undefs.
For this reason, I'm introducing new m_APIntAllowUndef() and
m_APFloatAllowUndef() matchers, that allow splats with undefs.
Additionally, m_APIntForbidUndef() and m_APFloatForbidUndef() are
added. These have the same behavior as the existing m_APInt() and
m_APFloat(), but serve as an explicit indication that undefs were
considered and found unsound for this transform. This helps
distinguish them from existing uses of m_APInt() where we do not
know whether undefs can or cannot be allowed without additional review.
Differential Revision: https://reviews.llvm.org/D72975
The hasSideEffect parameter is usually automatically inferred from
instruction patterns. For some of our MVE instructions, we do not have
patterns though, such as for the pre/post inc loads and stores. This
instead specifies the flag manually on the base MVE_VLDRSTR_base
tablegen class, making sure we get this correct.
This can help with scheduling multiple loads more optimally. Here I've
added a unittest as a more direct form of testing.
Differential Revision: https://reviews.llvm.org/D73117
We previously had to guard against older MSVC and GCC versions which had rvalue
references but not support for marking functions with ref qualifiers. However,
having bumped our minimum required version to MSVC 2017 and GCC 5.1 mean we can
unconditionally enable this feature. Rather than keeping the macro around, this
replaces use of the macro with the actual ref qualifier.
Summary:
Right now when picking a back-to-back instruction at random, we might select
instructions that we do not know how to handle.
Add a ExegesisTarget hook to possibly filter instructions.
Reviewers: gchatelet
Subscribers: tschuett, mstojanovic, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73161
This commit adds a ManglingOptions struct to IRMaterializationUnit, and replaces
IRCompileLayer::CompileFunction with a new IRCompileLayer::IRCompiler class. The
ManglingOptions struct defines the emulated-TLS state (via a bool member,
EmulatedTLS, which is true if emulated-TLS is enabled and false otherwise). The
IRCompileLayer::IRCompiler class wraps an IRCompiler (the same way that the
CompileFunction typedef used to), but adds a method to return the
IRCompileLayer::ManglingOptions that the compiler will use.
These changes allow us to correctly determine the symbols that will be produced
when a thread local global variable defined at the IR level is compiled with or
without emulated TLS. This is required for ORCv2, where MaterializationUnits
must declare their interface up-front.
Most ORCv2 clients should not require any changes. Clients writing custom IR
compilers will need to wrap their compiler in an IRCompileLayer::IRCompiler,
rather than an IRCompileLayer::CompileFunction, however this should be a
straightforward change (see modifications to CompileUtils.* in this patch for an
example).
Add support for converting Signaling NaN, and a NaN Payload from string.
The NaNs (the string "nan" or "NaN") may be prefixed with 's' or 'S' for defining a Signaling NaN.
A payload for a NaN can be specified as a suffix.
It may be a octal/decimal/hexadecimal number in parentheses or without.
Differential Revision: https://reviews.llvm.org/D69773
Summary:
FileCheck's Match unittest needs updating whenever some call to
initNextPattern() is inserted before its final block of checks. This
commit change usage of LineNumber inside the Tester object so that the
line number of the current pattern can be queries, thereby making the
Match test more solid.
Reviewers: jhenderson, jdenny, probinson, grimar, arichardson, rnk
Reviewed By: jhenderson
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D72913
Summary:
Clean redundant unit test checks (codepath already tested elsewhere) and
add a few missing checks for existing numeric substitution and match
logic.
Reviewers: jhenderson, jdenny, probinson, grimar, arichardson, rnk
Reviewed By: jhenderson
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D72912
The addition of `inverse_throughput` mode highlighted the disjointedness
of snippet generators and benchmark runners because it used the
`UopsSnippetGenerator` with the `LatencyBenchmarkRunner`.
To keep the code consistent tie the snippet generators to
parallelization/serialization rather than their benchmark runners.
Renaming `LatencySnippetGenerator` -> `SerialSnippetGenerator`.
Renaming `UopsSnippetGenerator` -> `ParallelSnippetGenerator`.
Differential Revision: https://reviews.llvm.org/D72928
It's been an empty target since r360498 and friends
(`git log --grep='Move InstPrinter files to MCTargetDesc.' llvm/lib/Target`),
but due to hwo the way these targets are structured it was silently
an empty target without anyone noticing.
No behavior change.
[this re-applies c0176916a4
with the correct commit message and phabricator link]
This addresses point 1 of PR44213.
https://bugs.llvm.org/show_bug.cgi?id=44213
The DW_AT_LLVM_sysroot attribute is used for Clang module debug info,
to allow LLDB to import a Clang module from source. Currently it is
part of each DW_TAG_module, however, it is the same for all modules in
a compile unit. It is more efficient and less ambiguous to store it
once in the DW_TAG_compile_unit.
This should have no effect on DWARF consumers other than LLDB.
Differential Revision: https://reviews.llvm.org/D71732