This is required by the Standard and makes it possible to add full
128-bit support to format.
The patch also fixes 128-bit from_chars "support". One unit test
required a too large value, this failed on 128-bit; the fix was to add
more characters to the input.
Note only base 10 has been optimized. Other bases can be optimized.
Note the 128-bit lookup table could be made smaller. This will be done later. I
really want to get 128-bit working in to_chars and format in the upcomming
LLVM 15 release, these optimizations aren't critical.
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D128929
The general idea of these tests is elimination of signed and unsigned
comparison of the same values through proving non-negativity of them.
Here are some examples where SCEV is not smart enough to prove it.
SPIR-V module typically contains some global entities that were not
global before made it to SPIR-V, e.g. types and constants are not usually
declared globally in LLVM. By design SPIR-V requires such stuff to be declared
once and in the module's global section. Since MIR is not able to represent
such things properly they were generated per-function, and then at the very end
of the backend's pipeline hoisted into some 'meta' function minding possible
duplicates.
New SPIRVDuplicatesTracker keeps mapping of the original LLVM entities such
as types, constant, global variables, etc to their MIR counterparts -
(MachineFunction, Register). Later SPIRVModuleAnalysis (apart from other
thing it's responsible for) performs topological sorting of the
tracker's entries to ensure proper ordering before the hoisting,
and actually performs the hoisting in a duplicates-free manner
by the tracker's nature.
Differential Revision: https://reviews.llvm.org/D128471
These three subtarget features are meant to control where MVE
instructions take 1 vs 2 vs 4 architectural beats. The mve1beat feature
is described as "Model MVE instructions as a 1 beat per tick
architecture", meaning MVE instruction will execute over 4 cycles.
mve4beat is the opposite where the entire 4 beats of the MVE instruction
execute in a single cycle. The costs for the two were backwards though,
not matching the cycle counts like they should. This patch switches the
costs on the two to bring them in-line with expectations.
Differential Revision: https://reviews.llvm.org/D129141
In Fuchsia, all tests in a directory, ie stdlib, are linked
into one executable, this causes problems for multiple
definitions of the vtables of the div tests because their
class has the same name. This patch just trivially changes
their name to be unique between all div tests.
Differential revision: https://reviews.llvm.org/D129248
Because the buffer descriptor structure (the V#) has no backwards-compatibility
guarentees, and since said guarantees have been violated in practice
(see https://github.com/llvm/llvm-project/issues/56323 ), and since
the `targetIsRDNA` attribute isn't something that higher-level clients can set
in general, make the lowering of the amdgpu dialect to rocdl take a --chipset
option.
Note that this option is a string because adding a parser for the Chipset
struct to llvm::cl wasn't working out.
Reviewed By: herhut
Differential Revision: https://reviews.llvm.org/D129228
Move user specified inputs to the linking group in case
they and the stardard libraries have mutual reference.
Reviewed By: benshi001
Differential Revision: https://reviews.llvm.org/D127501
Perform a major refactoring of vCont-threads tests in order to attempt
to improve their stability and performance.
Split test_vCont_run_subset_of_threads() into smaller test cases,
and split the whole suite into two files: one for signal-related tests,
the running-subset-of tests.
Eliminate output_match checks entirely, as they are fragile to
fragmentation of output. Instead, for the initial thread list capture
raise an explicit SIGSTOP from inside the test program, and for
the remaining output let the test program run until exit, and check all
the captured output afterwards.
For resume tests, capture the LLDB's thread view before and after
starting new threads in order to determine the IDs corresponding
to subthreads rather than relying on program output for that.
Add a mutex for output to guarantee serialization. A barrier is used
to guarantee that all threads start before SIGSTOP, and an atomic bool
is used to delay prints from happening until after SIGSTOP.
Call std::this_thread::yield() to reduce the risk of one of the threads
not being run.
This fixes the test hangs on FreeBSD. Hopefully, it will also fix all
the flakiness on buildbots.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.llvm.org/D129012
This revision revisits the implementation of applyToOne and its handling
of recoverable errors as well as propagation of null handles.
The implementation is simplified to always require passing a vector<Operation*>
in which the results are returned, resulting in less template instantiation magic.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D129185
ml.exe and ml64.exe support the use of anonymous labels, with @B and @F referring to the previous and next anonymous label respectively.
We add similar support to llvm-ml, with the exception that we are unable to emit an error message for an @F expression not followed by a @@ label.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D128944
With opaque pointers address of pointer variable and its value have
same type (`ptr`). As a result, cmpxchg is printed without values
types in LLVM assembly and cannot be read back. Add AtomicCmpXchg
to the list of instructions which always have operand types printed.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D129276
This patch refactors existing implementations of division representation storage
into a new class, DivisionRepr. This refactoring is done so that the common
division utilities can be shared in an upcoming patch.
Reviewed By: arjunp
Differential Revision: https://reviews.llvm.org/D129146
Previously we recorded AllocationBase as the base address of the region
we get from VirtualQueryEx. However, this is the base of the allocation,
which can later be split into more regions.
So you got stuff like:
[0x00007fff377c0000-0x00007fff377c1000) r-- PECOFF header
[0x00007fff377c0000-0x00007fff37840000) r-x .text
[0x00007fff377c0000-0x00007fff37870000) r-- .rdata
Where all the base addresses were the same.
Instead, use BaseAddress as the base of the region. So we get:
[0x00007fff377c0000-0x00007fff377c1000) r-- PECOFF header
[0x00007fff377c1000-0x00007fff37840000) r-x .text
[0x00007fff37840000-0x00007fff37870000) r-- .rdata
https://docs.microsoft.com/en-us/windows/win32/api/winnt/ns-winnt-memory_basic_information
The added test checks for any overlapping regions which means
if we get the base or size wrong it'll fail. This logic
applies to any OS so the test isn't restricted to Windows.
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D129272
After 82ba3f4, we (again) need to call lldb_enable_attach to be able to
attach to processes on linux. This was a new test, so it does not have
the necessary boilerplate.
It might be an oversight that pass OrcAArch64 as template parameter to stubAndPointerRangesOk on MIps.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D129076
This recommits b15b1421, which reverted in was reverted in f51c47d98 due to
failures on apple systems. The problem was that the patch introduced a race
where the debug server could start the attach process before the first process
(which isn't supposed to be attached to) was set up. This caused us to attach
to the wrong process.
The new version introduces additional synchronization to ensure that does not
happen.
Original commit message was:
As the documentation states, using this is not safe in multithreaded
programs, and I have traced it to a rare deadlock in some of the tests.
The reason this was introduced was to be able to attach to a program
from the very first instruction, where our usual mechanism of
synchronization -- waiting for a file to appear -- does not work.
However, this is only needed for a single test
(TestGdbRemoteAttachWait) so instead of doing this everywhere, I create
a bespoke solution for that single test. The solution basically
consists of outsourcing the preexec_fn code to a separate (and
single-threaded) shim process, which enables attaching and then executes
the real program.
This pattern could be generalized in case we needed to use it for other
tests, but I suspect that we will not be having many tests like this.
This effectively reverts commit
a997a1d7fb.
Not deleting the loose instruction with metadata associated to it causes
an assertion when the LLVMContext is destroyed. This was previously
hidden by the fact that llvm-c-test does not call LLVMShutdown. The
planned removal of ManagedStatic exposed this issue.
Differential Revision: https://reviews.llvm.org/D129114
The `unknownTypeConversion` bufferization option (enum) is now a type converter function option. Some logic of `getMemRefType` is now handled by that function.
This change makes type conversion more controllable. Previously, there were only two options when generating memref types for non-bufferizable ops: Static identity layout or fully dynamic layout. With this change, users of One-Shot Bufferize can provide a function with custom logic.
Differential Revision: https://reviews.llvm.org/D129273
The newline is used by Disassembler.cpp (`emitComments`) to work out how to
format them properly, and if there's no newline it goes into an infinite loop.
Unfortunately I couldn't get llvm-objdump to be affected, only the MacOS otool
utility which dlopens libLTO.
`__bolt_instr_data_dump()` does not lock the hash tables when iterating
over them, so the iteration can happen concurrently with a modification
done in another thread, when the table is in an inconsistent state. This
also has been observed in practice, when it caused a segmentation fault.
We fix this by locking hash tables during iteration. This is done by taking
the lock in `forEachElement()`.
The only other site of iteration, `resetCounters()`, has been correctly
locking the table even before this patch. This patch removes its `Lock`
because the lock is now taken in the inner `forEachElement()`.
Reviewed By: maksfb, yota9
Differential Revision: https://reviews.llvm.org/D129089
This Transform dialect op allows one to merge the lists of Payload IR
operations pointed to by several handles into a single list associated with one
handle. This is an important Transform dialect usability improvement for cases
where transformations may temporarily diverge for different groups of Payload
IR ops before converging back to the same script. Without this op, several
copies of the trailing transformations would have to be present in the
transformation script.
Depends On D129090
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D129110
Introduce a new transformation on structured ops that splits the iteration
space into two parts along the specified dimension. The index at which the
splitting happens may be static or dynamic. This transformation can be seen as
a rudimentary form of index-set splitting that only supports the splitting
along hyperplanes parallel to the iteration space hyperplanes, and is therefore
decomposable into per-dimension application.
It is a key low-level transformation that enables independent scheduling for
different parts of the iteration space of the same op, which hasn't been
possible previously. It may be used to implement, e.g., multi-sized tiling. In
future, peeling can be implemented as a combination of split-off amount
computation and splitting.
The transformation is conceptually close to tiling in its separation of the
iteration and data spaces, but cannot be currently implemented on top of
TilingInterface as the latter does not properly support `linalg.index`
offsetting.
Note that the transformation intentionally bypasses folding of
`tensor.extract_slice` operations when creating them as this folding was found
to prevent repeated splitting of the same operation because due to internal
assumptions about extract/insert_slice combination in dialect utilities.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D129090
The constructor does `Saver(Alloc)`, so `Alloc` should be
initialized first. Move `Alloc` up in the declaration order.
Fixes a -Wuninitialized warning when building with GCC 12.1.
Reported-by: Mihail Atanassov <mihail.atanassov@arm.com>
D127209 fixed LLVM to bring it in line with the AAPCS. The fix affects
functions where the first SVE parameter appears in the 9th or later
arguments, and the function does not return an SVE type.
Differential Revision: https://reviews.llvm.org/D129135
This patch makes use of TypeInterface implementing Type to remove instances of Type that simply checked whether a type implemented a given interface.
As part of this refactoring, some changes had to be done in the OpenMP Dialect files. In particular, they assumed that OpenMPOps.td to only ever include OpenMP TypeInterfaces, which did not hold anymore with a moved include in LLVMOpBase.td. For that reason, the type interface defintions were moved into a new file as is convention.
Differential Revision: https://reviews.llvm.org/D129210
By making TypeInterfaces and AttrInterfaces, Types and Attrs respectively it'd then be possible to use them anywhere where a Type or Attr may go. That is within the arguments and results of an Op definition, in a RewritePattern etc.
Prior to this change users had to separately define a Type or Attr, with a predicate to check whether a type or attribute implements a given interface. Such code will be redundant now.
Removing such occurrences in upstream dialects will be part of a separate patch.
As part of implementing this patch, slight refactoring had to be done. In particular, Interfaces cppClassName field was renamed to cppInterfaceName as it "clashed" with TypeConstraints cppClassName. In particular Interfaces cppClassName expected just the class name, without any namespaces, while TypeConstraints cppClassName expected a fully qualified class name.
Differential Revision: https://reviews.llvm.org/D129209
This is done during type legalization since the target representation of
these nodes may not be valid until after type legalization, and after
type legalization the fact that these are dealing with i1 types may be
lost.
Differential Revision: https://reviews.llvm.org/D128996
BasicAA will already call getModRefBehavior() on the Function of
the CallBase if there are no operand bundles. This happens through
getBestAAResults(), i.e. it is a recursive call that will query
other AA providers, not just the BasicAA implementation.
As such, there is no need to reimplement the same functionality
in GlobalsModRef, a combination of BasicAA and GlobalsModRef already
handles it. This does mean that this no longer works under
-disable-basic-aa, but that's a testing only option.
Instead of using <vscale x 16 x i1> for all the loads/stores, we now use the appropriate
predicate type according to the element size, e.g.
ld1b uses <vscale x 16 x i1>
ld1w uses <vscale x 4 x i1>
ld1q uses <vscale x 1 x i1>
Reviewed By: kmclaughlin
Differential Revision: https://reviews.llvm.org/D129083
Truncates and compares require some changes to generic legalisation functions
to use ElementCount instead of getNumElements.
Reviewed By: paulwalker-arm
Differential Revision: https://reviews.llvm.org/D129082
SplitBlockPredecessors currently asserts if one of the predecessor
terminators is a callbr. This limitation was originally necessary,
because just like with indirectbr, it was not possible to replace
successors of a callbr. However, this is no longer the case since
D67252. As the requirement nowadays is that callbr must reference
all blockaddrs directly in the call arguments, and these get
automatically updated when setSuccessor() is called, we no longer
need this limitation.
The only thing we need to do here is use replaceSuccessorWith()
instead of replaceUsesOfWith(), because only the former does the
necessary blockaddr updating magic.
I believe there's other similar limitations that can be removed,
e.g. related to critical edge splitting.
Differential Revision: https://reviews.llvm.org/D129205
This removes a part of the now obsolete formater code.
The removal also removes the _v2 suffix where it's no longer needed.
Depends on D128785
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D128846
This changes the implementation of the formatter. Instead of inheriting
from a specialized parser all formatters will use the same generic
parser. This reduces the binary size.
The new parser contains some additional fields only used in the chrono
formatting. Since this doesn't change the size of the parser the fields
are in the generic parser. The parser is designed to fit in 128-bit,
making it cheap to pass by value.
The new format function is a const member function. This isn't required
by the Standard yet, but it will be after LWG-3636 is accepted.
Additionally P2286 adds a formattable concept which requires the member
function to be const qualified in C++23. This paper is likely to be
accepted in the 2022 July plenary.
This is based on D125606. That commit did the groundwork and did similar
changes for the string formatters.
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D128785