While attempting to get the 64-bit lsan allocator working for Fuchsia, I
noticed this function would incorrectly return false for pointers returned
by the 64-bit allocator. On AArch64, this function attempts to get the VMA
size dynamically by counting the number of leading zeros from the function
frame address. This will fail if the frame address is significantly below an
allocated pointer (that is, the frame address has more leading zeros than an
allocated pointer). This is possible on Fuchsia and linux (when not called
from the initial thread stack).
It seems the intended use of this function is to speed up pointer scanning by
filtering out addresses that user code might not be able to access. Other
platforms this check is done on seem to hardcode the VMA size/shift, so it
seems appropriate to do this for aarch64 as well. This implies pointers on
aarch64 where the VMA size is <64 will pass through, but bad pointers will
still be caught by subsequent scan checks.
This patch also renames the function to something more fitting of what it's
trying to do.
Differential Revision: https://reviews.llvm.org/D123814
Even though single address image instructions only use a single VGPR
HW accesses 4 or 5 which creates alignment requirement.
Fixes: SWDEV-316648
Differential Revision: https://reviews.llvm.org/D126009
This commit builds upon recently added indexing support for C++ concepts
from https://reviews.llvm.org/D124441 by extending libclang to
support indexing and visiting concepts, constraints and requires
expressions as well.
Differential Revision: https://reviews.llvm.org/D126031
Whether a unit number in an inquire-by-unit statement is valid or not,
it should be the value to which the NUMBER= variable is set, not -1.
-1 should be returned to NUMBER= only for an inquire-by-file statement
when the FILE= is not connected to any unit.
Differential Revision: https://reviews.llvm.org/D126145
This reverts commit dfe513ae1b.
Tests have been changed to avoid the type legalization bug being
fixed in D126036.
Original commit message:
This will remove masks on the shift amount. We usually get this with
SimplifyDemandedBits in DAGCombine, but that's restricted to cases
where the AND has a single use. selectShiftMaskXLen does not have
that restriction.
When a Hollerith (or short character) literal is presented as an actual
argument that corresponds to a dummy argument for which a BOZ literal
would be acceptable, treat the Hollerith as if it had been a BOZ
literal in the same way -- and with the same code -- as f18 already
does for the similar extension in DATA statements.
Differential Revision: https://reviews.llvm.org/D126144
Clang has recently started diagnosing prototype redeclaration errors like [rG385e7df33046](https://reviews.llvm.org/rG385e7df33046d7292612ee1e3ac00a59d8bc0441)
This flagged legitimate issues in a codebase but was confusing to resolve because it actually conflicted with a function declaration from a system header and not from the one emitted with "note: ".
This patch updates the error handling to use the canonical declaration's source location instead to avoid misleading errors like the one described.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D126258
This commit breaks down diagnostic string literals so that the attribute
name and enumurator names can be shared with the stringify utility
function and the "expected ", " to be one of ", and ", " can be shared
between different enum-related diagnostic.
Differential Revision: https://reviews.llvm.org/D125938
Now that the requirements and implementation of asynchronous I/O are
better understood, adjust their I/O runtime APIs. In particular:
1) Remove the BeginAsynchronousOutput/Input APIs; they're not needed,
since any data transfer statement might have ASYNCHRONOUS= and
(if ASYNCHRONOUS='YES') ID= control list specifiers that need to
at least be checked.
2) Add implementations for BeginWait(All) to check for the error
case of a bad unit number and nonzero ID=.
3) Rearrange and comment SetAsynchronous so that it's clear that
it can be called for READ/WRITE as well as for OPEN.
The implementation remains completely synchronous, but should be conforming.
Where opportunities make sense for true asynchronous implementations of
some big block transfers without SIZE= in the future, we'll need to add
a GetAsynchronousId API to capture ID= on a READ or WRITE; add sourceFile
and sourceLine arguments to BeginWait(All) for good error reporting;
track pending operations in unit.h; and add code to force synchronization
to non-asynchronous I/O operations.
Lowering should call SetAsynchronous when ASYNCHRONOUS= appears as
a control list specifier. It should also set ID=x variables to 0
until such time as we support asynchronous operations, if ever.
This patch only removes the removed APIs from lowering.
Differential Revision: https://reviews.llvm.org/D126143
The second argument to is_fp_class specifies the set of floating-point
class to test against. It can be zero, in this case the intrinsic is
expected to return zero value.
Differential Revision: https://reviews.llvm.org/D112025
Followup to D125988 - FPOW is similar to FREM and will most likely scalarize to libcalls, so unroll before widening to prevent use making additional libcalls with UNDEF args.
Any zext 'sink' should already have an operand that is in the legal
value, so avoid using a trunc and just use the trunc operand instead.
Differential Revision: https://reviews.llvm.org/D118905
Add lowering for cases where the reduction dimension is fully unrolled.
It is common to unroll the reduction dimension, therefore we would want
to lower the contractions to an elementwise vector op in this case.
Differential Revision: https://reviews.llvm.org/D126120
We were using the default getScalarizationOverhead expansion for extraction costs, which adds up all the individual element extraction costs.
This is fine for 128-bit vectors, but for 256/512-bit vectors each element extraction also has to account for extracting the upper 128-bit subvector extraction before it can handle the element. For scalarization costs we only need to extract each demanded subvector once.
Differential Revision: https://reviews.llvm.org/D125527
Bitcasts were stripped in one case, but not the other. Of course,
this no longer really matters with opaque pointers, but as I went
through the trouble of tracking this down, we may as well remove
one typed vs opaque pointer optimization discrepancy.
Use IRBuilder so that the newly created freeze instructions
automatically gets inserted back into the IC worklist.
The changed worklist processing order leads to some cosmetic
differences in tests.
Fixes https://github.com/llvm/llvm-project/issues/55619.
Error-tolerant bracket matching enables our error-tolerant parsing strategies.
The implementation here is *not* yet error tolerant: this patch sets up the APIs
and plumbing, and describes the planned approach.
Differential Revision: https://reviews.llvm.org/D125911
We use the clang-linker-wrapper to perform device linking of embedded
offloading object files. This is done by generating those jobs inside of
the linker-wrapper itself. This patch adds an argument in Clang and the
linker-wrapper that allows users to forward input to the device linking
phase. This can either be done for every device linker, or for a
specific target triple. We use the `-Xoffload-linker <arg>` and the
`-Xoffload-linker-<triple> <arg>` syntax to accomplish this.
Reviewed By: markdewing, tra
Differential Revision: https://reviews.llvm.org/D126226
To be used correctly in a sort-like function, isFirstInsertElement
function must follow weak strict ordering rule, i.e.
isFirstInsertElement(IE1, IE1) should return false.
Currently, two element vectors produced as the result of a binary op are
widened to four element vectors on x86 by
DAGTypeLegalizer::WidenVecRes_BinaryCanTrap. If the op still isn't legal
after widening it is unrolled into 4 scalar ops in SelectionDAG before
being converted into a libcall. This way we end up with 4 libcalls (two of
them on known undef elements) instead of the original two libcalls.
This patch modifies DAGTypeLegalizer::WidenVectorResult to ensure that if
it is known that a binary op will be tunred into a libcall, it is unrolled
instead of being widened. This prevents the creation of the extra scalar
instructions on known undef elements and (eventually) libacalls with known
undef parameters which would otherwise be created when the op gets expanded
post widening.
Differential Revision: https://reviews.llvm.org/D125988
It is breaking the build with:
/build/llvm-toolchain-snapshot-15~++20220524114008+96323c9f4c10/llvm/lib/Target/M68k/MCTargetDesc/M68kMCCodeEmitter.cpp:478:10: fatal error: 'M68kGenMCCodeBeads.inc' file not found
^~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.
Remove the #include causes:
error: undefined reference to 'llvm::M68k::getMCInstrBeads(unsigned int)'
This reverts commit f50be3d218.
These were left over from when we converted to the update_llc_test_checks script and were just matching some asm/cfi directives - we have CHECK-LABEL to do this properly now.
For generic targets such as SPIR-V clang sets all OpenCL
extensions/features as supported by default. However
concrete targets are unlikely to support all extensions
features, which creates a problem when such generic SPIR-V
binary is compiled for a specific target later on.
To allow compile time diagnostics for unsupported features
this flag is now being exposed in the clang driver.
Differential Revision: https://reviews.llvm.org/D125243
This NFC patch is splited from D111617.
Using llvm::ArrayRef rather than llvm::SmallVector, ArrayRef is more generic
interface that could accept both llvm::ArrayRef and llvm::SmallVector.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D125893
Most of the folds implemented in this function work fine with
logical operations. We only need to be careful for the cases that
work on non-constant masks, where the RHS operand shouldn't be
poison.
This is a conservative implementation that bails out of illegal
transforms, but we could also change these to insert freeze instead.