LIBCLANG_INCLUDE_CLANG_TOOLS_EXTRA causes clang-tools-extra tools
to be included in libclang, which caused a dependency cycle. The option
has been off by default for two releases now, and (based on a web search
and mailing list feedback) nobody seems to turn it on. Remove it, like
planned on https://reviews.llvm.org/D79599
Differential Revision: https://reviews.llvm.org/D97693
This seems to be more of a Clang thing rather than a generic LLVM thing,
so this moves it out of LLVM pipelines and as Clang extension hooks into
LLVM pipelines.
Move the post-inline EEInstrumentation out of the backend pipeline and
into a late pass, similar to other sanitizer passes. It doesn't fit
into the codegen pipeline.
Also fix up EntryExitInstrumentation not running at -O0 under the new
PM. PR49143
Reviewed By: hans
Differential Revision: https://reviews.llvm.org/D97608
In `LateEHPrepare::addCatchAlls`, the current code tries to get the
iterator's debug info even when it is `MachineBasicBlock::end()`. This
fixes the bug by adding empty debug info instead in that case.
Reviewed By: tlively
Differential Revision: https://reviews.llvm.org/D97679
The interface served a purpose, but since the ability to emit diagnostics when parsing configuration was added, its become mostly redundant. Emitting the diagnostic and removing the boilerplate is much cleaner.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D97614
The code previously used two BUILD_PAIRs to concatenate the two UMULO
results with 0s in the lower bits to match original VT. Then it created
an ADD and a UADDO with the original bit width. Each of those operations
need to be expanded since they have illegal types.
Since we put 0s in the lower bits before the ADD, the lower half of the
ADD result will be 0. So the lower half of the UADDO result is
solely determined by the other operand. Since the UADDO need to
be split in half, we don't really needd an operation for the lower
bits. Unfortunately, we don't see that in type legalization and end up
creating something more complicated and DAG combine or
lowering aren't always able to recover it.
This patch directly generates the narrower ADD and UADDO to avoid
needing to legalize them. Now only the MUL is done on the original
type.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D97440
- Categories will now show up as `MyClass(Category)` instead of
`Category` and `MyCategory()` instead of `(anonymous)` in document
symbols
- Methods will now be shown as `-selector:` or `+selector:`
instead of `selector:` to differentiate between instance and class
methods in document symbols
Differential Revision: https://reviews.llvm.org/D96612
This adds IntrWillReturn to the gfx90a mfma intrinsics, to match all the
other mfma intrinsics, and llvm.amdgcn.live.mask, to match
llvm.amdgcn.ps.live.
Differential Revision: https://reviews.llvm.org/D97675
Move the results in line with the op instead. This results in each
operation having its own types recorded vs single tuple type, but comes
at benefit that every mutation doesn't incurs uniquing. Ran into cases
where updating result type of operation led to very large memory usage.
Differential Revision: https://reviews.llvm.org/D97652
The new Darwin backend for LLD is now able to link reasonably large
real-world programs on x86_64. For instance, we have achieved
self-hosting for the X86_64 target, where all LLD tests pass when
building lld with itself on macOS. As such, we would like to make it the
default back-end.
The new port is now named `ld64.lld`, and the old port remains
accessible as `ld64.lld.darwinold`
This [annoucement email][1] has some context. (But note that, unlike
what the email says, we are no longer doing this as part of the LLVM 12
branch cut -- instead we will go into LLVM 13.)
Numerous mechanical test changes were required to make this change; in
the interest of creating something that's reviewable on Phabricator,
I've split out the boring changes into a separate diff (D95905). I plan to
merge its contents with those in this diff before landing.
(@gkm made the original draft of this diff, and he has agreed to let me
take over.)
[1]: https://lists.llvm.org/pipermail/llvm-dev/2021-January/147665.html
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D95204
There was initially some concern around the correct handling of pcrel
section relocations with r_length != 2. But it looks like there are no such
relocations in practice -- x86_64's pcrel section relocs all have r_length == 2,
and ARM64 doesn't even have pcrel section relocs. So we can replace the TODO
with an assert.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D97576
This is a patch that updates the cost of `select i1 a, b, false` to be equivalent to that of `and i1 a, b`
as well as the cost of `select i1 a, true, b` equivalent to `or i1 a, b`.
Until now, these selects were folded into and/or i1 by InstCombine, but the transformation is poison-unsafe.
This is a step towards removing the unsafe transformation. D93065 has relevant transformations linked.
These selects should be translated into the assemblies as and/or i1 do in the same manner. The cost should be equivalent.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D97360
Before this patch, we could only link against the back-deployment libc++abi
dylib. This patch allows linking against the just-built libc++abi, but
running against the back-deployment one -- just like we do for libc++.
Also, add XFAIL markup to flag expected errors.
Differential Revision: https://reviews.llvm.org/D91069
On clang emits the compiler version string into debug information
by default for both dwarf and codeview. That makes compiler output
needlessly compiler-version-dependent which makes e.g. comparing
object file outputs during a bisect hard. So it's nice if there's
an easy way to turn this off.
(On ELF, this flag also controls the .comment section, but that
part is ELF-only. The debug-info bit isn't.)
Differential Revision: https://reviews.llvm.org/D97695
Update the deletion order when destroying VPBasicBlocks. This ensures
recipes that depend on earlier ones in the block are removed first.
Otherwise this may cause issues when recipes have remaining users later
in the block.
If the reference-types feature is enabled, call_indirect will explicitly
reference its corresponding function table via TABLE_NUMBER
relocations against a table symbol.
Also, as before, address-taken functions can also cause the function
table to be created, only with reference-types they additionally cause a
symbol table entry to be emitted.
Differential Revision: https://reviews.llvm.org/D90948
Under -O3 and -Ofast, the MASSV conversion prevents the sqrt call to be inlined.
Inline sqrt is faster than MASSV call on leppc.
Differential Revision: https://reviews.llvm.org/D97487
Skip the AVX-related lldb-server test on non-x86 architectures, as they
do not support AVX. While technically the test worked on Linux because
the AVX check would simply return false, other platforms do not provide
such a straightforward way of checking for AVX (especially remotely),
and the results of such check may need to be interpreted specially
for the platform in question.
Differential Revision: https://reviews.llvm.org/D97450
Use realpath() when spawning the executable create_after_attach
to workaround a FreeBSD plugin (and possibly others) problem.
If the executable is started via a path containing a symlink, it is
added to the module list twice -- via the real and apparent path.
This in turn cases the requested breakpoint to resolve twice.
Use realpath() for main program path in lldb-vscode breakpoint tests
to workaround a similar problem. If the passed path does not match
the realpath, lldb-vscode does not report the breakpoints as verified
and causes tests to fail.
Since the underlying problems are non-trivial to fix and the purpose
of these tests is not to reproduce symlink problems, let's apply
trivial workarounds to make them pass.
Differential Revision: https://reviews.llvm.org/D97230
The expected use case is for frontends to insert this into
shaders that are to be run under a debugger. The shader can
then be resumed or single stepped from the point of the call
under debugger control.
Differential Revision: https://reviews.llvm.org/D97670
Semantic checks for the following OpenMP 4.5 clauses.
1. 2.15.4.2 - Copyprivate clause
2. 2.15.3.4 - Firstprivate clause
3. 2.15.3.5 - Lastprivate clause
Add related test cases and resolve test cases marked as XFAIL.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D91920
I copied the nearly identical function from AArch64 into AMDGPU, so
fix this duplication.
Mips and X86 have their own more exotic versions which should be
removed. However replacing those is better left for a separate patch
since it requires other changes to avoid regressions.
This was reusing the parent function calling convention instead of the
callee. I'm not sure if there's a case where there's an observable
difference.
I previously missed this in b72a23650f
Moved some of the `sve-getIntrinsicCost-<..>` into a single sve-intrinsics.ll
file, and simplified the tests a bit by bundling all the intrinsics in one
function (instead of testing one intrinsic per function). That makes it easier
to see the cost of the intrinsics.
For ops that produces tensor types and implement the shaped type component interface, the type inference interface can be used. Create a grouping of these together to make it easier to specify (it cannot be added into a list of traits, but must rather be appended/concated to one as it isn't a trait but a list of traits).
Differential Revision: https://reviews.llvm.org/D97636
Given a zero input for a udot, an add can be folded in to take the place
of the input, using thte addition that the instruction naturally
performs.
Differential Revision: https://reviews.llvm.org/D97188
See original comment in 560ce2c70f
Baiscally the default seed value results in less collision, but changes the
iteration order, which matters for a few test cases.
Differential Revision: https://reviews.llvm.org/D97396
Like with EXTRACT_SUBVECTOR, INSERT_SUBVECTOR poses a problem
for vector masks as RVV isn't able to slide mask types around. We choose
instead to bitcast to equivalently-sized i8 types where we can, else we
zero-extend, perform the operation, and truncate back down.
One test was left disabled due to a crash in the legalizer.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D97559
This patch fixes a bug where the lowering for INSERT_SUBVECTOR and
EXTRACT_SUBVECTOR would insist on first extracting a register-aligned
LMUL1 vector type before perfoming the slide up/down. This was even if
the vector was a fractional LMUL type, in which case the aligned
EXTRACT_SUBVECTOR was invalid.
This issue only occurred for scalable vector types, but a variety of
tests for both scalable and fixed-length vectors have been added to
ensure this does not regress in the future.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D97556
This patch unifies the two disparate paths for lowering INSERT_SUBVECTOR
operations under one roof. Consequently, with this patch it is possible to
support any fixed-length subvector insertion, not just "cast-like" ones.
As before, support for the insertion of mask vectors will come in a
separate patch.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D97543
This patch adds support for extracting subvectors from vector masks.
This can be either extracting a scalable vector from another, or a fixed-length
vector from a fixed-length or scalable vector.
Since RVV lacks a way to slide vector masks down on an element-wise
basis and we don't know the true length of the vector registers, in many
cases we must resort to using equivalently-sized i8 vectors to perform
the operation. When this is not possible we fall back and extend to a
suitable i8 vector.
Support was also added for fixed-length truncation to mask types.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D97475
Simply make sure that the CodeGenFunction::CXXThisValue and CXXABIThisValue
are correctly initialized to the recovered value.
For lambda capture, we also need to make sure to fill the LambdaCaptureFields
Differential Revision: https://reviews.llvm.org/D97534
This patch updates LV to generate the runtime checks just after cost
modeling, to allow a more precise estimate of the actual cost of the
checks. This information will be used in future patches to generate
larger runtime checks in cases where the checks only make up a small
fraction of the expected scalar loop execution time.
The runtime checks are created up-front in a temporary block to allow better
estimating the cost and un-linked from the existing IR. After deciding to
vectorize, the checks are moved backed. If deciding not to vectorize, the
temporary block is completely removed.
This patch is similar in spirit to D71053, but explores a different
direction: instead of delaying the decision on whether to vectorize in
the presence of runtime checks it instead optimistically creates the
runtime checks early and discards them later if decided to not
vectorize. This has the advantage that the cost-modeling decisions
can be kept together and can be done up-front and thus preserving the
general code structure. I think delaying (part) of the decision to
vectorize would also make the VPlan migration a bit harder.
One potential drawback of this patch is that we speculatively
generate IR which we might have to clean up later. However it seems like
the code required to do so is quite manageable.
Reviewed By: lebedev.ri, ebrevnov
Differential Revision: https://reviews.llvm.org/D75980
This reverts commit 07de0846a5.
The original patch has caused 6 out 8 of Flang's public buildbots to
fail. As I'm not sure what the fix should be, I'm reverting this for
now. Please see https://reviews.llvm.org/D97201 for more context and
discussion.
This patch addresses issues arising from the fact that the index type
used for subvector insertion/extraction is inconsistent between the
intrinsics and SDNodes. The intrinsic forms require i64 whereas the
SDNodes use the type returned by SelectionDAG::getVectorIdxTy.
Rather than update the intrinsic definitions to use an overloaded index
type, this patch fixes the issue by transforming the index to the
correct type as required. Any loss of index bits going from i64 to a
smaller type is unexpected, and will be caught by an assertion in
SelectionDAG::getVectorIdxConstant.
The patch also updates the documentation for INSERT_SUBVECTOR and adds
an assertion to its creation to bring it in line with EXTRACT_SUBVECTOR.
This necessitated changes to AArch64 which was using i64 for
EXTRACT_SUBVECTOR but i32 for INSERT_SUBVECTOR. Only one test changed
its codegen after updating the backend accordingly.
Reviewed By: sdesmalen
Differential Revision: https://reviews.llvm.org/D97459
Currently dead gc value mentioned in the deopt section are not listed in gc section
and so are processed separately.
With this CL all deopt gc values are considered as base pointers and processed in the
same way as other gc values.
The fact that deopt gc pointer is a base pointer was used all the time but
it is explicitly documented here by putting the value in SI.Base.
The idea of the patch comes from Philip Reames.
Reviewers: reames, dantrushin
Reviewed By: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D97554