Environments are optional and a missing environment is distinct from
the default "unknown" environment enumerator. The test is negative,
because the function uses the host triple and is unpredictable.
rdar://91007207
https://reviews.llvm.org/D122946
Differential Revision: https://reviews.llvm.org/D122946
I noticed that when --update-section was added to llvm-objcopy it was
not added to the command guide, see
25bcd94234. This change adds it to the
docs and updates the help text.
Differential Revision: https://reviews.llvm.org/D122907
The actual argument shall have deferred the same type parameters as
the dummy argument if the argument is allocatable or pointer variable.
Currently programs not following this get one crash during execution.
Reviewed By: Jean Perier
Differential Revision: https://reviews.llvm.org/D122779
@thakis believes the problem was the lack of -n on my llvm-cxxfilt call,
so hopefully this is the only problem. Committing to see if this makes
all the buildbots happy.
This is a no-op in these files since the symlinks array is never empty
and the dependency to the base binary is added through the loop in these
cases.
But adding them doesn't hurt either, and it:
1. Makes all symlinks targets look the same, independent of symlinks
are created always or just conditionally based on gn args
2. Makes it less likely that bugs like the one fixed by b0abada8fe
are introduced by copy-pasting an existing symlink target and then
not being careful enough when tweaking it.
No behavior change.
AND the followups that fixed builds.
I attempted to get 'cute' and use llvm-cxxfilt to make the test look
nicer, but apparently some of the bots have a version of llvm-cxxfilt
that is not the in-tree one, so it fails to properly demangle the stuff.
I've disabled this "RUN" line.
This reverts commit 50186b63d1.
This pass inserts the necessary CFI instructions to compensate for the
inconsistency of the call-frame information caused by linear (non-CFG
aware) nature of the unwind tables.
Unlike the `CFIInstrInserer` pass, this one almost always emits only
`.cfi_remember_state`/`.cfi_restore_state`, which results in smaller
unwind tables and also transparently handles custom unwind info
extensions like CFA offset adjustement and save locations of SVE
registers.
This pass takes advantage of the constraints that LLVM imposes on the
placement of save/restore points (cf. `ShrinkWrap.cpp`):
* there is a single basic block, containing the function prologue
* possibly multiple epilogue blocks, where each epilogue block is
complete and self-contained, i.e. CSR restore instructions (and the
corresponding CFI instructions are not split across two or more
blocks.
* prologue and epilogue blocks are outside of any loops
Thus, during execution, at the beginning and at the end of each basic
block the function can be in one of two states:
- "has a call frame", if the function has executed the prologue, or
has not executed any epilogue
- "does not have a call frame", if the function has not executed the
prologue, or has executed an epilogue
These properties can be computed for each basic block by a single RPO
traversal.
In order to accommodate backends which do not generate unwind info in
epilogues we compute an additional property "strong no call frame on
entry" which is set for the entry point of the function and for every
block reachable from the entry along a path that does not execute the
prologue. If this property holds, it takes precedence over the "has a
call frame" property.
From the point of view of the unwind tables, the "has/does not have
call frame" state at beginning of each block is determined by the
state at the end of the previous block, in layout order.
Where these states differ, we insert compensating CFI instructions,
which come in two flavours:
- CFI instructions, which reset the unwind table state to the
initial one. This is done by a target specific hook and is
expected to be trivial to implement, for example it could be:
```
.cfi_def_cfa <sp>, 0
.cfi_same_value <rN>
.cfi_same_value <rN-1>
...
```
where `<rN>` are the callee-saved registers.
- CFI instructions, which reset the unwind table state to the one
created by the function prologue. These are the sequence:
```
.cfi_restore_state
.cfi_remember_state
```
In this case we also insert a `.cfi_remember_state` after the
last CFI instruction in the function prologue.
Reviewed By: MaskRay, danielkiss, chill
Differential Revision: https://reviews.llvm.org/D114545
Both > and >> expressions need to be parenthesized inside template
argument lists.
Reviewed By: dblaikie, rjmccall
Differential Revision: https://reviews.llvm.org/D122474
This fixes a regression from 69cde915e923d: If llvm_install_cctools_symlinks
is false, depending llvm-lipo:symlinks didn't actually depend on llvm-lipo
and the binary didn't get built as dependency of `check-lld` (because the
`symlinks` array ended up empty).
Note that `generate_assertion_tests.py` will be renamed to
`generate_header_tests.py` separately to facilitate change tracking.
Differential Revision: https://reviews.llvm.org/D123000
This patch adds basic modeling of `__builtin_expect`, just to propagate the
(first) argument, making the call transparent.
Driveby: adds tests for proper handling of other builtins.
Differential Revision: https://reviews.llvm.org/D122908
Few times in different methods of the EmitAssemblyHelper class the following
code snippet is used to get the TargetTriple and then use it's single method
to check some conditions:
TargetTriple(TheModule->getTargetTriple())
The parsing of a target triple string is not a trivial operation and it takes
time to repeat the parsing many times in different methods of the class and
even numerous times in one method just to call a getter
(llvm::Triple(TheModule->getTargetTriple()).getVendor()), for example.
The patch extracts the TargetTriple member of the EmitAssemblyHelper class to
parse the triple only once in the class' constructor.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D122587
A vector mul(sext, sext) or mul(zext, zext) will be code generated as a
single smull or umull instruction. This most notably effects v2i64
multiplies, which are otherwise not legal and need to be expanded.
The oneuse check has also been slightly changed, as it is already
checked from the use of isWideningInstruction in getCastInstrCost.
Differential Revision: https://reviews.llvm.org/D123006
This updates the disassembler to enable every optional extension.
Previously we had added things that we added "support" for in lldb.
(where support means significant work like new registers, fault types, etc.)
Something like TME (transactional memory) wasn't added because
there are no new lldb features for it. However we should still be
disassembling the instructions.
So I went through the AArch64 extensions and added all the missing
ones. The new test won't prevent us missing a new extension but it
does at least document our current settings.
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D121999
With opaque pointers, we can eliminate zero-index GEPs even if
they have multiple indices, as this no longer impacts the result
type of the GEP.
This optimization is already done for instructions in InstSimplify,
but we were missing the corresponding constant expression handling.
The constexpr transform is a bit more powerful, because it can
produce a vector splat constant and also handles undef values --
it is an extension of an existing single-index transform.
These are mostly small changes to make the code a bit clearer and more
consistent. Summary of changes:
* add missing namespace qualifiers (that's the preference in Flang)
* replace const member methods with static methods (to avoid passing
the *this pointer unnecessarily)
* rename `currentObjTy` (current object type) as `cpnTy` (component
type) - the latter feels more fitting
* remove redundant `return failure();` calls (` return
mlir::emitError` gives the same result)
* updated a few comments
Differential Revision: https://reviews.llvm.org/D122799
Update VPInterleavedAccessInfo to use the generic getVectorLoopRegion
helper instead of relying on the entry block being the top-most vector
loop region.
If both the character and string are known, but the length
potentially isn't, we can optimize the memchr() call to a select
of either the known position of the character or null.
Split off from https://reviews.llvm.org/D122836.
Handle the simple constant char case before the bitmask optimization.
This will allow extending the code to handle a non-constant size
argument in a followup change.
Split out from https://reviews.llvm.org/D122836.
This patch adds tests for the array-value-copy pass with array assignment
involving Fortran pointers.
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: schweitz
Differential Revision: https://reviews.llvm.org/D122878
If the memchr() size is 1, then we can convert the call into a
single-byte comparison. This works even if both the string and the
character are unknown.
Split off from https://reviews.llvm.org/D122836.
As discussed on https://github.com/llvm/llvm-project/issues/54682,
MemorySSA currently has a bug when computing the clobber of calls
that access loop-varying locations. I think a "proper" fix for this
on the MemorySSA side might be non-trivial, but we can easily work
around this in MemCpyOpt:
Currently, MemCpyOpt uses a location-less getClobberingMemoryAccess()
call to find a clobber on either the src or dest location, and then
refines it for the src and dest clobber. This was intended as an
optimization, as the location-less API is cached, while the
location-affected APIs are not.
However, I don't think this really makes a difference in practice,
because I don't think anything will use the cached clobbers on
those calls later anyway. On CTMark, this patch seems to be very
mildly positive actually.
So I think this is a reasonable way to avoid the problem for now,
though MemorySSA should also get a fix.
Differential Revision: https://reviews.llvm.org/D122911
The range calculation in walkForwards() assumes that the ranges of
the operands have already been calculated. With the used visit
order, this is not necessarily the case when there are multiple
roots. (There is nothing guaranteeing that instructions are visited
in topological order.)
Fix this by queuing instructions for reprocessing if the operand
ranges haven't been calculated yet.
Fixes https://github.com/llvm/llvm-project/issues/54669.
Differential Revision: https://reviews.llvm.org/D122817
In case a character component PDT length only depends on kind parameters,
fold it while instantiating the PDT. This is especially important if the
component has an initializer because later semantic phases (offset
computation or runtime type info generation) might get confused and
generate offset/type info that will lead to crashes in lowering.
Differential Revision: https://reviews.llvm.org/D122938
This patch adds FIR to LLVM test for fir.address_of.
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: schweitz
Differential Revision: https://reviews.llvm.org/D122889
Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>