I've also made a stab at imposing some more order on where and how we add
attributes; this part should be NFC. I wasn't sure whether the CUDA use
case for libdevice should propagate CPU/features attributes, so there's a
bit of unnecessary duplication.
This is split off from D79799 - where I was proposing to fully iterate
over a function until there are no more transforms. I suspect we are
still going to want to do something like that eventually.
But we can achieve the same gains much more efficiently on the current
set of regression tests just by reversing the order that we visit the
instructions.
This may also reduce the motivation for D79078, but we are still not
getting the optimal pattern for a reduction.
Given a VQMOVN(VSHR), we can fold that into a VQSHRN simply enough using
a few tablegen patterns.
Differential Revision: https://reviews.llvm.org/D77720
This is basically the same patch as D63233, but converted to
funnel shifts rather than regular shifts. I did not see a
way to effectively share code for these 2 cases though.
This follows D79718 and D79827 to re-fix PR37426 because
that gets canonicalized to funnel shift intrinsics in IR.
I did draft an alternative patch as an enhancement to
"shouldSinkOperands()", but that was awkward because
we have to key the transform from the select, but then
look at both its users and its operands.
This adds two combines for VMOVN, one to fold
VMOVN[tb](c, VQMOVNb(a, b)) => VQMOVN[tb](c, b)
The other to perform demand bits analysis on the lanes of a VMOVN. We
know that only the bottom lanes of the second operand and the top or
bottom lanes of the Qd operand are needed in the result, depending on if
the VMOVN is bottom or top.
Differential Revision: https://reviews.llvm.org/D77718
This adds some custom lowering for VQMOVN, an instruction that can be
used to perform saturating truncates from a pair of min(max(X, -0x8000),
0x7fff), providing those constants are correct. This leaves a VQMOVNBs
which saturates the value and inserts that into the bottom lanes of an
existing vector. We then need to do something with the other lanes,
extending the value using a vmovlb.
Ideally, as will often be the case, only the bottom lane of what remains
will be demanded, allowing the vmovlb to be removed. Which should mean
the instruction is either equal or a win most of the time, and allows
some extra follow-up folding to happen.
Differential Revision: https://reviews.llvm.org/D77590
computeKnownBitsFromAssume() currently asserts if m_V matches a
ptrtoint that changes the bitwidth. Because InstCombine
canonicalizes ptrtoint instructions to use explicit zext/trunc,
we never ran into the issue in practice. I'm adding unit tests,
as I don't know if this can be triggered via IR anywhere.
Fix this by calling anyextOrTrunc(BitWidth) on the computed
KnownBits. Note that we are going from the KnownBits of the
ptrtoint result to the KnownBits of the ptrtoint operand,
so we need to truncate if the ptrtoint zexted and anyext if
the ptrtoint truncated.
Differential Revision: https://reviews.llvm.org/D79234
Like other uses of ALLOW_RETRIES, this test tried to verify that an API
returned "quickly" but quick is not safe to define given slow and/or
busy machines.
Instead, we now verify that these "wait" APIs actually wait, which the
old test did not.
The code was calculating an offset from a stack pointer SDValue.
This is exactly what getMemBasePlusOffset does. I also replaced
sizeof(int) with a hardcoded 4. We know the type we're operating
on is 4 bytes. But the size of int that the source code is being
compiled with isn't guaranteed to be 4 bytes.
While here replace another use of getMemBasePlusOffset that was
proceeded with a call to getConstant with the other signature
that call getConstant internally.
This bug is exposed by Test7 of ehthrow.cxx in MSVC EH suite where
a rethrow occurs in a try-catch inside a catch (i.e., a nested Catch
handlers). See the test code in
https://github.com/microsoft/compiler-tests/blob/master/eh/ehthrow.cxx#L346
When an object is rethrown in a Catch handler, the copy-ctor of this
object must be executed after the destructions of live objects, but
BEFORE the dtors of live objects in parent handlers.
Today Windows 64-bit runtime (__CxxFrameHandler3 & 4) expects nested Catch
handers
are stored in pre-order (outer first, inner next) in $tryMap$ table, so
that given a State, its Catch's beginning State can be properly
retrieved. The Catch beginning state (which is also the ending State) is
the State where rethrown object's copy-ctor must take place.
LLVM currently stores nested catch handlers in post-ordering because
it's the natural way to compute the highest State in Catch.
The fix is to simply store TryCatch handler in pre-order, but update
Catch's highest State after child Catches are all processed.
Differential Revision: https://reviews.llvm.org/D79474?id=263919
This reverts commit bca347508c.
This broke clang/test/Misc/warning-flags.c, because the newly added
warning option in this commit didn't have a matching flag.
Summary:
Wasm currently does not fully handle exception specifications. Rather
than crashing, this treats `throw()` in the same way as `noexcept`, and
ignores and prints a warning for `throw(type, ..)`, for a temporary
measure.
Reviewers: dschuff
Subscribers: sbc100, jgravelle-google, sunfish, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79655
Summary:
When spilling in the entry function we should be able to borrow
StackPtrOffsetReg as a last resort. This restores behaviour
removed in D75138, and fixes failures when shaders use all
SGPRs, VGPRs and spill in the entry function.
Reviewers: scott.linder, arsenm, tpr
Reviewed By: scott.linder, arsenm
Subscribers: qcolombet, foad, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, t-tye, hiraditya, kerbowa, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79776
Summary:
Many of these were already implemented, and I just annotated the tests and/or
the code.
C752 was a simple check to verify that CONTIGUOUS components are arrays with
C754 proved to be virtually identical to C750 that I implemented previously.
This caused me to remove the distinction between specification expressions for
type parameters and bounds expressions that I'd previously created.
the POINTER attribute.
I also changed the error messages to specify that errors in specification
expressions could arise from either bad derived type components or type
parameters.
In cases where we detect a type param that was not declared, I created a symbol
marked as erroneous. That avoids subsequent semantic process for expressions
containing the symbol. This change caused me to adjust tests resolve33.f90 and
resolve34.f90. Also, I avoided putting out error messages for erroneous type
param symbols in `OkToAddComponent()` in resolve-names.cpp and in
`EvaluateParameters()`, type.cpp.
C756 checks that procedure components have the POINTER attribute.
Reviewers: tskeith, klausler, DavidTruby
Subscribers: llvm-commits
Tags: #llvm, #flang
Differential Revision: https://reviews.llvm.org/D79798
Summary:
In the the given example, a stack slot pointer is merged
between a setjmp and longjmp. This pointer is spilled,
so it does not get correctly restored, addinga undefined
behaviour where it shouldn't.
Change-Id: I60ec010844f2a24ce01ceccf12eb5eba5ab94abb
Reviewers: eli.friedman, thanm, efriedma
Reviewed By: efriedma
Subscribers: MatzeB, qcolombet, tpr, rnk, efriedma, hiraditya, llvm-commits, chill
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77767
Summary:
Various improvement for FileCheck's numeric-expression.txt test:
- remove unused values in USE DEF FMT IMPL MATCH section
- replace 14 literal for 0xe and 0xE to have example of hex literals
- rename variable to be more self-descriptive
- move CHECK as comment of the values being matched to help readability
- add conversion tests
- simplify test for use of several numeric variables by using existing
variable
- adjust position of error message check to match the alignment of the
error message wrt. the output matched by the previous check
Reviewed By: jhenderson, jdenny
Differential Revision: https://reviews.llvm.org/D79820
* improve coverage in `span`'s "conversion from `std::array`" test, while eliminating MSVC diagnostics about `testConstructorArray<T>() && testConstructorArray<const T, T>()` being redundant when `T` is already `const`.
* Remove use of `is_assignable` that triggers UB due to an insufficiently-complete type argument in `std::function`'s assignment operator test.
* Don't test that `shared_ptr` initialization from an rvalue triggers the lvalue aliasing constructor on non-libc++; this is not the case for Standard Libraries that implement LWG-2996. (Ditto, I'd simply remove this but it's your library ;).)
Differential Revision: https://reviews.llvm.org/D80030
Add a missing guard for `_LIBUNWIND_NO_HEAP` around code dealing with the
`.cfi_remember_state` and `.cfi_restore_state` instructions.
Patch by Amanieu d'Antras!
The JitRunner library is logically very close to the execution engine,
and shares similar dependencies.
find -name "*.cpp" -exec sed -i "s/Support\/JitRunner/ExecutionEngine\/JitRunner/" "{}" \;
Differential Revision: https://reviews.llvm.org/D79899
Summary:
The `-bcdtors:mbr` option causes processing for constructors and
destructors to omit otherwise-unreferenced members of static libraries,
matching the processing done on Linux, where `--whole-archive` is not
the default. Applying this option is desirable for reducing the
footprint of an installation.
Reviewed By: daltenty
Differential Revision: https://reviews.llvm.org/D79749
Summary:
The `arm_cmse.h` header includes standard headers, but some tests that
include this header explicitly specify a target. The standard headers
found via the standard include paths need not be compatible with the
explicitly-specified target from the tests. In order to avoid test
failures caused by such incompatibility, this patch uses `%clang_cc1`,
which doesn't pick up the host system headers.
Reviewed By: chill
Differential Revision: https://reviews.llvm.org/D79693
Summary:
If `DEFAULT_SYSROOT` is configured to some path, some tests would fail.
This patch overrides `sysroot` to be the empty string in the style of
D66834 so that the tests will pass even when the build is configured
with a `DEFAULT_SYSROOT`.
Reviewed By: mstorsjo
Differential Revision: https://reviews.llvm.org/D79694
Omitting comments can make the output much smaller. Size/time impact on
my machine:
* lib/Target/AArch64/AArch64GenDAGISel.inc, 10MiB (8.89s) -> 5MiB (3.20s)
* lib/Target/X86/X86GenDAGISel.inc, 20MiB (6.48s) -> 8.5MiB (4.18s)
In total, this change decreases lib/Target/*/*GenDAGISel.inc from
71.4MiB to 30.1MiB.
As rnk suggested, we can consider an option next to LLVM_OPTIMIZED_TABLEGEN
once we have more needs like this.
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D78884
In the CMake build, the HAVE_ vars are set based on system inspection,
and LLVM_ENABLE_ZLIB is set to false if neither's found. The GN build
doesn't do autodetection like this.
With this change, people can set llvm_enable_zlib=true on Windows
and as long as they provide a zlib.lib things should actually work.
(https://reviews.llvm.org/D79219 will remove 2 of the 3 config.h
values, hopefully soon. This change here just makes things a tiny
bit easier until that change is in.)
This patch introduces the `(-h|--host)` option to the `platform shell`
command. It allows the user to run shell commands from the host platform
(always available) without putting lldb in the background.
Since the default behaviour of `platform shell` is to run the command of
the selected platform, having such a choice can be quite handy when
debugging remote targets, for instances.
This patch also introduces a `shell` alias, to improve the command
discoverability and make it more convenient to use for the user.
rdar://62856024
Differential Revision: https://reviews.llvm.org/D79659
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
This patch improves data formatting for CFDictionaryRef and CFSetRef.
It uses the same data-formatter as NSCFDictionaries and NSCFSets introduced
previously but did require some adjustments in Core::ValueObject.
Since the "Ref" types are opaque pointers to the actual CF containers, if the
value object has a synthetic value, lldb will use the opaque pointer's pointee
type to create the new ValueObjectChild needed to dereference the ValueObject.
This allows the "Ref" types to behaves the same as CF containers when used with
the `frame variable` command, the SBAPI or in Xcode's variable inspector.
This patch also adds support for incomplete types in ValueObject.
rdar://53104287
Differential Revision: https://reviews.llvm.org/D79554
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>