This adds a DAG combine for converting sext_inreg of VGetLaneu into
VGetLanes, providing the types match correctly.
Differential Revision: https://reviews.llvm.org/D95073
We can only legally extract from the lowest 128-bit subvector, so extract the correct subvector to allow us to handle 256/512-bit vector element extracts.
Under SoftFP calling conventions, we can be left with
extract(bitcast(BUILD_VECTOR(VMOVDRR(a, b), ..))) patterns that can
simplify to a or b, depending on the extract lane.
Differential Revision: https://reviews.llvm.org/D94990
Correct integer constants like `1UL << 63` to `UINT64_C(1) << 63` in
order to make them work on 32-bit machines. Tested on both an i386
and x86_64 machines.
Reviewed By: mgorny
Differential Revision: https://reviews.llvm.org/D95724
If we determine that the invariant path through the loop has no effects,
we can directly branch to the exit block, instead to unswitching first.
Besides avoiding some extra work (unswitching first, then deleting the
loop again) this allows to be more aggressive than regular unswitching
with respect to cost-modeling. This approach should always be be
desirable.
This is similar in spirit to D93734, just that it uses the previously
added checks for loop-unswitching.
I tried to add the required no-op checks from scratch, as we only check
a subset of the loop. There is potential to unify the checks with
LoopDeletion, at the cost of adding a predicate whether a block should
be considered.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D95468
The reduction of a sanitizer build failure when enabling the dominance check (D95335) showed that loop peeling also needs to take care of scope duplication, just like loop unrolling (D92887).
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D95544
For unknown reasons the alabaster theme on the docs server is always generating
a GitHub link in the side bar. Beside the privacy problems of having an iframe
to some third-party service, we never configured any GitHub integration so
this button just links to the GitHub main site.
The button generation should be disabled by default, but as that's apparently
not true in the alabaster theme on the server, this patch tries working around
the issue by just explicitly turning off the GitHub integration.
This reverts commit d9b953d84b.
This commit resulted in build bot failures and the author is away from a
computer, so I am reverting on their behalf until they have a chance to
look into this.
This is the last revision to migrate using SimplePadOp to PadTensorOp, and the
SimplePadOp is removed in the patch. Update a bit in SliceAnalysis because the
PadTensorOp takes a region different from SimplePadOp. This is not covered by
LinalgOp because it is not a structured op.
Also, remove a duplicated comment from cpp file, which is already described in a
header file. And update the pseudo-mlir in the comment.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D95671
This primarily occurs with isel patterns using vnot. This reduces
the number of variants in the isel tables.
We generally canonicalize build_vectors of constants to the RHS. I think
we might fail if there is a bitcast on the build_vector, but that
should be easy to fix if we can find a case. Usually the
bitcast is introduced by type legalization or lowering. It's
likely canonicalization would have already occured.
To set non-default rounding mode user usually calls function 'fesetround'
from standard C library. This way has some disadvantages.
* It creates unnecessary dependency on libc. On the other hand, setting
rounding mode requires few instructions and could be made by compiler.
Sometimes standard C library even is not available, like in the case of
GPU or AI cores that execute small kernels.
* Compiler could generate more effective code if it knows that a particular
call just sets rounding mode.
This change introduces new IR intrinsic, namely 'llvm.set.rounding', which
sets current rounding mode, similar to 'fesetround'. It however differs
from the latter, because it is a lower level facility:
* 'llvm.set.rounding' does not return any value, whereas 'fesetround'
returns non-zero value in the case of failure. In glibc 'fesetround'
reports failure if its argument is invalid or unsupported or if floating
point operations are unavailable on the hardware. Compiler usually knows
what core it generates code for and it can validate arguments in many
cases.
* Rounding mode is specified in 'fesetround' using constants like
'FE_TONEAREST', which are target dependent. It is inconvenient to work
with such constants at IR level.
C standard provides a target-independent way to specify rounding mode, it
is used in FLT_ROUNDS, however it does not define standard way to set
rounding mode using this encoding.
This change implements only IR intrinsic. Lowering it to machine code is
target-specific and will be implemented latter. Mapping of 'fesetround'
to 'llvm.set.rounding' is also not implemented here.
Differential Revision: https://reviews.llvm.org/D74729
A couple patterns used bitconvert on the immAllOnesV, but
the isel matching uses ISD::isBuildVectorAllOnes which
is able to look through bitcasts. So isel patterns don't need
to do it explicitly.
This reverts commit 6e58539659.
This failed in http://lab.llvm.org:8011/#/builders/123/builds/2676. I guess
were're still missing some symbols, but unfortunately the specific error is
masked by a bug in python/lit that hides stderr. This test will have to remain
disabled on Windows until I can get help to debug it further.
We need to add a mask to the shift amount for these operations
to use the FSR/FSL instructions. We were previously doing this
in isel patterns, but custom lowering will make the mask
visible to optimizations earlier.
This testcase was failing on windows due to missing definitions. This commit
adds definitions of the missing symbols (as absolute symbols) to eliminate the
errors.
Fixes the `FastUnwindTest` unit test for RISC-V.
These changes reflect the different stack organization commonly used for
that architecture.
Differential Revision: https://reviews.llvm.org/D90574
This seems to be a safe way to ensure that the Compiler-RT test compiler
flags are properly set in all cross-compilation scenarios. Without this
when `BUILTINS_TEST_TARGET_CFLAGS` is set in
`compiler-rt/test/builtins/CMakeLists.txt` the other flags are cleared.
Differential Revision: https://reviews.llvm.org/D92124
If we're going to end up expanding anyway, we should do it early
so we don't create extra operations to handle the bytes added by
promotion.
This is helfpul on RISCV where we might have to promote i16 all
the way to i64.
Differential Revision: https://reviews.llvm.org/D95756
* Fixing missing `type` keyword in alias print
* Add test for large tuple type alias & rerun output to verify printed
form can be parsed (which caught the above).
This patch removes some options that have been duplicated in
LTOCodeGenerator and instead use lto::Config directly to manage the
options.
This is a cleanup after 6a59f05606.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D95738
Introduce a NativeRegisterContextFreeBSD for 32-bit ARM platform.
This includes support for GPR + VFP registers as exposed by FreeBSD's
ptrace(2) API. Hardware breakpoints or watchpoints are not supported
due to missing kernel support. The code is roughly based on the arm64
context.
It also includes an override for GetSoftwareBreakpointTrapOpcode() based
on the matching code in the PlatformFreeBSD plugin.
Differential Revision: https://reviews.llvm.org/D95696
Introduce arm64 support in the FreeBSDRemote plugin. The code
is roughly based on Linux and reuses the same POSIX RegisterInfos
(but the buffers need to be a few bytes larger due to stricter struct
member alignment in FreeBSD structures -- luckily, they do not affect
the actual member offsets). It supports reading and writing
general-purpose and FPU registers. SVE and hardware watchpoint support
is missing due to the limitations of FreeBSD ptrace(2) API.
Differential Revision: https://reviews.llvm.org/D95297
With a context instruction, this would produce a context
error. However, it would continue on and do an out of bounds access of
the empty allocation order array.
Current dsymutil implementation of hasLiveMemoryLocation()/hasLiveAddressRange()
and applyValidRelocs() assume that calls should be done in certain order
(from first Dies to last). Multi-thread implementation might call these methods
in other order(it might process compilation units in order other than they are physically
located), so we remove restriction that searching for relocations should be done
in ascending order. This change does not introduce noticable performance degradation.
The testing results for clang binary:
golden-dsymutil/dsymutil 23787992
clang MD5: 5efa8fd9355ebf81b65f24db5375caa2
elapsed time=91sec
build-Release/bin/dsymutil 23855616
clang MD5: 5efa8fd9355ebf81b65f24db5375caa2
elapsed time=91sec
Differential Revision: https://reviews.llvm.org/D93106
After committing D92214 it was noticed libc++ no longer builds with
C++17. For now reenable building with C++17. This is intended to be a
temporary measure in the future a C++20 capable compiler will be
required.
The result values of vp2intersect are vectors of bits, i.e.,
vector<8xi1> or vector<16xi8> (instead of i8 or i16).
Differential Revision: https://reviews.llvm.org/D95678
Analyze the shape of the result of TRANSFER(ptr,array) correctly
when "ptr" is an array of deferred shape. Fixing this bug led to
some refactoring and concentration of common code in TypeAndShape
member functions with code in general shape and character length
analysis, and this led to some regression test failures that have
all been cleaned up.
Differential Revision: https://reviews.llvm.org/D95744