This adds sign/zero extending scalar loads/stores to the MVE
instructions added in D77813, allowing us to create up more post-inc
instructions. These are comparatively simple, compared to LDR/STR (which
may be better turned into an LDRD/LDM), but still require some additions
over MVE instructions. Because there are i12 and i8 variants of the
offset loads/stores dealing with different signs, we may need to convert
an i12 address to a i8 negative instruction. t2LDRBi12 can also be
shrunk to a tLDRi under the right conditions, so we need to be careful
with codesize too.
Differential Revision: https://reviews.llvm.org/D78625
abs() should be rare enough that using value tracking is not going
to be a compile-time cost burden, so use it to reduce a variety of
potential patterns. We do this in DAGCombiner too.
Differential Revision: https://reviews.llvm.org/D85043
When checking for the style of a decl that isn't in the main file, the check will now search for the configuration that the included files uses to gather the style for its decls.
This can be useful to silence warnings in header files that follow a different naming convention without using header-filter to silence all warnings(even from other checks) in the header file.
Reviewed By: aaron.ballman, gribozavr2
Differential Revision: https://reviews.llvm.org/D84814
Now we try to load and broadcast together for operand 1. Followed
by load and broadcast for operand 1. Previously we tried load
operand 1, load operand 1, broadcast operand 0, broadcast operand 1.
Now we have a single helper that tries load and broadcast for
one operand that we can just call twice.
querying getSCEV() for incomplete phis leads to wrong cache value in `ExprToIVMap`,
because incomplete phis may be simplified to same value before get SCEV expression.
Reviewed By: lebedev.ri, mkazantsev
Differential Revision: https://reviews.llvm.org/D77560
SPE doesn't have a fsel instruction, so don't try to lower to it.
This fixes a "Cannot select: tN: f64 = PPCISD::FSEL tX, tY, tZ" error.
Reviewed By: #powerpc, lkail
Differential Revision: https://reviews.llvm.org/D77773
The patterns were incorrect copies from the FPU code, and are
unnecessary, since there's no extended load for SPE. Just let LLVM
itself do the work by marking it expand.
Reviewed By: #powerpc, lkail
Differential Revision: https://reviews.llvm.org/D78670
Change to expand all arguments and return values to i64 to follow ABI.
Update regression tests also.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D84581
These methods abstract away Error handling when trying to read options that can't be parsed by logging the error automatically and returning None.
Reviewed By: gribozavr2
Differential Revision: https://reviews.llvm.org/D84812
Refer to LangRef http://llvm.org/docs/LangRef.html#llvm-masked-load-intrinsics
'llvm.masked.load/store.*’ intrinsics are overloaded intrinsic, which allow the
load/store data to be a vector of any integer, floating-point or pointer data type.
Therefore, allow pointer data type when checking 'isLegalMaskedLoadStore()'.
Reviewed By: paulwalker-arm
Differential Revision: https://reviews.llvm.org/D85045
Rather than hardcoding immediate values for 12 different combinations
in a nested pair of switches, we can perform the matched logic
operation on 3 magic constants to calculate the immediate.
Special thanks to this tweet https://twitter.com/rygorous/status/1187034321992871936
for making me realize I could do this.
A recent change broke `ninja check-asan` on Darwin by causing an error
during linking of ASan unit tests [1].
Move the addition of `-ObjC` compiler flag outside of the new
`if(COMPILER_RT_STANDALONE_BUILD)` block. It doesn't add any global
flags (e.g, `${CMAKE_CXX_FLAGS}`) and the decision to add is based
solely on source paths (`${source_rpath}`).
[1] 8b2fcc42b8, https://reviews.llvm.org/D84466
Differential Revision: https://reviews.llvm.org/D85057
Add the optimizations we have in the SelectionDAG version.
Known non-negative copies all known bits. Any known one other than
the sign bit makes result non-negative.
Differential Revision: https://reviews.llvm.org/D85000
This patch fixed the issue that target memory might be deallocated when
they're still being used or before they're used.
Reviewed By: ye-luo
Differential Revision: https://reviews.llvm.org/D84996
This is an operation that can returns a new ValueShape with a different shape. Useful for composing shape function calls and reusing existing shape transfer functions.
Just adding the op in this change.
Differential Revision: https://reviews.llvm.org/D84217
The current output is a bit clunky and requires including files+macros everywhere, or manually wrapping the file inclusion in a registration function. This revision refactors the pass backend to automatically generate `registerFooPass`/`registerFooPasses` functions that wrap the pass registration. `gen-pass-decls` now takes a `-name` input that specifies a tag name for the group of passes that are being generated. For each pass, the generator now produces a `registerFooPass` where `Foo` is the name of the definition specified in tablegen. It also generates a `registerGroupPasses`, where `Group` is the tag provided via the `-name` input parameter, that registers all of the passes present.
Differential Revision: https://reviews.llvm.org/D84983
As is common with curses apps, this allows to redraw everything
in case something corrupts the screen. Apparently key modifiers
are difficult with curses (curses FAQ it "doesn't do that"),
thankfully Ctrl+key are simply control characters, so it's
(ascii & 037) => 12.
Differential Revision: https://reviews.llvm.org/D84972
D68049 created options for basic block sections: -fbasic-block-sections=,
-funique-basic-block-section-names. Rename options in llc and lld (--lto-)
to be consistent. Specifically,
+ Rename basicblock-sections to basic-block-sections
+ Rename unique-bb-section-names to unique-basic-block-section-names
Differential Revision: https://reviews.llvm.org/D84462
`TARGET_OS_IOS` and `TARGET_OS_WATCH` are not mutually exclusive.
`SANITIZER_IOS` is defined for all embedded platforms. So the branch
for watchOS is never taken. We could fix this by switching the order
of the branches (but the reason for doing so is non-obvious). Instead,
lets use the Darwin-specific `TARGET_OS_*` macros which are mutually
exclusive.
Summary: This patch separates the Loop Peeling Utilities from Loop Unrolling.
The reason for this change is that Loop Peeling is no longer only being used by
loop unrolling; Patch D82927 introduces loop peeling with fusion, such that
loops can be modified to have to same trip count, making them legal to be
peeled.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D83056
This patch lower `!OMP PARALLEL` construct from PFT to OpenMPDialect operations.
This is first patch in this direction(lowering parallel construct).
OpenMP parallel construct can have multiple clauses and parameters. This patch
only implements lowering of an empty(contains no code in body) parallel construct
without any clauses or parameters.
Patch is carved out of following approved PR:
https://github.com/flang-compiler/f18-llvm-project/pull/322
Reviewed By: kiranchandramohan, DavidTruby
Differential Revision: https://reviews.llvm.org/D84965
Block.h is a pretty common name, which can lead to nasty collisions with
user provided headers. Since we're only getting a few simple declarations
from the header, it's better to declare them manually than to include the
header.
rdar://66384326
Differential Revision: https://reviews.llvm.org/D85035
computeHostNumPhysicalCores() is designed to respect CPU affinity.
D84764 used sysconf(_SC_NPROCESSORS_ONLN) which does not respect
affinity.
SupportTests Threading.PhysicalConcurrency may fail if taskset -c is specified.
1. Annotate the sources with constraint numbers.
2. Add tests for
*C7107 (R765) digit shall have one of the values 0 or 1.
*C7108 (R766) digit shall have one of the values 0 through 7.
*C7109 (R764) A boz-literal-constant shall appear only as a data-stmt-constant in a DATA statement, or where explicitly allowed in 16.9 as an actual argument of an intrinsic procedure.
Reviewed By: PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D84504
This fixes the ExecutionEngine/MCJIT/stubs-sm-pic.ll test in no-asserts
builds which is set to XFAIL on some platforms like 32-bit x86. More
importantly, we probably don't want to silently error in these cases.
Differential revision: https://reviews.llvm.org/D84390