Restructure the current implementation of eliminateFrameIndex function
in order to support more instructions.
Reviewed By: efocht
Differential Revision: https://reviews.llvm.org/D129034
This patch adds support for Arm's Cortex-M85 CPU. The Cortex-M85 CPU is
an Arm v8.1m Mainline CPU, with optional support for MVE and PACBTI,
both of which are enabled by default.
Parts have been coauthored by by Mark Murray, Alexandros Lamprineas and
David Green.
Differential Revision: https://reviews.llvm.org/D128415
These are not mentioned in the OpenCL C Specification nor in the
OpenCL Extension Specification.
Differential Revision: https://reviews.llvm.org/D128436
In D95959, the improve analysis for "C >> X" broken the fold
((%x & C) == 0) --> %x u< (-C) iff (-C) is power of two.
It simplifies C, but fails to satisfy the fold condition.
This patch try to restore C before the fold.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D128790
This patch adds new the following SME intrinsics:
@llvm.aarch64.sme.addva
@llvm.aarch64.sme.addha
Differential Revision: https://reviews.llvm.org/D127861
Add FORCE_LLD_DIAGNOSTICS_CRASH inspired by the existing
FORCE_CLANG_DIAGNOSTICS_CRASH.
This is particularly useful for people customizing LLD as they may
want to modify the crash reporting behavior.
Differential Revision: https://reviews.llvm.org/D128195
For scalable VFs, the minimum assumed vscale needs to be included in the
cost-computation, otherwise a smaller VF may be used for RT check cost
computation than was used for earlier cost computations.
Fixes a RISCV test failing with UBSan due to both scalar and vector
loops having the same cost.
ParallelInsertSlice behaves similarly to tensor::InsertSliceOp in its
rank-reducing properties.
This revision extends rank-reducing rewrite behavior and reuses most of the
existing implementation.
Differential Revision: https://reviews.llvm.org/D129091
When addressing a substring of a character array, codegen emits two
GEPs: one for to compute the address of the base element, and a second
one to address the first characters from that element.
The first GEP still returns the LLVM array type (if the FIR array type could be
translated to an array type. Therefore) so zero
indexes must be added to the second GEP in this case to cover for the
Fortran array dimensions before inserting the susbtring offset index.
Surprisingly, the previous code worked ok when MLIR emits none opaque
pointers. But with opaque pointers, the two GEPs are folded in an
invalid GEP where the substring offset becomes an offset for the outer
array dimension.
Note that I tried to fix the issue by modifying the first GEP to return the
element type, but this still gave bad results (here something might be
wrong with opaque pointer in MLIR or LLVM).
Differential Revision: https://reviews.llvm.org/D129079
Properly checks enclosing DeclContext, and add the related test case.
It would be great to be able to use Sema to check conflicting scopes, but that's
not something clang-tidy seems to be able to do :-/
Fix#56221
Differential Revision: https://reviews.llvm.org/D128715
...that had temporarily regressed with (since reverted)
<886715af96>
"[clang] Introduce -fstrict-flex-arrays=<n> for stricter handling of flexible
arrays", and had then been seen to cause issues in the wild:
For one, the HarfBuzz project has various "fake" flexible array members of the
form
> Type arrayZ[HB_VAR_ARRAY];
in <https://github.com/harfbuzz/harfbuzz/blob/main/src/hb-open-type.hh>, where
HB_VAR_ARRAY is a macro defined as
> #ifndef HB_VAR_ARRAY
> #define HB_VAR_ARRAY 1
> #endif
in <https://github.com/harfbuzz/harfbuzz/blob/main/src/hb-machinery.hh>.
For another, the Firebird project in
<https://github.com/FirebirdSQL/firebird/blob/master/src/lock/lock_proto.h> uses
a trailing member
> srq lhb_hash[1]; // Hash table
as a "fake" flexible array, but declared in a
> struct lhb : public Firebird::MemoryHeader
that is not a standard-layout class (because the Firebird::MemoryHeader base
class also declares non-static data members).
(The second case is specific to C++. Extend the test setup so that all the
other tests are now run for both C and C++, just in case the behavior could ever
start to diverge for those two languages.)
A third case where -fsanitize=array-bounds differs from -Warray-bounds (and
which is also specific to C++, but which doesn't appear to have been encountered
in the wild) is when the "fake" flexible array member's size results from
template argument substitution.
Differential Revision: https://reviews.llvm.org/D128783
This hint instructs the linker to perform the AdrpLdr or AdrpAdd
transformation depending on whether the GOT load has been relaxed to
load a local symbol's address.
Differential Revision: https://reviews.llvm.org/D129059
Infers block/grid dimensions/indices or ranges of such dimensions/indices.
Reviewed By: krzysz00
Differential Revision: https://reviews.llvm.org/D129036
For a SHT_NOBITS section like .bss, its sh_offset is typically not
aligned by sh_addralign. If it is converted to SHT_PROGBITS by
`--set-section-flags .bss=alloc,contents`, we should conceptually align
it when computing the output size for -O binary. Otherwise the output
size may be smaller than GNU objcopy produced output.
* binary-no-paddr.test has a case with non-sensical p_paddr=1 which has
a changed behavior. Update it.
Close https://github.com/llvm/llvm-project/issues/55246
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D128961
Note that this is just enough for simple function call examples to
generate working code.
A good portion of this patch is the extra functions that needed to be
implemented to support the test case. e.g. storeRegToStackSlot,
loadRegFromStackSlot, eliminateFrameIndex.
Differential Revision: https://reviews.llvm.org/D128429
RVV C intrinsics use pointers to scalar for base address and their corresponding
IR intrinsics but use pointers to vector. It makes some vector load intrinsics
need specific ManualCodegen and MaskedManualCodegen to just add bitcast for
transforming to IR.
For simplifying riscv_vector.td, the patch make RISCVEmitter detect pointer
operands and bitcast them.
Reviewed By: kito-cheng
Differential Revision: https://reviews.llvm.org/D129043
I think that these conditions are unnecessary because in VisitClassTemplateDecl we import the definition via the templated CXXRecordDecl and in VisitVarTemplateDecl via the templated VarDecl. These are named ToTemplted and DTemplated respectively.
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D128608
The test diffs are cosmetic -- but improvements -- because we
let instcombine handle replacement. Instead of dropping the
old value name, it propagates to the new instruction.
Summary:
Currently we just check the extension to set the image kind. This
incorrectly labels the `.o` files created during LTO as object files.
This patch simply adds a check for the bitcode magic bytes instead.
RuntimeDyld does not support RISC-V, so it makes sense to enable
JITLink by default. This also makes relocations work without support
for a large code model.
Differential Revision: https://reviews.llvm.org/D129092
This fixes an UBSan failure after 644a965c1e. When using
user-provided VFs/ICs (via the force-vector-width /
force-vector-interleave options) the scalar cost is zero, which would
cause divide-by-zero.
When forcing vectorization using the options, the cost of the runtime
checks should not block vectorization.
- Update `clang-format --help` output after b1f0efc06a.
- Update `clang-format-diff.py` help text, which apparently hasn't
been updated in a while. Since git and svn examples are now part
of the help text, remove them in the text following the help text.
Differential Revision: https://reviews.llvm.org/D129050
Break after a constructor initializer colon only if it's not followed by a
comment on the same line.
Fixes#41128.
Fixes#43246.
Differential Revision: https://reviews.llvm.org/D129057
This patch just make the code more similar
in each conversion.
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D129071
The actions table is very compact but the binary search to find the
correct action is relatively expensive.
A hashtable is faster but pretty large (64 bits per value, plus empty
slots, and lookup is constant time but not trivial due to collisions).
The structure in this patch uses 1.25 bits per entry (whether present or absent)
plus the size of the values, and lookup is trivial.
The Shift table is 119KB = 27KB values + 92KB keys.
The Goto table is 86KB = 30KB values + 57KB keys.
(Goto has a smaller keyspace as #nonterminals < #terminals, and more entries).
This patch improves glrParse speed by 28%: 4.69 => 5.99 MB/s
Overall the table grows by 60%: 142 => 228KB.
By comparison, DenseMap<unsigned, StateID> is "only" 16% faster (5.43 MB/s),
and results in a 285% larger table (547 KB) vs the baseline.
Differential Revision: https://reviews.llvm.org/D128485