When doing CTU analysis setup you pre-compile .cpp to .ast and then
you run clang-extdef-mapping on the .cpp file as well. This is a
pretty slow process since we have to recompile the file each time.
With this patch you can now run clang-extdef-mapping directly on
the .ast file. That saves a lot of time.
I tried this on llvm/lib/AsmParser/Parser.cpp and running
extdef-mapping on the .cpp file took 5.4s on my machine.
While running it on the .ast file it took 2s.
This can save a lot of time for the setup phase of CTU analysis.
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D128704
This patch extends the affine parser to allow affine constraints with `<=`.
This is useful in writing unittests for Presburger library and test in general.
The internal storage and printing of IntegerSet is still in the original format.
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D129046
This test has some race condition which is making it hang on LLDB
Arm/AArch64 Linux buildbot. I am marking it as skipped until we
investigate whats going wrong.
Restructure the current implementation of eliminateFrameIndex function
in order to support more instructions.
Reviewed By: efocht
Differential Revision: https://reviews.llvm.org/D129034
Restructure the current implementation of eliminateFrameIndex function
in order to support more instructions.
Reviewed By: efocht
Differential Revision: https://reviews.llvm.org/D129034
This patch adds support for Arm's Cortex-M85 CPU. The Cortex-M85 CPU is
an Arm v8.1m Mainline CPU, with optional support for MVE and PACBTI,
both of which are enabled by default.
Parts have been coauthored by by Mark Murray, Alexandros Lamprineas and
David Green.
Differential Revision: https://reviews.llvm.org/D128415
These are not mentioned in the OpenCL C Specification nor in the
OpenCL Extension Specification.
Differential Revision: https://reviews.llvm.org/D128436
In D95959, the improve analysis for "C >> X" broken the fold
((%x & C) == 0) --> %x u< (-C) iff (-C) is power of two.
It simplifies C, but fails to satisfy the fold condition.
This patch try to restore C before the fold.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D128790
This patch adds new the following SME intrinsics:
@llvm.aarch64.sme.addva
@llvm.aarch64.sme.addha
Differential Revision: https://reviews.llvm.org/D127861
Add FORCE_LLD_DIAGNOSTICS_CRASH inspired by the existing
FORCE_CLANG_DIAGNOSTICS_CRASH.
This is particularly useful for people customizing LLD as they may
want to modify the crash reporting behavior.
Differential Revision: https://reviews.llvm.org/D128195
For scalable VFs, the minimum assumed vscale needs to be included in the
cost-computation, otherwise a smaller VF may be used for RT check cost
computation than was used for earlier cost computations.
Fixes a RISCV test failing with UBSan due to both scalar and vector
loops having the same cost.
ParallelInsertSlice behaves similarly to tensor::InsertSliceOp in its
rank-reducing properties.
This revision extends rank-reducing rewrite behavior and reuses most of the
existing implementation.
Differential Revision: https://reviews.llvm.org/D129091
When addressing a substring of a character array, codegen emits two
GEPs: one for to compute the address of the base element, and a second
one to address the first characters from that element.
The first GEP still returns the LLVM array type (if the FIR array type could be
translated to an array type. Therefore) so zero
indexes must be added to the second GEP in this case to cover for the
Fortran array dimensions before inserting the susbtring offset index.
Surprisingly, the previous code worked ok when MLIR emits none opaque
pointers. But with opaque pointers, the two GEPs are folded in an
invalid GEP where the substring offset becomes an offset for the outer
array dimension.
Note that I tried to fix the issue by modifying the first GEP to return the
element type, but this still gave bad results (here something might be
wrong with opaque pointer in MLIR or LLVM).
Differential Revision: https://reviews.llvm.org/D129079
Properly checks enclosing DeclContext, and add the related test case.
It would be great to be able to use Sema to check conflicting scopes, but that's
not something clang-tidy seems to be able to do :-/
Fix#56221
Differential Revision: https://reviews.llvm.org/D128715
...that had temporarily regressed with (since reverted)
<886715af96>
"[clang] Introduce -fstrict-flex-arrays=<n> for stricter handling of flexible
arrays", and had then been seen to cause issues in the wild:
For one, the HarfBuzz project has various "fake" flexible array members of the
form
> Type arrayZ[HB_VAR_ARRAY];
in <https://github.com/harfbuzz/harfbuzz/blob/main/src/hb-open-type.hh>, where
HB_VAR_ARRAY is a macro defined as
> #ifndef HB_VAR_ARRAY
> #define HB_VAR_ARRAY 1
> #endif
in <https://github.com/harfbuzz/harfbuzz/blob/main/src/hb-machinery.hh>.
For another, the Firebird project in
<https://github.com/FirebirdSQL/firebird/blob/master/src/lock/lock_proto.h> uses
a trailing member
> srq lhb_hash[1]; // Hash table
as a "fake" flexible array, but declared in a
> struct lhb : public Firebird::MemoryHeader
that is not a standard-layout class (because the Firebird::MemoryHeader base
class also declares non-static data members).
(The second case is specific to C++. Extend the test setup so that all the
other tests are now run for both C and C++, just in case the behavior could ever
start to diverge for those two languages.)
A third case where -fsanitize=array-bounds differs from -Warray-bounds (and
which is also specific to C++, but which doesn't appear to have been encountered
in the wild) is when the "fake" flexible array member's size results from
template argument substitution.
Differential Revision: https://reviews.llvm.org/D128783
This hint instructs the linker to perform the AdrpLdr or AdrpAdd
transformation depending on whether the GOT load has been relaxed to
load a local symbol's address.
Differential Revision: https://reviews.llvm.org/D129059
Infers block/grid dimensions/indices or ranges of such dimensions/indices.
Reviewed By: krzysz00
Differential Revision: https://reviews.llvm.org/D129036
For a SHT_NOBITS section like .bss, its sh_offset is typically not
aligned by sh_addralign. If it is converted to SHT_PROGBITS by
`--set-section-flags .bss=alloc,contents`, we should conceptually align
it when computing the output size for -O binary. Otherwise the output
size may be smaller than GNU objcopy produced output.
* binary-no-paddr.test has a case with non-sensical p_paddr=1 which has
a changed behavior. Update it.
Close https://github.com/llvm/llvm-project/issues/55246
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D128961
Note that this is just enough for simple function call examples to
generate working code.
A good portion of this patch is the extra functions that needed to be
implemented to support the test case. e.g. storeRegToStackSlot,
loadRegFromStackSlot, eliminateFrameIndex.
Differential Revision: https://reviews.llvm.org/D128429
RVV C intrinsics use pointers to scalar for base address and their corresponding
IR intrinsics but use pointers to vector. It makes some vector load intrinsics
need specific ManualCodegen and MaskedManualCodegen to just add bitcast for
transforming to IR.
For simplifying riscv_vector.td, the patch make RISCVEmitter detect pointer
operands and bitcast them.
Reviewed By: kito-cheng
Differential Revision: https://reviews.llvm.org/D129043
I think that these conditions are unnecessary because in VisitClassTemplateDecl we import the definition via the templated CXXRecordDecl and in VisitVarTemplateDecl via the templated VarDecl. These are named ToTemplted and DTemplated respectively.
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D128608
The test diffs are cosmetic -- but improvements -- because we
let instcombine handle replacement. Instead of dropping the
old value name, it propagates to the new instruction.