When there is a mix of affine load/store and non-affine operations (e.g. std.load, std.store),
affine-loop-fusion ignores the present of non-affine ops, thus changing the program semantics.
E.g. we have a program of three affine loops operating on the same memref in which one of them uses std.load and std.store, as follows.
```
affine.for
affine.store %1
affine.for
std.load %1
std.store %1
affine.for
affine.load %1
affine.store %1
```
affine-loop-fusion will produce the following result which changed the program semantics:
```
affine.for
std.load %1
std.store %1
affine.for
affine.store %1
affine.load %1
affine.store %1
```
This patch is to fix the above problem by checking non-affine users of the memref that are between the source and destination nodes of interest.
Differential Revision: https://reviews.llvm.org/D82158
Both `AArch64TargetParser.h` and `ARMTargetParser.h` refer to
`SmallVectorImpl` without directly including the header that defines
it, which works fine until nothing else happens to include it anyway.
When writing a unit test on replacing standard epilogue sequences with `BR __mspabi_func_epilog_<N>`, by manually asm-clobbering `rN` - `r10` for N = 4..10, everything worked well except for seeming inability to clobber r4.
The problem was that MSP430 code generator of LLVM used an obsolete name FP for that register. Things were worse because when `llc` read an unknown register name, it silently ignored it.
That is, I cannot use `fp` register name from the C code because Clang does not accept it (exactly like GCC). But the accepted name `r4` is not recognised by `llc` (it can be used in listings passed to `llvm-mc` and even `fp` is replace to `r4` by `llvm-mc`). So I can specify any of `fp` or `r4` for the string literal of `asm(...)` but nothing in the clobber list.
This patch replaces `MSP430::FP` with `MSP430::R4` in the backend code (even [MSP430 EABI](http://www.ti.com/lit/an/slaa534/slaa534.pdf) doesn't mention FP as a register name). The R0 - R3 registers, on the other hand, are left as is in the backend code (after all, they have some special meaning on the ISA level). It is just ensured clang is renaming them as expected by the downstream tools. There is probably not much sense in **marking them clobbered** but rename them //just in case// for use at potentially different contexts.
Differential Revision: https://reviews.llvm.org/D82184
Some parts of existing codebase assume the default `int` type to be (at least) 32 bit wide. On 16 bit targets such as MSP430 this may cause Undefined Behavior or results being defined but incorrect.
Differential Revision: https://reviews.llvm.org/D81408
When calling on-the-fly passes from the legacy pass manager, the modification
status is not reported, which is a problem in case we depend on an acutal
transformation pass, and not only analyse.
Update the Legacy PM API to optionally report the changed status, assert if a
change is detected but this change is lost.
Related to https://reviews.llvm.org/D80916
Differential Revision: https://reviews.llvm.org/D81236
1) Shared writable directories like /tmp are a security problem.
2) Systems provide dedicated cache directories these days anyway.
3) This also refines LLVM's cache_directory() on Darwin platforms to use
the Darwin per-user cache directory.
Reviewers: compnerd, aprantl, jakehehrlich, espindola, respindola, ilya-biryukov, pcc, sammccall
Reviewed By: compnerd, sammccall
Subscribers: hiraditya, llvm-commits, cfe-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D82362
Summary:
Assignment and comma operators for fixed-point types were being constevaled as other
binary operators, but they need special treatment.
Reviewers: rjmccall, leonardchan, bjope
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73189
Summary:
Diagnostics for overflow were not being produced for fixed-point
evaluation. This patch refactors a bit of the evaluator and adds
a proper diagnostic for these cases.
Reviewers: rjmccall, leonardchan, bjope
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73188
This patch helps teach yaml2obj emit correct DWARF64 unit header of the .debug_info section.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D82621
Summary:
`svwhilerw_bf16` and `svwhilewr_bf16` intrinsics use the scalar
`bfloat16_t`
type which is predicated on `__ARM_FEATURE_BF16_SCALAR_ARITHMETIC`. This
patch changes the feature guard from `__ARM_FEATURE_SVE_BF16` to the
scalar bfloat feature macro.
The verify tests for `+bf16` are also removed in this patch. The purpose
of these checks was to match the SVE2 ACLE tests that look for an
implicit declaration warning if the feature isn't set. They worked when
the intrinsics were guarded on `__ARM_FEATURE_SVE_BF16` as the
`bfloat16_t`
was guarded on a different macro, but with both the type and intrinsic
guarded on the same macro an earlier error is triggered in the ACLE
regarding the type and we don't get a warning as we do for SVE2.
Reviewers: sdesmalen, fpetrogalli, kmclaughlin, rengolin, efriedma
Reviewed By: sdesmalen, fpetrogalli
Differential Revision: https://reviews.llvm.org/D82578
Summary:
This fixes a bug in the logic for choosing the unwind plan. Based on the
comment in UnwindAssembly-x86, the intention was that a plan which
describes the function epilogue correctly does not need to be augmented
(and it should be used directly). However, the way this was implemented
(by returning false) meant that the higher level code
(FuncUnwinders::GetEHFrameAugmentedUnwindPlan) interpreted this as a
failure to produce _any_ plan and proceeded with other fallback options.
The fallback usually chosed for "asynchronous" plans was the
"instruction emulation" plan, which tended to fall over on certain
functions with multiple epilogues (that's a separate bug).
This patch simply changes the function to return true, which signals the
caller that the unmodified plan is ready to be used.
The attached test case demonstrates the case where we would previously
fall back to the instruction emulation plan, and unwind incorrectly --
the test asserts that the "augmented" eh_frame plan is used, and that
the unwind is correct.
Reviewers: jasonmolenda, jankratochvil
Subscribers: davide, echristo, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D82378
This function was implementing c-like promotion rules by switching on
the both types. C promotion rules are complicated, but they are not
*that* complicated -- they basically boil down to:
- wider types trump narrower ones
- unsigned trump signed
- floating point trumps integral
With a couple of helper functions, we can rewrite the function in terms
of these rules and greatly reduce the size and complexity of this
function.
Summary:
Permutation and selection bfloat16 intrinsic patterns should be guarded
on the feature flag `+bf16`. Missed in D82182 and D80850.
Reviewers: sdesmalen, fpetrogalli, kmclaughlin, efriedma
Reviewed By: fpetrogalli
Differential Revision: https://reviews.llvm.org/D82492
Similar to the recent patch for fpext, this adds vcvtb and vcvtt with
insert into vector instruction selection patterns for fptruncs. This
helps clear up a lot of register shuffling that we would otherwise do.
Differential Revision: https://reviews.llvm.org/D81637
Summary:
Currently, clangd always completes `typename` as `typename qualifier::name`, I think the current behavior is not useful when the code completion is triggered in `template <>`. So I tweak it to `typename identifier`.
Patch by @lh123 !
Reviewers: sammccall, kadircet
Reviewed By: kadircet
Subscribers: ilya-biryukov, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D82373
We current extract and convert from a top lane of a f16 vector using a
VMOVX;VCVTB pair. We can simplify that to use a single VCVTT. The
pattern is mostly copied from a vector extract pattern, but produces a
VCVTTHS f32 directly.
This had to move some code around so that ARMInstrVFP had access to the
required pattern frags that were previously part of ARMInstrNEON.
Differential Revision: https://reviews.llvm.org/D81556
These features implicitly enabled XSAVE in the frontend, but not
the backend. Disabling XSAVE in the frontend disabled XSAVEOPT, but
not the other 2. Nothing happened in the backend.
Fixed an issue in DataLayout::getIntPtrType where we were assuming
the input type was always a fixed vector type, which isn't true.
Added a test that exposed the problem to:
Transforms/InstCombine/vector_gep1.ll
Differential Revision: https://reviews.llvm.org/D82294
This lowers intrinsic @llvm.get.active.lane.mask to a setcc node, i.e. an icmp
ule, and creates vectors for its 2 arguments on which the comparison is
performed.
Differential Revision: https://reviews.llvm.org/D82292
The runtimes build includes libcxx/include/CMakeLists.txt directly instead
of going through the top-level CMake file. This not-very-hygienic inclusion
caused some variables like LIBCXX_BINARY_DIR not to be defined properly,
and the config_site generation logic to fail after landing 53623d4aa7.
This patch works around this issue by defining the missing variables.
However, the proper fix for this would be for the runtimes build to
always go through libc++'s top-level CMakeLists.txt. Doing otherwise
is unsupported.
Before this patch, the __config_site header was only generated when at
least one __config_site macro needed to be defined. This lead to two
different code paths in how libc++ is configured, depending on whether
a __config_site header was generated or not. After this patch, the
__config_site is always generated, but it can be empty in case there
are no macros to define in it.
More context on why this change is important
--------------------------------------------
In addition to being confusing, this double-code-path situation lead to
broken code being checked in undetected in 2405bd6898, which introduced
the LIBCXX_HAS_MERGED_TYPEINFO_NAMES_DEFAULT CMake setting. Specifically,
the _LIBCPP_HAS_MERGED_TYPEINFO_NAMES_DEFAULT <__config_site> macro was
supposed NOT to be defined unless LIBCXX_HAS_MERGED_TYPEINFO_NAMES_DEFAULT
was specified explicitly on the CMake command line. Instead, what happened
is that it was defined to 0 if it wasn't specified explicitly and a
<__config_site> header was generated. And defining that macro to 0 had
the important effect of using the non-unique RTTI comparison implementation,
which changes the ABI.
This change in behavior wasn't noticed because the <__config_site> header
is not generated by default. However, the Apple configuration does cause
a <__config_site> header to be generated, which lead to the wrong RTTI
implementation being used, and to https://llvm.org/PR45549. We came close
to an ABI break in the dylib, but were saved due to a downstream-only
change that overrode the decision of the <__config_site> for the purpose
of RTTI comparisons in libc++abi. This is an incredible luck that we should
not rely on ever again.
While the problem itself was fixed with 2464d8135e by setting
LIBCXX_HAS_MERGED_TYPEINFO_NAMES_DEFAULT explicitly in the Apple
CMake cache and then in d0fcdcd28f by making the setting less
brittle, the point still is that we should have had a single code
path from the beginning. Unlike most normal libraries, the macros
that configure libc++ are really complex, there's a lot of them and
they control important properties of the C++ runtime. There must be
a single code path for that, and it must be simple and robust.
Differential Revision: https://reviews.llvm.org/D80927