Summary:
In the patch D78849, it uses llvm::any_of to instead of for loop to
simplify the function addRequired().
It's obvious that above code is not a NFC conversion. Because any_of
will return if any addRequired(Reg) is true immediately, but we want
every element to call addRequired(Reg).
This patch uses for_range loop to fix above any_of bug.
Reviewed By: MaskRay, nickdesaulniers
Differential Revision: https://reviews.llvm.org/D79872
Summary:
If the value in the collapse close is negative f18 abort without the correct error message. This PR change the size_t in name resolution to a int64_t and check appropriately for negative or zero before the privatization of induction variable.
The correct error is then catch by the OpenMP structure check.
This diff is migrated from the GitHub pull request https://github.com/flang-compiler/f18/pull/1098
Reviewers: ichoyjx, jdoerfert, sscalpone, DavidTruby
Reviewed By: ichoyjx, sscalpone, DavidTruby
Subscribers: sscalpone, klausler, yaxunl, guansong, llvm-commits
Tags: #llvm, #flang
Differential Revision: https://reviews.llvm.org/D77821
Skip tests or parts thereof that aren't expected to work when run from a
reproducer. Also improve the doc comments in configuration.py to prevent
mistakes in the future.
D79276 caused the following builder to fail:
http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/23489
Specifically, FileCheck dumped stack in the following tests:
LLVM :: MC/Mips/micromips-jump-pc-region.s
LLVM :: MC/Mips/mips-jump-pc-region.s
Those tests contained characters encoded as 160 but that render (at
least for me in vim) like a single space (32). Those characters
appeared between the `#` and `RUN:` on several lines, and D79276
caused FileCheck to process those lines differently: `RUN:` is a
comment directive. As a result, D79276 caused FileCheck to start
calling is `isalnum` on those characters.
The problem is that FileCheck calls `isalnum` on type `char` without
casting to `unsigned char` first, so it sign-extends 160 beyond what
`unsigned char` or `EOF` can represent. C says that has undefined
behavior. This problem is general to FileCheck's prefix parsing and
so exists independently of D79276.
524457edbc fixed the above tests. This patch changes FileCheck to
use LLVM's replacements for `ctype.h` functions, and it adds tests for
cases that are representative with or without D79276.
Reviewed By: jhenderson, thopre, efriedma
Differential Revision: https://reviews.llvm.org/D79810
It's really almost going to be misleading, see the example in
https://bugs.llvm.org/show_bug.cgi?id=45820
Maybe at some point we can do something fancier, but at least
this will fix a bug where we step on dead code while debugging.
gLinux started shipping incompatible versions of Z3, which can lead to a
missing `z3.h` header when building the Z3 solver locally. This patch
disables the Z3 solver when building a clang toolchain for Fuchsia.
Differential Revision: https://reviews.llvm.org/D79974
Just like on X86, we're getting reports of bogus leaks from
CoreFoundation in:
[_CFXPreferences(SourceAdditions) withSourceForIdentifier:user:byHost:container:cloud:perform:]
rdar://63238710
This reverts commit 7d5bb94d78.
Reverting since this leads to a linker error we're seeing on Fuchsia.
The underlying issue seems to be that inlining is run after sanitizers
and causes different comdat groups instrumented by Sancov to reference
non-key symbols defined in other comdat groups.
Will re-land this patch after a fix for that is landed.
Summary:
Predefined allocators should not be mapped at all (they are just enumeric
constants). FOr user-defined allocators need to map the traits only as
firstprivates, the allocator itself is private.
At the beginning of the target region the user-defined allocatores must
be created and then destroyed at the end of the target region:
```
omp_allocator_handle_t my_allocator = __kmpc_init_allocator(<gtid>,
/*default memhandle*/ 0, <number_of_traits>, &<traits>);
...
call void @__kmpc_destroy_allocator(<gtid>, my_allocator);
```
Reviewers: jdoerfert, aaron.ballman
Subscribers: jholewinski, yaxunl, guansong, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79257
Summary:
D78319 introduced basic support for inline asm input operands in GlobalISel.
However, that patch did not handle the case where a memory input operand still needs to
be indirectified. Later code asserts that the memory operand is already indirect.
This patch adds an early return false to trigger the SelectionDAG fallback for now.
Reviewers: arsenm, paquette
Reviewed By: arsenm
Subscribers: wdng, rovka, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79955
We need to use it to handle <16 x double> indirect indexes
in the AMDGPU BE.
The only visible change from adding it is in ARM cost model.
To me it looks reasonable. With doubling a vector size it
quadruples the cost up to the size 8 and then it did only
double it. Now it also quadruples, which seems a logical
progression to me.
Actual AMDGPU code is to follow, this is a common part, plus
load/store legalization in the AMDGPU BE not to break what
works now.
Differential Revision: https://reviews.llvm.org/D79952
The commit 3c28a2dc6b introduced the check that checks if we're
trying to re-enter a main file when building a preamble. Unfortunately this slowed down the preamble
compilation by 80-90% in some test cases, as translateFile is really slow. This change checks
to see if the FileEntry is the main file without calling translateFile, but by using the new
isMainFile check instead. This speeds up preamble building by 1.5-2x for certain test cases that we have.
rdar://59361291
Differential Revision: https://reviews.llvm.org/D79834
This patch adds `affine.vector_load` and `affine.vector_store` ops to
the Affine dialect and lowers them to `vector.transfer_read` and
`vector.transfer_write`, respectively, in the Vector dialect.
Reviewed By: bondhugula, nicolasvasilache
Differential Revision: https://reviews.llvm.org/D79658
The fact that loads and stores can have the alignment missing is a
constant source of confusion: code that usually works can break down in
rare cases. So fix the LoadInst API so the alignment is never missing.
To reduce the number of changes required to make this work, IRBuilder
and certain LoadInst constructors will grab the module's datalayout and
compute the alignment automatically. This is the same alignment
instcombine would eventually apply anyway; we're just doing it earlier.
There's a minor risk that the way we're retrieving the datalayout
could break out-of-tree code, but I don't think that's likely.
This is the last in a series of patches, so most of the necessary
changes have already been merged.
Differential Revision: https://reviews.llvm.org/D77454
For IR generated by a compiler, this is really simple: you just take the
datalayout from the beginning of the file, and apply it to all the IR
later in the file. For optimization testcases that don't care about the
datalayout, this is also really simple: we just use the default
datalayout.
The complexity here comes from the fact that some LLVM tools allow
overriding the datalayout: some tools have an explicit flag for this,
some tools will infer a datalayout based on the code generation target.
Supporting this properly required plumbing through a bunch of new
machinery: we want to allow overriding the datalayout after the
datalayout is parsed from the file, but before we use any information
from it. Therefore, IR/bitcode parsing now has a callback to allow tools
to compute the datalayout at the appropriate time.
Not sure if I covered all the LLVM tools that want to use the callback.
(clang? lli? Misc IR manipulation tools like llvm-link?). But this is at
least enough for all the LLVM regression tests, and IR without a
datalayout is not something frontends should generate.
This change had some sort of weird effects for certain CodeGen
regression tests: if the datalayout is overridden with a datalayout with
a different program or stack address space, we now parse IR based on the
overridden datalayout, instead of the one written in the file (or the
default one, if none is specified). This broke a few AVR tests, and one
AMDGPU test.
Outside the CodeGen tests I mentioned, the test changes are all just
fixing CHECK lines and moving around datalayout lines in weird places.
Differential Revision: https://reviews.llvm.org/D78403
With this change, basic archive files can be linked together. Input
section discovery has been refactored into a function since archive
files lazily resolve their symbols / the object files containing those
symbols.
Reviewed By: int3, smeenai
Differential Revision: https://reviews.llvm.org/D78342
This isn't really a new invariant; it effectively already existed due to
existing DataLayout queries. But this makes it explicit.
This is technically not backward-compatible with the existing bitcode
reader, but it's backward-compatible with the output of the bitcode
writer, which is what matters in practice.
No testcase because I don't know a good way to write one: there are no
existing tools that can generate a bitcode file that would trigger the
error.
Split off from D78403.
Differential Revision: https://reviews.llvm.org/D79900
Commit 73152a2ec2 fixed type checking for
blocks with qualified id parameters. But there are existing APIs in
Apple SDKs relying on the old type checking behavior. Specifically,
these are APIs using NSItemProviderCompletionHandler in
Foundation/NSItemProvider.h. To keep existing code working and to allow
developers to use affected APIs introduce a compatibility mode that
enables the previous and the fixed type checking. This mode is enabled
only on Darwin platforms.
Reviewed By: jyknight, ahatanak
Differential Revision: https://reviews.llvm.org/D79511
We already form PMULH when the shift is truncated. But we can
also do it from just a shift by extending the result.
Unfortunately, I get regressions if I try to replace the truncate
combine with this as we turn the truncate into a more complicated
sequence first. Then we are unable to combine that sequence with
the extend produced at the end of this combine.
Differential Revision: https://reviews.llvm.org/D79682
Enable clausing of memory loads on gfx10 by adding a new pass to insert
the s_clause instructions that mark the start of each hard clause.
Differential Revision: https://reviews.llvm.org/D79792
Fixes PR42445 (compiler driver options -Os -Oz translate to
-plugin-opt=Os (Oz) which are not recognized by LLVMgold.so or LLD).
The optimization level mapping matches
CompilerInvocation.cpp:getOptimizationLevel() and SpeedLevel of
PassBuilder::OptimizationLevel::O*.
-plugin-opt=O* affects the way we construct regular LTO/ThinLTO pass
manager pipeline.
Reviewed By: pcc
Differential Revision: https://reviews.llvm.org/D79919
When I moved combineLoopMAddPattern to an IR pass. I didn't match the behavior of canReduceVMulWidth that was used in the SelectionDAG version. canReduceVMulWidth just calls computeSignBits and assumes a truncate is always profitable. The version I put in IR just looks for constants and zext/sext. Though I neglected to check the number of bits in input of the zext/sext.
This patch adds a check for the number of input bits to the sext/zext. And it adds a special case for add/sub with zext/sext inputs which can be handled by combineTruncatedArithmetic. Match the original SelectionDAG behavior appears to be a regression in some cases if the truncate isn't removed and becomes pack and permq. So enabling only this specific case is the conservative approach.
Differential Revision: https://reviews.llvm.org/D79909