The clang-tidy check bugprone-branch-clone has a false positive if some
symbols are undefined. This patch silences the warning when the two
sides of a branch are invalid.
Fixes#56057
Differential Revision: https://reviews.llvm.org/D128402
(-(X & 1)) & Y --> (X & 1) == 0 ? 0 : Y
https://alive2.llvm.org/ce/z/rhpH3i
This is noted as a missing IR canonicalization in issue #55618.
We already managed to fix codegen to the expected form.
Currently, the parser for IntegerSet, only allows constraints like:
```
affine-constraint ::= affine-expr `>=` `0`
| affine-expr `==` `0`
```
This form is sometimes unreadable and painful to use when writing unittests
for Presburger library and tests in general.
This patch extends the parser to allow affine constraints with affine-expr on
the RHS:
```
affine-constraint ::= affine-expr `>=` `affine-expr`
| affine-expr `==` `affine-expr`
```
The internal storage and printing of IntegerSet is still in the original format.
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D128915
Computing scalable offset needs up to two scrach registers. We add
scavenge spill slots according to the result of `RISCV::isRVVSpill`
and `RVVStackSize`. Since ADDI is not included in `RISCV::isRVVSpill`,
PEI doesn't add scavenge spill slots for scrach registers when using
ADDI to get scalable stack offsets.
The ADDI instruction has a destination register which can be used as
a scrach register. So one scavenge spil slot is sufficient for
computing scalable stack offsets.
Differential Revision: https://reviews.llvm.org/D128188
This reverts 3668d1264e
As far as we know, `__attribute__((weak))` support has been really bad
in runtimeldyld, so we just disable it in Windows at this moment. This
should fix the angry Windows buildbot.
Differential Revision: https://reviews.llvm.org/D129042
This is a recommit of b822efc740,
reverted in dc34d8df4c. The commit caused
fails because the test ast-print-fp-pragmas.c did not specify particular
target, and it failed on targets which do not support constrained
intrinsics. The original commit message is below.
AST does not have special nodes for pragmas. Instead a pragma modifies
some state variables of Sema, which in turn results in modified
attributes of AST nodes. This technique applies to floating point
operations as well. Every AST node that can depend on FP options keeps
current set of them.
This technique works well for options like exception behavior or fast
math options. They represent instructions to the compiler how to modify
code generation for the affected nodes. However treatment of FP control
modes has problems with this technique. Modifying FP control mode
(like rounding direction) usually requires operations on hardware, like
writing to control registers. It must be done prior to the first
operation that depends on the control mode. In particular, such
operations are required for implementation of `pragma STDC FENV_ROUND`,
compiler should set up necessary rounding direction at the beginning of
compound statement where the pragma occurs. As there is no representation
for pragmas in AST, the code generation becomes a complicated task in
this case.
To solve this issue FP options are kept inside CompoundStmt. Unlike to FP
options in expressions, these does not affect any operation on FP values,
but only inform the codegen about the FP options that act in the body of
the statement. As all pragmas that modify FP environment may occurs only
at the start of compound statement or at global level, such solution
works for all relevant pragmas. The options are kept as a difference
from the options in the enclosing compound statement or default options,
it helps codegen to set only changed control modes.
Differential Revision: https://reviews.llvm.org/D123952
At the moment, two files are not installed by CMake.
- `lib/Headers/openmp_wrappers/time.h`
- `lib/Headers/ppc_wrappers/nmmintrin.h`
`builtin_headers_gen` is available as the source of rules_pkg.
The difference of the layout of installed headers makes cache hit harder.
This handles the code we get for
int foo(int* x, unsigned y) {
return x[y >> 1];
}
The shift right and the shl will get DAG combined into
(shl (and X, 0xfffffffe), 1). We have custom isel to match the
shl+and, but with Zba the (add (shl X, 1), Y) part will get
matched and leave the and to be iseled by itself. This commit
adds a larger pattern that includes the and.
We can canonicalize consecutive complex.exp and complex.log which are inverse functions each other.
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D128966
This allows us to fold global and constant pool addresses into
load/store during isel instead of in the post-isel peephole. I
did not copy the alignment check for ConsantPoolSDNode because it
wasn't tested.
This is a step towards being able to remove the post-isel
peephole.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D128738
The moved helpers are only used for codegen. It will allow moving the
remaining ::execute implementations out of LoopVectorize.cpp.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D128657
where C2 has 32 leading zeros and C3 trailing zeros.
When the shl is used by an add C is 1,2 or 3, we end up matching
(add (shl X, C), Y) first. This leaves an and with a constant that
is harder to materialize.
Similar for SH2ADD and SH3ADD.
This is what we get from
int foo(int* x, unsigned y) {
return x[y >> 1];
}
This allows us to avoid materializing 0xFFFFFFFE into a register.
Summary:
A previous patch added support for dumping offloading sections. The
tests for this feature added dummy input to the required section using
`llvm-objcopy`. This binary format has a required alignment of `8` which
was not being respected by the file copied with llvm-objcopy and would
cause failures on architectures sensitive to alignment problems or with
sanitizers. This patch adds the proper alignemnt and adds an error check
at least for the binary format so it's not completely opaque. This
should be improvbed so users actually get a helpful message.
This reverts commit 7af3d4ab3d.
RISC-V reverted the shrink wrap patch for bug 53662. Since the bug is fixed
by D123679, the commit re-enable it.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D128965
If we are certainly not in a loop we can directly emit the heap2stack
allocas in the function entry block. This will help to get rid of them
(SROA) and avoid stacksave/restore intrinsics when the function is
inlined.
I used a script to reuse existing check lines rather than creating new
ones. There are more opportunities to reduce the line count but the
"check generated functions" logic makes that somewhat tricky.
FWIW, we really should redo the update script with all these use cases
in mind...
Differential Revision: https://reviews.llvm.org/D128686
In Clang/LLVM we are moving towards a new binary format to store many
embedded object files to create a fatbinary. This patch adds support for
dumping these embedded images in the `llvm-objdump` tool. This will
allow users to query information about what is stored inside the binary.
This has very similar functionality to the `cuobjdump` tool for thoe familiar
with the Nvidia utilities. The proposed use is as follows:
```
$ clang input.c -fopenmp --offload-arch=sm_70 --offload-arch=sm_52 -c
$ llvm-objdump -O input.o
input.o: file format elf64-x86-64
OFFLOADIND IMAGE [0]:
kind cubin
arch sm_52
triple nvptx64-nvidia-cuda
producer openmp
OFFLOADIND IMAGE [1]:
kind cubin
arch sm_70
triple nvptx64-nvidia-cuda
producer openmp
```
This will be expanded further once we start embedding more information
into these offloading images. Right now we are planning on adding
flags and entries for debug level, optimization, LTO usage, target
features, among others.
This patch only supports printing these sections, later we will want to
support dumping files the user may be interested in via another flag. I
am unsure if this should go here in `llvm-objdump` or `llvm-objcopy`.
Reviewed By: MaskRay, tra, jhenderson, JonChesterfield
Differential Revision: https://reviews.llvm.org/D126904
This patchs adds the necessary code for inspecting or creating offloading
binaries using the standing `obj2yaml` and `yaml2obj` features in LLVM.
Depends on D127774
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D127776
region with implicit default inside the member function.
This is to fix assert when field is referenced in OpenMP region with
default (first|private) clause inside member function.
The problem of assert is that the capture is not generated for the field.
This patch is to generate capture when the field is used with implicit
default, use it in the code, and save the capture off to make sure it is
considered from that point and add first/private clauses.
1> Add new field ImplicitDefaultFirstprivateFDs in SharingMapTy, used to
store generated capture fields info.
2> In function isOpenMPCaptureDecl: the caputer is generated and saved
in ImplicitDefaultFirstprivateFDs.
3> Add new help functions:
getImplicitFDCapExprDecl
isImplicitDefaultFirstprivateFD
addImplicitDefaultFirstprivateFD
4> Add addition argument in hasDSA to check default attribute for
default(first|private).
5> The isImplicitDefaultFirstprivateFD is used in VisitDeclRefExpr to
build the implicit clause.
6> Add new parameter "Context" for buildCaptureDecl, due to when capture
field, the parent context is needed to be used.
7> Change in isOpenMPPrivateDecl where stop propagate the capture from
the enclosing region for private variable.
8> In ActOnOpenMPFirstprivate/ActOnOpenMPPrivate, using captured info
to generate first|private clause.
9> Add new function isOpenMPRebuildMemberExpr: use to determine if field
needs to be rebuild during template instantiation.
Differential Revision: https://reviews.llvm.org/D127803
If BOLT instrumentation runtime uses XMM registers, it can interfere
with the user program causing crashes and unexpected behavior. This
happens as the instrumentation code preserves general purpose registers
only.
Build BOLT instrumentation runtime with "-mno-sse".
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/D128960
This is a preprocessor callback focused on the lexed file changing, without conflating effects of line number directives and other pragmas.
A client that only cares about what files the lexer processes, like dependency generation, can use this more straightforward
callback instead of `PPCallbacks::FileChanged()`. Clients that want the pragma directive effects as well can keep using `FileChanged()`.
A use case where `PPCallbacks::LexedFileChanged()` is particularly simpler to use than `FileChanged()` is in a situation
where a client wants to keep track of lexed file changes that include changes from/to the predefines buffer, where it becomes
unnecessary complicated trying to use `FileChanged()` while filtering out the pragma directives effects callbacks.
Also take the opportunity to provide information about the prior `FileID` the `Lexer` moved from, even when entering a new file.
Differential Revision: https://reviews.llvm.org/D128947