The LLVM Libc project is no longer just a proposal and should have
a webpage tracking the status of the project. This changes
puts the pieces into the right place so that the webpage can be
created.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D117436
This patch builds on the change in D117634 that expanded the short
triples when passed in by the user. This patch adds the same
functionality for the `-Xopenmp-target=` flag. Previously it was
unintuitive that passing `-fopenmp-targets=nvptx64
-Xopenmp-target=nvptx64 <arg>` would not forward the arg because the
triples did not match on account of `nvptx64` being expanded to
`nvptx64-nvidia-cuda`.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D118495
Inside a switch the caseStmt() and defaultStmt() have a nested statement
associated with them. Similarly, labelStmt() has a nested statement.
These statements were being missed when looking for a compound-if of the
form "if (x) return true; return false;" when the if is nested under one
of these labelling constructs.
Enhance the matchers to look for these nested statements using some
private matcher hasSubstatement() traversal matcher on case, default
and label statements. Add the private matcher hasSubstatementSequence()
to match the compound "if (x) return true; return false;" pattern.
- Add unit tests for private matchers and corresponding test
infrastructure
- Add corresponding test file readability-simplify-bool-expr-case.cpp.
- Fix variable name copy/paste error in readability-simplify-bool-expr.cpp.
- Drop the asserts, which were used only for debugging matchers.
- Run clang-format on the whole check.
- Move local functions out of anonymous namespace and declare state, per
LLVM style guide
- Declare labels constexpr
- Declare visitor arguments as pointer to const
- Drop braces around simple control statements per LLVM style guide
- Prefer explicit arguments over default arguments to methods
Differential Revision: https://reviews.llvm.org/D56303Fixes#27078
to give users a final warning that they need to migrate away. They could still
use -flegacy-pass-manager for Clang 14.0.0, but the functionality may not work
for 15.0.0.
-fexperimental-new-pass-manager is a no-op for default builds, so not urgent to
be removed for 14.0.0.
clang/test/Frontend/optimization-remark-with-hotness.c is removed because its
new PM replacement optimization-remark-with-hotness-new-pm.c exists.
Reviewed By: aeubanks, nikic
Differential Revision: https://reviews.llvm.org/D118313
llvm/Support/Path.h was likely previously implicitly included, and a
refactoring removed that inclusion, breaking the pass.
Differential Revision: https://reviews.llvm.org/D118508
Type constraints such as BoolLike and SignlessIntegerLike appear to
have been defined by copy-paste and all share an underlying TypesLike
structure that can be factored out.
This also allows for defining additional constraints of a similar
form, such as F32Like.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D118382
The --offload option was added in D110622 to "override the default
device target". When it landed it supported only HIP. This patch
extends that option to support SPIR-V targets for CUDA.
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D117137
Extend scalar evolution to handle >= and <= if a loop is known to be finite and the induction variable guards the condition. Specifically, with these assumptions lhs <= rhs is equivalent to lhs < rhs + 1 and lhs >= rhs to lhs > rhs -1.
In the case of lhs <= rhs, this is true since the only case these are not equivalent
is when rhs == unsigned/signed intmax, which would have resulted in an infinite loop.
In the case of lhs >= rhs, this is true since the only case these are not equivalent
is when rhs == unsigned/signed intmin, which would again have resulted in an infinite loop.
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D118090
Enhance the various update_*_test_checks.py tools to allow filtering the tool
output with regular expressions. The --filter option will emit only tool output
lines matching the given regular expression while the --filter-out option will
emit only tools output lines not matching the given regular expression. Filters
are applied in order of appearance on the command line (or in UTC_ARGS) and the
first matching filter terminates the search.
This allows test authors to create more focused tests by removing irrelevant
tool output and checking only the pieces of output necessary to test the desired
functionality.
Differential Revision: https://reviews.llvm.org/D117694
This patch enable lowering from Fortran to FIR for a basic empty
program. It brings all the infrastructure needed for that. As discussed
previously, this is the first patch for lowering and follow up patches
should be smaller.
With this patch we can lower the following code:
```
program basic
end program
```
To a the FIR equivalent:
```
func @_QQmain() {
return
}
```
Follow up patch will add lowering of more complex constructs.
Reviewed By: kiranchandramohan, schweitz, PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D118436
Remove obsolete code that has moved to the
`flang/Optimizer/Builder/Intrinsic` directory.
`genMin` is inlined in the code since it's not available
in the builder.
Reviewed By: kiranchandramohan, schweitz
Differential Revision: https://reviews.llvm.org/D118465
Fixes https://github.com/llvm/llvm-project/issues/53441.
Expected code:
```
/**/ //
int a; //
```
was before misformatted to:
```
/**/ //
int a; //
```
Because the "remaining length" (after the starting `/*`) of an empty block comment `/**/` was computed to be 0 instead of 2.
Reviewed By: MyDeveloperDay, HazardyKnusperkeks, owenpan
Differential Revision: https://reviews.llvm.org/D118475
ConstStructBuilder::Finalize in CGExprConstant.ccp assumes that the
passed in QualType is a RecordType. In some instances, the type is a
reference to a RecordType and the reference needs to be removed first.
Differential Revision: https://reviews.llvm.org/D117376
Currently clang treats host var address as constant in device compilation,
which causes const vars initialized with host var address promoted to
device variables incorrectly and results in undefined symbols.
This patch fixes that.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D118153
Fixes: SWDEV-309881
Change-Id: I0a69357063c6f8539ef259c96c250d04615f4473
A `-DBUILD_SHARED_LIBS=ON` build on Solaris/amd64 failed with
Undefined first referenced
symbol in file
_ZNK4llvm3cfg6UpdateIPNS_10BasicBlockEE4dumpEv tools/polly/unittests/DeLICM/CMakeFiles/DeLICMTests.dir/DeLICMTest.cpp.o (symbol belongs to implicit dependency /var/llvm/local-amd64-release-stage2-shared-A/bin/../lib/libLLVMCore.so.14git)
ld: fatal: symbol referencing errors
Solaris `ld` requires to directly link with dependant libraries, so this
patch explicitly adds `libLLVMCore`.
Tested on `amd64-pc-solaris2.11` and `x86_64-pc-linux-gnu`.
Differential Revision: https://reviews.llvm.org/D118452
Currently, the clang.arc.attachedcall bundle takes an optional function
argument. Depending on whether the argument is present, calls with this
bundle have the following semantics:
- on x86, with the argument present, the call is lowered to:
call _target
mov rax, rdi
call _objc_retainAutoreleasedReturnValue
- on AArch64, without the argument, the call is lowered to:
bl _target
mov x29, x29
and the objc runtime call is expected to be emitted separately.
That's because, on x86, the objc runtime checks for both the mov and
the call on x86, and treats the combination as the ARC autorelease elision
marker.
But on AArch64, it only checks for the dedicated NOP marker, as that's
historically been sufficiently unique. Thanks to that, the runtime call
wasn't required to be adjacent to the NOP marker, so it wasn't emitted
as part of the bundle sequence.
This patch unifies both architectures: on AArch64, we now emit all
3 instructions for the bundle. This guarantees that the runtime call
is adjacent to the marker in the sequence, and that's information the
runtime can use to further optimize this.
This helps simplify some of the handling, in particular
BundledRetainClaimRVs, which no longer needs to know whether the bundle
is sufficient or not: it now always should be.
Note that this does not include an AutoUpgrade for the nullary bundles,
as they are only produced in ObjCContract as part of the obj/asm emission
pipeline, and are not expected to be in bitcode.
Differential Revision: https://reviews.llvm.org/D118214
Commit history in chronological order:
[BOLT] llvm-bolt-wrapper: added wrapper for bolt binary matching
Summary:
Wrapper to compare two versions of BOLT to see if they produce the same output
binary given the same input.
(cherry picked from FBD26626137)
[BOLT] llvm-bolt-wrapper: support for no-output tests and heatmap mode
Summary:
- Added an option `skip_binary_cmp` to support invocations that don't output
a binary
- Minor fixes for heatmap mode, timeout, log comparison
- Rearranged in-line config example to be copy-pasteable
(cherry picked from FBD26822016)
[BOLT] llvm-bolt-wrapper: merge stdout/stderr, search for config in script dir
(cherry picked from FBD27529335)
[BOLT] llvm-bolt-wrapper: handle /dev/null
Summary:
Fixed the wrapper to preserve `-o /dev/null` and skip binary matching for such
invocations.
(cherry picked from FBD28013747)
[BOLT] llvm-bolt-wrapper: handle cases where output binary doesn't exist
Summary:
Handle invocations where output binary is not generated (e.g. due to an expected
assertion or exit with BOLT-ERROR) and skip binary comparison in such cases.
(cherry picked from FBD28080158)
[BOLT] llvm-bolt-wrapper: handle boltdiff mode
Summary:
Handle `llvm-boltdiff` invocation similarly to `perf2bolt`
(cherry picked from FBD28080157)
[BOLT] llvm-bolt-wrapper: find section with mismatch
Summary:
For mismatching ELF files, find section with mismatch and print sections table
with highlighted mismatch section.
(cherry picked from FBD28087231)
[BOLT] llvm-bolt-wrapper: ignore-build-id in perf2bolt mode
Summary:
When perf2bolt fails to match build-id from perf output for cmp binary, we need
to use -ignore-build-id option to override the strict checking behavior.
(cherry picked from FBD28087232)
[BOLT] llvm-bolt-wrapper: suppress -bolt-info=0 in heatmap mode
Summary:
Heatmap mode is incompatible with `-bolt-info=0` used to suppress binary
differences. Remove it.
(cherry picked from FBD28087230)
[BOLT] llvm-bolt-wrapper: add config-generator mode
Summary:
llvm-bolt-wrapper config can be generated by the script itself.
It makes the workflow more reliable compared to preparing the config manually.
(cherry picked from FBD28358939)
[BOLT] llvm-bolt-wrapper: fix mismatch reporting
Summary:
1. Fixed header comparison issue where headers were skipped due to
`skip_end == 0` (`lst[:-n]` does not work if n==0).
2. Detect color support while printing mismatching section:
- use bold color if terminal supports ANSI escape codes,
- otherwise print ">" at mismatching section.
3. Remove extra 0x before mismatching offset.
(cherry picked from FBD28691979)
[BOLT] llvm-bolt-wrapper: handle perf2bolt tests with ignore-build-id
Summary:
`ignore-build-id` must be passed not more than once. Account for that.
(cherry picked from FBD29830266)
[BOLT] llvm-bolt-wrapper: fix running subprocesses in parallel
Summary:
The commands were running sequentially due to the use of blocking `communicate`
call, which is needed when stdout/stderr are directed to a pipe.
Fix this behavior by directing the output to a file.
(cherry picked from FBD29951863)
This reverts commit db49a78900. The reasoning in the patch applied to a downstream branch, and I got myself confused when trying to split apart pieces. Thankfully, the assert was simply weaker than the actual invariant currently upstream which is that ReadyInsts is not empty.
This patch adds the vector.scan op which computes the
scan for a given n-d vector. It requires specifying the operator,
the identity element and whether the scan is inclusive or
exclusive.
TEST: Added test in ops.mlir
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D117171
Part of the _BitInt feature in C2x
(http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2763.pdf) is a new
macro in limits.h named BITINT_MAXWIDTH that can be used to determine
the maximum width of a bit-precise integer type. This macro must expand
to a value that is at least as large as ULLONG_WIDTH.
This adds an implementation-defined macro named __BITINT_MAXWIDTH__ to
specify that value, which is used by limits.h for the standard macro.
This also limits the maximum bit width to 128 bits because backends do
not currently support all mathematical operations (such as division) on
wider types yet. This maximum is expected to be increased in the future.
Functionality present in `flang/include/flang/Lower/ComplexExpr.h` are
available in `flang/include/flang/Optimizer/Builder/Complex.h`. This patch removes
the obsolete files.
Reviewed By: kiranchandramohan, schweitz
Differential Revision: https://reviews.llvm.org/D118462
Due to some complications with lifetime, and assume-like intrinsics, intrinsics were not included as outlinable instructions. This patch opens up most intrinsics, excluding lifetime and assume-like intrinsics, to be outlined. For similarity, it is required that the intrinsic IDs, and the intrinsics names match exactly, as well as the function type. This puts intrinsics in a different class than normal call instructions (https://reviews.llvm.org/D109448), where the name will no longer have to match.
This also adds an additional command line flag debug option to disable outlining intrinsics.
Recommit of: 8de76bd569
Adds extra checking of intrinsic function calls names to avoid taking the address of intrinsic calls when extracting function calls.
Reviewers: paquette, jroelofs
Differential Revision: https://reviews.llvm.org/D109450
Close#52781: for LTO, the inline asm diagnostic uses `<inline asm>` as the file
name (lib/CodeGen/AsmPrinter/AsmPrinterInlineAsm.cpp) and it is unclear which
module has the issue.
With this patch, we will see the module name (say `asm.o`) before `<inline asm>` with ThinLTO.
```
% clang -flto=thin -c asm.c && myld.lld asm.o -e f
ld.lld: error: asm.o <inline asm>:1:2: invalid instruction mnemonic 'invalid'
invalid
^~~~~~~
```
For regular LTO, unfortunately the original module name is lost and we only get
ld-temp.o.
Reviewed By: #lld-macho, ychen, Jez Ng
Differential Revision: https://reviews.llvm.org/D118434
This file was added in https://reviews.llvm.org/D74415. There was no
justification as to why it was added, and after about a year of being
in-tree, it's still unused, so this removes it.
The fact we could have a block with a valid scheduling window, but nothing to schedule was surprising to me. After digging through the code, this can only happen if we don't find anything to directly vectorize. However, the reduction handling code relies on this mode, so we can't simply consider such trees unvectorizeable. The assert conveys both that this situation can happen, but also that it can *only* happen for an immediate gather.
Context: We built the bundle before deciding that vectorization of a bundle is possible. A side effect of bundle construction is manipulating the scheduling window, so a bundle which isn't vectorizable can cause the creation or expansion of a scheduling window.
This matches GCC: https://godbolt.org/z/sM5q95PGY
I realize this is an API break for clang+clang - so I'm totally open to
discussing how we should deal with that. If Apple wants to keep the
Clang layout indefinitely, if we want to put a flag on this so non-Apple
folks can opt out of this fix/new behavior.
Differential Revision: https://reviews.llvm.org/D117616
The unit tests for PyTACO hasn't been upstreamed yet. A unit test for this
change will be added when we upstream all the unit tests for PyTACO.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D118417
AIX provides additional definitions in the system libc float.h that we
would like to be available to users, so we need to include_next through,
similar to what is done on some other platforms.
We also adjust the guards for some definitions which are restricted
based on language level to also be provide with the _ALL_SOURCE feature
test macro on AIX, similar to what is done by the platform float.h
header, so we don't run into cases where we don't provide the compiler
macro but still have a different definition from the system.
Differential Revision: https://reviews.llvm.org/D117935