The SwapAndRestore class is over engineered. Nothing makes use of the
early restoration machinery. Let's just remove that cognative burdon.
Reviewed By: ChuanqiXu
Differential Revision: https://reviews.llvm.org/D120673
This new flag enables `__has_feature(cxx_unstable)` that would replace libc++ macros for individual unstable/experimental features, e.g. `_LIBCPP_HAS_NO_INCOMPLETE_RANGES` or `_LIBCPP_HAS_NO_INCOMPLETE_FORMAT`.
This would make it easier and more convenient to opt-in into all libc++ unstable features at once.
Differential Revision: https://reviews.llvm.org/D120160
This takes care of normalizing newlines back to single LF instead
of CRLF.
This on itself breaks on a couple tests that accidentally seem to
be writing binary data to stdout; make sure those cases are piped
to /dev/null instead of actually written to a terminal.
Differential Revision: https://reviews.llvm.org/D120623
A function is basically dead when:
* it has no uses
* it has only self-referencing uses (it's recursive)
Differential Revision: https://reviews.llvm.org/D119878
Current objcopy implementation has a possibility to add or update sections.
The incoming section is specified as a pair: section name and name of the file
containing section data. The interface does not allow to specify incoming
section as a memory buffer. This patch adds possibility to specify incoming
section as a memory buffer.
Differential Revision: https://reviews.llvm.org/D120486
Commit 416e689ecd introduced a fix to
`readability-suspicious-call-argument` which added an entry to the
Release Notes, but the same change was backported to **14.0** in commit
e89602b7b2ec12f20f2618cefb864c2b22d0048a.
Hence, this change is not a new thing of the to-be 15.0 release.
For unreachable loops, any BECount is legal, and since D98706 SCEV
can make use of this for loops that are unreachable due to constant
branches. To avoid false positives, adjust SCEV verification to only
check BECounts in reachable loops.
Fixes https://github.com/llvm/llvm-project/issues/50523.
Differential Revision: https://reviews.llvm.org/D120651
The added tests highlight an unnecessary cross-iteration dependency when
lowering reductions to faddp. This dependency can negatively impact
performance.
NVVM IR specification defines them with i32 return type:
declare i32 @llvm.nvvm.match.any.sync.i64(i32 %membermask, i64 %value)
declare {i32, i1} @llvm.nvvm.match.all.sync.i64(i32 %membermask, i64 %value)
...
The i32 return value is a 32-bit mask where bit position in mask corresponds
to thread’s laneid.
as well as PTX ISA:
9.7.12.8. Parallel Synchronization and Communication Instructions: match.sync
match.any.sync.type d, a, membermask;
match.all.sync.type d[|p], a, membermask;
...
Destination d is a 32-bit mask where bit position in mask corresponds
to thread’s laneid.
Additionally, ptxas doesn't accept intructions, produced by NVPTX backend.
After this patch, it compiles with no issues.
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D120499
Add a check on run lines to pick up isel options in llc commands and allow
generating check lines of isel final output other than assembly. If llc command
line contains -debug-only=isel, update_llc_test_checks.py will try to scrub isel
output, otherwise, the script will fall back on default behaviour, which is try to
scrub assembly output instead.
The motivation of this change is to allow usage of update_llc_test_checks.py to
autogenerate checks of instruction selection results. In this way, we can detect
errors at an earlier stage before the compilation goes all the way to assembly.
It is an example of having some transparency for the stages between IR and
assembly. These generated tests are almost like "unit tests" of isel stage.
This patch only implements the initial change to differentiate isel output from
assembly output for Lanai. Other targets will not be supported for isel check
generation at the moment. Although adding support for it will only require
implementing the function regex and scrubber for corresponding targets.
The Lanai implementation was chosen mainly for the simplicity of demonstrating
the difference between isel checks and asm checks.
This patch also do not include the implementation of function prefix, which is
required for the generated isel checks to pass. I will put up a follow up revision
for the function prefix change to complete isel support.
Reviewed By: Flakebi
Differential Revision: https://reviews.llvm.org/D119368
The previous code used an unbounded sprintf, which in theory can
overflow, writing either the null terminator or the last digits
into the next struct member.
In practice, in LLD, all long section names are written sequentially
first at the start of the string table, followed by all the long
symbol names. Due to this, even if the total string table would
end up large, the long section names have fairly short offsets,
which is why this hasn't been an issue in practice.
I don't think it's worth trying to write a test that produces an
executable with enough long section names to make the section names
themselves exceed 10^6 bytes, which is currently necessary to trigger
faults with the previous form.
Differential Revision: https://reviews.llvm.org/D120676
Implementation partitions bring two extra cases where we have
visibility of module-private data.
1) When we import a module implementation partition.
2) When a partition implementation imports the primary module intertace.
We maintain a record of direct imports into the current module since
partition decls from direct imports (but not trasitive ones) are visible.
The rules on decl-reachability are much more relaxed (with the standard
giving permission for an implementation to load dependent modules and for
the decls there to be reachable, but not visible).
Differential Revision: https://reviews.llvm.org/D118599
The revision renames the following OpDSL functions:
```
TypeFn.cast -> TypeFn.cast_signed
BinaryFn.min -> BinaryFn.min_signed
BinaryFn.max -> BinaryFn.max_signed
```
The corresponding enum values on the C++ side are renamed accordingly:
```
#linalg.type_fn<cast> -> #linalg.type_fn<cast_signed>
#linalg.binary_fn<min> -> #linalg.binary_fn<min_signed>
#linalg.binary_fn<max> -> #linalg.binary_fn<max_signed>
```
Depends On D120110
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D120562
The revision extends OpDSL with unary and binary function attributes. A function attribute, makes the operations used in the body of a structured operation configurable. For example, a pooling operation may take an aggregation function attribute that specifies if the op shall implement a min or a max pooling. The goal of this revision is to define less and more flexible operations.
We may thus for example define an element wise op:
```
linalg.elem(lhs, rhs, outs=[out], op=BinaryFn.mul)
```
If the op argument is not set the default operation is used.
Depends On D120109
Reviewed By: nicolasvasilache, aartbik
Differential Revision: https://reviews.llvm.org/D120110
Add a checker to maintain the system-defined value 'errno'.
The value is supposed to be set in the future by existing or
new checkers that evaluate errno-modifying function calls.
Reviewed By: NoQ, steakhal
Differential Revision: https://reviews.llvm.org/D120310
WithColor has an "auto detection mode" which looks whether the
corresponding whether the corresponding cl::opt is enabled or not. While
this is great when opting into cl::opt, it's not so great for downstream
users of this utility, which might have their own competing options to
enable or disable colors. The WithColor constructor takes a color mode,
but the big benefit of the class are its static error and warning
helpers and default error handlers.
In order to allow users of this utility to enable or disable colors
globally, this patch adds the ability to specify a global auto detection
function. By default, the auto detection function behaves the way that
it does today. The benefit of this patch lies in that it can be
overwritten. In addition to a ability to change the auto detection
function, I've also made it possible to get your hands on the default
auto detection function, so you swap it back if if you so desire.
This patch allow downstream users (like LLDB) to globally disable colors
with its own command line flag.
Differential revision: https://reviews.llvm.org/D120593
Reassigning the operand didn't update the operand type which resulted in an
assertion (`Assertion `isReg() && "This is not a register operand!"' failed.`)
Reset the instruction instead.
Test Plan:
```
ninja check-bolt
...
PASS: BOLT-Unit :: Core/./CoreTests/X86/MCPlusBuilderTester.ReplaceRegWithImm/0 (90 of 136)
```
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D120263
Given a dependent `T` (maybe an undeduced `auto`),
Before:
new T(z) --> new T((z)) # changes meaning with more args
new T{z} --> new T{z}
T(z) --> T(z)
T{z} --> T({z}) # forbidden if T is auto
After:
new T(z) --> new T(z)
new T{z} --> new T{z}
T(z) --> T(z)
T{z} --> T{z}
Depends on D113393
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D120608
Trying to invoke an x64 binary on ARM64 Windows 10 won't work, and will
print an obscure error message. Choose the 32-bit linker instead, which
will run under emulation.
The x64 linker should in theory run under ARM64 Windows 11. We could
detect this using IsWow64GuestMachineSupported(), but I don't have a
setup to test that with at the moment.
Differential Revision: https://reviews.llvm.org/D120681
This is a clean-up patch. The functional pass was rolled into the module pass in D112732.
Reviewed By: vitalybuka, aeubanks
Differential Revision: https://reviews.llvm.org/D120674
Add applyStaticChunkedWorkshareLoop method implementing static schedule when chunk-size is specified. Unlike a static schedule without chunk-size (where chunk-size is chosen by the runtime such that each thread receives one chunk), we need two nested loops: one for looping over the iterations of a chunk, and a second for looping over all chunks assigned to the threads.
This patch includes the following related changes:
* Adapt applyWorkshareLoop to triage between the schedule types, now possible since all schedules have been implemented. The default schedule is assumed to be non-chunked static, as without OpenMPIRBuilder.
* Remove the chunk parameter from applyStaticWorkshareLoop, it is ignored by the runtime. Change the value for the value passed to the init function to 0, as without OpenMPIRBuilder.
* Refactor CanonicalLoopInfo::setTripCount and CanonicalLoopInfo::mapIndVar as used by both, applyStaticWorkshareLoop and applyStaticChunkedWorkshareLoop.
* Enable Clang to use the OpenMPIRBuilder in the presence of the schedule clause.
Differential Revision: https://reviews.llvm.org/D114413
Add a header-only implementation of Briggs & Torczon's fast small
integer set data structure to flang/include/flang/Common, and use
it in the runtime to manage a pool of Fortran unit numbers with
recycling. This replaces the bit set previously used for that
purpose. The set is initialized on demand with the negations of
all the NEWUNIT= unit numbers that can be returned to any kind
of integer variable.
For programs that require more concurrently open NEWUNIT= unit
numbers than the pool can hold, they are now allocated with a
non-recycling counter. This allows as many open units as the
operating system provides.
Many of the top-line comments in flang/unittests/Runtime had the
wrong path name. I noticed this while adding a unit test for the
fast integer set data structure, and cleaned them up.
Differential Revision: https://reviews.llvm.org/D120685
Only the methods currently required by the libc have been added.
Most of the existing uses of atomic operations have been switched over
to this new class. A future change will clean up the rest of uses.
This change now allows building mutex and condition variable code with a
C++ compiler which does not have stdatomic.h, for example g++.
Reviewed By: lntue
Differential Revision: https://reviews.llvm.org/D120642
This patch includes some changes which brings the code in line with
llvm coding guidelines.
-> Remove curlies for one line if statements.
-> Remove else after return.
-> Removes a few usage of auto.
-> Add Doxygen comments
Addresses post review comments in D120403 by @schweitz.
Reviewed By: schweitz
Differential Revision: https://reviews.llvm.org/D120657
Derived types with allocatable and pointer components cannot
be used in I/O data transfer statements unless they have defined
I/O procedures available (as type-bound or regular generics).
These cases are caught as errors by the I/O runtime library,
but it would be better if they were flagged during compilation.
(Address comment in review: don't use explicit name string lengths.)
Differential Revision: https://reviews.llvm.org/D120675