This adds support for STRICT_FSETCC(quiet) and STRICT_FSETCCS(signaling).
FEQ matches well to STRICT_FSETCC oeq.
FLT/FLE matches well to STRICT_FSETCCS olt/ole.
Others require commuting operands or multiple instructions.
STRICT_FSETCC olt/ole/ogt/oge/ult/ule/ugt/uge uses FLT/FLE,
but we need to save/restore FFLAGS around them to avoid spurious
exceptions. I've implemented pseudo instructions with a
CustomInserter to insert the save/restore CSR instructions.
Unfortunately, this doesn't honor exceptions for signaling NANs
but I'm not sure if signaling nans are really supported by the
constrained intrinsics.
STRICT_FSETCC one and ueq expand to a pair of FLT instructions
with a save/restore of fflags around each. This could be improved
in the future.
There may be some opportunities to generate better code for strict
comparisons mixed with nonans fast math flags. I've left FIXMEs in
the .td files for that.
Co-Authored-by: ShihPo Hung <shihpo.hung@sifive.com>
Reviewed By: arcbbb
Differential Revision: https://reviews.llvm.org/D116694
This avoids the InlineAdvisor carrying the responsibility of deleting
Function objects. We use LazyCallGraph::Node objects instead, which are
stable in memory for the duration of the Module-wide performance of CGSCC
passes started under the same ModuleToPostOrderCGSCCPassAdaptor (which
is the case here)
Differential Revision: https://reviews.llvm.org/D116964
This fixes bug52896.
Simply, some symmetric transfer optimization chances get invalided due
to we delete some inlined optimization passes in 822b92a. This would
cause stack-overflow in some situations which should be avoided by the
design of coroutine. This patch tries to fix this by transforming the
constant CmpInst instruction which was done in the deleted passes.
Reviewed By: rjmccall, junparser
Differential Revision: https://reviews.llvm.org/D116327
Extract the `readNativeFile()` loop from
`MemoryBuffer::getMemoryBufferForStream()` into `readNativeFileToEOF()`
to allow reuse. The chunk size is configurable; the default of `4*4096`
is exposed as `sys::fs::DefaultReadChunkSize` to allow sizing of
SmallVectors.
There's somewhere I'd like to read a usually-small file without overhead
of a MemoryBuffer; extracting existing logic rather than duplicating it.
Differential Revision: https://reviews.llvm.org/D115397
Replace a `reserve()`/`set_size()` pair with `resize_for_overwrite()`
and `truncate()`. The out parameter also needs a `clear()` call on the
error path.
Differential Revision: https://reviews.llvm.org/D115389
Replace a few `reserve()` / `set_size()` pairs with
`resize_for_overwrite()` / `truncate()` in the platform-specific
code for Windows.
Differential Revision: https://reviews.llvm.org/D115390
Update UnresolvedSet to use (and expose) `SmallVector::truncate()` instead
of `SmallVector::set_size()`. The latter is going to made private in a
future commit to avoid misuse.
Differential Revision: https://reviews.llvm.org/D115386
After {D115416}, the "Write map file" event no longer shows up
in the time trace. Each time trace profiler instance is thread-local,
but we had neglected to initialize a separate instance for the mapfile
worker thread.
Reviewed By: keith
Differential Revision: https://reviews.llvm.org/D117069
Update `SmallString::append()` to use `resize_for_overwrite()` instead
of `reserve()` followed by `set_size()`.
Differential Revision: https://reviews.llvm.org/D115382
Update `variadicMatcherDescriptor` to assert on reserved capacity and
to call `emplace_back()` instead of calling `set_size()` and constructing
the element in-place.
Differential Revision: https://reviews.llvm.org/D115379
Move `iter_swap.pass.cpp` into a new subdirectory: `iterator.cust.swap`
for symmetry with the neighboring subdirectory `iterator.cust.move`.
Differential Revision: https://reviews.llvm.org/D116992
D116913 will add LazyObject. Rename LazySymbol to LazyArchive to avoid confusion
and mirror ELF.
Reviewed By: #lld-macho, Jez Ng
Differential Revision: https://reviews.llvm.org/D116914
This happens in e.g. regalloc, where we trace decisions per function,
but wouldn't want to spew N log files (i.e. one per function). So we
output a key-value association, where the key is an ID for the
sub-module object, and the value is the tensorflow::SequenceExample.
The current relation with protobuf is tenuous, so we're avoiding a
custom message type in favor of using the `Struct` message, but that
requires the values be wire-able strings, hence base64 encoding.
We plan on resolving the protobuf situation shortly, and improve the
encoding of such logs, but this is sufficient for now for setting up
regalloc training.
Differential Revision: https://reviews.llvm.org/D116985
The code generation for the UBSan VLA size check was qualified by a con-
dition that the parameter must be a signed integer, however the C spec
does not make any distinction that only signed integer parameters can be
used to declare a VLA, only qualifying that it must be greater than zero
if it is not a constant.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D116048
This adds inline function support to NativePDB by parsing S_INLINESITE records
to retrieve inlinee line info and add them into line table at `ParseLineTable`.
Differential Revision: https://reviews.llvm.org/D116845
Given a while loop whose condition is given by a cmp, don't recomputed the comparison (or its inverse) in the after region, instead use a constant since the original condition must be true if we branched to the after region.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D117047
These tests are now green:
```
Trace.MultiPart
Trace.RestoreAccess
Trace.RestoreMutexLock
TraceAlloc.FinishedThreadReuse
TraceAlloc.FinishedThreadReuse2
TraceAlloc.SingleThread
```
rdar://82107856
D114294 <https://reviews.llvm.org/D114294> broke the Solaris buildbots:
/opt/llvm-buildbot/home/solaris11-amd64/clang-solaris11-amd64/llvm/compiler-rt/lib/sanitizer_common/sanitizer_linux_libcdep.cpp:613:29: error: use of undeclared identifier 'NT_GNU_BUILD_ID'
if (nhdr->n_type == NT_GNU_BUILD_ID && nhdr->n_namesz == kGnuNamesz) {
^
Like D107556 <https://reviews.llvm.org/D107556>, it forgot that
`NT_GNU_BUILD_ID` is an unportable GNU extension.
Fixed by making the code conditional on the definition of the macro.
Tested on `amd64-pc-solaris2.11` and `sparcv9-sun-solaris2.11`.
Differential Revision: https://reviews.llvm.org/D117051
Print a warning if `llvm-libtool-darwin` if any of the object
files provided by the user have the same file name.
The tool will now print a warning if there is a name collision across:
* Two object files
* An object file and an object file from within a static library
* Two object files from different static libraries
Here is an example of the error:
```
$ llvm-libtool-darwin -static -o archive.a out.o out.o
error: file 'out.o' was specified multiple times.
in: out.o
in: out.o
$ llvm-libtool-darwin -static -o archive.a out.o
$ llvm-libtool-darwin -static -o combined.a archive.a out.o
error: file 'out.o' was specified multiple times.
in: archive.a
in: out.o
```
This change mimics apple's cctools libtool's behavior which always shows a warning in such cases.
Reviewed By: smeenai
Differential Revision: https://reviews.llvm.org/D113130
Here we are refactoring the code to encapsulate data into classes. This allows
us to avoid passing the same objects through many functions that don't directly
use them. Now, functions that need to access data can do so from the class
state.
Reviewed By: jhenderson, smeenai
Differential Revision: https://reviews.llvm.org/D113127
This creates a way to configure MSAN to for eager checks that will be leveraged
by the introduction of a clang flag (-fsanitize-memory-param-retval).
This is redundant with the existing flag: -mllvm -msan-eager-checks.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D116855
Introduce `__fits_in_sso()` to put the constexpr tests into a central place.
Reviewed By: ldionne, #libc
Spies: libcxx-commits
Differential Revision: https://reviews.llvm.org/D116487
There are a lot of
```
#if _LIBCPP_DEBUG_LEVEL == 2
__get_db()->__insert_c(this);
#endif
```
This patch introduces `__debug_db_insert_c()` to put the `#if` in one central place.
Reviewed By: ldionne, #libc
Spies: libcxx-commits
Differential Revision: https://reviews.llvm.org/D116947
Prior to this patch, clang might suggest a deprecated name of a declaration over another name as the only mechanism for resolving two typo corrections referring to the same underlying declaration has previously been an alphabetical sort.
This patch adjusts this resolve by also taking into account whether one of two declarations are deprecated. If the new one is deprecated it may not replace a previous correction with a non-deprecated correction and a previous deprecated correction always gets replaced by a non-deprecated new correction.
Fixes https://github.com/llvm/llvm-project/issues/47272
Differential Revision: https://reviews.llvm.org/D116775
memory-barrier instructions to providing targets and developers a convenient
way to explicitly declare which instructions are memory-barriers.
Differential Revision: https://reviews.llvm.org/D116779
Most convolution operations need explicit padding of the input to
ensure all accesses are inbounds. In such cases, having a pad
operation can be a significant overhead. One way to reduce that
overhead is to try to fuse the pad operation with the producer of its
source.
A sequence
```
linalg.generic -> linalg.pad_tensor
```
can be replaced with
```
linalg.fill -> tensor.extract_slice -> linalg.generic ->
tensor.insert_slice.
```
if the `linalg.generic` has all parallel iterator types.
Differential Revision: https://reviews.llvm.org/D116418
This reverts commit 640beb38e7.
That commit caused performance degradtion in Quicksilver test QS:sGPU and a functional test failure in (rocPRIM rocprim.device_segmented_radix_sort).
Reverting until we have a better solution to s_cselect_b64 codegen cleanup
Change-Id: Ifc167b3c2dae7a65920676f22a97ba76485f3456
Reviewed By: kzhuravl
Differential Revision: https://reviews.llvm.org/D116686
Change-Id: I1abf49b74a7e2ba0e0205f747a4154a468b9d7f2
I noticed that the following case would compile in Clang but not GCC:
void *x(void) {
void *p = &&foo;
asm goto ("# %0\n\t# %l1":"+r"(p):::foo);
foo:;
return p;
}
Changing the output template above from %l2 would compile in GCC but not
Clang.
This demonstrates that when using tied outputs (say via the "+r" output
constraint), the hidden inputs occur or are numbered BEFORE the labels,
at least with GCC.
In fact, GCC does denote this in its documentation:
https://gcc.gnu.org/onlinedocs/gcc-11.2.0/gcc/Extended-Asm.html#Goto-Labels
> Output operand with constraint modifier ‘+’ is counted as two operands
> because it is considered as one output and one input operand.
For the sake of compatibility, I think it's worthwhile to just make this
change.
It's better to use symbolic names for compatibility (especially now
between released version of Clang that support asm goto with outputs).
ie. %l1 from the above would be %l[foo]. The GCC docs also make this
recommendation.
Also, I cleaned up some cruft in GCCAsmStmt::getNamedOperand. AFAICT,
NumPlusOperands was no longer used, though I couldn't find which commit
didn't clean that up correctly.
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98096
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103640
Link: https://gcc.gnu.org/onlinedocs/gcc-11.2.0/gcc/Extended-Asm.html#Goto-Labels
Reviewed By: void
Differential Revision: https://reviews.llvm.org/D115471