Simplify the test case to make it easier to look at. Change from auto-generated
checks to targeted manual checks to reduce sensitivity to register allocation
and scheduling changes.
Differential Revision: https://reviews.llvm.org/D111333
Average pool assumed the same input/output type. Result type for integers
is always an i32, should be updated appropriately.
Reviewed By: GMNGeoffrey
Differential Revision: https://reviews.llvm.org/D111590
Noticed in code review
4318028cd2 (commitcomment-57738034)
But the issue had already been fixed in
943b304848 due to a code checking tool
(PVS studio) identification, but that lacked test coverage.
Refactor this test a little bit too by using more CHECK-SAME to help the
checks fail sooner (rather than, if the addrx or sizes are wrong, having
that check bind to a much later output line - and then fail due to the
implicit-check-nots, which don't provide a lot of information about
where the intended check was likely to land) & more informatively.
Also modify the test to be more robust (current IR generation doesn't
include call sites for callees that are only declared but not defined -
so the test case couldn't be regenerated - add a function definition (&
optnone attribute) so it doesn't depend on call sites for
declared-but-not-defined functions)
This patch fixes another crash revealed by PR51614:
when *deciding* to vectorize with masked interleave groups, check if the access
is reverse (which is currently not supported).
Differential Revision: https://reviews.llvm.org/D108900
This adds strncat to llvm libc. In addition, an error was found with
strcat and that was fixed.
Reviewed By: lntue
Differential Revision: https://reviews.llvm.org/D111583
Adapt CodegenStartegy to used the vector transfer lowering patterns by default.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D111649
unlock corrupted backets by using s set by loop to nullptr.
Also StackDepot supports iterating without locking.
Reviewed By: dvyukov
Differential Revision: https://reviews.llvm.org/D111599
This patch fixes a couple of small oversights in the documentation for
the datalayout specification:
* The v and f specifications are subject to the same constraints on <size>
as i is.
* The p[n] specification didn't mark <idx> as optional, despite
being documented and parsed as such.
* Similarly, none of the alignment specifications require <pref>.
If another inlining session came after a ModuleInlinerWrapperPass, the
advisor alanysis would still be cached, but its Result would be cleared.
We need to clear both.
This addresses PR52118
Differential Revision: https://reviews.llvm.org/D111586
Clarify the message provided when the analyzer catches the use of memory
that is allocated with size zero.
Differential Revision: https://reviews.llvm.org/D111655
CFGBuilder::addStmt() implicitly passes AddStmtChoice::AlwaysAdd
to Visit() already, so this should have no behavior change.
Differential Revision: https://reviews.llvm.org/D111570
If I remember correctly this wasn't done previously because dim used to
be in the memref dialect.
Differential Revision: https://reviews.llvm.org/D111651
The new pass walks kernel's pointer arguments, then loads from them.
If a loaded value is a pointer and loaded pointer is unmodified in
the kernel before the load, then promote loaded pointer to global.
Then recursively continue.
Differential Revision: https://reviews.llvm.org/D111464
This may not be obvious, but Alive2 agrees:
https://alive2.llvm.org/ce/z/Ld9qNT
If the mul has "nsw", then -1 * INT_MIN is poison, so the
negate can also have "nsw" because 0 - INT_MIN is poison.
If the mul has "nuw", then that means the "OtherOp" can only
be 0 or 1 (anything else multiplied by 0xfff... would wrap).
So the replacement negate must be "nsw" because it is either
"0-0" or "0-1".
This is another regression noticed with a planned follow-up
to D111410.
As noted in https://github.com/halide/Halide/pull/6302,
we hilariously fail to match PAVG if we even as much
as look at it the wrong way.
In this particular case, the problem stems from the fact that
`PAVG` root (def) is a `trunc`, and leafs (uses) are `zext`'s,
and InstCombine really loves to get rid of both of these,
for example replace them with a bit mask. So we may not have
said `zext`.
Instead of checking for that + type match,
i think we should rely on the actual active type,
as per the knownbits.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D111571
As reported in https://bugs.llvm.org/show_bug.cgi?id=48145, name resolution for omp critical construct was failing. This patch adds functionality to help that name resolution as well as implementation to catch name mismatches.
The following semantic restrictions are therefore handled here:
- If a name is specified on a critical directive, the same name must also be specified on the end critical directive
- If no name appears on the critical directive, no name can appear on the end critical directive
- If a name appears on either the start critical directive or the end critical directive
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D110502
This patch continues unblocking optimizations that are blocked by pseudo probe instrumentation.
Not exactly like DbgIntrinsics, PseudoProbe intrinsic has other attributes (such as mayread, maywrite, mayhaveSideEffect) that can block optimizations. The issues fixed are:
- Flipped default param of getFirstNonPHIOrDbg API to skip pseudo probes
- Unblocked CSE by avoiding pseudo probe from clobbering memory SSA
- Unblocked induction variable simpliciation
- Allow empty loop deletion by treating probe intrinsic isDroppable
- Some refactoring.
Reviewed By: wenlei
Differential Revision: https://reviews.llvm.org/D110847
To get proper wrap-around behavior for the various kind parameter
values of the optional COUNT= and COUNT_MAX= dummy arguments to
the intrinsic subroutine SYSTEM_CLOCK, add an extra argument to
the APIs for lowering to pass the integer kind of the actual argument.
Avoid confusion by requiring that both actual arguments have the same
kind when both are present. The results of the runtime functions
remain std::int64_t and lowering should still convert them before
storing to the actual argument variables.
Rework the implementation a bit to accomodate the dynamic
specification of the kind parameter, and to clean up some coding
issues with preprocessing and templates.
Use the kind of the COUNT=/COUNT_MAX= actual arguments to determine
the clock's resolution, where possible, in conformance with other
Fortran implementations.
Differential Revision: https://reviews.llvm.org/D111281
The operand of the second any_of in EnforceSmallerThan should be
B not S like the FP code in the if below.
Unfortunately, fixing that causes an infinite loop in the build
of RISCV. So I've added a workaround for that as well.
Fixes PR44768.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D111502
Added support of a "--nvlink-path" option in clang-nvlink-wrapper which
takes the path of nvlink binary.
Static Device Library support for OpenMP (D105191) now searches for
nvlink binary and passes its location via this option. In absence
of this option, nvlink binary is searched in locations in PATH.
Differential Revision: https://reviews.llvm.org/D111488
Fixes the tests added in D110852 for the debug iterators.
Similar issues with hijacking `operator&` still exist, they will be
addressed separately.
Reviewed By: #libc, ldionne, Quuxplusone
Differential Revision: https://reviews.llvm.org/D111564
These "dump" methods call into MachineOperand::dump, which doesn't exist
with NDEBUG, thus we croak. Disable LiveDebugValues dump methods when
NDEBUG is turned on to avoid this.
This header was transitively included to provide the definition of
__lc_ctype_ptr that we use on AIX, but that is fragile as it depends on
the settings of compatibility macros, so we explicitly include it here
to avoid that scenario.
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D111239
Some random changes that were hanging around in my workspace. Also,
a tiny step towards creating a header file for the sparse utils lib.
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D111589
While looking at LWG-2988 and P0558 it seems the issues were already
implemented, but the synopsis wasn't updated. Some of the tests didn't
validate the `noexcept` status. A few tests were missing completely:
- `atomic_wait_explicit`
- `atomic_notify_one`
- `atomic_notify_all`
Mark P0558 as complete, didn't investigate which version of libc++ first
includes this. It seems the paper has been retroactively applied. I
couldn't find whether this is correct, but looking at cppreference it
seems intended.
Completes
- LWG-2988 Clause 32 cleanup missed one typename
- P0558 Resolving atomic<T> named base class inconsistencies
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D103765
This allows picking up on mingw triples that often use 'w64' instead
of 'pc' as the vendor part.
Differential Revision: https://reviews.llvm.org/D111297
With the -early-live-intervals command line flag,
TwoAddressInstructionPass::runOnMachineFunction would call
MachineFunction::verify before returning to check the live intervals.
But there was not much benefit to doing this since -verify-machineinstrs
and LLVM_ENABLE_EXPENSIVE_CHECKS provide a more general way of
scheduling machine verification after every pass.
Also it caused problems on targets like Lanai which are marked as "not
machine verifier clean", since verification would fail for known
target-specific problems which are nothing to do with LiveIntervals.
Differential Revision: https://reviews.llvm.org/D111618
This has a couple of benefits:
1. It can sometimes fix clusters that got broken apart when the register
allocator inserted a copy.
2. Post-RA scheduling does not have to worry about increasing register
pressure, which in some cases gives it more freedom to reorder
instructions.
Testing on a collection of 10,000 graphics shaders compiled for gfx1010
showed:
- The average length of each run of one or more load instructions
increased by about 1%.
- The number of runs of two or more load instructions increased by
about 4%.
This patch shifts the InstrRefBasedLDV class declaration to a header.
Partially because it's already massive, but mostly so that I can start
writing some unit tests for it. This patch also adds the boilerplate for
said unit tests.
Differential Revision: https://reviews.llvm.org/D110165
Add a switch to code gen strategy to disable/enable the vector transfer lowering and disable it by default.
Differential Revision: https://reviews.llvm.org/D111647