Fix test_platform_file_fstat to correctly truncate/max out the expected
value when GDB Remote Serial Protocol specifies a value as an unsigned
integer but the underlying platform type uses a signed integer.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.llvm.org/D128042
Make the AVX/MPX register tests more robust by checking for the presence
of actual registers rather than register sets. Account for the option
that the respective registers are defined but not available, as is
the case on FreeBSD and NetBSD. This fixes test regression on these
platforms.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.llvm.org/D128041
Refactor GDBRemoteCommunicationServerLLGS::SendStopReasonForState()
to accept process as an argument rather than hardcoding
m_current_process, in order to make it work correctly for multiprocess
scenarios.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.llvm.org/D127497
Refactor SendStopReplyPacketForThread() to accept process instance
as a parameter rather than use m_current_process. This future-proofs
it for multiprocess support.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.llvm.org/D127289
This change removes an explicit scalable vector bailout for fshl and fshr. This bailout was added in 60e4698b9a, when sinking a unconditional bailout for all intrinsics into selected cases. Its not clear if the bailout was originally unneeded, or if our cost model infrastructure has simply matured in the meantime. Either way, the generic code appears to handle scalable vectors without issue.
Note that the RISC-V cost model changes here aren't particularly interesting. They do probably better match the current lowering, but the main point is to have coverage of the BasicTTI path and simply show lack of crashing.
AArch64 costing was changed to preserve legacy behavior. There will most likely be an upcoming change to use the generic costs there too, but I didn't want to make that change not being particularly familiar with the target.
Differential Revision: https://reviews.llvm.org/D127680
The code being removed is technically correct; if we end up with two VL=0 instructions next to each other, we can avoid a state transition if the second is a scalar move. However, since both ops are also nops, we should simply delete them instead. As such, this compatibility rule simply complicates the code for no purpose.
Depending on the environment, a floating point instruction should
treat denormal inputs as zero, and/or flush a denormal output to zero.
Denormals are not currently accounted for when an instruction gets
folded to a constant, which can lead to differences in output between
a folded and a unfolded instruction when running on the target. The
denormal handling mode can be set by the function level attribute
denormal-fp-math, which this patch uses to determine whether any
denormal inputs to or outputs from folding should be zero, and that
the sign is set appropriately.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D116952
The example wouldn't compile, and used an invalid case style for a
function.
Reviewed By: MatzeB
Differential Revision: https://reviews.llvm.org/D128176
In order to support newer hardware, define wrappers around MFMA
intrinsics that have not previously been exposed in the ROCDL dialect.
A `amdgpu.mfma` wrapper around these instructions is in development
and will provide a more user-friendly interface to them.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D128079
When working through correctness issues in this pass, I moved a number of transforms which were phrased as mutating prior vsetvli instructions out of the main data flow because mutating prior instructions can invalidate the running dataflow results in subtle ways. We ended up creating both a prepass and a post-pass.
After consideration, I believe the prepass to be redundant, and this change removes it by folding it back into the data flow via a key conceptual change. Instead of phrasing the mutations on instructions, we can phrase them on abstract states. This avoids the dataflow inconsistency problem mentioned above by simply propagating the potential change forward, and thus reflecting its results in the dataflow. Critically, we do so without modifying existing VSETVLI instructions; some of the data flow steps include non-local IR analysis.
Compile time wise, this removes a linear pass, but has the potential to increase the number of iterations for the data flow to converge. That's not a algorithmic complexity change, the needVSETVLI mechanism has the same effect. In practice, I don't see this triggering more iterations, so I think it's likely to be a net win overall. (I didn't do any careful analysis here; just an impression from glancing at a couple tests.)
This has the potential to produce better results, so this isn't strictly speaking NFC.
Differential Revision: https://reviews.llvm.org/D127870
This is to fix the following error on https://green.lab.llvm.org/green/job/clang-stage2-Rthinlto:
BranchProbability.h:236:34: error: declaration of 'distance' must be imported from module 'std.iterator.__iterator.distance' before it is required
In D127983, I had flipped from using the computed EEW to using the SEW value pulled from the VSETVLI when checking compatibility. This wasn't intentional, though thankfully it appears to be a non-functional difference. The new code does make a unchecked assumption that the initial SEW operand on the load/store is the EEW. This patch clarifies the assumption, and adds an assert to make sure this remains true.
Differential Revision: https://reviews.llvm.org/D128085
These tests demonstrate cases where the constant produced by folding
a floating point instruction should differ based on the denormal
handling mode set in function attributes.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D125807
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D128186
Co-authored-by: Peter Steinfeld <psteinfeld@nvidia.com>
Make libomptarget.device.a built when using -DLLVM_ENABLE_PROJECTS=openmp
Use add_custom_command.
Reviewed By: jhuber6
Differential Revision: https://reviews.llvm.org/D128130
The SME zero instruction takes a mask as an input declaring which
64-bit element tiles should be zeroed. There is a 1:1 mapping
between the zero intrinsic and the instruction, however we also
want to make the register allocator aware that some tile
registers are being written to.
We can actually just use the custom inserter for a pseudo instruction
to correctly mark all the appropriate registers in the mask as
implicitly defined by the operation.
Differential Revision: https://reviews.llvm.org/D127843
Without this change, code such as "f(/*param=*/1_op)" will check the
comment twice, once for the parameter of f (correct) and once for
the parameter of operator""_op (likely incorrect). The change removes
only the second check.
Reviewed By: njames93, LegalizeAdulthood
Differential Revision: https://reviews.llvm.org/D125885
Include the process identifier in the `T` stop responses when
multiprocess extension is enabled (i.e. prepend it to the thread
identifier). Use the exposed identifier to simplify the fork-and-follow
tests.
The LLDB client accounts for the possible PID since the multiprocess
extension support was added in b601c67192.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.llvm.org/D127192
Include the process identifier in W/X stop reasons when multiprocess
extensions are enabled.
The LLDB client does not support process identifiers there at the moment
but it parses packets in such a way that their presence does not cause
any problems.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.llvm.org/D127191
Currently the backtrace emitted on windows when llvm-symbolizer is not
available includes addresses which cannot be easily decoded because
the addresses have the containing module's run-time base address added
into them, but we don't know what those base addresses are. This
change emits a module offset rather than an address.
There are a couple of related changes which were included as a result
of the review discussion for this patch:
- I have also removed the parameter printing as it adds noise to the
dump and doesn't seem useful.
- I have added the exception code to the backtrace.
Differential Review: https://reviews.llvm.org/D127915
LoopPeel add new incoming values to exit phi nodes which can change the
SCEV for the phi after 20d798bd47.
Forget SCEVs for such phis.
Fixes#56044.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D128164