request_irq() is preferred over setup_irq(). Invocations of setup_irq()
occur after memory allocators are ready.
Per tglx[1], setup_irq() existed in olden days when allocators were not
ready by the time early interrupts were initialized.
Hence replace setup_irq() by request_irq().
[1] https://lkml.kernel.org/r/alpine.DEB.2.20.1710191609480.1971@nanos
Signed-off-by: afzal mohammed <afzal.mohd.ma@gmail.com>
Message-Id: <20200304005049.5291-1-afzal.mohd.ma@gmail.com>
[heiko.carstens@de.ibm.com: replace pr_err with panic]
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Once the call to qdio_establish() has completed, qdio is free to deliver
data IRQs to the device driver's IRQ poll handler.
For qeth (the only qdio driver that currently uses IRQ polling) this is
problematic, since the IRQs can arrive before its NAPI instance is
even registered. Calling napi_schedule() from qeth_qdio_start_poll()
then crashes in various nasty ways.
Until recently qeth checked for IFF_UP to drop such early interrupts,
but that's fragile as well since it doesn't enforce any ordering.
Fix this properly by bringing up the qdio device in IRQS_DISABLED mode,
and have the driver explicitly opt-in to receive data IRQs.
qeth does so from qeth_open(), which kick-starts a NAPI poll and then
calls qdio_start_irq() from qeth_poll().
Also add a matching qdio_stop_irq() in qeth_stop() to switch the qdio
dataplane back into a disabled state.
Fixes: 3d35dbe622 ("s390/qeth: don't check for IFF_UP when scheduling napi")
CC: Qian Cai <cai@lca.pw>
Reported-by: Qian Cai <cai@lca.pw>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Acked-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
While we print out various SSQD fields at initialization time, having
raw & full access to the current SSQD can help with debugging.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
The current codebase makes use of the zero-length array language
extension to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array member[1][2],
introduced in C99:
struct foo {
int stuff;
struct boo array[];
};
By making use of the mechanism above, we will get a compiler warning
in case the flexible array does not occur last in the structure, which
will help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on.
Also, notice that, dynamic memory allocations won't be affected by
this change:
"Flexible array members have incomplete type, and so the sizeof operator
may not be applied. As a quirk of the original implementation of
zero-length arrays, sizeof evaluates to zero."[1]
This issue was found with the help of Coccinelle.
[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit 7649773293 ("cxgb3/l2t: Fix undefined behaviour")
Link: https://lkml.kernel.org/r/20200221150612.GA9717@embeddedor
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
There's no need for error handling, the debugfs core is smart enough to
deal with IS_ERR() internally.
This will also keep us from creating the debugfs files if the device
directory doesn't exist. Currently (because irq_ptr->debugfs_dev gets set
to NULL on error) the files would be placed into the debugfs root - without
any association to their parent device.
On teardown, use the debugfs_remove_recursive() helper to avoid keeping
track of each created file/directory.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Don't rely on the numeric value of enum constants.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Remove all usage of cdev->private->qdio_data that's buried deep in
internal code. This should only be used by the exported driver API,
which can then pass around a proper qdio_irq pointer.
Also trivially merge some initializations with their definitions.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Some parts use init_data->cdev, others use irq_ptr->cdev. In the end
it's all the same, but unnecessarily confusing.
Use a single reference instead.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
As the comment says, sl->sbal holds an absolute address. qeth currently
solves this through wild casting, while zfcp doesn't care.
Handle this properly in the code that actually builds the SL.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
Reviewed-by: Steffen Maier <maier@linux.ibm.com> [for qdio]
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
The only way to reach this allocation is via
qdio_establish()
qdio_detect_hsicq()
qdio_enable_async_operation()
and since qdio_establish() uses wait_event_*() just a few lines ealier,
we can trust that it certainly is never called from atomic context.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Current code uses a 'polling' flag to keep track of whether an Input
Queue has any ACKed SBALs. QEBSM devices might have multiple ACKed
SBALs, and those are tracked separately with 'ack_count'.
By also setting ack_count for non-QEBSM devices (to a fixed value of 1),
we can use 'ack_count != 0' as replacement for the polling flag.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
This patch corrects the SPDX License Identifier style in
header file related to S/390 common i/o drivers.
It assigns explicit block comment to the SPDX License
Identifier.
Changes made by using a script provided by Joe Perches here:
https://lkml.org/lkml/2019/2/7/46.
Fixes: 3cd90214b7 ("vfio: ccw: add tracepoints for interesting error paths")
Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Nishad Kamdar <nishadkamdar@gmail.com>
Message-Id: <20191225122054.GA4598@nishad>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The max data count (mdc) is an unsigned 16-bit integer value as per AR
documentation and is received via ccw_device_get_mdc() for a specific
path mask from the CIO layer. The function itself also always returns a
positive mdc value or 0 in case mdc isn't supported or couldn't be
determined.
Though, the comment for this function describes a negative return value
to indicate failures.
As a result, the DASD device driver interprets the return value of
ccw_device_get_mdc() incorrectly. The error case is essentially a dead
code path.
To fix this behaviour, check explicitly for a return value of 0 and
change the comment for ccw_device_get_mdc() accordingly.
This fix merely enables the error code path in the DASD functions
get_fcx_max_data() and verify_fcx_max_data(). The actual functionality
stays the same and is still correct.
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Jan Höppner <hoeppner@linux.ibm.com>
Acked-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Reviewed-by: Stefan Haberland <sth@linux.ibm.com>
Signed-off-by: Stefan Haberland <sth@linux.ibm.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pull networking updates from David Miller:
"Another merge window, another pull full of stuff:
1) Support alternative names for network devices, from Jiri Pirko.
2) Introduce per-netns netdev notifiers, also from Jiri Pirko.
3) Support MSG_PEEK in vsock/virtio, from Matias Ezequiel Vara
Larsen.
4) Allow compiling out the TLS TOE code, from Jakub Kicinski.
5) Add several new tracepoints to the kTLS code, also from Jakub.
6) Support set channels ethtool callback in ena driver, from Sameeh
Jubran.
7) New SCTP events SCTP_ADDR_ADDED, SCTP_ADDR_REMOVED,
SCTP_ADDR_MADE_PRIM, and SCTP_SEND_FAILED_EVENT. From Xin Long.
8) Add XDP support to mvneta driver, from Lorenzo Bianconi.
9) Lots of netfilter hw offload fixes, cleanups and enhancements,
from Pablo Neira Ayuso.
10) PTP support for aquantia chips, from Egor Pomozov.
11) Add UDP segmentation offload support to igb, ixgbe, and i40e. From
Josh Hunt.
12) Add smart nagle to tipc, from Jon Maloy.
13) Support L2 field rewrite by TC offloads in bnxt_en, from Venkat
Duvvuru.
14) Add a flow mask cache to OVS, from Tonghao Zhang.
15) Add XDP support to ice driver, from Maciej Fijalkowski.
16) Add AF_XDP support to ice driver, from Krzysztof Kazimierczak.
17) Support UDP GSO offload in atlantic driver, from Igor Russkikh.
18) Support it in stmmac driver too, from Jose Abreu.
19) Support TIPC encryption and auth, from Tuong Lien.
20) Introduce BPF trampolines, from Alexei Starovoitov.
21) Make page_pool API more numa friendly, from Saeed Mahameed.
22) Introduce route hints to ipv4 and ipv6, from Paolo Abeni.
23) Add UDP segmentation offload to cxgb4, Rahul Lakkireddy"
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1857 commits)
libbpf: Fix usage of u32 in userspace code
mm: Implement no-MMU variant of vmalloc_user_node_flags
slip: Fix use-after-free Read in slip_open
net: dsa: sja1105: fix sja1105_parse_rgmii_delays()
macvlan: schedule bc_work even if error
enetc: add support Credit Based Shaper(CBS) for hardware offload
net: phy: add helpers phy_(un)lock_mdio_bus
mdio_bus: don't use managed reset-controller
ax88179_178a: add ethtool_op_get_ts_info()
mlxsw: spectrum_router: Fix use of uninitialized adjacency index
mlxsw: spectrum_router: After underlay moves, demote conflicting tunnels
bpf: Simplify __bpf_arch_text_poke poke type handling
bpf: Introduce BPF_TRACE_x helper for the tracing tests
bpf: Add bpf_jit_blinding_enabled for !CONFIG_BPF_JIT
bpf, testing: Add various tail call test cases
bpf, x86: Emit patchable direct jump as tail call
bpf: Constant map key tracking for prog array pokes
bpf: Add poke dependency tracking for prog array maps
bpf: Add initial poke descriptor table for jit images
bpf: Move owner type, jited info into array auxiliary data
...
- Adjust PMU device drivers registration to avoid WARN_ON and few other
perf improvements.
- Enhance tracing in vfio-ccw.
- Few stack unwinder fixes and improvements, convert get_wchan custom
stack unwinding to generic api usage.
- Fixes for mm helpers issues uncovered with tests validating architecture
page table helpers.
- Fix noexec bit handling when hardware doesn't support it.
- Fix memleak and unsigned value compared with zero bugs in crypto
code. Minor code simplification.
- Fix crash during kdump with kasan enabled kernel.
- Switch bug and alternatives from asm to asm_inline to improve inlining
decisions.
- Use 'depends on cc-option' for MARCH and TUNE options in Kconfig,
add z13s and z14 ZR1 to TUNE descriptions.
- Minor head64.S simplification.
- Fix physical to logical CPU map for SMT.
- Several cleanups in qdio code.
- Other minor cleanups and fixes all over the code.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEE3QHqV+H2a8xAv27vjYWKoQLXFBgFAl3ahGYACgkQjYWKoQLX
FBguwAgAig+FNos8zkd7Sr2wg4DPL2IlYERVP40fOLXfGuVUOnMLg8OTO6yDWDpH
5+cKAQS1wWgyvlfjWRUJ6anXLBAsgKRD1nyFIZTpn/wArGk/duCbnl/VFriDgrST
8KTQDJpZ9w9nXtQ7lA2QWaw5U2WG8I2T2JuQJCdLXze7RXi0bDVe8e6131NMaJ42
LLxqOqm8d8XDnd8oDVP04LT5IfhuI2cILoGBP/GyI2fqQk9Ems6M2gxuISq1COmy
WORDLfwWyCLeF7gWKKjxf8Vo1HYcyoFvdXnxWiHb0TDZesQZJr/LLELTP03fbCW9
U4jbXncnnPA7kT4tlC95jT5M69yK5w==
=+FxG
-----END PGP SIGNATURE-----
Merge tag 's390-5.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Vasily Gorbik:
- Adjust PMU device drivers registration to avoid WARN_ON and few other
perf improvements.
- Enhance tracing in vfio-ccw.
- Few stack unwinder fixes and improvements, convert get_wchan custom
stack unwinding to generic api usage.
- Fixes for mm helpers issues uncovered with tests validating
architecture page table helpers.
- Fix noexec bit handling when hardware doesn't support it.
- Fix memleak and unsigned value compared with zero bugs in crypto
code. Minor code simplification.
- Fix crash during kdump with kasan enabled kernel.
- Switch bug and alternatives from asm to asm_inline to improve
inlining decisions.
- Use 'depends on cc-option' for MARCH and TUNE options in Kconfig, add
z13s and z14 ZR1 to TUNE descriptions.
- Minor head64.S simplification.
- Fix physical to logical CPU map for SMT.
- Several cleanups in qdio code.
- Other minor cleanups and fixes all over the code.
* tag 's390-5.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (41 commits)
s390/cpumf: Adjust registration of s390 PMU device drivers
s390/smp: fix physical to logical CPU map for SMT
s390/early: move access registers setup in C code
s390/head64: remove unnecessary vdso_per_cpu_data setup
s390/early: move control registers setup in C code
s390/kasan: support memcpy_real with TRACE_IRQFLAGS
s390/crypto: Fix unsigned variable compared with zero
s390/pkey: use memdup_user() to simplify code
s390/pkey: fix memory leak within _copy_apqns_from_user()
s390/disassembler: don't hide instruction addresses
s390/cpum_sf: Assign error value to err variable
s390/cpum_sf: Replace function name in debug statements
s390/cpum_sf: Use consistant debug print format for sampling
s390/unwind: drop unnecessary code around calling ftrace_graph_ret_addr()
s390: add error handling to perf_callchain_kernel
s390: always inline current_stack_pointer()
s390/mm: add mm_pxd_folded() checks to pxd_free()
s390/mm: properly clear _PAGE_NOEXEC bit when it is not supported
s390/mm: simplify page table helpers for large entries
s390/mm: make pmd/pud_bad() report large entries as bad
...
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEEw9DWbcNiT/aowBjO3s9rk8bwL68FAl3JYRgSHGNvaHVja0By
ZWRoYXQuY29tAAoJEN7Pa5PG8C+vwqMP/202cRtkkJ8Z21XPoCXAnA+pB08x79gE
buVYpPfb6GzKNYyaeDZ0X8DZzfVLaaWz4kr1B1FUGLJKVdtSXEMjKyj4ho8fkMle
aV7EZnswWI0ly+ckNy6UTm36MaRha6YK8lLQavuFcZpzhK2342xFxer6TZuc2yuK
qg+2veuxJH1s14pQCwLSrAaNSnoxheT0snDpb+70FZB4L+J4lEjCx/KqG+wP/sdw
l7ybfwSLu29yD9HDh6SrAjydZLTIsMIIY/1nTG5Gk/pYl70vIXZ7PjiMIxSxnPEe
U+RZ84ytbtVHK1SBo284AJziXaAdwASNQbOovishYAKQW3o3EsI7OomTtPSIFypU
KdMGWqJIxpFeEQ7s5ZEjE4gRz9hB16aveOLRd4A8sCVNP+vXJnILNyrxKtL4UP+a
+kP/UxIntJsvBKC9t83zxAlI07a+BcHgVw9bMtWcJaA6Q9WXwoeesE5gwYLpsQSs
tLlRrLKFIs8Y+mJmHiaBKXyipH51n+SvxWK4BZs9tfH2q3hjO7Tm5GIuJ852NTKq
XOk4CRRh16JsW9xEO9RndZr+b32KICAPov3IxpgxFKUVUr9/KV/TUw3NY0rECo8R
f+qraNETGDPsdMZ/tPbYuhdFyMd4eqRkOC2CPC9W7A/lhMYUNyM+nROVdeT3r5rm
t69OVIUYhfTD
=ZQte
-----END PGP SIGNATURE-----
Merge tag 'vfio-ccw-20191111' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/vfio-ccw into features
enhance tracing in vfio-ccw
* tag 'vfio-ccw-20191111' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/vfio-ccw:
vfio-ccw: Rework the io_fctl trace
vfio-ccw: Add a trace for asynchronous requests
vfio-ccw: Trace the FSM jumptable
vfio-ccw: Refactor how the traces are built
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
This allows IQD drivers to send out multiple SBALs with a single SIGA
instruction.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Output interrupts are not subject to SLSB-based avoidance, so remove the
gratuitous SLSB updates for Output SBALs in ERROR state.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
On an interrupt, tiqdio_thinint_handler() walks a list of all objects
that might require attention, and checks their DSCI. This list is
awkwardly built from Input Queues, even though the IRQs are per-device
and the queue is then only used to dereference its qdio_irq parent.
To simplify the logic, change the code so that tiq_list contains
qdio_irq entries.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
qperf_inc() takes a queue as input, but actually updates the statistics
in its qdio_irq parent.
In some contexts we already have access to the qdio_irq struct, and can
avoid the additional dereference.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Shift the definition of tiqdio_airq around, so that it doesn't require a
forward declaration for tiqdio_thinint_handler().
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Partial EQBS completion is no significant event, and the WARN ends up
spamming the debug logs for no good reason.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
qdio.h recently gained a new helper macro that handles wrap-around on a
QDIO queue, use it.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Using __field_struct for the schib is convenient, but it doesn't
appear to let us filter based on any of the schib elements.
Specifying the full schid or any element within it results
in various errors by the parser. So, expand that out to its
component elements, so we can limit the trace to a single device.
While we are at it, rename this trace to the function name, so we
remember what is being traced instead of an abstract reference to the
function control bit of the SCSW.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20191016142040.14132-5-farman@linux.ibm.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Since the asynchronous requests are typically associated with
error recovery, let's add a simple trace when one of those is
issued to a device.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Message-Id: <20191016142040.14132-4-farman@linux.ibm.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
It would be nice if we could track the sequence of events within
vfio-ccw, based on the state of the device/FSM and our calling
sequence within it. So let's add a simple trace here so we can
watch the states change as things go, and allow it to be folded
into the rest of the other cio traces.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20191016142040.14132-3-farman@linux.ibm.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Commit 3cd90214b7 ("vfio: ccw: add tracepoints for interesting error
paths") added a quick trace point to determine where a channel program
failed while being processed. It's a great addition, but adding more
traces to vfio-ccw is more cumbersome than it needs to be.
Let's refactor how this is done, so that additional traces are easier
to add and can exist outside of the FSM if we ever desire.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20191016142040.14132-2-farman@linux.ibm.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Commit 37db8985b2 ("s390/cio: add basic protected virtualization
support") breaks virtio-ccw devices with VIRTIO_F_IOMMU_PLATFORM for non
Protected Virtualization (PV) guests. The problem is that the dma_mask
of the ccw device, which is used by virtio core, gets changed from 64 to
31 bit, because some of the DMA allocations do require 31 bit
addressable memory. For PV the only drawback is that some of the virtio
structures must end up in ZONE_DMA because we have the bounce the
buffers mapped via DMA API anyway.
But for non PV guests we have a problem: because of the 31 bit mask
guests bigger than 2G are likely to try bouncing buffers. The swiotlb
however is only initialized for PV guests, because we don't want to
bounce anything for non PV guests. The first such map kills the guest.
Since the DMA API won't allow us to specify for each allocation whether
we need memory from ZONE_DMA (31 bit addressable) or any DMA capable
memory will do, let us use coherent_dma_mask (which is used for
allocations) to force allocating form ZONE_DMA while changing dma_mask
to DMA_BIT_MASK(64) so that at least the streaming API will regard
the whole memory DMA capable.
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Suggested-by: Robin Murphy <robin.murphy@arm.com>
Fixes: 37db8985b2 ("s390/cio: add basic protected virtualization support")
Link: https://lore.kernel.org/lkml/20190930153803.7958-1-pasic@linux.ibm.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
The QIB parm area is 128 bytes long. Current code consistently misuses
an _entirely unrelated_ QDIO constant, merely because it has the same
value. Stop doing so.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
- Fix 3 kasan findings.
- Add PERF_EVENT_IOC_PERIOD ioctl support.
- Add Crypto Express7S support and extend sysfs attributes for pkey.
- Minor common I/O layer documentation corrections.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEE3QHqV+H2a8xAv27vjYWKoQLXFBgFAl2Mj18ACgkQjYWKoQLX
FBiiowf/RnU7K+VACpZ9D1SAELvkz6z3aht4Xr22gJtzy1nHsMz2dILNPxrt1NgT
56gO8iYprQ7Mjl2/D6Mk2HbgI5cKVcyEr8hPvRA2NsUaVNOqH6HWGjn0NV4LRWGm
rGXkFvWz0639qSGiQ4KgRdJSFfMQiKWstKdKwgWwnnwmxpa7QC7P42SA9YBO3h9f
17y9JqLN1w9iKgvnGdeJlmPebi15I9jIMHaU+ebGd6EJ4AxNWOED7s1iIhAgjtNZ
jbsVzVu8luM0QNSBcK5h+4YDaYflt3zpuQg+DJcLvokVNGGaTi/RBzeJ+L81Fpgh
5HAPlaIV/xkgnqE9bG9Tr6L3NyRgug==
=Wc/P
-----END PGP SIGNATURE-----
Merge tag 's390-5.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull more s390 updates from Vasily Gorbik:
- Fix three kasan findings
- Add PERF_EVENT_IOC_PERIOD ioctl support
- Add Crypto Express7S support and extend sysfs attributes for pkey
- Minor common I/O layer documentation corrections
* tag 's390-5.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/cio: exclude subchannels with no parent from pseudo check
s390/cio: avoid calling strlen on null pointer
s390/topology: avoid firing events before kobjs are created
s390/cpumf: Remove mixed white space
s390/cpum_sf: Support ioctl PERF_EVENT_IOC_PERIOD
s390/zcrypt: CEX7S exploitation support
s390/cio: fix intparm documentation
s390/pkey: Add sysfs attributes to emit AES CIPHER key blobs
ccw console is created early in start_kernel and used before css is
initialized or ccw console subchannel is registered. Until then console
subchannel does not have a parent. For that reason assume subchannels
with no parent are not pseudo subchannels. This fixes the following
kasan finding:
BUG: KASAN: global-out-of-bounds in sch_is_pseudo_sch+0x8e/0x98
Read of size 8 at addr 00000000000005e8 by task swapper/0/0
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.3.0-rc8-07370-g6ac43dd12538 #2
Hardware name: IBM 2964 NC9 702 (z/VM 6.4.0)
Call Trace:
([<000000000012cd76>] show_stack+0x14e/0x1e0)
[<0000000001f7fb44>] dump_stack+0x1a4/0x1f8
[<00000000007d7afc>] print_address_description+0x64/0x3c8
[<00000000007d75f6>] __kasan_report+0x14e/0x180
[<00000000018a2986>] sch_is_pseudo_sch+0x8e/0x98
[<000000000189b950>] cio_enable_subchannel+0x1d0/0x510
[<00000000018cac7c>] ccw_device_recognition+0x12c/0x188
[<0000000002ceb1a8>] ccw_device_enable_console+0x138/0x340
[<0000000002cf1cbe>] con3215_init+0x25e/0x300
[<0000000002c8770a>] console_init+0x68a/0x9b8
[<0000000002c6a3d6>] start_kernel+0x4fe/0x728
[<0000000000100070>] startup_continue+0x70/0xd0
Cc: stable@vger.kernel.org
Reviewed-by: Sebastian Ott <sebott@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Fix the following kasan finding:
BUG: KASAN: global-out-of-bounds in ccwgroup_create_dev+0x850/0x1140
Read of size 1 at addr 0000000000000000 by task systemd-udevd.r/561
CPU: 30 PID: 561 Comm: systemd-udevd.r Tainted: G B
Hardware name: IBM 3906 M04 704 (LPAR)
Call Trace:
([<0000000231b3db7e>] show_stack+0x14e/0x1a8)
[<0000000233826410>] dump_stack+0x1d0/0x218
[<000000023216fac4>] print_address_description+0x64/0x380
[<000000023216f5a8>] __kasan_report+0x138/0x168
[<00000002331b8378>] ccwgroup_create_dev+0x850/0x1140
[<00000002332b618a>] group_store+0x3a/0x50
[<00000002323ac706>] kernfs_fop_write+0x246/0x3b8
[<00000002321d409a>] vfs_write+0x132/0x450
[<00000002321d47da>] ksys_write+0x122/0x208
[<0000000233877102>] system_call+0x2a6/0x2c8
Triggered by:
openat(AT_FDCWD, "/sys/bus/ccwgroup/drivers/qeth/group",
O_WRONLY|O_CREAT|O_TRUNC|O_CLOEXEC, 0666) = 16
write(16, "0.0.bd00,0.0.bd01,0.0.bd02", 26) = 26
The problem is that __get_next_id in ccwgroup_create_dev might set "buf"
buffer pointer to NULL and explicit check for that is required.
Cc: stable@vger.kernel.org
Reviewed-by: Sebastian Ott <sebott@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
The common I/O layer is maintaining an "intparm" inspired by
the hardware intparm for driver usage. This "intparm" is not
only applicaple for ssch, but also for hsch/csch. The kerneldoc
states that it is only updated for hsch/csch if no prior request
is pending; however, this is not what the code does (whether
that would actually desireable is a different issue.)
Let's at least fix the kerneldoc for now.
Fixes: b2ffd8e9a7 ("[S390] cio: Add docbook comments.")
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Sebastian Ott <sebott@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Pull networking updates from David Miller:
1) Support IPV6 RA Captive Portal Identifier, from Maciej Żenczykowski.
2) Use bio_vec in the networking instead of custom skb_frag_t, from
Matthew Wilcox.
3) Make use of xmit_more in r8169 driver, from Heiner Kallweit.
4) Add devmap_hash to xdp, from Toke Høiland-Jørgensen.
5) Support all variants of 5750X bnxt_en chips, from Michael Chan.
6) More RTNL avoidance work in the core and mlx5 driver, from Vlad
Buslov.
7) Add TCP syn cookies bpf helper, from Petar Penkov.
8) Add 'nettest' to selftests and use it, from David Ahern.
9) Add extack support to drop_monitor, add packet alert mode and
support for HW drops, from Ido Schimmel.
10) Add VLAN offload to stmmac, from Jose Abreu.
11) Lots of devm_platform_ioremap_resource() conversions, from
YueHaibing.
12) Add IONIC driver, from Shannon Nelson.
13) Several kTLS cleanups, from Jakub Kicinski.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1930 commits)
mlxsw: spectrum_buffers: Add the ability to query the CPU port's shared buffer
mlxsw: spectrum: Register CPU port with devlink
mlxsw: spectrum_buffers: Prevent changing CPU port's configuration
net: ena: fix incorrect update of intr_delay_resolution
net: ena: fix retrieval of nonadaptive interrupt moderation intervals
net: ena: fix update of interrupt moderation register
net: ena: remove all old adaptive rx interrupt moderation code from ena_com
net: ena: remove ena_restore_ethtool_params() and relevant fields
net: ena: remove old adaptive interrupt moderation code from ena_netdev
net: ena: remove code duplication in ena_com_update_nonadaptive_moderation_interval _*()
net: ena: enable the interrupt_moderation in driver_supported_features
net: ena: reimplement set/get_coalesce()
net: ena: switch to dim algorithm for rx adaptive interrupt moderation
net: ena: add intr_moder_rx_interval to struct ena_com_dev and use it
net: phy: adin: implement Energy Detect Powerdown mode via phy-tunable
ethtool: implement Energy Detect Powerdown support via phy-tunable
xen-netfront: do not assume sk_buff_head list is empty in error handling
s390/ctcm: Delete unnecessary checks before the macro call “dev_kfree_skb”
net: ena: don't wake up tx queue when down
drop_monitor: Better sanitize notified packets
...
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQQUwxxKyE5l/npt8ARiEGxRG/Sl2wUCXYAIeQAKCRBiEGxRG/Sl
2/SzAQDEnoNxzV/R5kWFd+2kmFeY3cll0d99KMrWJ8om+kje6QD/cXxZHzFm+T1L
UPF66k76oOODV7cyndjXnTnRXbeCRAM=
=Szby
-----END PGP SIGNATURE-----
Merge tag 'leds-for-5.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds
Pull LED updates from Jacek Anaszewski:
"In this cycle we've finally managed to contribute the patch set
sorting out LED naming issues. Besides that there are many changes
scattered among various LED class drivers and triggers.
LED naming related improvements:
- add new 'function' and 'color' fwnode properties and deprecate
'label' property which has been frequently abused for conveying
vendor specific names that have been available in sysfs anyway
- introduce a set of standard LED_FUNCTION* definitions
- introduce a set of standard LED_COLOR_ID* definitions
- add a new {devm_}led_classdev_register_ext() API with the
capability of automatic LED name composition basing on the
properties available in the passed fwnode; the function is
backwards compatible in a sense that it uses 'label' data, if
present in the fwnode, for creating LED name
- add tools/leds/get_led_device_info.sh script for retrieving LED
vendor, product and bus names, if applicable; it also performs
basic validation of an LED name
- update following drivers and their DT bindings to use the new LED
registration API:
- leds-an30259a, leds-gpio, leds-as3645a, leds-aat1290, leds-cr0014114,
leds-lm3601x, leds-lm3692x, leds-lp8860, leds-lt3593, leds-sc27xx-blt
Other LED class improvements:
- replace {devm_}led_classdev_register() macros with inlines
- allow to call led_classdev_unregister() unconditionally
- switch to use fwnode instead of be stuck with OF one
LED triggers improvements:
- led-triggers:
- fix dereferencing of null pointer
- fix a memory leak bug
- ledtrig-gpio:
- GPIO 0 is valid
Drop superseeded apu2/3 support from leds-apu since for apu2+ a newer,
more complete driver exists, based on a generic driver for the AMD
SOCs gpio-controller, supporting LEDs as well other devices:
- drop profile field from priv data
- drop iosize field from priv data
- drop enum_apu_led_platform_types
- drop superseeded apu2/3 led support
- add pr_fmt prefix for better log output
- fix error message on probing failure
Other misc fixes and improvements to existing LED class drivers:
- leds-ns2, leds-max77650:
- add of_node_put() before return
- leds-pwm, leds-is31fl32xx:
- use struct_size() helper
- leds-lm3697, leds-lm36274, leds-lm3532:
- switch to use fwnode_property_count_uXX()
- leds-lm3532:
- fix brightness control for i2c mode
- change the define for the fs current register
- fixes for the driver for stability
- add full scale current configuration
- dt: Add property for full scale current.
- avoid potentially unpaired regulator calls
- move static keyword to the front of declarations
- fix optional led-max-microamp prop error handling
- leds-max77650:
- add of_node_put() before return
- add MODULE_ALIAS()
- Switch to fwnode property API
- leds-as3645a:
- fix misuse of strlcpy
- leds-netxbig:
- add of_node_put() in netxbig_leds_get_of_pdata()
- remove legacy board-file support
- leds-is31fl319x:
- simplify getting the adapter of a client
- leds-ti-lmu-common:
- fix coccinelle issue
- move static keyword to the front of declaration
- leds-syscon:
- use resource managed variant of device register
- leds-ktd2692:
- fix a typo in the name of a constant
- leds-lp5562:
- allow firmware files up to the maximum length
- leds-an30259a:
- fix typo
- leds-pca953x:
- include the right header"
* tag 'leds-for-5.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds: (72 commits)
leds: lm3532: Fix optional led-max-microamp prop error handling
led: triggers: Fix dereferencing of null pointer
leds: ti-lmu-common: Move static keyword to the front of declaration
leds: lm3532: Move static keyword to the front of declarations
leds: trigger: gpio: GPIO 0 is valid
leds: pwm: Use struct_size() helper
leds: is31fl32xx: Use struct_size() helper
leds: ti-lmu-common: Fix coccinelle issue in TI LMU
leds: lm3532: Avoid potentially unpaired regulator calls
leds: syscon: Use resource managed variant of device register
leds: Replace {devm_}led_classdev_register() macros with inlines
leds: Allow to call led_classdev_unregister() unconditionally
leds: lm3532: Add full scale current configuration
dt: lm3532: Add property for full scale current.
leds: lm3532: Fixes for the driver for stability
leds: lm3532: Change the define for the fs current register
leds: lm3532: Fix brightness control for i2c mode
leds: Switch to use fwnode instead of be stuck with OF one
leds: max77650: Switch to fwnode property API
led: triggers: Fix a memory leak bug
...
- Add support for IBM z15 machines.
- Add SHA3 and CCA AES cipher key support in zcrypt and pkey refactoring.
- Move to arch_stack_walk infrastructure for the stack unwinder.
- Various kasan fixes and improvements.
- Various command line parsing fixes.
- Improve decompressor phase debuggability.
- Lift no bss usage restriction for the early code.
- Use refcount_t for reference counters for couple of places in
mm code.
- Logging improvements and return code fix in vfio-ccw code.
- Couple of zpci fixes and minor refactoring.
- Remove some outdated documentation.
- Fix secure boot detection.
- Other various minor code clean ups.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEE3QHqV+H2a8xAv27vjYWKoQLXFBgFAl1/pRoACgkQjYWKoQLX
FBjxLQf/Y1nlmoc8URLqaqfNTczIvUzdfXuahI7L75RoIIiqHtcHBrVwauSr7Lma
XVRzK/+6q0UPISrOIZEEtQKsMMM7rGuUv/+XTyrOB/Tsc31kN2EIRXltfXI/lkb8
BZdgch4Xs2rOD7y6TvqpYJsXYXsnLMWwCk8V+48V/pok4sEgMDgh0bTQRHPHYmZ6
1cv8ZQ0AeuVxC6ChM30LhajGRPkYd8RQ82K7fU7jxT0Tjzu66SyrW3pTwA5empBD
RI2yBZJ8EXwJyTCpvN8NKiBgihDs9oUZl61Dyq3j64Mb1OuNUhxXA/8jmtnGn0ok
O9vtImCWzExhjSMkvotuhHEC05nEEQ==
=LCgE
-----END PGP SIGNATURE-----
Merge tag 's390-5.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Vasily Gorbik:
- Add support for IBM z15 machines.
- Add SHA3 and CCA AES cipher key support in zcrypt and pkey
refactoring.
- Move to arch_stack_walk infrastructure for the stack unwinder.
- Various kasan fixes and improvements.
- Various command line parsing fixes.
- Improve decompressor phase debuggability.
- Lift no bss usage restriction for the early code.
- Use refcount_t for reference counters for couple of places in mm
code.
- Logging improvements and return code fix in vfio-ccw code.
- Couple of zpci fixes and minor refactoring.
- Remove some outdated documentation.
- Fix secure boot detection.
- Other various minor code clean ups.
* tag 's390-5.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (48 commits)
s390: remove pointless drivers-y in drivers/s390/Makefile
s390/cpum_sf: Fix line length and format string
s390/pci: fix MSI message data
s390: add support for IBM z15 machines
s390/crypto: Support for SHA3 via CPACF (MSA6)
s390/startup: add pgm check info printing
s390/crypto: xts-aes-s390 fix extra run-time crypto self tests finding
vfio-ccw: fix error return code in vfio_ccw_sch_init()
s390: vfio-ap: fix warning reset not completed
s390/base: remove unused s390_base_mcck_handler
s390/sclp: Fix bit checked for has_sipl
s390/zcrypt: fix wrong handling of cca cipher keygenflags
s390/kasan: add kdump support
s390/setup: avoid using strncmp with hardcoded length
s390/sclp: avoid using strncmp with hardcoded length
s390/module: avoid using strncmp with hardcoded length
s390/pci: avoid using strncmp with hardcoded length
s390/kaslr: reserve memory for kasan usage
s390/mem_detect: provide single get_mem_detect_end
s390/cmma: reuse kstrtobool for option value parsing
...
Fix to return negative error code -ENOMEM from the memory alloc failed
error handling case instead of 0, as done elsewhere in this function.
Fixes: 60e05d1cf0 ("vfio-ccw: add some logging")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Link https://lore.kernel.org/kvm/20190904083315.105600-1-weiyongjun1@huawei.com/
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
If a driver wants to use the new Output Queue poll code, then the qdio
layer must disable its internal Queue scanning. Let the driver select
this mode by passing a special scan_threshold of 0.
As the scan_threshold is the same for all Output Queues, also move it
into the main qdio_irq struct. This allows for fast opt-out checking, a
driver is expected to operate either _all_ or none of its Output Queues
in polling mode.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Acked-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
While commit d36deae750 ("qdio: extend API to allow polling") enhanced
the qdio layer so that drivers can poll their Input Queues, we don't
have the corresponding infrastructure for Output Queues yet.
Factor out a helper that scans a single QDIO Queue, so that qeth can
implement TX NAPI on top of it.
While doing so, remove the duplicated tracking of the next-to-scan index
(q->first_to_check vs q->first_to_kick) in this code path.
qdio_handle_aobs() needs to move slightly upwards in the code hierarchy,
so that it's still called from the polling path.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Acked-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Usually, the common I/O layer logs various things into the s390
cio debug feature, which has been very helpful in the past when
looking at crash dumps. As vfio-ccw devices unbind from the
standard I/O subchannel driver, we lose some information there.
Let's introduce some vfio-ccw debug features and log some things
there. (Unfortunately we cannot reuse the cio debug feature from
a module.)
Message-Id: <20190816151505.9853-2-cohuck@redhat.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Add a generic helper to match any/all devices. Using this
introduce new wrappers {bus/driver/class}_find_next_device().
Cc: Elie Morisse <syniurge@gmail.com>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Nehal Shah <nehal-bakulchandra.shah@amd.com>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: Shyam Sundar S K <shyam-sundar.s-k@amd.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com> # PCI
Link: https://lore.kernel.org/r/20190723221838.12024-7-suzuki.poulose@arm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add a helper to match the device name for device lookup. Also
reuse this generic exported helper for the existing bus_find_device_by_name().
and add similar variants for driver/class.
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: Alexander Aring <alex.aring@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Dan Murphy <dmurphy@ti.com>
Cc: Harald Freudenberger <freude@linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Jacek Anaszewski <jacek.anaszewski@gmail.com>
Cc: Lee Jones <lee.jones@linaro.org>
Cc: linux-leds@vger.kernel.org
Cc: linux-rtc@vger.kernel.org
Cc: linux-usb@vger.kernel.org
Cc: linux-wpan@vger.kernel.org
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Peter Oberparleiter <oberpar@linux.ibm.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Stefan Schmidt <stefan@datenfreihafen.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/20190723221838.12024-2-suzuki.poulose@arm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since vfio_ccw_async_region_ops is not exported and has no reason to be
globally visible make it static to avoid the following sparse warning:
drivers/s390/cio/vfio_ccw_async.c:73:30: warning: symbol 'vfio_ccw_async_region_ops' was not declared. Should it be static?
Fixes: d5afd5d135 ("vfio-ccw: add handling for async channel instructions")
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
The IQD mcast queue doesn't support QAOB mode, so skip the
qdio_enable_async_operation() setup call for this queue. This avoids
the allocation of an unneeded QAOB pointer array, and sets up q->use_cq
properly so that drivers are prohibited from using QAOBs for mcast
traffic.
Take this opportunity to streamline the q->use_cq and aob != 0 checks.
The path to qdio_siga_output() is straight-forward, we don't need to
worry about being called with bad operands.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
If the device driver were to send out a full queue's worth of SBALs,
current code would end up discovering the last of those SBALs as PRIMED
and erroneously skip the SIGA-w. This immediately stalls the queue.
Add a check to not attempt fast-requeue in this case. While at it also
make sure that the state of the previous SBAL was successfully extracted
before inspecting it.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
There is a small window where it's possible that we could be working
on an interrupt (queued in the workqueue) and setting up a channel
program (i.e allocating memory, pinning pages, translating address).
This can lead to allocating and freeing the channel program at the
same time and can cause memory corruption.
Let's not call cp_free if we are currently processing a channel program.
The only way we know for sure that we don't have a thread setting
up a channel program is when the state is set to VFIO_CCW_STATE_CP_PENDING.
Fixes: d5afd5d135 ("vfio-ccw: add handling for async channel instructions")
Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <62e87bf67b38dc8d5760586e7c96d400db854ebe.1562854091.git.alifm@linux.ibm.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
We don't set cp->initialized to true so calling cp_free
will just return and not do anything.
Also fix a memory leak where we fail to free a ccwchain
on an error.
Fixes: 812271b910 ("s390/cio: Squash cp_free() and cp_unpin_free()")
Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
Message-Id: <3173c4216f4555d9765eb6e4922534982bc820e4.1562854091.git.alifm@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The comment is misleading because it tells us that
we should set orb.cmd.c64 before calling ccwchain_calc_length,
otherwise the function ccwchain_calc_length would return an
error. This is not completely accurate.
We want to allow an orb without cmd.c64, and this is fine
as long as the channel program does not use IDALs. But we do
want to reject any channel program that uses IDALs and does
not set the flag, which is what we do in ccwchain_calc_length.
After we have done the ccw processing, we need to set cmd.c64,
as we use IDALs for all translated channel programs.
Also for better code readability let's move the setting of
cmd.c64 within the non error path.
Fixes: fb9e7880af ("vfio: ccw: push down unsupported IDA check")
Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <f68636106aef0faeb6ce9712584d102d1b315ff8.1562854091.git.alifm@linux.ibm.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Here is the "big" driver core and debugfs changes for 5.3-rc1
It's a lot of different patches, all across the tree due to some api
changes and lots of debugfs cleanups. Because of this, there is going
to be some merge issues with your tree at the moment, I'll follow up
with the expected resolutions to make it easier for you.
Other than the debugfs cleanups, in this set of changes we have:
- bus iteration function cleanups (will cause build warnings
with s390 and coresight drivers in your tree)
- scripts/get_abi.pl tool to display and parse Documentation/ABI
entries in a simple way
- cleanups to Documenatation/ABI/ entries to make them parse
easier due to typos and other minor things
- default_attrs use for some ktype users
- driver model documentation file conversions to .rst
- compressed firmware file loading
- deferred probe fixes
All of these have been in linux-next for a while, with a bunch of merge
issues that Stephen has been patient with me for. Other than the merge
issues, functionality is working properly in linux-next :)
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXSgpnQ8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ykcwgCfS30OR4JmwZydWGJ7zK/cHqk+KjsAnjOxjC1K
LpRyb3zX29oChFaZkc5a
=XrEZ
-----END PGP SIGNATURE-----
Merge tag 'driver-core-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core and debugfs updates from Greg KH:
"Here is the "big" driver core and debugfs changes for 5.3-rc1
It's a lot of different patches, all across the tree due to some api
changes and lots of debugfs cleanups.
Other than the debugfs cleanups, in this set of changes we have:
- bus iteration function cleanups
- scripts/get_abi.pl tool to display and parse Documentation/ABI
entries in a simple way
- cleanups to Documenatation/ABI/ entries to make them parse easier
due to typos and other minor things
- default_attrs use for some ktype users
- driver model documentation file conversions to .rst
- compressed firmware file loading
- deferred probe fixes
All of these have been in linux-next for a while, with a bunch of
merge issues that Stephen has been patient with me for"
* tag 'driver-core-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (102 commits)
debugfs: make error message a bit more verbose
orangefs: fix build warning from debugfs cleanup patch
ubifs: fix build warning after debugfs cleanup patch
driver: core: Allow subsystems to continue deferring probe
drivers: base: cacheinfo: Ensure cpu hotplug work is done before Intel RDT
arch_topology: Remove error messages on out-of-memory conditions
lib: notifier-error-inject: no need to check return value of debugfs_create functions
swiotlb: no need to check return value of debugfs_create functions
ceph: no need to check return value of debugfs_create functions
sunrpc: no need to check return value of debugfs_create functions
ubifs: no need to check return value of debugfs_create functions
orangefs: no need to check return value of debugfs_create functions
nfsd: no need to check return value of debugfs_create functions
lib: 842: no need to check return value of debugfs_create functions
debugfs: provide pr_fmt() macro
debugfs: log errors when something goes wrong
drivers: s390/cio: Fix compilation warning about const qualifiers
drivers: Add generic helper to match by of_node
driver_find_device: Unify the match function with class_find_device()
bus_find_device: Unify the match callback with class_find_device
...
- Improve stop_machine wait logic: replace cpu_relax_yield call in generic
stop_machine function with a weak stop_machine_yield function. This is
overridden on s390, which yields the current cpu to the neighbouring cpu
after a couple of retries, instead of blindly giving up the cpu to the
hipervisor. This significantly improves stop_machine performance on s390 in
overcommitted scenarios.
This includes common code changes which have been Acked by Peter Zijlstra
and Thomas Gleixner.
- Improve jump label transformation speed: transform jump labels without
using stop_machine.
- Refactoring of the vfio-ccw cp handling, simplifying the code and
avoiding unneeded allocating/copying.
- Various vfio-ccw fixes (ccw translation, state machine).
- Add support for vfio-ap queue interrupt control in the guest.
This includes s390 kvm changes which have been Acked by Christian
Borntraeger.
- Add protected virtualization support for virtio-ccw.
- Enforce both CONFIG_SMP and CONFIG_HOTPLUG_CPU, which allows to remove some
code which most likely isn't working at all, besides that s390 didn't even
compile for !CONFIG_SMP.
- Support for special flagged EP11 CPRBs for zcrypt.
- Handle PCI devices with no support for new MIO instructions.
- Avoid KASAN false positives in reworked stack unwinder.
- Couple of fixes for the QDIO layer.
- Convert s390 specific documentation to ReST format.
- Let s390 crypto modules return -ENODEV instead of -EOPNOTSUPP if hardware is
missing. This way our modules behave like most other modules and which is
also what systemd's systemd-modules-load.service expects.
- Replace defconfig with performance_defconfig, so there is one config file
less to maintain.
- Remove the SCLP call home device driver, which was never useful.
- Cleanups all over the place.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEE3QHqV+H2a8xAv27vjYWKoQLXFBgFAl0iEpcACgkQjYWKoQLX
FBgtZwf8DOJ6COUG91jKP0RSDlc2YvIMBxopQ38ql1lIsTj5t6DvJ2z3X5uct1wy
6mMiF01VuyD4V4UXbTJQrihzNx7D4dUh47s2sS+diGHxJyXacVxlmjS5k+6pLIUO
AyLvtCcoqDPPiThqnSTZFRm/TcfO/25fCG/IdjrFGj1MD09wHpUCh16tmRPTGFlC
BWZeilDT77fVXnh7Ggn3JB0mQay5PAw2ODOxELHTUBaLmYF8RJPPVKBPmXGl9P1W
84ESm2p+iALGGWDiTOUad9eu8wyQci/V/R+hFgs0Bz/HRcjznNH5EVvfQNCD4VNF
g/PET10nIQYZv2BNdi0cwRjR9jCFbw==
=jp0i
-----END PGP SIGNATURE-----
Merge tag 's390-5.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Vasily Gorbik:
- Improve stop_machine wait logic: replace cpu_relax_yield call in
generic stop_machine function with a weak stop_machine_yield
function. This is overridden on s390, which yields the current cpu to
the neighbouring cpu after a couple of retries, instead of blindly
giving up the cpu to the hipervisor. This significantly improves
stop_machine performance on s390 in overcommitted scenarios.
This includes common code changes which have been Acked by Peter
Zijlstra and Thomas Gleixner.
- Improve jump label transformation speed: transform jump labels
without using stop_machine.
- Refactoring of the vfio-ccw cp handling, simplifying the code and
avoiding unneeded allocating/copying.
- Various vfio-ccw fixes (ccw translation, state machine).
- Add support for vfio-ap queue interrupt control in the guest. This
includes s390 kvm changes which have been Acked by Christian
Borntraeger.
- Add protected virtualization support for virtio-ccw.
- Enforce both CONFIG_SMP and CONFIG_HOTPLUG_CPU, which allows to
remove some code which most likely isn't working at all, besides that
s390 didn't even compile for !CONFIG_SMP.
- Support for special flagged EP11 CPRBs for zcrypt.
- Handle PCI devices with no support for new MIO instructions.
- Avoid KASAN false positives in reworked stack unwinder.
- Couple of fixes for the QDIO layer.
- Convert s390 specific documentation to ReST format.
- Let s390 crypto modules return -ENODEV instead of -EOPNOTSUPP if
hardware is missing. This way our modules behave like most other
modules and which is also what systemd's systemd-modules-load.service
expects.
- Replace defconfig with performance_defconfig, so there is one config
file less to maintain.
- Remove the SCLP call home device driver, which was never useful.
- Cleanups all over the place.
* tag 's390-5.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (83 commits)
docs: s390: s390dbf: typos and formatting, update crash command
docs: s390: unify and update s390dbf kdocs at debug.c
docs: s390: restore important non-kdoc parts of s390dbf.rst
vfio-ccw: Fix the conversion of Format-0 CCWs to Format-1
s390/pci: correctly handle MIO opt-out
s390/pci: deal with devices that have no support for MIO instructions
s390: ap: kvm: Enable PQAP/AQIC facility for the guest
s390: ap: implement PAPQ AQIC interception in kernel
vfio: ap: register IOMMU VFIO notifier
s390: ap: kvm: add PQAP interception for AQIC
s390/unwind: cleanup unused READ_ONCE_TASK_STACK
s390/kasan: avoid false positives during stack unwind
s390/qdio: don't touch the dsci in tiqdio_add_input_queues()
s390/qdio: (re-)initialize tiqdio list entries
s390/dasd: Fix a precision vs width bug in dasd_feature_list()
s390/cio: introduce driver_override on the css bus
vfio-ccw: make convert_ccw0_to_ccw1 static
vfio-ccw: Remove copy_ccw_from_iova()
vfio-ccw: Factor out the ccw0-to-ccw1 transition
vfio-ccw: Copy CCW data outside length calculation
...
When processing Format-0 CCWs, we use the "len" variable as the
number of CCWs to convert to Format-1. But that variable
contains zero here, and is not a meaningful CCW count until
ccwchain_calc_length() returns. Since that routine requires and
expects Format-1 CCWs to identify the chaining behavior, the
format conversion must be done first.
Convert the 2KB we copied even if it's more than we need.
Fixes: 7f8e89a8f2 ("vfio-ccw: Factor out the ccw0-to-ccw1 transition")
Reported-by: Farhan Ali <alifm@linux.ibm.com>
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20190702180928.18113-1-farman@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Current code sets the dsci to 0x00000080. Which doesn't make any sense,
as the indicator area is located in the _left-most_ byte.
Worse: if the dsci is the _shared_ indicator, this potentially clears
the indication of activity for a _different_ device.
tiqdio_thinint_handler() will then have no reason to call that device's
IRQ handler, and the device ends up stalling.
Fixes: d0c9d4a89f ("[S390] qdio: set correct bit in dsci")
Cc: <stable@vger.kernel.org>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
When tiqdio_remove_input_queues() removes a queue from the tiq_list as
part of qdio_shutdown(), it doesn't re-initialize the queue's list entry
and the prev/next pointers go stale.
If a subsequent qdio_establish() fails while sending the ESTABLISH cmd,
it calls qdio_shutdown() again in QDIO_IRQ_STATE_ERR state and
tiqdio_remove_input_queues() will attempt to remove the queue entry a
second time. This dereferences the stale pointers, and bad things ensue.
Fix this by re-initializing the list entry after removing it from the
list.
For good practice also initialize the list entry when the queue is first
allocated, and remove the quirky checks that papered over this omission.
Note that prior to
commit e521813468 ("s390/qdio: fix access to uninitialized qdio_q fields"),
these checks were bogus anyway.
setup_queues_misc() clears the whole queue struct, and thus needs to
re-init the prev/next pointers as well.
Fixes: 779e6e1c72 ("[S390] qdio: new qdio driver.")
Cc: <stable@vger.kernel.org>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Sometimes, we want to control which of the matching drivers
binds to a subchannel device (e.g. for subchannels we want to
handle via vfio-ccw).
For pci devices, a mechanism to do so has been introduced in
782a985d7a ("PCI: Introduce new device binding path using
pci_dev.driver_override"). It makes sense to introduce the
driver_override attribute for subchannel devices as well, so
that we can easily extend the 'driverctl' tool (which makes
use of the driver_override attribute for pci).
Note that unlike pci we still require a driver override to
match the subchannel type; matching more than one subchannel
type is probably not useful anyway.
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Sebastian Ott <sebott@linux.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Update __ccwdev_check_busid() and __ccwgroupdev_check_busid() to use
"const" qualifiers to fix the compiler warning.
Reported-by: kbuild test robot <lkp@intel.com>
Cc: gregkh@linuxfoundation.org
Cc: devel@driverdev.osuosl.org
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
code and avoiding unneeded allocating/copying.
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEEw9DWbcNiT/aowBjO3s9rk8bwL68FAl0M6ZgSHGNvaHVja0By
ZWRoYXQuY29tAAoJEN7Pa5PG8C+vEAMP/RwxJwLihv8n/nSsC/QaeGWprEra+4sD
GQA/WWhoEulWN9FAJGqOqv1IpnGZvyOheHgXq48YUHPrvhGyzraGpI3zfF9czqTT
6U7fNuORovJD9Vym/ZugVlaNM15n0ANFlXLJsnVVrHMx49V0NrlVkF+BlUARfY5u
tqDYZKyiJGKW/k4Kkulh54BYbtTTwea/+fmBust7olRAQDP6BipPRHW7TWAAg1Hz
5TuQ6W4iMNyXHIs0rNQms9dy4a274jPipmcWZRncfahpGMXHzdXgJ0DLctbaY2on
92OLwmeEB43VpLWV0fZX6+QaHuzPhoBxtZchrzrRwC9/pRnwLGPUYXAYIIEAsAhC
4wUbvYIMzHy8+Z8L30oxfemd77HV7AvA1ijxjJY6MUBzd617n/Ti650xUejSPt33
Xbr8CpuuucuR1aMhRt9FTdLsOT7JE4us4sqgQ39jh1QwgMU/A+vByJwBVsSB/l4x
yFmjTnkh1itWImTsPmjBZ8za9Cnx+WtPPAMlZKNWv6JS+MNpsRWYtJS22+UUE9OY
m65yhiv+xvAMZCGhCZHPj0xk93acNKLy/p6+kNO5NDAimRf4La/Pd9L7AVF9xZpE
ZRXKVg80Iq0rGfI07tj9gouQdo/Ls+bhoIJJIaq81zX9cwC7R4rNgdUg2s4U2AXY
vl/clegCeztY
=FcTg
-----END PGP SIGNATURE-----
Merge tag 'vfio-ccw-20190621' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/vfio-ccw into features
Refactoring of the vfio-ccw cp handling, simplifying the
code and avoiding unneeded allocating/copying.
* tag 'vfio-ccw-20190621' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/vfio-ccw:
vfio-ccw: Remove copy_ccw_from_iova()
vfio-ccw: Factor out the ccw0-to-ccw1 transition
vfio-ccw: Copy CCW data outside length calculation
vfio-ccw: Skip second copy of guest cp to host
vfio-ccw: Move guest_cp storage into common struct
s390/cio: Combine direct and indirect CCW paths
vfio-ccw: Rearrange IDAL allocation in direct CCW
vfio-ccw: Remove pfn_array_table
vfio-ccw: Adjust the first IDAW outside of the nested loops
vfio-ccw: Rearrange pfn_array and pfn_array_table arrays
s390/cio: Use generalized CCW handler in cp_init()
s390/cio: Generalize the TIC handler
s390/cio: Refactor the routine that handles TIC CCWs
s390/cio: Squash cp_free() and cp_unpin_free()
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
The driver_find_device() accepts a match function pointer to
filter the devices for lookup, similar to bus/class_find_device().
However, there is a minor difference in the prototype for the
match parameter for driver_find_device() with the now unified
version accepted by {bus/class}_find_device(), where it doesn't
accept a "const" qualifier for the data argument. This prevents
us from reusing the generic match functions for driver_find_device().
For this reason, change the prototype of the driver_find_device() to
make the "match" parameter in line with {bus/class}_find_device()
and adjust its callers to use the const qualifier. Also, we could
now promote the "data" parameter to const as we pass it down
as a const parameter to the match functions.
Cc: Corey Minyard <minyard@acm.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Thierry Reding <thierry.reding@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Peter Oberparleiter <oberpar@linux.ibm.com>
Cc: Sebastian Ott <sebott@linux.ibm.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Nehal Shah <nehal-bakulchandra.shah@amd.com>
Cc: Shyam Sundar S K <shyam-sundar.s-k@amd.com>
Cc: Lee Jones <lee.jones@linaro.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There is an arbitrary difference between the prototypes of
bus_find_device() and class_find_device() preventing their callers
from passing the same pair of data and match() arguments to both of
them, which is the const qualifier used in the prototype of
class_find_device(). If that qualifier is also used in the
bus_find_device() prototype, it will be possible to pass the same
match() callback function to both bus_find_device() and
class_find_device(), which will allow some optimizations to be made in
order to avoid code duplication going forward. Also with that, constify
the "data" parameter as it is passed as a const to the match function.
For this reason, change the prototype of bus_find_device() to match
the prototype of class_find_device() and adjust its callers to use the
const qualifier in accordance with the new prototype of it.
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Andreas Noever <andreas.noever@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Corey Minyard <minyard@acm.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: David Kershner <david.kershner@unisys.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: David Airlie <airlied@linux.ie>
Cc: Felipe Balbi <balbi@kernel.org>
Cc: Frank Rowand <frowand.list@gmail.com>
Cc: Grygorii Strashko <grygorii.strashko@ti.com>
Cc: Harald Freudenberger <freude@linux.ibm.com>
Cc: Hartmut Knaack <knaack.h@gmx.de>
Cc: Heiko Stuebner <heiko@sntech.de>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Jonathan Cameron <jic23@kernel.org>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michael Jamet <michael.jamet@intel.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Peter Oberparleiter <oberpar@linux.ibm.com>
Cc: Sebastian Ott <sebott@linux.ibm.com>
Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Cc: Yehezkel Bernat <YehezkelShB@gmail.com>
Cc: rafael@kernel.org
Acked-by: Corey Minyard <minyard@acm.org>
Acked-by: David Kershner <david.kershner@unisys.com>
Acked-by: Mark Brown <broonie@kernel.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Acked-by: Wolfram Sang <wsa@the-dreams.de> # for the I2C parts
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Just to keep things tidy.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Message-Id: <20190618202352.39702-6-farman@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Farhan Ali <alifm@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
This is a really useful function, but it's buried in the
copy_ccw_from_iova() routine so that ccwchain_calc_length()
can just work with Format-1 CCWs while doing its counting.
But it means we're translating a full 2K of "CCWs" to Format-1,
when in reality there's probably far fewer in that space.
Let's factor it out, so maybe we can do something with it later.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Message-Id: <20190618202352.39702-5-farman@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Farhan Ali <alifm@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
It doesn't make much sense to "hide" the copy to the channel_program
struct inside a routine that calculates the length of the chain.
Let's move it to the calling routine, which will later copy from
channel_program to the memory it allocated itself.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Message-Id: <20190618202352.39702-4-farman@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Farhan Ali <alifm@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
We already pinned/copied/unpinned 2K (256 CCWs) of guest memory
to the host space anchored off vfio_ccw_private. There's no need
to do that again once we have the length calculated, when we could
just copy the section we need to the "permanent" space for the I/O.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Message-Id: <20190618202352.39702-3-farman@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Farhan Ali <alifm@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Rather than allocating/freeing a piece of memory every time
we try to figure out how long a CCW chain is, let's use a piece
of memory allocated for each device.
The io_mutex added with commit 4f76617378 ("vfio-ccw: protect
the I/O region") is held for the duration of the VFIO_CCW_EVENT_IO_REQ
event that accesses/uses this space, so there should be no race
concerns with another CPU attempting an (unexpected) SSCH for the
same device.
Suggested-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Message-Id: <20190618202352.39702-2-farman@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Farhan Ali <alifm@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
This allows device drivers (eg. qeth) to use the struct when processing
information retrieved via RCD.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Acked-by: Sebastian Ott <sebott@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
With both the direct-addressed and indirect-addressed CCW paths
simplified to this point, the amount of shared code between them is
(hopefully) more easily visible. Move the processing of IDA-specific
bits into the direct-addressed path, and add some useful commentary of
what the individual pieces are doing. This allows us to remove the
entire ccwchain_fetch_idal() routine and maintain a single function
for any non-TIC CCW.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20190606202831.44135-10-farman@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
This is purely deck furniture, to help understand the merge of the
direct and indirect handlers.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20190606202831.44135-9-farman@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Now that both CCW codepaths build this nested array:
ccwchain->pfn_array_table[1]->pfn_array[#idaws/#pages]
We can collapse this into simply:
ccwchain->pfn_array[#idaws/#pages]
Let's do that, so that we don't have to continually navigate two
nested arrays when the first array always has a count of one.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20190606202831.44135-8-farman@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Now that pfn_array_table[] is always an array of 1, it seems silly to
check for the very first entry in an array in the middle of two nested
loops, since we know it'll only ever happen once.
Let's move this outside the loops to simplify things, even though
the "k" variable is still necessary.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20190606202831.44135-7-farman@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
While processing a channel program, we currently have two nested
arrays that carry a slightly different structure. The direct CCW
path creates this:
ccwchain->pfn_array_table[1]->pfn_array[#pages]
while an IDA CCW creates:
ccwchain->pfn_array_table[#idaws]->pfn_array[1]
The distinction appears to state that each pfn_array_table entry
points to an array of contiguous pages, represented by a pfn_array,
um, array. Since the direct-addressed scenario can ONLY represent
contiguous pages, it makes the intermediate array necessary but
difficult to recognize. Meanwhile, since an IDAL can contain
non-contiguous pages and there is no logic in vfio-ccw to detect
adjacent IDAWs, it is the second array that is necessary but appearing
to be superfluous.
I am not aware of any documentation that states the pfn_array[] needs
to be of contiguous pages; it is just what the code does today.
I don't see any reason for this either, let's just flip the IDA
codepath around so that it generates:
ch_pat->pfn_array_table[1]->pfn_array[#idaws]
This will bring it in line with the direct-addressed codepath,
so that we can understand the behavior of this memory regardless
of what type of CCW is being processed. And it means the casual
observer does not need to know/care whether the pfn_array[]
represents contiguous pages or not.
NB: The existing vfio-ccw code only supports 4K-block Format-2 IDAs,
so that "#pages" == "#idaws" in this area. This means that we will
have difficulty with this overlap in terminology if support for
Format-1 or 2K-block Format-2 IDAs is ever added. I don't think that
this patch changes our ability to make that distinction.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20190606202831.44135-6-farman@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
It is now pretty apparent that ccwchain_handle_ccw()
(nee ccwchain_handle_tic()) does everything that cp_init()
wants to do.
Let's remove that duplicated code from cp_init() and let
ccwchain_handle_ccw() handle it itself.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20190606202831.44135-5-farman@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Refactor ccwchain_handle_tic() into a routine that handles a channel
program address (which itself is a CCW pointer), rather than a CCW pointer
that is only a TIC CCW. This will make it easier to reuse this code for
other CCW commands.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20190606202831.44135-4-farman@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Extract the "does the target of this TIC already exist?" check from
ccwchain_handle_tic(), so that it's easier to refactor that function
into one that cp_init() is able to use.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20190606202831.44135-3-farman@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The routine cp_free() does nothing but call cp_unpin_free(), and while
most places call cp_free() there is one caller of cp_unpin_free() used
when the cp is guaranteed to have not been marked initialized.
This seems like a dubious way to make a distinction, so let's combine
these routines and make cp_free() do all the work.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20190606202831.44135-2-farman@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Protected virtualization guests have to use shared pages for airq
notifier bit vectors, because the hypervisor needs to write these bits.
Let us make sure we allocate DMA memory for the notifier bit vectors by
replacing the kmem_cache with a dma_cache and kalloc() with
cio_dma_zalloc().
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Sebastian Ott <sebott@linux.ibm.com>
Reviewed-by: Michael Mueller <mimu@linux.ibm.com>
Tested-by: Michael Mueller <mimu@linux.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
As virtio-ccw devices are channel devices, we need to use the
dma area within the common I/O layer for any communication with
the hypervisor.
Note that we do not need to use that area for control blocks
directly referenced by instructions, e.g. the orb.
It handles neither QDIO in the common code, nor any device type specific
stuff (like channel programs constructed by the DASD driver).
An interesting side effect is that virtio structures are now going to
get allocated in 31 bit addressable storage.
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Sebastian Ott <sebott@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Michael Mueller <mimu@linux.ibm.com>
Tested-by: Michael Mueller <mimu@linux.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
To support protected virtualization cio will need to make sure the
memory used for communication with the hypervisor is DMA memory.
Let us introduce one global pool for cio.
Our DMA pools are implemented as a gen_pool backed with DMA pages. The
idea is to avoid each allocation effectively wasting a page, as we
typically allocate much less than PAGE_SIZE.
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Sebastian Ott <sebott@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Michael Mueller <mimu@linux.ibm.com>
Tested-by: Michael Mueller <mimu@linux.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
When a CQ-enabled device uses QEBSM for SBAL state inspection,
get_buf_states() can return the PENDING state for an Output Queue.
get_outbound_buffer_frontier() isn't prepared for this, and any PENDING
buffer will permanently stall all further completion processing on this
Queue.
This isn't a concern for non-QEBSM devices, as get_buf_states() for such
devices will manually turn PENDING buffers into EMPTY ones.
Fixes: 104ea556ee ("qdio: support asynchronous delivery of storage blocks")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Add missing parameter description to fix the following warning:
drivers/s390/cio/qdio_thinint.c:183: warning:
Function parameter or member 'floating' not described in 'tiqdio_thinint_handler'
Signed-off-by: Sebastian Ott <sebott@linux.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
If the CCW being processed is a No-Operation, then by definition no
data is being transferred. Let's fold those checks into the normal
CCW processors, rather than skipping out early.
Likewise, if the CCW being processed is a "test" (a category defined
here as an opcode that contains zero in the lowest four bits) then no
special processing is necessary as far as vfio-ccw is concerned.
These command codes have not been valid since the S/370 days, meaning
they are invalid in the same way as one that ends in an eight [1] or
an otherwise valid command code that is undefined for the device type
in question. Considering that, let's just process "test" CCWs like
any other CCW, and send everything to the hardware.
[1] POPS states that a x08 is a TIC CCW, and that having any high-order
bits enabled is invalid for format-1 CCWs. For format-0 CCWs, the
high-order bits are ignored.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Message-Id: <20190516161403.79053-4-farman@linux.ibm.com>
Acked-by: Farhan Ali <alifm@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
It is possible that a guest might issue a CCW with a length of zero,
and will expect a particular response. Consider this chain:
Address Format-1 CCW
-------- -----------------
0 33110EC0 346022CC 33177468
1 33110EC8 CF200000 3318300C
CCW[0] moves a little more than two pages, but also has the
Suppress Length Indication (SLI) bit set to handle the expectation
that considerably less data will be moved. CCW[1] also has the SLI
bit set, and has a length of zero. Once vfio-ccw does its magic,
the kernel issues a start subchannel on behalf of the guest with this:
Address Format-1 CCW
-------- -----------------
0 021EDED0 346422CC 021F0000
1 021EDED8 CF240000 3318300C
Both CCWs were converted to an IDAL and have the corresponding flags
set (which is by design), but only the address of the first data
address is converted to something the host is aware of. The second
CCW still has the address used by the guest, which happens to be (A)
(probably) an invalid address for the host, and (B) an invalid IDAW
address (doubleword boundary, etc.).
While the I/O fails, it doesn't fail correctly. In this example, we
would receive a program check for an invalid IDAW address, instead of
a unit check for an invalid command.
To fix this, revert commit 4cebc5d6a6 ("vfio: ccw: validate the
count field of a ccw before pinning") and allow the individual fetch
routines to process them like anything else. We'll make a slight
adjustment to our allocation of the pfn_array (for direct CCWs) or
IDAL (for IDAL CCWs) memory, so that we have room for at least one
address even though no guest memory will be pinned and thus the
IDAW will not be populated with a host address.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Message-Id: <20190516161403.79053-3-farman@linux.ibm.com>
Acked-by: Farhan Ali <alifm@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The skip flag of a CCW offers the possibility of data not being
transferred, but is only meaningful for certain commands.
Specifically, it is only applicable for a read, read backward, sense,
or sense ID CCW and will be ignored for any other command code
(SA22-7832-11 page 15-64, and figure 15-30 on page 15-75).
(A sense ID is xE4, while a sense is x04 with possible modifiers in the
upper four bits. So we will cover the whole "family" of sense CCWs.)
For those scenarios, since there is no requirement for the target
address to be valid, we should skip the call to vfio_pin_pages() and
rely on the IDAL address we have allocated/built for the channel
program. The fact that the individual IDAWs within the IDAL are
invalid is fine, since they aren't actually checked in these cases.
Set pa_nr to zero when skipping the pfn_array_pin() call, since it is
defined as the number of pages pinned and is used to determine
whether to call vfio_unpin_pages() upon cleanup.
The pfn_array_pin() routine returns the number of pages that were
pinned, but now might be skipped for some CCWs. Thus we need to
calculate the expected number of pages ourselves such that we are
guaranteed to allocate a reasonable number of IDAWs, which will
provide a valid address in CCW.CDA regardless of whether the IDAWs
are filled in with pinned/translated addresses or not.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Message-Id: <20190516161403.79053-2-farman@linux.ibm.com>
Acked-by: Farhan Ali <alifm@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Let's initialize the host address to something that is invalid,
rather than letting it default to zero. This just makes it easier
to notice when a pin operation has failed or been skipped.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Message-Id: <20190514234248.36203-5-farman@linux.ibm.com>
Reviewed-by: Farhan Ali <alifm@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The pfn_array_alloc_pin routine is doing too much. Today, it does the
alloc of the pfn_array struct and its member arrays, builds the iova
address lists out of a contiguous piece of guest memory, and asks vfio
to pin the resulting pages.
Let's effectively revert a significant portion of commit 5c1cfb1c39
("vfio: ccw: refactor and improve pfn_array_alloc_pin()") such that we
break pfn_array_alloc_pin() into its component pieces, and have one
routine that allocates/populates the pfn_array structs, and another
that actually pins the memory. In the future, we will be able to
handle scenarios where pinning memory isn't actually appropriate.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Message-Id: <20190514234248.36203-4-farman@linux.ibm.com>
Reviewed-by: Farhan Ali <alifm@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Otherwise, the guest can believe it's okay to start another I/O
and bump into the non-idle state. This results in a cc=2 (with
the asynchronous CSCH/HSCH code) returned to the guest, which is
unfortunate since everything is otherwise working normally.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Message-Id: <20190514234248.36203-3-farman@linux.ibm.com>
Reviewed-by: Farhan Ali <alifm@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Per the POPs [1], when processing an interrupt the SCSW.CPA field of an
IRB generally points to 8 bytes after the last CCW that was executed
(there are exceptions, but this is the most common behavior).
In the case of an error, this points us to the first un-executed CCW
in the chain. But in the case of normal I/O, the address points beyond
the end of the chain. While the guest generally only cares about this
when possibly restarting a channel program after error recovery, we
should convert the address even in the good scenario so that we provide
a consistent, valid, response upon I/O completion.
[1] Figure 16-6 in SA22-7832-11. The footnotes in that table also state
that this is true even if the resulting address is invalid or protected,
but moving to the end of the guest chain should not be a surprise.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Message-Id: <20190514234248.36203-2-farman@linux.ibm.com>
Reviewed-by: Farhan Ali <alifm@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
When get_buf_states() gets called with count > 1, it scans the
corresponding number of SBAL states until it encounters a mismatch.
But when these SBALs are in a HW-owned state, the callers don't actually
care _how many_ such SBALs are on the queue. If we can't process the
first SBAL, we can't process any of the following SBALs either. So when
the first SBAL is HW-owned, skip the scan of the remaining SBALs and
thus save some CPU time.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
For a 1-SBAL state inspection, use the corresponding helper.
No functional change, just reducing the number of immediate callers to
get_buf_states().
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Old code restricted the number of inspected SBALs to
QDIO_MAX_BUFFERS_PER_Q - 1, as otherwise the first_to_check and
first_to_kick cursors could overlap. Subsequent code would then assume that
there was no progress on the queue, when in fact _all_ SBALs on the queue
were ready-to-process.
This limitation no longer applies, so allow the queue-scan code to inspect
all SBALs on the queue. Note that qeth requires an additional patch
("s390/qeth: stop/wake TX queues based on their fill level"), to avoid
potential queue stalls when all 128 SBALs are reported as ready-to-process.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Since commit d485235b00 "s390: assume diag308 set always works",
the kernel does not use the rchp instruction anymore. So let's
remove the tracing for it.
Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
Acked-by: Sebastian Ott <sebott@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Improve /proc/interrupts on s390 to show statistics for individual
MSI interrupts.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Provide the ability to create cachesize aligned interrupt vectors.
These will be used for per-CPU interrupt vectors.
Signed-off-by: Sebastian Ott <sebott@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Add an extra parameter for airq handlers to recognize
floating vs. directed interrupts.
Signed-off-by: Sebastian Ott <sebott@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The quiesce function calls cio_cancel_halt_clear() and if we
get an -EBUSY we go into a loop where we:
- wait for any interrupts
- flush all I/O in the workqueue
- retry cio_cancel_halt_clear
During the period where we are waiting for interrupts or
flushing all I/O, the channel subsystem could have completed
a halt/clear action and turned off the corresponding activity
control bits in the subchannel status word. This means the next
time we call cio_cancel_halt_clear(), we will again start by
calling cancel subchannel and so we can be stuck between calling
cancel and halt forever.
Rather than calling cio_cancel_halt_clear() immediately after
waiting, let's try to disable the subchannel. If we succeed in
disabling the subchannel then we know nothing else can happen
with the device.
Suggested-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
Message-Id: <4d5a4b98ab1b41ac6131b5c36de18b76c5d66898.1555449329.git.alifm@linux.ibm.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
When releasing the vfio-ccw mdev, we currently do not release
any existing channel program and its pinned pages. This can
lead to the following warning:
[1038876.561565] WARNING: CPU: 2 PID: 144727 at drivers/vfio/vfio_iommu_type1.c:1494 vfio_sanity_check_pfn_list+0x40/0x70 [vfio_iommu_type1]
....
1038876.561921] Call Trace:
[1038876.561935] ([<00000009897fb870>] 0x9897fb870)
[1038876.561949] [<000003ff8013bf62>] vfio_iommu_type1_detach_group+0xda/0x2f0 [vfio_iommu_type1]
[1038876.561965] [<000003ff8007b634>] __vfio_group_unset_container+0x64/0x190 [vfio]
[1038876.561978] [<000003ff8007b87e>] vfio_group_put_external_user+0x26/0x38 [vfio]
[1038876.562024] [<000003ff806fc608>] kvm_vfio_group_put_external_user+0x40/0x60 [kvm]
[1038876.562045] [<000003ff806fcb9e>] kvm_vfio_destroy+0x5e/0xd0 [kvm]
[1038876.562065] [<000003ff806f63fc>] kvm_put_kvm+0x2a4/0x3d0 [kvm]
[1038876.562083] [<000003ff806f655e>] kvm_vm_release+0x36/0x48 [kvm]
[1038876.562098] [<00000000003c2dc4>] __fput+0x144/0x228
[1038876.562113] [<000000000016ee82>] task_work_run+0x8a/0xd8
[1038876.562125] [<000000000014c7a8>] do_exit+0x5d8/0xd90
[1038876.562140] [<000000000014d084>] do_group_exit+0xc4/0xc8
[1038876.562155] [<000000000015c046>] get_signal+0x9ae/0xa68
[1038876.562169] [<0000000000108d66>] do_signal+0x66/0x768
[1038876.562185] [<0000000000b9e37e>] system_call+0x1ea/0x2d8
[1038876.562195] 2 locks held by qemu-system-s39/144727:
[1038876.562205] #0: 00000000537abaf9 (&container->group_lock){++++}, at: __vfio_group_unset_container+0x3c/0x190 [vfio]
[1038876.562230] #1: 00000000670008b5 (&iommu->lock){+.+.}, at: vfio_iommu_type1_detach_group+0x36/0x2f0 [vfio_iommu_type1]
[1038876.562250] Last Breaking-Event-Address:
[1038876.562262] [<000003ff8013aa24>] vfio_sanity_check_pfn_list+0x3c/0x70 [vfio_iommu_type1]
[1038876.562272] irq event stamp: 4236481
[1038876.562287] hardirqs last enabled at (4236489): [<00000000001cee7a>] console_unlock+0x6d2/0x740
[1038876.562299] hardirqs last disabled at (4236496): [<00000000001ce87e>] console_unlock+0xd6/0x740
[1038876.562311] softirqs last enabled at (4234162): [<0000000000b9fa1e>] __do_softirq+0x556/0x598
[1038876.562325] softirqs last disabled at (4234153): [<000000000014e4cc>] irq_exit+0xac/0x108
[1038876.562337] ---[ end trace 6c96d467b1c3ca06 ]---
Similarly we do not free the channel program when we are removing
the vfio-ccw device. Let's fix this by resetting the device and freeing
the channel program and pinned pages in the release path. For the remove
path we can just quiesce the device, since in the remove path the mediated
device is going away for good and so we don't need to do a full reset.
Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
Message-Id: <ae9f20dc8873f2027f7b3c5d2aaa0bdfe06850b8.1554756534.git.alifm@linux.ibm.com>
Acked-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Add a region to the vfio-ccw device that can be used to submit
asynchronous I/O instructions. ssch continues to be handled by the
existing I/O region; the new region handles hsch and csch.
Interrupt status continues to be reported through the same channels
as for ssch.
Acked-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Farhan Ali <alifm@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The vfio-ccw code will need this, and it matches treatment of ssch
and csch.
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Allow to extend the regions used by vfio-ccw. The first user will be
handling of halt and clear subchannel.
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Farhan Ali <alifm@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Introduce a mutex to disallow concurrent reads or writes to the
I/O region. This makes sure that the data the kernel or user
space see is always consistent.
The same mutex will be used to protect the async region as well.
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The flow for processing ssch requests can be improved by splitting
the BUSY state:
- CP_PROCESSING: We reject any user space requests while we are in
the process of translating a channel program and submitting it to
the hardware. Use -EAGAIN to signal user space that it should
retry the request.
- CP_PENDING: We have successfully submitted a request with ssch and
are now expecting an interrupt. As we can't handle more than one
channel program being processed, reject any further requests with
-EBUSY. A final interrupt will move us out of this state.
By making this a separate state, we make it possible to issue a
halt or a clear while we're still waiting for the final interrupt
for the ssch (in a follow-on patch).
It also makes a lot of sense not to preemptively filter out writes to
the io_region if we're in an incorrect state: the state machine will
handle this correctly.
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
When we get a solicited interrupt, the start function may have
been cleared by a csch, but we still have a channel program
structure allocated. Make it safe to call the cp accessors in
any case, so we can call them unconditionally.
While at it, also make sure that functions called from other parts
of the code return gracefully if the channel program structure
has not been initialized (even though that is a bug in the caller).
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Farhan Ali <alifm@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
qdio.ko offers a small number of high-level functions to drive the
scanning of a QDIO queue for ready-to-process SBALs:
qdio_get_next_buffers(), __[ti]qdio_inbound_processing() and
__qdio_outbound_processing().
Let each of those functions maintain the 'start' index for their current
scan, and pass it to lower-level helpers as needed. This improves the
code's overall layering, and allows us to eliminate the additional
first_to_kick cursor with a follow-on patch.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Refactor all the low-level helpers to take the first_to_check cursor as
parameter, rather than accessing it directly.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
clang points out that the declaration of cio_irb does not match the
definition exactly, it is missing the alignment attribute:
../drivers/s390/cio/cio.c:50:1: warning: section does not match previous declaration [-Wsection]
DEFINE_PER_CPU_ALIGNED(struct irb, cio_irb);
^
../include/linux/percpu-defs.h:150:2: note: expanded from macro 'DEFINE_PER_CPU_ALIGNED'
DEFINE_PER_CPU_SECTION(type, name, PER_CPU_ALIGNED_SECTION) \
^
../include/linux/percpu-defs.h:93:9: note: expanded from macro 'DEFINE_PER_CPU_SECTION'
extern __PCPU_ATTRS(sec) __typeof__(type) name; \
^
../include/linux/percpu-defs.h:49:26: note: expanded from macro '__PCPU_ATTRS'
__percpu __attribute__((section(PER_CPU_BASE_SECTION sec))) \
^
../drivers/s390/cio/cio.h:118:1: note: previous attribute is here
DECLARE_PER_CPU(struct irb, cio_irb);
^
../include/linux/percpu-defs.h:111:2: note: expanded from macro 'DECLARE_PER_CPU'
DECLARE_PER_CPU_SECTION(type, name, "")
^
../include/linux/percpu-defs.h:87:9: note: expanded from macro 'DECLARE_PER_CPU_SECTION'
extern __PCPU_ATTRS(sec) __typeof__(type) name
^
../include/linux/percpu-defs.h:49:26: note: expanded from macro '__PCPU_ATTRS'
__percpu __attribute__((section(PER_CPU_BASE_SECTION sec))) \
^
Use DECLARE_PER_CPU_ALIGNED() here, to make the two match.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Sebastian Ott <sebott@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
This cursor is used for debugging only. But since
commit "s390/qdio: pass up count of ready-to-process SBALs" it effectively
duplicates the first_to_check cursor, diverging for just a short moment
when get_*_buffer_frontier() updates q->first_to_check.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
When passing a range of ready-to-process SBALs to the upper-layer
driver, use the available 'count' instead of calculating the distance
between the first_to_check and first_to_kick cursors.
This simplifies the logic of the queue-scan path, and opens up the
possibility of scanning all 128 SBALs in one go (as determining the
reported count no longer requires wrap-around safe arithmetic on the
queue's cursors).
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
When qdio_{in,out}bound_q_moved() scans a queue for pending work, it
currently only returns a boolean to its caller. The interface to the
upper-layer-drivers (qdio_kick_handler() and qdio_get_next_buffers())
then re-calculates the number of pending SBALs from the
q->first_to_check and q->first_to_kick cursors.
Refactor this so that whenever get_{in,out}bound_buffer_frontier()
adjusted the queue's first_to_check cursor, it also returns the
corresponding count of ready-to-process SBALs (and 0 else).
A subsequent patch will then make use of this additional information.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The DSCI is a 1-byte field, placed at the start of an u32. So when
printing it to a queue's debug state, limit the output to the part
that's actually occupied by the DSCI.
When the DSCI is set this gives us the expected output of '1', rather
than the current (obscure) value of '16777216'.
Suggested-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
This helper is not thinint-specific, qdio_get_next_buffers() also calls it
for non-thinint devices. So give it a more fitting name, and while at it
adjust its parameter.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
pci_out_supported() currently takes a single queue as parameter, even
though Output IRQ support is a per-device feature. Adjust the parameter,
so that the macro can also be used in code paths with no access to a queue
struct. This allows us to remove the remaining open-coded checks for
QIB_AC_OUTBOUND_PCI_SUPPORTED.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
- Fix early free of the channel program in vfio
- On AP device removal make sure that all messages are flushed
with the driver still attached that queued the message
- Limit brk randomization to 32MB to reduce the chance that the
heap of ld.so is placed after the main stack
- Add a rolling average for the steal time of a CPU, this will be
needed for KVM to decide when to do busy waiting
- Fix a warning in the CPU-MF code
- Add a notification handler for AP configuration change to react
faster to new AP devices
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQEcBAABCAAGBQJcnIq7AAoJEDjwexyKj9rgddUH/3VQP6BMvq2fwAsLqx8JeYgT
082xzP2nHli3tO6m8fFHmtqrSg5KTEDfuQVafqp92LeEMKUNWQI6kRu7rXeAVBct
M6hx21mqkm9VNjAlAjSq8IAUXP2K6/K0BMD5mYInYYYVRvJm3on4sHnkEj0kvXbm
OGxwnNBd9UnH5g6ti2vW4cyDvs0aqj1eDbSudy5KedumQz5J2XdFPn4f4Ej6p2+t
nuvlZFDnZ2Z4rliE3RFCuKExZR+YFZgS1urm6pcklncfvbJRsqFJ+nvhurskDUI3
4gOp1Yv1tvGNv/cNVEtnz8g/Kg8/sI7evjQBtxhtEsV/W0sbZPnjCt+28Cf1DN4=
=4nL7
-----END PGP SIGNATURE-----
Merge tag 's390-5.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Martin Schwidefsky:
"Improvements and bug fixes for 5.1-rc2:
- Fix early free of the channel program in vfio
- On AP device removal make sure that all messages are flushed with
the driver still attached that queued the message
- Limit brk randomization to 32MB to reduce the chance that the heap
of ld.so is placed after the main stack
- Add a rolling average for the steal time of a CPU, this will be
needed for KVM to decide when to do busy waiting
- Fix a warning in the CPU-MF code
- Add a notification handler for AP configuration change to react
faster to new AP devices"
* tag 's390-5.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/cpumf: Fix warning from check_processor_id
zcrypt: handle AP Info notification from CHSC SEI command
vfio: ccw: only free cp on final interrupt
s390/vtime: steal time exponential moving average
s390/zcrypt: revisit ap device remove procedure
s390: limit brk randomization to 32MB
for 32-bit guests
s390: interrupt cleanup, introduction of the Guest Information Block,
preparation for processor subfunctions in cpu models
PPC: bug fixes and improvements, especially related to machine checks
and protection keys
x86: many, many cleanups, including removing a bunch of MMU code for
unnecessary optimizations; plus AVIC fixes.
Generic: memcg accounting
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJci+7XAAoJEL/70l94x66DUMkIAKvEefhceySHYiTpfefjLjIC
16RewgHa+9CO4Oo5iXiWd90fKxtXLXmxDQOS4VGzN0rxvLGRw/fyXIxL1MDOkaAO
l8SLSNuewY4XBUgISL3PMz123r18DAGOuy9mEcYU/IMesYD2F+wy5lJ17HIGq6X2
RpoF1p3qO1jfkPTKOob6Ixd4H5beJNPKpdth7LY3PJaVhDxgouj32fxnLnATVSnN
gENQ10fnt8BCjshRYW6Z2/9bF15JCkUFR1xdBW2/xh1oj+kvPqqqk2bEN1eVQzUy
2hT/XkwtpthqjSbX8NNavWRSFnOnbMLTRKQyIXmFVsM5VoSrwtiGsCFzBgcT++I=
=XIzU
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Paolo Bonzini:
"ARM:
- some cleanups
- direct physical timer assignment
- cache sanitization for 32-bit guests
s390:
- interrupt cleanup
- introduction of the Guest Information Block
- preparation for processor subfunctions in cpu models
PPC:
- bug fixes and improvements, especially related to machine checks
and protection keys
x86:
- many, many cleanups, including removing a bunch of MMU code for
unnecessary optimizations
- AVIC fixes
Generic:
- memcg accounting"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (147 commits)
kvm: vmx: fix formatting of a comment
KVM: doc: Document the life cycle of a VM and its resources
MAINTAINERS: Add KVM selftests to existing KVM entry
Revert "KVM/MMU: Flush tlb directly in the kvm_zap_gfn_range()"
KVM: PPC: Book3S: Add count cache flush parameters to kvmppc_get_cpu_char()
KVM: PPC: Fix compilation when KVM is not enabled
KVM: Minor cleanups for kvm_main.c
KVM: s390: add debug logging for cpu model subfunctions
KVM: s390: implement subfunction processor calls
arm64: KVM: Fix architecturally invalid reset value for FPEXC32_EL2
KVM: arm/arm64: Remove unused timer variable
KVM: PPC: Book3S: Improve KVM reference counting
KVM: PPC: Book3S HV: Fix build failure without IOMMU support
Revert "KVM: Eliminate extra function calls in kvm_get_dirty_log_protect()"
x86: kvmguest: use TSC clocksource if invariant TSC is exposed
KVM: Never start grow vCPU halt_poll_ns from value below halt_poll_ns_grow_start
KVM: Expose the initial start value in grow_halt_poll_ns() as a module parameter
KVM: grow_halt_poll_ns() should never shrink vCPU halt_poll_ns
KVM: x86/mmu: Consolidate kvm_mmu_zap_all() and kvm_mmu_zap_mmio_sptes()
KVM: x86/mmu: WARN if zapping a MMIO spte results in zapping children
...
The current AP bus implementation periodically polls the AP configuration
to detect changes. When the AP configuration is dynamically changed via the
SE or an SCLP instruction, the changes will not be reflected to sysfs until
the next time the AP configuration is polled. The CHSC architecture
provides a Store Event Information (SEI) command to make notification of an
AP configuration change. This patch introduces a handler to process
notification from the CHSC SEI command by immediately kicking off an AP bus
scan-after-event.
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Sebastian Ott <sebott@linux.ibm.com>
Reviewed-by: Harald Freudenberger <FREUDE@de.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Sebastian Ott <sebott@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
When we get an interrupt for a channel program, it is not
necessarily the final interrupt; for example, the issuing
guest may request an intermediate interrupt by specifying
the program-controlled-interrupt flag on a ccw.
We must not switch the state to idle if the interrupt is not
yet final; even more importantly, we must not free the translated
channel program if the interrupt is not yet final, or the host
can crash during cp rewind.
Fixes: e5f84dbaea ("vfio: ccw: return I/O results asynchronously")
Cc: stable@vger.kernel.org # v4.12+
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Since we have a little function to see whether a channel
program address falls within a range of CCWs, let's use
it in the other places of code that make these checks.
(Why isn't ccw_head fully removed? Well, because this
way some longs lines don't have to be reflowed.)
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Message-Id: <20190222183941.29596-3-farman@linux.ibm.com>
Reviewed-by: Farhan Ali <alifm@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The routine ccwchain_calc_length() is tasked with looking at a
channel program, seeing how many CCWs are chained together by
the presence of the Chain-Command flag, and returning a count
to the caller.
Previously, it also considered a Transfer-in-Channel CCW as being
an appropriate mechanism for chaining. The problem at the time
was that the TIC CCW will almost certainly not go to the next CCW
in memory (because the CC flag would be sufficient), and so
advancing to the next 8 bytes will cause us to read potentially
invalid memory. So that comparison was removed, and the target
of the TIC is processed as a new chain.
This is fine when a TIC goes to a new chain (consider a NOP+TIC to
a channel program that is being redriven), but there is another
scenario where this falls apart. A TIC can be used to "rewind"
a channel program, for example to find a particular record on a
disk with various orientation CCWs. In this case, we DO want to
consider the memory after the TIC since the TIC will be skipped
once the requested criteria is met. This is due to the Status
Modifier presented by the device, though software doesn't need to
operate on it beyond understanding the behavior change of how the
channel program is executed.
So to handle this, we will re-introduce the check for a TIC CCW
but limit it by examining the target of the TIC. If the TIC
doesn't go back into the current chain, then current behavior
applies; we should stop counting CCWs and let the target of the
TIC be handled as a new chain. But, if the TIC DOES go back into
the current chain, then we need to keep looking at the memory after
the TIC for when the channel breaks out of the TIC loop. We can't
use tic_target_chain_exists() because the chain in question hasn't
been built yet, so we will redefine that comparison with some small
functions to make it more readable and to permit refactoring later.
Fixes: 405d566f98 ("vfio-ccw: Don't assume there are more ccws after a TIC")
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Message-Id: <20190222183941.29596-2-farman@linux.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Farhan Ali <alifm@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEEw9DWbcNiT/aowBjO3s9rk8bwL68FAlxYboYSHGNvaHVja0By
ZWRoYXQuY29tAAoJEN7Pa5PG8C+vNa0P/0XfHrBmrbfUUDLKiHje20fv8uGFiLNy
OxiCeXCqpjmYJD69UQyBtBgHksViYCgbQ1TA8Ia7BEbGrXDt8yixr2jZXQ0LMFfS
ZX1A2uvHnKhEY/yTt6gvH3bX8qB43EEZJBbrXmJGRNcZhV6TVoy51hMpaLaETeLO
3fJ9G2mzV/n/wBrPlyWcbrNW4iRzGzIdA6bWTYSgeKE3S/srMqLnCv8EPwDPhQV0
jfg0RYRgwphPi9OJ2bBuqoWExin2ScVFjW2ld+q3Lxuz5TlIu7zCVttIKqV57ubL
MXDabPREMUI/lkp3TAb7q265Bv6iOkwXyFWDQIkBkwxqlK0tgU1mzXeYlTKCVDnz
BOI1zPdU7ilzHClm9LJ9JZ5QKJCtSIwrxXlshdrItoF0dIpmhoyvQU2JltCTXumM
yCF5iUAvK0NQQPzK1pR/EwGd7ZzqFAWglKoTjN8wHgVxusLpJCCd8j/N9UZ0sjT0
djIdgbredP58xKEOGY1Tpa3C/P6lQTmy5a4xI8e8AmTHW2LuvtzKU9AO4xCswzsb
WRVDhTFtsj5LDBZmpKRHOwMbN755mSDXpUnDKGhXsm63WhaGoezTgq9XaW4UpCb7
FcowLgtw2xlWPil8rhNc6738BKWi5KCtVyOT2jlgH3vhRK/H0QGM5pPRCNY2Mtx7
VtQyUVc70JRH
=8LtS
-----END PGP SIGNATURE-----
Merge tag 'vfio-ccw-20190204' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/vfio-ccw into features
Pull vfio-ccw from Cornelia Huck with the following changes:
- A fix in ccw chain processing.
There is no need to use void pointers, all drivers are in agreement
about the underlying data structure of the SBAL arrays.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
This patch implements the Set Guest Information Block operation
to request association or disassociation of a Guest Information
Block (GIB) with the Adapter Interruption Facility. The operation
is required to receive GIB alert interrupts for guest adapters
in conjunction with AIV and GISA.
Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
Reviewed-by: Sebastian Ott <sebott@linux.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Acked-by: Janosch Frank <frankja@linux.ibm.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20190131085247.13826-9-mimu@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
When trying to calculate the length of a ccw chain, we assume
there are ccws after a TIC. This can lead to overcounting and
copying garbage data from guest memory.
Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
Message-Id: <d63748c1f1b03147bcbf401596638627a5e35ef7.1548082107.git.alifm@linux.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Remove write permissions for fops without a write callback.
Signed-off-by: Sebastian Ott <sebott@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Use DEFINE_SHOW_ATTRIBUTE macro to simplify the code.
Signed-off-by: Yangtao Li <tiny.windzz@gmail.com>
Signed-off-by: Sebastian Ott <sebott@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
VFIO_CCW_STATE_BOXED and VFIO_CCW_STATE_BUSY have
identical actions for the same events.
Let's merge both into a single state to simplify the code.
We choose to keep VFIO_CCW_STATE_BUSY.
Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
Message-Id: <1539767923-10539-2-git-send-email-pmorel@linux.ibm.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Direct returns from within a loop are rude, but it doesn't mean it gets
to avoid releasing the memory acquired beforehand.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Message-Id: <20181109023937.96105-3-farman@linux.ibm.com>
Reviewed-by: Farhan Ali <alifm@linux.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
If pfn_array_alloc fails somehow, we need to release the pfn_array_table
that was malloc'd earlier.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Message-Id: <20181109023937.96105-2-farman@linux.ibm.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Let's register the mediated device when all the data structures
which could be used are initialized.
Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Message-Id: <1540487720-11634-3-git-send-email-pmorel@linux.ibm.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Fix the following sparse warning:
drivers/s390/cio/vfio_ccw_drv.c:25:19: warning: symbol 'vfio_ccw_io_region'
was not declared. Should it be static?
Signed-off-by: Sebastian Ott <sebott@linux.ibm.com>
Message-Id: <alpine.LFD.2.21.1810151328570.1636@schleppi.aag-de.ibmmobiledemo.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Move remaining definitions and declarations from include/linux/bootmem.h
into include/linux/memblock.h and remove the redundant header.
The includes were replaced with the semantic patch below and then
semi-automated removal of duplicated '#include <linux/memblock.h>
@@
@@
- #include <linux/bootmem.h>
+ #include <linux/memblock.h>
[sfr@canb.auug.org.au: dma-direct: fix up for the removal of linux/bootmem.h]
Link: http://lkml.kernel.org/r/20181002185342.133d1680@canb.auug.org.au
[sfr@canb.auug.org.au: powerpc: fix up for removal of linux/bootmem.h]
Link: http://lkml.kernel.org/r/20181005161406.73ef8727@canb.auug.org.au
[sfr@canb.auug.org.au: x86/kaslr, ACPI/NUMA: fix for linux/bootmem.h removal]
Link: http://lkml.kernel.org/r/20181008190341.5e396491@canb.auug.org.au
Link: http://lkml.kernel.org/r/1536927045-23536-30-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Serge Semin <fancer.lancer@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- Improved access control for the zcrypt driver, multiple device nodes
can now be created with different access control lists
- Extend the pkey API to provide random protected keys, this is useful
for encrypted swap device with ephemeral protected keys
- Add support for virtually mapped kernel stacks
- Rework the early boot code, this moves the memory detection into the
boot code that runs prior to decompression.
- Add KASAN support
- Bug fixes and cleanups
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQEcBAABCAAGBQJbzXKpAAoJEDjwexyKj9rg98YH/jZ5/kEYV44JsACroTNBC782
6QLCvoCvSgXUAqRwnIfnxrcjrUVNW2aK6rOSsI/I8rQDsSA3boJ7FimoEI2BsUZG
dcMy0hC47AYB7yKREQX3gdDEj8f0bn8v2ize5F6gwLkIx0A+aBUSivRQeYMaF8sn
N/5OkSJwjCb+ZkNmDa3SHif+hC5+iL+q1hfuBdQkeCBok9pAqhyosRkgLe8CQgUV
HGrvaWJ4FudIpg4tu2jL2OsNoZFX2pK5d+Up886+KGKQEUfiXKYtdmzX17Vd7PIk
Vkf7EWUipzIA7UtrJ6pljoFsrNa+83jm4j5Dgy0ohadCVUBYLORte3yEl4P1EoM=
=MMf0
-----END PGP SIGNATURE-----
Merge tag 's390-4.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Martin Schwidefsky:
- Improved access control for the zcrypt driver, multiple device nodes
can now be created with different access control lists
- Extend the pkey API to provide random protected keys, this is useful
for encrypted swap device with ephemeral protected keys
- Add support for virtually mapped kernel stacks
- Rework the early boot code, this moves the memory detection into the
boot code that runs prior to decompression.
- Add KASAN support
- Bug fixes and cleanups
* tag 's390-4.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (83 commits)
s390/pkey: move pckmo subfunction available checks away from module init
s390/kasan: support preemptible kernel build
s390/pkey: Load pkey kernel module automatically
s390/perf: Return error when debug_register fails
s390/sthyi: Fix machine name validity indication
s390/zcrypt: fix broken zcrypt_send_cprb in-kernel api function
s390/vmalloc: fix VMALLOC_START calculation
s390/mem_detect: add missing include
s390/dumpstack: print psw mask and address again
s390/crypto: Enhance paes cipher to accept variable length key material
s390/pkey: Introduce new API for transforming key blobs
s390/pkey: Introduce new API for random protected key verification
s390/pkey: Add sysfs attributes to emit secure key blobs
s390/pkey: Add sysfs attributes to emit protected key blobs
s390/pkey: Define protected key blob format
s390/pkey: Introduce new API for random protected key generation
s390/zcrypt: add ap_adapter_mask sysfs attribute
s390/zcrypt: provide apfs failure code on type 86 error reply
s390/zcrypt: zcrypt device driver cleanup
s390/kasan: add support for mem= kernel parameter
...
Provide function to find a ccwgroup device by its busid.
Acked-by: Sebastian Ott <sebott@linux.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
We have two nested loops to check the entries within the pfn_array_table
arrays. But we mistakenly use the outer array as an index in our check,
and completely ignore the indexing performed by the inner loop.
Cc: stable@vger.kernel.org
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Message-Id: <20181002010235.42483-1-farman@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
If I attach a vfio-ccw device to my guest, I get the following warning
on the host when the host kernel is CONFIG_HARDENED_USERCOPY=y
[250757.595325] Bad or missing usercopy whitelist? Kernel memory overwrite attempt detected to SLUB object 'dma-kmalloc-512' (offset 64, size 124)!
[250757.595365] WARNING: CPU: 2 PID: 10958 at mm/usercopy.c:81 usercopy_warn+0xac/0xd8
[250757.595369] Modules linked in: kvm vhost_net vhost tap xt_CHECKSUM iptable_mangle ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c devlink tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables sunrpc dm_multipath s390_trng crc32_vx_s390 ghash_s390 prng aes_s390 des_s390 des_generic sha512_s390 sha1_s390 eadm_sch tape_3590 tape tape_class qeth_l2 qeth ccwgroup vfio_ccw vfio_mdev zcrypt_cex4 mdev vfio_iommu_type1 zcrypt vfio sha256_s390 sha_common zfcp scsi_transport_fc qdio dasd_eckd_mod dasd_mod
[250757.595424] CPU: 2 PID: 10958 Comm: CPU 2/KVM Not tainted 4.18.0-derp #2
[250757.595426] Hardware name: IBM 3906 M05 780 (LPAR)
...snip regs...
[250757.595523] Call Trace:
[250757.595529] ([<0000000000349210>] usercopy_warn+0xa8/0xd8)
[250757.595535] [<000000000032daaa>] __check_heap_object+0xfa/0x160
[250757.595540] [<0000000000349396>] __check_object_size+0x156/0x1d0
[250757.595547] [<000003ff80332d04>] vfio_ccw_mdev_write+0x74/0x148 [vfio_ccw]
[250757.595552] [<000000000034ed12>] __vfs_write+0x3a/0x188
[250757.595556] [<000000000034f040>] vfs_write+0xa8/0x1b8
[250757.595559] [<000000000034f4e6>] ksys_pwrite64+0x86/0xc0
[250757.595568] [<00000000008959a0>] system_call+0xdc/0x2b0
[250757.595570] Last Breaking-Event-Address:
[250757.595573] [<0000000000349210>] usercopy_warn+0xa8/0xd8
While vfio_ccw_mdev_{write|read} validates that the input position/count
does not run over the ccw_io_region struct, the usercopy code that does
copy_{to|from}_user doesn't necessarily know this. It sees the variable
length and gets worried that it's affecting a normal kmalloc'd struct,
and generates the above warning.
Adjust how the ccw_io_region is alloc'd with a whitelist to remove this
warning. The boundary checking will continue to do its thing.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Message-Id: <20180921204013.95804-3-farman@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
In the event that we want to change the layout of the ccw_io_region in the
future[1], it might be easier to work with it as a pointer within the
vfio_ccw_private struct rather than an embedded struct.
[1] https://patchwork.kernel.org/comment/22228541/
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Message-Id: <20180921204013.95804-2-farman@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
I've stumbled over this too many times now... AOBs are only ever used on
Output Queues. So in qdio_kick_handler(), move the call to their handler
into the Output-only path, and get rid of the convoluted contains_aobs()
helper. No functional change.
While at it, also remove
1. the unused sbal_state->aob field. For processing an async completion,
upper-layer drivers get their AOB pointer from the CQ buffer.
2. an unused EXPORT for qdio_allocate_aob(). External users would have
no way of passing an allocated AOB back into qdio.ko anyways...
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Tools like 'perf stat' parse the trace point format files defined
in /sys/kernel/debug/tracing/events/s390/.../format to handle
the print fmt: statement. The kernel provides a library in
directory linux/tools/lib/traceevent/* for this reason.
This library can not handle structures or unions defined in
the TRACE_EVENT/TP_STRUCT__entry macros with __field_struct macro.
There is no possibility to extract a structure member
(which might be a bit field) since there is no packing
information nor bit field offset by parsing the printf fmt line.
Therefore rewrite the TRACE_EVENT macro and add the
__field macro for the necessary members.
Keep the __fieldstruct macro to extract the complete
structure when dumps are analysed.
Note that the same information is displayed, this is no
interface change.
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Acked-by: Sebastian Ott <sebott@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Tools like 'perf stat' parse the trace point format files defined
in /sys/kernel/debug/tracing/events/s390/.../format to handle
the print fmt: statement. The kernel provides a library in
directory linux/tools/lib/traceevent/* for this reason.
This library can not handle structures or unions defined in
the TRACE_EVENT/TP_STRUCT__entry macros with __field_struct macro.
There is no possibility to extract a structure member
(which might be a bit field) since there is no packing
information nor bit field offset by parsing the printf fmt line.
Therefore rewrite the TRACE_EVENT macro and add the
__field macro for the necessary members.
Keep the __fieldstruct macro to extract the complete
structure when dumps are analysed.
Note that the same information is displayed, this is no
interface change.
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Acked-by: Sebastian Ott <sebott@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Tools like 'perf stat' parse the trace point format files defined
in /sys/kernel/debug/tracing/events/s390/.../format to handle
the print fmt: statement. The kernel provides a library in
directory linux/tools/lib/traceevent/* for this reason.
This library can not handle structures or unions defined in
the TRACE_EVENT/TP_STRUCT__entry macros with __field_struct macro.
There is no possibility to extract a structure member
(which might be a bit field) since there is no packing
information nor bit field offset by parsing the printf fmt line.
Therefore rewrite the TRACE_EVENT macro and add the
__field macro for the necessary members.
Keep the __fieldstruct macro to extract the complete
structure when dumps are analysed.
Note that the same information is displayed, this is no
interface change.
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Acked-by: Sebastian Ott <sebott@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Tools like 'perf stat' parse the trace point format files defined
in /sys/kernel/debug/tracing/events/s390/.../format to handle
the print fmt: statement. The kernel provides a library in
directory linux/tools/lib/traceevent/* for this reason.
This library can not handle structures or unions defined in
the TRACE_EVENT/TP_STRUCT__entry macros with __field_struct macro.
There is no possibility to extract a structure member
(which might be a bit field) since there is no packing
information nor bit field offset by parsing the printf fmt line.
Therefore rewrite the TRACE_EVENT macro and add the
the __field macro for the missing members.
Keep the __fieldstruct macro to extract the complete
structure when dumps are analysed.
Note that the same information is displayed, this is no
interface change.
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Acked-by: Sebastian Ott <sebott@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Tools like 'perf stat' parse the trace point format files defined
in /sys/kernel/debug/tracing/events/s390/.../format to handle
the print fmt: statement. The kernel provides a library in
directory linux/tools/lib/traceevent/* for this reason.
This library can not handle structures or unions defined in
the TRACE_EVENT/TP_STRUCT__entry macros with __field_struct macro.
There is no possibility to extract a structure member
(which might be a bit field) since there is no packing
information nor bit field offset by parsing the printf fmt line.
Therefore rewrite the TRACE_EVENT macro and add the
__field macro for the members adapter_IO, isc and type
of struct tpi_info.
Note that the same information is displayed, this is no
interface change.
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Acked-by: Sebastian Ott <sebott@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Tools like 'perf stat' parse the trace point format files defined
in /sys/kernel/debug/tracing/events/s390/.../format to handle
the print fmt: statement. The kernel provides a library in
directory linux/tools/lib/traceevent/* for this reason.
This library can not handle structures or unions defined in
the TRACE_EVENT/TP_STRUCT__entry macros with __field_struct macro.
There is no possibility to extract a structure member
(which might be a bit field) since there is no packing
information nor bit field offset by parsing the printf fmt line.
Therefore rewrite the TRACE_EVENT macro and add the
__field macro for the necessary fields.
Keep the __fieldstruct macro to extract the complete
structure when dumps are analysed.
Note that the same information is displayed, this is no
interface change.
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Acked-by: Sebastian Ott <sebott@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Remove attribute packed where possible failing this add proper alignment
information to fix warnings like the one below:
drivers/s390/cio/chsc.c: In function 'chsc_siosl':
drivers/s390/cio/chsc.c:1287:2: warning: alignment 1 of 'struct <anonymous>' is less than 4 [-Wpacked-not-aligned]
} __attribute__ ((packed)) *siosl_area;
Note: this patch should be a nop since non of these structs use auto
storage but allocated pages. However there are changes to the generated
code because of additional padding at the end of some of the structs due
to alignment when memset(foo, 0, sizeof(*foo)) is used.
Signed-off-by: Sebastian Ott <sebott@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>