OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Cédric Le Goater	c468bc4e84	KVM: PPC: Book3S HV: XIVE: Do not test the EQ flag validity when resetting When a CPU is hot-unplugged, the EQ is deconfigured using a zero size and a zero address. In this case, there is no need to check the flag and queue size validity. Move the checks after the queue reset code section to fix CPU hot-unplug. Reported-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com> Tested-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com> Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2019-05-29 13:44:37 +10:00
Cédric Le Goater	d47aacdb8e	KVM: PPC: Book3S HV: XIVE: Clear file mapping when device is released Improve the release of the XIVE KVM device by clearing the file address_space, which is used to unmap the interrupt ESB pages when a device is passed-through. Suggested-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2019-05-29 13:44:37 +10:00
Paul Mackerras	5a3f49364c	KVM: PPC: Book3S HV: Don't take kvm->lock around kvm_for_each_vcpu Currently the HV KVM code takes the kvm->lock around calls to kvm_for_each_vcpu() and kvm_get_vcpu_by_id() (which can call kvm_for_each_vcpu() internally). However, that leads to a lock order inversion problem, because these are called in contexts where the vcpu mutex is held, but the vcpu mutexes nest within kvm->lock according to Documentation/virtual/kvm/locking.txt. Hence there is a possibility of deadlock. To fix this, we simply don't take the kvm->lock mutex around these calls. This is safe because the implementations of kvm_for_each_vcpu() and kvm_get_vcpu_by_id() have been designed to be able to be called locklessly. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2019-05-29 13:44:36 +10:00
Paul Mackerras	1659e27d2b	KVM: PPC: Book3S: Use new mutex to synchronize access to rtas token list Currently the Book 3S KVM code uses kvm->lock to synchronize access to the kvm->arch.rtas_tokens list. Because this list is scanned inside kvmppc_rtas_hcall(), which is called with the vcpu mutex held, taking kvm->lock cause a lock inversion problem, which could lead to a deadlock. To fix this, we add a new mutex, kvm->arch.rtas_token_lock, which nests inside the vcpu mutexes, and use that instead of kvm->lock when accessing the rtas token list. This removes the lockdep_assert_held() in kvmppc_rtas_tokens_free(). At this point we don't hold the new mutex, but that is OK because kvmppc_rtas_tokens_free() is only called when the whole VM is being destroyed, and at that point nothing can be looking up a token in the list. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2019-05-29 13:44:36 +10:00
Paul Mackerras	0d4ee88d92	KVM: PPC: Book3S HV: Use new mutex to synchronize MMU setup Currently the HV KVM code uses kvm->lock in conjunction with a flag, kvm->arch.mmu_ready, to synchronize MMU setup and hold off vcpu execution until the MMU-related data structures are ready. However, this means that kvm->lock is being taken inside vcpu->mutex, which is contrary to Documentation/virtual/kvm/locking.txt and results in lockdep warnings. To fix this, we add a new mutex, kvm->arch.mmu_setup_lock, which nests inside the vcpu mutexes, and is taken in the places where kvm->lock was taken that are related to MMU setup. Additionally we take the new mutex in the vcpu creation code at the point where we are creating a new vcore, in order to provide mutual exclusion with kvmppc_update_lpcr() and ensure that an update to kvm->arch.lpcr doesn't get missed, which could otherwise lead to a stale vcore->lpcr value. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2019-05-29 13:44:36 +10:00
Paul Mackerras	c395fe1d8e	KVM: PPC: Book3S HV: Avoid touching arch.mmu_ready in XIVE release functions Currently, kvmppc_xive_release() and kvmppc_xive_native_release() clear kvm->arch.mmu_ready and call kick_all_cpus_sync() as a way of ensuring that no vcpus are executing in the guest. However, future patches will change the mutex associated with kvm->arch.mmu_ready to a new mutex that nests inside the vcpu mutexes, making it difficult to continue to use this method. In fact, taking the vcpu mutex for a vcpu excludes execution of that vcpu, and we already take the vcpu mutex around the call to kvmppc_xive_[native_]cleanup_vcpu(). Once the cleanup function is done and we release the vcpu mutex, the vcpu can execute once again, but because we have cleared vcpu->arch.xive_vcpu, vcpu->arch.irq_type, vcpu->arch.xive_esc_vaddr and vcpu->arch.xive_esc_raddr, that vcpu will not be going into XIVE code any more. Thus, once we have cleaned up all of the vcpus, we are safe to clean up the rest of the XIVE state, and we don't need to use kvm->arch.mmu_ready to hold off vcpu execution. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2019-05-29 13:44:36 +10:00
Eduardo Valentin	ca657468a0	Revert "drivers: thermal: tsens: Add new operation to check if a sensor is enabled" This reverts commit `3e6a8fb330`. Cc: Andy Gross <agross@kernel.org> Cc: David Brown <david.brown@linaro.org> Cc: Amit Kucheria <amit.kucheria@linaro.org> Cc: Zhang Rui <rui.zhang@intel.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Suggested-by: Amit Kucheria <amit.kucheria@linaro.org> Reported-by: Andy Gross <andygro@gmail.com> Signed-off-by: Eduardo Valentin <edubezval@gmail.com>	2019-05-28 19:30:33 -07:00
Saeed Mahameed	c0194e2d0e	net/mlx5e: Disable rxhash when CQE compress is enabled When CQE compression is enabled (Multi-host systems), compressed CQEs might arrive to the driver rx, compressed CQEs don't have a valid hash offload and the driver already reports a hash value of 0 and invalid hash type on the skb for compressed CQEs, but this is not good enough. On a congested PCIe, where CQE compression will kick in aggressively, gro will deliver lots of out of order packets due to the invalid hash and this might cause a serious performance drop. The only valid solution, is to disable rxhash offload at all when CQE compression is favorable (Multi-host systems). Fixes: `7219ab34f1` ("net/mlx5e: CQE compression") Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-05-28 18:25:42 -07:00
wenxu	24bcd210e2	net/mlx5e: restrict the real_dev of vlan device is the same as uplink device When register indr block for vlan device, it should check the real_dev of vlan device is same as uplink device. Or it will set offload rule to mlx5e which will never hit. Fixes: `35a605db16` ("net/mlx5e: Offload TC e-switch rules with ingress VLAN device") Signed-off-by: wenxu <wenxu@ucloud.cn> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-05-28 18:25:42 -07:00
Parav Pandit	25fa506b70	net/mlx5: Allocate root ns memory using kzalloc to match kfree root ns is yet another fs core node which is freed using kfree() by tree_put_node(). Rest of the other fs core objects are also allocated using kmalloc variants. However, root ns memory is allocated using kvzalloc(). Hence allocate root ns memory using kzalloc(). Fixes: `2530236303` ("net/mlx5_core: Flow steering tree initialization") Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-05-28 18:25:42 -07:00
Parav Pandit	9414277a5d	net/mlx5: Avoid double free in fs init error unwinding path In below code flow, for ingress acl table root ns memory leads to double free. mlx5_init_fs init_ingress_acls_root_ns() init_ingress_acl_root_ns kfree(steering->esw_ingress_root_ns); /* steering->esw_ingress_root_ns is not marked NULL / mlx5_cleanup_fs cleanup_ingress_acls_root_ns steering->esw_ingress_root_ns non NULL check passes. kfree(steering->esw_ingress_root_ns); / double free */ Similar issue exist for other tables. Hence zero out the pointers to not process the table again. Fixes: `9b93ab981e` ("net/mlx5: Separate ingress/egress namespaces for each vport") Fixes: 40c3eebb49e51 ("net/mlx5: Add support in RDMA RX steering") Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-05-28 18:25:41 -07:00
Parav Pandit	905f6bd30b	net/mlx5: Avoid double free of root ns in the error flow path When root ns setup for rdma, sniffer tx and sniffer rx fails, such root ns cleanup is done by the error unwinding path of mlx5_cleanup_fs(). Below call graph shows an example for sniffer_rx_root_ns. mlx5_init_fs() init_sniffer_rx_root_ns() cleanup_root_ns(steering->sniffer_rx_root_ns); mlx5_cleanup_fs() cleanup_root_ns(steering->sniffer_rx_root_ns); /* double free of sniffer_rx_root_ns */ Hence, use the existing cleanup_fs to cleanup. Fixes: `d83eb50e29` ("net/mlx5: Add support in RDMA RX steering") Fixes: `87d22483ce` ("net/mlx5: Add sniffer namespaces") Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-05-28 18:25:41 -07:00
Saeed Mahameed	8788392995	net/mlx5: Fix error handling in mlx5_load() In case mlx5_core_set_hca_defaults fails, it should jump to mlx5_cleanup_fs, fix that. Fixes: `c85023e153` ("IB/mlx5: Add raw ethernet local loopback support") Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Huy Nguyen <huyn@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-05-28 18:25:41 -07:00
Florian Fainelli	a6cd0d2d49	Documentation: net-sysfs: Remove duplicate PHY device documentation Both sysfs-bus-mdio and sysfs-class-net-phydev contain the same duplication information. There is not currently any MDIO bus specific attribute, but there are PHY device (struct phy_device) specific attributes. Use the more precise description from sysfs-bus-mdio and carry that over to sysfs-class-net-phydev. Fixes: `86f22d04df` ("net: sysfs: Document PHY device sysfs attributes") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-28 17:26:44 -07:00
Eric Dumazet	8fb44d60d4	llc: fix skb leak in llc_build_and_send_ui_pkt() If llc_mac_hdr_init() returns an error, we must drop the skb since no llc_build_and_send_ui_pkt() caller will take care of this. BUG: memory leak unreferenced object 0xffff8881202b6800 (size 2048): comm "syz-executor907", pid 7074, jiffies 4294943781 (age 8.590s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 1a 00 07 40 00 00 00 00 00 00 00 00 00 00 00 00 ...@............ backtrace: [<00000000e25b5abe>] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline] [<00000000e25b5abe>] slab_post_alloc_hook mm/slab.h:439 [inline] [<00000000e25b5abe>] slab_alloc mm/slab.c:3326 [inline] [<00000000e25b5abe>] __do_kmalloc mm/slab.c:3658 [inline] [<00000000e25b5abe>] __kmalloc+0x161/0x2c0 mm/slab.c:3669 [<00000000a1ae188a>] kmalloc include/linux/slab.h:552 [inline] [<00000000a1ae188a>] sk_prot_alloc+0xd6/0x170 net/core/sock.c:1608 [<00000000ded25bbe>] sk_alloc+0x35/0x2f0 net/core/sock.c:1662 [<000000002ecae075>] llc_sk_alloc+0x35/0x170 net/llc/llc_conn.c:950 [<00000000551f7c47>] llc_ui_create+0x7b/0x140 net/llc/af_llc.c:173 [<0000000029027f0e>] __sock_create+0x164/0x250 net/socket.c:1430 [<000000008bdec225>] sock_create net/socket.c:1481 [inline] [<000000008bdec225>] __sys_socket+0x69/0x110 net/socket.c:1523 [<00000000b6439228>] __do_sys_socket net/socket.c:1532 [inline] [<00000000b6439228>] __se_sys_socket net/socket.c:1530 [inline] [<00000000b6439228>] __x64_sys_socket+0x1e/0x30 net/socket.c:1530 [<00000000cec820c1>] do_syscall_64+0x76/0x1a0 arch/x86/entry/common.c:301 [<000000000c32554f>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 BUG: memory leak unreferenced object 0xffff88811d750d00 (size 224): comm "syz-executor907", pid 7074, jiffies 4294943781 (age 8.600s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 f0 0c 24 81 88 ff ff 00 68 2b 20 81 88 ff ff ...$.....h+ .... backtrace: [<0000000053026172>] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline] [<0000000053026172>] slab_post_alloc_hook mm/slab.h:439 [inline] [<0000000053026172>] slab_alloc_node mm/slab.c:3269 [inline] [<0000000053026172>] kmem_cache_alloc_node+0x153/0x2a0 mm/slab.c:3579 [<00000000fa8f3c30>] __alloc_skb+0x6e/0x210 net/core/skbuff.c:198 [<00000000d96fdafb>] alloc_skb include/linux/skbuff.h:1058 [inline] [<00000000d96fdafb>] alloc_skb_with_frags+0x5f/0x250 net/core/skbuff.c:5327 [<000000000a34a2e7>] sock_alloc_send_pskb+0x269/0x2a0 net/core/sock.c:2225 [<00000000ee39999b>] sock_alloc_send_skb+0x32/0x40 net/core/sock.c:2242 [<00000000e034d810>] llc_ui_sendmsg+0x10a/0x540 net/llc/af_llc.c:933 [<00000000c0bc8445>] sock_sendmsg_nosec net/socket.c:652 [inline] [<00000000c0bc8445>] sock_sendmsg+0x54/0x70 net/socket.c:671 [<000000003b687167>] __sys_sendto+0x148/0x1f0 net/socket.c:1964 [<00000000922d78d9>] __do_sys_sendto net/socket.c:1976 [inline] [<00000000922d78d9>] __se_sys_sendto net/socket.c:1972 [inline] [<00000000922d78d9>] __x64_sys_sendto+0x2a/0x30 net/socket.c:1972 [<00000000cec820c1>] do_syscall_64+0x76/0x1a0 arch/x86/entry/common.c:301 [<000000000c32554f>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-28 17:25:23 -07:00
Stefano Brivio	73f51d151e	selftests: pmtu: Fix encapsulating device in pmtu_vti6_link_change_mtu In the pmtu_vti6_link_change_mtu test, both local and remote addresses for the vti6 tunnel are assigned to the same address given to the dummy interface that we use as encapsulating device with a known MTU. This works as long as the dummy interface is actually selected, via rt6_lookup(), as encapsulating device. But if the remote address of the tunnel is a local address too, the loopback interface could also be selected, and there's nothing wrong with it. This is what some older -stable kernels do (3.18.z, at least), and nothing prevents us from subtly changing FIB implementation to revert back to that behaviour in the future. Define an IPv6 prefix instead, and use two separate addresses as local and remote for vti6, so that the encapsulating device can't be a loopback interface. Reported-by: Xiumei Mu <xmu@redhat.com> Fixes: `1fad59ea1c` ("selftests: pmtu: Add pmtu_vti6_link_change_mtu test") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-28 17:15:06 -07:00
Gen Zhang	50fbc13dc1	dfs_cache: fix a wrong use of kfree in flush_cache_ent() In flush_cache_ent(), 'ce->ce_path' is allocated by kstrdup_const(). It should be freed by kfree_const(), rather than kfree(). Signed-off-by: Gen Zhang <blackgod016574@gmail.com> Reviewed-by: Paulo Alcantara <palcantara@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>	2019-05-28 19:13:58 -05:00
Murphy Zhou	6457c20e33	fs/cifs/smb2pdu.c: fix buffer free in SMB2_ioctl_free The 2nd buffer could be NULL even if iov_len is not zero. This can trigger a panic when handling symlinks. It's easy to reproduce with LTP fs_racer scripts[1] which are randomly craete/delete/link files and dirs. Fix this panic by checking if the 2nd buffer is padding before kfree, like what we do in SMB2_open_free. [1] https://github.com/linux-test-project/ltp/tree/master/testcases/kernel/fs/racer Fixes: `2c87d6a94d` ("cifs: Allocate memory for all iovs in smb2_ioctl") Signed-off-by: Murphy Zhou <jencce.kernel@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Ronnie sahlberg <lsahlber@redhat.com>	2019-05-28 19:11:35 -05:00
Colin Ian King	210782038b	cifs: fix memory leak of pneg_inbuf on -EOPNOTSUPP ioctl case Currently in the case where SMB2_ioctl returns the -EOPNOTSUPP error there is a memory leak of pneg_inbuf. Fix this by returning via the out_free_inbuf exit path that will perform the relevant kfree. Addresses-Coverity: ("Resource leak") Fixes: `969ae8e8d4` ("cifs: Accept validate negotiate if server return NT_STATUS_NOT_SUPPORTED") CC: Stable <stable@vger.kernel.org> # v5.1+ Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2019-05-28 19:11:35 -05:00
Ross Lagerwall	d10e0cc113	xenbus: Avoid deadlock during suspend due to open transactions During a suspend/resume, the xenwatch thread waits for all outstanding xenstore requests and transactions to complete. This does not work correctly for transactions started by userspace because it waits for them to complete after freezing userspace threads which means the transactions have no way of completing, resulting in a deadlock. This is trivial to reproduce by running this script and then suspending the VM: import pyxs, time c = pyxs.client.Client(xen_bus_path="/dev/xen/xenbus") c.connect() c.transaction() time.sleep(3600) Even if this deadlock were resolved, misbehaving userspace should not prevent a VM from being migrated. So, instead of waiting for these transactions to complete before suspending, store the current generation id for each transaction when it is started. The global generation id is incremented during resume. If the caller commits the transaction and the generation id does not match the current generation id, return EAGAIN so that they try again. If the transaction was instead discarded, return OK since no changes were made anyway. This only affects users of the xenbus file interface. In-kernel users of xenbus are assumed to be well-behaved and complete all transactions before freezing. Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>	2019-05-28 17:32:15 -04:00
YueHaibing	41349672e3	xen/pvcalls: Remove set but not used variable Fixes gcc '-Wunused-but-set-variable' warning: drivers/xen/pvcalls-front.c: In function pvcalls_front_sendmsg: drivers/xen/pvcalls-front.c:543:25: warning: variable bedata set but not used [-Wunused-but-set-variable] drivers/xen/pvcalls-front.c: In function pvcalls_front_recvmsg: drivers/xen/pvcalls-front.c:638:25: warning: variable bedata set but not used [-Wunused-but-set-variable] They are never used since introduction. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>	2019-05-28 17:32:10 -04:00
Ingo Molnar	849e96f300	perf/urgent fixes: BPF: Jiri Olsa: - Fixup determination of end of kernel map, to avoid having BPF programs, that are after the kernel headers and just before module texts mixed up in the kernel map. tools UAPI header copies: Arnaldo Carvalho de Melo: - Update copy of files related to new fspick, fsmount, fsconfig, fsopen, move_mount and open_tree syscalls. - Sync cpufeatures.h, sched.h, fs.h, drm.h, i915_drm.h and kvm.h headers. Namespaces: Namhyung Kim: - Add missing byte swap ops for namespace events when processing records from perf.data files that could have been recorded in a arch with a different endianness. - Fix access to the thread namespaces list by using the namespaces_lock. perf data: Shawn Landden: - Fix 'strncat may truncate' build failure with recent gcc. s/390 Thomas Richter: - Fix s390 missing module symbol and warning for non-root users in 'perf record'. arm64: Vitaly Chikunov: - Fix mksyscalltbl when system kernel headers are ahead of the kernel. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCXO1vsQAKCRCyPKLppCJ+ J5MrAQCrxsTz1Lc6GrStrMMX72BqmoEPzoCkmONCukVJCcXeEQEAzdz4I4/CNG3g phtc030+Njnc8X5qpkR9kqSQuaPjWAk= =1Fbq -----END PGP SIGNATURE----- Merge tag 'perf-urgent-for-mingo-5.2-20190528' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/urgent fixes: BPF: Jiri Olsa: - Fixup determination of end of kernel map, to avoid having BPF programs, that are after the kernel headers and just before module texts mixed up in the kernel map. tools UAPI header copies: Arnaldo Carvalho de Melo: - Update copy of files related to new fspick, fsmount, fsconfig, fsopen, move_mount and open_tree syscalls. - Sync cpufeatures.h, sched.h, fs.h, drm.h, i915_drm.h and kvm.h headers. Namespaces: Namhyung Kim: - Add missing byte swap ops for namespace events when processing records from perf.data files that could have been recorded in a arch with a different endianness. - Fix access to the thread namespaces list by using the namespaces_lock. perf data: Shawn Landden: - Fix 'strncat may truncate' build failure with recent gcc. s/390 Thomas Richter: - Fix s390 missing module symbol and warning for non-root users in 'perf record'. arm64: Vitaly Chikunov: - Fix mksyscalltbl when system kernel headers are ahead of the kernel. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-05-28 23:16:22 +02:00
Tomas Bortoli	dfb4a6f219	tracing: Avoid memory leak in predicate_parse() In case of errors, predicate_parse() goes to the out_free label to free memory and to return an error code. However, predicate_parse() does not free the predicates of the temporary prog_stack array, thence leaking them. Link: http://lkml.kernel.org/r/20190528154338.29976-1-tomasbortoli@gmail.com Cc: stable@vger.kernel.org Fixes: `80765597bc` ("tracing: Rewrite filter logic to be simpler and faster") Reported-by: syzbot+6b8e0fb820e570c59e19@syzkaller.appspotmail.com Signed-off-by: Tomas Bortoli <tomasbortoli@gmail.com> [ Added protection around freeing prog_stack[i].pred ] Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2019-05-28 16:27:58 -04:00
Madalin Bucur	7aae703f80	dpaa_eth: use only online CPU portals Make sure only the portals for the online CPUs are used. Without this change, there are issues when someone boots with maxcpus=n, with n < actual number of cores available as frames either received or corresponding to the transmit confirmation path would be offered for dequeue to the offline CPU portals, getting lost. Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-28 11:19:55 -07:00
Jisheng Zhang	d484e06e25	net: mvneta: Fix err code path of probe Fix below issues in err code path of probe: 1. we don't need to unregister_netdev() because the netdev isn't registered. 2. when register_netdev() fails, we also need to destroy bm pool for HWBM case. Fixes: `dc35a10f68` ("net: mvneta: bm: add support for hardware buffer management") Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-28 11:05:00 -07:00
Thierry Reding	54ed6fd2e0	net: stmmac: Do not output error on deferred probe If the subdriver defers probe, do not show an error message. It's perfectly fine for this error to occur since the driver will get another chance to probe after some time and will usually succeed after all of the resources that it requires have been registered. Signed-off-by: Thierry Reding <treding@nvidia.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-28 11:00:09 -07:00
Filipe Manana	06989c799f	Btrfs: fix race updating log root item during fsync When syncing the log, the final phase of a fsync operation, we need to either create a log root's item or update the existing item in the log tree of log roots, and that depends on the current value of the log root's log_transid - if it's 1 we need to create the log root item, otherwise it must exist already and we update it. Since there is no synchronization between updating the log_transid and checking it for deciding whether the log root's item needs to be created or updated, we end up with a tiny race window that results in attempts to update the item to fail because the item was not yet created: CPU 1 CPU 2 btrfs_sync_log() lock root->log_mutex set log root's log_transid to 1 unlock root->log_mutex btrfs_sync_log() lock root->log_mutex sets log root's log_transid to 2 unlock root->log_mutex update_log_root() sees log root's log_transid with a value of 2 calls btrfs_update_root(), which fails with -EUCLEAN and causes transaction abort Until recently the race lead to a BUG_ON at btrfs_update_root(), but after the recent commit `7ac1e464c4` ("btrfs: Don't panic when we can't find a root key") we just abort the current transaction. A sample trace of the BUG_ON() on a SLE12 kernel: ------------[ cut here ]------------ kernel BUG at ../fs/btrfs/root-tree.c:157! Oops: Exception in kernel mode, sig: 5 [#1] SMP NR_CPUS=2048 NUMA pSeries (...) Supported: Yes, External CPU: 78 PID: 76303 Comm: rtas_errd Tainted: G X 4.4.156-94.57-default #1 task: c00000ffa906d010 ti: c00000ff42b08000 task.ti: c00000ff42b08000 NIP: d000000036ae5cdc LR: d000000036ae5cd8 CTR: 0000000000000000 REGS: c00000ff42b0b860 TRAP: 0700 Tainted: G X (4.4.156-94.57-default) MSR: 8000000002029033 <SF,VEC,EE,ME,IR,DR,RI,LE> CR: 22444484 XER: 20000000 CFAR: d000000036aba66c SOFTE: 1 GPR00: d000000036ae5cd8 c00000ff42b0bae0 d000000036bda220 0000000000000054 GPR04: 0000000000000001 0000000000000000 c00007ffff8d37c8 0000000000000000 GPR08: c000000000e19c00 0000000000000000 0000000000000000 3736343438312079 GPR12: 3930373337303434 c000000007a3a800 00000000007fffff 0000000000000023 GPR16: c00000ffa9d26028 c00000ffa9d261f8 0000000000000010 c00000ffa9d2ab28 GPR20: c00000ff42b0bc48 0000000000000001 c00000ff9f0d9888 0000000000000001 GPR24: c00000ffa9d26000 c00000ffa9d261e8 c00000ffa9d2a800 c00000ff9f0d9888 GPR28: c00000ffa9d26028 c00000ffa9d2aa98 0000000000000001 c00000ffa98f5b20 NIP [d000000036ae5cdc] btrfs_update_root+0x25c/0x4e0 [btrfs] LR [d000000036ae5cd8] btrfs_update_root+0x258/0x4e0 [btrfs] Call Trace: [c00000ff42b0bae0] [d000000036ae5cd8] btrfs_update_root+0x258/0x4e0 [btrfs] (unreliable) [c00000ff42b0bba0] [d000000036b53610] btrfs_sync_log+0x2d0/0xc60 [btrfs] [c00000ff42b0bce0] [d000000036b1785c] btrfs_sync_file+0x44c/0x4e0 [btrfs] [c00000ff42b0bd80] [c00000000032e300] vfs_fsync_range+0x70/0x120 [c00000ff42b0bdd0] [c00000000032e44c] do_fsync+0x5c/0xb0 [c00000ff42b0be10] [c00000000032e8dc] SyS_fdatasync+0x2c/0x40 [c00000ff42b0be30] [c000000000009488] system_call+0x3c/0x100 Instruction dump: 7f43d378 4bffebb9 60000000 88d90008 3d220000 e8b90000 3b390009 e87a01f0 e8898e08 e8f90000 4bfd48e5 60000000 <0fe00000> e95b0060 39200004 394a0ea0 ---[ end trace 8f2dc8f919cabab8 ]--- So fix this by doing the check of log_transid and updating or creating the log root's item while holding the root's log_mutex. Fixes: `7237f18336` ("Btrfs: fix tree logs parallel sync") CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2019-05-28 19:26:46 +02:00
Filipe Manana	5338e43abb	Btrfs: fix wrong ctime and mtime of a directory after log replay When replaying a log that contains a new file or directory name that needs to be added to its parent directory, we end up updating the mtime and the ctime of the parent directory to the current time after we have set their values to the correct ones (set at fsync time), efectivelly losing them. Sample reproducer: $ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt $ mkdir /mnt/dir $ touch /mnt/dir/file # fsync of the directory is optional, not needed $ xfs_io -c fsync /mnt/dir $ xfs_io -c fsync /mnt/dir/file $ stat -c %Y /mnt/dir 1557856079 <power failure> $ sleep 3 $ mount /dev/sdb /mnt $ stat -c %Y /mnt/dir 1557856082 --> should have been 1557856079, the mtime is updated to the current time when replaying the log Fix this by not updating the mtime and ctime to the current time at btrfs_add_link() when we are replaying a log tree. This could be triggered by my recent fsync fuzz tester for fstests, for which an fstests patch exists titled "fstests: generic, fsync fuzz tester with fsstress". Fixes: `e02119d5a7` ("Btrfs: Add a write ahead tree log to optimize synchronous operations") CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2019-05-28 19:16:16 +02:00
Filipe Manana	60d9f50308	Btrfs: fix fsync not persisting changed attributes of a directory While logging an inode we follow its ancestors and for each one we mark it as logged in the current transaction, even if we have not logged it. As a consequence if we change an attribute of an ancestor, such as the UID or GID for example, and then explicitly fsync it, we end up not logging the inode at all despite returning success to user space, which results in the attribute being lost if a power failure happens after the fsync. Sample reproducer: $ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt $ mkdir /mnt/dir $ chown 6007:6007 /mnt/dir $ sync $ chown 9003:9003 /mnt/dir $ touch /mnt/dir/file $ xfs_io -c fsync /mnt/dir/file # fsync our directory after fsync'ing the new file, should persist the # new values for the uid and gid. $ xfs_io -c fsync /mnt/dir <power failure> $ mount /dev/sdb /mnt $ stat -c %u:%g /mnt/dir 6007:6007 --> should be 9003:9003, the uid and gid were not persisted, despite the explicit fsync on the directory prior to the power failure Fix this by not updating the logged_trans field of ancestor inodes when logging an inode, since we have not logged them. Let only future calls to btrfs_log_inode() to mark inodes as logged. This could be triggered by my recent fsync fuzz tester for fstests, for which an fstests patch exists titled "fstests: generic, fsync fuzz tester with fsstress". Fixes: `12fcfd22fe` ("Btrfs: tree logging unlink/rename fixes") CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2019-05-28 18:56:50 +02:00
Qu Wenruo	57949d033a	btrfs: qgroup: Check bg while resuming relocation to avoid NULL pointer dereference [BUG] When mounting a fs with reloc tree and has qgroup enabled, it can cause NULL pointer dereference at mount time: BUG: kernel NULL pointer dereference, address: 00000000000000a8 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP NOPTI RIP: 0010:btrfs_qgroup_add_swapped_blocks+0x186/0x300 [btrfs] Call Trace: replace_path.isra.23+0x685/0x900 [btrfs] merge_reloc_root+0x26e/0x5f0 [btrfs] merge_reloc_roots+0x10a/0x1a0 [btrfs] btrfs_recover_relocation+0x3cd/0x420 [btrfs] open_ctree+0x1bc8/0x1ed0 [btrfs] btrfs_mount_root+0x544/0x680 [btrfs] legacy_get_tree+0x34/0x60 vfs_get_tree+0x2d/0xf0 fc_mount+0x12/0x40 vfs_kern_mount.part.12+0x61/0xa0 vfs_kern_mount+0x13/0x20 btrfs_mount+0x16f/0x860 [btrfs] legacy_get_tree+0x34/0x60 vfs_get_tree+0x2d/0xf0 do_mount+0x81f/0xac0 ksys_mount+0xbf/0xe0 __x64_sys_mount+0x25/0x30 do_syscall_64+0x65/0x240 entry_SYSCALL_64_after_hwframe+0x49/0xbe [CAUSE] In btrfs_recover_relocation(), we don't have enough info to determine which block group we're relocating, but only to merge existing reloc trees. Thus in btrfs_recover_relocation(), rc->block_group is NULL. btrfs_qgroup_add_swapped_blocks() hasn't taken this into consideration, and causes a NULL pointer dereference. The bug is introduced by commit `3d0174f78e` ("btrfs: qgroup: Only trace data extents in leaves if we're relocating data block group"), and later qgroup refactoring still keeps this optimization. [FIX] Thankfully in the context of btrfs_recover_relocation(), there is no other progress can modify tree blocks, thus those swapped tree blocks pair will never affect qgroup numbers, no matter whatever we set for block->trace_leaf. So we only need to check if @bg is NULL before accessing @bg->flags. Reported-by: Juan Erbes <jerbes@gmail.com> Link: https://bugzilla.opensuse.org/show_bug.cgi?id=1134806 Fixes: `3d0174f78e` ("btrfs: qgroup: Only trace data extents in leaves if we're relocating data block group") CC: stable@vger.kernel.org # 4.20+ Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2019-05-28 18:54:10 +02:00
Qu Wenruo	30d40577e3	btrfs: reloc: Also queue orphan reloc tree for cleanup to avoid BUG_ON() [BUG] When a fs has orphan reloc tree along with unfinished balance: ... item 16 key (TREE_RELOC ROOT_ITEM FS_TREE) itemoff 12090 itemsize 439 generation 12 root_dirid 256 bytenr 300400640 level 1 refs 0 <<< lastsnap 8 byte_limit 0 bytes_used 1359872 flags 0x0(none) uuid 7c48d938-33a3-4aae-ab19-6e5c9d406e46 item 17 key (BALANCE TEMPORARY_ITEM 0) itemoff 11642 itemsize 448 temporary item objectid BALANCE offset 0 balance status flags 14 Then at mount time, we can hit the following kernel BUG_ON(): BTRFS info (device dm-3): relocating block group 298844160 flags metadata\|dup ------------[ cut here ]------------ kernel BUG at fs/btrfs/relocation.c:1413! invalid opcode: 0000 [#1] PREEMPT SMP NOPTI CPU: 1 PID: 897 Comm: btrfs-balance Tainted: G O 5.2.0-rc1-custom #15 RIP: 0010:create_reloc_root+0x1eb/0x200 [btrfs] Call Trace: btrfs_init_reloc_root+0x96/0xb0 [btrfs] record_root_in_trans+0xb2/0xe0 [btrfs] btrfs_record_root_in_trans+0x55/0x70 [btrfs] select_reloc_root+0x7e/0x230 [btrfs] do_relocation+0xc4/0x620 [btrfs] relocate_tree_blocks+0x592/0x6a0 [btrfs] relocate_block_group+0x47b/0x5d0 [btrfs] btrfs_relocate_block_group+0x183/0x2f0 [btrfs] btrfs_relocate_chunk+0x4e/0xe0 [btrfs] btrfs_balance+0x864/0xfa0 [btrfs] balance_kthread+0x3b/0x50 [btrfs] kthread+0x123/0x140 ret_from_fork+0x27/0x50 [CAUSE] In btrfs, reloc trees are used to record swapped tree blocks during balance. Reloc tree either get merged (replace old tree blocks of its parent subvolume) in next transaction if its ref is 1 (fresh). Or is already merged and will be cleaned up if its ref is 0 (orphan). After commit `d2311e6985` ("btrfs: relocation: Delay reloc tree deletion after merge_reloc_roots"), reloc tree cleanup is delayed until one block group is balanced. Since fresh reloc roots are recorded during merge, as long as there is no power loss, those orphan reloc roots converted from fresh ones are handled without problem. However when power loss happens, orphan reloc roots can be recorded on-disk, thus at next mount time, we will have orphan reloc roots from on-disk data directly, and ignored by clean_dirty_subvols() routine. Then when background balance starts to balance another block group, and needs to create new reloc root for the same root, btrfs_insert_item() returns -EEXIST, and trigger that BUG_ON(). [FIX] For orphan reloc roots, also queue them to rc->dirty_subvol_roots, so all reloc roots no matter orphan or not, can be cleaned up properly and avoid above BUG_ON(). And to cooperate with above change, clean_dirty_subvols() will check if the queued root is a reloc root or a subvol root. For a subvol root, do the old work, and for a orphan reloc root, clean it up. Fixes: `d2311e6985` ("btrfs: relocation: Delay reloc tree deletion after merge_reloc_roots") CC: stable@vger.kernel.org # 5.1 Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2019-05-28 18:54:10 +02:00
Filipe Manana	3c850b4511	Btrfs: incremental send, fix emission of invalid clone operations When doing an incremental send we can now issue clone operations with a source range that ends at the source's file eof and with a destination range that ends at an offset smaller then the destination's file eof. If the eof of the source file is not aligned to the sector size of the filesystem, the receiver will get a -EINVAL error when trying to do the operation or, on older kernels, silently corrupt the destination file. The corruption happens on kernels without commit `ac765f83f1` ("Btrfs: fix data corruption due to cloning of eof block"), while the failure to clone happens on kernels with that commit. Example reproducer: $ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt/sdb $ xfs_io -f -c "pwrite -S 0xb1 0 2M" /mnt/sdb/foo $ xfs_io -f -c "pwrite -S 0xc7 0 2M" /mnt/sdb/bar $ xfs_io -f -c "pwrite -S 0x4d 0 2M" /mnt/sdb/baz $ xfs_io -f -c "pwrite -S 0xe2 0 2M" /mnt/sdb/zoo $ btrfs subvolume snapshot -r /mnt/sdb /mnt/sdb/base $ btrfs send -f /tmp/base.send /mnt/sdb/base $ xfs_io -c "reflink /mnt/sdb/bar 1560K 500K 100K" /mnt/sdb/bar $ xfs_io -c "reflink /mnt/sdb/bar 1560K 0 100K" /mnt/sdb/zoo $ xfs_io -c "truncate 550K" /mnt/sdb/bar $ btrfs subvolume snapshot -r /mnt/sdb /mnt/sdb/incr $ btrfs send -f /tmp/incr.send -p /mnt/sdb/base /mnt/sdb/incr $ mkfs.btrfs -f /dev/sdc $ mount /dev/sdc /mnt/sdc $ btrfs receive -f /tmp/base.send /mnt/sdc $ btrfs receive -vv -f /tmp/incr.send /mnt/sdc (...) truncate bar size=563200 utimes bar clone zoo - source=bar source offset=512000 offset=0 length=51200 ERROR: failed to clone extents to zoo Invalid argument The failure happens because the clone source range ends at the eof of file bar, 563200, which is not aligned to the filesystems sector size (4Kb in this case), and the destination range ends at offset 0 + 51200, which is less then the size of the file zoo (2Mb). So fix this by detecting such case and instead of issuing a clone operation for the whole range, do a clone operation for smaller range that is sector size aligned followed by a write operation for the block containing the eof. Here we will always be pessimistic and assume the destination filesystem of the send stream has the largest possible sector size (64Kb), since we have no way of determining it. This fixes a recent regression introduced in kernel 5.2-rc1. Fixes: `040ee6120c` ("Btrfs: send, improve clone range") Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2019-05-28 18:54:10 +02:00
Filipe Manana	6b1f72e5b8	Btrfs: incremental send, fix file corruption when no-holes feature is enabled When using the no-holes feature, if we have a file with prealloc extents with a start offset beyond the file's eof, doing an incremental send can cause corruption of the file due to incorrect hole detection. Such case requires that the prealloc extent(s) exist in both the parent and send snapshots, and that a hole is punched into the file that covers all its extents that do not cross the eof boundary. Example reproducer: $ mkfs.btrfs -f -O no-holes /dev/sdb $ mount /dev/sdb /mnt/sdb $ xfs_io -f -c "pwrite -S 0xab 0 500K" /mnt/sdb/foobar $ xfs_io -c "falloc -k 1200K 800K" /mnt/sdb/foobar $ btrfs subvolume snapshot -r /mnt/sdb /mnt/sdb/base $ btrfs send -f /tmp/base.snap /mnt/sdb/base $ xfs_io -c "fpunch 0 500K" /mnt/sdb/foobar $ btrfs subvolume snapshot -r /mnt/sdb /mnt/sdb/incr $ btrfs send -p /mnt/sdb/base -f /tmp/incr.snap /mnt/sdb/incr $ md5sum /mnt/sdb/incr/foobar 816df6f64deba63b029ca19d880ee10a /mnt/sdb/incr/foobar $ mkfs.btrfs -f /dev/sdc $ mount /dev/sdc /mnt/sdc $ btrfs receive -f /tmp/base.snap /mnt/sdc $ btrfs receive -f /tmp/incr.snap /mnt/sdc $ md5sum /mnt/sdc/incr/foobar cf2ef71f4a9e90c2f6013ba3b2257ed2 /mnt/sdc/incr/foobar --> Different checksum, because the prealloc extent beyond the file's eof confused the hole detection code and it assumed a hole starting at offset 0 and ending at the offset of the prealloc extent (1200Kb) instead of ending at the offset 500Kb (the file's size). Fix this by ensuring we never cross the file's size when issuing the write operations for a hole. Fixes: `16e7549f04` ("Btrfs: incompatible format change to remove hole extents") CC: stable@vger.kernel.org # 3.14+ Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2019-05-28 18:54:10 +02:00
Dennis Zhou	fee13fe965	btrfs: correct zstd workspace manager lock to use spin_lock_bh() The btrfs zstd workspace manager uses a background timer to reclaim not recently used workspaces. I used spin_lock() from this context which should have been caught with lockdep, but was not. This deadlock was reported in bugzilla. The fix is to switch the zstd wsm lock to use spin_lock_bh() from the softirq context. This happened quite relibably on ppc64, unlike on other architectures. [ 313.402874] ================================ [ 313.402875] WARNING: inconsistent lock state [ 313.402879] 5.1.0-rc7 #1 Not tainted [ 313.402880] -------------------------------- [ 313.402882] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. [ 313.402885] swapper/5/0 [HC0[0]:SC1[1]:HE1:SE0] takes: [ 313.402888] 0000000080d1120c (&(&wsm.lock)->rlock){+.?.}, at: .zstd_reclaim_timer_fn+0x40/0x230 [ 313.402895] {SOFTIRQ-ON-W} state was registered at: [ 313.402899] .lock_acquire+0xd0/0x240 [ 313.402903] ._raw_spin_lock+0x34/0x60 [ 313.402906] .zstd_get_workspace+0xd0/0x360 [ 313.402908] .end_compressed_bio_read+0x3b8/0x540 [ 313.402911] .bio_endio+0x174/0x2c0 [ 313.402914] .end_workqueue_fn+0x4c/0x70 [ 313.402917] .normal_work_helper+0x138/0x7e0 [ 313.402920] .process_one_work+0x324/0x790 [ 313.402922] .worker_thread+0x68/0x570 [ 313.402925] .kthread+0x19c/0x1b0 [ 313.402928] .ret_from_kernel_thread+0x58/0x78 [ 313.402930] irq event stamp: 2629216 [ 313.402933] hardirqs last enabled at (2629216): [<c0000000009da738>] ._raw_spin_unlock_irq+0x38/0x60 [ 313.402936] hardirqs last disabled at (2629215): [<c0000000009da4c4>] ._raw_spin_lock_irq+0x24/0x70 [ 313.402939] softirqs last enabled at (2629212): [<c0000000000af9fc>] .irq_enter+0x8c/0xd0 [ 313.402942] softirqs last disabled at (2629213): [<c0000000000afb58>] .irq_exit+0x118/0x170 [ 313.402944] other info that might help us debug this: [ 313.402945] Possible unsafe locking scenario: [ 313.402947] CPU0 [ 313.402948] ---- [ 313.402949] lock(&(&wsm.lock)->rlock); [ 313.402951] <Interrupt> [ 313.402952] lock(&(&wsm.lock)->rlock); [ 313.402954] * DEADLOCK * [ 313.402957] 1 lock held by swapper/5/0: [ 313.402958] #0: 000000004b612042 ((&wsm.timer)){+.-.}, at: .call_timer_fn+0x0/0x3c0 [ 313.402963] stack backtrace: [ 313.402967] CPU: 5 PID: 0 Comm: swapper/5 Not tainted 5.1.0-rc7 #1 [ 313.402968] Call Trace: [ 313.402972] [c0000007fa262e70] [c0000000009b3294] .dump_stack+0xe0/0x15c (unreliable) [ 313.402975] [c0000007fa262f10] [c000000000125548] .print_usage_bug+0x348/0x390 [ 313.402978] [c0000007fa262fd0] [c000000000125cb4] .mark_lock+0x724/0x930 [ 313.402981] [c0000007fa263080] [c000000000126c20] .__lock_acquire+0xc90/0x16a0 [ 313.402984] [c0000007fa2631b0] [c000000000128040] .lock_acquire+0xd0/0x240 [ 313.402987] [c0000007fa263280] [c0000000009da2b4] ._raw_spin_lock+0x34/0x60 [ 313.402990] [c0000007fa263300] [c00000000054b0b0] .zstd_reclaim_timer_fn+0x40/0x230 [ 313.402993] [c0000007fa2633d0] [c000000000158b38] .call_timer_fn+0xc8/0x3c0 [ 313.402996] [c0000007fa2634a0] [c000000000158f74] .expire_timers+0x144/0x260 [ 313.402999] [c0000007fa263550] [c000000000159178] .run_timer_softirq+0xe8/0x230 [ 313.403002] [c0000007fa263680] [c0000000009db288] .__do_softirq+0x188/0x5d4 [ 313.403004] [c0000007fa263790] [c0000000000afb58] .irq_exit+0x118/0x170 [ 313.403008] [c0000007fa263800] [c000000000028d88] .timer_interrupt+0x158/0x430 [ 313.403012] [c0000007fa2638b0] [c0000000000091d4] decrementer_common+0x134/0x140 [ 313.403017] --- interrupt: 901 at replay_interrupt_return+0x0/0x4 LR = .arch_local_irq_restore.part.0+0x68/0x80 [ 313.403020] [c0000007fa263bb0] [c00000000001a3ac] .arch_local_irq_restore.part.0+0x2c/0x80 (unreliable) [ 313.403024] [c0000007fa263c30] [c0000000007bbbcc] .cpuidle_enter_state+0xec/0x670 [ 313.403027] [c0000007fa263d00] [c0000000000f5130] .call_cpuidle+0x40/0x90 [ 313.403031] [c0000007fa263d70] [c0000000000f554c] .do_idle+0x2dc/0x3a0 [ 313.403034] [c0000007fa263e30] [c0000000000f59ac] .cpu_startup_entry+0x2c/0x30 [ 313.403037] [c0000007fa263ea0] [c000000000045674] .start_secondary+0x644/0x650 [ 313.403041] [c0000007fa263f90] [c00000000000ad5c] start_secondary_prolog+0x10/0x14 Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=203517 Fixes: `3f93aef535` ("btrfs: add zstd compression level support") CC: stable@vger.kernel.org # 5.1+ Signed-off-by: Dennis Zhou <dennis@kernel.org> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2019-05-28 18:54:09 +02:00
Nikolay Borisov	debd1c065d	btrfs: Ensure replaced device doesn't have pending chunk allocation Recent FITRIM work, namely `bbbf7243d6` ("btrfs: combine device update operations during transaction commit") combined the way certain operations are recoded in a transaction. As a result an ASSERT was added in dev_replace_finish to ensure the new code works correctly. Unfortunately I got reports that it's possible to trigger the assert, meaning that during a device replace it's possible to have an unfinished chunk allocation on the source device. This is supposed to be prevented by the fact that a transaction is committed before finishing the replace oepration and alter acquiring the chunk mutex. This is not sufficient since by the time the transaction is committed and the chunk mutex acquired it's possible to allocate a chunk depending on the workload being executed on the replaced device. This bug has been present ever since device replace was introduced but there was never code which checks for it. The correct way to fix is to ensure that there is no pending device modification operation when the chunk mutex is acquire and if there is repeat transaction commit. Unfortunately it's not possible to just exclude the source device from btrfs_fs_devices::dev_alloc_list since this causes ENOSPC to be hit in transaction commit. Fixing that in another way would need to add special cases to handle the last writes and forbid new ones. The looped transaction fix is more obvious, and can be easily backported. The runtime of dev-replace is long so there's no noticeable delay caused by that. Reported-by: David Sterba <dsterba@suse.com> Fixes: `391cd9df81` ("Btrfs: fix unprotected alloc list insertion during the finishing procedure of replace") CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2019-05-28 18:54:00 +02:00
Linus Torvalds	9fb67d643f	Pin control fixes for v5.2: - Interrupt clearing fix for the Intel pin controllers affecting touchpads on some laptops. - Compile Kconfig fix for the STMFX expander pin controller. -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJc7POsAAoJEEEQszewGV1zh1sP+gLYet9hCIOdinJFznOIT6P9 ncl6clClf+mhKpNX8DRhXGF1OWxx2Us2uXjtanUA/K7GmdazUDK3b80OX/xYFMnX RumR2VcABGk8vs6hh6qD92xm5PXc/IOelM1i4k29o9TpKXffaqME7BgTCsR9gjSL G4X3JH5kPxLjy41iKskO8ipyhtocUXhL43WrXh4tGilVKYFMtvJhwUpw2/UDj4Sw ex6ZjUwlj5sjYHpB52jb4NeHqPMOgPDu7St6YakoMtfu5aR0kaTOvqRwnQWjmY6m 6/Sna3/x8oly9k77uw/ruZrmGoseYeGjrEpAEdYt2bcNM0TsudULc24vnArVG/n3 ucv4RZ4Km2M1G/EOskSyTuxomjAcyIbIJ+ZdllGsAOcyraqd2kHvVIwReOdDyLta FQWDedA1KZRRnCC7GOIVtlDIoyMS3JnxXCAvVuFW2ybqoJy4HRgCot7ZaAtpeosm Bk7Yb4zd4wIyQvxHruaaT7MLfG1mRShkeV8c4NCot/ILO5m2uJ6L55x7ta+OU/vy sd29+ww8BN1UR6x06faQpvXv6KXKJmSWFGENGq1aW3WmkmOJnkit8gvAzhTbtW1S 17ezYJLwzD0tUzj/laAzqV3gXc9XwvA1nVciwWPzm+yEaa2VlsATam8+THPQF3aU UJODDGHn9EY5yM1Q5Eyt =XtxG -----END PGP SIGNATURE----- Merge tag 'pinctrl-v5.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pin control fixes from Linus Walleij: "The commits that stand out are the Intel fixes that arrived during the merge window and I got relayed by pull request from Andy. Apart from that a minor Kconfig noise. - Interrupt clearing fix for the Intel pin controllers affecting touchpads on some laptops. - Compile Kconfig fix for the STMFX expander pin controller" * tag 'pinctrl-v5.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: pinctrl: stmfx: Fix compile issue when CONFIG_OF_GPIO is not defined pinctrl: intel: Clear interrupt status in mask/unmask callback pinctrl: intel: Use GENMASK() consistently	2019-05-28 09:35:04 -07:00
Linus Torvalds	ca6584a331	GPIO fixes for v5.2: - Fix a build error in gpio-adp5588 -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJc7PG1AAoJEEEQszewGV1zsqMP/3t7ajMMuDSNO8EaBeYD7XiK 3hqbyKjh81r83Zc3xQh7S204LQnuasSrkAzsmxcqGEX9a/UVoH4yOk0PTdeS53OH tCfmzg0sGpYj7bLaJtOXDKQDQW1op4OqhHFg8nE6U583EDsafn+oKLO9T4IOTTyz fyecVD/8QWs9fyiSoOSMjXWYLHJMCMzdeiV+84r8CdWK7MWI4RmfRe0ThhPA6ARM DhvqFwO4rwc3JoxLh85Ag+m3X0JkWTboLeaadRP2SWzXcfmuguwhUnkReWh3LPNX qqHPkMCiNST3Hag27oxg5ygioVoCz8SwoptzOiyer00JVQNsCsoFNCxlJS3kBTov AHJFr452Bu7kOCu2R2XMnFzHR0NVY0WIYoimRaTd6dMlIarsK4zHSbX4XKy0jvc2 JpVUCz6SCdzrwc1G/udaj1JEbDEdLWgUEdr3QcX2yMFXpjfXCixCH6hxxCNPxQdO Nkk2rFHJ/I5TCpZovKbzYYUg9gjVAKnQKVchjjRPu/QPvcr4KbNwABRrRtvoWHgz aA2jMCNMohm6uDq6dnkLIHs7V+i5SwX3lHEVJWJRL17yjeppXaLjOjn0dapXjRMd FVHezaHwxVSLDFEI9E5Eh6aF93j5PwyoF2VyeZyNjbg+AHmDX0fQBJg4lM6KeehL eqjSQQypf2lIAzdxfs8q =eby+ -----END PGP SIGNATURE----- Merge tag 'gpio-v5.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio Pull GPIO fix from Linus Walleij: "Fix a build error in gpio-adp5588" * tag 'gpio-v5.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: gpio: fix gpio-adp5588 build errors	2019-05-28 09:33:05 -07:00
Randy Dunlap	9a626c4a63	ia64: fix build errors by exporting paddr_to_nid() Fix build errors on ia64 when DISCONTIGMEM=y and NUMA=y by exporting paddr_to_nid(). Fixes these build errors: ERROR: "paddr_to_nid" [sound/core/snd-pcm.ko] undefined! ERROR: "paddr_to_nid" [net/sunrpc/sunrpc.ko] undefined! ERROR: "paddr_to_nid" [fs/cifs/cifs.ko] undefined! ERROR: "paddr_to_nid" [drivers/video/fbdev/core/fb.ko] undefined! ERROR: "paddr_to_nid" [drivers/usb/mon/usbmon.ko] undefined! ERROR: "paddr_to_nid" [drivers/usb/core/usbcore.ko] undefined! ERROR: "paddr_to_nid" [drivers/md/raid1.ko] undefined! ERROR: "paddr_to_nid" [drivers/md/dm-mod.ko] undefined! ERROR: "paddr_to_nid" [drivers/md/dm-crypt.ko] undefined! ERROR: "paddr_to_nid" [drivers/md/dm-bufio.ko] undefined! ERROR: "paddr_to_nid" [drivers/ide/ide-core.ko] undefined! ERROR: "paddr_to_nid" [drivers/ide/ide-cd_mod.ko] undefined! ERROR: "paddr_to_nid" [drivers/gpu/drm/drm.ko] undefined! ERROR: "paddr_to_nid" [drivers/char/agp/agpgart.ko] undefined! ERROR: "paddr_to_nid" [drivers/block/nbd.ko] undefined! ERROR: "paddr_to_nid" [drivers/block/loop.ko] undefined! ERROR: "paddr_to_nid" [drivers/block/brd.ko] undefined! ERROR: "paddr_to_nid" [crypto/ccm.ko] undefined! Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: linux-ia64@vger.kernel.org Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-05-28 09:17:03 -07:00
Ard Biesheuvel	3fd00beb14	arm64/module: revert to unsigned interpretation of ABS16/32 relocations Commit `1cf24a2cc3` ("arm64/module: deal with ambiguity in PRELxx relocation ranges") updated the overflow checking logic in the relocation handling code to ensure that PREL16/32 relocations don't overflow signed quantities. However, the same code path is used for absolute relocations, where the interpretation is the opposite: the only current use case for absolute relocations operating on non-native word size quantities is the CRC32 handling in the CONFIG_MODVERSIONS code, and these CRCs are unsigned 32-bit quantities, which are now being rejected by the module loader if bit 31 happens to be set. So let's use different ranges for quanties subject to absolute vs. relative relocations: - ABS16/32 relocations should be in the range [0, Uxx_MAX) - PREL16/32 relocations should be in the range [Sxx_MIN, Sxx_MAX) - otherwise, print an error since no other 16 or 32 bit wide data relocations are currently supported. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>	2019-05-28 15:15:53 +01:00
Thomas Huth	a86cb413f4	KVM: s390: Do not report unusabled IDs via KVM_CAP_MAX_VCPU_ID KVM_CAP_MAX_VCPU_ID is currently always reporting KVM_MAX_VCPU_ID on all architectures. However, on s390x, the amount of usable CPUs is determined during runtime - it is depending on the features of the machine the code is running on. Since we are using the vcpu_id as an index into the SCA structures that are defined by the hardware (see e.g. the sca_add_vcpu() function), it is not only the amount of CPUs that is limited by the hard- ware, but also the range of IDs that we can use. Thus KVM_CAP_MAX_VCPU_ID must be determined during runtime on s390x, too. So the handling of KVM_CAP_MAX_VCPU_ID has to be moved from the common code into the architecture specific code, and on s390x we have to return the same value here as for KVM_CAP_MAX_VCPUS. This problem has been discovered with the kvm_create_max_vcpus selftest. With this change applied, the selftest now passes on s390x, too. Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com> Message-Id: <20190523164309.13345-9-thuth@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2019-05-28 15:52:19 +02:00
Christian Borntraeger	eb1f2f387d	kvm: fix compile on s390 part 2 We also need to fence the memunmap part. Fixes: `e45adf665a` ("KVM: Introduce a new guest mapping API") Fixes: `d30b214d1d` (kvm: fix compilation on s390) Cc: Michal Kubecek <mkubecek@suse.cz> Cc: KarimAllah Ahmed <karahmed@amazon.de> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2019-05-28 15:51:52 +02:00
Arnaldo Carvalho de Melo	a7350998a2	tools headers UAPI: Sync kvm.h headers with the kernel sources `dd53f6102c` ("Merge tag 'kvmarm-for-v5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD") `59c5c58c5b` ("Merge tag 'kvm-ppc-next-5.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into HEAD") `d7547c55cb` ("KVM: Introduce KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2") `6520ca64cd` ("KVM: PPC: Book3S HV: XIVE: Add a mapping for the source ESB pages") `39e9af3de5` ("KVM: PPC: Book3S HV: XIVE: Add a TIMA mapping") `e4945b9da5` ("KVM: PPC: Book3S HV: XIVE: Add get/set accessors for the VP XIVE state") `e6714bd167` ("KVM: PPC: Book3S HV: XIVE: Add a control to dirty the XIVE EQ pages") `7b46b6169a` ("KVM: PPC: Book3S HV: XIVE: Add a control to sync the sources") `5ca8064748` ("KVM: PPC: Book3S HV: XIVE: Add a global reset control") `13ce3297c5` ("KVM: PPC: Book3S HV: XIVE: Add controls for the EQ configuration") `e8676ce50e` ("KVM: PPC: Book3S HV: XIVE: Add a control to configure a source") `4131f83c3d` ("KVM: PPC: Book3S HV: XIVE: add a control to initialize a source") `eacc56bb9d` ("KVM: PPC: Book3S HV: XIVE: Introduce a new capability KVM_CAP_PPC_IRQ_XIVE") `90c73795af` ("KVM: PPC: Book3S HV: Add a new KVM device for the XIVE native exploitation mode") `4f45b90e1c` ("KVM: s390: add deflate conversion facilty to cpu model") `a243c16d18` ("KVM: arm64: Add capability to advertise ptrauth for guest") `a22fa321d1` ("KVM: arm64: Add userspace flag to enable pointer authentication") `4bd774e57b` ("KVM: arm64/sve: Simplify KVM_REG_ARM64_SVE_VLS array sizing") `8ae6efdde4` ("KVM: arm64/sve: Clean up UAPI register ID definitions") `173aec2d5a` ("KVM: s390: add enhanced sort facilty to cpu model") `555f3d03e7` ("KVM: arm64: Add a capability to advertise SVE support") `9033bba4b5` ("KVM: arm64/sve: Add pseudo-register for the guest's vector lengths") `7dd32a0d01` ("KVM: arm/arm64: Add KVM_ARM_VCPU_FINALIZE ioctl") `e1c9c98345` ("KVM: arm64/sve: Add SVE support to register access ioctl interface") `2b953ea348` ("KVM: Allow 2048-bit register access via ioctl interface") None entails changes in tooling, the closest to that were some new arch specific ioctls, that are still not handled by the tools/perf/trace/beauty/ library, that needs to create per-arch tables to convert ioctl cmd->string (and back). From a quick look the arch specific kvm-stat.c files at: $ ls -1 tools/perf/arch/*/util/kvm-stat.c tools/perf/arch/powerpc/util/kvm-stat.c tools/perf/arch/s390/util/kvm-stat.c tools/perf/arch/x86/util/kvm-stat.c $ Are not affected. This silences these perf building warnings: Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h' diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h Warning: Kernel ABI header at 'tools/arch/powerpc/include/uapi/asm/kvm.h' differs from latest version at 'arch/powerpc/include/uapi/asm/kvm.h' diff -u tools/arch/powerpc/include/uapi/asm/kvm.h arch/powerpc/include/uapi/asm/kvm.h Warning: Kernel ABI header at 'tools/arch/s390/include/uapi/asm/kvm.h' differs from latest version at 'arch/s390/include/uapi/asm/kvm.h' diff -u tools/arch/s390/include/uapi/asm/kvm.h arch/s390/include/uapi/asm/kvm.h Warning: Kernel ABI header at 'tools/arch/arm64/include/uapi/asm/kvm.h' differs from latest version at 'arch/arm64/include/uapi/asm/kvm.h' diff -u tools/arch/arm64/include/uapi/asm/kvm.h arch/arm64/include/uapi/asm/kvm.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Amit Daniel Kachhap <amit.kachhap@arm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Cédric Le Goater <clg@kaod.org> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Dave Martin <Dave.Martin@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Paul Mackerras <paulus@ozlabs.org> Cc: Peter Xu <peterx@redhat.com> Link: https://lkml.kernel.org/n/tip-3msmqjenlmb7eygcdnmlqaq1@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 09:52:23 -03:00
Thomas Richter	6738028dd5	perf record: Fix s390 missing module symbol and warning for non-root users Command 'perf record' and 'perf report' on a system without kernel debuginfo packages uses /proc/kallsyms and /proc/modules to find addresses for kernel and module symbols. On x86 this works for root and non-root users. On s390, when invoked as non-root user, many of the following warnings are shown and module symbols are missing: proc/{kallsyms,modules} inconsistency while looking for "[sha1_s390]" module! Command 'perf record' creates a list of module start addresses by parsing the output of /proc/modules and creates a PERF_RECORD_MMAP record for the kernel and each module. The following function call sequence is executed: machine__create_kernel_maps machine__create_module modules__parse machine__create_module --> for each line in /proc/modules arch__fix_module_text_start Function arch__fix_module_text_start() is s390 specific. It opens file /sys/module/<name>/sections/.text to extract the module's .text section start address. On s390 the module loader prepends a header before the first section, whereas on x86 the module's text section address is identical the the module's load address. However module section files are root readable only. For non-root the read operation fails and machine__create_module() returns an error. Command perf record does not generate any PERF_RECORD_MMAP record for loaded modules. Later command perf report complains about missing module maps. To fix this function arch__fix_module_text_start() always returns success. For root users there is no change, for non-root users the module's load address is used as module's text start address (the prepended header then counts as part of the text section). This enable non-root users to use module symbols and avoid the warning when perf report is executed. Output before: [tmricht@m83lp54 perf]$ ./perf report -D \| fgrep MMAP 0 0x168 [0x50]: PERF_RECORD_MMAP ... x [kernel.kallsyms]_text Output after: [tmricht@m83lp54 perf]$ ./perf report -D \| fgrep MMAP 0 0x168 [0x50]: PERF_RECORD_MMAP ... x [kernel.kallsyms]_text 0 0x1b8 [0x98]: PERF_RECORD_MMAP ... x /lib/modules/.../autofs4.ko.xz 0 0x250 [0xa8]: PERF_RECORD_MMAP ... x /lib/modules/.../sha_common.ko.xz 0 0x2f8 [0x98]: PERF_RECORD_MMAP ... x /lib/modules/.../des_generic.ko.xz Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Link: http://lkml.kernel.org/r/20190522144601.50763-4-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 09:52:23 -03:00
Jiri Olsa	ed9adb2035	perf machine: Read also the end of the kernel We mark the end of kernel based on the first module, but that could cover some bpf program maps. Reading _etext symbol if it's present to get precise kernel map end. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@google.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Link: http://lkml.kernel.org/r/20190508132010.14512-6-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 09:52:23 -03:00
Arnaldo Carvalho de Melo	93f678b9ae	perf test vmlinux-kallsyms: Ignore aliases to _etext when searching on kallsyms No need to search for aliases for the symbol that marks the end of the kernel text segment, the following patch will make such symbols not to be found when searching in the kallsyms maps causing this test to fail. So as a prep patch to avoid breaking bisection, ignore such symbols. Tested-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Stanislav Fomichev <sdf@google.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Link: https://lkml.kernel.org/n/tip-qfwuih8cvmk9doh7k5k244eq@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 09:52:23 -03:00
Namhyung Kim	acd244b84b	perf session: Add missing swap ops for namespace events In case it's recorded in a different arch. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Hari Bathini <hbathini@linux.vnet.ibm.com> <hbathini@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Krister Johansen <kjlx@templeofstupid.com> Fixes: `f3b3614a28` ("perf tools: Add PERF_RECORD_NAMESPACES to include namespaces related info") Link: http://lkml.kernel.org/r/20190522053250.207156-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 09:52:23 -03:00
Namhyung Kim	6584140ba9	perf namespace: Protect reading thread's namespace It seems that the current code lacks holding the namespace lock in thread__namespaces(). Otherwise it can see inconsistent results. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Hari Bathini <hbathini@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Krister Johansen <kjlx@templeofstupid.com> Link: http://lkml.kernel.org/r/20190522053250.207156-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 09:52:23 -03:00
Arnaldo Carvalho de Melo	9903c64f0f	tools headers UAPI: Sync drm/drm.h with the kernel To pick up the changes in these csets: `060cebb20c` ("drm: introduce a capability flag for syncobj timeline support") `50d1ebef79` ("drm/syncobj: add timeline signal ioctl for syncobj v5") `ea569910cb` ("drm/syncobj: add transition iotcls between binary and timeline v2") `27b575a9aa` ("drm/syncobj: add timeline payload query ioctl v6") `01d6c35783` ("drm/syncobj: add support for timeline point wait v8") `783195ec1c` ("drm/syncobj: disable the timeline UAPI for now v2") `48197bc564` ("drm: add syncobj timeline support v9") Which automagically results in the following new ioctls being recognized by the 'perf trace' ioctl cmd arg beautifier: $ tools/perf/trace/beauty/drm_ioctl.sh > /tmp/before $ cp include/uapi/drm/drm.h tools/include/uapi/drm/drm.h $ tools/perf/trace/beauty/drm_ioctl.sh > /tmp/after $ diff -u /tmp/before /tmp/after --- /tmp/before 2019-05-22 10:25:31.443151182 -0300 +++ /tmp/after 2019-05-22 10:25:46.449354819 -0300 @@ -103,6 +103,10 @@ [0xC7] = "MODE_LIST_LESSEES", [0xC8] = "MODE_GET_LEASE", [0xC9] = "MODE_REVOKE_LEASE", + [0xCA] = "SYNCOBJ_TIMELINE_WAIT", + [0xCB] = "SYNCOBJ_QUERY", + [0xCC] = "SYNCOBJ_TRANSFER", + [0xCD] = "SYNCOBJ_TIMELINE_SIGNAL", [DRM_COMMAND_BASE + 0x00] = "I915_INIT", [DRM_COMMAND_BASE + 0x01] = "I915_FLUSH", [DRM_COMMAND_BASE + 0x02] = "I915_FLIP", $ I.e. the strace like raw_tracepoint:sys_enter handler in 'perf trace' will get the cmd integer value and map it to the string. At some point it should be possible to translate from string to integer and use to filter using expressions such as: # perf trace -e ioctl/cmd==DRM_IOCTL_SYNCOBJ*/ Or some more suitable syntax to express that only these ioctls when acting on DRM fds should be shown. Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Christian König <christian.koenig@amd.com> Cc: Chunming Zhou <david1.zhou@amd.com> Cc: Dave Airlie <airlied@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-jrc9ogw33w4zgqc3pu7o1l3g@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-28 09:52:11 -03:00
Harald Freudenberger	bef9f0ba30	s390/crypto: fix gcm-aes-s390 selftest failures The current kernel uses improved crypto selftests. These tests showed that the current implementation of gcm-aes-s390 is not able to deal with chunks of output buffers which are not a multiple of 16 bytes. This patch introduces a rework of the gcm aes s390 scatter walk handling which now is able to handle any input and output scatter list chunk sizes correctly. Code has been verified by the crypto selftests, the tcrypt kernel module and additional tests ran via the af_alg interface. Cc: <stable@vger.kernel.org> Reported-by: Julian Wiedmann <jwi@linux.ibm.com> Reviewed-by: Patrick Steuer <steuer@linux.ibm.com> Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2019-05-28 14:49:50 +02:00
Harald Freudenberger	7379e65279	s390/zcrypt: Fix wrong dispatching for control domain CPRBs The zcrypt device driver does not handle CPRBs which address a control domain correctly. This fix introduces a workaround: The domain field of the request CPRB is checked if there is a valid domain value in there. If this is true and the value is a control only domain (a domain which is enabled in the crypto config ADM mask but disabled in the AQM mask) the CPRB is forwarded to the default usage domain. If there is no default domain, the request is rejected with an ENODEV. This fix is important for maintaining crypto adapters. For example one LPAR can use a crypto adapter domain ('Control and Usage') but another LPAR needs to be able to maintain this adapter domain ('Control'). Scenarios like this did not work properly and the patch enables this. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2019-05-28 14:49:38 +02:00

... 4 5 6 7 8 ...

840488 Commits All Branches Search

840488 Commits

All Branches