linux-sg2042

Commit Graph

Author	SHA1	Message	Date
Dongli Zhang	00b368502d	xen-netfront: do not assume sk_buff_head list is empty in error handling When skb_shinfo(skb) is not able to cache extra fragment (that is, skb_shinfo(skb)->nr_frags >= MAX_SKB_FRAGS), xennet_fill_frags() assumes the sk_buff_head list is already empty. As a result, cons is increased only by 1 and returns to error handling path in xennet_poll(). However, if the sk_buff_head list is not empty, queue->rx.rsp_cons may be set incorrectly. That is, queue->rx.rsp_cons would point to the rx ring buffer entries whose queue->rx_skbs[i] and queue->grant_rx_ref[i] are already cleared to NULL. This leads to NULL pointer access in the next iteration to process rx ring buffer entries. Below is how xennet_poll() does error handling. All remaining entries in tmpq are accounted to queue->rx.rsp_cons without assuming how many outstanding skbs are remained in the list. 985 static int xennet_poll(struct napi_struct *napi, int budget) ... ... 1032 if (unlikely(xennet_set_skb_gso(skb, gso))) { 1033 __skb_queue_head(&tmpq, skb); 1034 queue->rx.rsp_cons += skb_queue_len(&tmpq); 1035 goto err; 1036 } It is better to always have the error handling in the same way. Fixes: `ad4f15dc2c` ("xen/netfront: don't bug in case of too many frags") Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-16 21:46:22 +02:00
Sameeh Jubran	a53651ec93	net: ena: don't wake up tx queue when down There is a race condition that can occur when calling ena_down(). The ena_clean_tx_irq() - which is a part of the napi handler - function might wake up the tx queue when the queue is supposed to be down (during recovery or changing the size of the queues for example) This causes the ena_start_xmit() function to trigger and possibly try to access the destroyed queues. The race is illustrated below: Flow A: Flow B(napi handler) ena_down() netif_carrier_off() netif_tx_disable() ena_clean_tx_irq() netif_tx_wake_queue() ena_napi_disable_all() ena_destroy_all_io_queues() After these flows the tx queue is active and ena_start_xmit() accesses the destroyed queue which leads to a kernel panic. fixes: `1738cd3ed3` (net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)) Signed-off-by: Sameeh Jubran <sameehj@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-16 21:40:18 +02:00
Rain River	81e09359b4	MAINTAINERS: update FORCEDETH MAINTAINERS info Many FORCEDETH NICs are used in our hosts. Several bugs are fixed and some features are developed for FORCEDETH NICs. And I have been reviewing patches for FORCEDETH NIC for several months. Mark me as the FORCEDETH NIC maintainer. I will send out the patches and maintain FORCEDETH NIC. Signed-off-by: Rain River <rain.1986.08.12@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-16 09:16:08 +02:00
Paul Durrant	655e023ed4	MAINTAINERS: xen-netback: update my email address My Citrix email address will expire shortly. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Acked-by: Wei Liu <wl@xen.org> Acked-by: Wei Liu <wei.liu@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-16 09:13:41 +02:00
Jose Abreu	19e13cb27b	net: stmmac: Hold rtnl lock in suspend/resume callbacks We need to hold rnl lock in suspend and resume callbacks because phylink requires it. Otherwise we will get a WARN() in suspend and resume. Also, move phylink start and stop callbacks to inside device's internal lock so that we prevent concurrent HW accesses. Fixes: `74371272f9` ("net: stmmac: Convert to phylink and remove phylib logic") Reported-by: Christophe ROULLIER <christophe.roullier@st.com> Tested-by: Christophe ROULLIER <christophe.roullier@st.com> Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-16 09:11:09 +02:00
Xin Long	28e4860377	ip6_gre: fix a dst leak in ip6erspan_tunnel_xmit In ip6erspan_tunnel_xmit(), if the skb will not be sent out, it has to be freed on the tx_err path. Otherwise when deleting a netns, it would cause dst/dev to leak, and dmesg shows: unregister_netdevice: waiting for lo to become free. Usage count = 1 Fixes: `ef7baf5e08` ("ip6_gre: add ip6 erspan collect_md mode") Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: William Tu <u9012063@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-16 09:08:18 +02:00
Willem de Bruijn	acdcecc612	udp: correct reuseport selection with connected sockets UDP reuseport groups can hold a mix unconnected and connected sockets. Ensure that connections only receive all traffic to their 4-tuple. Fast reuseport returns on the first reuseport match on the assumption that all matches are equal. Only if connections are present, return to the previous behavior of scoring all sockets. Record if connections are present and if so (1) treat such connected sockets as an independent match from the group, (2) only return 2-tuple matches from reuseport and (3) do not return on the first 2-tuple reuseport match to allow for a higher scoring match later. New field has_conns is set without locks. No other fields in the bitmap are modified at runtime and the field is only ever set unconditionally, so an RMW cannot miss a change. Fixes: `e32ea7e747` ("soreuseport: fast reuseport UDP socket selection") Link: http://lkml.kernel.org/r/CA+FuTSfRP09aJNYRt04SS6qj22ViiOEWaWmLAwX0psk8-PGNxw@mail.gmail.com Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Craig Gallek <kraig@google.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-16 09:02:18 +02:00
Gerd Rausch	05a82481a3	net/rds: Fix 'ib_evt_handler_call' element in 'rds_ib_stat_names' All entries in 'rds_ib_stat_names' are stringified versions of the corresponding "struct rds_ib_statistics" element without the "s_"-prefix. Fix entry 'ib_evt_handler_call' to do the same. Fixes: `f4f943c958` ("RDS: IB: ack more receive completions to improve performance") Signed-off-by: Gerd Rausch <gerd.rausch@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-15 20:56:19 +02:00
Cong Wang	6efb971ba8	net_sched: let qdisc_put() accept NULL pointer When tcf_block_get() fails in sfb_init(), q->qdisc is still a NULL pointer which leads to a crash in sfb_destroy(). Similar for sch_dsmark. Instead of fixing each separately, Linus suggested to just accept NULL pointer in qdisc_put(), which would make callers easier. (For sch_dsmark, the bug probably exists long before commit 6529eaba33f0.) Fixes: `6529eaba33` ("net: sched: introduce tcf block infractructure") Reported-by: syzbot+d5870a903591faaca4ae@syzkaller.appspotmail.com Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-15 20:54:14 +02:00
Andrew Lunn	23426a25e5	net: dsa: Fix load order between DSA drivers and taggers The DSA core, DSA taggers and DSA drivers all make use of module_init(). Hence they get initialised at device_initcall() time. The ordering is non-deterministic. It can be a DSA driver is bound to a device before the needed tag driver has been initialised, resulting in the message: No tagger for this switch Rather than have this be fatal, return -EPROBE_DEFER so that it is tried again later once all the needed drivers have been loaded. Fixes: `d3b8c04988` ("dsa: Add boilerplate helper to register DSA tag driver modules") Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-15 20:49:19 +02:00
Paolo Abeni	d518d2ed86	net/sched: fix race between deactivation and dequeue for NOLOCK qdisc The test implemented by some_qdisc_is_busy() is somewhat loosy for NOLOCK qdisc, as we may hit the following scenario: CPU1 CPU2 // in net_tx_action() clear_bit(__QDISC_STATE_SCHED...); // in some_qdisc_is_busy() val = (qdisc_is_running(q) \|\| test_bit(__QDISC_STATE_SCHED, &q->state)); // here val is 0 but... qdisc_run(q) // ... CPU1 is going to run the qdisc next As a conseguence qdisc_run() in net_tx_action() can race with qdisc_reset() in dev_qdisc_reset(). Such race is not possible for !NOLOCK qdisc as both the above bit operations are under the root qdisc lock(). After commit `021a17ed79` ("pfifo_fast: drop unneeded additional lock on dequeue") the race can cause use after free and/or null ptr dereference, but the root cause is likely older. This patch addresses the issue explicitly checking for deactivation under the seqlock for NOLOCK qdisc, so that the qdisc_run() in the critical scenario becomes a no-op. Note that the enqueue() op can still execute concurrently with dev_qdisc_reset(), but that is safe due to the skb_array() locking, and we can't avoid that for NOLOCK qdiscs. Fixes: `021a17ed79` ("pfifo_fast: drop unneeded additional lock on dequeue") Reported-by: Li Shuang <shuali@redhat.com> Reported-and-tested-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-15 20:35:55 +02:00
Linus Torvalds	1609d7604b	The main change here is a revert of reverts. We recently simplified some code that was thought unnecessary; however, since then KVM has grown quite a few cond_resched()s and for that reason the simplified code is prone to livelocks---one CPUs tries to empty a list of guest page tables while the others keep adding to them. This adds back the generation-based zapping of guest page tables, which was not unnecessary after all. On top of this, there is a fix for a kernel memory leak and a couple of s390 fixlets as well. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAABAgAGBQJdfJcxAAoJEL/70l94x66D8ocH/jhz+95+TBNF5j0xGmBWiDbH zQlWmEMkAQ8o66J+503bc2ltQhM8078Ohumgtmq8HFaRgctDiRdjLBcce6aOr4tH 09gBdlWgkVWLhN8AhydSbHh+SLcCWdZQSAWQrvt1aM2BRdz5WECkUdauSL2oHsTP P58318EL0JrqQaQZtK4qI4eNDiYWdKq2luMu9BTYm3f1Hnk28gorErgBehScYuxE LlnM4RYedH54UR8opdUDmhHxO7bGTW4SVz4obq0sSOJs190DuVbCxbaJjhP+tSwk 7hPac3tHoYFUIhVtgGD3N/LrsFgvcShmjawLP0szCrR2sWrtTqmb0R63DLxpmyg= =yuZ9 -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm fixes from Paolo Bonzini: "The main change here is a revert of reverts. We recently simplified some code that was thought unnecessary; however, since then KVM has grown quite a few cond_resched()s and for that reason the simplified code is prone to livelocks---one CPUs tries to empty a list of guest page tables while the others keep adding to them. This adds back the generation-based zapping of guest page tables, which was not unnecessary after all. On top of this, there is a fix for a kernel memory leak and a couple of s390 fixlets as well" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86/mmu: Reintroduce fast invalidate/zap for flushing memslot KVM: x86: work around leak of uninitialized stack contents KVM: nVMX: handle page fault in vmread KVM: s390: Do not leak kernel stack data in the KVM_S390_INTERRUPT ioctl KVM: s390: kvm_s390_vm_start_migration: check dirty_bitmap before using it as target for memset()	2019-09-14 16:07:40 -07:00
Linus Torvalds	1f9c632cde	virtio: a last minute revert 32 bit build got broken by the latest defence in depth patch. Revert and we'll try again in the next cycle. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> -----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJdfT6AAAoJECgfDbjSjVRpOrUH/09N3JI/VSIvH5y75UYAU1pc nDrekWI7TLYYIKJCUwfcxIzxskw1EOENxSFNRAtkKyRKZtq7HmhWOBec4NdnHkvy ze1Cbt2aqQiMfbxJYStri/AYNSpC+aTttFSDAMm4TfE+QxfEjumO3J/HSRk0RYdt leGziB4H2BjsZM/2JpMD9UFIq/D9SeGZTwd2sZTCTQ+RIvYEN2hGUXoG5hYl/xv6 +g6/Rkj/eAHqilpvyAt2PWFxslqLsQhWt/S/PHa61HpQLuPBCBYAu38O6X0vD93F 8vNQfcedz4qpyyHdfaGB+jquZ800BxqaUvdLdyAdQJXXGaytopAYzncEa1iPyyg= =s4iF -----END PGP SIGNATURE----- Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost Pull virtio fix from Michael Tsirkin: "A last minute revert The 32-bit build got broken by the latest defence in depth patch. Revert and we'll try again in the next cycle" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: Revert "vhost: block speculation of translated descriptors"	2019-09-14 16:02:49 -07:00
Linus Torvalds	b03c036e6f	Urgent RISC-V fix for v5.3 Last week, Palmer and I learned that there was an error in the RISC-V kernel image header format that could make it less compatible with the ARM64 kernel image header format. I had missed this error during my original reviews of the patch. The kernel image header format is an interface that impacts bootloaders, QEMU, and other user tools. Those packages must be updated to align with whatever is merged in the kernel. We would like to avoid proliferating these image formats by keeping the RISC-V header as close as possible to the existing ARM64 header. Since the arch/riscv patch that adds support for the image header was merged with our v5.3-rc1 pull request as commit `0f327f2aaa` ("RISC-V: Add an Image header that boot loader can parse."), we think it wise to try to fix this error before v5.3 is released. The fix itself should be backwards-compatible with any project that has already merged support for premature versions of this interface. It primarily involves ensuring that the RISC-V image header has something useful in the same field as the ARM64 image header. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEElRDoIDdEz9/svf2Kx4+xDQu9KksFAl1870sACgkQx4+xDQu9 Kkt33A//SG4fTIyz0rIGTZpJPKV3nXacBq6XvOFxFsHRHlEvD2f/JkSK1Ab+hV5R vmTkVGCSCVz1C/OEA+KWsjuiJEglII6eOLIRqST1Wm6KumwAwLc78xdgEb1Sm/SC E7OTYtSqbUjCqzzD1BFcXfbP4mGF/9IBjWI3OcCnb1UcLuL29Mt35gvxI9fF1FB6 +EU96MBQbk4gVUYjKXObTvaAZwWIYrMkOFQmdRgb4jqk42i0hLmKx//WkI1Ajlp8 FDjE2nIo2NAt0N7pImJ/QtqxkOsQjMtOyOscoTyhB4eGJW0+fTyVrt6FpUdYQDQq vZI/WS2RFYUi2wfj+JNQ959MgsWZZ8z21KbFWwR0HC4k2xRZaxCO48g/VweJA/QW 3f6+CMxYgwF5KzToHvUjlo0wNMW2Xo/FX9bky3gb8rJPWnSx9uu9lfoh17FUD4Ty cEknaLtmMALA8Lgr8hwTKbZLg7J1ih5r1SPj0UvjpjEmwDUl2doA0EONuuBroEHM KDerGitg6D0g4B4VlGsHuLMd6Gj/5r2teno97tPoaf5J9mCZ1v2/Q5OL0QwBYd84 5cp+Ox1aQTY6SJq8gftBOD3MmW2lKCC5tT6H0bJvKBAE7tJaLPv5YIj6dp1jfXKB klzJUdGRsL60EwlL/cbFOurDfhBeQlq8akdzG5Cg5e8q+mISSTE= =Jt6U -----END PGP SIGNATURE----- Merge tag 'riscv/for-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fix from Paul Walmsley: "Last week, Palmer and I learned that there was an error in the RISC-V kernel image header format that could make it less compatible with the ARM64 kernel image header format. I had missed this error during my original reviews of the patch. The kernel image header format is an interface that impacts bootloaders, QEMU, and other user tools. Those packages must be updated to align with whatever is merged in the kernel. We would like to avoid proliferating these image formats by keeping the RISC-V header as close as possible to the existing ARM64 header. Since the arch/riscv patch that adds support for the image header was merged with our v5.3-rc1 pull request as commit `0f327f2aaa` ("RISC-V: Add an Image header that boot loader can parse."), we think it wise to try to fix this error before v5.3 is released. The fix itself should be backwards-compatible with any project that has already merged support for premature versions of this interface. It primarily involves ensuring that the RISC-V image header has something useful in the same field as the ARM64 image header" * tag 'riscv/for-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: modify the Image header to improve compatibility with the ARM64 header	2019-09-14 15:58:02 -07:00
Michael S. Tsirkin	0d4a3f2abb	Revert "vhost: block speculation of translated descriptors" This reverts commit `a89db445fb`. I was hasty to include this patch, and it breaks the build on 32 bit. Defence in depth is good but let's do it properly. Cc: stable@vger.kernel.org Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2019-09-14 15:21:51 -04:00
Linus Torvalds	36024fcf8d	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from David Miller: 1) Don't corrupt xfrm_interface parms before validation, from Nicolas Dichtel. 2) Revert use of usb-wakeup in btusb, from Mario Limonciello. 3) Block ipv6 packets in bridge netfilter if ipv6 is disabled, from Leonardo Bras. 4) IPS_OFFLOAD not honored in ctnetlink, from Pablo Neira Ayuso. 5) Missing ULP check in sock_map, from John Fastabend. 6) Fix receive statistic handling in forcedeth, from Zhu Yanjun. 7) Fix length of SKB allocated in 6pack driver, from Christophe JAILLET. 8) ip6_route_info_create() returns an error pointer, not NULL. From Maciej Żenczykowski. 9) Only add RDS sock to the hashes after rs_transport is set, from Ka-Cheong Poon. 10) Don't double clean TX descriptors in ixgbe, from Ilya Maximets. 11) Presence of transmit IPSEC offload in an SKB is not tested for correctly in ixgbe and ixgbevf. From Steffen Klassert and Jeff Kirsher. 12) Need rcu_barrier() when register_netdevice() takes one of the notifier based failure paths, from Subash Abhinov Kasiviswanathan. 13) Fix leak in sctp_do_bind(), from Mao Wenan. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (72 commits) cdc_ether: fix rndis support for Mediatek based smartphones sctp: destroy bucket if failed to bind addr sctp: remove redundant assignment when call sctp_get_port_local sctp: change return type of sctp_get_port_local ixgbevf: Fix secpath usage for IPsec Tx offload sctp: Fix the link time qualifier of 'sctp_ctrlsock_exit()' ixgbe: Fix secpath usage for IPsec TX offload. net: qrtr: fix memort leak in qrtr_tun_write_iter net: Fix null de-reference of device refcount ipv6: Fix the link time qualifier of 'ping_v6_proc_exit_net()' tun: fix use-after-free when register netdev failed tcp: fix tcp_ecn_withdraw_cwr() to clear TCP_ECN_QUEUE_CWR ixgbe: fix double clean of Tx descriptors with xdp ixgbe: Prevent u8 wrapping of ITR value to something less than 10us mlx4: fix spelling mistake "veify" -> "verify" net: hns3: fix spelling mistake "undeflow" -> "underflow" net: lmc: fix spelling mistake "runnin" -> "running" NFC: st95hf: fix spelling mistake "receieve" -> "receive" net/rds: An rds_sock is added too early to the hash table mac80211: Do not send Layer 2 Update frame before authorization ...	2019-09-14 12:20:38 -07:00
Linus Torvalds	1c4c5e2528	MMC host: - tmio: Fixup runtime PM management during probe and remove - sdhci-pci-o2micro: Fix eMMC initialization for an AMD SoC - bcm2835: Prevent lockups when terminating work -----BEGIN PGP SIGNATURE----- iQJLBAABCgA1FiEEugLDXPmKSktSkQsV/iaEJXNYjCkFAl17g5AXHHVsZi5oYW5z c29uQGxpbmFyby5vcmcACgkQ/iaEJXNYjCmNeBAAzuHgSqF6V2xhALjFBJaf2Flz ydwYv3m05lhF5n7GHfzg8CFhVRGJfMNTag4OzamzjIcf+MwtD43JODvQy/2fEP14 7xKUTf4cRyR3beGVIxvdl1teUK1aCSLnCQMG2b4iRmfBqQiJbfqSAq98oDndhNY6 Q50sJNBPUYwUsGUVRY+fR5D+m3F5cd3mpDAmKQr+KRI2vHYY1lXRI1PuYoXBM8mX BaRuE8gVPcVejpuhjS1EBgzDWxquYVOAzswYier4CywtC32+HMH9gg1+oGf9/Oyv 9WvEZBfLQtDpzTAhUaPA7BFjCV2vk7zLaM4E9zcUPdBsUWMcc+jDGzdoZ4T1FFfA dzPyoZew2+96vq0oeFLIwJhTmzxCP00GSEjfPxk3i1CCCY4Zh6u6TUhlttShqeei huBoBZTMI17fzi2wC6BQKFg2O95h771d6L98LEl71yNxoklZ6kk5jR5xhZjF6X1V xcC8agnzEc6jsF7yRJOqiZIn9a2qlNAEW53wwhHFanj+260RNgdLZMLt/0Y+8Ikd 5twYXJsMLI/Imlj4+rkxqAfXu6+nb0bD9vSKmwLPJIPC4/QztTs4VHGCBtcpS46k nZ7PMI1avOmN2mYQehTC7ociSXqUgMOzvfJ1XJZsaTfrSkO8/q9x/P1oKc3vOfvN mP57+5nGvLvfC0drK9g= =ENtL -----END PGP SIGNATURE----- Merge tag 'mmc-v5.3-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc Pull MMC fixes from Ulf Hansson: - tmio: Fixup runtime PM management during probe and remove - sdhci-pci-o2micro: Fix eMMC initialization for an AMD SoC - bcm2835: Prevent lockups when terminating work * tag 'mmc-v5.3-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: mmc: tmio: Fixup runtime PM management during remove mmc: tmio: Fixup runtime PM management during probe Revert "mmc: tmio: move runtime PM enablement to the driver implementations" Revert "mmc: sdhci: Remove unneeded quirk2 flag of O2 SD host controller" Revert "mmc: bcm2835: Terminate timeout work synchronously"	2019-09-14 12:08:19 -07:00
Linus Torvalds	592b8d8759	drm fixes for 5.3-rc8 lima: - fix gem_wait ioctl core: - constify modes list i915: - DP MST high color depth regression - GPU hangs on vulkan compute workloads -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJdemkOAAoJEAx081l5xIa+4ooP/RGV1E+pHsdC1SzsZYLZ+GDp TLeg0KnNMijxfP8TCqjssPwJFbYVO3RxZHCDKbCQdESRG6TKS5wqJ9bgR2y3jdXZ rjbhGY8YkxNkZbpAYPsYkzNcRqIrZFOYOnKDeafXP02KnHoiIckf8W3DKuCNr5XF /htgh4rQnKZnkiFJm1OzGwxdTTI+yv4aqi50cLR5pGAnSUKpxeHm2M+S9P2BmMgL BvsjDFFz+WGQllQ7nKUiMZ9oPbAGiQjWmICfucBniBUnhdleMk8qiNTuD8YQ1Cwh m5Ix2wa8vPZBeg1Hb55iJGhC+l4ePGdLPtJiPUUUMM+d8A0QlcMLrFpXM8jvtgA5 h+cZFR9Po8QS1bb6Jic/i9OLzG+KhFAQLSREcBNqGcyqfCZHqKArPSlTqqjUkRRj KW4OhdGxPrpmIzUHrHInSZKXzZtdWXKF35YFDWbjy0WNo91RW/X4I/qvCsML99uH WDoPumEG4eolPLK5JCcTnMMhRYULRopulVc8Tw/yTIxlpUEVhWk7fGhc8XAK/F0D S5bpxx8OQMfSYAaln5tzGhf4vUWVmdFNv0kyIdozZLRCQzYNuECg6bB16gZ9utvN np8k9x6CKFWTlNerAhO2jpvmh/pVupQ4GylGY5uIc2Z4P9LPMzORltrNU8cgaZB/ 8g9NDRvzCpfqW/UWXwFD =wL1R -----END PGP SIGNATURE----- Merge tag 'drm-fixes-2019-09-13' of git://anongit.freedesktop.org/drm/drm Pull drm fixes from Dave Airlie: "From the maintainer summit, just some last minute fixes for final: lima: - fix gem_wait ioctl core: - constify modes list i915: - DP MST high color depth regression - GPU hangs on vulkan compute workloads" * tag 'drm-fixes-2019-09-13' of git://anongit.freedesktop.org/drm/drm: drm/lima: fix lima_gem_wait() return value drm/i915: Restore relaxed padding (OCL_OOB_SUPPRES_ENABLE) for skl+ drm/i915: Limit MST to <= 8bpc once again drm/modes: Make the whitelist more const	2019-09-14 11:54:57 -07:00
Paolo Bonzini	a9c20bb020	KVM: s390: Fixes for 5.3 - prevent a user triggerable oops in the migration code - do not leak kernel stack content -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABAgAGBQJdejosAAoJEBF7vIC1phx8ZcYP/09WMmcbOexGvopqyMzIWgAv xpSHAW0+mGriu9b41OwkxBsMG3MxUzk86b3zL0r5eaigWXSuE2NU0OhScqF9ehMX pTtoeSzFJsPFwGQrOKIhpgcNzOJ+YfVqTDlf5dxq9uSNYF32suuz0Dw4P9PdFJOg k8prJXiKu+bL21TcbhWsAAP7Gb5/DA26p4d5KM3wJe351Af9lrLrDF2z+pKe9fbY v0vMcH3tJoBOOTYUSJeptEWU9OlYljMrJN7kkmXCEC8yklwoXPDNgAC8Yg2SfqYM xNKVkX/rY97cn1Dq0LpAvEjMDYvu7KbOM1qQE9A67gRLIjuGJnDyEa+j/iB/tOrz BMmTdut44XRaVZVdDL+d2pg3LKI+1+UV4XTwpD4g1tSpYLar3dJVb9mq00OzdCAg TsK+pQYTSZig+H4ubtikgm9pFGKOB2Jsp2+FoC7jYxhYQWyj4syBkSoaaUdY0LvE /Du3NY3RaG4yi2K2XV0yjBVAjpXxYMWqvzJYTC9XlrEQJ5nAmiefTgxZmcg4ZCMw 0YVRigG7vz8oKpVRl/6smGd/U+qTNZN4cXnFgUr71yONiIxsSndUZ/Yledtf+KQR uzPfvIwYpRzwqVnXkkFb+PNxvJVftCbe2rRI4D549VsbmEJmSadjiB5aW1Rj3fMN 47ZjXZmmGETR8BtQEM37 =LxGy -----END PGP SIGNATURE----- Merge tag 'kvm-s390-master-5.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvm-master KVM: s390: Fixes for 5.3 - prevent a user triggerable oops in the migration code - do not leak kernel stack content	2019-09-14 09:25:30 +02:00
Sean Christopherson	002c5f73c5	KVM: x86/mmu: Reintroduce fast invalidate/zap for flushing memslot James Harvey reported a livelock that was introduced by commit `d012a06ab1` ("Revert "KVM: x86/mmu: Zap only the relevant pages when removing a memslot""). The livelock occurs because kvm_mmu_zap_all() as it exists today will voluntarily reschedule and drop KVM's mmu_lock, which allows other vCPUs to add shadow pages. With enough vCPUs, kvm_mmu_zap_all() can get stuck in an infinite loop as it can never zap all pages before observing lock contention or the need to reschedule. The equivalent of kvm_mmu_zap_all() that was in use at the time of the reverted commit (`4e103134b8`, "KVM: x86/mmu: Zap only the relevant pages when removing a memslot") employed a fast invalidate mechanism and was not susceptible to the above livelock. There are three ways to fix the livelock: - Reverting the revert (commit `d012a06ab1`) is not a viable option as the revert is needed to fix a regression that occurs when the guest has one or more assigned devices. It's unlikely we'll root cause the device assignment regression soon enough to fix the regression timely. - Remove the conditional reschedule from kvm_mmu_zap_all(). However, although removing the reschedule would be a smaller code change, it's less safe in the sense that the resulting kvm_mmu_zap_all() hasn't been used in the wild for flushing memslots since the fast invalidate mechanism was introduced by commit `6ca18b6950` ("KVM: x86: use the fast way to invalidate all pages"), back in 2013. - Reintroduce the fast invalidate mechanism and use it when zapping shadow pages in response to a memslot being deleted/moved, which is what this patch does. For all intents and purposes, this is a revert of commit `ea145aacf4` ("Revert "KVM: MMU: fast invalidate all pages"") and a partial revert of commit `7390de1e99` ("Revert "KVM: x86: use the fast way to invalidate all pages""), i.e. restores the behavior of commit `5304b8d37c` ("KVM: MMU: fast invalidate all pages") and commit `6ca18b6950` ("KVM: x86: use the fast way to invalidate all pages") respectively. Fixes: `d012a06ab1` ("Revert "KVM: x86/mmu: Zap only the relevant pages when removing a memslot"") Reported-by: James Harvey <jamespharvey20@gmail.com> Cc: Alex Willamson <alex.williamson@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-09-14 09:25:11 +02:00
Fuqian Huang	541ab2aeb2	KVM: x86: work around leak of uninitialized stack contents Emulation of VMPTRST can incorrectly inject a page fault when passed an operand that points to an MMIO address. The page fault will use uninitialized kernel stack memory as the CR2 and error code. The right behavior would be to abort the VM with a KVM_EXIT_INTERNAL_ERROR exit to userspace; however, it is not an easy fix, so for now just ensure that the error code and CR2 are zero. Signed-off-by: Fuqian Huang <huangfq.daxian@gmail.com> Cc: stable@vger.kernel.org [add comment] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-09-14 09:25:11 +02:00
Paolo Bonzini	f7eea636c3	KVM: nVMX: handle page fault in vmread The implementation of vmread to memory is still incomplete, as it lacks the ability to do vmread to I/O memory just like vmptrst. Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-09-14 09:25:02 +02:00
Paul Walmsley	474efecb65	riscv: modify the Image header to improve compatibility with the ARM64 header Part of the intention during the definition of the RISC-V kernel image header was to lay the groundwork for a future merge with the ARM64 image header. One error during my original review was not noticing that the RISC-V header's "magic" field was at a different size and position than the ARM64's "magic" field. If the existing ARM64 Image header parsing code were to attempt to parse an existing RISC-V kernel image header format, it would see a magic number 0. This is undesirable, since it's our intention to align as closely as possible with the ARM64 header format. Another problem was that the original "res3" field was not being initialized correctly to zero. Address these issues by creating a 32-bit "magic2" field in the RISC-V header which matches the ARM64 "magic" field. RISC-V binaries will store "RSC\x05" in this field. The intention is that the use of the existing 64-bit "magic" field in the RISC-V header will be deprecated over time. Increment the minor version number of the file format to indicate this change, and update the documentation accordingly. Fix the assembler directives in head.S to ensure that reserved fields are properly zero-initialized. Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com> Reported-by: Palmer Dabbelt <palmer@sifive.com> Reviewed-by: Palmer Dabbelt <palmer@sifive.com> Cc: Atish Patra <atish.patra@wdc.com> Cc: Karsten Merker <merker@debian.org> Link: https://lore.kernel.org/linux-riscv/194c2f10c9806720623430dbf0cc59a965e50448.camel@wdc.com/T/#u Link: https://lore.kernel.org/linux-riscv/mhng-755b14c4-8f35-4079-a7ff-e421fd1b02bc@palmer-si-x1e/T/#t	2019-09-13 19:03:52 -07:00
Bjørn Mork	4d7ffcf3bf	cdc_ether: fix rndis support for Mediatek based smartphones A Mediatek based smartphone owner reports problems with USB tethering in Linux. The verbose USB listing shows a rndis_host interface pair (e0/01/03 + 10/00/00), but the driver fails to bind with [ 355.960428] usb 1-4: bad CDC descriptors The problem is a failsafe test intended to filter out ACM serial functions using the same 02/02/ff class/subclass/protocol as RNDIS. The serial functions are recognized by their non-zero bmCapabilities. No RNDIS function with non-zero bmCapabilities were known at the time this failsafe was added. But it turns out that some Wireless class RNDIS functions are using the bmCapabilities field. These functions are uniquely identified as RNDIS by their class/subclass/protocol, so the failing test can safely be disabled. The same applies to the two types of Misc class RNDIS functions. Applying the failsafe to Communication class functions only retains the original functionality, and fixes the problem for the Mediatek based smartphone. Tow examples of CDC functional descriptors with non-zero bmCapabilities from Wireless class RNDIS functions are: 0e8d:000a Mediatek Crosscall Spider X5 3G Phone CDC Header: bcdCDC 1.10 CDC ACM: bmCapabilities 0x0f connection notifications sends break line coding and serial state get/set/clear comm features CDC Union: bMasterInterface 0 bSlaveInterface 1 CDC Call Management: bmCapabilities 0x03 call management use DataInterface bDataInterface 1 and 19d2:1023 ZTE K4201-z CDC Header: bcdCDC 1.10 CDC ACM: bmCapabilities 0x02 line coding and serial state CDC Call Management: bmCapabilities 0x03 call management use DataInterface bDataInterface 1 CDC Union: bMasterInterface 0 bSlaveInterface 1 The Mediatek example is believed to apply to most smartphones with Mediatek firmware. The ZTE example is most likely also part of a larger family of devices/firmwares. Suggested-by: Lars Melin <larsm17@gmail.com> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-13 22:08:13 +02:00
David S. Miller	ae3b06ed55	Merge branch 'sctp_do_bind-leak' Mao Wenan says: ==================== fix memory leak for sctp_do_bind First two patches are to do cleanup, remove redundant assignment, and change return type of sctp_get_port_local. Third patch is to fix memory leak for sctp_do_bind if failed to bind address. v2: add one patch to change return type of sctp_get_port_local. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-13 22:06:20 +02:00
Mao Wenan	29b99f54a8	sctp: destroy bucket if failed to bind addr There is one memory leak bug report: BUG: memory leak unreferenced object 0xffff8881dc4c5ec0 (size 40): comm "syz-executor.0", pid 5673, jiffies 4298198457 (age 27.578s) hex dump (first 32 bytes): 02 00 00 00 81 88 ff ff 00 00 00 00 00 00 00 00 ................ f8 63 3d c1 81 88 ff ff 00 00 00 00 00 00 00 00 .c=............. backtrace: [<0000000072006339>] sctp_get_port_local+0x2a1/0xa00 [sctp] [<00000000c7b379ec>] sctp_do_bind+0x176/0x2c0 [sctp] [<000000005be274a2>] sctp_bind+0x5a/0x80 [sctp] [<00000000b66b4044>] inet6_bind+0x59/0xd0 [ipv6] [<00000000c68c7f42>] __sys_bind+0x120/0x1f0 net/socket.c:1647 [<000000004513635b>] __do_sys_bind net/socket.c:1658 [inline] [<000000004513635b>] __se_sys_bind net/socket.c:1656 [inline] [<000000004513635b>] __x64_sys_bind+0x3e/0x50 net/socket.c:1656 [<0000000061f2501e>] do_syscall_64+0x72/0x2e0 arch/x86/entry/common.c:296 [<0000000003d1e05e>] entry_SYSCALL_64_after_hwframe+0x49/0xbe This is because in sctp_do_bind, if sctp_get_port_local is to create hash bucket successfully, and sctp_add_bind_addr failed to bind address, e.g return -ENOMEM, so memory leak found, it needs to destroy allocated bucket. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Mao Wenan <maowenan@huawei.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-13 22:06:20 +02:00
Mao Wenan	e0e4b8de10	sctp: remove redundant assignment when call sctp_get_port_local There are more parentheses in if clause when call sctp_get_port_local in sctp_do_bind, and redundant assignment to 'ret'. This patch is to do cleanup. Signed-off-by: Mao Wenan <maowenan@huawei.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-13 22:06:20 +02:00
Mao Wenan	8e2ef6abd4	sctp: change return type of sctp_get_port_local Currently sctp_get_port_local() returns a long which is either 0,1 or a pointer casted to long. It's neither of the callers use the return value since commit `62208f1245` ("net: sctp: simplify sctp_get_port"). Now two callers are sctp_get_port and sctp_do_bind, they actually assumend a casted to an int was the same as a pointer casted to a long, and they don't save the return value just check whether it is zero or non-zero, so it would better change return type from long to int for sctp_get_port_local. Signed-off-by: Mao Wenan <maowenan@huawei.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-13 22:06:20 +02:00
Jeff Kirsher	8f6617badc	ixgbevf: Fix secpath usage for IPsec Tx offload Port the same fix for ixgbe to ixgbevf. The ixgbevf driver currently does IPsec Tx offloading based on an existing secpath. However, the secpath can also come from the Rx side, in this case it is misinterpreted for Tx offload and the packets are dropped with a "bad sa_idx" error. Fix this by using the xfrm_offload() function to test for Tx offload. CC: Shannon Nelson <snelson@pensando.io> Fixes: `7f68d43067` ("ixgbevf: enable VF IPsec offload operations") Reported-by: Jonathan Tooker <jonathan@reliablehosting.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Acked-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-13 15:52:10 +02:00
Ulf Hansson	87b5d602a1	mmc: tmio: Fixup runtime PM management during remove Accessing the device when it may be runtime suspended is a bug, which is the case in tmio_mmc_host_remove(). Let's fix the behaviour. Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>	2019-09-13 13:49:09 +02:00
Ulf Hansson	aa86f1a388	mmc: tmio: Fixup runtime PM management during probe The tmio_mmc_host_probe() calls pm_runtime_set_active() to update the runtime PM status of the device, as to make it reflect the current status of the HW. This works fine for most cases, but unfortunate not for all. Especially, there is a generic problem when the device has a genpd attached and that genpd have the ->start\|stop() callbacks assigned. More precisely, if the driver calls pm_runtime_set_active() during ->probe(), genpd does not get to invoke the ->start() callback for it, which means the HW isn't really fully powered on. Furthermore, in the next phase, when the device becomes runtime suspended, genpd will invoke the ->stop() callback for it, potentially leading to usage count imbalance problems, depending on what's implemented behind the callbacks of course. To fix this problem, convert to call pm_runtime_get_sync() from tmio_mmc_host_probe() rather than pm_runtime_set_active(). Additionally, to avoid bumping usage counters and unnecessary re-initializing the HW the first time the tmio driver's ->runtime_resume() callback is called, introduce a state flag to keeping track of this. Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>	2019-09-13 13:49:04 +02:00
Ulf Hansson	8861474a10	Revert "mmc: tmio: move runtime PM enablement to the driver implementations" This reverts commit `7ff2131933`. It turns out that the above commit introduces other problems. For example, calling pm_runtime_set_active() must not be done prior calling pm_runtime_enable() as that makes it fail. This leads to additional problems, such as clock enables being wrongly balanced. Rather than fixing the problem on top, let's start over by doing a revert. Fixes: `7ff2131933` ("mmc: tmio: move runtime PM enablement to the driver implementations") Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>	2019-09-13 13:48:35 +02:00
Linus Torvalds	a7f89616b7	Merge branch 'for-5.3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fix from Tejun Heo: "Roman found and fixed a bug in the cgroup2 freezer which allows new child cgroup to escape frozen state" * 'for-5.3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: freezer: fix frozen state inheritance kselftests: cgroup: add freezer mkdir test	2019-09-13 09:52:01 +01:00
Linus Torvalds	1b304a1ae4	for-5.3-rc8-tag -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAl16aTEACgkQxWXV+ddt WDvICg//cSn5w+g6EnxbrAZ6IYQJ4GA7lZSk2i6Dc/lI3BTfs7Wj0SPRKd01pBjT N+wbqoOgubsS1jkNfJsGCN80XzSa0tvyQdbezj5ncgSPXp4FRlT0K24EUQNPaqbg SsvvxAOCerVN3Yj2qrHNWIS5qZ5/8/NjLXca1DJ/OYmrkKfhe+Z6/b9EuKffPnco erMnaeSvQ27hYkkcdM0DGcWDoHHAQrefGNjQzp5vncJNN1F7+EGLbcH31UwApk1K /hvOQ6Q6SoR/NKbQu3AitrR9u7v9uhWP9jHJZT46q1m89CzI4S5FjK2wKZFjPE6r 0PGRqnpdaGAERaTo3s6jIqv/X2gzJkhhhzGMiPgPJCQbAH39f/fFGEX22TjG33Yq 2CiGSIPnmKQ7HE494YLuSyHD/89SutMMCkbF0sFBoKuTnu2HQMn9r5Pk6bEKtvIY iTk75/WTXR02qWCVhTyNDa9QnxewQGJC1d1KNQ6MwbzBiYyG9S/DDZnjLJPNx7DF KAAANCDdyPpraLcmw2sD/qd1o10HfQmn9z1L2v3YvJBfjMe76SQFCP5WwaJRcjOm c3ScAX9bXeXJgH+E98kWc7T6p49IPdMDGAtArQmtjO4V8pFRuqG+2Ibg6Za/y5XZ fkaS5UY+XIk3TUpEqkWKMPMigM9a3jgHskyMgdRLQfVnoOc6Z+k= =KXB8 -----END PGP SIGNATURE----- Merge tag 'for-5.3-rc8-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "Here are two fixes, one of them urgent fixing a bug introduced in 5.2 and reported by many users. It took time to identify the root cause, catching the 5.3 release is higly desired also to push the fix to 5.2 stable tree. The bug is a mess up of return values after adding proper error handling and honestly the kind of bug that can cause sleeping disorders until it's caught. My appologies to everybody who was affected. Summary of what could happen: 1) either a hang when committing a transaction, if this happens there's no risk of corruption, still the hang is very inconvenient and can't be resolved without a reboot 2) writeback for some btree nodes may never be started and we end up committing a transaction without noticing that, this is really serious and that will lead to the "parent transid verify failed" messages" * tag 'for-5.3-rc8-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: Btrfs: fix unwritten extent buffers and hangs on future writeback attempts Btrfs: fix assertion failure during fsync and use of stale transaction	2019-09-13 09:48:47 +01:00
Roman Gushchin	97a6136983	cgroup: freezer: fix frozen state inheritance If a new child cgroup is created in the frozen cgroup hierarchy (one or more of ancestor cgroups is frozen), the CGRP_FREEZE cgroup flag should be set. Otherwise if a process will be attached to the child cgroup, it won't become frozen. The problem can be reproduced with the test_cgfreezer_mkdir test. This is the output before this patch: ~/test_freezer ok 1 test_cgfreezer_simple ok 2 test_cgfreezer_tree ok 3 test_cgfreezer_forkbomb Cgroup /sys/fs/cgroup/cg_test_mkdir_A/cg_test_mkdir_B isn't frozen not ok 4 test_cgfreezer_mkdir ok 5 test_cgfreezer_rmdir ok 6 test_cgfreezer_migrate ok 7 test_cgfreezer_ptrace ok 8 test_cgfreezer_stopped ok 9 test_cgfreezer_ptraced ok 10 test_cgfreezer_vfork And with this patch: ~/test_freezer ok 1 test_cgfreezer_simple ok 2 test_cgfreezer_tree ok 3 test_cgfreezer_forkbomb ok 4 test_cgfreezer_mkdir ok 5 test_cgfreezer_rmdir ok 6 test_cgfreezer_migrate ok 7 test_cgfreezer_ptrace ok 8 test_cgfreezer_stopped ok 9 test_cgfreezer_ptraced ok 10 test_cgfreezer_vfork Reported-by: Mark Crossen <mcrossen@fb.com> Signed-off-by: Roman Gushchin <guro@fb.com> Fixes: `76f969e894` ("cgroup: cgroup v2 freezer") Cc: Tejun Heo <tj@kernel.org> Cc: stable@vger.kernel.org # v5.2+ Signed-off-by: Tejun Heo <tj@kernel.org>	2019-09-12 14:04:45 -07:00
Roman Gushchin	44e9d308a5	kselftests: cgroup: add freezer mkdir test Add a new cgroup freezer selftest, which checks that if a cgroup is frozen, their new child cgroups will properly inherit the frozen state. It creates a parent cgroup, freezes it, creates a child cgroup and populates it with a dummy process. Then it checks that both parent and child cgroup are frozen. Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Tejun Heo <tj@kernel.org> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Tejun Heo <tj@kernel.org>	2019-09-12 14:04:40 -07:00
Chris Wilson	505a8ec7e1	Revert "drm/i915/userptr: Acquire the page lock around set_page_dirty()" The userptr put_pages can be called from inside try_to_unmap, and so enters with the page lock held on one of the object's backing pages. We cannot take the page lock ourselves for fear of recursion. Reported-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reported-by: Martin Wilck <Martin.Wilck@suse.com> Reported-by: Leo Kraav <leho@kraav.com> Fixes: `aa56a292ce` ("drm/i915/userptr: Acquire the page lock around set_page_dirty()") References: https://bugzilla.kernel.org/show_bug.cgi?id=203317 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-09-12 14:55:03 +01:00
Linus Torvalds	98dcb386e5	for-linus-20190912 -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCXXpIKgAKCRCRxhvAZXjc om8yAQCIPJp2HWNsJRPRl9KVKRmR6MxItG1Hpj0MvgwzLEjufwD/SF9VAPgl2AmD oaAXrOYH4yhr9aaFlVuMpe2aWAZdxgU= =Tjvf -----END PGP SIGNATURE----- Merge tag 'for-linus-20190912' of gitolite.kernel.org:pub/scm/linux/kernel/git/brauner/linux Pull clone3 fix from Christian Brauner: "This is a last-minute bugfix for clone3() that should go in before we release 5.3 with clone3(). clone3() did not verify that the exit_signal argument was set to a valid signal. This can be used to cause a crash by specifying a signal greater than NSIG. e.g. -1. The commit from Eugene adds a check to copy_clone_args_from_user() to verify that the exit signal is limited by CSIGNAL as with legacy clone() and that the signal is valid. With this we don't get the legacy clone behavior were an invalid signal could be handed down and would only be detected and then ignored in do_notify_parent(). Users of clone3() will now get a proper error right when they pass an invalid exit signal. Note, that this is not a change in user-visible behavior since no kernel with clone3() has been released yet" * tag 'for-linus-20190912' of gitolite.kernel.org:pub/scm/linux/kernel/git/brauner/linux: fork: block invalid exit signals with clone3()	2019-09-12 14:50:14 +01:00
Linus Torvalds	95217783b7	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "A KVM guest fix, and a kdump kernel relocation errors fix" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/timer: Force PIT initialization when !X86_FEATURE_ARAT x86/purgatory: Change compiler flags from -mcmodel=kernel to -mcmodel=large to fix kexec relocation errors	2019-09-12 14:47:35 +01:00
Dave Airlie	e6bb711600	drm-misc-fixes for v5.3 final: - Constify modes whitelist harder. - Fix lima driver gem_wait ioctl. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEuXvWqAysSYEJGuVH/lWMcqZwE8MFAl158BIACgkQ/lWMcqZw E8MjGg//bFNdAdk0rdgZVS2TB2GcRBkQU2+S+UHKBz6DaDDCaD+5wkGhI8OqNDqy O97q3IW+cy+yEQyen+KVLgnEuQ6QsFCwWGItDjg4s4HyCApI/J31HX0uXSJyag88 SKQEH9sOcSh+sS88Yr5eBAXpQKIryf/uXUmZN1SaHY8M8kt//m/k3A5ex6i+GMdQ 3rl0K+VsBCF4eliW0dOsmsEi6PJrY/qHv2bCN8geAEKg6+NcNVzruZm+2rkrwlna nvKKjD0RCfFn9OyHMY6a0OZIZDDW+am3vI4geuJbwSTUGc1SgbMR/5YIBm63Ml64 qqOFM5f6OdSL/ZGQV+FDx1NRNS0Lez/jNnOZqo5CwFel1RbzBh/NZ6QoRw8KPONU ahpk8DCk1dUFADQdZ7o+C3bveM/Se0oLRVov9L9mUKAlSsCH+qE9/uYPH88M2oBe EnCs/HnUg4W/NvlVrHSxGAfla3pFrclbUuF7pU2hi5tsYm2VKsPeY7g6h4lMsG28 IMLQ+TlGJjpyfe08cPb9Q/SfJFs1o0KEEh58Y91ZGQB1ledDn+lvvIuxE1b8spR2 AELWKHJ+Mt94Yoxh+2vX6T+ATjgI1f0fYqUsheB1aBgu7IG7m04UuOIjenmxe+2E cGnfpAfdWVlIDWfQxVHglg1QqXvMGGhmm6nPQAfOjBU29TCV4rI= =LVcc -----END PGP SIGNATURE----- Merge tag 'drm-misc-fixes-2019-09-12' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes drm-misc-fixes for v5.3 final: - Constify modes whitelist harder. - Fix lima driver gem_wait ioctl. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/99e52e7a-d4ce-6a2c-0501-bc559a710955@linux.intel.com	2019-09-12 23:14:35 +10:00
Dave Airlie	911ad0b611	Merge tag 'drm-intel-fixes-2019-09-11' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes Final drm/i915 fixes for v5.3: - Fox DP MST high color depth regression - Fix GPU hangs on Vulkan compute workloads Signed-off-by: Dave Airlie <airlied@redhat.com> From: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/877e6e27qm.fsf@intel.com	2019-09-12 23:11:36 +10:00
Eugene Syromiatnikov	a0eb9abd8a	fork: block invalid exit signals with clone3() Previously, higher 32 bits of exit_signal fields were lost when copied to the kernel args structure (that uses int as a type for the respective field). Moreover, as Oleg has noted, exit_signal is used unchecked, so it has to be checked for sanity before use; for the legacy syscalls, applying CSIGNAL mask guarantees that it is at least non-negative; however, there's no such thing is done in clone3() code path, and that can break at least thread_group_leader. This commit adds a check to copy_clone_args_from_user() to verify that the exit signal is limited by CSIGNAL as with legacy clone() and that the signal is valid. With this we don't get the legacy clone behavior were an invalid signal could be handed down and would only be detected and ignored in do_notify_parent(). Users of clone3() will now get a proper error when they pass an invalid exit signal. Note, that this is not user-visible behavior since no kernel with clone3() has been released yet. The following program will cause a splat on a non-fixed clone3() version and will fail correctly on a fixed version: #define _GNU_SOURCE #include <linux/sched.h> #include <linux/types.h> #include <sched.h> #include <stdio.h> #include <stdlib.h> #include <sys/syscall.h> #include <sys/wait.h> #include <unistd.h> int main(int argc, char *argv[]) { pid_t pid = -1; struct clone_args args = {0}; args.exit_signal = -1; pid = syscall(__NR_clone3, &args, sizeof(struct clone_args)); if (pid < 0) exit(EXIT_FAILURE); if (pid == 0) exit(EXIT_SUCCESS); wait(NULL); exit(EXIT_SUCCESS); } Fixes: `7f192e3cd3` ("fork: add clone3") Reported-by: Oleg Nesterov <oleg@redhat.com> Suggested-by: Oleg Nesterov <oleg@redhat.com> Suggested-by: Dmitry V. Levin <ldv@altlinux.org> Signed-off-by: Eugene Syromiatnikov <esyr@redhat.com> Link: https://lore.kernel.org/r/4b38fa4ce420b119a4c6345f42fe3cec2de9b0b5.1568223594.git.esyr@redhat.com [christian.brauner@ubuntu.com: simplify check and rework commit message] Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>	2019-09-12 14:56:33 +02:00
Thomas Huth	53936b5bf3	KVM: s390: Do not leak kernel stack data in the KVM_S390_INTERRUPT ioctl When the userspace program runs the KVM_S390_INTERRUPT ioctl to inject an interrupt, we convert them from the legacy struct kvm_s390_interrupt to the new struct kvm_s390_irq via the s390int_to_s390irq() function. However, this function does not take care of all types of interrupts that we can inject into the guest later (see do_inject_vcpu()). Since we do not clear out the s390irq values before calling s390int_to_s390irq(), there is a chance that we copy random data from the kernel stack which could be leaked to the userspace later. Specifically, the problem exists with the KVM_S390_INT_PFAULT_INIT interrupt: s390int_to_s390irq() does not handle it, and the function __inject_pfault_init() later copies irq->u.ext which contains the random kernel stack data. This data can then be leaked either to the guest memory in __deliver_pfault_init(), or the userspace might retrieve it directly with the KVM_S390_GET_IRQ_STATE ioctl. Fix it by handling that interrupt type in s390int_to_s390irq(), too, and by making sure that the s390irq struct is properly pre-initialized. And while we're at it, make sure that s390int_to_s390irq() now directly returns -EINVAL for unknown interrupt types, so that we immediately get a proper error code in case we add more interrupt types to do_inject_vcpu() without updating s390int_to_s390irq() sometime in the future. Cc: stable@vger.kernel.org Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Signed-off-by: Thomas Huth <thuth@redhat.com> Link: https://lore.kernel.org/kvm/20190912115438.25761-1-thuth@redhat.com Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2019-09-12 14:12:21 +02:00
Christophe JAILLET	b456d72412	sctp: Fix the link time qualifier of 'sctp_ctrlsock_exit()' The '.exit' functions from 'pernet_operations' structure should be marked as __net_exit, not __net_init. Fixes: `8e2d61e0ae` ("sctp: fix race on protocol/netns initialization") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-12 12:55:28 +01:00
Steffen Klassert	f39b683d35	ixgbe: Fix secpath usage for IPsec TX offload. The ixgbe driver currently does IPsec TX offloading based on an existing secpath. However, the secpath can also come from the RX side, in this case it is misinterpreted for TX offload and the packets are dropped with a "bad sa_idx" error. Fix this by using the xfrm_offload() function to test for TX offload. Fixes: `5925947047` ("ixgbe: process the Tx ipsec offload") Reported-by: Michael Marley <michael@michaelmarley.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-12 12:43:14 +01:00
Filipe Manana	18dfa7117a	Btrfs: fix unwritten extent buffers and hangs on future writeback attempts The lock_extent_buffer_io() returns 1 to the caller to tell it everything went fine and the callers needs to start writeback for the extent buffer (submit a bio, etc), 0 to tell the caller everything went fine but it does not need to start writeback for the extent buffer, and a negative value if some error happened. When it's about to return 1 it tries to lock all pages, and if a try lock on a page fails, and we didn't flush any existing bio in our "epd", it calls flush_write_bio(epd) and overwrites the return value of 1 to 0 or an error. The page might have been locked elsewhere, not with the goal of starting writeback of the extent buffer, and even by some code other than btrfs, like page migration for example, so it does not mean the writeback of the extent buffer was already started by some other task, so returning a 0 tells the caller (btree_write_cache_pages()) to not start writeback for the extent buffer. Note that epd might currently have either no bio, so flush_write_bio() returns 0 (success) or it might have a bio for another extent buffer with a lower index (logical address). Since we return 0 with the EXTENT_BUFFER_WRITEBACK bit set on the extent buffer and writeback is never started for the extent buffer, future attempts to writeback the extent buffer will hang forever waiting on that bit to be cleared, since it can only be cleared after writeback completes. Such hang is reported with a trace like the following: [49887.347053] INFO: task btrfs-transacti:1752 blocked for more than 122 seconds. [49887.347059] Not tainted 5.2.13-gentoo #2 [49887.347060] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [49887.347062] btrfs-transacti D 0 1752 2 0x80004000 [49887.347064] Call Trace: [49887.347069] ? __schedule+0x265/0x830 [49887.347071] ? bit_wait+0x50/0x50 [49887.347072] ? bit_wait+0x50/0x50 [49887.347074] schedule+0x24/0x90 [49887.347075] io_schedule+0x3c/0x60 [49887.347077] bit_wait_io+0x8/0x50 [49887.347079] __wait_on_bit+0x6c/0x80 [49887.347081] ? __lock_release.isra.29+0x155/0x2d0 [49887.347083] out_of_line_wait_on_bit+0x7b/0x80 [49887.347084] ? var_wake_function+0x20/0x20 [49887.347087] lock_extent_buffer_for_io+0x28c/0x390 [49887.347089] btree_write_cache_pages+0x18e/0x340 [49887.347091] do_writepages+0x29/0xb0 [49887.347093] ? kmem_cache_free+0x132/0x160 [49887.347095] ? convert_extent_bit+0x544/0x680 [49887.347097] filemap_fdatawrite_range+0x70/0x90 [49887.347099] btrfs_write_marked_extents+0x53/0x120 [49887.347100] btrfs_write_and_wait_transaction.isra.4+0x38/0xa0 [49887.347102] btrfs_commit_transaction+0x6bb/0x990 [49887.347103] ? start_transaction+0x33e/0x500 [49887.347105] transaction_kthread+0x139/0x15c So fix this by not overwriting the return value (ret) with the result from flush_write_bio(). We also need to clear the EXTENT_BUFFER_WRITEBACK bit in case flush_write_bio() returns an error, otherwise it will hang any future attempts to writeback the extent buffer, and undo all work done before (set back EXTENT_BUFFER_DIRTY, etc). This is a regression introduced in the 5.2 kernel. Fixes: `2e3c25136a` ("btrfs: extent_io: add proper error handling to lock_extent_buffer_for_io()") Fixes: `f4340622e0` ("btrfs: extent_io: Move the BUG_ON() in flush_write_bio() one level up") Reported-by: Zdenek Sojka <zsojka@seznam.cz> Link: https://lore.kernel.org/linux-btrfs/GpO.2yos.3WGDOLpx6t%7D.1TUDYM@seznam.cz/T/#u Reported-by: Stefan Priebe - Profihost AG <s.priebe@profihost.ag> Link: https://lore.kernel.org/linux-btrfs/5c4688ac-10a7-fb07-70e8-c5d31a3fbb38@profihost.ag/T/#t Reported-by: Drazen Kacar <drazen.kacar@oradian.com> Link: https://lore.kernel.org/linux-btrfs/DB8PR03MB562876ECE2319B3E579590F799C80@DB8PR03MB5628.eurprd03.prod.outlook.com/ Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204377 Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2019-09-12 13:37:25 +02:00
Filipe Manana	410f954cb1	Btrfs: fix assertion failure during fsync and use of stale transaction Sometimes when fsync'ing a file we need to log that other inodes exist and when we need to do that we acquire a reference on the inodes and then drop that reference using iput() after logging them. That generally is not a problem except if we end up doing the final iput() (dropping the last reference) on the inode and that inode has a link count of 0, which can happen in a very short time window if the logging path gets a reference on the inode while it's being unlinked. In that case we end up getting the eviction callback, btrfs_evict_inode(), invoked through the iput() call chain which needs to drop all of the inode's items from its subvolume btree, and in order to do that, it needs to join a transaction at the helper function evict_refill_and_join(). However because the task previously started a transaction at the fsync handler, btrfs_sync_file(), it has current->journal_info already pointing to a transaction handle and therefore evict_refill_and_join() will get that transaction handle from btrfs_join_transaction(). From this point on, two different problems can happen: 1) evict_refill_and_join() will often change the transaction handle's block reserve (->block_rsv) and set its ->bytes_reserved field to a value greater than 0. If evict_refill_and_join() never commits the transaction, the eviction handler ends up decreasing the reference count (->use_count) of the transaction handle through the call to btrfs_end_transaction(), and after that point we have a transaction handle with a NULL ->block_rsv (which is the value prior to the transaction join from evict_refill_and_join()) and a ->bytes_reserved value greater than 0. If after the eviction/iput completes the inode logging path hits an error or it decides that it must fallback to a transaction commit, the btrfs fsync handle, btrfs_sync_file(), gets a non-zero value from btrfs_log_dentry_safe(), and because of that non-zero value it tries to commit the transaction using a handle with a NULL ->block_rsv and a non-zero ->bytes_reserved value. This makes the transaction commit hit an assertion failure at btrfs_trans_release_metadata() because ->bytes_reserved is not zero but the ->block_rsv is NULL. The produced stack trace for that is like the following: [192922.917158] assertion failed: !trans->bytes_reserved, file: fs/btrfs/transaction.c, line: 816 [192922.917553] ------------[ cut here ]------------ [192922.917922] kernel BUG at fs/btrfs/ctree.h:3532! [192922.918310] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC PTI [192922.918666] CPU: 2 PID: 883 Comm: fsstress Tainted: G W 5.1.4-btrfs-next-47 #1 [192922.919035] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.2-0-gf9626ccb91-prebuilt.qemu-project.org 04/01/2014 [192922.919801] RIP: 0010:assfail.constprop.25+0x18/0x1a [btrfs] (...) [192922.920925] RSP: 0018:ffffaebdc8a27da8 EFLAGS: 00010286 [192922.921315] RAX: 0000000000000051 RBX: ffff95c9c16a41c0 RCX: 0000000000000000 [192922.921692] RDX: 0000000000000000 RSI: ffff95cab6b16838 RDI: ffff95cab6b16838 [192922.922066] RBP: ffff95c9c16a41c0 R08: 0000000000000000 R09: 0000000000000000 [192922.922442] R10: ffffaebdc8a27e70 R11: 0000000000000000 R12: ffff95ca731a0980 [192922.922820] R13: 0000000000000000 R14: ffff95ca84c73338 R15: ffff95ca731a0ea8 [192922.923200] FS: 00007f337eda4e80(0000) GS:ffff95cab6b00000(0000) knlGS:0000000000000000 [192922.923579] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [192922.923948] CR2: 00007f337edad000 CR3: 00000001e00f6002 CR4: 00000000003606e0 [192922.924329] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [192922.924711] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [192922.925105] Call Trace: [192922.925505] btrfs_trans_release_metadata+0x10c/0x170 [btrfs] [192922.925911] btrfs_commit_transaction+0x3e/0xaf0 [btrfs] [192922.926324] btrfs_sync_file+0x44c/0x490 [btrfs] [192922.926731] do_fsync+0x38/0x60 [192922.927138] __x64_sys_fdatasync+0x13/0x20 [192922.927543] do_syscall_64+0x60/0x1c0 [192922.927939] entry_SYSCALL_64_after_hwframe+0x49/0xbe (...) [192922.934077] ---[ end trace f00808b12068168f ]--- 2) If evict_refill_and_join() decides to commit the transaction, it will be able to do it, since the nested transaction join only increments the transaction handle's ->use_count reference counter and it does not prevent the transaction from getting committed. This means that after eviction completes, the fsync logging path will be using a transaction handle that refers to an already committed transaction. What happens when using such a stale transaction can be unpredictable, we are at least having a use-after-free on the transaction handle itself, since the transaction commit will call kmem_cache_free() against the handle regardless of its ->use_count value, or we can end up silently losing all the updates to the log tree after that iput() in the logging path, or using a transaction handle that in the meanwhile was allocated to another task for a new transaction, etc, pretty much unpredictable what can happen. In order to fix both of them, instead of using iput() during logging, use btrfs_add_delayed_iput(), so that the logging path of fsync never drops the last reference on an inode, that step is offloaded to a safe context (usually the cleaner kthread). The assertion failure issue was sporadically triggered by the test case generic/475 from fstests, which loads the dm error target while fsstress is running, which lead to fsync failing while logging inodes with -EIO errors and then trying later to commit the transaction, triggering the assertion failure. CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2019-09-12 13:37:19 +02:00
Igor Mammedov	13a17cc052	KVM: s390: kvm_s390_vm_start_migration: check dirty_bitmap before using it as target for memset() If userspace doesn't set KVM_MEM_LOG_DIRTY_PAGES on memslot before calling kvm_s390_vm_start_migration(), kernel will oops with: Unable to handle kernel pointer dereference in virtual kernel address space Failing address: 0000000000000000 TEID: 0000000000000483 Fault in home space mode while using kernel ASCE. AS:0000000002a2000b R2:00000001bff8c00b R3:00000001bff88007 S:00000001bff91000 P:000000000000003d Oops: 0004 ilc:2 [#1] SMP ... Call Trace: ([<001fffff804ec552>] kvm_s390_vm_set_attr+0x347a/0x3828 [kvm]) [<001fffff804ecfc0>] kvm_arch_vm_ioctl+0x6c0/0x1998 [kvm] [<001fffff804b67e4>] kvm_vm_ioctl+0x51c/0x11a8 [kvm] [<00000000008ba572>] do_vfs_ioctl+0x1d2/0xe58 [<00000000008bb284>] ksys_ioctl+0x8c/0xb8 [<00000000008bb2e2>] sys_ioctl+0x32/0x40 [<000000000175552c>] system_call+0x2b8/0x2d8 INFO: lockdep is turned off. Last Breaking-Event-Address: [<0000000000dbaf60>] __memset+0xc/0xa0 due to ms->dirty_bitmap being NULL, which might crash the host. Make sure that ms->dirty_bitmap is set before using it or return -EINVAL otherwise. Cc: <stable@vger.kernel.org> Fixes: `afdad61615` ("KVM: s390: Fix storage attributes migration with memory slots") Signed-off-by: Igor Mammedov <imammedo@redhat.com> Link: https://lore.kernel.org/kvm/20190911075218.29153-1-imammedo@redhat.com/ Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2019-09-12 13:09:17 +02:00
Navid Emamdoost	a21b7f0cff	net: qrtr: fix memort leak in qrtr_tun_write_iter In qrtr_tun_write_iter the allocated kbuf should be release in case of error or success return. v2 Update: Thanks to David Miller for pointing out the release on success path as well. Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-12 11:58:44 +01:00
Subash Abhinov Kasiviswanathan	10cc514f45	net: Fix null de-reference of device refcount In event of failure during register_netdevice, free_netdev is invoked immediately. free_netdev assumes that all the netdevice refcounts have been dropped prior to it being called and as a result frees and clears out the refcount pointer. However, this is not necessarily true as some of the operations in the NETDEV_UNREGISTER notifier handlers queue RCU callbacks for invocation after a grace period. The IPv4 callback in_dev_rcu_put tries to access the refcount after free_netdev is called which leads to a null de-reference- 44837.761523: <6> Unable to handle kernel paging request at virtual address 0000004a88287000 44837.761651: <2> pc : in_dev_finish_destroy+0x4c/0xc8 44837.761654: <2> lr : in_dev_finish_destroy+0x2c/0xc8 44837.762393: <2> Call trace: 44837.762398: <2> in_dev_finish_destroy+0x4c/0xc8 44837.762404: <2> in_dev_rcu_put+0x24/0x30 44837.762412: <2> rcu_nocb_kthread+0x43c/0x468 44837.762418: <2> kthread+0x118/0x128 44837.762424: <2> ret_from_fork+0x10/0x1c Fix this by waiting for the completion of the call_rcu() in case of register_netdevice errors. Fixes: `93ee31f14f` ("[NET]: Fix free_netdev on register_netdev failure.") Cc: Sean Tranchetti <stranche@codeaurora.org> Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-12 11:55:34 +01:00

1 2 3 4 5 ...

857686 Commits All Branches Search

857686 Commits

All Branches