OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Sebastian Andrzej Siewior	84918a89d6	usb: dwc3: gadget: Let the interrupt handler disable bottom halves. The interrupt service routine registered for the gadget is a primary handler which mask the interrupt source and a threaded handler which handles the source of the interrupt. Since the threaded handler is voluntary threaded, the IRQ-core does not disable bottom halves before invoke the handler like it does for the forced-threaded handler. Due to changes in networking it became visible that a network gadget's completions handler may schedule a softirq which remains unprocessed. The gadget's completion handler is usually invoked either in hard-IRQ or soft-IRQ context. In this context it is enough to just raise the softirq because the softirq itself will be handled once that context is left. In the case of the voluntary threaded handler, there is nothing that will process pending softirqs. Which means it remain queued until another random interrupt (on this CPU) fires and handles it on its exit path or another thread locks and unlocks a lock with the bh suffix. Worst case is that the CPU goes idle and the NOHZ complains about unhandled softirqs. Disable bottom halves before acquiring the lock (and disabling interrupts) and enable them after dropping the lock. This ensures that any pending softirqs will handled right away. Link: https://lkml.kernel.org/r/c2a64979-73d1-2c22-e048-c275c9f81558@samsung.com Fixes: `e5f68b4a3e` ("Revert "usb: dwc3: gadget: remove unnecessary _irqsave()"") Cc: stable <stable@kernel.org> Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://lore.kernel.org/r/Yg/YPejVQH3KkRVd@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-02-24 11:16:38 +01:00
Szymon Heidrich	7f14c7227f	USB: gadget: validate endpoint index for xilinx udc Assure that host may not manipulate the index to point past endpoint array. Signed-off-by: Szymon Heidrich <szymon.heidrich@gmail.com> Cc: stable <stable@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-02-24 11:00:07 +01:00
Jakub Kicinski	5facf49702	mlx5-fixes-2022-02-23 -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmIWzHMACgkQSD+KveBX +j7LwwgAj1T1YcCWcY5sBKbBUU08YM/7fMDewo5KZ+dMI25NBA2spRJ4B5gsFR+K e7QrGRSX43HPeGlAS7xBikkzyhKckqm05GNvUCqSiIR5BB0ddpV01JAF6U/zHxTG dN/h/k9STUZBKOMANwTNt9lM1q5rfqSyDkFEspeumzfpiIfnoYea7gX5iLRxhvG1 sfn7uBem9hRIhywShGRgyh2+huzrsVm0K8rGksunqSqvxVGtIF1XBLWqzfd3mnlY TYBXBZlJGawqavHF9fgjTLvd9PECUggv6mJzkLbueRDcA1qzyVw3VDE6iGfIqOR4 R2ROzFV5SbVP5Q/E6wLfLhZjqImTDw== =/0bX -----END PGP SIGNATURE----- Merge tag 'mlx5-fixes-2022-02-23' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5 fixes 2022-02-22 This series provides bug fixes to mlx5 driver. Please pull and let me know if there is any problem. * tag 'mlx5-fixes-2022-02-23' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5e: Fix VF min/max rate parameters interchange mistake net/mlx5e: Add missing increment of count net/mlx5e: MPLSoUDP decap, fix check for unsupported matches net/mlx5e: Fix MPLSoUDP encap to use MPLS action information net/mlx5e: Add feature check for set fec counters net/mlx5e: TC, Skip redundant ct clear actions net/mlx5e: TC, Reject rules with forward and drop actions net/mlx5e: TC, Reject rules with drop and modify hdr action net/mlx5e: kTLS, Use CHECKSUM_UNNECESSARY for device-offloaded packets net/mlx5e: Fix wrong return value on ioctl EEPROM query failure net/mlx5: Fix possible deadlock on rule deletion net/mlx5: Fix tc max supported prio for nic mode net/mlx5: Fix wrong limitation of metadata match on ecpf net/mlx5: Update log_max_qp value to be 17 at most net/mlx5: DR, Fix the threshold that defines when pool sync is initiated net/mlx5: DR, Don't allow match on IP w/o matching on full ethertype/ip_version net/mlx5: DR, Fix slab-out-of-bounds in mlx5_cmd_dr_create_fte net/mlx5: DR, Cache STE shadow memory net/mlx5: Update the list of the PCI supported devices ==================== Link: https://lore.kernel.org/r/20220224001123.365265-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-02-23 20:30:01 -08:00
Dave Airlie	7c17b3d37f	Merge tag 'amd-drm-fixes-5.17-2022-02-23' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-5.17-2022-02-23: amdgpu: - Display FP fix - PCO powergating fix - RDNA2 OEM SKU stability fixes - Display PSR fix - PCI ASPM fix - Display link encoder fix for TEST_COMMIT - Raven2 suspend/resume fix - Fix a regression in virtual display support - GPUVM eviction fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220223214623.28823-1-alexander.deucher@amd.com	2022-02-24 14:27:36 +10:00
Dave Airlie	0c3127933c	drm/tegra: Fixes for v5.17-rc6 Contains a couple of fixes for Tegra186 suspend/resume, syncpoint waiting, a build warning and eDP on older Tegra devices. -----BEGIN PGP SIGNATURE----- iQJHBAABCAAxFiEEiOrDCAFJzPfAjcif3SOs138+s6EFAmIWL34THHRyZWRpbmdA bnZpZGlhLmNvbQAKCRDdI6zXfz6zob5bD/wOh2XS1U2/VPZtodJE4PbJm9MQmVyU E6wvMx4ZophB3zXtd25ki4uvPkTcEOcp52kSLeMtzyVIcqkxuSMbYJ1Px3YnB6/5 a3MYwHAq7O3/DLzF23NgOHI4AQlwYRIymSvMVKMwTj9ycYmRpEVK/kAkIAxUocIo xJXDwDfsSHXqoeJga4dE4J3DihD0KUAOzuS+Ya8ORTfKwKtypYCvghO5TybtPUNL L9dzJBdUBaBioqkca4ObtQWWxlyTZ4nfjIW0Ji1EEbQAeKA3BinLzeq+3U2lS1DN OOV7CTvs2RAwjGCRTGKhodgOrt0slrkOw2ZI0L36E3lznJiptlv0O3Qp4UsungzM bvsgGBzuD66V50hx7dp7DMthgVqVKMrPJuzrOUsf4yu7rnBHOR4fE4YHk0jSKOaS ZHzEFxzeRsyhADyoLgFhJzSm9uNugEyPu5170EZLz3GQ1f8U6rlfUqCYtPcWH+F2 uNd7i4Zd8xhZTzQX+WENz2AEKlOuWIDZJ7qrJQXcn/Mccupy8E43ITjGStXuiS7L VUY49YnIF5QrCNBkvzpgFYlb5N6LBNxv+WlBjz/hbOIOX6O/QJ6RuuIrGgOE1hwQ ja4CQpFuqrfIZHraUCwWWaiaZX+KNQfnrOJbtpkXgZQqDEjmHoSv48pjCc1E5MpN 917feFEZokbpsg== =O3hK -----END PGP SIGNATURE----- Merge tag 'drm/tegra/for-5.17-rc6' of https://gitlab.freedesktop.org/drm/tegra into drm-fixes drm/tegra: Fixes for v5.17-rc6 Contains a couple of fixes for Tegra186 suspend/resume, syncpoint waiting, a build warning and eDP on older Tegra devices. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thierry Reding <thierry.reding@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220223161903.293392-1-thierry.reding@gmail.com	2022-02-24 14:22:12 +10:00
Dave Airlie	753a64c779	* edid: Always set RGB444 * imx/dcss: Select GEM CMA helpers * radeon: Fix some variables's type * vc4: Fix codec cleanup; Fix PM reference counting -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAmIWimcACgkQaA3BHVML eiNLjwgAoFAvwg1yq9W0XCyi/RIPli6U9QIUemcUVu22pyDcv4mLw1ky9t1SbCGB GfV20hHXK3HhZ9wWgdgq08yhIlzVLVWN3G6hL+EwsJ4+RiI9gL/xh4qKePSgtgpj JL5Di/6brHibDa7GgyhYspVQBIVbrjBaBvR7rHKTGd+6AaddXnBBA2vFcVWuIXMJ KK3te19d4xmjjqtLlbIVLppO6EJqTSHq9QXiRxywTO6ahqzL8mKbpGuNXz1HKsW9 CcYGD3jA97uBZBD0kq4v49y2dLDJ8f75IGrPlg1JBi4xdvYOvGTlB470m+aWK3Mp uktakijXfDU+pbVa/6djD3H28aD5ZQ== =HjTE -----END PGP SIGNATURE----- Merge tag 'drm-misc-fixes-2022-02-23' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes * edid: Always set RGB444 * imx/dcss: Select GEM CMA helpers * radeon: Fix some variables's type * vc4: Fix codec cleanup; Fix PM reference counting Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/YhaKj4zWJ42YWRts@linux-uq9g.fritz.box	2022-02-24 13:53:13 +10:00
Arnaldo Carvalho de Melo	7414db4119	rtla: Fix systme -> system typo on man page Link: https://lkml.kernel.org/r/YhZsZxqk+IaFxorj@kernel.org Fixes: `496082df01` ("rtla: Add rtla osnoise man page") Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Daniel Bristot de Oliveira <bristot@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2022-02-23 21:52:18 -05:00
Linus Torvalds	91318b29a8	Devicetree fixes for v5.17, take 2: - Update some maintainers email addresses - Fix handling of elfcorehdr reservation for crash dump kernel - Fix unittest expected warnings text -----BEGIN PGP SIGNATURE----- iQJEBAABCgAuFiEEktVUI4SxYhzZyEuo+vtdtY28YcMFAmIWaVwQHHJvYmhAa2Vy bmVsLm9yZwAKCRD6+121jbxhwzXVD/90aZPFQlDsenfCumOjixDejfBUGbN7mYzH s2ExCo7cam1/dVXKHWBEcTxF7IV+NRd2CHso8K52zjsMTibdj4/nk0B0IZD1yVsb TU625m7jvdlFoRaOaVsb9gRj1x5sKFvEfb8Q2humHMqWuYAqW4GQjtCBl8wUgF79 fnj7CTnnL2fuqWw9a5yohCHH6TL1F5KYVYuo4jUpfCBIdFi9GY5SYlcyRbpyBNx7 aMmtRW4G95RJfrHJ9lRFC7bhZ1E/pIBzg3xZGGd4W+VNDTfM5j155JZEEOxaKXFF dqKvHMUg0tdpZU6ukQSvBCq+f4qYW0mQipwCJcOg+sZX4rOmPjJdoNQDr9L4Jysl pdCtryS+sbnSKeY2qUvTZJ70L714gsRu/JuEYmAmXJoJbEr3ANlY5tN7dMXV7mIb YmofkjuoEpnhUBPp8sJ7zy577yCZC0BbXewD+j9tdNGNIjndDPVtntb5bFluNmH7 PkPFa3hdVyNbOi4MHs/2yaqpqWBcQ/rK55cdeZ9Qdy+5/upZKSMWCro6cQJf+Btz yx9HRnoZK06Xxl6MZYXGcX7BTDlgcWrCTYKPCokDlWSjyaKkMCM4s2cdp+4TaPBC cDHnhn+0lRdpxb1m3KQSgD2ZU2pryXKlo2pOJQO76TPgLsBMIRMIBqng8uIcpsPj do1wSK09WA== =BPT6 -----END PGP SIGNATURE----- Merge tag 'devicetree-fixes-for-5.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull devicetree fixes from Rob Herring: - Update some maintainers email addresses - Fix handling of elfcorehdr reservation for crash dump kernel - Fix unittest expected warnings text * tag 'devicetree-fixes-for-5.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: dt-bindings: update Roger Quadros email MAINTAINERS: sifive: drop Yash Shah of/fdt: move elfcorehdr reservation early for crash dump kernel of: unittest: update text of expected warnings	2022-02-23 17:25:22 -08:00
Linus Torvalds	54134be658	selinux/stable-5.17 PR 20220223 -----BEGIN PGP SIGNATURE----- iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmIWcysUHHBhdWxAcGF1 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXM9NhAAh3qHMPzlnUr4O/C1BPguUmZY6Evm yM+qWE0as6awhlaDQd+4cI7UT/9iUVc0TQ/kUMyf2g0+fUH9+UYhudQHSBSuAvEW Cbl6t3c8TybzilUvItrwwcaEBzXx7+K3uyuqpfdi2qUucIoLPO7zkSX4hCnik+no vkn/ULDUTEVszXbdiCfnLlLbVl/8cy9diRI2ZxtaCSHlgTkgYFTBXwqsJg512Qgq 4utKRoP1ZMCL/NcKoV1z4vvylzJCXWlNQF+KbVIm+KS2+mKDDEZead48j7Izot8X kwOIdp5SgfIUnCbFsfbuZf1THKmjDRz1IS8rD5eGUKpz7upfmcAhenls2bGMf0VL IqqcOLVM5r1xXrlsrjLqwTwTYubxCOSXAmS1SExRbmEDhDBWfbobRqEUaXoiN+gd sSFhGsOLMOqaK9oTz0vRG0/kCIyfovC6CqxDPxLZffAMZN6GuaWcN3qi9ua64kD9 Ro1LxJiryTOgxX9CgbvYGjR2h9f91zbUP5FtTsmi//+ty2e8mtJLwRd4BL45xLQ1 khV/JYi69AIJ2IWrnOHBgjF8U48F1WzmG1pkZ6JrDzbMn52Ydvs00eXVbHIM85X8 8qL5duZFmm9C0rCWZtqYB47tOE6dJ1ugb0OqzJFhF2OaqV7KRdtJn/NA83V4Lh5m Z8UlL9BVar/5DsQ= =B7Cx -----END PGP SIGNATURE----- Merge tag 'selinux-pr-20220223' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull selinux fix from Paul Moore: "A second small SELinux fix which addresses an incorrect mutex_is_locked() check" * tag 'selinux-pr-20220223' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: selinux: fix misuse of mutex_is_locked()	2022-02-23 17:19:55 -08:00
Gal Pressman	ca49df96f9	net/mlx5e: Fix VF min/max rate parameters interchange mistake The VF min and max rate were passed incorrectly and resulted in wrongly interchanging them. Fix the order of parameters in mlx5_esw_qos_set_vport_rate(). Fixes: `d7df09f5e7` ("net/mlx5: E-switch, Enable vport QoS on demand") Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Aya Levin <ayal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-23 16:08:19 -08:00
Lama Kayal	5ee02b7a80	net/mlx5e: Add missing increment of count Add mistakenly missing increment of count variable when looping over output buffer in mlx5e_self_test(). This resolves the issue of garbage values output when querying with self test via ethtool. before: $ ethtool -t eth2 The test result is PASS The test extra info: Link Test 0 Speed Test 1768697188 Health Test 758528120 Loopback Test 3288687 after: $ ethtool -t eth2 The test result is PASS The test extra info: Link Test 0 Speed Test 0 Health Test 0 Loopback Test 0 Fixes: `7990b1b5e8` ("net/mlx5e: loopback test is not supported in switchdev mode") Signed-off-by: Lama Kayal <lkayal@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-23 16:08:19 -08:00
Maor Dickman	fdc18e4e4b	net/mlx5e: MPLSoUDP decap, fix check for unsupported matches Currently offload of rule on bareudp device require tunnel key in order to match on mpls fields and without it the mpls fields are ignored, this is incorrect due to the fact udp tunnel doesn't have key to match on. Fix by returning error in case flow is matching on tunnel key. Fixes: `72046a91d1` ("net/mlx5e: Allow to match on mpls parameters") Signed-off-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-23 16:08:19 -08:00
Maor Dickman	c63741b426	net/mlx5e: Fix MPLSoUDP encap to use MPLS action information Currently the MPLSoUDP encap builds the MPLS header using encap action information (tunnel id, ttl and tos) instead of the MPLS action information (label, ttl, tc and bos) which is wrong. Fix by storing the MPLS action information during the flow action parse and later using it to create the encap MPLS header. Fixes: `f828ca6a2f` ("net/mlx5e: Add support for hw encapsulation of MPLS over UDP") Signed-off-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-23 16:08:18 -08:00
Lama Kayal	7fac052903	net/mlx5e: Add feature check for set fec counters Fec counters support is checked via the PCAM feature_cap_mask, bit 0: PPCNT_counter_group_Phy_statistical_counter_group. Add feature check to avoid faulty behavior. Fixes: `0a1498ebfa` ("net/mlx5e: Expose FEC counters via ethtool") Signed-off-by: Lama Kayal <lkayal@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-23 16:08:18 -08:00
Roi Dayan	fb7e76ea3f	net/mlx5e: TC, Skip redundant ct clear actions Offload of ct clear action is just resetting the reg_c register. It's done by allocating modify hdr resources which is limited. Doing it multiple times is redundant and wasting modify hdr resources and if resources depleted the driver will fail offloading the rule. Ignore redundant ct clear actions after the first one. Fixes: `806401c20a` ("net/mlx5e: CT, Fix multiple allocations and memleak of mod acts") Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Ariel Levkovich <lariel@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-23 16:08:18 -08:00
Roi Dayan	3d65492a86	net/mlx5e: TC, Reject rules with forward and drop actions Such rules are redundant but allowed and passed to the driver. The driver does not support offloading such rules so return an error. Fixes: `03a9d11e6e` ("net/mlx5e: Add TC drop and mirred/redirect action parsing for SRIOV offloads") Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-23 16:08:17 -08:00
Roi Dayan	23216d387c	net/mlx5e: TC, Reject rules with drop and modify hdr action This kind of action is not supported by firmware and generates a syndrome. kernel: mlx5_core 0000:08:00.0: mlx5_cmd_check:777:(pid 102063): SET_FLOW_TABLE_ENTRY(0x936) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x8708c3) Fixes: `d7e75a325c` ("net/mlx5e: Add offloading of E-Switch TC pedit (header re-write) actions") Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-23 16:08:17 -08:00
Tariq Toukan	7eaf1f37b8	net/mlx5e: kTLS, Use CHECKSUM_UNNECESSARY for device-offloaded packets For RX TLS device-offloaded packets, the HW spec guarantees checksum validation for the offloaded packets, but does not define whether the CQE.checksum field matches the original packet (ciphertext) or the decrypted one (plaintext). This latitude allows architetctural improvements between generations of chips, resulting in different decisions regarding the value type of CQE.checksum. Hence, for these packets, the device driver should not make use of this CQE field. Here we block CHECKSUM_COMPLETE usage for RX TLS device-offloaded packets, and use CHECKSUM_UNNECESSARY instead. Value of the packet's tcp_hdr.csum is not modified by the HW, and it always matches the original ciphertext. Fixes: `1182f36593` ("net/mlx5e: kTLS, Add kTLS RX HW offload support") Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-23 16:08:16 -08:00
Gal Pressman	0b89429722	net/mlx5e: Fix wrong return value on ioctl EEPROM query failure The ioctl EEPROM query wrongly returns success on read failures, fix that by returning the appropriate error code. Fixes: `bb64143eee` ("net/mlx5e: Add ethtool support for dump module EEPROM") Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-23 16:08:15 -08:00
Maor Gottlieb	b645e57deb	net/mlx5: Fix possible deadlock on rule deletion Add missing call to up_write_ref_node() which releases the semaphore in case the FTE doesn't have destinations, such in drop rule case. Fixes: `465e7baab6` ("net/mlx5: Fix deletion of duplicate rules") Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-23 16:08:14 -08:00
Chris Mi	be7f4b0ab1	net/mlx5: Fix tc max supported prio for nic mode Only prio 1 is supported if firmware doesn't support ignore flow level for nic mode. The offending commit removed the check wrongly. Add it back. Fixes: `9a99c8f125` ("net/mlx5e: E-Switch, Offload all chain 0 priorities when modify header and forward action is not supported") Signed-off-by: Chris Mi <cmi@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-23 16:08:13 -08:00
Ariel Levkovich	07666c75ad	net/mlx5: Fix wrong limitation of metadata match on ecpf Match metadata support check returns false for ecpf device. However, this support does exist for ecpf and therefore this limitation should be removed to allow feature such as stacked devices and internal port offloaded to be supported. Fixes: `92ab1eb392` ("net/mlx5: E-Switch, Enable vport metadata matching if firmware supports it") Signed-off-by: Ariel Levkovich <lariel@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-23 16:08:13 -08:00
Maher Sanalla	7f839965b2	net/mlx5: Update log_max_qp value to be 17 at most Currently, log_max_qp value is dependent on what FW reports as its max capability. In reality, due to a bug, some FWs report a value greater than 17, even though they don't support log_max_qp > 17. This FW issue led the driver to exhaust memory on startup. Thus, log_max_qp value is set to be no more than 17 regardless of what FW reports, as it was before the cited commit. Fixes: `f79a609ea6` ("net/mlx5: Update log_max_qp value to FW max capability") Signed-off-by: Maher Sanalla <msanalla@nvidia.com> Reviewed-by: Avihai Horon <avihaih@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-23 16:08:12 -08:00
Yevgeny Kliteynik	ecd9c5cd46	net/mlx5: DR, Fix the threshold that defines when pool sync is initiated When deciding whether to start syncing and actually free all the "hot" ICM chunks, we need to consider the type of the ICM chunks that we're dealing with. For instance, the amount of available ICM for MODIFY_ACTION is significantly lower than the usual STE ICM, so the threshold should account for that - otherwise we can deplete MODIFY_ACTION memory just by creating and deleting the same modify header action in a continuous loop. This patch replaces the hard-coded threshold with a dynamic value. Fixes: `1c58651412` ("net/mlx5: DR, ICM memory pools sync optimization") Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-23 16:08:11 -08:00
Yevgeny Kliteynik	ffb0753b95	net/mlx5: DR, Don't allow match on IP w/o matching on full ethertype/ip_version Currently SMFS allows adding rule with matching on src/dst IP w/o matching on full ethertype or ip_version, which is not supported by HW. This patch fixes this issue and adds the check as it is done in DMFS. Fixes: `26d688e33f` ("net/mlx5: DR, Add Steering entry (STE) utilities") Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-23 16:08:10 -08:00
Yevgeny Kliteynik	0aec12d97b	net/mlx5: DR, Fix slab-out-of-bounds in mlx5_cmd_dr_create_fte When adding a rule with 32 destinations, we hit the following out-of-band access issue: BUG: KASAN: slab-out-of-bounds in mlx5_cmd_dr_create_fte+0x18ee/0x1e70 This patch fixes the issue by both increasing the allocated buffers to accommodate for the needed actions and by checking the number of actions to prevent this issue when a rule with too many actions is provided. Fixes: `1ffd498901` ("net/mlx5: DR, Increase supported num of actions to 32") Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-23 16:08:10 -08:00
Yevgeny Kliteynik	e5b2bc30c2	net/mlx5: DR, Cache STE shadow memory During rule insertion on each ICM memory chunk we also allocate shadow memory used for management. This includes the hw_ste, dr_ste and miss list per entry. Since the scale of these allocations is large we noticed a performance hiccup that happens once malloc and free are stressed. In extreme usecases when ~1M chunks are freed at once, it might take up to 40 seconds to complete this, up to the point the kernel sees this as self-detected stall on CPU: rcu: INFO: rcu_sched self-detected stall on CPU To resolve this we will increase the reuse of shadow memory. Doing this we see that a time in the aforementioned usecase dropped from ~40 seconds to ~8-10 seconds. Fixes: `29cf8febd1` ("net/mlx5: DR, ICM pool memory allocator") Signed-off-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-23 16:08:09 -08:00
Meir Lichtinger	f908a35b22	net/mlx5: Update the list of the PCI supported devices Add the upcoming BlueField-4 and ConnectX-8 device IDs. Fixes: `2e9d3e83ab` ("net/mlx5: Update the list of the PCI supported devices") Signed-off-by: Meir Lichtinger <meirl@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-23 16:08:08 -08:00
Qiang Yu	c1a66c3bc4	drm/amdgpu: check vm ready by amdgpu_vm->evicting flag Workstation application ANSA/META v21.1.4 get this error dmesg when running CI test suite provided by ANSA/META: [drm:amdgpu_gem_va_ioctl [amdgpu]] ERROR Couldn't update BO_VA (-16) This is caused by: 1. create a 256MB buffer in invisible VRAM 2. CPU map the buffer and access it causes vm_fault and try to move it to visible VRAM 3. force visible VRAM space and traverse all VRAM bos to check if evicting this bo is valuable 4. when checking a VM bo (in invisible VRAM), amdgpu_vm_evictable() will set amdgpu_vm->evicting, but latter due to not in visible VRAM, won't really evict it so not add it to amdgpu_vm->evicted 5. before next CS to clear the amdgpu_vm->evicting, user VM ops ioctl will pass amdgpu_vm_ready() (check amdgpu_vm->evicted) but fail in amdgpu_vm_bo_update_mapping() (check amdgpu_vm->evicting) and get this error log This error won't affect functionality as next CS will finish the waiting VM ops. But we'd better clear the error log by checking the amdgpu_vm->evicting flag in amdgpu_vm_ready() to stop calling amdgpu_vm_bo_update_mapping() later. Another reason is amdgpu_vm->evicted list holds all BOs (both user buffer and page table), but only page table BOs' eviction prevent VM ops. amdgpu_vm->evicting flag is set only for page table BOs, so we should use evicting flag instead of evicted list in amdgpu_vm_ready(). The side effect of this change is: previously blocked VM op (user buffer in "evicted" list but no page table in it) gets done immediately. v2: update commit comments. Acked-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Qiang Yu <qiang.yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2022-02-23 16:31:06 -05:00
Guchun Chen	e2b993302f	drm/amdgpu: bypass tiling flag check in virtual display case (v2) vkms leverages common amdgpu framebuffer creation, and also as it does not support FB modifier, there is no need to check tiling flags when initing framebuffer when virtual display is enabled. This can fix below calltrace: amdgpu 0000:00:08.0: GFX9+ requires FB check based on format modifier WARNING: CPU: 0 PID: 1023 at drivers/gpu/drm/amd/amdgpu/amdgpu_display.c:1150 amdgpu_display_framebuffer_init+0x8e7/0xb40 [amdgpu] v2: check adev->enable_virtual_display instead as vkms can be enabled in bare metal as well. Signed-off-by: Leslie Shi <Yuliang.Shi@amd.com> Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-23 16:31:06 -05:00
Guchun Chen	97c61e0b7c	Revert "drm/amdgpu: add modifiers in amdgpu_vkms_plane_init()" This reverts commit `4046afcebf`. No need to support modifier in virtual kms, otherwise, in SRIOV mode, when lanuching X server, set crtc will fail due to mismatch between primary plane modifier and framebuffer modifier. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-23 16:31:06 -05:00
Chen Gong	1e2be869c8	drm/amdgpu: do not enable asic reset for raven2 The GPU reset function of raven2 is not maintained or tested, so it should be very unstable. Now the amdgpu_asic_reset function is added to amdgpu_pmops_suspend, which causes the S3 test of raven2 to fail, so the asic_reset of raven2 is ignored here. Fixes: `daf8de0874` ("drm/amdgpu: always reset the asic in suspend (v2)") Signed-off-by: Chen Gong <curry.gong@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2022-02-23 16:31:06 -05:00
Nicholas Kazlauskas	3743e7f6fc	drm/amd/display: Fix stream->link_enc unassigned during stream removal [Why] Found when running igt@kms_atomic. Userspace attempts to do a TEST_COMMIT when 0 streams which calls dc_remove_stream_from_ctx. This in turn calls link_enc_unassign which ends up modifying stream->link = NULL directly, causing the global link_enc to be removed preventing further link activity and future link validation from passing. [How] We take care of link_enc unassignment at the start of link_enc_cfg_link_encs_assign so this call is no longer necessary. Fixes global state from being modified while unlocked. Reviewed-by: Jimmy Kizito <Jimmy.Kizito@amd.com> Acked-by: Jasdeep Dhillon <jdhillon@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2022-02-23 16:31:06 -05:00
Mario Limonciello	7294863a6f	drm/amd: Check if ASPM is enabled from PCIe subsystem commit `0064b0ce85` ("drm/amd/pm: enable ASPM by default") enabled ASPM by default but a variety of hardware configurations it turns out that this caused a regression. * PPC64LE hardware does not support ASPM at a hardware level. CONFIG_PCIEASPM is often disabled on these architectures. * Some dGPUs on ALD platforms don't work with ASPM enabled and PCIe subsystem disables it Check with the PCIe subsystem to see that ASPM has been enabled or not. Fixes: `0064b0ce85` ("drm/amd/pm: enable ASPM by default") Link: https://wiki.raptorcs.com/w/images/a/ad/P9_PHB_version1.0_27July2018_pub.pdf Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1723 Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1739 Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1885 Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1907 Tested-by: koba.ko@canonical.com Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2022-02-23 16:31:06 -05:00
Shreeya Patel	ae42f92888	gpio: Return EPROBE_DEFER if gc->to_irq is NULL We are racing the registering of .to_irq when probing the i2c driver. This results in random failure of touchscreen devices. Following explains the race condition better. [gpio driver] gpio driver registers gpio chip [gpio consumer] gpio is acquired [gpio consumer] gpiod_to_irq() fails with -ENXIO [gpio driver] gpio driver registers irqchip gpiod_to_irq works at this point, but -ENXIO is fatal We could see the following errors in dmesg logs when gc->to_irq is NULL [2.101857] i2c_hid i2c-FTS3528:00: HID over i2c has not been provided an Int IRQ [2.101953] i2c_hid: probe of i2c-FTS3528:00 failed with error -22 To avoid this situation, defer probing until to_irq is registered. Returning -EPROBE_DEFER would be the first step towards avoiding the failure of devices due to the race in registration of .to_irq. Final solution to this issue would be to avoid using gc irq members until they are fully initialized. This issue has been reported many times in past and people have been using workarounds like changing the pinctrl_amd to built-in instead of loading it as a module or by adding a softdep for pinctrl_amd into the config file. BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=209413 Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Shreeya Patel <shreeya.patel@collabora.com> Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>	2022-02-23 22:30:56 +01:00
Linus Torvalds	23d0432844	parisc unaligned handler fixes Two patches which fix a few bugs in the unalignment handlers. The fldd and fstd instructions weren't handled at all on 32-bit kernels, the stw instruction didn't checked for fault errors and the fldw_l and ldw_m were handled wrongly as integer vs. floating point instructions. Both patches are tagged for stable series. -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQS86RI+GtKfB8BJu973ErUQojoPXwUCYhZtYAAKCRD3ErUQojoP X95qAP4umgbso0RkZrxClYVCON6J/Ndo3PHEe3B992/Jv+R7ZAEA2O5ZOs7nUjdv rN27wpJU/BBhbKBqs3rYCw4UgQ2zTQI= =H3yW -----END PGP SIGNATURE----- Merge tag 'for-5.17/parisc-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc unaligned handler fixes from Helge Deller: "Two patches which fix a few bugs in the unalignment handlers. The fldd and fstd instructions weren't handled at all on 32-bit kernels, the stw instruction didn't check for fault errors and the fldw_l and ldw_m were handled wrongly as integer vs floating point instructions. Both patches are tagged for stable series" * tag 'for-5.17/parisc-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc/unaligned: Fix ldw() and stw() unalignment handlers parisc/unaligned: Fix fldd and fstd unaligned handlers on 32-bit kernel	2022-02-23 12:06:23 -08:00
Linus Torvalds	6f5738db96	hwmon fixes for v5.17-rc6 Fix two old bugs and one new bug in hwmon subsystem. - In pmbus core, clear pmbus fault/warning status bits after read to follow PMBus standard - In hwmon core, handle failure to register sensor with thermal zone correctly - In ntc_thermal driver, use valid thermistor names for Samsung thermistors -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEiHPvMQj9QTOCiqgVyx8mb86fmYEFAmIVr/UACgkQyx8mb86f mYHEig//UZe83C/FMxYH/tHp/71+IEafZraTe0IOLPZSKW3OXgtOMymiJtgOokzl eD60/GwRSu9wlmYeemQowlXwGd6dLsiCy1SgZVzE532Zm0VEx/hFTPrxifsZl2TK PDlYCbWJ11xaGpnhXN7oCAuYxtY2z3VK6OJFh9nsLpWSVhFECsQJSpt6TjwXD+uD tIdhLEuqENmBonLP4nxBRBctNjwiPBHppGAvjUty7HkChTFA6b3cd9ydR7V/xgwb +mfzWVG9d0DbgVJtS33/r6/K927mzZNz9Her7MKBfAaDoRifRUspT+UEuM/2GTb8 ukigFVCQg6pzf53FrzY3fgH98ENl6rQ/Fmkt2nAAbIpq+FDy1T9LqiSaUF2qfQbR CmeYiADlCsPWTFR5MO+TWxGuBx4YfxkMJQ4+eAHO6ms1/OnSHVdpzgYQ+fZG7L4D NSlYFuDq8Q4Jv52J/vffaYiZf3U4QTsE3KnopDuwITYKNKIht1pj57wlk2dIKwMg gkH8jX9crCI2/TXZPxTRgFNlIBGr1D9SiYyTgVzmfqqWl1S/zQMzg9JYjmB29aZC Q2YO9Kji+XTMCbNwPat+yOxe01Tnt3msgVt9honA5HT/a6tGMQ0zt45sIcl0RAC5 rEbe6HnLmob8IH731B4+6lQNBg4WoAw7pUN2vfLQhfYTnWxQLXk= =hqZA -----END PGP SIGNATURE----- Merge tag 'hwmon-for-v5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull hwmon fixes from Guenter Roeck: "Fix two old bugs and one new bug in the hwmon subsystem: - In pmbus core, clear pmbus fault/warning status bits after read to follow PMBus standard - In hwmon core, handle failure to register sensor with thermal zone correctly - In ntc_thermal driver, use valid thermistor names for Samsung thermistors" * tag 'hwmon-for-v5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: hwmon: (pmbus) Clear pmbus fault/warning bits after read hwmon: Handle failure to register sensor with thermal zone correctly hwmon: (ntc_thermistor) Underscore Samsung thermistor	2022-02-23 11:51:35 -08:00
Linus Torvalds	4eb0a7c8e1	slab fixes for 5.17-rc6 -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEjUuTAak14xi+SF7M4CHKc/GJqRAFAmIV/fQACgkQ4CHKc/GJ qRBhaAgAoz81fjhlkcCaHdgxVTEx6L93iJQJiZWoE3gTKk2jruun3sIYmPSOiY+b bWR1datDnvaS/Xv04rZ6pm6XPjCT+LmrOQCOlZMjptc6HKoKuDZTvcQ0u5CYOfQS I9ZRtPaHjSmhntS8BErxGes5+PF1hz/q2rGuODt4/DQCNPZNdHXMdym9w4Z4xHXm TuH2VXzv5JXhYlUEDz2HP8LXmbvxA9rGaMgngpX92pCL8uTLqANoZCT+zHEj3cKw db6/A8S7Y4PsfF0JphNup+wcsWj+yfIrfAQwnTgNXR4hlhbUxOHTJqXlQGK/NW7C tg2nXxQQn14MwPlkatdxFYzg1TbMYg== =U6Fu -----END PGP SIGNATURE----- Merge tag 'slab-for-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab Pull slab fixes from Vlastimil Babka: - Build fix (workaround) for clang. - Fix a /proc/kcore based slabinfo script broken by struct slab changes in 5.17-rc1. * tag 'slab-for-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab: tools/cgroup/slabinfo: update to work with struct slab slab: remove __alloc_size attribute from __kmalloc_track_caller	2022-02-23 11:33:12 -08:00
Bart Van Assche	081bdc9fe0	RDMA/ib_srp: Fix a deadlock Remove the flush_workqueue(system_long_wq) call since flushing system_long_wq is deadlock-prone and since that call is redundant with a preceding cancel_work_sync() Link: https://lore.kernel.org/r/20220215210511.28303-3-bvanassche@acm.org Fixes: `ef6c49d87c` ("IB/srp: Eliminate state SRP_TARGET_DEAD") Reported-by: syzbot+831661966588c802aae9@syzkaller.appspotmail.com Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-02-23 15:03:58 -04:00
Maxime Ripard	515415d316	ARM: boot: dts: bcm2711: Fix HVS register range While the HVS has the same context memory size in the BCM2711 than in the previous SoCs, the range allocated to the registers doubled and it now takes 16k + 16k, compared to 8k + 16k before. The KMS driver will use the whole context RAM though, eventually resulting in a pointer dereference error when we access the higher half of the context memory since it hasn't been mapped. Fixes: `4564363351` ("ARM: dts: bcm2711: Enable the display pipeline") Signed-off-by: Maxime Ripard <maxime@cerno.tech> Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>	2022-02-23 11:03:29 -08:00
Alex Deucher	3f1271b54e	PCI: Mark all AMD Navi10 and Navi14 GPU ATS as broken There are enough VBIOS escapes without the proper workaround that some users still hit this. Microsoft never productized ATS on Windows so OEM platforms that were Windows-only didn't always validate ATS. The advantages of ATS are not worth it compared to the potential instabilities on harvested boards. Disable ATS on all Navi10 and Navi14 boards. Symptoms include: amdgpu 0000:07:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0007 address=0xffffc02000 flags=0x0000] AMD-Vi: Event logged [IO_PAGE_FAULT device=07:00.0 domain=0x0007 address=0xffffc02000 flags=0x0000] [drm:amdgpu_job_timedout [amdgpu]] ERROR ring sdma0 timeout, signaled seq=6047, emitted seq=6049 amdgpu 0000:07:00.0: amdgpu: GPU reset begin! amdgpu 0000:07:00.0: amdgpu: GPU reset succeeded, trying to resume amdgpu 0000:07:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] ERROR ring sdma0 test failed (-110) [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] ERROR resume of IP block <sdma_v4_0> failed -110 amdgpu 0000:07:00.0: amdgpu: GPU reset(1) failed Related commits: `e8946a53e2` ("PCI: Mark AMD Navi14 GPU ATS as broken") `a2da5d8cc0` ("PCI: Mark AMD Raven iGPU ATS as broken in some platforms") `45beb31d3a` ("PCI: Mark AMD Navi10 GPU rev 0x00 ATS as broken") `5e89cd303e` ("PCI: Mark AMD Navi14 GPU rev 0xc5 ATS as broken") `d28ca864c4` ("PCI: Mark AMD Stoney Radeon R7 GPU ATS as broken") `9b44b0b09d` ("PCI: Mark AMD Stoney GPU ATS as broken") [bhelgaas: add symptoms and related commits] Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1760 Link: https://lore.kernel.org/r/20220222160801.841643-1-alexander.deucher@amd.com Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Guchun Chen <guchun.chen@amd.com>	2022-02-23 12:33:32 -06:00
Helge Deller	a972798368	parisc/unaligned: Fix ldw() and stw() unalignment handlers Fix 3 bugs: a) emulate_stw() doesn't return the error code value, so faulting instructions are not reported and aborted. b) Tell emulate_ldw() to handle fldw_l as floating point instruction c) Tell emulate_ldw() to handle ldw_m as integer instruction Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org	2022-02-23 18:01:06 +01:00
Helge Deller	dd2288f4a0	parisc/unaligned: Fix fldd and fstd unaligned handlers on 32-bit kernel Usually the kernel provides fixup routines to emulate the fldd and fstd floating-point instructions if they load or store 8-byte from/to a not natuarally aligned memory location. On a 32-bit kernel I noticed that those unaligned handlers didn't worked and instead the application got a SEGV. While checking the code I found two problems: First, the OPCODE_FLDD_L and OPCODE_FSTD_L cases were ifdef'ed out by the CONFIG_PA20 option, and as such those weren't built on a pure 32-bit kernel. This is now fixed by moving the CONFIG_PA20 #ifdef to prevent the compilation of OPCODE_LDD_L and OPCODE_FSTD_L only, and handling the fldd and fstd instructions. The second problem are two bugs in the 32-bit inline assembly code, where the wrong registers where used. The calculation of the natural alignment used %2 (vall) instead of %3 (ior), and the first word was stored back to address %1 (valh) instead of %3 (ior). Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org	2022-02-23 18:01:06 +01:00
Qu Wenruo	26fbac2517	btrfs: autodefrag: only scan one inode once Although we have btrfs_requeue_inode_defrag(), for autodefrag we are still just exhausting all inode_defrag items in the tree. This means, it doesn't make much difference to requeue an inode_defrag, other than scan the inode from the beginning till its end. Change the behaviour to always scan from offset 0 of an inode, and till the end. By this we get the following benefit: - Straight-forward code - No more re-queue related check - Fewer members in inode_defrag We still keep the same btrfs_get_fs_root() and btrfs_iget() check for each loop, and added extra should_auto_defrag() check per-loop. Note: the patch needs to be backported and is intentionally written to minimize the diff size, code will be cleaned up later. CC: stable@vger.kernel.org # 5.16 Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2022-02-23 17:55:01 +01:00
Qu Wenruo	199257a78b	btrfs: defrag: don't use merged extent map for their generation check For extent maps, if they are not compressed extents and are adjacent by logical addresses and file offsets, they can be merged into one larger extent map. Such merged extent map will have the higher generation of all the original ones. But this brings a problem for autodefrag, as it relies on accurate extent_map::generation to determine if one extent should be defragged. For merged extent maps, their higher generation can mark some older extents to be defragged while the original extent map doesn't meet the minimal generation threshold. Thus this will cause extra IO. So solve the problem, here we introduce a new flag, EXTENT_FLAG_MERGED, to indicate if the extent map is merged from one or more ems. And for autodefrag, if we find a merged extent map, and its generation meets the generation requirement, we just don't use this one, and go back to defrag_get_extent() to read extent maps from subvolume trees. This could cause more read IO, but should result less defrag data write, so in the long run it should be a win for autodefrag. Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2022-02-23 17:43:13 +01:00
Qu Wenruo	d5633b0dee	btrfs: defrag: bring back the old file extent search behavior For defrag, we don't really want to use btrfs_get_extent() to iterate all extent maps of an inode. The reasons are: - btrfs_get_extent() can merge extent maps And the result em has the higher generation of the two, causing defrag to mark unnecessary part of such merged large extent map. This in fact can result extra IO for autodefrag in v5.16+ kernels. However this patch is not going to completely solve the problem, as one can still using read() to trigger extent map reading, and got them merged. The completely solution for the extent map merging generation problem will come as an standalone fix. - btrfs_get_extent() caches the extent map result Normally it's fine, but for defrag the target range may not get another read/write for a long long time. Such cache would only increase the memory usage. - btrfs_get_extent() doesn't skip older extent map Unlike the old find_new_extent() which uses btrfs_search_forward() to skip the older subtree, thus it will pick up unnecessary extent maps. This patch will fix the regression by introducing defrag_get_extent() to replace the btrfs_get_extent() call. This helper will: - Not cache the file extent we found It will search the file extent and manually convert it to em. - Use btrfs_search_forward() to skip entire ranges which is modified in the past This should reduce the IO for autodefrag. Reported-by: Filipe Manana <fdmanana@suse.com> Fixes: `7b508037d4` ("btrfs: defrag: use defrag_one_cluster() to implement btrfs_defrag_file()") Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2022-02-23 17:43:07 +01:00
Qu Wenruo	550f133f69	btrfs: defrag: remove an ambiguous condition for rejection From the very beginning of btrfs defrag, there is a check to reject extents which meet both conditions: - Physically adjacent We may want to defrag physically adjacent extents to reduce the number of extents or the size of subvolume tree. - Larger than 128K This may be there for compressed extents, but unfortunately 128K is exactly the max capacity for compressed extents. And the check is > 128K, thus it never rejects compressed extents. Furthermore, the compressed extent capacity bug is fixed by previous patch, there is no reason for that check anymore. The original check has a very small ranges to reject (the target extent size is > 128K, and default extent threshold is 256K), and for compressed extent it doesn't work at all. So it's better just to remove the rejection, and allow us to defrag physically adjacent extents. CC: stable@vger.kernel.org # 5.16 Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2022-02-23 17:42:55 +01:00
Qu Wenruo	979b25c300	btrfs: defrag: don't defrag extents which are already at max capacity [BUG] For compressed extents, defrag ioctl will always try to defrag any compressed extents, wasting not only IO but also CPU time to compress/decompress: mkfs.btrfs -f $DEV mount -o compress $DEV $MNT xfs_io -f -c "pwrite -S 0xab 0 128K" $MNT/foobar sync xfs_io -f -c "pwrite -S 0xcd 128K 128K" $MNT/foobar sync echo "=== before ===" xfs_io -c "fiemap -v" $MNT/foobar btrfs filesystem defrag $MNT/foobar sync echo "=== after ===" xfs_io -c "fiemap -v" $MNT/foobar Then it shows the 2 128K extents just get COW for no extra benefit, with extra IO/CPU spent: === before === /mnt/btrfs/file1: EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS 0: [0..255]: 26624..26879 256 0x8 1: [256..511]: 26632..26887 256 0x9 === after === /mnt/btrfs/file1: EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS 0: [0..255]: 26640..26895 256 0x8 1: [256..511]: 26648..26903 256 0x9 This affects not only v5.16 (after the defrag rework), but also v5.15 (before the defrag rework). [CAUSE] From the very beginning, btrfs defrag never checks if one extent is already at its max capacity (128K for compressed extents, 128M otherwise). And the default extent size threshold is 256K, which is already beyond the compressed extent max size. This means, by default btrfs defrag ioctl will mark all compressed extent which is not adjacent to a hole/preallocated range for defrag. [FIX] Introduce a helper to grab the maximum extent size, and then in defrag_collect_targets() and defrag_check_next_extent(), reject extents which are already at their max capacity. Reported-by: Filipe Manana <fdmanana@suse.com> CC: stable@vger.kernel.org # 5.16 Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2022-02-23 17:42:53 +01:00
Qu Wenruo	7093f15291	btrfs: defrag: don't try to merge regular extents with preallocated extents [BUG] With older kernels (before v5.16), btrfs will defrag preallocated extents. While with newer kernels (v5.16 and newer) btrfs will not defrag preallocated extents, but it will defrag the extent just before the preallocated extent, even it's just a single sector. This can be exposed by the following small script: mkfs.btrfs -f $dev > /dev/null mount $dev $mnt xfs_io -f -c "pwrite 0 4k" -c sync -c "falloc 4k 16K" $mnt/file xfs_io -c "fiemap -v" $mnt/file btrfs fi defrag $mnt/file sync xfs_io -c "fiemap -v" $mnt/file The output looks like this on older kernels: /mnt/btrfs/file: EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS 0: [0..7]: 26624..26631 8 0x0 1: [8..39]: 26632..26663 32 0x801 /mnt/btrfs/file: EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS 0: [0..39]: 26664..26703 40 0x1 Which defrags the single sector along with the preallocated extent, and replace them with an regular extent into a new location (caused by data COW). This wastes most of the data IO just for the preallocated range. On the other hand, v5.16 is slightly better: /mnt/btrfs/file: EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS 0: [0..7]: 26624..26631 8 0x0 1: [8..39]: 26632..26663 32 0x801 /mnt/btrfs/file: EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS 0: [0..7]: 26664..26671 8 0x0 1: [8..39]: 26632..26663 32 0x801 The preallocated range is not defragged, but the sector before it still gets defragged, which has no need for it. [CAUSE] One of the function reused by the old and new behavior is defrag_check_next_extent(), it will determine if we should defrag current extent by checking the next one. It only checks if the next extent is a hole or inlined, but it doesn't check if it's preallocated. On the other hand, out of the function, both old and new kernel will reject preallocated extents. Such inconsistent behavior causes above behavior. [FIX] - Also check if next extent is preallocated If so, don't defrag current extent. - Add comments for each branch why we reject the extent This will reduce the IO caused by defrag ioctl and autodefrag. CC: stable@vger.kernel.org # 5.16 Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2022-02-23 17:42:52 +01:00
Varun Prakash	c2700d2886	nvme-tcp: send H2CData PDUs based on MAXH2CDATA As per NVMe/TCP specification (revision 1.0a, section 3.6.2.3) Maximum Host to Controller Data length (MAXH2CDATA): Specifies the maximum number of PDU-Data bytes per H2CData PDU in bytes. This value is a multiple of dwords and should be no less than 4,096. Current code sets H2CData PDU data_length to r2t_length, it does not check MAXH2CDATA value. Fix this by setting H2CData PDU data_length to min(req->h2cdata_left, queue->maxh2cdata). Also validate MAXH2CDATA value returned by target in ICResp PDU, if it is not a multiple of dword or if it is less than 4096 return -EINVAL from nvme_tcp_init_connection(). Signed-off-by: Varun Prakash <varun@chelsio.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>	2022-02-23 14:43:11 +01:00

1 2 3 4 5 ...

1073926 Commits All Branches Search

1073926 Commits

All Branches