Commit Graph

5997 Commits

Author SHA1 Message Date
Abel Vesa 1ce91d45ba misc: fastrpc: Add reserved mem support
The reserved mem support is needed for CMA heap support, which will be
used by AUDIOPD.

Co-developed-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Abel Vesa <abel.vesa@linaro.org>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Link: https://lore.kernel.org/r/20221125071405.148786-4-srinivas.kandagatla@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-11-25 18:45:33 +01:00
Abel Vesa 1959ab9edc misc: fastrpc: Rename audio protection domain to root
The AUDIO_PD will be done via static pd, so the proper name here is
actually ROOT_PD.

Co-developed-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Abel Vesa <abel.vesa@linaro.org>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Link: https://lore.kernel.org/r/20221125071405.148786-3-srinivas.kandagatla@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-11-25 18:45:33 +01:00
Greg Kroah-Hartman ff62b8e658 driver core: make struct class.devnode() take a const *
The devnode() in struct class should not be modifying the device that is
passed into it, so mark it as a const * and propagate the function
signature changes out into all relevant subsystems that use this
callback.

Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Reinette Chatre <reinette.chatre@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: x86@kernel.org
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Justin Sanders <justin@coraid.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Benjamin Gaignard <benjamin.gaignard@collabora.com>
Cc: Liam Mark <lmark@codeaurora.org>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Brian Starkey <Brian.Starkey@arm.com>
Cc: John Stultz <jstultz@google.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: David Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Sean Young <sean@mess.org>
Cc: Frank Haverkamp <haver@linux.ibm.com>
Cc: Jiri Slaby <jirislaby@kernel.org>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Anton Vorontsov <anton@enomsg.org>
Cc: Colin Cross <ccross@android.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Jaroslav Kysela <perex@perex.cz>
Cc: Takashi Iwai <tiwai@suse.com>
Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: Xie Yongji <xieyongji@bytedance.com>
Cc: Gautam Dawar <gautam.dawar@xilinx.com>
Cc: Dan Carpenter <error27@gmail.com>
Cc: Eli Cohen <elic@nvidia.com>
Cc: Parav Pandit <parav@nvidia.com>
Cc: Maxime Coquelin <maxime.coquelin@redhat.com>
Cc: alsa-devel@alsa-project.org
Cc: dri-devel@lists.freedesktop.org
Cc: kvm@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Cc: linux-block@vger.kernel.org
Cc: linux-input@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-media@vger.kernel.org
Cc: linux-rdma@vger.kernel.org
Cc: linux-scsi@vger.kernel.org
Cc: linux-usb@vger.kernel.org
Cc: virtualization@lists.linux-foundation.org
Link: https://lore.kernel.org/r/20221123122523.1332370-2-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-11-24 17:12:27 +01:00
Yang Yingliang 5f58cad1e4 ocxl: fix pci device refcount leak when calling get_function_0()
get_function_0() calls pci_get_domain_bus_and_slot(), as comment
says, it returns a pci device with refcount increment, so after
using it, pci_dev_put() needs be called.

Get the device reference when get_function_0() is not called, so
pci_dev_put() can be called in the error path and callers
unconditionally. And add comment above get_dvsec_vendor0() to tell
callers to call pci_dev_put().

Fixes: 87db7579eb ("ocxl: control via sysfs whether the FPGA is reloaded on a link reset")
Suggested-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Acked-by: Frederic Barrat <fbarrat@linux.ibm.com>
Acked-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221121154339.4088935-1-yangyingliang@huawei.com
2022-11-24 23:31:46 +11:00
Yang Yingliang 295faa1772 ocxl: fix possible name leak in ocxl_file_register_afu()
If device_register() returns error in ocxl_file_register_afu(),
the name allocated by dev_set_name() need be freed. As comment
of device_register() says, it should use put_device() to give
up the reference in the error path. So fix this by calling
put_device(), then the name can be freed in kobject_cleanup(),
and info is freed in info_release().

Fixes: 75ca758adb ("ocxl: Create a clear delineation between ocxl backend & frontend")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Acked-by: Andrew Donnellan <ajd@linux.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221111145929.2429271-1-yangyingliang@huawei.com
2022-11-24 23:31:44 +11:00
Yang Yingliang 8bf03f557d cxl: fix possible null-ptr-deref in cxl_pci_init_afu|adapter()
If device_register() fails in cxl_pci_afu|adapter(), the device
is not added, device_unregister() can not be called in the error
path, otherwise it will cause a null-ptr-deref because of removing
not added device.

As comment of device_register() says, it should use put_device() to give
up the reference in the error path. So split device_unregister() into
device_del() and put_device(), then goes to put dev when register fails.

Fixes: f204e0b8ce ("cxl: Driver code for powernv PCIe based cards for userspace access")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Acked-by: Andrew Donnellan <ajd@linux.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221111145440.2426970-2-yangyingliang@huawei.com
2022-11-24 23:31:25 +11:00
Yang Yingliang f949ccee1d cxl: fix possible null-ptr-deref in cxl_guest_init_afu|adapter()
If device_register() fails in cxl_register_afu|adapter(), the device
is not added, device_unregister() can not be called in the error path,
otherwise it will cause a null-ptr-deref because of removing not added
device.

As comment of device_register() says, it should use put_device() to give
up the reference in the error path. So split device_unregister() into
device_del() and put_device(), then goes to put dev when register fails.

Fixes: 14baf4d9c7 ("cxl: Add guest-specific code")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Acked-by: Andrew Donnellan <ajd@linux.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221111145440.2426970-1-yangyingliang@huawei.com
2022-11-24 23:31:24 +11:00
Miaoqian Lin 1d09697ff2 cxl: Fix refcount leak in cxl_calc_capp_routing
of_get_next_parent() returns a node pointer with refcount incremented,
we should use of_node_put() on it when not need anymore.
This function only calls of_node_put() in normal path,
missing it in the error path.
Add missing of_node_put() to avoid refcount leak.

Fixes: f24be42aab ("cxl: Add psl9 specific code")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Acked-by: Andrew Donnellan <ajd@linux.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220605060038.62217-1-linmq006@gmail.com
2022-11-24 23:12:19 +11:00
Dave Airlie d47f958083 Linux 6.1-rc6
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmN6wAgeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiG0EYH/3/RO90NbrFItraN
 Lzr+d3VdbGjTu8xd1M+PRTmwh3zxLpB+Jwqr0T0A2gzL9B/D+AUPUJdrCVbv9DqS
 FLJAVqoeV20dNBAHSffOOLPsgCZ+Eu+LzlNN7Iqde0e8cyZICFMNktitui84Xm/i
 1NgFVgz9OZ6+aieYvUj3FrFq0p8GTIaC/oybDZrxYKcO8ZzKVMJ11swRw10wwq0g
 qOOECvV3w7wlQ8upQZkzFxItKFc7EexZI6R4elXeGSJJ9Hlc092dv/zsKB9dwV+k
 WcwkJrZRoezYXzgGBFxUcQtzi+ethjrPjuJuM1rYLUSIcfIW/0lkaSLgRoBu8D+I
 1GfXkXs=
 =gt6P
 -----END PGP SIGNATURE-----

Backmerge tag 'v6.1-rc6' into drm-next

Linux 6.1-rc6

This is needed for drm-misc-next and tegra.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2022-11-24 11:05:43 +10:00
Yang Yingliang 02cd3032b1 cxl: fix possible null-ptr-deref in cxl_pci_init_afu|adapter()
If device_register() fails in cxl_pci_afu|adapter(), the device
is not added, device_unregister() can not be called in the error
path, otherwise it will cause a null-ptr-deref because of removing
not added device.

As comment of device_register() says, it should use put_device() to give
up the reference in the error path. So split device_unregister() into
device_del() and put_device(), then goes to put dev when register fails.

Fixes: f204e0b8ce ("cxl: Driver code for powernv PCIe based cards for userspace access")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Acked-by: Frederic Barrat <fbarrat@linux.ibm.com>
Acked-by: Andrew Donnellan <ajd@linux.ibm.com>
Link: https://lore.kernel.org/r/20221111145440.2426970-2-yangyingliang@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-11-23 20:03:31 +01:00
Yang Yingliang 61c80d1c38 cxl: fix possible null-ptr-deref in cxl_guest_init_afu|adapter()
If device_register() fails in cxl_register_afu|adapter(), the device
is not added, device_unregister() can not be called in the error path,
otherwise it will cause a null-ptr-deref because of removing not added
device.

As comment of device_register() says, it should use put_device() to give
up the reference in the error path. So split device_unregister() into
device_del() and put_device(), then goes to put dev when register fails.

Fixes: 14baf4d9c7 ("cxl: Add guest-specific code")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Acked-by: Andrew Donnellan <ajd@linux.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.ibm.com>
Link: https://lore.kernel.org/r/20221111145440.2426970-1-yangyingliang@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-11-23 20:03:31 +01:00
Uwe Kleine-König 3127a86a37 misc: ds1682: Convert to i2c's .probe_new()
The probe function doesn't make use of the i2c_device_id * parameter so it
can be trivially converted.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/20221118224540.619276-487-uwe@kleine-koenig.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-11-23 19:56:39 +01:00
Uwe Kleine-König 781edb0530 misc: bh1770glc: Convert to i2c's .probe_new()
The probe function doesn't make use of the i2c_device_id * parameter so it
can be trivially converted.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/20221118224540.619276-486-uwe@kleine-koenig.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-11-23 19:56:39 +01:00
Uwe Kleine-König 9f28b675c1 misc: apds9802als: Convert to i2c's .probe_new()
The probe function doesn't make use of the i2c_device_id * parameter so it
can be trivially converted.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/20221118224540.619276-484-uwe@kleine-koenig.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-11-23 19:56:39 +01:00
Uwe Kleine-König 6757c6480d misc: apds990x: Convert to i2c's .probe_new()
The probe function doesn't make use of the i2c_device_id * parameter so it
can be trivially converted.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/20221118224540.619276-485-uwe@kleine-koenig.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-11-23 19:56:39 +01:00
Uwe Kleine-König 244179dbe1 misc: eeprom/idt_89hpesx: Convert to i2c's .probe_new()
The probe function doesn't make use of the i2c_device_id * parameter so it
can be trivially converted.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/20221118224540.619276-489-uwe@kleine-koenig.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-11-23 19:56:39 +01:00
Uwe Kleine-König db687ce718 misc: isl29003: Convert to i2c's .probe_new()
The probe function doesn't make use of the i2c_device_id * parameter so it
can be trivially converted.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/20221118224540.619276-493-uwe@kleine-koenig.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-11-23 19:56:39 +01:00
Uwe Kleine-König 9c18dad44d misc: ics932s401: Convert to i2c's .probe_new()
The probe function doesn't make use of the i2c_device_id * parameter so it
can be trivially converted.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/20221118224540.619276-492-uwe@kleine-koenig.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-11-23 19:56:39 +01:00
Uwe Kleine-König 654700c9fc misc: hmc6352: Convert to i2c's .probe_new()
The probe function doesn't make use of the i2c_device_id * parameter so it
can be trivially converted.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/20221118224540.619276-491-uwe@kleine-koenig.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-11-23 19:56:38 +01:00
Uwe Kleine-König 99b0cb3f5f misc: eeprom/max6875: Convert to i2c's .probe_new()
The probe function doesn't make use of the i2c_device_id * parameter so it
can be trivially converted.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/20221118224540.619276-490-uwe@kleine-koenig.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-11-23 19:56:08 +01:00
Uwe Kleine-König 327e1ad186 misc: isl29020: Convert to i2c's .probe_new()
The probe function doesn't make use of the i2c_device_id * parameter so it
can be trivially converted.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/20221118224540.619276-494-uwe@kleine-koenig.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-11-23 19:56:07 +01:00
Uwe Kleine-König 8427bd8bde misc: tsl2550: Convert to i2c's .probe_new()
The probe function doesn't make use of the i2c_device_id * parameter so it
can be trivially converted.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/20221118224540.619276-496-uwe@kleine-koenig.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-11-23 19:56:05 +01:00
Uwe Kleine-König 59ee8ca4ee misc: eeprom/eeprom: Convert to i2c's .probe_new()
The probe function doesn't make use of the i2c_device_id * parameter so it
can be trivially converted.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/20221118224540.619276-488-uwe@kleine-koenig.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-11-23 19:56:03 +01:00
Uwe Kleine-König 7198cf0f1c misc: lis3lv02d/lis3lv02d_i2c: Convert to i2c's .probe_new()
The probe function doesn't make use of the i2c_device_id * parameter so it
can be trivially converted.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/20221118224540.619276-495-uwe@kleine-koenig.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-11-23 19:56:01 +01:00
Zheng Wang 643a16a0eb misc: sgi-gru: fix use-after-free error in gru_set_context_option, gru_fault and gru_handle_user_call_os
In some bad situation, the gts may be freed gru_check_chiplet_assignment.
The call chain can be gru_unload_context->gru_free_gru_context->gts_drop
and kfree finally. However, the caller didn't know if the gts is freed
or not and use it afterwards. This will trigger a Use after Free bug.

Fix it by introducing a return value to see if it's in error path or not.
Free the gts in caller if gru_check_chiplet_assignment check failed.

Fixes: 55484c45db ("gru: allow users to specify gru chiplet 2")
Signed-off-by: Zheng Wang <zyytlz.wz@163.com>
Acked-by: Dimitri Sivanich <sivanich@hpe.com>
Link: https://lore.kernel.org/r/20221110035033.19498-1-zyytlz.wz@163.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-11-23 19:55:48 +01:00
ruanjinjie fd2c930cf6 misc: tifm: fix possible memory leak in tifm_7xx1_switch_media()
If device_register() returns error in tifm_7xx1_switch_media(),
name of kobject which is allocated in dev_set_name() called in device_add()
is leaked.

Never directly free @dev after calling device_register(), even
if it returned an error! Always use put_device() to give up the
reference initialized.

Fixes: 2428a8fe22 ("tifm: move common device management tasks from tifm_7xx1 to tifm_core")
Signed-off-by: ruanjinjie <ruanjinjie@huawei.com>
Link: https://lore.kernel.org/r/20221117064725.3478402-1-ruanjinjie@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-11-23 19:55:26 +01:00
Yang Yingliang 27158c7267 ocxl: fix pci device refcount leak when calling get_function_0()
get_function_0() calls pci_get_domain_bus_and_slot(), as comment
says, it returns a pci device with refcount increment, so after
using it, pci_dev_put() needs be called.

Get the device reference when get_function_0() is not called, so
pci_dev_put() can be called in the error path and callers
unconditionally. And add comment above get_dvsec_vendor0() to tell
callers to call pci_dev_put().

Fixes: 87db7579eb ("ocxl: control via sysfs whether the FPGA is reloaded on a link reset")
Suggested-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Acked-by: Andrew Donnellan <ajd@linux.ibm.com>
Link: https://lore.kernel.org/r/20221121154339.4088935-1-yangyingliang@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-11-23 19:49:22 +01:00
Yang Yingliang a4cb1004ae misc: ocxl: fix possible name leak in ocxl_file_register_afu()
If device_register() returns error in ocxl_file_register_afu(),
the name allocated by dev_set_name() need be freed. As comment
of device_register() says, it should use put_device() to give
up the reference in the error path. So fix this by calling
put_device(), then the name can be freed in kobject_cleanup(),
and info is freed in info_release().

Fixes: 75ca758adb ("ocxl: Create a clear delineation between ocxl backend & frontend")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Acked-by: Andrew Donnellan <ajd@linux.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.ibm.com>
Link: https://lore.kernel.org/r/20221111145929.2429271-1-yangyingliang@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-11-23 19:49:17 +01:00
Alexander Usyskin 0ef77698b8 mei: bus-fixup: change pxp mode only if message was sent
Move PXP mode state machine to SETUP mode only if
memory ready message sent successfully to the firmware.
Leave it in INIT mode otherwise to allow try to send message later.

Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Link: https://lore.kernel.org/r/20221116124735.2493847-3-alexander.usyskin@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-11-23 19:43:33 +01:00
Alexander Usyskin 83f47eea74 mei: add timeout to send
When driver wakes up the firmware from the low power state,
it is sending a memory ready message.
The send is done via synchronous/blocking function to ensure
that firmware is in ready state. However, in case of firmware
undergoing reset send might be block forever.
To address this issue a timeout is added to blocking
write command on the internal bus.

Introduce the __mei_cl_send_timeout function to use instead of
__mei_cl_send in cases where timeout is required.
The mei_cl_write has only two callers and there is no need to split
it into two functions.

Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Link: https://lore.kernel.org/r/20221116124735.2493847-2-alexander.usyskin@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-11-23 19:43:33 +01:00
Ohad Sharabi 19a17a9fb4 habanalabs: fix VA range calculation
Current implementation is fixing the page size to PAGE_SIZE whereas the
input page size may be different.

Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:54:10 +02:00
Ofir Bitton 5354a2a001 habanalabs: fail driver load if EEPROM errors detected
In case EEPROM is not burned, firmware sets default EEPROM values.
As this is not valid in production, driver should fail load upon any
EEPROM error reported by firmware.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:54:10 +02:00
Tomer Tayar 1b18cf33d6 habanalabs: make print of engines idle mask more readable
The engines idle mask was increased to be an array of 4 u64 entries.
To make the print of this mask more readable, remove the "0x" prefix,
and zero-pad each u64 to 16 bytes if either it isn't zero or if any of
the higher-order u64's is not zero.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:49:10 +02:00
Tomer Tayar 893afb248c habanalabs: clear non-released encapsulated signals
Reserved encapsulated signals which were not released hold the context
refcount, leading to a failure when killing the user process on device
reset or device fini.
Add the release of these left signals in the CS roll-back process.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:47:42 +02:00
Tomer Tayar 1f615120fc habanalabs: don't put context in hl_encaps_handle_do_release_sob()
hl_encaps_handle_do_release_sob() can be called only when the last
reference to the context object is released and hl_ctx_do_release() is
initiated, and therefore it shouldn't call hl_ctx_put().

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:46:29 +02:00
Tomer Tayar 408c46bd6e habanalabs: print context refcount value if hard reset fails
Failing to kill a user process during a hard reset can be due to a
reference to the user context which isn't released.
To make it easier to understand if this the reason for the failure and
not something else, add a print of the context refcount value.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:45:23 +02:00
Dafna Hirschfeld 0abcae8b48 habanalabs: add RMWREG32_SHIFTED to set a val within a mask
This is similar to RMWREG32, but the given 'val' is already shifted
according to the mask.
This allows several 'ORed' vals and masks to be set at once
The patch also fixes wrong usage of RMWREG32 by replacing
it with RMWREG32_SHIFTED

Signed-off-by: Dafna Hirschfeld <dhirschfeld@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:44:42 +02:00
Tomer Tayar 56fb517775 habanalabs: fix rc when new CPUCP opcodes are not supported
When the new CPUCP opcodes are not supported and a CPUCP packet fails,
the return value is the F/W error resposone which is a positive value.
If this packet is sent from IOCTL and the positive value is used, the
ICOTL will not be considered as unsuccessful.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:44:02 +02:00
Marco Pagani 6825b5f81f habanalabs/gaudi2: added memset for the cq_size register
The clang-analyzer reported a warning: "Value stored to
'cq_size_addr' is never read".

The cq_size register of dcore0 is not being zeroed using
gaudi2_memset_device_lbw(), along with the other cq_* registers,
even though the corresponding cq_size_addr variable is set.

Signed-off-by: Marco Pagani <marpagan@redhat.com>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:22:38 +02:00
Marco Pagani 5908560a7f habanalabs: added return value check for hl_fw_dynamic_send_clear_cmd()
The clang-analyzer reported a warning: "Value stored to 'rc' is never
read".

The return value check for the first hl_fw_dynamic_send_clear_cmd() call
in hl_fw_dynamic_send_protocol_cmd() appears to be missing.

Signed-off-by: Marco Pagani <marpagan@redhat.com>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:48 +02:00
Tomer Tayar 01907ba525 habanalabs: increase the size of busy engines mask
Increase the size of the busy engines mask in 'struct hl_info_hw_idle',
for future ASICs with more than 128 engines.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:48 +02:00
farah kassabri 18cd948204 habanalabs/gaudi2: change memory scrub mechanism
Currently the scrubbing mechanism used the EDMA engines by directly
setting the engine core registers to scrub a chunk of memory.
Due to a sporadic failure with this mechanism, it was decided to
initiate the engines via its QMAN using LIN-DMA packets.

Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:48 +02:00
Oded Gabbay b585daa89d habanalabs: extend process wait timeout in device fine
Processes that use our device are likely to use at the same time other
devices such as remote storage.

In case our device is removed and a user process is still using the
device, we need to kill the user process. However, if that process
has a thread waiting for i/o to complete on remote storage, for example,
the process won't terminate.

Let's give it enough time to terminate before giving up.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
2022-11-23 16:13:48 +02:00
Oded Gabbay f69c3e460a habanalabs: check schedule_hard_reset correctly
schedule_hard_reset can be true only if we didn't do hard-reset.
Therefore, no point of checking it in case hard_reset is true.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
2022-11-23 16:13:48 +02:00
Tomer Tayar bc8e4bae70 habanalabs: reset device if still in use when released
If the device file is released while a context is still held, it won't
be possible to reopen it until the context is eventually released.
If that doesn't happen, only a device reset will revert it back to an
operational state, i.e. need to wait for a CS timeout or an error, or to
wait for an external intervention of injecting a reset via sysfs.

At this stage, after the device was released by user, context is held
either because of CS which were left running on the device and are not
relevant anymore, or due to missing cleanup steps from user side.

All of this is in any case handled in the device reset flow, so initiate
the reset at this point instead of waiting for it.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:48 +02:00
Tomer Tayar 9c604af0c9 habanalabs/gaudi2: return to reset upon SM SEI BRESP error
Due to a H/W issue in the LBW path to the PCIE_DBI MSI-X doorbell, there
were false sporadic error responses in SM when it was configured to
write to there, and hence no reset was done as part of handling the
relevant event.
Now that the virtual MSI-X doorbell is used, such errors in SM are not
expected and reset shouldn't be skipped.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:47 +02:00
Tomer Tayar 2c77ec14c2 habanalabs/gaudi2: don't enable entries in the MSIX_GW table
User should use the virtual MSI-X doorbell to generate interrupts from
the device, so there is no need to enable entries in the MSIX_GW table.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:47 +02:00
farah kassabri 24c983c88f habanalabs/gaudi2: remove redundant firmware version check
Firmware 1.7 is the first official firmware, so no need to check
if we are running a version below it.

Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:47 +02:00
Tomer Tayar fe3e88c947 habanalabs/gaudi: fix print for firmware-alive event
Add missing le{32,64}_to_cpu conversions.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:47 +02:00
Tomer Tayar 5f8981d699 habanalabs: fix print for out-of-sync and pkt-failure events
Add missing le32_to_cpu() conversions, and use %d for the value
returned from atomic_read().

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:47 +02:00
Dani Liberman d3027f4a62 habanalabs/gaudi2: add page fault notify event
Each time page fault happens, besides capturing its data, also notify
the user about it.

Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:47 +02:00
Ofir Bitton a63de89bee habanalabs/gaudi2: classify power/thermal events as info
As power and thermal envelope events are pure informative and not
indicating an error, we reduce the print level to info only.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:46 +02:00
Ohad Sharabi b829e01025 habanalabs: skip events info ioctl if not supported
Some ASICs haven't yet implemented this functionality and so the
ioctl call should fail and the user should be notified of the reason.

Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:46 +02:00
farah kassabri 3daa64eea1 habanalabs: fix firmware descriptor copy operation
This is needed to allow adding more data to the lkd_fw_comms_desc
structure.

Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:46 +02:00
Dani Liberman 413bdb176e habanalabs/gaudi2: add razwi notify event
Each time razwi (read-only zero, write ignored) event happens, besides
capturing its data, also notify the user about it.

Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:46 +02:00
Ofir Bitton 91bd822448 habanalabs/gaudi2: implement fp32 not supported event
Due to binning, Gaudi2 does not always support fp32.
We add support for such an event in case fp32 is used by the user
in such a device.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:46 +02:00
Dani Liberman aff6354afd habanalabs/gaudi: add page fault notify event
Each time page fault happens, besides capturing its data, also notify
the user about it.

Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:46 +02:00
Dani Liberman cd21701cde habanalabs: use single threaded WQ for event handling
Creating event queue workqueue using alloc_workqueue made it run in
multi threaded mode, which caused parallel dumping of events as well as
parallel events notifying to user, causing logs with multiple
events to be out of order.

Fixed by creating event queue workqueue as single threaded work queue.

Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:46 +02:00
Dani Liberman cb5fb665f3 habanalabs/gaudi: add razwi notify event
Each time razwi (read-only zero, write ignore) happens, besides
capturing its data, also notify the user about it.

Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:46 +02:00
Ofir Bitton 841cd2d765 habanalabs/gaudi2: add PCI revision 2 support
Add support for Gaudi2 Device with PCI revision 2.
Functionality is exactly the same as revision 1, the only difference
is device name exposed to user.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:46 +02:00
Ofir Bitton 306206985a habanalabs: remove redundant gaudi2_sec asic type
As Gaudi2 has a single PCI id, the secured asic type is redundant.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:45 +02:00
Ofir Bitton bdfef91e7c habanalabs: add warning print upon a PCI error
In order to know if driver catches PCI errors correctly, we need to
print a warning per each error.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:45 +02:00
Tomer Tayar fc69aa8640 habanalabs: fix PCIe access to SRAM via debugfs
hl_access_sram_dram_region() uses a region base which is set within the
hl_set_dram_bar() function. However, for SRAM access this function is
not called, and we end up with a wrong value of region base and with a
bad calculated address.
Fix it by initializing the region base value independently of whether
hl_set_dram_bar() is called or not.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:45 +02:00
farah kassabri 679e968908 habanalabs: zero ts registration buff when allocated
To avoid memory corruption in kernel memory while using timestamp
registration nodes, zero the kernel buff memory when its allocated.

Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:45 +02:00
Tal Cohen 4a9c6e2cdf habanalabs: no consecutive err when user context is enabled
Consecutive error protects a device reset loop from being triggered
due to h/w issues and enters the device into an unavailable state.
When user may cause the error, an unavailable state
will prevent the user from running its workloads.

The commit prevents entering consecutive state when a user context
is enabled.

Signed-off-by: Tal Cohen <talcohen@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:44 +02:00
Tomer Tayar 1b363adc7f habanalabs: use graceful hard reset for CS timeouts
Use graceful hard reset when detecting a CS timeout that requires a
device reset.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:44 +02:00
Tomer Tayar d1ce7e5ea1 habanalabs/gaudi2: use graceful hard reset for F/W events
Use graceful hard reset for F/W events on Gaudi2 device that require a
device reset.

While at it, do a small refactor of the checks and function calls,
to simplify it and to avoid code duplication.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:43 +02:00
Tomer Tayar 5b8873b39c habanalabs/gaudi: use graceful hard reset for F/W events
Use graceful hard reset for F/W events on Gaudi device that require a
device reset.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:43 +02:00
Tomer Tayar 11669b58fa habanalabs: add an option to control watchdog timeout via debugfs
Add an option to control the timeout value for the driver's watchdog
of the reset process. The timeout represents the amount of the user
has to close his process once he gets a device reset notification from
the driver.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:43 +02:00
Tomer Tayar a88a6f5f5c habanalabs: add support for graceful hard reset
Calling hl_device_reset() for a hard reset will lead to a quite
immediate device reset and to killing user process.
For resets that follow errors, it disables the option to debug the
errors on both the device side and the user application side.

This patch adds a 'graceful hard reset' option and a new
hl_device_cond_reset() function.
Under some conditions, mainly if there is no user process or if he is
not registered to driver notifications, this function will execute hard
reset as usual.
Otherwise, the reset will be postponed and a notification will be sent
to user, to let him perform post-error actions and then to release the
device, after which reset will take place.

If device is not released by user in some defined time, a watchdog work
will execute the reset in any case.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:43 +02:00
Ohad Sharabi d1e0ac37ed habanalabs: avoid divide by zero in device utilization
Currently there is no verification whether the divisor is legal.

Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:43 +02:00
Dani Liberman 6bcb2d05a5 habanalabs: fix user mappings calculation in case of page fault
As there are 2 types of user mappings, pmmu and hmmu, calculate
only the relevant mappings for the requested type.

Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:43 +02:00
Tomer Tayar 5ad06bb1d2 habanalabs/gaudi2: remove configurations to access the MSI-X doorbell
The virtual MSI-X doorbell is supported now in F/W, so all
configurations to access the PCIE_DBI MSI-X doorbell can be removed.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:43 +02:00
Ohad Sharabi e325d5dbf3 habanalabs: allow setting HBM BAR to other regions
Up until now the use-case in the driver was that the HBM is accessed
using the HBM BAR, yet the BAR sometimes cannot cover the whole HBM and
so we needed to set the BAR to other HBM offset.
Now we are facing the need to access other PCI memory regions that can
be covered by the HBM BAR.
To answer that we are allowing the caller to determine if the HBM BAR
need to be set or not regardless of the PCI memory region.

Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:43 +02:00
Ohad Sharabi 24fdfb359c habanalabs: fix using freed pointer
The code uses the pointer for trace purpose (without actually
dereference it) but still get static analysis warning.
This patch eliminate the warning.

Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:43 +02:00
Dilip Puri dc8d243cae habanalabs/gaudi2: unsecure CBU_EARLY_BRESP registers
NIC ARCs need to have access to CBU_EARLY_BRESP, hence we unsecure
those registers.

Signed-off-by: Dilip Puri <dilipp@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:42 +02:00
Tal Cohen 27cd39afde habanalabs: verify no zero event is sent
The event notifier mechanism should not raise an empty
event (event equals zero).

Signed-off-by: Tal Cohen <talcohen@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:42 +02:00
Dani Liberman 4f11694f27 habanalabs/gaudi2: capture page fault data
Capture page fault data when it happens.

Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:42 +02:00
Dani Liberman 15ac503cdc habanalabs/gaudi2: capture RAZWI information
Added function to calculate possible engines which caused
RAZWI (read-only zero, write ignored), from a given router id or
module index.

When getting RAZWI via PSOC IP, first the router id is calculated
and then the possible engines that caused the RAZWI are calculated.

There is a possibility that the RAZWI initiator is not an engine. In
that case, it will not be included in possible engines as it
doesn't have an engine id.

RAZWI information is captured when receiving event from engine or via
PSOC IP.

Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:41 +02:00
Dani Liberman 17f3f42af2 habanalabs: handle HBM MMU when capturing page fault data
In case of HBM MMU page fault, capture its relevant mappings.

Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:40 +02:00
Tomer Tayar 1eebb25929 habanalabs: move reset workqueue to be under hl_device
'struct hl_device_reset_work' is used as a wrapper for the reset work
and its parameters, including the reset workqueue on which it runs.
In a future commit, another reset related work with similar parameters
is going to be added, but it won't use the reset workqueue.

As in any case there is a single reset workqueue, and to allow the resue
of this structure, move the reset workqueue to 'struct hl_device'.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:40 +02:00
Tomer Tayar 51236cd95e habanalabs: allow unregistering eventfd when device non-operational
Unregistering eventfd is for releasing host resources and doesn't
involve an access to the device. As such, there is no reason to disallow
it when device isn't operational.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:40 +02:00
Tomer Tayar 3a83ebc521 habanalabs: skip idle status check if reset on device release
If reset upon device release is enabled, there is no need to check the
device idle status in hpriv_release(), because device is going to be
reset in any case.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:40 +02:00
Tal Cohen 5731b6e6f0 habanalabs/gaudi2: add device unavailable notification
Device unavailable notifies the user that there isn't an option to
retrieve debug information from the device.
When a critical device error occurs and the f/w performs the device
reset, a device unavailable notification shall be sent to the user
process.

Signed-off-by: Tal Cohen <talcohen@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:40 +02:00
Koby Elbaz 16448d6444 habanalabs/gaudi2: remove privileged MME clock configuration
Privileged MME clock configuration is removed as it is done by the f/w.

Signed-off-by: Koby Elbaz <kelbaz@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:40 +02:00
Dafna Hirschfeld 189b203ebb habanalabs: replace 'pf' to 'prefetch'
pf was an abbreviation for prefetch but because pf already stands
for 'physical function', we decided to change it to 'prefetch'.

Signed-off-by: Dafna Hirschfeld <dhirschfeld@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:40 +02:00
Dani Liberman dd600db47b habanalabs: add page fault info uapi
Only the first page fault will be saved.
Besides the address which caused the page fault, the driver captures
all of the mmu user mappings.
User can retrieve this data via the new uapi (new opcode in INFO ioctl).

Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:40 +02:00
Tomer Tayar 6d1c567f2a habanalabs/gaudi2: fix module ID for RAZWI handling
RAZWI is optionally handled as part of the generic QM SEI error
handling, but it always uses PDMA as the module ID.
Fix it to use the suitable module ID according to the specific event.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:40 +02:00
Bharat Jauhari 0502df9bbe habanalabs: use lower_32_bits()
This fixes sparse warning on doing cast to 32-bits

Signed-off-by: Bharat Jauhari <bjauhari@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:40 +02:00
Dani Liberman 52d5e54695 habanalabs: refactor razwi event notification
This event notification was compatible only with gaudi, where razwi
and page fault happens together.

To make it compatible with all ASICs, this refactor contains:

1. Razwi notification will only notify about razwi info.
   New notification will be added in future patch, to retrieve data
   about page fault error.

2. Changed razwi info structure to support all ASICs.

Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:39 +02:00
Oded Gabbay ea73ef14dd habanalabs: Use simplified API for p2p dist calc
Use the simplified API that calculates distance between two devices.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:39 +02:00
Ofir Bitton a925d90b36 habanalabs: allow control device open during reset
Monitoring apps would like to query device state at any time so we
should allow it also during reset because it doesn't involve
accessing the h/w.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:39 +02:00
Yang Yingliang 8749c27895 habanalabs: fix return value check in hl_fw_get_sec_attest_data()
If hl_cpu_accessible_dma_pool_alloc() fails, we should check
'req_cpu_addr', fix it.

Fixes: 0c88760f8f ("habanalabs/gaudi2: add secured attestation info uapi")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23 16:13:39 +02:00
Greg Kroah-Hartman 210a671cc3 Linux 6.1-rc6
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmN6wAgeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiG0EYH/3/RO90NbrFItraN
 Lzr+d3VdbGjTu8xd1M+PRTmwh3zxLpB+Jwqr0T0A2gzL9B/D+AUPUJdrCVbv9DqS
 FLJAVqoeV20dNBAHSffOOLPsgCZ+Eu+LzlNN7Iqde0e8cyZICFMNktitui84Xm/i
 1NgFVgz9OZ6+aieYvUj3FrFq0p8GTIaC/oybDZrxYKcO8ZzKVMJ11swRw10wwq0g
 qOOECvV3w7wlQ8upQZkzFxItKFc7EexZI6R4elXeGSJJ9Hlc092dv/zsKB9dwV+k
 WcwkJrZRoezYXzgGBFxUcQtzi+ethjrPjuJuM1rYLUSIcfIW/0lkaSLgRoBu8D+I
 1GfXkXs=
 =gt6P
 -----END PGP SIGNATURE-----

Merge 6.1-rc6 into char-misc-next

We need the char/misc fixes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-11-21 10:05:34 +01:00
Dmitry Osipenko 265751a513 fastrpc: Assert held reservation lock for dma-buf mmapping
When userspace mmaps dma-buf's fd, the dma-buf reservation lock must be
held. Add locking sanity check to the dma-buf mmaping callback to ensure
that the locking assumption won't regress in the future.

Suggested-by: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Acked-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221110201349.351294-7-dmitry.osipenko@collabora.com
2022-11-11 23:49:54 +03:00
Quan Nguyen 763dc90e9a misc: smpro-misc: Add Ampere's Altra SMpro misc driver
Add driver support for accessing various information reported by
Ampere's SMpro co-processor such as Boot Progress and other
miscellaneous data.

Signed-off-by: Quan Nguyen <quan@os.amperecomputing.com>
Link: https://lore.kernel.org/r/20221031024442.2490881-4-quan@os.amperecomputing.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-11-10 19:03:03 +01:00
Quan Nguyen 4a4a4e9eba misc: smpro-errmon: Add Ampere's SMpro error monitor driver
Add Ampere's SMpro error monitor driver for monitoring and reporting
RAS-related errors as reported by SMpro co-processor found on Ampere's
Altra processor family.

Signed-off-by: Quan Nguyen <quan@os.amperecomputing.com>
Link: https://lore.kernel.org/r/20221031024442.2490881-3-quan@os.amperecomputing.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-11-10 19:02:43 +01:00
Bo Liu 982a84455e misc: genwqe: card_base: Fix some kernel-doc warnings
Fixes the following W=1 kernel build warning(s):
  drivers/misc/genwqe/card_base.c:3: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst

Signed-off-by: Bo Liu <liubo03@inspur.com>
Link: https://lore.kernel.org/r/20221031063557.2710-1-liubo03@inspur.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-11-10 18:38:40 +01:00
Alexander Potapenko e5b0d06d9b misc/vmw_vmci: fix an infoleak in vmci_host_do_receive_datagram()
`struct vmci_event_qp` allocated by qp_notify_peer() contains padding,
which may carry uninitialized data to the userspace, as observed by
KMSAN:

  BUG: KMSAN: kernel-infoleak in instrument_copy_to_user ./include/linux/instrumented.h:121
   instrument_copy_to_user ./include/linux/instrumented.h:121
   _copy_to_user+0x5f/0xb0 lib/usercopy.c:33
   copy_to_user ./include/linux/uaccess.h:169
   vmci_host_do_receive_datagram drivers/misc/vmw_vmci/vmci_host.c:431
   vmci_host_unlocked_ioctl+0x33d/0x43d0 drivers/misc/vmw_vmci/vmci_host.c:925
   vfs_ioctl fs/ioctl.c:51
  ...

  Uninit was stored to memory at:
   kmemdup+0x74/0xb0 mm/util.c:131
   dg_dispatch_as_host drivers/misc/vmw_vmci/vmci_datagram.c:271
   vmci_datagram_dispatch+0x4f8/0xfc0 drivers/misc/vmw_vmci/vmci_datagram.c:339
   qp_notify_peer+0x19a/0x290 drivers/misc/vmw_vmci/vmci_queue_pair.c:1479
   qp_broker_attach drivers/misc/vmw_vmci/vmci_queue_pair.c:1662
   qp_broker_alloc+0x2977/0x2f30 drivers/misc/vmw_vmci/vmci_queue_pair.c:1750
   vmci_qp_broker_alloc+0x96/0xd0 drivers/misc/vmw_vmci/vmci_queue_pair.c:1940
   vmci_host_do_alloc_queuepair drivers/misc/vmw_vmci/vmci_host.c:488
   vmci_host_unlocked_ioctl+0x24fd/0x43d0 drivers/misc/vmw_vmci/vmci_host.c:927
  ...

  Local variable ev created at:
   qp_notify_peer+0x54/0x290 drivers/misc/vmw_vmci/vmci_queue_pair.c:1456
   qp_broker_attach drivers/misc/vmw_vmci/vmci_queue_pair.c:1662
   qp_broker_alloc+0x2977/0x2f30 drivers/misc/vmw_vmci/vmci_queue_pair.c:1750

  Bytes 28-31 of 48 are uninitialized
  Memory access of size 48 starts at ffff888035155e00
  Data copied to user address 0000000020000100

Use memset() to prevent the infoleaks.

Also speculatively fix qp_notify_peer_local(), which may suffer from the
same problem.

Reported-by: syzbot+39be4da489ed2493ba25@syzkaller.appspotmail.com
Cc: stable <stable@kernel.org>
Fixes: 06164d2b72 ("VMCI: queue pairs implementation.")
Signed-off-by: Alexander Potapenko <glider@google.com>
Reviewed-by: Vishnu Dasa <vdasa@vmware.com>
Link: https://lore.kernel.org/r/20221104175849.2782567-1-glider@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-11-09 15:40:03 +01:00
Dave Airlie 60ba8c5bd9 Merge tag 'drm-intel-gt-next-2022-11-03' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
Driver Changes:

- Fix for #7306: [Arc A380] white flickering when using arc as a
  secondary gpu (Matt A)
- Add Wa_18017747507 for DG2 (Wayne)
- Avoid spurious WARN on DG1 due to incorrect cache_dirty flag
  (Niranjana, Matt A)
- Corrections to CS timestamp support for Gen5 and earlier (Ville)

- Fix a build error used with clang compiler on hwmon (GG)
- Improvements to LMEM handling with RPM (Anshuman, Matt A)
- Cleanups in dmabuf code (Mike)

- Selftest improvements (Matt A)

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Y2N11wu175p6qeEN@jlahtine-mobl.ger.corp.intel.com
2022-11-04 17:33:34 +10:00
Lu Baolu 942fd5435d iommu: Remove SVM_FLAG_SUPERVISOR_MODE support
The current kernel DMA with PASID support is based on the SVA with a flag
SVM_FLAG_SUPERVISOR_MODE. The IOMMU driver binds the kernel memory address
space to a PASID of the device. The device driver programs the device with
kernel virtual address (KVA) for DMA access. There have been security and
functional issues with this approach:

- The lack of IOTLB synchronization upon kernel page table updates.
  (vmalloc, module/BPF loading, CONFIG_DEBUG_PAGEALLOC etc.)
- Other than slight more protection, using kernel virtual address (KVA)
  has little advantage over physical address. There are also no use
  cases yet where DMA engines need kernel virtual addresses for in-kernel
  DMA.

This removes SVM_FLAG_SUPERVISOR_MODE support from the IOMMU interface.
The device drivers are suggested to handle kernel DMA with PASID through
the kernel DMA APIs.

The drvdata parameter in iommu_sva_bind_device() and all callbacks is not
needed anymore. Cleanup them as well.

Link: https://lore.kernel.org/linux-iommu/20210511194726.GP1002214@nvidia.com/
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Tested-by: Tony Zhu <tony.zhu@intel.com>
Link: https://lore.kernel.org/r/20221031005917.45690-4-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-11-03 15:47:45 +01:00
Jason A. Donenfeld 6770473832 misc: sgi-gru: use explicitly signed char
With char becoming unsigned by default, and with `char` alone being
ambiguous and based on architecture, signed chars need to be marked
explicitly as such. This fixes warnings like:

drivers/misc/sgi-gru/grumain.c:711 gru_check_chiplet_assignment() warn: 'gts->ts_user_chiplet_id' is unsigned

Cc: Dimitri Sivanich <dimitri.sivanich@hpe.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Link: https://lore.kernel.org/r/20221025025223.573543-1-Jason@zx2c4.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-10-25 19:12:10 +02:00
Maxime Ripard a140a6a2d5
Merge drm/drm-next into drm-misc-next
Let's kick-off this release cycle.

Signed-off-by: Maxime Ripard <maxime@cerno.tech>
2022-10-18 15:00:03 +02:00
Dmitry Osipenko 791da5c7fe misc: fastrpc: Prepare to dynamic dma-buf locking specification
Prepare fastrpc to the common dynamic dma-buf locking convention by
starting to use the unlocked versions of dma-buf API functions.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221017172229.42269-12-dmitry.osipenko@collabora.com
2022-10-18 01:21:46 +03:00
Jason A. Donenfeld a251c17aa5 treewide: use get_random_u32() when possible
The prandom_u32() function has been a deprecated inline wrapper around
get_random_u32() for several releases now, and compiles down to the
exact same code. Replace the deprecated wrapper with a direct call to
the real function. The same also applies to get_random_int(), which is
just a wrapper around get_random_u32(). This was done as a basic find
and replace.

Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Yury Norov <yury.norov@gmail.com>
Reviewed-by: Jan Kara <jack@suse.cz> # for ext4
Acked-by: Toke Høiland-Jørgensen <toke@toke.dk> # for sch_cake
Acked-by: Chuck Lever <chuck.lever@oracle.com> # for nfsd
Acked-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> # for thunderbolt
Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs
Acked-by: Helge Deller <deller@gmx.de> # for parisc
Acked-by: Heiko Carstens <hca@linux.ibm.com> # for s390
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-10-11 17:42:58 -06:00
Linus Torvalds 27bc50fc90 - Yu Zhao's Multi-Gen LRU patches are here. They've been under test in
linux-next for a couple of months without, to my knowledge, any negative
   reports (or any positive ones, come to that).
 
 - Also the Maple Tree from Liam R.  Howlett.  An overlapping range-based
   tree for vmas.  It it apparently slight more efficient in its own right,
   but is mainly targeted at enabling work to reduce mmap_lock contention.
 
   Liam has identified a number of other tree users in the kernel which
   could be beneficially onverted to mapletrees.
 
   Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat
   (https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com).
   This has yet to be addressed due to Liam's unfortunately timed
   vacation.  He is now back and we'll get this fixed up.
 
 - Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer.  It uses
   clang-generated instrumentation to detect used-unintialized bugs down to
   the single bit level.
 
   KMSAN keeps finding bugs.  New ones, as well as the legacy ones.
 
 - Yang Shi adds a userspace mechanism (madvise) to induce a collapse of
   memory into THPs.
 
 - Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to support
   file/shmem-backed pages.
 
 - userfaultfd updates from Axel Rasmussen
 
 - zsmalloc cleanups from Alexey Romanov
 
 - cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and memory-failure
 
 - Huang Ying adds enhancements to NUMA balancing memory tiering mode's
   page promotion, with a new way of detecting hot pages.
 
 - memcg updates from Shakeel Butt: charging optimizations and reduced
   memory consumption.
 
 - memcg cleanups from Kairui Song.
 
 - memcg fixes and cleanups from Johannes Weiner.
 
 - Vishal Moola provides more folio conversions
 
 - Zhang Yi removed ll_rw_block() :(
 
 - migration enhancements from Peter Xu
 
 - migration error-path bugfixes from Huang Ying
 
 - Aneesh Kumar added ability for a device driver to alter the memory
   tiering promotion paths.  For optimizations by PMEM drivers, DRM
   drivers, etc.
 
 - vma merging improvements from Jakub Matěn.
 
 - NUMA hinting cleanups from David Hildenbrand.
 
 - xu xin added aditional userspace visibility into KSM merging activity.
 
 - THP & KSM code consolidation from Qi Zheng.
 
 - more folio work from Matthew Wilcox.
 
 - KASAN updates from Andrey Konovalov.
 
 - DAMON cleanups from Kaixu Xia.
 
 - DAMON work from SeongJae Park: fixes, cleanups.
 
 - hugetlb sysfs cleanups from Muchun Song.
 
 - Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY0HaPgAKCRDdBJ7gKXxA
 joPjAQDZ5LlRCMWZ1oxLP2NOTp6nm63q9PWcGnmY50FjD/dNlwEAnx7OejCLWGWf
 bbTuk6U2+TKgJa4X7+pbbejeoqnt5QU=
 =xfWx
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:

 - Yu Zhao's Multi-Gen LRU patches are here. They've been under test in
   linux-next for a couple of months without, to my knowledge, any
   negative reports (or any positive ones, come to that).

 - Also the Maple Tree from Liam Howlett. An overlapping range-based
   tree for vmas. It it apparently slightly more efficient in its own
   right, but is mainly targeted at enabling work to reduce mmap_lock
   contention.

   Liam has identified a number of other tree users in the kernel which
   could be beneficially onverted to mapletrees.

   Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat
   at [1]. This has yet to be addressed due to Liam's unfortunately
   timed vacation. He is now back and we'll get this fixed up.

 - Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer. It uses
   clang-generated instrumentation to detect used-unintialized bugs down
   to the single bit level.

   KMSAN keeps finding bugs. New ones, as well as the legacy ones.

 - Yang Shi adds a userspace mechanism (madvise) to induce a collapse of
   memory into THPs.

 - Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to
   support file/shmem-backed pages.

 - userfaultfd updates from Axel Rasmussen

 - zsmalloc cleanups from Alexey Romanov

 - cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and
   memory-failure

 - Huang Ying adds enhancements to NUMA balancing memory tiering mode's
   page promotion, with a new way of detecting hot pages.

 - memcg updates from Shakeel Butt: charging optimizations and reduced
   memory consumption.

 - memcg cleanups from Kairui Song.

 - memcg fixes and cleanups from Johannes Weiner.

 - Vishal Moola provides more folio conversions

 - Zhang Yi removed ll_rw_block() :(

 - migration enhancements from Peter Xu

 - migration error-path bugfixes from Huang Ying

 - Aneesh Kumar added ability for a device driver to alter the memory
   tiering promotion paths. For optimizations by PMEM drivers, DRM
   drivers, etc.

 - vma merging improvements from Jakub Matěn.

 - NUMA hinting cleanups from David Hildenbrand.

 - xu xin added aditional userspace visibility into KSM merging
   activity.

 - THP & KSM code consolidation from Qi Zheng.

 - more folio work from Matthew Wilcox.

 - KASAN updates from Andrey Konovalov.

 - DAMON cleanups from Kaixu Xia.

 - DAMON work from SeongJae Park: fixes, cleanups.

 - hugetlb sysfs cleanups from Muchun Song.

 - Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core.

Link: https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com [1]

* tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (555 commits)
  hugetlb: allocate vma lock for all sharable vmas
  hugetlb: take hugetlb vma_lock when clearing vma_lock->vma pointer
  hugetlb: fix vma lock handling during split vma and range unmapping
  mglru: mm/vmscan.c: fix imprecise comments
  mm/mglru: don't sync disk for each aging cycle
  mm: memcontrol: drop dead CONFIG_MEMCG_SWAP config symbol
  mm: memcontrol: use do_memsw_account() in a few more places
  mm: memcontrol: deprecate swapaccounting=0 mode
  mm: memcontrol: don't allocate cgroup swap arrays when memcg is disabled
  mm/secretmem: remove reduntant return value
  mm/hugetlb: add available_huge_pages() func
  mm: remove unused inline functions from include/linux/mm_inline.h
  selftests/vm: add selftest for MADV_COLLAPSE of uffd-minor memory
  selftests/vm: add file/shmem MADV_COLLAPSE selftest for cleared pmd
  selftests/vm: add thp collapse shmem testing
  selftests/vm: add thp collapse file and tmpfs testing
  selftests/vm: modularize thp collapse memory operations
  selftests/vm: dedup THP helpers
  mm/khugepaged: add tracepoint to hpage_collapse_scan_file()
  mm/madvise: add file and shmem support to MADV_COLLAPSE
  ...
2022-10-10 17:53:04 -07:00
Linus Torvalds a09476668e Char/Misc and other driver changes for 6.1-rc1
Here is the large set of char/misc and other small driver subsystem
 changes for 6.1-rc1.  Loads of different things in here:
   - IIO driver updates, additions, and changes.  Probably the largest
     part of the diffstat
   - habanalabs driver update with support for new hardware and features,
     the second largest part of the diff.
   - fpga subsystem driver updates and additions
   - mhi subsystem updates
   - Coresight driver updates
   - gnss subsystem updates
   - extcon driver updates
   - icc subsystem updates
   - fsi subsystem updates
   - nvmem subsystem and driver updates
   - misc driver updates
   - speakup driver additions for new features
   - lots of tiny driver updates and cleanups
 
 All of these have been in the linux-next tree for a while with no
 reported issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCY0GQmA8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ylyVQCeNJjZ3hy+Wz8WkPSY+NkehuIhyCIAnjXMOJP8
 5G/JQ+rpcclr7VOXlS66
 =zVkU
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc and other driver updates from Greg KH:
 "Here is the large set of char/misc and other small driver subsystem
  changes for 6.1-rc1. Loads of different things in here:

   - IIO driver updates, additions, and changes. Probably the largest
     part of the diffstat

   - habanalabs driver update with support for new hardware and
     features, the second largest part of the diff.

   - fpga subsystem driver updates and additions

   - mhi subsystem updates

   - Coresight driver updates

   - gnss subsystem updates

   - extcon driver updates

   - icc subsystem updates

   - fsi subsystem updates

   - nvmem subsystem and driver updates

   - misc driver updates

   - speakup driver additions for new features

   - lots of tiny driver updates and cleanups

  All of these have been in the linux-next tree for a while with no
  reported issues"

* tag 'char-misc-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (411 commits)
  w1: Split memcpy() of struct cn_msg flexible array
  spmi: pmic-arb: increase SPMI transaction timeout delay
  spmi: pmic-arb: block access for invalid PMIC arbiter v5 SPMI writes
  spmi: pmic-arb: correct duplicate APID to PPID mapping logic
  spmi: pmic-arb: add support to dispatch interrupt based on IRQ status
  spmi: pmic-arb: check apid against limits before calling irq handler
  spmi: pmic-arb: do not ack and clear peripheral interrupts in cleanup_irq
  spmi: pmic-arb: handle spurious interrupt
  spmi: pmic-arb: add a print in cleanup_irq
  drivers: spmi: Directly use ida_alloc()/free()
  MAINTAINERS: add TI ECAP driver info
  counter: ti-ecap-capture: capture driver support for ECAP
  Documentation: ABI: sysfs-bus-counter: add frequency & num_overflows items
  dt-bindings: counter: add ti,am62-ecap-capture.yaml
  counter: Introduce the COUNTER_COMP_ARRAY component type
  counter: Consolidate Counter extension sysfs attribute creation
  counter: Introduce the Count capture component
  counter: 104-quad-8: Add Signal polarity component
  counter: Introduce the Signal polarity component
  counter: interrupt-cnt: Implement watch_validate callback
  ...
2022-10-08 08:56:37 -07:00
Linus Torvalds ab29622157 whack-a-mole: cropped up open-coded file_inode() uses...
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCYzxj0gAKCRBZ7Krx/gZQ
 66/1AQC/KfIAINNOPxozsZaxOaOKo0ouVJ7sJV4ZGsPKpU69gwD/UodJZCtyZ52h
 wwkmfzTDjAgGt1QCKj96zk2XFqg4swE=
 =u0pv
 -----END PGP SIGNATURE-----

Merge tag 'pull-file_inode' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull file_inode() updates from Al Vrio:
 "whack-a-mole: cropped up open-coded file_inode() uses..."

* tag 'pull-file_inode' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  orangefs: use ->f_mapping
  _nfs42_proc_copy(): use ->f_mapping instead of file_inode()->i_mapping
  dma_buf: no need to bother with file_inode()->i_mapping
  nfs_finish_open(): don't open-code file_inode()
  bprm_fill_uid(): don't open-code file_inode()
  sgx: use ->f_mapping...
  exfat_iterate(): don't open-code file_inode(file)
  ibmvmc: don't open-code file_inode()
2022-10-06 17:22:11 -07:00
Linus Torvalds 7e6739b933 drm pull for 6.1-rc1
core:
 - convert selftests to kunit
 - managed init for more objects
 - move to idr_init_base
 - rename fb and gem cma helpers to dma
 - hide unregistered connectors from getconnector ioctl
 - DSC passthrough aux support
 - backlight handling improvements
 - add dma_resv_assert_held to vmap/vunmap
 
 edid:
 - move luminance calculation to core
 
 fbdev:
 - fix aperture helper usage
 
 fourcc:
 - add more format helpers
 - add DRM_FORMAT_Cxx, DRM_FORMAT_Rxx, DRM_FORMAT_Dxx
 - add packed AYUV8888, XYUV8888
 - add some kunit tests
 
 ttm:
 - allow bos without backing store
 - rewrite placement to use intersect/compatible functions
 
 dma-buf:
 - docs update
 - improve signalling when debugging
 
 udmabuf:
 - fix failure path GPF
 
 dp:
 - drop dp/mst legacy code
 - atomic mst state support
 - audio infoframe packing
 
 panel:
 - Samsung LTL101AL01
 - B120XAN01.0
 - R140NWF5 RH
 - DMT028VGHMCMI-1A T
 - AUO B133UAN02.1
 - IVO M133NW4J-R3
 - Innolux N120ACA-EA1
 
 amdgpu:
 - Gang submit support
 - Mode2 reset for RDNA2
 - New IP support:
   DCN 3.1.4, 3.2
   SMU 13.x
   NBIO 7.7
   GC 11.x
   PSP 13.x
   SDMA 6.x
   GMC 11.x
 - DSC passthrough support
 - PSP fixes for TA support
 - vangogh GFXOFF stats
 - clang fixes
 - gang submit CS cleanup prep work
 - fix VRAM eviction issues
 
 amdkfd:
 - GC 10.3 IP ISA fixes
 - fix CRIU regression
 - CPU fault on COW mapping fixes
 
 i915:
 - align fw versioning with kernel practices
 - add display substruct to i915 private
 - add initial runtime info to driver info
 - split out HDCP and backlight registers
 - MEI XeHP SDV GSC support
 - add per-gt sysfs defaults
 - TLB invalidation improvements
 - Disable PCI BAR resize on 32-bit
 - GuC firmware updates and compat changes
 - GuC log timestamp translation
 - DG2 preemption workaround changes
 - DG2 improved HDMI pixel clocks support
 - PCI BAR sanity checks
 - Enable DC5 on DG2
 - DG2 DMC fw bumped
 - ADL-S PCI ID added
 - Meteorlake enablement
 - Rename ggtt_view to gtt_view
 - host RPS fixes
 - release mmaps on rpm suspend on discrete
 - clocking and dpll refactoring
 - VBT definitions and parsing updates
 - SKL watermark code extracted to separate file
 - allow seamless M/N changes on eDP panels
 - BUG_ON removal and cleanups
 
 msm:
 - DPU: simplified VBIF configuration
 -      cleanup CTL interfaces
 - DSI: removed unused msm_display_dsc_config struct
 -      switch regulator calls to new API
 -      switched to PANEL_BRIDGE for direct attached panels
 - DSI_PHY: convert drivers to parent_hws
 - DP: cleanup pixel_rate handling
 - HDMI: turned hdmi-phy-8996 into OF clk provider
 - misc dt-bindings fixes
 - choose eDP as primary display if it's available
 - support getting interconnects from either the mdss or the mdp5/dpu
   device nodes
 - gem: Shrinker + LRU re-work:
 - adds a shared GEM LRU+shrinker helper and moves msm over to that
 - reduces lock contention between retire and submit by avoiding the
   need to acquire obj lock in retire path (and instead using resv
   seeing obj's busyness in the shrinker
 - fix reclaim vs submit issues
 - GEM fault injection for triggering userspace error paths
 - Map/unmap optimization
 - Improved robustness for a6xx GPU recovery
 
 virtio:
 - Improve error and edge conditions handling
 - Convert to use managed helpers
 - stop exposing LINEAR modifier
 
 mgag200:
 - split modeset handling per model
 
 udl:
 - suspend/disconnect handling improvements
 
 vc4:
 - rework HDMI power up
 - depend on PM
 - better unplugging support
 
 ast:
 - resolution handling improvements
 
 ingenic:
 - Add JZ4760(B) support
 - avoid a modeset when sharpness property is unchanged
 - use the new PM ops
 
 it6505:
 - power seq and clock updates
 
 ssd130x:
 - regmap bulk write
 - use atomic helpers instead of simple helpers
 
 via:
 - rename via_drv to via_dri1, consolidate all code.
 
 radeon:
 - drop DP MST experimental support
 - delayed work flush fix
 - use time_after
 
 ti-sn65dsi86:
 - DP support
 
 mediatek:
 - MT8195 DP support
 - drop of_gpio header
 - remove unneeded result
 - small DP code improvements
 
 vkms:
 - RGB565, XRGB64 and ARGB64 support
 
 sun4i:
 - tv: convert to atomic
 
 rcar-du:
 - Synopsys DW HDMI bridge DT bindings update
 
 exynos:
 - use drm_display_info.is_hdmi
 - correct return of mixer_mode_valid and hdmi_mode_valid
 
 omap:
 - refcounting fix
 
 rockchip:
 - RK3568 support
 - RK3399 gamma support
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmM894sACgkQDHTzWXnE
 hr7EYw//WdVe69TNauCAQiOYdmPp1twmr2o5gDOFLoo4IZw5v+qK0HL/nTrDkBq6
 xIu1GLTScOh0AItW1rhFmrtKhO/u/QPQ15P6cO7x8AzlUIhVOYqM79+OA0X6zIV8
 IZjpc6EEWPSKJTCRud9HdzsV06DIa+QlwShLCaOFxRiGSuUqsxzacIHUqnFekRnV
 PBG7RzcmdWwe6Gy/7T2wegsFjw1mh4S4FypEGs53emru3PGvcau5dwXcE5Jro7Br
 k4BFFknuXahVJ2ynVfIFn3QUQRMLgAKRWflqxo7McLeKVQEt4gfB6+PaMwGpSiRQ
 iC9QPy69TWEx6X015q2DvvlQDewnCbPOlzyoj9O991QDGLPIim8srPblr8DPeeOz
 Y7IW1PRVnPdKReMJvTyrIVED/XT9fUoR7N+F9sfPnEee5HsvjXNGumEHbOE8avFf
 rB6CFdby+Ecd9cSeINXowFy4ss0d5zCHMiKEVyQWTZOJysp29vLyKezNqU5m37FK
 LAQHtsRdn1+V3o22H5y1PJyqssbOMImMV1ffqW/urRLLefPVHIKCKI8Ycgh0qxqc
 B+gebHMgF8j6RR0DHAcQby+PIVi/Pn36TAMI3lPsVjFWGS5s5EQwpKlMNj46H0Cr
 yE2Vr4w29+Nsv0b4Uz16AZ0mHcauqx4bvWMyT0frJfNcE86x8MI=
 =u/MJ
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2022-10-05' of git://anongit.freedesktop.org/drm/drm

Pull drm updates from Dave Airlie:
 "Lots of stuff all over, some new AMD IP support and gang submit
  support. i915 has further DG2 and Meteorlake pieces, and a bunch of
  i915 display refactoring. msm has a shrinker rework. There are also a
  bunch of conversions to use kunit.

  This has two external pieces, some MEI changes needed for future Intel
  discrete GPUs. These should be acked by Greg. There is also a cross
  maintainer shared tree with some backlight rework from Hans in here.

  Core:
   - convert selftests to kunit
   - managed init for more objects
   - move to idr_init_base
   - rename fb and gem cma helpers to dma
   - hide unregistered connectors from getconnector ioctl
   - DSC passthrough aux support
   - backlight handling improvements
   - add dma_resv_assert_held to vmap/vunmap

  edid:
   - move luminance calculation to core

  fbdev:
   - fix aperture helper usage

  fourcc:
   - add more format helpers
   - add DRM_FORMAT_Cxx, DRM_FORMAT_Rxx, DRM_FORMAT_Dxx
   - add packed AYUV8888, XYUV8888
   - add some kunit tests

  ttm:
   - allow bos without backing store
   - rewrite placement to use intersect/compatible functions

  dma-buf:
   - docs update
   - improve signalling when debugging

  udmabuf:
   - fix failure path GPF

  dp:
   - drop dp/mst legacy code
   - atomic mst state support
   - audio infoframe packing

  panel:
   - Samsung LTL101AL01
   - B120XAN01.0
   - R140NWF5 RH
   - DMT028VGHMCMI-1A T
   - AUO B133UAN02.1
   - IVO M133NW4J-R3
   - Innolux N120ACA-EA1

  amdgpu:
   - Gang submit support
   - Mode2 reset for RDNA2
   - New IP support:
        DCN 3.1.4, 3.2
        SMU 13.x
        NBIO 7.7
        GC 11.x
        PSP 13.x
        SDMA 6.x
        GMC 11.x
   - DSC passthrough support
   - PSP fixes for TA support
   - vangogh GFXOFF stats
   - clang fixes
   - gang submit CS cleanup prep work
   - fix VRAM eviction issues

  amdkfd:
   - GC 10.3 IP ISA fixes
   - fix CRIU regression
   - CPU fault on COW mapping fixes

  i915:
   - align fw versioning with kernel practices
   - add display substruct to i915 private
   - add initial runtime info to driver info
   - split out HDCP and backlight registers
   - MEI XeHP SDV GSC support
   - add per-gt sysfs defaults
   - TLB invalidation improvements
   - Disable PCI BAR resize on 32-bit
   - GuC firmware updates and compat changes
   - GuC log timestamp translation
   - DG2 preemption workaround changes
   - DG2 improved HDMI pixel clocks support
   - PCI BAR sanity checks
   - Enable DC5 on DG2
   - DG2 DMC fw bumped
   - ADL-S PCI ID added
   - Meteorlake enablement
   - Rename ggtt_view to gtt_view
   - host RPS fixes
   - release mmaps on rpm suspend on discrete
   - clocking and dpll refactoring
   - VBT definitions and parsing updates
   - SKL watermark code extracted to separate file
   - allow seamless M/N changes on eDP panels
   - BUG_ON removal and cleanups

  msm:
   - DPU:
       simplified VBIF configuration
       cleanup CTL interfaces
   - DSI:
       removed unused msm_display_dsc_config struct
       switch regulator calls to new API
       switched to PANEL_BRIDGE for direct attached panels
   - DSI_PHY: convert drivers to parent_hws
   - DP: cleanup pixel_rate handling
   - HDMI: turned hdmi-phy-8996 into OF clk provider
   - misc dt-bindings fixes
   - choose eDP as primary display if it's available
   - support getting interconnects from either the mdss or the mdp5/dpu
     device nodes
   - gem: Shrinker + LRU re-work:
   - adds a shared GEM LRU+shrinker helper and moves msm over to that
   - reduce lock contention between retire and submit by avoiding the
     need to acquire obj lock in retire path (and instead using resv
     seeing obj's busyness in the shrinker
   - fix reclaim vs submit issues
   - GEM fault injection for triggering userspace error paths
   - Map/unmap optimization
   - Improved robustness for a6xx GPU recovery

  virtio:
   - improve error and edge conditions handling
   - convert to use managed helpers
   - stop exposing LINEAR modifier

  mgag200:
   - split modeset handling per model

  udl:
   - suspend/disconnect handling improvements

  vc4:
   - rework HDMI power up
   - depend on PM
   - better unplugging support

  ast:
   - resolution handling improvements

  ingenic:
   - add JZ4760(B) support
   - avoid a modeset when sharpness property is unchanged
   - use the new PM ops

  it6505:
   - power seq and clock updates

  ssd130x:
   - regmap bulk write
   - use atomic helpers instead of simple helpers

  via:
   - rename via_drv to via_dri1, consolidate all code.

  radeon:
   - drop DP MST experimental support
   - delayed work flush fix
   - use time_after

  ti-sn65dsi86:
   - DP support

  mediatek:
   - MT8195 DP support
   - drop of_gpio header
   - remove unneeded result
   - small DP code improvements

  vkms:
   - RGB565, XRGB64 and ARGB64 support

  sun4i:
   - tv: convert to atomic

  rcar-du:
   - Synopsys DW HDMI bridge DT bindings update

  exynos:
   - use drm_display_info.is_hdmi
   - correct return of mixer_mode_valid and hdmi_mode_valid

  omap:
   - refcounting fix

  rockchip:
   - RK3568 support
   - RK3399 gamma support"

* tag 'drm-next-2022-10-05' of git://anongit.freedesktop.org/drm/drm: (1374 commits)
  drm/amdkfd: Fix UBSAN shift-out-of-bounds warning
  drm/amdkfd: Track unified memory when switching xnack mode
  drm/amdgpu: Enable sram on vcn_4_0_2
  drm/amdgpu: Enable VCN DPG for GC11_0_1
  drm/msm: Fix build break with recent mm tree
  drm/panel: simple: Use dev_err_probe() to simplify code
  drm/panel: panel-edp: Use dev_err_probe() to simplify code
  drm/panel: simple: Add Multi-Inno Technology MI0800FT-9
  dt-bindings: display: simple: Add Multi-Inno Technology MI0800FT-9 panel
  drm/amdgpu: correct the memcpy size for ip discovery firmware
  drm/amdgpu: Skip put_reset_domain if it doesn't exist
  drm/amdgpu: remove switch from amdgpu_gmc_noretry_set
  drm/amdgpu: Fix mc_umc_status used uninitialized warning
  drm/amd/display: Prevent OTG shutdown during PSR SU
  drm/amdgpu: add page retirement handling for CPU RAS
  drm/amdgpu: use RAS error address convert api in mca notifier
  drm/amdgpu: support to convert dedicated umc mca address
  drm/amdgpu: export umc error address convert interface
  drm/amdgpu: fix sdma v4 init microcode error
  drm/amd/display: fix array-bounds error in dc_stream_remove_writeback()
  ...
2022-10-05 11:24:12 -07:00
Linus Torvalds b86406d42a * 'remove' callback converted to return void. Big change with trivial
fixes all over the tree. Other subsystems depending on this change
   have been asked to pull an immutable topic branch for this.
 * new driver for Microchip PCI1xxxx switch
 * heavy refactoring of the Mellanox BlueField driver
 * we prefer async probe in the i801 driver now
 * the rest is usual driver updates (support for more SoCs, some
   refactoring, some feature additions)
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEOZGx6rniZ1Gk92RdFA3kzBSgKbYFAmM7T3IACgkQFA3kzBSg
 KbYnAxAAn2SXzpUuuJ05hhk/y89RWHhzSilU+7d+egYfQJlbXUl2WzYx/Wu1BSZM
 ciyXuJFIiTywdUiX1r1VeMO80zmQQZXAUG7VygAtOSk7iPSd/qTyL+7J+k1DXADI
 hGR+pZLBVfTFyY3d1qHnwKFkzByvQjc2raARv9g7kDxkSQa8xI/sXScmhGYtrLch
 DUYUK1F3Sdqbk0FsudJ5Jvd7bZCSS+n+jSR+mrZaOXbkUD4JmDUauW8pAS6UI9in
 CxnjZoOLMHdAmC9ADanLeDRXxKz23uNU/9vdZ1/DMYnNsF/TnyWl6Rz/3BFE3YFk
 Vq7A1XAK4b3oJAgM92mdvKSkmzBIzkmj02vaVyuNPtRgHZo5MsIcEnWiBhymZY5g
 W6BPrjt/8YKRKeNlP/nrZmageklepsXZbUrNQt1ws8i4bbT+CKInKbjKLnBfDgVz
 5VSd8M9+y2Jd/JaJhMt9TBNmP0W2RrThxLF06Hux1ue7k4maE7Eljvkzcd4GJ6Un
 HYePZMhwCx3aeYsFmFT/V3kHFsfyHUlIFy/vgXTEICsKUpyj/dX96ANWhe+tJdcX
 Cknmc+XOVGPm0LPPju4M8WScMjSqNODm1yfDWUe2cRKlxzI45v6x4Oxl8rWD9hb4
 KKMGXit0LOtWETlHALffwFCifs6DdaaA0IMUtMQUj8egvys0enE=
 =arni
 -----END PGP SIGNATURE-----

Merge tag 'i2c-for-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux

Pull i2c updates from Wolfram Sang:

 - 'remove' callback converted to return void. Big change with trivial
   fixes all over the tree. Other subsystems depending on this change
   have been asked to pull an immutable topic branch for this.

 - new driver for Microchip PCI1xxxx switch

 - heavy refactoring of the Mellanox BlueField driver

 - we prefer async probe in the i801 driver now

 - the rest is usual driver updates (support for more SoCs, some
   refactoring, some feature additions)

* tag 'i2c-for-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (37 commits)
  i2c: pci1xxxx: prevent signed integer overflow
  i2c: acpi: Replace zero-length array with DECLARE_FLEX_ARRAY() helper
  i2c: i801: Prefer async probe
  i2c: designware-pci: Use standard pattern for memory allocation
  i2c: designware-pci: Group AMD NAVI quirk parts together
  i2c: microchip: pci1xxxx: Add driver for I2C host controller in multifunction endpoint of pci1xxxx switch
  docs: i2c: slave-interface: return errno when handle I2C_SLAVE_WRITE_REQUESTED
  i2c: mlxbf: remove device tree support
  i2c: mlxbf: support BlueField-3 SoC
  i2c: cadence: Add standard bus recovery support
  i2c: mlxbf: add multi slave functionality
  i2c: mlxbf: support lock mechanism
  macintosh/ams: Adapt declaration of ams_i2c_remove() to earlier change
  i2c: riic: Use devm_platform_ioremap_resource()
  i2c: mlxbf: remove IRQF_ONESHOT
  dt-bindings: i2c: rockchip: add rockchip,rk3128-i2c
  dt-bindings: i2c: renesas,rcar-i2c: Add r8a779g0 support
  i2c: tegra: Add GPCDMA support
  i2c: scmi: Convert to be a platform driver
  i2c: rk3x: Add rv1126 support
  ...
2022-10-04 18:54:33 -07:00
Linus Torvalds d0989d01c6 hardening updates for v6.1-rc1
Various fixes across several hardening areas:
 
 - loadpin: Fix verity target enforcement (Matthias Kaehlcke).
 
 - zero-call-used-regs: Add missing clobbers in paravirt (Bill Wendling).
 
 - CFI: clean up sparc function pointer type mismatches (Bart Van Assche).
 
 - Clang: Adjust compiler flag detection for various Clang changes (Sami
   Tolvanen, Kees Cook).
 
 - fortify: Fix warnings in arch-specific code in sh, ARM, and xen.
 
 Improvements to existing features:
 
 - testing: improve overflow KUnit test, introduce fortify KUnit test,
   add more coverage to LKDTM tests (Bart Van Assche, Kees Cook).
 
 - overflow: Relax overflow type checking for wider utility.
 
 New features:
 
 - string: Introduce strtomem() and strtomem_pad() to fill a gap in
   strncpy() replacement needs.
 
 - um: Enable FORTIFY_SOURCE support.
 
 - fortify: Enable run-time struct member memcpy() overflow warning.
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmM4chcWHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJvq1D/9uKU03RozAOnzhi4gcgRnHZSAK
 oOQOkPwnkUgFU0yOnMkNYOZ7njLnM+CjCN3RJ9SSpD2lrQ23PwLeThAuOzy0brPO
 0iAksIztSF3e5tAyFjtFkjswrY8MSv/TkF0WttTOSOj3lCUcwatF0FBkclCOXtwu
 ILXfG7K8E17r/wsUejN+oMAI42ih/YeVQAZpKRymEEJsK+Lly7OT4uu3fdFWVb1P
 M77eRLI2Vg1eSgMVwv6XdwGakpUdwsboK7do0GGX+JOrhayJoCfY2IpwyPz9ciel
 jsp9OQs8NrlPJMa2sQ7LDl+b5EQl/MtggX3JlQEbLs2LV7gDtYgAWNo6vxCT5Lvd
 zB7TZqIR3lrVjbtw4FAKQ+41bS4VOajk2NB3Mkiy5AfivB+6zKF+P56a+xSoNhOl
 iktpjCEP7bp4oxmTMXpOfmywjh/ZsyoMhQ2ABP7S+JZ5rHUndpPAjjuBetIcHxX2
 28Wlr4aFIF9ff9caasg4sMYXcQMGnuLUlUKngceUbd1umZZRNZ1gaIxYpm9poefm
 qd/lvTIvzn9V8IB8wHVmvafbvDbV88A+2bKJdSUDA352Dt9PvqT7yI0dmbMNliGL
 os+iLPW6Y6x38BxhXax0HR9FEhO3Eq7kLdNdc4J29NvISg8HHaifwNrG41lNwaWL
 cuc6IAjLxiRk3NsUpg==
 =HZ6+
 -----END PGP SIGNATURE-----

Merge tag 'hardening-v6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull kernel hardening updates from Kees Cook:
 "Most of the collected changes here are fixes across the tree for
  various hardening features (details noted below).

  The most notable new feature here is the addition of the memcpy()
  overflow warning (under CONFIG_FORTIFY_SOURCE), which is the next step
  on the path to killing the common class of "trivially detectable"
  buffer overflow conditions (i.e. on arrays with sizes known at compile
  time) that have resulted in many exploitable vulnerabilities over the
  years (e.g. BleedingTooth).

  This feature is expected to still have some undiscovered false
  positives. It's been in -next for a full development cycle and all the
  reported false positives have been fixed in their respective trees.
  All the known-bad code patterns we could find with Coccinelle are also
  either fixed in their respective trees or in flight.

  The commit message in commit 54d9469bc5 ("fortify: Add run-time WARN
  for cross-field memcpy()") for the feature has extensive details, but
  I'll repeat here that this is a warning _only_, and is not intended to
  actually block overflows (yet). The many patches fixing array sizes
  and struct members have been landing for several years now, and we're
  finally able to turn this on to find any remaining stragglers.

  Summary:

  Various fixes across several hardening areas:

   - loadpin: Fix verity target enforcement (Matthias Kaehlcke).

   - zero-call-used-regs: Add missing clobbers in paravirt (Bill
     Wendling).

   - CFI: clean up sparc function pointer type mismatches (Bart Van
     Assche).

   - Clang: Adjust compiler flag detection for various Clang changes
     (Sami Tolvanen, Kees Cook).

   - fortify: Fix warnings in arch-specific code in sh, ARM, and xen.

  Improvements to existing features:

   - testing: improve overflow KUnit test, introduce fortify KUnit test,
     add more coverage to LKDTM tests (Bart Van Assche, Kees Cook).

   - overflow: Relax overflow type checking for wider utility.

  New features:

   - string: Introduce strtomem() and strtomem_pad() to fill a gap in
     strncpy() replacement needs.

   - um: Enable FORTIFY_SOURCE support.

   - fortify: Enable run-time struct member memcpy() overflow warning"

* tag 'hardening-v6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (27 commits)
  Makefile.extrawarn: Move -Wcast-function-type-strict to W=1
  hardening: Remove Clang's enable flag for -ftrivial-auto-var-init=zero
  sparc: Unbreak the build
  x86/paravirt: add extra clobbers with ZERO_CALL_USED_REGS enabled
  x86/paravirt: clean up typos and grammaros
  fortify: Convert to struct vs member helpers
  fortify: Explicitly check bounds are compile-time constants
  x86/entry: Work around Clang __bdos() bug
  ARM: decompressor: Include .data.rel.ro.local
  fortify: Adjust KUnit test for modular build
  sh: machvec: Use char[] for section boundaries
  kunit/memcpy: Avoid pathological compile-time string size
  lib: Improve the is_signed_type() kunit test
  LoadPin: Require file with verity root digests to have a header
  dm: verity-loadpin: Only trust verity targets with enforcement
  LoadPin: Fix Kconfig doc about format of file with verity digests
  um: Enable FORTIFY_SOURCE
  lkdtm: Update tests for memcpy() run-time warnings
  fortify: Add run-time WARN for cross-field memcpy()
  fortify: Use SIZE_MAX instead of (size_t)-1
  ...
2022-10-03 17:24:22 -07:00
Tomas Winkler bd58904a32 mei: pxp: support matching with a gfx discrete card
With on-boards graphics card, both i915 and MEI
are in the same device hierarchy with the same parent,
while for discrete gfx card the MEI is its child device.
Adjust the match function for that scenario
by matching MEI parent device with i915.

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Vitaly Lubart <vitaly.lubart@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20220928004145.745803-7-daniele.ceraolospurio@intel.com
2022-10-03 11:29:12 -07:00
Vitaly Lubart c72891256a mei: pxp: add command streamer API to the PXP driver
The discrete graphics card with GSC firmware
using command streamer API hence it requires to enhance
pxp module with the new gsc_command() handler.

The handler is implemented via mei_pxp_gsc_command() which is
just a thin wrapper around mei_cldev_send_gsc_command()

Signed-off-by: Vitaly Lubart <vitaly.lubart@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20220928004145.745803-6-daniele.ceraolospurio@intel.com
2022-10-03 11:29:11 -07:00
Vitaly Lubart 2266e58a1c mei: bus: extend bus API to support command streamer API
Add mei bus API for sending gsc commands: mei_cldev_send_gsc_command()

The GSC commands are originated in the graphics stack
and are in form of SGL DMA buffers.
The GSC commands are synchronous, the response is received
in the same call on the out sg list buffers.
The function setups pointers for in and out sg lists in the
mei sgl extended header and sends it to the firmware.

Signed-off-by: Vitaly Lubart <vitaly.lubart@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20220928004145.745803-5-daniele.ceraolospurio@intel.com
2022-10-03 11:29:10 -07:00
Tomas Winkler 2af56dde08 mei: adjust extended header kdocs
Fix kdoc for struct mei_ext_hdr and mei_ext_begin().

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20220928004145.745803-4-daniele.ceraolospurio@intel.com
2022-10-03 11:29:09 -07:00
Tomas Winkler 5d5bc18971 mei: bus: enable sending gsc commands
GSC command is and extended header containing a scatter gather
list and without a data buffer. Using MEI_CL_IO_SGL flag,
the caller send the GSC command as a data and the function internally
moves it to the extended header.

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Vitaly Lubart <vitaly.lubart@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20220928004145.745803-3-daniele.ceraolospurio@intel.com
2022-10-03 11:29:08 -07:00
Tomas Winkler 4ed1cc997f mei: add support to GSC extended header
GSC extend header is of variable size and data
is provided in a sgl list inside the header
and not in the data buffers, need to enable the path.

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Vitaly Lubart <vitaly.lubart@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20220928004145.745803-2-daniele.ceraolospurio@intel.com
2022-10-03 11:29:07 -07:00
Matthew Wilcox (Oracle) d9fa0e37cd cxl: remove vma linked list walk
Use the VMA iterator instead.  This requires a little restructuring of the
surrounding code to hoist the mm to the caller.  That turns
cxl_prefault_one() into a trivial function, so call cxl_fault_segment()
directly.

Link: https://lkml.kernel.org/r/20220906194824.2110408-38-Liam.Howlett@oracle.com
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Tested-by: Yu Zhao <yuzhao@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: SeongJae Park <sj@kernel.org>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-26 19:46:20 -07:00
Sami Tolvanen 607289a7cd treewide: Drop function_nocfi
With -fsanitize=kcfi, we no longer need function_nocfi() as
the compiler won't change function references to point to a
jump table. Remove all implementations and uses of the macro.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Tested-by: Kees Cook <keescook@chromium.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220908215504.3686827-14-samitolvanen@google.com
2022-09-26 10:13:14 -07:00
Sami Tolvanen cf90d03835 lkdtm: Emit an indirect call for CFI tests
Clang can convert the indirect calls in lkdtm_CFI_FORWARD_PROTO into
direct calls. Move the call into a noinline function that accepts the
target address as an argument to ensure the compiler actually emits an
indirect call instead.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Kees Cook <keescook@chromium.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220908215504.3686827-8-samitolvanen@google.com
2022-09-26 10:13:13 -07:00
Shang XiaoJing 9ea224b119 mei: gsc: Remove redundant dev_err call
devm_ioremap_resource() prints error message in itself. Remove the
dev_err call to avoid redundant error message.

Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com>
Link: https://lore.kernel.org/r/20220923100841.17719-1-shangxiaojing@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-09-24 14:57:24 +02:00
Jilin Yuan 4b25cf09c6 mei: fix repeated words in comments
Delete the redundant word 'from'.

Acked-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Jilin Yuan <yuanjilin@cdjrlc.com>
Link: https://lore.kernel.org/r/20220918100431.28381-1-yuanjilin@cdjrlc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-09-24 14:57:21 +02:00
Gaosheng Cui 1b46c82146 drivers/misc/sgi-xp: Remove orphan declarations from drivers/misc/sgi-xp/xp.h
Remove the following orphan declarations from drivers/misc/sgi-xp/xp.h:
1. xp_nofault_PIOR_target
2. xp_error_PIOR
3. xp_nofault_PIOR

They have been removed since commit 9726bfcdb9 ("misc/sgi-xp:
remove SGI SN2 support"), so remove them.

Reviewed-by: Steve Wahl <steve.wahl@hpe.com>
Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Link: https://lore.kernel.org/r/20220913110356.764711-1-cuigaosheng1@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-09-24 14:57:19 +02:00
Christophe JAILLET 62e5d00684 misc: microchip: pci1xxxx: Fix a memory leak in the error handling of gp_aux_bus_probe()
'aux_bus' is freed in the remove function but not in the error handling
path of the probe.

Use devm_kzalloc() to simplify the remove function and fix the leak in the
probe.

Fixes: 393fc2f594 ("misc: microchip: pci1xxxx: load auxiliary bus driver for the PIO function in the multi-function endpoint of pci1xxxx device.")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/17e19926669a1654e5f2495bf3b289581183d02e.1663482259.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-09-22 16:54:35 +02:00
Christophe JAILLET c8b4747569 misc: microchip: pci1xxxx: Do not disable the pci device twice in gp_aux_bus_remove()
gp_aux_bus_probe() uses pcim_enable_device(), so there is no point in
calling pci_disable_device() explicitly in the remove function.

Fixes: 393fc2f594 ("misc: microchip: pci1xxxx: load auxiliary bus driver for the PIO function in the multi-function endpoint of pci1xxxx device.")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/8a3a385b3ae15ee7497469ec3250302b626a018b.1663482259.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-09-22 16:54:31 +02:00
Kumaravel Thiagarajan dc2c96a39d misc: microchip: pci1xxxx: use DEFINE_SIMPLE_DEV_PM_OPS() in place of the SIMPLE_DEV_PM_OPS() in pci1xxxx's gpio driver
build errors listed below and reported by Sudip Mukherjee
<sudipm.mukherjee@gmail.com> for the builds of
riscv, s390, csky, alpha and loongarch allmodconfig are fixed in
this patch.

drivers/misc/mchp_pci1xxxx/mchp_pci1xxxx_gpio.c:311:12: error: 'pci1xxxx_gpio_resume' defined but not used [-Werror=unused-function]
  311 | static int pci1xxxx_gpio_resume(struct device *dev)
      |            ^~~~~~~~~~~~~~~~~~~~
drivers/misc/mchp_pci1xxxx/mchp_pci1xxxx_gpio.c:295:12: error: 'pci1xxxx_gpio_suspend' defined but not used [-Werror=unused-function]
  295 | static int pci1xxxx_gpio_suspend(struct device *dev)
      |            ^~~~~~~~~~~~~~~~~~~~~

Fixes: 4ec7ac90ff ("misc: microchip: pci1xxxx: Add power management functions - suspend & resume handlers.")
Reported-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Signed-off-by: Kumaravel Thiagarajan <kumaravel.thiagarajan@microchip.com>
Link: https://lore.kernel.org/r/20220915094729.646185-1-kumaravel.thiagarajan@microchip.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-09-22 16:52:52 +02:00
Yihao Han 0e8bf26c77 misc: microchip: pci1xxxx: Remove duplicate include
Remove duplicate include in mchp_pci1xxxx_gpio.c

Fixes: 7d3e4d807d ("misc: microchip: pci1xxxx: load gpio driver for the gpio controller auxiliary device enumerated by the auxiliary bus driver.")
Reviewed-by: Kumaravel Thiagarajan <kumaravel.thiagarajan@microchip.com>
Signed-off-by: Yihao Han <hanyihao@vivo.com>
Link: https://lore.kernel.org/r/20220913030257.22352-1-hanyihao@vivo.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-09-22 16:52:42 +02:00
Dave Airlie 72ca70acc7 Merge tag 'drm-intel-gt-next-2022-09-16' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
Cross-subsystem Changes:

- MEI subsystem pieces for XeHP SDV GSC support
  These are Acked-by Greg.

Driver Changes:

- Release mmaps on RPM suspend on discrete GPUs (Anshuman)
- Update GuC version to 7.5 on DG1, DG2 and ADL
- Revert "drm/i915/dg2: extend Wa_1409120013 to DG2" (Lucas)
- MTL enabling incl. standalone media (Matt R, Lucas)
- Explicitly clear BB_OFFSET for new contexts on Gen8+ (Chris)
- Fix throttling / perf limit reason decoding (Ashutosh)
- XeHP SDV GSC support (Vitaly, Alexander, Tomas)

- Fix issues with overrding firmware file paths (John)
- Invert if-else ladders to check latest version first (Lucas)
- Cancel GuC engine busyness worker synchronously (Umesh)

- Skip applying copy engine fuses outside PVC (Lucas)
- Eliminate Gen10 frequency read function (Lucas)
- Static code checker fixes (Gaosheng)
- Selftest improvements (Chris)

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YyQ4Jgl3cpGL1/As@jlahtine-mobl.ger.corp.intel.com
2022-09-21 07:42:47 +10:00
Oded Gabbay 259cee1c24 habanalabs: eliminate aggregate use warning
When doing sizeof() and giving as argument a dereference of
a pointer-to-a-pointer object, clang will issue a warning.

Eliminate the warning by passing struct <name>*

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-09-20 15:52:27 +03:00
Tomer Tayar e403856468 habanalabs/gaudi: use 8KB aligned address for TPC kernels
I$ prefetch is enabled when sending a TPC kernel to initialize the TPC
memory, and it has a restriction that the base address will be aligned
to 8KB.
Currently the base address is 128 bytes from the start address of the
device SRAM, so prefetching will start 128 bytes before the actual
kernel memory.
Modify the kernel address to be 8KB aligned.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-09-20 15:51:54 +03:00
farah kassabri 6b9b9e244f habanalabs: remove some f/w descriptor validations
To be forward-backward compatible with the firmware in the initial
communication during preboot, we need to remove the validation of the
header size. This will allow us to add more fields to the
lkd_fw_comms_desc structure.

Instead of the validation of the header size, we just print warning
when some mismatch in descriptor has been revealed, and we calculate
the CRC base on descriptor size reported by the firmware instead of
calculating it ourselves.

Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-09-20 15:46:45 +03:00
Ohad Sharabi 8412bb69ed habanalabs: build ASICs from new to old
Newer ASICs code changes more often, has more chance to fail
compilation. So, let's compile them first so errors in those files
will fail compilation sooner.

Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-09-20 15:42:08 +03:00
Ofir Bitton bb677d527e habanalabs/gaudi2: allow user to flush PCIE by read
In order for the user to flush PCIE he needs to read some register
from PCIE block. The chosen register is SPECIAL_GLBL_SPARE_0 and
hence needs to be unsecured.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-09-19 15:10:01 +03:00
Oded Gabbay 4f3ce5e0d0 habanalabs: failure to open device due to reset is debug level
If the user wants to open the device, and the device is currently in
reset, the user will get an error from the open().

We don't need to display an error in the dmesg for that as it is
not a real error and we can spam the kernel log with this message.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-09-19 15:09:04 +03:00
Li zeming 006fd8cb65 habanalabs/gaudi2: Remove unnecessary (void*) conversions
The void pointer object can be directly assigned to different structure
objects, it does not need to be cast.

Signed-off-by: Li zeming <zeming@nfschina.com>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-09-19 15:08:40 +03:00
Dani Liberman 0c88760f8f habanalabs/gaudi2: add secured attestation info uapi
User will provide a nonce via the ioctl, and will retrieve
secured attestation data of the boot, generated using given
nonce.

Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-09-19 15:08:40 +03:00
Dani Liberman 43657dadfe habanalabs/gaudi2: add handling to pmmu events in eqe handler
In order to get the error cause and the captured address in case of
page fault, added pmmu events to eqe handler.

Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-09-19 15:08:39 +03:00
Tal Cohen ff13b900b0 habanalabs/gaudi: change TPC Assert to use TPC DEC instead of QMAN err
This change is done while there is a problem to use QMAN error for
TPC assert async. The problem involves security limitation that exists
to generate the assert via QMAN error.

Signed-off-by: Tal Cohen <talcohen@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-09-19 15:08:39 +03:00
Dani Liberman 97a78e3d8e habanalabs: rename error info structure
As a preparation for adding more errors to it,
change to more suitable name.

Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-09-19 15:08:39 +03:00
farah kassabri 04d53cd2a6 habanalabs/gaudi2: get f/w reset status register dynamically
Get the firmware reset status address from the dynamic registers
we read from the firmware instead of using a define.

Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-09-19 15:08:39 +03:00
Tomer Tayar f0b6d3cc29 habanalabs/gaudi2: increase hard-reset sleep time to 2 sec
The access to the device registers is blocked during hard reset, until
preboot runs and allows the access to specific registers, including the
PSOC BTM_FSM register which is used to know when the reset is done.
Between the reset request and until this register is polled there is a
small delay of 500 msec which is not enough for F/W to process the reset
and for preboot to run, so the register might be accessed while it is
blocked.
To avoid it, increase the delay to 2 sec.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-09-19 15:08:39 +03:00
Tomer Tayar cecde184ca habanalabs/gaudi2: print RAZWI info upon PCIe access error
Add the dump of the RAZWI information when a PCIe access is blocked by
RR.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-09-19 15:08:39 +03:00
Oded Gabbay 82736b063f habanalabs: MMU invalidation h/w is per device
The code used the mmu mutex to protect access to the context's page
tables and invalidation of the MMU cache. Because pgt are per
context, the mmu mutex was a member of the context object.

The problem is that the device has a single MMU invalidation h/w
(per MMU). Therefore, the mmu mutex should not be a property of the
context but a property of the device.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-09-19 15:08:39 +03:00
Tal Cohen 6f0818c9fc habanalabs: new notifier events for device state
Add new notifier events that inform several device states.
General H/W error raised on device general H/W error occurs.
User engine error is raised when a device engine informs of an error.

Signed-off-by: Tal Cohen <talcohen@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-09-19 15:08:38 +03:00
Oded Gabbay c833ac1a5f habanalabs/gaudi2: free event irq if init fails
In case initialization fails after event irq was requested, we need to
release that irq.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-09-19 15:08:38 +03:00
Ohad Sharabi 76925f55c9 habanalabs: fix resetting the DRAM BAR
Current code does not takes into account the new DRAM region base
and so calculated address is wrong and can lead to crush.

Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-09-19 15:08:38 +03:00
Ofir Bitton 0626fa1a4d habanalabs: add support for new cpucp return codes
Firmware now responds with a more detailed cpucp return codes.
Driver can now distinguish between error and debug return codes.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-09-19 15:08:38 +03:00
Tomer Tayar a0fc8688c0 habanalabs/gaudi2: read F/W security indication after hard reset
F/W security status might change after every reset.

Add the reading of the preboot status to the hard reset sequence, which
among others reads this security indication.

As this preboot status reading includes the waiting for the preboot to
be ready, it can be removed from the CPU init which is done in a later
stage.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-09-19 15:08:38 +03:00
Ofir Bitton aee3fd74fe habanalabs/gaudi: rename mme cfg error response print
Current description is misleading hence we rename it to a more
suitable error description.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-09-19 15:08:38 +03:00
farah kassabri 62adba0a55 habanalabs: fix possible hole in device va
cb_map_mem() uses gen_pool_alloc() to get virtual address for
mapping a CB.
The mapping is done in chunks of page size, so if the CB size is
larger, it is possible that the allocated virtual addresses won't
be consecutive.
User retrieves this device VA which returns the virtual address
in the first va_block. If there is a "hole" in the virtual addresses,
user can configure a HW block with a bad device VA.

Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-09-19 15:08:38 +03:00