This patch reverses the DPM clocks levels output of pp_dpm_mclk
and pp_dpm_fclk for renoir.
On dGPUs and older APUs we expose the levels from lowest clocks
to highest clocks. But for some APUs, the clocks levels are
given the reversed orders by PMFW. Like the memory DPM clocks
that are exposed by pp_dpm_mclk.
It's not intuitive that they are reversed on these APUs. All tools
and software that talks to the driver then has to know different ways
to interpret the data depending on the asic.
So we need to reverse them to expose the clocks levels from the
driver consistently.
Signed-off-by: Tim Huang <Tim.Huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
This patch reverses the DPM clocks levels output of pp_dpm_mclk
and pp_dpm_fclk.
On dGPUs and older APUs we expose the levels from lowest clocks
to highest clocks. But for some APUs, the clocks levels that from
the DFPstateTable are given the reversed orders by PMFW. Like the
memory DPM clocks that are exposed by pp_dpm_mclk.
It's not intuitive that they are reversed on these APUs. All tools
and software that talks to the driver then has to know different ways
to interpret the data depending on the asic.
So we need to reverse them to expose the clocks levels from the
driver consistently.
Signed-off-by: Tim Huang <Tim.Huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
This patch reverses the DPM clocks levels output of pp_dpm_mclk
and pp_dpm_fclk.
On dGPUs and older APUs we expose the levels from lowest clocks
to highest clocks. But for some APUs, the clocks levels that from
the DFPstateTable are given the reversed orders by PMFW. Like the
memory DPM clocks that are exposed by pp_dpm_mclk.
It's not intuitive that they are reversed on these APUs. All tools
and software that talks to the driver then has to know different ways
to interpret the data depending on the asic.
So we need to reverse them to expose the clocks levels from the
driver consistently.
Signed-off-by: Tim Huang <Tim.Huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
This patch reverses the DPM clocks levels output of pp_dpm_mclk.
On dGPUs and older APUs we expose the levels from lowest clocks
to highest clocks. But for some APUs, the clocks levels that from
the DFPstateTable are given the reversed orders by PMFW. Like the
memory DPM clocks that are exposed by pp_dpm_mclk.
It's not intuitive that they are reversed on these APUs. All tools
and software that talks to the driver then has to know different ways
to interpret the data depending on the asic.
So we need to reverse them to expose the clocks levels from the
driver consistently.
Signed-off-by: Tim Huang <Tim.Huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
This patch reverses the DPM clocks levels output of pp_dpm_mclk
and pp_dpm_fclk.
On dGPUs and older APUs we expose the levels from lowest clocks
to highest clocks. But for some APUs, the clocks levels that from
the DFPstateTable are given the reversed orders by PMFW. Like the
memory DPM clocks that are exposed by pp_dpm_mclk.
It's not intuitive that they are reversed on these APUs. All tools
and software that talks to the driver then has to know different ways
to interpret the data depending on the asic.
So we need to reverse them to expose the clocks levels from the
driver consistently.
Signed-off-by: Tim Huang <Tim.Huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Until commit 5c2712387d ("cacheinfo: Fix LLC is not exported through
sysfs"), cacheinfo called populate_cache_leaves() for CPU coming online
which let the arch specific functions handle (at least on x86)
populating the shared_cpu_map. However, with the changes in the
aforementioned commit, populate_cache_leaves() is not called when a CPU
comes online as a result of hotplug since last_level_cache_is_valid()
returns true as the cacheinfo data is not discarded. The CPU coming
online is not present in shared_cpu_map, however, it will not be added
since the cpu_cacheinfo->cpu_map_populated flag is set (it is set in
populate_cache_leaves() when cacheinfo is first populated for x86)
This can lead to inconsistencies in the shared_cpu_map when an offlined
CPU comes online again. Example below depicts the inconsistency in the
shared_cpu_list in cacheinfo when CPU8 is offlined and onlined again on
a 3rd Generation EPYC processor:
# for i in /sys/devices/system/cpu/cpu8/cache/index*/shared_cpu_list; do echo -n "$i: "; cat $i; done
/sys/devices/system/cpu/cpu8/cache/index0/shared_cpu_list: 8,136
/sys/devices/system/cpu/cpu8/cache/index1/shared_cpu_list: 8,136
/sys/devices/system/cpu/cpu8/cache/index2/shared_cpu_list: 8,136
/sys/devices/system/cpu/cpu8/cache/index3/shared_cpu_list: 8-15,136-143
# echo 0 > /sys/devices/system/cpu/cpu8/online
# echo 1 > /sys/devices/system/cpu/cpu8/online
# for i in /sys/devices/system/cpu/cpu8/cache/index*/shared_cpu_list; do echo -n "$i: "; cat $i; done
/sys/devices/system/cpu/cpu8/cache/index0/shared_cpu_list: 8
/sys/devices/system/cpu/cpu8/cache/index1/shared_cpu_list: 8
/sys/devices/system/cpu/cpu8/cache/index2/shared_cpu_list: 8
/sys/devices/system/cpu/cpu8/cache/index3/shared_cpu_list: 8
# cat /sys/devices/system/cpu/cpu136/cache/index0/shared_cpu_list
136
# cat /sys/devices/system/cpu/cpu136/cache/index3/shared_cpu_list
9-15,136-143
Clear the flag when the CPU is removed from shared_cpu_map when
cache_shared_cpu_map_remove() is called during CPU hotplug. This will
allow cache_shared_cpu_map_setup() to add the CPU coming back online in
the shared_cpu_map. Set the flag again when the shared_cpu_map is setup.
Following are results of performing the same test as described above with
the changes:
# for i in /sys/devices/system/cpu/cpu8/cache/index*/shared_cpu_list; do echo -n "$i: "; cat $i; done
/sys/devices/system/cpu/cpu8/cache/index0/shared_cpu_list: 8,136
/sys/devices/system/cpu/cpu8/cache/index1/shared_cpu_list: 8,136
/sys/devices/system/cpu/cpu8/cache/index2/shared_cpu_list: 8,136
/sys/devices/system/cpu/cpu8/cache/index3/shared_cpu_list: 8-15,136-143
# echo 0 > /sys/devices/system/cpu/cpu8/online
# echo 1 > /sys/devices/system/cpu/cpu8/online
# for i in /sys/devices/system/cpu/cpu8/cache/index*/shared_cpu_list; do echo -n "$i: "; cat $i; done
/sys/devices/system/cpu/cpu8/cache/index0/shared_cpu_list: 8,136
/sys/devices/system/cpu/cpu8/cache/index1/shared_cpu_list: 8,136
/sys/devices/system/cpu/cpu8/cache/index2/shared_cpu_list: 8,136
/sys/devices/system/cpu/cpu8/cache/index3/shared_cpu_list: 8-15,136-143
# cat /sys/devices/system/cpu/cpu136/cache/index0/shared_cpu_list
8,136
# cat /sys/devices/system/cpu/cpu136/cache/index3/shared_cpu_list
8-15,136-143
Fixes: 5c2712387d ("cacheinfo: Fix LLC is not exported through sysfs")
Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
Reviewed-by: Yicong Yang <yangyicong@hisilicon.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://lore.kernel.org/r/20230508084115.1157-3-kprateek.nayak@amd.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
While building the shared_cpu_map, check if the cache level and cache
type matches. On certain systems that build the cache topology based on
the instance ID, there are cases where the same ID may repeat across
multiple cache levels, leading inaccurate topology.
In event of CPU offlining, the cache_shared_cpu_map_remove() does not
consider if IDs at same level are being compared. As a result, when same
IDs repeat across different cache levels, the CPU going offline is not
removed from all the shared_cpu_map.
Below is the output of cache topology of CPU8 and it's SMT sibling after
CPU8 is offlined on a dual socket 3rd Generation AMD EPYC processor
(2 x 64C/128T) running kernel release v6.3:
# for i in /sys/devices/system/cpu/cpu8/cache/index*/shared_cpu_list; do echo -n "$i: "; cat $i; done
/sys/devices/system/cpu/cpu8/cache/index0/shared_cpu_list: 8,136
/sys/devices/system/cpu/cpu8/cache/index1/shared_cpu_list: 8,136
/sys/devices/system/cpu/cpu8/cache/index2/shared_cpu_list: 8,136
/sys/devices/system/cpu/cpu8/cache/index3/shared_cpu_list: 8-15,136-143
# echo 0 > /sys/devices/system/cpu/cpu8/online
# for i in /sys/devices/system/cpu/cpu136/cache/index*/shared_cpu_list; do echo -n "$i: "; cat $i; done
/sys/devices/system/cpu/cpu136/cache/index0/shared_cpu_list: 136
/sys/devices/system/cpu/cpu136/cache/index1/shared_cpu_list: 8,136
/sys/devices/system/cpu/cpu136/cache/index2/shared_cpu_list: 8,136
/sys/devices/system/cpu/cpu136/cache/index3/shared_cpu_list: 9-15,136-143
CPU8 is removed from index0 (L1i) but remains in the shared_cpu_list of
index1 (L1d) and index2 (L2). Since L1i, L1d, and L2 are shared by the
SMT siblings, and they have the same cache instance ID, CPU 2 is only
removed from the first index with matching ID which is index1 (L1i) in
this case. With this fix, the results are as expected when performing
the same experiment on the same system:
# for i in /sys/devices/system/cpu/cpu8/cache/index*/shared_cpu_list; do echo -n "$i: "; cat $i; done
/sys/devices/system/cpu/cpu8/cache/index0/shared_cpu_list: 8,136
/sys/devices/system/cpu/cpu8/cache/index1/shared_cpu_list: 8,136
/sys/devices/system/cpu/cpu8/cache/index2/shared_cpu_list: 8,136
/sys/devices/system/cpu/cpu8/cache/index3/shared_cpu_list: 8-15,136-143
# echo 0 > /sys/devices/system/cpu/cpu8/online
# for i in /sys/devices/system/cpu/cpu136/cache/index*/shared_cpu_list; do echo -n "$i: "; cat $i; done
/sys/devices/system/cpu/cpu136/cache/index0/shared_cpu_list: 136
/sys/devices/system/cpu/cpu136/cache/index1/shared_cpu_list: 136
/sys/devices/system/cpu/cpu136/cache/index2/shared_cpu_list: 136
/sys/devices/system/cpu/cpu136/cache/index3/shared_cpu_list: 9-15,136-143
When rebuilding topology, the same problem appears as
cache_shared_cpu_map_setup() implements a similar logic. Consider the
same 3rd Generation EPYC processor: CPUs in Core 1, that share the L1
and L2 caches, have L1 and L2 instance ID as 1. For all the CPUs on
the second chiplet, the L3 ID is also 1 leading to grouping on CPUs from
Core 1 (1, 17) and the entire second chiplet (8-15, 24-31) as CPUs
sharing one cache domain. This went undetected since x86 processors
depended on arch specific populate_cache_leaves() method to repopulate
the shared_cpus_map when CPU came back online until kernel release
v6.3-rc5.
Fixes: 198102c910 ("cacheinfo: Fix shared_cpu_map to handle shared caches at different levels")
Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://lore.kernel.org/r/20230508084115.1157-2-kprateek.nayak@amd.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The current code will, if we do not specify unique labels
for the SRAM subnodes, fail to register several nodes named
the same.
Example:
sram@40020000 {
(...)
sram@0 {
(...)
};
sram@1000 {
(...)
};
};
Since the child->name in both cases will be "sram" the
gen_pool_create() will fail because the name is not unique.
Use dev_name() for the device as this will have bus ID
set to the fully translated address for the node, and that
will always be unique.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20230417-ux500-sram-v2-2-6e62ad551faa@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This adds an SoC-specific binding for the banks of eSRAM
available in the ST-Ericsson U8500.
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20230417-ux500-sram-v2-1-6e62ad551faa@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Dan Carpenter spotted that test_fw_config->reqs will be leaked if
trigger_batched_requests_store() is called two or more times.
The same appears with trigger_batched_requests_async_store().
This bug wasn't trigger by the tests, but observed by Dan's visual
inspection of the code.
The recommended workaround was to return -EBUSY if test_fw_config->reqs
is already allocated.
Fixes: 7feebfa487 ("test_firmware: add support for request_firmware_into_buf")
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Russ Weight <russell.h.weight@intel.com>
Cc: Tianfei Zhang <tianfei.zhang@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: linux-kselftest@vger.kernel.org
Cc: stable@vger.kernel.org # v5.4
Suggested-by: Dan Carpenter <error27@gmail.com>
Suggested-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Link: https://lore.kernel.org/r/20230509084746.48259-2-mirsad.todorovac@alu.unizg.hr
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Dan Carpenter spotted a race condition in a couple of situations like
these in the test_firmware driver:
static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
{
u8 val;
int ret;
ret = kstrtou8(buf, 10, &val);
if (ret)
return ret;
mutex_lock(&test_fw_mutex);
*(u8 *)cfg = val;
mutex_unlock(&test_fw_mutex);
/* Always return full write size even if we didn't consume all */
return size;
}
static ssize_t config_num_requests_store(struct device *dev,
struct device_attribute *attr,
const char *buf, size_t count)
{
int rc;
mutex_lock(&test_fw_mutex);
if (test_fw_config->reqs) {
pr_err("Must call release_all_firmware prior to changing config\n");
rc = -EINVAL;
mutex_unlock(&test_fw_mutex);
goto out;
}
mutex_unlock(&test_fw_mutex);
rc = test_dev_config_update_u8(buf, count,
&test_fw_config->num_requests);
out:
return rc;
}
static ssize_t config_read_fw_idx_store(struct device *dev,
struct device_attribute *attr,
const char *buf, size_t count)
{
return test_dev_config_update_u8(buf, count,
&test_fw_config->read_fw_idx);
}
The function test_dev_config_update_u8() is called from both the locked
and the unlocked context, function config_num_requests_store() and
config_read_fw_idx_store() which can both be called asynchronously as
they are driver's methods, while test_dev_config_update_u8() and siblings
change their argument pointed to by u8 *cfg or similar pointer.
To avoid deadlock on test_fw_mutex, the lock is dropped before calling
test_dev_config_update_u8() and re-acquired within test_dev_config_update_u8()
itself, but alas this creates a race condition.
Having two locks wouldn't assure a race-proof mutual exclusion.
This situation is best avoided by the introduction of a new, unlocked
function __test_dev_config_update_u8() which can be called from the locked
context and reducing test_dev_config_update_u8() to:
static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
{
int ret;
mutex_lock(&test_fw_mutex);
ret = __test_dev_config_update_u8(buf, size, cfg);
mutex_unlock(&test_fw_mutex);
return ret;
}
doing the locking and calling the unlocked primitive, which enables both
locked and unlocked versions without duplication of code.
The similar approach was applied to all functions called from the locked
and the unlocked context, which safely mitigates both deadlocks and race
conditions in the driver.
__test_dev_config_update_bool(), __test_dev_config_update_u8() and
__test_dev_config_update_size_t() unlocked versions of the functions
were introduced to be called from the locked contexts as a workaround
without releasing the main driver's lock and thereof causing a race
condition.
The test_dev_config_update_bool(), test_dev_config_update_u8() and
test_dev_config_update_size_t() locked versions of the functions
are being called from driver methods without the unnecessary multiplying
of the locking and unlocking code for each method, and complicating
the code with saving of the return value across lock.
Fixes: 7feebfa487 ("test_firmware: add support for request_firmware_into_buf")
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Russ Weight <russell.h.weight@intel.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Tianfei Zhang <tianfei.zhang@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: linux-kselftest@vger.kernel.org
Cc: stable@vger.kernel.org # v5.4
Suggested-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
Link: https://lore.kernel.org/r/20230509084746.48259-1-mirsad.todorovac@alu.unizg.hr
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Function ll_rw_block was removed in commit 79f5978420 ("fs/buffer:
remove ll_rw_block() helper"). There is no unified function to sumbit
read or write buffer in block layer for now. Consider similar sematics,
we can choose submit_bh() to replace ll_rw_block() as predefined crash
point. In submit_bh(), it also takes read or write flag as the first
argument and invoke submit_bio() to submit I/O request to block layer.
Fixes: 79f5978420 ("fs/buffer: remove ll_rw_block() helper")
Signed-off-by: Yue Zhao <findns94@gmail.com>
Acked-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20230503162944.3969-1-findns94@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There was a bug where this code forgot to unlock the tdev->mutex if the
kzalloc() failed. Fix this issue, by moving the allocation outside the
lock.
Fixes: 2d1e952a2b ("mailbox: mailbox-test: Fix potential double-free in mbox_test_message_write()")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Lee Jones <lee@kernel.org>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
In a future patch HAS_IOPORT=n will result in inb()/outb() and friends
not being declared. SPEAKUP_SERIALIO thus needs to depend on HAS_IOPORT.
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Co-developed-by: Arnd Bergmann <arnd@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@kernel.org>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Link: https://lore.kernel.org/r/20230522105049.1467313-34-schnelle@linux.ibm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In a future patch HAS_IOPORT=n will result in inb()/outb() and friends
not being declared. Add dependencies for those drivers that use them.
Co-developed-by: Arnd Bergmann <arnd@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@kernel.org>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Link: https://lore.kernel.org/r/20230522105049.1467313-27-schnelle@linux.ibm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In a future patch HAS_IOPORT=n will result in inb()/outb() and friends
not being declared. As PC style parport uses these functions we need to
handle this dependency.
Co-developed-by: Arnd Bergmann <arnd@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@kernel.org>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Link: https://lore.kernel.org/r/20230522105049.1467313-24-schnelle@linux.ibm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In a future patch HAS_IOPORT=n will result in inb()/outb() and friends
not being declared. We thus need to add HAS_IOPORT as dependency for
those drivers using them. With that the !S390 dependency on ISDN can be
removed as all drivers without HAS_IOPORT requirement now build.
Co-developed-by: Arnd Bergmann <arnd@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@kernel.org>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Link: https://lore.kernel.org/r/20230522105049.1467313-21-schnelle@linux.ibm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In a future patch HAS_IOPORT=n will result in inb()/outb() and friends
not being declared. We thus need to guard sections of code calling them
as alternative access methods.
Co-developed-by: Arnd Bergmann <arnd@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@kernel.org>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Link: https://lore.kernel.org/r/20230522105049.1467313-11-schnelle@linux.ibm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The 104_QUAD_8 counter driver uses devm_ioport_map() without depending
on HAS_IOPORT_MAP. This means the driver is not usable on platforms such
as s390 which do not support I/O port mapping. Add the missing
HAS_IOPORT_MAP dependency to make this explicit.
Co-developed-by: Arnd Bergmann <arnd@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@kernel.org>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Link: https://lore.kernel.org/r/20230522105049.1467313-8-schnelle@linux.ibm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In a future patch HAS_IOPORT=n will result in inb()/outb() and friends
not being declared. We thus need to add HAS_IOPORT as dependency for
those drivers using them.
Co-developed-by: Arnd Bergmann <arnd@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@kernel.org>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Link: https://lore.kernel.org/r/20230522105049.1467313-7-schnelle@linux.ibm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In a future patch HAS_IOPORT=n will result in inb()/outb() and friends
not being declared. We thus need to add HAS_IOPORT as dependency for
those drivers using them.
Co-developed-by: Arnd Bergmann <arnd@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@kernel.org>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Link: https://lore.kernel.org/r/20230522105049.1467313-4-schnelle@linux.ibm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Small rc bug fixes:
- Fix 64K ARM page size support in bnxt_re and efa
- bnxt_re fixes for a memory leak, incorrect error handling and a remove a
bogus FW failure when running on a VF
- Update MAINTAINERS for hns and efa
- Fix two rxe regressions added this merge window in error unwind and
incorrect spinlock primitives
- hns gets a better algorithm for allocating page tables to avoid running
out of resources, and a timeout adjustment
- Fix a text case failure in hns
- Use after free in irdma and fix incorrect construction of a WQE causing
mis-execution
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRRRCHOFoQz/8F5bUaFwuHvBreFYQUCZHd8IwAKCRCFwuHvBreF
YYOFAQDkEWwKF1+L+vvI0Do6gTPEb1D5KDRl3uasl4AAVl66UgEA8UulKNiZF/iJ
L9NS8M4C/7WPMtPNFty3QjtSK5BXkwg=
=FCJZ
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma fixes from Jason Gunthorpe:
- Fix 64K ARM page size support in bnxt_re and efa
- bnxt_re fixes for a memory leak, incorrect error handling and a
remove a bogus FW failure when running on a VF
- Update MAINTAINERS for hns and efa
- Fix two rxe regressions added this merge window in error unwind and
incorrect spinlock primitives
- hns gets a better algorithm for allocating page tables to avoid
running out of resources, and a timeout adjustment
- Fix a text case failure in hns
- Use after free in irdma and fix incorrect construction of a WQE
causing mis-execution
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/irdma: Fix Local Invalidate fencing
RDMA/irdma: Prevent QP use after free
MAINTAINERS: Update maintainer of Amazon EFA driver
RDMA/bnxt_re: Do not enable congestion control on VFs
RDMA/bnxt_re: Fix return value of bnxt_re_process_raw_qp_pkt_rx
RDMA/bnxt_re: Fix a possible memory leak
RDMA/hns: Modify the value of long message loopback slice
RDMA/hns: Fix base address table allocation
RDMA/hns: Fix timeout attr in query qp for HIP08
RDMA/efa: Fix unsupported page sizes in device
RDMA/rxe: Convert spin_{lock_bh,unlock_bh} to spin_{lock_irqsave,unlock_irqrestore}
RDMA/rxe: Fix double unlock in rxe_qp.c
MAINTAINERS: Update maintainers of HiSilicon RoCE
RDMA/bnxt_re: Fix the page_size used during the MR creation
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEK2m5VNv+CHkogTfJ8vlZVpUNgaMFAmR3asIACgkQ8vlZVpUN
gaObtQgAoPD1ytW5rfDHIK3s7SXImETH/viqXshlPLF9VcJBO4c5Cj1smfgG1o5v
7iitA3KqT/WZ1sp1ileP0hErYYhAaQRwTzSvEwB7DYJp9KdQu6Gqa8TYL1WoXbih
tNJAeDJaZtPPaeaMQsBdR8cmKeGVVo7ErrcFU6759Ew2g6+UbO2eKjRlhx5xA35q
fszI3A5aqz5TYrbap6tnlzovTDbuJo8O8dNXUp07iPwA2Lllf+v54Anl/iN85rR9
+TbrTuvs4rA1Gbd42coro902Hsg3HmslTI6ajnz5oDPYF9/M3KYRYEzP/ufVzh6p
dmQ3rwYTPSPc1rrLGmXh3YeYIpuJAA==
=71Qb
-----END PGP SIGNATURE-----
Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 fixes from Ted Ts'o:
"Fix two regressions in ext4 and a number of issues reported by syzbot"
* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: enable the lazy init thread when remounting read/write
ext4: fix fsync for non-directories
ext4: add lockdep annotations for i_data_sem for ea_inode's
ext4: disallow ea_inodes with extended attributes
ext4: set lockdep subclass for the ea_inode in ext4_xattr_inode_cache_find()
ext4: add EA_INODE checking to ext4_iget()
to_amba_device() now properly keeps the const-ness of the dev pointer
passed into it, while as before it could be lost.
Signed-off-by: Prathu Baronia <prathubaronia2011@gmail.com>
Link: https://lore.kernel.org/r/20230518134656.9559-1-prathubaronia2011@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add a Device Feature List (DFL) feature id as a generic mechanism
to expose a vendor-specific FPGA IP to user space. The feature id
is intended for use with IPs that do not need any kernel services
beyond exposure to user space through the UIO DFL driver.
The feature id is used in, e.g., Intel Oak Springs Canyon IPUs
to expose various IPs to user space, e.g., Network Controller
Sideband Interface (NC-SI), BaseNIC, and VirtIO management.
Link: https://github.com/OPAE/dfl-feature-id
Signed-off-by: Peter Colberg <peter.colberg@intel.com>
Signed-off-by: Matthew Gerlach <matthew.gerlach@linux.intel.com>
Reviewed-by: Russ Weight <russell.h.weight@intel.com>
Link: https://lore.kernel.org/r/20230531030737.12989-1-peter.colberg@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
vma_close frees qfrs but not clears q->qfrs, which still points
to the freed object, leading to subsequent mmap fail.
So vma_close clears q->qfrs as well.
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Link: https://lore.kernel.org/r/20230511095921.9331-3-zhangfei.gao@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The inode can be different in a container, for example, a docker and host
both open the same uacce parent device, which uses the same uacce struct
but different inode, so uacce->inode is not enough.
What's worse, when docker stops, the inode will be destroyed as well,
causing use-after-free in uacce_remove.
So use q->mapping to replace uacce->inode->i_mapping.
Signed-off-by: Weili Qian <qianweili@huawei.com>
Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Link: https://lore.kernel.org/r/20230511095921.9331-2-zhangfei.gao@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The functions are redundant as they are empty and performed by the misc
driver.
Signed-off-by: Bo Svangård <bo.svangard@sylog.se>
Acked-by: Michal Simek <michal.simek@amd.com>
Link: https://lore.kernel.org/r/20230422195933.523874-1-bo.svangard@sylog.se
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
No Management involved in Zone Appened.
Fixes: bd83fe6f2c ("nvme: add verbose error logging")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alan Adamson <alan.adamson@oracle.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
gcc 13 may assign another type to enumeration constants than gcc 12. Split
the large enum at the top of source file stex.c such that the type of the
constants used in time expressions is changed back to the same type chosen
by gcc 12. This patch suppresses compiler warnings like this one:
In file included from ./include/linux/bitops.h:7,
from ./include/linux/kernel.h:22,
from drivers/scsi/stex.c:13:
drivers/scsi/stex.c: In function ‘stex_common_handshake’:
./include/linux/typecheck.h:12:25: error: comparison of distinct pointer types lacks a cast [-Werror]
12 | (void)(&__dummy == &__dummy2); \
| ^~
./include/linux/jiffies.h:106:10: note: in expansion of macro ‘typecheck’
106 | typecheck(unsigned long, b) && \
| ^~~~~~~~~
drivers/scsi/stex.c:1035:29: note: in expansion of macro ‘time_after’
1035 | if (time_after(jiffies, before + MU_MAX_DELAY * HZ)) {
| ^~~~~~~~~~
See also https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107405.
Cc: stable@vger.kernel.org
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org> # build-tested
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20230529195034.3077-1-bvanassche@acm.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If an attempt at contacting a receiver or a device fails because the
receiver or device never responds, don't restart the communication, only
restart it if the receiver or device answers that it's busy, as originally
intended.
This was the behaviour on communication timeout before commit 586e8fede7
("HID: logitech-hidpp: Retry commands when device is busy").
This fixes some overly long waits in a critical path on boot, when
checking whether the device is connected by getting its HID++ version.
Signed-off-by: Bastien Nocera <hadess@hadess.net>
Suggested-by: Mark Lord <mlord@pobox.com>
Fixes: 586e8fede7 ("HID: logitech-hidpp: Retry commands when device is busy")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217412
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Early alternatives are called with the mmu disabled, and then should not
access any global symbols through the GOT since it requires relocations,
relocations that we do before but *virtually*. So only use medany code
model for this early code.
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com> # booted on nezha & unmatched
Fixes: 39b3307294 ("riscv: Introduce CONFIG_RELOCATABLE")
Link: https://lore.kernel.org/r/20230526154630.289374-1-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
The bug here is that you cannot rely on getting the same socket
from multiple calls to fget() because userspace can influence
that. This is a kind of double fetch bug.
The fix is to delete the svc_alien_sock() function and instead do
the checking inside the svc_addsock() function.
Fixes: 3064639423 ("nfsd: check passed socket's net matches NFSd superblock's one")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: NeilBrown <neilb@suse.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
All callers of trace_probe_primary_from_call() check the return
value to be non NULL. However, the function returns
list_first_entry(&tpe->probes, ...) which can never be NULL.
Additionally, it does not check for the list being possibly empty,
possibly causing a type confusion on empty lists.
Use list_first_entry_or_null() which solves both problems.
Link: https://lore.kernel.org/linux-trace-kernel/20230128-list-entry-null-check-v1-1-8bde6a3da2ef@diag.uniroma1.it/
Fixes: 60d53e2c3b ("tracing/probe: Split trace_event related data from trace_probe")
Signed-off-by: Pietro Borrello <borrello@diag.uniroma1.it>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Acked-by: Mukesh Ojha <quic_mojha@quicinc.com>
Cc: stable@vger.kernel.org
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Syzkaller got the following report:
BUG: KASAN: use-after-free in sk_setup_caps+0x621/0x690 net/core/sock.c:2018
Read of size 8 at addr ffff888027f82780 by task syz-executor276/3255
The function sk_setup_caps (called by ip6_sk_dst_store_flow->
ip6_dst_store) referenced already freed memory as this memory was
freed by parallel task in udpv6_sendmsg->ip6_sk_dst_lookup_flow->
sk_dst_check.
task1 (connect) task2 (udp6_sendmsg)
sk_setup_caps->sk_dst_set |
| sk_dst_check->
| sk_dst_set
| dst_release
sk_setup_caps references |
to already freed dst_entry|
The reason for this race condition is: sk_setup_caps() keeps using
the dst after transferring the ownership to the dst cache.
Found by Linux Verification Center (linuxtesting.org) with syzkaller.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Vladislav Efanov <VEfanov@ispras.ru>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
KVM maintains a mask of supported CPUs when a vPMU type is explicitly
selected by userspace and is used to reject any attempt to run the vCPU
on an unsupported CPU. This is great, but we're still beholden to the
default behavior where vCPUs can be scheduled anywhere and guest
counters may silently stop working.
Avoid confusing the next poor sod to look at this code and document the
intended behavior.
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230525212723.3361524-3-oliver.upton@linux.dev
To date KVM has relied on using a perf event to probe the core PMU at
the time of vPMU initialization. Behind the scenes perf_event_init()
would iteratively walk the PMUs of the system and return the PMU that
could handle the event. However, an upcoming change in perf core will
drop the iterative walk, thereby breaking the fragile dance we do on the
KVM side.
Avoid the problem altogether by iterating over the list of supported
PMUs maintained in KVM, returning the core PMU that matches the CPU
we were called on.
Tested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230525212723.3361524-2-oliver.upton@linux.dev
The current code for the length calculation wrongly truncates the reported
length of the groups array, causing an under report of the subscribed
groups. To fix this, use 'BITS_TO_BYTES()' which rounds up the
division by 8.
Fixes: b42be38b27 ("netlink: add API to retrieve all group memberships")
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230529153335.389815-1-pctammela@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The reference count on page table allocations is increased for every
'counted' PTE (valid or donated) in the table in addition to the initial
reference from ->zalloc_page(). kvm_pgtable_stage2_free_removed() fails
to drop the last reference on the root of the table walk, meaning we
leak memory.
Fix it by dropping the last reference after the free walker returns,
at which point all references for 'counted' PTEs have been released.
Cc: stable@vger.kernel.org
Fixes: 5c359cca1f ("KVM: arm64: Tear down unlinked stage-2 subtree after break-before-make")
Reported-by: Yu Zhao <yuzhao@google.com>
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Tested-by: Yu Zhao <yuzhao@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230530193213.1663411-1-oliver.upton@linux.dev
When use the following command to test:
1)ip link add bond0 type bond
2)ip link set bond0 up
3)tc qdisc add dev bond0 root handle ffff: mq
4)tc qdisc replace dev bond0 parent ffff:fff1 handle ffff: mq
The kernel reports NULL pointer dereference issue. The stack information
is as follows:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
Internal error: Oops: 0000000096000006 [#1] SMP
Modules linked in:
pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : mq_attach+0x44/0xa0
lr : qdisc_graft+0x20c/0x5cc
sp : ffff80000e2236a0
x29: ffff80000e2236a0 x28: ffff0000c0e59d80 x27: ffff0000c0be19c0
x26: ffff0000cae3e800 x25: 0000000000000010 x24: 00000000fffffff1
x23: 0000000000000000 x22: ffff0000cae3e800 x21: ffff0000c9df4000
x20: ffff0000c9df4000 x19: 0000000000000000 x18: ffff80000a934000
x17: ffff8000f5b56000 x16: ffff80000bb08000 x15: 0000000000000000
x14: 0000000000000000 x13: 6b6b6b6b6b6b6b6b x12: 6b6b6b6b00000001
x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000000
x8 : ffff0000c0be0730 x7 : bbbbbbbbbbbbbbbb x6 : 0000000000000008
x5 : ffff0000cae3e864 x4 : 0000000000000000 x3 : 0000000000000001
x2 : 0000000000000001 x1 : ffff8000090bc23c x0 : 0000000000000000
Call trace:
mq_attach+0x44/0xa0
qdisc_graft+0x20c/0x5cc
tc_modify_qdisc+0x1c4/0x664
rtnetlink_rcv_msg+0x354/0x440
netlink_rcv_skb+0x64/0x144
rtnetlink_rcv+0x28/0x34
netlink_unicast+0x1e8/0x2a4
netlink_sendmsg+0x308/0x4a0
sock_sendmsg+0x64/0xac
____sys_sendmsg+0x29c/0x358
___sys_sendmsg+0x90/0xd0
__sys_sendmsg+0x7c/0xd0
__arm64_sys_sendmsg+0x2c/0x38
invoke_syscall+0x54/0x114
el0_svc_common.constprop.1+0x90/0x174
do_el0_svc+0x3c/0xb0
el0_svc+0x24/0xec
el0t_64_sync_handler+0x90/0xb4
el0t_64_sync+0x174/0x178
This is because when mq is added for the first time, qdiscs in mq is set
to NULL in mq_attach(). Therefore, when replacing mq after adding mq, we
need to initialize qdiscs in the mq before continuing to graft. Otherwise,
it will couse NULL pointer dereference issue in mq_attach(). And the same
issue will occur in the attach functions of mqprio, taprio and htb.
ffff:fff1 means that the repalce qdisc is ingress. Ingress does not allow
any qdisc to be attached. Therefore, ffff:fff1 is incorrectly used, and
the command should be dropped.
Fixes: 6ec1c69a8f ("net_sched: add classful multiqueue dummy scheduler")
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Tested-by: Peilin Ye <peilin.ye@bytedance.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://lore.kernel.org/r/20230527093747.3583502-1-shaozhengchao@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Currently, after creating an ingress (or clsact) Qdisc and grafting it
under TC_H_INGRESS (TC_H_CLSACT), it is possible to graft it again under
e.g. a TBF Qdisc:
$ ip link add ifb0 type ifb
$ tc qdisc add dev ifb0 handle 1: root tbf rate 20kbit buffer 1600 limit 3000
$ tc qdisc add dev ifb0 clsact
$ tc qdisc link dev ifb0 handle ffff: parent 1:1
$ tc qdisc show dev ifb0
qdisc tbf 1: root refcnt 2 rate 20Kbit burst 1600b lat 560.0ms
qdisc clsact ffff: parent ffff:fff1 refcnt 2
^^^^^^^^
clsact's refcount has increased: it is now grafted under both
TC_H_CLSACT and 1:1.
ingress and clsact Qdiscs should only be used under TC_H_INGRESS
(TC_H_CLSACT). Prohibit regrafting them.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Fixes: 1f211a1b92 ("net, sched: add clsact qdisc")
Tested-by: Pedro Tammela <pctammela@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Vlad Buslov <vladbu@nvidia.com>
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Currently it is possible to add e.g. an HTB Qdisc under ffff:fff1
(TC_H_INGRESS, TC_H_CLSACT):
$ ip link add name ifb0 type ifb
$ tc qdisc add dev ifb0 parent ffff:fff1 htb
$ tc qdisc add dev ifb0 clsact
Error: Exclusivity flag on, cannot modify.
$ drgn
...
>>> ifb0 = netdev_get_by_name(prog, "ifb0")
>>> qdisc = ifb0.ingress_queue.qdisc_sleeping
>>> print(qdisc.ops.id.string_().decode())
htb
>>> qdisc.flags.value_() # TCQ_F_INGRESS
2
Only allow ingress and clsact Qdiscs under ffff:fff1. Return -EINVAL
for everything else. Make TCQ_F_INGRESS a static flag of ingress and
clsact Qdiscs.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Fixes: 1f211a1b92 ("net, sched: add clsact qdisc")
Tested-by: Pedro Tammela <pctammela@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Vlad Buslov <vladbu@nvidia.com>
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
clsact Qdiscs are only supposed to be created under TC_H_CLSACT (which
equals TC_H_INGRESS). Return -EOPNOTSUPP if 'parent' is not
TC_H_CLSACT.
Fixes: 1f211a1b92 ("net, sched: add clsact qdisc")
Tested-by: Pedro Tammela <pctammela@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Vlad Buslov <vladbu@nvidia.com>
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
ingress Qdiscs are only supposed to be created under TC_H_INGRESS.
Return -EOPNOTSUPP if 'parent' is not TC_H_INGRESS, similar to
mq_init().
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reported-by: syzbot+b53a9c0d1ea4ad62da8b@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/r/0000000000006cf87705f79acf1a@google.com/
Tested-by: Pedro Tammela <pctammela@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Vlad Buslov <vladbu@nvidia.com>
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>