Commit Graph

873840 Commits

Author SHA1 Message Date
wangzhimin cf43a321eb usb: xhci: xhci-plat: Support for Phytium Pe220x
Add XHCI_RESET_ON_RESUME quirk for Phytium Pe220x
Phytium Pe220x xHCI host controller does not have suspend/resume
support. Therefore, use of the XHCI_RESET_ON_RESUME quirk is
mandatory in order to avoid failures after resume.

Signed-off-by: wangzhimin <wangzhimin1179@phytium.com.cn>
(cherry picked from commit 33d50ca483)
Signed-off-by: Alex Shi <alexsshi@tencent.com>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-28 12:27:46 +08:00
wangzhimin fe907db167 dts: phytium: Add dts for Phytium Pe220x SoCs
Add initial device tree for Phytium Pe220x SoCs. Phytium Pe220x
series has three specs (Pe2201/Pe2202/Pe2204), distinguished by
the number of CPU core. Besides CPU cores, on-chip peripherals
also vary. Thus, we split them into three separate DTBs.

Signed-off-by: wangzhimin <wangzhimin1179@phytium.com.cn>
(cherry picked from commit 1d5bb1d501)
Signed-off-by: Alex Shi <alexsshi@tencent.com>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-28 12:26:17 +08:00
Malloy Liu d89d180b6f SPI platform driver support for Phytium desktop CPUS
SPI platform driver for Phytium desktop CPUs, such as FT-2000/4 and D2000.

Reviewed-by: Hongbo Mao <maohongbo@phytium.com.cn>
(cherry picked from commit 555ae1361a)
Signed-off-by: Alex Shi <alexsshi@tencent.com>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-28 12:26:14 +08:00
liqian 179588c1a5 GPIO driver support for Phytium desktop and embedded CPUs
GPIO dirver fix patch. Support Phytiumm Desktop and Embedded CPUs, such as D2000 and E2000.

Reviewed-by: Hongbo Mao <maohongbo@phytium.com.cn>
(cherry picked from commit c2a5e2beae)
Signed-off-by: Alex Shi <alexsshi@tencent.com>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-28 12:26:14 +08:00
wangzhimin 865c34dd67 This patch makes stmmac driver support SBSA compatible Phytium
SoC's on-chip RGMII.

Signed-off-by: wangzhimin <wangzhimin1179@phytium.com.cn>
(cherry picked from commit 4611897533)
Signed-off-by: Alex Shi <alexsshi@tencent.com>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-28 12:26:14 +08:00
Wang Nan 8e96d80aac RTC driver support for Phytium desktop and embedded CPUs
RTC driver function fix patch Support Phytium desktop and embedded processors, such as D2000 and E2000

Reviewed-by: Wang Nan <wangnan1505@phytium.com.cn>
(cherry picked from commit 00f079b210)
Signed-off-by: Alex Shi <alexsshi@tencent.com>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-28 12:26:14 +08:00
Wei Tian f658346e99 HDA driver support for Phytium desktop
(cherry picked from commit 61b278c01b)
Signed-off-by: Alex Shi <alexsshi@tencent.com>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-28 12:26:14 +08:00
JianChao Sheng cace117756 serial driver support for Phytium desktop and embedded CPUs
serial driver function fix patch. Support Phytium desktop and embedded processors, such as D2000.

Reviewed-by:JianChao Sheng <shengjianchao@phytium.com.cn>
(cherry picked from commit 92e0f87830)
Signed-off-by: Alex Shi <alexsshi@tencent.com>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-28 12:26:14 +08:00
Chen 1f75d9911b mailbox driver support for Phytium desktop and embedded CPUs
Mailbox driver function patch. Support Phytium desktop and embedded processors, such as D2000 and E2000.

Reviewed-by: Hongbo Mao <maohongbo@phytium.com.cn>
(cherry picked from commit f6e90a3999)
Signed-off-by: Alex Shi <alexsshi@tencent.com>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-28 12:26:14 +08:00
Jiakun Shuai 602399421d I2C driver support for Phytium Desktop CPUs
I2C driver function fix patch. Support Phytium desktop and embedded processors, such as D2000 and E2000.

Conflicts:
	drivers/i2c/busses/i2c-designware-platdrv.c

Reviewed-by: Hongbo Mao <maohongbo@phytium.com.cn>
(cherry picked from commit 5ec3d87d62)
Signed-off-by: Alex Shi <alexsshi@tencent.com>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-28 12:26:14 +08:00
LiQian 1de253bccd Add Phytium BT BMC driver support
adds a simple device driver to expose the BT interface
on Phytium SoC as a character device. Such SOCs are commonly
used as BMCs (BaseBoard Management Controllers) and this driver
implements the BMC side of the BT interface.

(cherry picked from commit ccb23948d0)
Signed-off-by: Alex Shi <alexsshi@tencent.com>

Conflicts:
	drivers/char/ipmi/Kconfig
	drivers/char/ipmi/Makefile
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-28 12:26:14 +08:00
LiQian da85008b88 Add Phytium KCS IPMI BMC driver support
(cherry picked from commit 4d16c334ea)
Signed-off-by: Alex Shi <alexsshi@tencent.com>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-28 12:26:14 +08:00
wjl00563 1b10cd6409 powerpc: export arch_trigger_cpumask_backtrace
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-28 12:26:14 +08:00
Chen Siyu 6c6b6ecd88 update can driver for phytium D2000 Soc
fix the problem for setting bitrate on phytium D2000 Soc

Signed-off-by: Chen Siyu <chensiyu1321@phytium.com.cn>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-28 12:26:13 +08:00
Chen Siyu 0cdc4fd1f6 drivers,can: dt-bindings-can-phytium-Add-bindings-for-Phytium-CAN
add-Phytium-CAN-controller-support

Conflicts:
	MAINTAINERS

reviewed-by: Chen Siyu <chensiyu1321@phytium.com.cn>
Signed-off-by: Cheng Quan <chengquan@phytium.com.cn>
Signed-off-by: Li Zhengguang <lizhengguang1317@phytium.com.cn>
Signed-off-by: Chen Baozi <chenbaozi@phytium.com.cn>
Signed-off-by: Chen Siyu <chensiyu1321@phytium.com.cn>
(cherry picked from commit 213232efb8)
Signed-off-by: Alex Shi <alexsshi@tencent.com>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-28 12:26:13 +08:00
Fuhai Wang db1d1c415b netfilter: nf_tables: reject QUEUE/DROP verdict parameters
commit f342de4e2f33e0e39165d8639387aa6c19dff660 upstream.

This reverts commit e0abdadcc6.

core.c:nf_hook_slow assumes that the upper 16 bits of NF_DROP
verdicts contain a valid errno, i.e. -EPERM, -EHOSTUNREACH or similar,
or 0.

Due to the reverted commit, its possible to provide a positive
value, e.g. NF_ACCEPT (1), which results in use-after-free.

Its not clear to me why this commit was made.

NF_QUEUE is not used by nftables; "queue" rules in nftables
will result in use of "nft_queue" expression.

If we later need to allow specifiying errno values from userspace
(do not know why), this has to call NF_DROP_GETERR and check that
"err <= 0" holds true.

Fixes: e0abdadcc6 ("netfilter: nf_tables: accept QUEUE/DROP verdict parameters")
Cc: stable@vger.kernel.org
Reported-by: Notselwyn <notselwyn@pwning.tech>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Fuhai Wang <fuhaiwang@tencent.com>
Signed-off-by: caelli <caelli@tencent.com>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-28 12:26:13 +08:00
Jianping Liu bdaf238df4 config: enable CONFIG_NGBE and CONFIG_TXGBE
NGBE: support wangxun 1GbE driver
TXGBE: support wangxun 10GbE driver.

Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-28 12:25:27 +08:00
Jianping Liu b7dacbe7a4 drivers/net,wangxun: fix compile error
The commit 91dc13667e have change change
the struct of ethtool_ops, which will cause compile error.

Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-28 11:43:46 +08:00
Duanqiang Wen b38c5bb0b9 net: wangxun: txgbe: support wangxun 10GbE driver
add support for Wangxun 10GbE driver, source files and
functions are the same as wangxun out of box drivevr
release version txgbe-1.3.5.1.

Signed-off-by: Duanqiang Wen <duanqiangwen@net-swift.com>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-27 15:22:01 +08:00
Duanqiang Wen cb52153887 net: wangxun: ngbe: support wangxun 1GbE driver
add support for wangxun 1GbE driver, source files and functions
are the same as wangxun oob ngbe-1.2.5.3.

Signed-off-by: Duanqiang Wen <duanqiangwen@net-swift.com>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-27 15:22:01 +08:00
Jianping Liu 5706ab9788 config: enable CONFIG_INFINIBAND_HNS
Sync CONFIG_INFINIBAND_* from linux-5.4/lts/5.4.119-20.0009, ZhongXing NIC RDMA
needing infiniband driver.

Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-27 15:22:01 +08:00
Jianping Liu 38caf94087 config: update config with out manual change
Update config with the commands as below:
make tencentconfig
make savedefconfig
mv defconfig arch/$ARCH/configs/tencent.config

Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-27 15:22:01 +08:00
Byeonguk Jeong 4dea7a47fa bpf: Fix out-of-bounds write in trie_get_next_key()
[ Upstream commit 13400ac8fb80c57c2bfb12ebd35ee121ce9b4d21 ]

trie_get_next_key() allocates a node stack with size trie->max_prefixlen,
while it writes (trie->max_prefixlen + 1) nodes to the stack when it has
full paths from the root to leaves. For example, consider a trie with
max_prefixlen is 8, and the nodes with key 0x00/0, 0x00/1, 0x00/2, ...
0x00/8 inserted. Subsequent calls to trie_get_next_key with _key with
.prefixlen = 8 make 9 nodes be written on the node stack with size 8.

Fixes: b471f2f1de ("bpf: implement MAP_GET_NEXT_KEY command for LPM_TRIE map")
Signed-off-by: Byeonguk Jeong <jungbu2855@gmail.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@kernel.org>
Tested-by: Hou Tao <houtao1@huawei.com>
Acked-by: Hou Tao <houtao1@huawei.com>
Link: https://lore.kernel.org/r/Zxx384ZfdlFYnz6J@localhost.localdomain
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-27 15:13:06 +08:00
Tao Chen 78568f3d72 bpf: Check percpu map value size first
[ Upstream commit 1d244784be6b01162b732a5a7d637dfc024c3203 ]

Percpu map is often used, but the map value size limit often ignored,
like issue: https://github.com/iovisor/bcc/issues/2519. Actually,
percpu map value size is bound by PCPU_MIN_UNIT_SIZE, so we
can check the value size whether it exceeds PCPU_MIN_UNIT_SIZE first,
like percpu map of local_storage. Maybe the error message seems clearer
compared with "cannot allocate memory".

Signed-off-by: Jinke Han <jinkehan@didiglobal.com>
Signed-off-by: Tao Chen <chen.dylane@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240910144111.1464912-2-chen.dylane@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-27 15:13:06 +08:00
Florian Lehner af7ecf360c bpf, lpm: Fix check prefixlen before walking trie
[ Upstream commit 9b75dbeb36fcd9fc7ed51d370310d0518a387769 ]

When looking up an element in LPM trie, the condition 'matchlen ==
trie->max_prefixlen' will never return true, if key->prefixlen is larger
than trie->max_prefixlen. Consequently all elements in the LPM trie will
be visited and no element is returned in the end.

To resolve this, check key->prefixlen first before walking the LPM trie.

Fixes: b95a5c4db0 ("bpf: add a longest prefix match trie map implementation")
Signed-off-by: Florian Lehner <dev@der-flo.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20231105085801.3742-1-dev@der-flo.net
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-27 15:13:06 +08:00
Shung-Hsi Yu d5d51155da bpf: Fix precision tracking for BPF_ALU | BPF_TO_BE | BPF_END
commit 291d044fd51f8484066300ee42afecf8c8db7b3a upstream.

BPF_END and BPF_NEG has a different specification for the source bit in
the opcode compared to other ALU/ALU64 instructions, and is either
reserved or use to specify the byte swap endianness. In both cases the
source bit does not encode source operand location, and src_reg is a
reserved field.

backtrack_insn() currently does not differentiate BPF_END and BPF_NEG
from other ALU/ALU64 instructions, which leads to r0 being incorrectly
marked as precise when processing BPF_ALU | BPF_TO_BE | BPF_END
instructions. This commit teaches backtrack_insn() to correctly mark
precision for such case.

While precise tracking of BPF_NEG and other BPF_END instructions are
correct and does not need fixing, this commit opt to process all BPF_NEG
and BPF_END instructions within the same if-clause to better align with
current convention used in the verifier (e.g. check_alu_op).

Fixes: b5dc0163d8 ("bpf: precise scalar_value tracking")
Cc: stable@vger.kernel.org
Reported-by: Mohamed Mahmoud <mmahmoud@redhat.com>
Closes: https://lore.kernel.org/r/87jzrrwptf.fsf@toke.dk
Tested-by: Toke Høiland-Jørgensen <toke@redhat.com>
Tested-by: Tao Lyu <tao.lyu@epfl.ch>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Link: https://lore.kernel.org/r/20231102053913.12004-2-shung-hsi.yu@suse.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-27 15:13:06 +08:00
Toke Høiland-Jørgensen ecab4d8a7f bpf: Avoid deadlock when using queue and stack maps from NMI
[ Upstream commit a34a9f1a19 ]

Sysbot discovered that the queue and stack maps can deadlock if they are
being used from a BPF program that can be called from NMI context (such as
one that is attached to a perf HW counter event). To fix this, add an
in_nmi() check and use raw_spin_trylock() in NMI context, erroring out if
grabbing the lock fails.

Fixes: f1a2e44a3a ("bpf: add queue and stack maps")
Reported-by: Hsin-Wei Hung <hsinweih@uci.edu>
Tested-by: Hsin-Wei Hung <hsinweih@uci.edu>
Co-developed-by: Hsin-Wei Hung <hsinweih@uci.edu>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20230911132815.717240-1-toke@redhat.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-27 15:13:06 +08:00
Martin KaFai Lau a52663f05b bpf: Address KCSAN report on bpf_lru_list
[ Upstream commit ee9fd0ac30 ]

KCSAN reported a data-race when accessing node->ref.
Although node->ref does not have to be accurate,
take this chance to use a more common READ_ONCE() and WRITE_ONCE()
pattern instead of data_race().

There is an existing bpf_lru_node_is_ref() and bpf_lru_node_set_ref().
This patch also adds bpf_lru_node_clear_ref() to do the
WRITE_ONCE(node->ref, 0) also.

==================================================================
BUG: KCSAN: data-race in __bpf_lru_list_rotate / __htab_lru_percpu_map_update_elem

write to 0xffff888137038deb of 1 bytes by task 11240 on cpu 1:
__bpf_lru_node_move kernel/bpf/bpf_lru_list.c:113 [inline]
__bpf_lru_list_rotate_active kernel/bpf/bpf_lru_list.c:149 [inline]
__bpf_lru_list_rotate+0x1bf/0x750 kernel/bpf/bpf_lru_list.c:240
bpf_lru_list_pop_free_to_local kernel/bpf/bpf_lru_list.c:329 [inline]
bpf_common_lru_pop_free kernel/bpf/bpf_lru_list.c:447 [inline]
bpf_lru_pop_free+0x638/0xe20 kernel/bpf/bpf_lru_list.c:499
prealloc_lru_pop kernel/bpf/hashtab.c:290 [inline]
__htab_lru_percpu_map_update_elem+0xe7/0x820 kernel/bpf/hashtab.c:1316
bpf_percpu_hash_update+0x5e/0x90 kernel/bpf/hashtab.c:2313
bpf_map_update_value+0x2a9/0x370 kernel/bpf/syscall.c:200
generic_map_update_batch+0x3ae/0x4f0 kernel/bpf/syscall.c:1687
bpf_map_do_batch+0x2d9/0x3d0 kernel/bpf/syscall.c:4534
__sys_bpf+0x338/0x810
__do_sys_bpf kernel/bpf/syscall.c:5096 [inline]
__se_sys_bpf kernel/bpf/syscall.c:5094 [inline]
__x64_sys_bpf+0x43/0x50 kernel/bpf/syscall.c:5094
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

read to 0xffff888137038deb of 1 bytes by task 11241 on cpu 0:
bpf_lru_node_set_ref kernel/bpf/bpf_lru_list.h:70 [inline]
__htab_lru_percpu_map_update_elem+0x2f1/0x820 kernel/bpf/hashtab.c:1332
bpf_percpu_hash_update+0x5e/0x90 kernel/bpf/hashtab.c:2313
bpf_map_update_value+0x2a9/0x370 kernel/bpf/syscall.c:200
generic_map_update_batch+0x3ae/0x4f0 kernel/bpf/syscall.c:1687
bpf_map_do_batch+0x2d9/0x3d0 kernel/bpf/syscall.c:4534
__sys_bpf+0x338/0x810
__do_sys_bpf kernel/bpf/syscall.c:5096 [inline]
__se_sys_bpf kernel/bpf/syscall.c:5094 [inline]
__x64_sys_bpf+0x43/0x50 kernel/bpf/syscall.c:5094
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

value changed: 0x01 -> 0x00

Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 11241 Comm: syz-executor.3 Not tainted 6.3.0-rc7-syzkaller-00136-g6a66fdd29ea1 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/30/2023
==================================================================

Reported-by: syzbot+ebe648a84e8784763f82@syzkaller.appspotmail.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20230511043748.1384166-1-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-27 15:13:06 +08:00
Will Deacon bd0ebbe839 bpf: Fix mask generation for 32-bit narrow loads of 64-bit fields
commit 0613d8ca9a upstream.

A narrow load from a 64-bit context field results in a 64-bit load
followed potentially by a 64-bit right-shift and then a bitwise AND
operation to extract the relevant data.

In the case of a 32-bit access, an immediate mask of 0xffffffff is used
to construct a 64-bit BPP_AND operation which then sign-extends the mask
value and effectively acts as a glorified no-op. For example:

0:	61 10 00 00 00 00 00 00	r0 = *(u32 *)(r1 + 0)

results in the following code generation for a 64-bit field:

	ldr	x7, [x7]	// 64-bit load
	mov	x10, #0xffffffffffffffff
	and	x7, x7, x10

Fix the mask generation so that narrow loads always perform a 32-bit AND
operation:

	ldr	x7, [x7]	// 64-bit load
	mov	w10, #0xffffffff
	and	w7, w7, w10

Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Krzesimir Nowak <krzesimir@kinvolk.io>
Cc: Andrey Ignatov <rdna@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Fixes: 31fd85816d ("bpf: permits narrower load from bpf program context fields")
Signed-off-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230518102528.1341-1-will@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-27 15:13:05 +08:00
Stanislav Fomichev 55ec926393 bpf: Don't EFAULT for getsockopt with optval=NULL
[ Upstream commit 00e74ae086 ]

Some socket options do getsockopt with optval=NULL to estimate the size
of the final buffer (which is returned via optlen). This breaks BPF
getsockopt assumptions about permitted optval buffer size. Let's enforce
these assumptions only when non-NULL optval is provided.

Fixes: 0d01da6afc ("bpf: implement getsockopt and setsockopt hooks")
Reported-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/ZD7Js4fj5YyI2oLd@google.com/T/#mb68daf700f87a9244a15d01d00c3f0e5b08f49f7
Link: https://lore.kernel.org/bpf/20230418225343.553806-2-sdf@google.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-27 15:13:05 +08:00
Daniel Borkmann d30e8bf7fc bpf: Fix incorrect verifier pruning due to missing register precision taints
[ Upstream commit 71b547f561 ]

Juan Jose et al reported an issue found via fuzzing where the verifier's
pruning logic prematurely marks a program path as safe.

Consider the following program:

   0: (b7) r6 = 1024
   1: (b7) r7 = 0
   2: (b7) r8 = 0
   3: (b7) r9 = -2147483648
   4: (97) r6 %= 1025
   5: (05) goto pc+0
   6: (bd) if r6 <= r9 goto pc+2
   7: (97) r6 %= 1
   8: (b7) r9 = 0
   9: (bd) if r6 <= r9 goto pc+1
  10: (b7) r6 = 0
  11: (b7) r0 = 0
  12: (63) *(u32 *)(r10 -4) = r0
  13: (18) r4 = 0xffff888103693400 // map_ptr(ks=4,vs=48)
  15: (bf) r1 = r4
  16: (bf) r2 = r10
  17: (07) r2 += -4
  18: (85) call bpf_map_lookup_elem#1
  19: (55) if r0 != 0x0 goto pc+1
  20: (95) exit
  21: (77) r6 >>= 10
  22: (27) r6 *= 8192
  23: (bf) r1 = r0
  24: (0f) r0 += r6
  25: (79) r3 = *(u64 *)(r0 +0)
  26: (7b) *(u64 *)(r1 +0) = r3
  27: (95) exit

The verifier treats this as safe, leading to oob read/write access due
to an incorrect verifier conclusion:

  func#0 @0
  0: R1=ctx(off=0,imm=0) R10=fp0
  0: (b7) r6 = 1024                     ; R6_w=1024
  1: (b7) r7 = 0                        ; R7_w=0
  2: (b7) r8 = 0                        ; R8_w=0
  3: (b7) r9 = -2147483648              ; R9_w=-2147483648
  4: (97) r6 %= 1025                    ; R6_w=scalar()
  5: (05) goto pc+0
  6: (bd) if r6 <= r9 goto pc+2         ; R6_w=scalar(umin=18446744071562067969,var_off=(0xffffffff00000000; 0xffffffff)) R9_w=-2147483648
  7: (97) r6 %= 1                       ; R6_w=scalar()
  8: (b7) r9 = 0                        ; R9=0
  9: (bd) if r6 <= r9 goto pc+1         ; R6=scalar(umin=1) R9=0
  10: (b7) r6 = 0                       ; R6_w=0
  11: (b7) r0 = 0                       ; R0_w=0
  12: (63) *(u32 *)(r10 -4) = r0
  last_idx 12 first_idx 9
  regs=1 stack=0 before 11: (b7) r0 = 0
  13: R0_w=0 R10=fp0 fp-8=0000????
  13: (18) r4 = 0xffff8ad3886c2a00      ; R4_w=map_ptr(off=0,ks=4,vs=48,imm=0)
  15: (bf) r1 = r4                      ; R1_w=map_ptr(off=0,ks=4,vs=48,imm=0) R4_w=map_ptr(off=0,ks=4,vs=48,imm=0)
  16: (bf) r2 = r10                     ; R2_w=fp0 R10=fp0
  17: (07) r2 += -4                     ; R2_w=fp-4
  18: (85) call bpf_map_lookup_elem#1   ; R0=map_value_or_null(id=1,off=0,ks=4,vs=48,imm=0)
  19: (55) if r0 != 0x0 goto pc+1       ; R0=0
  20: (95) exit

  from 19 to 21: R0=map_value(off=0,ks=4,vs=48,imm=0) R6=0 R7=0 R8=0 R9=0 R10=fp0 fp-8=mmmm????
  21: (77) r6 >>= 10                    ; R6_w=0
  22: (27) r6 *= 8192                   ; R6_w=0
  23: (bf) r1 = r0                      ; R0=map_value(off=0,ks=4,vs=48,imm=0) R1_w=map_value(off=0,ks=4,vs=48,imm=0)
  24: (0f) r0 += r6
  last_idx 24 first_idx 19
  regs=40 stack=0 before 23: (bf) r1 = r0
  regs=40 stack=0 before 22: (27) r6 *= 8192
  regs=40 stack=0 before 21: (77) r6 >>= 10
  regs=40 stack=0 before 19: (55) if r0 != 0x0 goto pc+1
  parent didn't have regs=40 stack=0 marks: R0_rw=map_value_or_null(id=1,off=0,ks=4,vs=48,imm=0) R6_rw=P0 R7=0 R8=0 R9=0 R10=fp0 fp-8=mmmm????
  last_idx 18 first_idx 9
  regs=40 stack=0 before 18: (85) call bpf_map_lookup_elem#1
  regs=40 stack=0 before 17: (07) r2 += -4
  regs=40 stack=0 before 16: (bf) r2 = r10
  regs=40 stack=0 before 15: (bf) r1 = r4
  regs=40 stack=0 before 13: (18) r4 = 0xffff8ad3886c2a00
  regs=40 stack=0 before 12: (63) *(u32 *)(r10 -4) = r0
  regs=40 stack=0 before 11: (b7) r0 = 0
  regs=40 stack=0 before 10: (b7) r6 = 0
  25: (79) r3 = *(u64 *)(r0 +0)         ; R0_w=map_value(off=0,ks=4,vs=48,imm=0) R3_w=scalar()
  26: (7b) *(u64 *)(r1 +0) = r3         ; R1_w=map_value(off=0,ks=4,vs=48,imm=0) R3_w=scalar()
  27: (95) exit

  from 9 to 11: R1=ctx(off=0,imm=0) R6=0 R7=0 R8=0 R9=0 R10=fp0
  11: (b7) r0 = 0                       ; R0_w=0
  12: (63) *(u32 *)(r10 -4) = r0
  last_idx 12 first_idx 11
  regs=1 stack=0 before 11: (b7) r0 = 0
  13: R0_w=0 R10=fp0 fp-8=0000????
  13: (18) r4 = 0xffff8ad3886c2a00      ; R4_w=map_ptr(off=0,ks=4,vs=48,imm=0)
  15: (bf) r1 = r4                      ; R1_w=map_ptr(off=0,ks=4,vs=48,imm=0) R4_w=map_ptr(off=0,ks=4,vs=48,imm=0)
  16: (bf) r2 = r10                     ; R2_w=fp0 R10=fp0
  17: (07) r2 += -4                     ; R2_w=fp-4
  18: (85) call bpf_map_lookup_elem#1
  frame 0: propagating r6
  last_idx 19 first_idx 11
  regs=40 stack=0 before 18: (85) call bpf_map_lookup_elem#1
  regs=40 stack=0 before 17: (07) r2 += -4
  regs=40 stack=0 before 16: (bf) r2 = r10
  regs=40 stack=0 before 15: (bf) r1 = r4
  regs=40 stack=0 before 13: (18) r4 = 0xffff8ad3886c2a00
  regs=40 stack=0 before 12: (63) *(u32 *)(r10 -4) = r0
  regs=40 stack=0 before 11: (b7) r0 = 0
  parent didn't have regs=40 stack=0 marks: R1=ctx(off=0,imm=0) R6_r=P0 R7=0 R8=0 R9=0 R10=fp0
  last_idx 9 first_idx 9
  regs=40 stack=0 before 9: (bd) if r6 <= r9 goto pc+1
  parent didn't have regs=40 stack=0 marks: R1=ctx(off=0,imm=0) R6_rw=Pscalar() R7_w=0 R8_w=0 R9_rw=0 R10=fp0
  last_idx 8 first_idx 0
  regs=40 stack=0 before 8: (b7) r9 = 0
  regs=40 stack=0 before 7: (97) r6 %= 1
  regs=40 stack=0 before 6: (bd) if r6 <= r9 goto pc+2
  regs=40 stack=0 before 5: (05) goto pc+0
  regs=40 stack=0 before 4: (97) r6 %= 1025
  regs=40 stack=0 before 3: (b7) r9 = -2147483648
  regs=40 stack=0 before 2: (b7) r8 = 0
  regs=40 stack=0 before 1: (b7) r7 = 0
  regs=40 stack=0 before 0: (b7) r6 = 1024
  19: safe
  frame 0: propagating r6
  last_idx 9 first_idx 0
  regs=40 stack=0 before 6: (bd) if r6 <= r9 goto pc+2
  regs=40 stack=0 before 5: (05) goto pc+0
  regs=40 stack=0 before 4: (97) r6 %= 1025
  regs=40 stack=0 before 3: (b7) r9 = -2147483648
  regs=40 stack=0 before 2: (b7) r8 = 0
  regs=40 stack=0 before 1: (b7) r7 = 0
  regs=40 stack=0 before 0: (b7) r6 = 1024

  from 6 to 9: safe
  verification time 110 usec
  stack depth 4
  processed 36 insns (limit 1000000) max_states_per_insn 0 total_states 3 peak_states 3 mark_read 2

The verifier considers this program as safe by mistakenly pruning unsafe
code paths. In the above func#0, code lines 0-10 are of interest. In line
0-3 registers r6 to r9 are initialized with known scalar values. In line 4
the register r6 is reset to an unknown scalar given the verifier does not
track modulo operations. Due to this, the verifier can also not determine
precisely which branches in line 6 and 9 are taken, therefore it needs to
explore them both.

As can be seen, the verifier starts with exploring the false/fall-through
paths first. The 'from 19 to 21' path has both r6=0 and r9=0 and the pointer
arithmetic on r0 += r6 is therefore considered safe. Given the arithmetic,
r6 is correctly marked for precision tracking where backtracking kicks in
where it walks back the current path all the way where r6 was set to 0 in
the fall-through branch.

Next, the pruning logics pops the path 'from 9 to 11' from the stack. Also
here, the state of the registers is the same, that is, r6=0 and r9=0, so
that at line 19 the path can be pruned as it is considered safe. It is
interesting to note that the conditional in line 9 turned r6 into a more
precise state, that is, in the fall-through path at the beginning of line
10, it is R6=scalar(umin=1), and in the branch-taken path (which is analyzed
here) at the beginning of line 11, r6 turned into a known const r6=0 as
r9=0 prior to that and therefore (unsigned) r6 <= 0 concludes that r6 must
be 0 (**):

  [...]                                 ; R6_w=scalar()
  9: (bd) if r6 <= r9 goto pc+1         ; R6=scalar(umin=1) R9=0
  [...]

  from 9 to 11: R1=ctx(off=0,imm=0) R6=0 R7=0 R8=0 R9=0 R10=fp0
  [...]

The next path is 'from 6 to 9'. The verifier considers the old and current
state equivalent, and therefore prunes the search incorrectly. Looking into
the two states which are being compared by the pruning logic at line 9, the
old state consists of R6_rwD=Pscalar() R9_rwD=0 R10=fp0 and the new state
consists of R1=ctx(off=0,imm=0) R6_w=scalar(umax=18446744071562067968)
R7_w=0 R8_w=0 R9_w=-2147483648 R10=fp0. While r6 had the reg->precise flag
correctly set in the old state, r9 did not. Both r6'es are considered as
equivalent given the old one is a superset of the current, more precise one,
however, r9's actual values (0 vs 0x80000000) mismatch. Given the old r9
did not have reg->precise flag set, the verifier does not consider the
register as contributing to the precision state of r6, and therefore it
considered both r9 states as equivalent. However, for this specific pruned
path (which is also the actual path taken at runtime), register r6 will be
0x400 and r9 0x80000000 when reaching line 21, thus oob-accessing the map.

The purpose of precision tracking is to initially mark registers (including
spilled ones) as imprecise to help verifier's pruning logic finding equivalent
states it can then prune if they don't contribute to the program's safety
aspects. For example, if registers are used for pointer arithmetic or to pass
constant length to a helper, then the verifier sets reg->precise flag and
backtracks the BPF program instruction sequence and chain of verifier states
to ensure that the given register or stack slot including their dependencies
are marked as precisely tracked scalar. This also includes any other registers
and slots that contribute to a tracked state of given registers/stack slot.
This backtracking relies on recorded jmp_history and is able to traverse
entire chain of parent states. This process ends only when all the necessary
registers/slots and their transitive dependencies are marked as precise.

The backtrack_insn() is called from the current instruction up to the first
instruction, and its purpose is to compute a bitmask of registers and stack
slots that need precision tracking in the parent's verifier state. For example,
if a current instruction is r6 = r7, then r6 needs precision after this
instruction and r7 needs precision before this instruction, that is, in the
parent state. Hence for the latter r7 is marked and r6 unmarked.

For the class of jmp/jmp32 instructions, backtrack_insn() today only looks
at call and exit instructions and for all other conditionals the masks
remain as-is. However, in the given situation register r6 has a dependency
on r9 (as described above in **), so also that one needs to be marked for
precision tracking. In other words, if an imprecise register influences a
precise one, then the imprecise register should also be marked precise.
Meaning, in the parent state both dest and src register need to be tracked
for precision and therefore the marking must be more conservative by setting
reg->precise flag for both. The precision propagation needs to cover both
for the conditional: if the src reg was marked but not the dst reg and vice
versa.

After the fix the program is correctly rejected:

  func#0 @0
  0: R1=ctx(off=0,imm=0) R10=fp0
  0: (b7) r6 = 1024                     ; R6_w=1024
  1: (b7) r7 = 0                        ; R7_w=0
  2: (b7) r8 = 0                        ; R8_w=0
  3: (b7) r9 = -2147483648              ; R9_w=-2147483648
  4: (97) r6 %= 1025                    ; R6_w=scalar()
  5: (05) goto pc+0
  6: (bd) if r6 <= r9 goto pc+2         ; R6_w=scalar(umin=18446744071562067969,var_off=(0xffffffff80000000; 0x7fffffff),u32_min=-2147483648) R9_w=-2147483648
  7: (97) r6 %= 1                       ; R6_w=scalar()
  8: (b7) r9 = 0                        ; R9=0
  9: (bd) if r6 <= r9 goto pc+1         ; R6=scalar(umin=1) R9=0
  10: (b7) r6 = 0                       ; R6_w=0
  11: (b7) r0 = 0                       ; R0_w=0
  12: (63) *(u32 *)(r10 -4) = r0
  last_idx 12 first_idx 9
  regs=1 stack=0 before 11: (b7) r0 = 0
  13: R0_w=0 R10=fp0 fp-8=0000????
  13: (18) r4 = 0xffff9290dc5bfe00      ; R4_w=map_ptr(off=0,ks=4,vs=48,imm=0)
  15: (bf) r1 = r4                      ; R1_w=map_ptr(off=0,ks=4,vs=48,imm=0) R4_w=map_ptr(off=0,ks=4,vs=48,imm=0)
  16: (bf) r2 = r10                     ; R2_w=fp0 R10=fp0
  17: (07) r2 += -4                     ; R2_w=fp-4
  18: (85) call bpf_map_lookup_elem#1   ; R0=map_value_or_null(id=1,off=0,ks=4,vs=48,imm=0)
  19: (55) if r0 != 0x0 goto pc+1       ; R0=0
  20: (95) exit

  from 19 to 21: R0=map_value(off=0,ks=4,vs=48,imm=0) R6=0 R7=0 R8=0 R9=0 R10=fp0 fp-8=mmmm????
  21: (77) r6 >>= 10                    ; R6_w=0
  22: (27) r6 *= 8192                   ; R6_w=0
  23: (bf) r1 = r0                      ; R0=map_value(off=0,ks=4,vs=48,imm=0) R1_w=map_value(off=0,ks=4,vs=48,imm=0)
  24: (0f) r0 += r6
  last_idx 24 first_idx 19
  regs=40 stack=0 before 23: (bf) r1 = r0
  regs=40 stack=0 before 22: (27) r6 *= 8192
  regs=40 stack=0 before 21: (77) r6 >>= 10
  regs=40 stack=0 before 19: (55) if r0 != 0x0 goto pc+1
  parent didn't have regs=40 stack=0 marks: R0_rw=map_value_or_null(id=1,off=0,ks=4,vs=48,imm=0) R6_rw=P0 R7=0 R8=0 R9=0 R10=fp0 fp-8=mmmm????
  last_idx 18 first_idx 9
  regs=40 stack=0 before 18: (85) call bpf_map_lookup_elem#1
  regs=40 stack=0 before 17: (07) r2 += -4
  regs=40 stack=0 before 16: (bf) r2 = r10
  regs=40 stack=0 before 15: (bf) r1 = r4
  regs=40 stack=0 before 13: (18) r4 = 0xffff9290dc5bfe00
  regs=40 stack=0 before 12: (63) *(u32 *)(r10 -4) = r0
  regs=40 stack=0 before 11: (b7) r0 = 0
  regs=40 stack=0 before 10: (b7) r6 = 0
  25: (79) r3 = *(u64 *)(r0 +0)         ; R0_w=map_value(off=0,ks=4,vs=48,imm=0) R3_w=scalar()
  26: (7b) *(u64 *)(r1 +0) = r3         ; R1_w=map_value(off=0,ks=4,vs=48,imm=0) R3_w=scalar()
  27: (95) exit

  from 9 to 11: R1=ctx(off=0,imm=0) R6=0 R7=0 R8=0 R9=0 R10=fp0
  11: (b7) r0 = 0                       ; R0_w=0
  12: (63) *(u32 *)(r10 -4) = r0
  last_idx 12 first_idx 11
  regs=1 stack=0 before 11: (b7) r0 = 0
  13: R0_w=0 R10=fp0 fp-8=0000????
  13: (18) r4 = 0xffff9290dc5bfe00      ; R4_w=map_ptr(off=0,ks=4,vs=48,imm=0)
  15: (bf) r1 = r4                      ; R1_w=map_ptr(off=0,ks=4,vs=48,imm=0) R4_w=map_ptr(off=0,ks=4,vs=48,imm=0)
  16: (bf) r2 = r10                     ; R2_w=fp0 R10=fp0
  17: (07) r2 += -4                     ; R2_w=fp-4
  18: (85) call bpf_map_lookup_elem#1
  frame 0: propagating r6
  last_idx 19 first_idx 11
  regs=40 stack=0 before 18: (85) call bpf_map_lookup_elem#1
  regs=40 stack=0 before 17: (07) r2 += -4
  regs=40 stack=0 before 16: (bf) r2 = r10
  regs=40 stack=0 before 15: (bf) r1 = r4
  regs=40 stack=0 before 13: (18) r4 = 0xffff9290dc5bfe00
  regs=40 stack=0 before 12: (63) *(u32 *)(r10 -4) = r0
  regs=40 stack=0 before 11: (b7) r0 = 0
  parent didn't have regs=40 stack=0 marks: R1=ctx(off=0,imm=0) R6_r=P0 R7=0 R8=0 R9=0 R10=fp0
  last_idx 9 first_idx 9
  regs=40 stack=0 before 9: (bd) if r6 <= r9 goto pc+1
  parent didn't have regs=240 stack=0 marks: R1=ctx(off=0,imm=0) R6_rw=Pscalar() R7_w=0 R8_w=0 R9_rw=P0 R10=fp0
  last_idx 8 first_idx 0
  regs=240 stack=0 before 8: (b7) r9 = 0
  regs=40 stack=0 before 7: (97) r6 %= 1
  regs=40 stack=0 before 6: (bd) if r6 <= r9 goto pc+2
  regs=240 stack=0 before 5: (05) goto pc+0
  regs=240 stack=0 before 4: (97) r6 %= 1025
  regs=240 stack=0 before 3: (b7) r9 = -2147483648
  regs=40 stack=0 before 2: (b7) r8 = 0
  regs=40 stack=0 before 1: (b7) r7 = 0
  regs=40 stack=0 before 0: (b7) r6 = 1024
  19: safe

  from 6 to 9: R1=ctx(off=0,imm=0) R6_w=scalar(umax=18446744071562067968) R7_w=0 R8_w=0 R9_w=-2147483648 R10=fp0
  9: (bd) if r6 <= r9 goto pc+1
  last_idx 9 first_idx 0
  regs=40 stack=0 before 6: (bd) if r6 <= r9 goto pc+2
  regs=240 stack=0 before 5: (05) goto pc+0
  regs=240 stack=0 before 4: (97) r6 %= 1025
  regs=240 stack=0 before 3: (b7) r9 = -2147483648
  regs=40 stack=0 before 2: (b7) r8 = 0
  regs=40 stack=0 before 1: (b7) r7 = 0
  regs=40 stack=0 before 0: (b7) r6 = 1024
  last_idx 9 first_idx 0
  regs=200 stack=0 before 6: (bd) if r6 <= r9 goto pc+2
  regs=240 stack=0 before 5: (05) goto pc+0
  regs=240 stack=0 before 4: (97) r6 %= 1025
  regs=240 stack=0 before 3: (b7) r9 = -2147483648
  regs=40 stack=0 before 2: (b7) r8 = 0
  regs=40 stack=0 before 1: (b7) r7 = 0
  regs=40 stack=0 before 0: (b7) r6 = 1024
  11: R6=scalar(umax=18446744071562067968) R9=-2147483648
  11: (b7) r0 = 0                       ; R0_w=0
  12: (63) *(u32 *)(r10 -4) = r0
  last_idx 12 first_idx 11
  regs=1 stack=0 before 11: (b7) r0 = 0
  13: R0_w=0 R10=fp0 fp-8=0000????
  13: (18) r4 = 0xffff9290dc5bfe00      ; R4_w=map_ptr(off=0,ks=4,vs=48,imm=0)
  15: (bf) r1 = r4                      ; R1_w=map_ptr(off=0,ks=4,vs=48,imm=0) R4_w=map_ptr(off=0,ks=4,vs=48,imm=0)
  16: (bf) r2 = r10                     ; R2_w=fp0 R10=fp0
  17: (07) r2 += -4                     ; R2_w=fp-4
  18: (85) call bpf_map_lookup_elem#1   ; R0_w=map_value_or_null(id=3,off=0,ks=4,vs=48,imm=0)
  19: (55) if r0 != 0x0 goto pc+1       ; R0_w=0
  20: (95) exit

  from 19 to 21: R0=map_value(off=0,ks=4,vs=48,imm=0) R6=scalar(umax=18446744071562067968) R7=0 R8=0 R9=-2147483648 R10=fp0 fp-8=mmmm????
  21: (77) r6 >>= 10                    ; R6_w=scalar(umax=18014398507384832,var_off=(0x0; 0x3fffffffffffff))
  22: (27) r6 *= 8192                   ; R6_w=scalar(smax=9223372036854767616,umax=18446744073709543424,var_off=(0x0; 0xffffffffffffe000),s32_max=2147475456,u32_max=-8192)
  23: (bf) r1 = r0                      ; R0=map_value(off=0,ks=4,vs=48,imm=0) R1_w=map_value(off=0,ks=4,vs=48,imm=0)
  24: (0f) r0 += r6
  last_idx 24 first_idx 21
  regs=40 stack=0 before 23: (bf) r1 = r0
  regs=40 stack=0 before 22: (27) r6 *= 8192
  regs=40 stack=0 before 21: (77) r6 >>= 10
  parent didn't have regs=40 stack=0 marks: R0_rw=map_value(off=0,ks=4,vs=48,imm=0) R6_r=Pscalar(umax=18446744071562067968) R7=0 R8=0 R9=-2147483648 R10=fp0 fp-8=mmmm????
  last_idx 19 first_idx 11
  regs=40 stack=0 before 19: (55) if r0 != 0x0 goto pc+1
  regs=40 stack=0 before 18: (85) call bpf_map_lookup_elem#1
  regs=40 stack=0 before 17: (07) r2 += -4
  regs=40 stack=0 before 16: (bf) r2 = r10
  regs=40 stack=0 before 15: (bf) r1 = r4
  regs=40 stack=0 before 13: (18) r4 = 0xffff9290dc5bfe00
  regs=40 stack=0 before 12: (63) *(u32 *)(r10 -4) = r0
  regs=40 stack=0 before 11: (b7) r0 = 0
  parent didn't have regs=40 stack=0 marks: R1=ctx(off=0,imm=0) R6_rw=Pscalar(umax=18446744071562067968) R7_w=0 R8_w=0 R9_w=-2147483648 R10=fp0
  last_idx 9 first_idx 0
  regs=40 stack=0 before 9: (bd) if r6 <= r9 goto pc+1
  regs=240 stack=0 before 6: (bd) if r6 <= r9 goto pc+2
  regs=240 stack=0 before 5: (05) goto pc+0
  regs=240 stack=0 before 4: (97) r6 %= 1025
  regs=240 stack=0 before 3: (b7) r9 = -2147483648
  regs=40 stack=0 before 2: (b7) r8 = 0
  regs=40 stack=0 before 1: (b7) r7 = 0
  regs=40 stack=0 before 0: (b7) r6 = 1024
  math between map_value pointer and register with unbounded min value is not allowed
  verification time 886 usec
  stack depth 4
  processed 49 insns (limit 1000000) max_states_per_insn 1 total_states 5 peak_states 5 mark_read 2

Fixes: b5dc0163d8 ("bpf: precise scalar_value tracking")
Reported-by: Juan Jose Lopez Jaimez <jjlopezjaimez@google.com>
Reported-by: Meador Inge <meadori@google.com>
Reported-by: Simon Scannell <simonscannell@google.com>
Reported-by: Nenad Stojanovski <thenenadx@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Co-developed-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Reviewed-by: Juan Jose Lopez Jaimez <jjlopezjaimez@google.com>
Reviewed-by: Meador Inge <meadori@google.com>
Reviewed-by: Simon Scannell <simonscannell@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-27 15:13:05 +08:00
Xingui Yang d9f1cc8a3b scsi: libsas: Fix the failure of adding phy with zero-address to port
commit 06036a0a5db34642c5dbe22021a767141f010b7a upstream.

As of commit 7d1d865181 ("[SCSI] libsas: fix false positive 'device
attached' conditions"), reset the phy->entacted_sas_addr address to a
zero-address when the link rate is less than 1.5G.

Currently we find that when a new device is attached, and the link rate is
less than 1.5G, but the device type is not NO_DEVICE, for example: the link
rate is SAS_PHY_RESET_IN_PROGRESS and the device type is stp. After setting
the phy->entacted_sas_addr address to the zero address, the port will
continue to be created for the phy with the zero-address, and other phys
with the zero-address will be tried to be added to the new port:

[562240.051197] sas: ex 500e004aaaaaaa1f phy19:U:0 attached: 0000000000000000 (no device)
// phy19 is deleted but still on the parent port's phy_list
[562240.062536] sas: ex 500e004aaaaaaa1f phy0 new device attached
[562240.062616] sas: ex 500e004aaaaaaa1f phy00:U:5 attached: 0000000000000000 (stp)
[562240.062680] port-7:7:0: trying to add phy phy-7:7:19 fails: it's already part of another port

Therefore, it should be the same as sas_get_phy_attached_dev(). Only when
device_type is SAS_PHY_UNUSED, sas_address is set to the 0 address.

Fixes: 7d1d865181 ("[SCSI] libsas: fix false positive 'device attached' conditions")
Signed-off-by: Xingui Yang <yangxingui@huawei.com>
Link: https://lore.kernel.org/r/20240312141103.31358-5-yangxingui@huawei.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-27 15:13:05 +08:00
Xingui Yang 7c5c571ee5 scsi: libsas: Fix exp-attached device scan after probe failure scanned in again after probe failed
commit ab2068a6fb84751836a84c26ca72b3beb349619d upstream.

The expander phy will be treated as broadcast flutter in the next
revalidation after the exp-attached end device probe failed, as follows:

[78779.654026] sas: broadcast received: 0
[78779.654037] sas: REVALIDATING DOMAIN on port 0, pid:10
[78779.654680] sas: ex 500e004aaaaaaa1f phy05 change count has changed
[78779.662977] sas: ex 500e004aaaaaaa1f phy05 originated BROADCAST(CHANGE)
[78779.662986] sas: ex 500e004aaaaaaa1f phy05 new device attached
[78779.663079] sas: ex 500e004aaaaaaa1f phy05:U:8 attached: 500e004aaaaaaa05 (stp)
[78779.693542] hisi_sas_v3_hw 0000:b4:02.0: dev[16:5] found
[78779.701155] sas: done REVALIDATING DOMAIN on port 0, pid:10, res 0x0
[78779.707864] sas: Enter sas_scsi_recover_host busy: 0 failed: 0
...
[78835.161307] sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 tries: 1
[78835.171344] sas: sas_probe_sata: for exp-attached device 500e004aaaaaaa05 returned -19
[78835.180879] hisi_sas_v3_hw 0000:b4:02.0: dev[16:5] is gone
[78835.187487] sas: broadcast received: 0
[78835.187504] sas: REVALIDATING DOMAIN on port 0, pid:10
[78835.188263] sas: ex 500e004aaaaaaa1f phy05 change count has changed
[78835.195870] sas: ex 500e004aaaaaaa1f phy05 originated BROADCAST(CHANGE)
[78835.195875] sas: ex 500e004aaaaaaa1f rediscovering phy05
[78835.196022] sas: ex 500e004aaaaaaa1f phy05:U:A attached: 500e004aaaaaaa05 (stp)
[78835.196026] sas: ex 500e004aaaaaaa1f phy05 broadcast flutter
[78835.197615] sas: done REVALIDATING DOMAIN on port 0, pid:10, res 0x0

The cause of the problem is that the related ex_phy's attached_sas_addr was
not cleared after the end device probe failed, so reset it.

Signed-off-by: Xingui Yang <yangxingui@huawei.com>
Link: https://lore.kernel.org/r/20240619091742.25465-1-yangxingui@huawei.com
Reviewed-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-27 15:13:05 +08:00
Jason Yan 13c3aa7632 scsi: libsas: Abort all in-flight requests when device is gone
commit 0e4b1791d9 upstream.

When a disk is removed with in-flight I/O, the application needs to wait
for 30 seconds (depending on the timeout configuration) to hear back from
the kernel. Xingui tried to fix this issue by aborting the ATA link for
SATA devices[1], however this approach left the SAS devices unresolved.

Try to fix this issue by aborting all in-flight requests when the device is
gone. This is implemented by iterating over the tagset.

[1] https://lore.kernel.org/lkml/234e04db-7539-07e4-a6b8-c6b05f78193d@opensource.wdc.com/T/

Cc: Xingui Yang <yangxingui@huawei.com>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Cc: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Link: https://lore.kernel.org/r/20230330110930.175539-1-yanaijie@huawei.com
Reviewed-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-27 15:13:05 +08:00
yuehongwu 933e551af2 config: enable CONFIG_IMA for aarch64/x86 platform
This commit is made to align the kernel config with the 0009-kabi branch.

Conflicts:
	arch/arm64/configs/tencent.config

Signed-off-by: yuehongwu <yuehongwu@tencent.com>
Reviewed-by: caelli <caelli@tencent.com>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-27 15:13:05 +08:00
mengensun a1ca443b2a net: add FOU support for arm
[tapd]
https://tapd.woa.com/tapd_fe/20422414/bug/detail/1020422414132816169

compile FOU module in tree

Reviewed-by: Hui Li <caelli@tencent.com>
Signed-off-by: mengensun <mengensun@tencent.com>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-27 15:13:05 +08:00
mengensun 4574041a34 net: add MPLS_IPTUNNEL for arm64
[tapd]
https://tapd.woa.com/tapd_fe/20422414/bug/detail/1020422414132816799

compile mpls ip tunnel module in tree

Reviewed-by: Hui Li <caelli@tencent.com>
Signed-off-by: mengensun <mengensun@tencent.com>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-27 15:13:05 +08:00
mengensun 68a92caa25 crypto: add cast5 and cast6
Open CONFIG_CRYPTO_CAST5 and CONFIG_CRYPTO_CAST6

[tapd]
https://tapd.woa.com/tapd_fe/20422414/bug/detail/1020422414132825831

Reviewed-by: Hui Li <caelli@tencent.com>
Signed-off-by: mengensun <mengensun@tencent.com>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-27 15:13:05 +08:00
caelli e0a2b84344 config: enable arm64 CONFIG_TASKSTATS
enabel arm64 CONFIG_TASKSTATS to meet with x86.

Reviewed-by: Mengen Sun <mengensun@tencent.com>
Signed-off-by: caelli <caelli@tencent.com>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-27 15:13:05 +08:00
yilingjin b27c0c8467 arm64: change config for performance
change some config for performance

Conflicts:
	arch/arm64/configs/tencent.config

Signed-off-by: yilingjin <yilingjin@tencent.com>
Reviewed-by: caelli <caelli@tencent.com>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-27 15:13:05 +08:00
caelli a801aac3ae arm64: change config for performance
change some config for performance

Signed-off-by: caelli <caelli@tencent.com>
Reviewed-by: yilingjin <yilingjin@tencent.com>
Reviewed-by: yuehongwu <yuehongwu@tencent.com>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-27 15:13:04 +08:00
caelli 7a94805aa8 alinux: arm64: adjust tk_core memory layout
to #29722367

On some specific hardware with 128 bytes LLC cacheline, tk_core may
cause false sharing problem. We can align it to 128 bytes so that
it won't be affected by other global variables.

This change will make a bit waste on cache utilization but get good
number of performance improvement. So for both 64 and 128 bytes aligned
LLC cacheline, we adjust tk_core memory layout to avoid potential cacheline
contention.

Signed-off-by: Peng Wang <rocking@linux.alibaba.com>
Acked-by: Shanpei Chen <shanpeic@linux.alibaba.com>
Reviewed-by: Shile Zhang <shile.zhang@linux.alibaba.com>
Signed-off-by: caelli <caelli@tencent.com>
Reviewed-by: yilingjin <yilingjin@tencent.com>
Reviewed-by: yuehongwu <yuehongwu@tencent.com>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-27 15:13:04 +08:00
yilingjin 9e1bfb392a arm64/configs: turn CONFIG_RTC_DRV_EFI on
Signed-off-by: yilingjin <yilingjin@tencent.com>
Reviewed-by: Hui Li <caelli@tencent.com>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-27 15:13:04 +08:00
sumiyawang fcfb0a9343 driver: update hisilicon hardware crypto engine
keep the hisilicon crypto driver up to 1.3.11, autoprobe
the modules when hardware enabled.

Signed-off-by: sumiyawang <sumiyawang@tencent.com>
Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
Reviewed-by: Chun Liu <kaicliu@tencent.com>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-27 15:13:04 +08:00
Jinliang Zheng 343e48d1c2 Revert "crypto: hisilicon - Kunpeng916 crypto driver don't sleep when in softirq"
This reverts commit cf411bcc65.

Revert to facilitate integration of higher version drivers: 1.3.11

Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
Reviewed-by: Chun Liu <kaicliu@tencent.com>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-27 15:13:04 +08:00
yuehongwu 36dd8c7c77 config: Enabled some configs to fix hinic errors at arm64.
- CONFIG_PCI_PF_STUB=m
- CONFIG_PCI_IOV=y
- CONFIG_THIRDPARTY_HINIC=m
- CONFIG_THIRDPARTY_DRM_AST=m
delete CONFIG_DRM_AST=m
delete CONFIG_HINIC=m

Conflicts:
	arch/arm64/configs/tencent.config

Signed-off-by: yuehongwu <yuehongwu@tencent.com>
Reviewed-by: caelli <caelli@tencent.com>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-27 15:13:02 +08:00
MengEn Sun 340a0a10c9 ice: add ice driver for arm64
PTP feature of ice NIC is depend on x86 cpu feature of known
Constant TSC frequency there is no tsc on arm64, so do not
init the PTP callback function on arm64.

btw: we compile the ice driver as modules from now on

Reviewed-by: yuehongwu <yuehongwu@tencent.com>
Signed-off-by: MengEn Sun <mengensun@tencent.com>
Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
Reviewed-by: mengensun <mengensun@tencent.com>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-27 15:04:26 +08:00
Jianping Liu 83a5deaf4b drivers/thirdparty,ice: fix compile error
The commit 91dc13667e have change change
the struct of ethtool_ops, which will cause compile error.

Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-27 15:03:56 +08:00
Jinliang Zheng a2936f347e ice: update thirdparty ice nic driver to 1.10.1.2
Conflicts:
	drivers/thirdparty/ice/ice_ethtool.c

Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
Reviewed-by: mengensun <mengensun@tencent.com>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-14 19:22:04 +08:00
caelli 9660939fa0 config: enable vcan
enable vcan config.

Signed-off-by: caelli <caelli@tencent.com>
Reviewed-by: JinLiang Zheng <alexjlzheng@tencent.com>
Reviewed-by: Mengen Sun <mengensun@tencent.com>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-14 19:06:28 +08:00