In the Linux kernel, the following vulnerability has been resolved:
NFSD: fix race between nfsd registration and exports_proc
As of now nfsd calls create_proc_exports_entry() at start of init_nfsd
and cleanup by remove_proc_entry() at last of exit_nfsd.
Which causes kernel OOPs if there is race between below 2 operations:
(i) exportfs -r
(ii) mount -t nfsd none /proc/fs/nfsd
for 5.4 kernel ARM64:
CPU 1:
el1_irq+0xbc/0x180
arch_counter_get_cntvct+0x14/0x18
running_clock+0xc/0x18
preempt_count_add+0x88/0x110
prep_new_page+0xb0/0x220
get_page_from_freelist+0x2d8/0x1778
__alloc_pages_nodemask+0x15c/0xef0
__vmalloc_node_range+0x28c/0x478
__vmalloc_node_flags_caller+0x8c/0xb0
kvmalloc_node+0x88/0xe0
nfsd_init_net+0x6c/0x108 [nfsd]
ops_init+0x44/0x170
register_pernet_operations+0x114/0x270
register_pernet_subsys+0x34/0x50
init_nfsd+0xa8/0x718 [nfsd]
do_one_initcall+0x54/0x2e0
CPU 2 :
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000010
PC is at : exports_net_open+0x50/0x68 [nfsd]
Call trace:
exports_net_open+0x50/0x68 [nfsd]
exports_proc_open+0x2c/0x38 [nfsd]
proc_reg_open+0xb8/0x198
do_dentry_open+0x1c4/0x418
vfs_open+0x38/0x48
path_openat+0x28c/0xf18
do_filp_open+0x70/0xe8
do_sys_open+0x154/0x248
Sometimes it crashes at exports_net_open() and sometimes cache_seq_next_rcu().
and same is happening on latest 6.14 kernel as well:
[ 0.000000] Linux version 6.14.0-rc5-next-20250304-dirty
...
[ 285.455918] Unable to handle kernel paging request at virtual address 00001f4800001f48
...
[ 285.464902] pc : cache_seq_next_rcu+0x78/0xa4
...
[ 285.469695] Call trace:
[ 285.470083] cache_seq_next_rcu+0x78/0xa4 (P)
[ 285.470488] seq_read+0xe0/0x11c
[ 285.470675] proc_reg_read+0x9c/0xf0
[ 285.470874] vfs_read+0xc4/0x2fc
[ 285.471057] ksys_read+0x6c/0xf4
[ 285.471231] __arm64_sys_read+0x1c/0x28
[ 285.471428] invoke_syscall+0x44/0x100
[ 285.471633] el0_svc_common.constprop.0+0x40/0xe0
[ 285.471870] do_el0_svc_compat+0x1c/0x34
[ 285.472073] el0_svc_compat+0x2c/0x80
[ 285.472265] el0t_32_sync_handler+0x90/0x140
[ 285.472473] el0t_32_sync+0x19c/0x1a0
[ 285.472887] Code: f9400885 93407c23 937d7c27 11000421 (f86378a3)
[ 285.473422] ---[ end trace 0000000000000000 ]---
It reproduced simply with below script:
while [ 1 ]
do
/exportfs -r
done &
while [ 1 ]
do
insmod /nfsd.ko
mount -t nfsd none /proc/fs/nfsd
umount /proc/fs/nfsd
rmmod nfsd
done &
So exporting interfaces to user space shall be done at last and
cleanup at first place.
With change there is no Kernel OOPs.
In the Linux kernel, the following vulnerability has been resolved:
f2fs: fix to do sanity check on sit_bitmap_size
w/ below testcase, resize will generate a corrupted image which
contains inconsistent metadata, so when mounting such image, it
will trigger kernel panic:
touch img
truncate -s $((512*1024*1024*1024)) img
mkfs.f2fs -f img $((256*1024*1024))
resize.f2fs -s -i img -t $((1024*1024*1024))
mount img /mnt/f2fs
------------[ cut here ]------------
kernel BUG at fs/f2fs/segment.h:863!
Oops: invalid opcode: 0000 [#1] SMP PTI
CPU: 11 UID: 0 PID: 3922 Comm: mount Not tainted 6.15.0-rc1+ #191 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
RIP: 0010:f2fs_ra_meta_pages+0x47c/0x490
Call Trace:
f2fs_build_segment_manager+0x11c3/0x2600
f2fs_fill_super+0xe97/0x2840
mount_bdev+0xf4/0x140
legacy_get_tree+0x2b/0x50
vfs_get_tree+0x29/0xd0
path_mount+0x487/0xaf0
__x64_sys_mount+0x116/0x150
do_syscall_64+0x82/0x190
entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7fdbfde1bcfe
The reaseon is:
sit_i->bitmap_size is 192, so size of sit bitmap is 192*8=1536, at maximum
there are 1536 sit blocks, however MAIN_SEGS is 261893, so that sit_blk_cnt
is 4762, build_sit_entries() -> current_sit_addr() tries to access
out-of-boundary in sit_bitmap at offset from [1536, 4762), once sit_bitmap
and sit_bitmap_mirror is not the same, it will trigger f2fs_bug_on().
Let's add sanity check in f2fs_sanity_check_ckpt() to avoid panic.
In the Linux kernel, the following vulnerability has been resolved:
fbdev: Fix do_register_framebuffer to prevent null-ptr-deref in fb_videomode_to_var
If fb_add_videomode() in do_register_framebuffer() fails to allocate
memory for fb_videomode, it will later lead to a null-ptr dereference in
fb_videomode_to_var(), as the fb_info is registered while not having the
mode in modelist that is expected to be there, i.e. the one that is
described in fb_info->var.
================================================================
general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN NOPTI
KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f]
CPU: 1 PID: 30371 Comm: syz-executor.1 Not tainted 5.10.226-syzkaller #0
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
RIP: 0010:fb_videomode_to_var+0x24/0x610 drivers/video/fbdev/core/modedb.c:901
Call Trace:
display_to_var+0x3a/0x7c0 drivers/video/fbdev/core/fbcon.c:929
fbcon_resize+0x3e2/0x8f0 drivers/video/fbdev/core/fbcon.c:2071
resize_screen drivers/tty/vt/vt.c:1176 [inline]
vc_do_resize+0x53a/0x1170 drivers/tty/vt/vt.c:1263
fbcon_modechanged+0x3ac/0x6e0 drivers/video/fbdev/core/fbcon.c:2720
fbcon_update_vcs+0x43/0x60 drivers/video/fbdev/core/fbcon.c:2776
do_fb_ioctl+0x6d2/0x740 drivers/video/fbdev/core/fbmem.c:1128
fb_ioctl+0xe7/0x150 drivers/video/fbdev/core/fbmem.c:1203
vfs_ioctl fs/ioctl.c:48 [inline]
__do_sys_ioctl fs/ioctl.c:753 [inline]
__se_sys_ioctl fs/ioctl.c:739 [inline]
__x64_sys_ioctl+0x19a/0x210 fs/ioctl.c:739
do_syscall_64+0x33/0x40 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x67/0xd1
================================================================
Even though fbcon_init() checks beforehand if fb_match_mode() in
var_to_display() fails, it can not prevent the panic because fbcon_init()
does not return error code. Considering this and the comment in the code
about fb_match_mode() returning NULL - "This should not happen" - it is
better to prevent registering the fb_info if its mode was not set
successfully. Also move fb_add_videomode() closer to the beginning of
do_register_framebuffer() to avoid having to do the cleanup on fail.
Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
In the Linux kernel, the following vulnerability has been resolved:
mm: fix uprobe pte be overwritten when expanding vma
Patch series "Fix uprobe pte be overwritten when expanding vma".
This patch (of 4):
We encountered a BUG alert triggered by Syzkaller as follows:
BUG: Bad rss-counter state mm:00000000b4a60fca type:MM_ANONPAGES val:1
And we can reproduce it with the following steps:
1. register uprobe on file at zero offset
2. mmap the file at zero offset:
addr1 = mmap(NULL, 2 * 4096, PROT_NONE, MAP_PRIVATE, fd, 0);
3. mremap part of vma1 to new vma2:
addr2 = mremap(addr1, 4096, 2 * 4096, MREMAP_MAYMOVE);
4. mremap back to orig addr1:
mremap(addr2, 4096, 4096, MREMAP_MAYMOVE | MREMAP_FIXED, addr1);
In step 3, the vma1 range [addr1, addr1 + 4096] will be remap to new vma2
with range [addr2, addr2 + 8192], and remap uprobe anon page from the vma1
to vma2, then unmap the vma1 range [addr1, addr1 + 4096].
In step 4, the vma2 range [addr2, addr2 + 4096] will be remap back to the
addr range [addr1, addr1 + 4096]. Since the addr range [addr1 + 4096,
addr1 + 8192] still maps the file, it will take vma_merge_new_range to
expand the range, and then do uprobe_mmap in vma_complete. Since the
merged vma pgoff is also zero offset, it will install uprobe anon page to
the merged vma. However, the upcomming move_page_tables step, which use
set_pte_at to remap the vma2 uprobe pte to the merged vma, will overwrite
the newly uprobe pte in the merged vma, and lead that pte to be orphan.
Since the uprobe pte will be remapped to the merged vma, we can remove the
unnecessary uprobe_mmap upon merged vma.
This problem was first found in linux-6.6.y and also exists in the
community syzkaller:
https://lore.kernel.org/all/000000000000ada39605a5e71711@google.com/T/
In the Linux kernel, the following vulnerability has been resolved:
smb: client: add NULL check in automount_fullpath
page is checked for null in __build_path_from_dentry_optional_prefix
when tcon->origin_fullpath is not set. However, the check is missing when
it is set.
Add a check to prevent a potential NULL pointer dereference.
In the Linux kernel, the following vulnerability has been resolved:
fbcon: Make sure modelist not set on unregistered console
It looks like attempting to write to the "store_modes" sysfs node will
run afoul of unregistered consoles:
UBSAN: array-index-out-of-bounds in drivers/video/fbdev/core/fbcon.c:122:28
index -1 is out of range for type 'fb_info *[32]'
...
fbcon_info_from_console+0x192/0x1a0 drivers/video/fbdev/core/fbcon.c:122
fbcon_new_modelist+0xbf/0x2d0 drivers/video/fbdev/core/fbcon.c:3048
fb_new_modelist+0x328/0x440 drivers/video/fbdev/core/fbmem.c:673
store_modes+0x1c9/0x3e0 drivers/video/fbdev/core/fbsysfs.c:113
dev_attr_store+0x55/0x80 drivers/base/core.c:2439
static struct fb_info *fbcon_registered_fb[FB_MAX];
...
static signed char con2fb_map[MAX_NR_CONSOLES];
...
static struct fb_info *fbcon_info_from_console(int console)
...
return fbcon_registered_fb[con2fb_map[console]];
If con2fb_map contains a -1 things go wrong here. Instead, return NULL,
as callers of fbcon_info_from_console() are trying to compare against
existing "info" pointers, so error handling should kick in correctly.
In the Linux kernel, the following vulnerability has been resolved:
net: clear the dst when changing skb protocol
A not-so-careful NAT46 BPF program can crash the kernel
if it indiscriminately flips ingress packets from v4 to v6:
BUG: kernel NULL pointer dereference, address: 0000000000000000
ip6_rcv_core (net/ipv6/ip6_input.c:190:20)
ipv6_rcv (net/ipv6/ip6_input.c:306:8)
process_backlog (net/core/dev.c:6186:4)
napi_poll (net/core/dev.c:6906:9)
net_rx_action (net/core/dev.c:7028:13)
do_softirq (kernel/softirq.c:462:3)
netif_rx (net/core/dev.c:5326:3)
dev_loopback_xmit (net/core/dev.c:4015:2)
ip_mc_finish_output (net/ipv4/ip_output.c:363:8)
NF_HOOK (./include/linux/netfilter.h:314:9)
ip_mc_output (net/ipv4/ip_output.c:400:5)
dst_output (./include/net/dst.h:459:9)
ip_local_out (net/ipv4/ip_output.c:130:9)
ip_send_skb (net/ipv4/ip_output.c:1496:8)
udp_send_skb (net/ipv4/udp.c:1040:8)
udp_sendmsg (net/ipv4/udp.c:1328:10)
The output interface has a 4->6 program attached at ingress.
We try to loop the multicast skb back to the sending socket.
Ingress BPF runs as part of netif_rx(), pushes a valid v6 hdr
and changes skb->protocol to v6. We enter ip6_rcv_core which
tries to use skb_dst(). But the dst is still an IPv4 one left
after IPv4 mcast output.
Clear the dst in all BPF helpers which change the protocol.
Try to preserve metadata dsts, those may carry non-routing
metadata.
In the Linux kernel, the following vulnerability has been resolved:
bpf: fix ktls panic with sockmap
[ 2172.936997] ------------[ cut here ]------------
[ 2172.936999] kernel BUG at lib/iov_iter.c:629!
......
[ 2172.944996] PKRU: 55555554
[ 2172.945155] Call Trace:
[ 2172.945299] <TASK>
[ 2172.945428] ? die+0x36/0x90
[ 2172.945601] ? do_trap+0xdd/0x100
[ 2172.945795] ? iov_iter_revert+0x178/0x180
[ 2172.946031] ? iov_iter_revert+0x178/0x180
[ 2172.946267] ? do_error_trap+0x7d/0x110
[ 2172.946499] ? iov_iter_revert+0x178/0x180
[ 2172.946736] ? exc_invalid_op+0x50/0x70
[ 2172.946961] ? iov_iter_revert+0x178/0x180
[ 2172.947197] ? asm_exc_invalid_op+0x1a/0x20
[ 2172.947446] ? iov_iter_revert+0x178/0x180
[ 2172.947683] ? iov_iter_revert+0x5c/0x180
[ 2172.947913] tls_sw_sendmsg_locked.isra.0+0x794/0x840
[ 2172.948206] tls_sw_sendmsg+0x52/0x80
[ 2172.948420] ? inet_sendmsg+0x1f/0x70
[ 2172.948634] __sys_sendto+0x1cd/0x200
[ 2172.948848] ? find_held_lock+0x2b/0x80
[ 2172.949072] ? syscall_trace_enter+0x140/0x270
[ 2172.949330] ? __lock_release.isra.0+0x5e/0x170
[ 2172.949595] ? find_held_lock+0x2b/0x80
[ 2172.949817] ? syscall_trace_enter+0x140/0x270
[ 2172.950211] ? lockdep_hardirqs_on_prepare+0xda/0x190
[ 2172.950632] ? ktime_get_coarse_real_ts64+0xc2/0xd0
[ 2172.951036] __x64_sys_sendto+0x24/0x30
[ 2172.951382] do_syscall_64+0x90/0x170
......
After calling bpf_exec_tx_verdict(), the size of msg_pl->sg may increase,
e.g., when the BPF program executes bpf_msg_push_data().
If the BPF program sets cork_bytes and sg.size is smaller than cork_bytes,
it will return -ENOSPC and attempt to roll back to the non-zero copy
logic. However, during rollback, msg->msg_iter is reset, but since
msg_pl->sg.size has been increased, subsequent executions will exceed the
actual size of msg_iter.
'''
iov_iter_revert(&msg->msg_iter, msg_pl->sg.size - orig_size);
'''
The changes in this commit are based on the following considerations:
1. When cork_bytes is set, rolling back to non-zero copy logic is
pointless and can directly go to zero-copy logic.
2. We can not calculate the correct number of bytes to revert msg_iter.
Assume the original data is "abcdefgh" (8 bytes), and after 3 pushes
by the BPF program, it becomes 11-byte data: "abc?de?fgh?".
Then, we set cork_bytes to 6, which means the first 6 bytes have been
processed, and the remaining 5 bytes "?fgh?" will be cached until the
length meets the cork_bytes requirement.
However, some data in "?fgh?" is not within 'sg->msg_iter'
(but in msg_pl instead), especially the data "?" we pushed.
So it doesn't seem as simple as just reverting through an offset of
msg_iter.
3. For non-TLS sockets in tcp_bpf_sendmsg, when a "cork" situation occurs,
the user-space send() doesn't return an error, and the returned length is
the same as the input length parameter, even if some data is cached.
Additionally, I saw that the current non-zero-copy logic for handling
corking is written as:
'''
line 1177
else if (ret != -EAGAIN) {
if (ret == -ENOSPC)
ret = 0;
goto send_end;
'''
So it's ok to just return 'copied' without error when a "cork" situation
occurs.
In the Linux kernel, the following vulnerability has been resolved:
wifi: rtw88: fix the 'para' buffer size to avoid reading out of bounds
Set the size to 6 instead of 2, since 'para' array is passed to
'rtw_fw_bt_wifi_control(rtwdev, para[0], ¶[1])', which reads
5 bytes:
void rtw_fw_bt_wifi_control(struct rtw_dev *rtwdev, u8 op_code, u8 *data)
{
...
SET_BT_WIFI_CONTROL_DATA1(h2c_pkt, *data);
SET_BT_WIFI_CONTROL_DATA2(h2c_pkt, *(data + 1));
...
SET_BT_WIFI_CONTROL_DATA5(h2c_pkt, *(data + 4));
Detected using the static analysis tool - Svace.