In the Linux kernel, the following vulnerability has been resolved:
svcrdma: fix miss destroy percpu_counter in svc_rdma_proc_init()
There's issue as follows:
RPC: Registered rdma transport module.
RPC: Registered rdma backchannel transport module.
RPC: Unregistered rdma transport module.
RPC: Unregistered rdma backchannel transport module.
BUG: unable to handle page fault for address: fffffbfff80c609a
PGD 123fee067 P4D 123fee067 PUD 123fea067 PMD 10c624067 PTE 0
Oops: Oops: 0000 [#1] PREEMPT SMP KASAN NOPTI
RIP: 0010:percpu_counter_destroy_many+0xf7/0x2a0
Call Trace:
<TASK>
__die+0x1f/0x70
page_fault_oops+0x2cd/0x860
spurious_kernel_fault+0x36/0x450
do_kern_addr_fault+0xca/0x100
exc_page_fault+0x128/0x150
asm_exc_page_fault+0x26/0x30
percpu_counter_destroy_many+0xf7/0x2a0
mmdrop+0x209/0x350
finish_task_switch.isra.0+0x481/0x840
schedule_tail+0xe/0xd0
ret_from_fork+0x23/0x80
ret_from_fork_asm+0x1a/0x30
</TASK>
If register_sysctl() return NULL, then svc_rdma_proc_cleanup() will not
destroy the percpu counters which init in svc_rdma_proc_init().
If CONFIG_HOTPLUG_CPU is enabled, residual nodes may be in the
'percpu_counters' list. The above issue may occur once the module is
removed. If the CONFIG_HOTPLUG_CPU configuration is not enabled, memory
leakage occurs.
To solve above issue just destroy all percpu counters when
register_sysctl() return NULL.
In the Linux kernel, the following vulnerability has been resolved:
nfsd: release svc_expkey/svc_export with rcu_work
The last reference for `cache_head` can be reduced to zero in `c_show`
and `e_show`(using `rcu_read_lock` and `rcu_read_unlock`). Consequently,
`svc_export_put` and `expkey_put` will be invoked, leading to two
issues:
1. The `svc_export_put` will directly free ex_uuid. However,
`e_show`/`c_show` will access `ex_uuid` after `cache_put`, which can
trigger a use-after-free issue, shown below.
==================================================================
BUG: KASAN: slab-use-after-free in svc_export_show+0x362/0x430 [nfsd]
Read of size 1 at addr ff11000010fdc120 by task cat/870
CPU: 1 UID: 0 PID: 870 Comm: cat Not tainted 6.12.0-rc3+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.16.1-2.fc37 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0x53/0x70
print_address_description.constprop.0+0x2c/0x3a0
print_report+0xb9/0x280
kasan_report+0xae/0xe0
svc_export_show+0x362/0x430 [nfsd]
c_show+0x161/0x390 [sunrpc]
seq_read_iter+0x589/0x770
seq_read+0x1e5/0x270
proc_reg_read+0xe1/0x140
vfs_read+0x125/0x530
ksys_read+0xc1/0x160
do_syscall_64+0x5f/0x170
entry_SYSCALL_64_after_hwframe+0x76/0x7e
Allocated by task 830:
kasan_save_stack+0x20/0x40
kasan_save_track+0x14/0x30
__kasan_kmalloc+0x8f/0xa0
__kmalloc_node_track_caller_noprof+0x1bc/0x400
kmemdup_noprof+0x22/0x50
svc_export_parse+0x8a9/0xb80 [nfsd]
cache_do_downcall+0x71/0xa0 [sunrpc]
cache_write_procfs+0x8e/0xd0 [sunrpc]
proc_reg_write+0xe1/0x140
vfs_write+0x1a5/0x6d0
ksys_write+0xc1/0x160
do_syscall_64+0x5f/0x170
entry_SYSCALL_64_after_hwframe+0x76/0x7e
Freed by task 868:
kasan_save_stack+0x20/0x40
kasan_save_track+0x14/0x30
kasan_save_free_info+0x3b/0x60
__kasan_slab_free+0x37/0x50
kfree+0xf3/0x3e0
svc_export_put+0x87/0xb0 [nfsd]
cache_purge+0x17f/0x1f0 [sunrpc]
nfsd_destroy_serv+0x226/0x2d0 [nfsd]
nfsd_svc+0x125/0x1e0 [nfsd]
write_threads+0x16a/0x2a0 [nfsd]
nfsctl_transaction_write+0x74/0xa0 [nfsd]
vfs_write+0x1a5/0x6d0
ksys_write+0xc1/0x160
do_syscall_64+0x5f/0x170
entry_SYSCALL_64_after_hwframe+0x76/0x7e
2. We cannot sleep while using `rcu_read_lock`/`rcu_read_unlock`.
However, `svc_export_put`/`expkey_put` will call path_put, which
subsequently triggers a sleeping operation due to the following
`dput`.
=============================
WARNING: suspicious RCU usage
5.10.0-dirty #141 Not tainted
-----------------------------
...
Call Trace:
dump_stack+0x9a/0xd0
___might_sleep+0x231/0x240
dput+0x39/0x600
path_put+0x1b/0x30
svc_export_put+0x17/0x80
e_show+0x1c9/0x200
seq_read_iter+0x63f/0x7c0
seq_read+0x226/0x2d0
vfs_read+0x113/0x2c0
ksys_read+0xc9/0x170
do_syscall_64+0x33/0x40
entry_SYSCALL_64_after_hwframe+0x67/0xd1
Fix these issues by using `rcu_work` to help release
`svc_expkey`/`svc_export`. This approach allows for an asynchronous
context to invoke `path_put` and also facilitates the freeing of
`uuid/exp/key` after an RCU grace period.
In the Linux kernel, the following vulnerability has been resolved:
NFSD: Prevent NULL dereference in nfsd4_process_cb_update()
@ses is initialized to NULL. If __nfsd4_find_backchannel() finds no
available backchannel session, setup_callback_client() will try to
dereference @ses and segfault.
In the Linux kernel, the following vulnerability has been resolved:
f2fs: fix race in concurrent f2fs_stop_gc_thread
In my test case, concurrent calls to f2fs shutdown report the following
stack trace:
Oops: general protection fault, probably for non-canonical address 0xc6cfff63bb5513fc: 0000 [#1] PREEMPT SMP PTI
CPU: 0 UID: 0 PID: 678 Comm: f2fs_rep_shutdo Not tainted 6.12.0-rc5-next-20241029-g6fb2fa9805c5-dirty #85
Call Trace:
<TASK>
? show_regs+0x8b/0xa0
? __die_body+0x26/0xa0
? die_addr+0x54/0x90
? exc_general_protection+0x24b/0x5c0
? asm_exc_general_protection+0x26/0x30
? kthread_stop+0x46/0x390
f2fs_stop_gc_thread+0x6c/0x110
f2fs_do_shutdown+0x309/0x3a0
f2fs_ioc_shutdown+0x150/0x1c0
__f2fs_ioctl+0xffd/0x2ac0
f2fs_ioctl+0x76/0xe0
vfs_ioctl+0x23/0x60
__x64_sys_ioctl+0xce/0xf0
x64_sys_call+0x2b1b/0x4540
do_syscall_64+0xa7/0x240
entry_SYSCALL_64_after_hwframe+0x76/0x7e
The root cause is a race condition in f2fs_stop_gc_thread() called from
different f2fs shutdown paths:
[CPU0] [CPU1]
---------------------- -----------------------
f2fs_stop_gc_thread f2fs_stop_gc_thread
gc_th = sbi->gc_thread
gc_th = sbi->gc_thread
kfree(gc_th)
sbi->gc_thread = NULL
< gc_th != NULL >
kthread_stop(gc_th->f2fs_gc_task) //UAF
The commit c7f114d864ac ("f2fs: fix to avoid use-after-free in
f2fs_stop_gc_thread()") attempted to fix this issue by using a read
semaphore to prevent races between shutdown and remount threads, but
it fails to prevent all race conditions.
Fix it by converting to write lock of s_umount in f2fs_do_shutdown().
In the Linux kernel, the following vulnerability has been resolved:
virtiofs: use pages instead of pointer for kernel direct IO
When trying to insert a 10MB kernel module kept in a virtio-fs with cache
disabled, the following warning was reported:
------------[ cut here ]------------
WARNING: CPU: 1 PID: 404 at mm/page_alloc.c:4551 ......
Modules linked in:
CPU: 1 PID: 404 Comm: insmod Not tainted 6.9.0-rc5+ #123
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) ......
RIP: 0010:__alloc_pages+0x2bf/0x380
......
Call Trace:
<TASK>
? __warn+0x8e/0x150
? __alloc_pages+0x2bf/0x380
__kmalloc_large_node+0x86/0x160
__kmalloc+0x33c/0x480
virtio_fs_enqueue_req+0x240/0x6d0
virtio_fs_wake_pending_and_unlock+0x7f/0x190
queue_request_and_unlock+0x55/0x60
fuse_simple_request+0x152/0x2b0
fuse_direct_io+0x5d2/0x8c0
fuse_file_read_iter+0x121/0x160
__kernel_read+0x151/0x2d0
kernel_read+0x45/0x50
kernel_read_file+0x1a9/0x2a0
init_module_from_file+0x6a/0xe0
idempotent_init_module+0x175/0x230
__x64_sys_finit_module+0x5d/0xb0
x64_sys_call+0x1c3/0x9e0
do_syscall_64+0x3d/0xc0
entry_SYSCALL_64_after_hwframe+0x4b/0x53
......
</TASK>
---[ end trace 0000000000000000 ]---
The warning is triggered as follows:
1) syscall finit_module() handles the module insertion and it invokes
kernel_read_file() to read the content of the module first.
2) kernel_read_file() allocates a 10MB buffer by using vmalloc() and
passes it to kernel_read(). kernel_read() constructs a kvec iter by
using iov_iter_kvec() and passes it to fuse_file_read_iter().
3) virtio-fs disables the cache, so fuse_file_read_iter() invokes
fuse_direct_io(). As for now, the maximal read size for kvec iter is
only limited by fc->max_read. For virtio-fs, max_read is UINT_MAX, so
fuse_direct_io() doesn't split the 10MB buffer. It saves the address and
the size of the 10MB-sized buffer in out_args[0] of a fuse request and
passes the fuse request to virtio_fs_wake_pending_and_unlock().
4) virtio_fs_wake_pending_and_unlock() uses virtio_fs_enqueue_req() to
queue the request. Because virtiofs need DMA-able address, so
virtio_fs_enqueue_req() uses kmalloc() to allocate a bounce buffer for
all fuse args, copies these args into the bounce buffer and passed the
physical address of the bounce buffer to virtiofsd. The total length of
these fuse args for the passed fuse request is about 10MB, so
copy_args_to_argbuf() invokes kmalloc() with a 10MB size parameter and
it triggers the warning in __alloc_pages():
if (WARN_ON_ONCE_GFP(order > MAX_PAGE_ORDER, gfp))
return NULL;
5) virtio_fs_enqueue_req() will retry the memory allocation in a
kworker, but it won't help, because kmalloc() will always return NULL
due to the abnormal size and finit_module() will hang forever.
A feasible solution is to limit the value of max_read for virtio-fs, so
the length passed to kmalloc() will be limited. However it will affect
the maximal read size for normal read. And for virtio-fs write initiated
from kernel, it has the similar problem but now there is no way to limit
fc->max_write in kernel.
So instead of limiting both the values of max_read and max_write in
kernel, introducing use_pages_for_kvec_io in fuse_conn and setting it as
true in virtiofs. When use_pages_for_kvec_io is enabled, fuse will use
pages instead of pointer to pass the KVEC_IO data.
After switching to pages for KVEC_IO data, these pages will be used for
DMA through virtio-fs. If these pages are backed by vmalloc(),
{flush|invalidate}_kernel_vmap_range() are necessary to flush or
invalidate the cache before the DMA operation. So add two new fields in
fuse_args_pages to record the base address of vmalloc area and the
condition indicating whether invalidation is needed. Perform the flush
in fuse_get_user_pages() for write operations and the invalidation in
fuse_release_user_pages() for read operations.
It may seem necessary to introduce another fie
---truncated---
In the Linux kernel, the following vulnerability has been resolved:
usb: typec: fix potential array underflow in ucsi_ccg_sync_control()
The "command" variable can be controlled by the user via debugfs. The
worry is that if con_index is zero then "&uc->ucsi->connector[con_index
- 1]" would be an array underflow.
In the Linux kernel, the following vulnerability has been resolved:
phy: realtek: usb: fix NULL deref in rtk_usb3phy_probe
In rtk_usb3phy_probe() devm_kzalloc() may return NULL
but this returned value is not checked.
In the Linux kernel, the following vulnerability has been resolved:
phy: realtek: usb: fix NULL deref in rtk_usb2phy_probe
In rtk_usb2phy_probe() devm_kzalloc() may return NULL
but this returned value is not checked.
In the Linux kernel, the following vulnerability has been resolved:
tcp: Fix use-after-free of nreq in reqsk_timer_handler().
The cited commit replaced inet_csk_reqsk_queue_drop_and_put() with
__inet_csk_reqsk_queue_drop() and reqsk_put() in reqsk_timer_handler().
Then, oreq should be passed to reqsk_put() instead of req; otherwise
use-after-free of nreq could happen when reqsk is migrated but the
retry attempt failed (e.g. due to timeout).
Let's pass oreq to reqsk_put().
In the Linux kernel, the following vulnerability has been resolved:
Bluetooth: MGMT: Fix possible deadlocks
This fixes possible deadlocks like the following caused by
hci_cmd_sync_dequeue causing the destroy function to run:
INFO: task kworker/u19:0:143 blocked for more than 120 seconds.
Tainted: G W O 6.8.0-2024-03-19-intel-next-iLS-24ww14 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kworker/u19:0 state:D stack:0 pid:143 tgid:143 ppid:2 flags:0x00004000
Workqueue: hci0 hci_cmd_sync_work [bluetooth]
Call Trace:
<TASK>
__schedule+0x374/0xaf0
schedule+0x3c/0xf0
schedule_preempt_disabled+0x1c/0x30
__mutex_lock.constprop.0+0x3ef/0x7a0
__mutex_lock_slowpath+0x13/0x20
mutex_lock+0x3c/0x50
mgmt_set_connectable_complete+0xa4/0x150 [bluetooth]
? kfree+0x211/0x2a0
hci_cmd_sync_dequeue+0xae/0x130 [bluetooth]
? __pfx_cmd_complete_rsp+0x10/0x10 [bluetooth]
cmd_complete_rsp+0x26/0x80 [bluetooth]
mgmt_pending_foreach+0x4d/0x70 [bluetooth]
__mgmt_power_off+0x8d/0x180 [bluetooth]
? _raw_spin_unlock_irq+0x23/0x40
hci_dev_close_sync+0x445/0x5b0 [bluetooth]
hci_set_powered_sync+0x149/0x250 [bluetooth]
set_powered_sync+0x24/0x60 [bluetooth]
hci_cmd_sync_work+0x90/0x150 [bluetooth]
process_one_work+0x13e/0x300
worker_thread+0x2f7/0x420
? __pfx_worker_thread+0x10/0x10
kthread+0x107/0x140
? __pfx_kthread+0x10/0x10
ret_from_fork+0x3d/0x60
? __pfx_kthread+0x10/0x10
ret_from_fork_asm+0x1b/0x30
</TASK>