In the Linux kernel, the following vulnerability has been resolved:
cgroup: split cgroup_destroy_wq into 3 workqueues
A hung task can occur during [1] LTP cgroup testing when repeatedly
mounting/unmounting perf_event and net_prio controllers with
systemd.unified_cgroup_hierarchy=1. The hang manifests in
cgroup_lock_and_drain_offline() during root destruction.
Related case:
cgroup_fj_function_perf_event cgroup_fj_function.sh perf_event
cgroup_fj_function_net_prio cgroup_fj_function.sh net_prio
Call Trace:
cgroup_lock_and_drain_offline+0x14c/0x1e8
cgroup_destroy_root+0x3c/0x2c0
css_free_rwork_fn+0x248/0x338
process_one_work+0x16c/0x3b8
worker_thread+0x22c/0x3b0
kthread+0xec/0x100
ret_from_fork+0x10/0x20
Root Cause:
CPU0 CPU1
mount perf_event umount net_prio
cgroup1_get_tree cgroup_kill_sb
rebind_subsystems // root destruction enqueues
// cgroup_destroy_wq
// kill all perf_event css
// one perf_event css A is dying
// css A offline enqueues cgroup_destroy_wq
// root destruction will be executed first
css_free_rwork_fn
cgroup_destroy_root
cgroup_lock_and_drain_offline
// some perf descendants are dying
// cgroup_destroy_wq max_active = 1
// waiting for css A to die
Problem scenario:
1. CPU0 mounts perf_event (rebind_subsystems)
2. CPU1 unmounts net_prio (cgroup_kill_sb), queuing root destruction work
3. A dying perf_event CSS gets queued for offline after root destruction
4. Root destruction waits for offline completion, but offline work is
blocked behind root destruction in cgroup_destroy_wq (max_active=1)
Solution:
Split cgroup_destroy_wq into three dedicated workqueues:
cgroup_offline_wq – Handles CSS offline operations
cgroup_release_wq – Manages resource release
cgroup_free_wq – Performs final memory deallocation
This separation eliminates blocking in the CSS free path while waiting for
offline operations to complete.
[1] https://github.com/linux-test-project/ltp/blob/master/runtest/controllers
In the Linux kernel, the following vulnerability has been resolved:
ksmbd: smbdirect: verify remaining_data_length respects max_fragmented_recv_size
This is inspired by the check for data_offset + data_length.
In the Linux kernel, the following vulnerability has been resolved:
ksmbd: smbdirect: validate data_offset and data_length field of smb_direct_data_transfer
If data_offset and data_length of smb_direct_data_transfer struct are
invalid, out of bounds issue could happen.
This patch validate data_offset and data_length field in recv_done.
In the Linux kernel, the following vulnerability has been resolved:
cnic: Fix use-after-free bugs in cnic_delete_task
The original code uses cancel_delayed_work() in cnic_cm_stop_bnx2x_hw(),
which does not guarantee that the delayed work item 'delete_task' has
fully completed if it was already running. Additionally, the delayed work
item is cyclic, the flush_workqueue() in cnic_cm_stop_bnx2x_hw() only
blocks and waits for work items that were already queued to the
workqueue prior to its invocation. Any work items submitted after
flush_workqueue() is called are not included in the set of tasks that the
flush operation awaits. This means that after the cyclic work items have
finished executing, a delayed work item may still exist in the workqueue.
This leads to use-after-free scenarios where the cnic_dev is deallocated
by cnic_free_dev(), while delete_task remains active and attempt to
dereference cnic_dev in cnic_delete_task().
A typical race condition is illustrated below:
CPU 0 (cleanup) | CPU 1 (delayed work callback)
cnic_netdev_event() |
cnic_stop_hw() | cnic_delete_task()
cnic_cm_stop_bnx2x_hw() | ...
cancel_delayed_work() | /* the queue_delayed_work()
flush_workqueue() | executes after flush_workqueue()*/
| queue_delayed_work()
cnic_free_dev(dev)//free | cnic_delete_task() //new instance
| dev = cp->dev; //use
Replace cancel_delayed_work() with cancel_delayed_work_sync() to ensure
that the cyclic delayed work item is properly canceled and that any
ongoing execution of the work item completes before the cnic_dev is
deallocated. Furthermore, since cancel_delayed_work_sync() uses
__flush_work(work, true) to synchronously wait for any currently
executing instance of the work item to finish, the flush_workqueue()
becomes redundant and should be removed.
This bug was identified through static analysis. To reproduce the issue
and validate the fix, I simulated the cnic PCI device in QEMU and
introduced intentional delays — such as inserting calls to ssleep()
within the cnic_delete_task() function — to increase the likelihood
of triggering the bug.
In the Linux kernel, the following vulnerability has been resolved:
net/mlx5e: Harden uplink netdev access against device unbind
The function mlx5_uplink_netdev_get() gets the uplink netdevice
pointer from mdev->mlx5e_res.uplink_netdev. However, the netdevice can
be removed and its pointer cleared when unbound from the mlx5_core.eth
driver. This results in a NULL pointer, causing a kernel panic.
BUG: unable to handle page fault for address: 0000000000001300
at RIP: 0010:mlx5e_vport_rep_load+0x22a/0x270 [mlx5_core]
Call Trace:
<TASK>
mlx5_esw_offloads_rep_load+0x68/0xe0 [mlx5_core]
esw_offloads_enable+0x593/0x910 [mlx5_core]
mlx5_eswitch_enable_locked+0x341/0x420 [mlx5_core]
mlx5_devlink_eswitch_mode_set+0x17e/0x3a0 [mlx5_core]
devlink_nl_eswitch_set_doit+0x60/0xd0
genl_family_rcv_msg_doit+0xe0/0x130
genl_rcv_msg+0x183/0x290
netlink_rcv_skb+0x4b/0xf0
genl_rcv+0x24/0x40
netlink_unicast+0x255/0x380
netlink_sendmsg+0x1f3/0x420
__sock_sendmsg+0x38/0x60
__sys_sendto+0x119/0x180
do_syscall_64+0x53/0x1d0
entry_SYSCALL_64_after_hwframe+0x4b/0x53
Ensure the pointer is valid before use by checking it for NULL. If it
is valid, immediately call netdev_hold() to take a reference, and
preventing the netdevice from being freed while it is in use.
In the Linux kernel, the following vulnerability has been resolved:
smb: client: let recv_done verify data_offset, data_length and remaining_data_length
This is inspired by the related server fixes.
In the Linux kernel, the following vulnerability has been resolved:
drm: bridge: anx7625: Fix NULL pointer dereference with early IRQ
If the interrupt occurs before resource initialization is complete, the
interrupt handler/worker may access uninitialized data such as the I2C
tcpc_client device, potentially leading to NULL pointer dereference.
In the Linux kernel, the following vulnerability has been resolved:
net: rfkill: gpio: Fix crash due to dereferencering uninitialized pointer
Since commit 7d5e9737efda ("net: rfkill: gpio: get the name and type from
device property") rfkill_find_type() gets called with the possibly
uninitialized "const char *type_name;" local variable.
On x86 systems when rfkill-gpio binds to a "BCM4752" or "LNV4752"
acpi_device, the rfkill->type is set based on the ACPI acpi_device_id:
rfkill->type = (unsigned)id->driver_data;
and there is no "type" property so device_property_read_string() will fail
and leave type_name uninitialized, leading to a potential crash.
rfkill_find_type() does accept a NULL pointer, fix the potential crash
by initializing type_name to NULL.
Note likely sofar this has not been caught because:
1. Not many x86 machines actually have a "BCM4752"/"LNV4752" acpi_device
2. The stack happened to contain NULL where type_name is stored
In the Linux kernel, the following vulnerability has been resolved:
dm-stripe: fix a possible integer overflow
There's a possible integer overflow in stripe_io_hints if we have too
large chunk size. Test if the overflow happened, and if it did, don't set
limits->io_min and limits->io_opt;
In the Linux kernel, the following vulnerability has been resolved:
crypto: af_alg - Set merge to zero early in af_alg_sendmsg
If an error causes af_alg_sendmsg to abort, ctx->merge may contain
a garbage value from the previous loop. This may then trigger a
crash on the next entry into af_alg_sendmsg when it attempts to do
a merge that can't be done.
Fix this by setting ctx->merge to zero near the start of the loop.