In the Linux kernel, the following vulnerability has been resolved:
bpf: Tell memcg to use allow_spinning=false path in bpf_timer_init()
Currently, calling bpf_map_kmalloc_node() from __bpf_async_init() can
cause various locking issues; see the following stack trace (edited for
style) as one example:
...
[10.011566] do_raw_spin_lock.cold
[10.011570] try_to_wake_up (5) double-acquiring the same
[10.011575] kick_pool rq_lock, causing a hardlockup
[10.011579] __queue_work
[10.011582] queue_work_on
[10.011585] kernfs_notify
[10.011589] cgroup_file_notify
[10.011593] try_charge_memcg (4) memcg accounting raises an
[10.011597] obj_cgroup_charge_pages MEMCG_MAX event
[10.011599] obj_cgroup_charge_account
[10.011600] __memcg_slab_post_alloc_hook
[10.011603] __kmalloc_node_noprof
...
[10.011611] bpf_map_kmalloc_node
[10.011612] __bpf_async_init
[10.011615] bpf_timer_init (3) BPF calls bpf_timer_init()
[10.011617] bpf_prog_xxxxxxxxxxxxxxxx_fcg_runnable
[10.011619] bpf__sched_ext_ops_runnable
[10.011620] enqueue_task_scx (2) BPF runs with rq_lock held
[10.011622] enqueue_task
[10.011626] ttwu_do_activate
[10.011629] sched_ttwu_pending (1) grabs rq_lock
...
The above was reproduced on bpf-next (b338cf849ec8) by modifying
./tools/sched_ext/scx_flatcg.bpf.c to call bpf_timer_init() during
ops.runnable(), and hacking the memcg accounting code a bit to make
a bpf_timer_init() call more likely to raise an MEMCG_MAX event.
We have also run into other similar variants (both internally and on
bpf-next), including double-acquiring cgroup_file_kn_lock, the same
worker_pool::lock, etc.
As suggested by Shakeel, fix this by using __GFP_HIGH instead of
GFP_ATOMIC in __bpf_async_init(), so that e.g. if try_charge_memcg()
raises an MEMCG_MAX event, we call __memcg_memory_event() with
@allow_spinning=false and avoid calling cgroup_file_notify() there.
Depends on mm patch
"memcg: skip cgroup_file_notify if spinning is not allowed":
https://lore.kernel.org/bpf/20250905201606.66198-1-shakeel.butt@linux.dev/
v0 approach s/bpf_map_kmalloc_node/bpf_mem_alloc/
https://lore.kernel.org/bpf/20250905061919.439648-1-yepeilin@google.com/
v1 approach:
https://lore.kernel.org/bpf/20250905234547.862249-1-yepeilin@google.com/
In the Linux kernel, the following vulnerability has been resolved:
net: fec: Fix possible NPD in fec_enet_phy_reset_after_clk_enable()
The function of_phy_find_device may return NULL, so we need to take
care before dereferencing phy_dev.
In the Linux kernel, the following vulnerability has been resolved:
libceph: fix invalid accesses to ceph_connection_v1_info
There is a place where generic code in messenger.c is reading and
another place where it is writing to con->v1 union member without
checking that the union member is active (i.e. msgr1 is in use).
On 64-bit systems, con->v1.auth_retry overlaps with con->v2.out_iter,
so such a read is almost guaranteed to return a bogus value instead of
0 when msgr2 is in use. This ends up being fairly benign because the
side effect is just the invalidation of the authorizer and successive
fetching of new tickets.
con->v1.connect_seq overlaps with con->v2.conn_bufs and the fact that
it's being written to can cause more serious consequences, but luckily
it's not something that happens often.
In the Linux kernel, the following vulnerability has been resolved:
dmaengine: ti: edma: Fix memory allocation size for queue_priority_map
Fix a critical memory allocation bug in edma_setup_from_hw() where
queue_priority_map was allocated with insufficient memory. The code
declared queue_priority_map as s8 (*)[2] (pointer to array of 2 s8),
but allocated memory using sizeof(s8) instead of the correct size.
This caused out-of-bounds memory writes when accessing:
queue_priority_map[i][0] = i;
queue_priority_map[i][1] = i;
The bug manifested as kernel crashes with "Oops - undefined instruction"
on ARM platforms (BeagleBoard-X15) during EDMA driver probe, as the
memory corruption triggered kernel hardening features on Clang.
Change the allocation to use sizeof(*queue_priority_map) which
automatically gets the correct size for the 2D array structure.
In the Linux kernel, the following vulnerability has been resolved:
can: xilinx_can: xcan_write_frame(): fix use-after-free of transmitted SKB
can_put_echo_skb() takes ownership of the SKB and it may be freed
during or after the call.
However, xilinx_can xcan_write_frame() keeps using SKB after the call.
Fix that by only calling can_put_echo_skb() after the code is done
touching the SKB.
The tx_lock is held for the entire xcan_write_frame() execution and
also on the can_get_echo_skb() side so the order of operations does not
matter.
An earlier fix commit 3d3c817c3a40 ("can: xilinx_can: Fix usage of skb
memory") did not move the can_put_echo_skb() call far enough.
[mkl: add "commit" in front of sha1 in patch description]
[mkl: fix indention]
In the Linux kernel, the following vulnerability has been resolved:
wifi: brcmfmac: fix use-after-free when rescheduling brcmf_btcoex_info work
The brcmf_btcoex_detach() only shuts down the btcoex timer, if the
flag timer_on is false. However, the brcmf_btcoex_timerfunc(), which
runs as timer handler, sets timer_on to false. This creates critical
race conditions:
1.If brcmf_btcoex_detach() is called while brcmf_btcoex_timerfunc()
is executing, it may observe timer_on as false and skip the call to
timer_shutdown_sync().
2.The brcmf_btcoex_timerfunc() may then reschedule the brcmf_btcoex_info
worker after the cancel_work_sync() has been executed, resulting in
use-after-free bugs.
The use-after-free bugs occur in two distinct scenarios, depending on
the timing of when the brcmf_btcoex_info struct is freed relative to
the execution of its worker thread.
Scenario 1: Freed before the worker is scheduled
The brcmf_btcoex_info is deallocated before the worker is scheduled.
A race condition can occur when schedule_work(&bt_local->work) is
called after the target memory has been freed. The sequence of events
is detailed below:
CPU0 | CPU1
brcmf_btcoex_detach | brcmf_btcoex_timerfunc
| bt_local->timer_on = false;
if (cfg->btcoex->timer_on) |
... |
cancel_work_sync(); |
... |
kfree(cfg->btcoex); // FREE |
| schedule_work(&bt_local->work); // USE
Scenario 2: Freed after the worker is scheduled
The brcmf_btcoex_info is freed after the worker has been scheduled
but before or during its execution. In this case, statements within
the brcmf_btcoex_handler() — such as the container_of macro and
subsequent dereferences of the brcmf_btcoex_info object will cause
a use-after-free access. The following timeline illustrates this
scenario:
CPU0 | CPU1
brcmf_btcoex_detach | brcmf_btcoex_timerfunc
| bt_local->timer_on = false;
if (cfg->btcoex->timer_on) |
... |
cancel_work_sync(); |
... | schedule_work(); // Reschedule
|
kfree(cfg->btcoex); // FREE | brcmf_btcoex_handler() // Worker
/* | btci = container_of(....); // USE
The kfree() above could | ...
also occur at any point | btci-> // USE
during the worker's execution|
*/ |
To resolve the race conditions, drop the conditional check and call
timer_shutdown_sync() directly. It can deactivate the timer reliably,
regardless of its current state. Once stopped, the timer_on state is
then set to false.
In the Linux kernel, the following vulnerability has been resolved:
wifi: cfg80211: fix use-after-free in cmp_bss()
Following bss_free() quirk introduced in commit 776b3580178f
("cfg80211: track hidden SSID networks properly"), adjust
cfg80211_update_known_bss() to free the last beacon frame
elements only if they're not shared via the corresponding
'hidden_beacon_bss' pointer.
In the Linux kernel, the following vulnerability has been resolved:
fs: writeback: fix use-after-free in __mark_inode_dirty()
An use-after-free issue occurred when __mark_inode_dirty() get the
bdi_writeback that was in the progress of switching.
CPU: 1 PID: 562 Comm: systemd-random- Not tainted 6.6.56-gb4403bd46a8e #1
......
pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : __mark_inode_dirty+0x124/0x418
lr : __mark_inode_dirty+0x118/0x418
sp : ffffffc08c9dbbc0
........
Call trace:
__mark_inode_dirty+0x124/0x418
generic_update_time+0x4c/0x60
file_modified+0xcc/0xd0
ext4_buffered_write_iter+0x58/0x124
ext4_file_write_iter+0x54/0x704
vfs_write+0x1c0/0x308
ksys_write+0x74/0x10c
__arm64_sys_write+0x1c/0x28
invoke_syscall+0x48/0x114
el0_svc_common.constprop.0+0xc0/0xe0
do_el0_svc+0x1c/0x28
el0_svc+0x40/0xe4
el0t_64_sync_handler+0x120/0x12c
el0t_64_sync+0x194/0x198
Root cause is:
systemd-random-seed kworker
----------------------------------------------------------------------
___mark_inode_dirty inode_switch_wbs_work_fn
spin_lock(&inode->i_lock);
inode_attach_wb
locked_inode_to_wb_and_lock_list
get inode->i_wb
spin_unlock(&inode->i_lock);
spin_lock(&wb->list_lock)
spin_lock(&inode->i_lock)
inode_io_list_move_locked
spin_unlock(&wb->list_lock)
spin_unlock(&inode->i_lock)
spin_lock(&old_wb->list_lock)
inode_do_switch_wbs
spin_lock(&inode->i_lock)
inode->i_wb = new_wb
spin_unlock(&inode->i_lock)
spin_unlock(&old_wb->list_lock)
wb_put_many(old_wb, nr_switched)
cgwb_release
old wb released
wb_wakeup_delayed() accesses wb,
then trigger the use-after-free
issue
Fix this race condition by holding inode spinlock until
wb_wakeup_delayed() finished.
In the Linux kernel, the following vulnerability has been resolved:
i40e: Fix potential invalid access when MAC list is empty
list_first_entry() never returns NULL - if the list is empty, it still
returns a pointer to an invalid object, leading to potential invalid
memory access when dereferenced.
Fix this by using list_first_entry_or_null instead of list_first_entry.