In the Linux kernel, the following vulnerability has been resolved:
serial: caif: fix use-after-free in caif_serial ldisc_close()
There is a use-after-free bug in caif_serial where handle_tx() may
access ser->tty after the tty has been freed.
The race condition occurs between ldisc_close() and packet transmission:
CPU 0 (close) CPU 1 (xmit)
------------- ------------
ldisc_close()
tty_kref_put(ser->tty)
[tty may be freed here]
<-- race window -->
caif_xmit()
handle_tx()
tty = ser->tty // dangling ptr
tty->ops->write() // UAF!
schedule_work()
ser_release()
unregister_netdevice()
The root cause is that tty_kref_put() is called in ldisc_close() while
the network device is still active and can receive packets.
Since ser and tty have a 1:1 binding relationship with consistent
lifecycles (ser is allocated in ldisc_open and freed in ser_release
via unregister_netdevice, and each ser binds exactly one tty), we can
safely defer the tty reference release to ser_release() where the
network device is unregistered.
Fix this by moving tty_kref_put() from ldisc_close() to ser_release(),
after unregister_netdevice(). This ensures the tty reference is held
as long as the network device exists, preventing the UAF.
Note: We save ser->tty before unregister_netdevice() because ser is
embedded in netdev's private data and will be freed along with netdev
(needs_free_netdev = true).
How to reproduce: Add mdelay(500) at the beginning of ldisc_close()
to widen the race window, then run the reproducer program [1].
Note: There is a separate deadloop issue in handle_tx() when using
PORT_UNKNOWN serial ports (e.g., /dev/ttyS3 in QEMU without proper
serial backend). This deadloop exists even without this patch,
and is likely caused by inconsistency between uart_write_room() and
uart_write() in serial core. It has been addressed in a separate
patch [2].
KASAN report:
==================================================================
BUG: KASAN: slab-use-after-free in handle_tx+0x5d1/0x620
Read of size 1 at addr ffff8881131e1490 by task caif_uaf_trigge/9929
Call Trace:
<TASK>
dump_stack_lvl+0x10e/0x1f0
print_report+0xd0/0x630
kasan_report+0xe4/0x120
handle_tx+0x5d1/0x620
dev_hard_start_xmit+0x9d/0x6c0
__dev_queue_xmit+0x6e2/0x4410
packet_xmit+0x243/0x360
packet_sendmsg+0x26cf/0x5500
__sys_sendto+0x4a3/0x520
__x64_sys_sendto+0xe0/0x1c0
do_syscall_64+0xc9/0xf80
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f615df2c0d7
Allocated by task 9930:
Freed by task 64:
Last potentially related work creation:
The buggy address belongs to the object at ffff8881131e1000
which belongs to the cache kmalloc-cg-2k of size 2048
The buggy address is located 1168 bytes inside of
freed 2048-byte region [ffff8881131e1000, ffff8881131e1800)
The buggy address belongs to the physical page:
page_owner tracks the page as allocated
page last free pid 9778 tgid 9778 stack trace:
Memory state around the buggy address:
ffff8881131e1380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8881131e1400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff8881131e1480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff8881131e1500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8881131e1580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================
[1]: https://gist.github.com/mrpre/f683f244544f7b11e7fa87df9e6c2eeb
[2]: https://lore.kernel.org/linux-serial/20260204074327.226165-1-jiayuan.chen@linux.dev/T/#u
In the Linux kernel, the following vulnerability has been resolved:
RDMA/rxe: Fix double free in rxe_srq_from_init
In rxe_srq_from_init(), the queue pointer 'q' is assigned to
'srq->rq.queue' before copying the SRQ number to user space.
If copy_to_user() fails, the function calls rxe_queue_cleanup()
to free the queue, but leaves the now-invalid pointer in
'srq->rq.queue'.
The caller of rxe_srq_from_init() (rxe_create_srq) eventually
calls rxe_srq_cleanup() upon receiving the error, which triggers
a second rxe_queue_cleanup() on the same memory, leading to a
double free.
The call trace looks like this:
kmem_cache_free+0x.../0x...
rxe_queue_cleanup+0x1a/0x30 [rdma_rxe]
rxe_srq_cleanup+0x42/0x60 [rdma_rxe]
rxe_elem_release+0x31/0x70 [rdma_rxe]
rxe_create_srq+0x12b/0x1a0 [rdma_rxe]
ib_create_srq_user+0x9a/0x150 [ib_core]
Fix this by moving 'srq->rq.queue = q' after copy_to_user.
In the Linux kernel, the following vulnerability has been resolved:
RDMA/uverbs: Validate wqe_size before using it in ib_uverbs_post_send
ib_uverbs_post_send() uses cmd.wqe_size from userspace without any
validation before passing it to kmalloc() and using the allocated
buffer as struct ib_uverbs_send_wr.
If a user provides a small wqe_size value (e.g., 1), kmalloc() will
succeed, but subsequent accesses to user_wr->opcode, user_wr->num_sge,
and other fields will read beyond the allocated buffer, resulting in
an out-of-bounds read from kernel heap memory. This could potentially
leak sensitive kernel information to userspace.
Additionally, providing an excessively large wqe_size can trigger a
WARNING in the memory allocation path, as reported by syzkaller.
This is inconsistent with ib_uverbs_unmarshall_recv() which properly
validates that wqe_size >= sizeof(struct ib_uverbs_recv_wr) before
proceeding.
Add the same validation for ib_uverbs_post_send() to ensure wqe_size
is at least sizeof(struct ib_uverbs_send_wr).
In the Linux kernel, the following vulnerability has been resolved:
scsi: csiostor: Fix dereference of null pointer rn
The error exit path when rn is NULL ends up deferencing the null pointer rn
via the use of the macro CSIO_INC_STATS. Fix this by adding a new error
return path label after the use of the macro to avoid the deference.
In the Linux kernel, the following vulnerability has been resolved:
ext4: don't zero the entire extent if EXT4_EXT_DATA_PARTIAL_VALID1
When allocating initialized blocks from a large unwritten extent, or
when splitting an unwritten extent during end I/O and converting it to
initialized, there is currently a potential issue of stale data if the
extent needs to be split in the middle.
0 A B N
[UUUUUUUUUUUU] U: unwritten extent
[--DDDDDDDD--] D: valid data
|<- ->| ----> this range needs to be initialized
ext4_split_extent() first try to split this extent at B with
EXT4_EXT_DATA_ENTIRE_VALID1 and EXT4_EXT_MAY_ZEROOUT flag set, but
ext4_split_extent_at() failed to split this extent due to temporary lack
of space. It zeroout B to N and mark the entire extent from 0 to N
as written.
0 A B N
[WWWWWWWWWWWW] W: written extent
[SSDDDDDDDDZZ] Z: zeroed, S: stale data
ext4_split_extent() then try to split this extent at A with
EXT4_EXT_DATA_VALID2 flag set. This time, it split successfully and left
a stale written extent from 0 to A.
0 A B N
[WW|WWWWWWWWWW]
[SS|DDDDDDDDZZ]
Fix this by pass EXT4_EXT_DATA_PARTIAL_VALID1 to ext4_split_extent_at()
when splitting at B, don't convert the entire extent to written and left
it as unwritten after zeroing out B to N. The remaining work is just
like the standard two-part split. ext4_split_extent() will pass the
EXT4_EXT_DATA_VALID2 flag when it calls ext4_split_extent_at() for the
second time, allowing it to properly handle the split. If the split is
successful, it will keep extent from 0 to A as unwritten.
In the Linux kernel, the following vulnerability has been resolved:
apparmor: fix NULL sock in aa_sock_file_perm
Deal with the potential that sock and sock-sk can be NULL during
socket setup or teardown. This could lead to an oops. The fix for NULL
pointer dereference in __unix_needs_revalidation shows this is at
least possible for af_unix sockets. While the fix for af_unix sockets
applies for newer mediation this is still the fall back path for older
af_unix mediation and other sockets, so ensure it is covered.
In the Linux kernel, the following vulnerability has been resolved:
ipvs: skip ipv6 extension headers for csum checks
Protocol checksum validation fails for IPv6 if there are extension
headers before the protocol header. iph->len already contains its
offset, so use it to fix the problem.
In the Linux kernel, the following vulnerability has been resolved:
smack: /smack/doi: accept previously used values
Writing to /smack/doi a value that has ever been
written there in the past disables networking for
non-ambient labels.
E.g.
# cat /smack/doi
3
# netlabelctl -p cipso list
Configured CIPSO mappings (1)
DOI value : 3
mapping type : PASS_THROUGH
# netlabelctl -p map list
Configured NetLabel domain mappings (3)
domain: "_" (IPv4)
protocol: UNLABELED
domain: DEFAULT (IPv4)
protocol: CIPSO, DOI = 3
domain: DEFAULT (IPv6)
protocol: UNLABELED
# cat /smack/ambient
_
# cat /proc/$$/attr/smack/current
_
# ping -c1 10.1.95.12
64 bytes from 10.1.95.12: icmp_seq=1 ttl=64 time=0.964 ms
# echo foo >/proc/$$/attr/smack/current
# ping -c1 10.1.95.12
64 bytes from 10.1.95.12: icmp_seq=1 ttl=64 time=0.956 ms
unknown option 86
# echo 4 >/smack/doi
# echo 3 >/smack/doi
!> [ 214.050395] smk_cipso_doi:691 cipso add rc = -17
# echo 3 >/smack/doi
!> [ 249.402261] smk_cipso_doi:678 remove rc = -2
!> [ 249.402261] smk_cipso_doi:691 cipso add rc = -17
# ping -c1 10.1.95.12
!!> ping: 10.1.95.12: Address family for hostname not supported
# echo _ >/proc/$$/attr/smack/current
# ping -c1 10.1.95.12
64 bytes from 10.1.95.12: icmp_seq=1 ttl=64 time=0.617 ms
This happens because Smack keeps decommissioned DOIs,
fails to re-add them, and consequently refuses to add
the “default” domain map:
# netlabelctl -p cipso list
Configured CIPSO mappings (2)
DOI value : 3
mapping type : PASS_THROUGH
DOI value : 4
mapping type : PASS_THROUGH
# netlabelctl -p map list
Configured NetLabel domain mappings (2)
domain: "_" (IPv4)
protocol: UNLABELED
!> (no ipv4 map for default domain here)
domain: DEFAULT (IPv6)
protocol: UNLABELED
Fix by clearing decommissioned DOI definitions and
serializing concurrent DOI updates with a new lock.
Also:
- allow /smack/doi to live unconfigured, since
adding a map (netlbl_cfg_cipsov4_map_add) may fail.
CIPSO_V4_DOI_UNKNOWN(0) indicates the unconfigured DOI
- add new DOI before removing the old default map,
so the old map remains if the add fails
(2008-02-04, Casey Schaufler)
In the Linux kernel, the following vulnerability has been resolved:
bpf: fix end-of-list detection in cgroup_storage_get_next_key()
list_next_entry() never returns NULL -- when the current element is the
last entry it wraps to the list head via container_of(). The subsequent
NULL check is therefore dead code and get_next_key() never returns
-ENOENT for the last element, instead reading storage->key from a bogus
pointer that aliases internal map fields and copying the result to
userspace.
Replace it with list_entry_is_head() so the function correctly returns
-ENOENT when there are no more entries.
In the Linux kernel, the following vulnerability has been resolved:
bpf: reject negative CO-RE accessor indices in bpf_core_parse_spec()
CO-RE accessor strings are colon-separated indices that describe a path
from a root BTF type to a target field, e.g. "0:1:2" walks through
nested struct members. bpf_core_parse_spec() parses each component with
sscanf("%d"), so negative values like -1 are silently accepted. The
subsequent bounds checks (access_idx >= btf_vlen(t)) only guard the
upper bound and always pass for negative values because C integer
promotion converts the __u16 btf_vlen result to int, making the
comparison (int)(-1) >= (int)(N) false for any positive N.
When -1 reaches btf_member_bit_offset() it gets cast to u32 0xffffffff,
producing an out-of-bounds read far past the members array. A crafted
BPF program with a negative CO-RE accessor on any struct that exists in
vmlinux BTF (e.g. task_struct) crashes the kernel deterministically
during BPF_PROG_LOAD on any system with CONFIG_DEBUG_INFO_BTF=y
(default on major distributions). The bug is reachable with CAP_BPF:
BUG: unable to handle page fault for address: ffffed11818b6626
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
Oops: Oops: 0000 [#1] SMP KASAN NOPTI
CPU: 0 UID: 0 PID: 85 Comm: poc Not tainted 7.0.0-rc6 #18 PREEMPT(full)
RIP: 0010:bpf_core_parse_spec (tools/lib/bpf/relo_core.c:354)
RAX: 00000000ffffffff
Call Trace:
<TASK>
bpf_core_calc_relo_insn (tools/lib/bpf/relo_core.c:1321)
bpf_core_apply (kernel/bpf/btf.c:9507)
check_core_relo (kernel/bpf/verifier.c:19475)
bpf_check (kernel/bpf/verifier.c:26031)
bpf_prog_load (kernel/bpf/syscall.c:3089)
__sys_bpf (kernel/bpf/syscall.c:6228)
</TASK>
CO-RE accessor indices are inherently non-negative (struct member index,
array element index, or enumerator index), so reject them immediately
after parsing.