In the Linux kernel, the following vulnerability has been resolved:
xhci: Fix NULL pointer dereference when reading portli debugfs files
Michal reported and debgged a NULL pointer dereference bug in the
recently added portli debugfs files
Oops is caused when there are more port registers counted in
xhci->max_ports than ports reported by Supported Protocol capabilities.
This is possible if max_ports is more than maximum port number, or
if there are gaps between ports of different speeds the 'Supported
Protocol' capabilities.
In such cases port->rhub will be NULL so we can't reach xhci behind it.
Add an explicit NULL check for this case, and print portli in hex
without dereferencing port->rhub.
In the Linux kernel, the following vulnerability has been resolved:
usb: xhci: Fix memory leak in xhci_disable_slot()
xhci_alloc_command() allocates a command structure and, when the
second argument is true, also allocates a completion structure.
Currently, the error handling path in xhci_disable_slot() only frees
the command structure using kfree(), causing the completion structure
to leak.
Use xhci_free_command() instead of kfree(). xhci_free_command() correctly
frees both the command structure and the associated completion structure.
Since the command structure is allocated with zero-initialization,
command->in_ctx is NULL and will not be erroneously freed by
xhci_free_command().
This bug was found using an experimental static analysis tool we are
developing. The tool is based on the LLVM framework and is specifically
designed to detect memory management issues. It is currently under
active development and not yet publicly available, but we plan to
open-source it after our research is published.
The bug was originally detected on v6.13-rc1 using our static analysis
tool, and we have verified that the issue persists in the latest mainline
kernel.
We performed build testing on x86_64 with allyesconfig using GCC=11.4.0.
Since triggering these error paths in xhci_disable_slot() requires specific
hardware conditions or abnormal state, we were unable to construct a test
case to reliably trigger these specific error paths at runtime.
In the Linux kernel, the following vulnerability has been resolved:
rust_binder: avoid reading the written value in offsets array
When sending a transaction, its offsets array is first copied into the
target proc's vma, and then the values are read back from there. This is
normally fine because the vma is a read-only mapping, so the target
process cannot change the value under us.
However, if the target process somehow gains the ability to write to its
own vma, it could change the offset before it's read back, causing the
kernel to misinterpret what the sender meant. If the sender happens to
send a payload with a specific shape, this could in the worst case lead
to the receiver being able to privilege escalate into the sender.
The intent is that gaining the ability to change the read-only vma of
your own process should not be exploitable, so remove this TOCTOU read
even though it's unexploitable without another Binder bug.
In the Linux kernel, the following vulnerability has been resolved:
rust_binder: check ownership before using vma
When installing missing pages (or zapping them), Rust Binder will look
up the vma in the mm by address, and then call vm_insert_page (or
zap_page_range_single). However, if the vma is closed and replaced with
a different vma at the same address, this can lead to Rust Binder
installing pages into the wrong vma.
By installing the page into a writable vma, it becomes possible to write
to your own binder pages, which are normally read-only. Although you're
not supposed to be able to write to those pages, the intent behind the
design of Rust Binder is that even if you get that ability, it should not
lead to anything bad. Unfortunately, due to another bug, that is not the
case.
To fix this, store a pointer in vm_private_data and check that the vma
returned by vma_lookup() has the right vm_ops and vm_private_data before
trying to use the vma. This should ensure that Rust Binder will refuse
to interact with any other VMA. The plan is to introduce more vma
abstractions to avoid this unsafe access to vm_ops and vm_private_data,
but for now let's start with the simplest possible fix.
C Binder performs the same check in a slightly different way: it
provides a vm_ops->close that sets a boolean to true, then checks that
boolean after calling vma_lookup(), but this is more fragile
than the solution in this patch. (We probably still want to do both, but
the vm_ops->close callback will be added later as part of the follow-up
vma API changes.)
It's still possible to remap the vma so that pages appear in the right
vma, but at the wrong offset, but this is a separate issue and will be
fixed when Rust Binder gets a vm_ops->close callback.
In the Linux kernel, the following vulnerability has been resolved:
rust_binder: fix oneway spam detection
The spam detection logic in TreeRange was executed before the current
request was inserted into the tree. So the new request was not being
factored in the spam calculation. Fix this by moving the logic after
the new range has been inserted.
Also, the detection logic for ArrayRange was missing altogether which
meant large spamming transactions could get away without being detected.
Fix this by implementing an equivalent low_oneway_space() in ArrayRange.
Note that I looked into centralizing this logic in RangeAllocator but
iterating through 'state' and 'size' got a bit too complicated (for me)
and I abandoned this effort.
In the Linux kernel, the following vulnerability has been resolved:
ALSA: usb-audio: Check endpoint numbers at parsing Scarlett2 mixer interfaces
The Scarlett2 mixer quirk in USB-audio driver may hit a NULL
dereference when a malformed USB descriptor is passed, since it
assumes the presence of an endpoint in the parsed interface in
scarlett2_find_fc_interface(), as reported by fuzzer.
For avoiding the NULL dereference, just add the sanity check of
bNumEndpoints and skip the invalid interface.
In the Linux kernel, the following vulnerability has been resolved:
ceph: fix i_nlink underrun during async unlink
During async unlink, we drop the `i_nlink` counter before we receive
the completion (that will eventually update the `i_nlink`) because "we
assume that the unlink will succeed". That is not a bad idea, but it
races against deletions by other clients (or against the completion of
our own unlink) and can lead to an underrun which emits a WARNING like
this one:
WARNING: CPU: 85 PID: 25093 at fs/inode.c:407 drop_nlink+0x50/0x68
Modules linked in:
CPU: 85 UID: 3221252029 PID: 25093 Comm: php-cgi8.1 Not tainted 6.14.11-cm4all1-ampere #655
Hardware name: Supermicro ARS-110M-NR/R12SPD-A, BIOS 1.1b 10/17/2023
pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : drop_nlink+0x50/0x68
lr : ceph_unlink+0x6c4/0x720
sp : ffff80012173bc90
x29: ffff80012173bc90 x28: ffff086d0a45aaf8 x27: ffff0871d0eb5680
x26: ffff087f2a64a718 x25: 0000020000000180 x24: 0000000061c88647
x23: 0000000000000002 x22: ffff07ff9236d800 x21: 0000000000001203
x20: ffff07ff9237b000 x19: ffff088b8296afc0 x18: 00000000f3c93365
x17: 0000000000070000 x16: ffff08faffcbdfe8 x15: ffff08faffcbdfec
x14: 0000000000000000 x13: 45445f65645f3037 x12: 34385f6369706f74
x11: 0000a2653104bb20 x10: ffffd85f26d73290 x9 : ffffd85f25664f94
x8 : 00000000000000c0 x7 : 0000000000000000 x6 : 0000000000000002
x5 : 0000000000000081 x4 : 0000000000000481 x3 : 0000000000000000
x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff08727d3f91e8
Call trace:
drop_nlink+0x50/0x68 (P)
vfs_unlink+0xb0/0x2e8
do_unlinkat+0x204/0x288
__arm64_sys_unlinkat+0x3c/0x80
invoke_syscall.constprop.0+0x54/0xe8
do_el0_svc+0xa4/0xc8
el0_svc+0x18/0x58
el0t_64_sync_handler+0x104/0x130
el0t_64_sync+0x154/0x158
In ceph_unlink(), a call to ceph_mdsc_submit_request() submits the
CEPH_MDS_OP_UNLINK to the MDS, but does not wait for completion.
Meanwhile, between this call and the following drop_nlink() call, a
worker thread may process a CEPH_CAP_OP_IMPORT, CEPH_CAP_OP_GRANT or
just a CEPH_MSG_CLIENT_REPLY (the latter of which could be our own
completion). These will lead to a set_nlink() call, updating the
`i_nlink` counter to the value received from the MDS. If that new
`i_nlink` value happens to be zero, it is illegal to decrement it
further. But that is exactly what ceph_unlink() will do then.
The WARNING can be reproduced this way:
1. Force async unlink; only the async code path is affected. Having
no real clue about Ceph internals, I was unable to find out why the
MDS wouldn't give me the "Fxr" capabilities, so I patched
get_caps_for_async_unlink() to always succeed.
(Note that the WARNING dump above was found on an unpatched kernel,
without this kludge - this is not a theoretical bug.)
2. Add a sleep call after ceph_mdsc_submit_request() so the unlink
completion gets handled by a worker thread before drop_nlink() is
called. This guarantees that the `i_nlink` is already zero before
drop_nlink() runs.
The solution is to skip the counter decrement when it is already zero,
but doing so without a lock is still racy (TOCTOU). Since
ceph_fill_inode() and handle_cap_grant() both hold the
`ceph_inode_info.i_ceph_lock` spinlock while set_nlink() runs, this
seems like the proper lock to protect the `i_nlink` updates.
I found prior art in NFS and SMB (using `inode.i_lock`) and AFS (using
`afs_vnode.cb_lock`). All three have the zero check as well.
In the Linux kernel, the following vulnerability has been resolved:
usb: gadget: f_ncm: Fix net_device lifecycle with device_move
The network device outlived its parent gadget device during
disconnection, resulting in dangling sysfs links and null pointer
dereference problems.
A prior attempt to solve this by removing SET_NETDEV_DEV entirely [1]
was reverted due to power management ordering concerns and a NO-CARRIER
regression.
A subsequent attempt to defer net_device allocation to bind [2] broke
1:1 mapping between function instance and network device, making it
impossible for configfs to report the resolved interface name. This
results in a regression where the DHCP server fails on pmOS.
Use device_move to reparent the net_device between the gadget device and
/sys/devices/virtual/ across bind/unbind cycles. This preserves the
network interface across USB reconnection, allowing the DHCP server to
retain their binding.
Introduce gether_attach_gadget()/gether_detach_gadget() helpers and use
__free(detach_gadget) macro to undo attachment on bind failure. The
bind_count ensures device_move executes only on the first bind.
[1] https://lore.kernel.org/lkml/f2a4f9847617a0929d62025748384092e5f35cce.camel@crapouillou.net/
[2] https://lore.kernel.org/linux-usb/795ea759-7eaf-4f78-81f4-01ffbf2d7961@ixit.cz/
In the Linux kernel, the following vulnerability has been resolved:
usb: legacy: ncm: Fix NPE in gncm_bind
Commit 56a512a9b410 ("usb: gadget: f_ncm: align net_device lifecycle
with bind/unbind") deferred the allocation of the net_device. This
change leads to a NULL pointer dereference in the legacy NCM driver as
it attempts to access the net_device before it's fully instantiated.
Store the provided qmult, host_addr, and dev_addr into the struct
ncm_opts->net_opts during gncm_bind(). These values will be properly
applied to the net_device when it is allocated and configured later in
the binding process by the NCM function driver.
In the Linux kernel, the following vulnerability has been resolved:
usb: gadget: f_ncm: Fix atomic context locking issue
The ncm_set_alt function was holding a mutex to protect against races
with configfs, which invokes the might-sleep function inside an atomic
context.
Remove the struct net_device pointer from the f_ncm_opts structure to
eliminate the contention. The connection state is now managed by a new
boolean flag to preserve the use-after-free fix from
commit 6334b8e4553c ("usb: gadget: f_ncm: Fix UAF ncm object at re-bind
after usb ep transport error").
BUG: sleeping function called from invalid context
Call Trace:
dump_stack_lvl+0x83/0xc0
dump_stack+0x14/0x16
__might_resched+0x389/0x4c0
__might_sleep+0x8e/0x100
...
__mutex_lock+0x6f/0x1740
...
ncm_set_alt+0x209/0xa40
set_config+0x6b6/0xb40
composite_setup+0x734/0x2b40
...