In the Linux kernel, the following vulnerability has been resolved:
macvlan: fix error recovery in macvlan_common_newlink()
valis provided a nice repro to crash the kernel:
ip link add p1 type veth peer p2
ip link set address 00:00:00:00:00:20 dev p1
ip link set up dev p1
ip link set up dev p2
ip link add mv0 link p2 type macvlan mode source
ip link add invalid% link p2 type macvlan mode source macaddr add 00:00:00:00:00:20
ping -c1 -I p1 1.2.3.4
He also gave a very detailed analysis:
<quote valis>
The issue is triggered when a new macvlan link is created with
MACVLAN_MODE_SOURCE mode and MACVLAN_MACADDR_ADD (or
MACVLAN_MACADDR_SET) parameter, lower device already has a macvlan
port and register_netdevice() called from macvlan_common_newlink()
fails (e.g. because of the invalid link name).
In this case macvlan_hash_add_source is called from
macvlan_change_sources() / macvlan_common_newlink():
This adds a reference to vlan to the port's vlan_source_hash using
macvlan_source_entry.
vlan is a pointer to the priv data of the link that is being created.
When register_netdevice() fails, the error is returned from
macvlan_newlink() to rtnl_newlink_create():
if (ops->newlink)
err = ops->newlink(dev, ¶ms, extack);
else
err = register_netdevice(dev);
if (err < 0) {
free_netdev(dev);
goto out;
}
and free_netdev() is called, causing a kvfree() on the struct
net_device that is still referenced in the source entry attached to
the lower device's macvlan port.
Now all packets sent on the macvlan port with a matching source mac
address will trigger a use-after-free in macvlan_forward_source().
</quote valis>
With all that, my fix is to make sure we call macvlan_flush_sources()
regardless of @create value whenever "goto destroy_macvlan_port;"
path is taken.
Many thanks to valis for following up on this issue.
In the Linux kernel, the following vulnerability has been resolved:
scsi: target: iscsi: Fix use-after-free in iscsit_dec_session_usage_count()
In iscsit_dec_session_usage_count(), the function calls complete() while
holding the sess->session_usage_lock. Similar to the connection usage count
logic, the waiter signaled by complete() (e.g., in the session release
path) may wake up and free the iscsit_session structure immediately.
This creates a race condition where the current thread may attempt to
execute spin_unlock_bh() on a session structure that has already been
deallocated, resulting in a KASAN slab-use-after-free.
To resolve this, release the session_usage_lock before calling complete()
to ensure all dereferences of the sess pointer are finished before the
waiter is allowed to proceed with deallocation.
In the Linux kernel, the following vulnerability has been resolved:
KVM: Don't clobber irqfd routing type when deassigning irqfd
When deassigning a KVM_IRQFD, don't clobber the irqfd's copy of the IRQ's
routing entry as doing so breaks kvm_arch_irq_bypass_del_producer() on x86
and arm64, which explicitly look for KVM_IRQ_ROUTING_MSI. Instead, to
handle a concurrent routing update, verify that the irqfd is still active
before consuming the routing information. As evidenced by the x86 and
arm64 bugs, and another bug in kvm_arch_update_irqfd_routing() (see below),
clobbering the entry type without notifying arch code is surprising and
error prone.
As a bonus, checking that the irqfd is active provides a convenient
location for documenting _why_ KVM must not consume the routing entry for
an irqfd that is in the process of being deassigned: once the irqfd is
deleted from the list (which happens *before* the eventfd is detached), it
will no longer receive updates via kvm_irq_routing_update(), and so KVM
could deliver an event using stale routing information (relative to
KVM_SET_GSI_ROUTING returning to userspace).
As an even better bonus, explicitly checking for the irqfd being active
fixes a similar bug to the one the clobbering is trying to prevent: if an
irqfd is deactivated, and then its routing is changed,
kvm_irq_routing_update() won't invoke kvm_arch_update_irqfd_routing()
(because the irqfd isn't in the list). And so if the irqfd is in bypass
mode, IRQs will continue to be posted using the old routing information.
As for kvm_arch_irq_bypass_del_producer(), clobbering the routing type
results in KVM incorrectly keeping the IRQ in bypass mode, which is
especially problematic on AMD as KVM tracks IRQs that are being posted to
a vCPU in a list whose lifetime is tied to the irqfd.
Without the help of KASAN to detect use-after-free, the most common
sympton on AMD is a NULL pointer deref in amd_iommu_update_ga() due to
the memory for irqfd structure being re-allocated and zeroed, resulting
in irqfd->irq_bypass_data being NULL when read by
avic_update_iommu_vcpu_affinity():
BUG: kernel NULL pointer dereference, address: 0000000000000018
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 40cf2b9067 P4D 40cf2b9067 PUD 408362a067 PMD 0
Oops: Oops: 0000 [#1] SMP
CPU: 6 UID: 0 PID: 40383 Comm: vfio_irq_test
Tainted: G U W O 6.19.0-smp--5dddc257e6b2-irqfd #31 NONE
Tainted: [U]=USER, [W]=WARN, [O]=OOT_MODULE
Hardware name: Google, Inc. Arcadia_IT_80/Arcadia_IT_80, BIOS 34.78.2-0 09/05/2025
RIP: 0010:amd_iommu_update_ga+0x19/0xe0
Call Trace:
<TASK>
avic_update_iommu_vcpu_affinity+0x3d/0x90 [kvm_amd]
__avic_vcpu_load+0xf4/0x130 [kvm_amd]
kvm_arch_vcpu_load+0x89/0x210 [kvm]
vcpu_load+0x30/0x40 [kvm]
kvm_arch_vcpu_ioctl_run+0x45/0x620 [kvm]
kvm_vcpu_ioctl+0x571/0x6a0 [kvm]
__se_sys_ioctl+0x6d/0xb0
do_syscall_64+0x6f/0x9d0
entry_SYSCALL_64_after_hwframe+0x4b/0x53
RIP: 0033:0x46893b
</TASK>
---[ end trace 0000000000000000 ]---
If AVIC is inhibited when the irfd is deassigned, the bug will manifest as
list corruption, e.g. on the next irqfd assignment.
list_add corruption. next->prev should be prev (ffff8d474d5cd588),
but was 0000000000000000. (next=ffff8d8658f86530).
------------[ cut here ]------------
kernel BUG at lib/list_debug.c:31!
Oops: invalid opcode: 0000 [#1] SMP
CPU: 128 UID: 0 PID: 80818 Comm: vfio_irq_test
Tainted: G U W O 6.19.0-smp--f19dc4d680ba-irqfd #28 NONE
Tainted: [U]=USER, [W]=WARN, [O]=OOT_MODULE
Hardware name: Google, Inc. Arcadia_IT_80/Arcadia_IT_80, BIOS 34.78.2-0 09/05/2025
RIP: 0010:__list_add_valid_or_report+0x97/0xc0
Call Trace:
<TASK>
avic_pi_update_irte+0x28e/0x2b0 [kvm_amd]
kvm_pi_update_irte+0xbf/0x190 [kvm]
kvm_arch_irq_bypass_add_producer+0x72/0x90 [kvm]
irq_bypass_register_consumer+0xcd/0x170 [irqbypa
---truncated---
In the Linux kernel, the following vulnerability has been resolved:
pmdomain: imx8m-blk-ctrl: fix out-of-range access of bc->domains
Fix out-of-range access of bc->domains in imx8m_blk_ctrl_remove().
In the Linux kernel, the following vulnerability has been resolved:
ALSA: aloop: Fix racy access at PCM trigger
The PCM trigger callback of aloop driver tries to check the PCM state
and stop the stream of the tied substream in the corresponding cable.
Since both check and stop operations are performed outside the cable
lock, this may result in UAF when a program attempts to trigger
frequently while opening/closing the tied stream, as spotted by
fuzzers.
For addressing the UAF, this patch changes two things:
- It covers the most of code in loopback_check_format() with
cable->lock spinlock, and add the proper NULL checks. This avoids
already some racy accesses.
- In addition, now we try to check the state of the capture PCM stream
that may be stopped in this function, which was the major pain point
leading to UAF.
In the Linux kernel, the following vulnerability has been resolved:
dmaengine: mmp_pdma: Fix race condition in mmp_pdma_residue()
Add proper locking in mmp_pdma_residue() to prevent use-after-free when
accessing descriptor list and descriptor contents.
The race occurs when multiple threads call tx_status() while the tasklet
on another CPU is freeing completed descriptors:
CPU 0 CPU 1
----- -----
mmp_pdma_tx_status()
mmp_pdma_residue()
-> NO LOCK held
list_for_each_entry(sw, ..)
DMA interrupt
dma_do_tasklet()
-> spin_lock(&desc_lock)
list_move(sw->node, ...)
spin_unlock(&desc_lock)
| dma_pool_free(sw) <- FREED!
-> access sw->desc <- UAF!
This issue can be reproduced when running dmatest on the same channel with
multiple threads (threads_per_chan > 1).
Fix by protecting the chain_running list iteration and descriptor access
with the chan->desc_lock spinlock.
In the Linux kernel, the following vulnerability has been resolved:
wifi: wlcore: ensure skb headroom before skb_push
This avoids occasional skb_under_panic Oops from wl1271_tx_work. In this case, headroom is
less than needed (typically 110 - 94 = 16 bytes).
In the Linux kernel, the following vulnerability has been resolved:
flex_proportions: make fprop_new_period() hardirq safe
Bernd has reported a lockdep splat from flexible proportions code that is
essentially complaining about the following race:
<timer fires>
run_timer_softirq - we are in softirq context
call_timer_fn
writeout_period
fprop_new_period
write_seqcount_begin(&p->sequence);
<hardirq is raised>
...
blk_mq_end_request()
blk_update_request()
ext4_end_bio()
folio_end_writeback()
__wb_writeout_add()
__fprop_add_percpu_max()
if (unlikely(max_frac < FPROP_FRAC_BASE)) {
fprop_fraction_percpu()
seq = read_seqcount_begin(&p->sequence);
- sees odd sequence so loops indefinitely
Note that a deadlock like this is only possible if the bdi has configured
maximum fraction of writeout throughput which is very rare in general but
frequent for example for FUSE bdis. To fix this problem we have to make
sure write section of the sequence counter is irqsafe.
In the Linux kernel, the following vulnerability has been resolved:
mptcp: fix race in mptcp_pm_nl_flush_addrs_doit()
syzbot and Eulgyu Kim reported crashes in mptcp_pm_nl_get_local_id()
and/or mptcp_pm_nl_is_backup()
Root cause is list_splice_init() in mptcp_pm_nl_flush_addrs_doit()
which is not RCU ready.
list_splice_init_rcu() can not be called here while holding pernet->lock
spinlock.
Many thanks to Eulgyu Kim for providing a repro and testing our patches.