In the Linux kernel, the following vulnerability has been resolved:
dmaengine: idxd: fix device leaks on compat bind and unbind
Make sure to drop the reference taken when looking up the idxd device as
part of the compat bind and unbind sysfs interface.
In the Linux kernel, the following vulnerability has been resolved:
net: can: j1939: j1939_xtp_rx_rts_session_active(): deactivate session upon receiving the second rts
Since j1939_session_deactivate_activate_next() in j1939_tp_rxtimer() is
called only when the timer is enabled, we need to call
j1939_session_deactivate_activate_next() if we cancelled the timer.
Otherwise, refcount for j1939_session leaks, which will later appear as
| unregister_netdevice: waiting for vcan0 to become free. Usage count = 2.
problem.
In the Linux kernel, the following vulnerability has been resolved:
nvme-tcp: fix NULL pointer dereferences in nvmet_tcp_build_pdu_iovec
Commit efa56305908b ("nvmet-tcp: Fix a kernel panic when host sends an invalid H2C PDU length")
added ttag bounds checking and data_offset
validation in nvmet_tcp_handle_h2c_data_pdu(), but it did not validate
whether the command's data structures (cmd->req.sg and cmd->iov) have
been properly initialized before processing H2C_DATA PDUs.
The nvmet_tcp_build_pdu_iovec() function dereferences these pointers
without NULL checks. This can be triggered by sending H2C_DATA PDU
immediately after the ICREQ/ICRESP handshake, before
sending a CONNECT command or NVMe write command.
Attack vectors that trigger NULL pointer dereferences:
1. H2C_DATA PDU sent before CONNECT → both pointers NULL
2. H2C_DATA PDU for READ command → cmd->req.sg allocated, cmd->iov NULL
3. H2C_DATA PDU for uninitialized command slot → both pointers NULL
The fix validates both cmd->req.sg and cmd->iov before calling
nvmet_tcp_build_pdu_iovec(). Both checks are required because:
- Uninitialized commands: both NULL
- READ commands: cmd->req.sg allocated, cmd->iov NULL
- WRITE commands: both allocated
In the Linux kernel, the following vulnerability has been resolved:
net/sched: sch_qfq: do not free existing class in qfq_change_class()
Fixes qfq_change_class() error case.
cl->qdisc and cl should only be freed if a new class and qdisc
were allocated, or we risk various UAF.
In the Linux kernel, the following vulnerability has been resolved:
net/mlx5e: Fix crash on profile change rollback failure
mlx5e_netdev_change_profile can fail to attach a new profile and can
fail to rollback to old profile, in such case, we could end up with a
dangling netdev with a fully reset netdev_priv. A retry to change
profile, e.g. another attempt to call mlx5e_netdev_change_profile via
switchdev mode change, will crash trying to access the now NULL
priv->mdev.
This fix allows mlx5e_netdev_change_profile() to handle previous
failures and an empty priv, by not assuming priv is valid.
Pass netdev and mdev to all flows requiring
mlx5e_netdev_change_profile() and avoid passing priv.
In mlx5e_netdev_change_profile() check if current priv is valid, and if
not, just attach the new profile without trying to access the old one.
This fixes the following oops, when enabling switchdev mode for the 2nd
time after first time failure:
## Enabling switchdev mode first time:
mlx5_core 0012:03:00.1: E-Switch: Supported tc chains and prios offload
workqueue: Failed to create a rescuer kthread for wq "mlx5e": -EINTR
mlx5_core 0012:03:00.1: mlx5e_netdev_init_profile:6214:(pid 37199): mlx5e_priv_init failed, err=-12
mlx5_core 0012:03:00.1 gpu3rdma1: mlx5e_netdev_change_profile: new profile init failed, -12
workqueue: Failed to create a rescuer kthread for wq "mlx5e": -EINTR
mlx5_core 0012:03:00.1: mlx5e_netdev_init_profile:6214:(pid 37199): mlx5e_priv_init failed, err=-12
mlx5_core 0012:03:00.1 gpu3rdma1: mlx5e_netdev_change_profile: failed to rollback to orig profile, -12
^^^^^^^^
mlx5_core 0000:00:03.0: E-Switch: Disable: mode(LEGACY), nvfs(0), necvfs(0), active vports(0)
## retry: Enabling switchdev mode 2nd time:
mlx5_core 0000:00:03.0: E-Switch: Supported tc chains and prios offload
BUG: kernel NULL pointer dereference, address: 0000000000000038
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: Oops: 0000 [#1] SMP NOPTI
CPU: 13 UID: 0 PID: 520 Comm: devlink Not tainted 6.18.0-rc4+ #91 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-2.fc40 04/01/2014
RIP: 0010:mlx5e_detach_netdev+0x3c/0x90
Code: 50 00 00 f0 80 4f 78 02 48 8b bf e8 07 00 00 48 85 ff 74 16 48 8b 73 78 48 d1 ee 83 e6 01 83 f6 01 40 0f b6 f6 e8 c4 42 00 00 <48> 8b 45 38 48 85 c0 74 08 48 89 df e8 cc 47 40 1e 48 8b bb f0 07
RSP: 0018:ffffc90000673890 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff8881036a89c0 RCX: 0000000000000000
RDX: ffff888113f63800 RSI: ffffffff822fe720 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000002dcd R09: 0000000000000000
R10: ffffc900006738e8 R11: 00000000ffffffff R12: 0000000000000000
R13: 0000000000000000 R14: ffff8881036a89c0 R15: 0000000000000000
FS: 00007fdfb8384740(0000) GS:ffff88856a9d6000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000038 CR3: 0000000112ae0005 CR4: 0000000000370ef0
Call Trace:
<TASK>
mlx5e_netdev_change_profile+0x45/0xb0
mlx5e_vport_rep_load+0x27b/0x2d0
mlx5_esw_offloads_rep_load+0x72/0xf0
esw_offloads_enable+0x5d0/0x970
mlx5_eswitch_enable_locked+0x349/0x430
? is_mp_supported+0x57/0xb0
mlx5_devlink_eswitch_mode_set+0x26b/0x430
devlink_nl_eswitch_set_doit+0x6f/0xf0
genl_family_rcv_msg_doit+0xe8/0x140
genl_rcv_msg+0x18b/0x290
? __pfx_devlink_nl_pre_doit+0x10/0x10
? __pfx_devlink_nl_eswitch_set_doit+0x10/0x10
? __pfx_devlink_nl_post_doit+0x10/0x10
? __pfx_genl_rcv_msg+0x10/0x10
netlink_rcv_skb+0x52/0x100
genl_rcv+0x28/0x40
netlink_unicast+0x282/0x3e0
? __alloc_skb+0xd6/0x190
netlink_sendmsg+0x1f7/0x430
__sys_sendto+0x213/0x220
? __sys_recvmsg+0x6a/0xd0
__x64_sys_sendto+0x24/0x30
do_syscall_64+0x50/0x1f0
entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7fdfb8495047
In the Linux kernel, the following vulnerability has been resolved:
dmaengine: tegra-adma: Fix use-after-free
A use-after-free bug exists in the Tegra ADMA driver when audio streams
are terminated, particularly during XRUN conditions. The issue occurs
when the DMA buffer is freed by tegra_adma_terminate_all() before the
vchan completion tasklet finishes accessing it.
The race condition follows this sequence:
1. DMA transfer completes, triggering an interrupt that schedules the
completion tasklet (tasklet has not executed yet)
2. Audio playback stops, calling tegra_adma_terminate_all() which
frees the DMA buffer memory via kfree()
3. The scheduled tasklet finally executes, calling vchan_complete()
which attempts to access the already-freed memory
Since tasklets can execute at any time after being scheduled, there is
no guarantee that the buffer will remain valid when vchan_complete()
runs.
Fix this by properly synchronizing the virtual channel completion:
- Calling vchan_terminate_vdesc() in tegra_adma_stop() to mark the
descriptors as terminated instead of freeing the descriptor.
- Add the callback tegra_adma_synchronize() that calls
vchan_synchronize() which kills any pending tasklets and frees any
terminated descriptors.
Crash logs:
[ 337.427523] BUG: KASAN: use-after-free in vchan_complete+0x124/0x3b0
[ 337.427544] Read of size 8 at addr ffff000132055428 by task swapper/0/0
[ 337.427562] Call trace:
[ 337.427564] dump_backtrace+0x0/0x320
[ 337.427571] show_stack+0x20/0x30
[ 337.427575] dump_stack_lvl+0x68/0x84
[ 337.427584] print_address_description.constprop.0+0x74/0x2b8
[ 337.427590] kasan_report+0x1f4/0x210
[ 337.427598] __asan_load8+0xa0/0xd0
[ 337.427603] vchan_complete+0x124/0x3b0
[ 337.427609] tasklet_action_common.constprop.0+0x190/0x1d0
[ 337.427617] tasklet_action+0x30/0x40
[ 337.427623] __do_softirq+0x1a0/0x5c4
[ 337.427628] irq_exit+0x110/0x140
[ 337.427633] handle_domain_irq+0xa4/0xe0
[ 337.427640] gic_handle_irq+0x64/0x160
[ 337.427644] call_on_irq_stack+0x20/0x4c
[ 337.427649] do_interrupt_handler+0x7c/0x90
[ 337.427654] el1_interrupt+0x30/0x80
[ 337.427659] el1h_64_irq_handler+0x18/0x30
[ 337.427663] el1h_64_irq+0x7c/0x80
[ 337.427667] cpuidle_enter_state+0xe4/0x540
[ 337.427674] cpuidle_enter+0x54/0x80
[ 337.427679] do_idle+0x2e0/0x380
[ 337.427685] cpu_startup_entry+0x2c/0x70
[ 337.427690] rest_init+0x114/0x130
[ 337.427695] arch_call_rest_init+0x18/0x24
[ 337.427702] start_kernel+0x380/0x3b4
[ 337.427706] __primary_switched+0xc0/0xc8
In the Linux kernel, the following vulnerability has been resolved:
dm-verity: disable recursive forward error correction
There are two problems with the recursive correction:
1. It may cause denial-of-service. In fec_read_bufs, there is a loop that
has 253 iterations. For each iteration, we may call verity_hash_for_block
recursively. There is a limit of 4 nested recursions - that means that
there may be at most 253^4 (4 billion) iterations. Red Hat QE team
actually created an image that pushes dm-verity to this limit - and this
image just makes the udev-worker process get stuck in the 'D' state.
2. It doesn't work. In fec_read_bufs we store data into the variable
"fio->bufs", but fio bufs is shared between recursive invocations, if
"verity_hash_for_block" invoked correction recursively, it would
overwrite partially filled fio->bufs.
In the Linux kernel, the following vulnerability has been resolved:
netfilter: nf_tables: avoid chain re-validation if possible
Hamza Mahfooz reports cpu soft lock-ups in
nft_chain_validate():
watchdog: BUG: soft lockup - CPU#1 stuck for 27s! [iptables-nft-re:37547]
[..]
RIP: 0010:nft_chain_validate+0xcb/0x110 [nf_tables]
[..]
nft_immediate_validate+0x36/0x50 [nf_tables]
nft_chain_validate+0xc9/0x110 [nf_tables]
nft_immediate_validate+0x36/0x50 [nf_tables]
nft_chain_validate+0xc9/0x110 [nf_tables]
nft_immediate_validate+0x36/0x50 [nf_tables]
nft_chain_validate+0xc9/0x110 [nf_tables]
nft_immediate_validate+0x36/0x50 [nf_tables]
nft_chain_validate+0xc9/0x110 [nf_tables]
nft_immediate_validate+0x36/0x50 [nf_tables]
nft_chain_validate+0xc9/0x110 [nf_tables]
nft_immediate_validate+0x36/0x50 [nf_tables]
nft_chain_validate+0xc9/0x110 [nf_tables]
nft_table_validate+0x6b/0xb0 [nf_tables]
nf_tables_validate+0x8b/0xa0 [nf_tables]
nf_tables_commit+0x1df/0x1eb0 [nf_tables]
[..]
Currently nf_tables will traverse the entire table (chain graph), starting
from the entry points (base chains), exploring all possible paths
(chain jumps). But there are cases where we could avoid revalidation.
Consider:
1 input -> j2 -> j3
2 input -> j2 -> j3
3 input -> j1 -> j2 -> j3
Then the second rule does not need to revalidate j2, and, by extension j3,
because this was already checked during validation of the first rule.
We need to validate it only for rule 3.
This is needed because chain loop detection also ensures we do not exceed
the jump stack: Just because we know that j2 is cycle free, its last jump
might now exceed the allowed stack size. We also need to update all
reachable chains with the new largest observed call depth.
Care has to be taken to revalidate even if the chain depth won't be an
issue: chain validation also ensures that expressions are not called from
invalid base chains. For example, the masquerade expression can only be
called from NAT postrouting base chains.
Therefore we also need to keep record of the base chain context (type,
hooknum) and revalidate if the chain becomes reachable from a different
hook location.
In the Linux kernel, the following vulnerability has been resolved:
net: dsa: properly keep track of conduit reference
Problem description
-------------------
DSA has a mumbo-jumbo of reference handling of the conduit net device
and its kobject which, sadly, is just wrong and doesn't make sense.
There are two distinct problems.
1. The OF path, which uses of_find_net_device_by_node(), never releases
the elevated refcount on the conduit's kobject. Nominally, the OF and
non-OF paths should result in objects having identical reference
counts taken, and it is already suspicious that
dsa_dev_to_net_device() has a put_device() call which is missing in
dsa_port_parse_of(), but we can actually even verify that an issue
exists. With CONFIG_DEBUG_KOBJECT_RELEASE=y, if we run this command
"before" and "after" applying this patch:
(unbind the conduit driver for net device eno2)
echo 0000:00:00.2 > /sys/bus/pci/drivers/fsl_enetc/unbind
we see these lines in the output diff which appear only with the patch
applied:
kobject: 'eno2' (ffff002009a3a6b8): kobject_release, parent 0000000000000000 (delayed 1000)
kobject: '109' (ffff0020099d59a0): kobject_release, parent 0000000000000000 (delayed 1000)
2. After we find the conduit interface one way (OF) or another (non-OF),
it can get unregistered at any time, and DSA remains with a long-lived,
but in this case stale, cpu_dp->conduit pointer. Holding the net
device's underlying kobject isn't actually of much help, it just
prevents it from being freed (but we never need that kobject
directly). What helps us to prevent the net device from being
unregistered is the parallel netdev reference mechanism (dev_hold()
and dev_put()).
Actually we actually use that netdev tracker mechanism implicitly on
user ports since commit 2f1e8ea726e9 ("net: dsa: link interfaces with
the DSA master to get rid of lockdep warnings"), via netdev_upper_dev_link().
But time still passes at DSA switch probe time between the initial
of_find_net_device_by_node() code and the user port creation time, time
during which the conduit could unregister itself and DSA wouldn't know
about it.
So we have to run of_find_net_device_by_node() under rtnl_lock() to
prevent that from happening, and release the lock only with the netdev
tracker having acquired the reference.
Do we need to keep the reference until dsa_unregister_switch() /
dsa_switch_shutdown()?
1: Maybe yes. A switch device will still be registered even if all user
ports failed to probe, see commit 86f8b1c01a0a ("net: dsa: Do not
make user port errors fatal"), and the cpu_dp->conduit pointers
remain valid. I haven't audited all call paths to see whether they
will actually use the conduit in lack of any user port, but if they
do, it seems safer to not rely on user ports for that reference.
2. Definitely yes. We support changing the conduit which a user port is
associated to, and we can get into a situation where we've moved all
user ports away from a conduit, thus no longer hold any reference to
it via the net device tracker. But we shouldn't let it go nonetheless
- see the next change in relation to dsa_tree_find_first_conduit()
and LAG conduits which disappear.
We have to be prepared to return to the physical conduit, so the CPU
port must explicitly keep another reference to it. This is also to
say: the user ports and their CPU ports may not always keep a
reference to the same conduit net device, and both are needed.
As for the conduit's kobject for the /sys/class/net/ entry, we don't
care about it, we can release it as soon as we hold the net device
object itself.
History and blame attribution
-----------------------------
The code has been refactored so many times, it is very difficult to
follow and properly attribute a blame, but I'll try to make a short
history which I hope to be correct.
We have two distinct probing paths:
- one for OF, introduced in 2016 i
---truncated---