Security Vulnerabilities
- CVEs Published In February 2024
In the Linux kernel, the following vulnerability has been resolved:
iommu/vt-d: Remove WO permissions on second-level paging entries
When the first level page table is used for IOVA translation, it only
supports Read-Only and Read-Write permissions. The Write-Only permission
is not supported as the PRESENT bit (implying Read permission) should
always set. When using second level, we still give separate permissions
that allows WriteOnly which seems inconsistent and awkward. We want to
have consistent behavior. After moving to 1st level, we don't want things
to work sometimes, and break if we use 2nd level for the same mappings.
Hence remove this configuration.
In the Linux kernel, the following vulnerability has been resolved:
udp: skip L4 aggregation for UDP tunnel packets
If NETIF_F_GRO_FRAGLIST or NETIF_F_GRO_UDP_FWD are enabled, and there
are UDP tunnels available in the system, udp_gro_receive() could end-up
doing L4 aggregation (either SKB_GSO_UDP_L4 or SKB_GSO_FRAGLIST) at
the outer UDP tunnel level for packets effectively carrying and UDP
tunnel header.
That could cause inner protocol corruption. If e.g. the relevant
packets carry a vxlan header, different vxlan ids will be ignored/
aggregated to the same GSO packet. Inner headers will be ignored, too,
so that e.g. TCP over vxlan push packets will be held in the GRO
engine till the next flush, etc.
Just skip the SKB_GSO_UDP_L4 and SKB_GSO_FRAGLIST code path if the
current packet could land in a UDP tunnel, and let udp_gro_receive()
do GRO via udp_sk(sk)->gro_receive.
The check implemented in this patch is broader than what is strictly
needed, as the existing UDP tunnel could be e.g. configured on top of
a different device: we could end-up skipping GRO at-all for some packets.
Anyhow, that is a very thin corner case and covering it will add quite
a bit of complexity.
v1 -> v2:
- hopefully clarify the commit message
In the Linux kernel, the following vulnerability has been resolved:
ASoC: q6afe-clocks: fix reprobing of the driver
Q6afe-clocks driver can get reprobed. For example if the APR services
are restarted after the firmware crash. However currently Q6afe-clocks
driver will oops because hw.init will get cleared during first _probe
call. Rewrite the driver to fill the clock data at runtime rather than
using big static array of clocks.
In the Linux kernel, the following vulnerability has been resolved:
Bluetooth: avoid deadlock between hci_dev->lock and socket lock
Commit eab2404ba798 ("Bluetooth: Add BT_PHY socket option") added a
dependency between socket lock and hci_dev->lock that could lead to
deadlock.
It turns out that hci_conn_get_phy() is not in any way relying on hdev
being immutable during the runtime of this function, neither does it even
look at any of the members of hdev, and as such there is no need to hold
that lock.
This fixes the lockdep splat below:
======================================================
WARNING: possible circular locking dependency detected
5.12.0-rc1-00026-g73d464503354 #10 Not tainted
------------------------------------------------------
bluetoothd/1118 is trying to acquire lock:
ffff8f078383c078 (&hdev->lock){+.+.}-{3:3}, at: hci_conn_get_phy+0x1c/0x150 [bluetooth]
but task is already holding lock:
ffff8f07e831d920 (sk_lock-AF_BLUETOOTH-BTPROTO_L2CAP){+.+.}-{0:0}, at: l2cap_sock_getsockopt+0x8b/0x610
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #3 (sk_lock-AF_BLUETOOTH-BTPROTO_L2CAP){+.+.}-{0:0}:
lock_sock_nested+0x72/0xa0
l2cap_sock_ready_cb+0x18/0x70 [bluetooth]
l2cap_config_rsp+0x27a/0x520 [bluetooth]
l2cap_sig_channel+0x658/0x1330 [bluetooth]
l2cap_recv_frame+0x1ba/0x310 [bluetooth]
hci_rx_work+0x1cc/0x640 [bluetooth]
process_one_work+0x244/0x5f0
worker_thread+0x3c/0x380
kthread+0x13e/0x160
ret_from_fork+0x22/0x30
-> #2 (&chan->lock#2/1){+.+.}-{3:3}:
__mutex_lock+0xa3/0xa10
l2cap_chan_connect+0x33a/0x940 [bluetooth]
l2cap_sock_connect+0x141/0x2a0 [bluetooth]
__sys_connect+0x9b/0xc0
__x64_sys_connect+0x16/0x20
do_syscall_64+0x33/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xae
-> #1 (&conn->chan_lock){+.+.}-{3:3}:
__mutex_lock+0xa3/0xa10
l2cap_chan_connect+0x322/0x940 [bluetooth]
l2cap_sock_connect+0x141/0x2a0 [bluetooth]
__sys_connect+0x9b/0xc0
__x64_sys_connect+0x16/0x20
do_syscall_64+0x33/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xae
-> #0 (&hdev->lock){+.+.}-{3:3}:
__lock_acquire+0x147a/0x1a50
lock_acquire+0x277/0x3d0
__mutex_lock+0xa3/0xa10
hci_conn_get_phy+0x1c/0x150 [bluetooth]
l2cap_sock_getsockopt+0x5a9/0x610 [bluetooth]
__sys_getsockopt+0xcc/0x200
__x64_sys_getsockopt+0x20/0x30
do_syscall_64+0x33/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xae
other info that might help us debug this:
Chain exists of:
&hdev->lock --> &chan->lock#2/1 --> sk_lock-AF_BLUETOOTH-BTPROTO_L2CAP
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(sk_lock-AF_BLUETOOTH-BTPROTO_L2CAP);
lock(&chan->lock#2/1);
lock(sk_lock-AF_BLUETOOTH-BTPROTO_L2CAP);
lock(&hdev->lock);
*** DEADLOCK ***
1 lock held by bluetoothd/1118:
#0: ffff8f07e831d920 (sk_lock-AF_BLUETOOTH-BTPROTO_L2CAP){+.+.}-{0:0}, at: l2cap_sock_getsockopt+0x8b/0x610 [bluetooth]
stack backtrace:
CPU: 3 PID: 1118 Comm: bluetoothd Not tainted 5.12.0-rc1-00026-g73d464503354 #10
Hardware name: LENOVO 20K5S22R00/20K5S22R00, BIOS R0IET38W (1.16 ) 05/31/2017
Call Trace:
dump_stack+0x7f/0xa1
check_noncircular+0x105/0x120
? __lock_acquire+0x147a/0x1a50
__lock_acquire+0x147a/0x1a50
lock_acquire+0x277/0x3d0
? hci_conn_get_phy+0x1c/0x150 [bluetooth]
? __lock_acquire+0x2e1/0x1a50
? lock_is_held_type+0xb4/0x120
? hci_conn_get_phy+0x1c/0x150 [bluetooth]
__mutex_lock+0xa3/0xa10
? hci_conn_get_phy+0x1c/0x150 [bluetooth]
? lock_acquire+0x277/0x3d0
? mark_held_locks+0x49/0x70
? mark_held_locks+0x49/0x70
? hci_conn_get_phy+0x1c/0x150 [bluetooth]
hci_conn_get_phy+0x
---truncated---
In the Linux kernel, the following vulnerability has been resolved:
ataflop: potential out of bounds in do_format()
The function uses "type" as an array index:
q = unit[drive].disk[type]->queue;
Unfortunately the bounds check on "type" isn't done until later in the
function. Fix this by moving the bounds check to the start.
In the Linux kernel, the following vulnerability has been resolved:
io_uring: fix overflows checks in provide buffers
Colin reported before possible overflow and sign extension problems in
io_provide_buffers_prep(). As Linus pointed out previous attempt did nothing
useful, see d81269fecb8ce ("io_uring: fix provide_buffers sign extension").
Do that with help of check_<op>_overflow helpers. And fix struct
io_provide_buf::len type, as it doesn't make much sense to keep it
signed.
In the Linux kernel, the following vulnerability has been resolved:
netfilter: nftables: Fix a memleak from userdata error path in new objects
Release object name if userdata allocation fails.
In the Linux kernel, the following vulnerability has been resolved:
arm64: entry: always set GIC_PRIO_PSR_I_SET during entry
Zenghui reports that booting a kernel with "irqchip.gicv3_pseudo_nmi=1"
on the command line hits a warning during kernel entry, due to the way
we manipulate the PMR.
Early in the entry sequence, we call lockdep_hardirqs_off() to inform
lockdep that interrupts have been masked (as the HW sets DAIF wqhen
entering an exception). Architecturally PMR_EL1 is not affected by
exception entry, and we don't set GIC_PRIO_PSR_I_SET in the PMR early in
the exception entry sequence, so early in exception entry the PMR can
indicate that interrupts are unmasked even though they are masked by
DAIF.
If DEBUG_LOCKDEP is selected, lockdep_hardirqs_off() will check that
interrupts are masked, before we set GIC_PRIO_PSR_I_SET in any of the
exception entry paths, and hence lockdep_hardirqs_off() will WARN() that
something is amiss.
We can avoid this by consistently setting GIC_PRIO_PSR_I_SET during
exception entry so that kernel code sees a consistent environment. We
must also update local_daif_inherit() to undo this, as currently only
touches DAIF. For other paths, local_daif_restore() will update both
DAIF and the PMR. With this done, we can remove the existing special
cases which set this later in the entry code.
We always use (GIC_PRIO_IRQON | GIC_PRIO_PSR_I_SET) for consistency with
local_daif_save(), as this will warn if it ever encounters
(GIC_PRIO_IRQOFF | GIC_PRIO_PSR_I_SET), and never sets this itself. This
matches the gic_prio_kentry_setup that we have to retain for
ret_to_user.
The original splat from Zenghui's report was:
| DEBUG_LOCKS_WARN_ON(!irqs_disabled())
| WARNING: CPU: 3 PID: 125 at kernel/locking/lockdep.c:4258 lockdep_hardirqs_off+0xd4/0xe8
| Modules linked in:
| CPU: 3 PID: 125 Comm: modprobe Tainted: G W 5.12.0-rc8+ #463
| Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
| pstate: 604003c5 (nZCv DAIF +PAN -UAO -TCO BTYPE=--)
| pc : lockdep_hardirqs_off+0xd4/0xe8
| lr : lockdep_hardirqs_off+0xd4/0xe8
| sp : ffff80002a39bad0
| pmr_save: 000000e0
| x29: ffff80002a39bad0 x28: ffff0000de214bc0
| x27: ffff0000de1c0400 x26: 000000000049b328
| x25: 0000000000406f30 x24: ffff0000de1c00a0
| x23: 0000000020400005 x22: ffff8000105f747c
| x21: 0000000096000044 x20: 0000000000498ef9
| x19: ffff80002a39bc88 x18: ffffffffffffffff
| x17: 0000000000000000 x16: ffff800011c61eb0
| x15: ffff800011700a88 x14: 0720072007200720
| x13: 0720072007200720 x12: 0720072007200720
| x11: 0720072007200720 x10: 0720072007200720
| x9 : ffff80002a39bad0 x8 : ffff80002a39bad0
| x7 : ffff8000119f0800 x6 : c0000000ffff7fff
| x5 : ffff8000119f07a8 x4 : 0000000000000001
| x3 : 9bcdab23f2432800 x2 : ffff800011730538
| x1 : 9bcdab23f2432800 x0 : 0000000000000000
| Call trace:
| lockdep_hardirqs_off+0xd4/0xe8
| enter_from_kernel_mode.isra.5+0x7c/0xa8
| el1_abort+0x24/0x100
| el1_sync_handler+0x80/0xd0
| el1_sync+0x6c/0x100
| __arch_clear_user+0xc/0x90
| load_elf_binary+0x9fc/0x1450
| bprm_execve+0x404/0x880
| kernel_execve+0x180/0x188
| call_usermodehelper_exec_async+0xdc/0x158
| ret_from_fork+0x10/0x18
In the Linux kernel, the following vulnerability has been resolved:
ethernet:enic: Fix a use after free bug in enic_hard_start_xmit
In enic_hard_start_xmit, it calls enic_queue_wq_skb(). Inside
enic_queue_wq_skb, if some error happens, the skb will be freed
by dev_kfree_skb(skb). But the freed skb is still used in
skb_tx_timestamp(skb).
My patch makes enic_queue_wq_skb() return error and goto spin_unlock()
incase of error. The solution is provided by Govind.
See https://lkml.org/lkml/2021/4/30/961.
In the Linux kernel, the following vulnerability has been resolved:
sctp: do asoc update earlier in sctp_sf_do_dupcook_a
There's a panic that occurs in a few of envs, the call trace is as below:
[] general protection fault, ... 0x29acd70f1000a: 0000 [#1] SMP PTI
[] RIP: 0010:sctp_ulpevent_notify_peer_addr_change+0x4b/0x1fa [sctp]
[] sctp_assoc_control_transport+0x1b9/0x210 [sctp]
[] sctp_do_8_2_transport_strike.isra.16+0x15c/0x220 [sctp]
[] sctp_cmd_interpreter.isra.21+0x1231/0x1a10 [sctp]
[] sctp_do_sm+0xc3/0x2a0 [sctp]
[] sctp_generate_timeout_event+0x81/0xf0 [sctp]
This is caused by a transport use-after-free issue. When processing a
duplicate COOKIE-ECHO chunk in sctp_sf_do_dupcook_a(), both COOKIE-ACK
and SHUTDOWN chunks are allocated with the transort from the new asoc.
However, later in the sideeffect machine, the old asoc is used to send
them out and old asoc's shutdown_last_sent_to is set to the transport
that SHUTDOWN chunk attached to in sctp_cmd_setup_t2(), which actually
belongs to the new asoc. After the new_asoc is freed and the old asoc
T2 timeout, the old asoc's shutdown_last_sent_to that is already freed
would be accessed in sctp_sf_t2_timer_expire().
Thanks Alexander and Jere for helping dig into this issue.
To fix it, this patch is to do the asoc update first, then allocate
the COOKIE-ACK and SHUTDOWN chunks with the 'updated' old asoc. This
would make more sense, as a chunk from an asoc shouldn't be sent out
with another asoc. We had fixed quite a few issues caused by this.