In the Linux kernel, the following vulnerability has been resolved:
greybus: gb-beagleplay: fix sleep in atomic context in hdlc_tx_frames()
hdlc_append() calls usleep_range() to wait for circular buffer space,
but it is called with tx_producer_lock (a spinlock) held via
hdlc_tx_frames() -> hdlc_append_tx_frame()/hdlc_append_tx_u8()/etc.
Sleeping while holding a spinlock is illegal and can trigger
"BUG: scheduling while atomic".
Fix this by moving the buffer-space wait out of hdlc_append() and into
hdlc_tx_frames(), before the spinlock is acquired. The new flow:
1. Pre-calculate the worst-case encoded frame length.
2. Wait (with sleep) outside the lock until enough space is available,
kicking the TX consumer work to drain the buffer.
3. Acquire the spinlock, re-verify space, and write the entire frame
atomically.
This ensures that sleeping only happens without any lock held, and
that frames are either fully enqueued or not written at all.
This bug is found by CodeQL static analysis tool (interprocedural
sleep-in-atomic query) and my code review.
In the Linux kernel, the following vulnerability has been resolved:
mm/mempolicy: fix memory leaks in weighted_interleave_auto_store()
weighted_interleave_auto_store() fetches old_wi_state inside the if
(!input) block only. This causes two memory leaks:
1. When a user writes "false" and the current mode is already manual,
the function returns early without freeing the freshly allocated
new_wi_state.
2. When a user writes "true", old_wi_state stays NULL because the
fetch is skipped entirely. The old state is then overwritten by
rcu_assign_pointer() but never freed, since the cleanup path is
gated on old_wi_state being non-NULL. A user can trigger this
repeatedly by writing "1" in a loop.
Fix both leaks by moving the old_wi_state fetch before the input check,
making it unconditional. This also allows a unified early return for both
"true" and "false" when the requested mode matches the current mode.
Reviewed by: Donet Tom <donettom@linux.ibm.com>
In the Linux kernel, the following vulnerability has been resolved:
RDMA/rxe: Validate pad and ICRC before payload_size() in rxe_rcv
rxe_rcv() currently checks only that the incoming packet is at least
header_size(pkt) bytes long before payload_size() is used.
However, payload_size() subtracts both the attacker-controlled BTH pad
field and RXE_ICRC_SIZE from pkt->paylen:
payload_size = pkt->paylen - offset[RXE_PAYLOAD] - bth_pad(pkt)
- RXE_ICRC_SIZE
This means a short packet can still make payload_size() underflow even
if it includes enough bytes for the fixed headers. Simply requiring
header_size(pkt) + RXE_ICRC_SIZE is not sufficient either, because a
packet with a forged non-zero BTH pad can still leave payload_size()
negative and pass an underflowed value to later receive-path users.
Fix this by validating pkt->paylen against the full minimum length
required by payload_size(): header_size(pkt) + bth_pad(pkt) +
RXE_ICRC_SIZE.
In the Linux kernel, the following vulnerability has been resolved:
ipmi:ssif: Clean up kthread on errors
If an error occurs after the ssif kthread is created, but before the
main IPMI code starts the ssif interface, the ssif kthread will not
be stopped.
So make sure the kthread is stopped on an error condition if it is
running.
In the Linux kernel, the following vulnerability has been resolved:
md/md-llbitmap: skip reading rdevs that are not in_sync
When reading bitmap pages from member disks, the code iterates through
all rdevs and attempts to read from the first available one. However,
it only checks for raid_disk assignment and Faulty flag, missing the
In_sync flag check.
This can cause bitmap data to be read from spare disks that are still
being rebuilt and don't have valid bitmap information yet. Reading
stale or uninitialized bitmap data from such disks can lead to
incorrect dirty bit tracking, potentially causing data corruption
during recovery or normal operation.
Add the In_sync flag check to ensure bitmap pages are only read from
fully synchronized member disks that have valid bitmap data.
In the Linux kernel, the following vulnerability has been resolved:
net: ks8851: Reinstate disabling of BHs around IRQ handler
If the driver executes ks8851_irq() AND a TX packet has been sent, then
the driver enables TX queue via netif_wake_queue() which schedules TX
softirq to queue packets for this device.
If CONFIG_PREEMPT_RT=y is set AND a packet has also been received by
the MAC, then ks8851_rx_pkts() calls netdev_alloc_skb_ip_align() to
allocate SKBs for the received packets. If netdev_alloc_skb_ip_align()
is called with BH enabled, then local_bh_enable() at the end of
netdev_alloc_skb_ip_align() will trigger the pending softirq processing,
which may ultimately call the .xmit callback ks8851_start_xmit_par().
The ks8851_start_xmit_par() will try to lock struct ks8851_net_par
.lock spinlock, which is already locked by ks8851_irq() from which
ks8851_start_xmit_par() was called. This leads to a deadlock, which
is reported by the kernel, including a trace listed below.
If CONFIG_PREEMPT_RT is not set, then since commit 0913ec336a6c0
("net: ks8851: Fix deadlock with the SPI chip variant") the deadlock
can also be triggered without received packet in the RX FIFO. The
pending softirqs will be processed on return from
spin_unlock_bh(&ks->statelock) in ks8851_irq(), which triggers the
deadlock as well.
Fix the problem by disabling BH around critical sections, including the
IRQ handler, thus preventing the net_tx_action() softirq from triggering
during these critical sections. The net_tx_action() softirq is triggered
once BH are re-enabled and at the end of the IRQ handler, once all the
other IRQ handler actions have been completed.
__schedule from schedule_rtlock+0x1c/0x34
schedule_rtlock from rtlock_slowlock_locked+0x548/0x904
rtlock_slowlock_locked from rt_spin_lock+0x60/0x9c
rt_spin_lock from ks8851_start_xmit_par+0x74/0x1a8
ks8851_start_xmit_par from netdev_start_xmit+0x20/0x44
netdev_start_xmit from dev_hard_start_xmit+0xd0/0x188
dev_hard_start_xmit from sch_direct_xmit+0xb8/0x25c
sch_direct_xmit from __qdisc_run+0x1f8/0x4ec
__qdisc_run from qdisc_run+0x1c/0x28
qdisc_run from net_tx_action+0x1f0/0x268
net_tx_action from handle_softirqs+0x1a4/0x270
handle_softirqs from __local_bh_enable_ip+0xcc/0xe0
__local_bh_enable_ip from __alloc_skb+0xd8/0x128
__alloc_skb from __netdev_alloc_skb+0x3c/0x19c
__netdev_alloc_skb from ks8851_irq+0x388/0x4d4
ks8851_irq from irq_thread_fn+0x24/0x64
irq_thread_fn from irq_thread+0x178/0x28c
irq_thread from kthread+0x12c/0x138
kthread from ret_from_fork+0x14/0x28
In the Linux kernel, the following vulnerability has been resolved:
KVM: nSVM: Triple fault if restore host CR3 fails on nested #VMEXIT
If loading L1's CR3 fails on a nested #VMEXIT, nested_svm_vmexit()
returns an error code that is ignored by most callers, and continues to
run L1 with corrupted state. A sane recovery is not possible in this
case, and HW behavior is to cause a shutdown. Inject a triple fault
instead, and do not return early from nested_svm_vmexit(). Continue
cleaning up the vCPU state (e.g. clear pending exceptions), to handle
the failure as gracefully as possible.
From the APM:
Upon #VMEXIT, the processor performs the following actions in order to
return to the host execution context:
...
if (illegal host state loaded, or exception while loading host state)
shutdown
else
execute first host instruction following the VMRUN
Remove the return value of nested_svm_vmexit(), which is mostly
unchecked anyway.
In the Linux kernel, the following vulnerability has been resolved:
crypto: authencesn - reject short ahash digests during instance creation
authencesn requires either a zero authsize or an authsize of at least
4 bytes because the ESN encrypt/decrypt paths always move 4 bytes of
high-order sequence number data at the end of the authenticated data.
While crypto_authenc_esn_setauthsize() already rejects explicit
non-zero authsizes in the range 1..3, crypto_authenc_esn_create()
still copied auth->digestsize into inst->alg.maxauthsize without
validating it. The AEAD core then initialized the tfm's default
authsize from that value.
As a result, selecting an ahash with digest size 1..3, such as
cbcmac(cipher_null), exposed authencesn instances whose default
authsize was invalid even though setauthsize() would have rejected the
same value. AF_ALG could then trigger the ESN tail handling with a
too-short tag and hit an out-of-bounds access.
Reject authencesn instances whose ahash digest size is in the invalid
non-zero range 1..3 so that no tfm can inherit an unsupported default
authsize.
In the Linux kernel, the following vulnerability has been resolved:
vfio/cdx: Fix NULL pointer dereference in interrupt trigger path
Add validation to ensure MSI is configured before accessing cdx_irqs
array in vfio_cdx_set_msi_trigger(). Without this check, userspace
can trigger a NULL pointer dereference by calling VFIO_DEVICE_SET_IRQS
with VFIO_IRQ_SET_DATA_BOOL or VFIO_IRQ_SET_DATA_NONE flags before
ever setting up interrupts via VFIO_IRQ_SET_DATA_EVENTFD.
The vfio_cdx_msi_enable() function allocates the cdx_irqs array and
sets config_msi to 1 only when called through the EVENTFD path. The
trigger loop (for DATA_BOOL/DATA_NONE) assumed this had already been
done, but there was no enforcement of this call ordering.
This matches the protection used in the PCI VFIO driver where
vfio_pci_set_msi_trigger() checks irq_is() before the trigger loop.
In the Linux kernel, the following vulnerability has been resolved:
mm/page_alloc: return NULL early from alloc_frozen_pages_nolock() in NMI on UP
On UP kernels (!CONFIG_SMP), spin_trylock() is a no-op that
unconditionally succeeds even when the lock is already held. As a
result, alloc_frozen_pages_nolock() called from NMI context can
re-enter rmqueue() and acquire the zone lock that the interrupted
context is already holding, corrupting the freelists.
With CONFIG_DEBUG_SPINLOCK on UP, the following BUG is triggered with
the slub_kunit test module:
BUG: spinlock trylock failure on UP on CPU#0, kunit_try_catch/243
[...]
Call Trace:
<NMI>
dump_stack_lvl+0x3f/0x60
do_raw_spin_trylock+0x41/0x50
_raw_spin_trylock+0x24/0x50
rmqueue.isra.0+0x2a9/0xa70
get_page_from_freelist+0xeb/0x450
alloc_frozen_pages_nolock_noprof+0x111/0x1e0
allocate_slab+0x42a/0x500
___slab_alloc+0xa7/0x4c0
kmalloc_nolock_noprof+0x164/0x310
[...]
</NMI>
Fix this by returning NULL early when invoked from NMI on a UP kernel.