In the Linux kernel, the following vulnerability has been resolved:
mm/kmemleak: avoid soft lockup in __kmemleak_do_cleanup()
A soft lockup warning was observed on a relative small system x86-64
system with 16 GB of memory when running a debug kernel with kmemleak
enabled.
watchdog: BUG: soft lockup - CPU#8 stuck for 33s! [kworker/8:1:134]
The test system was running a workload with hot unplug happening in
parallel. Then kemleak decided to disable itself due to its inability to
allocate more kmemleak objects. The debug kernel has its
CONFIG_DEBUG_KMEMLEAK_MEM_POOL_SIZE set to 40,000.
The soft lockup happened in kmemleak_do_cleanup() when the existing
kmemleak objects were being removed and deleted one-by-one in a loop via a
workqueue. In this particular case, there are at least 40,000 objects
that need to be processed and given the slowness of a debug kernel and the
fact that a raw_spinlock has to be acquired and released in
__delete_object(), it could take a while to properly handle all these
objects.
As kmemleak has been disabled in this case, the object removal and
deletion process can be further optimized as locking isn't really needed.
However, it is probably not worth the effort to optimize for such an edge
case that should rarely happen. So the simple solution is to call
cond_resched() at periodic interval in the iteration loop to avoid soft
lockup.
In the Linux kernel, the following vulnerability has been resolved:
btrfs: do not allow relocation of partially dropped subvolumes
[BUG]
There is an internal report that balance triggered transaction abort,
with the following call trace:
item 85 key (594509824 169 0) itemoff 12599 itemsize 33
extent refs 1 gen 197740 flags 2
ref#0: tree block backref root 7
item 86 key (594558976 169 0) itemoff 12566 itemsize 33
extent refs 1 gen 197522 flags 2
ref#0: tree block backref root 7
...
BTRFS error (device loop0): extent item not found for insert, bytenr 594526208 num_bytes 16384 parent 449921024 root_objectid 934 owner 1 offset 0
BTRFS error (device loop0): failed to run delayed ref for logical 594526208 num_bytes 16384 type 182 action 1 ref_mod 1: -117
------------[ cut here ]------------
BTRFS: Transaction aborted (error -117)
WARNING: CPU: 1 PID: 6963 at ../fs/btrfs/extent-tree.c:2168 btrfs_run_delayed_refs+0xfa/0x110 [btrfs]
And btrfs check doesn't report anything wrong related to the extent
tree.
[CAUSE]
The cause is a little complex, firstly the extent tree indeed doesn't
have the backref for 594526208.
The extent tree only have the following two backrefs around that bytenr
on-disk:
item 65 key (594509824 METADATA_ITEM 0) itemoff 13880 itemsize 33
refs 1 gen 197740 flags TREE_BLOCK
tree block skinny level 0
(176 0x7) tree block backref root CSUM_TREE
item 66 key (594558976 METADATA_ITEM 0) itemoff 13847 itemsize 33
refs 1 gen 197522 flags TREE_BLOCK
tree block skinny level 0
(176 0x7) tree block backref root CSUM_TREE
But the such missing backref item is not an corruption on disk, as the
offending delayed ref belongs to subvolume 934, and that subvolume is
being dropped:
item 0 key (934 ROOT_ITEM 198229) itemoff 15844 itemsize 439
generation 198229 root_dirid 256 bytenr 10741039104 byte_limit 0 bytes_used 345571328
last_snapshot 198229 flags 0x1000000000001(RDONLY) refs 0
drop_progress key (206324 EXTENT_DATA 2711650304) drop_level 2
level 2 generation_v2 198229
And that offending tree block 594526208 is inside the dropped range of
that subvolume. That explains why there is no backref item for that
bytenr and why btrfs check is not reporting anything wrong.
But this also shows another problem, as btrfs will do all the orphan
subvolume cleanup at a read-write mount.
So half-dropped subvolume should not exist after an RW mount, and
balance itself is also exclusive to subvolume cleanup, meaning we
shouldn't hit a subvolume half-dropped during relocation.
The root cause is, there is no orphan item for this subvolume.
In fact there are 5 subvolumes from around 2021 that have the same
problem.
It looks like the original report has some older kernels running, and
caused those zombie subvolumes.
Thankfully upstream commit 8d488a8c7ba2 ("btrfs: fix subvolume/snapshot
deletion not triggered on mount") has long fixed the bug.
[ENHANCEMENT]
For repairing such old fs, btrfs-progs will be enhanced.
Considering how delayed the problem will show up (at run delayed ref
time) and at that time we have to abort transaction already, it is too
late.
Instead here we reject any half-dropped subvolume for reloc tree at the
earliest time, preventing confusion and extra time wasted on debugging
similar bugs.
In the Linux kernel, the following vulnerability has been resolved:
mm/kmemleak: avoid deadlock by moving pr_warn() outside kmemleak_lock
When netpoll is enabled, calling pr_warn_once() while holding
kmemleak_lock in mem_pool_alloc() can cause a deadlock due to lock
inversion with the netconsole subsystem. This occurs because
pr_warn_once() may trigger netpoll, which eventually leads to
__alloc_skb() and back into kmemleak code, attempting to reacquire
kmemleak_lock.
This is the path for the deadlock.
mem_pool_alloc()
-> raw_spin_lock_irqsave(&kmemleak_lock, flags);
-> pr_warn_once()
-> netconsole subsystem
-> netpoll
-> __alloc_skb
-> __create_object
-> raw_spin_lock_irqsave(&kmemleak_lock, flags);
Fix this by setting a flag and issuing the pr_warn_once() after
kmemleak_lock is released.
In the Linux kernel, the following vulnerability has been resolved:
Revert "fs/ntfs3: Replace inode_trylock with inode_lock"
This reverts commit 69505fe98f198ee813898cbcaf6770949636430b.
Initially, conditional lock acquisition was removed to fix an xfstest bug
that was observed during internal testing. The deadlock reported by syzbot
is resolved by reintroducing conditional acquisition. The xfstest bug no
longer occurs on kernel version 6.16-rc1 during internal testing. I
assume that changes in other modules may have contributed to this.
In the Linux kernel, the following vulnerability has been resolved:
NFS: Fix filehandle bounds checking in nfs_fh_to_dentry()
The function needs to check the minimal filehandle length before it can
access the embedded filehandle.
In the Linux kernel, the following vulnerability has been resolved:
wifi: ath11k: fix sleeping-in-atomic in ath11k_mac_op_set_bitrate_mask()
ath11k_mac_disable_peer_fixed_rate() is passed as the iterator to
ieee80211_iterate_stations_atomic(). Note in this case the iterator is
required to be atomic, however ath11k_mac_disable_peer_fixed_rate() does
not follow it as it might sleep. Consequently below warning is seen:
BUG: sleeping function called from invalid context at wmi.c:304
Call Trace:
<TASK>
dump_stack_lvl
__might_resched.cold
ath11k_wmi_cmd_send
ath11k_wmi_set_peer_param
ath11k_mac_disable_peer_fixed_rate
ieee80211_iterate_stations_atomic
ath11k_mac_op_set_bitrate_mask.cold
Change to ieee80211_iterate_stations_mtx() to fix this issue.
Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.30
In the Linux kernel, the following vulnerability has been resolved:
serial: 8250: fix panic due to PSLVERR
When the PSLVERR_RESP_EN parameter is set to 1, the device generates
an error response if an attempt is made to read an empty RBR (Receive
Buffer Register) while the FIFO is enabled.
In serial8250_do_startup(), calling serial_port_out(port, UART_LCR,
UART_LCR_WLEN8) triggers dw8250_check_lcr(), which invokes
dw8250_force_idle() and serial8250_clear_and_reinit_fifos(). The latter
function enables the FIFO via serial_out(p, UART_FCR, p->fcr).
Execution proceeds to the serial_port_in(port, UART_RX).
This satisfies the PSLVERR trigger condition.
When another CPU (e.g., using printk()) is accessing the UART (UART
is busy), the current CPU fails the check (value & ~UART_LCR_SPAR) ==
(lcr & ~UART_LCR_SPAR) in dw8250_check_lcr(), causing it to enter
dw8250_force_idle().
Put serial_port_out(port, UART_LCR, UART_LCR_WLEN8) under the port->lock
to fix this issue.
Panic backtrace:
[ 0.442336] Oops - unknown exception [#1]
[ 0.442343] epc : dw8250_serial_in32+0x1e/0x4a
[ 0.442351] ra : serial8250_do_startup+0x2c8/0x88e
...
[ 0.442416] console_on_rootfs+0x26/0x70
In the Linux kernel, the following vulnerability has been resolved:
s390/ism: fix concurrency management in ism_cmd()
The s390x ISM device data sheet clearly states that only one
request-response sequence is allowable per ISM function at any point in
time. Unfortunately as of today the s390/ism driver in Linux does not
honor that requirement. This patch aims to rectify that.
This problem was discovered based on Aliaksei's bug report which states
that for certain workloads the ISM functions end up entering error state
(with PEC 2 as seen from the logs) after a while and as a consequence
connections handled by the respective function break, and for future
connection requests the ISM device is not considered -- given it is in a
dysfunctional state. During further debugging PEC 3A was observed as
well.
A kernel message like
[ 1211.244319] zpci: 061a:00:00.0: Event 0x2 reports an error for PCI function 0x61a
is a reliable indicator of the stated function entering error state
with PEC 2. Let me also point out that a kernel message like
[ 1211.244325] zpci: 061a:00:00.0: The ism driver bound to the device does not support error recovery
is a reliable indicator that the ISM function won't be auto-recovered
because the ISM driver currently lacks support for it.
On a technical level, without this synchronization, commands (inputs to
the FW) may be partially or fully overwritten (corrupted) by another CPU
trying to issue commands on the same function. There is hard evidence that
this can lead to DMB token values being used as DMB IOVAs, leading to
PEC 2 PCI events indicating invalid DMA. But this is only one of the
failure modes imaginable. In theory even completely losing one command
and executing another one twice and then trying to interpret the outputs
as if the command we intended to execute was actually executed and not
the other one is also possible. Frankly, I don't feel confident about
providing an exhaustive list of possible consequences.
In the Linux kernel, the following vulnerability has been resolved:
parisc: Revise __get_user() to probe user read access
Because of the way read access support is implemented, read access
interruptions are only triggered at privilege levels 2 and 3. The
kernel executes at privilege level 0, so __get_user() never triggers
a read access interruption (code 26). Thus, it is currently possible
for user code to access a read protected address via a system call.
Fix this by probing read access rights at privilege level 3 (PRIV_USER)
and setting __gu_err to -EFAULT (-14) if access isn't allowed.
Note the cmpiclr instruction does a 32-bit compare because COND macro
doesn't work inside asm.
In the Linux kernel, the following vulnerability has been resolved:
ksmbd: fix refcount leak causing resource not released
When ksmbd_conn_releasing(opinfo->conn) returns true,the refcount was not
decremented properly, causing a refcount leak that prevents the count from
reaching zero and the memory from being released.