In the Linux kernel, the following vulnerability has been resolved:
btrfs: don't BUG() on unexpected delayed ref type in run_one_delayed_ref()
There is no need to BUG(), we can just return an error and log an error
message.
In the Linux kernel, the following vulnerability has been resolved:
md raid: fix hang when stopping arrays with metadata through dm-raid
When using device-mapper's dm-raid target, stopping a RAID array can cause
the system to hang under specific conditions.
This occurs when:
- A dm-raid managed device tree is suspended from top to bottom
(the top-level RAID device is suspended first, followed by its
underlying metadata and data devices)
- The top-level RAID device is then removed
Removing the top-level device triggers a hang in the following sequence:
the dm-raid destructor calls md_stop(), which tries to flush the
write-intent bitmap by writing to the metadata sub-devices. However, these
devices are already suspended, making them unable to complete the write-intent
operations and causing an indefinite block.
Fix:
- Prevent bitmap flushing when md_stop() is called from dm-raid
destructor context
and avoid a quiescing/unquescing cycle which could also cause I/O
- Still allow write-intent bitmap flushing when called from dm-raid
suspend context
This ensures that RAID array teardown can complete successfully even when the
underlying devices are in a suspended state.
This second patch uses md_is_rdwr() to distinguish between suspend and
destructor paths as elaborated on above.
In the Linux kernel, the following vulnerability has been resolved:
btrfs: do not ASSERT() when the fs flips RO inside btrfs_repair_io_failure()
[BUG]
There is a bug report that when btrfs hits ENOSPC error in a critical
path, btrfs flips RO (this part is expected, although the ENOSPC bug
still needs to be addressed).
The problem is after the RO flip, if there is a read repair pending, we
can hit the ASSERT() inside btrfs_repair_io_failure() like the following:
BTRFS info (device vdc): relocating block group 30408704 flags metadata|raid1
------------[ cut here ]------------
BTRFS: Transaction aborted (error -28)
WARNING: fs/btrfs/extent-tree.c:3235 at __btrfs_free_extent.isra.0+0x453/0xfd0, CPU#1: btrfs/383844
Modules linked in: kvm_intel kvm irqbypass
[...]
---[ end trace 0000000000000000 ]---
BTRFS info (device vdc state EA): 2 enospc errors during balance
BTRFS info (device vdc state EA): balance: ended with status: -30
BTRFS error (device vdc state EA): parent transid verify failed on logical 30556160 mirror 2 wanted 8 found 6
BTRFS error (device vdc state EA): bdev /dev/nvme0n1 errs: wr 0, rd 0, flush 0, corrupt 10, gen 0
[...]
assertion failed: !(fs_info->sb->s_flags & SB_RDONLY) :: 0, in fs/btrfs/bio.c:938
------------[ cut here ]------------
assertion failed: !(fs_info->sb->s_flags & SB_RDONLY) :: 0, in fs/btrfs/bio.c:938
kernel BUG at fs/btrfs/bio.c:938!
Oops: invalid opcode: 0000 [#1] SMP NOPTI
CPU: 0 UID: 0 PID: 868 Comm: kworker/u8:13 Tainted: G W N 6.19.0-rc6+ #4788 PREEMPT(full)
Tainted: [W]=WARN, [N]=TEST
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.17.0-0-gb52ca86e094d-prebuilt.qemu.org 04/01/2014
Workqueue: btrfs-endio simple_end_io_work
RIP: 0010:btrfs_repair_io_failure.cold+0xb2/0x120
RSP: 0000:ffffc90001d2bcf0 EFLAGS: 00010246
RAX: 0000000000000051 RBX: 0000000000001000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffffff8305cf42 RDI: 00000000ffffffff
RBP: 0000000000000002 R08: 00000000fffeffff R09: ffffffff837fa988
R10: ffffffff8327a9e0 R11: 6f69747265737361 R12: ffff88813018d310
R13: ffff888168b8a000 R14: ffffc90001d2bd90 R15: ffff88810a169000
FS: 0000000000000000(0000) GS:ffff8885e752c000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
------------[ cut here ]------------
[CAUSE]
The cause of -ENOSPC error during the test case btrfs/124 is still
unknown, although it's known that we still have cases where metadata can
be over-committed but can not be fulfilled correctly, thus if we hit
such ENOSPC error inside a critical path, we have no choice but abort
the current transaction.
This will mark the fs read-only.
The problem is inside the btrfs_repair_io_failure() path that we require
the fs not to be mount read-only. This is normally fine, but if we are
doing a read-repair meanwhile the fs flips RO due to a critical error,
we can enter btrfs_repair_io_failure() with super block set to
read-only, thus triggering the above crash.
[FIX]
Just replace the ASSERT() with a proper return if the fs is already
read-only.
In the Linux kernel, the following vulnerability has been resolved:
rapidio: replace rio_free_net() with kfree() in rio_scan_alloc_net()
When idtab allocation fails, net is not registered with rio_add_net() yet,
so kfree(net) is sufficient to release the memory. Set mport->net to NULL
to avoid dangling pointer.
In the Linux kernel, the following vulnerability has been resolved:
octeontx2-af: Workaround SQM/PSE stalls by disabling sticky
NIX SQ manager sticky mode is known to cause stalls when multiple SQs
share an SMQ and transmit concurrently. Additionally, PSE may deadlock
on transitions between sticky and non-sticky transmissions. There is
also a credit drop issue observed when certain condition clocks are
gated.
work around these hardware errata by:
- Disabling SQM sticky operation:
- Clear TM6 (bit 15)
- Clear TM11 (bit 14)
- Disabling sticky → non-sticky transition path that can deadlock PSE:
- Clear TM5 (bit 23)
- Preventing credit drops by keeping the control-flow clock enabled:
- Set TM9 (bit 21)
These changes are applied via NIX_AF_SQM_DBG_CTL_STATUS. With this
configuration the SQM/PSE maintain forward progress under load without
credit loss, at the cost of disabling sticky optimizations.
In the Linux kernel, the following vulnerability has been resolved:
drm: Account property blob allocations to memcg
DRM_IOCTL_MODE_CREATEPROPBLOB allows userspace to allocate arbitrary-sized
property blobs backed by kernel memory.
Currently, the blob data allocation is not accounted to the allocating
process's memory cgroup, allowing unprivileged users to trigger unbounded
kernel memory consumption and potentially cause system-wide OOM.
Mark the property blob data allocation with GFP_KERNEL_ACCOUNT so that the memory
is properly charged to the caller's memcg. This ensures existing cgroup
memory limits apply and prevents uncontrolled kernel memory growth without
introducing additional policy or per-file limits.
In the Linux kernel, the following vulnerability has been resolved:
ext4: move ext4_percpu_param_init() before ext4_mb_init()
When running `kvm-xfstests -c ext4/1k -C 1 generic/383` with the
`DOUBLE_CHECK` macro defined, the following panic is triggered:
==================================================================
EXT4-fs error (device vdc): ext4_validate_block_bitmap:423:
comm mount: bg 0: bad block bitmap checksum
BUG: unable to handle page fault for address: ff110000fa2cc000
PGD 3e01067 P4D 3e02067 PUD 0
Oops: Oops: 0000 [#1] SMP NOPTI
CPU: 0 UID: 0 PID: 2386 Comm: mount Tainted: G W
6.18.0-gba65a4e7120a-dirty #1152 PREEMPT(none)
RIP: 0010:percpu_counter_add_batch+0x13/0xa0
Call Trace:
<TASK>
ext4_mark_group_bitmap_corrupted+0xcb/0xe0
ext4_validate_block_bitmap+0x2a1/0x2f0
ext4_read_block_bitmap+0x33/0x50
mb_group_bb_bitmap_alloc+0x33/0x80
ext4_mb_add_groupinfo+0x190/0x250
ext4_mb_init_backend+0x87/0x290
ext4_mb_init+0x456/0x640
__ext4_fill_super+0x1072/0x1680
ext4_fill_super+0xd3/0x280
get_tree_bdev_flags+0x132/0x1d0
vfs_get_tree+0x29/0xd0
vfs_cmd_create+0x59/0xe0
__do_sys_fsconfig+0x4f6/0x6b0
do_syscall_64+0x50/0x1f0
entry_SYSCALL_64_after_hwframe+0x76/0x7e
==================================================================
This issue can be reproduced using the following commands:
mkfs.ext4 -F -q -b 1024 /dev/sda 5G
tune2fs -O quota,project /dev/sda
mount /dev/sda /tmp/test
With DOUBLE_CHECK defined, mb_group_bb_bitmap_alloc() reads
and validates the block bitmap. When the validation fails,
ext4_mark_group_bitmap_corrupted() attempts to update
sbi->s_freeclusters_counter. However, this percpu_counter has not been
initialized yet at this point, which leads to the panic described above.
Fix this by moving the execution of ext4_percpu_param_init() to occur
before ext4_mb_init(), ensuring the per-CPU counters are initialized
before they are used.
In the Linux kernel, the following vulnerability has been resolved:
wifi: rtw88: 8822b: Avoid WARNING in rtw8822b_config_trx_mode()
rtw8822b_set_antenna() can be called from userspace when the chip is
powered off. In that case a WARNING is triggered in
rtw8822b_config_trx_mode() because trying to read the RF registers
when the chip is powered off returns an unexpected value.
Call rtw8822b_config_trx_mode() in rtw8822b_set_antenna() only when
the chip is powered on.
------------[ cut here ]------------
write RF mode table fail
WARNING: CPU: 0 PID: 7183 at rtw8822b.c:824 rtw8822b_config_trx_mode.constprop.0+0x835/0x840 [rtw88_8822b]
CPU: 0 UID: 0 PID: 7183 Comm: iw Tainted: G W OE 6.17.5-arch1-1 #1 PREEMPT(full) 01c39fc421df2af799dd5e9180b572af860b40c1
Tainted: [W]=WARN, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
Hardware name: LENOVO 82KR/LNVNB161216, BIOS HBCN18WW 08/27/2021
RIP: 0010:rtw8822b_config_trx_mode.constprop.0+0x835/0x840 [rtw88_8822b]
Call Trace:
<TASK>
rtw8822b_set_antenna+0x57/0x70 [rtw88_8822b 370206f42e5890d8d5f48eb358b759efa37c422b]
rtw_ops_set_antenna+0x50/0x80 [rtw88_core 711c8fb4f686162be4625b1d0b8e8c6a5ac850fb]
ieee80211_set_antenna+0x60/0x100 [mac80211 f1845d85d2ecacf3b71867635a050ece90486cf3]
nl80211_set_wiphy+0x384/0xe00 [cfg80211 296485ee85696d2150309a6d21a7fbca83d3dbda]
? netdev_run_todo+0x63/0x550
genl_family_rcv_msg_doit+0xfc/0x160
genl_rcv_msg+0x1aa/0x2b0
? __pfx_nl80211_pre_doit+0x10/0x10 [cfg80211 296485ee85696d2150309a6d21a7fbca83d3dbda]
? __pfx_nl80211_set_wiphy+0x10/0x10 [cfg80211 296485ee85696d2150309a6d21a7fbca83d3dbda]
? __pfx_nl80211_post_doit+0x10/0x10 [cfg80211 296485ee85696d2150309a6d21a7fbca83d3dbda]
? __pfx_genl_rcv_msg+0x10/0x10
netlink_rcv_skb+0x59/0x110
genl_rcv+0x28/0x40
netlink_unicast+0x285/0x3c0
? __alloc_skb+0xdb/0x1a0
netlink_sendmsg+0x20d/0x430
____sys_sendmsg+0x39f/0x3d0
? import_iovec+0x2f/0x40
___sys_sendmsg+0x99/0xe0
? refill_obj_stock+0x12e/0x240
__sys_sendmsg+0x8a/0xf0
do_syscall_64+0x81/0x970
? do_syscall_64+0x81/0x970
? ksys_read+0x73/0xf0
? do_syscall_64+0x81/0x970
? count_memcg_events+0xc2/0x190
? handle_mm_fault+0x1d7/0x2d0
? do_user_addr_fault+0x21a/0x690
? exc_page_fault+0x7e/0x1a0
entry_SYSCALL_64_after_hwframe+0x76/0x7e
</TASK>
---[ end trace 0000000000000000 ]---
In the Linux kernel, the following vulnerability has been resolved:
xfrm: esp: avoid in-place decrypt on shared skb frags
MSG_SPLICE_PAGES can attach pages from a pipe directly to an skb. TCP
marks such skbs with SKBFL_SHARED_FRAG after skb_splice_from_iter(),
so later paths that may modify packet data can first make a private
copy. The IPv4/IPv6 datagram append paths did not set this flag when
splicing pages into UDP skbs.
That leaves an ESP-in-UDP packet made from shared pipe pages looking
like an ordinary uncloned nonlinear skb. ESP input then takes the no-COW
fast path for uncloned skbs without a frag_list and decrypts in place
over data that is not owned privately by the skb.
Mark IPv4/IPv6 datagram splice frags with SKBFL_SHARED_FRAG, matching
TCP. Also make ESP input fall back to skb_cow_data() when the flag is
present, so ESP does not decrypt externally backed frags in place.
Private nonlinear skb frags still use the existing fast path.
This intentionally does not change ESP output. In esp_output_head(),
the path that appends the ESP trailer to existing skb tailroom without
calling skb_cow_data() is not reachable for nonlinear skbs:
skb_tailroom() returns zero when skb->data_len is nonzero, while ESP
tailen is positive. Thus ESP output will either use the separate
destination-frag path or fall back to skb_cow_data().