In the Linux kernel, the following vulnerability has been resolved:
scsi: qla2xxx: Fix crash due to stale SRB access around I/O timeouts
Ensure SRB is returned during I/O timeout error escalation. If that is not
possible fail the escalation path.
Following crash stack was seen:
BUG: unable to handle kernel paging request at 0000002f56aa90f8
IP: qla_chk_edif_rx_sa_delete_pending+0x14/0x30 [qla2xxx]
Call Trace:
? qla2x00_status_entry+0x19f/0x1c50 [qla2xxx]
? qla2x00_start_sp+0x116/0x1170 [qla2xxx]
? dma_pool_alloc+0x1d6/0x210
? mempool_alloc+0x54/0x130
? qla24xx_process_response_queue+0x548/0x12b0 [qla2xxx]
? qla_do_work+0x2d/0x40 [qla2xxx]
? process_one_work+0x14c/0x390
In the Linux kernel, the following vulnerability has been resolved:
sched, cpuset: Fix dl_cpu_busy() panic due to empty cs->cpus_allowed
With cgroup v2, the cpuset's cpus_allowed mask can be empty indicating
that the cpuset will just use the effective CPUs of its parent. So
cpuset_can_attach() can call task_can_attach() with an empty mask.
This can lead to cpumask_any_and() returns nr_cpu_ids causing the call
to dl_bw_of() to crash due to percpu value access of an out of bound
CPU value. For example:
[80468.182258] BUG: unable to handle page fault for address: ffffffff8b6648b0
:
[80468.191019] RIP: 0010:dl_cpu_busy+0x30/0x2b0
:
[80468.207946] Call Trace:
[80468.208947] cpuset_can_attach+0xa0/0x140
[80468.209953] cgroup_migrate_execute+0x8c/0x490
[80468.210931] cgroup_update_dfl_csses+0x254/0x270
[80468.211898] cgroup_subtree_control_write+0x322/0x400
[80468.212854] kernfs_fop_write_iter+0x11c/0x1b0
[80468.213777] new_sync_write+0x11f/0x1b0
[80468.214689] vfs_write+0x1eb/0x280
[80468.215592] ksys_write+0x5f/0xe0
[80468.216463] do_syscall_64+0x5c/0x80
[80468.224287] entry_SYSCALL_64_after_hwframe+0x44/0xae
Fix that by using effective_cpus instead. For cgroup v1, effective_cpus
is the same as cpus_allowed. For v2, effective_cpus is the real cpumask
to be used by tasks within the cpuset anyway.
Also update task_can_attach()'s 2nd argument name to cs_effective_cpus to
reflect the change. In addition, a check is added to task_can_attach()
to guard against the possibility that cpumask_any_and() may return a
value >= nr_cpu_ids.
In the Linux kernel, the following vulnerability has been resolved:
dm thin: fix use-after-free crash in dm_sm_register_threshold_callback
Fault inject on pool metadata device reports:
BUG: KASAN: use-after-free in dm_pool_register_metadata_threshold+0x40/0x80
Read of size 8 at addr ffff8881b9d50068 by task dmsetup/950
CPU: 7 PID: 950 Comm: dmsetup Tainted: G W 5.19.0-rc6 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-1.fc33 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0x34/0x44
print_address_description.constprop.0.cold+0xeb/0x3f4
kasan_report.cold+0xe6/0x147
dm_pool_register_metadata_threshold+0x40/0x80
pool_ctr+0xa0a/0x1150
dm_table_add_target+0x2c8/0x640
table_load+0x1fd/0x430
ctl_ioctl+0x2c4/0x5a0
dm_ctl_ioctl+0xa/0x10
__x64_sys_ioctl+0xb3/0xd0
do_syscall_64+0x35/0x80
entry_SYSCALL_64_after_hwframe+0x46/0xb0
This can be easily reproduced using:
echo offline > /sys/block/sda/device/state
dd if=/dev/zero of=/dev/mapper/thin bs=4k count=10
dmsetup load pool --table "0 20971520 thin-pool /dev/sda /dev/sdb 128 0 0"
If a metadata commit fails, the transaction will be aborted and the
metadata space maps will be destroyed. If a DM table reload then
happens for this failed thin-pool, a use-after-free will occur in
dm_sm_register_threshold_callback (called from
dm_pool_register_metadata_threshold).
Fix this by in dm_pool_register_metadata_threshold() by returning the
-EINVAL error if the thin-pool is in fail mode. Also fail pool_ctr()
with a new error message: "Error registering metadata threshold".
In the Linux kernel, the following vulnerability has been resolved:
iommu/vt-d: avoid invalid memory access via node_online(NUMA_NO_NODE)
KASAN reports:
[ 4.668325][ T0] BUG: KASAN: wild-memory-access in dmar_parse_one_rhsa (arch/x86/include/asm/bitops.h:214 arch/x86/include/asm/bitops.h:226 include/asm-generic/bitops/instrumented-non-atomic.h:142 include/linux/nodemask.h:415 drivers/iommu/intel/dmar.c:497)
[ 4.676149][ T0] Read of size 8 at addr 1fffffff85115558 by task swapper/0/0
[ 4.683454][ T0]
[ 4.685638][ T0] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.19.0-rc3-00004-g0e862838f290 #1
[ 4.694331][ T0] Hardware name: Supermicro SYS-5018D-FN4T/X10SDV-8C-TLN4F, BIOS 1.1 03/02/2016
[ 4.703196][ T0] Call Trace:
[ 4.706334][ T0] <TASK>
[ 4.709133][ T0] ? dmar_parse_one_rhsa (arch/x86/include/asm/bitops.h:214 arch/x86/include/asm/bitops.h:226 include/asm-generic/bitops/instrumented-non-atomic.h:142 include/linux/nodemask.h:415 drivers/iommu/intel/dmar.c:497)
after converting the type of the first argument (@nr, bit number)
of arch_test_bit() from `long` to `unsigned long`[0].
Under certain conditions (for example, when ACPI NUMA is disabled
via command line), pxm_to_node() can return %NUMA_NO_NODE (-1).
It is valid 'magic' number of NUMA node, but not valid bit number
to use in bitops.
node_online() eventually descends to test_bit() without checking
for the input, assuming it's on caller side (which might be good
for perf-critical tasks). There, -1 becomes %ULONG_MAX which leads
to an insane array index when calculating bit position in memory.
For now, add an explicit check for @node being not %NUMA_NO_NODE
before calling test_bit(). The actual logics didn't change here
at all.
[0] https://github.com/norov/linux/commit/0e862838f290147ea9c16db852d8d494b552d38d
In the Linux kernel, the following vulnerability has been resolved:
block: don't allow the same type rq_qos add more than once
In our test of iocost, we encountered some list add/del corruptions of
inner_walk list in ioc_timer_fn.
The reason can be described as follows:
cpu 0 cpu 1
ioc_qos_write ioc_qos_write
ioc = q_to_ioc(queue);
if (!ioc) {
ioc = kzalloc();
ioc = q_to_ioc(queue);
if (!ioc) {
ioc = kzalloc();
...
rq_qos_add(q, rqos);
}
...
rq_qos_add(q, rqos);
...
}
When the io.cost.qos file is written by two cpus concurrently, rq_qos may
be added to one disk twice. In that case, there will be two iocs enabled
and running on one disk. They own different iocgs on their active list. In
the ioc_timer_fn function, because of the iocgs from two iocs have the
same root iocg, the root iocg's walk_list may be overwritten by each other
and this leads to list add/del corruptions in building or destroying the
inner_walk list.
And so far, the blk-rq-qos framework works in case that one instance for
one type rq_qos per queue by default. This patch make this explicit and
also fix the crash above.
In the Linux kernel, the following vulnerability has been resolved:
firmware: arm_scpi: Ensure scpi_info is not assigned if the probe fails
When scpi probe fails, at any point, we need to ensure that the scpi_info
is not set and will remain NULL until the probe succeeds. If it is not
taken care, then it could result use-after-free as the value is exported
via get_scpi_ops() and could refer to a memory allocated via devm_kzalloc()
but freed when the probe fails.
In the Linux kernel, the following vulnerability has been resolved:
net: atlantic: fix aq_vec index out of range error
The final update statement of the for loop exceeds the array range, the
dereference of self->aq_vec[i] is not checked and then leads to the
index out of range error.
Also fixed this kind of coding style in other for loop.
[ 97.937604] UBSAN: array-index-out-of-bounds in drivers/net/ethernet/aquantia/atlantic/aq_nic.c:1404:48
[ 97.937607] index 8 is out of range for type 'aq_vec_s *[8]'
[ 97.937608] CPU: 38 PID: 3767 Comm: kworker/u256:18 Not tainted 5.19.0+ #2
[ 97.937610] Hardware name: Dell Inc. Precision 7865 Tower/, BIOS 1.0.0 06/12/2022
[ 97.937611] Workqueue: events_unbound async_run_entry_fn
[ 97.937616] Call Trace:
[ 97.937617] <TASK>
[ 97.937619] dump_stack_lvl+0x49/0x63
[ 97.937624] dump_stack+0x10/0x16
[ 97.937626] ubsan_epilogue+0x9/0x3f
[ 97.937627] __ubsan_handle_out_of_bounds.cold+0x44/0x49
[ 97.937629] ? __scm_send+0x348/0x440
[ 97.937632] ? aq_vec_stop+0x72/0x80 [atlantic]
[ 97.937639] aq_nic_stop+0x1b6/0x1c0 [atlantic]
[ 97.937644] aq_suspend_common+0x88/0x90 [atlantic]
[ 97.937648] aq_pm_suspend_poweroff+0xe/0x20 [atlantic]
[ 97.937653] pci_pm_suspend+0x7e/0x1a0
[ 97.937655] ? pci_pm_suspend_noirq+0x2b0/0x2b0
[ 97.937657] dpm_run_callback+0x54/0x190
[ 97.937660] __device_suspend+0x14c/0x4d0
[ 97.937661] async_suspend+0x23/0x70
[ 97.937663] async_run_entry_fn+0x33/0x120
[ 97.937664] process_one_work+0x21f/0x3f0
[ 97.937666] worker_thread+0x4a/0x3c0
[ 97.937668] ? process_one_work+0x3f0/0x3f0
[ 97.937669] kthread+0xf0/0x120
[ 97.937671] ? kthread_complete_and_exit+0x20/0x20
[ 97.937672] ret_from_fork+0x22/0x30
[ 97.937676] </TASK>
v2. fixed "warning: variable 'aq_vec' set but not used"
v3. simplified a for loop
In the Linux kernel, the following vulnerability has been resolved:
iavf: Fix adminq error handling
iavf_alloc_asq_bufs/iavf_alloc_arq_bufs allocates with dma_alloc_coherent
memory for VF mailbox.
Free DMA regions for both ASQ and ARQ in case error happens during
configuration of ASQ/ARQ registers.
Without this change it is possible to see when unloading interface:
74626.583369: dma_debug_device_change: device driver has pending DMA allocations while released from device [count=32]
One of leaked entries details: [device address=0x0000000b27ff9000] [size=4096 bytes] [mapped with DMA_BIDIRECTIONAL] [mapped as coherent]