In the Linux kernel, the following vulnerability has been resolved:
tty: tty_buffer: Fix the softlockup issue in flush_to_ldisc
When running ltp testcase(ltp/testcases/kernel/pty/pty04.c) with arm64, there is a soft lockup,
which look like this one:
Workqueue: events_unbound flush_to_ldisc
Call trace:
dump_backtrace+0x0/0x1ec
show_stack+0x24/0x30
dump_stack+0xd0/0x128
panic+0x15c/0x374
watchdog_timer_fn+0x2b8/0x304
__run_hrtimer+0x88/0x2c0
__hrtimer_run_queues+0xa4/0x120
hrtimer_interrupt+0xfc/0x270
arch_timer_handler_phys+0x40/0x50
handle_percpu_devid_irq+0x94/0x220
__handle_domain_irq+0x88/0xf0
gic_handle_irq+0x84/0xfc
el1_irq+0xc8/0x180
slip_unesc+0x80/0x214 [slip]
tty_ldisc_receive_buf+0x64/0x80
tty_port_default_receive_buf+0x50/0x90
flush_to_ldisc+0xbc/0x110
process_one_work+0x1d4/0x4b0
worker_thread+0x180/0x430
kthread+0x11c/0x120
In the testcase pty04, The first process call the write syscall to send
data to the pty master. At the same time, the workqueue will do the
flush_to_ldisc to pop data in a loop until there is no more data left.
When the sender and workqueue running in different core, the sender sends
data fastly in full time which will result in workqueue doing work in loop
for a long time and occuring softlockup in flush_to_ldisc with kernel
configured without preempt. So I add need_resched check and cond_resched
in the flush_to_ldisc loop to avoid it.
In the Linux kernel, the following vulnerability has been resolved:
arm64: dts: qcom: msm8998: Fix CPU/L2 idle state latency and residency
The entry/exit latency and minimum residency in state for the idle
states of MSM8998 were ..bad: first of all, for all of them the
timings were written for CPU sleep but the min-residency-us param
was miscalculated (supposedly, while porting this from downstream);
Then, the power collapse states are setting PC on both the CPU
cluster *and* the L2 cache, which have different timings: in the
specific case of L2 the times are higher so these ones should be
taken into account instead of the CPU ones.
This parameter misconfiguration was not giving particular issues
because on MSM8998 there was no CPU scaling at all, so cluster/L2
power collapse was rarely (if ever) hit.
When CPU scaling is enabled, though, the wrong timings will produce
SoC unstability shown to the user as random, apparently error-less,
sudden reboots and/or lockups.
This set of parameters are stabilizing the SoC when CPU scaling is
ON and when power collapse is frequently hit.
In the Linux kernel, the following vulnerability has been resolved:
scsi: scsi_debug: Fix out-of-bound read in resp_readcap16()
The following warning was observed running syzkaller:
[ 3813.830724] sg_write: data in/out 65466/242 bytes for SCSI command 0x9e-- guessing data in;
[ 3813.830724] program syz-executor not setting count and/or reply_len properly
[ 3813.836956] ==================================================================
[ 3813.839465] BUG: KASAN: stack-out-of-bounds in sg_copy_buffer+0x157/0x1e0
[ 3813.841773] Read of size 4096 at addr ffff8883cf80f540 by task syz-executor/1549
[ 3813.846612] Call Trace:
[ 3813.846995] dump_stack+0x108/0x15f
[ 3813.847524] print_address_description+0xa5/0x372
[ 3813.848243] kasan_report.cold+0x236/0x2a8
[ 3813.849439] check_memory_region+0x240/0x270
[ 3813.850094] memcpy+0x30/0x80
[ 3813.850553] sg_copy_buffer+0x157/0x1e0
[ 3813.853032] sg_copy_from_buffer+0x13/0x20
[ 3813.853660] fill_from_dev_buffer+0x135/0x370
[ 3813.854329] resp_readcap16+0x1ac/0x280
[ 3813.856917] schedule_resp+0x41f/0x1630
[ 3813.858203] scsi_debug_queuecommand+0xb32/0x17e0
[ 3813.862699] scsi_dispatch_cmd+0x330/0x950
[ 3813.863329] scsi_request_fn+0xd8e/0x1710
[ 3813.863946] __blk_run_queue+0x10b/0x230
[ 3813.864544] blk_execute_rq_nowait+0x1d8/0x400
[ 3813.865220] sg_common_write.isra.0+0xe61/0x2420
[ 3813.871637] sg_write+0x6c8/0xef0
[ 3813.878853] __vfs_write+0xe4/0x800
[ 3813.883487] vfs_write+0x17b/0x530
[ 3813.884008] ksys_write+0x103/0x270
[ 3813.886268] __x64_sys_write+0x77/0xc0
[ 3813.886841] do_syscall_64+0x106/0x360
[ 3813.887415] entry_SYSCALL_64_after_hwframe+0x44/0xa9
This issue can be reproduced with the following syzkaller log:
r0 = openat(0xffffffffffffff9c, &(0x7f0000000040)='./file0\x00', 0x26e1, 0x0)
r1 = syz_open_procfs(0xffffffffffffffff, &(0x7f0000000000)='fd/3\x00')
open_by_handle_at(r1, &(0x7f00000003c0)=ANY=[@ANYRESHEX], 0x602000)
r2 = syz_open_dev$sg(&(0x7f0000000000), 0x0, 0x40782)
write$binfmt_aout(r2, &(0x7f0000000340)=ANY=[@ANYBLOB="00000000deff000000000000000000000000000000000000000000000000000047f007af9e107a41ec395f1bded7be24277a1501ff6196a83366f4e6362bc0ff2b247f68a972989b094b2da4fb3607fcf611a22dd04310d28c75039d"], 0x126)
In resp_readcap16() we get "int alloc_len" value -1104926854, and then pass
the huge arr_len to fill_from_dev_buffer(), but arr is only 32 bytes. This
leads to OOB in sg_copy_buffer().
To solve this issue, define alloc_len as u32.
In the Linux kernel, the following vulnerability has been resolved:
scsi: pm80xx: Fix memory leak during rmmod
Driver failed to release all memory allocated. This would lead to memory
leak during driver removal.
Properly free memory when the module is removed.
In the Linux kernel, the following vulnerability has been resolved:
ksmbd: validate payload size in ipc response
If installing malicious ksmbd-tools, ksmbd.mountd can return invalid ipc
response to ksmbd kernel server. ksmbd should validate payload size of
ipc response from ksmbd.mountd to avoid memory overrun or
slab-out-of-bounds. This patch validate 3 ipc response that has payload.
In the Linux kernel, the following vulnerability has been resolved:
btrfs: dev-replace: properly validate device names
There's a syzbot report that device name buffers passed to device
replace are not properly checked for string termination which could lead
to a read out of bounds in getname_kernel().
Add a helper that validates both source and target device name buffers.
For devid as the source initialize the buffer to empty string in case
something tries to read it later.
This was originally analyzed and fixed in a different way by Edward Adam
Davis (see links).
In the Linux kernel, the following vulnerability has been resolved:
ext4: avoid allocating blocks from corrupted group in ext4_mb_find_by_goal()
Places the logic for checking if the group's block bitmap is corrupt under
the protection of the group lock to avoid allocating blocks from the group
with a corrupted block bitmap.
In the Linux kernel, the following vulnerability has been resolved:
ext4: avoid allocating blocks from corrupted group in ext4_mb_try_best_found()
Determine if the group block bitmap is corrupted before using ac_b_ex in
ext4_mb_try_best_found() to avoid allocating blocks from a group with a
corrupted block bitmap in the following concurrency and making the
situation worse.
ext4_mb_regular_allocator
ext4_lock_group(sb, group)
ext4_mb_good_group
// check if the group bbitmap is corrupted
ext4_mb_complex_scan_group
// Scan group gets ac_b_ex but doesn't use it
ext4_unlock_group(sb, group)
ext4_mark_group_bitmap_corrupted(group)
// The block bitmap was corrupted during
// the group unlock gap.
ext4_mb_try_best_found
ext4_lock_group(ac->ac_sb, group)
ext4_mb_use_best_found
mb_mark_used
// Allocating blocks in block bitmap corrupted group
In the Linux kernel, the following vulnerability has been resolved:
ext4: avoid dividing by 0 in mb_update_avg_fragment_size() when block bitmap corrupt
Determine if bb_fragments is 0 instead of determining bb_free to eliminate
the risk of dividing by zero when the block bitmap is corrupted.
In the Linux kernel, the following vulnerability has been resolved:
aoe: avoid potential deadlock at set_capacity
Move set_capacity() outside of the section procected by (&d->lock).
To avoid possible interrupt unsafe locking scenario:
CPU0 CPU1
---- ----
[1] lock(&bdev->bd_size_lock);
local_irq_disable();
[2] lock(&d->lock);
[3] lock(&bdev->bd_size_lock);
<Interrupt>
[4] lock(&d->lock);
*** DEADLOCK ***
Where [1](&bdev->bd_size_lock) hold by zram_add()->set_capacity().
[2]lock(&d->lock) hold by aoeblk_gdalloc(). And aoeblk_gdalloc()
is trying to acquire [3](&bdev->bd_size_lock) at set_capacity() call.
In this situation an attempt to acquire [4]lock(&d->lock) from
aoecmd_cfg_rsp() will lead to deadlock.
So the simplest solution is breaking lock dependency
[2](&d->lock) -> [3](&bdev->bd_size_lock) by moving set_capacity()
outside.