In the Linux kernel, the following vulnerability has been resolved:
qed: allow sleep in qed_mcp_trace_dump()
By default, qed_mcp_cmd_and_union() delays 10us at a time in a loop
that can run 500K times, so calls to qed_mcp_nvm_rd_cmd()
may block the current thread for over 5s.
We observed thread scheduling delays over 700ms in production,
with stacktraces pointing to this code as the culprit.
qed_mcp_trace_dump() is called from ethtool, so sleeping is permitted.
It already can sleep in qed_mcp_halt(), which calls qed_mcp_cmd().
Add a "can sleep" parameter to qed_find_nvram_image() and
qed_nvram_read() so they can sleep during qed_mcp_trace_dump().
qed_mcp_trace_get_meta_info() and qed_mcp_trace_read_meta(),
called only by qed_mcp_trace_dump(), allow these functions to sleep.
I can't tell if the other caller (qed_grc_dump_mcp_hw_dump()) can sleep,
so keep b_can_sleep set to false when it calls these functions.
An example stacktrace from a custom warning we added to the kernel
showing a thread that has not scheduled despite long needing resched:
[ 2745.362925,17] ------------[ cut here ]------------
[ 2745.362941,17] WARNING: CPU: 23 PID: 5640 at arch/x86/kernel/irq.c:233 do_IRQ+0x15e/0x1a0()
[ 2745.362946,17] Thread not rescheduled for 744 ms after irq 99
[ 2745.362956,17] Modules linked in: ...
[ 2745.363339,17] CPU: 23 PID: 5640 Comm: lldpd Tainted: P O 4.4.182+ #202104120910+6d1da174272d.61x
[ 2745.363343,17] Hardware name: FOXCONN MercuryB/Quicksilver Controller, BIOS H11P1N09 07/08/2020
[ 2745.363346,17] 0000000000000000 ffff885ec07c3ed8 ffffffff8131eb2f ffff885ec07c3f20
[ 2745.363358,17] ffffffff81d14f64 ffff885ec07c3f10 ffffffff81072ac2 ffff88be98ed0000
[ 2745.363369,17] 0000000000000063 0000000000000174 0000000000000074 0000000000000000
[ 2745.363379,17] Call Trace:
[ 2745.363382,17] <IRQ> [<ffffffff8131eb2f>] dump_stack+0x8e/0xcf
[ 2745.363393,17] [<ffffffff81072ac2>] warn_slowpath_common+0x82/0xc0
[ 2745.363398,17] [<ffffffff81072b4c>] warn_slowpath_fmt+0x4c/0x50
[ 2745.363404,17] [<ffffffff810d5a8e>] ? rcu_irq_exit+0xae/0xc0
[ 2745.363408,17] [<ffffffff817c99fe>] do_IRQ+0x15e/0x1a0
[ 2745.363413,17] [<ffffffff817c7ac9>] common_interrupt+0x89/0x89
[ 2745.363416,17] <EOI> [<ffffffff8132aa74>] ? delay_tsc+0x24/0x50
[ 2745.363425,17] [<ffffffff8132aa04>] __udelay+0x34/0x40
[ 2745.363457,17] [<ffffffffa04d45ff>] qed_mcp_cmd_and_union+0x36f/0x7d0 [qed]
[ 2745.363473,17] [<ffffffffa04d5ced>] qed_mcp_nvm_rd_cmd+0x4d/0x90 [qed]
[ 2745.363490,17] [<ffffffffa04e1dc7>] qed_mcp_trace_dump+0x4a7/0x630 [qed]
[ 2745.363504,17] [<ffffffffa04e2556>] ? qed_fw_asserts_dump+0x1d6/0x1f0 [qed]
[ 2745.363520,17] [<ffffffffa04e4ea7>] qed_dbg_mcp_trace_get_dump_buf_size+0x37/0x80 [qed]
[ 2745.363536,17] [<ffffffffa04ea881>] qed_dbg_feature_size+0x61/0xa0 [qed]
[ 2745.363551,17] [<ffffffffa04eb427>] qed_dbg_all_data_size+0x247/0x260 [qed]
[ 2745.363560,17] [<ffffffffa0482c10>] qede_get_regs_len+0x30/0x40 [qede]
[ 2745.363566,17] [<ffffffff816c9783>] ethtool_get_drvinfo+0xe3/0x190
[ 2745.363570,17] [<ffffffff816cc152>] dev_ethtool+0x1362/0x2140
[ 2745.363575,17] [<ffffffff8109bcc6>] ? finish_task_switch+0x76/0x260
[ 2745.363580,17] [<ffffffff817c2116>] ? __schedule+0x3c6/0x9d0
[ 2745.363585,17] [<ffffffff810dbd50>] ? hrtimer_start_range_ns+0x1d0/0x370
[ 2745.363589,17] [<ffffffff816c1e5b>] ? dev_get_by_name_rcu+0x6b/0x90
[ 2745.363594,17] [<ffffffff816de6a8>] dev_ioctl+0xe8/0x710
[ 2745.363599,17] [<ffffffff816a58a8>] sock_do_ioctl+0x48/0x60
[ 2745.363603,17] [<ffffffff816a5d87>] sock_ioctl+0x1c7/0x280
[ 2745.363608,17] [<ffffffff8111f393>] ? seccomp_phase1+0x83/0x220
[ 2745.363612,17] [<ffffffff811e3503>] do_vfs_ioctl+0x2b3/0x4e0
[ 2745.363616,17] [<ffffffff811e3771>] SyS_ioctl+0x41/0x70
[ 2745.363619,17] [<ffffffff817c6ffe>] entry_SYSCALL_64_fastpath+0x1e/0x79
[ 2745.363622,17] ---[ end trace f6954aa440266421 ]---
In the Linux kernel, the following vulnerability has been resolved:
scsi: ufs: core: Fix handling of lrbp->cmd
ufshcd_queuecommand() may be called two times in a row for a SCSI command
before it is completed. Hence make the following changes:
- In the functions that submit a command, do not check the old value of
lrbp->cmd nor clear lrbp->cmd in error paths.
- In ufshcd_release_scsi_cmd(), do not clear lrbp->cmd.
See also scsi_send_eh_cmnd().
This commit prevents that the following appears if a command times out:
WARNING: at drivers/ufs/core/ufshcd.c:2965 ufshcd_queuecommand+0x6f8/0x9a8
Call trace:
ufshcd_queuecommand+0x6f8/0x9a8
scsi_send_eh_cmnd+0x2c0/0x960
scsi_eh_test_devices+0x100/0x314
scsi_eh_ready_devs+0xd90/0x114c
scsi_error_handler+0x2b4/0xb70
kthread+0x16c/0x1e0
In the Linux kernel, the following vulnerability has been resolved:
virtio_net: Fix error unwinding of XDP initialization
When initializing XDP in virtnet_open(), some rq xdp initialization
may hit an error causing net device open failed. However, previous
rqs have already initialized XDP and enabled NAPI, which is not the
expected behavior. Need to roll back the previous rq initialization
to avoid leaks in error unwinding of init code.
Also extract helper functions of disable and enable queue pairs.
Use newly introduced disable helper function in error unwinding and
virtnet_close. Use enable helper function in virtnet_open.
In the Linux kernel, the following vulnerability has been resolved:
ext4: allow ext4_get_group_info() to fail
Previously, ext4_get_group_info() would treat an invalid group number
as BUG(), since in theory it should never happen. However, if a
malicious attaker (or fuzzer) modifies the superblock via the block
device while it is the file system is mounted, it is possible for
s_first_data_block to get set to a very large number. In that case,
when calculating the block group of some block number (such as the
starting block of a preallocation region), could result in an
underflow and very large block group number. Then the BUG_ON check in
ext4_get_group_info() would fire, resutling in a denial of service
attack that can be triggered by root or someone with write access to
the block device.
For a quality of implementation perspective, it's best that even if
the system administrator does something that they shouldn't, that it
will not trigger a BUG. So instead of BUG'ing, ext4_get_group_info()
will call ext4_error and return NULL. We also add fallback code in
all of the callers of ext4_get_group_info() that it might NULL.
Also, since ext4_get_group_info() was already borderline to be an
inline function, un-inline it. The results in a next reduction of the
compiled text size of ext4 by roughly 2k.
In the Linux kernel, the following vulnerability has been resolved:
start_kernel: Add __no_stack_protector function attribute
Back during the discussion of
commit a9a3ed1eff36 ("x86: Fix early boot crash on gcc-10, third try")
we discussed the need for a function attribute to control the omission
of stack protectors on a per-function basis; at the time Clang had
support for no_stack_protector but GCC did not. This was fixed in
gcc-11. Now that the function attribute is available, let's start using
it.
Callers of boot_init_stack_canary need to use this function attribute
unless they're compiled with -fno-stack-protector, otherwise the canary
stored in the stack slot of the caller will differ upon the call to
boot_init_stack_canary. This will lead to a call to __stack_chk_fail()
then panic.
In the Linux kernel, the following vulnerability has been resolved:
ipv6: Add lwtunnel encap size of all siblings in nexthop calculation
In function rt6_nlmsg_size(), the length of nexthop is calculated
by multipling the nexthop length of fib6_info and the number of
siblings. However if the fib6_info has no lwtunnel but the siblings
have lwtunnels, the nexthop length is less than it should be, and
it will trigger a warning in inet6_rt_notify() as follows:
WARNING: CPU: 0 PID: 6082 at net/ipv6/route.c:6180 inet6_rt_notify+0x120/0x130
......
Call Trace:
<TASK>
fib6_add_rt2node+0x685/0xa30
fib6_add+0x96/0x1b0
ip6_route_add+0x50/0xd0
inet6_rtm_newroute+0x97/0xa0
rtnetlink_rcv_msg+0x156/0x3d0
netlink_rcv_skb+0x5a/0x110
netlink_unicast+0x246/0x350
netlink_sendmsg+0x250/0x4c0
sock_sendmsg+0x66/0x70
___sys_sendmsg+0x7c/0xd0
__sys_sendmsg+0x5d/0xb0
do_syscall_64+0x3f/0x90
entry_SYSCALL_64_after_hwframe+0x72/0xdc
This bug can be reproduced by script:
ip -6 addr add 2002::2/64 dev ens2
ip -6 route add 100::/64 via 2002::1 dev ens2 metric 100
for i in 10 20 30 40 50 60 70;
do
ip link add link ens2 name ipv_$i type ipvlan
ip -6 addr add 2002::$i/64 dev ipv_$i
ifconfig ipv_$i up
done
for i in 10 20 30 40 50 60;
do
ip -6 route append 100::/64 encap ip6 dst 2002::$i via 2002::1
dev ipv_$i metric 100
done
ip -6 route append 100::/64 via 2002::1 dev ipv_70 metric 100
This patch fixes it by adding nexthop_len of every siblings using
rt6_nh_nlmsg_size().
In the Linux kernel, the following vulnerability has been resolved:
iommu: Fix error unwind in iommu_group_alloc()
If either iommu_group_grate_file() fails then the
iommu_group is leaked.
Destroy it on these error paths.
Found by kselftest/iommu/iommufd_fail_nth
In the Linux kernel, the following vulnerability has been resolved:
iw_cxgb4: Fix potential NULL dereference in c4iw_fill_res_cm_id_entry()
This condition needs to match the previous "if (epcp->state == LISTEN) {"
exactly to avoid a NULL dereference of either "listen_ep" or "ep". The
problem is that "epcp" has been re-assigned so just testing
"if (epcp->state == LISTEN) {" a second time is not sufficient.