Security Vulnerabilities
- CVEs Published In August 2024
In the Linux kernel, the following vulnerability has been resolved:
iommu/arm-smmu-v3: Don't unregister on shutdown
Similar to SMMUv2, this driver calls iommu_device_unregister() from the
shutdown path, which removes the IOMMU groups with no coordination
whatsoever with their users - shutdown methods are optional in device
drivers. This can lead to NULL pointer dereferences in those drivers'
DMA API calls, or worse.
Instead of calling the full arm_smmu_device_remove() from
arm_smmu_device_shutdown(), let's pick only the relevant function call -
arm_smmu_device_disable() - more or less the reverse of
arm_smmu_device_reset() - and call just that from the shutdown path.
In the Linux kernel, the following vulnerability has been resolved:
iommu/arm-smmu: Don't unregister on shutdown
Michael Walle says he noticed the following stack trace while performing
a shutdown with "reboot -f". He suggests he got "lucky" and just hit the
correct spot for the reboot while there was a packet transmission in
flight.
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000098
CPU: 0 PID: 23 Comm: kworker/0:1 Not tainted 6.1.0-rc5-00088-gf3600ff8e322 #1930
Hardware name: Kontron KBox A-230-LS (DT)
pc : iommu_get_dma_domain+0x14/0x20
lr : iommu_dma_map_page+0x9c/0x254
Call trace:
iommu_get_dma_domain+0x14/0x20
dma_map_page_attrs+0x1ec/0x250
enetc_start_xmit+0x14c/0x10b0
enetc_xmit+0x60/0xdc
dev_hard_start_xmit+0xb8/0x210
sch_direct_xmit+0x11c/0x420
__dev_queue_xmit+0x354/0xb20
ip6_finish_output2+0x280/0x5b0
__ip6_finish_output+0x15c/0x270
ip6_output+0x78/0x15c
NF_HOOK.constprop.0+0x50/0xd0
mld_sendpack+0x1bc/0x320
mld_ifc_work+0x1d8/0x4dc
process_one_work+0x1e8/0x460
worker_thread+0x178/0x534
kthread+0xe0/0xe4
ret_from_fork+0x10/0x20
Code: d503201f f9416800 d503233f d50323bf (f9404c00)
---[ end trace 0000000000000000 ]---
Kernel panic - not syncing: Oops: Fatal exception in interrupt
This appears to be reproducible when the board has a fixed IP address,
is ping flooded from another host, and "reboot -f" is used.
The following is one more manifestation of the issue:
$ reboot -f
kvm: exiting hardware virtualization
cfg80211: failed to load regulatory.db
arm-smmu 5000000.iommu: disabling translation
sdhci-esdhc 2140000.mmc: Removing from iommu group 11
sdhci-esdhc 2150000.mmc: Removing from iommu group 12
fsl-edma 22c0000.dma-controller: Removing from iommu group 17
dwc3 3100000.usb: Removing from iommu group 9
dwc3 3110000.usb: Removing from iommu group 10
ahci-qoriq 3200000.sata: Removing from iommu group 2
fsl-qdma 8380000.dma-controller: Removing from iommu group 20
platform f080000.display: Removing from iommu group 0
etnaviv-gpu f0c0000.gpu: Removing from iommu group 1
etnaviv etnaviv: Removing from iommu group 1
caam_jr 8010000.jr: Removing from iommu group 13
caam_jr 8020000.jr: Removing from iommu group 14
caam_jr 8030000.jr: Removing from iommu group 15
caam_jr 8040000.jr: Removing from iommu group 16
fsl_enetc 0000:00:00.0: Removing from iommu group 4
arm-smmu 5000000.iommu: Blocked unknown Stream ID 0x429; boot with "arm-smmu.disable_bypass=0" to allow, but this may have security implications
arm-smmu 5000000.iommu: GFSR 0x80000002, GFSYNR0 0x00000002, GFSYNR1 0x00000429, GFSYNR2 0x00000000
fsl_enetc 0000:00:00.1: Removing from iommu group 5
arm-smmu 5000000.iommu: Blocked unknown Stream ID 0x429; boot with "arm-smmu.disable_bypass=0" to allow, but this may have security implications
arm-smmu 5000000.iommu: GFSR 0x80000002, GFSYNR0 0x00000002, GFSYNR1 0x00000429, GFSYNR2 0x00000000
arm-smmu 5000000.iommu: Blocked unknown Stream ID 0x429; boot with "arm-smmu.disable_bypass=0" to allow, but this may have security implications
arm-smmu 5000000.iommu: GFSR 0x80000002, GFSYNR0 0x00000000, GFSYNR1 0x00000429, GFSYNR2 0x00000000
fsl_enetc 0000:00:00.2: Removing from iommu group 6
fsl_enetc_mdio 0000:00:00.3: Removing from iommu group 8
mscc_felix 0000:00:00.5: Removing from iommu group 3
fsl_enetc 0000:00:00.6: Removing from iommu group 7
pcieport 0001:00:00.0: Removing from iommu group 18
arm-smmu 5000000.iommu: Blocked unknown Stream ID 0x429; boot with "arm-smmu.disable_bypass=0" to allow, but this may have security implications
arm-smmu 5000000.iommu: GFSR 0x00000002, GFSYNR0 0x00000000, GFSYNR1 0x00000429, GFSYNR2 0x00000000
pcieport 0002:00:00.0: Removing from iommu group 19
Unable to handle kernel NULL pointer dereference at virtual address 00000000000000a8
pc : iommu_get_dma_domain+0x14/0x20
lr : iommu_dma_unmap_page+0x38/0xe0
Call trace:
iommu_get_dma_domain+0x14/0x20
dma_unmap_page_attrs+0x38/0x1d0
en
---truncated---
In the Linux kernel, the following vulnerability has been resolved:
ixgbe: fix pci device refcount leak
As the comment of pci_get_domain_bus_and_slot() says, it
returns a PCI device with refcount incremented, when finish
using it, the caller must decrement the reference count by
calling pci_dev_put().
In ixgbe_get_first_secondary_devfn() and ixgbe_x550em_a_has_mii(),
pci_dev_put() is called to avoid leak.
In the Linux kernel, the following vulnerability has been resolved:
arm64/mm: fix incorrect file_map_count for invalid pmd
The page table check trigger BUG_ON() unexpectedly when split hugepage:
------------[ cut here ]------------
kernel BUG at mm/page_table_check.c:119!
Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
Dumping ftrace buffer:
(ftrace buffer empty)
Modules linked in:
CPU: 7 PID: 210 Comm: transhuge-stres Not tainted 6.1.0-rc3+ #748
Hardware name: linux,dummy-virt (DT)
pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : page_table_check_set.isra.0+0x398/0x468
lr : page_table_check_set.isra.0+0x1c0/0x468
[...]
Call trace:
page_table_check_set.isra.0+0x398/0x468
__page_table_check_pte_set+0x160/0x1c0
__split_huge_pmd_locked+0x900/0x1648
__split_huge_pmd+0x28c/0x3b8
unmap_page_range+0x428/0x858
unmap_single_vma+0xf4/0x1c8
zap_page_range+0x2b0/0x410
madvise_vma_behavior+0xc44/0xe78
do_madvise+0x280/0x698
__arm64_sys_madvise+0x90/0xe8
invoke_syscall.constprop.0+0xdc/0x1d8
do_el0_svc+0xf4/0x3f8
el0_svc+0x58/0x120
el0t_64_sync_handler+0xb8/0xc0
el0t_64_sync+0x19c/0x1a0
[...]
On arm64, pmd_leaf() will return true even if the pmd is invalid due to
pmd_present_invalid() check. So in pmdp_invalidate() the file_map_count
will not only decrease once but also increase once. Then in set_pte_at(),
the file_map_count increase again, and so trigger BUG_ON() unexpectedly.
Add !pmd_present_invalid() check in pmd_user_accessible_page() to fix the
problem.
In the Linux kernel, the following vulnerability has been resolved:
drm/msm/dp: do not complete dp_aux_cmd_fifo_tx() if irq is not for aux transfer
There are 3 possible interrupt sources are handled by DP controller,
HPDstatus, Controller state changes and Aux read/write transaction.
At every irq, DP controller have to check isr status of every interrupt
sources and service the interrupt if its isr status bits shows interrupts
are pending. There is potential race condition may happen at current aux
isr handler implementation since it is always complete dp_aux_cmd_fifo_tx()
even irq is not for aux read or write transaction. This may cause aux read
transaction return premature if host aux data read is in the middle of
waiting for sink to complete transferring data to host while irq happen.
This will cause host's receiving buffer contains unexpected data. This
patch fixes this problem by checking aux isr and return immediately at
aux isr handler if there are no any isr status bits set.
Current there is a bug report regrading eDP edid corruption happen during
system booting up. After lengthy debugging to found that VIDEO_READY
interrupt was continuously firing during system booting up which cause
dp_aux_isr() to complete dp_aux_cmd_fifo_tx() prematurely to retrieve data
from aux hardware buffer which is not yet contains complete data transfer
from sink. This cause edid corruption.
Follows are the signature at kernel logs when problem happen,
EDID has corrupt header
panel-simple-dp-aux aux-aea0000.edp: Couldn't identify panel via EDID
Changes in v2:
-- do complete if (ret == IRQ_HANDLED) ay dp-aux_isr()
-- add more commit text
Changes in v3:
-- add Stephen suggested
-- dp_aux_isr() return IRQ_XXX back to caller
-- dp_ctrl_isr() return IRQ_XXX back to caller
Changes in v4:
-- split into two patches
Changes in v5:
-- delete empty line between tags
Changes in v6:
-- remove extra "that" and fixed line more than 75 char at commit text
Patchwork: https://patchwork.freedesktop.org/patch/516121/
In the Linux kernel, the following vulnerability has been resolved:
drm/virtio: Fix GEM handle creation UAF
Userspace can guess the handle value and try to race GEM object creation
with handle close, resulting in a use-after-free if we dereference the
object after dropping the handle's reference. For that reason, dropping
the handle's reference must be done *after* we are done dereferencing
the object.
In the Linux kernel, the following vulnerability has been resolved:
gsmi: fix null-deref in gsmi_get_variable
We can get EFI variables without fetching the attribute, so we must
allow for that in gsmi.
commit 859748255b43 ("efi: pstore: Omit efivars caching EFI varstore
access layer") added a new get_variable call with attr=NULL, which
triggers panic in gsmi.
In the Linux kernel, the following vulnerability has been resolved:
usb: gadget: f_ncm: fix potential NULL ptr deref in ncm_bitrate()
In Google internal bug 265639009 we've received an (as yet) unreproducible
crash report from an aarch64 GKI 5.10.149-android13 running device.
AFAICT the source code is at:
https://android.googlesource.com/kernel/common/+/refs/tags/ASB-2022-12-05_13-5.10
The call stack is:
ncm_close() -> ncm_notify() -> ncm_do_notify()
with the crash at:
ncm_do_notify+0x98/0x270
Code: 79000d0b b9000a6c f940012a f9400269 (b9405d4b)
Which I believe disassembles to (I don't know ARM assembly, but it looks sane enough to me...):
// halfword (16-bit) store presumably to event->wLength (at offset 6 of struct usb_cdc_notification)
0B 0D 00 79 strh w11, [x8, #6]
// word (32-bit) store presumably to req->Length (at offset 8 of struct usb_request)
6C 0A 00 B9 str w12, [x19, #8]
// x10 (NULL) was read here from offset 0 of valid pointer x9
// IMHO we're reading 'cdev->gadget' and getting NULL
// gadget is indeed at offset 0 of struct usb_composite_dev
2A 01 40 F9 ldr x10, [x9]
// loading req->buf pointer, which is at offset 0 of struct usb_request
69 02 40 F9 ldr x9, [x19]
// x10 is null, crash, appears to be attempt to read cdev->gadget->max_speed
4B 5D 40 B9 ldr w11, [x10, #0x5c]
which seems to line up with ncm_do_notify() case NCM_NOTIFY_SPEED code fragment:
event->wLength = cpu_to_le16(8);
req->length = NCM_STATUS_BYTECOUNT;
/* SPEED_CHANGE data is up/down speeds in bits/sec */
data = req->buf + sizeof *event;
data[0] = cpu_to_le32(ncm_bitrate(cdev->gadget));
My analysis of registers and NULL ptr deref crash offset
(Unable to handle kernel NULL pointer dereference at virtual address 000000000000005c)
heavily suggests that the crash is due to 'cdev->gadget' being NULL when executing:
data[0] = cpu_to_le32(ncm_bitrate(cdev->gadget));
which calls:
ncm_bitrate(NULL)
which then calls:
gadget_is_superspeed(NULL)
which reads
((struct usb_gadget *)NULL)->max_speed
and hits a panic.
AFAICT, if I'm counting right, the offset of max_speed is indeed 0x5C.
(remember there's a GKI KABI reservation of 16 bytes in struct work_struct)
It's not at all clear to me how this is all supposed to work...
but returning 0 seems much better than panic-ing...
In the Linux kernel, the following vulnerability has been resolved:
dmaengine: idxd: Let probe fail when workqueue cannot be enabled
The workqueue is enabled when the appropriate driver is loaded and
disabled when the driver is removed. When the driver is removed it
assumes that the workqueue was enabled successfully and proceeds to
free allocations made during workqueue enabling.
Failure during workqueue enabling does not prevent the driver from
being loaded. This is because the error path within drv_enable_wq()
returns success unless a second failure is encountered
during the error path. By returning success it is possible to load
the driver even if the workqueue cannot be enabled and
allocations that do not exist are attempted to be freed during
driver remove.
Some examples of problematic flows:
(a)
idxd_dmaengine_drv_probe() -> drv_enable_wq() -> idxd_wq_request_irq():
In above flow, if idxd_wq_request_irq() fails then
idxd_wq_unmap_portal() is called on error exit path, but
drv_enable_wq() returns 0 because idxd_wq_disable() succeeds. The
driver is thus loaded successfully.
idxd_dmaengine_drv_remove()->drv_disable_wq()->idxd_wq_unmap_portal()
Above flow on driver unload triggers the WARN in devm_iounmap() because
the device resource has already been removed during error path of
drv_enable_wq().
(b)
idxd_dmaengine_drv_probe() -> drv_enable_wq() -> idxd_wq_request_irq():
In above flow, if idxd_wq_request_irq() fails then
idxd_wq_init_percpu_ref() is never called to initialize the percpu
counter, yet the driver loads successfully because drv_enable_wq()
returns 0.
idxd_dmaengine_drv_remove()->__idxd_wq_quiesce()->percpu_ref_kill():
Above flow on driver unload triggers a BUG when attempting to drop the
initial ref of the uninitialized percpu ref:
BUG: kernel NULL pointer dereference, address: 0000000000000010
Fix the drv_enable_wq() error path by returning the original error that
indicates failure of workqueue enabling. This ensures that the probe
fails when an error is encountered and the driver remove paths are only
attempted when the workqueue was enabled successfully.
In the Linux kernel, the following vulnerability has been resolved:
USB: gadgetfs: Fix race between mounting and unmounting
The syzbot fuzzer and Gerald Lee have identified a use-after-free bug
in the gadgetfs driver, involving processes concurrently mounting and
unmounting the gadgetfs filesystem. In particular, gadgetfs_fill_super()
can race with gadgetfs_kill_sb(), causing the latter to deallocate
the_device while the former is using it. The output from KASAN says,
in part:
BUG: KASAN: use-after-free in instrument_atomic_read_write include/linux/instrumented.h:102 [inline]
BUG: KASAN: use-after-free in atomic_fetch_sub_release include/linux/atomic/atomic-instrumented.h:176 [inline]
BUG: KASAN: use-after-free in __refcount_sub_and_test include/linux/refcount.h:272 [inline]
BUG: KASAN: use-after-free in __refcount_dec_and_test include/linux/refcount.h:315 [inline]
BUG: KASAN: use-after-free in refcount_dec_and_test include/linux/refcount.h:333 [inline]
BUG: KASAN: use-after-free in put_dev drivers/usb/gadget/legacy/inode.c:159 [inline]
BUG: KASAN: use-after-free in gadgetfs_kill_sb+0x33/0x100 drivers/usb/gadget/legacy/inode.c:2086
Write of size 4 at addr ffff8880276d7840 by task syz-executor126/18689
CPU: 0 PID: 18689 Comm: syz-executor126 Not tainted 6.1.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022
Call Trace:
<TASK>
...
atomic_fetch_sub_release include/linux/atomic/atomic-instrumented.h:176 [inline]
__refcount_sub_and_test include/linux/refcount.h:272 [inline]
__refcount_dec_and_test include/linux/refcount.h:315 [inline]
refcount_dec_and_test include/linux/refcount.h:333 [inline]
put_dev drivers/usb/gadget/legacy/inode.c:159 [inline]
gadgetfs_kill_sb+0x33/0x100 drivers/usb/gadget/legacy/inode.c:2086
deactivate_locked_super+0xa7/0xf0 fs/super.c:332
vfs_get_super fs/super.c:1190 [inline]
get_tree_single+0xd0/0x160 fs/super.c:1207
vfs_get_tree+0x88/0x270 fs/super.c:1531
vfs_fsconfig_locked fs/fsopen.c:232 [inline]
The simplest solution is to ensure that gadgetfs_fill_super() and
gadgetfs_kill_sb() are serialized by making them both acquire a new
mutex.