In the Linux kernel, the following vulnerability has been resolved:
mm/hugetlb: unshare page tables during VMA split, not before
Currently, __split_vma() triggers hugetlb page table unsharing through
vm_ops->may_split(). This happens before the VMA lock and rmap locks are
taken - which is too early, it allows racing VMA-locked page faults in our
process and racing rmap walks from other processes to cause page tables to
be shared again before we actually perform the split.
Fix it by explicitly calling into the hugetlb unshare logic from
__split_vma() in the same place where THP splitting also happens. At that
point, both the VMA and the rmap(s) are write-locked.
An annoying detail is that we can now call into the helper
hugetlb_unshare_pmds() from two different locking contexts:
1. from hugetlb_split(), holding:
- mmap lock (exclusively)
- VMA lock
- file rmap lock (exclusively)
2. hugetlb_unshare_all_pmds(), which I think is designed to be able to
call us with only the mmap lock held (in shared mode), but currently
only runs while holding mmap lock (exclusively) and VMA lock
Backporting note:
This commit fixes a racy protection that was introduced in commit
b30c14cd6102 ("hugetlb: unshare some PMDs when splitting VMAs"); that
commit claimed to fix an issue introduced in 5.13, but it should actually
also go all the way back.
[jannh@google.com: v2]
In the Linux kernel, the following vulnerability has been resolved:
ALSA: bcd2000: Fix a UAF bug on the error path of probing
When the driver fails in snd_card_register() at probe time, it will free
the 'bcd2k->midi_out_urb' before killing it, which may cause a UAF bug.
The following log can reveal it:
[ 50.727020] BUG: KASAN: use-after-free in bcd2000_input_complete+0x1f1/0x2e0 [snd_bcd2000]
[ 50.727623] Read of size 8 at addr ffff88810fab0e88 by task swapper/4/0
[ 50.729530] Call Trace:
[ 50.732899] bcd2000_input_complete+0x1f1/0x2e0 [snd_bcd2000]
Fix this by adding usb_kill_urb() before usb_free_urb().
In the Linux kernel, the following vulnerability has been resolved:
netfilter: nf_tables: do not allow SET_ID to refer to another table
When doing lookups for sets on the same batch by using its ID, a set from a
different table can be used.
Then, when the table is removed, a reference to the set may be kept after
the set is freed, leading to a potential use-after-free.
When looking for sets by ID, use the table that was used for the lookup by
name, and only return sets belonging to that same table.
This fixes CVE-2022-2586, also reported as ZDI-CAN-17470.
In the Linux kernel, the following vulnerability has been resolved:
scsi: sg: Allow waiting for commands to complete on removed device
When a SCSI device is removed while in active use, currently sg will
immediately return -ENODEV on any attempt to wait for active commands that
were sent before the removal. This is problematic for commands that use
SG_FLAG_DIRECT_IO since the data buffer may still be in use by the kernel
when userspace frees or reuses it after getting ENODEV, leading to
corrupted userspace memory (in the case of READ-type commands) or corrupted
data being sent to the device (in the case of WRITE-type commands). This
has been seen in practice when logging out of a iscsi_tcp session, where
the iSCSI driver may still be processing commands after the device has been
marked for removal.
Change the policy to allow userspace to wait for active sg commands even
when the device is being removed. Return -ENODEV only when there are no
more responses to read.
In the Linux kernel, the following vulnerability has been resolved:
usbnet: Fix linkwatch use-after-free on disconnect
usbnet uses the work usbnet_deferred_kevent() to perform tasks which may
sleep. On disconnect, completion of the work was originally awaited in
->ndo_stop(). But in 2003, that was moved to ->disconnect() by historic
commit "[PATCH] USB: usbnet, prevent exotic rtnl deadlock":
https://git.kernel.org/tglx/history/c/0f138bbfd83c
The change was made because back then, the kernel's workqueue
implementation did not allow waiting for a single work. One had to wait
for completion of *all* work by calling flush_scheduled_work(), and that
could deadlock when waiting for usbnet_deferred_kevent() with rtnl_mutex
held in ->ndo_stop().
The commit solved one problem but created another: It causes a
use-after-free in USB Ethernet drivers aqc111.c, asix_devices.c,
ax88179_178a.c, ch9200.c and smsc75xx.c:
* If the drivers receive a link change interrupt immediately before
disconnect, they raise EVENT_LINK_RESET in their (non-sleepable)
->status() callback and schedule usbnet_deferred_kevent().
* usbnet_deferred_kevent() invokes the driver's ->link_reset() callback,
which calls netif_carrier_{on,off}().
* That in turn schedules the work linkwatch_event().
Because usbnet_deferred_kevent() is awaited after unregister_netdev(),
netif_carrier_{on,off}() may operate on an unregistered netdev and
linkwatch_event() may run after free_netdev(), causing a use-after-free.
In 2010, usbnet was changed to only wait for a single instance of
usbnet_deferred_kevent() instead of *all* work by commit 23f333a2bfaf
("drivers/net: don't use flush_scheduled_work()").
Unfortunately the commit neglected to move the wait back to
->ndo_stop(). Rectify that omission at long last.
In the Linux kernel, the following vulnerability has been resolved:
ARM: OMAP2+: display: Fix refcount leak bug
In omapdss_init_fbdev(), of_find_node_by_name() will return a node
pointer with refcount incremented. We should use of_node_put() when
it is not used anymore.
In the Linux kernel, the following vulnerability has been resolved:
ARM: OMAP2+: pdata-quirks: Fix refcount leak bug
In pdata_quirks_init_clocks(), the loop contains
of_find_node_by_name() but without corresponding of_node_put().
In the Linux kernel, the following vulnerability has been resolved:
ext2: Add more validity checks for inode counts
Add checks verifying number of inodes stored in the superblock matches
the number computed from number of inodes per group. Also verify we have
at least one block worth of inodes per group. This prevents crashes on
corrupted filesystems.
In the Linux kernel, the following vulnerability has been resolved:
arm64: fix oops in concurrently setting insn_emulation sysctls
emulation_proc_handler() changes table->data for proc_dointvec_minmax
and can generate the following Oops if called concurrently with itself:
| Unable to handle kernel NULL pointer dereference at virtual address 0000000000000010
| Internal error: Oops: 96000006 [#1] SMP
| Call trace:
| update_insn_emulation_mode+0xc0/0x148
| emulation_proc_handler+0x64/0xb8
| proc_sys_call_handler+0x9c/0xf8
| proc_sys_write+0x18/0x20
| __vfs_write+0x20/0x48
| vfs_write+0xe4/0x1d0
| ksys_write+0x70/0xf8
| __arm64_sys_write+0x20/0x28
| el0_svc_common.constprop.0+0x7c/0x1c0
| el0_svc_handler+0x2c/0xa0
| el0_svc+0x8/0x200
To fix this issue, keep the table->data as &insn->current_mode and
use container_of() to retrieve the insn pointer. Another mutex is
used to protect against the current_mode update but not for retrieving
insn_emulation as table->data is no longer changing.