In the Linux kernel, the following vulnerability has been resolved:
x86/kexec: Disable KCOV instrumentation after load_segments()
The load_segments() function changes segment registers, invalidating GS base
(which KCOV relies on for per-cpu data). When CONFIG_KCOV is enabled, any
subsequent instrumented C code call (e.g. native_gdt_invalidate()) begins
crashing the kernel in an endless loop.
To reproduce the problem, it's sufficient to do kexec on a KCOV-instrumented
kernel:
$ kexec -l /boot/otherKernel
$ kexec -e
The real-world context for this problem is enabling crash dump collection in
syzkaller. For this, the tool loads a panic kernel before fuzzing and then
calls makedumpfile after the panic. This workflow requires both CONFIG_KEXEC
and CONFIG_KCOV to be enabled simultaneously.
Adding safeguards directly to the KCOV fast-path (__sanitizer_cov_trace_pc())
is also undesirable as it would introduce an extra performance overhead.
Disabling instrumentation for the individual functions would be too fragile,
so disable KCOV instrumentation for the entire machine_kexec_64.c and
physaddr.c. If coverage-guided fuzzing ever needs these components in the
future, other approaches should be considered.
The problem is not relevant for 32 bit kernels as CONFIG_KCOV is not supported
there.
[ bp: Space out comment for better readability. ]
In the Linux kernel, the following vulnerability has been resolved:
thermal: core: Fix thermal zone device registration error path
If thermal_zone_device_register_with_trips() fails after registering
a thermal zone device, it needs to wait for the tz->removal completion
like thermal_zone_device_unregister(), in case user space has managed
to take a reference to the thermal zone device's kobject, in which case
thermal_release() may not be called by the error path itself and tz may
be freed prematurely.
Add the missing wait_for_completion() call to the thermal zone device
registration error path.
In the Linux kernel, the following vulnerability has been resolved:
USB: dummy-hcd: Fix interrupt synchronization error
This fixes an error in synchronization in the dummy-hcd driver. The
error has a somewhat involved history. The synchronization mechanism
was introduced by commit 7dbd8f4cabd9 ("USB: dummy-hcd: Fix erroneous
synchronization change"), which added an emulated "interrupts enabled"
flag together with code emulating synchronize_irq() (it waits until
all current handler callbacks have returned).
But the emulated interrupt-disable occurred too late, after the driver
containing the handler callback routines had been told that it was
unbound and no more callbacks would occur. Commit 4a5d797a9f9c ("usb:
gadget: dummy_hcd: fix gpf in gadget_setup") tried to fix this by
moving the synchronize_irq() emulation code from dummy_stop() to
dummy_pullup(), which runs before the unbind callback.
There still were races, though, because the emulated interrupt-disable
still occurred too late. It couldn't be moved to dummy_pullup(),
because that routine can be called for reasons other than an impending
unbind. Therefore commits 7dc0c55e9f30 ("USB: UDC core: Add
udc_async_callbacks gadget op") and 04145a03db9d ("USB: UDC: Implement
udc_async_callbacks in dummy-hcd") added an API allowing the UDC core
to tell dummy-hcd exactly when emulated interrupts and their callbacks
should be disabled.
That brings us to the current state of things, which is still wrong
because the emulated synchronize_irq() occurs before the emulated
interrupt-disable! That's no good, beause it means that more emulated
interrupts can occur after the synchronize_irq() emulation has run,
leading to the possibility that a callback handler may be running when
the gadget driver is unbound.
To fix this, we have to move the synchronize_irq() emulation code yet
again, to the dummy_udc_async_callbacks() routine, which takes care of
enabling and disabling emulated interrupt requests. The
synchronization will now run immediately after emulated interrupts are
disabled, which is where it belongs.
In the Linux kernel, the following vulnerability has been resolved:
wifi: iwlwifi: mvm: don't send a 6E related command when not supported
MCC_ALLOWED_AP_TYPE_CMD is related to 6E support. Do not send it if the
device doesn't support 6E.
Apparently, the firmware is mistakenly advertising support for this
command even on AX201 which does not support 6E and then the firmware
crashes.
In the Linux kernel, the following vulnerability has been resolved:
media: solo6x10: Check for out of bounds chip_id
Clang with CONFIG_UBSAN_SHIFT=y noticed a condition where a signed type
(literal "1" is an "int") could end up being shifted beyond 32 bits,
so instrumentation was added (and due to the double is_tw286x() call
seen via inlining), Clang decides the second one must now be undefined
behavior and elides the rest of the function[1]. This is a known problem
with Clang (that is still being worked on), but we can avoid the entire
problem by actually checking the existing max chip ID, and now there is
no runtime instrumentation added at all since everything is known to be
within bounds.
Additionally use an unsigned value for the shift to remove the
instrumentation even without the explicit bounds checking.
[hverkuil: fix checkpatch warning for is_tw286x]
In the Linux kernel, the following vulnerability has been resolved:
most: core: fix leak on early registration failure
A recent commit fixed a resource leak on early registration failures but
for some reason left out the first error path which still leaks the
resources associated with the interface.
Fix up also the first error path so that the interface is always
released on errors.
In the Linux kernel, the following vulnerability has been resolved:
drm/amdgpu: fix sync handling in amdgpu_dma_buf_move_notify
Invalidating a dmabuf will impact other users of the shared BO.
In the scenario where process A moves the BO, it needs to inform
process B about the move and process B will need to update its
page table.
The commit fixes a synchronisation bug caused by the use of the
ticket: it made amdgpu_vm_handle_moved behave as if updating
the page table immediately was correct but in this case it's not.
An example is the following scenario, with 2 GPUs and glxgears
running on GPU0 and Xorg running on GPU1, on a system where P2P
PCI isn't supported:
glxgears:
export linear buffer from GPU0 and import using GPU1
submit frame rendering to GPU0
submit tiled->linear blit
Xorg:
copy of linear buffer
The sequence of jobs would be:
drm_sched_job_run # GPU0, frame rendering
drm_sched_job_queue # GPU0, blit
drm_sched_job_done # GPU0, frame rendering
drm_sched_job_run # GPU0, blit
move linear buffer for GPU1 access #
amdgpu_dma_buf_move_notify -> update pt # GPU0
It this point the blit job on GPU0 is still running and would
likely produce a page fault.
In the Linux kernel, the following vulnerability has been resolved:
spi: spidev: fix lock inversion between spi_lock and buf_lock
The spidev driver previously used two mutexes, spi_lock and buf_lock,
but acquired them in different orders depending on the code path:
write()/read(): buf_lock -> spi_lock
ioctl(): spi_lock -> buf_lock
This AB-BA locking pattern triggers lockdep warnings and can
cause real deadlocks:
WARNING: possible circular locking dependency detected
spidev_ioctl() -> mutex_lock(&spidev->buf_lock)
spidev_sync_write() -> mutex_lock(&spidev->spi_lock)
*** DEADLOCK ***
The issue is reproducible with a simple userspace program that
performs write() and SPI_IOC_WR_MAX_SPEED_HZ ioctl() calls from
separate threads on the same spidev file descriptor.
Fix this by simplifying the locking model and removing the lock
inversion entirely. spidev_sync() no longer performs any locking,
and all callers serialize access using spi_lock.
buf_lock is removed since its functionality is fully covered by
spi_lock, eliminating the possibility of lock ordering issues.
This removes the lock inversion and prevents deadlocks without
changing userspace ABI or behaviour.
In the Linux kernel, the following vulnerability has been resolved:
drm/amd/display: Fix dsc eDP issue
[why]
Need to add function hook check before use
In the Linux kernel, the following vulnerability has been resolved:
bpf: Properly mark live registers for indirect jumps
For a `gotox rX` instruction the rX register should be marked as used
in the compute_insn_live_regs() function. Fix this.