In the context switch logic Xen attempts to skip an IBPB in the case of
a vCPU returning to a CPU on which it was the previous vCPU to run.
While safe for Xen's isolation between vCPUs, this prevents the guest
kernel correctly isolating between tasks. Consider:
1) vCPU runs on CPU A, running task 1.
2) vCPU moves to CPU B, idle gets scheduled on A. Xen skips IBPB.
3) On CPU B, guest kernel switches from task 1 to 2, issuing IBPB.
4) vCPU moves back to CPU A. Xen skips IBPB again.
Now, task 2 is running on CPU A with task 1's training still in the BTB.
[This CNA information record relates to multiple CVEs; the
text explains which aspects/vulnerabilities correspond to which CVE.]
Some Viridian hypercalls can specify a mask of vCPU IDs as an input, in
one of three formats. Xen has boundary checking bugs with all three
formats, which can cause out-of-bounds reads and writes while processing
the inputs.
* CVE-2025-58147. Hypercalls using the HV_VP_SET Sparse format can
cause vpmask_set() to write out of bounds when converting the bitmap
to Xen's format.
* CVE-2025-58148. Hypercalls using any input format can cause
send_ipi() to read d->vcpu[] out-of-bounds, and operate on a wild
vCPU pointer.
When passing through PCI devices, the detach logic in libxl won't remove
access permissions to any 64bit memory BARs the device might have. As a
result a domain can still have access any 64bit memory BAR when such
device is no longer assigned to the domain.
For PV domains the permission leak allows the domain itself to map the memory
in the page-tables. For HVM it would require a compromised device model or
stubdomain to map the leaked memory into the HVM domain p2m.
[This CNA information record relates to multiple CVEs; the
text explains which aspects/vulnerabilities correspond to which CVE.]
Some Viridian hypercalls can specify a mask of vCPU IDs as an input, in
one of three formats. Xen has boundary checking bugs with all three
formats, which can cause out-of-bounds reads and writes while processing
the inputs.
* CVE-2025-58147. Hypercalls using the HV_VP_SET Sparse format can
cause vpmask_set() to write out of bounds when converting the bitmap
to Xen's format.
* CVE-2025-58148. Hypercalls using any input format can cause
send_ipi() to read d->vcpu[] out-of-bounds, and operate on a wild
vCPU pointer.
When setting up interrupt remapping for legacy PCI(-X) devices,
including PCI(-X) bridges, a lookup of the upstream bridge is required.
This lookup, itself involving acquiring of a lock, is done in a context
where acquiring that lock is unsafe. This can lead to a deadlock.
Certain instructions need intercepting and emulating by Xen. In some
cases Xen emulates the instruction by replaying it, using an executable
stub. Some instructions may raise an exception, which is supposed to be
handled gracefully. Certain replayed instructions have additional logic
to set up and recover the changes to the arithmetic flags.
For replayed instructions where the flags recovery logic is used, the
metadata for exception handling was incorrect, preventing Xen from
handling the the exception gracefully, treating it as fatal instead.
PVH guests have their ACPI tables constructed by the toolstack. The
construction involves building the tables in local memory, which are
then copied into guest memory. While actually used parts of the local
memory are filled in correctly, excess space that is being allocated is
left with its prior contents.
Certain PCI devices in a system might be assigned Reserved Memory
Regions (specified via Reserved Memory Region Reporting, "RMRR") for
Intel VT-d or Unity Mapping ranges for AMD-Vi. These are typically used
for platform tasks such as legacy USB emulation.
Since the precise purpose of these regions is unknown, once a device
associated with such a region is active, the mappings of these regions
need to remain continuouly accessible by the device. In the logic
establishing these mappings, error handling was flawed, resulting in
such mappings to potentially remain in place when they should have been
removed again. Respective guests would then gain access to memory
regions which they aren't supposed to have access to.
In x86's APIC (Advanced Programmable Interrupt Controller) architecture,
error conditions are reported in a status register. Furthermore, the OS
can opt to receive an interrupt when a new error occurs.
It is possible to configure the error interrupt with an illegal vector,
which generates an error when an error interrupt is raised.
This case causes Xen to recurse through vlapic_error(). The recursion
itself is bounded; errors accumulate in the the status register and only
generate an interrupt when a new status bit becomes set.
However, the lock protecting this state in Xen will try to be taken
recursively, and deadlock.
An optional feature of PCI MSI called "Multiple Message" allows a
device to use multiple consecutive interrupt vectors. Unlike for MSI-X,
the setting up of these consecutive vectors needs to happen all in one
go. In this handling an error path could be taken in different
situations, with or without a particular lock held. This error path
wrongly releases the lock even when it is not currently held.