In the Linux kernel, the following vulnerability has been resolved:
coresight: tmc-etr: Fix race condition between sysfs and perf mode
When trying to run perf and sysfs mode simultaneously, the WARN_ON()
in tmc_etr_enable_hw() is triggered sometimes:
WARNING: CPU: 42 PID: 3911571 at drivers/hwtracing/coresight/coresight-tmc-etr.c:1060 tmc_etr_enable_hw+0xc0/0xd8 [coresight_tmc]
[..snip..]
Call trace:
tmc_etr_enable_hw+0xc0/0xd8 [coresight_tmc] (P)
tmc_enable_etr_sink+0x11c/0x250 [coresight_tmc] (L)
tmc_enable_etr_sink+0x11c/0x250 [coresight_tmc]
coresight_enable_path+0x1c8/0x218 [coresight]
coresight_enable_sysfs+0xa4/0x228 [coresight]
enable_source_store+0x58/0xa8 [coresight]
dev_attr_store+0x20/0x40
sysfs_kf_write+0x4c/0x68
kernfs_fop_write_iter+0x120/0x1b8
vfs_write+0x2c8/0x388
ksys_write+0x74/0x108
__arm64_sys_write+0x24/0x38
el0_svc_common.constprop.0+0x64/0x148
do_el0_svc+0x24/0x38
el0_svc+0x3c/0x130
el0t_64_sync_handler+0xc8/0xd0
el0t_64_sync+0x1ac/0x1b0
---[ end trace 0000000000000000 ]---
Since the enablement of sysfs mode is separeted into two critical regions,
one for sysfs buffer allocation and another for hardware enablement, it's
possible to race with the perf mode. Fix this by double check whether
the perf mode's been used before enabling the hardware in sysfs mode.
mode:
[sysfs mode] [perf mode]
tmc_etr_get_sysfs_buffer()
spin_lock(&drvdata->spinlock)
[sysfs buffer allocation]
spin_unlock(&drvdata->spinlock)
spin_lock(&drvdata->spinlock)
tmc_etr_enable_hw()
drvdata->etr_buf = etr_perf->etr_buf
spin_unlock(&drvdata->spinlock)
spin_lock(&drvdata->spinlock)
tmc_etr_enable_hw()
WARN_ON(drvdata->etr_buf) // WARN sicne etr_buf initialized at
the perf side
spin_unlock(&drvdata->spinlock)
With this fix, we retain the check for CS_MODE_PERF in get_etr_sysfs_buf.
This ensures we verify whether the perf mode's already running before we
actually allocate the buffer. Then we can save the time of
allocating/freeing the sysfs buffer if race with the perf mode.
In the Linux kernel, the following vulnerability has been resolved:
ibmveth: Disable GSO for packets with small MSS
Some physical adapters on Power systems do not support segmentation
offload when the MSS is less than 224 bytes. Attempting to send such
packets causes the adapter to freeze, stopping all traffic until
manually reset.
Implement ndo_features_check to disable GSO for packets with small MSS
values. The network stack will perform software segmentation instead.
The 224-byte minimum matches ibmvnic
commit <f10b09ef687f> ("ibmvnic: Enforce stronger sanity checks
on GSO packets")
which uses the same physical adapters in SEA configurations.
The issue occurs specifically when the hardware attempts to perform
segmentation (gso_segs > 1) with a small MSS. Single-segment GSO packets
(gso_segs == 1) do not trigger the problematic LSO code path and are
transmitted normally without segmentation.
Add an ndo_features_check callback to disable GSO when MSS < 224 bytes.
Also call vlan_features_check() to ensure proper handling of VLAN packets,
particularly QinQ (802.1ad) configurations where the hardware parser may
not support certain offload features.
Validated using iptables to force small MSS values. Without the fix,
the adapter freezes. With the fix, packets are segmented in software
and transmission succeeds. Comprehensive regression testing completedd
(MSS tests, performance, stability).
In the Linux kernel, the following vulnerability has been resolved:
RDMA/hns: Fix WQ_MEM_RECLAIM warning
When sunrpc is used, if a reset triggered, our wq may lead the
following trace:
workqueue: WQ_MEM_RECLAIM xprtiod:xprt_rdma_connect_worker [rpcrdma]
is flushing !WQ_MEM_RECLAIM hns_roce_irq_workq:flush_work_handle
[hns_roce_hw_v2]
WARNING: CPU: 0 PID: 8250 at kernel/workqueue.c:2644 check_flush_dependency+0xe0/0x144
Call trace:
check_flush_dependency+0xe0/0x144
start_flush_work.constprop.0+0x1d0/0x2f0
__flush_work.isra.0+0x40/0xb0
flush_work+0x14/0x30
hns_roce_v2_destroy_qp+0xac/0x1e0 [hns_roce_hw_v2]
ib_destroy_qp_user+0x9c/0x2b4
rdma_destroy_qp+0x34/0xb0
rpcrdma_ep_destroy+0x28/0xcc [rpcrdma]
rpcrdma_ep_put+0x74/0xb4 [rpcrdma]
rpcrdma_xprt_disconnect+0x1d8/0x260 [rpcrdma]
xprt_rdma_connect_worker+0xc0/0x120 [rpcrdma]
process_one_work+0x1cc/0x4d0
worker_thread+0x154/0x414
kthread+0x104/0x144
ret_from_fork+0x10/0x18
Since QP destruction frees memory, this wq should have the WQ_MEM_RECLAIM.
In the Linux kernel, the following vulnerability has been resolved:
inet: RAW sockets using IPPROTO_RAW MUST drop incoming ICMP
Yizhou Zhao reported that simply having one RAW socket on protocol
IPPROTO_RAW (255) was dangerous.
socket(AF_INET, SOCK_RAW, 255);
A malicious incoming ICMP packet can set the protocol field to 255
and match this socket, leading to FNHE cache changes.
inner = IP(src="192.168.2.1", dst="8.8.8.8", proto=255)/Raw("TEST")
pkt = IP(src="192.168.1.1", dst="192.168.2.1")/ICMP(type=3, code=4, nexthopmtu=576)/inner
"man 7 raw" states:
A protocol of IPPROTO_RAW implies enabled IP_HDRINCL and is able
to send any IP protocol that is specified in the passed header.
Receiving of all IP protocols via IPPROTO_RAW is not possible
using raw sockets.
Make sure we drop these malicious packets.
In the Linux kernel, the following vulnerability has been resolved:
nfc: hci: shdlc: Stop timers and work before freeing context
llc_shdlc_deinit() purges SHDLC skb queues and frees the llc_shdlc
structure while its timers and state machine work may still be active.
Timer callbacks can schedule sm_work, and sm_work accesses SHDLC state
and the skb queues. If teardown happens in parallel with a queued/running
work item, it can lead to UAF and other shutdown races.
Stop all SHDLC timers and cancel sm_work synchronously before purging the
queues and freeing the context.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
In the Linux kernel, the following vulnerability has been resolved:
power: supply: rt9455: Fix use-after-free in power_supply_changed()
Using the `devm_` variant for requesting IRQ _before_ the `devm_`
variant for allocating/registering the `power_supply` handle, means that
the `power_supply` handle will be deallocated/unregistered _before_ the
interrupt handler (since `devm_` naturally deallocates in reverse
allocation order). This means that during removal, there is a race
condition where an interrupt can fire just _after_ the `power_supply`
handle has been freed, *but* just _before_ the corresponding
unregistration of the IRQ handler has run.
This will lead to the IRQ handler calling `power_supply_changed()` with
a freed `power_supply` handle. Which usually crashes the system or
otherwise silently corrupts the memory...
Note that there is a similar situation which can also happen during
`probe()`; the possibility of an interrupt firing _before_ registering
the `power_supply` handle. This would then lead to the nasty situation
of using the `power_supply` handle *uninitialized* in
`power_supply_changed()`.
Fix this racy use-after-free by making sure the IRQ is requested _after_
the registration of the `power_supply` handle.
In the Linux kernel, the following vulnerability has been resolved:
spi: wpcm-fiu: Fix potential NULL pointer dereference in wpcm_fiu_probe()
platform_get_resource_byname() can return NULL, which would cause a crash
when passed the pointer to resource_size().
Move the fiu->memory_size assignment after the error check for
devm_ioremap_resource() to prevent the potential NULL pointer dereference.
In the Linux kernel, the following vulnerability has been resolved:
drm/amd/display: Fix out-of-bounds stream encoder index v3
eng_id can be negative and that stream_enc_regs[]
can be indexed out of bounds.
eng_id is used directly as an index into stream_enc_regs[], which has
only 5 entries. When eng_id is 5 (ENGINE_ID_DIGF) or negative, this can
access memory past the end of the array.
Add a bounds check using ARRAY_SIZE() before using eng_id as an index.
The unsigned cast also rejects negative values.
This avoids out-of-bounds access.
Fixes the below smatch error:
dcn*_resource.c: stream_encoder_create() may index
stream_enc_regs[eng_id] out of bounds (size 5).
drivers/gpu/drm/amd/amdgpu/../display/dc/resource/dcn351/dcn351_resource.c
1246 static struct stream_encoder *dcn35_stream_encoder_create(
1247 enum engine_id eng_id,
1248 struct dc_context *ctx)
1249 {
...
1255
1256 /* Mapping of VPG, AFMT, DME register blocks to DIO block instance */
1257 if (eng_id <= ENGINE_ID_DIGF) {
ENGINE_ID_DIGF is 5. should <= be <?
Unrelated but, ugh, why is Smatch saying that "eng_id" can be negative?
end_id is type signed long, but there are checks in the caller which prevent it from being negative.
1258 vpg_inst = eng_id;
1259 afmt_inst = eng_id;
1260 } else
1261 return NULL;
1262
...
1281
1282 dcn35_dio_stream_encoder_construct(enc1, ctx, ctx->dc_bios,
1283 eng_id, vpg, afmt,
--> 1284 &stream_enc_regs[eng_id],
^^^^^^^^^^^^^^^^^^^^^^^ This stream_enc_regs[] array has 5 elements so we are one element beyond the end of the array.
...
1287 return &enc1->base;
1288 }
v2: use explicit bounds check as suggested by Roman/Dan; avoid unsigned int cast
v3: The compiler already knows how to compare the two values, so the
cast (int) is not needed. (Roman)
In the Linux kernel, the following vulnerability has been resolved:
pstore/ram: fix buffer overflow in persistent_ram_save_old()
persistent_ram_save_old() can be called multiple times for the same
persistent_ram_zone (e.g., via ramoops_pstore_read -> ramoops_get_next_prz
for PSTORE_TYPE_DMESG records).
Currently, the function only allocates prz->old_log when it is NULL,
but it unconditionally updates prz->old_log_size to the current buffer
size and then performs memcpy_fromio() using this new size. If the
buffer size has grown since the first allocation (which can happen
across different kernel boot cycles), this leads to:
1. A heap buffer overflow (OOB write) in the memcpy_fromio() calls
2. A subsequent OOB read when ramoops_pstore_read() accesses the buffer
using the incorrect (larger) old_log_size
The KASAN splat would look similar to:
BUG: KASAN: slab-out-of-bounds in ramoops_pstore_read+0x...
Read of size N at addr ... by task ...
The conditions are likely extremely hard to hit:
0. Crash with a ramoops write of less-than-record-max-size bytes.
1. Reboot: ramoops registers, pstore_get_records(0) reads old crash,
allocates old_log with size X
2. Crash handler registered, timer started (if pstore_update_ms >= 0)
3. Oops happens (non-fatal, system continues)
4. pstore_dump() writes oops via ramoops_pstore_write() size Y (>X)
5. pstore_new_entry = 1, pstore_timer_kick() called
6. System continues running (not a panic oops)
7. Timer fires after pstore_update_ms milliseconds
8. pstore_timefunc() → schedule_work() → pstore_dowork() → pstore_get_records(1)
9. ramoops_get_next_prz() → persistent_ram_save_old()
10. buffer_size() returns Y, but old_log is X bytes
11. Y > X: memcpy_fromio() overflows heap
Requirements:
- a prior crash record exists that did not fill the record size
(almost impossible since the crash handler writes as much as it
can possibly fit into the record, capped by max record size and
the kmsg buffer almost always exceeds the max record size)
- pstore_update_ms >= 0 (disabled by default)
- Non-fatal oops (system survives)
Free and reallocate the buffer when the new size differs from the
previously allocated size. This ensures old_log always has sufficient
space for the data being copied.