In the Linux kernel, the following vulnerability has been resolved:
smb: client: let smbd_destroy() call disable_work_sync(&info->post_send_credits_work)
In smbd_destroy() we may destroy the memory so we better
wait until post_send_credits_work is no longer pending
and will never be started again.
I actually just hit the case using rxe:
WARNING: CPU: 0 PID: 138 at drivers/infiniband/sw/rxe/rxe_verbs.c:1032 rxe_post_recv+0x1ee/0x480 [rdma_rxe]
...
[ 5305.686979] [ T138] smbd_post_recv+0x445/0xc10 [cifs]
[ 5305.687135] [ T138] ? srso_alias_return_thunk+0x5/0xfbef5
[ 5305.687149] [ T138] ? __kasan_check_write+0x14/0x30
[ 5305.687185] [ T138] ? __pfx_smbd_post_recv+0x10/0x10 [cifs]
[ 5305.687329] [ T138] ? __pfx__raw_spin_lock_irqsave+0x10/0x10
[ 5305.687356] [ T138] ? srso_alias_return_thunk+0x5/0xfbef5
[ 5305.687368] [ T138] ? srso_alias_return_thunk+0x5/0xfbef5
[ 5305.687378] [ T138] ? _raw_spin_unlock_irqrestore+0x11/0x60
[ 5305.687389] [ T138] ? srso_alias_return_thunk+0x5/0xfbef5
[ 5305.687399] [ T138] ? get_receive_buffer+0x168/0x210 [cifs]
[ 5305.687555] [ T138] smbd_post_send_credits+0x382/0x4b0 [cifs]
[ 5305.687701] [ T138] ? __pfx_smbd_post_send_credits+0x10/0x10 [cifs]
[ 5305.687855] [ T138] ? __pfx___schedule+0x10/0x10
[ 5305.687865] [ T138] ? __pfx__raw_spin_lock_irq+0x10/0x10
[ 5305.687875] [ T138] ? queue_delayed_work_on+0x8e/0xa0
[ 5305.687889] [ T138] process_one_work+0x629/0xf80
[ 5305.687908] [ T138] ? srso_alias_return_thunk+0x5/0xfbef5
[ 5305.687917] [ T138] ? __kasan_check_write+0x14/0x30
[ 5305.687933] [ T138] worker_thread+0x87f/0x1570
...
It means rxe_post_recv was called after rdma_destroy_qp().
This happened because put_receive_buffer() was triggered
by ib_drain_qp() and called:
queue_work(info->workqueue, &info->post_send_credits_work);
In the Linux kernel, the following vulnerability has been resolved:
smb: client: fix smbdirect_recv_io leak in smbd_negotiate() error path
During tests of another unrelated patch I was able to trigger this
error: Objects remaining on __kmem_cache_shutdown()
In the Linux kernel, the following vulnerability has been resolved:
jbd2: check 'jh->b_transaction' before removing it from checkpoint
Following process will corrupt ext4 image:
Step 1:
jbd2_journal_commit_transaction
__jbd2_journal_insert_checkpoint(jh, commit_transaction)
// Put jh into trans1->t_checkpoint_list
journal->j_checkpoint_transactions = commit_transaction
// Put trans1 into journal->j_checkpoint_transactions
Step 2:
do_get_write_access
test_clear_buffer_dirty(bh) // clear buffer dirty,set jbd dirty
__jbd2_journal_file_buffer(jh, transaction) // jh belongs to trans2
Step 3:
drop_cache
journal_shrink_one_cp_list
jbd2_journal_try_remove_checkpoint
if (!trylock_buffer(bh)) // lock bh, true
if (buffer_dirty(bh)) // buffer is not dirty
__jbd2_journal_remove_checkpoint(jh)
// remove jh from trans1->t_checkpoint_list
Step 4:
jbd2_log_do_checkpoint
trans1 = journal->j_checkpoint_transactions
// jh is not in trans1->t_checkpoint_list
jbd2_cleanup_journal_tail(journal) // trans1 is done
Step 5: Power cut, trans2 is not committed, jh is lost in next mounting.
Fix it by checking 'jh->b_transaction' before remove it from checkpoint.
In the Linux kernel, the following vulnerability has been resolved:
wifi: rtw88: Fix memory leak in rtw88_usb
Kmemleak shows the following leak arising from routine in the usb
probe routine:
unreferenced object 0xffff895cb29bba00 (size 512):
comm "(udev-worker)", pid 534, jiffies 4294903932 (age 102751.088s)
hex dump (first 32 bytes):
77 30 30 30 00 00 00 00 02 2f 2d 2b 30 00 00 00 w000...../-+0...
02 00 2a 28 00 00 00 00 ff 55 ff ff ff 00 00 00 ..*(.....U......
backtrace:
[<ffffffff9265fa36>] kmalloc_trace+0x26/0x90
[<ffffffffc17eec41>] rtw_usb_probe+0x2f1/0x680 [rtw_usb]
[<ffffffffc03e19fd>] usb_probe_interface+0xdd/0x2e0 [usbcore]
[<ffffffff92b4f2fe>] really_probe+0x18e/0x3d0
[<ffffffff92b4f5b8>] __driver_probe_device+0x78/0x160
[<ffffffff92b4f6bf>] driver_probe_device+0x1f/0x90
[<ffffffff92b4f8df>] __driver_attach+0xbf/0x1b0
[<ffffffff92b4d350>] bus_for_each_dev+0x70/0xc0
[<ffffffff92b4e51e>] bus_add_driver+0x10e/0x210
[<ffffffff92b50935>] driver_register+0x55/0xf0
[<ffffffffc03e0708>] usb_register_driver+0x88/0x140 [usbcore]
[<ffffffff92401153>] do_one_initcall+0x43/0x210
[<ffffffff9254f42a>] do_init_module+0x4a/0x200
[<ffffffff92551d1c>] __do_sys_finit_module+0xac/0x120
[<ffffffff92ee6626>] do_syscall_64+0x56/0x80
[<ffffffff9300006a>] entry_SYSCALL_64_after_hwframe+0x46/0xb0
The leak was verified to be real by unloading the driver, which resulted
in a dangling pointer to the allocation.
The allocated memory is freed in rtw_usb_intf_deinit().
In the Linux kernel, the following vulnerability has been resolved:
null_blk: fix poll request timeout handling
When doing io_uring benchmark on /dev/nullb0, it's easy to crash the
kernel if poll requests timeout triggered, as reported by David. [1]
BUG: kernel NULL pointer dereference, address: 0000000000000008
Workqueue: kblockd blk_mq_timeout_work
RIP: 0010:null_timeout_rq+0x4e/0x91
Call Trace:
? null_timeout_rq+0x4e/0x91
blk_mq_handle_expired+0x31/0x4b
bt_iter+0x68/0x84
? bt_tags_iter+0x81/0x81
__sbitmap_for_each_set.constprop.0+0xb0/0xf2
? __blk_mq_complete_request_remote+0xf/0xf
bt_for_each+0x46/0x64
? __blk_mq_complete_request_remote+0xf/0xf
? percpu_ref_get_many+0xc/0x2a
blk_mq_queue_tag_busy_iter+0x14d/0x18e
blk_mq_timeout_work+0x95/0x127
process_one_work+0x185/0x263
worker_thread+0x1b5/0x227
This is indeed a race problem between null_timeout_rq() and null_poll().
null_poll() null_timeout_rq()
spin_lock(&nq->poll_lock)
list_splice_init(&nq->poll_list, &list)
spin_unlock(&nq->poll_lock)
while (!list_empty(&list))
req = list_first_entry()
list_del_init()
...
blk_mq_add_to_batch()
// req->rq_next = NULL
spin_lock(&nq->poll_lock)
// rq->queuelist->next == NULL
list_del_init(&rq->queuelist)
spin_unlock(&nq->poll_lock)
Fix these problems by setting requests state to MQ_RQ_COMPLETE under
nq->poll_lock protection, in which null_timeout_rq() can safely detect
this race and early return.
Note this patch just fix the kernel panic when request timeout happen.
[1] https://lore.kernel.org/all/3893581.1691785261@warthog.procyon.org.uk/
In the Linux kernel, the following vulnerability has been resolved:
PM / devfreq: Fix leak in devfreq_dev_release()
srcu_init_notifier_head() allocates resources that need to be released
with a srcu_cleanup_notifier_head() call.
Reported by kmemleak.
In the Linux kernel, the following vulnerability has been resolved:
Bluetooth: Fix hci_suspend_sync crash
If hci_unregister_dev() frees the hci_dev object but hci_suspend_notifier
may still be accessing it, it can cause the program to crash.
Here's the call trace:
<4>[102152.653246] Call Trace:
<4>[102152.653254] hci_suspend_sync+0x109/0x301 [bluetooth]
<4>[102152.653259] hci_suspend_dev+0x78/0xcd [bluetooth]
<4>[102152.653263] hci_suspend_notifier+0x42/0x7a [bluetooth]
<4>[102152.653268] notifier_call_chain+0x43/0x6b
<4>[102152.653271] __blocking_notifier_call_chain+0x48/0x69
<4>[102152.653273] __pm_notifier_call_chain+0x22/0x39
<4>[102152.653276] pm_suspend+0x287/0x57c
<4>[102152.653278] state_store+0xae/0xe5
<4>[102152.653281] kernfs_fop_write+0x109/0x173
<4>[102152.653284] __vfs_write+0x16f/0x1a2
<4>[102152.653287] ? selinux_file_permission+0xca/0x16f
<4>[102152.653289] ? security_file_permission+0x36/0x109
<4>[102152.653291] vfs_write+0x114/0x21d
<4>[102152.653293] __x64_sys_write+0x7b/0xdb
<4>[102152.653296] do_syscall_64+0x59/0x194
<4>[102152.653299] entry_SYSCALL_64_after_hwframe+0x5c/0xc1
This patch holds the reference count of the hci_dev object while
processing it in hci_suspend_notifier to avoid potential crash
caused by the race condition.
In the Linux kernel, the following vulnerability has been resolved:
can: gs_usb: fix time stamp counter initialization
If the gs_usb device driver is unloaded (or unbound) before the
interface is shut down, the USB stack first calls the struct
usb_driver::disconnect and then the struct net_device_ops::ndo_stop
callback.
In gs_usb_disconnect() all pending bulk URBs are killed, i.e. no more
RX'ed CAN frames are send from the USB device to the host. Later in
gs_can_close() a reset control message is send to each CAN channel to
remove the controller from the CAN bus. In this race window the USB
device can still receive CAN frames from the bus and internally queue
them to be send to the host.
At least in the current version of the candlelight firmware, the queue
of received CAN frames is not emptied during the reset command. After
loading (or binding) the gs_usb driver, new URBs are submitted during
the struct net_device_ops::ndo_open callback and the candlelight
firmware starts sending its already queued CAN frames to the host.
However, this scenario was not considered when implementing the
hardware timestamp function. The cycle counter/time counter
infrastructure is set up (gs_usb_timestamp_init()) after the USBs are
submitted, resulting in a NULL pointer dereference if
timecounter_cyc2time() (via the call chain:
gs_usb_receive_bulk_callback() -> gs_usb_set_timestamp() ->
gs_usb_skb_set_timestamp()) is called too early.
Move the gs_usb_timestamp_init() function before the URBs are
submitted to fix this problem.
For a comprehensive solution, we need to consider gs_usb devices with
more than 1 channel. The cycle counter/time counter infrastructure is
setup per channel, but the RX URBs are per device. Once gs_can_open()
of _a_ channel has been called, and URBs have been submitted, the
gs_usb_receive_bulk_callback() can be called for _all_ available
channels, even for channels that are not running, yet. As cycle
counter/time counter has not set up, this will again lead to a NULL
pointer dereference.
Convert the cycle counter/time counter from a "per channel" to a "per
device" functionality. Also set it up, before submitting any URBs to
the device.
Further in gs_usb_receive_bulk_callback(), don't process any URBs for
not started CAN channels, only resubmit the URB.
In the Linux kernel, the following vulnerability has been resolved:
scsi: ufs: core: Fix handling of lrbp->cmd
ufshcd_queuecommand() may be called two times in a row for a SCSI command
before it is completed. Hence make the following changes:
- In the functions that submit a command, do not check the old value of
lrbp->cmd nor clear lrbp->cmd in error paths.
- In ufshcd_release_scsi_cmd(), do not clear lrbp->cmd.
See also scsi_send_eh_cmnd().
This commit prevents that the following appears if a command times out:
WARNING: at drivers/ufs/core/ufshcd.c:2965 ufshcd_queuecommand+0x6f8/0x9a8
Call trace:
ufshcd_queuecommand+0x6f8/0x9a8
scsi_send_eh_cmnd+0x2c0/0x960
scsi_eh_test_devices+0x100/0x314
scsi_eh_ready_devs+0xd90/0x114c
scsi_error_handler+0x2b4/0xb70
kthread+0x16c/0x1e0
In the Linux kernel, the following vulnerability has been resolved:
iommu/amd/iommu_v2: Fix pasid_state refcount dec hit 0 warning on pasid unbind
When unbinding pasid - a race condition exists vs outstanding page faults.
To prevent this, the pasid_state object contains a refcount.
* set to 1 on pasid bind
* incremented on each ppr notification start
* decremented on each ppr notification done
* decremented on pasid unbind
Since refcount_dec assumes that refcount will never reach 0:
the current implementation causes the following to be invoked on
pasid unbind:
REFCOUNT_WARN("decrement hit 0; leaking memory")
Fix this issue by changing refcount_dec to refcount_dec_and_test
to explicitly handle refcount=1.