In the Linux kernel, the following vulnerability has been resolved:
fscrypt: stop using keyrings subsystem for fscrypt_master_key
The approach of fs/crypto/ internally managing the fscrypt_master_key
structs as the payloads of "struct key" objects contained in a
"struct key" keyring has outlived its usefulness. The original idea was
to simplify the code by reusing code from the keyrings subsystem.
However, several issues have arisen that can't easily be resolved:
- When a master key struct is destroyed, blk_crypto_evict_key() must be
called on any per-mode keys embedded in it. (This started being the
case when inline encryption support was added.) Yet, the keyrings
subsystem can arbitrarily delay the destruction of keys, even past the
time the filesystem was unmounted. Therefore, currently there is no
easy way to call blk_crypto_evict_key() when a master key is
destroyed. Currently, this is worked around by holding an extra
reference to the filesystem's request_queue(s). But it was overlooked
that the request_queue reference is *not* guaranteed to pin the
corresponding blk_crypto_profile too; for device-mapper devices that
support inline crypto, it doesn't. This can cause a use-after-free.
- When the last inode that was using an incompletely-removed master key
is evicted, the master key removal is completed by removing the key
struct from the keyring. Currently this is done via key_invalidate().
Yet, key_invalidate() takes the key semaphore. This can deadlock when
called from the shrinker, since in fscrypt_ioctl_add_key(), memory is
allocated with GFP_KERNEL under the same semaphore.
- More generally, the fact that the keyrings subsystem can arbitrarily
delay the destruction of keys (via garbage collection delay, or via
random processes getting temporary key references) is undesirable, as
it means we can't strictly guarantee that all secrets are ever wiped.
- Doing the master key lookups via the keyrings subsystem results in the
key_permission LSM hook being called. fscrypt doesn't want this, as
all access control for encrypted files is designed to happen via the
files themselves, like any other files. The workaround which SELinux
users are using is to change their SELinux policy to grant key search
access to all domains. This works, but it is an odd extra step that
shouldn't really have to be done.
The fix for all these issues is to change the implementation to what I
should have done originally: don't use the keyrings subsystem to keep
track of the filesystem's fscrypt_master_key structs. Instead, just
store them in a regular kernel data structure, and rework the reference
counting, locking, and lifetime accordingly. Retain support for
RCU-mode key lookups by using a hash table. Replace fscrypt_sb_free()
with fscrypt_sb_delete(), which releases the keys synchronously and runs
a bit earlier during unmount, so that block devices are still available.
A side effect of this patch is that neither the master keys themselves
nor the filesystem keyrings will be listed in /proc/keys anymore.
("Master key users" and the master key users keyrings will still be
listed.) However, this was mostly an implementation detail, and it was
intended just for debugging purposes. I don't know of anyone using it.
This patch does *not* change how "master key users" (->mk_users) works;
that still uses the keyrings subsystem. That is still needed for key
quotas, and changing that isn't necessary to solve the issues listed
above. If we decide to change that too, it would be a separate patch.
I've marked this as fixing the original commit that added the fscrypt
keyring, but as noted above the most important issue that this patch
fixes wasn't introduced until the addition of inline encryption support.
In the Linux kernel, the following vulnerability has been resolved:
wifi: cfg80211: fix memory leak in query_regdb_file()
In the function query_regdb_file() the alpha2 parameter is duplicated
using kmemdup() and subsequently freed in regdb_fw_cb(). However,
request_firmware_nowait() can fail without calling regdb_fw_cb() and
thus leak memory.
In the Linux kernel, the following vulnerability has been resolved:
ACPI: APEI: Fix integer overflow in ghes_estatus_pool_init()
Change num_ghes from int to unsigned int, preventing an overflow
and causing subsequent vmalloc() to fail.
The overflow happens in ghes_estatus_pool_init() when calculating
len during execution of the statement below as both multiplication
operands here are signed int:
len += (num_ghes * GHES_ESOURCE_PREALLOC_MAX_SIZE);
The following call trace is observed because of this bug:
[ 9.317108] swapper/0: vmalloc error: size 18446744071562596352, exceeds total pages, mode:0xcc0(GFP_KERNEL), nodemask=(null),cpuset=/,mems_allowed=0-1
[ 9.317131] Call Trace:
[ 9.317134] <TASK>
[ 9.317137] dump_stack_lvl+0x49/0x5f
[ 9.317145] dump_stack+0x10/0x12
[ 9.317146] warn_alloc.cold+0x7b/0xdf
[ 9.317150] ? __device_attach+0x16a/0x1b0
[ 9.317155] __vmalloc_node_range+0x702/0x740
[ 9.317160] ? device_add+0x17f/0x920
[ 9.317164] ? dev_set_name+0x53/0x70
[ 9.317166] ? platform_device_add+0xf9/0x240
[ 9.317168] __vmalloc_node+0x49/0x50
[ 9.317170] ? ghes_estatus_pool_init+0x43/0xa0
[ 9.317176] vmalloc+0x21/0x30
[ 9.317177] ghes_estatus_pool_init+0x43/0xa0
[ 9.317179] acpi_hest_init+0x129/0x19c
[ 9.317185] acpi_init+0x434/0x4a4
[ 9.317188] ? acpi_sleep_proc_init+0x2a/0x2a
[ 9.317190] do_one_initcall+0x48/0x200
[ 9.317195] kernel_init_freeable+0x221/0x284
[ 9.317200] ? rest_init+0xe0/0xe0
[ 9.317204] kernel_init+0x1a/0x130
[ 9.317205] ret_from_fork+0x22/0x30
[ 9.317208] </TASK>
[ rjw: Subject and changelog edits ]
In the Linux kernel, the following vulnerability has been resolved:
media: meson: vdec: fix possible refcount leak in vdec_probe()
v4l2_device_unregister need to be called to put the refcount got by
v4l2_device_register when vdec_probe fails or vdec_remove is called.
In the Linux kernel, the following vulnerability has been resolved:
net: gso: fix panic on frag_list with mixed head alloc types
Since commit 3dcbdb134f32 ("net: gso: Fix skb_segment splat when
splitting gso_size mangled skb having linear-headed frag_list"), it is
allowed to change gso_size of a GRO packet. However, that commit assumes
that "checking the first list_skb member suffices; i.e if either of the
list_skb members have non head_frag head, then the first one has too".
It turns out this assumption does not hold. We've seen BUG_ON being hit
in skb_segment when skbs on the frag_list had differing head_frag with
the vmxnet3 driver. This happens because __netdev_alloc_skb and
__napi_alloc_skb can return a skb that is page backed or kmalloced
depending on the requested size. As the result, the last small skb in
the GRO packet can be kmalloced.
There are three different locations where this can be fixed:
(1) We could check head_frag in GRO and not allow GROing skbs with
different head_frag. However, that would lead to performance
regression on normal forward paths with unmodified gso_size, where
!head_frag in the last packet is not a problem.
(2) Set a flag in bpf_skb_net_grow and bpf_skb_net_shrink indicating
that NETIF_F_SG is undesirable. That would need to eat a bit in
sk_buff. Furthermore, that flag can be unset when all skbs on the
frag_list are page backed. To retain good performance,
bpf_skb_net_grow/shrink would have to walk the frag_list.
(3) Walk the frag_list in skb_segment when determining whether
NETIF_F_SG should be cleared. This of course slows things down.
This patch implements (3). To limit the performance impact in
skb_segment, the list is walked only for skbs with SKB_GSO_DODGY set
that have gso_size changed. Normal paths thus will not hit it.
We could check only the last skb but since we need to walk the whole
list anyway, let's stay on the safe side.
In the Linux kernel, the following vulnerability has been resolved:
bpf: Fix wrong reg type conversion in release_reference()
Some helper functions will allocate memory. To avoid memory leaks, the
verifier requires the eBPF program to release these memories by calling
the corresponding helper functions.
When a resource is released, all pointer registers corresponding to the
resource should be invalidated. The verifier use release_references() to
do this job, by apply __mark_reg_unknown() to each relevant register.
It will give these registers the type of SCALAR_VALUE. A register that
will contain a pointer value at runtime, but of type SCALAR_VALUE, which
may allow the unprivileged user to get a kernel pointer by storing this
register into a map.
Using __mark_reg_not_init() while NOT allow_ptr_leaks can mitigate this
problem.
In the Linux kernel, the following vulnerability has been resolved:
HID: hyperv: fix possible memory leak in mousevsc_probe()
If hid_add_device() returns error, it should call hid_destroy_device()
to free hid_dev which is allocated in hid_allocate_device().
In the Linux kernel, the following vulnerability has been resolved:
ext4: fix BUG_ON() when directory entry has invalid rec_len
The rec_len field in the directory entry has to be a multiple of 4. A
corrupted filesystem image can be used to hit a BUG() in
ext4_rec_len_to_disk(), called from make_indexed_dir().
------------[ cut here ]------------
kernel BUG at fs/ext4/ext4.h:2413!
...
RIP: 0010:make_indexed_dir+0x53f/0x5f0
...
Call Trace:
<TASK>
? add_dirent_to_buf+0x1b2/0x200
ext4_add_entry+0x36e/0x480
ext4_add_nondir+0x2b/0xc0
ext4_create+0x163/0x200
path_openat+0x635/0xe90
do_filp_open+0xb4/0x160
? __create_object.isra.0+0x1de/0x3b0
? _raw_spin_unlock+0x12/0x30
do_sys_openat2+0x91/0x150
__x64_sys_open+0x6c/0xa0
do_syscall_64+0x3c/0x80
entry_SYSCALL_64_after_hwframe+0x46/0xb0
The fix simply adds a call to ext4_check_dir_entry() to validate the
directory entry, returning -EFSCORRUPTED if the entry is invalid.