| 文件 | 最后提交记录 | 最后更新时间 |
|---|---|---|
fs: prepare for stackable filesystems backing file helpers mainline inclusion from mainline-v6.8-rc1 commit f91a704f7161c2cf0fcd41fa9fbec4355b813fff category: feature bugzilla: https://gitee.com/src-openeuler/kernel/issues/IBHLU4 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f91a704f7161c2cf0fcd41fa9fbec4355b813fff -------------------------------- In preparation for factoring out some backing file io helpers from overlayfs, move backing_file_open() into a new file fs/backing-file.c and header. Add a MAINTAINERS entry for stackable filesystems and add a Kconfig FS_STACK which stackable filesystems need to select. For now, the backing_file struct, the backing_file alloc/free functions and the backing_file_real_path() accessor remain internal to file_table.c. We may change that in the future. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Yifan Qiao <qiaoyifan4@huawei.com> | 1 年前 | |
ovl: modify layer parameter parsing We ran into issues where mount(8) passed multiple lower layers as one big string through fsconfig(). But the fsconfig() FSCONFIG_SET_STRING option is limited to 256 bytes in strndup_user(). While this would be fixable by extending the fsconfig() buffer I'd rather encourage users to append layers via multiple fsconfig() calls as the interface allows nicely for this. This has also been requested as a feature before. With this port to the new mount api the following will be possible: fsconfig(fs_fd, FSCONFIG_SET_STRING, "lowerdir", "/lower1", 0); /* set upper layer */ fsconfig(fs_fd, FSCONFIG_SET_STRING, "upperdir", "/upper", 0); /* append "/lower2", "/lower3", and "/lower4" */ fsconfig(fs_fd, FSCONFIG_SET_STRING, "lowerdir", ":/lower2:/lower3:/lower4", 0); /* turn index feature on */ fsconfig(fs_fd, FSCONFIG_SET_STRING, "index", "on", 0); /* append "/lower5" */ fsconfig(fs_fd, FSCONFIG_SET_STRING, "lowerdir", ":/lower5", 0); Specifying ':' would have been rejected so this isn't a regression. And we can't simply use "lowerdir=/lower" to append on top of existing layers as "lowerdir=/lower,lowerdir=/other-lower" would make "/other-lower" the only lower layer so we'd break uapi if we changed this. So the ':' prefix seems a good compromise. Users can choose to specify multiple layers at once or individual layers. A layer is appended if it starts with ":". This requires that the user has already added at least one layer before. If lowerdir is specified again without a leading ":" then all previous layers are dropped and replaced with the new layers. If lowerdir is specified and empty than all layers are simply dropped. An additional change is that overlayfs will now parse and resolve layers right when they are specified in fsconfig() instead of deferring until super block creation. This allows users to receive early errors. It also allows users to actually use up to 500 layers something which was theoretically possible but ended up not working due to the mount option string passed via mount(2) being too large. This also allows a more privileged process to set config options for a lesser privileged process as the creds for fsconfig() and the creds for fsopen() can differ. We could restrict that they match by enforcing that the creds of fsopen() and fsconfig() match but I don't see why that needs to be the case and allows for a good delegation mechanism. Plus, in the future it means we're able to extend overlayfs mount options and allow users to specify layers via file descriptors instead of paths: fsconfig(FSCONFIG_SET_PATH{_EMPTY}, "lowerdir", "lower1", dirfd); /* append */ fsconfig(FSCONFIG_SET_PATH{_EMPTY}, "lowerdir", "lower2", dirfd); /* append */ fsconfig(FSCONFIG_SET_PATH{_EMPTY}, "lowerdir", "lower3", dirfd); /* clear all layers specified until now */ fsconfig(FSCONFIG_SET_STRING, "lowerdir", NULL, 0); This would be especially nice if users create an overlayfs mount on top of idmapped layers or just in general private mounts created via open_tree(OPEN_TREE_CLONE). Those mounts would then never have to appear anywhere in the filesystem. But for now just do the minimal thing. We should probably aim to move more validation into ovl_fs_parse_param() so users get errors before fsconfig(FSCONFIG_CMD_CREATE). But that can be done in additional patches later. This is now also rebased on top of the lazy lowerdata lookup which allows the specificatin of data only layers using the new "::" syntax. The rules are simple. A data only layers cannot be followed by any regular layers and data layers must be preceeded by at least one regular layer. Parsing the lowerdir mount option must change because of this. The original patchset used the old lowerdir parsing function to split a lowerdir mount option string such as: lowerdir=/lower1:/lower2::/lower3::/lower4 simply replacing each non-escaped ":" by "\0". So sequences of non-escaped ":" were counted as layers. For example, the previous lowerdir mount option above would've counted 6 layers instead of 4 and a lowerdir mount option such as: lowerdir="/lower1:/lower2::/lower3::/lower4:::::::::::::::::::::::::::" would be counted as 33 layers. Other than being ugly this didn't matter much because kern_path() would reject the first "\0" layer. However, this overcounting of layers becomes problematic when we base allocations on it where we very much only want to allocate space for 4 layers instead of 33. So the new parsing function rejects non-escaped sequences of colons other than ":" and "::" immediately instead of relying on kern_path(). Link: https://github.com/util-linux/util-linux/issues/2287 Link: https://github.com/util-linux/util-linux/issues/1992 Link: https://bugs.archlinux.org/task/78702 Link: https://lore.kernel.org/linux-unionfs/20230530-klagen-zudem-32c0908c2108@brauner Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Amir Goldstein <amir73il@gmail.com> | 2 年前 | |
fs: move file_start_write() into direct_splice_actor() mainline inclusion from mainline-v6.8-rc1 commit da40448ce4eb4de18eb7b0db61dddece32677939 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IBHLU4 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=da40448ce4eb4de18eb7b0db61dddece32677939 -------------------------------- The callers of do_splice_direct() hold file_start_write() on the output file. This may cause file permission hooks to be called indirectly on an overlayfs lower layer, which is on the same filesystem of the output file and could lead to deadlock with fanotify permission events. To fix this potential deadlock, move file_start_write() from the callers into the direct_splice_actor(), so file_start_write() will not be held while splicing from the input file. Suggested-by: Josef Bacik <josef@toxicpanda.com> Link: https://lore.kernel.org/r/20231128214258.GA2398475@perftesting/ Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Link: https://lore.kernel.org/r/20231130141624.3338942-3-amir73il@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Yifan Qiao <qiaoyifan4@huawei.com> | 1 年前 | |
ovl: reorder ovl_want_write() after ovl_inode_lock() mainline inclusion from mainline-v6.7-rc1 commit 162d06444070c12827d604a2cb6b6bd98d48cbb0 category: feature bugzilla: https://gitee.com/src-openeuler/kernel/issues/IBHLU4 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=162d06444070c12827d604a2cb6b6bd98d48cbb0 -------------------------------- Make the locking order of ovl_inode_lock() strictly between the two vfs stacked layers, i.e.: - ovl vfs locks: sb_writers, inode_lock, ... - ovl_inode_lock - upper vfs locks: sb_writers, inode_lock, ... To that effect, move ovl_want_write() into the helpers ovl_nlink_start() and ovl_copy_up_start which currently take the ovl_inode_lock() after ovl_want_write(). Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Yifan Qiao <qiaoyifan4@huawei.com> | 1 年前 | |
ovl: reorder ovl_want_write() after ovl_inode_lock() mainline inclusion from mainline-v6.7-rc1 commit 162d06444070c12827d604a2cb6b6bd98d48cbb0 category: feature bugzilla: https://gitee.com/src-openeuler/kernel/issues/IBHLU4 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=162d06444070c12827d604a2cb6b6bd98d48cbb0 -------------------------------- Make the locking order of ovl_inode_lock() strictly between the two vfs stacked layers, i.e.: - ovl vfs locks: sb_writers, inode_lock, ... - ovl_inode_lock - upper vfs locks: sb_writers, inode_lock, ... To that effect, move ovl_want_write() into the helpers ovl_nlink_start() and ovl_copy_up_start which currently take the ovl_inode_lock() after ovl_want_write(). Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Yifan Qiao <qiaoyifan4@huawei.com> | 1 年前 | |
fs: pass offset and result to backing_file end_write() callback mainline inclusion from mainline-v6.12-rc5 commit f03b296e8b516dbd63f57fc9056c1b0da1b9a0ff category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IBHLU4 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f03b296e8b516dbd63f57fc9056c1b0da1b9a0ff -------------------------------- This is needed for extending fuse inode size after fuse passthrough write. Suggested-by: Miklos Szeredi <miklos@szeredi.hu> Link: https://lore.kernel.org/linux-fsdevel/CAJfpegs=cvZ_NYy6Q_D42XhYS=Sjj5poM1b5TzXzOVvX=R36aA@mail.gmail.com/ Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Yifan Qiao <qiaoyifan4@huawei.com> | 1 年前 | |
ovl: reorder ovl_want_write() after ovl_inode_lock() mainline inclusion from mainline-v6.7-rc1 commit 162d06444070c12827d604a2cb6b6bd98d48cbb0 category: feature bugzilla: https://gitee.com/src-openeuler/kernel/issues/IBHLU4 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=162d06444070c12827d604a2cb6b6bd98d48cbb0 -------------------------------- Make the locking order of ovl_inode_lock() strictly between the two vfs stacked layers, i.e.: - ovl vfs locks: sb_writers, inode_lock, ... - ovl_inode_lock - upper vfs locks: sb_writers, inode_lock, ... To that effect, move ovl_want_write() into the helpers ovl_nlink_start() and ovl_copy_up_start which currently take the ovl_inode_lock() after ovl_want_write(). Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Yifan Qiao <qiaoyifan4@huawei.com> | 1 年前 | |
ovl: do not encode lower fh with upper sb_writers held mainline inclusion from mainline-v6.7-rc1 commit 5b02bfc1e7e3811c5bf7f0fa626a0694d0dbbd77 category: feature bugzilla: https://gitee.com/src-openeuler/kernel/issues/IBHLU4 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5b02bfc1e7e3811c5bf7f0fa626a0694d0dbbd77 -------------------------------- When lower fs is a nested overlayfs, calling encode_fh() on a lower directory dentry may trigger copy up and take sb_writers on the upper fs of the lower nested overlayfs. The lower nested overlayfs may have the same upper fs as this overlayfs, so nested sb_writers lock is illegal. Move all the callers that encode lower fh to before ovl_want_write(). Signed-off-by: Amir Goldstein <amir73il@gmail.com> Conflicts: fs/overlayfs/copy_up.c fs/overlayfs/overlayfs.h [Context differences.] Signed-off-by: Yifan Qiao <qiaoyifan4@huawei.com> | 1 年前 | |
ovl: remove unused forward declaration stable inclusion from stable-v6.6.88 commit 4f7b6029ae8e22169c8a002ca0431865e63011c5 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/ID6MDL Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=4f7b6029ae8e22169c8a002ca0431865e63011c5 -------------------------------- [ Upstream commit a6eb9a4a69cc360b930dad9dc8513f8fd9b3577f ] The ovl_get_verity_xattr() function was never added, only its declaration. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com> Fixes: 184996e92e86 ("ovl: Validate verity xattr when resolving lowerdata") Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Alexander Larsson <alexl@redhat.com> Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit 4f7b6029ae8e22169c8a002ca0431865e63011c5) Signed-off-by: Wentao Guan <guanwentao@uniontech.com> | 7 个月前 | |
ovl: make use of ->layers safe in rcu pathwalk ovl_permission() accesses ->layers[...].mnt; we can't have ->layers freed without an RCU delay on fs shutdown. Fortunately, kern_unmount_array() that is used to drop those mounts does include an RCU delay, so freeing is delayed; unfortunately, the array passed to kern_unmount_array() is formed by mangling ->layers contents and that happens without any delays. The ->layers[...].name string entries are used to store the strings to display in "lowerdir=..." by ovl_show_options(). Those entries are not accessed in RCU walk. Move the name strings into a separate array ofs->config.lowerdirs and reuse the ofs->config.lowerdirs array as the temporary mount array to pass to kern_unmount_array(). Reported-by: Al Viro <viro@zeniv.linux.org.uk> Link: https://lore.kernel.org/r/20231002023711.GP3389589@ZenIV/ Acked-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> | 2 年前 | |
ovl: fail if trusted xattrs are needed but caller lacks permission stable inclusion from stable-v6.6.55 commit bf47be5479b3e611910e1133f4d5413aa77e86c5 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/IB0MX4 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=bf47be5479b3e611910e1133f4d5413aa77e86c5 -------------------------------- commit 6c4a5f96450415735c31ed70ff354f0ee5cbf67b upstream. Some overlayfs features require permission to read/write trusted.* xattrs. These include redirect_dir, verity, metacopy, and data-only layers. This patch adds additional validations at mount time to stop overlays from mounting in certain cases where the resulting mount would not function according to the user's expectations because they lack permission to access trusted.* xattrs (for example, not global root.) Similar checks in ovl_make_workdir() that disable features instead of failing are still relevant and used in cases where the resulting mount can still work "reasonably well." Generally, if the feature was enabled through kernel config or module option, any mount that worked before will still work the same; this applies to redirect_dir and metacopy. The user must explicitly request these features in order to generate a mount failure. Verity and data-only layers on the other hand must be explictly requested and have no "reasonable" disabled or degraded alternative, so mounts attempting either always fail. "lower data-only dirs require metacopy support" moved down in case userxattr is set, which disables metacopy. Cc: stable@vger.kernel.org # v6.6+ Signed-off-by: Mike Baynton <mike@mbaynton.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Wen Zhiwei <wenzhiwei@kylinos.cn> | 1 年前 | |
ovl: store and show the user provided lowerdir mount option stable inclusion from stable-v6.6.23 commit 26532aeb3cec005d85b99298430077cb44858490 bugzilla: https://gitee.com/openeuler/kernel/issues/I9MPZ8 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=26532aeb3cec005d85b99298430077cb44858490 -------------------------------- [ Upstream commit 0cea4c097d97fdc89de488bd4202d0b087ccec58 ] We are about to add new mount options for adding lowerdir one by one, but those mount options will not support escaping. For the existing case, where lowerdir mount option is provided as a colon separated list, store the user provided (possibly escaped) string and display it as is when showing the lowerdir mount option. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Stable-dep-of: 2824083db76c ("ovl: Always reject mounting over case-insensitive directories") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: ZhangPeng <zhangpeng362@huawei.com> | 2 年前 | |
ovl: Fix uninit-value in ovl_fill_real stable inclusion from stable-v6.6.128 commit 43b6f69e18063ae01911990f8726e9927e070ebf category: bugfix bugzilla: https://atomgit.com/openeuler/kernel/issues/ Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=43b6f69e18063ae01911990f8726e9927e070ebf -------------------------------- commit 43b6f69e18063ae01911990f8726e9927e070ebf upstream. [ Upstream commit 1992330d90dd766fcf1730fd7bf2d6af65370ac4 ] Syzbot reported a KMSAN uninit-value issue in ovl_fill_real. This iusse's call chain is: __do_sys_getdents64() -> iterate_dir() ... -> ext4_readdir() -> fscrypt_fname_alloc_buffer() // alloc -> fscrypt_fname_disk_to_usr // write without tail '\0' -> dir_emit() -> ovl_fill_real() // read by strcmp() The string is used to store the decrypted directory entry name for an encrypted inode. As shown in the call chain, fscrypt_fname_disk_to_usr() write it without null-terminate. However, ovl_fill_real() uses strcmp() to compare the name against "..", which assumes a null-terminated string and may trigger a KMSAN uninit-value warning when the buffer tail contains uninit data. Reported-by: syzbot+d130f98b2c265fae5297@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=d130f98b2c265fae5297 Fixes: 4edb83bb1041 ("ovl: constant d_ino for non-merge dirs") Signed-off-by: Qing Wang <wangqing7171@gmail.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Link: https://patch.msgid.link/20260128132406.23768-2-amir73il@gmail.com Acked-by: Miklos Szeredi <mszeredi@redhat.com> Reviewed-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Wang Hai <wanghai38@huawei.com> | 25 天前 | |
ovl: don't allow datadir only stable inclusion from stable-v6.6.88 commit 0874b629f65320778e7e3e206177770666d9db18 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/ID6MDL Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=0874b629f65320778e7e3e206177770666d9db18 -------------------------------- commit eb3a04a8516ee9b5174379306f94279fc90424c4 upstream. In theory overlayfs could support upper layer directly referring to a data layer, but there's no current use case for this. Originally, when data-only layers were introduced, this wasn't allowed, only introduced by the "datadir+" feature, but without actually handling this case, resulting in an Oops. Fix by disallowing datadir without lowerdir. Reported-by: Giuseppe Scrivano <gscrivan@redhat.com> Fixes: 24e16e385f22 ("ovl: add support for appending lowerdirs one by one") Cc: <stable@vger.kernel.org> # v6.7 Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Alexander Larsson <alexl@redhat.com> Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 0874b629f65320778e7e3e206177770666d9db18) Signed-off-by: Wentao Guan <guanwentao@uniontech.com> | 7 个月前 | |
ovl: Check for NULL d_inode() in ovl_dentry_upper() stable inclusion from stable-v6.6.96 commit 2cbeb47ea983cb79ca6464c5fd1abcb1372a7532 category: bugfix bugzilla: https://atomgit.com/openeuler/kernel/issues/8365 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=2cbeb47ea983cb79ca6464c5fd1abcb1372a7532 -------------------------------- [ Upstream commit 8a39f1c870e9d6fbac5638f3a42a6a6363829c49 ] In ovl_path_type() and ovl_is_metacopy_dentry() GCC notices that it is possible for OVL_E() to return NULL (which implies that d_inode(dentry) may be NULL). This would result in out of bounds reads via container_of(), seen with GCC 15's -Warray-bounds -fdiagnostics-details. For example: In file included from arch/x86/include/generated/asm/rwonce.h:1, from include/linux/compiler.h:339, from include/linux/export.h:5, from include/linux/linkage.h:7, from include/linux/fs.h:5, from fs/overlayfs/util.c:7: In function 'ovl_upperdentry_dereference', inlined from 'ovl_dentry_upper' at ../fs/overlayfs/util.c:305:9, inlined from 'ovl_path_type' at ../fs/overlayfs/util.c:216:6: include/asm-generic/rwonce.h:44:26: error: array subscript 0 is outside array bounds of 'struct inode[7486503276667837]' [-Werror=array-bounds=] 44 | #define __READ_ONCE(x) (*(const volatile __unqual_scalar_typeof(x) *)&(x)) | ~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ include/asm-generic/rwonce.h:50:9: note: in expansion of macro '__READ_ONCE' 50 | __READ_ONCE(x); \ | ^~~~~~~~~~~ fs/overlayfs/ovl_entry.h:195:16: note: in expansion of macro 'READ_ONCE' 195 | return READ_ONCE(oi->__upperdentry); | ^~~~~~~~~ 'ovl_path_type': event 1 185 | return inode ? OVL_I(inode)->oe : NULL; 'ovl_path_type': event 2 Avoid this by allowing ovl_dentry_upper() to return NULL if d_inode() is NULL, as that means the problematic dereferencing can never be reached. Note that this fixes the over-eager compiler warning in an effort to being able to enable -Warray-bounds globally. There is no known behavioral bug here. Suggested-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Kees Cook <kees@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit 2cbeb47ea983cb79ca6464c5fd1abcb1372a7532) Signed-off-by: Wentao Guan <guanwentao@uniontech.com> | 4 个月前 |