| 文件 | 最后提交记录 | 最后更新时间 |
|---|---|---|
ceph: select FS_ENCRYPTION_ALGS if FS_ENCRYPTION stable inclusion from stable-v6.6.14 commit a8b91a92d4d630da26e7a747911d230d42e9f9a9 bugzilla: https://gitee.com/openeuler/kernel/issues/I99TJK Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=a8b91a92d4d630da26e7a747911d230d42e9f9a9 -------------------------------- commit 9c896d6bc3dfef86659a6a1fb25ccdea5dbef6a3 upstream. The kconfig options for filesystems that support FS_ENCRYPTION are supposed to select FS_ENCRYPTION_ALGS. This is needed to ensure that required crypto algorithms get enabled as loadable modules or builtin as is appropriate for the set of enabled filesystems. Do this for CEPH_FS so that there aren't any missing algorithms if someone happens to have CEPH_FS as their only enabled filesystem that supports encryption. Cc: stable@vger.kernel.org Fixes: f061feda6c54 ("ceph: add fscrypt ioctls and ceph.fscrypt.auth vxattr") Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Xiubo Li <xiubli@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: ZhangPeng <zhangpeng362@huawei.com> | 2 年前 | |
ceph: fscrypt_auth handling for ceph Most fscrypt-enabled filesystems store the crypto context in an xattr, but that's problematic for ceph as xatts are governed by the XATTR cap, but we really want the crypto context as part of the AUTH cap. Because of this, the MDS has added two new inode metadata fields: fscrypt_auth and fscrypt_file. The former is used to hold the crypto context, and the latter is used to track the real file size. Parse new fscrypt_auth and fscrypt_file fields in inode traces. For now, we don't use fscrypt_file, but fscrypt_auth is used to hold the fscrypt context. Allow the client to use a setattr request for setting the fscrypt_auth field. Since this is not a standard setattr request from the VFS, we add a new field to __ceph_setattr that carries ceph-specific inode attrs. Have the set_context op do a setattr that sets the fscrypt_auth value, and get_context just return the contents of that field (since it should always be available). Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Xiubo Li <xiubli@redhat.com> Reviewed-and-tested-by: Luís Henriques <lhenriques@suse.de> Reviewed-by: Milind Changire <mchangir@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> | 2 年前 | |
Merge tag 'ceph-for-6.6-rc1' of https://github.com/ceph/ceph-client Pull ceph updates from Ilya Dryomov: "Mixed with some fixes and cleanups, this brings in reasonably complete fscrypt support to CephFS! The list of things which don't work with encryption should be fairly short, mostly around the edges: fallocate (not supported well in CephFS to begin with), copy_file_range (requires re-encryption), non-default striping patterns. This was a multi-year effort principally by Jeff Layton with assistance from Xiubo Li, Luís Henriques and others, including several dependant changes in the MDS, netfs helper library and fscrypt framework itself" * tag 'ceph-for-6.6-rc1' of https://github.com/ceph/ceph-client: (53 commits) ceph: make num_fwd and num_retry to __u32 ceph: make members in struct ceph_mds_request_args_ext a union rbd: use list_for_each_entry() helper libceph: do not include crypto/algapi.h ceph: switch ceph_lookup/atomic_open() to use new fscrypt helper ceph: fix updating i_truncate_pagecache_size for fscrypt ceph: wait for OSD requests' callbacks to finish when unmounting ceph: drop messages from MDS when unmounting ceph: update documentation regarding snapshot naming limitations ceph: prevent snapshot creation in encrypted locked directories ceph: add support for encrypted snapshot names ceph: invalidate pages when doing direct/sync writes ceph: plumb in decryption during reads ceph: add encryption support to writepage and writepages ceph: add read/modify/write to ceph_sync_write ceph: align data in pages in ceph_sync_write ceph: don't use special DIO path for encrypted inodes ceph: add truncate size handling support for fscrypt ceph: add object version support for sync read libceph: allow ceph_osdc_new_request to accept a multi-op read ... | 2 年前 | |
ceph: supply snapshot context in ceph_uninline_data() stable inclusion from stable-v6.6.128 commit a87a445ac1d9f83335cd00c6f87c590cbe5e9667 category: bugfix bugzilla: https://atomgit.com/openeuler/kernel/issues/ Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=a87a445ac1d9f83335cd00c6f87c590cbe5e9667 -------------------------------- commit a87a445ac1d9f83335cd00c6f87c590cbe5e9667 upstream. [ Upstream commit 305ff6b3a03c230d3c07b61457e961406d979693 ] The ceph_uninline_data function was missing proper snapshot context handling for its OSD write operations. Both CEPH_OSD_OP_CREATE and CEPH_OSD_OP_WRITE requests were passing NULL instead of the appropriate snapshot context, which could lead to unnecessary object clone. Reproducer: ../src/vstart.sh --new -x --localhost --bluestore // turn on cephfs inline data ./bin/ceph fs set a inline_data true --yes-i-really-really-mean-it // allow fs_a client to take snapshot ./bin/ceph auth caps client.fs_a mds 'allow rwps fsname=a' mon 'allow r fsname=a' osd 'allow rw tag cephfs data=a' // mount cephfs with fuse, since kernel cephfs doesn't support inline write ceph-fuse --id fs_a -m 127.0.0.1:40318 --conf ceph.conf -d /mnt/mycephfs/ // bump snapshot seq mkdir /mnt/mycephfs/.snap/snap1 echo "foo" > /mnt/mycephfs/test // umount and mount it again using kernel cephfs client umount /mnt/mycephfs mount -t ceph fs_a@.a=/ /mnt/mycephfs/ -o conf=./ceph.conf echo "bar" >> /mnt/mycephfs/test ./bin/rados listsnaps -p cephfs.a.data $(printf "%x\n" $(stat -c %i /mnt/mycephfs/test)).00000000 will see this object does unnecessary clone 1000000000a.00000000 (seq:2): cloneid snaps size overlap 2 2 4 [] head - 8 but it's expected to see 10000000000.00000000 (seq:2): cloneid snaps size overlap head - 8 since there's no snapshot between these 2 writes clone happened because the first osd request CEPH_OSD_OP_CREATE doesn't pass snap context so object is created with snap seq 0, but later data writeback is equipped with snapshot context. snap.seq(1) > object snap seq(0), so osd does object clone. This fix properly acquiring the snapshot context before performing write operations. Signed-off-by: ethanwu <ethanwu@synology.com> Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> Tested-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Wang Hai <wanghai38@huawei.com> | 25 天前 | |
ceph: rename _to_client() to _to_fs_client() stable inclusion from stable-v6.6.29 commit 985b9ee8a2cfc702cb9a63ca4f22b0b244297d2c bugzilla: https://gitee.com/openeuler/kernel/issues/I9MPZ8 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=985b9ee8a2cfc702cb9a63ca4f22b0b244297d2c -------------------------------- [ Upstream commit 5995d90d2d19f337df6a50bcf4699ef053214dac ] We need to covert the inode to ceph_client in the following commit, and will add one new helper for that, here we rename the old helper to _fs_client(). Link: https://tracker.ceph.com/issues/61590 Signed-off-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Patrick Donnelly <pdonnell@redhat.com> Reviewed-by: Milind Changire <mchangir@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Stable-dep-of: b372e96bd0a3 ("ceph: redirty page before returning AOP_WRITEPAGE_ACTIVATE") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: ZhangPeng <zhangpeng362@huawei.com> | 2 年前 | |
netfs: Further cleanups after struct netfs_inode wrapper introduced Change the signature of netfs helper functions to take a struct netfs_inode pointer rather than a struct inode pointer where appropriate, thereby relieving the need for the network filesystem to convert its internal inode format down to the VFS inode only for netfslib to bounce it back up. For type safety, it's better not to do that (and it's less typing too). Give netfs_write_begin() an extra argument to pass in a pointer to the netfs_inode struct rather than deriving it internally from the file pointer. Note that the ->write_begin() and ->write_end() ops are intended to be replaced in the future by netfslib code that manages this without the need to call in twice for each page. netfs_readpage() and similar are intended to be pointed at directly by the address_space_operations table, so must stick to the signature dictated by the function pointers there. Changes ======= - Updated the kerneldoc comments and documentation [DH]. Signed-off-by: David Howells <dhowells@redhat.com> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/CAHk-=wgkwKyNmNdKpQkqZ6DnmUL-x9hp0YBnUGjaPFEAdxDTbw@mail.gmail.com/ | 3 年前 | |
ceph: rename _to_client() to _to_fs_client() stable inclusion from stable-v6.6.29 commit 985b9ee8a2cfc702cb9a63ca4f22b0b244297d2c bugzilla: https://gitee.com/openeuler/kernel/issues/I9MPZ8 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=985b9ee8a2cfc702cb9a63ca4f22b0b244297d2c -------------------------------- [ Upstream commit 5995d90d2d19f337df6a50bcf4699ef053214dac ] We need to covert the inode to ceph_client in the following commit, and will add one new helper for that, here we rename the old helper to _fs_client(). Link: https://tracker.ceph.com/issues/61590 Signed-off-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Patrick Donnelly <pdonnell@redhat.com> Reviewed-by: Milind Changire <mchangir@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Stable-dep-of: b372e96bd0a3 ("ceph: redirty page before returning AOP_WRITEPAGE_ACTIVATE") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: ZhangPeng <zhangpeng362@huawei.com> | 2 年前 | |
License cleanup: add SPDX GPL-2.0 license identifier to files with no license Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> | 8 年前 | |
ceph: fix oops due to invalid pointer for kfree() in parse_longname() mainline inclusion from mainline-v6.19 commit bc8dedae022ce3058659c3addef3ec4b41d15e00 category: bugfix bugzilla: https://atomgit.com/src-openeuler/kernel/issues/13836/ CVE: CVE-2026-23201 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=bc8dedae022ce3058659c3addef3ec4b41d15e00 -------------------------------- This fixes a kernel oops when reading ceph snapshot directories (.snap), for example by simply running ls /mnt/my_ceph/.snap. The variable str is guarded by __free(kfree), but advanced by one for skipping the initial '_' in snapshot names. Thus, kfree() is called with an invalid pointer. This patch removes the need for advancing the pointer so kfree() is called with correct memory pointer. Steps to reproduce: 1. Create snapshots on a cephfs volume (I've 63 snaps in my testcase) 2. Add cephfs mount to fstab $ echo "samba-fileserver@.files=/volumes/datapool/stuff/3461082b-ecc9-4e82-8549-3fd2590d3fb6 /mnt/test/stuff ceph acl,noatime,_netdev 0 0" >> /etc/fstab 3. Reboot the system $ systemctl reboot 4. Check if it's really mounted $ mount | grep stuff 5. List snapshots (expected 63 snapshots on my system) $ ls /mnt/test/stuff/.snap Now ls hangs forever and the kernel log shows the oops. Cc: stable@vger.kernel.org Fixes: 101841c38346 ("[ceph] parse_longname(): strrchr() expects NUL-terminated string") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220807 Suggested-by: Helge Deller <deller@gmx.de> Signed-off-by: Daniel Vogelbacher <daniel@chaospixel.com> Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Conflicts: fs/ceph/crypto.c [Context conflicts due to unmmerged commit 38d46409c463 ("ceph: print cluster fsid and client global_id in all debug logs")] Signed-off-by: Pan Taixi <pantaixi1@huawei.com> | 3 个月前 | |
ceph: add support for encrypted snapshot names Since filenames in encrypted directories are encrypted and shown as a base64-encoded string when the directory is locked, make snapshot names show a similar behaviour. When creating a snapshot, .snap directories for every subdirectory will show the snapshot name in the "long format": # mkdir .snap/my-snap # ls my-dir/.snap/ _my-snap_1099511627782 Encrypted snapshots will need to be able to handle these by encrypting/decrypting only the snapshot part of the string ('my-snap'). Also, since the MDS prevents snapshot names to be bigger than 240 characters it is necessary to adapt CEPH_NOHASH_NAME_MAX to accommodate this extra limitation. [ idryomov: drop const on !CONFIG_FS_ENCRYPTION branch too ] Signed-off-by: Luís Henriques <lhenriques@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Milind Changire <mchangir@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> | 2 年前 | |
ceph: add a bunch of missing ceph_path_info initializers mainline inclusion from mainline-v7.0-rc4 commit 43323a5934b660afae687e8e4e95ac328615a5c4 category: bugfix bugzilla: https://atomgit.com/src-openeuler/kernel/issues/14967 CVE: CVE-2026-43408 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=43323a5934b660afae687e8e4e95ac328615a5c4 -------------------------------- ceph_mdsc_build_path() must be called with a zero-initialized ceph_path_info parameter, or else the following ceph_mdsc_free_path_info() may crash. Example crash (on Linux 6.18.12): virt_to_cache: Object is not a Slab page! WARNING: CPU: 184 PID: 2871736 at mm/slub.c:6732 kmem_cache_free+0x316/0x400 [...] Call Trace: [...] ceph_open+0x13d/0x3e0 do_dentry_open+0x134/0x480 vfs_open+0x2a/0xe0 path_openat+0x9a3/0x1160 [...] cache_from_obj: Wrong slab cache. names_cache but object is from ceph_inode_info WARNING: CPU: 184 PID: 2871736 at mm/slub.c:6746 kmem_cache_free+0x2dd/0x400 [...] kernel BUG at mm/slub.c:634! Oops: invalid opcode: 0000 [#1] SMP NOPTI RIP: 0010:__slab_free+0x1a4/0x350 Some of the ceph_mdsc_build_path() callers had initializers, but others had not, even though they were all added by commit 15f519e9f883 ("ceph: fix race condition validating r_parent before applying state"). The ones without initializer are suspectible to random crashes. (I can imagine it could even be possible to exploit this bug to elevate privileges.) Unfortunately, these Ceph functions are undocumented and its semantics can only be derived from the code. I see that ceph_mdsc_build_path() initializes the structure only on success, but not on error. Calling ceph_mdsc_free_path_info() after a failed ceph_mdsc_build_path() call does not even make sense, but that's what all callers do, and for it to be safe, the structure must be zero-initialized. The least intrusive approach to fix this is therefore to add initializers everywhere. Cc: stable@vger.kernel.org Fixes: 15f519e9f883 ("ceph: fix race condition validating r_parent before applying state") Signed-off-by: Max Kellermann <max.kellermann@ionos.com> Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Conflicts: fs/ceph/debugfs.c fs/ceph/dir.c fs/ceph/file.c fs/ceph/inode.c [Context conflicts in fs/ceph/debugfs.c. For other files, struct ceph_path_info path_info are all properly initialized to {0}] Signed-off-by: Pan Taixi <pantaixi1@huawei.com> | 1 个月前 | |
ceph: only d_add() negative dentries when they are unhashed stable inclusion from stable-v6.6.140 commit 83ce43a21bb7df8dd52228afdd918d2d058eefde category: bugfix bugzilla: https://atomgit.com/src-openeuler/kernel/issues/15294 CVE: CVE-2026-46052 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=83ce43a21bb7df8dd52228afdd918d2d058eefde -------------------------------- [ Upstream commit 803447f93d75ab6e40c85e6d12b5630d281d70d6 ] Ceph can call d_add(dentry, NULL) on a negative dentry that is already present in the primary dcache hash. In the current VFS that is not safe. d_add() goes through __d_add() to __d_rehash(), which unconditionally reinserts dentry->d_hash into the hlist_bl bucket. If the dentry is already hashed, reinserting the same node can corrupt the bucket, including creating a self-loop. Once that happens, __d_lookup() can spin forever in the hlist_bl walk, typically looping only on the d_name.hash mismatch check and eventually triggering RCU stall reports like this one: rcu: INFO: rcu_sched self-detected stall on CPU rcu: 87-....: (2100 ticks this GP) idle=3a4c/1/0x4000000000000000 softirq=25003319/25003319 fqs=829 rcu: (t=2101 jiffies g=79058445 q=698988 ncpus=192) CPU: 87 UID: 2952868916 PID: 3933303 Comm: php-cgi8.3 Not tainted 6.18.17-i1-amd #950 NONE Hardware name: Dell Inc. PowerEdge R7615/0G9DHV, BIOS 1.6.6 09/22/2023 RIP: 0010:__d_lookup+0x46/0xb0 Code: c1 e8 07 48 8d 04 c2 48 8b 00 49 89 fc 49 89 f5 48 89 c3 48 83 e3 fe 48 83 f8 01 77 0f eb 2d 0f 1f 44 00 00 48 8b 1b 48 85 db <74> 20 39 6b 18 75 f3 48 8d 7b 78 e8 ba 85 d0 00 4c 39 63 10 74 1f RSP: 0018:ff745a70c8253898 EFLAGS: 00000282 RAX: ff26e470054cb208 RBX: ff26e470054cb208 RCX: 000000006e958966 RDX: ff26e48267340000 RSI: ff745a70c82539b0 RDI: ff26e458f74655c0 RBP: 000000006e958966 R08: 0000000000000180 R09: 9cd08d909b919a89 R10: ff26e458f74655c0 R11: 0000000000000000 R12: ff26e458f74655c0 R13: ff745a70c82539b0 R14: d0d0d0d0d0d0d0d0 R15: 2f2f2f2f2f2f2f2f FS: 00007f5770896980(0000) GS:ff26e482c5d88000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f5764de50c0 CR3: 000000a72abb5001 CR4: 0000000000771ef0 PKRU: 55555554 Call Trace: <TASK> lookup_fast+0x9f/0x100 walk_component+0x1f/0x150 link_path_walk+0x20e/0x3d0 path_lookupat+0x68/0x180 filename_lookup+0xdc/0x1e0 vfs_statx+0x6c/0x140 vfs_fstatat+0x67/0xa0 __do_sys_newfstatat+0x24/0x60 do_syscall_64+0x6a/0x230 entry_SYSCALL_64_after_hwframe+0x76/0x7e This is reachable with reused cached negative dentries. A Ceph lookup or atomic_open can be handed a negative dentry that is already hashed, and fs/ceph/dir.c then hits one of two paths that incorrectly assume "negative" also means "unhashed": - ceph_finish_lookup(): MDS reply is -ENOENT with no trace -> d_add(dentry, NULL) - ceph_lookup(): local ENOENT fast path for a complete directory with shared caps -> d_add(dentry, NULL) Both paths can therefore re-add an already-hashed negative dentry. Ceph already uses the correct pattern elsewhere: ceph_fill_trace() only calls d_add(dn, NULL) for a negative null-dentry reply when d_unhashed(dn) is true. Fix both fs/ceph/dir.c sites the same way: only call d_add() for a negative dentry when it is actually unhashed. If the negative dentry is already hashed, leave it in place and reuse it as-is. This preserves the existing behavior for unhashed dentries while avoiding d_hash list corruption for reused hashed negatives. Cc: stable@vger.kernel.org Fixes: 2817b000b02c ("ceph: directory operations") Signed-off-by: Max Kellermann <max.kellermann@ionos.com> Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> [ kept existing dout() debug call instead of upstream's doutc() form when adding the d_unhashed() guard around d_add() ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tengda Wu <wutengda2@huawei.com> | 29 天前 | |
ceph: rename _to_client() to _to_fs_client() stable inclusion from stable-v6.6.29 commit 985b9ee8a2cfc702cb9a63ca4f22b0b244297d2c bugzilla: https://gitee.com/openeuler/kernel/issues/I9MPZ8 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=985b9ee8a2cfc702cb9a63ca4f22b0b244297d2c -------------------------------- [ Upstream commit 5995d90d2d19f337df6a50bcf4699ef053214dac ] We need to covert the inode to ceph_client in the following commit, and will add one new helper for that, here we rename the old helper to _fs_client(). Link: https://tracker.ceph.com/issues/61590 Signed-off-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Patrick Donnelly <pdonnell@redhat.com> Reviewed-by: Milind Changire <mchangir@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Stable-dep-of: b372e96bd0a3 ("ceph: redirty page before returning AOP_WRITEPAGE_ACTIVATE") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: ZhangPeng <zhangpeng362@huawei.com> | 2 年前 | |
ceph: supply snapshot context in ceph_zero_partial_object() stable inclusion from stable-v6.6.128 commit 9efa154609cdb658f51c7d76b30a09f7e6485250 category: bugfix bugzilla: https://atomgit.com/src-openeuler/kernel/issues/14817 CVE: CVE-2026-43273 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=9efa154609cdb658f51c7d76b30a09f7e6485250 -------------------------------- [ Upstream commit f16bd3fa74a2084ee7e16a8a2be7e7399b970907 ] The ceph_zero_partial_object function was missing proper snapshot context for its OSD write operations, which could lead to data inconsistencies in snapshots. Reproducer: ../src/vstart.sh --new -x --localhost --bluestore ./bin/ceph auth caps client.fs_a mds 'allow rwps fsname=a' mon 'allow r fsname=a' osd 'allow rw tag cephfs data=a' mount -t ceph fs_a@.a=/ /mnt/mycephfs/ -o conf=./ceph.conf dd if=/dev/urandom of=/mnt/mycephfs/foo bs=64K count=1 mkdir /mnt/mycephfs/.snap/snap1 md5sum /mnt/mycephfs/.snap/snap1/foo fallocate -p -o 0 -l 4096 /mnt/mycephfs/foo echo 3 > /proc/sys/vm/drop/caches md5sum /mnt/mycephfs/.snap/snap1/foo # get different md5sum!! Cc: stable@vger.kernel.org Fixes: ad7a60de882ac ("ceph: punch hole support") Signed-off-by: ethanwu <ethanwu@synology.com> Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> Tested-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Pu Lehui <pulehui@huawei.com> | 1 个月前 | |
ceph: Fix incorrect flush end position calculation stable inclusion from stable-v6.6.89 commit f55e7f8abbd3f6f2402882183051f3bbc49a7b4c category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/ID73WA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=f55e7f8abbd3f6f2402882183051f3bbc49a7b4c -------------------------------- [ Upstream commit f452a2204614fc10e2c3b85904c4bd300c2789dc ] In ceph, in fill_fscrypt_truncate(), the end flush position is calculated by: loff_t lend = orig_pos + CEPH_FSCRYPT_BLOCK_SHIFT - 1; but that's using the block shift not the block size. Fix this to use the block size instead. Fixes: 5c64737d2536 ("ceph: add truncate size handling support for fscrypt") Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit f55e7f8abbd3f6f2402882183051f3bbc49a7b4c) Signed-off-by: Wentao Guan <guanwentao@uniontech.com> | 7 个月前 | |
ceph: fix kerneldoc copypasta over ceph_start_io_direct Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> | 5 年前 | |
ceph: add buffered/direct exclusionary locking for reads and writes xfstest generic/451 intermittently fails. The test does O_DIRECT writes to a file, and then reads back the result using buffered I/O, while running a separate set of tasks that are also doing buffered reads. The client will invalidate the cache prior to a direct write, but it's easy for one of the other readers' replies to race in and reinstantiate the invalidated range with stale data. To fix this, we must to serialize direct I/O writes and buffered reads. We could just sprinkle in some shared locks on the i_rwsem for reads, and increase the exclusive footprint on the write side, but that would cause O_DIRECT writes to end up serialized vs. other direct requests. Instead, borrow the scheme used by nfs.ko. Buffered writes take the i_rwsem exclusively, but buffered reads take a shared lock, allowing them to run in parallel. O_DIRECT requests also take a shared lock, but we need for them to not run in parallel with buffered reads. A flag on the ceph_inode_info is used to indicate whether it's in direct or buffered I/O mode. When a conflicting request is submitted, it will block until the inode can be flipped to the necessary mode. Link: https://tracker.ceph.com/issues/40985 Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> | 6 年前 | |
ceph: rename _to_client() to _to_fs_client() stable inclusion from stable-v6.6.29 commit 985b9ee8a2cfc702cb9a63ca4f22b0b244297d2c bugzilla: https://gitee.com/openeuler/kernel/issues/I9MPZ8 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=985b9ee8a2cfc702cb9a63ca4f22b0b244297d2c -------------------------------- [ Upstream commit 5995d90d2d19f337df6a50bcf4699ef053214dac ] We need to covert the inode to ceph_client in the following commit, and will add one new helper for that, here we rename the old helper to _fs_client(). Link: https://tracker.ceph.com/issues/61590 Signed-off-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Patrick Donnelly <pdonnell@redhat.com> Reviewed-by: Milind Changire <mchangir@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Stable-dep-of: b372e96bd0a3 ("ceph: redirty page before returning AOP_WRITEPAGE_ACTIVATE") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: ZhangPeng <zhangpeng362@huawei.com> | 2 年前 | |
License cleanup: add SPDX GPL-2.0 license identifier to files with no license Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> | 8 年前 | |
ceph: add checking of wait_for_completion_killable() return value stable inclusion from stable-v6.6.117 commit 34353a0cd39bdda2fc6ea4b575d2a1770034097a category: bugfix bugzilla: https://atomgit.com/openeuler/kernel/issues/8763 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=34353a0cd39bdda2fc6ea4b575d2a1770034097a -------------------------------- [ Upstream commit b7ed1e29cfe773d648ca09895b92856bd3a2092d ] The Coverity Scan service has detected the calling of wait_for_completion_killable() without checking the return value in ceph_lock_wait_for_completion() [1]. The CID 1636232 defect contains explanation: "If the function returns an error value, the error value may be mistaken for a normal value. In ceph_lock_wait_for_completion(): Value returned from a function is not checked for errors before being used. (CWE-252)". The patch adds the checking of wait_for_completion_killable() return value and return the error code from ceph_lock_wait_for_completion(). [1] https://scan5.scan.coverity.com/#/project-view/64304/10063?selectedIssue=1636232 Signed-off-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> Reviewed-by: Alex Markuze <amarkuze@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit 34353a0cd39bdda2fc6ea4b575d2a1770034097a) Signed-off-by: Wentao Guan <guanwentao@uniontech.com> | 3 个月前 | |
ceph: fix memory leaks in ceph_mdsc_build_path() stable inclusion from stable-v6.6.130 commit 657dc653b06a3cc0282aea447a3f137fa94066a4 category: bugfix bugzilla: https://atomgit.com/src-openeuler/kernel/issues/14977 CVE: CVE-2026-43419 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=657dc653b06a3cc0282aea447a3f137fa94066a4 -------------------------------- commit 040d159a45ded7f33201421a81df0aa2a86e5a0b upstream. Add __putname() calls to error code paths that did not free the "path" pointer obtained by __getname(). If ownership of this pointer is not passed to the caller via path_info.path, the function must free it before returning. Cc: stable@vger.kernel.org Fixes: 3fd945a79e14 ("ceph: encode encrypted name in ceph_mdsc_build_path and dentry release") Fixes: 550f7ca98ee0 ("ceph: give up on paths longer than PATH_MAX") Signed-off-by: Max Kellermann <max.kellermann@ionos.com> Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Conflicts: fs/ceph/mds_client.c [Context conflict.] Signed-off-by: Pu Lehui <pulehui@huawei.com> | 1 个月前 | |
ceph: fix race condition validating r_parent before applying state mainline inclusion from mainline-v6.17-rc6 commit 15f519e9f883b316d86e2bb6b767a023aafd9d83 category: bugfix bugzilla: https://atomgit.com/src-openeuler/kernel/issues/8352 CVE: CVE-2025-39927 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=15f519e9f883b316d86e2bb6b767a023aafd9d83 -------------------------------- Add validation to ensure the cached parent directory inode matches the directory info in MDS replies. This prevents client-side race conditions where concurrent operations (e.g. rename) cause r_parent to become stale between request initiation and reply processing, which could lead to applying state changes to incorrect directory inodes. [ idryomov: folded a kerneldoc fixup and a follow-up fix from Alex to move CEPH_CAP_PIN reference when r_parent is updated: When the parent directory lock is not held, req->r_parent can become stale and is updated to point to the correct inode. However, the associated CEPH_CAP_PIN reference was not being adjusted. The CEPH_CAP_PIN is a reference on an inode that is tracked for accounting purposes. Moving this pin is important to keep the accounting balanced. When the pin was not moved from the old parent to the new one, it created two problems: The reference on the old, stale parent was never released, causing a reference leak. A reference for the new parent was never acquired, creating the risk of a reference underflow later in ceph_mdsc_release_request(). This patch corrects the logic by releasing the pin from the old parent and acquiring it for the new parent when r_parent is switched. This ensures reference accounting stays balanced. ] Cc: stable@vger.kernel.org Signed-off-by: Alex Markuze <amarkuze@redhat.com> Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Conflicts: fs/ceph/debugfs.c fs/ceph/dir.c fs/ceph/file.c fs/ceph/inode.c fs/ceph/mds_client.c fs/ceph/mds_client.h [The code has undergone multiple refactorings, leading to many context conflicts.] Signed-off-by: Wang Zhaolong <wangzhaolong@huaweicloud.com> | 5 个月前 | |
ceph: pass the mdsc to several helpers stable inclusion from stable-v6.6.29 commit 2e2023e9a4c2936a33da1cf66782e83e26e5b635 bugzilla: https://gitee.com/openeuler/kernel/issues/I9MPZ8 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=2e2023e9a4c2936a33da1cf66782e83e26e5b635 -------------------------------- [ Upstream commit 197b7d792d6aead2e30d4b2c054ffabae2ed73dc ] We will use the 'mdsc' to get the global_id in the following commits. Link: https://tracker.ceph.com/issues/61590 Signed-off-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Patrick Donnelly <pdonnell@redhat.com> Reviewed-by: Milind Changire <mchangir@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Stable-dep-of: b372e96bd0a3 ("ceph: redirty page before returning AOP_WRITEPAGE_ACTIVATE") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: ZhangPeng <zhangpeng362@huawei.com> | 2 年前 | |
ceph: never send metrics if disable_send_metrics is set Even the 'disable_send_metrics' is true so when the session is being opened it will always trigger to send the metric for the first time. Cc: stable@vger.kernel.org Signed-off-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> | 2 年前 | |
ceph: include average/stdev r/w/m latency in mds metrics stdev is computed in cephfs-top tool - clients forward square of sums and IO count required to calculate stdev. Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-by: Xiubo Li <xiubli@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> | 4 年前 | |
ceph: fix invalid pointer access if get_quota_realm return ERR_PTR stable inclusion from stable-v6.6.16 commit d8fedfb4be52ce6e61010cbbeaef16d2e7b93e22 bugzilla: https://gitee.com/openeuler/kernel/issues/I99TJK Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=d8fedfb4be52ce6e61010cbbeaef16d2e7b93e22 -------------------------------- [ Upstream commit 0f4cf64eabc6e16cfc2704f1960e82dc79d91c8d ] This issue is reported by smatch that get_quota_realm() might return ERR_PTR but we did not handle it. It's not a immediate bug, while we still should address it to avoid potential bugs if get_quota_realm() is changed to return other ERR_PTR in future. Set ceph_snap_realm's pointer in get_quota_realm()'s to address this issue, the pointer would be set to NULL if get_quota_realm() failed to get struct ceph_snap_realm, so no ERR_PTR would happen any more. [ xiubli: minor code style clean up ] Signed-off-by: Wenchao Hao <haowenchao2@huawei.com> Reviewed-by: Xiubo Li <xiubli@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: ZhangPeng <zhangpeng362@huawei.com> | 2 年前 | |
ceph: rename _to_client() to _to_fs_client() stable inclusion from stable-v6.6.29 commit 985b9ee8a2cfc702cb9a63ca4f22b0b244297d2c bugzilla: https://gitee.com/openeuler/kernel/issues/I9MPZ8 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=985b9ee8a2cfc702cb9a63ca4f22b0b244297d2c -------------------------------- [ Upstream commit 5995d90d2d19f337df6a50bcf4699ef053214dac ] We need to covert the inode to ceph_client in the following commit, and will add one new helper for that, here we rename the old helper to _fs_client(). Link: https://tracker.ceph.com/issues/61590 Signed-off-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Patrick Donnelly <pdonnell@redhat.com> Reviewed-by: Milind Changire <mchangir@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Stable-dep-of: b372e96bd0a3 ("ceph: redirty page before returning AOP_WRITEPAGE_ACTIVATE") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: ZhangPeng <zhangpeng362@huawei.com> | 2 年前 | |
ceph: add getvxattr op Problem: Some directory vxattrs (e.g. ceph.dir.pin.random) are governed by information that isn't necessarily shared with the client. Add support for the new GETVXATTR operation, which allows the client to query the MDS directly for vxattrs. When the client is queried for a vxattr that doesn't have a special handler, have it issue a GETVXATTR to the MDS directly. Solution: Adds new getvxattr op to fetch ceph.dir.pin*, ceph.dir.layout* and ceph.file.layout* vxattrs. If the entire layout for a dir or a file is being set, then it is expected that the layout be set in standard JSON format. Individual field value retrieval is not wrapped in JSON. The JSON format also applies while setting the vxattr if the entire layout is being set in one go. As a temporary measure, setting a vxattr can also be done in the old format. The old format will be deprecated in the future. URL: https://tracker.ceph.com/issues/51062 Signed-off-by: Milind Changire <mchangir@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> | 4 年前 | |
ceph: set superblock s_magic for IMA fsmagic matching stable inclusion from stable-v6.6.95 commit 96707ff5818f29926e4dd3626f56035bf7e45b2b category: bugfix bugzilla: https://atomgit.com/openeuler/kernel/issues/8365 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=96707ff5818f29926e4dd3626f56035bf7e45b2b -------------------------------- commit 72386d5245b249f5a0a8fabb881df7ad947b8ea4 upstream. The CephFS kernel driver forgets to set the filesystem magic signature in its superblock. As a result, IMA policy rules based on fsmagic matching do not apply as intended. This causes a major performance regression in Talos Linux [1] when mounting CephFS volumes, such as when deploying Rook Ceph [2]. Talos Linux ships a hardened kernel with the following IMA policy (irrelevant lines omitted): [...] dont_measure fsmagic=0xc36400 # CEPH_SUPER_MAGIC [...] measure func=FILE_CHECK mask=^MAY_READ euid=0 measure func=FILE_CHECK mask=^MAY_READ uid=0 [...] Currently, IMA compares 0xc36400 == 0x0 for CephFS files, resulting in all files opened with O_RDONLY or O_RDWR getting measured with SHA512 on every open(2): 10 69990c87e8af323d47e2d6ae4... ima-ng sha512:<hash> /data/cephfs/test-file Since O_WRONLY is rare, this results in an order of magnitude lower performance than expected for practically all file operations. Properly setting CEPH_SUPER_MAGIC in the CephFS superblock resolves the regression. Tests performed on a 3x replicated Ceph v19.3.0 cluster across three i5-7200U nodes each equipped with one Micron 7400 MAX M.2 disk (BlueStore) and Gigabit ethernet, on Talos Linux v1.10.2: FS-Mark 3.3 Test: 500 Files, Empty Files/s > Higher Is Better 6.12.27-talos . 16.6 |==== +twelho patch . 208.4 |==================================================== FS-Mark 3.3 Test: 500 Files, 1KB Size Files/s > Higher Is Better 6.12.27-talos . 15.6 |======= +twelho patch . 118.6 |==================================================== FS-Mark 3.3 Test: 500 Files, 32 Sub Dirs, 1MB Size Files/s > Higher Is Better 6.12.27-talos . 12.7 |=============== +twelho patch . 44.7 |===================================================== IO500 [3] 2fcd6d6 results (benchmarks within variance omitted): | IO500 benchmark | 6.12.27-talos | +twelho patch | Speedup | |-------------------|----------------|----------------|-----------| | mdtest-easy-write | 0.018524 kIOPS | 1.135027 kIOPS | 6027.33 % | | mdtest-hard-write | 0.018498 kIOPS | 0.973312 kIOPS | 5161.71 % | | ior-easy-read | 0.064727 GiB/s | 0.155324 GiB/s | 139.97 % | | mdtest-hard-read | 0.018246 kIOPS | 0.780800 kIOPS | 4179.29 % | This applies outside of synthetic benchmarks as well, for example, the time to rsync a 55 MiB directory with ~12k of mostly small files drops from an unusable 10m5s to a reasonable 26s (23x the throughput). [1]: https://www.talos.dev/ [2]: https://www.talos.dev/v1.10/kubernetes-guides/configuration/ceph-with-rook/ [3]: https://github.com/IO500/io500 Cc: stable@vger.kernel.org Signed-off-by: Dennis Marttinen <twelho@welho.tech> Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 96707ff5818f29926e4dd3626f56035bf7e45b2b) Signed-off-by: Wentao Guan <guanwentao@uniontech.com> | 4 个月前 | |
ceph: try to allocate a smaller extent map for sparse read stable inclusion from stable-v6.6.69 commit fb98248fc4a26e73d032e53160aee5d566d410cc category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/IBNEPJ Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=fb98248fc4a26e73d032e53160aee5d566d410cc -------------------------------- [ Upstream commit aaefabc4a5f7ae48682c4d2d5d10faaf95c08eb9 ] In fscrypt case and for a smaller read length we can predict the max count of the extent map. And for small read length use cases this could save some memories. [ idryomov: squash into a single patch to avoid build break, drop redundant variable in ceph_alloc_sparse_ext_map() ] Signed-off-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Stable-dep-of: 18d44c5d062b ("ceph: allocate sparse_ext map only for sparse reads") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Wen Zhiwei <wenzhiwei@kylinos.cn> | 1 年前 | |
ceph: move net/ceph/ceph_fs.c to fs/ceph/util.c All of these functions are only called from CephFS, so move them into ceph.ko, and drop the exports. Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> | 6 年前 | |
ceph: fix a buffer leak in __ceph_setxattr() stable inclusion from stable-v6.6.141 commit 4bfdcefdaa6092a06cacd59389c7756b36e6de8c category: bugfix bugzilla: https://atomgit.com/openeuler/kernel/issues/ Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=4bfdcefdaa6092a06cacd59389c7756b36e6de8c -------------------------------- commit 5d3cc36b4e77a27ce7b686b7c59c7072bcb3fa8e upstream. The old_blob in __ceph_setxattr() can store ci->i_xattrs.prealloc_blob value during the retry. However, it is never called the ceph_buffer_put() for the old_blob object. This patch fixes the issue of the buffer leak. Cc: stable@vger.kernel.org Signed-off-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> Reviewed-by: Alex Markuze <amarkuze@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Wang Hai <wanghai38@huawei.com> | 25 天前 |
| 文件 | 最后提交记录 | 最后更新时间 |
|---|---|---|
| 2 年前 | ||
| 2 年前 | ||
| 2 年前 | ||
| 25 天前 | ||
| 2 年前 | ||
| 3 年前 | ||
| 2 年前 | ||
| 8 年前 | ||
| 3 个月前 | ||
| 2 年前 | ||
| 1 个月前 | ||
| 29 天前 | ||
| 2 年前 | ||
| 1 个月前 | ||
| 7 个月前 | ||
| 5 年前 | ||
| 6 年前 | ||
| 2 年前 | ||
| 8 年前 | ||
| 3 个月前 | ||
| 1 个月前 | ||
| 5 个月前 | ||
| 2 年前 | ||
| 2 年前 | ||
| 4 年前 | ||
| 2 年前 | ||
| 2 年前 | ||
| 4 年前 | ||
| 4 个月前 | ||
| 1 年前 | ||
| 6 年前 | ||
| 25 天前 |