kernel/fs/unicode · openEuler/kernel - AtomGit

Wwenzhiwei11unicode: Fix utf8_load() error path

11d9d08c创建于 2025年2月10日历史提交

文件	最后提交记录	最后更新时间
.gitignore	unicode: fix .gitignore for generated utfdata file Commit 2b3d04787012 ("unicode: Add utf8-data module") changed the generated utf8data file from 'utf8data.h' to 'utf8data.c', but didn't change the comments or the .gitignore to match. The comments should be updated too, but at least they don't cause any visible breakage. But the gitignore file needs changing to avoid git complaining about untracked files. Fixes: 2b3d04787012 ("unicode: Add utf8-data module") Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	4 年前
Kconfig	unicode: clean up the Kconfig symbol confusion Turn the CONFIG_UNICODE symbol into a tristate that generates some always built in code and remove the confusing CONFIG_UNICODE_UTF8_DATA symbol. Note that a lot of the IS_ENABLED() checks could be turned from cpp statements into normal ifs, but this change is intended to be fairly mechanic, so that should be cleaned up later. Fixes: 2b3d04787012 ("unicode: Add utf8-data module") Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>	4 年前
Makefile	kbuild: unify cmd_copy and cmd_shipped cmd_copy and cmd_shipped have similar functionality. The difference is that cmd_copy uses 'cp' while cmd_shipped 'cat'. Unify them into cmd_copy because this macro name is more intuitive. Going forward, cmd_copy will use 'cat' to avoid the permission issue. I also thought of 'cp --no-preserve=mode' but this option is not mentioned in the POSIX spec [1], so I am keeping the 'cat' command. [1]: https://pubs.opengroup.org/onlinepubs/009695299/utilities/cp.html Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Gabriel Krisman Bertazi <krisman@collabora.com>	4 年前
README.utf8data	unicode: update to Unicode 12.1.0 final Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: Gabriel Krisman Bertazi <krisman@collabora.com>	7 年前
mkutf8data.c	Revert "unicode: Don't special case ignorable code points" stable inclusion from stable-v6.6.66 commit 7535956ffe5bd9060402b4e67f91792c4cdf1ed2 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/IBD0BE Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=7535956ffe5bd9060402b4e67f91792c4cdf1ed2 -------------------------------- [ Upstream commit 231825b2e1ff6ba799c5eaf396d3ab2354e37c6b ] This reverts commit 5c26d2f1d3f5e4be3e196526bead29ecb139cf91. It turns out that we can't do this, because while the old behavior of ignoring ignorable code points was most definitely wrong, we have case-folding filesystems with on-disk hash values with that wrong behavior. So now you can't look up those names, because they hash to something different. Of course, it's also entirely possible that in the meantime people have created new files with the new ("more correct") case folding logic, and reverting will just make other things break. The correct solution is to not do case folding in filesystems, but sadly, people seem to never really understand that. People still see it as a feature, not a bug. Reported-by: Qi Han <hanqi@vivo.com> Link: https://bugzilla.kernel.org/show_bug.cgi?id=219586 Cc: Gabriel Krisman Bertazi <krisman@suse.de> Requested-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Cheng Yu <serein.chengyu@huawei.com>	1 年前
utf8-core.c	unicode: Fix utf8_load() error path stable inclusion from stable-v6.6.64 commit c4b6c1781f6cc4e2283120ac8d873864b8056f21 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/IBL4B6 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=c4b6c1781f6cc4e2283120ac8d873864b8056f21 -------------------------------- [ Upstream commit 156bb2c569cd869583c593d27a5bd69e7b2a4264 ] utf8_load() requests the symbol "utf8_data_table" and then checks if the requested UTF-8 version is supported. If it's unsupported, it tries to put the data table using symbol_put(). If an unsupported version is requested, symbol_put() fails like this: kernel BUG at kernel/module/main.c:786! RIP: 0010:__symbol_put+0x93/0xb0 Call Trace: <TASK> ? __die_body.cold+0x19/0x27 ? die+0x2e/0x50 ? do_trap+0xca/0x110 ? do_error_trap+0x65/0x80 ? __symbol_put+0x93/0xb0 ? exc_invalid_op+0x51/0x70 ? __symbol_put+0x93/0xb0 ? asm_exc_invalid_op+0x1a/0x20 ? __pfx_cmp_name+0x10/0x10 ? __symbol_put+0x93/0xb0 ? __symbol_put+0x62/0xb0 utf8_load+0xf8/0x150 That happens because symbol_put() expects the unique string that identify the symbol, instead of a pointer to the loaded symbol. Fix that by using such string. Fixes: 2b3d04787012 ("unicode: Add utf8-data module") Signed-off-by: André Almeida <andrealmeid@igalia.com> Reviewed-by: Theodore Ts'o <tytso@mit.edu> Link: https://lore.kernel.org/r/20240902225511.757831-2-andrealmeid@igalia.com Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Wen Zhiwei <wenzhiwei@kylinos.cn>	1 年前
utf8-norm.c	unicode: only export internal symbols for the selftests The exported symbols in utf8-norm.c are not needed for normal file system consumers, so move them to conditional _GPL exports just for the selftest. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>	4 年前
utf8-selftest.c	unicode: Add utf8-data module utf8data.h contains a large database table which is an auto-generated decodification trie for the unicode normalization functions. Allow building it into a separate module. Based on a patch from Shreeya Patel <shreeya.patel@collabora.com>. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>	4 年前
utf8data.c_shipped	Revert "unicode: Don't special case ignorable code points" stable inclusion from stable-v6.6.66 commit 7535956ffe5bd9060402b4e67f91792c4cdf1ed2 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/IBD0BE Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=7535956ffe5bd9060402b4e67f91792c4cdf1ed2 -------------------------------- [ Upstream commit 231825b2e1ff6ba799c5eaf396d3ab2354e37c6b ] This reverts commit 5c26d2f1d3f5e4be3e196526bead29ecb139cf91. It turns out that we can't do this, because while the old behavior of ignoring ignorable code points was most definitely wrong, we have case-folding filesystems with on-disk hash values with that wrong behavior. So now you can't look up those names, because they hash to something different. Of course, it's also entirely possible that in the meantime people have created new files with the new ("more correct") case folding logic, and reverting will just make other things break. The correct solution is to not do case folding in filesystems, but sadly, people seem to never really understand that. People still see it as a feature, not a bug. Reported-by: Qi Han <hanqi@vivo.com> Link: https://bugzilla.kernel.org/show_bug.cgi?id=219586 Cc: Gabriel Krisman Bertazi <krisman@suse.de> Requested-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Cheng Yu <serein.chengyu@huawei.com>	1 年前
utf8n.h	unicode: Add utf8-data module utf8data.h contains a large database table which is an auto-generated decodification trie for the unicode normalization functions. Allow building it into a separate module. Based on a patch from Shreeya Patel <shreeya.patel@collabora.com>. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>	4 年前