If I mount smb fs with mount_smbfs -Eiso8859-2:cp852 -Lcs_CZ.ISO8859-2 ...
and there are some file/directory names, where some characters could not
be converted, directory tree traversion hangs. It is because iconv_xxx()
function returns -1, which is forwarded to the upper layer, where -1
is incorrectly taken as ERESTART and path conversion is restarted from
the beginning, which results in infinite loop.
Fix: I'm currently trying the following patch, which fixes the hang problem,
however I'm not sure, how much is it really correct.
Try to mount some smb fs with character conversions (for example, I do
use -Eiso8859-2:cp852,-Lcs_CZ.ISO8859-2), where iconv could not convert
all characters in file/directory names and try to run find . on mounted
directory - it should hang when it finds nonconvertible file/directory
name (maybe it "works" just for directory names, which is the case
where I tried to find the problem).
The patch looks correct.
bp@ has stated that this patch looks correct.
Over to maintainer(s).
Created attachment 175596 [details]
Revised patch for stable/11 and head
This problem occurs on haed and stable/11, and maybe older branches.
Unfortunately, previous patch no longer applicable to stable/10 and head.
Attached patch is fixed to be applicable to stable/11 and head, amd64.
Not 100% sure the check order is correct, but running OK for me.
(Not tryed for stable/10 and earlier branch.)
Without this, `cp -a` stoppes (running but in infinite loop?) on directory containing Japanese (UTF-8) filename if any conversion is specified (even if it's UTF-8:UTF-8).
Can someone confirm and commit?
(In reply to Tomoaki AOKI from comment #4)
> Without this, `cp -a` stoppes (running but in infinite loop?) on directory
> containing Japanese (UTF-8) filename if any conversion is specified (even if
> it's UTF-8:UTF-8).
> Can someone confirm and commit?
If I understand the code correctly, with such patch you can get a mix of partially converted name with the remaining unconverted part. It may lead to unpredictable results for charsets with different character length, I guess. Maybe it will be better to fix iconv? Or what if smb_copy_conv() will use some reasonable error code? Can you test this patch?
@@ -334,7 +334,7 @@ smb_copy_iconv(struct mbchain *mbp, c_caddr_t src, caddr_t dst,
*dstlen -= outlen;
- return error;
+ return (EINVAL);
(In reply to Andrey V. Elsukov from comment #5)
With my patch, if smb_copy_iconv() is called recursively, you're right.
If not, my patch can be better, as whole src is kept unchanged if iconv_conv() returns -1.
*According to the prototype of iconv_conv() in src/sys/sys/iconv.h,
src is constant, so shouldn't be changed at all by iconv_conv().
With my patch, if modified code path is executed, cp would copy with original filename unchanged and succeeds, while with yours the file is skipped.
Anyway, your patch would be safer and cp didn't hang for me, at least for now, while my patch can be more convenient if iconv implementation is problematic and the src string should be safe for dst.
Will need another eye to proceed.
By the way, your patch couldn't be applied (incomplete and need fix).
Sorry for delay.
Created attachment 175826 [details]
Andrey's patch modified to be applicable
Andrey's patch fixed to be applicable.
*Added truncated tail portion.
*Some space to tab conversion.
(In reply to Tomoaki AOKI from comment #6)
> (In reply to Andrey V. Elsukov from comment #5)
> With my patch, if smb_copy_iconv() is called recursively, you're right.
> If not, my patch can be better, as whole src is kept unchanged if
> iconv_conv() returns -1.
> *According to the prototype of iconv_conv() in src/sys/sys/iconv.h,
> src is constant, so shouldn't be changed at all by iconv_conv().
> With my patch, if modified code path is executed, cp would copy with
> original filename unchanged and succeeds, while with yours the file is
mb_put_mem() can allocate new mbuf and link it with existent. Imagine that one mbuf will be not enough to store converted name. In such case smb_copy_iconv() will be called several times for one name (to convert parts of name stored in different mbufs). So, it is possible, that first part will be converted correctly, but second call will fail. In such case I think it is not correct to just copy part of name as is.
But I don't know how real this scenario can be :)
For bugs that match the following
- Status Is In progress
- Untouched since 2018-01-01.
- Affects Base System OR Documentation
Reset to open status.
I did a quick pass but if you are getting this email it might be worthwhile to double check to see if this bug ought to be closed.
Created attachment 233412 [details]
Fix build Andrey's patch at and after main git 8b83d7e0ee54416b0ee58bd85f9c0ae7fb3357a1
At main git 8b83d7e0ee54416b0ee58bd85f9c0ae7fb3357a1, unused-but-set-variables are no longer allowed.
This breaks kernel build with Andrey's patch applied.
Fix this by deleting variable "error" definition and avoid setting "error" by casting function iconv_conv() as void inside function smb_copy_iconv().
No functional change intended.