Bug 264196 - sysutils/fusefs-ntfs: read(2)ing big files in blocks of 32768 bytes and smaller fails with EINVAL
Summary: sysutils/fusefs-ntfs: read(2)ing big files in blocks of 32768 bytes and small...
Status: Open
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: --- Affects Only Me
Assignee: freebsd-ports-bugs (Nobody)
URL:
Keywords: needs-qa
Depends on:
Blocks:
 
Reported: 2022-05-24 04:11 UTC by Alexey Dokuchaev
Modified: 2023-01-25 20:16 UTC (History)
4 users (show)

See Also:
bugzilla: maintainer-feedback? (freebsd)


Attachments
Demo program that does the read(2) (803 bytes, text/plain)
2022-05-24 04:11 UTC, Alexey Dokuchaev
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Alexey Dokuchaev freebsd_committer freebsd_triage 2022-05-24 04:11:08 UTC
Created attachment 234168 [details]
Demo program that does the read(2)

With the recent 14.0-CURRENT (n255230) and fusefs-ntfs-2017.3.23 (UBLIO=off), I'm frequently getting errors when copying not-very-small files with scp(1) to remote machines, but not with rsync(1); cat(1)ing file e.g. to /dev/null also fails.  Running those programs under truss(1) reveals interesting bug: reading the files on Windows 10 formatted NTFS with block of size 32768 and smaller (stat(2) returns optimal size of 4096 bytes, which both scp(1) and cat(1) use) fails after several blocks; rsync(1) uses 262144 and does not fail.  Even more interestingly, it does not fail on first couple of blocks and on previously read ones, i.e. once the file had been read, it can be read again with smaller block size.  I've written a program to demonstrate this (attached):

$ ls -l xorgproto-2021.5.tbz
-rw-r--r--  1 danfe  wheel  245998 21 янв.  14:11 xorgproto-2021.5.tbz
$ /tmp/a.out xorgproto-2021.5.tbz 32768
using i/o size 32768 (optimal 4096)
read: Invalid argument
1 reads were successful

$ /tmp/a.out xorgproto-2021.5.tbz 32768
using i/o size 32768 (optimal 4096)
read: Invalid argument
3 reads were successful

$ /tmp/a.out xorgproto-2021.5.tbz 32768
using i/o size 32768 (optimal 4096)

$ /tmp/a.out xorgproto-2021.5.tbz -1
using i/o size 4096 (optimal 4096)

I'm not sure if the bug is in the port or fusefs(5) implementation, so I'm CC'ing Alan in case he might have a clue.
Comment 1 Alan Somers freebsd_committer freebsd_triage 2022-05-24 11:54:44 UTC
I can't reproduce the problem with a different fuse file system, so it's probably a bug in fuse-ntfs.  I can try to investigate, but you'll have to tell me how to create the file system first.  Have you tried running the daemon in the foreground, with debug output?
Comment 2 Alexey Dokuchaev freebsd_committer freebsd_triage 2022-05-24 13:58:20 UTC
(In reply to Alan Somers from comment #1)
> I can try to investigate, but you'll have to tell me how to create the file
> system first.
$ pkg which `which mkntfs`
/usr/local/sbin/mkntfs was installed by package fusefs-ntfs-2017.3.23
$ truncate -s 1G ntfs
# mdconfig -f ~alice/ntfs
md1
# mkntfs -f /dev/md1
... Have a nice day.
# ntfs-3g /dev/md1 /mnt/bar
$ cd /mnt/bar
$ truncate -s 10M foo
$ /tmp/a.out foo -1
using i/o size 4096 (optimal 4096)
read: Invalid argument
1 reads were successful
...
$ /tmp/a.out foo -1
using i/o size 4096 (optimal 4096)
read: Invalid argument
177 reads were successful
...

> Have you tried running the daemon in the foreground, with debug output?
Not yet, will try.
Comment 3 Kubilay Kocak freebsd_committer freebsd_triage 2022-05-25 00:07:37 UTC
I have a WIP updating fuse-ntfs to 2021.8.22 in review D34079

Are you able to test that version?
Comment 4 Alexey Dokuchaev freebsd_committer freebsd_triage 2022-05-25 00:11:16 UTC
(In reply to Kubilay Kocak from comment #3)
> Are you able to test that version?
Well, now that I've provided instructions in comment #2 how to reproduce it locally, you should be able to easily test it yourself, no? :-)
Comment 5 Kubilay Kocak freebsd_committer freebsd_triage 2022-05-25 00:25:12 UTC
(In reply to Alexey Dokuchaev from comment #4)

I'm just trying to assist by providing avenues/possible options to isolate that may save people some time.
Comment 6 Alan Somers freebsd_committer freebsd_triage 2022-05-25 01:19:19 UTC
(In reply to Alexey Dokuchaev from comment #2)
I can't reproduce the bug using these instructions:

> ntfs-3g --version
ntfs-3g 2017.3.23 external FUSE 29
> uname -a
FreeBSD myhostname 14.0-CURRENT FreeBSD 14.0-CURRENT #393 loadavg-1k-n255214-639d0eae55bb-dirty: Fri May  6 17:17:27 MDT 2022     somers@myhostname:/usr/obj/usr/home/somers/src/freebsd.org/src/amd64.amd64/sys/GENERIC amd64
Comment 7 Alexey Dokuchaev freebsd_committer freebsd_triage 2022-05-25 02:06:15 UTC
(In reply to Alan Somers from comment #6)
> I can't reproduce the bug using these instructions
Oh well, thank you for trying in any case!  I'll play with the port's options, check out Kubilay's WIP update, and other debugging to obtain and share more data points later.
Comment 8 Alexey Dokuchaev freebsd_committer freebsd_triage 2022-05-25 04:04:18 UTC
(In reply to Alan Somers from comment #1)
> Have you tried running the daemon in the foreground, with debug output?
So I did.  Apparently the bug exhibits itself when the controversial (see bug #194526 and bug #206978) UBLIO option is OFF (mentioned in the comment #1).  Here's the relevant part of the diff between correct and buggy debug output:

 [...]
 unique: 14, opcode: BMAP (37), nodeid: 3, insize: 56, pid: <snip>
-   unique: 14, error: -78 (Function not implemented), outsize: 16
-unique: 15, opcode: READ (15), nodeid: 3, insize: 80, pid: <snip>
-read[0] 131072 bytes from 65536 flags: 0x0
-   read[0] 131072 bytes from 65536
-   unique: 15, success, outsize: 131088
-unique: 16, opcode: GETATTR (3), nodeid: 3, insize: 56, pid: <snip>
+bmap /Temp/qemu6-6.2.0.pkg blocksize: 65536 index: 1
+   unique: 14, error: -22 (Invalid argument), outsize: 16
+unique: 15, opcode: GETATTR (3), nodeid: 3, insize: 56, pid: <snip>
 getattr /Temp/qemu6-6.2.0.pkg
-   unique: 16, success, outsize: 120
-unique: 17, opcode: GETATTR (3), nodeid: 3, insize: 56, pid: <snip>
+   unique: 15, success, outsize: 120
+unique: 16, opcode: BMAP (37), nodeid: 3, insize: 56, pid: <snip>
+bmap /Temp/qemu6-6.2.0.pkg blocksize: 65536 index: 1
+   unique: 16, error: -22 (Invalid argument), outsize: 16
+unique: 17, opcode: FLUSH (25), nodeid: 3, insize: 64, pid: <snip>
+   unique: 17, error: -78 (Function not implemented), outsize: 16
+unique: 18, opcode: SETATTR (4), nodeid: 3, insize: 128, pid: <snip>
 getattr /Temp/qemu6-6.2.0.pkg
-   unique: 17, success, outsize: 120
-unique: 18, opcode: GETATTR (3), nodeid: 3, insize: 56, pid: <snip>
-getattr /Temp/qemu6-6.2.0.pkg
    unique: 18, success, outsize: 120

(In reply to Alan Somers from comment #6)
> I can't reproduce the bug using these instructions
Can you rebuild the port with UBLIO=off and retry?  In bug #194526, comment #8 you say that fusefs(5) now has sane caching builtin and suggest that we eliminate libublio, so I guessed you must have it disabled for a long time. :-)
Comment 9 Alan Somers freebsd_committer freebsd_triage 2022-05-25 12:55:19 UTC
Yes, when I rebuild it without ublio your test program fails immediately.
Comment 10 Alexey Dokuchaev freebsd_committer freebsd_triage 2022-05-26 04:51:27 UTC
(In reply to Kubilay Kocak from comment #3)
> I have a WIP updating fuse-ntfs to 2021.8.22 in review D34079
It behaves the same way WRT this bug.  Apart from that, proposed update path is quite dubious and doing it literally from scratch is uncalled for.  Please hold it off for the time being, thank you.
Comment 11 Alexey Dokuchaev freebsd_committer freebsd_triage 2022-05-26 07:25:58 UTC
(In reply to Alexey Dokuchaev from comment #8)
> ...
> +bmap /Temp/qemu6-6.2.0.pkg blocksize: 65536 index: 1
> +   unique: 14, error: -22 (Invalid argument), outsize: 16
Looks like this -EINVAL comes from `static int ntfs_fuse_bmap()' in the src/ntfs-3g.c around line 2773.  Adding debug fprintf() like this:

> if (blocksize > ctx->vol->cluster_size) {
>    fprintf(stderr, " >>> blocksize = %zd, cluster_size = %d\n",
>      blocksize, ctx->vol->cluster_size);
>    return -EINVAL;
>}
yields ">>> blocksize = 65536, cluster_size = 4096".
Comment 12 Gleb Popov freebsd_committer freebsd_triage 2023-01-25 20:16:48 UTC
Bumped into this with simple

dd if=/media/NTFS Volume/4Gb.file of=/dev/zero bs=1

Interesting observation:

1. Insert the stick, mount the FS.

2. Run the command for the first time:
1+0 records in
1+0 records out
1 bytes transferred in 0.002173 secs (460 bytes/sec)

3. Run the command again:
65537+0 records in
65537+0 records out
65537 bytes transferred in 0.435892 secs (150352 bytes/sec)

4. And again:
131073+0 records in
131073+0 records out
131073 bytes transferred in 0.861837 secs (152086 bytes/sec)

5. Again:
196609+0 records in
196609+0 records out
196609 bytes transferred in 1.306456 secs (150490 bytes/sec)

6. And the last time:
262145+0 records in
262145+0 records out
262145 bytes transferred in 2.224164 secs (117862 bytes/sec)

7. Unmount, eject, insert, mount.

8. Retry the test - everything is same up to numbers.

Hope this will give a hint on what's going wrong.