Created attachment 234168 [details] Demo program that does the read(2) With the recent 14.0-CURRENT (n255230) and fusefs-ntfs-2017.3.23 (UBLIO=off), I'm frequently getting errors when copying not-very-small files with scp(1) to remote machines, but not with rsync(1); cat(1)ing file e.g. to /dev/null also fails. Running those programs under truss(1) reveals interesting bug: reading the files on Windows 10 formatted NTFS with block of size 32768 and smaller (stat(2) returns optimal size of 4096 bytes, which both scp(1) and cat(1) use) fails after several blocks; rsync(1) uses 262144 and does not fail. Even more interestingly, it does not fail on first couple of blocks and on previously read ones, i.e. once the file had been read, it can be read again with smaller block size. I've written a program to demonstrate this (attached): $ ls -l xorgproto-2021.5.tbz -rw-r--r-- 1 danfe wheel 245998 21 янв. 14:11 xorgproto-2021.5.tbz $ /tmp/a.out xorgproto-2021.5.tbz 32768 using i/o size 32768 (optimal 4096) read: Invalid argument 1 reads were successful $ /tmp/a.out xorgproto-2021.5.tbz 32768 using i/o size 32768 (optimal 4096) read: Invalid argument 3 reads were successful $ /tmp/a.out xorgproto-2021.5.tbz 32768 using i/o size 32768 (optimal 4096) $ /tmp/a.out xorgproto-2021.5.tbz -1 using i/o size 4096 (optimal 4096) I'm not sure if the bug is in the port or fusefs(5) implementation, so I'm CC'ing Alan in case he might have a clue.
I can't reproduce the problem with a different fuse file system, so it's probably a bug in fuse-ntfs. I can try to investigate, but you'll have to tell me how to create the file system first. Have you tried running the daemon in the foreground, with debug output?
(In reply to Alan Somers from comment #1) > I can try to investigate, but you'll have to tell me how to create the file > system first. $ pkg which `which mkntfs` /usr/local/sbin/mkntfs was installed by package fusefs-ntfs-2017.3.23 $ truncate -s 1G ntfs # mdconfig -f ~alice/ntfs md1 # mkntfs -f /dev/md1 ... Have a nice day. # ntfs-3g /dev/md1 /mnt/bar $ cd /mnt/bar $ truncate -s 10M foo $ /tmp/a.out foo -1 using i/o size 4096 (optimal 4096) read: Invalid argument 1 reads were successful ... $ /tmp/a.out foo -1 using i/o size 4096 (optimal 4096) read: Invalid argument 177 reads were successful ... > Have you tried running the daemon in the foreground, with debug output? Not yet, will try.
I have a WIP updating fuse-ntfs to 2021.8.22 in review D34079 Are you able to test that version?
(In reply to Kubilay Kocak from comment #3) > Are you able to test that version? Well, now that I've provided instructions in comment #2 how to reproduce it locally, you should be able to easily test it yourself, no? :-)
(In reply to Alexey Dokuchaev from comment #4) I'm just trying to assist by providing avenues/possible options to isolate that may save people some time.
(In reply to Alexey Dokuchaev from comment #2) I can't reproduce the bug using these instructions: > ntfs-3g --version ntfs-3g 2017.3.23 external FUSE 29 > uname -a FreeBSD myhostname 14.0-CURRENT FreeBSD 14.0-CURRENT #393 loadavg-1k-n255214-639d0eae55bb-dirty: Fri May 6 17:17:27 MDT 2022 somers@myhostname:/usr/obj/usr/home/somers/src/freebsd.org/src/amd64.amd64/sys/GENERIC amd64
(In reply to Alan Somers from comment #6) > I can't reproduce the bug using these instructions Oh well, thank you for trying in any case! I'll play with the port's options, check out Kubilay's WIP update, and other debugging to obtain and share more data points later.
(In reply to Alan Somers from comment #1) > Have you tried running the daemon in the foreground, with debug output? So I did. Apparently the bug exhibits itself when the controversial (see bug #194526 and bug #206978) UBLIO option is OFF (mentioned in the comment #1). Here's the relevant part of the diff between correct and buggy debug output: [...] unique: 14, opcode: BMAP (37), nodeid: 3, insize: 56, pid: <snip> - unique: 14, error: -78 (Function not implemented), outsize: 16 -unique: 15, opcode: READ (15), nodeid: 3, insize: 80, pid: <snip> -read[0] 131072 bytes from 65536 flags: 0x0 - read[0] 131072 bytes from 65536 - unique: 15, success, outsize: 131088 -unique: 16, opcode: GETATTR (3), nodeid: 3, insize: 56, pid: <snip> +bmap /Temp/qemu6-6.2.0.pkg blocksize: 65536 index: 1 + unique: 14, error: -22 (Invalid argument), outsize: 16 +unique: 15, opcode: GETATTR (3), nodeid: 3, insize: 56, pid: <snip> getattr /Temp/qemu6-6.2.0.pkg - unique: 16, success, outsize: 120 -unique: 17, opcode: GETATTR (3), nodeid: 3, insize: 56, pid: <snip> + unique: 15, success, outsize: 120 +unique: 16, opcode: BMAP (37), nodeid: 3, insize: 56, pid: <snip> +bmap /Temp/qemu6-6.2.0.pkg blocksize: 65536 index: 1 + unique: 16, error: -22 (Invalid argument), outsize: 16 +unique: 17, opcode: FLUSH (25), nodeid: 3, insize: 64, pid: <snip> + unique: 17, error: -78 (Function not implemented), outsize: 16 +unique: 18, opcode: SETATTR (4), nodeid: 3, insize: 128, pid: <snip> getattr /Temp/qemu6-6.2.0.pkg - unique: 17, success, outsize: 120 -unique: 18, opcode: GETATTR (3), nodeid: 3, insize: 56, pid: <snip> -getattr /Temp/qemu6-6.2.0.pkg unique: 18, success, outsize: 120 (In reply to Alan Somers from comment #6) > I can't reproduce the bug using these instructions Can you rebuild the port with UBLIO=off and retry? In bug #194526, comment #8 you say that fusefs(5) now has sane caching builtin and suggest that we eliminate libublio, so I guessed you must have it disabled for a long time. :-)
Yes, when I rebuild it without ublio your test program fails immediately.
(In reply to Kubilay Kocak from comment #3) > I have a WIP updating fuse-ntfs to 2021.8.22 in review D34079 It behaves the same way WRT this bug. Apart from that, proposed update path is quite dubious and doing it literally from scratch is uncalled for. Please hold it off for the time being, thank you.
(In reply to Alexey Dokuchaev from comment #8) > ... > +bmap /Temp/qemu6-6.2.0.pkg blocksize: 65536 index: 1 > + unique: 14, error: -22 (Invalid argument), outsize: 16 Looks like this -EINVAL comes from `static int ntfs_fuse_bmap()' in the src/ntfs-3g.c around line 2773. Adding debug fprintf() like this: > if (blocksize > ctx->vol->cluster_size) { > fprintf(stderr, " >>> blocksize = %zd, cluster_size = %d\n", > blocksize, ctx->vol->cluster_size); > return -EINVAL; >} yields ">>> blocksize = 65536, cluster_size = 4096".
Bumped into this with simple dd if=/media/NTFS Volume/4Gb.file of=/dev/zero bs=1 Interesting observation: 1. Insert the stick, mount the FS. 2. Run the command for the first time: 1+0 records in 1+0 records out 1 bytes transferred in 0.002173 secs (460 bytes/sec) 3. Run the command again: 65537+0 records in 65537+0 records out 65537 bytes transferred in 0.435892 secs (150352 bytes/sec) 4. And again: 131073+0 records in 131073+0 records out 131073 bytes transferred in 0.861837 secs (152086 bytes/sec) 5. Again: 196609+0 records in 196609+0 records out 196609 bytes transferred in 1.306456 secs (150490 bytes/sec) 6. And the last time: 262145+0 records in 262145+0 records out 262145 bytes transferred in 2.224164 secs (117862 bytes/sec) 7. Unmount, eject, insert, mount. 8. Retry the test - everything is same up to numbers. Hope this will give a hint on what's going wrong.