Bug 205816

Summary: [ext2fs] [patch] EXT4 sparse blocks unsupported, contain garbage when read
Product: Base System Reporter: Damjan Jovanovic <damjan.jov>
Component: kernAssignee: Pedro F. Giffuni <pfg>
Status: Closed FIXED    
Severity: Affects Many People CC: ardovm, fs, gnehzuil, nowak, pfg
Priority: --- Keywords: feature, patch
Version: CURRENTFlags: pfg: mfc-stable10+
pfg: mfc-stable9-
Hardware: Any   
OS: Any   
Attachments:
Description Flags
preliminary patch to implement support for EXT4 sparse files
none
preliminary patch to implement support for EXT4 sparse files, version 2
none
support for EXT4 sparse files, version 3
none
support for EXT4 sparse files, cleaned up
none
Patch for 9-STABLE to restore compilation none

Description Damjan Jovanovic 2016-01-03 04:35:00 UTC
Created attachment 164979 [details]
preliminary patch to implement support for EXT4 sparse files

The ext2fs module does not support sparse files on EXT4 filesystems, yet doesn't fail. When attempting to read the sparse blocks from a sparse file, instead of zeroes, they contain garbage data read from the wrong disk blocks.

Here's an example sparse file in debugfs:

debugfs:  dump_extents /home/user/Documents/file.iso
Level Entries           Logical            Physical Length Flags
 0/ 1   1/  1       0 - 1122599  1082775            1122600
 1/ 1   1/ 53       0 -       5  1372160 -  1372165      6
 1/ 1   2/ 53       8 -      11  1372168 -  1372171      4
 1/ 1   3/ 53      16 -      19  1372176 -  1372179      4
 1/ 1   4/ 53      24 -      27  1372184 -  1372187      4
 1/ 1   5/ 53      32 -      33  1372192 -  1372193      2
...

Note how block ranges 0-5, 8-11, 16-19 are allocated, but the in between blocks are sparse.

The problem starts in the ext4_ext_binsearch() function in ext2_extents.c which expects the block ranges in the extents to be continuous, and doesn't check whether the lbn parameter even lies inside the found extent's block range. It blindly sets:

path->ep_ext = l - 1;

which will set the pointer to invalid memory (a part of struct ext2_extent_header) when lbn=0 and the first block of a file is sparse. Also in the above example, lbn=6 will use the 0-5 extent, reading block 0 instead.

My preliminary patch improves ext4_ext_binsearch() to check for out of range blocks, marks the path as being sparse, and stores the range of extents starting at the lbn that is sparse. Sparse extents are cached in ext4_ext_read() for future reuse. Actual reading of sparse blocks is implemented by copying from a static array of zeroes (is there a better way of doing it?) and read() definitely works, but mmap() doesn't seem to (system instantly reboots when reading even 1 byte from mapped region, but this also happens for some dense files, so it's not (just?) my patch).

If testing against other EXT4 implementations, note that ext4fuse also has this bug.
Comment 1 Pedro F. Giffuni freebsd_committer freebsd_triage 2016-01-03 14:42:16 UTC
Thank you Damjan: I am CC'ing Zheng Liu as he did the original ext4 implementation.
Comment 2 Pedro F. Giffuni freebsd_committer freebsd_triage 2016-01-03 14:43:15 UTC
drop flz@ ... I meant lz@
Comment 3 Pedro F. Giffuni freebsd_committer freebsd_triage 2016-01-03 15:38:42 UTC
Off the top of my head ...

We do have some support for sparse files, based on the generic
support for UFS:

https://svnweb.freebsd.org/base/head/sys/fs/ext2fs/ext2_vnops.c?r1=252956&r2=252955&pathrev=252956

I don't think we have any filesystem-specific code to deal
with it (which would be ideal).
Comment 4 Damjan Jovanovic 2016-01-08 03:52:55 UTC
Created attachment 165234 [details]
preliminary patch to implement support for EXT4 sparse files, version 2

New version of the patch, rebased against the latest CURRENT. Should work, but could be more efficient, eg. this case will be slow:
  path->ep_sparse_ext.e_len = 1; // FIXME: where does it end?
Comment 5 Pedro F. Giffuni freebsd_committer freebsd_triage 2016-01-08 19:33:06 UTC
(In reply to Damjan Jovanovic from comment #4)
I like this patch.
The only change I'd do is dropping the (int) cast from the second argument
in uiomove, but that comes from the original code.

I will try that and hold a bit more for some code review from Zheng but
I will probably commit it over the weekend if there is silence.
Comment 6 Damjan Jovanovic 2016-01-10 17:20:58 UTC
Created attachment 165366 [details]
support for EXT4 sparse files, version 3

New version of the patch with these changes:
* Dropped the (int) cast from the second argument in uiomove().
* Curly braces now match the code style of surrounding code.
* ep_sparse_ext and ep_ext are now in a union, since only 1 can be valid.
* Read ahead for mmap has been implemented, and the performance of reading from large memory mapped files has improved more than 400 fold, from 251 kB/second to 106 MB/second :)
Comment 7 Pedro F. Giffuni freebsd_committer freebsd_triage 2016-01-10 22:49:09 UTC
(In reply to Damjan Jovanovic from comment #6)

Nice!

The patch looks good but there is something wrong when I apply it:

 % svn patch --dry-run --strip 1 patch-PR165234
C         sys/fs/ext2fs/ext2_bmap.c
>         rejected hunk @@ -112,9 +109,11 @@
>         rejected hunk @@ -124,6 +123,9 @@
C         sys/fs/ext2fs/ext2_extents.c
>         rejected hunk @@ -82,8 +82,7 @@
>         rejected hunk @@ -92,8 +91,7 @@
C         sys/fs/ext2fs/ext2_extents.h
>         rejected hunk @@ -85,8 +85,10 @@
C         sys/fs/ext2fs/ext2_vnops.c
>         rejected hunk @@ -1863,16 +1863,13 @@
Summary of conflicts:
  Text conflicts: 4
Comment 8 Damjan Jovanovic 2016-01-11 02:05:30 UTC
(In reply to Pedro F. Giffuni from comment #7)

Thank you. It applies here on 2 computers. Are you applying it to the latest CURRENT, and have no earlier version of that patch already applied?
Comment 9 Pedro F. Giffuni freebsd_committer freebsd_triage 2016-01-11 02:28:41 UTC
(In reply to Damjan Jovanovic from comment #8)

Hi. Yes, I am using current, checked out from subversion, and I tried with:
- svn patch
- patch from base
- gpatch

They all fail :(.
Try to email me the patch directly to my FreeBSD address.
Comment 10 Pedro F. Giffuni freebsd_committer freebsd_triage 2016-01-11 16:08:58 UTC
Created attachment 165400 [details]
support for EXT4 sparse files, cleaned up

Starting from version 3, I cleaned up some style(9) issues.
Also reused EXT2_MAX_BLOCK_SIZE instead of the hard coded 4096.

I will commit this later unless there are objections.

Any reason why it shouldn't work against 10.x ?
Comment 11 Damjan Jovanovic 2016-01-11 16:52:09 UTC
(In reply to Pedro F. Giffuni from comment #10)

Great :).

I've tested it on PC-BSD 10.2 on a physical machine in addition to FreeBSD CURRENT on VirtualBox. It applies and works on both. I also managed to use it to copy all 25 GB of files from my Linux partition with no problems, so it seems ready to be committed.
Comment 12 commit-hook freebsd_committer freebsd_triage 2016-01-11 19:15:47 UTC
A commit references this bug:

Author: pfg
Date: Mon Jan 11 19:14:56 UTC 2016
New revision: 293680
URL: https://svnweb.freebsd.org/changeset/base/293680

Log:
  ext4: add support for reading sparse files

  Add support for sparse files in ext4. Also implement read-ahead, which
  greatly increases the performance when transferring files from ext4.

  Both features implemented by Damjan Jovanovic.

  PR:		205816
  MFC after:	1 week

Changes:
  head/sys/fs/ext2fs/ext2_bmap.c
  head/sys/fs/ext2fs/ext2_extents.c
  head/sys/fs/ext2fs/ext2_extents.h
  head/sys/fs/ext2fs/ext2_vnops.c
Comment 13 Pedro F. Giffuni freebsd_committer freebsd_triage 2016-01-11 19:45:29 UTC
I am unsure if this should go into 9 as I can't really test.
Comment 14 Kubilay Kocak freebsd_committer freebsd_triage 2016-01-11 20:14:51 UTC
@Pedro The mfc-stableX flags are used to track when/after a change has been committed to the respective branch (+, meaning done), or in the case where it wont, or shouldn't be (-, with comment). If you're not sure, leave the value as ? 

You can mouse-over the flag names for more detail and example usage/meaning
Comment 15 nowak 2016-01-14 11:09:13 UTC
(In reply to Damjan Jovanovic from comment #0)

There is "zero_region" that can be used here instead of the static array.

while (xfersize > 0) {
	ssize_t len = MIN(xfersize, ZERO_REGION_SIZE);
	error = uiomove(__DECONST(void *, zero_region), len, uio);
	if (error) {
		return error;
	}
	xfersize -= len;
}
Comment 16 commit-hook freebsd_committer freebsd_triage 2016-01-18 15:40:32 UTC
A commit references this bug:

Author: pfg
Date: Mon Jan 18 15:39:33 UTC 2016
New revision: 294271
URL: https://svnweb.freebsd.org/changeset/base/294271

Log:
  MFC	r293680
  ext4: add support for reading sparse files

  Add support for sparse files in ext4. Also implement read-ahead, which
  greatly increases the performance when transferring files from ext4.
  The sparse file support has become very common in ext4.

  Both features implemented by Damjan Jovanovic.

  PR:		205816

Changes:
_U  stable/10/
  stable/10/sys/fs/ext2fs/ext2_bmap.c
  stable/10/sys/fs/ext2fs/ext2_extents.c
  stable/10/sys/fs/ext2fs/ext2_extents.h
  stable/10/sys/fs/ext2fs/ext2_vnops.c
Comment 17 commit-hook freebsd_committer freebsd_triage 2016-01-18 15:44:35 UTC
A commit references this bug:

Author: pfg
Date: Mon Jan 18 15:43:37 UTC 2016
New revision: 294273
URL: https://svnweb.freebsd.org/changeset/base/294273

Log:
  MFC	r293680
  ext4: add support for reading sparse files

  Add support for sparse files in ext4. Also implement read-ahead, which
  greatly increases the performance when transferring files from ext4.
  The sparse file support has become very common in ext4.

  Both features implemented by Damjan Jovanovic.

  PR:		205816

Changes:
_U  stable/9/sys/
_U  stable/9/sys/fs/
  stable/9/sys/fs/ext2fs/ext2_bmap.c
  stable/9/sys/fs/ext2fs/ext2_extents.c
  stable/9/sys/fs/ext2fs/ext2_extents.h
  stable/9/sys/fs/ext2fs/ext2_vnops.c
Comment 18 Arrigo Marchiori 2016-01-19 09:27:24 UTC
(In reply to commit-hook from comment #17)

Hello,

this commit seems to break builds on 9-STABLE, because of the unnamed union in stable/9/sys/fs/ext2fs/ext2_extents.h

Excerpt from the build log:
[...]
In file included from @/fs/ext2fs/inode.h:46,
                 from /usr/src/sys/modules/ext2fs/../../fs/ext2fs/ext2_extents.c:41:
@/fs/ext2fs/ext2_extents.h:91: warning: declaration does not declare anything

HTH.
Comment 19 Arrigo Marchiori 2016-01-19 11:41:30 UTC
Created attachment 165814 [details]
Patch for 9-STABLE to restore compilation

This patch is a tentative adaptation of base r294273 to restore compilation on 9-STABLE.
An anonymous union was given a name. An additional assignment is added to avoid a compilation warning.
Comment 20 commit-hook freebsd_committer freebsd_triage 2016-01-19 14:15:31 UTC
A commit references this bug:

Author: pfg
Date: Tue Jan 19 14:15:10 UTC 2016
New revision: 294323
URL: https://svnweb.freebsd.org/changeset/base/294323

Log:
  Revert r294273
  MFC	r293680
  ext4: add support for reading sparse files

  The change was meant for newer versions and doesn't work out of the box.
  While it seems easy to adapt I prfer not to have the implementations
  diverge at this time.

  Reported by:	Arrigo Marchiori
  PR:		205816

Changes:
_U  stable/9/sys/
_U  stable/9/sys/fs/
  stable/9/sys/fs/ext2fs/ext2_bmap.c
  stable/9/sys/fs/ext2fs/ext2_extents.c
  stable/9/sys/fs/ext2fs/ext2_extents.h
  stable/9/sys/fs/ext2fs/ext2_vnops.c
Comment 21 Pedro F. Giffuni freebsd_committer freebsd_triage 2016-01-19 14:19:00 UTC
(In reply to Arrigo Marchiori from comment #19)

Thanks for the report Arrigo!

Merging to stable was admittedly a bad decision.
I reverted the change as I don't want to handle diverging
versions of the ext2fs driver at this time.

I may consider merging a fixed version later.

Pedro.
Comment 22 Pedro F. Giffuni freebsd_committer freebsd_triage 2016-02-09 03:36:45 UTC
*** Bug 205932 has been marked as a duplicate of this bug. ***