Bug 197876 - [devfs] an error in devfs leads to data loss and to faulty block size being reported - GEOM providers' sector and media sizes are not reflected in the block device entries
Summary: [devfs] an error in devfs leads to data loss and to faulty block size being r...
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Many People
Assignee: freebsd-bugs (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-02-21 09:50 UTC by jau
Modified: 2024-11-01 13:34 UTC (History)
3 users (show)

See Also:


Attachments
fix devfs_getattr() just enough to make it report sizes for GEOM providers (907 bytes, patch)
2015-02-23 17:49 UTC, jau
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description jau 2015-02-21 09:50:35 UTC
When I set a 16k sector size for a new GEOM gate provider the existing consumers
happily use multiples of 16k when tasting the new provider. The sector size is
not reflected in the block size of the new device entry as reported by stat(),
though.
The new device entry is reported as using 4k blocks by stat().
If I ignore the block size value reported by stat() and anyhow use what was set
when the new provider was created, everything works. (dd bs=16384 ...)

The big problem is that usually programs know nothing more of the true block size
but what is reported by stat(). Think of e.g. stdio. Any program using stdio to
access a device will end up trying to use a wrong block size.
When the user space application (think of e.g. fstyp) tries to use wrong block
size to read or write the new device, the request never gets all the way to
GEOM gate.
Something higher up in the chain notices that the block size used by the program
does not match the sector size of the underlying GEOM provider and responds with
EINVAL to the program.

For the time being I do not know how and where the device entry gets the faulty
block size.
Comment 1 jau 2015-02-23 08:26:08 UTC
It seems that the same disparity between GEOM internal settings and what is
shown by stat() applies to all GEOM device entries.
It also seems that this odd behavior is rooted in the vnode attributes being
left uninitialized. In addition to st_blksize being persistently 4k, st_size
and st_blocks are not initialized at all. The information to fill in the latter
two is anyhow properly stored in the mediasize field of the provider.
The only reason st_blksize is reported as 4k is this line in the function
vn_stat (see vfs_vnops.c)

sb->st_blksize = max(PAGE_SIZE, vap->va_blocksize);

The contents of the vattr structure seem to be largely only what gets filled
in by vattr_null (see vfs_subr.c)
Here just a couple of examples...

(sleipnir:pts/1) 9:58 ~> pfstat /dev/ggate0 
/dev/ggate0:
	st_dev: 	1895890688
	st_ino: 	190
	st_mode:	0x21a0
	st_nlink:	1
	st_uid: 	0
	st_gid: 	5
	st_rdev:	190
	st_size:	0
	st_blocks:	0
	st_blksize:	4096
	st_flags:	0x0
	st_gen: 	0
	st_btim:	1970-01-01 01:59:59.000000000
	st_mtim:	2015-02-23 10:00:14.989084901
	st_ctim:	2015-02-23 10:00:14.989084901
	st_atim:	2015-02-23 10:00:14.989084901

(sleipnir:pts/1) DING! ~> pfstat /dev/mirror/root
/dev/mirror/root:
	st_dev: 	1895890688
	st_ino: 	201
	st_mode:	0x21a0
	st_nlink:	1
	st_uid: 	0
	st_gid: 	5
	st_rdev:	201
	st_size:	0
	st_blocks:	0
	st_blksize:	4096
	st_flags:	0x0
	st_gen: 	0
	st_btim:	1970-01-01 01:59:59.000000000
	st_mtim:	2015-02-23 09:20:16.130940000
	st_ctim:	2015-02-23 09:20:16.130940000
	st_atim:	2015-02-23 09:22:11.158069059


In fact I think this is now becoming mostly a case of finding the proper
place to fill in the appropriate 'struct vattr' fields from the provider
structure.
Comment 2 jau 2015-02-23 17:49:18 UTC
Created attachment 153379 [details]
fix devfs_getattr() just enough to make it report sizes for GEOM providers

This patch fixes the problems with GEOM provider size fields reported to
user space via the [lf]stat() calls.

Now the same example cases shown before look a whole lot better.
Even the 16k sectorsize/st_blksize which previously was being falsely reported
as 4k comes out just fine.

/dev/mirror/root:
	st_dev: 	1895890688
	st_ino: 	201
	st_mode:	0x21a0
	st_nlink:	1
	st_uid: 	0
	st_gid: 	5
	st_rdev:	201
	st_size:	2147483136
	st_blocks:	4194303
	st_blksize:	4096
	st_flags:	0x0
	st_gen: 	0
	st_btim:	1970-01-01 01:59:59.000000000
	st_mtim:	2015-02-23 19:23:13.180258000
	st_ctim:	2015-02-23 19:23:13.180258000
	st_atim:	2015-02-23 19:25:07.508476106

/dev/ggate0:
	st_dev: 	1895890688
	st_ino: 	177
	st_mode:	0x21a0
	st_nlink:	1
	st_uid: 	0
	st_gid: 	5
	st_rdev:	177
	st_size:	68719476736
	st_blocks:	134217728
	st_blksize:	16384
	st_flags:	0x0
	st_gen: 	0
	st_btim:	1970-01-01 01:59:59.000000000
	st_mtim:	2015-02-23 19:43:48.037354325
	st_ctim:	2015-02-23 19:43:48.037354325
	st_atim:	2015-02-23 19:43:48.037354325
Comment 3 jau 2015-02-28 09:51:09 UTC
The same mistake is present in at least 11.x and 10.1-stable.
Most likely it is present in any and all FreeBSD versions
which have carried support for devfs.
Comment 4 jau 2015-05-31 12:05:47 UTC
At least OS X seems to be properly reporting device block sizes which
are larger than the minimum block cluster.

/dev/disk0:
        st_dev:         488796360
        st_ino:         589
        st_mode:        0x61a0
        st_nlink:       1
        st_uid:         0
        st_gid:         5
        st_rdev:        16777216
        st_size:        0
        st_blocks:      0
        st_blksize:     2048
        st_flags:       0x0
        st_gen:         0
        st_btim:        2015-04-25 12:14:41.000000000
        st_mtim:        2015-04-25 12:14:41.000000000
        st_ctim:        2015-04-25 12:14:41.000000000
        st_atim:        2015-04-25 12:14:41.000000000

/dev/rdisk0:
        st_dev:         488796360
        st_ino:         591
        st_mode:        0x21a0
        st_nlink:       1
        st_uid:         0
        st_gid:         5
        st_rdev:        16777216
        st_size:        0
        st_blocks:      0
        st_blksize:     131072
        st_flags:       0x0
        st_gen:         0
        st_btim:        2015-04-25 12:14:41.000000000
        st_mtim:        2015-04-25 12:14:41.000000000
        st_ctim:        2015-04-25 12:14:41.000000000
        st_atim:        2015-05-27 08:52:16.996702000
Comment 5 Mark Linimon freebsd_committer freebsd_triage 2024-10-03 03:42:32 UTC
^Triage: clear stale flags.

To submitter: is this aging PR still relevant?
Comment 6 crest 2024-11-01 13:34:59 UTC
Stacking a GEOM provider that changes the reported blocksize (e.g. gnop create -S 16384) and exporting it via GEOM gate should work, if it doesn't you you have a simple reproducer.