Bug 179248 - A return value of telldir(3) only seekable for once
Summary: A return value of telldir(3) only seekable for once
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: standards (show other bugs)
Version: 9.1-STABLE
Hardware: Any Any
: Normal Affects Only Me
Assignee: freebsd-standards (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-06-03 08:20 UTC by Akinori MUSHA
Modified: 2018-05-28 19:45 UTC (History)
3 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Akinori MUSHA freebsd_committer freebsd_triage 2013-06-03 08:20:00 UTC
Our implementation of telldir(3)/seekdir(3) is not POSIX compliant in that a value obtained from telldir(3) is invalidated after calling seekdir(3) and then readdir(3).

IEEE Std 1003.1, 2008/2013 says that only a call of rewinddir(3) may invalidate the location values returned by telldir(3):

    If the value of loc was not obtained from an earlier call to
    telldir(), or if a call to rewinddir() occurred between the call
    to telldir() and the call to seekdir(), the results of subsequent
    calls to readdir() are unspecified.

Fix: 

I don't have a quick fix for this, as it may need a revamp of how the location thing is defined.

NetBSD seems to have a different implementation which doesn't have this problem.
However, I'm not sure if theirs is flawless esp. wrt memory management.
How-To-Repeat: % cat ../dirtest.c
#include <stdio.h>
#include <string.h>
#include <sys/types.h>
#include <dirent.h>

int main(void)
{
	DIR *dirp;
	struct dirent *dp;
	long pos;

	if ((dirp = opendir(".")) == NULL) return 1;
	printf("telldir = %ld\n", telldir(dirp));

	if ((dp = readdir(dirp)) == NULL) return 1;
	printf("readdir = %s\n", dp->d_name);
	printf("telldir = %ld\n", telldir(dirp));

	if ((dp = readdir(dirp)) == NULL) return 1;
	printf("readdir = %s\n", dp->d_name);
	printf("telldir = %ld\n", telldir(dirp));

	if ((dp = readdir(dirp)) == NULL) return 1;
	printf("readdir = %s\n", dp->d_name);
	printf("telldir = %ld\n", pos = telldir(dirp));

	if ((dp = readdir(dirp)) == NULL) return 1;
	printf("readdir = %s\n", dp->d_name);
	printf("telldir = %ld\n", telldir(dirp));

	printf("seekdir to %ld\n", pos);
	seekdir(dirp, pos);

	if ((dp = readdir(dirp)) == NULL) return 1;
	printf("readdir = %s\n", dp->d_name);
	printf("telldir = %ld\n", telldir(dirp));

	printf("seekdir to %ld\n", pos);
	seekdir(dirp, pos);

	if ((dp = readdir(dirp)) == NULL) return 1;
	printf("readdir = %s\n", dp->d_name);
	printf("telldir = %ld\n", telldir(dirp));

	(void)closedir(dirp);
	return 0;
}
% make ../dirtest
cc -O2 -pipe -g -march=core2  ../dirtest.c  -o ../dirtest
% ls -al
total 35
drwxr-xr-x   2 knu  knu   6 Jun  3 15:39 .
drwx------  11 knu  knu  53 Jun  3 15:39 ..
-rw-r--r--   1 knu  knu   0 Jun  3 15:39 aaa
-rw-r--r--   1 knu  knu   0 Jun  3 15:39 bbb
-rw-r--r--   1 knu  knu   0 Jun  3 15:39 ccc
-rw-r--r--   1 knu  knu   0 Jun  3 15:39 ddd
% ../dirtest
telldir = 1
readdir = .
telldir = 2
readdir = ..
telldir = 3
readdir = aaa
telldir = 4
readdir = ccc
telldir = 5
seekdir to 4
readdir = ccc    # <= OK
telldir = 6
seekdir to 4
readdir = bbb    # <= FAIL
telldir = 7
Comment 1 Jilles Tjoelker freebsd_committer freebsd_triage 2013-06-11 22:29:53 UTC
On Mon, Jun 03, 2013 at 07:14:46AM +0000, Akinori MUSHA wrote:
> >Number:         179248
> >Category:       standards
> >Synopsis:       A return value of telldir(3) only seekable for once
> [snip]
> >Description:
> Our implementation of telldir(3)/seekdir(3) is not POSIX compliant in
> that a value obtained from telldir(3) is invalidated after calling
> seekdir(3) and then readdir(3).

> IEEE Std 1003.1, 2008/2013 says that only a call of rewinddir(3) may
> invalidate the location values returned by telldir(3):

>     If the value of loc was not obtained from an earlier call to
>     telldir(), or if a call to rewinddir() occurred between the call
>     to telldir() and the call to seekdir(), the results of subsequent
>     calls to readdir() are unspecified.

I think the problem is that telldir()/seekdir() want to return to the
same directory entry within the block, instead of to the beginning of
the block, while the required bits for the entry within the block are
not available in telldir()'s return value.

Some other platforms provide kernel support for this operation. The
struct dirent has a field d_off which is the file offset of the next
entry. It looks like the ino64 patches from Gleb Kurtsou add this
functionality to the FreeBSD kernel. With this, telldir() returns the
d_off value in the last dirent returned by readdir() (or 0) and
seekdir() simply calls lseek().

As a result, a telldir()/seekdir() sequence may set the directory
"backwards" a few entries even if it has been unmodified, because UFS
truncates the offset to a block boundary. This may require a network
filesystem to deny requests for a single directory entry at a time.
Alternatively, UFS may replace the truncated bits with the number of
directory entries to skip. This takes advantage of d_off being more like
a "cookie" than a true file offset.

The kernel may have a similar "out of bits" problem when an application
with 32-bit long calls getdirentries(2) on an NFSv3 directory which
returns 64-bit cookies, and also with unionfs and mount -o union.

In the case of unionfs, the kernel appears to use some sort of state in
the unionfs vnode and assumes that the directory cookies are otherwise
unique enough. This likely causes problems if lseek() is used with a
non-zero offset/cookie.

In the case of mount -o union, the kernel "solves" the problem by
irreversibly modifying the open file description to refer to the lower
layer after the upper layer's entries have been read; the only way to
deal with this in userland is to read the entire directory on a
duplicate open file description (created with open(fd, ".", ...)) on
opendir() and rewinddir() (bug: rewinddir() does not do this, violating
POSIX's requirement that rewinddir() pick up changes made to the
directory).

> >Fix:
> I don't have a quick fix for this, as it may need a revamp of how the
> location thing is defined.

> NetBSD seems to have a different implementation which doesn't have
> this problem.
> However, I'm not sure if theirs is flawless esp. wrt memory
> management.

NetBSD stores the (block, entry) pairs uniquely and for the life of the
DIR object (perhaps discarding them upon rewinddir() as well). This
means no memory is "leaked" per se but memory consumption on a DIR that
has many telldir() calls is proportional to the number of entries in the
directory.

Also, a "proper" solution is possible if you are willing to accept that
it does not work for all filesystems. Most filesystems leave some of the
bits zero (particularly if there are 64 of them) which can then be used
to store the entry number. However, a malloc-based solution is then
still necessary for filesystems that do need all the bits or very large
directories.

-- 
Jilles Tjoelker
Comment 2 John Baldwin freebsd_committer freebsd_triage 2014-09-03 17:59:58 UTC
While this is not quite fully fixed as described by Jilles, I think the simpler bug you initially described is addressed by http://svnweb.freebsd.org/base?view=revision&revision=269204
Comment 3 Ed Maste freebsd_committer freebsd_triage 2017-08-18 13:12:08 UTC
This is not reproducible in stable/10 and later, presumably due to the commit John mentions. In -current we now have d_off as a consequence of Gleb's ino64 work and should revisit Jilles' comment #1.
Comment 4 Konstantin Belousov freebsd_committer freebsd_triage 2017-08-18 13:28:44 UTC
(In reply to Ed Maste from comment #3)
d_off is not functional yet.
Comment 5 Eitan Adler freebsd_committer freebsd_triage 2018-05-28 19:45:50 UTC
batch change:

For bugs that match the following
-  Status Is In progress 
AND
- Untouched since 2018-01-01.
AND
- Affects Base System OR Documentation

DO:

Reset to open status.


Note:
I did a quick pass but if you are getting this email it might be worthwhile to double check to see if this bug ought to be closed.