Bug 18095

Summary: mmap region with non zero offset is corrupted after madvise call
Product: Base System Reporter: pha <pha>
Component: i386Assignee: freebsd-bugs (Nobody) <bugs>
Status: Closed FIXED    
Severity: Affects Only Me CC: pha
Priority: Normal    
Version: 4.0-RELEASE   
Hardware: Any   
OS: Any   

Description pha 2000-04-19 14:50:01 UTC
This is a VM subsystem bug related to a sequence of calls to mmap and madvise.

It first appeared in snapshots sometime after the July 5th, 1999 Walnut Creek
FreeBSD 4.0 snapshot.

What happens is that in the following case, the memory region visible to the
running progrem is corrupted:

 region =  mmap a file with offset 16K
 madvise(region, MADV_SEQUENTIAL)
 madvise(region, MADV_WILLNEED)

 after that sequence, the region has incorrect contents.  If the madvise calls
 are commented out, the map works properly.  My workaround is to not use
 madvise on post July 5th 1999 4.0 snapshot machines.  The incorrect contents
 are actually data from the same file, but shifted by 16384 bytes, except for the
 last 16384 bytes, which are apparently correct.

 I believe that the _very first_ time madvise is used, that it will not fail.
 Any subsequent run will fail until reboot.

Fix: 

no clue.  I understand the VM system was reworked after July 5th snapshot, and I
have to believe that this test case is not common, so it slipped by...
How-To-Repeat: 
Run the following program three times -
first time creates the test file,
second time shows operation without madvise.
third time shows operation with madvise (which fails).

See comments a few lines below for run instructions.


/*
 * test whether an object can be mapped in two places in VM.
 *
 * compile this program two ways:
 * cc -o vm vm.c
 * vm foo		(creates the file foo with block ids)
 * vm foo		(tests the file foo - should work, because we aren't using madvise)
 *
 * cc -o vm -DUSEMADVISE vm.c
 * vm foo		(now fails on FreeBSD after July 5th 1999 Walnut creek snapshot,
 *				 because madvise corrupts vm maps for second offset object)
 *
 *
 * usage: vm <filename>
 *
 * if <filename> does not exist, creates a FILESIZE file, populates
 * each unique getpagesize() block of data with a getpagesize()/4
 * array of numbers, each of which is the page index into the file.
 *
 * if <filename> does exist, the file is mapped in two sections.
 * the first section the first HEADERSIZE bytes, and is mapped where the OS
 * wants to map it.  The second section (FILESIZE - HEADERSIZE) is mapped via
 * a second mmap call.
 *
 * then both sections are checked for the correct values.
 *
 *
 * written by Paul H. Anderson (pha@pdq.com) to debug a problem with FreeBSD
 * 4.0 apparently after the vm system was re-written.
 *
 * this program is not copyrighted and may be used for any purpose whatsoever,
 * please give credit, please don't remove this notice.
 */

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/mman.h>
#include <errno.h>
#include <fcntl.h>

#define FILESIZE (1024*1024)	/* file is 1MB long for now */
#define HEADERSIZE (16*1024)	/* header is 16K long for now */

int makethefile(char *filename);
int checkthefile(char *filename);

int main(int argc, char **argv)
{
	struct stat sb;

	if(argc!=2) {
		fprintf(stderr,"usage: %s <filename>\n", argv[0]);
		exit(1);
	}

	if(stat(argv[1], &sb)) {
		if(errno==ENOENT) exit(makethefile(argv[1]));
		perror(argv[0]);
		exit(1);
	}

	/* file exists, so check it. */

	exit(checkthefile(argv[1]));
}

/*
 * basic idea here is to write a file with indentifying information
 * in each block - we just write the block #.
 */

int makethefile(char *filename)
{
	int *block = (int *) calloc(getpagesize() * sizeof(int),1);
	int fd;
	int i;

	printf("creating a %d byte file with blocks marked with their block number.\n", FILESIZE);
	fd = open(filename, O_RDWR|O_CREAT, 0666);

	for(i=0;i<FILESIZE / getpagesize(); i++) {
		block[0] = i;
		if(write(fd, block, getpagesize()) != getpagesize()) {
			fprintf(stderr,"failed to write %d bytes to file %s\n", getpagesize(), filename);
			close(fd);
			exit(1);
		}
	}
	close(fd);
}

/*
 * here we want to check the file
 *
 */
int checkthefile(char *filename)
{
	int *block1, *block2;
	int fd1, fd2;
	int i,j;
	int errors = 0;


	fd1 = open(filename, O_RDONLY);
	block1 = (int *) mmap(NULL, (size_t) HEADERSIZE, PROT_READ, MAP_PRIVATE, fd1, (off_t)0);
#if defined(USEMADVISE)
	madvise(block1, HEADERSIZE, MADV_SEQUENTIAL);
	madvise(block1, HEADERSIZE, MADV_WILLNEED);
#endif

	fd2 = open(filename, O_RDONLY);
	block2 = (int *) mmap(NULL, FILESIZE - HEADERSIZE, PROT_READ, MAP_PRIVATE, fd2, (off_t) HEADERSIZE);
#if defined(USEMADVISE)
	madvise(block2, FILESIZE - HEADERSIZE, MADV_SEQUENTIAL);
	madvise(block2, FILESIZE - HEADERSIZE, MADV_WILLNEED);
#endif

	for(i=0;i<HEADERSIZE / getpagesize(); i++) {
		int index = (i)*getpagesize() / sizeof(int);
		if(block1[index] != i) {
			fprintf(stderr,"expected page with page id %d but got id %d at address %x\n", (int)i, (int)block1[index], (int)&block1[index]);
			errors++;
		}
	}
	/* finished looking at header, now look at second mapped region */

	for(i=HEADERSIZE / getpagesize(); i<FILESIZE / getpagesize(); i++) {
		int index = (i-HEADERSIZE/getpagesize())*getpagesize() / sizeof(int);
		if(block2[index] != i) {
			fprintf(stderr,"expected page with page id %d but got id %d at address %x\n", (int)i, (int)block2[index], (int)&block2[index]);
			errors++;
		}
	}
	if(errors == 0) {
#if defined(USEMADVISE)
		printf("vm system test worked fine - you either fixed the problem, or we are running\n");
		printf("on a FreeBSD machine equal to or prior the July 5th, 1999 Walnut creek snapshot\n");
#else
		printf("madvise not being used, so the fact that this test works only tells you\n");
		printf("that mmap alone with an offset appears to work fine.\n");
#endif
	}
	return errors;
}
Comment 1 Matthew Dillon 2000-05-14 19:41:34 UTC
    Paul, try this patch to /usr/src/sys/vm/vm_map.c and tell me if 
    it works.  It appears to work for me when I test using your 
    test program.

					-Matt
					Matthew Dillon 
					<dillon@backplane.com>

Index: vm_map.c
===================================================================
RCS file: /home/ncvs/src/sys/vm/vm_map.c,v
retrieving revision 1.187
diff -u -r1.187 vm_map.c
--- vm_map.c	2000/02/28 04:10:35	1.187
+++ vm_map.c	2000/05/14 18:31:06
@@ -1127,15 +1127,19 @@
 		     (current != &map->header) && (current->start < end);
 		     current = current->next
 		) {
+			vm_offset_t useStart;
+
 			if (current->eflags & MAP_ENTRY_IS_SUB_MAP)
 				continue;
 
 			pindex = OFF_TO_IDX(current->offset);
 			count = atop(current->end - current->start);
+			useStart = current->start;
 
 			if (current->start < start) {
 				pindex += atop(start - current->start);
 				count -= atop(start - current->start);
+				useStart = start;
 			}
 			if (current->end > end)
 				count -= atop(current->end - end);
@@ -1148,7 +1152,7 @@
 			if (behav == MADV_WILLNEED) {
 				pmap_object_init_pt(
 				    map->pmap, 
-				    current->start,
+				    useStart,
 				    current->object.vm_object,
 				    pindex, 
 				    (count << PAGE_SHIFT),
Comment 2 Matt Dillon freebsd_committer freebsd_triage 2000-05-14 19:52:39 UTC
State Changed
From-To: open->closed


A fix is under test and will be committed ASAP. 


Comment 3 pha 2000-05-15 16:06:19 UTC
Hi,

I was able to verify that installation of this fix appears to correct the
madvise problem at my site on my machines, with my application as well.

I have not done extensive testing of the overall system to see if for some
reason it introduces anything new, but I doubt it will.

Thank you very much, Matt, and anyone else who worked on fixing the
problem!  Great job!

Paul

On Sun, 14 May 2000, Matthew Dillon wrote:

>     Paul, try this patch to /usr/src/sys/vm/vm_map.c and tell me if 
>     it works.  It appears to work for me when I test using your 
>     test program.
> 
> 					-Matt
> 					Matthew Dillon 
> 					<dillon@backplane.com>
> 
> Index: vm_map.c
> ===================================================================
> RCS file: /home/ncvs/src/sys/vm/vm_map.c,v
> retrieving revision 1.187
> diff -u -r1.187 vm_map.c
> --- vm_map.c	2000/02/28 04:10:35	1.187
> +++ vm_map.c	2000/05/14 18:31:06
> @@ -1127,15 +1127,19 @@
>  		     (current != &map->header) && (current->start < end);
>  		     current = current->next
>  		) {
> +			vm_offset_t useStart;
> +
>  			if (current->eflags & MAP_ENTRY_IS_SUB_MAP)
>  				continue;
>  
>  			pindex = OFF_TO_IDX(current->offset);
>  			count = atop(current->end - current->start);
> +			useStart = current->start;
>  
>  			if (current->start < start) {
>  				pindex += atop(start - current->start);
>  				count -= atop(start - current->start);
> +				useStart = start;
>  			}
>  			if (current->end > end)
>  				count -= atop(current->end - end);
> @@ -1148,7 +1152,7 @@
>  			if (behav == MADV_WILLNEED) {
>  				pmap_object_init_pt(
>  				    map->pmap, 
> -				    current->start,
> +				    useStart,
>  				    current->object.vm_object,
>  				    pindex, 
>  				    (count << PAGE_SHIFT),
> 

+---------------------------------------------------+
| Paul Anderson           Public Data Queries, Inc. |
| pha@pdq.com             734-213-4964 x308         |
+---------------------------------------------------+