Bug 22667

Summary: kernel panic when attepmt to remove file with big UID
Product: Base System Reporter: eugene <eugene>
Component: kernAssignee: dwmalone
Status: Closed FIXED    
Severity: Affects Only Me    
Priority: Normal    
Version: 3.4-RELEASE   
Hardware: Any   
OS: Any   

Description eugene 2000-11-07 20:30:01 UTC
Awter power outage filesystem was corrupted and multiple special files were found with invalid time and uid/gid. (uids and gids were very big possible long uint size.) fsck was working unusually long (3 times longer) but didnt find anything bad and marked filesystem as clean. Filesystem was mounted with nodev flag for safety and any attempt to remove thoose bad files was causing kernel panic.

Here's information from crash dump:
Fatal trap 12: page fault while in kernel mode
mp_lock = 01000002; cpuid = 1; lapic.id = 00000000
fault virtual address   = 0x28
fault code              = supervisor read, page not present
instruction pointer     = 0x8:0xc0179eb1
stack pointer           = 0x10:0xd2ef9cbc
frame pointer           = 0x10:0xd2ef9cc4
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 51461 (rm)
interrupt mask          =  <- SMP: XXX
trap number             = 12
panic: page fault
mp_lock = 01000002; cpuid = 1; lapic.id = 00000000
boot() called on cpu#1

backtrace:
#0  0xc0148f63 in boot ()
#1  0xc014921b in panic ()
#2  0xc01d28ab in trap_fatal ()
#3  0xc01d254b in trap_pfault ()
#4  0xc01d21c2 in trap ()
#5  0xc0179eb1 in spec_strategy ()
#6  0xc0179635 in spec_vnoperate ()
#7  0xc01a52f1 in ufs_vnoperatespec ()
#8  0xc019b2a0 in ffs_indirtrunc ()
#9  0xc019ae30 in ffs_truncate ()
#10 0xc019fb05 in ufs_inactive ()
#11 0xc01a52f1 in ufs_vnoperatespec ()
#12 0xc016ddd6 in vput ()
#13 0xc0171519 in unlink ()
#14 0xc01d2b1f in syscall ()
#15 0xc01c18cc in Xint0x80_syscall ()
----
every time we attempted to remove any of that files kernel crash was similar to this and backtrace was the same.

I browsed cvs tree and didnt find any changes related to this issue so i believe that problem still exists in 4.x

Fix: 

There's no fix. You can only clean thoose inodes through fsdb and only by hands because fsdb closes stdin after every command - this makes scripting impossible.
How-To-Repeat: I dont know since it was fs dabame by power outage. But you can try this:
Create special file and tweak it's *time and uid/gid parameters through fsdb,
make uid/gid as big as possible. Then mount fs and try to delete that file.
Comment 1 iedowse 2000-11-07 20:56:54 UTC
In message <200011072021.MAA10065@alj.me.ru>, eugene@nsb.sovam.com writes:

>#8  0xc019b2a0 in ffs_indirtrunc ()
>#9  0xc019ae30 in ffs_truncate ()

>I browsed cvs tree and didnt find any changes related to this issue so i belie
>ve that problem still exists in 4.x

I believe that this is a duplicate of PR bin/19426, which was fixed
in -current and 4-stable in July. The problem was actually that
fsck fails to zero di_size in device inodes; when these inodes are
removed, the kernel will panic when it tries to truncate them (hence
the ffs_truncate in the traceback you provided).

	http://www.freebsd.org/cgi/query-pr.cgi?pr=19426 

Ian
Comment 2 dwmalone freebsd_committer freebsd_triage 2000-11-07 22:55:22 UTC
State Changed
From-To: open->feedback

Can you let us know if the patch from the PR fixes the problem? 

(Mind you - you probably don't have a filesystem with the problem 
any more. I can look into merging the change into 3.X if using 
a local patch or moving to 4.X isn't an option). 


Comment 3 dwmalone freebsd_committer freebsd_triage 2000-11-07 22:55:22 UTC
Responsible Changed
From-To: freebsd-bugs->dwmalone

Original PR was mine.
Comment 4 iedowse freebsd_committer freebsd_triage 2001-04-10 20:03:27 UTC
State Changed
From-To: feedback->closed

I believe this problem is already fixed (see PR bin/19426 for 
further details).