The following crash was reported in a FreeNAS12 server: > Fatal trap 12: page fault while in kernel mode > > cpuid = 1; apic id = 02 > > fault virtual address = 0x410 > > fault code = supervisor read data, page not present > > instruction pointer = 0x20:0xffffffff80aa4a57 > > stack pointer = 0x28:0xfffffe021f94f150 > > frame pointer = 0x28:0xfffffe021f94f1d0 > > code segment = base rx0, limit 0xfffff, type 0x1b > > = DPL 0, pres 1, long 1, def32 0, gran 1 > > processor eflags = interrupt enabled, resume, IOPL = 0 > > current process = 4908 (nfsd: service) > > trap number = 12 > > panic: page fault > > cpuid = 1 > > time = 1619545070 > > KDB: stack backtrace: > > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame > 0xfffffe021f94ee10 > vpanic() at vpanic+0x17b/frame 0xfffffe021f94ee60 > > panic() at panic+0x43/frame 0xfffffe021f94eec0 > > trap_fatal() at trap_fatal+0x391/frame 0xfffffe021f94ef20 > > trap_pfault() at trap_pfault+0x4f/frame 0xfffffe021f94ef70 > > trap() at trap+0x286/frame 0xfffffe021f94f080 > > calltrap() at calltrap+0x8/frame 0xfffffe021f94f080 > > --- trap 0xc, rip = 0xffffffff80aa4a57, rsp = 0xfffffe021f94f150, rbp = > 0xfffffe021f94f1d0 --- > __mtx_lock_sleep() at __mtx_lock_sleep+0xd7/frame 0xfffffe021f94f1d0 > > clnt_bck_svccall() at clnt_bck_svccall+0x10a/frame 0xfffffe021f94f210 > > svc_vc_recv() at svc_vc_recv+0x1b2/frame 0xfffffe021f94f2e0 > > svc_run_internal() at svc_run_internal+0x377/frame 0xfffffe021f94f420 > > svc_thread_start() at svc_thread_start+0xb/frame 0xfffffe021f94f430 > > fork_exit() at fork_exit+0x7e/frame 0xfffffe021f94f470 > > fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe021f94f470 > > --- trap 0xc, rip = 0x8002e1b2a, rsp = 0x7fffffffe578, rbp = > 0x7fffffffe810 --- > KDB: enter: panic This crash in clnt_bck_svccall() appears to have occurred because the CLIENT structure for handling the callback RPCs has already been free'd. Freeing this CLIENT structure only occurs when the ClientID (not the same thing, despite the name similarity) has been destroyed.
Created attachment 224761 [details] ref cnt the CLIENT structure so that it is not prematurely free'd This patch acquires a reference count on the CLIENT structure that is not released until the associated socket structure is destroyed. This should avoid the structure being free'd before a callback reply has been processed. It also adds a check for closed or closing, so that it does not try to process a callback reply after the CLIENT structure has been CLNT_CLOSE()'d. I think should fix the crash. I am waiting for a review and, hopefully, a positive test report from Michael Dexter, who reported the crash.
Created attachment 224763 [details] ref cnt the CLIENT structure so that it is not prematurely free'd for freebsd12 Same patch as 224761, but for older kernels. Search in sys/fs/nfsserver/nfs_nfsdstate,c for xp_p2. If you find 3 of them, use this patch. If you find 2 of them, use 224761.
The patch has been committed and MFC'd.