My root fs is out of space. When I do "make buildkernel && sudo make installkernel", the buildkernel fails almost immediately but in that case I'd expect the whole command to fail: markj@biggie> make -j32 KERNCONF=GENERIC buildkernel && sudo make installkernel -j32 KERNCONF=GENERIC INSTKERNNAME= kernel.test && sudo nextboot -k kernel.test && sudo shutdown -r now /: write failed, filesystem is full --- buildkernel --- --- installkernel --- make[1] warning: /usr/home/markj/src/freebsd/: Read-only file system. --- _installcheck_kernel --- -------------------------------------------------------------- >>> Install check kernel -------------------------------------------------------------- --- installkernel --- -------------------------------------------------------------- >>> Installing kernel GENERIC on Tue Oct 19 09:36:11 EDT 2021 -------------------------------------------------------------- ... markj@biggie> dmesg | tail -n 2 lo0: link state changed to UP pid 8820 (make), uid 1001 inumber 7865089 on /: filesystem full markj@biggie>
I used dtrace to stop the make process when the error occurs: markj@biggie> sudo dtrace -n 'syscall:::return /errno == ENOSPC/{raise(SIGSTOP);}' -w Then markj@biggie> gdb -q -p $(pgrep make) Attaching to process 56889 Reading symbols from /usr/bin/make... Reading symbols from /usr/lib/debug//usr/bin/make.debug... Reading symbols from /lib/libc.so.7... warning: the debug information found in "/usr/lib/debug//lib/libc.so.7.debug" does not match "/lib/libc.so.7" (CRC mismatch). (No debugging symbols found in /lib/libc.so.7) Reading symbols from /libexec/ld-elf.so.1... warning: the debug information found in "/usr/lib/debug//libexec/ld-elf.so.1.debug" does not match "/libexec/ld-elf.so.1" (CRC mismatch). (No debugging symbols found in /libexec/ld-elf.so.1) [Switching to LWP 101646 of process 56889] 0x00000008011d0faa in _write () from /lib/libc.so.7 (gdb) bt #0 0x00000008011d0faa in _write () from /lib/libc.so.7 #1 0x00000008011b4307 in ?? () from /lib/libc.so.7 #2 0x00000008011aa57e in fflush () from /lib/libc.so.7 #3 0x0000000001039a5e in ShellWriter_WriteFmt (wr=<optimized out>, fmt=0x10286f4 "%s\n", arg=0x8018a0102 "cd /usr/home/markj/src/freebsd; PATH=/sbin:/bin:/usr/sbin:/usr/bin MAKE_CMD=\"make\" make -m /usr/home/markj/src/freebsd/share/mk -f Makefile.inc1 TARGET=amd64 TARGET_ARCH=amd64 buildkernel") at /root/freebsd/contrib/bmake/job.c:798 #4 JobWriteCommand (job=<optimized out>, wr=<optimized out>, ln=0x80190cf00, ucmd=<optimized out>) at /root/freebsd/contrib/bmake/job.c:1010 #5 JobWriteCommands (job=<optimized out>, job@entry=0x801913fc0) at /root/freebsd/contrib/bmake/job.c:1048 #6 0x0000000001037985 in JobWriteShellCommands (job=0x801913fc0, gn=0x801905f00, out_run=<optimized out>) at /root/freebsd/contrib/bmake/job.c:1646 #7 JobStart (gn=<optimized out>, special=false) at /root/freebsd/contrib/bmake/job.c:1725 #8 Job_Make (gn=gn@entry=0x801905f00) at /root/freebsd/contrib/bmake/job.c:2170 #9 0x000000000103f735 in MakeStartJobs () at /root/freebsd/contrib/bmake/make.c:1028 #10 0x000000000103f4c4 in Make_Run (targs=targs@entry=0x7fffffffdeb0) at /root/freebsd/contrib/bmake/make.c:1391 #11 0x000000000103ce4c in runTargets () at /root/freebsd/contrib/bmake/main.c:953 #12 main_Run () at /root/freebsd/contrib/bmake/main.c:1642 #13 main (argc=17129376, argv=<optimized out>) at /root/freebsd/contrib/bmake/main.c:1701 So we're writing to the "job file" set up by JobWriteShellCommands(), which creates a tmpfile and unlinks it.
And errors from writing are indeed discarded: 791 static void 792 ShellWriter_WriteFmt(ShellWriter *wr, const char *fmt, const char *arg) 793 { 794 DEBUG1(JOB, fmt, arg); 795 796 (void)fprintf(wr->f, fmt, arg); 797 /* XXX: Is flushing needed in any case, or only if f == stdout? */ 798 (void)fflush(wr->f); 799 }
FYI this will be fixed in bmake 20211024
(In reply to Simon J. Gerraty from comment #3) Great, thank you!
(In reply to Simon J. Gerraty from comment #3) I suppose this is fixed by https://github.com/NetBSD/src/commit/48b38ee7e341477ac9dd3d4d804b951179581199 , now in FreeBSD? I can't verify it at the moment but checking for errors from fclose() seems sufficient. Thanks.