Hi, I currently have some trouble with a amd64 HEAD build machine. This machine cross compile for an ARMv6 host. During the compilation, objcopy enter in an infinite loop. The process is stuck (unkillable) in "RUN" state: @@ PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND 93219 99.5 0.1 12844 3580 - R 11:46 184:34.58 objcopy -j .peh ... @@ The only thing a can get for now is the kernel backtrace via "procstat -kk 93219", that I run in loop There are some of the data: @@ __lockmgr_args+0x62a getblkx+0x154 breadn_flags+0x3d vfs_bio_getpages+0x323 ffs_getpages+0x78 VOP_GETPAGES_APV+0x56 ... gbincore+0x38 getblkx+0xab breadn_flags+0x3d vfs_bio_getpages+0x323 ffs_getpages+0x78 VOP_GETPAGES_APV+0x56 ... breadn_flags+0x1e9 vfs_bio_getpages+0x323 ffs_getpages+0x78 VOP_GETPAGES_APV+0x56 ... __lockmgr_args+0x672 binsfree+0x51 vfs_bio_getpages+0x386 ffs_getpages+0x78 VOP_GETPAGES_APV+0x56 ... vm_page_grab+0x6b vfs_bio_getpages+0x4ac ffs_getpages+0x78 VOP_GETPAGES_APV+0x56 ... ffs_getpages+0x78 VOP_GETPAGES_APV+0x56 ... @@ The "common" part is the vfs_bio_getpages that seems to endless loop. What I can do to bring more info for that issue ? Best regards Alexandre
Some additional info: - I'm running on UFS - I'm running Asynchronous - The machine has 12 CPU
I updated the issue and changed the affected version. The stable 12 has the same problem.
All file systems are OK. I have to do manually the fsck each time because the tool send me the error "PARTIALLY TRUNCATED INODE" and is unable to recover the error. # mount /dev/ufs/root on / (ufs, local, noatime) devfs on /dev (devfs, local, multilabel) /dev/ufs/var on /var (ufs, local, noatime) /dev/ufs/tmp on /tmp (ufs, asynchronous, local, noatime) /dev/ufs/usr on /usr (ufs, asynchronous, local, noatime) /dev/ufs/home on /home (ufs, asynchronous, local, noatime) # tunefs -p /dev/ufs/root (all file systems are the same) Password: tunefs: POSIX.1e ACLs: (-a) disabled tunefs: NFSv4 ACLs: (-N) disabled tunefs: MAC multilabel: (-l) disabled tunefs: soft updates: (-n) disabled tunefs: soft update journaling: (-j) disabled tunefs: gjournal: (-J) disabled tunefs: trim: (-t) disabled tunefs: maximum blocks per file in a cylinder group: (-e) 4096 tunefs: average file size: (-f) 16384 tunefs: average number of files in a directory: (-s) 64 tunefs: minimum percentage of free space: (-m) 8% tunefs: space to hold for metadata blocks: (-k) 5240 tunefs: optimization preference: (-o) time tunefs: volume label: (-L) root
I have not yet been able to reproduce the problem. I have a core file from Alexandre host: https://people.freebsd.org/~pho/bug236961.12.0-STABLE.coredump.txz
After some investigation, it seems that the condition "if (ma[i]->valid != VM_PAGE_BITS_ALL)" (into vfs_bio_getpages) is always true in my case.
Hello, The problem disappear when I put the /tmp folder (via symlink) in the same partition than /home (where the build run) To recap my disk configuration: - the build (source + objects) runs on /home partition - the /tmp is on the same disk as /home, but before (/tmp is quicker than /home) - Both /home and /tmp are "async + noatime" - I use ccache (but seems not relevant) - The swap is not the problem (freeze occurs when I disable it) - When /tmp is a symlink to a folder in /home, the problem disappear.
What does "swapctl -l" show? "systat -swap" also helps to monitor swap page usage.
We believe this will be fixed by r359464. *** This bug has been marked as a duplicate of bug 242626 ***