Bug 256352 - website build: hugo hangs sometimes forever
Summary: website build: hugo hangs sometimes forever
Status: Open
Alias: None
Product: Documentation
Classification: Unclassified
Component: Documentation (show other bugs)
Version: Latest
Hardware: Any Any
: --- Affects Many People
Assignee: freebsd-doc (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-06-01 19:59 UTC by Wolfram Schneider
Modified: 2021-06-13 06:24 UTC (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Wolfram Schneider freebsd_committer 2021-06-01 19:59:54 UTC
Our new website build system depends on hugo. Sometimes it hangs and does not finish

/usr/local/bin/hugo --minify
Start building sites ?
[...]
Total in 247238 ms
--- generate-pgpkeys-txt ---
/usr/local/bin/ruby ./tools/global-pgpkeys-creator.rb


using truss(1) I see that a hugo user thread is waiting some nanoseconds for something and then runs in a timeout

ERR#60 'Operation timed out'

and a kevent is called with a ~10 seconds timeout. This repeats and never ends.


_umtx_op(0x2e45a18,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x18,0x7fffdfffcdc8) ERR#60 'Operation timed out'
compat11.kevent(3,0x0,0,{ },64,{ 9.977036585 })	 = 0 (0x0)
_umtx_op(0xc000088558,UMTX_OP_WAKE_PRIVATE,0x1,0x0,0x0) = 0 (0x0)
_umtx_op(0xc000088558,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x18,0x0) = 0 (0x0)
nanosleep({ 0.010000000 })			 = 0 (0x0)
compat11.kevent(3,0x0,0,{ },64,{ 9.976893055 })	 = 0 (0x0)
_umtx_op(0x2e45a18,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x18,0x7fffdfffcdc8) ERR#60 'Operation timed out'
_umtx_op(0xc00051e558,UMTX_OP_WAKE_PRIVATE,0x1,0x0,0x0) = 0 (0x0)
_umtx_op(0xc00051e558,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x18,0x0) = 0 (0x0)
Comment 1 Wolfram Schneider freebsd_committer 2021-06-01 20:00:26 UTC
I run hugo with the verbose parameter and I see this happens right after syncing the static files:

--- generate-releases ---
/usr/local/bin/python3 ./tools/releases-toml.py -p ./shared/releases.adoc
--- build ---
/usr/local/bin/hugo --verbose
INFO 2021/05/29 08:58:11 Translation func for language it not found, use default.
INFO 2021/05/29 08:58:11 Translation func for language hu not found, use default.
INFO 2021/05/29 08:58:11 Using config file:
Start building sites ?
INFO 2021/05/29 08:58:11 syncing static files to /usr/local/freebsd/doc/website/public/
Comment 2 Wolfram Schneider freebsd_committer 2021-06-01 20:10:39 UTC
I was able to run the whole build process with truss(1). This takes time. Here is the output before and when it starts to run in an endless loop. (the full log is 800MB)

 7680: fstatat(AT_FDCWD,"/usr/local/freebsd/doc/website/static/security/so_public_key.asc",{ mode=-rw-rw-r-- ,inode=936130,size=4369,blksize=4608 },AT_SYMLINK_NOFOLLOW) = 0 (0x0)
 7680: fstatat(AT_FDCWD,"/usr/local/freebsd/doc/website/public/security/so_public_key.asc",{ mode=-rw-rw-r-- ,inode=940726,size=4369,blksize=4608 },0x0) = 0 (0x0)
 7680: nanosleep({ 0.000020000 })                = 0 (0x0)
 7680: fstatat(AT_FDCWD,"/usr/local/freebsd/doc/website/static/security/so_public_key.asc",{ mode=-rw-rw-r-- ,inode=936130,size=4369,blksize=4608 },AT_SYMLINK_NOFOLLOW) = 0 (0x0)
 7680: open("/usr/local/freebsd/doc/website/public/security/so_public_key.asc",O_RDONLY|O_CLOEXEC,00) = 7 (0x7)
 7680: nanosleep({ 0.000020000 })                = 0 (0x0)
 7680: fstat(7,{ mode=-rw-rw-r-- ,inode=940726,size=4369,blksize=4608 }) = 0 (0x0)
 7680: fstatat(AT_FDCWD,"/usr/local/freebsd/doc/website/static/security/so_public_key.asc",{ mode=-rw-rw-r-- ,inode=936130,size=4369,blksize=4608 },AT_SYMLINK_NOFOLLOW) = 0 (0x0)
 7680: nanosleep({ 0.000020000 })                = 0 (0x0)
 7680: fstatat(AT_FDCWD,"/usr/local/freebsd/doc/website/static/security/so_public_key.asc",{ mode=-rw-rw-r-- ,inode=936130,size=4369,blksize=4608 },AT_SYMLINK_NOFOLLOW) = 0 (0x0)
 7680: fstatat(AT_FDCWD,"/usr/local/freebsd/doc/website/static/security/so_public_key.asc",{ mode=-rw-rw-r-- ,inode=936130,size=4369,blksize=4608 },AT_SYMLINK_NOFOLLOW) = 0 (0x0)
 7680: nanosleep({ 0.000020000 })                = 0 (0x0)
 7680: fstatat(AT_FDCWD,"/usr/local/freebsd/doc/website/static/security/so_public_key.asc",{ mode=-rw-rw-r-- ,inode=936130,size=4369,blksize=4608 },AT_SYMLINK_NOFOLLOW) = 0 (0x0)
 7680: open("/usr/local/freebsd/doc/website/static/security/so_public_key.asc",O_RDONLY|O_CLOEXEC,00) = 8 (0x8)
 7680: thr_kill(105085,SIGURG)                   = 0 (0x0)
 7680: fstat(8,{ mode=-rw-rw-r-- ,inode=936130,size=4369,blksize=4608 }) = 0 (0x0)
 7680: SIGNAL 16 (SIGURG) code=SI_LWP pid=7680 uid=1024
 7680: sigprocmask(SIG_SETMASK,{ SIGHUP|SIGINT|SIGQUIT|SIGILL|SIGTRAP|SIGABRT|SIGEMT|SIGFPE|SIGKILL|SIGBUS|SIGSEGV|SIGSYS|SIGPIPE|SIGALRM|SIGTERM|SIGURG|SIGSTOP|SIGTSTP|SIGCONT|SIGCHLD|SIGTTIN|SIGTTOU|SIGIO|SIGXCPU|SIGXFSZ|SIGVTALRM|SIGPROF|SIGWINCH|SIGINFO|SIGUSR1|SIGUSR2 },0x0) = 0 (0x0)
 7680: sigreturn(0xc0001076c0)                   EJUSTRETURN
 7680: nanosleep({ 0.000020000 })                = 0 (0x0)
 7680: nanosleep({ 0.000020000 })                = 0 (0x0)
 7680: read(7,"-----BEGIN PGP PUBLIC KEY BLOCK-"...,1000) = 1000 (0x3e8)
 7680: read(8,"-----BEGIN PGP PUBLIC KEY BLOCK-"...,1000) = 1000 (0x3e8)
 7680: nanosleep({ 0.000020000 })                = 0 (0x0)
 7680: read(7,"lwCeNtPmA+g5\ncn24psuzOeh1tRElIm"...,1000) = 1000 (0x3e8)
 7680: read(8,"lwCeNtPmA+g5\ncn24psuzOeh1tRElIm"...,1000) = 1000 (0x3e8)
 7680: nanosleep({ 0.000020000 })                = 0 (0x0)
 7680: read(7,"IXlXJR0TvuDuJOaLNvTOC3OI8L97fdBc"...,1000) = 1000 (0x3e8)
 7680: read(8,"IXlXJR0TvuDuJOaLNvTOC3OI8L97fdBc"...,1000) = 1000 (0x3e8)
 7680: nanosleep({ 0.000040000 })                = 0 (0x0)
 7680: read(7,"TECRuM2/T2zHHr0AdKuBPF28U+H\nTxy"...,1000) = 1000 (0x3e8)
 7680: nanosleep({ 0.000080000 })                = 0 (0x0)
 7680: read(8,"TECRuM2/T2zHHr0AdKuBPF28U+H\nTxy"...,1000) = 1000 (0x3e8)
 7680: read(7,"0A\nUEzs6oMF5YOFE8LmykesbUHAbC07"...,1000) = 369 (0x171)
 7680: read(8,"0A\nUEzs6oMF5YOFE8LmykesbUHAbC07"...,1000) = 369 (0x171)
 7680: nanosleep({ 0.000160000 })                = 0 (0x0)
 7680: read(7,0xc00096e400,1000)                 = 0 (0x0)
 7680: read(8,0xc00096e800,1000)                 = 0 (0x0)
 7680: close(8)                                  = 0 (0x0)
 7680: close(7)                                  = 0 (0x0)
 7680: fstatat(AT_FDCWD,"/usr/local/freebsd/doc/website/public/security/so_public_key.asc",{ mode=-rw-rw-r-- ,inode=940726,size=4369,blksize=4608 },0x0) = 0 (0x0)
 7680: nanosleep({ 0.000320000 })                = 0 (0x0)
 7680: fstatat(AT_FDCWD,"/usr/local/freebsd/doc/website/static/security/so_public_key.asc",{ mode=-rw-rw-r-- ,inode=936130,size=4369,blksize=4608 },AT_SYMLINK_NOFOLLOW) = 0 (0x0)
 7680: fstatat(AT_FDCWD,"/usr/local/freebsd/doc/website/public/security",{ mode=drwxr-xr-x ,inode=936779,size=13,blksize=131072 },0x0) = 0 (0x0)
 7680: fstatat(AT_FDCWD,"/usr/local/freebsd/doc/website/static/security",{ mode=drwxrwxr-x ,inode=932802,size=7,blksize=131072 },AT_SYMLINK_NOFOLLOW) = 0 (0x0)
 7680: fstatat(AT_FDCWD,"/usr/local/freebsd/doc/website/public/",{ mode=drwxr-xr-x ,inode=936173,size=68,blksize=131072 },0x0) = 0 (0x0)
 7680: utimensat(AT_FDCWD,"/usr/local/freebsd/doc/website/public/",{ -6795364579.128654848, -6795364579.128654848 },0x0) = 0 (0x0)
 7680: nanosleep({ 0.000640000 })                = 0 (0x0)
 7680: _umtx_op(0x2e45a18,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x18,0x7fffdfffcdc8) ERR#60 'Operation timed out'
 7680: compat11.kevent(3,0x0,0,{ },64,{ 9.995060341 }) = 0 (0x0)
 7680: _umtx_op(0xc000060d58,UMTX_OP_WAKE_PRIVATE,0x1,0x0,0x0) = 0 (0x0)
 7680: _umtx_op(0xc000060d58,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x18,0x0) = 0 (0x0)
 7680: nanosleep({ 0.001280000 })                = 0 (0x0)
 7680: _umtx_op(0x2e45a18,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x18,0x7fffdfffcdc8) ERR#60 'Operation timed out'
 7680: nanosleep({ 0.002560000 })                = 0 (0x0)
 7680: nanosleep({ 0.005120000 })                = 0 (0x0)
 7680: compat11.kevent(3,0x0,0,{ },64,{ 9.910313681 }) = 0 (0x0)
 7680: _umtx_op(0xc000088158,UMTX_OP_WAKE_PRIVATE,0x1,0x0,0x0) = 0 (0x0)
 7680: _umtx_op(0xc000088158,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x18,0x0) = 0 (0x0)
 7680: nanosleep({ 0.010000000 })                = 0 (0x0)
 7680: compat11.kevent(3,0x0,0,{ },64,{ 9.901809311 }) = 0 (0x0)
 7680: _umtx_op(0xc000088158,UMTX_OP_WAKE_PRIVATE,0x1,0x0,0x0) = 0 (0x0)
 7680: _umtx_op(0xc000088158,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x18,0x0) = 0 (0x0)
 7680: _umtx_op(0x2e45a18,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x18,0x7fffdfffcdc8) ERR#60 'Operation timed out'
 7680: nanosleep({ 0.010000000 })                = 0 (0x0)
 7680: _umtx_op(0x2e45a18,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x18,0x7fffdfffcdc8) ERR#60 'Operation timed out'
 7680: compat11.kevent(3,0x0,0,{ },64,{ 9.911707291 }) = 0 (0x0)
 7680: _umtx_op(0xc000088158,UMTX_OP_WAKE_PRIVATE,0x1,0x0,0x0) = 0 (0x0)
 7680: _umtx_op(0xc000088158,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x18,0x0) = 0 (0x0)
 7680: nanosleep({ 0.010000000 })                = 0 (0x0)
 7680: compat11.kevent(3,0x0,0,{ },64,{ 9.911323811 }) = 0 (0x0)

[...]

at least you can kill the process with ^C

 7680: _umtx_op(0x2e45a18,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x18,0x7fffdfffcdc8) ERR#60 'Operation timed out'
 7680: _umtx_op(0x2e46478,UMTX_OP_WAKE_PRIVATE,0x1,0x0,0x0) = 0 (0x0)
 7680: _umtx_op(0x2e46478,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x18,0x0) = 0 (0x0)
 7680: nanosleep({ 0.010000000 })                = 0 (0x0)
 7680: _umtx_op(0x2e46478,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x18,0x0) ERR#4 'Interrupted system call'
 7680: SIGNAL 2 (SIGINT) code=SI_KERNEL
 7680: sigprocmask(SIG_SETMASK,{ SIGHUP|SIGINT|SIGQUIT|SIGILL|SIGTRAP|SIGABRT|SIGEMT|SIGFPE|SIGKILL|SIGBUS|SIGSEGV|SIGSYS|SIGPIPE|SIGALRM|SIGTERM|SIGURG|SIGSTOP|SIGTSTP|SIGCONT|SIGCHLD|SIGTTIN|SIGTTOU|SIGIO|SIGXCPU|SIGXFSZ|SIGVTALRM|SIGPROF|SIGWINCH|SIGINFO|SIGUSR1|SIGUSR2 },0x0) = 0 (0x0)
 7680: sigprocmask(SIG_UNBLOCK,{ SIGINT },0x0)   = 0 (0x0)
 7674: kill(-7680,SIGINT)                        ERR#3 'No such process'
 7680: thr_self(0xc000009510)                    = 0 (0x0)
 7680: thr_kill(100554,SIGINT)                   = 0 (0x0)
 7680: SIGNAL 2 (SIGINT) code=SI_LWP pid=7680 uid=1024
 7680: sigprocmask(SIG_SETMASK,{ SIGHUP|SIGINT|SIGQUIT|SIGILL|SIGTRAP|SIGABRT|SIGEMT|SIGFPE|SIGKILL|SIGBUS|SIGSEGV|SIGSYS|SIGPIPE|SIGALRM|SIGTERM|SIGURG|SIGSTOP|SIGTSTP|SIGCONT|SIGCHLD|SIGTTIN|SIGTTOU|SIGIO|SIGXCPU|SIGXFSZ|SIGVTALRM|SIGPROF|SIGWINCH|SIGINFO|SIGUSR1|SIGUSR2 },0x0) = 0 (0x0)
 7680: sigprocmask(SIG_SETMASK,{ SIGHUP|SIGINT|SIGQUIT|SIGILL|SIGTRAP|SIGABRT|SIGEMT|SIGFPE|SIGKILL|SIGBUS|SIGSEGV|SIGSYS|SIGPIPE|SIGALRM|SIGTERM|SIGURG|SIGSTOP|SIGTSTP|SIGCONT|SIGCHLD|SIGTTIN|SIGTTOU|SIGIO|SIGXCPU|SIGXFSZ|SIGVTALRM|SIGPROF|SIGWINCH|SIGINFO|SIGUSR1|SIGUSR2 },{ SIGHUP|SIGINT|SIGQUIT|SIGILL|SIGTRAP|SIGABRT|SIGEMT|SIGFPE|SIGBUS|SIGSEGV|SIGSYS|SIGPIPE|SIGALRM|SIGTERM|SIGURG|SIGTSTP|SIGCONT|SIGCHLD|SIGTTIN|SIGTTOU|SIGIO|SIGXCPU|SIGXFSZ|SIGVTALRM|SIGPROF|SIGWINCH|SIGINFO|SIGUSR1|SIGUSR2 }) = 0 (0x0)
 7680: sigaction(SIGINT,{ SIG_DFL SA_ONSTACK|SA_RESTART|SA_SIGINFO ss_t },{ 0x802e0e410 SA_ONSTACK|SA_RESTART|SA_SIGINFO ss_t }) = 0 (0x0)
 7680: sigprocmask(SIG_SETMASK,{ SIGHUP|SIGINT|SIGQUIT|SIGILL|SIGTRAP|SIGABRT|SIGEMT|SIGFPE|SIGBUS|SIGSEGV|SIGSYS|SIGPIPE|SIGALRM|SIGTERM|SIGURG|SIGTSTP|SIGCONT|SIGCHLD|SIGTTIN|SIGTTOU|SIGIO|SIGXCPU|SIGXFSZ|SIGVTALRM|SIGPROF|SIGWINCH|SIGINFO|SIGUSR1|SIGUSR2 },0x0) = 0 (0x0)
 7680: sigprocmask(SIG_UNBLOCK,{ SIGINT },0x0)   = 0 (0x0)
 7680: thr_self(0xc0000089e8)                    = 0 (0x0)
 7680: thr_kill(100554,SIGINT)                   = 0 (0x0)
 7680: SIGNAL 2 (SIGINT) code=SI_LWP pid=7680 uid=1024
 7680: <thread 105085 exited>
 7680: <thread 105083 exited>
 7680: <thread 105086 exited>
 7680: <thread 105087 exited>
 7680: <thread 105088 exited>
 7680: <thread 105084 exited>
 7680: process killed, signal = 2
Comment 3 Danilo G. Baio freebsd_committer 2021-06-05 00:58:30 UTC
I've seen this as well, it does not happen in the first build for me, only when website/public has contents so it always get stuck on 'syncing static files'.
Comment 4 Ceri Davies freebsd_committer 2021-06-10 12:37:30 UTC
This looks like it is related to the go runtime, as https://ask.csdn.net/questions/3196768 feels quite similar.

As a workaround, a pre-build step of "rm -rf public/*" seems to work well but if content is actually being served from that directory then that will obviously be problematic.
Comment 5 Wolfram Schneider freebsd_committer 2021-06-13 06:21:51 UTC
I can reproduce this on 3 different machines (12-STABLE, 13-STABLE, 14-CURRENT). The first run is ok, the second hangs after 5 seconds - always.

Given that there is a workaround (rm -rf website/public) I think this issue is not critical, just annoying.
Comment 6 Wolfram Schneider freebsd_committer 2021-06-13 06:24:35 UTC
Actually, you don't have to remove all the 8040 files in website/public/. It is enough to remove the cgi directory (9 files)

rm -rf website/public/cgi