On FreeBSD amd64, after about 196 days uptime, the clock_t data type rolls over from 2147483647 (0x7FFFFFFF) to -2147483648 (0x80000000). When this happens, cl_times()'s return value, which is unsigned long (64bit) jumps from 0x000000007FFFFFFF to 0xFFFFFFFF80000000, wreaking all sorts of havoc, e.g. total spam of messages like: May 18 10:00:05 baweb01-tev heartbeat: [84855]: WARN: Gmain_timeout_dispatch: Dispatch function for memory stats was delayed 144115187989455872 ms (> 5000 ms) before being called (GSource: 0x802531318) May 18 10:00:05 baweb01-tev heartbeat: [84855]: WARN: Gmain_timeout_dispatch: Dispatch function for memory stats was delayed 144115187989455879 ms (> 5000 ms) before being called (GSource: 0x802531318) till heartbeat finally shuts down. Furthermore, on this platform, ./configure does not detect that the rollover workaround has to be used, as sizeof(long) is 8, which for some odd reason is the condition for "clock_t is long enough". Note: I placed the same bug report upstream on http://developerbugs.linux-foundation.org/show_bug.cgi?id=2596 Fix: I attached a patch doing the following changes: - fix to ./configure to check size of clock_t, not long - changed return type of cl_times() from unsigned long to clock_t - make all callers able to cope with that change - time_longclock() is the only caller that doesn't immediately convert cl_times()'s return value to a clock_t, so only that one needed changes - changed wraparound logic to no longer rely on unsigned long being "good enough". Instead, the static variables are changed to longclock_t, and sign extension is explicitly prevented by an AND with the maximum "unsigned" clock_t value It fixes the issue in our lab, and should do no harm on other platforms. Patch attached with submission follows: How-To-Repeat: Also, for testing purposes, I provided the source code of a "fake uptime" preload library to fake the return value of times(). That way, the problem can be reproduced without having access to a FreeBSD amd64 system with an uptime of 196 days. Usage: gcc -shared -o /tmp/fu.so fu.c -Os -fPIC -s -Wall -Wextra env LD_PRELOAD=/tmp/fu.so FU_TIMES=0x7FFF0000 /usr/local/etc/rc.d/heartbeat start Note that the library may need changes if libc is not /lib/libc.so.7 Source code: #include <sys/times.h> #include <err.h> #include <stdbool.h> #include <stdlib.h> #include <dlfcn.h> typedef clock_t (times_t) (struct tms *buffer); static times_t *orig_times = NULL; static bool initialized = false; static clock_t offset = 0; static void init() { struct tms t; if(initialized) return; initialized = true; orig_times = (times_t *) dlfunc(RTLD_NEXT, "times"); if(!orig_times) { void *libc = dlopen("/lib/libc.so.7", RTLD_LAZY); orig_times = (times_t *) dlfunc(libc, "times"); } if(!orig_times) errx(1, "cannot find times()"); const char *p = getenv("FU_TIMES"); if(p) { clock_t ret = times(&t); offset = strtoll(p, NULL, 0) - ret; } else offset = 0; initialized = true; } clock_t times(struct tms *buffer) { init(); clock_t ret = orig_times(buffer); return ret + offset; }
Responsible Changed From-To: freebsd-ports-bugs->freebsd-bugs Reassign to src.
This PR seems to have a patch against sysutils/heartbeat, not the kernel source. Yuri
This really needs reassigning to ports, or better, to the maintainer of the= sysutils/heartbeat port. This is no kernel issue at all, heartbeat has special logic to handle clock= _t rollover, but because of these bugs reported here it does not work. Best regards, Rudolf Polzer=
Responsible Changed From-To: freebsd-bugs->freebsd-ports-bugs Apparently this really applies to the sysutils/heartbeat port. The PR confused me.
Hi Justin, I guess you haven't got a mail from gnats about PR 157176. http://www.freebsd.org/cgi/query-pr.cgi?pr=ports/157176 Please do a review of the patch. From what I read on http://developerbugs.linux-foundation.org/show_bug.cgi?id=2596 it seems the bug was confirmed and the patch was adopted in a similar form upstream. Regards, olli
Hey Olli, The patch seems fine. I'll have the patch files created shortly. - Justin On Sat, Aug 27, 2011 at 2:25 PM, Olli Hauer <ohauer@freebsd.org> wrote: > Hi Justin, > > I guess you haven't got a mail from gnats about PR 157176. > http://www.freebsd.org/cgi/query-pr.cgi?pr=ports/157176 > > Please do a review of the patch. From what I read on > http://developerbugs.linux-foundation.org/show_bug.cgi?id=2596 > it seems the bug was confirmed and the patch was adopted in > a similar form upstream. > > Regards, > olli >
Patch files are attached. These can be thrown into the heartbeat port's files directory. Updated: patch-configure.in New: patch-include-clplumbing-longclock.h patch-lib-clplumbping-longclock.c - Justin
On 2011-09-08 01:23, Justin Head wrote: > Patch files are attached. These can be thrown into the heartbeat > port's files directory. > > Updated: > patch-configure.in > > New: > patch-include-clplumbing-longclock.h > patch-lib-clplumbping-longclock.c > > - Justin Hi Justin, thanks for reply and sorry for my delay. A tinderbuild on 8.2-and64 is in progress, will commit later if I see no issues. -- Regards, olli
ohauer 2011-09-20 21:40:51 UTC FreeBSD ports repository Modified files: sysutils/heartbeat Makefile sysutils/heartbeat/files patch-configure.in Added files: sysutils/heartbeat/files patch-include-clplumbing-longclock.h patch-lib-clplumbping-longclock.c Log: - fix crash when clock_t (signed 32bit int) rolls over on amd64 Thanks to Rudolf Polzer and Justin Head! PR: ports/157176 Submitted by: Rudolf Polzer <rpolzer@mucke-novak.de> Approved by: Justin Head <ports@encarnate.com> (maintainer) Revision Changes Path 1.40 +1 -1 ports/sysutils/heartbeat/Makefile 1.3 +19 -2 ports/sysutils/heartbeat/files/patch-configure.in 1.1 +20 -0 ports/sysutils/heartbeat/files/patch-include-clplumbing-longclock.h (new) 1.1 +79 -0 ports/sysutils/heartbeat/files/patch-lib-clplumbping-longclock.c (new) _______________________________________________ cvs-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/cvs-all To unsubscribe, send any mail to "cvs-all-unsubscribe@freebsd.org"
State Changed From-To: open->closed Committed Thanks!
Responsible Changed From-To: freebsd-ports-bugs->ohauer Committed Thanks!