| Summary: | pthreads: Cannot set scheduling timer/Cannot set virtual timer | ||
|---|---|---|---|
| Product: | Base System | Reporter: | Lawrence D. Lopez <lawlopez> |
| Component: | kern | Assignee: | freebsd-threads (Nobody) <threads> |
| Status: | Closed FIXED | ||
| Severity: | Affects Only Me | ||
| Priority: | Normal | ||
| Version: | Unspecified | ||
| Hardware: | Any | ||
| OS: | Any | ||
In message <20000718222750.0210537B6CF@hub.freebsd.org>, lawlopez@cisco.com wri tes: >A timer interrupt occurs at this point which then calls >a timeout function which does not mask off >timer interrupts but which processes for a period of time >long enough so that the original timecounter element used >by microtime is reused. This is the culprit, do you know which timeout function this is ? -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD coreteam member | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. Responsible Changed From-To: freebsd-bugs->jasone Over to maintainer. Responsible Changed From-To: jasone->freebsd-bugs Responsible Changed From-To: freebsd-bugs->freebsd-threads Assign to threads mailing list Why does the bug belong to thread library ? it is a kernel timer race bug. David Xu State Changed From-To: open->closed libc_r is no longer supported and the timeout code has changed substantially |
The threads library is terminating because setitimer is being called with a negative number of microseconds. This is occuring because timeout function is taking so long that all of the timecounter elements are used. This is occuring because the system call gettimeofday is returning a negative number of microseconds to _thread_kern_sched() in /usr/src/lib/libc_r_g/uthread/uthread_kern.c This is occuring because the microtime() function called by the system call gettimeofday in /usr/src/sys/kern/kern_clock.c is returning a negative number of microseconds. This is occurring because the inline function tco_delta() is returning the a negative time. This is occurring because the timecounter structure used by microtime and given to tco_delta is being modified while tco_delta is using it. Specifically tcl_delta() is calling tc->tc_get_timecount(tc) and is storing the value in a register. A timer interrupt occurs at this point which then calls a timeout function which does not mask off timer interrupts but which processes for a period of time long enough so that the original timecounter element used by microtime is reused. At this point sync_other_counter() resets tc->tc_offset_count and tc_delta returns a very large number. static __inline unsigned tco_delta(struct timecounter *tc) { return ((tc->tc_get_timecount(tc) - tc->tc_offset_count) & tc->tc_counter_mask); } Fix: Add to the kernel configuration file: options "NTIMECOUNTER=100" This would allow for timeout functions of up to 1000 milli seconds. The comments in LINT while they may be correct should be more specific. How-To-Repeat: It is difficult to reproduce the problem. I think if you start up a timeout function which spins for 100 ms and then call gettimeofday() you may get lucky.