Bug 78444 - [sched_ule] doesn't keep track of the sleep time of a process
Summary: [sched_ule] doesn't keep track of the sleep time of a process
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 6.0-CURRENT
Hardware: Any Any
: Normal Affects Only Me
Assignee: Antoine Brodin
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-03-05 13:10 UTC by Antoine Brodin
Modified: 2012-12-29 12:38 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Antoine Brodin 2005-03-05 13:10:13 UTC
When using the sched_ule scheduler, the sleep time of a process isn't
computed.
kg_slptime is zeroed during the ksegrp creation and after that, it's
never modified.
In sched_ule.c, something named kg_slptime is manipulated but it's not
the sleep time of a process: there's the macro
#define kg_slptime kg_sched->skg_slptime
It represents an history of the process's sleep and not a sleep time
and it would be bogus to use it as the sleep time.

Side effects: swapout doesn't work as intended: a process can never be
swaped out since its sleep time is 0 and is below swap_idle_threshold1.

Fix: 

Having a lightweight schedcpu routine running every hz ticks like in
sched_4bsd ?
The problem is that such a routine would use a O(n) algorithm and would
 penalize quite unnecessarily sched_ule.

Another idea: add a sched_slptime routine in sched_ule.c for swapout
and fill_kinfo_thread to use ?

add something like this to sched_ule.c:

%%
u_int
sched_slptime(struct ksegrp *kg)
{
	struct thread *td;
	u_int slptime = UINT_MAX;
	u_int hzticks;

	PROC_LOCK_ASSERT(kg->kg_proc, MA_OWNED);
	mtx_assert(&sched_lock, MA_OWNED);
	FOREACH_THREAD_IN_GROUP(kg, td) {
		if (td->td_slptime) {
			hzticks = ticks - td->td_slptime;			
			slptime = min(slptime, hzticks);
		} else
			return (0);
	}
	return (slptime / hz);
}
%%

add something like this to sched_4bsd.c:

%%
u_int
sched_slptime(struct ksegrp *kg)
{

	return (kg->kg_slptime);
}
%%

add something like this to sched.h:

%%
u_int sched_slptime(struct ksegrp *kg);
%%

and use sched_slptime(kg) in vm_glue.c and kern_proc.c instead of
kg->kg_slptime
How-To-Repeat: Use ps:
$ ps -o sl
 SL
  0
  0
  0
  0
  0
  0
  0
  0
  0
  0
  0
  0
Comment 1 Antoine Brodin 2005-03-05 21:03:17 UTC
The assertion PROC_LOCK_ASSERT(kg->kg_proc, MA_OWNED); in sched_slptime
is not necessary and wrong.
Comment 2 Gleb Smirnoff freebsd_committer freebsd_triage 2005-03-09 10:13:08 UTC
Responsible Changed
From-To: freebsd-bugs->jeff

Over to ULE author.
Comment 3 Mark Linimon freebsd_committer freebsd_triage 2007-03-14 22:24:26 UTC
State Changed
From-To: open->feedback

The ULE scheduler has been extensively upgraded in -CURRENT.  Does this 
problem still occur there?
Comment 4 Antoine Brodin 2007-03-15 15:11:51 UTC
Mark Linimon <linimon@FreeBSD.org> wrote:
> Synopsis: [sched_ule] doesn't keep track of the sleep time of a process
> 
> State-Changed-From-To: open->feedback
> State-Changed-By: linimon
> State-Changed-When: Wed Mar 14 22:24:26 UTC 2007
> State-Changed-Why: 
> The ULE scheduler has been extensively upgraded in -CURRENT.  Does this
> problem still occur there?
> 
> http://www.freebsd.org/cgi/query-pr.cgi?pr=78444

The problem still occurs.
%%
> grep -R td_slptime /sys/
/sys/kern/kern_proc.c:  kp->ki_slptime = td->td_slptime;
/sys/kern/sched_4bsd.c:                         if (td->td_slptime > 1) {
/sys/kern/sched_4bsd.c:                         td->td_slptime = 0;
/sys/kern/sched_4bsd.c:                         td->td_slptime++;
/sys/kern/sched_4bsd.c:                 if (td->td_slptime > 1)
/sys/kern/sched_4bsd.c: if (td->td_slptime > 5 * loadfac)
/sys/kern/sched_4bsd.c:         td->td_slptime--;       /* was incremented in schedcpu() */
/sys/kern/sched_4bsd.c:         while (newcpu && --td->td_slptime)
/sys/kern/sched_4bsd.c: td->td_slptime = 0;
/sys/kern/sched_4bsd.c: if (td->td_slptime > 1) {
/sys/kern/sched_4bsd.c: td->td_slptime = 0;
/sys/sys/proc.h:        u_int           td_slptime;     /* (j) How long completely blocked. */
/sys/vm/vm_glue.c:                              pri = p->p_swtime + td->td_slptime;
/sys/vm/vm_glue.c:                              if (td->td_slptime < swap_idle_threshold1)
/sys/vm/vm_glue.c:                                  (td->td_slptime < swap_idle_threshold2)))
/sys/vm/vm_glue.c:                              if (minslptime > td->td_slptime)
/sys/vm/vm_glue.c:                                      minslptime = td->td_slptime;
%%

So td_slptime is used by kern/kern_proc.c and vm/vm_glue.c but in the
sched_ule case it is always 0.
Comment 5 Mark Linimon freebsd_committer freebsd_triage 2007-03-15 17:11:13 UTC
State Changed
From-To: feedback->open

Submitter notes the problem still recurs.
Comment 6 jroberson 2007-03-17 20:38:24 UTC
On Thu, 15 Mar 2007, Antoine Brodin wrote:

> Mark Linimon <linimon@FreeBSD.org> wrote:
>> Synopsis: [sched_ule] doesn't keep track of the sleep time of a process
>>
>> State-Changed-From-To: open->feedback
>> State-Changed-By: linimon
>> State-Changed-When: Wed Mar 14 22:24:26 UTC 2007
>> State-Changed-Why:
>> The ULE scheduler has been extensively upgraded in -CURRENT.  Does this
>> problem still occur there?
>>
>> http://www.freebsd.org/cgi/query-pr.cgi?pr=78444
>
> The problem still occurs.
> %%
>> grep -R td_slptime /sys/
> /sys/kern/kern_proc.c:  kp->ki_slptime = td->td_slptime;
> /sys/kern/sched_4bsd.c:                         if (td->td_slptime > 1) {
> /sys/kern/sched_4bsd.c:                         td->td_slptime = 0;
> /sys/kern/sched_4bsd.c:                         td->td_slptime++;
> /sys/kern/sched_4bsd.c:                 if (td->td_slptime > 1)
> /sys/kern/sched_4bsd.c: if (td->td_slptime > 5 * loadfac)
> /sys/kern/sched_4bsd.c:         td->td_slptime--;       /* was incremented in schedcpu() */
> /sys/kern/sched_4bsd.c:         while (newcpu && --td->td_slptime)
> /sys/kern/sched_4bsd.c: td->td_slptime = 0;
> /sys/kern/sched_4bsd.c: if (td->td_slptime > 1) {
> /sys/kern/sched_4bsd.c: td->td_slptime = 0;
> /sys/sys/proc.h:        u_int           td_slptime;     /* (j) How long completely blocked. */
> /sys/vm/vm_glue.c:                              pri = p->p_swtime + td->td_slptime;
> /sys/vm/vm_glue.c:                              if (td->td_slptime < swap_idle_threshold1)
> /sys/vm/vm_glue.c:                                  (td->td_slptime < swap_idle_threshold2)))
> /sys/vm/vm_glue.c:                              if (minslptime > td->td_slptime)
> /sys/vm/vm_glue.c:                                      minslptime = td->td_slptime;
> %%
>
> So td_slptime is used by kern/kern_proc.c and vm/vm_glue.c but in the
> sched_ule case it is always 0.
>

Atoine,

Do you have any interest in working on a patch for this?  I propose that 
we change slptime to a ticks value and modify sched_4bsd to get a delta 
from the present slptime in the updatepri() calculations etc.  In this way 
it would be compatible with ULE's O(1) algorithm and the rest of the 
kernel could assume it was a tick count as well.

Thanks,
Jeff
Comment 7 Antoine Brodin 2007-03-17 21:34:56 UTC
On Sat, 17 Mar 2007 12:38:24 -0800 (PST)
Jeff Roberson <jroberson@chesapeake.net> wrote:
> Do you have any interest in working on a patch for this?  I propose that 
> we change slptime to a ticks value and modify sched_4bsd to get a delta 
> from the present slptime in the updatepri() calculations etc.  In this way 
> it would be compatible with ULE's O(1) algorithm and the rest of the 
> kernel could assume it was a tick count as well.

I don't have much time now so I don't think I'll be able to provide a
patch.
p_swtime has to be changed too, it is a number of seconds for sched_4bsd
and a number of ticks for sched_ule

Cheers,

Antoine
Comment 8 Antoine Brodin freebsd_committer freebsd_triage 2012-12-29 12:37:22 UTC
State Changed
From-To: open->closed

Someone fixed this bug, I can no longer reproduce on 9-stable and -current 


Comment 9 Antoine Brodin freebsd_committer freebsd_triage 2012-12-29 12:37:22 UTC
Responsible Changed
From-To: jeff->antoine

Someone fixed this bug, I can no longer reproduce on 9-stable and -current