Bug 107171 - [patch] [ncurses] systat(1) doesn't die when it's xterm is killed while it's running
Summary: [patch] [ncurses] systat(1) doesn't die when it's xterm is killed while it's ...
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: bin (show other bugs)
Version: 6.1-RELEASE
Hardware: Any Any
: Normal Affects Only Me
Assignee: Jaakko Heinonen
URL:
Keywords:
: 32667 (view as bug list)
Depends on:
Blocks:
 
Reported: 2006-12-24 16:00 UTC by Josh Paetzel <josh@tcbug.org>
Modified: 2015-04-27 13:16 UTC (History)
1 user (show)

See Also:


Attachments
systat-hang-on-getch-loop-fix.patch (1.44 KB, patch)
2008-01-26 21:25 UTC, Jaakko Heinonen
no flags Details | Diff
systat-hang-on-getch-loop-fix.patch (1.48 KB, patch)
2008-01-27 12:19 UTC, Jaakko Heinonen
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Josh Paetzel <josh@tcbug.org> 2006-12-24 16:00:29 UTC

I discovered my CPU at 0% idle this morning.  top reported systat using
95% of the CPU time.  I didn't have systat running anywhere, then remembered
I had it running in an xterm last night and killed the xterm without stopping
systat.  This seems to be 100% reproducable, but interestingly enough only
happens when systat is run as root.  Running systat as a normal user doesn't
trigger this behavior.....killing its xterm kills systat in that case.

It's annoying that systat doesn't die in this situation, the fact that it
maxes the CPU is what really makes this a nasty behavior.

Fix: 

I don't know of a fix but the workaround is:

1) Don't run systat as root
2) Don't kill an xterm with a running systat in it
How-To-Repeat: 

startx
open 2 xterms
su to root in one of them
start top in the other
systat -if in the root xterm
kill the xterm running systat
watch systat run away to 100% cpu usage
Comment 1 Josh Paetzel <josh@tcbug.org> 2006-12-30 08:24:08 UTC
For what it's worth I've done some more testing and the problem 
doesn't seem to be restricted to xterms.  The same behaviour can be 
reproduced using pterms, aterms, and eterms.

I've also confirmed that other people can reproduce the issue, 
although all on 6.x i386 boxes.

-- 
Thanks,

Josh Paetzel
Comment 2 Nate Eldredge 2007-03-08 10:06:35 UTC
The bug is in /usr/src/usr.bin/systat/keyboard.c, where an error return by 
getch() is ignored.  When you close an xterm, it sends SIGHUP to all 
processes running on that terminal, and all further terminal I/O by those 
processes fails with ENXIO.  But a process running as a different user 
(e.g. root) can't receive the SIGHUP, and systat ignores the ENXIO and 
retries endlessly.

I presume the retry is to cope with the case where getch() was interrupted 
by a signal.  So probably it should only retry if errno = EINTR.  This is 
essentially what top does.  Also, ferror(stdin) is a bogus test; getch() 
does not go through stdio and so ferror(stdin) will never be set, and the 
clearerr() is bogus too.

EINTR is probably moot for FreeBSD anyway since it appears our ncurses 
handles that transparently, but it doesn't hurt to check.

Here is a patch which fixes the problem.  It still assumes the constant 
ERR returned by getch() is -1 which is not great style, but I'll touch as 
little as possible here.

--- /usr/src/usr.bin/systat/keyboard.c.old	Thu Mar  8 02:00:35 2007
+++ /usr/src/usr.bin/systat/keyboard.c	Thu Mar  8 02:00:40 2007
@@ -42,6 +42,7 @@
  #include <ctype.h>
  #include <signal.h>
  #include <termios.h>
+#include <errno.h>

  #include "systat.h"
  #include "extern.h"
@@ -58,9 +59,11 @@
                  do {
                          refresh();
                          ch = getch() & 0177;
-                        if (ch == 0177 && ferror(stdin)) {
-                                clearerr(stdin);
-                                continue;
+                        if (ch == 0177) {
+				if (errno == EINTR)
+                                	continue;
+				else
+					die(0);
                          }
                          if (ch >= 'A' && ch <= 'Z')
                                  ch += 'a' - 'A';

-- 
Nate Eldredge
nge@cs.hmc.edu
Comment 3 Bruce Evans 2007-03-08 12:20:56 UTC
On Thu, 8 Mar 2007, Nate Eldredge wrote:

> I presume the retry is to cope with the case where getch() was interrupted
> by a signal.  So probably it should only retry if errno = EINTR.  This is
> essentially what top does.  Also, ferror(stdin) is a bogus test; getch()
> does not go through stdio and so ferror(stdin) will never be set, and the
> clearerr() is bogus too.
>
> EINTR is probably moot for FreeBSD anyway since it appears our ncurses
> handles that transparently, but it doesn't hurt to check.

ncurses has code to actively hide it, but this doesn't seem to be
configured in FreeBSD (option HIDE_EINTR).  Thus getch()'s man page
may be wrong in claiming that the ncurses implementation never returns
EINTR.

In fact, it is a bug for getch() to never return EINTR, and a worse
bug to loop in an input function when read() returns EINTR.  Most
unthreaded applications that use signal handlers need to change the
BSD default of restarting syscalls after a signal, so that their signal
handlers can just set a flag and depend on their main loop checking
the flag promptly.  Restarting syscalls prevents the flag being checked
promptly by looping calling read() (or another syscall) in the kernel,
and HIDE_EINTR would prevent the flag being checked promptly by looping
calling read() in the library.

However, EINTR is probably irrelevant for the current bug.  systat
doesn't catch SIGHUP in order to clean up up unsafely like it does
for SIGINT, so any SIGHUP that gives EOF on the terminal would either:
- kill the process uncleanly, or
- not be delivered to the process because it is masked (no EINTR since
   it is not caught), or
- not be delivered to the process because of job control stuff.
So systat just needs a simple check for EOF from getch() and the
remainly problem is that getch(), like getc(), unimproves on read()
by not having any way to distinguish the non-error of EOF from errors.

Bruce
Comment 4 Nate Eldredge 2007-03-09 03:09:22 UTC
Bruce Evans wrote:

> However, EINTR is probably irrelevant for the current bug. systat
> doesn't catch SIGHUP in order to clean up up unsafely like it does
> for SIGINT, so any SIGHUP that gives EOF on the terminal would either:
> - kill the process uncleanly, or
> - not be delivered to the process because it is masked (no EINTR since
> it is not caught), or
> - not be delivered to the process because of job control stuff.

I included the EINTR handling because systat does use SIGALRM.  In fact it 
actually does all the display updating inside the SIGALRM handler (nice 
huh?).  If we are sure that restarting syscalls are in effect (systat does 
not explicitly select this behavior), then I agree that systat can die on 
EINTR like any other error.  But if not then EINTR could legitimately 
appear every time the alarm rings, and the correct behavior is to retry.

> So systat just needs a simple check for EOF from getch() and the
> remainly problem is that getch(), like getc(), unimproves on read()
> by not having any way to distinguish the non-error of EOF from errors.

Note getch() returns ERR, not EOF.  (In the current implementation they 
are both equal to -1.)

getc() is actually okay; when getc() returns EOF you can use ferror(f) and 
feof(f) to decide if it was an error or end-of-file.  But yes, with 
getch() you're stuck.  I guess because it's normally used on a terminal 
device in raw mode, where you wouldn't normally receive an end-of-file 
(since ^D wouldn't work).

-- 
Nate Eldredge
nge@cs.hmc.edu
Comment 5 Dominic Marks 2008-01-02 15:20:25 UTC
All

I have stumbled upon this PR after finding a systat process running
at 99% cpu for the umpteenth time. I don't know enough about the
subject matter to comment on the solution but I would be very happy
for the problem to go away.

The patch has not been committed, was there an issue with it to
prevent this?

Happy new year,
Dominic Marks
Comment 6 Jaakko Heinonen 2008-01-26 21:25:35 UTC
Here's an updated patch against HEAD. I took the keyboard.c snippet from
OpenBSD and added a signal handling for SIGHUP in main.c.
Comment 7 Jaakko Heinonen 2008-01-27 12:19:53 UTC
Updated patch to add a missing include for stdlib.h in keyboard.c. (For
exit(3) prototype.)
Comment 8 Rebecca Cran freebsd_committer freebsd_triage 2008-05-14 23:34:54 UTC
The same problem also appears to affect many other full-screen ncurses 
apps: I've seen it occur with ee, sysinstall, sade and tzsetup too.

-- 
Bruce
Comment 9 Jaakko Heinonen freebsd_committer freebsd_triage 2009-10-08 19:17:37 UTC
Responsible Changed
From-To: freebsd-bugs->jh

Take.
Comment 10 dfilter service freebsd_committer freebsd_triage 2009-10-11 13:32:40 UTC
Author: jh
Date: Sun Oct 11 12:32:25 2009
New Revision: 197956
URL: http://svn.freebsd.org/changeset/base/197956

Log:
  - Catch SIGHUP to perform cleanup before exiting.
  - Exit if getch() returns with an error other than EINTR. Otherwise
    systat(1) may get stuck in an infinite loop if it doesn't receive
    SIGHUP when terminal closes. [1]
  - Remove attempt to clear stdio error indicators. getch() doesn't use
    stdio, making it useless. [2]
  - Remove unneeded masking of getch() return value. [2]
  
  PR:		bin/107171
  Reviewed by:	bde
  Approved by:	trasz (mentor)
  Obtained from:	OpenBSD [1]
  Suggested by:	bde [2]
  MFC after:	1 month

Modified:
  head/usr.bin/systat/keyboard.c
  head/usr.bin/systat/main.c

Modified: head/usr.bin/systat/keyboard.c
==============================================================================
--- head/usr.bin/systat/keyboard.c	Sun Oct 11 12:23:56 2009	(r197955)
+++ head/usr.bin/systat/keyboard.c	Sun Oct 11 12:32:25 2009	(r197956)
@@ -39,8 +39,10 @@ __FBSDID("$FreeBSD$");
 static const char sccsid[] = "@(#)keyboard.c	8.1 (Berkeley) 6/6/93";
 #endif
 
+#include <errno.h>
 #include <ctype.h>
 #include <signal.h>
+#include <stdlib.h>
 #include <termios.h>
 
 #include "systat.h"
@@ -57,10 +59,11 @@ keyboard(void)
                 move(CMDLINE, 0);
                 do {
                         refresh();
-                        ch = getch() & 0177;
-                        if (ch == 0177 && ferror(stdin)) {
-                                clearerr(stdin);
-                                continue;
+                        ch = getch();
+                        if (ch == ERR) {
+                                if (errno == EINTR)
+                                        continue;
+                                exit(1);
                         }
                         if (ch >= 'A' && ch <= 'Z')
                                 ch += 'a' - 'A';

Modified: head/usr.bin/systat/main.c
==============================================================================
--- head/usr.bin/systat/main.c	Sun Oct 11 12:23:56 2009	(r197955)
+++ head/usr.bin/systat/main.c	Sun Oct 11 12:32:25 2009	(r197956)
@@ -133,6 +133,7 @@ main(int argc, char **argv)
 			exit(1);
 		}
 	}
+	signal(SIGHUP, die);
 	signal(SIGINT, die);
 	signal(SIGQUIT, die);
 	signal(SIGTERM, die);
_______________________________________________
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
Comment 11 Jaakko Heinonen freebsd_committer freebsd_triage 2009-10-11 13:53:39 UTC
State Changed
From-To: open->patched

Patched in head (r197956).
Comment 12 dfilter service freebsd_committer freebsd_triage 2009-11-15 14:11:45 UTC
Author: jh
Date: Sun Nov 15 14:11:26 2009
New Revision: 199290
URL: http://svn.freebsd.org/changeset/base/199290

Log:
  MFC r197956:
  
  - Catch SIGHUP to perform cleanup before exiting.
  - Exit if getch() returns with an error other than EINTR. Otherwise
    systat(1) may get stuck in an infinite loop if it doesn't receive
    SIGHUP when terminal closes.
  - Remove attempt to clear stdio error indicators. getch() doesn't use
    stdio, making it useless.
  - Remove unneeded masking of getch() return value.
  
  PR:		bin/107171
  Approved by:	trasz (mentor)

Modified:
  stable/8/usr.bin/systat/keyboard.c
  stable/8/usr.bin/systat/main.c
Directory Properties:
  stable/8/usr.bin/systat/   (props changed)

Modified: stable/8/usr.bin/systat/keyboard.c
==============================================================================
--- stable/8/usr.bin/systat/keyboard.c	Sun Nov 15 11:43:28 2009	(r199289)
+++ stable/8/usr.bin/systat/keyboard.c	Sun Nov 15 14:11:26 2009	(r199290)
@@ -39,8 +39,10 @@ __FBSDID("$FreeBSD$");
 static const char sccsid[] = "@(#)keyboard.c	8.1 (Berkeley) 6/6/93";
 #endif
 
+#include <errno.h>
 #include <ctype.h>
 #include <signal.h>
+#include <stdlib.h>
 #include <termios.h>
 
 #include "systat.h"
@@ -57,10 +59,11 @@ keyboard(void)
                 move(CMDLINE, 0);
                 do {
                         refresh();
-                        ch = getch() & 0177;
-                        if (ch == 0177 && ferror(stdin)) {
-                                clearerr(stdin);
-                                continue;
+                        ch = getch();
+                        if (ch == ERR) {
+                                if (errno == EINTR)
+                                        continue;
+                                exit(1);
                         }
                         if (ch >= 'A' && ch <= 'Z')
                                 ch += 'a' - 'A';

Modified: stable/8/usr.bin/systat/main.c
==============================================================================
--- stable/8/usr.bin/systat/main.c	Sun Nov 15 11:43:28 2009	(r199289)
+++ stable/8/usr.bin/systat/main.c	Sun Nov 15 14:11:26 2009	(r199290)
@@ -133,6 +133,7 @@ main(int argc, char **argv)
 			exit(1);
 		}
 	}
+	signal(SIGHUP, die);
 	signal(SIGINT, die);
 	signal(SIGQUIT, die);
 	signal(SIGTERM, die);
_______________________________________________
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
Comment 13 dfilter service freebsd_committer freebsd_triage 2010-05-21 17:04:12 UTC
Author: jh
Date: Fri May 21 16:03:57 2010
New Revision: 208381
URL: http://svn.freebsd.org/changeset/base/208381

Log:
  MFC r197956:
  
  - Catch SIGHUP to perform cleanup before exiting.
  - Exit if getch() returns with an error other than EINTR. Otherwise
    systat(1) may get stuck in an infinite loop if it doesn't receive
    SIGHUP when terminal closes.
  - Remove attempt to clear stdio error indicators. getch() doesn't use
    stdio, making it useless.
  - Remove unneeded masking of getch() return value.
  
  PR:		bin/107171

Modified:
  stable/7/usr.bin/systat/keyboard.c
  stable/7/usr.bin/systat/main.c
Directory Properties:
  stable/7/usr.bin/systat/   (props changed)

Modified: stable/7/usr.bin/systat/keyboard.c
==============================================================================
--- stable/7/usr.bin/systat/keyboard.c	Fri May 21 16:01:57 2010	(r208380)
+++ stable/7/usr.bin/systat/keyboard.c	Fri May 21 16:03:57 2010	(r208381)
@@ -39,8 +39,10 @@ __FBSDID("$FreeBSD$");
 static const char sccsid[] = "@(#)keyboard.c	8.1 (Berkeley) 6/6/93";
 #endif
 
+#include <errno.h>
 #include <ctype.h>
 #include <signal.h>
+#include <stdlib.h>
 #include <termios.h>
 
 #include "systat.h"
@@ -57,10 +59,11 @@ keyboard()
                 move(CMDLINE, 0);
                 do {
                         refresh();
-                        ch = getch() & 0177;
-                        if (ch == 0177 && ferror(stdin)) {
-                                clearerr(stdin);
-                                continue;
+                        ch = getch();
+                        if (ch == ERR) {
+                                if (errno == EINTR)
+                                        continue;
+                                exit(1);
                         }
                         if (ch >= 'A' && ch <= 'Z')
                                 ch += 'a' - 'A';

Modified: stable/7/usr.bin/systat/main.c
==============================================================================
--- stable/7/usr.bin/systat/main.c	Fri May 21 16:01:57 2010	(r208380)
+++ stable/7/usr.bin/systat/main.c	Fri May 21 16:03:57 2010	(r208381)
@@ -133,6 +133,7 @@ main(int argc, char **argv)
 			exit(1);
 		}
 	}
+	signal(SIGHUP, die);
 	signal(SIGINT, die);
 	signal(SIGQUIT, die);
 	signal(SIGTERM, die);
_______________________________________________
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
Comment 14 Jaakko Heinonen freebsd_committer freebsd_triage 2010-05-21 21:24:33 UTC
State Changed
From-To: patched->closed

Fixed in head, stable/8 and stable/7.
Comment 15 Jilles Tjoelker freebsd_committer freebsd_triage 2015-04-27 13:16:05 UTC
*** Bug 32667 has been marked as a duplicate of this bug. ***