Bug 230273 - linux_select CALL is very slow (100 ms)
Summary: linux_select CALL is very slow (100 ms)
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 11.2-RELEASE
Hardware: amd64 Any
: --- Affects Only Me
Assignee: freebsd-emulation (Nobody)
Depends on:
Blocks: 247219
  Show dependency treegraph
Reported: 2018-08-01 21:20 UTC by Jason Mader
Modified: 2020-10-14 16:16 UTC (History)
2 users (show)

See Also:


Note You need to log in before you can comment on or make changes to this bug.
Description Jason Mader 2018-08-01 21:20:23 UTC
Linux compatibility layer is taking a long time with either a read() or select() in a jail. I noticed this because the select() is called about 20 times during a network exchange, so the network communication takes about 2 seconds. I always see this take 100 ms, so it makes me think it's a timeout of some kind.

0.000007856 linux_time(0x0)			 = 1533157182 (0x5b621f3e)
0.000017328 linux_newstat("/etc/localtime",0x7fffffffc500) = 0 (0x0)
0.000009030 linux_select(0x1,0x7fffffffd310,0x0,0x0,0x7fffffffd390) = 1 (0x1)
0.000009219 read(0,"f\0\0\0",4)			 = 4 (0x4)
0.104589092 linux_select(0x1,0x7fffffffd290,0x0,0x0,0x7fffffffd310) = 1 (0x1)

Here's a ktrace of a similar call sequence,

 70145 lstc_server CALL  linux_time(0)
 70145 lstc_server RET   linux_time 1533157805/0x5b6221ad
 70145 lstc_server CALL  linux_newstat(0x800b56e6b,0x7fffffffc500)
 70145 lstc_server NAMI  "/compat/linux/etc/localtime"
 70145 lstc_server NAMI  "/etc/localtime"
 70145 lstc_server STRU  struct stat {dev=2853443365, ino=53, mode=0100444, nlink=1, uid=0, gid=0, rdev=4294967295, atime=1504808312.221830000, mtime=1504808312.221900000, ctime=1532210790.994694000, birthtime=1504808312.221900000, size=118, blksize=4096, blocks=1, flags=0x800 }
 70145 lstc_server RET   linux_newstat 0
 70145 lstc_server CALL  linux_select(0x1,0x7fffffffd310,0,0,0x7fffffffd390)
 70145 lstc_server RET   linux_select 1
 70145 lstc_server CALL  read(0,0x7fffffffd3a0,0x4)
 70145 lstc_server GIO   fd 0 read 4 bytes
 70145 lstc_server RET   read 4
 70145 lstc_server CALL  linux_select(0x1,0x7fffffffd290,0,0,0x7fffffffd310)
 70145 lstc_server RET   linux_select 1

And maybe related, or not, even with

jail {

linux_semop() always returns an error.

0.000006559 linux_time(0x0)			 = 1533157182 (0x5b621f3e)
0.000007433 linux_semop(0x20000,0x7fffffffd3d0,0x1) ERR#-11 'Resource temporarily unavailable'
0.000380425 linux_select(0x6,0x7fffffffd310,0x0,0x0,0x7fffffffd390) = 1 (0x1)
0.000010566 linux_accept(0x4,0x0,0x7fffffffd3ac) = 0 (0x0)
0.000007921 linux_getpeername(0x0,0x7fffffffd1b0,0x7fffffffd1cc) = 0 (0x0)
0.000007532 linux_getpeername(0x0,0x7fffffffd1b0,0x7fffffffd1cc) = 0 (0x0)
0.000009828 linux_semop(0x20000,0x7fffffffd3d0,0x1) ERR#-11 'Resource temporarily unavailable'
0.000009462 linux_getpeername(0x0,0x7fffffffd3c0,0x7fffffffd3dc) = 0 (0x0)
0.000009209 linux_getpeername(0x0,0x7fffffffd3c0,0x7fffffffd3dc) = 0 (0x0)
Comment 1 Conrad Meyer freebsd_committer 2018-08-01 21:39:26 UTC
What does the fd being read correspond to?  What is the program trying to do?  I don't think we can diagnose this from the little information provided.
Comment 2 Conrad Meyer freebsd_committer 2018-08-01 21:43:51 UTC
Based on the implementation of linux_select in compat/linux/linux_misc, I don't see how the wrapper would introduce 100ms of latency.
Comment 3 Jason Mader 2018-08-02 10:35:07 UTC
(In reply to Conrad Meyer from comment #1)

It might be,

 12650 lstc_server CALL  linux_socket(0x1,0x80002,0)
 12650 lstc_server RET   linux_socket 0
 12650 lstc_server CALL  linux_connect(0,0x800d911c0,0x6e)
 12650 lstc_server STRU  struct sockaddr { AF_LOCAL, /dev/log }
 12650 lstc_server NAMI  "/dev/log"
 12650 lstc_server RET   linux_connect -1 errno -2 No such file or directory
 12650 lstc_server CALL  close(0)
 12650 lstc_server RET   close 0

FreeBSD 11.2 Bug 228351 devfs doesn't expose /dev/log, even with `add path tty unhide` in rules.
Comment 4 Conrad Meyer freebsd_committer 2018-08-02 19:14:35 UTC
(In reply to Jason Mader from comment #3)
So your program is maybe blindly calling select(2) on an fd that isn't even open and encountering a timeout when nothing happens?  That would not be a linuxulator bug.
Comment 5 Conrad Meyer freebsd_committer 2018-08-02 19:17:02 UTC
(In reply to Conrad Meyer from comment #4)
Hm, though in that case you would expect EBADF.  But perhaps something else is being opened and receives the fd number.
Comment 6 Jason Mader 2018-08-03 19:16:41 UTC
(In reply to Conrad Meyer from comment #4)
Correct, it likely isn't. I've run the binary on the host system (not in the jail) where it has access to the /dev/log symlink, and it still has the delay. I'm going to try to see if I can use dtrace to find out what's happening between the read() and the select().
Comment 7 Dmitry Chagin freebsd_committer 2020-02-05 11:25:04 UTC
hi, Jason. what is the program u are using? how to repeat u case?
Comment 8 Jason Mader 2020-02-05 11:52:09 UTC
(In reply to Dmitry Chagin from comment #7)

It's an application vendor's license server program, so not easy for anyone else to recreate with that program.
Comment 9 Dmitry Chagin freebsd_committer 2020-02-05 20:51:35 UTC
(In reply to Jason Mader from comment #8)
so, may be you show the full ktrace output from server via kdump -HAR -m 32