|Summary:||graphics/drm-fbsd12.0-kmod: massive memory leak|
|Product:||Ports & Packages||Reporter:||Dirk Meyer <dinoex>|
|Component:||Individual Port(s)||Assignee:||freebsd-x11 (Nobody) <x11>|
|Severity:||Affects Some People||CC:||bennett, gljennjohn, swills|
Description Dirk Meyer 2020-10-09 14:36:27 UTC
drm-fbsd12.0-kmod is 4.16.g20200221. My desktop computer has 16 GiB RAM, but it seems that drm-fbsd12.0-kmod keeps leaking memory, using up all memory in one day and forcing me to reboot. This computer uses Intel i5-6200U CPU and its integrated Intel Skylake graphics. The operating system is FreeBSD 12.1. The version of drm-fbsd12.0-kmod is 4.16.g20200221. When the system was just booted, wired memory was below 1 GiB. top: Mem: 3049M Active, 393M Inact, 174M Laundry, 981M Wired, 497M Buf, 11G Free Swap: 16G Total, 16G Free After the system had run for a few hours, wired memory became 14 GiB. The desktop was still responsive, but the usage of swap was increasing. After the system had run longer, wired memory become 16 GiB. The desktop was non-responsive, even shell commands are frozen for 40 secs. "vmstat -z" keeps stable: 1,392,114,304 TOTAL 1,333,297,844 58,816,460 See also: https://github.com/FreeBSDDesktop/kms-drm/issues/247
Comment 1 Scott Bennett 2020-10-12 06:42:15 UTC
Your description matches that of a collection of bugs present in 11.2 to 11.4 and 12.x that others began complaining of on other lists within days of the release of 11.2. I have complained on the x11 and stable lists about them a few times this year because they are still a problem and because the FreeBSD developers have not addressed them. Over time others have found a number of sysctl workarounds, and I have found a few others and have accumulated several that, in combination, allow me to keep my system usable for weeks at a time, rather than a day to three or four days. They do not fix the bugs, however. One glaring bug is that vm.max_wired is now ignored by the kernel. With vm.max_wired=786432 I have seen the amount of real memory tied up in pagefixing (a.k.a. "wiring" in Berkeley dialect) exceed 6700 MB on an 8 GB machine. My view is that vm.max_wired should either be honored or removed from the source code tree. It is worth noting that ZFS ARC does not appear to be to blame. It rarely exceeds the quasi-limit of vfs.zfs.arc_max by more than ~200 MB. The following sysctl variables, when set to very increased values from their default values, seem to help keep a system able to do work or to be recoverable to such condition without the necessity of a reboot: vm.v_free_min, vm.pageout_wakeup_thresh, vm.pageout_oom_seq. Reducing the value of vfs.zfs.arc_max may also help, depending upon your system's configuration. Also, be aware that maintaining a large ccache directory tree to use with "make buildworld", "make buildkernel", or "portmaster -a" will save you a great deal of time, but will also hasten the day when a reboot will be necessary. My advice for those cases is to keep CCACHE_DIR and, for buildworld and buildkernel, /usr/ports or other DESTDIR, in a file system that can be easily unmounted and, perhaps, remounted in order to free up its associated buffer cache memory (most of which should have been pagefreed immediately upon completion of an I/O operation long since anyway). There is not enough pagefreeing being done by the kernel anymore that used to be done at appropriate times or so it appears. Also, if you use ZFS, it will help to reduce the limit on ZFS ARC size by setting vfs.zfs.arc_max. Note, however, that that is merely a crutch to give you more operational time before you have to intervene manually. FWIW, my speculation is that this mess of bugs of the kind that should have vanished from FreeBSD by release 1.x was introduced into 13-CURRENT and later backported into 11.2 and 12.x. (I am aware that it affects 12.1, but I do not know whether 12.0 was affected. I currently am running 11.4-STABLE at r364474. I do not think these bugs have much, if anything, to do with the graphics stack, but rather appear to be VM subsystem bugs. My system still suffers from them, even though there is no safe-to-use graphics support for a Radeon HD 5770 card, and therefore my system is not running X11. :-(
Comment 2 Gary Jennejohn 2020-10-12 08:29:13 UTC
(In reply to Scott Bennett from comment #1) I use HEAD and vm_max_wired no longer exists there. However, there is vm.max_user_wired which has this description "vm.max_user_wired: system-wide limit to user-wired page count". If this also exists in your version it might help to set it.
Comment 3 Scott Bennett 2020-10-12 11:13:26 UTC
Given that the only user who can pagefix is root, I am not sure how the new vm.max_user_wired will be useful. Further, if it really does what the description says, it will not solve the problems that vm.max_wired could have helped with and possible solve if it were properly supported in 11 and 12, which appear to be caused by the kernel, not a user program. And no, there is only vm.max_wired in 11. I can't say for 12 at the moment because the laptop that has 12.1 on it is shut off. Next time I use it I can look to see whether vm.max_user_wired is present. If a sysctl to limit pagefixing on a per-process basis were added, but not replacing vm.max_wired, that could be useful in a case of root running something that might pagefix too much memory concurrently. It's a shame that vm.max_wired is being dumped instead of repaired. In any case, this PR probably ought to be reassigned to the proper team for the VM system.