Bug 249932 - graphics/drm-fbsd12.0-kmod: panic on certain graphics operations
Summary: graphics/drm-fbsd12.0-kmod: panic on certain graphics operations
Status: Closed FIXED
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: --- Affects Only Me
Assignee: freebsd-x11 (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-09-27 06:49 UTC by Martin Birgmeier
Modified: 2021-11-19 14:57 UTC (History)
4 users (show)

See Also:
bugzilla: maintainer-feedback? (x11)


Attachments
/var/crash/core.txt.1 (481.99 KB, text/plain)
2020-09-27 06:49 UTC, Martin Birgmeier
no flags Details
/var/run/dmesg.boot (15.31 KB, text/plain)
2020-09-27 06:49 UTC, Martin Birgmeier
no flags Details
/var/log/Xorg.0.log (39.81 KB, text/plain)
2020-09-27 06:49 UTC, Martin Birgmeier
no flags Details
boot messages with kernel from head@r366217 (14.18 KB, text/plain)
2020-10-03 08:07 UTC, Martin Birgmeier
no flags Details
core.txt.8 panic info (379.25 KB, text/plain)
2020-11-01 21:32 UTC, Patrick McMunn
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Birgmeier 2020-09-27 06:49:00 UTC
Created attachment 218349 [details]
/var/crash/core.txt.1

Scenario:
- FreeBSD 12.1-RELEASE-p6 #6 r362488M
- latest ports, amongst them drm-fbsd12.0-kmod-4.16.g20200221, drm-kmod-g20190710, gpu-firmware-kmod-g20200920
- CPU: AMD Phenom(tm) II X4 955 Processor (3210.75-MHz K8-class CPU)
- Chipset: "ATI Radeon HD 4290" (ChipID = 0x9714)
- Starting X via sddm
- Logging in
- Starting firefox
- Modify toolbar, drag one icon

Result:
- The machine panics

Note:
- This has been happening for a long time; from time to time I am checking whether it might be working again.
Comment 1 Martin Birgmeier 2020-09-27 06:49:28 UTC
Created attachment 218350 [details]
/var/run/dmesg.boot
Comment 2 Martin Birgmeier 2020-09-27 06:49:59 UTC
Created attachment 218351 [details]
/var/log/Xorg.0.log
Comment 3 Val Packett 2020-09-27 11:50:57 UTC
We have seen the radeon_gem_object_free + ttm_bo_cleanup_refs_or_queue trace before: bug 237544, bug 234760. Nice to get a report from someone who knows about kgdb :)

Would be great if you could:

- upgrade to a CURRENT kernel
- git clone https://github.com/freebsd/drm-kmod
- add 'KCONFIG+= DRM_DEBUG_MM' to kconfig.mk
- install the driver (make -jN, make install)
- try to reproduce
Comment 4 Martin Birgmeier 2020-09-27 13:27:34 UTC
Current as in stable/12 or in head?

And in any case, just kernel + modules, no userland (that would be too heavy for this use case, which is "reliable zfs server")?

Will I be able to go back to releng/12.1 regarding zfs?

I basically just want to flip /boot/kernel and /boot/modules.

-- Martin
Comment 5 Val Packett 2020-09-27 20:47:36 UTC
CURRENT == head, yes.

Of course you can keep 12 userland.

Regarding ZFS, just don't zpool upgrade. No upgrade will be done automatically.
Comment 6 Martin Birgmeier 2020-09-28 17:09:33 UTC
Already working on it, will take some while as I also have to recompile all ports for head to be prepared for drm-kmod. Or can I just compile the git repo using the native tools, without any ports installed (all in a VM running head, before transferring the result to the machine in question)?

(See also https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=249957 ;-))

-- Martin
Comment 7 Martin Birgmeier 2020-09-29 19:27:31 UTC
I have recompiled head @ r366217 and then all (my usual) ports @ r550470. Then cloned drm-kmod and set DRM_DEBUG_MM. uname -a shows, amongst others, "FreeBSD 13.0-CURRENT #0 r366217M".

Running make in drm-kmod yields the error below.

How can I proceed?

-- Martin

...
cc  -O2 -pipe '-DKBUILD_MODNAME="drm"' '-DLINUXKPI_PARAM_PREFIX=drm_' -DDRM_SYSCTL_PARAM_PREFIX=_dri -DLINUXKPI_VERSION=50000 -DCONFIG_DRM_AMDGPU_CIK -DCONFIG_DRM_AMDGPU_SI -DCONFIG_DRM_AMD_DC -DCONFIG_DRM_AMD_DC_FBC -DCONFIG_DRM_AMD_POWERPLAY -DCONFIG_DRM_I915_ALPHA_SUPPORT -DCONFIG_DRM_I915_FORCE_PROBE='"*"' -DCONFIG_DRM_I915_CAPTURE_ERROR -DCONFIG_DRM_I915_SPIN_REQUEST=5 -DCONFIG_DRM_I915_USERFAULT_AUTOSUSPEND=250 -DCONFIG_DRM_LOAD_EDID_FIRMWARE -DCONFIG_DRM_MIPI_DSI -DCONFIG_DRM_PANEL_ORIENTATION_QUIRKS -DCONFIG_DRM_VMWGFX_FBCON -DCONFIG_DRM_DEBUG_MM -DCONFIG_DRM_FBDEV_EMULATION -DCONFIG_DRM_FBDEV_OVERALLOC=100 -DCONFIG_DRM_LEGACY -DCONFIG_DRM_VM -DCONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG -DCONFIG_BACKLIGHT_CLASS_DEVICE -DCONFIG_DMI -DCONFIG_FB -DCONFIG_MTRR -DCONFIG_PCI -DCONFIG_PM -DCONFIG_SMP -DCONFIG_ACPI -DCONFIG_ACPI_SLEEP -DCONFIG_AGP -DCONFIG_X86 -DCONFIG_X86_PAT -DCONFIG_64BIT -DCONFIG_AS_MOVNTDQA -DCONFIG_COMPAT -DCONFIG_X64_64 -DCONFIG_DRM_AMD_DC_DCN1_0 -DCONFIG_DRM_AMD_DC_DCN1_01 -DCONFIG_DRM_AMD_DC_DCN2_0 -DCONFIG_DRM_AMD_DC_DSC_SUPPORT  -fno-strict-aliasing -Werror -D_KERNEL -DKLD_MODULE -nostdinc  -I/usr/tmp/drm-kmod/linuxkpi/gplv2/include -I/auto/z/SRC/FreeBSD/head/sys/compat/linuxkpi/common/include -I/usr/tmp/drm-kmod/linuxkpi/dummy/include -I/usr/tmp/drm-kmod/drivers/gpu/drm -I/usr/tmp/drm-kmod/include -I/usr/tmp/drm-kmod/include/drm -I/usr/tmp/drm-kmod/include/uapi -I/usr/tmp/drm-kmod/drivers/gpu -include /usr/tmp/drm-kmod/drm/opt_global.h -I. -I/auto/z/SRC/FreeBSD/head/sys -I/auto/z/SRC/FreeBSD/head/sys/contrib/ck/include -fno-common  -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -fdebug-prefix-map=./machine=/auto/z/SRC/FreeBSD/head/sys/amd64/include -fdebug-prefix-map=./x86=/auto/z/SRC/FreeBSD/head/sys/x86/include     -MD  -MF.depend.drm_mm.o -MTdrm_mm.o -mcmodel=kernel -mno-red-zone -mno-mmx -mno-sse -msoft-float  -fno-asynchronous-unwind-tables -ffreestanding -fwrapv -fstack-protector -Wno-pointer-sign -Wno-format -Wno-format-zero-length   -mno-aes -mno-avx  -std=iso9899:1999 -c /usr/tmp/drm-kmod/drivers/gpu/drm/drm_mm.c -o drm_mm.o
/usr/tmp/drm-kmod/drivers/gpu/drm/drm_mm.c:112:6: error: implicit declaration of function 'stack_trace_save' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
        n = stack_trace_save(entries, ARRAY_SIZE(entries), 1);
            ^
/usr/tmp/drm-kmod/drivers/gpu/drm/drm_mm.c:115:16: error: implicit declaration of function 'stack_depot_save' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
        node->stack = stack_depot_save(entries, n, GFP_NOWAIT);
                      ^
/usr/tmp/drm-kmod/drivers/gpu/drm/drm_mm.c:115:16: note: did you mean 'stack_trace_save'?
/usr/tmp/drm-kmod/drivers/gpu/drm/drm_mm.c:112:6: note: 'stack_trace_save' declared here
        n = stack_trace_save(entries, ARRAY_SIZE(entries), 1);
            ^
/usr/tmp/drm-kmod/drivers/gpu/drm/drm_mm.c:136:16: error: implicit declaration of function 'stack_depot_fetch' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
                nr_entries = stack_depot_fetch(node->stack, &entries);
                             ^
/usr/tmp/drm-kmod/drivers/gpu/drm/drm_mm.c:137:3: error: implicit declaration of function 'stack_trace_snprint' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
                stack_trace_snprint(buf, BUFSZ, entries, nr_entries, 0);
                ^
4 errors generated.
*** Error code 1

Stop.
make[1]: stopped in /usr/tmp/drm-kmod/drm
*** Error code 1
Comment 8 Val Packett 2020-09-30 12:36:11 UTC
(In reply to Martin Birgmeier from comment #7)

oh, I guess we don't have everything for DRM_DEBUG_MM,
I'll see if we can avoid using the stack depot things and still get some of the debug assertions here.

For now you can just test without the debug thing.

> can I just compile the git repo using the native tools, without any ports installed (all in a VM running head, before transferring the result to the machine in question)?

Yeah I'm pretty sure all you need is that /usr/src should match the kernel you'll be running. Ports should not matter
Comment 9 Val Packett 2020-09-30 12:51:52 UTC
Okay, just change

#ifdef CONFIG_DRM_DEBUG_MM

to

#if defined(CONFIG_DRM_DEBUG_MM) && defined(__linux__)

in drivers/gpu/drm/drm_mm.c
Comment 10 Martin Birgmeier 2020-10-03 08:07:17 UTC
Created attachment 218482 [details]
boot messages with kernel from head@r366217

Status:
- Built kernel from head@r366217
- Built drm-kmod from 677e58f482149531cbe1ef30507c34be433edd09
- Replaced /boot/kernel and /boot/modules

Result:
- no zpools imported (not necessary for this test, as /usr/local is on / and that is UFS)
- no graphics - some microcode file seems to be missing?

-- Martin
Comment 11 Val Packett 2020-10-03 14:46:39 UTC
(In reply to Martin Birgmeier from comment #10)
Right, you need to rebuild https://github.com/FreeBSDDesktop/kms-firmware for the new kernel too. Sorry, forgot to mention.
Comment 12 Martin Birgmeier 2020-10-03 16:55:38 UTC
O.k. so the result now is negative - which is to say that there was no panic and therefore nothing that could be reported. I tried to tax it using firefox and scrolling and video playback, no untoward things happened (unlike with 12.1).

The system just worked (tm). ;-)

Except that it did not import the zfs pool, but I assume/hope this does not matter.

So where do we go from here? Will all this nice stuff be in 12.2? Or come as a ports update?

-- Martin
Comment 13 Val Packett 2020-10-04 18:27:21 UTC
(In reply to Martin Birgmeier from comment #12)
Hm. If you want to stay on 12.x kernels you could try the drm-v5.0-fbsd12.1 branch that's on the FreeBSDDesktop github. It's not actively supported though. I think someone made the decision to stick with 4.16-fbsd12.0 for the whole 12.x series. Maybe time to rethink that if 5.0 fixes radeon..

But wait. I don't think you've tried manually building 4.16 on the 12 kernel.
Maybe the code is fine but the build in the official packages is somehow busted/mismatched.
Comment 14 Niclas Zeising freebsd_committer freebsd_triage 2020-10-05 07:58:51 UTC
(In reply to Greg V from comment #13)

There are regressions with the 5.0 release, where the console doesn't always update as it should, that's the reason we stuck with 4.16 for 12.

Feel free to try the 5.0 branch, but be aware that it is not supported.
Comment 15 Patrick McMunn 2020-11-01 21:32:23 UTC
Created attachment 219285 [details]
core.txt.8 panic info

I'm running 12.2-RELEASE on a DELL XPS400. It's using the r300 Mesa driver. I have frequent kernel panics -- usually when trying to run any graphical application other than the Plasma 5 desktop, though even that has been enough to panic the system more than once. I also experience frequent flickering of the screen -- mostly when scrolling a browser page up or down -- and the screen doesn't necessary refresh properly so that old images may remain even though the image should be newer. This also hinders ALT-TAB switching which I may have to do 3-6 times or more just to switch between programs and get the new window to appear properly
Comment 16 Emmanuel Vadot freebsd_committer freebsd_triage 2021-11-19 14:57:36 UTC
This is fixed on FreeBSD >= 13 based on comments, drm-fbsd12-kmod isn't maintained so closing this PR.