Bug 237642 - x11-drivers/xf86-video-ati: Upgrade from 18.1.0 to 19.0.0 results in invisible mouse pointer
Summary: x11-drivers/xf86-video-ati: Upgrade from 18.1.0 to 19.0.0 results in invisibl...
Status: Closed FIXED
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: --- Affects Some People
Assignee: freebsd-x11 mailing list
URL: https://gitlab.freedesktop.org/xorg/d...
Keywords:
Depends on:
Blocks:
 
Reported: 2019-04-29 12:22 UTC by Jason W. Bacon
Modified: 2020-04-16 17:49 UTC (History)
7 users (show)

See Also:
zeising: merge-quarterly?


Attachments
fix ati-legacy with xserver 1.20.7 (2.48 KB, patch)
2020-02-24 09:33 UTC, Niclas Zeising
no flags Details | Diff
files/patch-src_drmmode__display.c (702 bytes, patch)
2020-04-14 07:14 UTC, Alexey Dokuchaev
no flags Details | Diff
final (hopefully) fix (1.39 KB, patch)
2020-04-14 15:52 UTC, Niclas Zeising
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jason W. Bacon freebsd_committer 2019-04-29 12:22:52 UTC
Following a routine pkg upgrade, a couple of my systems no longer had a visible mouse pointer.  X11 was otherwise fully functional.  If I could blindly get the mouse over a button and click, it responded normally.

Both systems have Radeon GPUs and I noticed that xf86-video-ati had been recently upgraded, so I reverted to 18.1.0 and this fixed the issue.

My hardware is quite old and I found that xf86-video-ati-legacy also works for me.  I do not know whether this is a viable solution for everyone, though.

I added a modified v18.1.0 port to my WIP collection:

https://github.com/outpaddling/freebsd-ports-wip

It has PKGNAMESUFFIX=-18 so that pkg upgrade won't replace it with the broken v19.0.0.
Comment 1 Citroën 2019-05-07 14:53:55 UTC
Same here.

Defining "SWcursor "on" in Xorg configuration file(s) restores the pointer, alas soft. The default hardware pointer used to show in previous version of xf86-video-ati.

Mouse (USB) works normally, as reported.
Comment 3 Citroën 2019-05-12 15:28:18 UTC
Just built 19.0.1,1 from ports.
Same as before. No hardware cursor, only software. Mouse otherwise OK.
Comment 4 Andrew 2019-10-13 18:27:40 UTC
I just detected the same problem after upgrading xf86-video-ati from 18.1.0,1 to 19.0.1,1 on a Compaq Evo N620c laptop (good old days.. :) ).

Here is its GPU model:

% pciconf -lv vgapci0
vgapci0@pci0:1:0:0:	class=0x030000 card=0x00580e11 chip=0x4c571002 rev=0x00 hdr=0x00
    vendor     = 'Advanced Micro Devices, Inc. [AMD/ATI]'
    device     = 'RV200/M7 [Mobility Radeon 7500]'
    class      = display
    subclass   = VGA
Comment 5 cn1337 2020-02-21 19:20:51 UTC
I had the same problem and used the legacy driver (x11-drivers/xf86-video-ati-legacy) as a workaround.

After I just upgraded x11-servers/xorg-server to 1.20.7 I can no longer build 
the legacy driver. 


My device:
pciconf -lv vgapci0
vgapci0@pci0:1:0:0:	class=0x030000 card=0x20031787 chip=0x944c1002 rev=0x00 hdr=0x00
    vendor     = 'Advanced Micro Devices, Inc. [AMD/ATI]'
    device     = 'RV770 LE [Radeon HD 4830]'
    class      = display
    subclass   = VGA
Comment 6 Tatsuki Makino 2020-02-23 05:00:28 UTC
My radeon may be having the same problem, but I can't confirm it because "xsetroot -cursor_name X_cursor" is written in ~/.xinitrc.
Comment 7 Niclas Zeising freebsd_committer 2020-02-24 09:33:10 UTC
Created attachment 211884 [details]
fix ati-legacy with xserver 1.20.7

(In reply to cn1337 from comment #5)

I have a patch for the ati-legacy driver, it is attached here.  I don't know if it works though, so please test it.
Comment 8 Niclas Zeising freebsd_committer 2020-02-24 09:34:00 UTC
WHich DRM driver are those of you with this issue using?
Comment 9 cn1337 2020-03-05 14:45:52 UTC
(In reply to Niclas Zeising from comment #7)
Thanks for the patch and your efforts. I can build the legacy drivers with it.

BUT:
I no longer need to. With xserver 1.20.7 I can user the regular driver (xf86-video-ati-19.1.0_1,1) and everything works fine. 

Try using the regular driver after you upgrade to 1.20.7 it might resolve this issue.
Comment 10 Tatsuki Makino 2020-03-05 21:25:49 UTC
The current version of xf86-video-ati should be combined with drm-fbsd12.0-kmod-4.16.g20200221 or later if 12 series is used.
As a result, I have not yet encountered kernel panic. :)
Comment 11 Gavin Atkinson freebsd_committer freebsd_triage 2020-03-26 19:59:27 UTC
I seem to be hitting the same issue on 12.1-STABLE r359108 with:

drm-legacy-kmod-g20200306
gpu-firmware-kmod-g20200130
xf86-video-ati-19.1.0_2,1
xorg-server-1.20.7_2,1

(all installed from packages), on:

vgapci0@pci0:1:0:0:	class=0x030000 card=0x90181028 chip=0x95c51002 rev=0x00 hdr=0x00
    vendor     = 'Advanced Micro Devices, Inc. [AMD/ATI]'
    device     = 'RV620 LE [Radeon HD 3450]'

(note that contrary to https://wiki.freebsd.org/Graphics/AMD-GPU-Matrix I need to use drm-legacy-kmod and not drm-kmod for this otherwise the machine hangs at boot).

Setting "SWcursor "on" is a workaround for me too.
Comment 12 Alexey Dokuchaev freebsd_committer 2020-03-27 08:47:01 UTC
(In reply to Gavin Atkinson from comment #11)
> I need to use drm-legacy-kmod and not drm-kmod for this otherwise
> the machine hangs at boot).
FWIW, nothing except drm-legacy-kmod had never worked for me as well, locking up the machine upon "kldload radeonkms".  This is observed on RV710 (Radeon HD 4350/4550) and Richland (Radeon HD 8550G).
Comment 13 Alexey Dokuchaev freebsd_committer 2020-04-13 13:16:24 UTC
Diffing the xf86-video-ati-{18.1.0,19.0.1}/ChangeLog files shows these three cursor-related upstream commits:

  803f872f7d4b2d80be434bb42ce64dfd295b122c	Use two HW cursor buffers ...
  91e557f78ad261e76a1829f54722c2c0781742d2	Update cursor position in drm...
  92df709786830d4e30a106dd49d8e0355c50c8f0	Use drmIoctl in drmmode_show...

Reverting the last 92df709786830d4e30a106dd49d8e0355c50c8f0 change gives me back the HW cursor.  Unfortunately, it requires reverting two other commits as well, but the culprit seems to be somewhere along the conversion from drmModeSetCursor[2]() to drmIoctl() in the drmmode_show_cursor().  Reverting just the first two commits does not help.
Comment 14 Alexey Dokuchaev freebsd_committer 2020-04-14 07:14:06 UTC
Created attachment 213384 [details]
files/patch-src_drmmode__display.c

PoC patch which replaces drmIoctl() calls back with drmModeSetCursor[2]() ones.  This is enough to restore the HW cursor with version 19.1.0, but the root cause of the bug probably lies deeper.
Comment 15 Niclas Zeising freebsd_committer 2020-04-14 07:37:10 UTC
(In reply to Alexey Dokuchaev from comment #14)

Thank you!
I wonder if it's possible to look into what drmIoctl does, perhaps it's something missing there that we need to implement.
Comment 16 Niclas Zeising freebsd_committer 2020-04-14 07:58:31 UTC
(In reply to Niclas Zeising from comment #15)
I had a quick look, and drmIoctl() really just does an ioctl() call into the driver.  Looking at the code, the old and the new code looks semantically the same...  Can you test just swapping one of the calls at the time, and see if it is ok, or if both are needed?
Comment 17 Jan Beich freebsd_committer 2020-04-14 08:14:01 UTC
drmModeSetCursor2() appears to discard DRM_MODE_CURSOR_MOVE from https://gitlab.freedesktop.org/xorg/driver/xf86-video-ati/commit/91e557f78ad2
Comment 18 Niclas Zeising freebsd_committer 2020-04-14 08:26:50 UTC
(In reply to Jan Beich from comment #17)

In the current code, before the suggested patch from danfe, drmModeSetCursor2 is not called anywhere.

I'm not sure what you mean by it discarding DRM_MODE_CURSOR_MOVE, can you explain?  It is added to the argument structure in line 1252, but I can't see it ever being removed from the flags.
Comment 19 Niclas Zeising freebsd_committer 2020-04-14 08:33:10 UTC
both the call to drmIoctl(DRM_MODE_CURSOR_MOVE) and drmIoctl(DRM_MODE_CURSOR_MOVE2) in drmmode_show_cursor() use the same argument structure. This is then passed directly to the ioctl() call.  I wonder if this causes issues in our ioctl code, since the two different ioctls use different structures in the drm kernel driver.

When using drmModeSetCursor and drmModeSetCursor2, the correct structure is used.

If this is the cause, then switching out drmIoctl(DRM_MODE_CURSOR_MOVE) to drmModeSetCursor() should be enough to fix the issue, maybe.  That, and to figure out which drmIoctl() call that's causing the issue, is why I want danfe to test only replacing one call at the time.
Comment 20 Jan Beich freebsd_committer 2020-04-14 08:40:11 UTC
(In reply to Niclas Zeising from comment #18)
When danfe@ changed to drmModeSetCursor2 and drmModeSetCursor modified arg.flags are no longer used, so it's always DRM_MODE_CURSOR_BO instead of DRM_MODE_CURSOR_BO | DRM_MODE_CURSOR_MOVE.
Comment 21 Alexey Dokuchaev freebsd_committer 2020-04-14 08:44:17 UTC
I've arrived at the similar conclusions about the same structure (re/ab)use in the code.  I've observed that making only the first replacement (drmModeSetCursor2) and leaving the last drmIoctl() near the function exit is enough to fix the cursor.  Replacing just the last call alone does not help.
Comment 22 Alexey Dokuchaev freebsd_committer 2020-04-14 08:51:28 UTC
(In reply to Jan Beich from comment #20)
> always DRM_MODE_CURSOR_BO instead of DRM_MODE_CURSOR_BO | DRM_MODE_CURSOR_MOVE
I've commented out //arg.flags |= DRM_MODE_CURSOR_MOVE; line and left both drmIoctl()s, it did not help.
Comment 23 Niclas Zeising freebsd_committer 2020-04-14 08:59:52 UTC
(In reply to Alexey Dokuchaev from comment #21)

Which is strange, as I thought the opposite would be true considering the struct abuse.  I wonder if the last call is never reached...  Any chance you can check the flags value before the call, or explicitly set it to just DRM_MODE_CURSOR_BO before the call to drmIoctl()?  Although, this seem to be explicitly set earlier in the drm_show_cursor() function.
Comment 24 Niclas Zeising freebsd_committer 2020-04-14 09:03:41 UTC
(In reply to Niclas Zeising from comment #23)

One other difference is that in libdrm, drmModeSetCursor() and drmModeSetCursor2() both call DRM_IOCTL() to do the actual ioctl, and DRM_IOCTL calls drmIoctl(), but checks if the return from drmIoctl() is negative, and if it is negative, retruns -errno instead of the return value from drmIoctl().  I guess you could try swapping drmIoctl() to DRM_IOCTL() and see if that works.  But now I'm only guessing.
Comment 25 Alexey Dokuchaev freebsd_committer 2020-04-14 09:24:12 UTC
(In reply to Niclas Zeising from comment #24)
> But now I'm only guessing.
You guess it correct. :-)  If I replace the first call with DRM_IOCTL() as defined in libdrm's xf86drmMode.c, the cursor is back.  Or if I check if (ret < 0) instead of if (ret == -EINVAL), leaving drmIoctl() as is -- this also fixes the problem.
Comment 26 Niclas Zeising freebsd_committer 2020-04-14 09:49:53 UTC
(In reply to Alexey Dokuchaev from comment #25)

So, something we do wrt. return values from the ioctl() is different from what linux does.  Is there any way to check the return value and print/log it somewhere?  I think you can use xf86DrvMsg for this.  You'll find how they work starting here: https://gitlab.freedesktop.org/xorg/xserver/-/blob/master/hw/xfree86/common/xf86Helper.c#L999

Something like xf86DrvMsg(pScrn->scrnIndex, X_INFO, ...) could work, I guess.  The rest of the call is a regular printf() format string I think.

I'm wondering if the call DRM_IOCTL_MODE_CURSOR2 is failing, but before your patch the code path that sets use_set_cursor2 to false isn't taken.  With your patch we instead fall back to use DRM_IOCTL_MODE_CURSOR, and that ioctl works

Thanks for helping out!
Comment 27 Jan Beich freebsd_committer 2020-04-14 09:55:06 UTC
(In reply to Niclas Zeising from comment #26)
> I'm wondering if the call DRM_IOCTL_MODE_CURSOR2 is failing

drm-legacy-kmod doesn't support DRM_IOCTL_MODE_CURSOR2, so drmIoctl would likely return EINVAL. I suspect on Linux no one tests such old kernels.
Comment 28 Alexey Dokuchaev freebsd_committer 2020-04-14 10:01:42 UTC
(In reply to Niclas Zeising from comment #26)
> Is there any way to check the return value and print/log it somewhere?
Somewhat expectedly, return value is -1.
Comment 29 Niclas Zeising freebsd_committer 2020-04-14 10:05:23 UTC
(In reply to Jan Beich from comment #27)

So, the question is if it returns EINVAL or -EINVAL, or something else.  If it returns -EINVAL, then the code as-is should work.  Interestingly, as danfe reports, changing to check for any negative return works, which would catch -EINVAL.

I'm also curious if this is a problem only with drm-legacy-kmod, or also with drm-kmod.
Comment 30 Niclas Zeising freebsd_committer 2020-04-14 10:12:19 UTC
(In reply to Alexey Dokuchaev from comment #28)

So, what the code probably should do, on FreeBSD, is
if (ret == -1 && errno == -EINVAL) maybe.

I wonder if drmIoctl() in libdrm needs to be rewritten on FreeBSD, to return errno or -errno rather than the return value of the actual ioctl() call.
Comment 31 Alexey Dokuchaev freebsd_committer 2020-04-14 10:26:56 UTC
> I'm also curious if this is a problem only with drm-legacy-kmod,
> or also with drm-kmod.
I've tried building graphics/drm-fbsd11.2-kmod on my 13-CURRENT, which is based on Linux 4.11 code and thus should suck less than 4.16 one, but that was wishful thinking.  There were some error messages on the console before the screen went blank and machine locked up, I'd have to see if this contributes to the problem.

> I wonder if drmIoctl() in libdrm needs to be rewritten on FreeBSD,
> to return errno or -errno rather than the return value of the actual
> ioctl() call.
No.  drmIoctl() should follow ioctl() semantics and return -1 while passing the error code in errno.  Introducing DRM_IOCTL() was kind of ugly, but breaking drmIoctl() would be even worse.

The previous code was correct; when Michel Dänzer decided to bypass existing API and call drmIoctl() directly, he should have updated the code to check for error condition properly instead of comparing return value with what would be in the errno.
Comment 32 Niclas Zeising freebsd_committer 2020-04-14 10:34:08 UTC
(In reply to Alexey Dokuchaev from comment #31)

Ignore drm-kmod for now then.  There have been several changes to FreeBSD and to lkpi so that drm-fbsd11.2-kmod will not build.

I'm still not sure about drmIoctl(), I asked some xorg devs, and they were unsure about the return from linux ioctl().

In any case, does it work if you change the xf86-video-ati code to check for ret == -1 && errno == -EINVAL (or maybe EINVAL)?
Comment 33 Alexey Dokuchaev freebsd_committer 2020-04-14 10:36:58 UTC
(In reply to Niclas Zeising from comment #30)
> So, what the code probably should do, on FreeBSD, is
> if (ret == -1 && errno == -EINVAL) maybe.
Not just on FreeBSD.  This is correct POSIX semantics (modulo the errno negation).  Linux' ioctl(2) manpage says that "a few ioctl() requests use the return value as an output parameter and return a nonnegative value on success" but that's utterly broken.  Ideally, drmIoctl() should be fixed to always behave per POSIX.
Comment 34 Alexey Dokuchaev freebsd_committer 2020-04-14 10:42:43 UTC
(In reply to Niclas Zeising from comment #32)
> There have been several changes to FreeBSD and to lkpi so that
> drm-fbsd11.2-kmod will not build.
It builds just fine with couple of patches.  I suspect something goes wrong when initializing GPU firmware (gpu-firmware-kmod port is installed).

> I'm still not sure about drmIoctl(), I asked some xorg devs, and they
> were unsure about the return from linux ioctl().
If would be nice if you could convince them to read and follow *BSD or POSIX manpages, not Linux' ones. :-)

> In any case, does it work if you change the xf86-video-ati code to check
> for ret == -1 && errno == -EINVAL (or maybe EINVAL)?
I'll test and report shortly.
Comment 35 Niclas Zeising freebsd_committer 2020-04-14 10:49:51 UTC
(In reply to Alexey Dokuchaev from comment #34)

I think convincing Linux kernel people to change the behavior of ioctl() is quite futile. :)

As you say, it is probably wrong to change drmIoctl() as well, the caller has to be responsible for actually checking errno.

Other parts of libdrm does this for you with the DRM_IOCTL() function, but not drmIoctl() directly.  This is probably why it worked before.

Big thanks for all the help so far!
Comment 36 Alexey Dokuchaev freebsd_committer 2020-04-14 11:12:49 UTC
(In reply to Niclas Zeising from comment #35)
> I think convincing Linux kernel people to change the behavior of ioctl()
> is quite futile. :)
I was talking about convincing X.Org devs to follow POSIX, not Linux kernel people. :-)

> Other parts of libdrm does this for you with the DRM_IOCTL() function, but
> not drmIoctl() directly.  This is probably why it worked before.
Right.  Anyway, changing just the check on line 1278 to "if (ret == -1 && errno == EINVAL)" restores correct behavior (HW cursor is back).
Comment 37 Niclas Zeising freebsd_committer 2020-04-14 11:18:50 UTC
(In reply to Alexey Dokuchaev from comment #36)

Then that's the patch we'll use, and submit upstream as well.  It seems to be the most correct one as well.  Can you submit it upstream, or do you want me to do it?
Comment 38 Alexey Dokuchaev freebsd_committer 2020-04-14 11:26:54 UTC
(In reply to Niclas Zeising from comment #37)
I can try myself first, and ask for help if I do something wrong.  Is it here? https://gitlab.freedesktop.org/xorg/driver/xf86-video-ati/-/issues
Comment 39 Niclas Zeising freebsd_committer 2020-04-14 15:52:11 UTC
Created attachment 213392 [details]
final (hopefully) fix

Hi!
Big thanks for all the help danfe!
Attached is a patch that implements the last iteration of solutions we discussed. I'll commit it later today, but posting here for a quick review beforehand.
Comment 40 commit-hook freebsd_committer 2020-04-15 19:55:00 UTC
A commit references this bug:

Author: zeising
Date: Wed Apr 15 19:54:50 UTC 2020
New revision: 531787
URL: https://svnweb.freebsd.org/changeset/ports/531787

Log:
  x11-drivers/xf86-video-ati: Fix missing hw cursor

  Add an upstream patch (submitted by myself and danfe) to fix the issue where
  the hw cursor sometimes does not work when using x11-drivers/xf86-video-ati.
  Big thanks to danfe@ for debugging and testing and figuring out what's going
  on.

  For details on the change, see upstream issue:
  https://gitlab.freedesktop.org/xorg/driver/xf86-video-ati/-/issues/190

  PR:		237642
  Reported by:	jwb
  MFH:		2020Q2

Changes:
  head/x11-drivers/xf86-video-ati/Makefile
  head/x11-drivers/xf86-video-ati/distinfo
Comment 41 Niclas Zeising freebsd_committer 2020-04-15 19:56:59 UTC
Committed the fix (as it landed upstream).
Awaiting MFH

Big thanks once again to danfe@ for figuring out what's going on, and for help with testing!
Comment 42 Jason W. Bacon freebsd_committer 2020-04-15 21:04:36 UTC
(In reply to Niclas Zeising from comment #41)

VERY big thanks!
Comment 43 commit-hook freebsd_committer 2020-04-16 17:48:47 UTC
A commit references this bug:

Author: zeising
Date: Thu Apr 16 17:47:56 UTC 2020
New revision: 531865
URL: https://svnweb.freebsd.org/changeset/ports/531865

Log:
  MFH: r531787

  x11-drivers/xf86-video-ati: Fix missing hw cursor

  Add an upstream patch (submitted by myself and danfe) to fix the issue where
  the hw cursor sometimes does not work when using x11-drivers/xf86-video-ati.
  Big thanks to danfe@ for debugging and testing and figuring out what's going
  on.

  For details on the change, see upstream issue:
  https://gitlab.freedesktop.org/xorg/driver/xf86-video-ati/-/issues/190

  PR:		237642
  Reported by:	jwb

  Approved by:	ports-secteam (joenum)

Changes:
_U  branches/2020Q2/
  branches/2020Q2/x11-drivers/xf86-video-ati/Makefile
  branches/2020Q2/x11-drivers/xf86-video-ati/distinfo
Comment 44 Niclas Zeising freebsd_committer 2020-04-16 17:49:06 UTC
MFH Done