Bug 226495 - graphics/mesa-dri: GPU freeze on Sandy Bridge system
Summary: graphics/mesa-dri: GPU freeze on Sandy Bridge system
Status: Closed Feedback Timeout
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: amd64 Any
: --- Affects Only Me
Assignee: freebsd-x11 (Nobody)
URL: https://blog.sleeplessbeastie.eu/2014...
Keywords:
Depends on:
Blocks:
 
Reported: 2018-03-10 05:07 UTC by rkoberman
Modified: 2018-06-10 06:41 UTC (History)
3 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description rkoberman 2018-03-10 05:07:28 UTC
Since the last MESA update, my ThinkPad T520 with Sandy Bridge graphics has been freezing intermittently. It has done so four times with nothing obvious in common, so it is quite infrequent. Running standard DRM, not drm-next. (I plan to try that shortly.) 

I have not been able to get the system to recover except to ssh in from my phone and kill xinit. I can then restart MATE and all is well again.

I log the following:
Mar  9 08:20:12 rogue kernel: error: [drm:pid1599:__gen6_gt_force_wake_get] *ERROR* Timed out waiting for forcewake to ack request.
Mar  9 08:20:12 rogue kernel: error: [drm:pid1599:__gen6_gt_wait_for_thread_c0] *ERROR* GT thread status wait timed out
Mar  9 08:20:12 rogue kernel: error: [drm:pid1599:gen6_gt_check_fifodbg] *ERROR* MMIO read or write has been dropped 3
Mar  9 08:21:12 rogue kernel: error: [drm:pid12:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
Mar  9 08:21:12 rogue kernel: info: [drm] capturing error event; look for more information in sysctl hw.dri.0.info.i915_error_state

I have a copy of the contents of hw.dri.0.info.i915_error_state from after the restart of mate, though I have no idea if it remains valid after the restart of X and mate. I will attach the text.

These errors (except the last two lines) do occur from time to time, but the recovery is usually quick, often too quickly to be noticed and other times causing a brief pause. I believe these happened on older DRM versions, but without the freezes.

I have a copy of the contents of hw.dri.0.info.i915_error_state from after the restart of mate, though I have no idea if it remains valid a

The system is running 11-STABLE:
FreeBSD rogue 11.1-STABLE FreeBSD 11.1-STABLE #0 r330560: Wed Mar  7 09:09:42 PST 2018     root@rogue:/usr/obj/usr/src/sys/GENERIC.4BSD  amd64


info: [drm] Initialized drm 1.1.0 20060810
CPU: Intel(R) Core(TM) i5-2520M CPU @ 2.50GHz (2491.96-MHz K8-class CPU)
  Origin="GenuineIntel"  Id=0x206a7  Family=0x6  Model=0x2a  Stepping=7
  Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,$
  Features2=0x1fbae3ff<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,C$
  AMD Features=0x28100800<SYSCALL,NX,RDTSCP,LM>
  AMD Features2=0x1<LAHF>
  XSAVE Features=0x1<XSAVEOPT>
  VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID
  TSC: P-state invariant, performance statistics
Comment 1 rkoberman 2018-03-10 05:20:44 UTC
Attachment was too big. The file is available at http://ykoberman.dlinkddns.com/FreeBSD/gpu_hang_sysctl.txt.
Comment 2 Carlos J. Puga Medina freebsd_committer freebsd_triage 2018-03-10 12:16:35 UTC
Hi Robert,

I also had this problem with the Intel Sandy Bridge GPU, the whole system was crashing while using Chromium.

Add in /boot/loader.conf

# Enable power-saving idle states of the GPU
drm.i915.enable_rc6=7

# Enable semaphores
drm.i915.semaphores=1

If you are using Chromium to avoid crashes: disable the GPU hardware acceleration (use OpenGL ES 2.0 SwiftShader instead).
Comment 3 rkoberman 2018-03-10 17:15:33 UTC
(In reply to Carlos J. Puga Medina from comment #2)
Thanks for the suggestions, Carlos.

I seem to have a dilemma. I need to have VTd enabled to support my Windows 7 virtual system, so the question is whether VTd with semaphore is more stable than VTd without it.

From the blog, it is not clear to me that enabling rc6 is related to my problem or just a good thing to do. In any case, I had already been running with it set to '5'. There was a note that a bug in Sandy Bridge could lead to hangs if bit 1 was set, though a later note indicated that an update to the DRM code might disable that optimization on affected Sandy Bridge systems. This is from old memory as I have been running it at '5' for several years.

I'll monitor for errors and see of the semaphore makes things better or worse.

Finally, where did "Robert" come from? I have not used that name for almost 60 years, though that is my first name. I started using my middle name in first grade. I'd like to find the source of any such reference and squash it, if possible.

Again, thanks!
Kevin
Comment 4 Niclas Zeising freebsd_committer freebsd_triage 2018-05-20 08:25:54 UTC
Hi!
Have you tried with the latest version of mesa now in ports? 18.0.4.
If you are tracking FreeBSD stable or current, can you also switch to use graphics/drm-stable-kmod or graphics/drm-next-kmod and see if the problem goes away?

Thanks!
Regards
-- 
Niclas
Comment 5 Niclas Zeising freebsd_committer freebsd_triage 2018-06-10 06:41:11 UTC
Feedback timeout.  If this is still an issue, please test with updated ports (mesa) and possibly drm-stable-kmod.  If that doesn't work, please re-open this or open a new PR.