Bug 232483 - loader crashes on "lsdev" command
Summary: loader crashes on "lsdev" command
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: amd64 Any
: --- Affects Only Me
Assignee: freebsd-bugs (Nobody)
URL:
Keywords: regression
Depends on:
Blocks:
 
Reported: 2018-10-20 18:05 UTC by Lev A. Serebryakov
Modified: 2019-04-21 04:16 UTC (History)
2 users (show)

See Also:


Attachments
lsdev output (329.80 KB, image/png)
2018-11-04 10:42 UTC, Toomas Soome
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Lev A. Serebryakov freebsd_committer freebsd_triage 2018-10-20 18:05:43 UTC
loader (default one, for MBR/bsdlabel) built from stable/12 r339445 crashes right after "lsdev" console command.

Here is "Screenshot" (retyped from photo of monitor):

OK lsdev
disk devices:
  disk0: BIOS drive C(83363580 X512):
    disk0s1: FreeBSD
    disk0s2: FreeBSD
    disk0s3: FreeBSD
panic: free: guard1 fail @0xcaec9680 from unknown:0
--> Press a key on the console to reboot <--

This disk contains 4 (not 3) FreeBS slices.
Comment 1 Toomas Soome freebsd_committer freebsd_triage 2018-11-04 10:42:16 UTC
Created attachment 198931 [details]
lsdev output

I could not replicate the issue...
Comment 2 Lev A. Serebryakov freebsd_committer freebsd_triage 2018-11-05 21:21:37 UTC
How could I provide additional diagnostics? I could reproduce it 100%.
Comment 3 Toomas Soome freebsd_committer freebsd_triage 2018-11-06 07:07:59 UTC
(In reply to Lev A. Serebryakov from comment #2)
first of all, could you provide gpart list, so we could see how the table should look like, also, does lsdev -v produce the same panic?

the unknown:0 is also weird - it should be source file and line number where the free is called from, but of course we can guess that from code path.... 

if you press space on early spin, you can start other loader variants, do they produce the same crash?
Comment 4 Lev A. Serebryakov freebsd_committer freebsd_triage 2018-11-06 20:49:50 UTC
gpart show

=>      63  30031187  da0  MBR  (14G)
        63         1       - free -  (512B)
        64   1060232    1  freebsd  (518M)
   1060296   1060232    2  freebsd  [active]  (518M)
   2120528     16384    3  freebsd  (8.0M)
   2136912   8388608    4  freebsd  (4.0G)
  10525520  19505730       - free -  (9.3G)

=>      0  1060232  da0s1  BSD  (518M)
        0       16         - free -  (8.0K)
       16  1060216      1  !0  (518M)

=>      0  1060232  da0s2  BSD  (518M)
        0       16         - free -  (8.0K)
       16  1060216      1  !0  (518M)

lsdev -v crashes after showing 3 partitions (slices) too, with exactly same message.

I've tried  /boot/loader (which is loader_lua), /boot/loader_4th and /boot/loader_simp, results are exactly the same, with and without "-v" flag to lsdev. All three crashes at same place with same message (6 crashes in total).

Now I'm running 13.0-CURRENT r340152, but as far as I can see, nothing has been changed in this area beteween new 12 branch and head.

I could build loaded with any additional custom diagnostics, if it is needed.

And, BTW, here is my /boot/loader.conf:

====
# We don't have writeable / FS
entropy_cache_load="NO"

beastie_disable="YES"
autoboot_delay="3"

# Use serial
#boot_serial="YES"
#boot_multicons="YES"
#console="comconsole,vidconsole"
#comconsole_speed="115200"

kern.geom.label.disk_ident.enable=0

dumpdev="/dev/da0s4"

net.link.ifqmaxlen=16384
net.isr.maxthreads=2
net.isr.numthreads=2
net.isr.dispatch="direct"

net.graph.maxdata=4096

hw.em.max_interrupt_rate=32000

hint.acpi_throttle.0.disabled=1
hint.acpi_throttle.1.disabled=1
====
Comment 5 Toomas Soome freebsd_committer freebsd_triage 2018-11-07 10:25:56 UTC
Ok, I am in trouble there, I got:

=>      63  41942977  da2  MBR  (20G)
        63         1       - free -  (512B)
        64   1060864    1  freebsd  (518M)
   1060928   1060864    2  freebsd  (518M)
   2121792     16384    3  freebsd  (8.0M)
   2138176   8388608    4  freebsd  (4.0G)
  10526784  31416256       - free -  (15G)

=>      0  1060864  da2s1  BSD  (518M)
        0  1060864         - free -  (518M)

=>      0  1060864  da2s2  BSD  (518M)
        0  1060864         - free -  (518M)


so, how do I create the entries: 
=>      0  1060232  da0s1  BSD  (518M)
        0       16         - free -  (8.0K)
       16  1060216      1  !0  (518M)

That is partition 1 inside freebsd slice 1 and 2. Note, those *should* appear in lsdev output (for you) as: disk0s1p1 and disk0s2p1, but they are completely missing.

It seems to me, the best option there is to build loader with PART_DEBUG defined (you can set it in the stand/common/part.c, then check out what debug messages we will get and then we can see what kind of issue we are stepping on there.
Comment 6 Toomas Soome freebsd_committer freebsd_triage 2018-11-07 12:25:11 UTC
Anyhow, I was able to get loader panic with this partition table, so we can work on from there.
Comment 7 Toomas Soome freebsd_committer freebsd_triage 2018-11-07 13:50:35 UTC
proposed fix is in https://reviews.freebsd.org/D17890
Comment 8 Lev A. Serebryakov freebsd_committer freebsd_triage 2018-11-07 14:07:52 UTC
I'll try it 8 or 9 hours later, thnx!
Comment 9 commit-hook freebsd_committer freebsd_triage 2018-11-07 21:37:30 UTC
A commit references this bug:

Author: tsoome
Date: Wed Nov  7 21:36:52 UTC 2018
New revision: 340240
URL: https://svnweb.freebsd.org/changeset/base/340240

Log:
  loader: ptable_open() check for ptable_cd9660read result is wrong

  The ptable_*read() functions return NULL on read errors (and partition table
  closed as an side effect). The ptable_open must check the return value and
  act properly.

  PR:		232483
  Reported by:	lev
  Reviewed by:	lev,cem
  Differential Revision:	https://reviews.freebsd.org/D17890

Changes:
  head/stand/common/part.c
Comment 10 Lev A. Serebryakov freebsd_committer freebsd_triage 2018-11-28 19:56:00 UTC
Should it be closed or we are waiting for MFC?
Comment 11 commit-hook freebsd_committer freebsd_triage 2018-11-29 13:01:58 UTC
A commit references this bug:

Author: tsoome
Date: Thu Nov 29 13:01:22 UTC 2018
New revision: 341225
URL: https://svnweb.freebsd.org/changeset/base/341225

Log:
  MFC: r340240:
  loader: ptable_open() check for ptable_cd9660read result is wrong

  The ptable_*read() functions return NULL on read errors (and partition table
  closed as an side effect). The ptable_open must check the return value and
  act properly.

  PR:		232483
  Reported by:	lev
  Reviewed by:	lev,cem
  Differential Revision:	https://reviews.freebsd.org/D17890

Changes:
_U  stable/12/
  stable/12/stand/common/part.c
Comment 12 commit-hook freebsd_committer freebsd_triage 2019-04-21 04:16:41 UTC
A commit references this bug:

Author: kevans
Date: Sun Apr 21 04:15:58 UTC 2019
New revision: 346480
URL: https://svnweb.freebsd.org/changeset/base/346480

Log:
  MFC r338262, r339334, r339796, r340240, r340857, r340917, r341007

  r338262:
  stand: fdt: Drop some write-only assignments/variables and leaked bits

  Generally straightforward enough; a copy of argv[1] was being made in
  command_fdt_internal, solely used for a comparison within the
  handler-search, then promptly leaked.

  r339334:
  loader.efi: add poweroff command

  Add poweroff command to make life a bit easier.

  r339796:
  Simplify the EFI delay() function by calling BS->Stall()

  r340240:
  loader: ptable_open() check for ptable_cd9660read result is wrong

  The ptable_*read() functions return NULL on read errors (and partition table
  closed as an side effect). The ptable_open must check the return value and
  act properly.

  r340857:
  Nuke out buffer overflow safety marker code, it duplicates similar code in
  the malloc()/free() as well as having potential of softening the handling
  in case error is detected down to a mere warning as compared to hard panic
  in free().

  r340917:
  Update pxeboot(8) manual page to reflect the next-server change in the ISC DHCP v3 server.

  r341007:
  Bump the date of pxeboot(8) manual page for r340917.

  PR:		123484, 232483

Changes:
_U  stable/11/
  stable/11/stand/common/bcache.c
  stable/11/stand/common/part.c
  stable/11/stand/efi/libefi/delay.c
  stable/11/stand/efi/loader/main.c
  stable/11/stand/fdt/fdt_loader_cmd.c
  stable/11/stand/i386/pxeldr/pxeboot.8