Bug 135690 - [panic] [ata] ufs_dirbad: /backuphd: bad dir ino 22259126 at offset 0: mangled entry
Summary: [panic] [ata] ufs_dirbad: /backuphd: bad dir ino 22259126 at offset 0: mangle...
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: Unspecified
Hardware: Any Any
: Normal Affects Only Me
Assignee: freebsd-bugs (Nobody)
URL:
Keywords: crash
Depends on:
Blocks:
 
Reported: 2009-06-18 08:00 UTC by takeda
Modified: 2024-01-04 09:40 UTC (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description takeda 2009-06-18 08:00:06 UTC
Ok... so this is a bit complicated...

Output from kgdb:
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-marcel-freebsd"...

Unread portion of the kernel message buffer:
ad2: TIMEOUT - READ_DMA48 retrying (1 retry left) LBA=355843068
panic: ufs_dirbad: /backuphd: bad dir ino 22259126 at offset 0: mangled entry
KDB: stack backtrace:
db_trace_self_wrapper(c0809ee8,c0882620,c081def3,d139f930,d139f930,...) at db_trace_self_wrapper+0x26
panic(c081def3,c26c84dc,153a5b6,0,c081dfb1,...) at panic+0xf8
ufs_dirbad(c2d01900,0,c081dfb1,0,d139f9cc,...) at ufs_dirbad+0x73
ufs_lookup(d139f9f8,d139f9f8,d139fbcc,d139fbb8,c2b49200,...) at ufs_lookup+0x4bd
vfs_cache_lookup(d139fa84,c05f5271,2,c2d0fe04,d139faa4,...) at vfs_cache_lookup+0xf2
VOP_LOOKUP_APV(c085a940,d139fa84,d139fbcc,c080fb81,2a9,...) at VOP_LOOKUP_APV+0x3d
lookup(d139fba4,c24d5400,0,d139fbc0,c2d0fe8c,...) at lookup+0x50f
namei(d139fba4,d139fb44,60,0,c2ad98c0,...) at namei+0x3a8
kern_stat(c2ad98c0,bfbfd710,0,d139fc14,52,...) at kern_stat+0x3d
stat(c2ad98c0,d139fcf8,8,c05a18b5,c2ad98c0,...) at stat+0x2f
syscall(d139fd38) at syscall+0x208
Xint0x80_syscall() at Xint0x80_syscall+0x20
--- syscall (188, FreeBSD ELF32, stat), eip = 0x206fa46b, esp = 0xbfbfc14c, ebp = 0xbfbfc168 ---
Uptime: 1h23m45s
Physical memory: 367 MB
Dumping 77 MB: 62 46 30 14

Reading symbols from /boot/kernel/mac_seeotheruids.ko...Reading symbols from /boot/kernel/mac_seeotheruids.ko.symbols...done.
done.
Loaded symbols for /boot/kernel/mac_seeotheruids.ko
Reading symbols from /boot/kernel/geom_journal.ko...Reading symbols from /boot/kernel/geom_journal.ko.symbols...done.
done.
Loaded symbols for /boot/kernel/geom_journal.ko
Reading symbols from /boot/kernel/acpi.ko...Reading symbols from /boot/kernel/acpi.ko.symbols...done.
done.
Loaded symbols for /boot/kernel/acpi.ko
#0  doadump () at pcpu.h:196
196             __asm __volatile("movl %%fs:0,%0" : "=r" (td));
(kgdb) bt
#0  doadump () at pcpu.h:196
#1  0xc05824eb in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:418
#2  0xc058288c in panic (fmt=0xc081def3 "ufs_dirbad: %s: bad dir ino %lu at offset %ld: %s")
    at /usr/src/sys/kern/kern_shutdown.c:574
#3  0xc0768733 in ufs_dirbad (ip=Variable "ip" is not available.
) at /usr/src/sys/ufs/ufs/ufs_lookup.c:607
#4  0xc076992d in ufs_lookup (ap=0xd139f9f8) at /usr/src/sys/ufs/ufs/ufs_lookup.c:297
#5  0xc05f2b52 in vfs_cache_lookup (ap=0xd139fa84) at vnode_if.h:83
#6  0xc07e0bfd in VOP_LOOKUP_APV (vop=0xc085ae60, a=0xd139fa84) at vnode_if.c:99
#7  0xc05f970f in lookup (ndp=0xd139fba4) at vnode_if.h:57
#8  0xc05fa788 in namei (ndp=0xd139fba4) at /usr/src/sys/kern/vfs_lookup.c:219
#9  0xc0608b3d in kern_stat (td=0xc2ad98c0, path=0xbfbfd710 <Address 0xbfbfd710 out of bounds>, pathseg=UIO_USERSPACE,
    sbp=0xd139fc14) at /usr/src/sys/kern/vfs_syscalls.c:2113
#10 0xc0608d3f in stat (td=0xc2ad98c0, uap=0xd139fcf8) at /usr/src/sys/kern/vfs_syscalls.c:2097
#11 0xc07cbb38 in syscall (frame=0xd139fd38) at /usr/src/sys/i386/i386/trap.c:1090
#12 0xc07b50f0 in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:255
#13 0x00000033 in ?? ()
Previous frame inner to this frame (corrupt stack?)
(kgdb) up
#1  0xc05824eb in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:418
418                     doadump();
(kgdb) up
#2  0xc058288c in panic (fmt=0xc081def3 "ufs_dirbad: %s: bad dir ino %lu at offset %ld: %s")
    at /usr/src/sys/kern/kern_shutdown.c:574
574             boot(bootopt);
(kgdb) up
#3  0xc0768733 in ufs_dirbad (ip=Variable "ip" is not available.
) at /usr/src/sys/ufs/ufs/ufs_lookup.c:607
607                     panic("ufs_dirbad: %s: bad dir ino %lu at offset %ld: %s",
(kgdb) up
#4  0xc076992d in ufs_lookup (ap=0xd139f9f8) at /usr/src/sys/ufs/ufs/ufs_lookup.c:297
297                             ufs_dirbad(dp, dp->i_offset, "mangled entry");


I have two disks.
The disk where's an empty /backuphd directory is using UFS2 (no softupdates):
/dev/ad0s1a on / (ufs, NFS exported, local)

On top of that I have partition from another disk (/dev/ad2), I was experimenting and I set up journaling on it. The disk is mounted as /backuphd.

Now, perhaps this isn't really related to the issue, but for completness sake I'll mention it. I have one directory called "david" I just created to back up files from a windows machine. I also have samba3 setup for sharing my home directory. I didn't felt like creating separate share for this so I simply made a symlink to that directory.

Here's the layout
/dev/ad0s1a on / (ufs, NFS exported, local)
devfs on /dev (devfs, local)
/dev/ad0s1d on /var (ufs, local, with quotas, soft-updates)
/dev/ad0s1e on /tmp (ufs, local, soft-updates)
/dev/ad0s1f on /usr (ufs, NFS exported, local, with quotas, soft-updates)
devfs on /var/named/dev (devfs, local)
devfs on /var/db/dhcpd/dev (devfs, local)
mayumi:/usr/obj on /usr/obj (nfs)
/dev/ad2.journal on /backuphd (ufs, asynchronous, NFS exported, local, gjournal)

The drives seem to be clean:
[chinatsu]:/usr/obj/usr/src/sys/CHINATSU# fsck /
** /dev/ad0s1a (NO WRITE)
** Last Mounted on /
** Root file system
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
11727 files, 200752 used, 53063 free (2799 frags, 6283 blocks, 1.1% fragmentation)

[chinatsu]:/usr/obj/usr/src/sys/CHINATSU# fsck /backuphd
** /dev/ad2.journal (NO WRITE)
** Last Mounted on /backuphd
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
401506 files, 91325079 used, 144677463 free (16119 frags, 18082668 blocks, 0.0% fragmentation)

The filesystems are clean as you can see, I wasn't repairing them manually since last crash, all it run was the standard check that happens when filesystem wasn't properly dismounted.

Fix: 

No idea
How-To-Repeat: Don't know exact steps, it seems random.

This crash happened about 3 times within an hour (I just stopped accessing the drive after that, and system works for 11 hours already... it had uptime of 70 days or so before the crash).

It feels like the crash was caused by scanning contents of a directory, though that's just my assumption.

Basically the first crash happened after a night of copying files. In the morning it looked like it was done, so I went to Total Commander (windows file manager - remember it was through samba) to see the files and instantly I heard noises of computer restarting. Total commander refreshes the contents of directory listing whenever it is set to foreground.
Comment 1 Volker Werth freebsd_committer freebsd_triage 2009-06-18 13:27:06 UTC
State Changed
From-To: open->feedback

Derek, 
nothing complicated here. It seems like your drive ad2 is bad. Can you please check 
your drive using S.M.A.R.T. (sysutils/smartmontools) for a fault? Either that, 
a broken ata controller or an ata driver bug might have caused your panic. 
You may want to play with DMA timeouts but often it doesn't help. One possible 
workaround is to disable ATA-DMA completely but it will reduce performance 
_a lot_. 
I tend to close this PR but am waiting for you to check your drive...
Comment 2 takeda 2009-06-18 19:35:06 UTC
I have smartd running periodically and do scan on the disk. Here's its
current output. I run short scan an it didn't get any error. I'm
running the extended one, but for some reason it doesn't show progress
I'll check in 4 hours if it indeed is working... Though disk seems to
be operating (it doesn't go to sleep as it usually does when it is not
in use).

Here's output as of now:

smartctl version 5.38 [i386-portbld-freebsd7.1] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Caviar SE family
Device Model:     WDC WD5000AAKB-00YSA0
Serial Number:    WD-WCAS84876881
Firmware Version: 12.01C02
User Capacity:    500 107 862 016 bytes
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   8
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Thu Jun 18 11:30:39 2009 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
                                        was completed without error.
                                        Auto Offline Data Collection: Enabled.
Self-test execution status:      ( 241) Self-test routine in progress...
                                        10% of test remaining.
Total time to complete Offline
data collection:                 (13200) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        ( 154) minutes.
Conveyance self-test routine
recommended polling time:        (   5) minutes.
SCT capabilities:              (0x203f) SCT Status supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0003   192   173   021    Pre-fail  Always       -       5400
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       918
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000e   200   200   051    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   086   086   000    Old_age   Always       -       10577
 10 Spin_Retry_Count        0x0012   100   100   051    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0012   100   253   051    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       18
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       13
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       918
194 Temperature_Celsius     0x0022   091   083   000    Old_age   Always       -       59
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0012   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   200   200   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   051    Old_age   Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%     10574         -
# 2  Short offline       Aborted by host               90%     10574         -
# 3  Short offline       Aborted by host               50%     10574         -
# 4  Short offline       Completed without error       00%     10570         -
# 5  Short offline       Completed without error       00%      8674         -
# 6  Short offline       Completed without error       00%      8650         -
# 7  Extended offline    Completed without error       00%      8628         -
# 8  Short offline       Completed without error       00%      8602         -
# 9  Short offline       Completed without error       00%      7046         -
#10  Short offline       Completed without error       00%      7022         -
#11  Short offline       Completed without error       00%      6998         -
#12  Short offline       Completed without error       00%      6974         -
#13  Extended offline    Completed without error       00%      6952         -
#14  Short offline       Completed without error       00%      6926         -
#15  Short offline       Completed without error       00%      6902         -
#16  Short offline       Completed without error       00%      6878         -
#17  Short offline       Completed without error       00%      6854         -
#18  Short offline       Completed without error       00%      6830         -
#19  Short offline       Completed without error       00%      6806         -
#20  Extended offline    Completed without error       00%      6785         -
#21  Short offline       Completed without error       00%      6759         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
Comment 3 takeda 2009-06-18 20:14:40 UTC
Ok it finished without error:

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%     10577         -
# 2  Short offline       Completed without error       00%     10574         -
# 3  Short offline       Aborted by host               90%     10574         -
# 4  Short offline       Aborted by host               50%     10574         -
# 5  Short offline       Completed without error       00%     10570         -
Comment 4 takeda 2009-06-18 21:57:53 UTC
Looks like my original e-mail with the extensive tests reached the
destination.

I haven't noticed anything new in the SMART output, except those
additional tests which passed successfully.
Raw_Read_Error_Rate, Reallocated_Sector_Ct, Seek_Error_Rate,
Spin_Retry_Count, Calibration_Retry_Count, Reallocated_Event_Count,
Current_Pending_Sector, Offline_Uncorrectable, UDMA_CRC_Error_Count
and Multi_Zone_Error_Rate are all 0.

If the controler would be broken do you think it would show in SMART?
I'm asking since the controller is on the disk (it's an IDE disk).

Is there a good way to check the cable besides replacing it with a new
one and seeing if the problem persists?

I haven't notice this in my logfiles:

First crash:
Jun 17 08:43:28 chinatsu sshguard[90671]: Blocking 202.104.3.83: 4 failures over 0 seconds.
Jun 17 09:30:48 chinatsu kernel: ad2: TIMEOUT - READ_DMA48 retrying (1 retry left) LBA=344820288
Jun 17 09:30:54 chinatsu kernel: ad2: TIMEOUT - READ_DMA48 retrying (0 retries left) LBA=344820288
Jun 17 09:35:05 chinatsu syslogd: kernel boot file is /boot/kernel/kernel
Jun 17 09:35:05 chinatsu kernel: Copyright (c) 1992-2009 The FreeBSD Project.
Jun 17 09:35:05 chinatsu kernel: Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
[...]
Jun 17 09:35:05 chinatsu kernel: GEOM_JOURNAL: Journal 56801142: ad2 contains data.
Jun 17 09:35:05 chinatsu kernel: GEOM_JOURNAL: Journal 56801142: ad2 contains journal.
Jun 17 09:35:05 chinatsu kernel: GEOM_JOURNAL: Journal ad2 consistent.
Jun 17 09:35:05 chinatsu kernel: Trying to mount root from ufs:/dev/ad0s1a
Jun 17 09:35:05 chinatsu kernel: WARNING: / was not properly dismounted
Jun 17 09:35:05 chinatsu savecore: reboot after panic: ufs_dirbad: /backuphd: bad dir ino 21573632 at offset 0: mangled entry
Jun 17 09:35:05 chinatsu savecore: no dump, not enough free space on device (110564 available, need 118358)
Jun 17 09:35:05 chinatsu savecore: unsaved dumps found but not saved

I didn't have enough space on /var

Second one (no signs of DMA errors or anything):
Jun 17 09:40:38 chinatsu named[833]: transfer of 'pckrzyz.pl/IN' from 83.14.32.234#53: failed to connect: timed out
Jun 17 09:43:24 chinatsu named[833]: transfer of 'pckrzyz.pl/IN' from 83.14.32.234#53: failed to connect: timed out
Jun 17 09:46:29 chinatsu named[833]: transfer of 'pckrzyz.pl/IN' from 83.14.32.234#53: failed to connect: timed out
Jun 17 09:52:31 chinatsu named[833]: transfer of 'pckrzyz.pl/IN' from 83.14.32.234#53: failed to connect: timed out
Jun 17 10:06:28 chinatsu named[833]: transfer of 'pckrzyz.pl/IN' from 83.14.32.234#53: failed to connect: timed out
Jun 17 10:35:55 chinatsu named[833]: transfer of 'pckrzyz.pl/IN' from 83.14.32.234#53: failed to connect: timed out
Jun 17 10:59:51 chinatsu syslogd: kernel boot file is /boot/kernel/kernel
[...]
Jun 17 10:59:51 chinatsu kernel: GEOM_JOURNAL: Journal 56801142: ad2 contains data.
Jun 17 10:59:51 chinatsu kernel: GEOM_JOURNAL: Journal 56801142: ad2 contains journal.
Jun 17 10:59:51 chinatsu kernel: GEOM_JOURNAL: Journal ad2 consistent.
Jun 17 10:59:51 chinatsu kernel: Trying to mount root from ufs:/dev/ad0s1a
Jun 17 10:59:51 chinatsu kernel: WARNING: / was not properly dismounted
Jun 17 10:59:51 chinatsu savecore: reboot after panic: ufs_dirbad: /backuphd: bad dir ino 22259126 at offset 0: mangled entry
Jun 17 10:59:51 chinatsu savecore: writing core to vmcore.0
Jun 17 11:00:07 chinatsu named[829]: starting BIND 9.4.2-P2 -t /var/named -u bind
Jun 17 11:00:08 chinatsu named[829]: command channel listening on 127.0.0.1#953
Jun 17 11:00:08 chinatsu named[829]: command channel listening on ::1#953

Third one:
Jun 17 11:37:51 chinatsu sshguard[1145]: Blocking 59.124.109.227: 4 failures over 740 seconds.
Jun 17 12:03:16 chinatsu named[829]: transfer of 'pckrzyz.pl/IN' from 83.14.32.234#53: failed to connect: timed out
Jun 17 12:41:26 chinatsu kernel: ad2: TIMEOUT - READ_DMA48 retrying (1 retry left) LBA=344820316
Jun 17 12:41:31 chinatsu kernel: ad2: TIMEOUT - READ_DMA48 retrying (0 retries left) LBA=344820316
Jun 17 12:45:26 chinatsu syslogd: kernel boot file is /boot/kernel/kernel
Jun 17 12:45:26 chinatsu kernel: ad2: FAILURE - READ_DMA48 timed out LBA=344820316
Jun 17 12:45:26 chinatsu kernel: GEOM_JOURNAL: Error while reading data from ad2 (error=5).
Jun 17 12:45:26 chinatsu kernel: panic: ufs_dirbad: /backuphd: bad dir ino 21573639 at offset 512: mangled entry
Jun 17 12:45:26 chinatsu kernel: KDB: stack backtrace:
Jun 17 12:45:26 chinatsu kernel: db_trace_self_wrapper(c0809ee8,c0882620,c081def3,d1372930,d1372930,...) at db_trace_self_wrapper+0x26
Jun 17 12:45:26 chinatsu kernel: panic(c081def3,c271e4dc,1493007,200,c081dfb1,...) at panic+0xf8
Jun 17 12:45:26 chinatsu kernel: ufs_dirbad(c2d20180,200,c081dfb1,0,d13729cc,...) at ufs_dirbad+0x73
Jun 17 12:45:26 chinatsu kernel: ufs_lookup(d13729f8,d13729f8,d1372bcc,d1372bb8,c2b91d00,...) at ufs_lookup+0x4bd
Jun 17 12:45:26 chinatsu kernel: vfs_cache_lookup(d1372a84,c05f5271,2,c2d1f8a0,d1372aa4,...) at vfs_cache_lookup+0xf2
Jun 17 12:45:26 chinatsu kernel: VOP_LOOKUP_APV(c085a940,d1372a84,d1372bcc,c080fb81,2a9,...) at VOP_LOOKUP_APV+0x3d
Jun 17 12:45:26 chinatsu kernel: lookup(d1372ba4,c2b3f000,0,d1372bc0,c2d22088,...) at lookup+0x50f
Jun 17 12:45:26 chinatsu kernel: namei(d1372ba4,d1372b44,60,0,c288a8c0,...) at namei+0x3a8
Jun 17 12:45:26 chinatsu kernel: kern_stat(c288a8c0,bfbfe514,0,d1372c14,52,...) at kern_stat+0x3d
Jun 17 12:45:26 chinatsu kernel: stat(c288a8c0,d1372cf8,8,c05a18b5,c288a8c0,...) at stat+0x2f
Jun 17 12:45:26 chinatsu kernel: syscall(d1372d38) at syscall+0x208
Jun 17 12:45:26 chinatsu kernel: Xint0x80_syscall() at Xint0x80_syscall+0x20
Jun 17 12:45:26 chinatsu kernel: --- syscall (188, FreeBSD ELF32, stat), eip = 0x206fa46b, esp = 0xbfbfca7c, ebp = 0xbfbfca98 ---
[...]
Jun 17 12:45:26 chinatsu kernel: Trying to mount root from ufs:/dev/ad0s1a
Jun 17 12:45:26 chinatsu kernel: WARNING: / was not properly dismounted
Jun 17 12:45:27 chinatsu savecore: reboot after panic: ufs_dirbad: /backuphd: bad dir ino 21573639 at offset 512: mangled entry
Jun 17 12:45:27 chinatsu savecore: writing core to vmcore.1
Jun 17 12:45:45 chinatsu named[829]: starting BIND 9.4.2-P2 -t /var/named -u bind
Jun 17 12:45:45 chinatsu named[829]: command channel listening on 127.0.0.1#953
Jun 17 12:45:45 chinatsu named[829]: command channel listening on ::1#953

As for installing 7.2 I can start doing it. Though the system was
stable for quite some time until this begin to happen.

Oh one more thing that might be important but I forgot about it.
Before I start copying the data, I got an error while using a ls.
ls was reporting that one directory entry is invalid or something like
that (it was two days ago, I don't rememebr exactly) I run fsck and it
found few errors I fixed all of them. It was kind of weird that they
were there in the first place since journaling was used. That was
before I experienced the crashes. Is it possible that the filesystem
is somehow damaged there and fsck wasn't able to fix it correctly?

As for the suggestion of changing UDMA, how can I do it?
Comment 5 Philip M. Gollucci freebsd_committer freebsd_triage 2009-07-09 08:38:50 UTC
State Changed
From-To: feedback->open

Maintainer has approved.
Comment 6 Eitan Adler freebsd_committer freebsd_triage 2017-12-31 07:58:42 UTC
For bugs matching the following criteria:

Status: In Progress Changed: (is less than) 2014-06-01

Reset to default assignee and clear in-progress tags.

Mail being skipped
Comment 7 Karel Gardas 2021-11-18 14:26:39 UTC
Not sure if I may report here too, but I'm able to see the same panic quite reguralry on RISC-V platform with latest FreeBSD VM image and while running in qemu 5.2.0 on top of FreeBSD 13-p5 on x64 host.

My panic looks:
FreeBSD/riscv (freebsd) (ttyu0)                                                                                                                                                              

login: panic: ufs_dirbad: /: bad dir ino 739685 at offset 0: mangled entry
cpuid = 3
time = 1637242167
KDB: stack backtrace:
db_trace_self() at db_trace_self
db_trace_self_wrapper() at db_trace_self_wrapper+0x38
kdb_backtrace() at kdb_backtrace+0x2c
vpanic() at vpanic+0x148
panic() at panic+0x2a
ufs_lookup_ino() at ufs_lookup_ino+0xb18
ufs_lookup() at ufs_lookup+0x16
VOP_CACHEDLOOKUP_APV() at VOP_CACHEDLOOKUP_APV+0x30
vfs_cache_lookup() at vfs_cache_lookup+0xae
VOP_LOOKUP_APV() at VOP_LOOKUP_APV+0x30
lookup() at lookup+0x45e
namei() at namei+0x35c
kern_statat() at kern_statat+0xe8
sys_fstatat() at sys_fstatat+0x1e
do_trap_user() at do_trap_user+0x208
cpu_exception_handler_user() at cpu_exception_handler_user+0x72
--- exception 8, tval = 0
KDB: enter: panic
[ thread pid 779 tid 100087 ]
Stopped at      kdb_enter+0x4c: sd      zero,0(a0)
db> where
Tracing pid 779 tid 100087 td 0xffffffc0957c7100
kdb_enter() at kdb_enter+0x4a
vpanic() at vpanic+0x18c
panic() at panic+0x2a
ufs_lookup_ino() at ufs_lookup_ino+0xb18
ufs_lookup() at ufs_lookup+0x16
VOP_CACHEDLOOKUP_APV() at VOP_CACHEDLOOKUP_APV+0x30
vfs_cache_lookup() at vfs_cache_lookup+0xae
VOP_LOOKUP_APV() at VOP_LOOKUP_APV+0x30
vfs_cache_lookup() at vfs_cache_lookup+0xae
VOP_LOOKUP_APV() at VOP_LOOKUP_APV+0x30
lookup() at lookup+0x45e
namei() at namei+0x35c
kern_statat() at kern_statat+0xe8
sys_fstatat() at sys_fstatat+0x1e
do_trap_user() at do_trap_user+0x208
cpu_exception_handler_user() at cpu_exception_handler_user+0x72
--- exception 8, tval = 0

db>

and it usually happen after some time of using risc-v vm. When using it I usually do:
- pkg install git
- git clone latest src
- buildworld
- buildkernel

sometimes I'm not able to complete, sometimes I'm and the panic happen more later.
Comment 8 Karel Gardas 2021-11-18 14:29:12 UTC
Forgot to mention. I run qemu with:
$HOME/sfw/qemu-5.2.0/bin/qemu-system-riscv64 -machine virt -smp 4 -m 4G -nographic -device virtio-blk-device,drive=hd -drive file=FreeBSD-14.0-CURRENT-riscv-riscv64.qcow2,if=none,id=hd -device virtio-net-device,netdev=net -netdev user,id=net,hostfwd=tcp::2233-:22 -bios $HOME/sfw/opensbi/generic/fw_jump.elf -kernel $HOME/sfw/u-boot-qemu/usr/lib/u-boot/qemu-riscv64_smode/uboot.elf -object rng-random,filename=/dev/urandom,id=rng -device virtio-rng-device,rng=rng -nographic -append "root=LABEL=rootfs console=ttyS0"

and download images from here:
https://download.freebsd.org/ftp/snapshots/VM-IMAGES/14.0-CURRENT/riscv64/

After image download/uncompress I usually resize it (qcow2) by +20GB to have more space. FreeBSD nicely resize fs in the VM.

Besides enabling ssh server and adding user, I've not done anything to alter standard VM configuration.
Comment 9 Graham Perrin freebsd_committer freebsd_triage 2022-10-17 12:18:01 UTC
Keyword: 

    crash

– in lieu of summary line prefix: 

    [panic]

* bulk change for the keyword
* summary lines may be edited manually (not in bulk). 

Keyword descriptions and search interface: 

    <https://bugs.freebsd.org/bugzilla/describekeywords.cgi>