| Summary: | unusable minimal installation of FreeBSD-7.0 | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Base System | Reporter: | Fernan Aguero <fernan> | ||||
| Component: | amd64 | Assignee: | freebsd-amd64 (Nobody) <amd64> | ||||
| Status: | Closed FIXED | ||||||
| Severity: | Affects Only Me | ||||||
| Priority: | Normal | ||||||
| Version: | 7.0-RELEASE | ||||||
| Hardware: | Any | ||||||
| OS: | Any | ||||||
| Attachments: |
|
||||||
|
Description
Fernan Aguero
2008-04-10 14:40:00 UTC
ld-elf.so.1 doesn't use anything from /usr/src. The specific assert that is
failing is this:
assert(ELF_R_TYPE(rela->r_info) == R_X86_64_JMP_SLOT);
in reloc_plt() in src/libexec/rtld-elf/amd64/reloc.c. I wonder if you somehow
have 32-bit binaries instead of 64-bit?
--
John Baldwin
I wrote down that error on paper and typed it in the email
... but I'm sure it said /usr/src ... that's why I decided
to set my /etc/fstab to mount freebsd-7.0 sources from another FreeBSD
box and symlink /usr/src.
> I wonder if you somehow have 32-bit binaries instead of 64-bit?
I wonder the same thing because after doing that, even
though make was now OK (I was able to cd
/usr/ports/sysutils/screen && make install), other commands
failed (vi, Exec format error. Binary file not executable).
This is a Dell PowerEdge SC1435 with two Opteron 2210
processors and 8 Gb RAM.
I tried running brandelf on some executables, but brandelf
itself would not run!
We reinstalled the box from scratch, reformatting the disk
and now choosing a Standard install (Developer: full
sources, docs, no X and no games) to no avail.
Fernan
On Thu, Apr 10, 2008 at 6:36 AM, Fernan Aguero <fernan@unsam.edu.ar> wrote: > Because of the minimal setup, /usr/src is empty. > > This in turn produces the following errors upong boot: > ld-elf.so.1: assert failed: /usr/src/libexec/rtld-elf/amd64/reloc.c:341 > Abort trap (core dumped) This doesn't make sense to me. I just built a dozen machines using 'minimal' installs in the freebsd.org cluster. Nothing but the base dist and a kernel. /usr/src was quite empty, and I didn't see anything like this, even once. If I had to guess, it seems more like you ended up with a corrupted set of binaries being installed or in memory somehow. Did you also reboot in between the initial attempts to do this and mounting /usr/src? Something else has to be going on. The simple existence of /usr/src/* can't affect this. -- Peter Wemm - peter@wemm.org; peter@FreeBSD.org; peter@yahoo-inc.com "All of this is for nothing if we don't go to the stars" - JMS/B5 "If Java had true garbage collection, most programs would delete themselves upon execution." -- Robert Sewell **WANTED TO BUY: Garmin Streetpilot 2650 or 2660. Not later model! ** > I wrote down that error on paper and typed it in the email > ... but I'm sure it said /usr/src ... that's why I decided > to set my /etc/fstab to mount freebsd-7.0 sources from another FreeBSD > box and symlink /usr/src. That's because the assert() macro puts the full filename of the current file into the binary when it is compiled for the error message. The binary is not trying to read anything from /usr/src itse.f > > I wonder if you somehow have 32-bit binaries instead of 64-bit? > > I wonder the same thing because after doing that, even > though make was now OK (I was able to cd > /usr/ports/sysutils/screen && make install), other commands > failed (vi, Exec format error. Binary file not executable). It certainly sounds like you have mixed and matched some things. Maybe just do a minimal install but include the 'lib32' dist for 32-bit binary compat? minimal probably doesn't include it (but vi also should be a 64-bit binary, try using 'file' rather than brandelf to see what file thinks vi is). -- John Baldwin > On Thursday 10 April 2008 02:44:11 pm Fernan Aguero wrote:
> > > ld-elf.so.1 doesn't use anything from /usr/src. The specific assert that
> is
> > > failing is this:
> > >
> > > assert(ELF_R_TYPE(rela->r_info) == R_X86_64_JMP_SLOT);
> > >
> > > in reloc_plt() in src/libexec/rtld-elf/amd64/reloc.c.
> >
> > I wrote down that error on paper and typed it in the email
> > ... but I'm sure it said /usr/src ... that's why I decided
> > to set my /etc/fstab to mount freebsd-7.0 sources from another FreeBSD
> > box and symlink /usr/src.
>
> That's because the assert() macro puts the full filename of the current file
> into the binary when it is compiled for the error message. The binary is not
> trying to read anything from /usr/src itse.f
>
> > > I wonder if you somehow have 32-bit binaries instead of 64-bit?
> > >
> > > --
> > > John Baldwin
> >
> > I wonder the same thing because after doing that, even
> > though make was now OK (I was able to cd
> > /usr/ports/sysutils/screen && make install), other commands
> > failed (vi, Exec format error. Binary file not executable).
>
> It certainly sounds like you have mixed and matched some things. Maybe just
> do a minimal install but include the 'lib32' dist for 32-bit binary compat?
> minimal probably doesn't include it (but vi also should be a 64-bit binary,
> try using 'file' rather than brandelf to see what file thinks vi is).
Sorry for the delay in replying,
I have just did a new install (completely erasing and
reformatting the disk) and this time I did a standard
install, choosing developer + lib32. Now the box
stops in the boot process attempting to mount root:
[...]
ad4: 152587 <WDC WD1600JS-75NCB3 10.02E04> at ata2-master UDMA33
SMP: AP CPU #1 Launched!
SMP: AP CPU #2 Launched!
SMP: AP CPU #3 Launched!
Trying to mount root from ufs:/dev/ad4s1a
/libexec/ld-elf.so.1: /lib/libncurses.so.7: Shared object
has no run-time symbol table
Enter fill pathname of shell or RETURN for /bin/sh:
I cannot even type RETURN and enter a shell, because the
same message about libncurses.so.7 appears!
Disk slicing/partitioning is as follows:
1) ad4s1 (using whole disk), set bootable
2) Install Boot Manager
3) ad4s1b, 32G, swap
ad4s1a, 1G, /
ad4s1d, 1G, /tmp
ad4s1e, 5G, /usr
ad4s1f, 10G, /var
ad4s1g, rest of disk, /home
BTW, I don't understand what you mean by "It certainly
sounds like you have mixed and matched some things" ... this
is a new box, fresh, and I'm using the 7.0 ISO that I just
downloaded from freebsd.org (cheksums are OK), and following
a straightforward installation procedure. I'm not trying to
select any other package apart from those that are selected
by choosing the options put forward by the installer
(remember I started with a minimal install!). So in any
case, if there's any 'mix and match' issue, it's an issue
of mixed and matched packages included in the ISO image or a
wrong combination of selections made by the installer ...
BTW, I have just succesfully installed both
ubuntu-7.10-amd64 (server edition) and FreeBSD-6.1-RELEASE
(amd64) without any issues on the same box. In both cases, I
was able to SSH in from another box and install third party
packages (postgresql, screen, vim).
To me this sounds like a problem with the 7.0-RELEASE CD
and/or installer.
Fernan
On Friday 18 April 2008 10:53:08 am Fernan Aguero wrote: > > On Thursday 10 April 2008 02:44:11 pm Fernan Aguero wrote: > > > > ld-elf.so.1 doesn't use anything from /usr/src. The specific assert that > > is > > > > failing is this: > > > > > > > > assert(ELF_R_TYPE(rela->r_info) == R_X86_64_JMP_SLOT); > > > > > > > > in reloc_plt() in src/libexec/rtld-elf/amd64/reloc.c. > > > > > > I wrote down that error on paper and typed it in the email > > > ... but I'm sure it said /usr/src ... that's why I decided > > > to set my /etc/fstab to mount freebsd-7.0 sources from another FreeBSD > > > box and symlink /usr/src. > > > > That's because the assert() macro puts the full filename of the current file > > into the binary when it is compiled for the error message. The binary is not > > trying to read anything from /usr/src itse.f > > > > > > I wonder if you somehow have 32-bit binaries instead of 64-bit? > > > > > > > > -- > > > > John Baldwin > > > > > > I wonder the same thing because after doing that, even > > > though make was now OK (I was able to cd > > > /usr/ports/sysutils/screen && make install), other commands > > > failed (vi, Exec format error. Binary file not executable). > > > > It certainly sounds like you have mixed and matched some things. Maybe just > > do a minimal install but include the 'lib32' dist for 32-bit binary compat? > > minimal probably doesn't include it (but vi also should be a 64-bit binary, > > try using 'file' rather than brandelf to see what file thinks vi is). > > Sorry for the delay in replying, > > I have just did a new install (completely erasing and > reformatting the disk) and this time I did a standard > install, choosing developer + lib32. Now the box > stops in the boot process attempting to mount root: > > [...] > ad4: 152587 <WDC WD1600JS-75NCB3 10.02E04> at ata2-master UDMA33 > SMP: AP CPU #1 Launched! > SMP: AP CPU #2 Launched! > SMP: AP CPU #3 Launched! > Trying to mount root from ufs:/dev/ad4s1a > /libexec/ld-elf.so.1: /lib/libncurses.so.7: Shared object > has no run-time symbol table > Enter fill pathname of shell or RETURN for /bin/sh: > > I cannot even type RETURN and enter a shell, because the > same message about libncurses.so.7 appears! I think you have some sort of local corruption either on the CD itself, the ISO image, or perhaps on the hard drive? If the 7.0 CD were this fundamentally broken there would be more widespread reports of problems rather than this isolated incident. -- John Baldwin > On Friday 18 April 2008 10:53:08 am Fernan Aguero wrote:
> > > On Thursday 10 April 2008 02:44:11 pm Fernan Aguero wrote:
> > > > > ld-elf.so.1 doesn't use anything from /usr/src. The specific assert
> that
> > > is
> > > > > failing is this:
> > > > >
> > > > > assert(ELF_R_TYPE(rela->r_info) == R_X86_64_JMP_SLOT);
> > > > >
> > > > > in reloc_plt() in src/libexec/rtld-elf/amd64/reloc.c.
> > > >
> > > > I wrote down that error on paper and typed it in the email
> > > > ... but I'm sure it said /usr/src ... that's why I decided
> > > > to set my /etc/fstab to mount freebsd-7.0 sources from another FreeBSD
> > > > box and symlink /usr/src.
> > >
> > > That's because the assert() macro puts the full filename of the current
> file
> > > into the binary when it is compiled for the error message. The binary is
> not
> > > trying to read anything from /usr/src itse.f
> > >
> > > > > I wonder if you somehow have 32-bit binaries instead of 64-bit?
> > > > >
> > > > > --
> > > > > John Baldwin
> > > >
> > > > I wonder the same thing because after doing that, even
> > > > though make was now OK (I was able to cd
> > > > /usr/ports/sysutils/screen && make install), other commands
> > > > failed (vi, Exec format error. Binary file not executable).
> > >
> > > It certainly sounds like you have mixed and matched some things. Maybe
> just
> > > do a minimal install but include the 'lib32' dist for 32-bit binary
> compat?
> > > minimal probably doesn't include it (but vi also should be a 64-bit
> binary,
> > > try using 'file' rather than brandelf to see what file thinks vi is).
> >
> > Sorry for the delay in replying,
> >
> > I have just did a new install (completely erasing and
> > reformatting the disk) and this time I did a standard
> > install, choosing developer + lib32. Now the box
> > stops in the boot process attempting to mount root:
> >
> > [...]
> > ad4: 152587 <WDC WD1600JS-75NCB3 10.02E04> at ata2-master UDMA33
> > SMP: AP CPU #1 Launched!
> > SMP: AP CPU #2 Launched!
> > SMP: AP CPU #3 Launched!
> > Trying to mount root from ufs:/dev/ad4s1a
> > /libexec/ld-elf.so.1: /lib/libncurses.so.7: Shared object
> > has no run-time symbol table
> > Enter fill pathname of shell or RETURN for /bin/sh:
> >
> > I cannot even type RETURN and enter a shell, because the
> > same message about libncurses.so.7 appears!
>
> I think you have some sort of local corruption either on the CD itself, the
> ISO image, or perhaps on the hard drive? If the 7.0 CD were this
> fundamentally broken there would be more widespread reports of problems
> rather than this isolated incident.
> --
> John Baldwin
I will try and burn another copy of the ISO image ... apart
from that, as I said, the checksums on the downloaded image
match to those in file, and the disk behaves OK with both
Ubuntu and FreeBSD-6.x
Fernan
On Fri, Apr 18, 2008 at 11:53:08AM -0300, Fernan Aguero wrote: >Enter fill pathname of shell or RETURN for /bin/sh: > >I cannot even type RETURN and enter a shell, because the >same message about libncurses.so.7 appears! What about if you enter /rescue/sh here? The tools in /rescue are all statically linked. I've checked my copies of both 7.0-RELEASE-amd64-livefs.iso and 7.0-RELEASE-amd64-disc1.iso and get the following results: # rescue/md5 -r rescue/md5 libexec/ld-elf.so.1 lib/libncurses.so.7 lib/libc.so.7 bin/sh boot/kernel/kernel 55fcd4d3c7914dde00a3bf67dd045d67 rescue/md5 61620a834da2d96314ed8966c20dbc50 libexec/ld-elf.so.1 fcb9abd938532ae64dc64c444821d0d9 lib/libncurses.so.7 8ff68ce60040f07f63d41ac3400bbb54 lib/libc.so.7 7f012c2361c37d1f23ffa3279f8ac32c bin/sh 12b4c1d7484f4a332a5b3339ec492503 boot/kernel/kernel Can you please confirm you get the same (the installer should be verifying MD5s of these files when it installs them as well). >To me this sounds like a problem with the 7.0-RELEASE CD >and/or installer. I've checked the relevant files on my copy of the CDs and I don't have any problems running them. Possibly there is a problem with the installer but, we should have lots more similar complaints is so. -- Peter Jeremy Please excuse any delays as the result of my ISP's inability to implement an MTA that is either RFC2821-compliant or matches their claimed behaviour. > > > > > > > > It certainly sounds like you have mixed and matched some things. Maybe > > just > > > > do a minimal install but include the 'lib32' dist for 32-bit binary > > compat? > > > > minimal probably doesn't include it (but vi also should be a 64-bit > > binary, > > > > try using 'file' rather than brandelf to see what file thinks vi is). > > > > > > Sorry for the delay in replying, > > > > > > I have just did a new install (completely erasing and > > > reformatting the disk) and this time I did a standard > > > install, choosing developer + lib32. Now the box > > > stops in the boot process attempting to mount root: > > > > > > [...] > > > ad4: 152587 <WDC WD1600JS-75NCB3 10.02E04> at ata2-master UDMA33 > > > SMP: AP CPU #1 Launched! > > > SMP: AP CPU #2 Launched! > > > SMP: AP CPU #3 Launched! > > > Trying to mount root from ufs:/dev/ad4s1a > > > /libexec/ld-elf.so.1: /lib/libncurses.so.7: Shared object > > > has no run-time symbol table > > > Enter fill pathname of shell or RETURN for /bin/sh: > > > > > > I cannot even type RETURN and enter a shell, because the > > > same message about libncurses.so.7 appears! > > > > I think you have some sort of local corruption either on the CD itself, the > > ISO image, or perhaps on the hard drive? If the 7.0 CD were this > > fundamentally broken there would be more widespread reports of problems > > rather than this isolated incident. > > -- > > John Baldwin > > I will try and burn another copy of the ISO image ... apart > from that, as I said, the checksums on the downloaded image > match to those in file, and the disk behaves OK with both > Ubuntu and FreeBSD-6.x > > Fernan The answer is that the CD was not OK (but this didn't make any difference, as new problems arose, read on ...) # ISO image is OK grep amd64-disc1 7.0-amd64-CHECKSUM.MD5 MD5 (7.0-RELEASE-amd64-disc1.iso) = 0232f1b6ffde0e3e76034c9f10791acd cat 7.0-RELEASE-amd64-disc1.iso | md5sum 0232f1b6ffde0e3e76034c9f10791acd - # But burned CD was not OK echo $(( $(ls -l 7.0-RELEASE-amd64-disc1.iso | awk '{ print $5 }') / 2048 )) 255535 dd if=/dev/cdrom bs=2048 count=255535 | md5sum 255535+0 records in 255535+0 records out 523335680 bytes (523 MB) copied, 121.408 seconds, 4.3 MB/s f6de944affbeb748e7df5337aa91ea8b - So I trashed this CD and burned the same image again, now succesfully: dd if=/dev/cdrom bs=2048 count=255535 | md5sum 255535+0 records in 255535+0 records out 523335680 bytes (523 MB) copied, 218.401 seconds, 2.4 MB/s 0232f1b6ffde0e3e76034c9f10791acd - I've now installed FreeBSD-7.0 (amd64) again, several times, with either a minimal install + man pages (with or w/o lib32 compatibility) or as a standard install (all default options) and again, as before, there are many errors upon boot: # minimal, no lib32 compatibility Trying to mount root from ufs:/dev/ad4s1a /libexec/ld-elf.so.1: /lib/libedit.so.6: invalid file format # standard install (all default options) Starting local daemons:. Updating motdawk: 1: Syntax error: "(" unexpected . Mounting late filesystems:. Configuring syscons: keymap blanktime. /libexec/ld-elf.so.1: Shared object "<ò " not found, required by "libssh.so.4" (repeated three times) Starting cron. Local package initialization:. /libexec/ld-elf.so.1: /lib/libm.so.5: invalid file format I can login and do a couple of things: mkdir /freebsd mount host:/freebsd /freebsd rm -rf /usr/src ln -s /freebsd/freebsd-7.0/src /usr/src ln -s /freebsd/ports /usr/ports mkdir /usr/ports.workdir echo "WRKDIRPREFIX=/usr/ports.workdir" > /etc/make.conf But then: cd /usr/ports/sysutils/screen make clean && make install clean ... ===> Extracting for screen-4.0.3_1 /usr/bin/awk: 1: Syntax error: "(" unexpected *** Error code 2 And also: cd /usr/src make buildworld >>> stage 1.1: legacy release compatibility shims cd /freebsd/freebsd-7.0/src; MAKEOBJDIRPREFIX=/usr/obj/freebsd/freebsd-7.0/src/tmp INSTALL="sh /freebsd/freebsd-7.0/src/tools/install.sh" PATH=/usr/obj/freebsd/freebsd-7.0/src/tmp/legacy/usr/sbin: ... ... -DNO_WARNS legacy awk: 1: Syntax error: "(" unexpected Any help would be appreciated. I'm at a loss here. Ubuntu-7.10 (amd64, server) works fine on this same box, as well as FreeBSD-6.1-RELEASE (amd64). I even followed the instructions to upgrade 6.1 to 7.0 only to find that after rebooting (this is on the second time as per the instructions) http://people.freebsd.org/~rse/upgrade/freebsd-upgrade-6x-7x.txt I lose the ability to SSH in, and many of these "syntax error word unexpected" appear. I cannot rebuild and reinstall world at this point, even though the box boots and lets me log in. Fernan On Mon, 21 Apr 2008 15:22:47 -0300 Fernan Aguero <fernan@iib.unsam.edu.ar> = wrote: > The answer is that the CD was not OK (but this didn't make > any difference, as new problems arose, read on ...) >=20 > # ISO image is OK > grep amd64-disc1 7.0-amd64-CHECKSUM.MD5 > MD5 (7.0-RELEASE-amd64-disc1.iso) =3D 0232f1b6ffde0e3e76034c9f10791acd >=20 > cat 7.0-RELEASE-amd64-disc1.iso | md5sum=20 > 0232f1b6ffde0e3e76034c9f10791acd - >=20 >=20 > # But burned CD was not OK > echo $(( $(ls -l 7.0-RELEASE-amd64-disc1.iso | awk '{ print $5 }') / 2048= )) > 255535 >=20 > dd if=3D/dev/cdrom bs=3D2048 count=3D255535 | md5sum=20 > 255535+0 records in > 255535+0 records out > 523335680 bytes (523 MB) copied, 121.408 seconds, 4.3 MB/s > f6de944affbeb748e7df5337aa91ea8b - >=20 > So I trashed this CD and burned the same image again, now > succesfully: > dd if=3D/dev/cdrom bs=3D2048 count=3D255535 | md5sum=20 > 255535+0 records in > 255535+0 records out > 523335680 bytes (523 MB) copied, 218.401 seconds, 2.4 MB/s > 0232f1b6ffde0e3e76034c9f10791acd - >=20 > I've now installed FreeBSD-7.0 (amd64) again, several times, > with either a minimal install + man pages (with or w/o lib32 compatibilit= y) or as a standard install (all default options) and > again, as before, there are many errors upon boot: >=20 > # minimal, no lib32 compatibility > Trying to mount root from ufs:/dev/ad4s1a > /libexec/ld-elf.so.1: /lib/libedit.so.6: invalid file format >=20 > # standard install (all default options) > Starting local daemons:. > Updating motdawk: 1: Syntax error: "(" unexpected > . > Mounting late filesystems:. > Configuring syscons: keymap blanktime. > /libexec/ld-elf.so.1: Shared object "<=F2 " not found, required by "libss= h.so.4" > (repeated three times) > Starting cron. > Local package initialization:. > /libexec/ld-elf.so.1: /lib/libm.so.5: invalid file format >=20 > I can login and do a couple of things: > mkdir /freebsd > mount host:/freebsd /freebsd > rm -rf /usr/src > ln -s /freebsd/freebsd-7.0/src /usr/src > ln -s /freebsd/ports /usr/ports > mkdir /usr/ports.workdir > echo "WRKDIRPREFIX=3D/usr/ports.workdir" > /etc/make.conf >=20 > But then:=20 > cd /usr/ports/sysutils/screen > make clean && make install clean > ... > =3D=3D=3D> Extracting for screen-4.0.3_1 > /usr/bin/awk: 1: Syntax error: "(" unexpected > *** Error code 2 >=20 > And also: > cd /usr/src > make buildworld > >>> stage 1.1: legacy release compatibility shims > cd /freebsd/freebsd-7.0/src; > MAKEOBJDIRPREFIX=3D/usr/obj/freebsd/freebsd-7.0/src/tmp > INSTALL=3D"sh /freebsd/freebsd-7.0/src/tools/install.sh"=20 > PATH=3D/usr/obj/freebsd/freebsd-7.0/src/tmp/legacy/usr/sbin: > ... ... -DNO_WARNS legacy > awk: 1: Syntax error: "(" unexpected >=20 >=20 > Any help would be appreciated. I'm at a loss here. > Ubuntu-7.10 (amd64, server) works fine on this same box, as > well as FreeBSD-6.1-RELEASE (amd64). I even followed > the instructions to upgrade 6.1 to 7.0 only to find that > after rebooting (this is on the second time as per the instructions) > http://people.freebsd.org/~rse/upgrade/freebsd-upgrade-6x-7x.txt > I lose the ability to SSH in, and many of these "syntax > error word unexpected" appear. I cannot rebuild and > reinstall world at this point, even though the box boots and > lets me log in. You're not going to want to hear it, but I've seen similar symptoms with a hardware problems. Two easy things to try: 1) is this repeatable? Does it fail the exact same way each time? 2) Have you tried FreeBSD-7.0-RELEASE on x86? Just another data point. There are two possible hardware issues: one is that some component is marginal, and the previous release of FreeBSD and your Ubuntu test don't stress the component in question enough to cause it to fail. the first test will tell you whether or not this is the case: if it fails the exact same way each time, then this isn't a part showing intermittent failures. If the failures move around, try stress-testing the machine. The previous time I saw such failures in a repeatable way, the problem was that the new version of BSD had changed an internal API, and was now passing more registers to internal functions than previously. The hardware was flipping a bit in one of those registers. The previous release of the OS (and the hardware vendors OS) never passed that many registers, so the problem didn't arise. If your problem is repeatable, you need to fire up the debugger on the kernel, and watch carefully at the first point of failure. This will also provide useful information for fixing the problem if it's software and not hardware. <mike --=20 Mike Meyer <mwm@mired.org> http://www.mired.org/consulting.html Independent Network/Unix/Perforce consultant, email for more information. To submitter: I wonder if you could provide the dmesg of this system? Also, are you able to try the following test: md5 /lib/libm.so.5 /lib/libedit.so.6 reboot md5 /lib/libm.so.5 /lib/libedit.so.6 And see if the checksums match? It really sounds like you are seeing some data corruption somehow, and it may be hard to determine exactly where this is happening. Gavin > On Mon, 21 Apr 2008 15:22:47 -0300 Fernan Aguero <fernan@iib.unsam.edu.ar> wrote: > > The answer is that the CD was not OK (but this didn't make > > any difference, as new problems arose, read on ...) > > > > # ISO image is OK > > grep amd64-disc1 7.0-amd64-CHECKSUM.MD5 > > MD5 (7.0-RELEASE-amd64-disc1.iso) = 0232f1b6ffde0e3e76034c9f10791acd > > > > cat 7.0-RELEASE-amd64-disc1.iso | md5sum > > 0232f1b6ffde0e3e76034c9f10791acd - > > > > > > # But burned CD was not OK > > echo $(( $(ls -l 7.0-RELEASE-amd64-disc1.iso | awk '{ print $5 }') / 2048 )) > > 255535 > > > > dd if=/dev/cdrom bs=2048 count=255535 | md5sum > > 255535+0 records in > > 255535+0 records out > > 523335680 bytes (523 MB) copied, 121.408 seconds, 4.3 MB/s > > f6de944affbeb748e7df5337aa91ea8b - > > > > So I trashed this CD and burned the same image again, now > > succesfully: > > dd if=/dev/cdrom bs=2048 count=255535 | md5sum > > 255535+0 records in > > 255535+0 records out > > 523335680 bytes (523 MB) copied, 218.401 seconds, 2.4 MB/s > > 0232f1b6ffde0e3e76034c9f10791acd - > > > > I've now installed FreeBSD-7.0 (amd64) again, several times, > > with either a minimal install + man pages (with or w/o lib32 compatibility) or as a standard install (all default options) and > > again, as before, there are many errors upon boot: > > > > # minimal, no lib32 compatibility > > Trying to mount root from ufs:/dev/ad4s1a > > /libexec/ld-elf.so.1: /lib/libedit.so.6: invalid file format > > > > # standard install (all default options) > > Starting local daemons:. > > Updating motdawk: 1: Syntax error: "(" unexpected > > . > > Mounting late filesystems:. > > Configuring syscons: keymap blanktime. > > /libexec/ld-elf.so.1: Shared object "<ò " not found, required by "libssh.so.4" > > (repeated three times) > > Starting cron. > > Local package initialization:. > > /libexec/ld-elf.so.1: /lib/libm.so.5: invalid file format > > > > I can login and do a couple of things: > > mkdir /freebsd > > mount host:/freebsd /freebsd > > rm -rf /usr/src > > ln -s /freebsd/freebsd-7.0/src /usr/src > > ln -s /freebsd/ports /usr/ports > > mkdir /usr/ports.workdir > > echo "WRKDIRPREFIX=/usr/ports.workdir" > /etc/make.conf > > > > But then: > > cd /usr/ports/sysutils/screen > > make clean && make install clean > > ... > > ===> Extracting for screen-4.0.3_1 > > /usr/bin/awk: 1: Syntax error: "(" unexpected > > *** Error code 2 > > > > And also: > > cd /usr/src > > make buildworld > > >>> stage 1.1: legacy release compatibility shims > > cd /freebsd/freebsd-7.0/src; > > MAKEOBJDIRPREFIX=/usr/obj/freebsd/freebsd-7.0/src/tmp > > INSTALL="sh /freebsd/freebsd-7.0/src/tools/install.sh" > > PATH=/usr/obj/freebsd/freebsd-7.0/src/tmp/legacy/usr/sbin: > > ... ... -DNO_WARNS legacy > > awk: 1: Syntax error: "(" unexpected > > > > > > Any help would be appreciated. I'm at a loss here. > > Ubuntu-7.10 (amd64, server) works fine on this same box, as > > well as FreeBSD-6.1-RELEASE (amd64). I even followed > > the instructions to upgrade 6.1 to 7.0 only to find that > > after rebooting (this is on the second time as per the instructions) > > http://people.freebsd.org/~rse/upgrade/freebsd-upgrade-6x-7x.txt > > I lose the ability to SSH in, and many of these "syntax > > error word unexpected" appear. I cannot rebuild and > > reinstall world at this point, even though the box boots and > > lets me log in. > > You're not going to want to hear it, but I've seen similar symptoms > with a hardware problems. I know this is a possibility, and I do want to hear about it -- my on-site guarantee is still alive :) I've booted the Dell diagnostics CD and run all diagnostic tests (memory, disk, processors) ... no problem was found. > Two easy things to try: > > 1) is this repeatable? Does it fail the exact same way each time? No. Each time the error is different. 1. On one ocasion, right after booting I got several screens of weird symbols. In the previous email I just copied the things that would make sense to a human being. 2. I have just reinstalled 7.0-amd64 again (standard install, didn't select any combo of anything ... just the defaults, and didn't install any extra package), and this time, after booting the error messages were: [...] Configuring syscons: keymap blanktime. /libexec/ld-elf.so.1: /lib/libcrypto.so.5: Unsupported relocation type 74041104 in non-PLT relocations (message repeated 5 times) Starting cron. Local package initialization:. Starting background file system checks in 60 seconds. This time I was able to login and check libcrypto: MD5 (/lib/libcrypto.so.5) = ee943528c8046145b60359f50f45fbf4 And as suggested by Gavin, I rebooted and checked things again. Upon reboot, the error was now different (relocation type changed), and the checksum for libcrypto.so.5 changed! MD5 (/lib/libcrypto.so.5) = a24993818d2b888df053428b48a18eac Maybe it's a problem with the disk driver in FreeBSD-7 for this chipset? (dmesg attached) ... it seems like reading and writing to disk is the problem ... but this only happens with FreeBSD-7.0. Any suggestion as to how I can further debug this? 3. I reinstalled the system once again, and now, after booting into the installed system: init: getty repeating too quickly on port /dev/ttyv6, sleeping 30 secs init: getty repeating too quickly on port /dev/ttyv6, sleeping 30 secs init: getty repeating too quickly on port /dev/ttyv6, sleeping 30 secs ... > 2) Have you tried FreeBSD-7.0-RELEASE on x86? Just another data point. I don't have a spare i386 box available for testing 7.0 right now ... I don't think it's possible to install 7.0-i386 on this box, or is it? > There are two possible hardware issues: one is that some component is > marginal, and the previous release of FreeBSD and your Ubuntu test > don't stress the component in question enough to cause it to fail. the > first test will tell you whether or not this is the case: if it fails > the exact same way each time, then this isn't a part showing > intermittent failures. If the failures move around, try stress-testing > the machine. In which way do you suggest stress testing? When I installed FreeBSD-6.1-amd64 on this box, I compiled all world and kernel without an issue. Do you suggest doing something else? Thanks for all the help thus far. Fernan > The previous time I saw such failures in a repeatable way, the problem > was that the new version of BSD had changed an internal API, and was > now passing more registers to internal functions than previously. The > hardware was flipping a bit in one of those registers. The previous > release of the OS (and the hardware vendors OS) never passed that many > registers, so the problem didn't arise. > > If your problem is repeatable, you need to fire up the debugger on the > kernel, and watch carefully at the first point of failure. This will > also provide useful information for fixing the problem if it's > software and not hardware. > > <mike > -- > Mike Meyer <mwm@mired.org> http://www.mired.org/consulting.html > Independent Network/Unix/Perforce consultant, email for more information. On Tue, Apr 22, 2008 at 04:06:32PM -0300, Fernan Aguero wrote: >This time I was able to login and check libcrypto: >MD5 (/lib/libcrypto.so.5) = ee943528c8046145b60359f50f45fbf4 > >And as suggested by Gavin, I rebooted and checked things >again. > >Upon reboot, the error was now different (relocation type changed), >and the checksum for libcrypto.so.5 changed! >MD5 (/lib/libcrypto.so.5) = a24993818d2b888df053428b48a18eac ... >Maybe it's a problem with the disk driver in FreeBSD-7 for >this chipset? (dmesg attached) ... it seems like reading and >writing to disk is the problem ... but this only happens >with FreeBSD-7.0. Any suggestion as to how I can further debug this? ... >atapci0: <ServerWorks HT1000 SATA150 controller> port 0xecb0-0xecb7,0xeca0-0xeca3,0xecb8-0xecbf,0xeca4-0xeca7,0xece0-0xecef mem 0xefdfe000-0xefdfffff irq 6 at device 14.0 on pci3 There are a number of threads in -current discussing the brokenness of that chipset. The impression I get is that 7.0-RELEASE will never work for you but there are some fixes in 7.0 post -RELEASE. A sample is: http://lists.freebsd.org/pipermail/freebsd-current/2008-March/084272.html http://lists.freebsd.org/pipermail/freebsd-current/2008-March/084075.html -- Peter Jeremy Please excuse any delays as the result of my ISP's inability to implement an MTA that is either RFC2821-compliant or matches their claimed behaviour. On Tuesday 22 April 2008 03:32:11 pm Peter Jeremy wrote: > On Tue, Apr 22, 2008 at 04:06:32PM -0300, Fernan Aguero wrote: > >This time I was able to login and check libcrypto: > >MD5 (/lib/libcrypto.so.5) = ee943528c8046145b60359f50f45fbf4 > > > >And as suggested by Gavin, I rebooted and checked things > >again. > > > >Upon reboot, the error was now different (relocation type changed), > >and the checksum for libcrypto.so.5 changed! > >MD5 (/lib/libcrypto.so.5) = a24993818d2b888df053428b48a18eac > > ... > > >Maybe it's a problem with the disk driver in FreeBSD-7 for > >this chipset? (dmesg attached) ... it seems like reading and > >writing to disk is the problem ... but this only happens > >with FreeBSD-7.0. Any suggestion as to how I can further debug this? > > ... > > >atapci0: <ServerWorks HT1000 SATA150 controller> port > > 0xecb0-0xecb7,0xeca0-0xeca3,0xecb8-0xecbf,0xeca4-0xeca7,0xece0-0xecef mem > > 0xefdfe000-0xefdfffff irq 6 at device 14.0 on pci3 > > There are a number of threads in -current discussing the brokenness of > that chipset. The impression I get is that 7.0-RELEASE will never work > for you but there are some fixes in 7.0 post -RELEASE. A sample is: > http://lists.freebsd.org/pipermail/freebsd-current/2008-March/084272.html > http://lists.freebsd.org/pipermail/freebsd-current/2008-March/084075.html The 'SWKSMIO' part of the patch I posted might fix the issues with PATA-mode. -- John Baldwin Please, I'm begging: PLEASE commit John's patch to 7.1! Without it, FreeBSD is completely broken on relatively common hardware like Dell SC1435 servers. It looks like a fix was committed just before 7.1 was released. Could you confirm whether this issue has been fixed? -- Bruce State Changed From-To: open->feedback Note that submitter has been asked for feedback. +----[ Bruce Cran (11.Feb.2010 15:31): | | It looks like a fix was committed just before 7.1 was released. Could | you confirm whether this issue has been fixed? | | -- | Bruce | +----] IIRC I've tried early candidates of 7.1 and the bug was still there. However, it seems like it's been fixed as you say: http://lists.freebsd.org/pipermail/freebsd-bugs/2009-March/034435.html I'm now running 7.2-STABLE on this box successfully. Don't know if I'm being too helpful. Sorry. -- fernan State Changed From-To: feedback->closed Submitter reports that the issue has been fixed. |