Bug 283112 - New Cache files packed into the FreeBSD14.2 release ISO file but not into Beta3 ISO files
Summary: New Cache files packed into the FreeBSD14.2 release ISO file but not into Bet...
Status: In Progress
Alias: None
Product: Base System
Classification: Unclassified
Component: misc (show other bugs)
Version: 14.2-RELEASE
Hardware: Any Any
: --- Affects Some People
Assignee: freebsd-bugs (Nobody)
URL:
Keywords: vendor
Depends on:
Blocks:
 
Reported: 2024-12-04 03:22 UTC by Yanhui He
Modified: 2025-01-15 02:33 UTC (History)
9 users (show)

See Also:


Attachments
freebsd_14.2_rc1_newfiles (10.22 KB, text/plain)
2024-12-04 03:22 UTC, Yanhui He
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Yanhui He 2024-12-04 03:22:12 UTC
Created attachment 255613 [details]
freebsd_14.2_rc1_newfiles

Hi,

pycdlib is used on our test to automatic install FreeBSD 14.2. We hit the issue that we cannot use it from FreeBSD 14.2 RC1, including the released FreeBSD 14.2 Release ISO Images. pycdlib is the same one for below FreeBSD 14.2 BETA3 and release.

Like we use the below code to open FreeBSD 14.2 BETA3 64bit ISO Image.
******test_beta3_iso.py******
#!/usr/bin/env python

import pycdlib
iso = pycdlib.PyCdlib(always_consistent=True)
src_iso = "FreeBSD-14.2-BETA3-amd64-dvd1.iso"
iso.open(src_iso)
iso.close()
******test_beta3_iso.py******

Test FreeBSD 14.2 BETA3 ISO, and no errors
$ python3 test_beta3_iso.py

******test_release_iso.py******
#!/usr/bin/env python

import pycdlib
iso = pycdlib.PyCdlib(always_consistent=True)
src_iso = "FreeBSD-14.2-RELEASE-amd64-dvd1.iso"
iso.open(src_iso)
iso.close()
******test_release_iso.py******


Test RELEASE ISO, got exceptions

$ python3 test_release_iso.py
Traceback (most recent call last):
  File "/tmp/test_iso.py", line 8, in <module>
    iso.open(src_iso)
  File "/home/qiz/.local/lib/python3.10/site-packages/pycdlib/pycdlib.py", line 4123, in open
    self._open_fp(fp)
  File "/home/qiz/.local/lib/python3.10/site-packages/pycdlib/pycdlib.py", line 2310, in _open_fp
    ic_level, lastbyte = self._walk_directories(self.pvd, extent_to_ptr,
  File "/home/qiz/.local/lib/python3.10/site-packages/pycdlib/pycdlib.py", line 1151, in _walk_directories
    new_record.parent.track_child(new_record,
  File "/home/qiz/.local/lib/python3.10/site-packages/pycdlib/dr.py", line 838, in track_child
    self._add_child(child, logical_block_size, allow_duplicate, False)
  File "/home/qiz/.local/lib/python3.10/site-packages/pycdlib/dr.py", line 747, in _add_child
    raise pycdlibexception.PyCdlibInvalidInput('Failed adding duplicate name to parent')
pycdlib.pycdlibexception.PyCdlibInvalidInput: Failed adding duplicate name to parent


By comparing BETA3 with RC1/RELEASE ISO, there are some new files added from RC1 (see attachement freebsd_14.2_rc1_newfiles.txt). 

The most suspicious files are below cache files. 
./var/cache/pkg/wifi-firmware-iwlwifi-kmod-22000-20241017_1~7ad9338d54.pkg
./var/cache/pkg/wifi-firmware-iwlwifi-kmod-7000-20241017_1~f04ac7a71d.pkg
./var/cache/pkg/wifi-firmware-iwlwifi-kmod-8000-20241017_1~d771c14b9f.pkg
./var/cache/pkg/wifi-firmware-iwlwifi-kmod-9000-20241017_1~1dd55a85fd.pkg
./var/cache/pkg/wifi-firmware-iwlwifi-kmod-ax210-20241017_1~43eea6bfa4.pkg
./var/cache/pkg/wifi-firmware-iwlwifi-kmod-bz-20241017_1~7f6e1c6d44.pkg
./var/cache/pkg/wifi-firmware-kmod-release-20241017~b0e3d437a2.pkg
./var/cache/pkg/wifi-firmware-rtw88-kmod-rtw8703b-20241017_1~25b73e7198.pkg
./var/cache/pkg/wifi-firmware-rtw88-kmod-rtw8723d-20241017_1~cbb2d43e86.pkg
./var/cache/pkg/wifi-firmware-rtw88-kmod-rtw8821c-20241017_1~0d1aa10a4d.pkg
./var/cache/pkg/wifi-firmware-rtw88-kmod-rtw8822b-20241017_1~fd58884de1.pkg
./var/cache/pkg/wifi-firmware-rtw88-kmod-rtw8822c-20241017_1~3599bbf662.pkg
./var/cache/pkg/wifi-firmware-rtw89-kmod-rtw8851b-20241017_1~f20c3d0609.pkg
./var/cache/pkg/wifi-firmware-rtw89-kmod-rtw8852a-20241017_1~c87c5db9f3.pkg
./var/cache/pkg/wifi-firmware-rtw89-kmod-rtw8852b-20241017_1~c6b2db8a10.pkg
./var/cache/pkg/wifi-firmware-rtw89-kmod-rtw8852c-20241017_1~967b9c7bf3.pkg
./var/cache/pkg/wifi-firmware-rtw89-kmod-rtw8922a-20241017_1~83ff2fd377.pkg


Shall these cache files be packed into the release ISO file?

Thanks!
Yanhui
Comment 1 Yanhui He 2024-12-05 08:42:04 UTC
Hi,

Is there any one help to take a look?

Thanks!
Yanhui
Comment 2 Mark Johnston freebsd_committer freebsd_triage 2024-12-06 16:44:59 UTC
Indeed, those files shouldn't be there - they're cached downloads that should be cleaned as part of the image building process.  I can confirm that they're present when I mount a 14.2 iso image with cd9660.  cc'ing releng.

What exactly is tripping up pycdlib?  Is it truncating file names?
Comment 3 Bjoern A. Zeeb freebsd_committer freebsd_triage 2024-12-07 02:11:25 UTC
(In reply to Mark Johnston from comment #2)

Or not be downloaded to the staging area but only installed into it.
Hmm.  I'll look into it.

Likely unrelated to the pycdlib problem tough unless it's the symlinks in there too which are not in the list from #c0 but I see them in my staging directory from my tests still.
Comment 4 Yanhui He 2024-12-07 05:36:23 UTC
(In reply to Bjoern A. Zeeb from comment #3)
Thanks Mark and Bjoern!

pycdlib package keeps same in the last 3 years.
Comment 5 Yanhui He 2024-12-09 03:02:24 UTC
Hi Bjoern,

Any update?

Thanks!
Yanhui
Comment 6 Mark Johnston freebsd_committer freebsd_triage 2024-12-09 16:37:19 UTC
We're getting an exception here:

  1152                 try:                                                                                                                                                                                                                                                                                                   
  1153                     new_record.parent.track_child(new_record,                                                                                                                                                                                                                                                          
  1154                                                   self.logical_block_size)                                                                                                                                                                                                                                             
  1155                 except pycdlibexception.PyCdlibInvalidInput:                                                                                                                                                                                                                                                           
  1156                     # dir_record.track_child() may throw a PyCdlibInvalidInput                                                                                                                                                                                                                                         
  1157                     # if it was given a duplicate child.  However, we allow                                                                                                                                                                                                                                            
  1158                     # duplicate children if and only if this record is a file                                                                                                                                                                                                                                          
  1159                     # and the last file has the same name; this represents a                                                                                                                                                                                                                                           
  1160                     # very large file.                                                                                                                                                                                                                                                                                 
  1161                     if new_record.is_dir() or last_record is None or last_record.file_identifier() != new_record.file_identifier():                                                                                                                               
  1162                         raise

I reproduced the problem locally and found back-to-back duplicate names "WIFI_FIRMWARE_RTW88_KMOD_RTW87".  Normally the exception would be ignored, but new_record.is_dir() is true.  However, none of the files in var/cache/pkg in the ISO are directories, so I tend to suspect that there's a bug in pycdlib.
Comment 7 Bjoern A. Zeeb freebsd_committer freebsd_triage 2024-12-10 01:41:05 UTC
(In reply to Yanhui He from comment #5)

Not from me but people working on the cleanup at least:
   https://reviews.freebsd.org/D47942


Should be possible to quickly create a semi-rnaomd release DVD image without the var/cache files (not directories) to see if markj's theory holds up but given the names Mark got I would not be surprised by a bug somewhere else as well.
Comment 8 Yanhui He 2024-12-10 02:35:51 UTC
(In reply to Mark Johnston from comment #6)
Hi Mark,

AFAIK, PyCdlib doesn't have change in recent months. It's OK for FreeBSD 14.2 Beta3 ISO Image, and failed since FreeBSD 14.2 RC1 release.

Thanks!
Yanhui
Comment 9 Yanhui He 2024-12-10 02:38:46 UTC
(In reply to Bjoern A. Zeeb from comment #7)
Thanks Bjoern!

If the change will be released, can we have a semi-rnaomd release DVD image without the var/cache files (not directories)?
Comment 10 Yanhui He 2024-12-10 03:06:54 UTC
(In reply to Mark Johnston from comment #6)
Hi Mark,

https://github.com/clalancette/pycdlib/blob/master/CHANGELOG
https://pypi.org/project/pycdlib/1.3.2/#history

The latest version of pycvdlib is 1.14.0 and it was released on Jan 15, 2023.

There is no change between FreeBSD 14.2 BETA3 and RC1 release.

Thanks!
Yanhui
Comment 11 Keira Zhang 2024-12-10 03:20:05 UTC
(In reply to Mark Johnston from comment #6)
Hi Mark,
Thanks for helping to address the issue. I found more clues. It was directory /usr/local/share/licenses/wifi-firmware-rtw88-kmod-rtw8723d-20241017_1 caused the exception.

https://github.com/clalancette/pycdlib/blob/master/pycdlib/dr.py#L214 didn't return a correct file identifier length, which led to the file identifier be identified as WIFI_FIRMWARE_RTW88_KMOD_RTW87 for this directory. 

As /usr/local/share/licenses/wifi-firmware-rtw88-kmod-rtw8703b-20241017_1 had been identified as WIFI_FIRMWARE_RTW88_KMOD_RTW87 already, adding directory wifi-firmware-rtw88-kmod-rtw8723d-20241017_1 identifier caused the duplicate name exception.


We are using pycdlib to inserting /etc/installerconfig to the FreeBSD ISO for unattended installation. This issue blocks our weekly regression testing for FreeBSD 14.2 GA release. Can you please give us some advice about how we can work around the issue? 

Thanks very much!
Comment 13 Mark Johnston freebsd_committer freebsd_triage 2024-12-10 15:00:36 UTC
(In reply to Yanhui He from comment #10)
I understand, I'm saying that there is a bug in pycdlib that is exposed by the directories that appeared in /usr/local/share/licenses.

I submitted a bug report upstream: https://github.com/clalancette/pycdlib/issues/131

If I modify pycdlib to compare directory entries using their full name, the exception doesn't occur, but I'm not sure if it's right.  Here is my patch, it lets your example program work: https://github.com/markjdb/pycdlib/commit/16aaf7884296c5fb72dc298c48b30e9ca5bf82cf
Comment 14 commit-hook freebsd_committer freebsd_triage 2024-12-10 17:30:02 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=6dafe8c1e46e58097b277ab9babb3ec5336713d7

commit 6dafe8c1e46e58097b277ab9babb3ec5336713d7
Author:     Muhammad Moinur Rahman <bofh@FreeBSD.org>
AuthorDate: 2024-12-10 17:24:43 +0000
Commit:     Muhammad Moinur Rahman <bofh@FreeBSD.org>
CommitDate: 2024-12-10 17:28:55 +0000

    Clean pkg cache in release media

    Recently wifi-firmware-kmod-release pkgs were added in the release
    installation media, but unfortunately the pkgs were not cleaned up after
    the installation. These kept the stale pkgs in the /var/cache. The pkgs
    should be removed with pkg clean.

    PR:             283112
    Reported by:    yanhui.he@broadcom.com
    Reviewed by:    cperciva, emaste
    Approved by:    cperciva, emaste, re (implicit)
    MFC after:      3 days
    Sponsored by:   The FreeBSD Foundation
    Differential Revision:  https://reviews.freebsd.org/D47942

 release/Makefile | 3 +++
 1 file changed, 3 insertions(+)
Comment 15 Muhammad Moinur Rahman freebsd_committer freebsd_triage 2024-12-10 17:46:33 UTC
Although my commit fixes the underlying issue and ensures that the pkg caches are not present in the next release, we do not re-roll releases at the moment, so there will be no new release media.
This will be MFC'ed on Friday, after which new images can be generated on 14.X branch without those files in /var/cache.
Comment 16 Ed Maste freebsd_committer freebsd_triage 2024-12-10 18:25:17 UTC
It looks like there's a makefs bug here - the short non-rockridge paths need to be unique. usr.sbin/makefs/cd9660.c:cd9660_rename_filename() is called to deduplicate conflicts. I can see that it is called for both files and directories, but does not manage to successfully dedup the directories. I'll submit a separate bug for that.
Comment 17 Ed Maste freebsd_committer freebsd_triage 2024-12-10 19:10:29 UTC
(In reply to Keira Zhang from comment #11)
> We are using pycdlib to inserting /etc/installerconfig to the FreeBSD ISO for
> unattended installation. This issue blocks our weekly regression testing for
> FreeBSD 14.2 GA release. Can you please give us some advice about how we can work
> around the issue?

As a result of your report we have found two issues (so far), which have fixes in progress. Unfortunately that will not solve the issue with the existing 14.2-RELEASE ISO image. We are discussing potential solutions or workarounds and will update when we have more information.
Comment 18 Yanhui He 2024-12-11 02:52:14 UTC
(In reply to Mark Johnston from comment #13)
Got it, thanks Mark!
Comment 19 Keira Zhang 2024-12-11 03:26:03 UTC
Thanks to all of you. I have tried Mark's commit in
https://github.com/markjdb/pycdlib/commit/16aaf7884296c5fb72dc298c48b30e9ca5bf82cf.

And it could rebuild FreeBSD 14.2 release image with unattended config file.  We will use it as a workaround in our weekly regression testing for now.

When there is new image in 14.x branch, and if the new image won't reproduce this issue, we can replace 14.2 release image with the new image.
Comment 20 Ed Maste freebsd_committer freebsd_triage 2024-12-13 18:30:34 UTC
(In reply to Ed Maste from comment #17)

The paths containing duplicated short names are:

/USR/LOCAL/SHARE/LICENSES/WIFI_FIRMWARE_RTW88_KMOD_RTW87/
/USR/LOCAL/SHARE/LICENSES/WIFI_FIRMWARE_RTW88_KMOD_RTW88/
/USR/LOCAL/SHARE/LICENSES/WIFI_FIRMWARE_RTW89_KMOD_RTW88/

The makefs bug affects conflicting directories only -- conflicting files are correctly handled. So while the pkg cache files are undesirable (and are being removed) they're not responsible for the pycdlib exception and the trouble you've encountered.
Comment 21 commit-hook freebsd_committer freebsd_triage 2024-12-30 19:50:46 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=1f31d437428014e864bcce1223cf7017180e2608

commit 1f31d437428014e864bcce1223cf7017180e2608
Author:     Ed Maste <emaste@FreeBSD.org>
AuthorDate: 2024-12-30 15:01:06 +0000
Commit:     Ed Maste <emaste@FreeBSD.org>
CommitDate: 2024-12-30 19:48:44 +0000

    makefs: Fix cd9660 duplicate directory names

    Previously we could create cd9660 images with duplicate short (level 2)
    names.

    cd9660_level2_convert_filename used a 30-character limit (for files and
    directories), not including the '.' separator.  cd9660_rename_filename
    used a 31-character limit, including the '.'.  Directory names 31
    characters or longer (without '.') were shortened to 30 characters, and
    if a collision occurred cd9660_rename_filename uniquified them starting
    with the 31st character.  Unfortunately the directory record's name_len
    was already set, so the unique part of the name was stripped off.

    Directories are up to 31 d-characters (i.e., A-Z 0-9 and _); there is no
    provision for a '.' in a directory name.  Increase the name length limit
    to 31 for directories, and exclude '.'s.

    This name mapping and deduplication code is still fragile and convoluted
    and would beenfit from a more holistic effort.

    PR:             283238, 283112
    Reviewed by:    imp
    Sponsored by:   The FreeBSD Foundation
    Differential Revision: https://reviews.freebsd.org/D48251

 usr.sbin/makefs/cd9660.c                     |  5 ++--
 usr.sbin/makefs/tests/makefs_cd9660_tests.sh | 39 ++++++++++++++++++++++++++--
 usr.sbin/makefs/tests/makefs_tests_common.sh |  2 +-
 3 files changed, 41 insertions(+), 5 deletions(-)
Comment 22 commit-hook freebsd_committer freebsd_triage 2025-01-14 21:08:22 UTC
A commit in branch stable/14 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=40b11c1c16d85b3339be4c3c4fde2ef7fa36f3ff

commit 40b11c1c16d85b3339be4c3c4fde2ef7fa36f3ff
Author:     Ed Maste <emaste@FreeBSD.org>
AuthorDate: 2024-12-30 15:01:06 +0000
Commit:     Ed Maste <emaste@FreeBSD.org>
CommitDate: 2025-01-14 21:07:39 +0000

    makefs: Fix cd9660 duplicate directory names

    Previously we could create cd9660 images with duplicate short (level 2)
    names.

    cd9660_level2_convert_filename used a 30-character limit (for files and
    directories), not including the '.' separator.  cd9660_rename_filename
    used a 31-character limit, including the '.'.  Directory names 31
    characters or longer (without '.') were shortened to 30 characters, and
    if a collision occurred cd9660_rename_filename uniquified them starting
    with the 31st character.  Unfortunately the directory record's name_len
    was already set, so the unique part of the name was stripped off.

    Directories are up to 31 d-characters (i.e., A-Z 0-9 and _); there is no
    provision for a '.' in a directory name.  Increase the name length limit
    to 31 for directories, and exclude '.'s.

    This name mapping and deduplication code is still fragile and convoluted
    and would beenfit from a more holistic effort.

    PR:             283238, 283112
    Reviewed by:    imp
    Sponsored by:   The FreeBSD Foundation
    Differential Revision: https://reviews.freebsd.org/D48251

    (cherry picked from commit 1f31d437428014e864bcce1223cf7017180e2608)

 usr.sbin/makefs/cd9660.c                     |  5 ++--
 usr.sbin/makefs/tests/makefs_cd9660_tests.sh | 39 ++++++++++++++++++++++++++--
 usr.sbin/makefs/tests/makefs_tests_common.sh |  2 +-
 3 files changed, 41 insertions(+), 5 deletions(-)
Comment 23 commit-hook freebsd_committer freebsd_triage 2025-01-15 02:33:57 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=79778b7aafc829e6f67cbf0d50a51836e69dc784

commit 79778b7aafc829e6f67cbf0d50a51836e69dc784
Author:     Ed Maste <emaste@FreeBSD.org>
AuthorDate: 2024-12-30 15:01:06 +0000
Commit:     Ed Maste <emaste@FreeBSD.org>
CommitDate: 2025-01-15 02:33:30 +0000

    makefs: Fix cd9660 duplicate directory names

    Previously we could create cd9660 images with duplicate short (level 2)
    names.

    cd9660_level2_convert_filename used a 30-character limit (for files and
    directories), not including the '.' separator.  cd9660_rename_filename
    used a 31-character limit, including the '.'.  Directory names 31
    characters or longer (without '.') were shortened to 30 characters, and
    if a collision occurred cd9660_rename_filename uniquified them starting
    with the 31st character.  Unfortunately the directory record's name_len
    was already set, so the unique part of the name was stripped off.

    Directories are up to 31 d-characters (i.e., A-Z 0-9 and _); there is no
    provision for a '.' in a directory name.  Increase the name length limit
    to 31 for directories, and exclude '.'s.

    This name mapping and deduplication code is still fragile and convoluted
    and would beenfit from a more holistic effort.

    PR:             283238, 283112
    Reviewed by:    imp
    Sponsored by:   The FreeBSD Foundation
    Differential Revision: https://reviews.freebsd.org/D48251

    (cherry picked from commit 1f31d437428014e864bcce1223cf7017180e2608)
    (cherry picked from commit 40b11c1c16d85b3339be4c3c4fde2ef7fa36f3ff)

 usr.sbin/makefs/cd9660.c                     |  5 ++--
 usr.sbin/makefs/tests/makefs_cd9660_tests.sh | 39 ++++++++++++++++++++++++++--
 usr.sbin/makefs/tests/makefs_tests_common.sh |  2 +-
 3 files changed, 41 insertions(+), 5 deletions(-)