Bug 255091 - freebsd-update should retry when encountering incorrect hashes
Summary: freebsd-update should retry when encountering incorrect hashes
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: misc (show other bugs)
Version: 13.0-STABLE
Hardware: Any Any
: --- Affects Some People
Assignee: freebsd-bugs (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-04-15 13:01 UTC by Tim Foster
Modified: 2021-04-15 13:01 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Tim Foster 2021-04-15 13:01:56 UTC
Upgrading to 13.0-RELEASE has been a little slow because we often hit
errors where one of the files that was phttpget'd was corrupt. Perhaps this was a 
problem with our ISP, we don't know.

When hitting that error, we'd try again, hoping that this time we'd get a clean
run, often failing again and having to manaully retry.

The process of deciding which files to download isn't getting cached, so each
time we retry, there's a lengthy calculation to do before we start the download
attempt, which is unfortunate, but completely understandable (how would we
do cache invalidation)

Here's an example of the crash:

root@puroto:/home/timf # freebsd-update --currently-running 12.2-RELEASE -b /space/jails/pupuru -r 13.0-RELEASE upgrade
src component not installed, skipped
Looking up update.FreeBSD.org mirrors... 3 mirrors found.
Fetching metadata signature for 12.2-RELEASE from update1.freebsd.org... done.
Fetching metadata index... done.
Fetching 1 metadata patches. done.
Applying metadata patches... done.
Fetching 1 metadata files... done.
.
.
7910....7920....7930....7940....7950....7960....7970....7980....7990....8000....8010....8020....8030....8040....8050....8060....8070....8080 done.
Applying patches... done.
Fetching 1842 files... ....10....20....30....40....50....60....70....80....90....100....110....120....1.
.
.
1740....1750....1760....1770....1780....1790....1800....1810....1820....1830....1840. 013ea614ef0dd0177d85c3699276a6f40a0071ded8c54d364bb342eacdedc306 has incorrect hash.
root@puroto:/home/timf #

Instead of just bailing out, it would be better if we instead saved the
incorrect files to a list, then looped around until we have no more files left.

We worked on a fix for this at:

https://github.com/freebsd/freebsd-src/compare/main...timfoster:freebsd-update-retry?diff=split

which seems to be doing the right thing here.

We notice that 'portsnap' also has similar logic, bailing out on the first
error, so this work could probably be extended.

A few things we haven't yet considered in this fix:

1. does this deserve a new command line flag '--retries <number>' where <number> is the amount of retries, or they keyword 'infinite' (or 'always') to always retry?

2. what should the default behaviour be?

We're not sure why phttpget is downloading corrupted files, whether that's
something in our network environment, ISP, or just heavy load upstream, but
regardless of that, it would be nice if freebsd-update was more resilient.