Bug 260399 - freebsd-update: Downloading patches often fails repeatedly
Summary: freebsd-update: Downloading patches often fails repeatedly
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: bin (show other bugs)
Version: Unspecified
Hardware: Any Any
: --- Affects Some People
Assignee: Kubilay Kocak
URL:
Keywords: needs-qa
Depends on:
Blocks:
 
Reported: 2021-12-13 21:16 UTC by Some Signup
Modified: 2022-06-20 12:06 UTC (History)
5 users (show)

See Also:
koobs: maintainer-feedback? (cperciva)


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Some Signup 2021-12-13 21:16:31 UTC
I've just about had it with freebsd-update (and portsnap for much the same reasons).

I want to update several machines to 12.3-RELEASE. Fine. I've done this sort of thing before, and it's always a total pain. Not because of the process, but because freebsd-update is a horrible terrible nasty piece of work.

Snippet 1:
Preparing to download files... done.
Fetching 1760 patches....
..1750....1760 done.

Snippet 2:
Applying patches... done.
Fetching 962 files... ....10.
....960. failed.

Well I've seen this before, so run it again.

Snippet 3:
Fetching 921 patches.....10....20
...920 done.


Question: why does this have to download 921 more patches? Has it not already downloaded all the patches? What new patches is it downloading?

Snippet 4:
Fetching 880 files...
..880 failed.

Repeat until I am lucky enough to hit the precise nanosecond when this seems to work, and it seems to fail often (just like portsnap).

I really don't understand the logic of downloading the same patches over and over again when this happens. Are you really trying to tell me that all the patches are in some way corrupted or broken? If yes, I will say prove it! Because I don't believe it. 

I have never had problems downloading files directly myself using simple http/ftp type clients on any FreeBSD machine. The problem is solely with the truly horrible freebsd-update and its horrible cousin portsnap.

These two things (I cannot with any integrity refer to them as 'programs') are the biggest most horrible and nasty aspects of FreeBSD. They frequently fail. Don't try to suggest doing something with portsnap.conf or freebsd-update.conf. It's not that. It's these two horrible fail-at-the-drop-of-a-hat programs which should not be in the base system at all.

I don't have a problem with patching systems. But these two tools do not belong in the FreeBSD toolbox. 

Anyone thinking of replying "they work for me", good for you. But they do frequently fail for people as is shown through bug reports and posts here and there.

Seriously, in 2021, how hard is it to download some files reliably?
Comment 1 Kubilay Kocak freebsd_committer freebsd_triage 2021-12-13 23:24:15 UTC
@Reporter We understand things can get frustrating. In order to make it easier, faster and more likely for the community and our developers to help you and resolve issues, in future, (pretty) please elide as much frustration as humanly possible our of the issue report and detail, leaving only the information we can easily and quickly digest, reproduce, and ideally resolve.

With that in mind, the following would be useful:

- freebsd version (uname -a) one is updating from
- a full log of the update process including:
-- any additional errors or messages from /var/log/messages, dmesg, or otherwise, -- the freebsd-update mirrors that your system resolver ends up selecting for those updates
Comment 2 Some Signup 2021-12-14 00:51:54 UTC
It's not related to a version of FreeBSD. It's related solely to freebsd-update itself (and portsnap).

The simple fact is freebsd-update is not a reliable tool. I am willing to bet that if I wrote a dead simple shell script to iterate over a list of 7736 files and to fetch/curl/wget/whatever them one at a time I would get 7736 (or close to that) files at the first attempt. And if I had a file with mapping files to checksums, I could check each file and see if it matches, and if any one or more files do not match their checksums or were not fetched, I could re-fetch them. I would report that at the end of the big fetch process, and retry just the files with mismatched checksums. But apparently that is too much for freebsd-update and portsnap to handle. Instead it wants to refetch thousands of patches files all over again. Seriously... this is utter madness. I do not for one second believe that, in the example shown below, 7129 patches were somehow broken the first time round. It just don't believe it. At some point you have to look at the individual tool and not the system as a whole.

Now, I'm aware that freebsd-update does a lot more than just fetch a load of files. But this is where it fails. I am not alone in seeing this. 

I've used FreeBSD for over 20 years and I'm a big fan. I've even bought FreeBSD goodies to directly support the project. But I have to be honest and vent my spleen here about this sub-standard tool. It really is not good enough. 

I manage numerous machines remotely. They are not even in the same country. I am not confident in freebsd-update to update them to a new release. Patch level updates work, seemingly because not 'too many' files are involved. But release -> release updates are a whole different thing.

The fault is solely with freebsd-update (and its ugly cousin portsnap). I've read the reports, and I know that it is not me doing something wrong nor the way my systems are configured.

If you think it's related to any of my machines, it is not. In which case here is some useless information :

root@bsddev-12:/usr/home/unhappyuser # freebsd-update upgrade -r 12.3-RELEASE
Looking up update.FreeBSD.org mirrors... 2 mirrors found.
Fetching metadata signature for 12.2-RELEASE from update2.freebsd.org... done.
Fetching metadata index... done.
Fetching 1 metadata files... done.
Inspecting system... done.

The following components of FreeBSD seem to be installed:
kernel/generic src/src world/base world/doc world/lib32

The following components of FreeBSD do not seem to be installed:
kernel/generic-dbg world/base-dbg world/lib32-dbg

Does this look reasonable (y/n)? y

Fetching metadata signature for 12.3-RELEASE from update2.freebsd.org... done.
Fetching metadata index... done.
Fetching 1 metadata patches. done.
Applying metadata patches... done.
Fetching 1 metadata files... done.
Inspecting system... done.
Fetching files from 12.2-RELEASE for merging... done.
Preparing to download files... done.
Fetching 41704 patches.....10..
...
....41690....41700.. done.


Fetching 7736 files... ....10....20
...
....7730... failed.


Today this is trying to update from 12.2-RELEASE-p7 -> 12.3 RELEASE. But I've seen it going back for ages and ages.

This is connecting to: 69.11.15.204.in-addr.arpa is an alias for 69.64/27.11.15.204.in-addr.arpa.
69.64/27.11.15.204.in-addr.arpa domain name pointer update2.freebsd.org.
Comment 3 Graham Perrin freebsd_committer freebsd_triage 2021-12-14 07:26:40 UTC
(In reply to Some Signup from comment #2)

> fault is solely with … and … portsnap

Given your knowledge of the deprecation plan for portsnap, please, have you begun using an alternative?

> mirrors … update2.freebsd.org

Wonder whether particular mirrors are contributory to difficulty. 

> not even in the same country

In which country was the update2.freebsd.org example?
Comment 4 Some Signup 2021-12-14 15:34:18 UTC
> Wonder whether particular mirrors are contributory to difficulty.

I don't think so. There are bug reports and posts in various forums from people all over the place. I doubt they are all hitting the one and same mirror.

But my point still holds: I do not believe that somehow I have downloaded about 7000 files or patches which are broken. I just don't believe it. The fault is with the tool.
Comment 5 Kubilay Kocak freebsd_committer freebsd_triage 2021-12-14 22:30:17 UTC
@Reporter We've asked politely to be as constructively as possible so we can isolate the issue and resolve it for you and others.

This is a issue tracker not a discussion forum or place to vent frustrations. If the previous behaviour continues we'll close the issue without further consideration.

Moving forward (hopefully) from here...

Colin, are you able to provide suggestions for additional tests or for further isolating this issue, with respect to debugging options or similar?
Comment 6 Colin Percival freebsd_committer freebsd_triage 2021-12-14 23:25:59 UTC
Running with --debug would help.

It would also be very useful to know something about the network.  I've seen problems like this with broken "transparent" HTTP proxies.
Comment 7 Some Signup 2021-12-15 11:30:09 UTC
I can run this with --debug.

But there is still no comment on feedback about why it needs to download 7000 patches again. I simply do not believe that that many files are broken during a download.
Comment 8 Graham Perrin freebsd_committer freebsd_triage 2021-12-19 16:37:03 UTC
(In reply to Some Signup from comment #7)

> … download … patches again. …

Subscribe to bug 257247. 

Then this bug 260399 might properly focus on failure to download.
Comment 9 Some Signup 2022-01-06 23:07:54 UTC
My apologies for the delay in trying to update my machines. It's such a pain that I put it off until I can't put it off any longer.


If I run freebsd-update with --debug I get to see thousands of errors downloading files. I can post them all here, but I don't think it's necessary. A couple of sample but unrelated entries:

http://ipv4.aws.portsnap.freebsd.org/bp/8585c48bb921235425ee2865596c60b70aab551562ade66b65902e35b979a1f8-6cf7851b2ddd1d39ac9345fb4efe852f25da266ce85a6a3906f808c6287c81e6: 404 Error (ignored)

http://ipv4.aws.portsnap.freebsd.org/f/a9be29debdb0c5186913b752ad4145f5221bd8944a58abbca1de03084d2acadf.gz: 400 Error (ignored)


Some of these errors are wrong - the files exist, and in other cases the files are indeed missing. A couple of examples:

curl http://ipv4.aws.portsnap.freebsd.org/bp/7f7a3e981fa345c60fc3dff57ed14af3b76c69d374cfaf50bbedcb9bbed2c013-85791eee88708d116494f0051e1e6930f227c9569f3d42d7658f8d01aee65a47
<html>
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>nginx/1.20.1</center>
</body>
</html>


curl --output thisoneisok.gz http://ipv4.aws.portsnap.freebsd.org/f/a9be29debdb0c5186913b752ad4145f5221bd8944a58abbca1de03084d2acadf.gz
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   869  100   869    0     0   4456      0 --:--:-- --:--:-- --:--:--  4456


So there appear to be two errors here: genuinely missing files, and files which exist but freebsd-update treats as missing.

Interestingly, when I tried in a browser to get to 'http://ipv4.aws.portsnap.freebsd.org/bp' I see an error in my browser trying to get to 'http://ipv4.aws.portsnap.freebsd.org:8080/bp/'. I did not add this 8080 to the URL so I don't know where it's coming from (definitely not something locally, no proxies). It would appear to be a redirection coming from AWS.
Comment 10 Colin Percival freebsd_committer freebsd_triage 2022-01-07 02:49:27 UTC
Some quick notes -- sorry, juggling a bunch of stuff right now and won't have time to debug this properly until later:

1. Getting 404 errors in response to /bp/ requests is normal and harmless.  Portsnap and FreeBSD Update use "opportunistic patching" -- if there's a patch available it will be used but if not they'll fall back to downloading the complete new file.

2. The redirect to :8080 is a Colin screwup -- I have varnish listening on port 80 and nginx serving up the bits from port 8080.  Thanks for pointing this out; I'll dig into the nginx configuration to figure out what I did wrong there.

3. The one thing which concerns me here is the HTTP 400 errors; can you reproduce those with HTTP headers?
Comment 11 Some Signup 2022-01-07 13:15:56 UTC
My apologies: please note, my last post referred to freebsd-update when on that occasion I was using portsnap. But I often see the same end result of failed downloads when using freebsd-update - the symptoms are exactly the same. 

-> 3. The one thing which concerns me here is the HTTP 400 errors; can you reproduce those with HTTP headers

I can't reproduce this using curl for an individual file.

Looking at my latest attempt at using portsnap I saw this number of errors:
 714  400 Error (ignored)
7725  404 Error (ignored)
   1  408 Error (ignored)
   3  413 Error (ignored)

Ignoring the 404 errors thanks to Colin's explanation, the other errors seem to be important. I thought an http 400 error means a bad request, or a request that the web server cannot handle?

I've done a tcpdump while running portsnap --debug auto on a machine. I can attach a part of it. In amongst the 'good' http headers I do see some headers which appear to be 'bad'.

A typical 'good' request:
GET /bp/22c67764d7b53be769033f98ec4422a47cf71014b47dd893bfd8ba2817a37d38-ce207a35bd0fbdb64ac3326e0a2d81e0caf2eb8a9a8c650b11d052a6ea45c2b4 HTTP/1.1
Host: ipv4.aws.portsnap.freebsd.org
User-Agent: portsnap (auto, 13.0-RELEASE-p4)
Connection: Keep-Alive

A bad request:
GET /bp/5934ff241e93ae4916c3bd479e032aad2a7bb1e3a3e30c8a0cbe0851d2777ef5-8f7b4d39e928d7b3def5c19e2df1bf8986af290a1eac8545b2b134be76258b40 HTTP/1.aws.portsnap.freebsd.org
User-Agent: portsnap (auto, 13.0-RELEASE-p4)
Connection: Keep-Alive

The GET and and Host parts are clobbered together and a spurious '.' is inserted. It looks like the 'HTTP/1.1' is being truncated or overwritten. I can see this same type of error several times:

GET /bp/5e8bc5f6bc657c2e0ca384f617bedd73772f7173a06b64838528832802437619-b24ab558baef9f4a5570bedc48bd720a22e1cd45d1bd38a385730da9d0ddbddb HTTP/1.aws.portsnap.freebsd.org
GET /bp/c502e4c30cb2334244f0038351847344ac7452959a913e0972a011dbb8abc0ce-1a4ca6cb90dab8fae3f2adcec09d82a86d17a20955387dfff3cdf474bbf60c17 HTTP/1.aws.portsnap.freebsd.org
GET /bp/955b599e8e767b1a714dabdae3498c0b1e6aaa5f337e058061b8e60fd2a56f5c-4e47c6280360287ebc768026f7e60b8b4d97a937d97f7e593c2c689b46c0cb41 HTTP/1.aws.portsnap.freebsd.org
GET /bp/298b882b48d8a1d54e99ebfb46d52acb8b757c14d2bcdfcb61e508a015349155-e01fa8224d0d5ff8e4b01cc6894dde48f9997229908bd52f4096cb03d58f9f8e HTTP/1.aws.portsnap.freebsd.org
...
Comment 12 Colin Percival freebsd_committer freebsd_triage 2022-01-07 19:17:20 UTC
Don't worry about portsnap vs. freebsd-update -- it's basically the same code.

If HTTP requests are getting mangled that's a very clear sign as to where the problem is.  A couple questions about that:

1. Where you are running tcpdump, on the system in question or somewhere else (firewall etc)?

2. Can you run this under ktrace so I can see the data being passed from phttpget to the kernel?

It's looking like there's either a phttpget bug, a network stack bug, or you have a glitchy network card.
Comment 13 Julian Noble 2022-06-20 12:06:54 UTC
I think the most important bug here is the failure to restart utilizing the already downloaded patches and files.

I'm usually running it on a low latency 1Gbps link in a big city in Australia - but freebsd-update has been a big *slow* problem for me for a long time on different systems and different network paths.
The additional latency and no local mirrors doesn't help - but all that would be bearable if it just wouldn't restart doing work that has already been done each time.

Really I don't think understanding exactly why a particular request failed or exactly what version people are upgrading to or from is the issue. The networks will never be perfect.

I've tried proxy/caching via nginx - but that's another layer of indirection which makes it harder to debug (e.g it seems that 404s are common and normal - but some can be cached - others shouldn't be?)
Simply doing a quick restart of nginx means any freebsd-updates pointed to it need to restart from scratch... which suggests to me that the way this works isn't really suitable for global distribution.

I'd love to try running my own freebsd-update server - but that seems a massive job - and really I can see fewer and fewer technically inclined people even getting to that level with the freebsd-update experience as it is.

Perhaps it's fine in North America with latencies of a few 10s of milliseconds and everyone there barely cares about the odd restart from scratch?

I know freebsd-update involves a lot of cleverness in what it does - but I  understand the frustrations of the OP. 
Starting a slow grind through 57000+ high latency http requests again is more than a minor annoyance.  It's not just an hour or two either..  these update attempts for a few machines can end up taking days of an admins time and thus get put off. The only indication at the end is "failed"
In some cases - such as running via a wrapper such as iocage - I don't think there's even a way to pass through the debug flag.


Is there a way to make freebsd-update resume from close to where it stopped?