Summary: | freebsd-update: Downloading patches often fails repeatedly | ||
---|---|---|---|
Product: | Base System | Reporter: | Some Signup <somesignup> |
Component: | bin | Assignee: | freebsd-bugs (Nobody) <bugs> |
Status: | Open --- | ||
Severity: | Affects Some People | CC: | bugs, cperciva, grahamperrin, julian, lwhsu, somesignup |
Priority: | --- | Keywords: | needs-qa |
Version: | Unspecified | Flags: | koobs:
maintainer-feedback?
(cperciva) |
Hardware: | Any | ||
OS: | Any |
Description
Some Signup
2021-12-13 21:16:31 UTC
@Reporter We understand things can get frustrating. In order to make it easier, faster and more likely for the community and our developers to help you and resolve issues, in future, (pretty) please elide as much frustration as humanly possible our of the issue report and detail, leaving only the information we can easily and quickly digest, reproduce, and ideally resolve. With that in mind, the following would be useful: - freebsd version (uname -a) one is updating from - a full log of the update process including: -- any additional errors or messages from /var/log/messages, dmesg, or otherwise, -- the freebsd-update mirrors that your system resolver ends up selecting for those updates It's not related to a version of FreeBSD. It's related solely to freebsd-update itself (and portsnap). The simple fact is freebsd-update is not a reliable tool. I am willing to bet that if I wrote a dead simple shell script to iterate over a list of 7736 files and to fetch/curl/wget/whatever them one at a time I would get 7736 (or close to that) files at the first attempt. And if I had a file with mapping files to checksums, I could check each file and see if it matches, and if any one or more files do not match their checksums or were not fetched, I could re-fetch them. I would report that at the end of the big fetch process, and retry just the files with mismatched checksums. But apparently that is too much for freebsd-update and portsnap to handle. Instead it wants to refetch thousands of patches files all over again. Seriously... this is utter madness. I do not for one second believe that, in the example shown below, 7129 patches were somehow broken the first time round. It just don't believe it. At some point you have to look at the individual tool and not the system as a whole. Now, I'm aware that freebsd-update does a lot more than just fetch a load of files. But this is where it fails. I am not alone in seeing this. I've used FreeBSD for over 20 years and I'm a big fan. I've even bought FreeBSD goodies to directly support the project. But I have to be honest and vent my spleen here about this sub-standard tool. It really is not good enough. I manage numerous machines remotely. They are not even in the same country. I am not confident in freebsd-update to update them to a new release. Patch level updates work, seemingly because not 'too many' files are involved. But release -> release updates are a whole different thing. The fault is solely with freebsd-update (and its ugly cousin portsnap). I've read the reports, and I know that it is not me doing something wrong nor the way my systems are configured. If you think it's related to any of my machines, it is not. In which case here is some useless information : root@bsddev-12:/usr/home/unhappyuser # freebsd-update upgrade -r 12.3-RELEASE Looking up update.FreeBSD.org mirrors... 2 mirrors found. Fetching metadata signature for 12.2-RELEASE from update2.freebsd.org... done. Fetching metadata index... done. Fetching 1 metadata files... done. Inspecting system... done. The following components of FreeBSD seem to be installed: kernel/generic src/src world/base world/doc world/lib32 The following components of FreeBSD do not seem to be installed: kernel/generic-dbg world/base-dbg world/lib32-dbg Does this look reasonable (y/n)? y Fetching metadata signature for 12.3-RELEASE from update2.freebsd.org... done. Fetching metadata index... done. Fetching 1 metadata patches. done. Applying metadata patches... done. Fetching 1 metadata files... done. Inspecting system... done. Fetching files from 12.2-RELEASE for merging... done. Preparing to download files... done. Fetching 41704 patches.....10.. ... ....41690....41700.. done. Fetching 7736 files... ....10....20 ... ....7730... failed. Today this is trying to update from 12.2-RELEASE-p7 -> 12.3 RELEASE. But I've seen it going back for ages and ages. This is connecting to: 69.11.15.204.in-addr.arpa is an alias for 69.64/27.11.15.204.in-addr.arpa. 69.64/27.11.15.204.in-addr.arpa domain name pointer update2.freebsd.org. (In reply to Some Signup from comment #2) > fault is solely with … and … portsnap Given your knowledge of the deprecation plan for portsnap, please, have you begun using an alternative? > mirrors … update2.freebsd.org Wonder whether particular mirrors are contributory to difficulty. > not even in the same country In which country was the update2.freebsd.org example? > Wonder whether particular mirrors are contributory to difficulty.
I don't think so. There are bug reports and posts in various forums from people all over the place. I doubt they are all hitting the one and same mirror.
But my point still holds: I do not believe that somehow I have downloaded about 7000 files or patches which are broken. I just don't believe it. The fault is with the tool.
@Reporter We've asked politely to be as constructively as possible so we can isolate the issue and resolve it for you and others. This is a issue tracker not a discussion forum or place to vent frustrations. If the previous behaviour continues we'll close the issue without further consideration. Moving forward (hopefully) from here... Colin, are you able to provide suggestions for additional tests or for further isolating this issue, with respect to debugging options or similar? Running with --debug would help. It would also be very useful to know something about the network. I've seen problems like this with broken "transparent" HTTP proxies. I can run this with --debug. But there is still no comment on feedback about why it needs to download 7000 patches again. I simply do not believe that that many files are broken during a download. (In reply to Some Signup from comment #7) > … download … patches again. … Subscribe to bug 257247. Then this bug 260399 might properly focus on failure to download. My apologies for the delay in trying to update my machines. It's such a pain that I put it off until I can't put it off any longer. If I run freebsd-update with --debug I get to see thousands of errors downloading files. I can post them all here, but I don't think it's necessary. A couple of sample but unrelated entries: http://ipv4.aws.portsnap.freebsd.org/bp/8585c48bb921235425ee2865596c60b70aab551562ade66b65902e35b979a1f8-6cf7851b2ddd1d39ac9345fb4efe852f25da266ce85a6a3906f808c6287c81e6: 404 Error (ignored) http://ipv4.aws.portsnap.freebsd.org/f/a9be29debdb0c5186913b752ad4145f5221bd8944a58abbca1de03084d2acadf.gz: 400 Error (ignored) Some of these errors are wrong - the files exist, and in other cases the files are indeed missing. A couple of examples: curl http://ipv4.aws.portsnap.freebsd.org/bp/7f7a3e981fa345c60fc3dff57ed14af3b76c69d374cfaf50bbedcb9bbed2c013-85791eee88708d116494f0051e1e6930f227c9569f3d42d7658f8d01aee65a47 <html> <head><title>404 Not Found</title></head> <body> <center><h1>404 Not Found</h1></center> <hr><center>nginx/1.20.1</center> </body> </html> curl --output thisoneisok.gz http://ipv4.aws.portsnap.freebsd.org/f/a9be29debdb0c5186913b752ad4145f5221bd8944a58abbca1de03084d2acadf.gz % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 869 100 869 0 0 4456 0 --:--:-- --:--:-- --:--:-- 4456 So there appear to be two errors here: genuinely missing files, and files which exist but freebsd-update treats as missing. Interestingly, when I tried in a browser to get to 'http://ipv4.aws.portsnap.freebsd.org/bp' I see an error in my browser trying to get to 'http://ipv4.aws.portsnap.freebsd.org:8080/bp/'. I did not add this 8080 to the URL so I don't know where it's coming from (definitely not something locally, no proxies). It would appear to be a redirection coming from AWS. Some quick notes -- sorry, juggling a bunch of stuff right now and won't have time to debug this properly until later: 1. Getting 404 errors in response to /bp/ requests is normal and harmless. Portsnap and FreeBSD Update use "opportunistic patching" -- if there's a patch available it will be used but if not they'll fall back to downloading the complete new file. 2. The redirect to :8080 is a Colin screwup -- I have varnish listening on port 80 and nginx serving up the bits from port 8080. Thanks for pointing this out; I'll dig into the nginx configuration to figure out what I did wrong there. 3. The one thing which concerns me here is the HTTP 400 errors; can you reproduce those with HTTP headers? My apologies: please note, my last post referred to freebsd-update when on that occasion I was using portsnap. But I often see the same end result of failed downloads when using freebsd-update - the symptoms are exactly the same. -> 3. The one thing which concerns me here is the HTTP 400 errors; can you reproduce those with HTTP headers I can't reproduce this using curl for an individual file. Looking at my latest attempt at using portsnap I saw this number of errors: 714 400 Error (ignored) 7725 404 Error (ignored) 1 408 Error (ignored) 3 413 Error (ignored) Ignoring the 404 errors thanks to Colin's explanation, the other errors seem to be important. I thought an http 400 error means a bad request, or a request that the web server cannot handle? I've done a tcpdump while running portsnap --debug auto on a machine. I can attach a part of it. In amongst the 'good' http headers I do see some headers which appear to be 'bad'. A typical 'good' request: GET /bp/22c67764d7b53be769033f98ec4422a47cf71014b47dd893bfd8ba2817a37d38-ce207a35bd0fbdb64ac3326e0a2d81e0caf2eb8a9a8c650b11d052a6ea45c2b4 HTTP/1.1 Host: ipv4.aws.portsnap.freebsd.org User-Agent: portsnap (auto, 13.0-RELEASE-p4) Connection: Keep-Alive A bad request: GET /bp/5934ff241e93ae4916c3bd479e032aad2a7bb1e3a3e30c8a0cbe0851d2777ef5-8f7b4d39e928d7b3def5c19e2df1bf8986af290a1eac8545b2b134be76258b40 HTTP/1.aws.portsnap.freebsd.org User-Agent: portsnap (auto, 13.0-RELEASE-p4) Connection: Keep-Alive The GET and and Host parts are clobbered together and a spurious '.' is inserted. It looks like the 'HTTP/1.1' is being truncated or overwritten. I can see this same type of error several times: GET /bp/5e8bc5f6bc657c2e0ca384f617bedd73772f7173a06b64838528832802437619-b24ab558baef9f4a5570bedc48bd720a22e1cd45d1bd38a385730da9d0ddbddb HTTP/1.aws.portsnap.freebsd.org GET /bp/c502e4c30cb2334244f0038351847344ac7452959a913e0972a011dbb8abc0ce-1a4ca6cb90dab8fae3f2adcec09d82a86d17a20955387dfff3cdf474bbf60c17 HTTP/1.aws.portsnap.freebsd.org GET /bp/955b599e8e767b1a714dabdae3498c0b1e6aaa5f337e058061b8e60fd2a56f5c-4e47c6280360287ebc768026f7e60b8b4d97a937d97f7e593c2c689b46c0cb41 HTTP/1.aws.portsnap.freebsd.org GET /bp/298b882b48d8a1d54e99ebfb46d52acb8b757c14d2bcdfcb61e508a015349155-e01fa8224d0d5ff8e4b01cc6894dde48f9997229908bd52f4096cb03d58f9f8e HTTP/1.aws.portsnap.freebsd.org ... Don't worry about portsnap vs. freebsd-update -- it's basically the same code. If HTTP requests are getting mangled that's a very clear sign as to where the problem is. A couple questions about that: 1. Where you are running tcpdump, on the system in question or somewhere else (firewall etc)? 2. Can you run this under ktrace so I can see the data being passed from phttpget to the kernel? It's looking like there's either a phttpget bug, a network stack bug, or you have a glitchy network card. I think the most important bug here is the failure to restart utilizing the already downloaded patches and files. I'm usually running it on a low latency 1Gbps link in a big city in Australia - but freebsd-update has been a big *slow* problem for me for a long time on different systems and different network paths. The additional latency and no local mirrors doesn't help - but all that would be bearable if it just wouldn't restart doing work that has already been done each time. Really I don't think understanding exactly why a particular request failed or exactly what version people are upgrading to or from is the issue. The networks will never be perfect. I've tried proxy/caching via nginx - but that's another layer of indirection which makes it harder to debug (e.g it seems that 404s are common and normal - but some can be cached - others shouldn't be?) Simply doing a quick restart of nginx means any freebsd-updates pointed to it need to restart from scratch... which suggests to me that the way this works isn't really suitable for global distribution. I'd love to try running my own freebsd-update server - but that seems a massive job - and really I can see fewer and fewer technically inclined people even getting to that level with the freebsd-update experience as it is. Perhaps it's fine in North America with latencies of a few 10s of milliseconds and everyone there barely cares about the odd restart from scratch? I know freebsd-update involves a lot of cleverness in what it does - but I understand the frustrations of the OP. Starting a slow grind through 57000+ high latency http requests again is more than a minor annoyance. It's not just an hour or two either.. these update attempts for a few machines can end up taking days of an admins time and thus get put off. The only indication at the end is "failed" In some cases - such as running via a wrapper such as iocage - I don't think there's even a way to pass through the debug flag. Is there a way to make freebsd-update resume from close to where it stopped? |