Bug 223065 - INDEX file truncated from time to time
Summary: INDEX file truncated from time to time
Status: Closed FIXED
Alias: None
Product: Services
Classification: Unclassified
Component: FTP/WWW Sites & Mirrors (show other bugs)
Version: unspecified
Hardware: Any Any
: --- Affects Many People
Assignee: Philip Paeps
URL:
Keywords:
Depends on:
Blocks: 278947 258215
  Show dependency treegraph
 
Reported: 2017-10-17 13:36 UTC by Trond Endrestøl
Modified: 2024-05-13 08:28 UTC (History)
14 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Trond Endrestøl 2017-10-17 13:36:35 UTC
From time to time I see messages such as this one when running "make -C /usr/ports fetchindex":

/usr/ports/INDEX-11.bz2                        31% of 2093 kB  396 kBps 00m02s
fetch: /usr/ports/INDEX-11.bz2 appears to be truncated: 673899/2143836 bytes
*** Error code 1

I thought the INDEX files were updated using a temporary file which is later renamed. Or maybe the scheduling is wrong so that two or more instances interfere with each other.

Also, I was unsure of which category to pick. Individual Port? Nah. Package Infrastructure? Maybe. Ports Framework? Maybe.
Comment 1 Mathieu Arnold freebsd_committer freebsd_triage 2017-10-17 16:34:02 UTC
This is on www.freebsd.org, outside of our perview.
Comment 2 Wolfram Schneider freebsd_committer freebsd_triage 2018-01-05 14:16:54 UTC
I can confirm the bug. I see it sometimes for the documentation project during the build of the ports HTML pages:

doc/en_US.ISO8859-1/htdocs/ports$ make
/usr/bin/fetch -am -o INDEX.bz2 https://www.FreeBSD.org/ports/INDEX-11.bz2
INDEX.bz2                                      34% of 2156 kB 1009 kBps 00m01s
fetch: INDEX.bz2 appears to be truncated: 766037/2208356 bytes
*** Error code 1
Comment 3 Wolfram Schneider freebsd_committer freebsd_triage 2018-01-05 14:21:42 UTC
We are using fetch -a

     -a, --retry
                 Automatically retry the transfer upon soft failures.

so it is unlikely a network issue between the client and the server.

I guess this is an issue with the front-end varnish daemon on www.freebsd.org. Under some conditions (low memory, high memory pressure) the server returns truncated results. Sad.
Comment 4 Trond Endrestøl 2018-01-05 15:20:03 UTC
Running "make -C /usr/ports fetchindex" repeatly shows fetch(1) coming closer and closer to the actual file length. That led me to conclude that the INDEX file is updated in-place. Maybe that's simply false.

Python flavors and DEFAULT_VERSIONS=blah-blah-blah and EMACS_PORT_NAME=emacs-nox11, both in /etc/make.conf, has forced me to run "make -C /usr/ports index" instead.
Comment 5 Wolfram Schneider freebsd_committer freebsd_triage 2018-01-05 15:37:08 UTC
You can see the actual file size, last modification, age of the object in the cache with:

$ lynx -head -source https://www.FreeBSD.org/ports/INDEX-12.bz2
HTTP/1.1 200 OK
Date: Fri, 05 Jan 2018 15:32:27 GMT
Content-Type: application/octet-stream
Content-Length: 2209757
Connection: close
Server: nginx/1.12.2
Last-Modified: Fri, 05 Jan 2018 15:14:13 GMT
ETag: "5a4f9645-21b7dd"
X-Varnish: 216949192 233144374
Age: 74
Via: 1.1 wfe0.ysv.FreeBSD.org
X-Cache: HIT

I'm pretty sure that an update of the INDEX files is atomar. You will get an old or a new file, but never something in between, or an growing file while downloading.
Comment 6 Trond Endrestøl 2018-01-05 15:45:22 UTC
(In reply to Wolfram Schneider from comment #5)
Should we close this PR, or should it remain open?
Comment 7 Wolfram Schneider freebsd_committer freebsd_triage 2018-01-05 17:40:35 UTC
(In reply to Trond.Endrestol from comment #6)

The PR should be remain open until we are sure it is fixed.
Comment 8 Chris Hutchinson 2018-01-05 18:41:57 UTC
(In reply to Wolfram Schneider from comment #3)
> We are using fetch -a
> 
>      -a, --retry
>                  Automatically retry the transfer upon soft failures.
> 
> so it is unlikely a network issue between the client and the server.
> 
> I guess this is an issue with the front-end varnish daemon on
> www.freebsd.org. Under some conditions (low memory, high memory pressure)
> the server returns truncated results. Sad.

I've run into trouble with this on the freebsd site in other
areas (man page archives, for example). I think unless they
either fix varnish, or freebsd uses something else; this
will *always* be a problem.
Till then, this pr(1) is likely to grow quite large.

--Chris
Comment 9 Wolfram Schneider freebsd_committer freebsd_triage 2018-01-07 17:48:44 UTC
(In reply to Chris Hutchinson from comment #8)

FWIW, I remember this bug exists for at least 3 years, and is not related to a specific varnish or FreeBSD version. I don't believe we are the only one who have this problem.
Comment 10 Wolfram Schneider freebsd_committer freebsd_triage 2018-01-07 17:51:34 UTC
As far as I understand varnish:

the front-end has a fast connection to the backend. The backend delivers the data fast (>1GBit/s) Varnish needs to allocate memory to cache the data from the backend. If the malloc() call fails, varnish may return truncated results.

If we know the exact file size in advance, then your client will report an error message due the mismatch between the content-length and the received bytes. For dynamic content, you will get rotten results - that's horrible.

I do not understand why varnish fails to cache a 2MB "big" file. The machine has several GB RAM reserved for varnish.

I would expect that in case of a malloc() failure varnish fails back to streaming, and trash the dirty cache. Or break the TCP connection, and do a reset so the client get an error message on the lower level and can handle it (retry).
Comment 11 Chris Hutchinson 2018-01-08 01:34:27 UTC
I quite agree, that varnish is probably not well suited
for dynamic content.
In the context of this pr(1); it appears that varnish
lacks effective check summing.
IOW it isn't comparing the file(s) after creation. So
it blindly serves its cached version w/o first having
determined that the version it created was, in fact
correct -- EBADFD?

This is what it looks like from the outside in,
anyway.

--Chris
Comment 12 tech-lists 2018-07-18 11:57:17 UTC
Will this get fixed? 

It happens all the time now. Maybe something in the handbook/man ports stating that it doesn't work?
Comment 13 Wolfram Schneider freebsd_committer freebsd_triage 2018-08-02 16:18:03 UTC
(In reply to tech-lists from comment #12)

Well, we should fix the mess, not document it.
Comment 14 M. Voorhis 2018-08-17 10:51:09 UTC
(In reply to tech-lists from comment #12)

If this isn't going to be repaired, people that track the ports with svn should be told in documentation that they can build their own indexes with, for exmple on a FreeBSD-11 machine:

cd /usr/ports
/usr/local/bin/svn update
/usr/bin/make index (i.e., to build INDEX-11)
/usr/bin/make index INDEXFILE=INDEX-10
/usr/bin/make index INDEXFILE=INDEX-12

This is substantially more time consuming than just fetching the index, but will get you a reliable index file every time.

It's unfortunate that this is necessary because the building of the index has already been done elsewhere, but since a good index file isn't reliably available I've been building indexes like this for a long time.
Comment 15 Emanuel Haupt freebsd_committer freebsd_triage 2021-04-07 20:32:24 UTC
I've mitigated the issue with a simple wrapper:

until make -C /usr/ports fetchindex; do sleep 0.1; done

It's not perfect but it works.
Comment 16 Tobias Kortkamp freebsd_committer freebsd_triage 2021-04-21 05:20:17 UTC
To be honest I don't understand how this can still be a thing.  It
affects everybody.  Where is this implemented and can we take a
look at it?
Comment 17 Philip Paeps freebsd_committer freebsd_triage 2021-04-21 06:00:11 UTC
This sounds like it might be something related to the wfe* front-end machines.  I don't know how those work but I can take a look.
Comment 18 Emanuel Haupt freebsd_committer freebsd_triage 2021-04-21 06:25:06 UTC
Maybe the INDEX file could be added to a new git repository in order to not clutter the commit history of ports. Then the 'fetchindex' target could be adjusted.
Comment 19 Tobias Kortkamp freebsd_committer freebsd_triage 2021-06-04 09:18:25 UTC
(In reply to Philip Paeps from comment #17)
Did you figure out what the problem was?
Comment 20 Philip Paeps freebsd_committer freebsd_triage 2021-06-05 06:55:14 UTC
I did not.  Thanks for the nudge!  This slipped off my radar.
Comment 21 Wolfram Schneider freebsd_committer freebsd_triage 2021-06-06 07:42:20 UTC
I still can reproduce the error

$ for i in $(seq 0 999);do printf "."; /usr/bin/fetch -q -a -o INDEX.bz2  https://www.FreeBSD.org/ports/INDEX-12.bz2;done

...................................................................................................................................................................................................................................................................................................................................................fetch: INDEX.bz2 appears to be truncated: 2157442/2383323 bytes
.......................................................................................................fetch: INDEX.bz2 appears to be truncated: 839819/2383323 bytes
.fetch: INDEX.bz2 appears to be truncated: 1758337/2383323 bytes
..................................................................................................................................................................................................................................................................................................................................................................................................fetch: INDEX.bz2 appears to be truncated: 1571602/2383323 bytes
......................................fetch: INDEX.bz2 appears to be truncated: 2118718/2383323 bytes
...................................................................
[...]
Comment 22 Emanuel Haupt freebsd_committer freebsd_triage 2021-06-12 08:48:32 UTC
In case anyone is interested, I am mirroring the INDEX files for major versions 11, 12, 13, 14 and update them every 10 minutes. Feel free to use by adding the following line to /etc/make.conf:

MASTER_SITE_INDEX=http://snow.critical.ch/fbsd-ports-index/
Comment 23 Philip Paeps freebsd_committer freebsd_triage 2021-09-02 15:19:05 UTC
So this keeps coming up.  I'll try to find out where is file is generated.  Assign it to me so it goes a little higher on my list.
Comment 24 Matthias Andree freebsd_committer freebsd_triage 2021-09-02 21:00:49 UTC
Not sure if it helps... but apparently the size I get on a quick succession of retries increases each time, and fetch restarts from scratch. (where, by contrast, wget -c would pick up where it left off).  The final fetch was successful.  

Very wild guessing and stupid question:
Is this a live view of a cache or reverse proxy or load balancer slowly downloading a file and serving the part of it that it already has, without atomic replacement of a prior copy?


/usr/ports/INDEX-13.bz2                        30% of 2353 kB 1215 kBps    02s
fetch: /usr/ports/INDEX-13.bz2 appears to be truncated: 746972/2409975 bytes
*** Error code 1

Stop.
make: stopped in /usr/ports
/usr/bin/env  fetch -am -o /usr/ports/INDEX-13.bz2 https://www.FreeBSD.org/ports/INDEX-13.bz2
/usr/ports/INDEX-13.bz2                        49% of 2353 kB 1583 kBps    01s
fetch: /usr/ports/INDEX-13.bz2 appears to be truncated: 1181373/2409975 bytes
*** Error code 1

Stop.
make: stopped in /usr/ports
/usr/bin/env  fetch -am -o /usr/ports/INDEX-13.bz2 https://www.FreeBSD.org/ports/INDEX-13.bz2
/usr/ports/INDEX-13.bz2                        62% of 2353 kB 1838 kBps    00s
fetch: /usr/ports/INDEX-13.bz2 appears to be truncated: 1503200/2409975 bytes
*** Error code 1

Stop.
make: stopped in /usr/ports
/usr/bin/env  fetch -am -o /usr/ports/INDEX-13.bz2 https://www.FreeBSD.org/ports/INDEX-13.bz2
/usr/ports/INDEX-13.bz2                        76% of 2353 kB 2130 kBps    01s
fetch: /usr/ports/INDEX-13.bz2 appears to be truncated: 1851401/2409975 bytes
*** Error code 1

Stop.
make: stopped in /usr/ports
/usr/bin/env  fetch -am -o /usr/ports/INDEX-13.bz2 https://www.FreeBSD.org/ports/INDEX-13.bz2
/usr/ports/INDEX-13.bz2                        90% of 2353 kB 2418 kBps    01s
fetch: /usr/ports/INDEX-13.bz2 appears to be truncated: 2179716/2409975 bytes
*** Error code 1

Stop.
make: stopped in /usr/ports
/usr/bin/env  fetch -am -o /usr/ports/INDEX-13.bz2 https://www.FreeBSD.org/ports/INDEX-13.bz2
/usr/ports/INDEX-13.bz2                               2353 kB 2482 kBps    01s
Comment 25 Philip Paeps freebsd_committer freebsd_triage 2021-09-06 01:52:45 UTC
It does look like an atomicity problem, but I'm not sure where it occurs.

I suspect this rsync on bit0:

nice lockf -s -t 0 /home/bitmir/.index.lock \
        /usr/local/bin/rsync -vaH --delete --time-limit=9 \
        rsync://portsindexbuild.nyi.FreeBSD.org/ports-index/ \
        /home/bit/ports-index/ >> /var/log/sync-ports-index.log 2>&1

Investigating ...
Comment 26 Craig Leres freebsd_committer freebsd_triage 2021-09-06 02:17:18 UTC
(In reply to Philip Paeps from comment #25)
Suggestion: Use rsync's --bwlimit=RATE flag to make the rsync take longer so you have more time to see what's happening?
Comment 27 Philip Paeps freebsd_committer freebsd_triage 2021-10-28 02:21:04 UTC
clusteradm made several improvements to the web frontend infrastructure yesterday.  We believe that these finally fixed this problem.  I'll leave this bug open for a couple more days to let others confirm (or deny) that this is the case.
Comment 28 Danilo G. Baio freebsd_committer freebsd_triage 2021-10-30 13:13:13 UTC
(In reply to Philip Paeps from comment #27)

I can confirm it's fixed, the last three days without any issue.

Thank you!
Comment 29 Philip Paeps freebsd_committer freebsd_triage 2021-10-31 04:25:10 UTC
I'll mark this as fixed now.  Sorry this took so long to resolve.
Comment 30 Craig Leres freebsd_committer freebsd_triage 2021-10-31 17:40:57 UTC
The last time I saw this happen was on October 23rd; thanks for the fix!