Bug 202637

Summary: www/lighttpd: 1.4.36 has header corruption, and breaks under load
Product: Ports & Packages Reporter: Peter Wemm <peter>
Component: Individual Port(s)Assignee: Guido Falsi <madpilot>
Status: Closed FIXED    
Severity: Affects Only Me CC: delphij, madpilot, miwi, pkubaj, zi
Priority: --- Keywords: patch
Version: LatestFlags: koobs: merge-quarterly+
Hardware: Any   
OS: Any   
Description Flags
update to 1.4.37 none

Description Peter Wemm freebsd_committer freebsd_triage 2015-08-25 03:47:13 UTC
This is a preliminary report.  The 1.4.36 update badly breaks freebsd-update.  It appears that lighttpd is decoding the http requests badly.

2015-08-25 03:13:47: (request.c.712) invalid character in key GET /10.2-RELEASE/amd64/f/05a1e9b7b0583d98cdf1d5b6020f2f25446002a8efe0417395b75757a1ddefe5.gz HTTP/1.1\r\nHost: update2.freebsd.org\r\nUs  0 -> 400 
2015-08-25 03:13:47: (request.c.715) request-header:\nGET /10.2-RELEASE/amd64/f/05a1e9b7b0583d98cdf1d5b6020f2f25446002a8efe0417395b75757a1ddefe5.gz HTTP/1.1\r\nHost: update2.freebsd.org\r\nUs 

This manifests in the freebsd-update / portsnap clients like this:
http://update2.freebsd.org/10.2-RELEASE/amd64/f/0c301f89e862e5165519e7c65dccffbb22b3b8c5ef5db7267f98dc04812feb4d.gz: 200 OK
http://update2.freebsd.org/10.2-RELEASE/amd64/f/0c302f734e8bc0df8e4a1c26f98e7c1fa1fa837858cc24769908a0ab46b0e313.gz: 400 Error (ignored)
http://update2.freebsd.org/10.2-RELEASE/amd64/f/41065db0a842bfe35bc8f79877b4c5c8077bab9ccd34a4451b09b3cf6e079e64.gz: 200 OK

We are seeing:
* 400 bad-request errors
* corrupt data transfers (this is the most serious problem)
* spurious, transient 404 not-found errors.

Reverting to 1.4.35_05 solves it, but obviously that's not a good solution.
Comment 1 Peter Wemm freebsd_committer freebsd_triage 2015-08-25 07:05:48 UTC
As an update; we switched the affected machines to nginx, so I'm sorry to say we no longer have a test case in the freebsd.org cluster.

I'm sorry, but we needed this to work reliably ASAP.
Comment 2 Piotr Kubaj freebsd_committer 2015-08-25 10:01:48 UTC
Could you attach your lighttpd configs?
Comment 3 Peter Wemm freebsd_committer freebsd_triage 2015-08-25 20:17:57 UTC
All that's left:

# egrep -v '(^#|^$)' lighttpd.conf conf.d/*.conf
lighttpd.conf:var.log_root    = "/logs"
lighttpd.conf:var.server_root = "/data"
lighttpd.conf:var.state_dir   = "/var/run"
lighttpd.conf:var.conf_dir    = "/etc"
lighttpd.conf:server.chroot   = "/usr/local/www"
lighttpd.conf:var.vhosts_dir  = server_root + "/vhosts"
lighttpd.conf:var.cache_dir   = "/var/cache/lighttpd"
lighttpd.conf:var.socket_dir  = "/sockets"
lighttpd.conf:include "modules.conf"
lighttpd.conf:server.port = 80
lighttpd.conf:server.bind = ""
lighttpd.conf:$SERVER["socket"] == "[2001:4f8:3:ffe0:406a:0:16:1a]:80" { } 
lighttpd.conf:server.username  = "www"
lighttpd.conf:server.groupname = "www"
lighttpd.conf:server.document-root = "/data/"
lighttpd.conf:server.tag = "lighttpd"
lighttpd.conf:server.pid-file = state_dir + "/lighttpd.pid"
lighttpd.conf:server.errorlog             = log_root + "/error.log"
lighttpd.conf:include "conf.d/access_log.conf"
lighttpd.conf:include "conf.d/debug.conf"
lighttpd.conf:server.event-handler = "freebsd-kqueue"
lighttpd.conf:server.network-backend = "write"
lighttpd.conf:server.max-fds = 40960
lighttpd.conf:server.stat-cache-engine = "simple"
lighttpd.conf:server.max-connections = 20480
lighttpd.conf:server.max-keep-alive-idle = 5
lighttpd.conf:server.max-keep-alive-requests = 1024
lighttpd.conf:server.max-request-size = 512
lighttpd.conf:index-file.names += (
lighttpd.conf:  "index.xhtml", "index.html", "index.htm", "default.htm", "index.php"
lighttpd.conf:url.access-deny             = ( "~", ".inc", ".swp" )
lighttpd.conf:$HTTP["url"] =~ "\.pdf$" {
lighttpd.conf:  server.range-requests = "disable"
lighttpd.conf:static-file.exclude-extensions = ( ".php", ".pl", ".fcgi", ".scgi" )
lighttpd.conf:include "conf.d/mime.conf"
lighttpd.conf:include "conf.d/dirlisting.conf"
lighttpd.conf:server.follow-symlink = "disable"
lighttpd.conf:server.upload-dirs = ( "/var/tmp/" )
conf.d/dirlisting.conf:dir-listing.activate      = "enable"
conf.d/dirlisting.conf:dir-listing.hide-dotfiles = "disable" 
conf.d/dirlisting.conf:dir-listing.exclude       = ( "~$" )
conf.d/dirlisting.conf:dir-listing.encoding = "UTF-8"
conf.d/dirlisting.conf:dir-listing.hide-header-file = "disable"
conf.d/dirlisting.conf:dir-listing.show-header = "disable"
conf.d/dirlisting.conf:dir-listing.hide-readme-file = "disable"
conf.d/dirlisting.conf:dir-listing.show-readme = "disable"

The the conf files not shown here were deleted by 'pkg delete' as they were stock / unmodified.

We saw it on the freebsd-update and portsnap servers.  They typically have a very very large number of files with long file names.  The file names are usually sha256 of the file itself.  The easiest way to trigger it was:
# find 10.2-RELEASE/amd64/f -type f >/tmp/flist
(this generates about 66000 files)
somewhere else:
xargs -P8 /usr/libexec/phttpget servername.freebsd.org < /tmp/flist
and watch the sporadic '400 bad request'.  With 1.4.35_05, it runs without the error 400s.

/usr/libexec/phttpget is src/usr.sbin/portsnap/phttpget/* - it does pipelined http fetches.

If you have debug.conf log failed requests, you can see it reporting what appears to be truncated request headers.
Comment 4 Piotr Kubaj freebsd_committer 2015-08-25 20:54:50 UTC
I'll try to reproduce this bug on my home server running 10.2-RELEASE/amd64. Since I'm currently during holidays in the country, I don't really have much access to the Internet (I wouldn't be surprised if it was faster to use a homing pidgeon). I'll get back to home on Sunday and I'll try to get to it ASAP.
Comment 5 Peter Wemm freebsd_committer freebsd_triage 2015-08-26 02:35:19 UTC
There's no rush - it was one of the last lighttpd servers in the freebsd.org cluster and it was easier to switch to nginx.  I posted the info here to make a note of what I found before I gave up, in case somebody was interested in following up with the lighttpd folks.
Comment 6 Piotr Kubaj freebsd_committer 2015-08-31 19:38:51 UTC
Created attachment 160569 [details]
update to 1.4.37

There is a new update to lighttpd-1.4.37. Here's the changelog:

    [mod_proxy] remove debug log line from error log (fixes #2659)
    [mod_dirlisting] fix dir-listing.set-footer not showing
    fix out-of-filedescriptors when uploading “large” files (fixes #2660, thx rmilecki)
    increase upload temporary chunk file size from 1MB to 16MB
    fix undefined integer shift
    rewrite network sendfile/mmap/writev/write backends
    fix some unchecked return value warnings
    [kqueue] fix kevent call
    [autoconf] define HAVE_CRYPT when crypt() is present
    [bsd xattr] fix compile break with BSD extended attributes in stat_cache
    [mod_cgi] rewrite mmap and generic (post body) send error handling
    [mmap] fix mmap alignment
    [plugins] when modules are linked statically still only load the modules given in the config
    [mmap] handle SIGBUS in network; those get triggered if the file gets smaller during reading
    fix some warnings found by coverity (“leak” in setup phase, not catching too long unix socket paths in mod_proxy)

This is supposed to fix bugs found on FreeBSD and release announcement explicitly mentions fixing bugs affecting FreeBSD:

Also, I've made a change to the Makefile to that MYSQLAUTH implies MYSQL.
Comment 7 Piotr Kubaj freebsd_committer 2015-08-31 19:43:18 UTC
I've tested that www/lighttpd* ports build fine using poudriere on 10.2-RELEASE/amd64.
Comment 8 Guido Falsi freebsd_committer 2015-08-31 21:02:52 UTC
I'll take this PR to commit the update, is there a plan for testing if it really fixes the problem noticed on the FreeBSD servers?
Comment 9 commit-hook freebsd_committer 2015-08-31 23:15:00 UTC
A commit references this bug:

Author: madpilot
Date: Mon Aug 31 23:14:15 UTC 2015
New revision: 395734
URL: https://svnweb.freebsd.org/changeset/ports/395734

  - Update to 1.4.37
  - Use new OPTION helper
  - Unsilence some installation commands

  PR:			202637
  Submitted by:		peter@
  Update Submitted by:	pkubaj at riseup.net (maintainer)
  MFH:			2015Q3

Comment 10 Guido Falsi freebsd_committer 2015-08-31 23:20:14 UTC
I just committed the update, with the only change of unsilencing some installation commands.

I'm not closing the bug since I'm unable to check if the reported issue is actually solved or not by the update.

Comment 11 commit-hook freebsd_committer 2015-08-31 23:40:03 UTC
A commit references this bug:

Author: madpilot
Date: Mon Aug 31 23:39:28 UTC 2015
New revision: 395736
URL: https://svnweb.freebsd.org/changeset/ports/395736

  MFH: r395734, r395735


  - Update to 1.4.37
  - Use new OPTION helper
  - Unsilence some installation commands

  PR:			202637
  Submitted by:		peter@
  Update Submitted by:	pkubaj at riseup.net (maintainer)


  Fix distinfo mangled by previous commit.

  Approved by:		ports-secteam (delphij)

_U  branches/2015Q3/
Comment 12 Peter Wemm freebsd_committer freebsd_triage 2015-08-31 23:59:18 UTC
I'm inclined to suspect that it will fix it.  I think the only person who's left that can test lighttpd on a freebsd-update server is Ryan (zi@) - I've added him to the cc: list.
Comment 13 Ryan Steinmetz freebsd_committer freebsd_triage 2015-09-01 00:26:41 UTC
freebsd-update server has been updated.
Comment 14 Martin Wilke freebsd_committer 2016-01-16 07:33:37 UTC
Its fixed.
Comment 15 Kubilay Kocak freebsd_committer freebsd_triage 2016-01-16 15:36:40 UTC
@Martin, please assign to committer that resolved in the cases where a commit has been made (FIXED), or referenced (PR:) as a comment

Also this went to quarterly, so classify accordingly.