Bug 211430

Summary: devel/apr1: IPv6 to IPv4 fallback does not work in serf
Product: Ports & Packages Reporter: Don Lewis <truckman>
Component: Individual Port(s)Assignee: Olli Hauer <ohauer>
Status: Closed FIXED    
Severity: Affects Some People CC: ohauer
Priority: --- Keywords: patch
Version: LatestFlags: bugzilla: maintainer-feedback? (apache)
ohauer: merge-quarterly?
Hardware: Any   
OS: Any   
URL: https://bz.apache.org/bugzilla/show_bug.cgi?id=59914
Attachments:
Description Flags
patch to modify apr1 poll() emulation to match behavior expected by serf. none

Description Don Lewis freebsd_committer freebsd_triage 2016-07-28 22:14:12 UTC
Created attachment 173081 [details]
patch to modify apr1 poll() emulation to match behavior expected by serf.

serf depends on the poll emulation in apr returning a POLLERR event if a non-blocking connect() attempt fails in order to trigger an IPv6 -> IPv4 fallback, or a fallback to another address for a multi-homed host.  On FreeBSD, the poll emulation is done using kqueue, and the result returned by the poll() emulation is POLLIN + POLLHUP.

connect(4,{ AF_INET6 [xxxx:xxxx:xxxx:xxxx::xxxx]:80 },28) ERR#36 'Operation now in progress'
gettimeofday({ 1469515046.979614 },0x0)		 = 0 (0x0)
kevent(3,{ 4,EVFILT_READ,EV_ADD,0x0,0x0,0x805491300 },1,0x0,0,0x0) = 0 (0x0)
kevent(3,{ 4,EVFILT_WRITE,EV_ADD,0x0,0x0,0x805491300 },1,0x0,0,0x0) = 0 (0x0)
kevent(3,0x0,0,{ 4,EVFILT_READ,EV_EOF,NOTE_LOWAT|0x3c,0x0,0x805491300 4,EVFILT_WRITE,EV_EOF,NOTE_LOWAT|0x3c,0x8000,0x805491300 },32,{ 0.500000000 }) = 2 (0x2)

When serf sees this, it calls read(), which then fails with ECONNREFUSED (or whatever), which is not even a documented read() errno value.

read(4,0x80549c064,8000)			 ERR#61 'Connection refused'

What that occurs, serf closes the socket an any other addresses are not tried.

The attached patch modifies apr to return what serf expects in the case of a non-blocking connect() failure.  Unfortunately, I did not see an easy way of handling ETIMEDOUT since that error can either be caused by connect failing or after the connection is established.  The poll emulation might need to differentiate between those two cases.

This problem affects users of svn who don't have working IPv6 connectivity to the FreeBSD svn servers.
Comment 1 Olli Hauer freebsd_committer freebsd_triage 2016-07-31 11:09:01 UTC
Hi Don,

Thanks for the patch, it was submitted to upstream as APR PR 59914

> This problem affects users of svn who don't have working IPv6 connectivity to the FreeBSD svn servers.

Do you mean with this the users having IPv6 connectivity but not to the public network?
Comment 2 Don Lewis freebsd_committer freebsd_triage 2016-07-31 18:34:57 UTC
(In reply to Olli Hauer from comment #1)
> Do you mean with this the users having IPv6 connectivity but not to the public network?

Yes.
Comment 3 Olli Hauer freebsd_committer freebsd_triage 2016-07-31 19:27:58 UTC
Hm, that's strange.
I've ask because I have such a configuration in my office and haven't run into that issue (running parts of the network on IPv6 but rejecting outgoing IPv6 connections from everywhere except some test systems).

In case you can give some further detail I will try to create a dedicated test network.
Comment 4 Don Lewis freebsd_committer freebsd_triage 2016-07-31 21:29:16 UTC
I originally stumbled across this while debugging another user of apr1.  The amd64 host I was using has both IPv4 and IPv6 connectivity and I was trying to access a web page on another of my hosts that also has IPv4 and IPv6 addresses.  I wasn't able to access the page and unfortunately the diagnostic message that I got wasn't at all helpful.  Accessing the same page from my i386 laptop worked fine.  Nothing was showing up in the server log when I was using the "broken" host, and I wasn't seeing any connect attempts in tcpdump.  I spent a long time looking for amd64 vs. i386 issues.

The problem turned out to be my nginx configuration.  I'd configured it to listen to IPv4 port 80, but neglected to also configure it for IPv6.  My tcpdump check was also only looking for IPv4.  When I looked at all connect attempts to port 80, I noticed that the client was only attempting to connect using IPv6 and was not trying IPv4 at all.  My laptop was working because it is configured to use DHCP and was only getting an IPv4 address.  Firefox and other tools didn't have any problem connecting to this server.

This problem is very easy to reproduce.  Try to do an svn http checkout from a host that has both IPv4 and IPv6 addresses, but is not listening to port 80 on either.  You should see svn attempt to connect using IPv6 and then quit after it gets the RST response.  It won't try the IPv4 address at all.  That's how I got the strace output in my original comment.
Comment 5 commit-hook freebsd_committer freebsd_triage 2016-08-04 18:44:56 UTC
A commit references this bug:

Author: ohauer
Date: Thu Aug  4 18:44:22 UTC 2016
New revision: 419646
URL: https://svnweb.freebsd.org/changeset/ports/419646

Log:
  - add patch to modify apr1 poll() emulation to match behavior expected by serf

    serf depends on the poll emulation in apr returning a POLLERR event if a
    non-blocking connect() attempt fails in order to trigger an IPv6 -> IPv4
    fallback, or a fallback to another address for a multi-homed host.  On
    FreeBSD, the poll emulation is done using kqueue, and the result returned by
    the poll() emulation is POLLIN + POLLHUP.

  - upstream apache PR:
    https://bz.apache.org/bugzilla/show_bug.cgi?id=59914

  PR:		211430
  Submitted by:	Don Lewis (truckman@)
  MFH:		2016Q3

Changes:
  head/devel/apr1/Makefile
  head/devel/apr1/files/patch-apr_poll_unix_kqueue.c
Comment 6 Olli Hauer freebsd_committer freebsd_triage 2016-08-04 18:51:02 UTC
Thanks for the detailed analyses and the patch!  

Hopefully it will be adopted by the upstream apr developers.
Comment 7 commit-hook freebsd_committer freebsd_triage 2016-08-06 06:51:08 UTC
A commit references this bug:

Author: ohauer
Date: Sat Aug  6 06:50:09 UTC 2016
New revision: 419731
URL: https://svnweb.freebsd.org/changeset/ports/419731

Log:
  MFH: r419646

  - add patch to modify apr1 poll() emulation to match behavior expected by serf

    serf depends on the poll emulation in apr returning a POLLERR event if a
    non-blocking connect() attempt fails in order to trigger an IPv6 -> IPv4
    fallback, or a fallback to another address for a multi-homed host.  On
    FreeBSD, the poll emulation is done using kqueue, and the result returned by
    the poll() emulation is POLLIN + POLLHUP.

  - upstream apache PR:
    https://bz.apache.org/bugzilla/show_bug.cgi?id=59914

  PR:		211430
  Submitted by:	Don Lewis (truckman@)

  Approved by:	ports-secteam (junovitch)

Changes:
_U  branches/2016Q3/
  branches/2016Q3/devel/apr1/Makefile
  branches/2016Q3/devel/apr1/files/patch-apr_poll_unix_kqueue.c