Bug 140835 - [libfetch] fetchParseURL(3) returns success with invalid URLs
Summary: [libfetch] fetchParseURL(3) returns success with invalid URLs
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: bin (show other bugs)
Version: 8.0-PRERELEASE
Hardware: Any Any
: Normal Affects Only Me
Assignee: Dag-Erling Smørgrav
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-11-24 19:20 UTC by Jeremy Chadwick
Modified: 2018-05-28 19:42 UTC (History)
1 user (show)

See Also:


Attachments
libfetch_URLparse2.patch.txt (1.13 KB, text/plain; charset=US-ASCII)
2011-09-09 05:30 UTC, Mark Johnston
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jeremy Chadwick 2009-11-24 19:20:02 UTC
libfetch contains a function, fetchParseURL(3), whose man page
states the following:

     fetchParseURL() takes a URL in the form of a null-terminated string and
     splits it into its components function according to the Common Internet
     Scheme Syntax detailed in RFC1738.  A regular expression which produces
     this syntax is:

         <scheme>:(//(<user>(:<pwd>)?@)?<host>(:<port>)?)?/(<document>)?

     If the URL does not seem to begin with a scheme name, the following syn-
     tax is assumed:

         ((<user>(:<pwd>)?@)?<host>(:<port>)?)?/(<document>)?

     Note that some components of the URL are not necessarily relevant to all
     URL schemes.  For instance, the file scheme only needs the <scheme> and
     <document> components.

     .....

     fetchParseURL() returns a pointer to a struct url containing the individ-
     ual components of the URL.  If it is unable to allocate memory, or the
     URL is syntactically incorrect, fetchParseURL() returns a NULL pointer.

But when passed a URL such as the below (note the delimiter is
colon-slash, not colon-slash-slash)

	http:/www.somesite.com/

fetchParseURL(3) returns a pointer to a struct with the following
data:

	url->scheme = http
	url->user   = <null>
	url->pwd    = <null>
	url->host   = <null>
	url->port   = 0
	url->doc    = /www.somesite.com/

Given the documentation, fetchParseURL(3) should return NULL in this
scenario; it was able to work out the scheme by itself, which
implies that the RFC1738-compliancy paragraph of the documentation
should apply strictly.

This issue came to light on freebsd-stable:

http://lists.freebsd.org/pipermail/freebsd-stable/2009-November/052969.html
http://lists.freebsd.org/pipermail/freebsd-stable/2009-November/052971.html
http://lists.freebsd.org/pipermail/freebsd-stable/2009-November/052972.html
http://lists.freebsd.org/pipermail/freebsd-stable/2009-November/052973.html

Fix: 

None known.
How-To-Repeat: $ fetch http:/www.somesite.com/
fetch: http:/www.somesite.com/: No address record
$ fetch http:/localhost/
fetch: http:/localhost/: No address record
Comment 1 Mark Linimon freebsd_committer freebsd_triage 2011-02-23 22:30:37 UTC
Responsible Changed
From-To: freebsd-bugs->des

des, is this still your territory?
Comment 2 Mark Johnston 2011-09-09 05:30:01 UTC
It looks like this behaviour is intentional: fetchParseURL() seems to
treat <scheme>:/<stuff> as shorthand for <scheme>://localhost/stuff,
so that I can write "fetch file:/home/mark/foo" or so. It turns out
that firefox and curl do this too, but only for the "file" scheme.
Since this special syntax doesn't seem to be mentioned anywhere in RFC
1738, I've made a patch which changes fetchParseURL() to accept this
syntax for only the file scheme (and return NULL otherwise). The patch
also fixes a couple of style bugs.

I'm not sure if the "proper" fix for this is to drop support for that
syntax completely, since it's not in the RFC and it's not documented
anywhere in fetch/libfetch. Then again, maybe having a
strictly-conforming RFC 1738 implementation isn't important - fetch(1)
already does things like treat "ftp.freebsd.org" as
"ftp://ftp.freebsd.org" and so on. Any thoughts?

-Mark
Comment 3 Eitan Adler freebsd_committer freebsd_triage 2018-05-28 19:42:50 UTC
batch change:

For bugs that match the following
-  Status Is In progress 
AND
- Untouched since 2018-01-01.
AND
- Affects Base System OR Documentation

DO:

Reset to open status.


Note:
I did a quick pass but if you are getting this email it might be worthwhile to double check to see if this bug ought to be closed.