[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [Libevent-users] bad request with evhttp libevent-2.0.10



On Wed, Jan 19, 2011, Nick Mathewson wrote:
> On Thu, Jan 13, 2011 at 12:34 AM, Adrian Chadd <adrian@xxxxxxxxxxxxxxx> wrote:
> > It's best to assume browsers don't do the right thing in 100% of cases.
> >
> > (Been there, done that with Squid..)
> 
> Ick.  Okay, what does Squid (or some other sensible application)
> assume constitutes a valid URI?

> (I wonder whether this argues for separate modes of
> standards-enforcement in evhttp_uri_parse.)

My hand-rolled parser (in src/url.cc in squid-3):

* find first :// - that's protocol
* find next / or EOL - that's [user@]host[:port]
* if !EOL, everything from that / to EOL is urlpath

Then protocol, host and urlpath are separately sanitised, but not rejected
outright.

Eg:

                /* RFC 2732 states IPv6 "SHOULD" be bracketed. allowing for times when its not. */
                /* RFC 3986 'update' simply modifies this to an "is" with no emphasis at all! */
                /* therefore we MUST accept the case where they are not bracketed at all. */

So even though RFC3986 states that IPv6 is bracketed, people may have written
code that assumes otherwise. The parser there tries to deal with both.

There's stuff for stripping whitespace from the URL, but browsers have in the past
embedded whitespace in the host and urlpath. In both instances, these are either

* rejected;
* ignored (and passed up);
* converted to %20 in the urlpath;
* chopped at the first space in the urlpath;

It's not as obvious as you'd think, because plenty of badly-written useragents do
some pretty stupid stuff.



Adrian

-- 
- Xenion - http://www.xenion.com.au/ - VPS Hosting - Commercial Squid Support -
- $24/pm+GST entry-level VPSes w/ capped bandwidth charges available in WA -
***********************************************************************
To unsubscribe, send an e-mail to majordomo@xxxxxxxxxxxxx with
unsubscribe libevent-users    in the body.