[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
Re: [tor-bugs] #26227 [Core Tor/Stem]: Review existing stem.client code
#26227: Review existing stem.client code
---------------------------+------------------------------
Reporter: dmr | Owner: dmr
Type: task | Status: needs_review
Priority: Medium | Milestone:
Component: Core Tor/Stem | Version:
Severity: Normal | Resolution:
Keywords: client | Actual Points:
Parent ID: | Points:
Reviewer: atagar | Sponsor:
---------------------------+------------------------------
Comment (by dmr):
==== HTTP, potential bugs
**Summary**: HTTP parsing has a lot of edge cases
**Suggestion**: use a library for this (see architecture ticket, #26227)
There are a number of potential bugs with the HTTP handling.
HTTP is a nuanced spec. It's very possible that these bugs will never be
exercised due to tor's implementation, but it's also hard to know how tor
will evolve in the future.
Here's the potential bugs I saw:
* HTTP defines headers as case-insensitive, but
[[https://gitweb.torproject.org/stem.git/tree/stem/descriptor/remote.py?id=4306013ab6e868e99553fbb78ed14ef51a896b7d#n860|the
parsing code creates a dictionary]] that is case-sensitive. This is in
contrast to `_download_from_dirport()` which has tests to assure the
`reply_headers` attribute functions in a case-insensitive way. See
[[https://github.com/dmr-x/stem/commits/26227-client-code-review--HTTP-
headers-
case|branch]]/[[https://github.com/dmr-x/stem/commit/cba6a8ba508108e8dcbf6f7a0b2e3978524eb2aa|commit]]
with a demo test that shows this.
* HTTP allows headers to be
[[https://tools.ietf.org/html/rfc1945#page-21|split among continuation
lines]], as well as for multi-attribute headers to be specified in a CSV
or as `key: single_value` with different values on multiple lines. The
code currently assumes that
[[https://gitweb.torproject.org/stem.git/tree/stem/descriptor/remote.py?id=4306013ab6e868e99553fbb78ed14ef51a896b7d#n867|each
header is completely specified on a single line]] and that
[[https://gitweb.torproject.org/stem.git/tree/stem/descriptor/remote.py?id=4306013ab6e868e99553fbb78ed14ef51a896b7d#n863|there
are no continuation lines]].
* Both of the above may mean that the content encoding may not be
correctly passed to `_decompress()`.
* We're using HTTP/1.0. I'm not super-familiar with the specifics of
HTTP/1.0, but there may be some restrictions that we run into, as far as
e.g. [[https://tools.ietf.org/html/rfc1945#section-3.5|content encodings
allowed]]. Might want to be using HTTP/1.1 to be "safe". (I haven't looked
at the dir spec or tor implementation to know if there's any functional
impact.)
* fixed in
[[https://github.com/dmr-x/stem/commit/ba275219bd50dfc408befca81ef4d3116008dbcc|commit
`ba275219bd50dfc408befca81ef4d3116008dbcc`]]: the Reason-Phrase of the
HTTP response is only recommended to be `OK` for `200`, but that is not
required. The parsing code did require it to be `OK`.
So there are ways to fix all these cases, but my preference would be to
not reinvent the wheel here. This is discussed more in the architecture
ticket (#26227).
== Phew
That's a lot. Sorry - still working on my verbosity! Hopefully it's at
least well organized!
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/26227#comment:5>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs