[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [tor-bugs] #26227 [Core Tor/Stem]: Review existing stem.client code



#26227: Review existing stem.client code
---------------------------+------------------------------
 Reporter:  dmr            |          Owner:  dmr
     Type:  task           |         Status:  needs_review
 Priority:  Medium         |      Milestone:
Component:  Core Tor/Stem  |        Version:
 Severity:  Normal         |     Resolution:
 Keywords:  client         |  Actual Points:
Parent ID:                 |         Points:
 Reviewer:  atagar         |        Sponsor:
---------------------------+------------------------------

Comment (by dmr):

 Replying to [comment:3 dmr]:
 > [...] within the next day. Sorry for the delay!
 Sorry //again// for the delay!

 Here's comments I've collected from my code review.

 Some of them are pretty straightforward and probably need no discussion. I
 ran out of time with the commits I pushed, but these could probably be
 addressed without much review. I'll note that as **Straightforward**; you
 can probably skip over reading those ;).

 For others, where I note **Suggestion** or **Question**, I'd like to get
 some feedback on whether you think such a change is appropriate. Feel free
 to respond with a quick "Go ahead" or "Let's not do that" for various
 suggestions!

 I can then make tickets or commits to the code-review branch - whichever
 is most appropriate!

 == Maybe //slightly// out of `stem.client` scope, but mentioning
 anyways... (lest I forget)
 ==== stem.descriptor.remote `_download_from_*port()` signatures
 **Straightforward**
 Instead of `url`, this method should take an `endpoint` and `resource` -
 having the same signature as `_download_from_orport()`
 url creation should be thus done within the `_download_from_dirport()`
 method

 ==== Assume DirPort for `_pick_endpoint()`, or...?
 **Question**: Should we assume DirPort for the directory authorities by
 default, or should it switch to ORPort, or have it be specifiable?
 Right now it assumes DirPort.

 (Note that the last retry for `_download_descriptors()` will use directory
 authorities (and thus DirPort) if `Query.fall_back_to_authority` is
 `True`.)

 Suggestion, if have it be specifiable:
 We could allow `fall_back_to_authority` in the constructor to take in an
 `Endpoint` //class// to make it clear which type to use, and default to
 `DirPort` if specified as `True` (backwards-compatible).

 == stem.client and consumers
 ==== New `Relay` connection per `_download_from_orport()` call
 **Summary**: each `_download_from_orport()` call creates a new `Relay`
 connection; ideally these should be multiplexed
 **Suggestion (for future)**
 At a future point, we may want to refactor this to share connections and
 possibly circuits with a stem.client singleton.
 This is discussed more in the architecture ticket (#26227)
 At that point, we'd want to avoid hardcoding the `stream_id`.

 ==== Protocol constants: `stem.client.Relay`
 **Suggestion**
 Somewhere I think we want to define a `SUPPORTED_LINK_PROTOCOLS` constant
 We likely won't have full support for any protocol listed, so that could
 be a bit disingenuous for a while. But in order to move forward with
 development, it's a chicken-and-the-egg problem ;).

 **Suggestion**
 In similar fashion to defining the `SUPPORTED_LINK_PROTOCOLS` constant, we
 should define constants for all the subprotocols. See
 [[https://gitweb.torproject.org/torspec.git/tree/tor-
 spec.txt?id=419ba0c307106f7d719b947faf7e4235f95b37ae#n1817|tor-spec
 section 9]].

 ==== Parameter naming: `stem.client.Relay.connect()`, `link_protocols`
 **Summary**: The method takes a `link_protocols` parameter. This should be
 named a bit clearer for its purpose.
 **Suggestion**: It could be named something like
 `link_protocols_to_advertise` and default to `SUPPORTED_LINK_PROTOCOLs`.

 Rationale:
 I can see that it appears to already generate confusion with its current
 name, because `_download_from_orport()` attempts to use link_protocol
 values associated with the endpoint, i.e. the //remote// relay's
 advertised link protocols.

 ==== Constants/magics for `link_protocol` values
 **Straightforward**
 Based on the above, refactor these

 There's a few constants and a few magic numbers, e.g. `[3]`.

 ==== Naming: `stem.client.datatype.Size` subclasses/attributes
 **Suggestion**:
 It might be good to switch `CHAR`/`SHORT`/etc. to `UCHAR`/`USHORT`/etc.
 I don't know what the convention is here, but it may help for readability.

 I'm personally used to `U<size>` to signify unsigned and `<size>` to
 signify signed. I think switching to `U<size>` would make the code
 //potentially// easier to read for newcomers from various backgrounds.

 **Suggestion**:
 Similarly, it may help to put the bits length in it, too - for the most
 immediate readability.
 So, e.g.:
 * `UCHAR8`
 * `USHORT16`
 * etc.

 For reference, on a quick glance...
 * [[https://gitweb.torproject.org/trunnel.git/tree/README|trunnel]]
 [[https://gitweb.torproject.org/trunnel.git/tree/lib/trunnel/Grammar.py?id=c6e8a499f5a5f00113ea268cfcef9e7676c6ed96#n86|appears
 to use]] `u8`, `u16`, etc.
 * `tor`
 [[https://gitweb.torproject.org/tor.git/tree/src/common/torint.h?id=1b04dab60c549d9f0d621e1a115cab8a49c839f9|appears
 to use]] `uint8_t`, `uint16_t`, etc.

 ==== RelaySocket `recv()` may give us partial cells
 **Summary**: our socket recv(), by nature of network communication, may
 unblock at "incomplete" cell lengths.
 (I believe so, at least.)

 We don't account for mid-Cell reads/`recv()`s. We try to pop a Cell at a
 time, and I feel it's probably possible to get a buffer read that is
 between cell boundaries from our lower-level I/O, and thus error out.
 Variable-width cells could potentially exacerbate this.

 stem.client should be designed to handle that at a higher layer

 **Suggestion**: new exception type named* `IncompleteCell`, raised from
 `pop()`/`unpack()` instead of `ValueError`

 That would let us catch it specifically and re-block on the recv
 continuously, appending buffers until we have received the complete cell.
 `ValueError` then would be used for malformed cells in which the code
 believe it's seen all the content.

 //* exact naming TBD//

 ==== RelaySocket `recv()` and `send()` use/handling
 **Summary**: `recv()` is directly used by consumers at different layers.
 `send()` too

 We furthermore don't have an architecture with a centralized `recv()`. For
 instance, it looks like `Relay.__init__()`, `Relay.create_circuit()`, and
 `Circuit.send()` all directly use orport `recv()`, instead of e.g. having
 a thread delegated to do the recv and sorting of cells into buckets or
 whatnot.

 The problem with these disparate consumers doing `recv()` so directly is
 that they are dropping cells that don't pertain to what they're trying to
 do. (And per above, maybe dropping portions of a cell, corrupting the
 connection "framing".)

 There may also be something similar with `send()`, but since underneath it
 uses a socket //file// and flushes, I think it might be ok. (Though I've
 read a lot of different documentation on receive/send methods - I could be
 mixing things up.)

 **Suggestion**: Manage these buffers in a centralized fashion, in a thread
 More discussion thereon in the architecture ticket (#26227).

 ==== Decryption/encryption of `RelayCell` - wrong abstraction layer?
 **Summary**: it looks like our abstraction is breaking down a bit
 Much of what's happening in
 [[https://gitweb.torproject.org/stem.git/tree/stem/client/__init__.py?id=4306013ab6e868e99553fbb78ed14ef51a896b7d#n223|`stem.client.Circuit.send()`]]
 is more directly handling cell content than it should be.

 **Suggestion**: move much of this into `RelayCell` methods. The forward-
 key and backward-key data will need to reside in the Circuit, but much of
 the rest of this can be encapsulated in the Cell.

 ==== Nonuniform integrity/content/etc. checking within `stem.client.cell`
 **Summary**: we do cell content checks in a variety of methods, but not
 uniformly

 Often this is in `__init__` //or// `pack` //and/or// `unpack`.
 Mostly it looks like the validation only happens in a subset of these, for
 a given cell type. Sometimes the checks may be inconsistent.

 //(For example: `AuthChallengeCell` seems to allow for 0 methods to be
 specified, which would be bad (the unpacking expects at least 1 method,
 iiuc))//

 I think ideally we'd want it to happen for all of these.

 That would allow a cell's contents to be changed e.g.:
 * after it's unpacked, before it's packed
 * after it's constructed, before it's packed
 * etc.
 ... without bypassing the checks

 **Suggestion**: Cells define a more generic `validate()` method for each
 cell type; call this in each of `__init__`, `pack`, and `unpack`

 Side suggestion:
 To allow delayed validation and improper cell creation, we might want to
 provide some kwargs to the constructor and pack methods; maybe:
 * `__init__(..., delay_validation=False)`
 * `pack(..., allow_invalid=False)`
 And similarly, if the parsing allows for it, we might have something
 similar (`allow_invalid=False`) for the unpack method.

 Using these in non-default situations would be rare and probably mostly
 useful for integration tests of edge cases with tor, seeing how it
 responds to bad input.

 ==== Incongruency with Cell.unpack() and Field.unpack()
 **Summary**: `Cell.unpack()` creates a generator whereas `Field.unpack()`
 ensures that there is //no// remainder content.
 **Suggestion**: we might want to standardize these

 Perhaps to...:
 * `unpack()` creates a generator; `unpack_one()` ensures there is no
 remainder content
 * `unpack_many()` creates a generator; `unpack()` ensures there is no
 remainder content

 ==== `CircuitCell` subclass; what about `ConnectionCell` subclass?
 **Summary**: We have a `CircuitCell` subclass, for cells that must pertain
 to a circuit. Can we have a `ConnectionCell` subclass for cells that
 //must not// pertain to a circuit?

 I believe* for Cells, they would be either CircuitCell (circuit id must be
 > 0) or ConnectionCell (circuit id must be 0).
 Currently we enforce in `_pack()` that CircuitCells have a circuit id, but
 we don't enforce anything for the other cells.

 **Suggestion**: subclass all other Cells into a `ConnectionCell` subclass;
 refactor a bit for circuit id checks
 (Looking for your feedback here on //whether// you think we should do
 this. The //how// is very straightforward.)

 *This division is actually not completely clear to me in the spec, but I'm
 pretty confident about it. I can hunt down spec wording while
 implementing, and if it's not really not clear, track an edit to make this
 division clearer.

 ==== Define a max-payload-length security parameter?
 **Summary**: variable-length cell payloads have a length stored in a SHORT
 but no specified max size
 **(Open) Question**: does little-t `tor` define a security parameter here?

 Potentially this should have a security parameter.
 A SHORT allows a payload up to 65535 bytes, which isn't that bad (just
 under 64KBytes), but I'd wonder if little-t `tor` actually defines
 something smaller. There certainly isn't anything defined in the spec.

 ==== Not all the ValueError exceptions have tests
 **Suggestion**: employ a code coverage tool.
 **Question**: Possibly `coverage`?

 ==== No timeout specifiable for `_download_from_orport()`
 **Summary**: `_download_from_dirport()` takes a `timeout` parameter, but
 `_download_from_orport()` doesn't
 **Suggestion**: should have a timeout argument, ideally

 This in practice may be a bit harder to implement without rearchitecting a
 bit. But probably can be done with a different architecture (see #26227).
 Low priority in the interim?

 == ...
 //splitting here because the spam filter is complaining I have too many
 links//

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/26227#comment:4>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs