[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [tor-dev] Proposal: Controller events to better understand connection/circuit usage

To: Rob Jansen <jansen@xxxxxxxxxx>, Nick Mathewson <nickm@xxxxxxxxxxxx>
Subject: Re: [tor-dev] Proposal: Controller events to better understand connection/circuit usage
From: Karsten Loesing <karsten@xxxxxxxxxxxxxx>
Date: Thu, 28 Feb 2013 12:34:03 +0100
Cc: tor-dev@xxxxxxxxxxxxxxxxxxxx
Delivered-to: archiver@xxxxxxxx
Delivery-date: Thu, 28 Feb 2013 06:34:19 -0500
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:sender:message-id:date:from:user-agent:mime-version:to :cc:subject:references:in-reply-to:x-enigmail-version:content-type :content-transfer-encoding; bh=VJYMDC9ditvsFTJLLJnnCn5LzqfJxztMxF5/BZkA41c=; b=wP6NnlIqC0d95XwhYhk038Yw0eFbFW9VhDoin427ap229kV91dSDOmpbbUAm/ky6F7 OYITb3UM7soA7sLuV3gsIoaXp7/Bh9M/1a74XbwFQj477oemtp+nFeuxgq7hDpHKTzez RqRHyEw87yFVJk/L/TMaLoYFv/+A92nL7tM4C42be4ZMzEJ9YmFiGqxOGjRxmZoVmEzp XeHZ29g8aiKCYLGNQ275SLAwCWUMCM5J6zuR091uNtpEixZ7t0JdA7tGIVq5OdRIniaF uifpXC6X68N4HYceR9wjXeV9IMPg68n71TBOwmOicmHM4HhKdgQazsJ6e8hXzU9fPt54 eajA==
In-reply-to: <CAEj1OKAEM6VmrCCrUbJVH9iWBDEL-XWJ7vq63u8kDyV3D0_J1g@xxxxxxxxxxxxxx>
List-archive: <http://lists.torproject.org/pipermail/tor-dev>
List-help: <mailto:tor-dev-request@lists.torproject.org?subject=help>
List-id: discussion regarding Tor development <tor-dev.lists.torproject.org>
List-post: <mailto:tor-dev@lists.torproject.org>
List-subscribe: <https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev>, <mailto:tor-dev-request@lists.torproject.org?subject=subscribe>
List-unsubscribe: <https://lists.torproject.org/cgi-bin/mailman/options/tor-dev>, <mailto:tor-dev-request@lists.torproject.org?subject=unsubscribe>
References: <51128154.6080902@xxxxxxxxxxxxxx> <5115FEF7.2090604@xxxxxxxxxxxxxx> <CAJdkzEOu4vgZ-fDguM0c_Fa0FrkCZ=KGje4_k4NX5v4eZGnqAA@xxxxxxxxxxxxxx> <CAEMWYqK-0DXY0oZM9C6UG6L-DgU8xS-3wHvdsAeFZtyqm8o_2A@xxxxxxxxxxxxxx> <CAJdkzEPJZZjgOX-+q2WG5hV0141stR9PNYYOnRqxKEytyVTUFA@xxxxxxxxxxxxxx> <511957DA.5050105@xxxxxxxxxxxxxx> <CAJdkzEOkqt+59KhOMCdO3kFqAv=B_0sLd_n_E4u3ycQP0ojkaQ@xxxxxxxxxxxxxx> <CAEj1OKBwKZpe=QgurDneFbrqe=zEVoSdfxMLc90vvKM1MQscuQ@xxxxxxxxxxxxxx> <CAJdkzENzOD2O0+VrOQyaVHXUAO7w2SSVSKVVfSQyevVYwrq6yg@xxxxxxxxxxxxxx> <5123983B.8070905@xxxxxxxxxxxxxx> <CAEj1OKABy3ArZ6y7wuEWi9RQEMjrJqMjHYKCmZ5n0Dt6BYkrGg@xxxxxxxxxxxxxx> <5127B202.3050903@xxxxxxxxxxxxxx> <CAEj1OKC6F+Nt+v2Fp=VS3a5nF_AyyoFYY1zEj1JyF0FppAOU0w@xxxxxxxxxxxxxx> <5129332A.7040100@xxxxxxxxxxxxxx> <CAEj1OKCLZg65eah1WmOdc_3X5p+JgYzMGM+ofRGB2OR9hhB_QQ@xxxxxxxxxxxxxx> <512B8304.3010104@xxxxxxxxxxxxxx> <CAEj1OKAEM6VmrCCrUbJVH9iWBDEL-XWJ7vq63u8kDyV3D0_J1g@xxxxxxxxxxxxxx>
Reply-to: tor-dev@xxxxxxxxxxxxxxxxxxxx
Sender: tor-dev-bounces@xxxxxxxxxxxxxxxxxxxx
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:17.0) Gecko/20130216 Thunderbird/17.0.3

[Nick, can you merge my torspec proposal218 branch, please?  Thanks!]


On 2/28/13 2:08 AM, Rob Jansen wrote:
> On Mon, Feb 25, 2013 at 10:28 AM, Karsten Loesing <karsten@xxxxxxxxxxxxxx>
>  wrote:
> 
>> On 2/23/13 11:20 PM, Rob Jansen wrote:
>>> On Feb 23, 2013 4:22 PM, "Karsten Loesing" <karsten@xxxxxxxxxxxxxx>
>> wrote:
>>>> Your understanding of n_circ_id and p_circ_id matches mine, but are you
>>>> sure there's a UID for circuits other than origin circuits?  I think you
>>>> mean origin_circuit_t->global_identifier.  But there's no such field for
>>>> or_circuit_t or circuit_t.  Or do you mean something else?
>>>>
>>> Ok. Though I thought my original patch moved the global Id to the base
>>> circuit struct. Perhaps I didn't. Anyway, I'm not sure it matters...
>>
>> Your original patch did move the ID to circuit_t, but I thought we
>> wanted to avoid numbering non-origin circuits (mostly because that
>> affects relays in non-TestingTorNetwork mode and could lead to busy
>> relays running out of IDs at some point), which is why I took this
>> change out.
>>
> 
> Right, I remember this now. I could imagine ways to get around the 'running
> out of ids' problem, like resetting the id counter after a given timeframe.
> In fact, we could just let the counter overflow and loop back to 0 on its
> own (assuming its a unsigned int type) since we really only need them to be
> unique approximately for the life of a circuit. If we see the ID come up
> again after an hour, we can assume its a new circuit.

I'd rather not want to change behavior in non-simulation mode.  If we
really need a UID for non-origin circuits, we could add a separate
uint64_t global_identifier to circuit_t and only use that in simulating
mode.  I'm not sure that we need it though.

>> Also, even if we moved the field to circuit_t, the new CELL_STATS events
>> would be the only ones using these UIDs, because all other events are
>> for origin circuits only.  I don't see how these IDs would help us.  We
>> never learn what UID the next or previous node in a circuit picks for a
>> given circuit.
> 
> Not currently, but couldn't we implement the functionality where relays log
> or export their circuit UIDs and next/prev IDs over the control port?
> Though I'm not sure if you'd want this in mainline Tor if its only useful
> in simulation mode...

We could have relays log local circuit UIDs together with p_circ_id,
p_conn_id, n_circ_id, and n_conn_id in simulation mode.  But I don't see
how that would facilitate circuit tracking compared to the current
approach using ORCONN and CELL_STATS events.

Here's an example:

- fileclient creates a circuit with
  - UID 14,
  - n_conn_id 15, and
  - n_circ_id 19403.

- tokenglobal is the first hop in this circuit and reports
  - (new) UID 12345,
  - p_conn_id 32,
  - p_circ_id 19403,
  - n_conn_id 18, and
  - n_circ_id 6710.

- tokenrelay is the second hop and reports
  - (new) UID 23456,
  - p_conn_id 17,
  - p_circ_id 6710,
  - n_conn_id 16,
  - n_circ_id 34402.

- exit2 is the third and final hop and reports
  - (new) UID 34567,
  - p_conn_id 15, and
  - p_circ_id 34402.

We need ORCONN events to know that fileclient's n_conn_id 15 is the same
connection as tokenglobal's p_conn_id 32.  How would the new UIDs make
this easier?

>>>> tl;dr: I _think_ it's possible to reconstruct circuits from ORCONN and
>>>> CELL_STATS events as they are currently specified in proposal 218.
>>>>
>>> Great, but do we really expect every Tor controller parser to get this
>>> right? It seems complicated enough that there should be an easier way.
>>> Maybe it's just wishful thinking on my part.
>>
>> I agree that reconstructing circuits from ORCONN and CELL_STATS events
>> is far from trivial.  I don't really see how to make it simpler though.
>
> What about linking all of the IDs and UIDs as described above?

(See above.)

>> From an earlier mail in this thread:
>>>> Finally, Rob, should I look into CIRC_BW events you suggested a while
>>>> ago?  If yes, what did you have in mind how that event would look like,
>>>> and when/by whom would it be emitted?
>>>
>>> If we want to do this, it would likely be an aggregation of all STREAM_BW
>>> events for a circuit, but only during the time when those streams
>> belonged
>>> to that circuit. I don't think it makes sense to emit it for every
>>> STREAM_BW event though, so what if we aggregate and emit it once per
>>> second? A format similar to the STREAM_BW format should work fine.
>>
>> Done.  I specified and implemented such a CIRC_BW event.
>>
>>
> Great!
> 
> 
>>
>> Here's the updated proposal 218 (Nick, please don't merge this yet):
>>
>>
>> https://gitweb.torproject.org/user/karsten/torspec.git/blob/refs/heads/proposal218:/proposals/218-usage-controller-events.txt
> 
> 
> In section 5.3/5.4/5.5, these events are emitted in the
> second_elapsed_callback(), right? I wanted to verify that a relay who
> hasn't sent anything in a few seconds and then starts sending again will
> emit the event at the end of the second after which it resumed sending,
> rather than the first bytes after it resumed sending.

Yes, all these events are emitted in second_elapsed_callback().

> The last word in 5.4 should be 'reading' instead of 'read'.

Fixed.

> Is the specification of 5.5 and 5.6 complex enough to warrant including
> example outputs?

Sure, can't hurt.  Added a few examples.

> In 5.6, were we planning on explaining how buckets can go negative and how
> that affects the reporting of the TB_EMPTY events?

I tried a better explanation:

"""
  ReadBucketEmpty (WriteBucketEmpty) is the time in millis that the read
  (write) bucket was empty since the last refill.  LastRefill is the
  time in millis since the last refill.

  If a bucket went negative and if refilling tokens didn't make it go
  positive again, there will be multiple consecutive TB_EMPTY events for
  each refill interval during which the bucket contained zero tokens or
  less.  In such a case, ReadBucketEmpty or WriteBucketEmpty are capped
  at LastRefill in order not to report empty times more than once.
"""

> Here's the tor branch:
>>
>>
>> https://gitweb.torproject.org/karsten/tor.git/shortlog/refs/heads/morestats3
>>
>> Here's a Shadow log file containing all new events:
>>
>> https://people.torproject.org/~karsten/volatile/scallion.log.gz
>>
>> Here's the Java program that I used to parse the Shadow log file:
>>
>> https://people.torproject.org/~karsten/volatile/ParseProposal218Events.java
>>
>>
> Ugh. I really don't look forward to writing parsers for this. I guess there
> may be few projects that actually require this information, and those that
> do can use your code:)

I hope that some of the complexity goes away once we use a parsing
library like Stem.  And I think we should provide parsing code to other
people looking into this.

>> And finally, here's the output, which should be easier to understand now:
>>
>> https://people.torproject.org/~karsten/volatile/scallion.txt
>>
>> Search for "Circuit [fileclient-60.1.0.0]:14" to find the circuit I
>> mentioned earlier in this thread.
>>
>>
>> Can you review the proposal changes and tell me if they make sense to you?
>>
>> Best,
>> Karsten
>>
>>
> See comments above. Overall, it looks good. I'm still wondering about the
> UIDs / IDs issue. It may be that we don't want to include that for other
> reasons, in which case the current implementation is fine.

Okay, in that case let's consider proposal 218 done for the moment,
unless we come up with a better idea to solve circuit-tracking thing.

I asked Nick above to merge my changes, so that I can put proposal 218
on the sponsor F year 3 wiki page as one result of the February 28
milestone.  (That doesn't mean we cannot make it even better after
February.)

Thanks!
Karsten

_______________________________________________
tor-dev mailing list
tor-dev@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev

Follow-Ups:
- Re: [tor-dev] Proposal: Controller events to better understand connection/circuit usage
  - From: Nick Mathewson

References:
- [tor-dev] Proposal: Controller events to better understand connection/circuit usage
  - From: Karsten Loesing
- Re: [tor-dev] Proposal: Controller events to better understand connection/circuit usage
  - From: Karsten Loesing
- Re: [tor-dev] Proposal: Controller events to better understand connection/circuit usage
  - From: Damian Johnson
- Re: [tor-dev] Proposal: Controller events to better understand connection/circuit usage
  - From: Karsten Loesing
- Re: [tor-dev] Proposal: Controller events to better understand connection/circuit usage
  - From: Damian Johnson
- Re: [tor-dev] Proposal: Controller events to better understand connection/circuit usage
  - From: Karsten Loesing
- Re: [tor-dev] Proposal: Controller events to better understand connection/circuit usage
  - From: Damian Johnson
- Re: [tor-dev] Proposal: Controller events to better understand connection/circuit usage
  - From: Damian Johnson
- Re: [tor-dev] Proposal: Controller events to better understand connection/circuit usage
  - From: Karsten Loesing
- Re: [tor-dev] Proposal: Controller events to better understand connection/circuit usage
  - From: Karsten Loesing
- Re: [tor-dev] Proposal: Controller events to better understand connection/circuit usage
  - From: Karsten Loesing
- Re: [tor-dev] Proposal: Controller events to better understand connection/circuit usage
  - From: Karsten Loesing

Prev by Author: Re: [tor-dev] Proposal: Controller events to better understand connection/circuit usage
Next by Author: [tor-dev] Interest in Contributing to Tor
Previous by thread: Re: [tor-dev] Proposal: Controller events to better understand connection/circuit usage
Next by thread: Re: [tor-dev] Proposal: Controller events to better understand connection/circuit usage
Index(es):
- Author
- Thread