Re: Proposal: GETINFO controller option for connection information

On Tue, Jun 1, 2010 at 8:47 AM, Nick Mathewson <nickm@xxxxxxxxxxxxx> wrote:

I'll try to follow up to this whole thread and push things forward even more.

On April 14, Damian wrote:
[...]

>Also, could we move forward on the other (less controversial) items? For instance, bandwidth totals tend to be a very highly requested piece of information and pipe's already provided a nice patch to get it (http://www.mail-archive.com/or-talk@xxxxxxxxxxxxx/msg13085.html). For reference, here's the not-so-controversial GETINFO options I proposed:

I'm fine with most of these, but the names are no good. *Everything*
that's returned from GETINFO is "info" after all, so prefixing all of
these with "info" is redundant; they need to be put into a grouping
that actually says what they mean.

The only one I wouldn't want to add as-is the ns/authorities option,
since it says that it uses the v2 directory format. That format is
obsolescent; we shouldn't be adding new things that use it. A
non-deprecated format would be fine.

[...]

>I'm not planning on converting the following to the customary 80-character width until it's at
> least past being a first draft for a couple reasons:
>1. I find editing fixed-width documents to be a time consuming pain in the ass.

Maybe use a text editor that will re-wrap paragraphs for you? There
are dozens of them.

> 2. I've yet to hear why we do this. Is it just to cater to mail clients too dumb to know how to line wrap?

Three reasons off the top of my head. First: That's the format that
RFCs use. Second: we like to be able to use diff to compare different
versions of a spec, and making every paragraph a single line makes
diff's output much less useful. Third: we like to version-control our
specs, and it's a lot easier to resolve conflicts when every paragraph
is not a single line.

There may be more benefits too.

On Fri, Apr 16, 2010 at 3:17 PM, Jacob Appelbaum <jacob@xxxxxxxxxxxxx> wrote:
> Damian Johnson wrote:
>> Yesterday Jake met with me to discuss this proposal, making the very
>> good points that both:
>> 1. It's completely ineffectual for the auditing purposes I've
>> mentioned since either (a) these results can be fetched from netstat
>> already or (b) the information would only be provided via tor and
>> can't be validated.
>> 2. The things I'm really interested in can be fetched with much less
>> (and safer) information.
>
> I still think that anything that can be used to track circuits (and the
> clients associated with them) is not a good idea - in Tor or using arm.
> We shouldn't encourage people to log, look or otherwise track Tor.
>
>>
>> In particular we discussed making the proposal circuit based rather
>> than connection based, being something like the following:
>>
>> "circ/<Circuit identity>" -- Provides entry for the associated circuit,
>> formatted as:
>> CIRC_ID IN_TYPE OUT_TYPE READ WRITE UPTIME
>>
>> none of the parameters contain whitespace, and additional results must be
>> ignored to allow for future expansion. Parameters are defined as follows:
>> CIRC_ID - Unique identifier for the circuit this belongs to.
>> IN_TYPE/OUT_TYPE - Single character flags indicating the purpose of the
>> inbound or outbound connection. If no connection is established then
>> this provides an empty string. Otherwise, it consists of one from each
>> of the following categories (this may become longer in future
>> expansion):
>> Usage Type:
>> C: client traffic, R: relaying traffic,
>> X: control, H: hidden service, D: directory
>> Destination:
>> I: inter-tor connection, O: outside the tor network, L: localhost
>> For instance, "RO" would indicate that this was an established
>> 1st-hop (or bridged) relay connection.
>> READ/WRITE - Total bytes read/written over the life of this connection.
>> UPTIME - Time the connection's been established in seconds.

This looks a lot better; I don't see a good way to cause problems with this.

>> "circ/all" -- Newline separated listing of all current circuits.

Do you mean their IDs, or their entries in the format specified above, or what?

[...]

>> SafeControlPort 0|1
>> Restricts access of the control port to only include read-only operations.
>> (Default: 0)
>>
>> Making this the default would be a no-go due to vidalia (though still
>> a nice option to have...). If this is implemented its setting should
>> be part of the PROTOCOLINFO response.

I agree with Jake that this probably wants to be another proposal of
its own, and get implemented independently.

>> Finally, the other proposed GETINFO options still seem useful (with
>> the possible exception of "info/uptime-reset"), and could be improved
>> with the addition of:
>>
>> "info/user" -- User under which the tor process is running, providing an
>> empty string if none exists.
>>
>
> You may also want something like the following:
>
> "info/uid"
> "info/euid"
> "info/gid"
> "info/egid"

Probably a "process-owner" notion is closer to what you want here.
Also remember that it needs to work on Windows. ;)

Also see above caveat on the "info/" prefix.

>> "info/pid" -- Process id belonging to the tor process, -1 if none exists for
>> the platform.
>>
>> * this one is both useful and surprisingly difficult for me to
>> retrieve at present (arm attempts to get it from pidof, ps, and
>> netstat yet still fails on some systems...)
>
> The good news is that it's pretty easy to do in C:
>
> pid_t pid;
> pid = getpid(); // see also getppid();
> printf("PID is: %d\n", pid);

Fine by me, modulo calling it "info".

At this point we're probably ready for another proposal revision, and
a draft patch to implement all of the above. :)

--
Nick