[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [tor-bugs] #28877 [Core Tor/Tor]: Paginate large controller commands like 'GETINFO desc/all-recent'



#28877: Paginate large controller commands like 'GETINFO desc/all-recent'
--------------------------+--------------------------
 Reporter:  wagon         |          Owner:  atagar
     Type:  defect        |         Status:  assigned
 Priority:  Medium        |      Milestone:
Component:  Core Tor/Tor  |        Version:
 Severity:  Normal        |     Resolution:
 Keywords:                |  Actual Points:
Parent ID:                |         Points:
 Reviewer:                |        Sponsor:
--------------------------+--------------------------

Comment (by atagar):

 > so how it can be "paginated"?

 Hi wagon. A paginated API is one in which you can receive batches of a
 limited size. The caller then makes a series of calls to get the full
 listing. Ignoring the control interface for a minute paginated interfaces
 look like...

 {{{
 First Request: {
   first_index: 0,
   size: 5,
 }

 First Response: [
   <descriptor 1>,
   <descriptor 2>,
   <descriptor 3>,
   <descriptor 4>,
   <descriptor 5>,
 ]

 Second Request: {
   first_index: 5,
   size: 5,
 }

 Second Response: [
   <descriptor 6>,
   <descriptor 7>,
 ]
 }}}

 The caller sees its second request received only two descriptors (rather
 than the five requested) so it knows it has received them all.

 The reason for a paginated API is to divide the fourteen megabyte GETINFO
 response controllers presently get into a series of bite sized responses.
 A massive GETINFO response like this saturates the control connection,
 preventing further commands and events from being transmitted.

 In fact, Nyx used to avoid commands like 'GETINFO ns/all' entirely in
 favor of reading cached descriptors from tor's data directory. This was
 far faster and avoids blocking the control socket (effectively all the
 command does is echo the file), but I've been cautioned that any direct
 use of tor's data directory is a bad idea.

 I'm not overly married to the idea of a paginated API. I'd be delighted to
 chat with the network team about design ideas, but first step lets be
 clear about the problem we're trying to address: **controllers need the
 ability to break up multi-megabyte responses into smaller replies so we
 avoid saturating the control connection.**

 Pagination might be a poor fit. In particular...

 * GETINFO commands are not designed to take keyword arguments. We could
 hack this together with positional arguments (**GETINFO desc/batch/0/5**
 then **GETINFO desc/batch/5/5** for the example above), but needless to
 say... ick.

 * Concurrency. If tor downloads new descriptors while we're in the middle
 of iterating over it's prior ones the caller will conclude with an
 incorrect enumeration. Usually this would be dealt with by a consensus id
 argument so the caller can specify the set of descriptors its iterating
 over but this isn't really how tor is designed.

 So TL;DR: We need some way of breaking up these responses. Pagination
 probably isn't a good fit so ideas welcome.

 > Maybe you can. I started from finding a source of this problem.

 I suspect we're talking about two different things. The above is a long
 time problem I've had with tor's 'get all descriptor' commands ('GETINFO
 desc/all-recent', 'GETINFO md/all', and 'GETINFO ns/all'). I'm tackling
 that topic in this ticket because that's the problem you cited originally
 ("GETINFO desc/all-recent returns very huge listing which interpreter
 cannot manage properly.").

 Nyx actually **avoids** making that particular query because doing so
 would temporarily hose the control connection in the way you describe. I
 just took a look and unless I'm missing something I'm not finding any
 'GETINFO desc/all-recent' calls in nyx.

 I suspect your initial hypothesis about the reason Nyx is freezing is
 inaccurate. Feel free to file a **separate** ticket with the 'nyx --debug'
 output when Nyx freezes so I can see what's up. But I'd like to hijack
 this ticket to brainstorm our long term plan for these bulky GETINFO
 commands since they've been a long time pain point for me.

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/28877#comment:3>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs