[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [tor-bugs] #20548 [Metrics]: Handle bad input more consistently in metrics code bases

Subject: Re: [tor-bugs] #20548 [Metrics]: Handle bad input more consistently in metrics code bases
From: "Tor Bug Tracker & Wiki" <blackhole@xxxxxxxxxxxxxx>
Date: Mon, 07 Nov 2016 08:13:52 -0000
Auto-submitted: auto-generated
Delivered-to: archiver@xxxxxxxx
Delivery-date: Mon, 07 Nov 2016 03:14:03 -0500
In-reply-to: <047.c2f19a9cf828237e560b0f83e07dba2f@torproject.org>
List-archive: <http://lists.torproject.org/pipermail/tor-bugs/>
List-help: <mailto:tor-bugs-request@lists.torproject.org?subject=help>
List-id: "auto: Tor bug tracker status mails" <tor-bugs.lists.torproject.org>
List-post: <mailto:tor-bugs@lists.torproject.org>
List-subscribe: <https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs>, <mailto:tor-bugs-request@lists.torproject.org?subject=subscribe>
List-unsubscribe: <https://lists.torproject.org/cgi-bin/mailman/options/tor-bugs>, <mailto:tor-bugs-request@lists.torproject.org?subject=unsubscribe>
References: <047.c2f19a9cf828237e560b0f83e07dba2f@torproject.org>
Reply-to: no-reply@xxxxxxxxxxxxxx, tor-assistants@xxxxxxxxxxxxxx
Sender: "tor-bugs" <tor-bugs-bounces@xxxxxxxxxxxxxxxxxxxx>

#20548: Handle bad input more consistently in metrics code bases
-------------------------+---------------------
 Reporter:  karsten      |          Owner:
     Type:  enhancement  |         Status:  new
 Priority:  Medium       |      Milestone:
Component:  Metrics      |        Version:
 Severity:  Normal       |     Resolution:
 Keywords:               |  Actual Points:
Parent ID:               |         Points:
 Reviewer:               |        Sponsor:
-------------------------+---------------------

Comment (by iwakeh):

 Some thoughts:

 One step is unifying the parsing process by replacing all parsing code
 with metrics-lib provided parsing (which is already under way for
 CollecTor).  This addresses goal number one in the description above.

 Goal number two (of the bullet point list in the description above) is
 fine, too, as descriptors are separate data units and failure of parsing
 one should not influence parsing and storing of subsequent descriptors
 only because these happened to be stored in the same file temporarily.

 Regarding the second list: privacy and client expectation, i.e. topics 3.
 and 4., are the most important.

 One way to combine storing-of-all-that-is-seen with privacy and client
 expectation, would be to store invalid descriptors separately.  The
 separate location also can be public for relay descriptors and sanitized
 bridge descriptors,i.e., public folders for download would be 'archive',
 'relay', and 'substandard' (or some better name).  All bridge descriptors
 that cannot be sanitized should be stored too, but not yet be offered to
 the public.

 Advantages:
 * privacy is ensured
 * clients can choose the quality of descriptors they're interested in
 * we'd get an overview of how many 'bad' descriptors show up every month
 and can analyze them
 * others can also analyze the 'substandard' descriptors, too, or use them,
 if they choose to.
 * Given that descriptors are not supposed to be altered other than for
 privacy reasons, some still could be later integrated into the 'normal'
 archives for example when more robust parsing is available.

 Disadvantages:
 * implementation of the third storage (alover, i.e. for 'recent', 'out',
 and 'substandard'), but the implementation should be easy.
 * maintenance of third storage location.

 Concerning already archived data there are two options:
 * leave them as thy are
 * or re-parse and sort substandard historic descriptors into tarballs in
 the 'substandard' directory.

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/20548#comment:1>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs

References:
- [tor-bugs] #20548 [Metrics]: Handle bad input more consistently in metrics code bases
  - From: Tor Bug Tracker & Wiki

Prev by Author: Re: [tor-bugs] #20520 [Core Tor/Stem]: stem test_installs_all_files fails with *.swo file
Next by Author: Re: [tor-bugs] #20132 [Core Tor/Tor]: Let large client deployments use a local directory cache
Previous by thread: [tor-bugs] #20548 [Metrics]: Handle bad input more consistently in metrics code bases
Next by thread: Re: [tor-bugs] #20548 [Metrics]: Handle bad input more consistently in metrics code bases
Index(es):
- Author
- Thread