[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
Re: [tor-dev] Proposal: Tor bandwidth measurements document format
- To: tor-dev@xxxxxxxxxxxxxxxxxxxx
- Subject: Re: [tor-dev] Proposal: Tor bandwidth measurements document format
- From: juga <juga@xxxxxxxxxx>
- Date: Mon, 30 Apr 2018 13:21:00 +0000
- Autocrypt: addr=juga@xxxxxxxxxx; keydata= xsFNBFONpUkBEADCZKJs2sXSK2qEvIquZKnt16LCsg61kacKX9MGdbrfB/xxrjx9tcU6fCFd C5//4OwI+sT/E41LrwpV8cehVtoAtmwyUK+/LC3XLbK8m8XSp8/ghOBnm2jlI0z9FVqfHuiv cfd6v7C2xSkqu2TvzdavBetHgPtJwrZ1zK5suQY4ww+48C2GvyqwJR1CMjJq2mz5CEa3mh7N vwqnSxM8Oh1ptSiRbkGR36GF5rt2w2fbFv143fqvquNi+T0P/VWjFk7DZA4d6yhiiM3zb3IL I3TOlgZT2oyjAqq5DTU5evS1VYq9zugT5xvvFQ473M2le3uSr5wPhj4kXkDAgbaHOxinBfR4 ldT3yDJ63mDq1nxxhBfoDx6EP/8gard04aow1zFw8USUS18wloV5/XMnGZX4vYFpKfPYxOSw CFyXXPxGkeIqZ7A1kuUJnvnsfFs+FzPor0EkaKHV88HB/XFwupgc1h4EhmDcEAQJGh8wgjrN A/xLH6tiv08VSC6wtS1BfOQsaRoA32/rQAusoZ5uzwk1C4WJlJ6RkBf/XWIFt9T+22gl8rAP 6MGTFGzq1Wnvfp54ih16+B3tyrH1rjfo2TSHzJypdOJElhjxi2RxLGD3UBOrmjm2nxqWYEE/ syQRWaTs7UTXQ83veLs0exIKzr31nUHuYIqkCITPHVTywHJ4FwARAQABzRIgPGp1Z2FAcmlz ZXVwLm5ldD7CwZcEEwEKAEECGwMFCwkIBwMFFQoJCAsFFgMCAQACHgECF4ACGQEWIQQtqB0B RVw6ADIZiFDzBUR6+AbUawUCWT+5lAUJB5NHywAKCRDzBUR6+AbUa29FEACTIHcWwDSMacME vYiFBcXPYeA+vHvg8Tn0D/utUsVimmwSZTai9pyFXo7cvAZV99VgWgqv8hGg4YqGu/NJdnCD VMeJt/y9zE7qvfqpI9vxOr6BNzhbg0ww61oPqSvyfb68muFiwMvfl7my3pkYqoHlngfcjoMI qZuhxG/UHl2jNIYYlRIhA3mR1lCp2BmZjKUcWlTPN70WDWTfRjjnGLFgi6ce5uM7egxFPo5c 69t4t8mRVFkBN58sdfwfQ1FWZLlAw6yc8NIkj3/mHPHnH1ihBbAGWRB66wlfm5ZCrFxdOxg8 43KHXf92d5rgvs1R4ap2NzibMJL/Hlty0cDgvvhBDeJ87EPikMOl5GT1kqQKRPLdrz0J2rus y8UFIlFot//ekhmxFZp3MnKxzG6xs+Ok3D5tbHCyS+c4WMIjDTaNFUMRxzjXy9mIneQ7DSDV wsEtiftl6SKdRulLjIDhUwEIXZvoeYLThhd/dsYTth+6RAfi88eUqGnjYI3YkfAw65lz9AwN PoyUXoSVeANEhad/GbpnfBGA/o4cntQNk5jS5nQdPwSTLqZCKGGOuHW1UKpHz8A8wrrt36gl GihSJuKhtpsQYKv+7SUBnrXkHzBtspLH7MK9Svf3IZFOgUzIHtKAOVsVFhEfClq5LGz8cpfB 3qyv5EhQ1QRpFOiN4c3Ays7BTQRTjaaBARAA5UBd7mTMdO/6wXKSfeJef5Wwtuc8Js4zdZOM uZ15Kkqk5P3wALqUVzrSsdDv4vO2M79ZranbHWcdAOApkRiRJ0Yu/fzUdgyBBf6v4KI98sZD LRdz/I+7994sAgB839dd+hTKQDuG7rArKxD3PWYLkaWRkOX/rPJUJ2l5xeF69kSrtyANfXPs mvZjzDBBvxegelYAGroDHXUJjgMSXPIz3nLVwQtJTMszs7tyOW+/Nhb6+SPilzahGWWd39Cs wbo0CoKu/AjAcdeNbcS3ab4fNy/yARq2EXFB5+vTSkAolDoDBriGqGpCDgvoFPz7lb7/IGld jsO8MQddlztu38X/Md3fGu4Cje/IchqfNb8eikztVxyVImtc1MXUWs4Ecc2t5FYmVzlTXppI tA/rpM9D4w/QEnyptQ7Uu8aItTOmy3NEHUGR/pWH2J0M78zrAg1FP6Dso4ANRVM2wTsQbeys 8e0NHocGUI64MS/2Xqmkd+bEdFY8RdoNzakUcai1VnDobYMbp9c37K+WjeBVsru5JqEkjzfv u8Zk7Ji0f4EyZ4ng+JlMsldytE5LCO0nEduJYCWpa3FtuEy7P9zK0kThq6ED3I3b2JtboADt c0I1tpZto8A21JN60ULUJbyGA/UqZFAqJu8qBUI8nrqAm/5nyv/e6axeZl3vJfmGcskHqI0A EQEAAcLBfAQYAQoAJgIbDBYhBC2oHQFFXDoAMhmIUPMFRHr4BtRrBQJZP7mkBQkHk0ajAAoJ EPMFRHr4BtRrppIP/3DS3og9FiY3VB7a1LLbthGEj3e/ORrPxpQicwSB/Rn70KsrVtB5AYaH bcD64yJCxHLaxKOXNoLtNvAqUHGiJxCN9QNElRlQG+DsuglsUSPAAjgGbGueaB6VFoGv5QZY xkc3K7t7K/Q0V87wUZmxFPAGZK5KMud8Zy93s+U9i2bdXEQsANGb4ubT7rfYGU0TGio+zRMY 1vEokRMz87hyM4laFwwa2X+JrcSl0Ghg382IDJG6lK2DsVU5M1TkqBC0/vmnfOx+/OFuMsWo dCB/vmOEiuH4xM5/c856OPsPz0SYhlGfLqrpr51lp2tAG19JuZCQ/uPZqt0KYuebk0nhnOS8 PFujQiYhSebXyLfcUBXUEu0BMqjWPE0Yumk/2zluq10t4xZDjGvifcH/1AzxPlH6ezxote8H OjiL+pOYSqbHoEGFNX95wMinU7niq75r1ikonsMKKgQDP1Tl06TKenwH6Muci/FX9Q2YhHJ1 uEODG0kFr/LqIFMdj9mUtAH20p1weyhGG+X6BUIjc96XTbaLms43qT0RI/cil1PQBrRzS2S0 OjT6j+g5WkKOS17MEFnar7lOcQvxD62hSuD4HKZ1WL0ocExRxReSFfjFXWaibHNyhNynR4pQ FNMJrVJl5qqdXi9u89gq5wSWTb3t5V00CsPP0e0jBSgh6jT5xUil
- Delivered-to: archiver@xxxxxxxx
- Delivery-date: Mon, 30 Apr 2018 09:21:47 -0400
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/simple; d=riseup.net; s=squak; t=1525094489; bh=ujLzcwzWTNK3CfbBshyYW6g0nXz6tOf3bZcRoUmAMm0=; h=Subject:From:To:References:Date:In-Reply-To:From; b=GoMKVvDnn8RNaGktQJNTeAmNSZQyGTzwu0IQWHrGm7jVoBub+kdYvTeuWZFh6hPVQ THbL5LvEolYjG03eCqMi4kuGy/d3wIEVFousQXLZG/5HiXfjVb5BVoHz70WNc2u3uj 24WcgWU4yfSKvLvGMXIbO6u0c4Bl/oZN108PLKrI=
- In-reply-to: <af2221c5-a4b6-44ab-865d-19c4fd8013ae@riseup.net>
- List-archive: <http://lists.torproject.org/pipermail/tor-dev/>
- List-help: <mailto:tor-dev-request@lists.torproject.org?subject=help>
- List-id: discussion regarding Tor development <tor-dev.lists.torproject.org>
- List-post: <mailto:tor-dev@lists.torproject.org>
- List-subscribe: <https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev>, <mailto:tor-dev-request@lists.torproject.org?subject=subscribe>
- List-unsubscribe: <https://lists.torproject.org/cgi-bin/mailman/options/tor-dev>, <mailto:tor-dev-request@lists.torproject.org?subject=unsubscribe>
- Openpgp: preference=signencrypt
- References: <af2221c5-a4b6-44ab-865d-19c4fd8013ae@riseup.net>
- Reply-to: tor-dev@xxxxxxxxxxxxxxxxxxxx
- Sender: "tor-dev" <tor-dev-bounces@xxxxxxxxxxxxxxxxxxxx>
Hi,
after teor's revision, second version pasted below.
Changes can be seen: in
https://github.com/juga0/torspec/commits/bandwidth-file-spec
Best,
juga
=================================================================
Tor Bandwidth Measurements Document Format
juga
teor
1. Scope and preliminaries
This document describes the format of Tor's bandwidth measurements
document, version 1.0.0 and later.
Since Tor version 0.2.4.12-alpha the directory
authorities use the bandwidth measurements document called
"V3BandwidthsFile" and produced by Torflow [1]
(format described in README.spec.txt [2]).
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in
RFC 2119.
1.2. Acknowledgements
The original bandwidth measurement scanner (Torflow) and format was
created by mike. Teor suggested to write this specification while
contributing on pastly's new bandwidth scanner implementation.
This specification was revised after feedback from:
XXX
1.3 Outline
The bandwidth measurements mentioned in sections 3.4.1 and 3.4.2
of "Tor directory protocol" (dir-spec.txt) [3] are obtained
by bandwidth authorities, which generate a file storing information
on relays' measured bandwidth capacities.
1.4. Format Versions
1.0.0 - The legacy fallback bandwidth measurements document format
1.1.0 - Adds key_value lines to the header, format version,
optional ones and section separator.
2. Format details
Bandwidth measurements MUST contain the following sections:
- Header (exactly once)
- Relays measurements (zero or more times)
2.1. Definitions
The following nonterminals are defined in dir-spec.txt, sections
1.2., 2.1.1., 2.1.3.:
Int
SP (space)
NL (newline)
Keyword
ArgumentChar
fingerprint (hexdigest)
nickname
Nonterminals defined in "Tor Directory List Format" (dir-list-spec.txt),
section 2.2.1.:
version_number
We define the following nonterminals:
value ::= ArgumentChar+
key_value ::= Keyword "=" value
line ::= ArgumentChar* NL
timestamp ::= Int
bandwidth ::= Int
relay_line ::= key_value (SP key_value)* NL
2.2. Header format
Some header lines MUST appear in specific positions, as documented below.
All other lines can appear in any order.
There MUST NOT be multiple key_value header lines with the same key.
It consists of:
timestamp NL
[At start, exactly once.]
The Unix Epoch time in seconds when the file was created.
"version=" version_number NL
[In second position, zero or one time.]
The specification document format version.
It uses semantic versioning [5].
This line has been added in version 1.1.0 of this specification.
Version 1.0.0 documents do not contain this line, and the
version_number is considered to be "1.0.0".
"software=" value NL
[Zero or one time.]
The name of the software that created the document.
This line has been added in version 1.1.0 of this specification.
Version 1.0.0 documents do not contain this line, and the software is
considered to be "torflow".
"software_version=" value NL
[Zero or one time.]
The version of the software that created the document.
The version may be a version_number, a git commit, or some other
version scheme.
This line has been added in version 1.1.0 of this specification.
"scanner_started=" timestamp NL
[Zero or one time.]
The Unix Epoch time in seconds when the scanner that generates the
measurements document started.
This line has been added in version 1.1.0 of this specification.
"earliest_measurement=" timestamp NL
[Zero or one time.]
The Unix Epoch time in seconds when the first relay measurement
was obtained.
This line has been added in version 1.1.0 of this specification.
key_value NL
[Zero or more times.]
Future format versions may include additional key_value header lines.
Additional header lines will be accompanied by a minor version
increment.
Implementations MAY add additional header lines as needed. This
specification SHOULD be updated to avoid conflicting meanings for the
same header keys.
Parsers MUST NOT rely on the order of these additional lines.
Additional header lines MUST NOT use any keywords specified in the
relay measurements format.
If a header line does not conform to this format, the line SHOULD be
ignored by parsers.
NL
[Zero or one time.]
The header ends.
This line has been added in version 1.1.0 of this specification.
For version 1.0.0 documents, the header ends when the first relay
measurement line is found conforming to the next section.
2.3. Relay measurements format
It consists of zero or more relay_line with the measurement results
of relays in arbitrary order.
There can be at most one relay_line per relay identity (fingerprint).
There MUST NOT be multiple key_value pairs with the same key in the same
relay_line.
Each relay_line MUST include the following key_value in arbitrary order:
"node_id=" fingerprint
[Exactly once.]
The fingerprint of the relay being measured.
"bw=" bandwidth
[Exactly once.]
The measured bandwidth of this relay.
Tor accepts zero bandwidths, but they trigger bugs in older Tor
implementations. Therefore, implementations SHOULD NOT produce zero
bandwidths. Instead, they SHOULD use one as their minimum bandwidth.
Multiple measurements can be aggregated using an averaging scheme, such
as a mean, median, or decaying average.
Torflow scales bandwidths to kilobytes per second. Other implementations
SHOULD use kilobytes per second for their initial bandwidth scaling.
If different implementations or configurations are used in votes for the
same network, their measurements MAY need further scaling. See
Appendix B
for information about scaling, and one possible scaling method.
key_value
[Zero or more times.]
Future format versions may include additional key_value pairs on a
relay_line.
Additional key_value pairs will be accompanied by a minor version
increment.
Implementations MAY add additional relay key_value pairs as needed. This
specification SHOULD be updated to avoid conflicting meanings for the
same relay keys.
Parsers MUST NOT rely on the order of these additional key_value pairs.
Additional key_value pairs MUST NOT use any keywords specified in the
header format.
If a relay line does not conform to this format, the line SHOULD be
ignored by parsers.
2.4. Implementation notes
2.4.1. Simple Bandwidth Scanner
Every relay measurement in sbws version 0.1.0 consists of:
"node_id=" fingerprint SP
As above.
"bw=" bandwidth SP
As above.
"nick=" nickname SP
[Exactly once.]
The relay nickname.
"rtt=" Int SP
[Exactly once.]
The Round Trip Time in milliseconds to obtain 1 byte of data.
"time=" timestamp NL
[Exactly once.]
The Unix Epoch time in seconds when the last measurement was performed.
2.4.2. Torflow
Torflow relay lines include node_id and bw, and other key_value pairs [2].
References:
1. https://gitweb.torproject.org/torflow.git
2.
https://gitweb.torproject.org/torflow.git/tree/NetworkScanners/BwAuthority/README.spec.txt#n332
3. https://gitweb.torproject.org/torspec.git/tree/dir-spec.txt
4. https://metrics.torproject.org/onionoo.html#details
5. https://semver.org/
A. Sample data
The following has not been obtained from any real measurement.
A.1. Generated by Torflow
This an example version 1.0.0 document:
1523911758
node_id=$68A483E05A2ABDCA6DA5A3EF8DB5177638A27F80 bw=760 nick=Test
measured_at=1523911725 updated_at=1523911725 pid_error=4.11374090719
pid_error_sum=4.11374090719 pid_bw=57136645 pid_delta=2.12168374577
circ_fail=0.2 scanner=/filepath
node_id=$96C15995F30895689291F455587BD94CA427B6FC bw=189 nick=Test2
measured_at=1523911623 updated_at=1523911623 pid_error=3.96703337994
pid_error_sum=3.96703337994 pid_bw=47422125 pid_delta=2.65469736988
circ_fail=0.0 scanner=/filepath
A.2. Generated by sbws version 0.1.0
1523911758
version=1.1.0
software=sbws
software_version=0.1.0
scanner_started=1523911756
earliest_measurement=1523911757
node_id=$68A483E05A2ABDCA6DA5A3EF8DB5177638A27F80 bw=760 nick=Test
rtt=380 time=1523911725
node_id=$96C15995F30895689291F455587BD94CA427B6FC bw=189 nick=Test2
rtt=378 time=1523911623
B. Scaling bandwidths
B.1. Scaling requirements
Tor accepts zero bandwidths, but they trigger bugs in older Tor
implementations. Therefore, scaling methods SHOULD perform the
following checks:
* If the total bandwidth is zero, all relays should be given equal
bandwidths.
* If the scaled bandwidth is zero, it should be rounded up to one.
Initial experiments indicate that scaling may not be needed for
torflow and sbws, because their measured bandwidths are similar
enough already.
B.2. A linear scaling method
If scaling is required, here is a simple linear bandwith scaling
method, which ensures that all bandwidth votes contain approximately
the same total bandwidth:
1. Calculate the relay quota by dividing the total measured bandwidth
in all votes, by the number of relays with measured bandwidth
votes. In the public tor network, this is approximately 7500 as of
April 2018. The quota should be a consensus parameter, so it can be
adjusted for all scanners on the network.
2. Calculate a vote quota by multiplying the relay quota by the number
of relays this bandwidth authority has measured
bandwidths for.
3. Calculate a scaling factor by dividing the vote quota by the
total unscaled measured bandwidth in this bandwidth
authority's upcoming vote.
4. Multiply each unscaled measured bandwidth by the scaling
factor.
Now, the total scaled bandwidth in the upcoming vote is
approximately equal to the quota.
B.3. Quota changes
If all scanners are using scaling, the quota can be gradually
reduced or increased as needed. Smaller quotas decrease the size
of uncompressed consensuses, and may decrease the size of
consensus diffs and compressed consensuses. But if the relay
quota is too small, some relays may be over- or under-weighted.
_______________________________________________
tor-dev mailing list
tor-dev@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev