[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
[tor-bugs] #6064 [Metrics Website]: Bridge usage statistics on metrics website are broken
#6064: Bridge usage statistics on metrics website are broken
-----------------------------+----------------------------------------------
Reporter: karsten | Owner: karsten
Type: defect | Status: new
Priority: major | Milestone:
Component: Metrics Website | Version:
Keywords: | Parent:
Points: | Actualpoints:
-----------------------------+----------------------------------------------
The graph on [https://metrics.torproject.org/users.html#bridge-users
bridge users from all countries] recently went up from 10,000 to 50,000.
There was no event that could explain this increase, so I looked for a
possible bug.
Here's the bug: when we aggregate bridge users per day, we write single
observations to a file with lines like this:
{{{
bridge,date,time,??,a1,a2,...,all
0007BC3A0CFC768DB2FA1E3EB6FB4ABF4EBE2D13,2012-05-24,07:12:18,NA,1.12,NA,...,30.55
}}}
In the next step we aggregate these lines by summing up all observations
of a given day.
Turns out the file with single observations was truncated and we didn't
notice. When adding lines to that file, it is read to memory, new
observations are added, and the file is written to disk. The file is
always kept ordered by bridge fingerprint. Here's the distribution of
bridge fingerprints in the file:
{{{
0 24567
1 24623
2 11687
3 1526
4 1124
5 825
6 1352
7 1422
8 1271
9 1287
A 1336
B 1048
C 1525
D 1227
E 1497
F 994
}}}
We would expect roughly the same number of bridges in each bucket. Looks
like the file was truncated after writing half of the fingerprints
starting with 2. This could have happened due to Java running out of
memory, the server being restarted while writing the file, etc.
The quick fix is to aggregate bridge usage statistics again and replace
the single-observations file on yatei. I'm going to do that now.
The next fix is to avoid truncating the file by writing to a temp file and
replacing the original file with it once we're done writing. I'll look
into that next.
The real fix is to stop using flat files for something that requires a
database. That's going to take me quite a bit longer.
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/6064>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs