[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
Re: [tor-bugs] #17488 [ExoneraTor]: ExoneraTor hangs forever on old known-positive test
#17488: ExoneraTor hangs forever on old known-positive test
------------------------+--------------------------
Reporter: starlight | Owner: karsten
Type: defect | Status: accepted
Priority: Very High | Milestone:
Component: ExoneraTor | Version:
Severity: Major | Resolution:
Keywords: | Actual Points:
Parent ID: | Points:
Sponsor: |
------------------------+--------------------------
Comment (by karsten):
I'm stuck, but I'll post my findings here in the hope that somebody else
can resolve this mystery.
The bottleneck seems to be the second select in function
`search_statusentries_by_address_date`, that ''"finds status entries of
relays having an exit list entry with the provided IP address as the exit
address."'' That select can be really slow depending on whether the
provided IP address is found in an exit list or not. Consider date
`2011-04-29` and IP addresses `98.113.149.36` (the address mentioned in
this ticket) and `86.59.21.38` (tor26, not an exit):
{{{
EXPLAIN ANALYZE
SELECT statusentry.rawstatusentry,
statusentry.descriptor,
statusentry.validafter,
statusentry.fingerprint,
HOST(statusentry.oraddress),
HOST(exitlistentry.exitaddress),
-- Pick only the last scan result that took place in the 24 hours
-- before the valid-after time.
MAX(exitlistentry.scanned)
FROM statusentry
JOIN exitlistentry
ON statusentry.fingerprint = exitlistentry.fingerprint
WHERE exitlistentry.exitaddress = '98.113.149.36'::INET
-- Focus on a time period from 1 day before and 1 day after the
-- given date. Also include a second day before the given date
-- for exit lists, because it can take up to 24 hours to scan a
-- relay again. We shouldn't miss exit list entries here.
AND DATE(exitlistentry.scanned) >= '2011-04-27'
AND DATE(exitlistentry.scanned) <= '2011-04-30'
AND DATE(statusentry.validafter) >= '2011-04-28'
AND DATE(statusentry.validafter) <= '2011-04-30'
-- Consider only exit list scans that took place in the 24 hours
-- before the relay was listed in a consensus.
AND statusentry.validafter >= exitlistentry.scanned
AND statusentry.validafter - exitlistentry.scanned <=
'1 day'::INTERVAL
GROUP BY 1, 2, 3, 4, 5, 6;
}}}
Query results are:
{{{
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=11062.95..11062.97 rows=1 width=342) (actual
time=100644.027..100644.202 rows=57 loops=1)
-> Nested Loop (cost=0.01..11062.94 rows=1 width=342) (actual
time=122.074..100643.033 rows=84 loops=1)
Join Filter: ((statusentry.validafter - exitlistentry.scanned) <=
'1 day'::interval)
-> Index Scan using exitlistentry_exitaddress_scanneddate on
exitlistentry (cost=0.01..8.82 rows=1 width=56) (actual
time=23.242..68.592 rows=4 loops=1)
Index Cond: ((exitaddress = '98.113.149.36'::inet) AND
(date(scanned) >= '2011-04-27'::date) AND (date(scanned) <=
'2011-04-30'::date))
-> Index Scan using statusentry_fingerprint_validafter on
statusentry (cost=0.00..11054.09 rows=1 width=327) (actual
time=26.442..25142.994 rows=60 loops=4)
Index Cond: ((fingerprint = exitlistentry.fingerprint) AND
(validafter >= exitlistentry.scanned))
Filter: ((date(validafter) >= '2011-04-28'::date) AND
(date(validafter) <= '2011-04-30'::date))
Total runtime: 100644.583 ms
(9 rows)
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=11062.95..11062.97 rows=1 width=342) (actual
time=11.782..11.782 rows=0 loops=1)
-> Nested Loop (cost=0.01..11062.94 rows=1 width=342) (actual
time=11.773..11.773 rows=0 loops=1)
Join Filter: ((statusentry.validafter - exitlistentry.scanned) <=
'1 day'::interval)
-> Index Scan using exitlistentry_exitaddress_scanneddate on
exitlistentry (cost=0.01..8.82 rows=1 width=56) (actual
time=11.766..11.766 rows=0 loops=1)
Index Cond: ((exitaddress = '86.59.21.38'::inet) AND
(date(scanned) >= '2011-04-27'::date) AND (date(scanned) <=
'2011-04-30'::date))
-> Index Scan using statusentry_fingerprint_validafter on
statusentry (cost=0.00..11054.09 rows=1 width=327) (never executed)
Index Cond: ((fingerprint = exitlistentry.fingerprint) AND
(validafter >= exitlistentry.scanned))
Filter: ((date(validafter) >= '2011-04-28'::date) AND
(date(validafter) <= '2011-04-30'::date))
Total runtime: 11.953 ms
(9 rows)
}}}
Note that I saw much worse total runtimes for the first query which took
about 10 times as long.
Also note that I created two new indexes that aren't yet contained in
[https://gitweb.torproject.org/exonerator.git/tree/db/exonerator.sql
ExoneraTor's database schema] yet:
{{{
CREATE INDEX statusentry_validafter_fingerprint
ON statusentry (validafter, fingerprint);
CREATE INDEX statusentry_fingerprint_validafter
ON statusentry (fingerprint, validafter);
}}}
Before having these indexes, the above query would use statusentry_pkey,
but overall runtimes were comparable. And 1.5 minutes or even 15 minutes
is obviously unacceptable for this query. 1.5 seconds would be
acceptable.
Hmm. What's the index that we're obviously missing? (I'd rather not want
to change the table structure, but if that's really, really the only way
to fix this, let's talk about it.)
Feedback much appreciated!
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/17488#comment:3>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs
- Prev by Author:
[tor-bugs] #17542 [- Select a component]: http://health786.com/cellumis/
- Next by Author:
[tor-bugs] [Tor Bug Tracker & Wiki] Batch modify: #10281, #5791, #12418, #12426, #12427, #12516, #12523, #12950, #12968, #13056, #15138, #16352, #16417, #17406, #17505, #17506, #17507, #17508, #17509, #17519, #17531, #17532
- Previous by thread:
Re: [tor-bugs] #17488 [ExoneraTor]: ExoneraTor hangs forever on old known-positive test
- Next by thread:
Re: [tor-bugs] #17488 [ExoneraTor]: ExoneraTor hangs forever on old known-positive test
- Index(es):