[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
Re: [tor-bugs] #11350 [Onionoo]: Extend Onionoo's lookup parameter to give out relays/bridges that haven't been running in the past week
#11350: Extend Onionoo's lookup parameter to give out relays/bridges that haven't
been running in the past week
-----------------------------+---------------------
Reporter: karsten | Owner: karsten
Type: enhancement | Status: new
Priority: normal | Milestone:
Component: Onionoo | Version:
Resolution: | Keywords:
Actual Points: | Parent ID:
Points: |
-----------------------------+---------------------
Comment (by wfn):
This is not really relevant to the relay challenge task per se, so anyone
can safely skip this comment.
Maybe orthogonal, but can't hurt, so fwiw, re:
{{{
+ /* TODO This is an evil hack to support looking up relays or bridges
+ * that haven't been running for a week without having to load
+ * 500,000 NodeStatus instances into memory. Maybe there's a better
+ * way? Or do we need to switch to a real database for this? */
}}}
Karsten, *if* you decide to do some benchmarking using a database (using
whatever database schema appropriate), I'd very much advise to look over
the following document/tutorial:
https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server
Note that this is not considered to be any kind of 'postgres hacking';
this can be done in a purely wheezy/stable setting, and is completely
normal practice. The postgres defaults in linux systems are somewhat..
conservative. e.g. changing `effective_cache_size` to up to of 75% of the
overall system's memory is normal. `shared_buffers` default in linux is
usually 32MB or so. (To elevate this, you do need to change
`/etc/sysctl.conf` (to raise `SHMMAX`), but again, this should not be
considered to be fringe/esoteric practice; if this is not done, postgres
assumes it can't pre-allocate more than 32MB of memory; that's not a lot
of memory.)
You once mentioned cases of indexes not fitting into memory. Beyond not
using partial/functional indexes (LOWER(), SUBSTR()) and having redundant
indexes, the primary reason for this is (as I've somewhat painfully
discovered) not allowing postgres to actually use enough memory (fwiw,
using pre-allocated shared memory is faster, too, though I'd need to dig
up references.)
Sorry for the detour, but in case someone *does* end up experimenting with
less hacky database-based solutions, don't forget to take a good look at
your postgres configuration. :) (or, maybe you've already done that, and
this was all redundant!)
Also, it makes sense to use intermediary tables[1], so e.g. a
'fingerprint' table for unique fingerprint lookup -> then join with status
entries / whatnot. The fingerprints can just as well reside in memory of
course, if they can be efficiently persisted, and so on. In-house partial-
nosql-solutions. :)
[1]: e.g. https://github.com/wfn/torsearch/blob/master/db/db_create.sql
(hopefully this was not painful to read! Just wanted to share what I've
learned.)
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/11350#comment:2>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs