[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
Re: [tor-bugs] #15844 [Onionoo]: Develop database schema to support Onionoo's search parameter efficiently
#15844: Develop database schema to support Onionoo's search parameter efficiently
-----------------------------+-----------------
Reporter: karsten | Owner:
Type: enhancement | Status: new
Priority: normal | Milestone:
Component: Onionoo | Version:
Resolution: | Keywords:
Actual Points: | Parent ID:
Points: |
-----------------------------+-----------------
Comment (by leeroy):
The problem '''is '''the `LIKE '%foo%'` . That's the only problem. All the
other queries will be fast because they are index-able using prefix
search, `LIKE 'foo%'` . Looking for a schema or de-normalizing from 3NF
won't fix it. It's not going to be solved by a trigram-match either.
Check out the documentation and look at how the trigram-match is
implemented. The trigram is faster only because it allows a form of
indexing, but the implementation is computation heavy. I was wondering--
what is the importance of a similarity metric? The trigram-match computes
similarity for ranking results. How important (or relevant) is this if
searching for example in fingerprints? If similarity isn't that important
maybe a user-defined function would work better.
Another possibility would be to opportunistically try to complete the
query using a prefix search first.
You might also do for nickname, and ip, what you do for fingerprint. Make
it a separate query type in the protocol and do search_fp, search_nn,
search_ip. Not ideal, I know, but it would avoid a multi-table search
combined with a scan of each. This would also make possible pattern length
adjustments. I mean a single character search might make sense for
nickname search, but does it equally make sense for ip or fingerprint?
I'm really interested in the user-defined function possibility because
that wouldn't require any compromise. Especially if similarity isn't that
important a metric. I'm sure there are other well known algorithms that
are suitable replacements.
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/15844#comment:5>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs