[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
[tor-bugs] #1614 [EFF-HTTPS Everywhere]: list
#1614: list
----------------------------------+-----------------------------------------
Reporter: bee | Owner: pde
Type: enhancement | Status: new
Priority: trivial | Milestone:
Component: EFF-HTTPS Everywhere | Version:
Keywords: | Parent:
----------------------------------+-----------------------------------------
Hi!!!!!!!!!
So far, this is how i've understand how HTTPS Everywhere works!!! but I
ain't sure if everything i've got is right!!!
It loads all rules in one array "this.rules" of the "HTTPSRules" class!!!
Every rule is an object of the "RuleSet" class!!
So, when you need to trigger a filter, you fetch all "this.rules" rules to
test if a selected address (which is saved into an URI object) matches
against one of the filters of "RuleSet"!!
That should be slow if you suppose to have plenty of filters!!!!!!!!! even
if you've only a few rules!!!
This is why you added one thing called: "match_rule"!!! Yeah, it's because
every rule can have more than a single filter, and having a "generic"
matching rule, speeds up the search a bit!!!! It allows you to do only one
test for a rule!!!
Anyway, i think you can improve it further!!!!
I think you've to write in the XML match_rule, only the host name of a
server!!! Yeah, it's like "server.com", and it should be ok even for all
subdomains, like "this.server.com"!!
Then, you've to create the list of rules and sort them by name!!! So,
you'll have "aaa.com", "bee.net", "eff.org", "paypal.com", "server.com",
"zzz.com" and so on!!!!
When, you need to lookup one main rule, you don't need to execute any slow
regexp!!!!!!!!!!! All you've to do, is a binary search on the list!!! I
think that it'll speed up a lot the addon, at least when the rules set
will grow up!!!!!
http://en.wikipedia.org/wiki/Binary_search_algorithm
I've even found a demo code: http://www.nczonline.net/blog/2009/09/01
/computer-science-in-javascript-binary-search/ !!!!! This is funny!! There
is also one link to one example of a so called "binary search tree"!! You
could create one tree having for each node the length of an host name!!!!!
So, you can have for every node, a list with only host names of the same
length!!!! though this seems to me too much complex!!!!! but it could be
useful to split into groups lists with millions of items!!!!!!!!!!!!!!!
yeah!!
~bee!!!!!!!!!!
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/1614>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online