[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

[tor-bugs] #1614 [EFF-HTTPS Everywhere]: list



#1614: list
----------------------------------+-----------------------------------------
 Reporter:  bee                   |       Owner:  pde
     Type:  enhancement           |      Status:  new
 Priority:  trivial               |   Milestone:     
Component:  EFF-HTTPS Everywhere  |     Version:     
 Keywords:                        |      Parent:     
----------------------------------+-----------------------------------------
 Hi!!!!!!!!!

 So far, this is how i've understand how HTTPS Everywhere works!!! but I
 ain't sure if everything i've got is right!!!
 It loads all rules in one array "this.rules" of the "HTTPSRules" class!!!
 Every rule is an object of the "RuleSet" class!!
 So, when you need to trigger a filter, you fetch all "this.rules" rules to
 test if a selected address (which is saved into an URI object) matches
 against one of the filters of "RuleSet"!!
 That should be slow if you suppose to have plenty of filters!!!!!!!!! even
 if you've only a few rules!!!
 This is why you added one thing called: "match_rule"!!! Yeah, it's because
 every rule can have more than a single filter, and having a "generic"
 matching rule, speeds up the search a bit!!!! It allows you to do only one
 test for a rule!!!
 Anyway, i think you can improve it further!!!!
 I think you've to write in the XML match_rule, only the host name of a
 server!!! Yeah, it's like "server.com", and it should be ok even for all
 subdomains, like "this.server.com"!!
 Then, you've to create the list of rules and sort them by name!!! So,
 you'll have "aaa.com", "bee.net", "eff.org", "paypal.com", "server.com",
 "zzz.com" and so on!!!!
 When, you need to lookup one main rule, you don't need to execute any slow
 regexp!!!!!!!!!!! All you've to do, is a binary search on the list!!! I
 think that it'll speed up a lot the addon, at least when the rules set
 will grow up!!!!!
 http://en.wikipedia.org/wiki/Binary_search_algorithm
 I've even found a demo code: http://www.nczonline.net/blog/2009/09/01
 /computer-science-in-javascript-binary-search/ !!!!! This is funny!! There
 is also one link to one example of a so called "binary search tree"!! You
 could create one tree having for each node the length of an host name!!!!!
 So, you can have for every node, a list with only host names of the same
 length!!!! though this seems to me too much complex!!!!! but it could be
 useful to split into groups lists with millions of items!!!!!!!!!!!!!!!
 yeah!!

 ~bee!!!!!!!!!!

-- 
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/1614>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online