[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
Re: [tor-bugs] #9385 [BridgeDB]: bridgedb's email responder should fuzzy match email addresses within time periods
#9385: bridgedb's email responder should fuzzy match email addresses within time
periods
--------------------------+--------------------------------------------
Reporter: isis | Owner: isis
Type: defect | Status: new
Priority: normal | Milestone:
Component: BridgeDB | Version:
Resolution: | Keywords: bridgedb-email, bridgedb-0.2.x
Actual Points: | Parent ID:
Points: |
--------------------------+--------------------------------------------
Comment (by wfn):
Replying to [comment:2 mparte]:
>
> {{{
> However, going down the path of finding clever regexes to match things
like the fake .onion address looking email addresses in addition to all
the other things which are clearly patterns to a human sounds like a good
way to either write unreadable code or accidentally block honest users.
> }}}
>
> Could you test for Kolmogorov Complexity?
This is venturing into crazy territory, but fwiw: it might be possible to
calculate similarity simply using Hamming Distances:
https://en.wikipedia.org/wiki/Hamming_distance which at the binary/bit
level is 'the number of XORs needed to make $given_string into
$some_target_string.'
Actually implementing such a distance metric on top of email address
storage (god forbid) would entitle making an email address storage
mechanism (probably) based on binary trees. I fear it would be simple to
produce false positives, though; the mechanism could prefer calculating
distances starting from the right-hand-side of the first part of the email
address, whatnot.
In any case, I guess I agree that this does indeed sound like a losing
battle with some insane code to top it off. :(
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/9385#comment:8>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs