Thus spake NoName (antispam06@xxxxxxx): > On 17.05.2013 23:26, Andrew Lewman wrote: > >On Fri, 17 May 2013 21:54:40 +0200 > >NoName <antispam06@xxxxxxx> wrote: > > > >>I have been reading lately about the ability to fingerprint an user > >>based on the particularities of writing. Each person has a > >>prefference for certain words, makes certain spelling mistakes, and > >>so on. And the more text the better for the machine to identify the > >>writer. > > > >See https://psal.cs.drexel.edu/index.php/JStylo-Anonymouth and > >https://events.ccc.de/congress/2011/Fahrplan/events/4781.de.html > > Thank you! But if you have more, post it. The above research is a great starting point (and comes with some open source tools you can try out, albeit they are a bit slow), but this is a very hard problem because language provides many, many ways for style variations to differentiate people. Audio is of course even worse. On the converse, while stylometry attacks are scary in theory, in practice they tend to fall apart when thrown against "suspect lists" even as large as tor-talk (I believe the current state-of-the-art is O(100) suspects). This is a reflection of the difficulty in identifying population-bisecting features that actually work in the general sense without introducing false positives due to the natural tendency for people to share and imitate elements of writing style. At least when it comes to written text, the adversary really needs to start with a short list of suspects using prior knowledge. In both cases, what we really need are solid metrics to rank the contribution of features to classification accuracy, so we can choose the language features to obfuscate first: https://en.wikipedia.org/wiki/Feature_selection Ad-hoc techniques as simple as making a conscious effort to "sound" like someone else have also been shown to be effective without requiring much practice, but it can also be difficult to break certain key stylistic habits. -- Mike Perry
Attachment:
signature.asc
Description: Digital signature
_______________________________________________ tor-talk mailing list tor-talk@xxxxxxxxxxxxxxxxxxxx https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-talk