[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [tor-dev] Mnemonic 80-bit phrases (proposal)

To: tor-dev@xxxxxxxxxxxxxxxxxxxx
Subject: Re: [tor-dev] Mnemonic 80-bit phrases (proposal)
From: Sai <tor@xxxxxxxxxx>
Date: Tue, 20 Mar 2012 22:47:56 -0400
Delivered-to: archiver@xxxxxxxx
Delivery-date: Tue, 20 Mar 2012 22:48:30 -0400
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:content-type :content-transfer-encoding; bh=n/X8Z3+Sf4859fH6/vcqzNj2Tag617MjjBl2VX5853s=; b=bpgGIPufj0DfgAO9DuNqMSJbu83Dq2uxYW/FCpRIvsaIGnqLKGJLeBSXHRE+Vq2G9P t/drGXEIEmbl7K5Yc0MtRBTXo6bTgVFZDmnCURzzrTLgs+yRT33Ibk8WCD+SfBIs61qN eAxYdUuPuG1Cc4umJTqad63ZZJwZTcLS4bWzHsELY/z/cbaXecqYYJRJxRYNUrkM+PBz XgVlVo+h6HCqOT3BPycLtMAMv9giag+U2bnrSpUU8y34WmMjc2zhYDWLHBiRYV+/LP8r d6jAvHTChEGLNEd0OqwOoEKL26sgKy0Y7w5hE7A5rQhNzI9GDJ8K4E1x5MmmP6MWIShV WB1w==
In-reply-to: <CANDp8O_vwOqjsYBP5xip5mNVrH3pnN_SCE_kUMNLDOTKbaDKvg@xxxxxxxxxxxxxx>
List-archive: <http://lists.torproject.org/pipermail/tor-dev>
List-help: <mailto:tor-dev-request@lists.torproject.org?subject=help>
List-id: "This mailing list is for discussion by the developers of Tor." <tor-dev.lists.torproject.org>
List-post: <mailto:tor-dev@lists.torproject.org>
List-subscribe: <https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev>, <mailto:tor-dev-request@lists.torproject.org?subject=subscribe>
List-unsubscribe: <https://lists.torproject.org/cgi-bin/mailman/options/tor-dev>, <mailto:tor-dev-request@lists.torproject.org?subject=unsubscribe>
References: <CAPdaGT4DLxu2qYEkwnxJrO6EAUCsLc-ttchqRZyRhmHVi60-kQ@xxxxxxxxxxxxxx> <CAPdaGT7UhRu+GGYHEB8cJMKg8D2LYyRrhDhXyM0P5C2fTZ6nFQ@xxxxxxxxxxxxxx> <CAMe+WHsEaHTSRZaQwYEkkuX8Yprx=Rw8ATfeZfO0A2hiL2qhhw@xxxxxxxxxxxxxx> <CAPdaGT625csG6W2=vrnwmaWxx4O-a4q9CrrEzceEAMAfx6_Rsw@xxxxxxxxxxxxxx> <CAPdaGT6Ne5Z7aPZK2nggMB2g71RQM74mDshn2DgwFfKgLayFEw@xxxxxxxxxxxxxx> <CAPdaGT4Q_Xqfa5ipLJBv_V7jGLwgxvnoFNXTFPHkwjN5g+6XUw@xxxxxxxxxxxxxx> <CANDp8O_vwOqjsYBP5xip5mNVrH3pnN_SCE_kUMNLDOTKbaDKvg@xxxxxxxxxxxxxx>
Reply-to: tor-dev@xxxxxxxxxxxxxxxxxxxx
Sender: tor-dev-bounces@xxxxxxxxxxxxxxxxxxxx

On Tue, Mar 20, 2012 at 20:11, Ken Takusagawa II
<ken.takusagawa.2@xxxxxxxxx> wrote:
> 1. You need 2^8=256 templates, not just 8, to reach 6*12+8=80 bits.

We won't know for sure how it hashes out until we make both the
dictionaries and the syntax generator. The ambiguity was intentional.

But yes, it may well use a number of generated templates. We're
thinking of making it symbolic expansion based, which is more
efficient on bits but also more complicated to describe before it's
fixed (and it'll require a parser library).

> 2. Having toyed with this idea in the past, let me warn that forming a 4096
> word dictionary of memorable, non-collidingÂ words for each word category is
> going to be very difficult.Â Too many words are semantically similar,
> phonetically similar, or just unfamiliar.

Our intention currently is to first take candidate dictionaries from
WordNet, and use a combination of WordNet and Google 1-gram frequency
data as part of the cutoff for whether words are adequately familiar.
(N-grams with n >= 2 are rather irrelevant to our needs, AFAICT.)

> http://kenta.blogspot.com/2012/02/lefoezyy-some-notes-on-google-books.html

Thanks; that could be useful.

> Another way to go about it might be to first catalogue semantic categories
> (colors, animals, etc.) then list the most common (yet dissimilar) members
> of each category.Â An attempt at 64 words is here:

This is something that WordNet has already done.

> http://kenta.blogspot.com/2011/10/xpmqawkv-common-words.html

I think you omit far more common words, which you shouldn't â eg air
water coal man house etc.

But quibbling at this level is pointless; we'll need to be dealing
with dictionaries mostly on the order of a few thousand words, sorted
by *constituent types*, not be semantic categories. (E.g. one
dictionary would be "nouns that can be the target of a transitive
verb".)

> I'd propose that the "right" way to do this is not just sentences, but
> entire semantically consistent stories, written in rhyming verse, with
> entropy of perhaps only a few bits per sentence.Â (Prehistoric oral
> tradition does prove we can memorize such poems.)Â However, synthesizing
> these seem extremely difficult, an AI problem.

I think it's currently impossible to do that, and furthermore, that
it's *not* Right even if you could â because it would violate a key
constraint: that it can be reasonably typed as a domain. It shouldn't
take longer than a few seconds to remember and type. It won't be as
fast as typing "google.com", and that's OK, but I think that level of
redundant expansion is way too much.

Creating unambiguously parseable syntaxes and dictionaries that meet
our stated constraints is already hard enough. ;-)

> 3. I presume people are familiar with Bubblebabble?Â It doesn't solve all
> the problems, but does make bit strings seem less "dense".

BubbleBabble produces nonwords; as such it fails a basic requirement.
Making something merely look phonotactically valid isn't enough; it
has to be grammatically valid and composed entirely of known terms.

- Sai
_______________________________________________
tor-dev mailing list
tor-dev@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev

Follow-Ups:
- Re: [tor-dev] Mnemonic 80-bit phrases (proposal)
  - From: Ken Takusagawa II

References:
- Re: [tor-dev] Mnemonic 80-bit phrases (proposal)
  - From: Ken Takusagawa II

Prev by Author: Re: [tor-dev] SkypeMorph
Next by Author: Re: [tor-dev] Mnemonic 80-bit phrases (proposal)
Previous by thread: Re: [tor-dev] Mnemonic 80-bit phrases (proposal)
Next by thread: Re: [tor-dev] Mnemonic 80-bit phrases (proposal)
Index(es):
- Author
- Thread