[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [freehaven-dev] plausible deniability
On Sun, 29 Oct 2000 email@example.com wrote:
> These are interesting ideas, but the real question is, what happens in
> the context of the DMCA and NET act? If you publish a random pad, and
> you are informed by a copyright owner that your pad, when xor'd with five
> others, produces some infringing data, are you obligated to shut down?
It seems that you need a system which
1) ensures that the pad creator and pad holder do not know each
2) has a high degree of "ambiguity" about whether a particular
pad is tied to a particular document.
We're covering oblivious transfer(OT) in the cryptography course I'm TAing
right now. This is a primitive in which a Sender S sends a message M to
a Receiver R. R receives the message with probability 1/2. S has no idea
whether R actually received the message.
In one of the course meetings, this came up. The idea we were
kicking around was a system in which a content publisher does the
1. prepares the pads as Roger and David Madore have outlined
(or more generally -- secret sharing)
Destroys the original secret.
2. performs a broadcast OT with a collection of servers.
(whose identities may not necessarily be known to the publisher;
This was inspired by the "graduated mirroring" idea.
The twist is that the use of Oblivious Transfer means that the content
publisher does not know (can not know) which servers pick up the pads and
which do not. So even if the content publisher is corrupted later, the
adversary still can't get its hands on the data.
You can work with the parameters and the OT probability to make it
exceedingly unlikely that any single server receives all of the shares.
Then none of the servers by themselves have the document. Collections
of servers have some calculable probability of having the document.
What still needs solving, however, is the retreival part. Because if
retreival identifies specific pads on specific servers, you're toast
under the DMCA. Also if retreival is too cumbersome to be used for
"useful" things such as, oh, online backups, the entire system may
be vulnerable to challenge as a "pirate's tool."
> you are arguably working together in concert for this purpose, and that
> might be enough for a conspiracy charge. This will greatly increase
> the legal penalty.
It will be interesting to see what happens to Napster in this regard.
After all, "everyone knows" that it's primarily used for copyright
infringement. So does that make every Napster user a co-conspirator?
> Somehow you need to offer the judge a legitimate reason for continuing
> to publish the pad. You will be charged with doing it to help other
> people infringe copyright. You need to come up with a *convincing*
> story for why you needed to publish that data, without admitting that
> your intention was to help people break the law.
There seem to be at least two approaches to answering this:
1) Create a system in which a single pad is used for more
than one message. Pad X on server Y can be XORed with
pad X' to yield a Britney Spears album...but it can also be
XORed with X'' to yield the Declaration of Independence.
Revoking or unpublishing pad X causes not only the infringing
material to be removed, but lots of other material as well.
This is Roger's point about "what if someone comes alone and
uses your pad for a bad document?" I think Madore mentions it
as well. This other material would have to be non-infringing
and valuable, it seems, or else we have a problem.
(Online backup services come to mind)
One way to create such a system would be to make a pad already
on a server part of your pad splitting scheme. i.e. set your
message M, pick the pad X, and then find R_1 .. R_n such that
R_1 XOR ... XOR R_n XOR X = M
There are at least three problems with this approach
1) No straightforward way to tell what documents
you can obtain from a pad just by looking at it.
If the state comes to you and says "this pad gives
us Britney Spears," you need to say
"yes, but it also gives us Shakespeare" and quickly.
Uploading a new pad set which uses the target pad
to give Shakespeare AFTER being contacted doesn't
seem to cut it.
You could add a list of which documents are associated
with particular pads, but this is extra information.
2) If a particular pad can be identified as part of
an infringing set of pads, a court may decide
that the illegal infringement outweighs the
benefit of having the other documents which use
that pad available.
In particular, the pad server might be asked to
prove that "no other pad sets exist in the system
which yield the non-Britney Spears documents."
This seems to be difficult. If we end up with
that as the standard of proof, this approach is useless.
(and it's not such an unreasonable standard,
despite being impossible to meet -- you're
weighing demonstrated harm in infringement vs.
the possibility that this is the only pad which
can produce Shakespeare...in a system where
people create overlapping pad sets all the time)
3) I don't like pad X. I will use pad X to create
a pad set which yields a Britney Spears album.
And there's nothing you or anyone else can do about it.
Then I tip off the FBI. Oops.
2) Create a system in which it is impractical to link individual
pads with particular documents. That is, a system in which it is
*not* possible for me to pick out pads X1, X2, X3 directly and
then demonstrate that their XOR is a Britney Spears album.
Maybe we could call this the "Britney Spears Existence Problem."
Instance: A set of pads X_1 ... X_n partitioned into subsets
S_1 ... S_k corresponding to k distinct servers.
Question: Does there exist a sequence X_i1... X_ij such that
X_i1 XOR ... XOR X_ij yields Britney Spears?
(and modify as needed for more complicated secret sharing
Note that if the contents of all the servers are known, then
this is in NP for sure - guess the XOR-combination and check.
The only thing which comes to mind here is a system in which
the servers keep their pads secret. Then reconstruction is some
kind of secure multi-party computation (in the BGW sense of a
circuit with private inputs evaluated by broadcast communication
between players), which outputs the final requested data from
It's more difficult than this, however, because we seem to want
also that the servers do not know which pads are involved in
reconstructing which documents. Otherwise if a server is
compromised (i.e. inspected by the state)
The adversary model would be that an adversary corrupts
one or more servers, with the goals of
1) discovering which data on the corrupted server
belongs to infringing material
- aimed at proving that a server holds infringing
data in order to shut it down.
2) discovering which other servers may be holding
- aimed at picking the next server to break into.
This sounds something like a secure multiparty computation problem
with an adaptive adversary which starts with 1 server
and then "grows" its reach by 1 more server on each "timestep."
The benefit of this approach, if it works, is then that
you can argue that is is not sufficient to merely remove one pad.
You need to remove an entire server!
If you remove an entire server, the potential for "good data" to
be lost b/c of being partially stored there becomes much higher.