[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [freehaven-dev] plausible deniability





On Sun, 29 Oct 2000 hal@finney.org wrote:

> These are interesting ideas, but the real question is, what happens in
> the context of the DMCA and NET act?  If you publish a random pad, and
> you are informed by a copyright owner that your pad, when xor'd with five
> others, produces some infringing data, are you obligated to shut down?

It seems that you need a system which

	1) ensures that the pad creator and pad holder do not know each
	other's identities,

	2) has a high degree of "ambiguity" about whether a particular
	pad is tied to a particular document. 


We're covering oblivious transfer(OT) in the cryptography course I'm TAing
right now. This is a primitive in which a Sender S sends a message M to
a Receiver R. R receives the message with probability 1/2. S has no idea
whether R actually received the message. 

In one of the course meetings, this came up. The idea we were
kicking around was a system in which a content publisher does the
following:

	1. prepares the pads as Roger and David Madore have outlined
	(or more generally -- secret sharing) 
	Destroys the original secret. 

	2. performs a broadcast OT with a collection of servers.
	(whose identities may not necessarily be known to the publisher;
	c.f. Usenet)

This was inspired by the "graduated mirroring" idea.
The twist is that the use of Oblivious Transfer means that the content
publisher does not know (can not know) which servers pick up the pads and
which do not. So even if the content publisher is corrupted later, the
adversary still can't get its hands on the data.

You can work with the parameters and the OT probability to make it
exceedingly unlikely that any single server receives all of the shares.
Then none of the servers by themselves have the document. Collections
of servers have some calculable probability of having the document. 

What still needs solving, however, is the retreival part. Because if
retreival identifies specific pads on specific servers, you're toast
under the DMCA. Also if retreival is too cumbersome to be used for 
"useful" things such as, oh, online backups, the entire system may
be vulnerable to challenge as a "pirate's tool." 

> you are arguably working together in concert for this purpose, and that
> might be enough for a conspiracy charge.  This will greatly increase
> the legal penalty.

It will be interesting to see what happens to Napster in this regard.
After all, "everyone knows" that it's primarily used for copyright
infringement. So does that make every Napster user a co-conspirator?

> 
> Somehow you need to offer the judge a legitimate reason for continuing
> to publish the pad.  You will be charged with doing it to help other
> people infringe copyright.  You need to come up with a *convincing*
> story for why you needed to publish that data, without admitting that
> your intention was to help people break the law.

There seem to be at least two approaches to answering this:

	1) Create a system in which a single pad is used for more
	than one message. Pad X on server Y can be XORed with 
	pad X' to yield a Britney Spears album...but it can also be
	XORed with X'' to yield the Declaration of Independence.
	Revoking or unpublishing pad X causes not only the infringing
	material to be removed, but lots of other material as well.
 
	This is Roger's point about "what if someone comes alone and
	uses your pad for a bad document?" I think Madore mentions it
	as well. This other material would have to be non-infringing
	and valuable, it seems, or else we have a problem. 
	(Online backup services come to mind) 
		
	One way to create such a system would be to make a pad already
	on a server part of your pad splitting scheme. i.e. set your
	message M, pick the pad X, and then find R_1 .. R_n such that

	R_1 XOR ... XOR R_n XOR X = M 	

	There are at least three problems with this approach

		1) No straightforward way to tell what documents
		you can obtain from a pad just by looking at it. 
		If the state comes to you and says "this pad gives
		us Britney Spears," you need to say
		"yes, but it also gives us Shakespeare" and quickly.
		Uploading a new pad set which uses the target pad
		to give Shakespeare AFTER being contacted doesn't
		seem to cut it. 	

		You could add a list of which documents are associated
		with particular pads, but this is extra information. 

		2) If a particular pad can be identified as part of
		an infringing set of pads, a court may decide
		that the illegal infringement outweighs the
		benefit of having the other documents which use
		that pad available. 

		In particular, the pad server might be asked to
		prove that "no other pad sets exist in the system
		which yield the non-Britney Spears documents."
		This seems to be difficult. If we end up with
		that as the standard of proof, this approach is useless.
		(and it's not such an unreasonable standard,
		despite being impossible to meet -- you're
		weighing demonstrated harm in infringement vs.
		the possibility that this is the only pad which
		can produce Shakespeare...in a system where
		people create overlapping pad sets all the time)


		3) I don't like pad X. I will use pad X to create
		a pad set which yields a Britney Spears album.
		And there's nothing you or anyone else can do about it.
		Then I tip off the FBI. Oops. 

	2) Create a system in which it is impractical to link individual
	pads with particular documents. That is, a system in which it is
	*not* possible for me to pick out pads X1, X2, X3 directly and
	then demonstrate that their XOR is a Britney Spears album.
 
	Maybe we could call this the "Britney Spears Existence Problem."
	Instance: A set of pads X_1 ... X_n  partitioned into subsets
		  S_1 ... S_k  corresponding to k distinct servers.
	Question: Does there exist a sequence X_i1... X_ij such that
	X_i1 XOR ... XOR X_ij yields Britney Spears?

	(and modify as needed for more complicated secret sharing 
	schemes)

	Note that if the contents of all the servers are known, then
	this is in NP for sure - guess the XOR-combination and check. 

	The only thing which comes to mind here is a system in which
	the servers keep their pads secret. Then reconstruction is some
	kind of secure multi-party computation (in the BGW sense of a
	circuit with private inputs evaluated by broadcast communication
	between players), which outputs the final requested data from
	the pads. 

	It's more difficult than this, however, because we seem to want 
	also that the servers do not know which pads are involved in 
	reconstructing which documents. Otherwise if a server is
	compromised (i.e. inspected by the state) 

	The adversary model would be that an adversary corrupts 
	one or more servers, with the goals of
	
		1) discovering which data on the corrupted server
		belongs to infringing material
			- aimed at proving that a server holds infringing
			data in order to shut it down.
			
		2) discovering which other servers may be holding
		infringing material. 
			- aimed at picking the next server to break into. 
	

	This sounds something like a secure multiparty computation problem
	with an adaptive adversary which starts with 1 server
	and then "grows" its reach by 1 more server on each "timestep." 	
	
	The benefit of this approach, if it works, is then that 
	you can argue that is is not sufficient to merely remove one pad.
	You need to remove an entire server! 
	If you remove an entire server, the potential for "good data" to 
	be lost b/c of being partially stored there becomes much higher.
	

-David