[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [tor-talk] Cryptographic social networking project



 

>No, I am just suggesting not to use Tor for something it wasn't built
>for. We have been working on a technology that combines anonymization
>with multicast distribution and is therefore a lot better suited for
>social use cases. I hoped you would see this point and maybe consider
>joining forces with us rather than developing something that may run
>into scalability limits.

>You want to use a shopping bag to deliver a cupboard.. and now you
>say that using more shopping bags will solve the problem. You can't
>solve an exponentially growing problem with a linearely growing
>solution. All technologies that address scalability have a multicast
>distribution strategy somewhere. In cloud technology it's the way
>the database replication is organized in distribution trees. In
>Bittorrent it's the way BT grows a tree with every further downloader.
>Tor doesn't have that and so far I have not heard of anyone being
>interested in changing this. In fact, it would be such a drastic 
>intrusion into its current operation mode that it would risk affecting
>the current way Tor operates. That is why it is good for everyone that
>other platforms like GNUnet, Tribler and I2P experiment with this
>challenge and Tor developers who think Tor has reached a sufficient
>degree of maturity could come and help the other platforms. I can
>imagine an integration happening at some point, since all of these
>platforms need a relay router network to perform well.

>Still whenever Alice uses those 167 circuits (example scenario) she is
>sending the exact same information to all of those people. If our
>anonymization network had native distribution trees rather than unicast
>circuits, then this task would be roughly the same as when Twitter
>delivers a tweet to all data centers in order to make it appear on
>potentially millions of recipient dashboards.

>Which again means that the same data is being delivered in hundreds of
>copies over the Tor network, rather than having a multicast strategy
>that ensures data travels each network node just once at maximum, or
>at least reduces redundancy to a scalable amount.

>You insist on only focusing on the cost of establishing circuits, but
>I don't believe Tor will be able and wanting to deal with an explosion
>of redundant data deliveries. There is a reason why Bittorrent is
>discouraged over Tor - because it is the same social use case. Tor
>scales for a steadily growing number of humanoids that make unicast
>exchanges with websites and other server-like applications. It's a
>linear challenge that a slow increase in efficiency and number of 
>relay nodes can tackle.
>The moment all of these users start interacting with each other like
>crazy, Tor has a problem. I don't understand why I have to tell and
>re-tell these basics of scalability as if it was my opinion. This
>is how scalability works, or rather doesn't work. If we want an
>anonymization platform that can scale socially, we have to make one.

>Now you have an assessment that your plan will likely not work out for 
>a relevant number of participants and you are free to find out the hard 
>way or teach me something about scalabilty after working with it for ...
>hmm.. when did I start working on IRC's multicast? That's 25 years ago now.
>So good luck proving me that I got it all wrong.

i think you got it all wrong, maybe it's because English is not my
native language. let me demonstrate it by technical details

-ESTIMATING TOTAL COST-- 

We assume our network gets 10 million active users with 167 friend per
each user. 

Hidden Service (hybrid scheme): beside classical hidden services, we do
have shared secrets between Alice and Bob which after minor changes (few
lines code) on relay operators empower them create hidden service
circuits for social networking in mass scales without heat. There will
be a directory server (managed by stable parties) that every 24 hours
generate snapshots from all available onion routers (OR) sorted by row
numbers to make sure everyone around the world see same view from list
of ORs with same order. Alice knows a SharedSecret and a CommonSecret
for Bob (Bob's shared secret is unique for Alice but his common secret
is same for all his friends). 

In an undirected graph number of edges=(vertices)*(degree)/2 so in our
network there are <835 million connections but there aren't 835 million
onion circuits between users. Each user for handling hidden services
only have two regular circuit that for 10 million users sum of them
would become 20 million. Thus Alice only have two 3hop circuit and Bob
have two 3hop circuit, there is no overheat here compared to what Tor
users generally do for browsing websites securely by Tor browser bundle
hence I'm not going to calculate cost of maintaining these regular
circuits. The sender circuit (SC) is used for sending Notifications to
friend hidden services, the receiver circuit (RC) is used as the hidden
service itself to receive Notifications from all friends. In RC, third
hop is called rendezvous point (RP). Alice in order to send a
Notification to Bob, need find out what is his RP plus some additional
information to send him packets through RP. 

In hybrid hidden services there is no need for asymmetric key agreements
to establish a secure channel between SC and RC, also I dismiss
calculating cost of symmetric cryptography on packets as it's trivial
using regular block ciphers so I won't estimate CPU work required by ORs
to handle hidden services (check djb's benchmarks for aes_128 at
cr.yp.to). All informations needed to exchange Notifications securely at
RPs, is delivered from CommonSecret and SharedSecret. 

Bob select his RPs from directory's snapshot in time intervals between
10minutes-12hours after beginning of each day at 00:00 UTC. Time
interval is delivered from V_1=H(CommonSecret||mm/dd/year||EpochCounter)
where EpochCounter is a natural number starting from 1 to n that reset
to 1 again at 00:00 UTC in next day and row number for RP in directory's
snapshot is delivered from
V_2=H(H(CommonSecret||mm/dd/year||EpochCounter)). Bob to generate Time
interval, spin a wheel by V_1 that has 42600 slots and encode where it
stops into waiting time between 10 minutes to 12 hours. To generate row
number for each epoch's RP, he spin a wheel by V_2 that has n slots
(n=number of available ORs in directory's snapshot) and use where it
stops as RP's row number. if row number for RP is for instance 3907, Bob
connect to OR #3907 #3908 #3909 and keep these RPs open to make sure if
Alice failed send her Notification to #3907 then she can try other RPs. 

Bob start opening RPs from 00:00 UTC, wait for generated time interval
and use a higher epoch counter to determine what is next RP and how much
he should stay there again by generated time interval. Hence total
numbers of epochs is different everyday for each person. 

When Alice know what is Bob's RP, she don't send anything to it until
she have a new Notification for him. She sends packets as
{CircuitID|Payload}over Http from her SC without establishing a TLS
channel with RP. CircuitID= first 4 byte of
H(CommonSecret||mm/dd/year||EpochCounter||GenerateCommonID), payload is
cipher-text of {cookie|Notification} encrypted by RP_KEY which is
H(CommonSecret||SharedSecret||mm/dd/year||EpochCounter||GenerateKey),
cookie is (cookie1)â(cookie2). Bob when open an RP, tells all different
cookies for all his 167 friends to RP (for each friend there is a
different cookie1 and cookie2 value in each epoch), cookie1= first 4
byte of
H(CommonSecret||SharedSecret||mm/dd/year||EpochCounter||GenerateCookie1)
and cookie2= first 4 byte of
H(CommonSecret||SharedSecret||mm/dd/year||EpochCounter||GenerateCookie2).
When Alice gives {cookie|Notification} to RP, if
(cookie)=(cookie1)â(cookie2), RP send the packet to Bob, then RP OR in
its RAM replace (cookie2) with (H(cookie2). When Alice want to send
another Notification to Bob using same RP again, for (cookie) she have
to send (cookie1)â(H(cookie2)). Next Notification need
(cookie1)â(H(H(cookie2))) as cookie and so on. 

Let say each packet is approximately 60 byte and Alice sends 50
Notifications to all her friends each day. Thus Alice sends 50*60*167
byte to all her friends that sending them via her 3hop SC to each
friend's 3hop RC will increase the total amount 6x time more. Therefore
Alice everyday sends 3 MB through ORs in order to deliver Notifications
for different purposes to all her friends. If 10 million users send same
amount of data to their friends, it will cost 30 TB data exchange for
onion network. 

PseudonymousServer: public container for hosting blocks have 100%
efficiency. If each user everyday send/receive 10 MB data
(reading/posting) to/from PseudonymousServer, the total amount of
traffic for 10 million users would be 100 TB each day that based on our
threat model has to be routed through onion network but this is linear
traffic not an exponential effect, for instance if on Twitter.com each
user approximately download/upload 10MB data from/to Twitter.com servers
everyday, for 10 million users it would require exact same amount of
traffic (100*3 TB) to be routed through the onion network if they use
Tor browser bundle to access Twitter.com 

--SUMMARY-- 

Our paradigm with 10 million users for presumed social networking
scenarios, as an extra load compared to classical hidden services for
linear applications, requires 30 TB data exchange inside onion network
which constitute ~55% capacity of 5000 volunteer onion router with 1Mbps
available bandwidth each day that is slight compared to how much
bandwidth 10 millions users need for surfing their favorite websites
using Tor browser bundle because simply refreshing a graphical magazine
like buzzfeed.com will cost more than 3 MB ... 

In conclusion Tor network need more relays if millions of more users who
transfer megabytes of data per day try use it. 

>Our plan is completely different from what you write here. Pubsub
>distribution channels operate over the backbone, not the individual
>friend systems. It is the backbone ensuring that everyone gets a copy
>of the message she is supposed to get and the subscribers may not know
>of each other - who they are, how many they are. I don't know why you
>assume you can judge what we have been working on in the last decade,
>then talk about things that have nothing to do with us.

Now I did a search on your website and i'm not exactly sure what is it.
what I found seems to be an experimental mesh network. You criticized
Tor because when a global adversary monitors both entry+exit nodes in a
circuit, metadata is compromised. In a mesh network (if friends are
using each other as mesh routers) even a local adversary by monitoring
any part of network can compromise metadata for that part. Breaking
onion routing need 2 point of failure but breaking mesh network only
need 1 point of failure. If you employ high delays, padding etc for more
security, then why not apply same defense on a parallel onion network
managed by a comprehensive organization like Tor inc? 

In mesh networks when a node route someone else's traffic to
destination, it makes traffic analysis for an observer harder as they
can't detect it's from node itself or someone else but exact same
property imply on onion routing networks either, if user run an onion
router then it become harder for an observer detect intercepted traffic
belongs to user itself or someone else behind it. Onion routing is
already implemented, widely adopted, heavily supported and foils various
types of more traffic analysis attacks that mesh networks can't. 

By the way it's cool to replace ISPs with mesh networks to reduce radius
of connection between identities and make dragnet SIGINT more difficult,
for instance when I send a TCP packet from my home IP address in Iran to
a Tor entryGuard located in iceland, GCHQ really collect metadata for my
connection by intercepting Iran's optic fibers at Oman sea and probably
deanonymize my Tor circuit if they are controlling my selected Tor exit
node in Japan too. But Internet backbones are beyond application
developers scope, it's up to societies. 

>You just described another one of the good reasons why Tor isn't the
>appropriate tool for the job we want to get done. Low latency is a
>client/server-paradigm requirement that unnecessarily reduces the
>anonymity for the use case of a distributed social network.

Our assumption is that anonymity works and when users retrieve something
from PseudonymousServer via Tor, server can't recognize requests coming
outside the exit node are from whom, for instance if Alice retrieve
block1 then retrieve block2 from same exit node, we assume server can't
recognize these retrievals are from same person as many others are using
same exit node to retrieve blocks and majority of exit nodes are not
concluding with attacker in same time. This threat model isn't perfect
nor broken. If we decide not to do that, there is no alternative
solution. High latency networks might cause deanonymization harder but
if they are practical enough, I'm sure Tor network can easily add delays
by writing few lines code for those who want it and if they do that in
the future we can easily adopt it. The only other solution that makes
deanonymizing connection between Alice and Bob really hard, is using a
PIR protocol by homomorphic encryption to ask Alice put something on a
database and then Bob later on query the database to pick up her packet
without telling server what is his query or what server should in
response give to him! But problem with such a PIR protocol is that for
10 million users, service provider have to pay billions of dollars to
cloud hostings every month for computing astronomical cryptographic
functions. Another PIR protocol that don't need cryptographically
massage all records in database to guess output, is asking Alice to put
something on database and Bob later download all records from database
to locally choose which record is for him and delete the rest of
unwanted outputs. But problem with such a PIR protocol is that database
everyday become larger and larger thus users have to download more and
more data from it next days which eventually paralyze the Internet. 

So we are on the right track... 
-- 
tor-talk mailing list - tor-talk@xxxxxxxxxxxxxxxxxxxx
To unsubscribe or change other settings go to
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-talk