[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [tor-dev] Proposal: Load-balancing hidden services by splitting introduction from rendezvous



Hi Tim,

Thanks for your great comments, very much appreciated!

Comments inline.



Op 30/09/15 om 19:40 schreef Tim Wilson-Brown - teor:

On 30 Sep 2015, at 17:27, Tom van der Woerdt <info@xxxxxxx
<mailto:info@xxxxxxx>> wrote:

...

Filename: xxx-intro-rendezvous-controlsocket.txt
Title: Load-balancing hidden services by splitting introduction from
      rendezvous
Author: Tom van der Woerdt
Created: 2015-09-30
Status: draft

1. Overview and motivation

To address scaling concerns with the onion web, we want to be able to
spread the load of hidden services across multiple machines.
OnionBalance is a great stab at this, and it can currently give us 60x
the capacity by publishing 6 separate descriptors, each with 10
introduction points, but more is better. This proposal aims to address
hidden service scaling up to a point where we can handle millions of
concurrent connections.

The basic idea involves splitting the 'introduce' from the
'rendezvous', in the tor implementation, and adding new events and
commands to the control specification to allow intercepting
introductions and transmitting them to different nodes, which will then
take care of the actual rendezvous.
…
2.1. DisableAutomaticRendezvous configuration option

The syntax is:
   "DisableAutomaticRendezvous" SP [1|0] CRLF

This configuration option is defined to be a boolean toggle which, if
set, stops the tor implementation from automatically doing a rendezvous
when an INTRODUCE2 cell is received. Instead, an event will be sent to
the controllers. If no controllers are present, the introduction cell
should be dropped, as acting on it instead of dropping it could open a
window for a DoS.

For security reasons, the configuration should be made available only
in the configuration files, and not as an option settable by the
controller.

I’m not sure it’s necessary to prevent the controller setting this option.
We trust the controller, and might need it to be able to set this option
for compatibility with ephemeral hidden services.

What is the threat model where a controller could set this option, but
not do things that are much worse?

You're right, this addresses an irrelevant threat model.


2.2. The "INTRODUCE" event

The syntax is:
   "650" SP "INTRODUCE" SP RendezvousData CRLF

   RendezvousData = implementation-specific, but must not contain
                    whitespace, must only contain human-readable
                    characters, and should be no longer than 512 bytes

I don’t think 512 bytes is enough for the current implementation, I
recommend at least 2048 bytes. (See below.)


Agreed

The INTRODUCE event should contain sufficient data to allow continuing
the rendezvous from another Tor instance. The exact format is left
unspecified and left up to the implementation. From this follows that
only matching versions can be used safely to coordinate the rendezvous
of hidden service connections.

I would appreciate a list of the data needed by the current version of
the hidden service protocol to rendezvous, even if we don’t want to
specify the exact format, or specify data items for future
implementations. This helps ensure that the limits in the proposal are
sane, and that the proposal doesn’t have any unexpected implementation
issues.

 From reading rend_service_receive_introduction think the data is at least:
* service_id - the hidden service address (16 base32 bytes)
* intro_key - the introduction-point specific key (128 binary bytes, 171
base64 bytes)
* request - the encrypted portion of the INTRODUCE2 cell (up to 476
binary bytes(?), 635 base64 bytes)
Therefore, I think the minimum for the current hidden service
implementation is around 830 bytes, at least if we want to offload the
maximum processing to the rendezvous instances by sending the entire
encrypted INTRODUCE2 cell. Therefore, I’d suggest that a limit of 2048
bytes is much more reasonable for future-proofing this proposal.

It also looks like you might need to split rend_service_t into:
* introduction point-specific data
* rendezvous-specific data
* shared data
Does any data need to be shared, and, if so, how do you intend to keep
the shared data synchronised?
(Putting it in the RendezvousData each time might blow out the size
considerably.)

I’d also appreciate an example of which parts of
rend_service_receive_introduction could be performed by each of the
cooperating tor instances. I assume that sending the data “as early as
possible” would offload the most processing to the rendezvous side. I
think that the split could happen right before the decryption of the
cell, at the lines:
   stage_descr = "decryption";
   /* Now try to decrypt it */

This would avoid having to share the intro point encrypted replay cache
(intro_point->accepted_intro_rsa_parts), but there’s still the hidden
service Diffie-Hellman handshake cache
(service->accepted_intro_dh_parts). If we don’t share that:
* two backend instances could accidentally compete for the same
rendezvous point if the client times out
* a client could more easily DoS the hidden service by using the same
Diffie-Hellman handshake
We’d have to decide if this security issue outweighs the benefit of
doing the decryption on multiple rendezvous-side instances.


Just spent a tiny bit of time trying to separate the functions as much as I can :

https://github.com/TvdW/tor/commit/115389e1659d400eb8fcb6c2d5db3c00fb4b80e2

The prototype of the new function is :

int rend_service_perform_rendezvous(rend_intro_cell_t *parsed_req,
                                    rend_service_t *service,
                                    rend_intro_point_t *intro_point,
                                    crypto_pk_t *intro_key,
                                    char *rend_pk_digest)

intro_point is only needed to count how many handshakes we've seen, should probably be moved to a different location but I couldn't find a way to do that without changing the behavior.

parsed_req is just the parsed cell, could easily be serialized and unserialized later.

service would have to be synchronized across machines, but since it's basically a configuration struct, we can leave that up to the operator to make sure configurations are similar or equal.

intro_key should probably just be transferred to the controller. Same for rend_pk_digest


Post-224 a few things may need to change, as there are more keys to deal with, but imho we can just give those to the controller as well as I don't see that becoming a performance issue any time soon.

As for the DH replay cache: we still perform the normal replay checks, what's the worst thing that can happen if we see the same DH data twice?

In general, I’m concerned that we need to think through the
implementation of this proposal more carefully, because it will help us
decide whether it’s compatible with:
* Current Hidden Services
* Next-Generation Hidden Services
And perhaps make changes to any of these proposals to make them work
together.

Thoughts welcome! I don't think I'm the right person to address those.


I’d also note that it’s definitely not compatible with Single Onion
Services as specified in Proposal #252, as there is no rendezvous in
that protocol.

Indeed.

Tom




Draft 2:

Filename: TBD.txt
Title: Load-balancing hidden services by splitting introduction from
       rendezvous
Author: Tom van der Woerdt
Created: 2015-09-30
Status: draft

1. Overview and motivation

To address scaling concerns with the onion web, we want to be able to
spread the load of hidden services across multiple machines.
OnionBalance is a great stab at this, and it can currently give us 60x
the capacity by publishing 6 separate descriptors, each with 10
introduction points, but more is better. This proposal aims to address
hidden service scaling up to a point where we can handle millions of
concurrent connections.

The basic idea involves splitting the 'introduce' from the
'rendezvous', in the tor implementation, and adding new events and
commands to the control specification to allow intercepting
introductions and transmitting them to different nodes, which will then
take care of the actual rendezvous. External controller code could
relay the data to another node or a pool of nodes, all which are run by
the hidden service operator, effectively distributing the load of
hidden services over multiple processes.

By cleverly utilizing the current descriptor methods, we could publish
up to sixty unique introduction points, which could translate to many
thousands of parallel tor workers. This should allow hidden services to
go multi-threaded, with a few small changes.


2. Specification

We propose two additions to the control specification, of which one is
an event and the other is a new command. We also introduce a new
configuration option.


2.1. HiddenServiceAutomaticRendezvous configuration option

The syntax is:
    "HiddenServiceAutomaticRendezvous" SP [1|0] CRLF

This configuration option is defined to be a boolean toggle which, if
zero, stops the tor implementation from automatically doing a rendezvous
when an INTRODUCE2 cell is received. Instead, an event will be sent to
the controllers. If no controllers are present, the introduction cell
should be dropped, as acting on it instead of dropping it could open a
window for a DoS.

This configuration option can be specified on a per-hidden service
level, and can be set through the controller for ephemeral hidden
services as well.


2.2. The "INTRODUCE" event

The syntax is:
    "650" SP "INTRODUCE" SP RendezvousData CRLF

    RendezvousData = implementation-specific, but must not contain
                     whitespace, must only contain human-readable
                     characters, and should be no longer than 2048 bytes

The INTRODUCE event should contain sufficient data to allow continuing
the rendezvous from another Tor instance. The exact format is left
unspecified and left up to the implementation. From this follows that
only matching versions can be used safely to coordinate the rendezvous
of hidden service connections.


2.3. "PERFORM-RENDEZVOUS" command

The syntax is:
  "PERFORM-RENDEZVOUS" SP RendezvousData CRLF

This command allows a controller to perform a rendezvous using data
received through an INTRODUCE event. The format of RendezvousData is
not specified other than that it must not contain whitespace, and
should be no longer than 2048 bytes.


3. Compatibility and security

The implementation of these methods should, ideally, not change
anything in the network, and all control changes are opt-in, so this
proposal is fully backwards compatible.

Controllers handling this data must be careful to not leak rendezvous
data to untrusted parties, as it could be used to intercept and
manipulate hidden services traffic.


4. Example

Let's take an example where a client (Alice) tries to contact Bob's
hidden service. To do this, Bob follows the normal hidden service
specification, except he sets up ten servers to do this. One of these
publishes the descriptor, the others have this disabled. When the
INTRODUCE2 cell arrives at the node which published the descriptor, it
does not immediately try to perform the rendezvous, but instead outputs
this to the controller. Through an out-of-band process this message is
relayed to a controller of another node of Bob's, and this transmits
the "PERFORM-RENDEZVOUS" command to that node. This node finally
performs the rendezvous, and will continue to serve data to Alice,
whose client will now not have to talk to the introduction point
anymore.


5. Other considerations

We have left the actual format of the rendezvous data in the control
protocol unspecified, so that controllers do not need to worry about
the various types of hidden service connections, most notably proposal
224.

The decision to not implement the actual cell relaying in the tor
implementation itself was taken to allow more advanced configurations,
and to leave the actual load-balancing algorithm to the implementor of
the controller. The developer of the tor implementation should not
have to choose between a round-robin algorithm and something that could
pull CPU load averages from a centralized monitoring system.
_______________________________________________
tor-dev mailing list
tor-dev@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev