[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [tor-relays] inet_csk_bind_conflict



Also see this patch, which introduces net.ipv4.ip_autobind_reuse:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=4b01a9674231a97553a55456d883f584e948a78d

Enabling net.ipv4.ip_autobind_reuse allows the kernel to bind SO_REUSEADDR enabled sockets (which I think they are in tor) to the same <addr, port> only when all ephemeral ports are exhausted. (So it should fix the "resource exhausted" bugs, but we'll still spend way too much time in the kernel looking for free ports, before giving up and checking if net.ipv4.ip_autobind_reuse is toggled)

It is only safe to use when you know that you'll not have tons of connections to the same <dstip:dstport> (so not safe to use in the haproxy setup), which is why it "should only be set by experts", and they suggest using IP_BIND_ADDRESS_NO_PORT instead:
> ip_autobind_reuse - BOOLEAN
>  By default, bind() does not select the ports automatically even if
>  the new socket and all sockets bound to the port have SO_REUSEADDR.
>  ip_autobind_reuse allows bind() to reuse the port and this is useful
>  when you use bind()+connect(), but may break some applications.
>  The preferred solution is to use IP_BIND_ADDRESS_NO_PORT and this
>  option (i.e ip_autobind_reuse) should only be set by experts.
>  Default: 0

I've enabled `sysctl -w net.ipv4.ip_autobind_reuse=1` on the dotsrc exits for now, while we wait for IP_BIND_ADDRESS_NO_PORT.

- Anders

On Sat, Dec 10, 2022 at 9:59 AM Anders Trier Olesen <anders.trier.olesen@xxxxxxxxx> wrote:
Hi David

IP_BIND_ADDRESS_NO_PORT did not fix your somewhat similar problem in your Haproxy setup, because all the connections are to the same dst tuple <ip, port> (i.e 127.0.0.1:ExtORPort).
The connect() system call is looking for a unique 5-tuple <protocol, srcip, srcport, dstip, dstport>. In the Haproxy setup, the only free variable is srcport <tcp, 127.0.0.1, srcport, 127.0.0.1, ExtORPort>, so toggling IP_BIND_ADDRESS_NO_PORT makes no difference.

The following should help (unless found a bug in Linux):
  1. Let tor listen on a bunch of different ExtORPort
  2. Let tor listen on a bunch of ips for the ExtORPort (so we have #ExtORPort * #ExtOrPortListenIPs unique combinations)
  3. Connect from different src ips (what you already implemented)
  4. sysctl -w net.ipv4.ip_local_port_range="1024 65535"
For 1 and 2 to make a difference, if you do a 3 (i.e bind before connect), you need IP_BIND_ADDRESS_NO_PORT enabled on the socket.

Tor relays already connect to many different dstip:dstport pairs, so enabling IP_BIND_ADDRESS_NO_PORT should solve our problem.

I rest my case ;)

Best regards
Anders Trier Olesen


On Sat, Dec 10, 2022 at 5:41 AM David Fifield <david@xxxxxxxxxxxxxxx> wrote:
On Fri, Dec 09, 2022 at 09:47:07AM +0000, Alexander Færøy wrote:
> On 2022/12/01 20:35, Christopher Sheats wrote:
> > Does anyone have experience troubleshooting and/or fixing this problem?
>
> Like I wrote in [1], I think it would be interesting to hear if the
> patch from pseudonymisaTor in ticket #26646[2] would be of any help in
> the given situation. The patch allows an exit operator to specify a
> range of IP addresses for binding purposes for outbound connections. I
> would think this could split the load wasted on trying to resolve port
> conflicts in the kernel amongst the set of IP's you have available for
> outbound connections.

This sounds similar to a problem we faced with the main Snowflake
bridge. After usage passed a certain threshold, we started getting
constant EADDRNOTAVAIL, not on the outgoing connections to middle nodes,
but on the many localhost TCP connections used by the pluggable
transports model.

https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/issues/40198
https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/issues/40201

Long story short, the only mitigation that worked for us was to bind
sockets to an address (with port number unspecified, and with
IP_BIND_ADDRESS_NO_PORT *unset*) before connecting them, and use
different 127.0.0.0/8 addresses or ranges of addresses in different
segments of the communication chain.

https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/merge_requests/120
https://gitlab.torproject.org/dcf/extor-static-cookie/-/commit/a5c7a038a71aec1ff78d1b15888f1c75b66639cd

IP_BIND_ADDRESS_NO_PORT was mentioned in another part of the thread
(https://lists.torproject.org/pipermail/tor-relays/2022-December/020895.html).
For us, this bind option *did not help* and in fact we had to apply a
workaround for Haproxy, which has IP_BIND_ADDRESS_NO_PORT hardcoded.
*Why* that should be the case is a mystery to me, as is why it is true
that bind-before-connect avoids EADDRNOTAVAIL even when the address
manually bound to is the very same address the kernel would have
automatically assigned. I even spent some time reading the Linux 5.10
source code trying to make sense of it. In the source code I found, or
at least think I found, code paths for the behvior I observed; but the
behavior seems to go against how bind and IP_BIND_ADDRESS_NO_PORT are
documented to work.

https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/issues/40201#note_2839472

> Although my understanding of what Linux is doing is very imperfect, my
> understanding is that both of these questions have the same answer:
> port number assignment in `connect` when called on a socket not yet
> bound to a port works differently than in `bind` when called with a
> port number of 0. In case (1), the socket is not bound to a port
> because you haven't even called `bind`. In case (2), the socket is not
> bound to a port because haproxy sets the `IP_BIND_ADDRESS_NO_PORT`
> sockopt before calling `bind`. When you call `bind` *without*
> `IP_BIND_ADDRESS_NO_PORT`, it causes the port number to be bound
> before calling `connect`, which avoids the code path in `connect` that
> results in `EADDRNOTAVAIL`.
>
> I am confused by these results, which are contrary to my understanding
> of what `IP_BIND_ADDRESS_NO_PORT` is supposed to do, which is
> precisely to avoid the problem of source address port exhaustion by
> deferring the port number assignment until the time of `connect`, when
> additional information about the destination address is available. But
> it's demonstrable that binding to a source port before calling
> `connect` avoids `EADDRNOTAVAIL` errors in our use cases, whatever the
> cause may be.
_______________________________________________
tor-relays mailing list
tor-relays@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
_______________________________________________
tor-relays mailing list
tor-relays@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays