Hi again
I took another look at this problem, and now I'm even more convinced that what we really need is IP_BIND_ADDRESS_NO_PORT. Here's why.
If torrc OutboundBindAddress is configured, tor calls bind(2) on every outgoing connection:
with sockaddr_in.sin_port set to 0 on #L2438.
The kernel doesn't know that we'll not be using this socket for listen(2), so the kernel attempts to find an unused local two-tuple (according to [1]. Actually a three-tuple: <protocol, source ip, source port>):
The bind syscall is handled by inet_bind:
which calls __inet_bind that in turn calls sk->sk_prot->get_port on #L531 (notice the if on #L529).
get_port is implemented by inet_csk_get_port in inet_connection_sock.c:
On #L375, we call inet_csk_find_open_port (defined on #L190) to find a free port.
inet_csk_find_open_port gets the local port range on #L206 (i.e net.ipv4.ip_local_port_range), selects a random starting point (L#222), and loops through all the ports until it finds one that is free (#L230). For every port candidate, if it is already in use (#L240) it calls inet_csk_bind_conflict (#L241), which is defined on #L133. As far as I understand, it is inet_csk_bind_conflict's job is to determine if it is safe to bind to the port anyway (ex, the existing connection could be in TCP_TIME_WAIT and SO_REUSEPORT set on the socket). This is where your server spend so much time. Increasing net.ipv4.ip_local_port_range doesn't solve the problem, but makes it more likely to find a port that is free.
Lets trace back to the "if" in __inet_bind on #L529:
Since we call bind with sockaddr_in.sin_port set to 0, snum is 0, and we can avoid the whole call chain by setting inet->bind_address_no_port to 1. I.e this patch:
That should allow the kernel to use already in use src ports as long as the TCP 4-tuple is unique.
Please include it in the next tor release! :)
- Anders