[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
[tor-bugs] #32108 [Core Tor/Tor]: tor can overrun its accountingmax if it enters soft hibernation first
#32108: tor can overrun its accountingmax if it enters soft hibernation first
------------------------------+--------------------------------
Reporter: arma | Owner: (none)
Type: defect | Status: new
Priority: Medium | Milestone:
Component: Core Tor/Tor | Version: Tor: 0.4.0.1-alpha
Severity: Normal | Keywords: network-health
Actual Points: | Parent ID:
Points: | Reviewer:
Sponsor: |
------------------------------+--------------------------------
I'll put the punchline first: second_elapsed_callback(), which is where we
check if it's time to go dormant for hibernation, is no longer called when
we've entered soft hibernation.
I assume this is because of the new "periodic event flag" feature, where
we try to avoid calling callbacks when we're in a state that won't need
them. See the "online and active" note here:
{{{
/* This is a legacy catch-all callback that runs once per second if
* we are online and active. */
CALLBACK(second_elapsed, NET_PARTICIPANT,
FL(NEED_NET)|FL(RUN_ON_DISABLE)),
}}}
The impact is limited, since we stop accepting new connections and new
circuits when we enter soft hibernation, but it can still be bad: existing
connections and circuits can last for a long time and use a lot of
bandwidth.
A secondary impact is that accounting_run_housekeeping() never gets
called, which means that the state file never gets updated after we've
entered soft hibernation, which means these bandwidth overspends are never
recorded to disk.
I think the bug went in during commit 4bf79fa4f which is part of Tor
0.4.0.1-alpha.
The PERIODIC_EVENT_FLAG_NEED_NET flag (what FL(NEED_NET) expands into)
checks net_is_disabled(), but there is another function right after
net_is_disabled in netstatus.c called net_is_completely_disabled(), and
the only difference is that it checks we_are_fully_hibernating() vs
we_are_hibernating().
I confirmed the overall bug happens in practice: there's a relay operator
in #tor who hit soft hibernation, and then saw his tor proceed to use more
bytes than expected. I had him do a 'gdb attach' to his tor and break on
'second_elapsed_callback' and the function never got called.
It seems like the immediate fix, and best backport plan, would be to
resume calling second_elapsed_callback even when net_is_disabled(). The
longer term plan can be to audit our calls to net_is_disabled() and
net_is_completely_disabled() and we_are_hibernating(), with an eye towards
"should we be doing this behavior while soft hibernating", and see what
other bugs we find.
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/32108>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs