[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
Re: [tor-relays] Flooding of unbound via resolve attempts
On Thu, Mar 10, 2022 at 08:33:07AM +0000, Georg Koppen wrote:
> Hello!
>
> As you might know we are doing regular (at the moment weekly) scans of
> exit nodes to find and help with misconfigurations or errors that have
> potentially serious effects for Tor network usability and performance.
> The results we got so far after over a year of scanning are roughly
> single digit numbers of exit relays per week having mostly DNS
> configuration issues (unbound crashed etc.)
>
> However, this week we suddenly found almost 80 exit relays with
> malfunctioning DNS resolution[1] which was surprising. Additionally,
> after some of the servers got fixed the issue returned. DrWhax (thanks!)
> pointed us to a possible explanation twittered by the unredacted folks:
>
> https://twitter.com/unredacted_org/status/1501458345219215363
>
> It seems that someone (intentionally or not) is overwhelming unbound
> leading to DNS resolution issues for those exit operators that do run
> this local resolver, which we currently recommend.
>
I find it interesting that it is possible to crash/DoS unbound through
Tor circuits to an exit relay. I would have assumed other factors
would limit before unbound would. They posted some CPU graphs on the
Twitter page, but it would have been interesting to see some
requests/s numbers if someone has any to share.
> We've opened a ticket[2] for further investigation, but I hope this
> email raises some awareness so that exit operators can keep and eye on
> the situation.
>
> Feel free to add insights you have to the ticket. Additionally, I bet if
> someone would share how they do monitoring for such a problem on their
> exits then a lot of exit operators would be happily picking up that
> setup and the Tor network would win. :)
>
I'm using Grafana + Prometheus + node_exporter to monitor my relays.
Grafana is a web UI for visualising data, Prometheus is a data
collector that scrapes data from node_exporter and stores it for
Grafana to fetch. node_exporter is a service that collects and
presents a bunch of data on the same format as the new Tor metrics
function.
(When I eventually get Tor daemons recent enough to get anything but
emptiness out of the metrics port, I'll add them to Premetheus for
scraping as well.)
Grafana is great and one can build dashboards that show pertinent
information and give a good overview. It is also possible to configure
alerts if metrics go outside of specified bounds. I have alerts
configured to mail me for a few statistics.
When it comes to unbound monitoring, I use unbound_exporter from the
letsencrypt project on Github[3]. It works the same way node_exporter
does, but exports unbound metrics and can be scraped by Prometheus. To
visualise the data, I use a pre-made dashboard for Grafana[4] that I
have tweaked a bit.
Cordially,
Andreas Kempe
>
> [1] https://gitlab.torproject.org/tpo/network-health/team/-/issues/197
> [2] https://gitlab.torproject.org/tpo/network-health/analysis/-/issues/30
[3]: https://github.com/letsencrypt/unbound_exporter
[4]: https://grafana.com/grafana/dashboards/9604
_______________________________________________
tor-relays mailing list
tor-relays@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays