[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

[tor-dev] Proposal 328: Make Relays Report When They Are Overloaded



Greetings,

Attached is a proposal from Mike Perry and I. Merge requsest is here:

https://gitlab.torproject.org/tpo/core/torspec/-/merge_requests/22

Cheers!
David

-- 
uSxMMaJozH7GAo7qjzhVUm5FDynPYtBfOeZcLS88XeU=
```
Filename: 328-relay-overload-report.md
Title: Make Relays Report When They Are Overloaded
Author: David Goulet, Mike Perry
Created: November 3rd 2020
Status: Draft
```

# 0. Introduction

Many relays are likely sometimes under heavy load in terms of memory, CPU or
network resources which in turns diminishes their ability to efficiently relay
data through the network.

Having the capability of learning if a relay is overloaded would allow us to
make better informed load balancing decisions. For instance, we can make our
bandwidth scanners more intelligent on how they allocate bandwidth based on
such metrics from relays.

We could furthermore improve our network health monitoring and pinpoint relays
possibly misbehaving or under DDoS attack.

# 1. Metrics to Report

We propose that relays start collecting several metrics (see section 2)
reflecting their loads from different component of tor.

Then, we propose that 3 new lines be added to the extra-info document (see
dir-spec.txt, section 2.1.2) if only the overload case arrise.

This following describes a series of metrics to collect but more might come in
the future and thus this is not an exhaustive list.

# 1.1. General Overload

The general overload line indicates that a relay has reached an "overloaded
state" which can be one or many of the following load metrics:

   - Any OOMkiller invocation due to memory pressure
   - Any onionskins are dropped
   - CPU utilization of Tor's mainloop CPU core above 90% for 60 sec
   - TCP port exhaustion

The format of the overloaded line added in the extra-info document is as
follow:

```
"overload-reached" YYYY-MM-DD HH:MM:SS NL
   [At most once.]
```

The timestamp is when a at least one metrics was detected. It should always be
at the hour and thus, as an example, "2020-01-10 13:00:00" is an expected
timestamp. Because this is a binary state, if the line is present, we consider
that it was hit at the very least once somewhere between the provided
timestamp and the "published" timestamp of the document which is when the
document was generated.

The overload field should remain in place for 72 hours since last triggered.
If the limits are reached again in this period, the timestamp is updated, and
this 72 hour period restarts.

# 1.2. Token bucket size

Relays should report the 'BandwidthBurst' and 'BandwidthRate' limits in their
descriptor, as well as the number of times these limits were reached, for read
and write, in the past 24 hours starting at the provided timestamp rounded
down to the hour.

```
"overload-ratelimits" SP YYYY-MM-DD SP HH:MM:SS
                      SP rate-limit SP burst-limit
                      SP read-rate-count SP read-burst-count
                      SP write-rate-count SP write-burst-count NL
  [At most once.]
```

The "rate-limit" and "burst-limit" are the raw values from the BandwidthRate
and BandwidthBurst found in the torrc configuration file.

The "{read|write}-rate-count" and "{read|write}-burst-count" are the counts of
how many times the reported limits were exhausted and thus the maximum between
the read and write count occurances.

# 1.3. File Descriptor Exhaustion

Not having enough file descriptors in this day of age is really a
misconfiguration or a too old operation system. That way, we can very quickly
notice which relay has a value too small and we can notify them.

This should be published in this format:

```
"overload-fd-exhausted" YYYY-MM-DD HH:MM:SS NL
  [At most once.]
```

As the overloaded line, the timestamp indicates that the maximum was reached
between the this timestamp and the "published" timestamp of the document.

This overload field should remain in place for 72 hours since last triggered.
If the limits are reached again in this period, the timestamp is updated, and
this 72 hour period restarts.

# 2. Load Metrics

This section proposes a series of metrics that should be collected and
reported to the MetricsPort. The Prometheus format (only one supported for
now) is described for each metrics but each of them are prefixed with the
following in order to have a proper namespace for "load" events:

`tor_load_`

## 2.1 Out-Of-Memory (OOM) Invocation

Tor's OOM manages caches and queues of all sorts. Relays have many of them and
so any invocation of the OOM should be reported.

```
# HELP Total number of bytes the OOM has cleaned up
# TYPE counter
tor_load_oom_bytes_total{<LABEL>} <VALUE>
```

Running counter of how many bytes were cleaned up by the OOM for a tor
component identified by a label (see list below). To make sense, this should
be visualized with the rate() function.

Possible LABELs for which the OOM was triggered:
  - `cell`: Circuit cell queue
  - `dns`: DNS resolution cache
  - `geoip`: GeoIP cache
  - `hsdir`: Onion service descriptors

## 2.2 Onionskin Queues

Onionskins handling is one of the few items that tor processes in parallel but
they can be dropped for various reasons when under load. For this metrics to
make sense, we also need to gather how many onionskins are we processing and
thus one can provide a total processed versus dropped ratio:

```
# HELP Total number of onionskins
# TYPE counter
tor_load_onionskin_total{<LABEL>} <NUM>
```

Possible LABELs are:
  - `processed`: Indicating how many were processed.
  - `dropped`: Indicating how many were dropped due to load.

## 2.3 File Descriptor Exhaustion

Relays can reach a "ulimit" (on Linux) cap that is the number of allowed
opened file descriptors. In Tor's use case, this is mostly sockets. File
descriptors should be reported as follow:

```
# HELP Total number of file descriptors
# TYPE gauge
tor_load_fd_total{<LABEL>} <NUM>
```

Possible LABELs are:
  - `remaining`: How many file descriptors remains that is can be opened.

Note: since tor does track that value in order to reserve a block for critical
port such as the Control Port, that value can easily be exported.

## 2.4 TCP Port Exhaustion

TCP protocol is capped at 65535 ports and thus if the relay ever is unable to
open more outbound sockets, that is an overloaded state. It should be
reported:

```
# HELP Total number of opened outbound connections.
# TYPE gauge
tor_load_socket_total{<LABEL>} <NUM>
```

Possible LABELs are:
  - `outbound`: Sockets used for outbound connections.

## 2.5 Connection Bucket Limit

Rate limited connections track bandwidth using a bucket system. Once the
bucket is filled and tor wants to send more, it pauses until it is refilled a
second later. Once that is hit, it should be reported:

```
# HELP Total number of global connection bucket limit reached
# TYPE counter
tor_load_global_rate_limit_reached_total{<LABEL>} <NUM>
```

Possible LABELs are:
  - `read`: Read side of the global rate limit bucket.
  - `write`: Write side of the global rate limit bucket.

Attachment: signature.asc
Description: PGP signature

_______________________________________________
tor-dev mailing list
tor-dev@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev