[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
Re: [tor-bugs] #31244 [Internal Services/Tor Sysadmin Team]: long term prometheus metrics
#31244: long term prometheus metrics
-------------------------------------------------+---------------------
Reporter: anarcat | Owner: tpa
Type: enhancement | Status: new
Priority: Medium | Milestone:
Component: Internal Services/Tor Sysadmin Team | Version:
Severity: Normal | Resolution:
Keywords: | Actual Points:
Parent ID: | Points:
Reviewer: | Sponsor:
-------------------------------------------------+---------------------
Comment (by anarcat):
in #29388, i said:
{{{
> (1.3byte/(15s)) * 15 d * 2500 * 80 to Gibyte
((1,3 * byte) / (15 * second)) * (15 * jour) * 2500 * 80 =
approx. 20,92123 gibibytes
}}}
If we expand this to 30d (the current retention policy), we get:
{{{
> 30d×1.3byte/(15s)×2500×80 to Gibyte
(((30 * day) * (1.3 * byte)) / (15 * second)) * 2500 * 80 = approx.
41.842461 gibibytes
}}}
In other words, the current server should take about 40Gibytes of storage.
It's actually taking much less:
{{{
21G /var/lib/prometheus/metrics2/
}}}
There are a few reasons for this:
1. we don't have 2500 metrics, we have 1289
2. we don't have 80 hosts, we have 75
3. each host doesn't necessarily expose all metrics
Regardless of 3, stripping down to 1300 metrics over 75 hosts gives an
estimate that actually matches the current consumption, more or less:
{{{
> 30d×1.3byte/(15s)×1300×75 to Gibyte
(((30 * jour) * (1,3 * byte)) / (15 * second)) * 1300 * 75 = approx.
20,3982 gibibytes
}}}
So let's play with those schedules a bit. Here's the same data, but with
hourly pulls for a year:
{{{
> 365d×1.3byte/(1h)×1300×75 to Gibyte
(((365 * jour) * (1,3 * byte)) / (1 * hour)) * 1300 * 75 = approx.
1,0340754 gibibytes
}}}
Holy macaroni! Only 1GB! We could keep 20 years of data with this!
Let's see 15 minutes increments:
{{{
> 365d×1.3byte/(15min)×1300×75 to Gibyte
(((365 * jour) * (1,3 * byte)) / (15 * minute)) * 1300 * 75 = approx.
4,1363016 gibibytes
}}}
Still very reasonable! And 5 minutes frequency will, of course, give us:
{{{
> 365d×1.3byte/(5min)×1300×75 to Gibyte
(((365 * jour) * (1,3 * byte)) / (5 * minute)) * 1300 * 75 = approx.
12,408905 gibibytes
}}}
So, basically, we have this:
|| Frequency || Retention period || Storage used ||
|| 15 second|| 30 days|| 20 GiB||
|| 5 min|| 10 year|| 120 GiB||
|| 5 min|| 5 year|| 60 GiB||
|| 5 min|| 1 year|| 12 GiB||
|| 15 min|| 10 year|| 40 GiB||
|| 15 min|| 5 year|| 20 GiB||
|| 15 min|| 1 year|| 4 GiB||
|| 1 hour|| 10 year|| 10 GiB||
|| 1 hour|| 5 year|| 5 GiB||
|| 1 hour|| 1 year|| 1 GiB||
So how long do we want to keep that stuff anyways? I like the 15 minutes 5
year plan, personnally (20GB) although I *also* like the idea of just
shoving samples every 5 minutes like we were doing with Munin, which gives
us 12GiB, or 60 GiB over five years...
Thoughts?
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/31244#comment:1>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs