[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
Re: [tor-bugs] #4407 [Metrics Website]: Create a basic monitoring infrastructure for large scale events
#4407: Create a basic monitoring infrastructure for large scale events
-----------------------------+----------------------------------------------
Reporter: atagar | Owner: karsten
Type: task | Status: new
Priority: normal | Milestone:
Component: Metrics Website | Version:
Keywords: | Parent:
Points: | Actualpoints:
-----------------------------+----------------------------------------------
Comment(by karsten):
Replying to [comment:2 atagar]:
> > My suggestion is to continue using a Java application that gets
executed once per hour by cron.
>
> The first question that comes to mind is: do we need monitors to have
historical data? This was the reason I avoided the metrics codebase for my
consensus tracker script. Once you add its java and DB prereqs the
installation and complexity of the system gets much worse with, I think,
little benefit.
We don't need historical data for the monitoring infrastructure. Or
rather, we'll want to keep our own state files, but we don't really need
to have access to past descriptors. I agree with you that the monitoring
infrastructure should be independent of the metrics database.
I came to a similar conclusion a few weeks ago, but for a slightly
different reason. We had a single cronjob to download descriptors, import
them into the metrics database, and run the consensus-health script. This
approach turned out to be terribly error-prone. Whenever the database
import got stuck, the download stopped and the consensus-health script
didn't work anymore. That's why I made the consensus-health script a
separate component that is independent of the metrics database.
But hey, Java is not a prereq, it's a programming language. Whether we
require a certain JVM and Java libraries or a certain Python version and
Python APIs makes no difference. Well, besides the personal developer
preferences that have an influence on development speed.
> What I'd like to see is for the alarm infrastructure to use a metrics
service API, but itself be a separate and distinct component.
I like the idea of such a metrics service API. I have a TODO list item
since way too many months for extracting the common parts of metrics-web
and metrics-db that handle relay descriptors and put them in a separate
API. In the meantime, ExoneraTor copies that code, the consensus-health
script copies it, the extra-info descriptor health script would copy it,
and the monitoring infrastructure is going to copy it, too. Let's finally
make an API. I'm going to open a ticket today once I have a rough idea
how the API could look like. Will post the ticket number here.
> That said, this decision is really up to whoever codes it. If it's
something like the above then I'd be happy to mentor, and if it's an
expansion of the metrics codebase then guess that ball's in your court. If
no one gets to it first then I might hack on it later as a client for
stem.
We could also discuss what the API is supposed to do, and then implement
it both in Java and Python. There are a few Java metrics programs that
would make use of it, and I think you have a few Python applications which
would use it, too.
> > But I already know 1 person who won't like that suggestion. ;)
>
> Bold accusation! Actually, if you'd proposed a java project when I first
joined the community I would have been all over it - I have far more java
development experience than python.
Doh! ;)
> > We can implement trivial things like "we just lost more than 25% of
the relays in one hour." But what we really need is someone to sit down
with the descriptor archives and look what are expected changes and what
changes would be unusual.
>
> Right. What I'd like to see first is alarms for when the sky is falling.
After that it becomes a question of tuning and pattern matching which
could then easily lead to interesting research projects - hint hint,
researchy people. :)
Agreed. This research project might even turn out to be quite
interesting!
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/4407#comment:3>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs