[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [tor-bugs] #20380 [Metrics/CollecTor]: Expand INSTALL.md to a more complete operator's guide



#20380: Expand INSTALL.md to a more complete operator's guide
-------------------------------+---------------------------------
 Reporter:  karsten            |          Owner:
     Type:  enhancement        |         Status:  needs_review
 Priority:  Medium             |      Milestone:  CollecTor 1.1.0
Component:  Metrics/CollecTor  |        Version:
 Severity:  Normal             |     Resolution:
 Keywords:                     |  Actual Points:
Parent ID:                     |         Points:
 Reviewer:                     |        Sponsor:
-------------------------------+---------------------------------

Comment (by karsten):

 Thanks for the detailed feedback!  Please take a look at
 [https://gitweb.torproject.org/karsten/metrics-db.git/log/?h=task-20380 my
 updated task-20380 branch] for changes discussed below.

 Replying to [comment:10 iwakeh]:
 > Replying to [comment:9 karsten]:
 > > ...
 > > A few thoughts:
 > >
 > >  - When you say that closer monitoring will be needed when disk space
 drops below a given number, do you mean 200G or 20G or a different number?
 >
 > I was referring to the disk space available when starting,  i.e. very
 close to 150G and logging to the same disk requires more attention than a
 terabyte setup.  Hmm, but if that is confusing just discard it.

 Ah, now I understand.  Hmm, I think I'd rather pick a different number
 than 150G than going into more detail there.  After all, a CollecTor
 instance that doesn't download and serve the full tarball archive will
 need a lot less than 150G, and an instance that does serve tarballs might
 run out of disk space in a year or two even with 150G.  Let's just change
 it to 200G to have some more room to breathe.

 > >  - We shouldn't add new section headers easily.  The chosen section
 headers and even paragraphs in this document (will) have equivalents in
 the other operator's guides for other metrics tools.  If we want to add
 new sections, we'll also have to add those sections to the other manuals.
 The current sections are:
 > >
 > > {{{
 > > $ grep "^#" INSTALL.md
 > > # CollecTor Operator's Guide
 > > ## Setting up the host
 > > ## Setting up the service
 > > ## Maintaining the service
 > > }}}
 > >
 >
 > It's important to have a consistent structure, but it would be helpful
 for readers to have sub-headings, which are application dependent.
 Scrolling through a document with only generic headings when looking for
 particular information takes longer (of course, there is a search).
 > So, maybe keep the top level consistent and allow for application
 dependent headlines below?

 I admit that there could be more sections, though I have not yet given up
 on keeping them independent of the application.  I added a few more
 section headers.

 > >  - (continued) What other sections or even subsections should we
 include, and what instructions would go into those vs. the existing
 sections?
 >
 > I see two more sections.
 > * 'Planning the Service' contrasts those sections giving a to-do list.
 People running instances will have different needs that can be better
 covered this way.
 > * and even more important, a section 'Bootstrapping' or similar.  What
 data to download before a first run etc.  Again this is not a to-do list
 as it depends what data should be processed.

 Added the first but not yet the second.  Let me know if anything is still
 missing.

 > > > The idea behind my changes is that I think the service shouldn't be
 run from the unpacked tar
 > > > folder.  The tar contains a development environment, so the jar
 would disappear after 'ant clean' or changed etc.
 > > > The runtime directory should only contain files that are really
 necessary for the application or which were created by the application.
 > > > Hope this doesn't make the description too complicated.
 > >
 > > Yes, makes sense, let's change that.  There are still a few paths left
 where we refer to files in `collector-<version>/` and where we should tell
 the user to copy those files to the working directory and run them from
 there.  I can update those places.
 > >
 > > > I also would like have even less description of tools from the OS,
 because such things should be decided by the operator.
 > >
 > > Which parts would that include?  The crontab, `@reboot`, `screen`,
 etc.?  Can you make a list?
 >
 > When we avoid mentioning any such tools and methods, we avoid getting
 out of date and stay platform independent.  People operating servers have
 their favorite tools for and know what to do when told
 >
 > * run this script every three days
 > * provide an http server for serving data and files in folders X, Y, Z.
 > * for continuous operation ensure start-up on reboot and
 > * monitoring of logs as well as running service is important
 > etc.
 >
 > CollecTor does not depend on apache or crontab only the services
 provided by them.  Even the suggested install of openjdk could be left
 out.  Also apt-get.  Attempt of a list:
 >
 > * apt-get
 > * apache2
 > * crontab
 > * gpg
 > * openjdk, only Java 7
 > * screen
 > * ...

 Agreed with almost all changes mentioned here, except for Apache.  I
 believe that CollecTor depends on Apache to put together its
 `header.html`, `footer.html`, and to create directory listings.  I haven't
 tried out other HTTP servers, but unless somebody has, I don't want to
 recommend any HTTP server if what we really need is an Apache.  (Note that
 this is different for Metrics, Onionoo, and ExoneraTor which can all work
 with any HTTP server that can forward requests to Tomcat/Jetty.)

 > Another thing would be to use `<OutPath>/recent` and similar instead of
 the default choices provided.  So, it is clear which option is referred
 to.

 Good idea.

 > The backup recommendation I would also leave out.  It depends on the
 setup and the kind of data collected.  Or, move it to 'Planning the
 service'?

 I'm not sure.  This seems like a question that new operators might have,
 though maybe not during the setup process when they're not yet certain
 that they will succeed.  That recommendation would probably benefit from a
 section header, so that people who don't care can skip it more easily.
 Changed.

 Please take another look.  Thanks!

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/20380#comment:11>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs