[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [tor-bugs] #23243 [Metrics/Website]: write a spec for web-server-access log descriptors



#23243: write a spec for web-server-access log descriptors
-----------------------------+-----------------------------------
 Reporter:  iwakeh           |          Owner:  metrics-team
     Type:  enhancement      |         Status:  needs_information
 Priority:  Medium           |      Milestone:
Component:  Metrics/Website  |        Version:
 Severity:  Normal           |     Resolution:
 Keywords:                   |  Actual Points:
Parent ID:                   |         Points:
 Reviewer:                   |        Sponsor:
-----------------------------+-----------------------------------

Comment (by karsten):

 Replying to [comment:34 iwakeh]:
 > There are two open questions:
 >
 > 1. Should it be mentioned in section 2 of the spec that log files come
 in directories named as the physical host, i.e.,
 meronense.torproject.org/metrics.torproject.org-access.log.20170707.log?

 Wait, there's no `.log` at the end of the file name. Example (from the
 server):

 `metrics.torproject.org-access.log-20170912.gz`

 Also note the `-` between `access.log` and the date.

 > 2. As already visible in 1.: the files are expected to have ending
 '.log' or '.log.bz2' or some other compression?
 >
 > Especially a clear answer for 2. is important for the implementation.

 I'd say the exact compression type is an implementation detail. See also
 the very last paragraph in the spec where we said: "Sanitized log files
 are typically compressed before publication. In particular the sorting
 step allows for highly efficient compression rates. We typically use XZ
 for compression, which is indicated by appending ".xz" to log file names,
 but this is subject to change." -- We could say something similar for logs
 that are provided to the sanitizer.

 How about we add a new first paragraph to Section 3.1 (Discarding non-
 matching files):

 """
 Log files are made available to the santizer in a separate directory per
 physical web server host. Log files are typically gz-compressed, which is
 indicated by appending ".gz" to log file names, but this is subject to
 change. Overall, the sanitizer expects log files to use the following path
 format:

 <phyiscal-host>/<virtual-host>.torproject.org-access.log-YYYYMMDD[.gz]
 """

 And while we're at it, let's change "''<hostname>''.torproject.org-access
 .log-YYYYMMDD" in the last paragraph of Section 2 to "''<virtual-
 host>''.torproject.org-access.log-YYYYMMDD".

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/23243#comment:35>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs