[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
Re: [tor-bugs] #23243 [Metrics/Website]: write a spec for web-server-access log descriptors
#23243: write a spec for web-server-access log descriptors
-----------------------------+-----------------------------------
Reporter: iwakeh | Owner: metrics-team
Type: enhancement | Status: needs_information
Priority: Medium | Milestone:
Component: Metrics/Website | Version:
Severity: Normal | Resolution:
Keywords: | Actual Points:
Parent ID: | Points:
Reviewer: | Sponsor:
-----------------------------+-----------------------------------
Comment (by karsten):
Replying to [comment:34 iwakeh]:
> There are two open questions:
>
> 1. Should it be mentioned in section 2 of the spec that log files come
in directories named as the physical host, i.e.,
meronense.torproject.org/metrics.torproject.org-access.log.20170707.log?
Wait, there's no `.log` at the end of the file name. Example (from the
server):
`metrics.torproject.org-access.log-20170912.gz`
Also note the `-` between `access.log` and the date.
> 2. As already visible in 1.: the files are expected to have ending
'.log' or '.log.bz2' or some other compression?
>
> Especially a clear answer for 2. is important for the implementation.
I'd say the exact compression type is an implementation detail. See also
the very last paragraph in the spec where we said: "Sanitized log files
are typically compressed before publication. In particular the sorting
step allows for highly efficient compression rates. We typically use XZ
for compression, which is indicated by appending ".xz" to log file names,
but this is subject to change." -- We could say something similar for logs
that are provided to the sanitizer.
How about we add a new first paragraph to Section 3.1 (Discarding non-
matching files):
"""
Log files are made available to the santizer in a separate directory per
physical web server host. Log files are typically gz-compressed, which is
indicated by appending ".gz" to log file names, but this is subject to
change. Overall, the sanitizer expects log files to use the following path
format:
<phyiscal-host>/<virtual-host>.torproject.org-access.log-YYYYMMDD[.gz]
"""
And while we're at it, let's change "''<hostname>''.torproject.org-access
.log-YYYYMMDD" in the last paragraph of Section 2 to "''<virtual-
host>''.torproject.org-access.log-YYYYMMDD".
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/23243#comment:35>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs