On Fri, Apr 29, 2011 at 02:36:47PM +0200, tagnaq wrote:
> if I understand it correctly metrics-db does not fetch all
> descriptors[1] so the server-descriptor archives on metrics[2] does not
> contain all descriptors.

It's correct that metrics-db does not fetch non-referenced descriptors.
But it does parse the cached-descriptors[.new] files from gabelmoo, one of
the directory authorities.  So, the archives at least contain the
descriptors published to gabelmoo.  But there may be descriptors that are
only published to one or more of the other directory authorities which
would then not be contained in the metrics archive.  I can't say how many
descriptors we're talking about.

See #2763 and #2282 for more discussion about this issue.  The current
status of these tasks is that we need to extend Tor to make the rejected
descriptors and those not contained in the consensus available via the
directory protocol.  Once that's done, I can extend metrics-db to archive
these descriptors, too.  I'd like to turn off rsyncing the
cached-descriptors* files from gabelmoo sooner rather than later.

> If my assumption is correct:
> Are there also archives that contain all descriptors? (referenced +
> unreferenced)
> Does the directory-archive script[3] archive/fetch all descriptors?

The directory-archive script only downloads referenced descriptors, too.


