[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
Re: [tor-bugs] #18910 [Metrics/CollecTor]: distributing descriptors accross CollecTor instances
#18910: distributing descriptors accross CollecTor instances
-------------------------------+---------------------------------
Reporter: iwakeh | Owner: iwakeh
Type: enhancement | Status: needs_review
Priority: High | Milestone: CollecTor 1.1.0
Component: Metrics/CollecTor | Version:
Severity: Normal | Resolution:
Keywords: ctip | Actual Points:
Parent ID: | Points:
Reviewer: | Sponsor:
-------------------------------+---------------------------------
Comment (by iwakeh):
Thanks for working through all that code so quickly!
Replying to [comment:56 karsten]:
> Alright! Please find [https://gitweb.torproject.org/karsten/metrics-
db.git/log/?h=task-18910-3 branch task-18910-3 in my repository]. I
created fixup/squash commits for most things I found. Please take a close
look and challenge anything you don't like to be squashed into your
commit. In particular, please take a look why the unit tests don't pass
anymore. ;)
Fine. I'll start testing after finishing this reply.
>
> More comments below:
>
> - I believe that index.json would include the `.tmp` files written
during sync, because we're only skipping files starting with `.`. And
that's not even a new issue, because the `.tmp` files written during
normal updating would also be included. The issue will only become more
prevalent. I'd say we adapt the indexer, which can happen in a separate
commit. Or we could change temporary files to start with `.`, but that
seems a bit more error-prone. Suggestion: '''fix'''
That's ok. Eventually the regular modules should use the classes in
o.t.c.persist for storing descriptors (that will reduce the line count in
them tremendously and finally unify writing descriptors). So, using the
"." prefix here now is fine.
>
> - As mentioned before somewhere, we're appending server/extra-info
descriptors to different files in the synchronization run than in the
earlier update run. The reason is that `SyncManager` creates its own
`collectionDate` in the constructor. However, we cannot just pass a
`Date` instance to both `SyncManager` and `ArchiveWriter`, because then
`SyncManager` would append to a file that we're already serving, in which
case the file size would change once we're done synchronizing, and we
should try to avoid that. Let's fix this as soon as the current updaters
are modules just like the synchronization module, and let's keep in mind
not to close and rename temporary files to their final names between
modules. Suggestion: '''postpone'''
Agreed.
>
> - As said earlier, let's release metrics-lib and use that new release
as basis for this patch. Then we won't have to import and instantiate
`org.torproject.descriptor.index.DescriptorIndexCollector` anymore, and we
can change defaults for the `*SyncOrigins` properties in
`collector.properties` back to `https://collector.torproject.org`.
Suggestion: '''change'''
Fine!
All these changes are reasons for metrics-lib release 1.4.1.
>
> - Regarding comment 25 above, I'm still a fan of using `printf` as
opposed to creating `SimpleDateFormat` instances and concatenating
strings. I also believe it would be more efficient. But I'm happy to
postpone that change. Suggestion: '''postpone'''
Yes, it might be good to think a little about that. The timestamp to
string functionality might also be a candidate for 'metrics-tools' (cf.
#20405), b/c it's used throughout Metrics' products.
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/18910#comment:57>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs