[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [tor-bugs] #18910 [Metrics/CollecTor]: distributing descriptors accross CollecTor instances



#18910: distributing descriptors accross CollecTor instances
-------------------------------+---------------------------------
 Reporter:  iwakeh             |          Owner:  iwakeh
     Type:  enhancement        |         Status:  needs_review
 Priority:  High               |      Milestone:  CollecTor 1.1.0
Component:  Metrics/CollecTor  |        Version:
 Severity:  Normal             |     Resolution:
 Keywords:  ctip               |  Actual Points:
Parent ID:                     |         Points:
 Reviewer:                     |        Sponsor:
-------------------------------+---------------------------------

Comment (by iwakeh):

 Thanks for working through all that code so quickly!

 Replying to [comment:56 karsten]:
 > Alright!  Please find [https://gitweb.torproject.org/karsten/metrics-
 db.git/log/?h=task-18910-3 branch task-18910-3 in my repository].  I
 created fixup/squash commits for most things I found.  Please take a close
 look and challenge anything you don't like to be squashed into your
 commit.  In particular, please take a look why the unit tests don't pass
 anymore. ;)

 Fine. I'll start testing after finishing this reply.

 >
 > More comments below:
 >
 >  - I believe that index.json would include the `.tmp` files written
 during sync, because we're only skipping files starting with `.`.  And
 that's not even a new issue, because the `.tmp` files written during
 normal updating would also be included.  The issue will only become more
 prevalent.  I'd say we adapt the indexer, which can happen in a separate
 commit.  Or we could change temporary files to start with `.`, but that
 seems a bit more error-prone.  Suggestion: '''fix'''

 That's ok.  Eventually the regular modules should use the classes in
 o.t.c.persist for storing descriptors (that will reduce the line count in
 them tremendously and finally unify writing descriptors).  So, using the
 "." prefix here now is fine.

 >
 >  - As mentioned before somewhere, we're appending server/extra-info
 descriptors to different files in the synchronization run than in the
 earlier update run.  The reason is that `SyncManager` creates its own
 `collectionDate` in the constructor.  However, we cannot just pass a
 `Date` instance to both `SyncManager` and `ArchiveWriter`, because then
 `SyncManager` would append to a file that we're already serving, in which
 case the file size would change once we're done synchronizing, and we
 should try to avoid that.  Let's fix this as soon as the current updaters
 are modules just like the synchronization module, and let's keep in mind
 not to close and rename temporary files to their final names between
 modules.  Suggestion: '''postpone'''

 Agreed.

 >
 >  - As said earlier, let's release metrics-lib and use that new release
 as basis for this patch.  Then we won't have to import and instantiate
 `org.torproject.descriptor.index.DescriptorIndexCollector` anymore, and we
 can change defaults for the `*SyncOrigins` properties in
 `collector.properties` back to `https://collector.torproject.org`.
 Suggestion: '''change'''

 Fine!
 All these changes are reasons for metrics-lib release 1.4.1.

 >
 >  - Regarding comment 25 above, I'm still a fan of using `printf` as
 opposed to creating `SimpleDateFormat` instances and concatenating
 strings.  I also believe it would be more efficient.  But I'm happy to
 postpone that change.  Suggestion: '''postpone'''

 Yes, it might be good to think a little about that.  The timestamp to
 string functionality might also be a candidate for 'metrics-tools' (cf.
 #20405), b/c it's used throughout Metrics' products.

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/18910#comment:57>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs