[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
Re: [tor-bugs] #19170 [Metrics/CollecTor]: make parsing more robust (extra-info)
#19170: make parsing more robust (extra-info)
-------------------------------+--------------------------
Reporter: iwakeh | Owner: iwakeh
Type: defect | Status: accepted
Priority: Medium | Milestone:
Component: Metrics/CollecTor | Version:
Severity: Normal | Resolution:
Keywords: ctip | Actual Points:
Parent ID: | Points:
Reviewer: | Sponsor:
-------------------------------+--------------------------
Comment (by karsten):
Replying to [comment:9 iwakeh]:
> Replying to [comment:8 atagar]:
> > > Another question that would need to be investigated: how will
CollecTor clients deal with the additional non-compliant data?
> >
> > This seems an odd question. CollecTor serves tarballs of the published
descriptor data. If the authorities publish it then CollecTor should
provide it, malformed or not.
There may be special cases where that statement doesn't hold, but in
general, I agree that CollecTor should aim for providing all data that the
directory authorities published.
> Of course, clients should deal with the data as it is collected.
Currently, they are 'shielded' from non-conformant extra-info decriptors,
b/c CollecTor drops them. After the change some might trip over that newly
available data. I intended to find out what the change would trigger, for
example what additional work we'd have with clients like Onionoo etc.
Onionoo et al. shouldn't be affected, because they're using metrics-lib to
parse descriptors which shields them from malformed descriptors. Other
clients not using metrics-lib might be affected, but those clients would
also break when parsing Tor data directly, so I don't think that we have
to take special care there.
Regarding the `LenientParser` idea, I wonder whether we should just skip
the metrics-lib check to see whether we can parse a descriptor before
writing it to disk. See `ArchiveWriter#store()`. At that point we
already parsed all relevant fields that we need for storing the descriptor
without using metrics-lib, and that check is only there to make sure that
metrics-lib will be able parse the descriptor later. But if we want to
take that check out, which I think we should, then let's just change that
code to print out an informational log statement and store the file
anyway. What do you think?
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/19170#comment:11>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs