[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
[tor-bugs] #8815 [Stem]: Stem's DescriptorReader should handle relative paths in processed files when given a target with a relative path
#8815: Stem's DescriptorReader should handle relative paths in processed files
when given a target with a relative path
--------------------+-------------------------------------------------------
Reporter: wfn | Owner: atagar
Type: defect | Status: new
Priority: minor | Milestone:
Component: Stem | Version:
Keywords: | Parent:
Points: | Actualpoints:
--------------------+-------------------------------------------------------
A bugfix for DescriptorReader._handle_file() when (one of the) target(s)
descriptor directory is given by a relative path. Need to make sure it is
an absolute path when comparing to the (always absolute) paths in
_processed_files. Please find the linked commit and attached git diff.
A (probably unnecessarily) longer explanation: when
stem.descriptor.reader.DescriptorReader is initialized with a relative
path for a target, e.g.:
{{{
from stem.descriptor.reader import DescriptorReader
reader = DescriptorReader(['server-descriptors'],
persistence_path='./used_desc')
}}}
The DescriptorReader._handle_file() method (which is used when the reader
is accessed as an iterator, etc.) will skip over the loaded
_processed_files, because the check for a given file (as 'target', which
will be a relative path) will mismatch the one in the processed files
dictionary (as '_processed_files', where the paths are always absolute) -
stem/descriptor/reader.py, line 462, which attempts to get the 'previously
last used' timestamp for a given target file:
{{{
last_used = self._processed_files.get(target)
}}}
Here, 'target' would in our example something of the following kind:
'server-descriptors/402619c25024fb360f88992437242b8938b99e5d'
However in _processed_files (and in the 'used_desc' file), the
corresponding key would be e.g.
'/home/kostas/priv/tordev/data/recent/relay-descriptors/server-
descriptors/402619c25024fb360f88992437242b8938b99e5d'
We need to make 'target' always be an absolute path to avoid this kind of
issue, and also to make sure that our 'new_processed_files' (to be used
when e.g. the iterator is to be called again, i.e. when e.g. we want to
re-iterate over our reader to see if anything new came up) also stores
absolute paths.
Here is a link to a commit that makes sure the relevant paths are always
absolute:
https://github.com/wfn/stem/commit/18a92836fac436b7fdd7f5d3ab10786f55b82c99
Ran Stem unit tests incl. for reader.py just in case, all good.
Attached please also find a sample script which makes use of this
functionality by supplying a relative path to DescriptorReader, just in
case. (I rsync'd 'relay-descriptors' in 'recent' for my Stem experiments.)
See attached sample_output.txt
I'm also attaching a git diff output (git diff
1773ebaab470206653ce6d84c3ef1276f81c5d0a , last commit in
git.torproject.org/stem.git) just in case.
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/8815>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs