On Tue, Oct 02, 2007 at 05:36:02PM +0800, Li-Hui Zhou wrote: > I've noticed the bugfix and yeah, it's been a LOT better. > > Old bug? Why it not been triggered until recent svn? Short version: There was an optimization that worked around it, but that optimization no longer applied. Long version: The format of a "router list," as we parsed it used to be: (ExtraInfo | RouterDesc)* In other words, it had any number of extrainfo and routerdesc documents, possibly scrambled up. The old code did: - If we start with the word extra-info, it's an extra-info and we're done. - If we start with the word router, it's a routerdesc and we're done. - Otherwise, scan for the first instance of the word router and scan for the first instance of the word extra-info. Whichever comes first is the next document. This was fine until we added annotations around r11680. The format became: (Annotation* (ExtraInfo | RouterDesc))* Now, the first two cases no longer applied when there were annotations, since the point where we were looking never started with the word router or the word extra-info, so we always did case 3. But in a list like the cached-descriptors list that has no extra-info documents, we wound up scanning the entire list looking for an extra-info that never existed, and we did this scan for every router in the cache. That's O(n^2) in cache size, and that's no good. For the fix, see the patch. :) yrs, -- Nick Mathewson
Attachment:
pgpMIRmleQdHf.pgp
Description: PGP signature