> On 16 Apr 2016, at 05:47, David Goulet <dgoulet@xxxxxxxxx> wrote: > > Hi! > > (For the following, I'm only talking about HS directory.) > > Here is my conundrum. I was working on plugging the new cache for proposal 224 > into our OOM. I stumbled upon a fun problem. > > We plan to have both current and new HS protocol in parallel. This means two > caches for both version 2 (current) and 3 (prop224). So far so good. > > However, here is the problem. Now, our OOM handlers is called in: > cell_queues_check_size(). The heuristic is basically: > > /* If we're spending over 20% of the memory limit on hidden service > * descriptors, free them until we're down to 10%. */ > > Considering we have _two_ caches now, we somehow have to cleanup both of them > "evenly" that is if we have to remove 20kb we either remove 10kb on each (if > possible) or we actually clean them by entry expiry time instead regardless of > which version. > > I think the latter is what we want but then it complicates things a bit since > both cache contains _different_ data struct. and indexed with different keys > (DIGEST vs DIGEST256). One option here is to go over all entries in both cache, > create temporary objects and put them in a sorted list (by expiry time). Those > "temp object" would contain something like: 1) cache version, 2) expiry time > and 3) index key. Once done, we go over that list and delete entries until we > reach our number of bytes we need to remove. > > This could work since we would do that when we are at 20% memory but still it > requires allocating some small objects to _clear_ our memory... Someone has > maybe something _better_ to propose? :) > > (keep in mind that minimal change to the current HS code is preferable. I know > that unifying both caches somehow would help but that's lots more work! and > the new cache for prop224 will be able to support versionning for which our > current one doesn't, it's baked for v2.) > > I'm asking because this will require some refactoring for which I want to make > sure we are in agreement on how to proceed and it doesn't end up not the right > solution. > > Basically my goal is to move the OOM handler for both v2 and v3 to one single > function called hs_cache_handle_oom() and from which we'll do the gymnastic > describe above. We'll stop calling rend_cache_clean_v2_descs_as_dir() > explicitely after that in cell_queues_check_size(). Having code that is only executed under rare conditions is error-prone. Allocating memory at OOM time is error-prone. We could add cache entries to the unified list when they are created, rather than at OOM time. This also has the benefit of being faster than creating and sorting a large list at OOM time. Alternately, we could iterate through each cache like this: Find ta, the oldest expiry time in Cache A Find tb, the oldest expiry time in Cache B Set Cache X to the cache with the oldest descriptor Set tx to the minimum expiry time from ta and tb Set ty to the maximum expiry time from ta and tb While tx <= ty and we are above the desired memory limit Deallocate the oldest descriptor from Cache X Set tx to the oldest expiry time in Cache X If we are still above the desired memory limit, repeat this entire process We should add entries to a time-sorted list when they are created, otherwise this algorithm is O(n^2). A third alternative is that we can iterate through each time period: Set K to the oldest expected descriptor age in hours, minus 1 hour Deallocate all entries from Cache A that are older than K hours Deallocate all entries from Cache B that are older than K hours Set K to K - 1 and repeat this process This algorithm is O(Kn), which is ok as long as K is small. This carries a slight risk of over-deallocating cache entries. Which is OK at OOM time. I like this one, because it's simple, performant, and doesn't need any extra memory allocations. Tim Tim Wilson-Brown (teor) teor2345 at gmail dot com PGP 968F094B ricochet:ekmygaiu4rzgsk6n
Attachment:
signature.asc
Description: Message signed with OpenPGP using GPGMail
_______________________________________________ tor-dev mailing list tor-dev@xxxxxxxxxxxxxxxxxxxx https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev