Hi Nick, On 21 Mar 2020, at 05:38, Nick Mathewson <nickm@xxxxxxxxxxxxxx> wrote:
There's a shorter encoding for Raw: for each [i, pos1, pos2] in index_ranges: w = pos2 - pos1 j = the index of pos1 among all sorted pos1s new_encoding = [i, j, w] [i, j, w] is an efficient encoding if index_ranges is sparse compared with ENDIVERouterData, because: * j has the same cardinality as I * w is smaller than pos1 and pos2 If index_ranges is dense, there may be a more efficient encoding: add missing i with weight 0 drop j With this encoding, you can drop a few of the constraints. There's a faster calculation and more efficient encoding for Weighted, because there's a common factor in each POS() calculation: (1 << 32) / total_weight If we remove that factor, we end up with a much simpler algorithm, that's also more flexible: Algorithm: Expanding a "Weighted" indexspec. Each weighted indexspec also has a multiplier, which may vary between indexspecs. Let total_weight = SUM(indexspec.index_weights) Verify total_weight * multiplier <= UINT64_MAX. Let total_so_far = 0. Let result_idx = {} (an empty mapping). Define POS(b) = b * multiplier For 0 <= i < LEN(indexspec.indexweights): Let w = indexspec.indexweights[i]. Let lo = POS(total_so_far). Let total_so_far = total_so_far + w. Let hi = POS(total_so_far) - 1. Append (lo, hi) => i to result_idx. Return result_idx. If multiplier is large, then the weights can be quite small. (Small weights sacrifice some accuracy.) And if the multiplier is 1, you effectively have a "Raw" index. If you make those changes, you should be able to use a similar process to expand all the different index types. (After multiplying, truncating, or hashing, you either end up with a delta, or an absolute position. You can turn deltas into absolute positions, and then feed them all into the same base algorithm.) There are also a few things I think might be bugs: Was there meant to be a constraint that the Weighted total is UINT64_MAX? Or close to UINT64_MAX? The fixed parameters don't make much sense otherwise. I think v2 and v3 onion services assign descriptors to the next highest HSDir RSA id (or ed25519 hash). But INDEX_FROM_RING_KEYS() uses the relay input as the lowest value. There is no next member for the last member in INDEX_FROM_RING_KEYS(), but the code asks for one. (Perhaps there are some typos here that make the code hard to understand.) We'll need special treatment for ring wrapping. (That is, 0xFF… is not a real upper limit, it actually means, "use the lowest relay".) It's weird to call a middle value a "suffix", but "infix" is also a bit of an unusual word. There are also a bunch of typos, let me know when you're ready for copy-editing. T |
_______________________________________________ tor-dev mailing list tor-dev@xxxxxxxxxxxxxxxxxxxx https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev