[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
Re: Proposal: More robust consensus voting with diverse authority sets
On Tue, Apr 29, 2008 at 04:14:08PM +0200, Peter Palfrader wrote:
> On Wed, 02 Apr 2008, Nick Mathewson wrote:
> 
> >                                    Is there a simpler approach that
> > does what we want, or do we really need to do the NP-hard thing?
> 
> Did one occur to you yet?  If not, should we go forward with this
> proposal?
All I can think of is providing explicit "flag day" options in the
authority configuration: something to the effect of "before May 1, the
authority list is X; after May 1, the authority list is Y."  This
isn't as powerful as proposal 134 though, and I think it might be less
good than we need.
I decided that I'd eel better about the NP-hard thing if we codde up
the sample C implementation and found out that it scaled well to a
large number of authorities.  ("Scaling well" here means something
like "not replacing RSA as our biggest time-sink in voting", and
"large number of authorities" meaning something far larger than
anything we're hoping to do now.)
To test this, I hacked up a quick and dirty recursive max-clique
implemention in python, to get a handle on scaling issues and
efficiency concerns.  With a little tweaking and a few standard AI
search hacks, I got something that performed fine up to even 50
authorities (which is more than we ever plan to do).  I've attached
the code, in case anybody wants to adopt the approach: basically, it's
a greedy depth-first search with pruning.
So, afaik, the sole remaining big problem in this proposal is
migration.  IIUC, the current consensus-method thing won't work, for
reasons explained in an earlier message of mine on this thread:
   >The 'consensus method' thing is only meant to determine which
   >algorithm the voting authorities use to produce the consensus from the
   >votes, not which authorities' votes count.
Once we solve this, I think building proposal 134 would be a fine
thing to do.
peace,
-- 
Nick
import time
import random
random.seed()
# So, here's the algorithm, *APPROXIMATELY*:
#
#  To find the biggest clique of some graph:
#     Find the biggest clique of that graph that has size greater than 0.
#
#  To find the biggest clique in some graph with size greater than N:
#     If there are N or fewer points in the graph, return no clique.
#     Else, if there are 3 or fewer points in the graph, manually
#        find the biggest clique.
#     Else:
#        Let P0 = the most connected point in the graph.
#        Let G' = the subgraph of all points connected to P0.
#        Recursively find C, the biggest clique in G' of size greater than N-1.
#        If Size(C) == Size(G'), return C.
#        Let Best = the larger of N and Size(C)+1.
#        Let G'' = the subgraph of all points in the graph other than P0.
#        Recursively find C', the biggest clique in G'' of size greater
#           than Best.
#        Return the larger of C and C'.
#
def maxp(points, edges, onlyIfBetterThan):
    """given a list of points and a symmetric set of (point,point) pairs,
       return the list of points in the largest clique of points in
       'points' such that for all distint p1,p2 in the result, (p1,p2)
       is in edges.  If no clique has more than onlyIfBetterThan elements,
       possibly return an empty list."""
    nPoints = len(points)
    if onlyIfBetterThan >= nPoints:
        # There's no way we can get an answer with more than onlyIfBetterThan
        # points without having more than that number of points to work with.
        return []
    if nPoints == 3:
        # Only 3 points? Stop recursing
        p0,p1,p2 = points
        if (p0,p1) in edges:
            if (p0,p2) in edges and (p1,p2) in edges:
                return points
            else:
                return points[0:2]
        elif (p1,p2) in edges:
            return points[1:]
        elif (p0,p2) in edges:
            return [ p0, p2 ]
        else:
            return [ p0 ]
    elif nPoints > 3:
        # First, find o2, the biggest clique that contains point 0.
        # Only consider points that are directly connected to point 0,
        # obviously.
        p0 = points[0]
        pRest = [p for p in points[1:] if (p0,p) in edges]
        o2 = maxp(pRest, edges, onlyIfBetterThan-1)
        o2.append(p0)
        if len(o2) == nPoints:
            # o2 is the same size as the entire subgraph under consideration;
            # we can't do better than that.
            return o2
        # Now find the biggest clique that _doesn't_ contain point 0.
        # We just found o2, so we don't need to consider any result
        # with size <= len(o2).
        onlyIfBetterThan = max(len(o2), onlyIfBetterThan)
        o1 = maxp(points[1:], edges, onlyIfBetterThan)
        # Return whichever of o1 or o2 was the largest.
        if len(o1) > len(o2):
            if len(o2) < onlyIfBetterThan:
                return []
            return o1
        else:
            if len(o2) < onlyIfBetterThan:
                return []
            return o2
    # The rest of this function is just corner cases for small
    # numbers of points.
    elif nPoints == 0:
        return points
    elif nPoints == 1:
        return points
    elif nPoints == 2:
        if (points[0], points[1]) in edges:
            return points
        else:
            return [ points[0] ]
def maxclique(aset):
    # make that aset is symmetric: delete asymmetric bits.
    for a,b in list(aset):
        if (b,a) not in aset:
            del aset[a,b]
    count = {}
    for k,k2 in aset:
        try:
            count[k] += 1
        except KeyError:
            count[k] = 1
    points = [p for c, p in sorted((c,p) for p,c in count.iteritems())]
    return maxp(points, aset, 0)
def randGraph(nP, nE):
    points = range(1,nP+1)
    edges = set()
    if nE > (nP*(nP-1))/2:
        nE = nP*(nP-1)/2
    for _ in xrange(nE):
        while 1:
            a = random.choice(points)
            b = random.choice(points)
            if a != b and (a,b) not in edges:
                edges.add((a,b))
                edges.add((b,a))
                break
    return edges
def bench(nodes,edges,iters):
    e = randGraph(nodes, edges)
    t = time.time()
    for _ in xrange(iters):
        maxclique(e)
    t2 = time.time()
    print "%s nodes, %s edges: %s msec" % (nodes,edges,((t2-t)/iters)*1000)
if __name__ == '__main__':
    print bench(50,800,30)