[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [tor-dev] [RFC] Proposal: A First Take at PoW Over Introduction Circuits



On 9/22/20 07:10, George Kadianakis wrote:
> George Kadianakis <desnacked@xxxxxxxxxx> writes:
>
>> tevador <tevador@xxxxxxxxx> writes:
>>
>>> Hi all,
>>>
> Hello,
>
> I have pushed another update to the PoW proposal here:
>   https://github.com/asn-d6/torspec/tree/pow-over-intro
> I also (finally) merged it upstream to torspec as proposal #327:
>   https://github.com/torproject/torspec/blob/master/proposals/327-pow-over-intro.txt
>
> The most important improvements are:
> - Add tevador as an author.
> - Update PoW algorithms based on tevador's Equix feedback.
> - Update effort estimation algorithm based on tevador's simulation.
> - Include hybrid attack section.
> - Remove a bunch of blocker tags.
>
> Two things I'd like to work more on:
>
> - I'd like people to take tevador's Equix PoW function and run it on
>   their boxes and post back benchmarks of how it performed.

I shared some results privately with George and he suggested including
the list. Results below.

> Particularly
>   so if you have a GPU-enabled box, so that we can get some benchmarks
>   from GPUs as well. That will help us tune the proposal even more.

For anyone else following along or also contributing benchmarks, George
clarified for me that the equix benchmark isn't capable of utilizing the
GPU.

My results:

First results are on my w530, i7, 4 core (hyperthreaded to 8) laptop
(with moderate activity in the background).

I stumbled across some weird artifacts when using more threads than
processors: the benchmark reports solutions/sec continuing to increase
linearly with #threads. The wall-clock time for the benchmark itself
(measured with `time`) show the expected trend though of linear scaling
only up to 4 (the number of physical cores), a little bump at 8 (using
the hyperthreaded virtual cores), and no improvement past that.

Further below are results on my pinephone.
   
$ time ./equix-bench --threads 1
    Solving nonces 0-499 (interpret: 0, hugepages: 0, threads: 1) ...
    1.910000 solutions/nonce
    227.714446 solutions/sec. (1 thread)
    20301.439170 verifications/sec. (1 thread)

    real    0m4.242s
    user    0m4.230s
    sys    0m0.012s

$ time ./equix-bench --threads 2
    Solving nonces 0-499 (interpret: 0, hugepages: 0, threads: 2) ... 
    1.910000 solutions/nonce
    450.100153 solutions/sec. (2 threads)
    17925.519934 verifications/sec. (1 thread)

    real    0m2.184s
    user    0m4.294s
    sys    0m0.004s

$ time ./equix-bench --threads 4
    Solving nonces 0-499 (interpret: 0, hugepages: 0, threads: 4) ... 
    1.910000 solutions/nonce
    876.343564 solutions/sec. (4 threads)
    18863.079719 verifications/sec. (1 thread)

    real    0m1.154s
    user    0m4.400s
    sys    0m0.012s

$ time ./equix-bench --threads 8
    Solving nonces 0-499 (interpret: 0, hugepages: 0, threads: 8) ... 
    1.910000 solutions/nonce
    1089.198671 solutions/sec. (8 threads)
    17808.857809 verifications/sec. (1 thread)

    real    0m0.981s
    user    0m7.019s
    sys    0m0.052s

$ time ./equix-bench --threads 16
    Solving nonces 0-499 (interpret: 0, hugepages: 0, threads: 16) ... 
    1.910000 solutions/nonce
    2183.232035 solutions/sec. (16 threads)
    18936.014118 verifications/sec. (1 thread)

    real    0m1.025s
    user    0m7.021s
    sys    0m0.032s

$ time ./equix-bench --threads 32
    Solving nonces 0-499 (interpret: 0, hugepages: 0, threads: 32) ... 
    1.910000 solutions/nonce
    4397.259598 solutions/sec. (32 threads)
    17754.229411 verifications/sec. (1 thread)

    real    0m1.026s
    user    0m6.961s
    sys    0m0.049s

$ cat /proc/cpuinfo
    <snip>
    processor    : 7
    vendor_id    : GenuineIntel
    cpu family    : 6
    model        : 58
    model name    : Intel(R) Core(TM) i7-3740QM CPU @ 2.70GHz
    stepping    : 9
    microcode    : 0x21
    cpu MHz        : 1856.366
    cache size    : 6144 KB
    physical id    : 0
    siblings    : 8
    core id        : 3
    cpu cores    : 4
    apicid        : 7
    initial apicid    : 7
    fpu        : yes
    fpu_exception    : yes
    cpuid level    : 13
    wp        : yes
    flags        : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe
syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl
xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor
ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic
popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm cpuid_fault
epb pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid
fsgsbase smep erms xsaveopt dtherm ida arat pln pts md_clear flush_l1d
    bugs        : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass
l1tf mds swapgs itlb_multihit srbds
    bogomips    : 5387.48
    clflush size    : 64
    cache_alignment    : 64
    address sizes    : 36 bits physical, 48 bits virtual
    power management:

Similar behavior on the (4-core aarch64) pinephone:
   
$ time ./equix-bench --threads 1
    Solving nonces 0-499 (interpret: 0, hugepages: 0, threads: 1) ... 
    1.910000 solutions/nonce
    23.920219 solutions/sec. (1 thread)
    4477.199102 verifications/sec. (1 thread)

    real    0m 40.35s
    user    0m 40.12s
    sys    0m 0.01s

$ time ./equix-bench --threads 2
    Solving nonces 0-499 (interpret: 0, hugepages: 0, threads: 2) ... 
    1.910000 solutions/nonce
    47.683428 solutions/sec. (2 threads)
    4384.937853 verifications/sec. (1 thread)

    real    0m 20.45s
    user    0m 40.20s
    sys    0m 0.06s


$ time ./equix-bench --threads 4
    Solving nonces 0-499 (interpret: 0, hugepages: 0, threads: 4) ... 
    1.910000 solutions/nonce
    94.149494 solutions/sec. (4 threads)
    4359.695415 verifications/sec. (1 thread)
    real    0m 10.47s
    user    0m 40.71s
    sys    0m 0.08s

$ time ./equix-bench --threads 8
    Solving nonces 0-499 (interpret: 0, hugepages: 0, threads: 8) ... 
    1.910000 solutions/nonce
    188.808873 solutions/sec. (8 threads)
    4348.479398 verifications/sec. (1 thread)

    real    0m 10.50s
    user    0m 40.61s
    sys    0m 0.07s


$ cat /proc/cpuinfo
    <snip>
    processor    : 3
    BogoMIPS    : 48.00
    Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid
    CPU implementer    : 0x41
    CPU architecture: 8
    CPU variant    : 0x0
    CPU part    : 0xd03
    CPU revision    : 4




_______________________________________________
tor-dev mailing list
tor-dev@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev