[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [tor-bugs] #16943 [Tor]: Implement prop250 (Random Number Generation During Tor Voting)



#16943: Implement prop250 (Random Number Generation During Tor Voting)
-------------------------+------------------------------------
 Reporter:  asn          |          Owner:  dgoulet
     Type:  enhancement  |         Status:  needs_revision
 Priority:  High         |      Milestone:  Tor: 0.2.8.x-final
Component:  Tor          |        Version:
 Severity:  Normal       |     Resolution:
 Keywords:  tor-hs       |  Actual Points:
Parent ID:  #8244        |         Points:  large
  Sponsor:  SponsorR     |
-------------------------+------------------------------------
Changes (by teor):

 * status:  needs_review => needs_revision


Comment:

 Replying to [comment:31 dgoulet]:
 > Replying to [comment:27 teor]:
 > > This is my 3rd comment in a row. Please read all 3 comments.
 > >
 >
 > Ok, FYI I've autosquash the fixup from last comment. It took me a while
 to have
 > it done cleanly but now it's done. You have a series of fixup now for
 this
 > comment.
 >
 > See branch: `prop250_final_v5`

 The last commit I reviewed was c219b23d47e9c742532354eedae10ca0509a1d78 in
 prop250_final_v4.
 This appears to correspond to 4f5e212cbd34c8cfb701b25a007e88eacbac1f1f
 "Rename shared random files to use underscore" in prop250_final_v5, which
 includes an autosquash and a rebase to master.

 I'm only going to comment on new commits, and any issues I still have with
 the old code.
 (I'll remove any questions, fixes, and explanations I'm ok with.)

 > > There are missed opportunities for const pointers in this commit.
 >
 > All fixed (hopefully).

 Except for the new smartlist_get_most_frequent_srv, which can take `const
 smartlist_t *sl`.
 (It was added in 3b04ebdb94e29c0b2569ff8c64256f37f1389180.)

 > > Nitpick: get_majority_srv_from_votes() uses `unsigned int current` for
 a boolean value. Tor typically uses int for boolean values.
 >
 > Should we really continue that "tradition"? :)

 I'm in favour of consistently using int, but it doesn't matter that much.
 (We could decide to move to `stdbool.h`, and assert that our booleans are
 actually 0 or 1, but I think that's a topic for another ticket.)

 > > Why pick the microdesc consensus to update the state? Any particular
 reason?
 > > Should we fall back to the NS consensus if there's no microdesc
 consensus or SR values?
 > > {{{
 > >   /* A new consensus has been created: pass it to the shared random
 subsystem
 > >      to update the SR state. */
 > >   sr_act_post_consensus(pending[FLAV_MICRODESC].consensus);
 > > }}}
 >
 > It's the consensus that client uses so we use that consensus to set our
 SRV
 > value post consensus thus having the same view as client. Does that make
 sense?
 > You think we should use `FLAV_NS`?

 Most clients use the microdesc consensus, some still use NS.

 I don't mind which consensus we use, but:
 * I think we should have a comment explaining the arbitrary choice, and
 * I wonder what happens if one consensus fails, but the other doesn't (can
 this even happen?).

 > > I get the following crashes when I run the unit tests on OS X 10.11.3,
 Apple LLVM version 7.0.2 (clang-700.1.81), x86_64.
 > >
 > > Five times:
 > > {{{
 > > shared-random/utils: [forking] Feb 10 16:13:46.122 [err]
 tor_assertion_failed_: Bug: src/or/shared-random.c:854:
 sr_generate_our_commit: Assertion my_rsa_cert failed; aborting. (on Tor
 0.2.8.1-alpha-dev )
 > > }}}
 > >
 > > {{{
 > > 2   libsystem_c.dylib                 0x00007fff975b06e7 abort + 129
 > > 3   test                              0x0000000101a8af29
 sr_generate_our_commit + 825 (shared-random.c:854)
 > > 4   test                              0x0000000101a8cf11
 sr_state_update + 177 (shared-random-state.c:1056)
 > > 5   test                              0x0000000101a8d56d sr_state_init
 + 349 (shared-random-state.c:1231)
 > > 6   test                              0x00000001018e43c3
 test_a_networkstatus + 339 (test_dir.c:1800)
 > > 7   test                              0x00000001019dfadd
 testcase_run_one + 845 (tinytest.c:106)
 > > 8   test                              0x00000001019dffd3 tinytest_main
 + 499 (tinytest.c:432)
 > > 9   test                              0x00000001019df43f main + 639
 (testing_common.c:298)
 > > }}}
 >
 > I think it's fixed. Since I can't reproduce, I hope you can confirm.

 I still see 5 of these. One example is:

 {{{
 3   test                                0x00000001053d4045
 sr_generate_our_commit + 709 (shared_random.c:870)
 4   test                                0x00000001053d60f1 sr_state_update
 + 177 (shared_random_state.c:1068)
 5   test                                0x00000001053d67d3 sr_state_init +
 483 (shared_random_state.c:1243)
 6   test                                0x00000001052c9573
 test_sr_get_majority_srv_from_votes + 51 (test_shared_random.c:811)
 }}}

 > > One time:
 > > {{{
 > > shared-random/state_transition: [forking] Feb 10 16:13:46.389 [err]
 tor_assertion_failed_: Bug: src/or/shared-random-state.c:210:
 commit_add_to_state: Assertion saved_commit == NULL failed; aborting. (on
 Tor 0.2.8.1-alpha-dev )
 > > }}}
 > >
 > > {{{
 > > 2   libsystem_c.dylib                 0x00007fff975b06e7 abort + 129
 > > 3   test                              0x0000000101a8e00f
 commit_add_to_state + 175 (shared-random-state.c:210)
 > > 4   test                              0x0000000101a8c970
 sr_state_add_commit + 48 (shared-random-state.c:941)
 > > 5   test                              0x0000000101982dfe
 test_state_transition + 574 (test_shared_random.c:976)
 > > 6   test                              0x00000001019dfadd
 testcase_run_one + 845 (tinytest.c:106)
 > > 7   test                              0x00000001019dffd3 tinytest_main
 + 499 (tinytest.c:432)
 > > 8   test                              0x00000001019df43f main + 639
 (testing_common.c:298)
 > > }}}
 >
 > This one is a bit more confusing to me. I don't see any possible way how
 that
 > assert can be triggered apart from _already_ having a commit in our
 state when
 > adding it. You'll have to tell me if you hit it again after the fixes
 and
 > probably we'll have to dig deeper.

 I still see this one, same location, line numbers have changed slightly:

 {{{
 3   test                                0x00000001053d6f4f
 commit_add_to_state + 175 (shared_random_state.c:212)
 4   test                                0x00000001053d5b50
 sr_state_add_commit + 48 (shared_random_state.c:953)
 5   test                                0x00000001052cac6e
 test_state_transition + 574 (test_shared_random.c:1000)
 }}}

 > > And I also see `12/720 TESTS FAILED. (1 skipped)`.

 Now I see `7/724 TESTS FAILED. (1 skipped)`.

 6 of these are the crashes above, the final one is:

 {{{
 shared-random/state_update: [forking]
   FAIL src/test/test_shared_random.c:1190: assert(state->n_commit_rounds
 == 1): 2 vs 1
   [state_update FAILED]
 }}}

 > > I'd still like to run the SRV code in chutney, is there a chutney
 template for that?
 > > (I think I'll need to wait for the unit tests to be fixed first.)
 >
 > You can with that branch. We'll simply use the default values for the
 super
 > majority. grep for prefix "SR:" logs and see sr-state file.

 OK, I'll wait to use chutney until the unit tests pass with no crashes.

 I'm going to look into those crashes myself. If I find anything, I'll post
 another comment.

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/16943#comment:34>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs