[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [tor-bugs] #29787 [Metrics/Onionperf]: Enumerate possible failure cases and include failure information in .tpf output



#29787: Enumerate possible failure cases and include failure information in .tpf
output
-------------------------------+------------------------------
 Reporter:  karsten            |          Owner:  metrics-team
     Type:  enhancement        |         Status:  new
 Priority:  Medium             |      Milestone:
Component:  Metrics/Onionperf  |        Version:
 Severity:  Normal             |     Resolution:
 Keywords:                     |  Actual Points:
Parent ID:                     |         Points:
 Reviewer:                     |        Sponsor:
-------------------------------+------------------------------

Comment (by karsten):

 Alright, I finally made some progress here!

 Last things first, I made the following plot:

 [[Image(op_errors-2019-04-24.png​, 500px)]]

 This plot uses your script with a minor extension:

 {{{
 diff --git a/op_errors.py b/op_errors.py
 index 1c8b278..7169e4d 100644
 --- a/op_errors.py
 +++ b/op_errors.py
 @@ -131,6 +131,7 @@ def main():
              #if there are no failures at all in the circuit data then the
 csv column will simply be left empty
              pass
          header = [
 +            'unix_ts_end', 'hostname_local',
              'transfer_id', 'is_error', 'error_code', 'state_failed',
              'total_seconds', 'endpoint_remote', 'total_bytes_read',
              'circuit_id', 'stream_id','buildtime_seconds',
 'failure_reason_local',
 }}}

 I fed it with all OnionPerf .json files that we have.

 Then I combined the three fields `error_code`, `failure_reason_local` (if
 present), and `failure_reason_remote` (if present, and only if
 `failure_reason_local` is present, too) into a combined error code.

 The result is that we have 11 combined error codes now, which are all in
 the graph.

 The next step will be to understand in more detail what causes these
 errors. For example:
  - `READ` is a fun one. The cases I looked at (all from op-ab) were all
 onion service cases. The server had completed sending the response, and
 all data was "in flight". Yet, some time later, the client had its
 connection closed shortly before receiving the last remaining bytes. This
 could be a bug. Still, needs closer investigation.

 acute, if you'd like to take a look, too, maybe write down which combined
 error codes you're going to look at, so that we can avoid duplicating
 effort. (Thanks for all your efforts so far!)

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/29787#comment:20>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs