[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
Re: [tor-bugs] #25688 [Obfuscation/Snowflake]: proxy-go is still deadlocking occasionally
#25688: proxy-go is still deadlocking occasionally
--------------------------------------------+------------------------------
Reporter: dcf | Owner: cohosh
Type: defect | Status: needs_review
Priority: Low | Milestone:
Component: Obfuscation/Snowflake | Version:
Severity: Normal | Resolution:
Keywords: network-team-roadmap-2019-Q1Q2 | Actual Points:
Parent ID: | Points: 3
Reviewer: | Sponsor:
--------------------------------------------+------------------------------
Comment (by dcf):
I was going to quibble about `makePeerConnectionFromOffer` becoming
blocking, which through `runSession`, will block the main polling loop in
`main` (if I understand right). But I think this is an architectural
problem unrelated to the deadlock fix, so let's ignore it. (It seems like
`makePeerConnectionFromOffer` and `sendAnswer` should run in their own
goroutine, responsible for a single PeerConnection.)
Replying to [comment:22 cohosh]:
> Replying to [comment:21 dcf]:
> I am ok with this as well, but we should probably be tearing down the
peer connections properly after a timeout anyway (though maybe go handles
this on its own eventually?)
I'm looking at the code and I don't quite get how it's supposed to work.
The error handlers
([https://github.com/cohosh/snowflake/blob/08f5205461573bf8a6e8961540ac620865a3b45c
/proxy-go/snowflake.go#L302 here]
[https://github.com/cohosh/snowflake/blob/08f5205461573bf8a6e8961540ac620865a3b45c
/proxy-go/snowflake.go#L313 here]
[https://github.com/cohosh/snowflake/blob/08f5205461573bf8a6e8961540ac620865a3b45c
/proxy-go/snowflake.go#L318 here]) call `pc.Destroy()`, and `retToken()`
[https://github.com/cohosh/snowflake/blob/08f5205461573bf8a6e8961540ac620865a3b45c
/proxy-go/snowflake.go#L355 in the caller]. The timeout checker
[https://github.com/cohosh/snowflake/blob/08f5205461573bf8a6e8961540ac620865a3b45c
/proxy-go/snowflake.go#L334 here] calls both `pc.Destroy()` and also
`retToken()`, which makes sense because it doesn't have a caller to call
`retToken()`. So that looks good.
When a PeerConnection ends "naturally", I suppose what happens is that
[https://github.com/cohosh/snowflake/blob/08f5205461573bf8a6e8961540ac620865a3b45c
/proxy-go/snowflake.go#L334 dc.OnClose()] gets called, which calls
`pc.DeleteDataChannel(dc)`, but not `pc.Destroy()` nor `retToken()`. I can
understand why `pc.DeleteDataChannel(dc)` gets called here and not in the
other cases--because in the other cases we don't have a DataChannel yet--
but then I'm wondering why it doesn't call the other two functions. Are we
losing a token in this case too?
I was thinking, what we need is an OnError handler so we can get called
back when a DataChannel fails to establish. I found
[https://github.com/keroserene/go-
webrtc/blob/0c5ebb10a5dd7990a4962b78de27c2a8c735dac0/datachannel.go#L47-L50
this comment]:
{{{
OnError - is not implemented because the underlying Send
always returns true as specified for SCTP, there is no reasonable
exposure of other specific errors from the native code, and OnClose
already covers the bases.
}}}
I was curious about what happens in browser WebRTC so I hacked
[https://developer.mozilla.org/en-
US/docs/Web/API/WebRTC_API/Simple_RTCDataChannel_sample this demo] and
[https://github.com/mdn/samples-
server/tree/49c605fbda926a2dce9955b362233eef673e6090/s/webrtc-simple-
datachannel code] to comment out [https://github.com/mdn/samples-
server/blob/49c605fbda926a2dce9955b362233eef673e6090/s/webrtc-simple-
datachannel/main.js#L60-L62 the onicecandidate callback] in the remote.
What happens is you get a browser-produced line in the console:
{{{
⚠️ ICE failed, add a STUN server and see about:webrtc for more details
}}}
but none of the error callbacks get called, so the application is never
aware of the failure. (There is a [https://developer.mozilla.org/en-
US/docs/Web/API/RTCPeerConnection/onicecandidateerror onicecandidateerror]
callback but apparently nothing implements it.) So it looks like a browser
is not doing anything fundamentally different, and checking after a
timeout seems like a reasonable way to do it.
So I think your patch looks good, if you can
1. answer whether the `OnClose` path needs to call `retToken()`
2. move the `time.Minute` value into a commented constant
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/25688#comment:23>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs