[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
Re: [tor-bugs] #25985 [Obfuscation/Snowflake]: Snowflake rendezvous using AMP cache
#25985: Snowflake rendezvous using AMP cache
-----------------------------------+--------------------------------
Reporter: twim | Owner: (none)
Type: project | Status: needs_revision
Priority: Medium | Milestone:
Component: Obfuscation/Snowflake | Version:
Severity: Normal | Resolution:
Keywords: | Actual Points:
Parent ID: | Points:
Reviewer: | Sponsor:
-----------------------------------+--------------------------------
Changes (by dcf):
* status: needs_review => needs_revision
Comment:
This is some review of commit
[https://github.com/nogoegst/snowflake/commit/b75253ba55bde140be733c4cf5425fc1aab161c4
b75253ba55bde140be733c4cf5425fc1aab161c4].
I had trouble testing this change locally because `amper.Client` was
changing my http into https. This is what I did:
{{{
broker$ ./broker -disable-tls -addr 127.0.0.1:8080
proxy-go$ ./proxy-go -broker http://127.0.0.1:8080/
client$ TOR_PT_MANAGED_TRANSPORT_VER=1 TOR_PT_CLIENT_TRANSPORTS=snowflake
./client -codec amp -url http://127.0.0.1:8080/
}}}
The error I get from the client is a result of attempting to parse HTTP as
TLS:
{{{
BrokerChannel Error: tls: oversized record received with length 20527
}}}
(On further reflection, what I was trying to do wouldn't have worked
anyway, because the AMP client code would have sent a URL path of
`/c/s/.../amp/client/...`, not `/amp/client/...`, but I could have worked
around that for testing.) It looks like the cause is
[https://github.com/nogoegst/amper/blob/6dbbc42eb44c1f389591c83a268524d5eceed1d1/client.go#L55
here in amper/client.go], with a hardcoded `"https"`. Looking at that, I
also noticed a
[https://github.com/nogoegst/amper/blob/6dbbc42eb44c1f389591c83a268524d5eceed1d1/client.go#L23
CDNDomain = "cdn.ampproject.org"]. I have to say, it skeeves me a bit to
see hardcoded constants like that in a library; it seems like the kind of
thing that belongs in a configuration file.
----
I like the idea behind
[https://github.com/nogoegst/snowflake/commit/f358a7696045336ef5e15ff6f8ee2467d8d5f53a
the commit that adds the -codec flag]: generalize a currently hardcoded
interface, then provide alternate implementations in a later commit. I
must say, though, that I don't like the layering of
[https://github.com/nogoegst/snowflake/blob/f358a7696045336ef5e15ff6f8ee2467d8d5f53a/broker/broker.go#L222
clientOffersPOST]→[https://github.com/nogoegst/snowflake/blob/f358a7696045336ef5e15ff6f8ee2467d8d5f53a/broker/broker.go#L186
ctx.handleClientOffers]→[https://github.com/nogoegst/snowflake/blob/f358a7696045336ef5e15ff6f8ee2467d8d5f53a/broker/broker.go#L169
DecodeError]. Instead of returning an `error` directly,
`ctx.handleClientOffers` writes a JSON representation of the error into a
buffer provided by the caller, which the caller must then convert back
into an `error` by calling `DecodeError`. It seems unnecessarily
convoluted just to enable the side effect of writing directly to the
response body in the AMP case. This is a case where I would prefer to see
duplicated code first, because I suspect we can find a better factoring.
In my testing, the `DecodeError` error recovery didn't work. It decodes to
an `Error` that has the same `E` string value but is unequal to
`ErrNoSnowflakes`, for example. When I run a
[https://gitweb.torproject.org/pluggable-
transports/snowflake.git/tree/client?id=25b304a9a856f8c791882ad523df26ffc8fa629c
master branch client] (with POST) against an
[https://github.com/nogoegst/snowflake/tree/b75253ba55bde140be733c4cf5425fc1aab161c4/broker
amp branch broker], it fails because the broker returns status 200 when it
should return status 503. The culprit seems to be
[https://github.com/nogoegst/snowflake/blob/b75253ba55bde140be733c4cf5425fc1aab161c4/broker/broker.go#L229-L234
this error-matching switch]. If I add a `default` case with a panic, I see
that the `ErrNoSnowflakes` case isn't being matched, even though that is
the intention:
{{{
http: panic serving 127.0.0.1:56174: {"error":"No snowflakes proxies
available"}
}}}
----
Here's an example of what the AMP responses look like:
{{{
$ wget -q -O- http://127.0.0.1:8080/client/amp
<!doctype html>
<html amp>
<head>
<meta charset="utf-8">
<title>amp</title>
<link rel="canonical" href="https://ampproject.org" />
<meta name="viewport" content="width=device-width,minimum-scale=1
,initial-scale=1">
<style>body {opacity: 0}</style><noscript><style>body {opacity:
1}</style></noscript>
<script async src="https://cdn.ampproject.org/v0.js"></script>
</head>
<body>
<p>In varietate concordia</p>
<pre id="data">eyJlcnJvciI6Ik5vIHNub3dmbGFrZXMg
cHJveGllcyBhdmFpbGFibGUifQo</pre>
</body>
</html>
[The base64 decodes to '{"error":"No snowflakes proxies available"}'.]
}}}
The overall AMP container encoding looks pretty reasonable. A few tiny
questions:
* [https://www.ampproject.org/docs/getting_started/create/basic_markup
basic_markup] says the `<script async>` has to be the second child of
`<head>`, right after `<meta charset="utf-8">`.
* The `<style>body {opacity: 0}</style><noscript><style>body {opacity:
1}</style></noscript>` doesn't match the documented
[https://www.ampproject.org/docs/reference/spec/amp-boilerplate.html AMP
boilerplate code], which is much longer. Is there a reason?
* `<link rel="canonical" href="https://ampproject.org" />`: it would be
better not to refer to any third parties. The AMP guide says this can be a
relative URL, so I would prefer one of these if they work:
* `<link rel="canonical" href="">`
* `<link rel="canonical" href="#">`
* `<link rel="canonical" href="?">`
----
About the encoded data inside the AMP HTML container: I'm wondering if we
should always put the data payload ''inside'' the same JSON object that
can signal an error, rather than making the payload be sent ''instead of''
the error object when there is no error. This is how it currently works:
1. Try to parse the response body as JSON with an `"error"` field. If the
JSON parsing is successful, handle the error and stop.
2. Otherwise (JSON parsing failed), return the entire response body as a
blob of bytes.
That is, under the base64 encoding, a message can look either like this:
{{{
{"error": "No snowflake proxies available"}
}}}
or like this:
{{{
{"type": "answer", "sdp": "..."}
}}}
One problem with this design, as far as amper is concerned, is that it's
impossible for the server to send a payload that happens to parse as JSON
with an `"error"` field—it will be interpreted as an error, not a payload.
This consideration doesn't affect Snowflake, because Snowflake's payloads
don't have an `"error"` field even though they are JSON. But another
consideration is that you can't tell the difference between a non-error
response and an improperly encoded error response (if the body is
truncated, for example).
I propose to remove this ambiguity by including the data payload inside
the JSON object that represents an error. Just like in an HTTP response,
we have two things—a status code and a message—inside one package. A
message would either look like this:
{{{
{"error": "No snowflake proxies available", "data": null}
}}}
or like this:
{{{
{"error": null, "data": "eyJ0eXBlIjogImFuc3dlciIsICJzZ..."}
}}}
(The `null` fields are included for clarity and could be omitted in
practice.) To handle this, a client needs to `json.Unmarshal` on a `struct
{ Error, Data: string }`. Any JSON decoding error signifies an error in
the AMP transport, not a Snowflake-level error. Assuming JSON decoding is
successful, the `"error"` field indicates whether there was a Snowflake-
level error.
About signaling errors, I would prefer to use machine-readable strings
(e.g. `"ErrNoSnowflakes"`) rather than human-readable strings (e.g. `"No
snowflake proxies available"`). The latter seems more likely to get broken
accidentally by someone adding punctuation, for example. The variable
names you've chosen are good, but I would just them as the actual values,
not just the identifiers. We can set up a canonical mapping, e.g.
200⇒`""`, 503⇒`"ErrNoSnowflakes"`, 504⇒`"ErrAnswerTimeout"`, to map the
status codes from POST into the error strings for common handling.
----
Does
[https://github.com/nogoegst/snowflake/blob/b75253ba55bde140be733c4cf5425fc1aab161c4/client/rendezvous.go#L69
NewBrokerChannel] need to instantiate an `amper.Client` when it's not
using `-codec amp`?
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/25985#comment:24>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs