[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [tor-dev] Proposal 285: Directory documents should be standardized as UTF-8



Quoting teor (2018-01-10 00:19:54)
> These are called "Unicode Scalar Values".
> https://www.unicode.org/glossary/#unicode_scalar_value
> 
> Let's reference that.

"Unicode Scalar Value" includes U+0, which I think we probably want to
exclude.

> >        * each encoded with the shortest possible encoding.
> >        * without any BOM
> > 
> > Are there other restrictions we should make?  If so, how should we phrase them?
> 
> These seem fine, and not tied to a particular unicode version.
> 
> But I don't know enough about Unicode to know if there is anything else we should
> specify.

Skimming through
https://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-test.txt, I think it
might be good to additionally forbid the code points listed at the end:
U+nFFF{E,F} for n = 0..10, and U+FDD0 through U+FDEF.
_______________________________________________
tor-dev mailing list
tor-dev@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev