[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
Re: [tor-bugs] #22410 [Core Tor/Tor]: ensure that uint8_t is unsigned char
#22410: ensure that uint8_t is unsigned char
--------------------------+------------------------------------
Reporter: catalyst | Owner: catalyst
Type: defect | Status: needs_review
Priority: Medium | Milestone: Tor: 0.3.2.x-final
Component: Core Tor/Tor | Version:
Severity: Normal | Resolution:
Keywords: | Actual Points:
Parent ID: #6877 | Points:
Reviewer: | Sponsor:
--------------------------+------------------------------------
Comment (by catalyst):
Replying to [comment:4 cypherpunks]:
> Comparing the length of the keywords is a bad way of choosing data
types.
In the abstract, I agree. I also think the length of a type name does
affects readability by humans and we should consider that in our
decisions.
> Replying to [comment:3 catalyst]:
> > If uint8_t is an extended integer type rather than unsigned char
(which is admittedly unlikely), it won't have the privileged aliasing
properties of unsigned char so code that casts pointers to other types to
pointers to uint8_t might violate the strict aliasing rules and produce
undefined behavior.
> I agree with the possibility of violating strict aliasing. However, i
assume (yes, i know i should never) these pointers are dereferenced at
some point which is always undefined behavior when the old and new type
differs so the point is moot.
C99 §6.5 paragraph 7 explicitly says that it's always valid to use a
character lvalue to access the stored value of any object. This means
it's always valid to dereference a pointer to a character type as long as
it points into an object. If we detect that uint8_t is a character type,
then we know that it will also have these privileged aliasing properties.
> I don't care about the data type names (any renaming can easily be done
using `typedef` if preferred). IMO it's more important that the data type
matches the type of data it holds and the code handling these data types
is built around these data types in order to keep casting to a minimum
(preferably none).
I think the best data type for handling arbitrary byte data on a platform
with CHAR_BIT==8 is unsigned char. This also has an advantage when
handling encoding or decoding a larger type, because of the privileged
aliasing properties of character types in C.
A lot of existing code in the tree uses uint8_t. It's easier to check at
configure time whether uint8_t is a character type than to check each use
of uint8_t for strict aliasing violations that could occur on (presumably
rare) platforms where uint8_t is not a character type.
There will often need to be casting even when using unsigned char or
uint8_t because they will promote to signed int on most platforms. This
can cause problems with bitwise shifts if the appropriate casts aren't
done. (Left-shifting a 1 bit into the sign bit is undefined behavior.)
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/22410#comment:5>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs