[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: RFC/proposal for Thandy changes

On 01/01/2011 10:14 PM, Nick Mathewson wrote:
> On Sun, Oct 17, 2010 at 10:46 PM, Justin Samuel <js@xxxxxxxxxxxxxxxx> wrote:
>> Hi all,
> Hi, and sorry about the delay!  I think I like most of it, dislike a
> couple of details, and have some points where I don't get it.  More
> detail follows.

No worries about the delay. I was pretty bogged down until mid-December.

> [ lines re-wrapped]
>> 0. Proposed Thandy Changes
>> ==========================
>> This is a set of proposals that includes a section of simple changes
>> that can be considered on their own (Section 1) as well as a more
>> fundamental Thandy restructuring proposal (Section 2).
>> This isn't meant to be at the level of detail needed for a spec and
>> subsequent implementation. This is to get feedback and promote
>> discussion. It's not an official proposal at this point but more of a
>> request for comment.
> Okay.  I'm going to omit all comments of the form "Could be okay, but
> needs more detail", then. :)
>> A few relevant documents for reference:
>>  * Thandy spec:
>>    https://gitweb.torproject.org/thandy.git/blob_plain/HEAD:/specs/thandy-spec.txt
>>  * TUF spec: https://www.updateframework.com/browser/specs/tuf-spec.txt
>>  * High-level differences between Thandy and TUF:
>>    https://www.updateframework.com/wiki/ThandyDifferences
>>  * Paper on TUF: http://www.freehaven.net/~arma/tuf-ccs2010.pdf
>> 1. Individual Thandy Changes
>> ============================
>> These are changes that could be made to Thandy without major overhaul
>> and can be considered separately of the restructuring proposal (Section
>> 2).

Your comments on these individual changes make sense, but I'm going to
skip replying to these for the moment. I think the more pressing issue
is deciding whether it's worthwhile to further consider the
restructuring proposal (below).

>> 2. Thandy Restructuring Proposal
>> ================================
>> Primary goal: Keep Thandy's concepts of bundles and packages but overlay
>> them on top of the generic 'targets' approach of TUF.
> I don't get this as a goal.  Obviously, you're not advocating that we
> should use TUF's 'targets' approach for it's own sake: you're advocating
> that we use it in order to get some concrete benefit that it provides.
> _That benefit_ is the real goal, not mere use of TUF's approach for its
> own sake.  (I know this sounds like nitpicking, but unless we're clear
> about what actual benefits a change is meant to provide, it's harder to
> evaluate it.)
> If I had to guess, I would say that the real goal is probably something
> like, "separate Thandy's notion of packages and bundles from Thandy's
> notion of authenticated downloads."

Right, separation is the goal. The motivations behind this goal could
really use some discussion.

The first motivation was to see if this split-layer design would be easy
to explain, understand, and reason about.

The second motivation with this goal was to make Thandy's design and
implementation as useful to other projects as possible. I feel (with no
evidence and little experience to back this up) that an approach which
tightly integrates the concepts of bundles and packages would be less
likely to be used by highly diverse projects than one which offered a
more generic system for authenticated downloads. Of course, there are
probably plenty of projects that would use Thandy in the same way that
Tor uses it even if the concepts of bundles and packages didn't exactly
fit their project's or organization's model.

Regarding the first motivation, I'd love to be make the claim that this
separation makes it easier to reason about the security of Thandy.
However, especially after writing the initial informal proposal, I'm not
sure that's true. My impression is that this separation makes Thandy's
design more complex. It seems to trade simplicity for flexibility, where
Tor may not be the direct beneficiary of the flexibility down the road.

>> Note: This proposal is not advocating using/maintaining/relying on TUF
>> as a separate project. That depends on factors such as the future of TUF
>> according to the current TUF maintainers, whether Python is an
>> appropriate choice for Windows clients, etc.
>> 2.1 Approach
>> ------------
>> Two separate layers:
>>   1. An authentication layer that downloads and authenticates opaque
>>      'target' files according to metadata it understands that lists
>>      hashes and sizes of the target files. This layer doesn't understand
>>      what bundles and packages are.
>>   2. A decision/installation layer that uses the authentication layer to
>>      download bundle/package info and associated files. This layer
>>      doesn't know the details of the authentication mechanisms or roles;
>>      it gets files from the authentication layer that the authentication
>>      layer has already authenticated.
>>      * Note that the update decision and installation code are probably
>>        separate, but for the sake of this proposal all that matters is
>>        that the Thandy authentication layer is logically separate from
>>        the rest of Thandy.
> Hm.  It would help to know what exactly the interface to layer 1 should
> be.  I'm guessing it's something like, "Update the metadata", "Tell me
> what files are available", "Download the following files".

That's pretty much it. There could be some ability to filter the "tell
me what files are available" by file path or by signing roles. The
"download the following files" call wouldn't make the files available to
the calling code until signatures and hashes had been verified.

>> For the authentication layer, we start with the following roles (the
>> same as TUF uses):
>>   * Root
>>     o Root of trust for the entire PKI. Indicates through signed
>>       metadata which keys are trusted for the Release, Targets,
>>       Timestamp, and Mirror roles.
>>   * Timestamp
>>     o Signs a frequently regenerated timestamp file with a short
>>       expiration indicating the most recent release metadata.
>>   * Release
>>     o Signs the release metadata which lists the hashes and sizes of all
>>       other metadata files (other than the timestamp file). Note that
>>       bundleinfo and pkginfo are not considered metadata at the
>>       authentication layer.
>>   * Targets
>>     o Signs a metadata file that lists the hashes and sizes of target
>>       files: the files that the decision layer ultimately wants to
>>       obtain.
>>     o Can delegate to sub-roles the responsibility for providing target
>>       files from specific paths on the repository (e.g. Role A is
>>       trusted to provide files from the /targets/role_a/ directory).
> It sounds like you're combining the roles of signing code (which the
> targets key can do, and delegates) with the role of deciding who can
> sign code.  Is that wise?  Nowhere else in the Thandy design is this
> done.

Whether or not it is a bad idea to combine code-signer roles with
delegator roles depends on how they are used. Also, the extent of
delegation could be limited by having the delegating role indicate
whether the delegated role can delegate further. --- This may be an
example of how trying to make a Thandy download/authentication layer
highly flexible ultimately just makes security harder to reason about.

> In practice, I'd assume that the Targets role should be pretty much
> *only* used for delegation.  But in that case, what's the benefit of
> separating this from the root role?

One benefit is that it allows more choices in the balance between how
often certain keys are used vs. what level of privilege those keys have.
In general, any time an organization has a project that could benefit
from being split into subprojects or components with different
authors/maintainers, allowing delegation by roles other than the root
role can decrease how often the root keys need to be used. There may not
be much benefit in Tor's case. I think it's safe to say that this is
oriented more towards flexibility in potential use cases rather than
practicality with Tor's current use case.

>>   * Mirror
>>     o Signs a metadata file that lists the locations and details of
>>       repository mirrors.
>> From here we use delegation by the Targets role to create the roles for
>> bundlers and packagers. The top-level Targets role delegates a separate
>> role for each bundle and each package.
>> The targets role hierarchy looks like this (with many more bundle and
>> package roles):
>> Root
>> `-- Targets
>>     |-- bundles/tor-browser-stable
>>     |-- bundles/tor-browser-beta
>>     `-- pkgs/openssl
>> Each bundle version and package version that bundlers and packagers
>> released has a separate bundleinfo and pkginfo file, respectively. These
>> bundleinfo and pkginfo files are opaque to the authentication layer: it
>> considers them target files like any other. However, the decision layer
>> understands the contents of these files and uses them to make subsequent
>> download and installation decisions (with the downloads always being
>> done through the authentication layer).
>> 2.2. Repository Structure
>> -------------------------
>> Top-level metadata files are:
>> /meta/root.txt
>> /meta/release.txt
>> /meta/timestamp.txt
>> /meta/targets.txt
>> /meta/mirrors.txt
>> The /meta/targets.txt file would include a delegations section such as:
>> delegations : {
>>     keys : {
>>         'ABC...' : { details },
>>         '123...' : { details },
>>         ...
>>       },
>>     roles : {
>>         'bundles/tor-browser-stable' : {
>>             keys : ['ABC...', '123...'],
>>             threshold : 2,
>>             paths : ['bundles/tor-browser-stable/**'],
>>           },
>>         'pkgs/openssl' : {
>>             keys : ['DEF...', '456...'],
>>             threshold : 2,
>>             paths : ['pkgs/openssl/**'],
>>           },
>>         ...
>>       }
>>   }
> To be clear, are you proposing that *every* role be able to delegate
> itself in its particular file, or that a single level of delegation
> exist in the targets.txt file?

With what I proposed, the root role and all targets roles (that is,
including delegated targets roles) would be allowed to delegate. I
attempted to arrange the metadata file names and formats such that there
would be a clean and consistent way to allow other roles to delegate in
the future, if needed. However, I find that thinking of cases where the
mirrors role, release role, or timestamp role might need to delegate
starts to feel like an academic exercise and an attempt to solve
problems that don't exist.

>> The above would mean that the top-level Targets role had delegated a
>> role whose full name would be targets/bundles/tor-browser-stable (as it
>> is delegated by the targets role, the prepended targets/ is implicit in
>> the delegated role's name). This role for the tor-browser-stable bundle
>> would be trusted for the specified paths relative to the repository's
>> targets/ directory. Thus, a specific version's bundleinfo file created
>> by the bundler could be placed on the repository at, for example:
>>   /targets/bundles/tor-browser-stable/win32/0.1/tor-browser-stable_win32_0.1.bundleinfo
>> (Note that this bundle role is trusted for all targets files matching
>> the path 'bundles/tor-browser-stable/**' under the repository's targets/
>> directory, as specified when this role was created through the above
>> delegation.)
>> The bundle maintainer would sign a metadata file listing the hash and
>> size of this bundleinfo. This metadata would be placed on the repository
>> at:
>>   /meta/targets/bundles/tor-browser-stable/win32/0.1/tor-browser-stable_win32_0.1.txt
>> (Note that the basename of these files isn't crucial to this aspect of
>> the design. They don't need to repeat the path info, though that's
>> probably helpful for humans.)
>> More generally, the metadata location is:
>>   /meta/ROLE_NAME/[ANY_PATH/]ANY_NAME.txt
>> Packages are similar to bundles with the difference that there are one
>> or more target files in addition to the pkginfo file. A package
>> maintainer may supply the following files to be placed on the
>> repository:
>>   /targets/pkgs/openssl/win32/0.9.8m/openssl_win32_0.9.8m.pkginfo
>>   /targets/pkgs/openssl/win32/0.9.8m/libeay32.dll
>>   /targets/pkgs/openssl/win32/0.9.8m/ssleay32.dll
>> The hashes and sizes of these files are listed in metadata signed by the
>> targets/pkgs/openssl role (that is, the openssl package maintainer's
>> role).  This metadata would be placed on the repository at:
>>   /meta/targets/pkgs/openssl/win32/0.9.8m/openssl_win32_0.9.8m.txt
> So to see if I have it right:
>   - Every target file corresponds to exactly one target metadata file,
>     though any target metadata file can in principle correspond to one
>     or more target files.

Correct for Tor's usage of the download/authentication layer. In other
(non-Tor) usages, it would be possible for separate targets metadata
files to list the same target file, potentially having the same target
file described with conflicting sets of hashes+filesize (I'm not saying
this would be a good thing, but just that it would be possible unless
this was prohibited by imposing restrictions on delegation, see below).

>   - It is trivial, given a target metadata file, to learn which target
>     files it authenticates.  It is not trivial, given a target file, to
>     learn which target metadata file authenticates it; ideally, it will
>     be in a corresponding location in the metadata.  (Is this required?)

Correct. The contents of targets metadata files always state which
targets files that role has directly authenticated.

For going the other direction (given a target file, which metadata files
might it be listed in), if the delegation is done in such a way that:

  a) there isn't overlap in the paths for which delegated targets roles
are responsible (e.g. "a/**" and "b/**" are fine because they don't
overlap), and

  b) targets roles that perform delegation don't authenticate target
files in those same delegated paths (e.g. authenticating "foo.rpm" and
delegating "a/**" is fine, but not authenticating "a/foo.rpm" and
delegating "a/**"),

then an individual target file can only be listed in a single targets
metadata file. This doesn't make it trivial to know which targets
metadata file listed a given target file, but it does ensure that there
is at most one targets metadata file that listed it.

>   - Both layers of the updater (the authentication layer and the
>     decision layer) need to be able to verify hashes and signatures.

This doesn't have to be the case. The requirement is that the targets
role delegation be done in such a way that a target file's path limits
the roles that could have provided the file. If that is the case, then
successfully obtaining the target file through the authentication layer
means that the authentication layer was able to verify signatures by
whichever targets roles are trusted to provide files at that path.

For example, if delegated targets roles "linux" and "win" are only
trusted to provide files from paths "linux/**" and "win/**",
respectively, then the decision layer can use the authentication layer
to download a target file at "linux/foo.rpm" without having to know
details of signature, hashes, or even the underlying roles. Properly
restricting the paths that the various delegated targets roles are
trusted for is an absolute requirement with this approach.

>> 2.3. Update Procedure
>> ---------------------
>> The update procedure is:
>>   * The decision layer uses the authentication layer to retrieve a list
>>     of all available bundleinfo files.
>>     o Implementation: the decision layer asks the authentication layer
>>       for a list of all available metadata file paths/names. The
>>       authentication layer obtains this information from the release
>>       metadata.
>>   * Looking at the paths/names of available bundleinfo files, the
>>     decision layer identifies whether there is a newer version of a
>>     bundle it is interested in.
>>     o Implementation: the bundle names, OS, arch, and bundle version are
>>       all contained in paths of the available bundle metadata files.
> This seems to add a requirement that you can do a mapping from bundle
> name to bundle version.  Specifying string-to-version mappings in a
> reliable way can be really nasty.  Sure you want to do that?

That does seem potentially problem-prone. A solution would be to include
additional information for each item listed in the release.txt file.
Keeping in the spirit of over-generality, I've always envisioned an
optional "custom" field that could be used in various places in the
authentication layer metadata. For example, the targets metadata for a
bundle file could look like this when listed in release.txt:

     "hashes": {
     "length": 680,
     "custom": {
       "type": "bundle",
       "name": "tor-browser-stable",
       "OS": "win32",
       "version": "0.1",

Format-wise, that could even be:

       "version": [0, 1],

Either way, this version information isn't authoritative (because the
release.txt file is signed by the release role, not the individual
bundle role) and the decision layer would need to verify that the
version information listed in the bundle file, once downloaded, is the same.

The Thandy spec should probably describe the allowed version formats and
how ordering is determined or refer to an existing Tor spec if that's
already described somewhere.

>>   * The decision layer notices a bundle version in the list that it
>>     wants and uses the authentication layer to retrieve the bundleinfo
>>     file for that version.
>>   * The decision layer reads the contents of the bundleinfo file which
>>     indicate the necessary package versions and any other info the
>>     decision layer needs.
>>   * The decision layer uses the authentication layer to retrieve the
>>     pkginfo files for each of the package versions that it wants.
>>   * The decision layer understands the contents of the pkginfo
>>     files. These files indicate the individual files that are part of
>>     this version of the package.
>>   * The decision layer uses the authentication layer to retrieve the
>>     individual files (e.g. /targets/pkgs/openssl/win32/0.9.8m/libeay32.dll)
>>     that are needed.
>>   * The decision layer hands off the relevant installation instructions
>>     (from the bundleinfo and pkginfo files) and individual package files
>>     to the code that performs the installation/upgrade.
>> 2.4.bundleinfo and pkginfo
>> --------------------------
>> As the contents of the bundleinfo and pkginfo are opaque to the
>> authentication layer, essentially there are two completely separate sets
>> of metadata in this design. It would make sense to have them use the
>> same format (e.g. Canonical JSON) and be parsed/generated by the same
>> code.
> This argues for three components, then: the two you described, plus a
> generic data-format layer that they both could use.

That makes sense.

> peace & happy new year,

Happy new year. Hopefully 2011 will be the year I actually help with Thandy.

I'm not sure if I've provided a clear enough idea of how this
split-layer design would work in order for people to form opinions about
it. At the moment, the way it looks to me is that this split-layer
approach increases complexity for what could be considered theoretical
and debatable future benefits. If a tested, used, and supported
download/authentication layer already existed in the form Thandy needed
it and it could be essentially dropped in as a library, the increased
design complexity might be offset by decreased development time and
increased reliability. Does this seem like a fair assessment?

I defer to those with more experience to decide whether the ideas and
remaining questions/concerns of the split-layer approach should be
fleshed out further, possibly into an actual proposal. Otherwise, I
should probably instead focus on addressing Nick's comments on the
simpler changes.