[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [tor-dev] gitian replacement proposal



Hi,

On Mon, 13 Jan 2014, Georg Koppen wrote:

> Hi,
> 
> Nicolas Vigier:
> > Hello,
> > 
> > You can find at this URL a proposal to refactor the tor browser bundle
> > build process, using an other tool to replace gitian:
> > https://people.torproject.org/~boklm/automation/tor-automation-proposals.html#build-tool
> > (also added as attached file to this email)
> 
> it seems to be much more than a proposal for refactoring the TBB build
> process as building packages of components is not relevant in the
> latter. So, I won't comment on the details but like to get consensus on
> the big picture first. Taking the main improvements you listed below as
> a starting point seems therefore fine to me:
> 
> > The main improvements in this prototype from the current build process
> > are:
> > 
> > - all components are built separately, and include in their output file
> >   name the commit hash or version, architecture and OS (if architecture
> >   dependant). This allows us to keep previous builds if the
> >   commit/version/architecture/OS didn't change. So we can rebuild a
> >   bundle very quickly when the browser didn't change.
> 
> Yes, we start with #10120 (which I'd like to start working on in the
> next weeks). But avoiding to rebuild other parts of the bundle
> (torbutton etc.) should be easily doable as well (but keep the
> starting/updating/stopping-the-VM-overhead in mind).
> 
> > - the gitian replacment has features to download tarballs and verify them
> >   with sha256sum or gpg signature, so this can replace the fetch-inputs.sh
> >   script.
> 
> Yes, but we already have fetch-inputs.sh. So the advantage of burps
> seems not to be so big here.

The advantage is that it's more simple, and better integrated with the
rest of the tool.

For instance, the python sources tarball is currently defined in the
following files:
 - versions: version and download URL
 - fetch-inputs.sh: gpg signature verification, and creation of a
   symlink Python-$VERSION.tar.bz2 -> python.tar.bz2
 - descriptors/linux/gitian-firefox.yml: python.tar.bz2 is defined in
   the list of files to be copied inside the build VM

In burps the equivalent can be defined in one place, in the file
projects/python/config, with the following lines:

  version: 2.7.5
  input_files:
    - name: python
      filename: 'Python-[% c("version") %].tar.bz2'
      URL: 'http://www.python.org/ftp/python/[% c("version") %]/[% c("filename") %]'
      file_gpg_id: 1
      sig_ext: asc

In this definition we have:
 - the name of the file that should be copied to the build VM (filename)
 - the URL to download the file if it is missing (URL)
 - the 'file_gpg_id' option to indicate that a gpg signature file should
   be downloaded too, and used to verify the file (using keyring python.gpg)
 - no symlink needed. I think it is needed in the gitian build process
   because there is no easy way to access the python version defined in
   the versions file from the gitian descriptor, so a symlink is created
   to avoid updating the filename in the gitian descriptor each time
   the version changes. In burps we can access this filename, so we don't
   need a symlink.

So I think this is more simple. An other advantage is that the files
are going to be downloaded only if they are needed: if we make python
build optional, and we run a build that doesn't need it, it won't be
downloaded.

> 
> > - we can remove the linux/windows/macosx descriptors duplication, and
> >   instead use template directives for the parts that differ between
> >   those builds (it's still possible to use separate files if they differ
> >   completly).
> 
> Yes, that is good. Although I am not sure how much this buys us in a
> full-fledged gitian-like setup. And I guess the gitian people would be
> happy to take patches. :)

The prototype I have made does not support Windows and Mac OS X builds
yet, but I have looked at how it can be implemented.

Instead of having 3 separate descriptor files:
 gitian/descriptors/linux/gitian-tor.yml
 gitian/descriptors/windows/gitian-tor.yml
 gitian/descriptors/mac/gitian-tor.yml

We have only one, but we make changes like this for the parts that
need to differ between Linux / Windows / Mac OS X builds:

  diff --git a/burps.conf b/burps.conf
  index 60d2868be16f..8941b0c7ed94 100644
  --- a/burps.conf
  +++ b/burps.conf
  @@ -36,3 +36,9 @@ targets:
     include_pt:
       var:
          include_pt: 1
  +  win32:
  +    var:
  +       crosscompile_host: i686-w64-mingw32
  +  osx32:
  +    var:
  +       crosscompile_host: i686-apple-darwin11
  diff --git a/projects/tor/build b/projects/tor/build
  index b8cd9f805922..42ac7cf9c67b 100644
  --- a/projects/tor/build
  +++ b/projects/tor/build
  @@ -12,6 +12,9 @@ mkdir "$INSTDIR"
   ./autogen.sh
   [% c('var/touch_directory', { directory => '.' }) %]
   ./configure --disable-asciidoc --with-libevent-dir="$rootdir/libevent" \
  +	    [% IF c('var/crosscompile_host') -%]
  +		--host=[% c('var/crosscompile_host') -%]
  +	    [%- END -%]
   	    --prefix="$INSTDIR"
   make -l[% c('var/max_load') %] -j
   make install

We can then see the different build scripts with:

  $ burps showconf tor build --target dev | grep -A1 configure
  ./configure --disable-asciidoc --with-libevent-dir="$rootdir/libevent" \
                      --prefix="$INSTDIR"
  $ burps showconf tor build --target dev --target win32 | grep -A1 configure
  ./configure --disable-asciidoc --with-libevent-dir="$rootdir/libevent" \
                          --host=i686-w64-mingw32     --prefix="$INSTDIR"

If in the future we want to build for Windows 64 or Mac OS X 64, we can
easily add new targets in burps.conf that set a different value for the
option var/crosscompile_host.

If we want to do the same using gitian, I think the following features
are currently missing:

- possibility to use some templating instructions in the script
  definition. Or some other way to access the info about which OS we
  should build for, from the script.

- possibility to define some options in a common configuration file so
  we can avoid duplicating them in all descriptors.

- possibility to select some OS target on the command line

> 
> > - we can define variables based on selected OS. This allows for instance
> >   to build python 2.7 when building on Ubuntu Lucid, but avoid building
> >   it on other distros which already provide python 2.7.
> 
> Well, building Python already ourselves is actually a feature as we are
> no longer dependent on some Python package. In the very loooooooooong
> run the aim is to build everything ourselves. A better example might
> therefore be building Binutils which we only need for Windows. But that
> boils basically down to #10120 again (we might even be able to be so
> smart to build the platform dependent tools only if the user really
> wants to build for that particular platform: there is no need to build
> Binutils if the user only wants to build Linux bundles).
> 
> > - we can define variables based on "targets" that are set on command
> >   line. For instance in the prototype using "--target enable_pt" instructs
> >   to include the portable transports (only pyptlib in this prototype) in
> >   the bundle.
> 
> The portable transports are supposed to get included into the stable TBB
> rather sooner than later. Thus, that feature is not needed either.

Maybe that won't be needed for portable transports, but we can imagine
having in the future other types of bundles with experimental components
that we only want to have in a separate bundle.

An other use is to enable or disable the build with the randomized
readdir library that was discussed before.

https://lists.torproject.org/pipermail/tor-dev/2013-December/005925.html

> 
> > - we can easily switch from building in a VM to building locally
> 
> True, but I am not sure why that is a feature compared to Gitian as we
> need the VM for creating reproducible builds. Thus, this one does not
> count here as an improvement IMO.

It think this can be useful to be able to easily disable the use of a VM
in some cases:

- we want to do an experimental build with clang instead of GCC. Or a
  different version of GCC / glibc, to understand if a problem is caused
  by Ubuntu GCC / glibc version. An easy way to disable the use of an
  Ubuntu VM can allow to do that.

- in the future, if we're building all the toolchain ourself, we will
  want to check that building on Ubuntu and on other distros produce the
  same result.

> 
> > - build descriptors can depend on the result of another build descriptor.
> >   This remove needs for scripts like mkbundle-*.sh.
> 
> Good idea. And looking at
> https://github.com/bitcoin/bitcoin/tree/master/contrib/gitian-descriptors the
> Bitcoin people might be interested in this as one.
> 
> > And I think those improvements should make it easier to rebuild a new
> > bundle automatically when any of the components of the bundle receives
> > a new commit, and then run tests on this bundle.
> 
> I understand why the first improvement makes rebuilding the bundles
> easier. But why does that hold for the other features as well?
> 
> To get the discussion properly started I think we should ask
> additionally why there has to be a new tool for building TBB. Why not
> improving Gitian? Is it broken beyond repair? Others using Gitian could
> benefit as well and it would save maintenance costs (due to creating yet
> another tool doing similar things *and* maintaining that one etc.). As
> far as I can see none of your improvements is so specific to your tool
> that they can't get included into Gitian. This point is especially worth
> considering as you don't want to get rid of Gitian's functionality
> entirely but only of Gitian for driving the build process if I
> understood that correctly. All those tricky things concerning VM
> creation/handling are kept (and improved :) )

Yes, improving Gitian to have similar features would be an other
solution. However I think it would require important changes in how
Gitian works, and that would be more work, to reimplement the features
that are already available in burps. But it should be possible.

If you're wondering why I did not improve Gitian instead of creating
burps, I can explain that. The main reason is that initialy I did not
intend to make a gitian replacement, I only wanted a tool that would
allow me to automate creation of an rpm package from a software with
its sources in a git repository: git clone/fetch a repository, make a
tarball from a selected commit, create an rpm spec file from a template,
and run rpmbuild to generate the package. I tried to do that in a generic
way so it can be extended to support Debian and other types of packaging.
I'm not running Debian but I wanted to be able to make Debian packages,
so I added an option to be able to build inside a VM/chroot. And added
other options to make it easy to configure and extend.

Later I looked at Gitian more closely, and realized that it was doing
something quite similar to the tool I had been making, but much more
limited. So I started wondering if the tool I made could be used to
build tor browser bundles and quickly made a prototype to check that it
was possible, and see what it would look like and whether it would make
sense to do that.

Now I think burps has all gitian features, with some improvements, and
is much easier to extend with its configuration system and use of
templates. So I think that would be a good change.

> 
> "- creation and start/stop of the Ubuntu build VMs. We can keep the
>   gitian scripts for that, and improve them later."
> 
> So, from my current understanding I tend to think there should be a
> couple of bugs get filed against gitian-builder (and that are good bugs
> you point out, I think!). They should get fixed then and upstreamed.

I think there are two different ways to do it:

- Improve gitian to add all missing features. I think this is difficult
  to do while keeping compatibility with previous versions of gitian, so
  I don't know if upstream will accept the patches. The final result
  will be something similar to what burps based build system is.

- I continue working on a burps based prototype, and make it rebuilt
  automatically by some Jenkins. When this prototype is able to produce
  the same bundles as gitian ones, we use it for the next releases and
  stop updating the gitian based one.

I'm not sure that it is easy to do those changes incrementaly. So my
favorite solution is the 2nd one.

> 
> That said, maybe having the whole packaging in the same tool as well
> changes things, I don't know (it might be worth thinking about the
> additional complexity due to burps being a packaging tool, too: e.g.
> does the lsb_release/release + lsb_release/id combination not matter for
> TBBs). But that is probably a different discussion (or is it not?).

The lsb_release/* is just a way to identify the distribution used for
the build. If you don't want the build to work differently depending on
the distribution, you can ignore that.

The packaging in the same tool is also something interesting I think. In
gitian the build script is defined in the 'script' option inside the
descriptor. In burps the equivalent of that is the 'build' option. But
in the same descriptor file (or files included from the descriptor file),
we can also have an rpm spec file, debian package files. If later we
want to create docker images (www.docker.io), we can easily add docker
files too.

Attachment: pgpmvAAKn6Jmp.pgp
Description: PGP signature

_______________________________________________
tor-dev mailing list
tor-dev@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev