How to verify LibreSSL release files

2016-10-02 08:05:01 +0200

Like many other sites, waf.io is now using LibreSSL in order to support the s as secure in https. Since LibreSSL is still a relatively new project, there are only few Linux distribution packages, so we are building it ourselves from the source code ourselves. Using OpenBSD directly might become an option for us once SNI is finally supported in HTTPD and binary patches make system updates easier (maybe in OpenBSD 6.1).

Although obtaining the source code is easy, both LibreSSL project and download pages are over http only, so the data can be tampered with during transfer. Do not LibreSSL developers trust their own library sufficiently to dare to use it? Where are the download verification instructions?

The release files are at least signed with GPG, but the corresponding public key is only present over http. That key does not seem to be used to sign the project commits either. While signing commits does make a signing key valid, it increases the amount of data to tamper with, and thus increases the confidence in the usage of the signing key and in the state of the source tree. This is the reason why most commits in the Waf source tree are signed with the release key for instance:

$ git log --show-signature
commit 9ed7d41488a88935e1f6f5fccbce6397a8ac1fed
gpg: Signature made Thu 15 Sep 2016 09:36:02 PM CEST
gpg:                using RSA key 0x49B4C67C05277AAA
gpg: Good signature from "Thomas Nagy <noreply@waf.io>" [ultimate]
Primary key fingerprint: 8AF2 2DE5 A068 22E3 474F  3C70 49B4 C67C 0527 7AAA
Author: Thomas Nagy <noreply@waf.io>
Date:   Thu Sep 15 21:36:02 2016 +0200

    Expand '--foo=' with shell=False - Issue #1814

Fortunately for LibreSSL, their release files are also co-signed with the new Signify tool from the OpenBSD project, and the public key is at least also present in the project source tree on Github and can therefore be obtained over https at least (it would be best if the GPG public key were also signed using Signify as other release files by the way).

The Signify manual page is way shorter than the GPG one, yet it fails to describe the expected file formats and the underlying algorithms that would make one want to trust it. It is also a pity that the newly-invented signature format prevents usage of signatures as Python/Ruby/Perl commented lines, that keeping signatures in archive files is not supported (jarsigner for Jar files), and that chains of trust to verify files is inexistent (certificates).

The Signify application is also a bit difficult to find on Linux distributions as there is another unrelated application named signify that generates random email signatures, but on Debian the package signify-openbsd can be installed directly. In the end I deemed the public key (RWQg/nutTVqCUVUw8OhyHt9n51IC8mdQRd1b93dOyVrwtIXmMI+dtGFe) sufficiently trustworthy and ran the following commands to fetch and verify the latest LibreSSL release:

$ sudo apt-get install signify-openbsd
$ wget http://ftp.openbsd.org/pub/OpenBSD/LibreSSL/libressl-2.5.0.tar.gz
$ wget http://ftp.openbsd.org/pub/OpenBSD/LibreSSL/SHA256.sig
$ wget https://raw.githubusercontent.com/libressl-portable/portable/master/libressl.pub
$ signify-openbsd -C -p libressl.pub -x SHA256.sig libressl-2.5.0.tar.gz
Signature Verified
libressl-2.5.0.tar.gz: OK

Let us hope that the download and verification instructions will be easier to follow in the future.

Techniques for stripping binaries

2016-09-26 08:05:01 +0200

Programs and shared libraries can contain symbol information that increase the size of the data to redistribute and that facilitates reverse-engineering. The linkers/compilers do not provide a way of stripping symbols while creating the files, so in practice the strip program needs to be executed on the resulting binary files. The build process must then place barriers to prevent usage of such binaries before they are ready.

A fairly common approach consists in running strip only on files that are installed. Overriding the method copy_fun on the installation class provides a coarse-grained way of achieving this. In the following example, the strip command is called on files comes from a link task:

import shutil, os
from waflib import Build, Utils, Context

def copy_fun(self, src, tgt):
    shutil.copy2(src, tgt)
    os.chmod(tgt, self.chmod)

    if getattr(self.generator, 'link_task', None):
        if self.generator.link_task.outputs[0] in self.inputs:
            self.generator.bld.cmd_and_log('strip %s' % tgt, quiet=Context.BOTH)
Build.inst.copy_fun = copy_fun

If stripping is required during the build phase, then the build order must be set so that dependent tasks wait for the binaries to be ready. A reliable implementation may be difficult to achieve for partial builds if the strip operation is modeled as a Task object as the task does not know the full list of downstream dependencies beforehand. The following hack hack provides a rather easy way to force a particular order though. In the following example, both inputs and outputs for the strip task are set to the same file:

from waflib import TaskGen

@TaskGen.feature('cshlib', 'cxxshlib', 'cprogram', 'cxxprogram')
@TaskGen.after('apply_link')
def add_strip_task(self):
    if getattr(self, 'link_task', None):
        exe_node = self.link_task.outputs[0]
        # special case: same inputs and outputs for a task!
        strip_task = self.create_task('strip', exe_node, exe_node)

The main drawback of this solution is that a deadlock will be observed if several post-processing operations are declared for the same file. Besides that, the strip task object is not really necessary in the first place; removing it can significantly reduce the amount of console logs for a whole build.

My favorite approach consists in chaining the run method through inheritance. In the example below, the function wrap_compiled_task creates subclasses with the same name as the original. A Python metaclass bound to parent Task class translates the run_str attribute into a run method so that a long python function does not need to be written down. That metaclass also registers the last class created so that cls3 replaces cls1 in Task.classes[classname] as default. One must be careful to load C/C++/Fortran Waf tools first else the code will not find any class to subclass:

from waflib import Task

def wrap_compiled_task(classname):
    # create subclasses and override the method 'run'
    #
    cls1 = Task.classes[classname]
    cls2 = type(classname, (cls1,), {'run_str': '${STRIP} ${TGT[0].abspath()}'})
    cls3 = type(classname, (cls2,), {})

    def run_all(self):
        if self.env.NO_STRIPPING:
            return cls1.run(self)
        ret = cls1.run(self)
        if ret:
            return ret
        return cls2.run(self)
    cls3.run = run_all

for k in 'cprogram cshlib cxxprogram cxxshlib fcprogram fcshlib'.split():
    if k in Task.classes:
        wrap_compiled_task(k)

The techniques described above above are not exclusively for stripping binaries though. They may be precious in situations where built files need to be modified after a compiler was executed. The complete examples for this post can be found in the following folder.

How to build static libraries from shared objects

2016-09-25 18:05:01 +0200

It is sometimes necessary to build and redistribute static libraries that for future use in shared libraries. On Linux -fPIC is usually passed to the arguments of the compiler. The first attempts at writing a portable wscript file typically resemble the following:

def build(bld):
    bld.stlib(source='x.c', target='foo', use='cshlib')
    bld.program(source='main.c', target='app', use='foo')

The problem with this approach is that both -fPIC and -shared flags are then propagated to all downstream targets, with the annoying side effect of silently turning programs into shared libraries. Flag propagation can be easily stopped by replacing use by uselib:

def build(bld):
    bld.stlib(source='x.c', target='foo', uselib='cshlib')
    bld.program(source='main.c', target='app', use='foo')

While this works, the code above is more difficult to understand as the uselib keyword is less frequently used. The best approach may be to declare explicitly the flags instead. In this case it is sufficient to specify the cflags (cxxflags in C++ or fcflags in Fortran):

def build(bld):
    bld.stlib(source='x.c', target='foo', cflags=bld.env.CFLAGS_cshlib)
    bld.program(source='main.c', target='app', use='foo')

Five useful command-lines

2015-07-15 08:05:01 +0200

Command-line tools such as ls can almost have too many options. Fortunately, aggregates can be formed instead of passing flags one by one. For example, a long command such as ls -l -r -t can be simplified to ls -lrt. And as repeated letters usually lack side effects, amusing words can be obtained such as netstat -satan.

Here are five useful and easy-to-memorize command lines that have been used for the creation of waf.io:

diff -burN    # compare two files, ignoring whitespaces
du -sm *      # compute folder/file sizes, in megabytes
ls -lart      # display a detailed file listing, order by time
gpg -bass     # sign a file with gpg
netstat -evil # similar to ifconfig -a

Stop blaming the Chinese, the web is broken

2015-04-03 00:05:00 +0200

Github was under attack at the same time as we were deciding to move the Waf project to it (informal poll results). The source of this particular attack is though to come from China, and such "cyber" problems are gradually taking a political dimension.

Yet, one may wonder why attacks can happen in the first place. Why is the web that fragile? Some suggest to encrypt traffic as a measure to mitigate the problem. Such a mitigation would only work if all sites and all users used mainly https.

A much more realistic solution for this particular issue could be to have web browsers disallow background requests to external sites unless the URLs is explicitly meant for that purpose. For example, if such requests were only permitted to domain names containing keywords such as "tracker" tracker.site.domain or "api" api.site.domain, then such attacks would then be prevented by design. The only drawback would be for advertisers as requests would become a little too easy to filter. We can bet that the idea will be rejected for backward compatibility reasons.

The Waf site is using https at least and the new padlock in the URL bar definitely looks nice. The confidentiality benefits are actually lower than we would like to believe though. Certificate authorities have been found to provide certificates to impersonate sites for a long time. Sometimes the matter becomes apparent, and finger starts pointing at people again.

Now, such basic impersonation would be easy to detect if for example site fingerprints were readily accessible instead of being buried behind layers of guis in web browsers. Certificate authorities would be less tempted to allow fake certificates to be created:

But even with easier access to the fingerprints and even with certificate pinning, the whole certificate system should be considered as already broken from its very core: trust only works when several crossing eyes are involved. It is well known that sensitive operations in banks require two people to open a blinded door for example. This principle has also been rediscovered in airplanes as well; they may require two people in the cockpit at all times.

A more robust scheme for the whole web would be to have site certificates signed by several authorities. For example, site certificates should be signed by one (or two) of the current certificate authorities (Comodo, Verisign, etc) and by another key with a certificate published on the registrar level. Then everyone would obtain the second authority through the whois information (elliptic cryptography does not require the keys to be very long). This second certificate could be used to sign DNS records as well: DNSSEC is not widely used yet, and no one actually cares.

Yet, the web browsers seem to do their best to keep the web as insecure as possible by focusing on features that have little value added for security while remaining essentially insecure and still feature classes of bugs that we thought long gone.

It is high time we stopped blaming the Chinese or the Russians: our systems are broken and we must redesign them.