Bundled dependencies

The intent of this page is to collect information on dependency bundling and static linking as a reference to refer upstream developers, instead of explaining the same thing repeatedly by e-mail.

When is code bundled?

Code is considered bundled in a piece of software if any of the following conditions occur:

  • Statically linking against a system library
  • Shipping and using your own copy of a library
  • Including and (unconditionally) using snippets of code copied from a library

In other words, code bundling occurs whenever a program or library ends up containing code that does not belong to it.

Temptations

There are reasons why bundling dependencies and using static linking occurs; there are certain benefits to it. To counter bundling, it is important to understand why it is appealing to some upstream projects.

Comforting non-Linux users

Especially in Windows, shipping dependencies can be a favour to users to save end users having to manually install dependencies or additional libraries. Without a package manager, there is no real solution to that on Windows anyway.

It is tempting when using bundled code on Windows to bundle on GNU/Linux too: it feels consistent and fits together nicely in the mind of the software author.

Easing up adoption despite odd dependencies

If a software package foomatic has some dependency libbar that is not yet packaged for major distributions, libbar makes it harder for foomatic to be packaged, because foomatic forces the new maintainer to package libbar him/herself or to wait for someone else to package it for them.

Bundling libbar hides the dependency on libbar in a way: if the packager is not paying close attention foomatic may even get in despite and with the bundled dependency. (It is, however, only a matter of time until someone notices the bundling.)

Private forks

If foomatic uses a library libbar, the developers of foomatic may wish to make some changes to libbar, for example to add a new feature, modify the API, or change the default behavior. If the developers of libbar for whatever reason are opposed to these changes, the developers of foomatic may want to fork libbar.

But publishing and properly maintaining a fork takes time and effort, so the developers of foomatic could be tempted to take the easy road, bundle their patched version of libbar with foomatic, and maybe occasionally update it for upstream libbar changes.

Problems

Why is bundling dependencies and static linking bad after all?

Security implications

Consider the perspective of a baz maintainer where baz uses libbar.

Now, a critical important security flaw has been found in libbar (say, remote privilege escalation). The problem is large enough that devs of libbar release a fixed version right away, and distributions package it quickly to decrease the possibility of break-in to users' systems to a minimum.

If a particular distribution has an efficient security upgrade system, the patched library can get there in less than 24 hours. But that would be of no use to baz users which will still use the earlier vulnerable library.

Now, depending on how bad things are:

  • If baz statically linked against libbar, then the users would either have to rebuild baz themselves to make it use the fixed library or distribution developers would have to make a new package for baz and make sure it gets to user systems along with libbar (assuming they are aware that the package is statically linked)
  • If baz bundled a local copy of libbar, then they would have to wait till you discover the vulnerability, update libbar sources, release the new version and distributions package the new version

In the meantime, users probably even won't know they are running a vulnerable application just because they won't know there's a vulnerable library statically linked into the executables.

Examples:

Note further that in general, nobody investigates whether vulnerabilities impact old versions of packages, which means a stale bundled copy in a project may be vulnerable without any attention being paid to it. Further, MITRE does not issue CVEs for bundled dependencies.

Waste of hardware resources

Say a media player is bundling library libvorbis. If libvorbis is also installed system-wide, this means that two copies of libvorbis:

  1. occupy twice as much space on disk
  2. occupy (up to) twice as much RAM (of the page cache)

Waste of development time downstream

Due to the consequences of bundled dependencies, many hours of downstream developer time are wasted that could have been put to more useful work.

Potential for symbol collisions

If a program foomatic uses a system-installed library A and also uses another library B which bundles library A, there is a potential for symbol collisions.

This means that foomatic might use an interface, such as my_function() and that the my_function() symbol would be present in both A and the version of A bundled inside of library B.

If the system-installed copy of A and the copy of A compiled into library B are from different releases of library A, then the operation of the interface my_function() might behave differently in each copy of A.

Since the program foomatic was compiled against the system-installed copy of A and for various other reasons, if foomatic ends up using the my_function() interface from the version of A bundled in library B instead of the interface in the system-installed copy.

This can potentially result in crashes or strange unpredictable behavior.

This sort of problem can be prevented if library B uses symbol visibility tricks when it links against library A, which would cause library B not to export library A's interfaces.

Examples:

Possibly broken at runtime

It is uncommon for bundled dependencies to be wired up in the build system such that their testsuite is built and executed as part of the parent package.

Missing testsuites mean that real incompatibilities with newer toolchains, dependencies, or changes in the consuming application can go unnoticed.

Meson subprojects mitigate this by running the testsuite of any bundled dependencies ('subprojects') by default.

Downstream consequences

When a bundled dependency is discovered downstream this has a number of bad consequences.

Analysis

Suppose there is a copy of libvorbis bundled with a media player. Which version is it? Has it been modified?

Separating forks from copies

Before the bundled dependency can be replaced by the system-widely installed one, one must know if it has been modified: is it a fork?

If it is a fork, it may or may not be replaced without breaking something.

That's something to find out: more time wasted. If the code says which version it is we at least know what to run diff against, but that is not always the case.

Determining versions

If a bundled dependency doesn't share its version, one has to find it somehow. Mailing upstream could work, comparing against a number of tarball contents may work too. Lots of opportunities to waste time.

Patching

Once it is clear that a bundled dependency can be ripped out, a patch is written, applied, and tested (more waste of time). If upstream is willing to co-operate, the patch may be dropped later. If not, the patch will need porting to each new version downstream.

What to do upstream

  • Remove bundled dependency:

    At best, remove the bundle dependency and allow compilation against dependency libbar from either a system-wide installation of it or a local one at any user-defined location.

    That gives flexibility to users on systems without libbar packaged and makes it easy to compile against the system copy downstream: cool!

  • Keep bundled dependency: make its use completely optional:

    With a build time option to disable use of the bundled dependency, it is possible to bypass it downstream without patching: nice!

    When keeping a dependency libbar bundled, make sure to follow the upstream of libbar closely and update your copy to a recent version of libbar on every minor (and major) release to at least reduce the damage done to people using your bundled version a little.

    Clearly document if a bundled dependency is a fork or an unmodified copy and which version of the bundled software we are dealing with.

    Use a Meson subproject which allows clean fallback from a system copy to a local bundled one.