2008-06-05

Linux package differences

I put a long comment under the article Ubuntu 7.10 to PCLinuxOS 2008. Here's a copy of it:

It's not so easy to create a unified, centralized repository for all Linux distros as you might think.

The distros exist because all these people have different opinions on how a Linux system should work. I'd say this is a real-time experiment, a Darwinian effort to see which approach works best. This is the price of innovation. At the other end is stagnation, and I don't think any of Linux users would like it to stagnate.

Reasons for why it's hard to create universal packages:
1) Improvements/differences in GCC compiler and so called "toolchain".
2) Different init systems and services management.
3) Different config files locations and syntax.
4) Different package formats and package management systems.

Explanations:

1) All binary programs that constitute a working system have to be compiled (translated from human-readable form into a form suitable for CPU to execute). In case of Linux the kernel, basic GNU tools and other useful programs (be it X Window System, GNOME/KDE/XFCE/etc, Firefox, K3B, and so on), this is done by a suite of GCC compilers and a set of tools like: automake, linkers, loaders, binutils and so on.

These tools are not set in stone, they are improved, changed, restructured over time, not to mention the occasional bugfixing. Sometimes these changes are INCOMPATIBLE, like changing the GCC from version 2.95 to 3.x line and then to 4.x series. The same is for accompanying tools. They all change as their requirements change (according to the user's demand).

So when a particular distro or a package developer decide to use a specific set of tools and libraries, with their specific versions and behaviour, these decisions can be a source of incompatibility across distros and package maintainers. So for example, an mplayer package from distro A may *NOT* work at all in distro B, because of so called ABI differences (Application Binary Interface). The compatibility is maintained at the source level instead. So it's the distro maintainers that worry about this, so they just recompile a problematic package and put a new version for use.

2) Even when the package is compatible at the binary level (which is true for most packages and distros), distro creators may have different views on how their systems work internally and how they handle system services. I mean the locations of service scripts (/etc/rc.d versus /etc/rc0.d versus /etc/init.d), their configuration files' locations (/etc/default versus /etc/conf.d) and structure, where they place working directories of services, and so on.

This may not apply to the end-user applications like web browser, multimedia player or CD/DVD burning application. But for things like cups (printing service) or apache (web server) it is a serious concern. Each distro may offer a unique approach here, because of people disagreeing on the details. So it may happen that a specific package from distro A would not work in distro B or may even break its installation. That's why users are advised *NOT* to install foreign packages on their own without thinking about consequences.

Compiling a package manually under a given distro typically yields in a working package that's tailored to this distro (and honestly it's not so hard as one can imagine). But since then it is the user that's responsible for maintaining such package and upgrading it when some bugs are found.

3) Sometimes even a basic structure of config files may be radically different. From my experience the network configuration varies wildly between distros (e.g. Debian, Mandriva, RedHat and Gentoo have it completely different). Apache configuration files are also well known to be differently separated and placed in different directories and under different names in distros. Just because people have different opinions and vision on how it should be done.

[Note: this section has been added here] Also, KDE installations may vary between distros: there can be different directory names, different number of config files, application launchers and so on. These are the parameters that can be set at the compilation time. So when people pick different values here, incompatibilities arise. For instance, Debian/(K)Ubuntu and PCLinuxOS have it completely different.

4) Package formats of DEB, RPM, TGZ, .recipe, .ebuild are different enough to be incompatible. In some packages there are little programs (maintainer scripts) that are run at the beginning of the installation or after a package is installed.

Each package format has a different way of describing dependencies (which other packages need to be installed first before a given package would run correctly). Even RPM or DEB packages from different distros may describe dependencies incompatibly. Even packages from different releases of the same distro may be incompatible!

Package formats also evolve, for example recently Ubuntu and Debian introduced the concept of package triggers, independently of pre-inst or post-inst scripts. Yes, this mechanism is backward compatible, but you get the idea.

Concluding, there are incentives to minimize the differences, like the noble FHS (Linux Filesystem Hierarchy Standard) or LSB (Linux Standard Base). So the problems are worked on, but for now it's really hard to create a package that would run unchanged in a number of distros. I don't say this would not change in the future, but I hope you now understand the problems here.

2 comments:

Unknown said...

Thanks, this was just were I was looking for. Internally divided is one thing, divided for the outer world is something different. Do you think on a short term consensus on a standard can be expected? Even without the universal package system?

SirYes said...

Instead of standardizing package format (which I suppose will not happen), I think the effort will be put on package management mechanisms.

Essentially a binary package is a set of files grouped into subdirectories (which could be easily put into .ZIP or .TAR.GZ archive) with added management information. This way after installing package the system "knows" where are files belongig to the package, and can safely modify or remove them (when package is upgraded or uninstalled).

Even for Windows there are .EXE installers as well as "Windows Installer" formats, like .MSI files. However, installer functions ARE standardized there since Windows 95! But in Windows there are no package repositories and it's impossible to know which file belongs to which installed program. Each installer copies files to their destination folders (application and system folders), possibly recording installed files in case of uninstallation, modifies registry, may install system services, device drivers and so on. And while it runs on privileged account, it may do many things (it's an .EXE file after all).

In short: I mostly prefer Linux way of installing packages. Much more clean and controllable.