It's all about the package

Posted on Tue 14 June 2011 by alex in geek

Apologies in advance for this tech heavy post. I was moved to write it as some friends are doing a crash course in learning Linux. Consider this remote teaching for those that are interested.

Back in the early history of computing software was often built in-situ. Generally programs were distributed as source code and compiled on the machine that needed it. This neatly solves a lot of problems as generally if it doesn't compile it will because other pre-requisites don't exist on the system. However compiling does have a few disadvantages - including having a development system on your machine as well as having to send everyone a copy of your source code. For the growing market in computer software where the customers never saw the code a new method was needed. Thus was born the idea of binary software distribution.

Distributing binaries raises a whole new set of interesting problems. It's very rare an executable exists in a vacuum. Programs generally require the services of libraries and system daemons to which they hand off work. While they all may exist on the machine that the software was built on they may not available on the machine you install to. As a result all sorts of hair pulling problems can arise. Early Unix software was often delivered as a tarball and would be wrapped in a bunch of shell scripts. You would extract the tarball into a sub-directory (usually under /opt - see FHS) and then run a shell script which would spend a considerable amount of effort checking the system and attempting to shim the state of the live system into one that is as close as possible to the system it was compiled on so it could finally run the software. A lot of software is still installed this way and it's not that dissimilar to the approach both Macs and Windows PCs take to installing software.

However the raison d'être of a modern Linux distribution is distributing software written by other people. The solution to these problems (and many more) is through the use of a package manager. As an aspiring sysadmin of a Linux machine it is well worth familiarising yourself with the package manger of your chosen distribution. The package manager is certainly should be the first port of call for finding out what packages files belong to and what other files are associated with a package. While there are many competing systems out there to two main ones are RPM and DEB.

The RPM Package Manager is was originally developed by Red Hat and is unsurprisingly used by them and the many distributions that used Red Hat as a base. This includes RHEL, SuSE, CentOS, Fedora, and Mandriva. It's sufficiently well known it was even added to the Linux Standards Base as the preferred installation method for installing 3rd party software.

The DEB file format is used by Debian's dpkg tools. Although Debian is probably more known to old Linux hackers it has a excellent reputation as a truly stable server operating system. While RPM systems where still dealing with "RPM Hell" and having to manually resolve dependencies Debian had the "Advanced Packaging Tool" apt which makes installing any piece of software a simple one line affair at the command line. While Debian may only have a niche audience it's popular derivative Ubuntu is probably the most popular distribution seen my newcomers to the Linux experience. Thanks to Debian's long history there is probably more software packaged in the DEB file format than any other packaging system.

In my next package related post I'll go through some of the questions you might ask your package manager and give some real life command line examples.