“Speed up updates by downloading less!”
I have been using delta-rpm's with yum-presto plugin for a couple of years - starting with the experimental repository for Fedora 8. The experience has been very satisfying. It has enabled me to keep my Fedora installation updated with reasonable effort given the bandwidth constraints. Now, even though the bandwidth is better, the updates seem more frequent! So, the delta rpm repository continues to be very valuable.
There were two ideas I wanted to explore. The first was that whether deltaRPM could be used across Fedora versions. The second was that is the same or a similar solution possible for Ubuntu.
Let us consider upgrade across distribution versions. There were over 2000 rpms installed on Fedora 13 on my system. Of these, almost 1800 had a version on Fedora 12. As I had updated Fedora 12 just prior to upgrading to Fedora 13, I was curious to know what would have been the impact had delta packages been available.
Deltarpm package contains a program makedeltarpm, which is just what I needed. A short python program matched each rpm installed on Fedora13 with one from from the cache I had saved from Fedora 12. For each matching pair, the program fired makedeltarpm.
A summary of the results was that about a quarter of the delta rpm's were less than 20% of the original. Overall, delta rpm's for the were about a third of the original size. So, the download size for these packages would have decreased from 1.3GB to a little more than 400MB, saving about 900MB in download effort.
Obviously, a delta repository would be of little use for installing or updating from the dvd. But, if one is upgrading using the pre-upgrade or an online upgrade option, the benefit would be considerable.
An RPM package is essentially an archive with packaging and control information. E.g. we can extract the files using rpm2cpio utility. Makedeltarpm creates a binary diff using the bsdiff algorithm of each file in the archive. The difference files are repackaged as a drpm archive. Control information will also be needed in the archive, e.g. to ensure that unchanged files and deleted files in the package are properly handled.
The utility applydeltarpm will recreate the rpm using the data from the package already installed on the disk and the delta rpm.
The same logic should be implementable for Debian packages as well.
A little searching showed that there is a debdelta package, which has not been used much. There is a Debian delta repository available for Debian – http://www.bononia.it/debian-deltas. The date time stamps indicate that this repository is current; however, I have not tried it.
A posting by Onkar Shinde in the Ubuntu India mailing http://email@example.com/msg05960.html indicated an effort to create one for Ubuntu. Sadly, the people who would find it most useful are also the ones constrained for server resources - disk space and bandwidth in particular.
Use of debdelta is as simple as using makedeltarpm. I experimented with it with the packages on the Ubuntu 10.04 CD and about 240 packages of which were updated from the CD on my system. The size of of the update reduced from around 200MB to merely 12MB – just 6% of the original size. About 70% of the delta packages were less than 10% of the original. This may be unusually small. Still, it is obvious that the utility of debdelta would be as great for Ubuntu as for Fedora.
It just needs someone with the server resources to set up a suitable delta repository with suitable scripts to ensure that it remains in sync with the official repository.
It was nice to come across interest in delta repositories in other distributions. As I use Arch Linux as well, the following post was a welcome one – https://bbs.archlinux.org/viewtopic.php?pid=724617. As expected, one person with the username 'sabooky' has taken the lead and created a delta repository for the i686 systems. It uses xdelta3 instead of bsdiff at present. It seems very promising and is very easy to use as pacman(package manager for Arch Linux) developers had already integrated the option of using delta repositories.
There is bound to be more activity in this area. All delta repositories should benefit from the work being done for Google Chrome updates (http://dev.chromium.org/developers/design-documents/software-updates-courgette ). The remarkable part is the statistics shown for an example of a Chromium update:
I can't wait to see which distribution implements a repository with Courgette first!