Idea: Faster Metadata Downloads With Yum and Git

The presto plugin for yum has worked great for me so far.  It's been very useful, not for the lack of download limits, but for the time saved in getting the bits downloaded.  The time saved is significant if the bandwidth is not too good (it never is).

However, I've observed in some cases the presto metadata is larger than the actual package size in some cases -- e.g., a font.  If a font package, say 21KB in size, has a deltarpm of 3KB in size, it results in a savings of 18KB of downloads.  This is a very impressive 85% of savings.  However, the presto metadata itself could be more than 400KB, nullifying the advantage of the drpm.  We're effectively downloading, in this corner case, 418KB instead of 21KB.  That is 19 times of what of the actual package size.

So here's an idea: why not let git handle the metadata for us?  The metadata is a text (or sqlite) file that lists package names, their dependencies, version numbers and so on.  Since text can be very easily handled by git, it should be a breeze fetching metadata updates from a git server.  At install-time (or upgrade-time), the metadata git repository for a particular Fedora version can be cloned, and on each update, all that's necessary for yum to do is invoke 'git pull' and it gets all the latest metadata.  Downloads: a few KB each day instead of a few MBs.

The advantages are numerous:

There are some challenges to be considered as well:

I've filed an RFE for this feature.  For someone looking for a weekend hack for yum in python, this should be a good opportunity to jump right in!  If you intend to take this up, get in touch with the developers, make sure no one else is working on this yet (or collaborate with others) and update the details on the Fedora Feature Page.