FUDCon Pune: My session on ‘Learning Git’

My session on learning git (slides) was scheduled right after the lunch break on the first day of the FUDCon Pune 2011.

I had targeted the session for beginners; however I had some help from Shakthi, who conducted a session on git during the 2nd FAD and from Ramky who spoke on version control systems in the talk before mine.  So I could skip a few basic things and get right on to the demo.

I didn’t really get the luxury to prepare in advance; I had in my mind what I would do in general, but got the slides and the flow ready just the night prior to the talk.  Organising FUDCon wasn’t too taxing, but there are a few last-minute things that have to be done, well, at the last minute.  And the presentation, etc., had to wait.

I have earlier seen students just attend sessions but not really follow up on what they were being taught.  So I thought I’d make this an interactive session, inviting people from the audience to participate in the session by someone coming up on the stage and writing a .c program, someone else coming up and creating a git repo, then someone else modifying the code, doing another commit, and so on.

While I thought about this, I recalled Rusty’s session at foss.in a few years back where he did such a thing successfully.  Now emulating that feat would be really difficult.  People who have attended Rusty’s talks would know what I mean.  He puts in hours and days for such talks.  I’m sure he’d have thought about how to pull it off even if the person to come up on stage wouldn’t know how to type.

There were about 50 – 60 people attending the  talk.  So what I did, instead, was to ask the attendees about who knew how to write C programs, and who knew how to type fast.  I called up one such attendee and asked him to write a simple ‘Hello, World!’ program.

I then called up someone else (Aditya) to commit the first version.  Thankfully, the original C file did not have any punctuation in the ‘Hello, World!’ string, so the idea for the 2nd commit was ready.  Once Aditya initialised the git repo and did the first commit, I modified the program output to add the comma and exclamation point and make that the 2nd commit in the git repo.  I then moved on to create a new C program that prints out ‘Goodbye, World’ (we had dedicated the conference to Dennis Ritchie).  This was done in a new branch called ‘goodbye’.  Next was to create another branch, called ‘fudcon’, and write another C program to show ‘Hello, FUDCon’.  Then a few lessons on merging, switching branches, viewing commits and logs from other branches followed.  The slides have the list of commands that were shown.

The last step was to clone this repo into another local one, commit a few things there, do a push into the original repo, make some other pulls here and there, and the session participants were ready with hands-on git lessons that they could use.

I had quite a few questions during and after the session, and I even heard of people trying out the examples after the talk. So I’d call the talk/demo a success.

Idea: Faster Metadata Downloads With Yum and Git

The presto plugin for yum has worked great for me so far.  It’s been very useful, not for the lack of download limits, but for the time saved in getting the bits downloaded.  The time saved is significant if the bandwidth is not too good (it never is).

However, I’ve observed in some cases the presto metadata is larger than the actual package size in some cases — e.g., a font.  If a font package, say 21KB in size, has a deltarpm of 3KB in size, it results in a savings of 18KB of downloads.  This is a very impressive 85% of savings.  However, the presto metadata itself could be more than 400KB, nullifying the advantage of the drpm.  We’re effectively downloading, in this corner case, 418KB instead of 21KB.  That is 19 times of what of the actual package size.

So here’s an idea: why not let git handle the metadata for us?  The metadata is a text (or sqlite) file that lists package names, their dependencies, version numbers and so on.  Since text can be very easily handled by git, it should be a breeze fetching metadata updates from a git server.  At install-time (or upgrade-time), the metadata git repository for a particular Fedora version can be cloned, and on each update, all that’s necessary for yum to do is invoke ‘git pull’ and it gets all the latest metadata.  Downloads: a few KB each day instead of a few MBs.

The advantages are numerous:

  • Saves server bandwidth
  • Uses very less server resources when using the git protocol
  • Scales really well
  • Compresses really well
  • Makes yum faster for users
    • I think this is the biggest win — not having to wait ages for a ‘yum search’ to finish everyday has to get anyone interested.  Makes old-time Debian users like me very happy.

There are some challenges to be considered as well:

  • Should the yum metadata be served by just one canonical git server, while the packages get served by mirrors?  Not each mirror may have the git protocol enabled nor can the Fedora project ask each mirror to configure git on the server.
    • Doing this can result in slow mirrors not able to service package download requests for the latest metadata
    • This can be mitigated by using git over http over the server
  • The metadata can keep growing
    • This can be mitigated by having a separate git repository for the metadata belonging to each release.  Multiple git repos can be set up easily for extra repositories (e.g., for external repos or for multiple version repos while doing an upgrade).
  • The mirror list has to be updated to also include git repositories that can be worked on with ‘git remote’.

I’ve filed an RFE for this feature.  For someone looking for a weekend hack for yum in python, this should be a good opportunity to jump right in!  If you intend to take this up, get in touch with the developers, make sure no one else is working on this yet (or collaborate with others) and update the details on the Fedora Feature Page.