Telephone +44(0)1524 64544
Email: info@shadowcat.co.uk

CPAN Mirrors

Creating Local Mirrors (Part One)

Tue Feb 23 10:11:50 2016

Wow its been a long time since my last post, sorry about that! Anyway...

Something which I had an issue with a while ago is that when installing a lot of testing machines for clients, or internal projects, the CPAN dependencies were taking a long time to download and install - mainly due to our Internet Connection here at Castle Shadowcat being... well a bit tight on the upstream and downstream. So I decided to do something about it.

This first post is going to go over the various CPAN mirror clients available (although I may have missed one or two - let me know in the comments!). So without further ado, lets get to it!

First part was finding out and figuring out the current tools for mirroring CPAN. According to cpan.org there are several different methods for creating a mirror:

However each of these have their own benefits and drawbacks. So lets have a closer look.

ftp

Probably the simplest of all the clients, this is just using an FTP client to copy the files to your local machine. Does not have any special features (atleast pertaining to CPAN), and is not a recommended way of creating a mirror - it is only recommended if no other method is possible.

rsync

A favoured tool of system admins, developers, and basically anyone who needs to copy files reliably from one place to another. If you have never used it directly, then you have probably used something which depends on it - its a tool well worth knowing about and using.

In terms of using it for a CPAN mirror, its the recomended tool for setting up a mirror which syncs once or twice a day. Useful if you arent in need of the latest modules from today, and relatively light on resources. However does need to be run every time you want to synchronise - or you could use your crontab.

Cpan::Mini

Written in perl, this is probably the one most people are familiar with. Very useful for creating a local mirror in your home directory, for when you are away from an internet connection, this module and its included CLI app minicpan allow you to create a 'latest only' CPAN mirror. Great for conferences and hacking around on the move, and can probably be made to create a full mirror. You can probably also combine this with a plackup static server, and have a network wide minicpan for use across VMs on you machine.

Now, the issue with this is, its a manual sync process - if you want to update your CPAN mirror, you run minicpan again. Not that much of an issue for a mobile development machine, but if you need an always up-to-date CPAN mirror, then it starts to show its issues. You can always run it on a crontab however. Also, as it by default only fetches the latest versions of modules, you may run into version requirements that cannot be fulfilled, so it may not be all that useful if you are working on old codebases.

rrr-client

Another client written in perl, this is the official client used on the master CPAN host and all Tier 1 mirrors. The underlying tool for this is rsync, however it does a much more complex check of items in the recent changes, reducing network and processing overhead.

The main benefit of this tool is that (once you have the correct incantation), you can then just set it running and leave it - as long as your shell session stays alive, it will continue to run and update itself.

The recommended way of using this is alongside a daily rsync, which is useful to gather things it may miss, and also to remove cache files it no longer needs from the CPAN directory.

iim

This client is the one I am least familiar with, however it claims to have features which rrr-client lacks - as well as having a lighter footprint. I'l leave investigation of this one up to the reader, although if there is enough interest I will give it a look and a writeup - leave a comment if interested!

Next Time...

So thats a quick overview of whats available according to cpan.org - I know there are similar tools (such as pinto), which may get covered sometime in the future. For now though, tune in next time for a walkthrough of how I set up a dedicated VM to run rrr-client, and what pitfalls I managed to fall into along the way...