Ant Power, Cruise under Control and Selenium Fuel (part 2) - Removing old artifacts from CruiseControl.

One thing I had to do when using CruiseControl is remove old artifacts. Since my CruiseControl starts a build on every commit in times of fast development this causes the creation of lots of artifacts. You probably don't want to keep all of them since your server disk will get full.

Looking at CruiseControl configuration I found a parameter for deletion of artifacts, but it is limited to check the age of an artifact and people said that this configuration assumes your artifacts inside the log directory. When I had CruiseControl setup my artifacts directory is outside of logs/ (I used the config.xml example that comes with CruiseControl and there the artifactpublisher is configured to a outside directory) so use deleteartifacts wasn't a option.

There's a wiki entry about this but also deletes artifacts based only on age. This is not what I want. If a project doesn't build for a long time all artifacts will be removed and no last version of the artifact will be available. What I did is just code a simple Perl script that does the job based on age but keeps some artifacts.

Removing old artifacts from CruiseControl

This script should be run by cron at any interval rate you want. There three parameters configurable, the first (MAX_AGE_IN_DAYS) is very clear, how many days old a artifact should be to be removed. The second (MIN_ARTIFACTS_TO_KEEP) tell the script to keep some artifacts even if they're old enough to be deleted. And the third (ARTIFACTS_DIR) points to your artifacts directory.

The script is pretty simple but solves my problem, keep some of the artifacts while removing too old ones. Here's the code:

#!/usr/bin/perl
# vim: sw=2 ts=2 expandtab bg=dark
# this script removes old artifacts from cruisecontrol
# to avoid full disk. You can configure a min number
# of artifacts to keep and how many days old a artifact
# is considered too old and should be removed.
# Marco Valtas - 2008.
use strict;
use warnings;
use File::stat;
use File::Path;

################
# configuration session.

# artifacts older than this will be removed.
use constant MAX_AGE_IN_DAYS  => 10;
# keep at least this number of artifacts.
use constant MIN_ARTIFACTS_TO_KEEP => 5;
# where we find the artifacts?
use constant ARTIFACTS_DIR => "/var/cc-sandbox/artifacts";

# end of configuration
################


# open artifacts dir to list all projects.
opendir(ART_DIR,ARTIFACTS_DIR) or die $!;
 # list all projects, skip "." files.
 my @projects = grep { /^[^.]/ } readdir(ART_DIR);
closedir(ART_DIR);

# exit if no project was found.
exit if $#projects == -1;

# foreach project, we dig to find
# all artifacts directories.
foreach my $project (@projects) {

 # project artifacts directory.
 my $W_DIR = ARTIFACTS_DIR."/$project";

 opendir(PROJECT_DIR,$W_DIR) or die $!;
   # list all dirs, sorted so old dirs are first.
   my @artifacts = sort(grep { /^\d{14}$/ } readdir(PROJECT_DIR));
 closedir(PROJECT_DIR);

 # skip this project if not enough artifacts were found.
 next if $#artifacts < MIN_ARTIFACTS_TO_KEEP;

 # for each artifact found (skipping the ones we want to keep)
 # check the age and remove it if necessary.
 foreach my $idx (0..$#artifacts - MIN_ARTIFACTS_TO_KEEP) {

   # how many days old is this directory?
   my $stat = stat($W_DIR."/$artifacts[$idx]");
   my $days_old =  int((time - $stat->mtime) /86400);

   # remove the artifact dir if is too old.
   if($days_old > MAX_AGE_IN_DAYS) {
     rmtree($W_DIR."/$artifacts[$idx]");
   }
 }
}

That's it.

Published in Nov 18, 2008