Acumen Development

  • Acumen Development
  • Acumen Systems
  • Acumen Third Sector
  • About Us
  • Websites
  • Web Applications
  • Branding
  • Support
  • Contact Us

Working with Corrupt Subversion Repositories

recent news

  • Wordpress and Subversion
    Leo Brown, 1st June

  • Working with Corrupt Subversion Repositories
    Leo Brown, 20th January

  • Integrating with Web1.0 Service Providers
    Leo Brown, 4th June

  • WordPress Text Replacement Plugin
    Leo Brown, 14th February

  • Direct Email Reception
    Leo Brown, 2nd January

Project Request Form

Download a project request form here

Working with Corrupt Subversion Repositories

January 20, 2010
categories: Development Processes

Subversion is a great piece of version control software, and has been at the core of all our development projects for many years. Sadly, there are a few weaknesses. Key among them is lack of obliterate support, but one that’s stung us a few times now is corrupt SVN databases.

Here’s a quick runthrough of the solution I developed to the common ‘bad transaction’ issue, that causes your repository to stop checking out/modifying certain files (typically blobs).

1. Use svnadmin verify to find your bad revision (in this example, 195)

svn verify myrepo
* Verified revision 0.
...
* Verified revision 194.
svnadmin: Malformed representation header

2. Dump the first working section of the repository to a dumpfile, then dump the second section as an incremental dump onto the end of this dumpfile

svnadmin dump myrepo -r0:194 > my.dmp
svnadmin dump myrepo --incremental -r196:HEAD >> my.dmp

3. Try to rebuild this repository by creating a new repository, and using svnadmin load to read in the newly created dumpfile.

svnadmin create newrepo
cat my_filtered.dmp | svnadmin load newrepo

4. It’s likely that paths can not be rebuilt, and you will see errors describing this. Use svndumpfilter to exclude the problematic paths, creating a new dumpfile for each exclusion, and repeating from step 3.

cat my.dmp | svndumpfilter exclude "My/Bad/Path" > my_filtered.dmp
Excluding prefixes:
   '/My/Bad/Path'

Revision 0 committed as 0. (etc)

5. Replace the rebuilt repository’s UUID with that of the old repository, by using svnlook uuid and svnadmin setuuid.

svnadmin setuuid newrepo `svnlook uuid myrepo`

6. Use svn switch to attempt to move a checked out copy to your new repository.

svn --relocate switch http://host/myrepo http://host/newrepo .

7. If all goes to plan, replace your corrupt repository with your rebuild, and svn switch any working copies back to their original path. Then copy back in the data that you lost in the exclusion process, commit and get on with your work.

There you go, loads of messing around to get back to square one. Once that’s done, stop using FSFS as your SVN backend, as it’s much more buggy!

Basic Scraping

November 22, 2008
categories: Development Processes, Open Source
tags: mashups, php, scraping
A short introduction to Web Page Scraping

While in production applications we all favour use of an API, there are a lot of situations, such as in ‘Mashups’ (I love how that term has been reappropriated from Jungle music) where you need to do some page scraping.

It’s occurred to me how these very easy techniques seem inaccessible to many people, so I thought I’d post a few bits and bobs about some basic scraping methods.

Here’s a bit of code I wrote to use PHP’s DOMDocument class to treat a HTML page as XML and fetch, in this case, the incredibly useful current world population… fantastic!

<?php
 // where to find population data...
 $location['url']='http://www.census.gov/ipc/www/popclockworld.html';
 $location['id']='worldnumber';

 // initialise a new document and prepare the data
 $d=new DOMDocument();
 $file = file_get_contents($location['url']);

 // get and print current world population
 $d->loadHTML($file);
 $e=$d->getElementById($location['id']);
 print $e->nodeValue;
?>

Sample output: 6,738,610,278