As one of the last things I do at the bank, I’ve been asked to look at how we might import our vob’s substantial history into Subversion, should we decide to move from Clearcase. It’s a bit of a hairy problem, since Clearcase and Subversion have completely different versioning schemes (eg, changesets vs. status for individual files) and also because we are heavily dependent on branching and labelling regularly for our development workflow. The currently available tools for converting versioned history between SCM systems are fairly rudimentary, with the most complete ones being those that convert from systems similar to Subversion in philosophy — RCS and CVS. There is a Clearcase-to-Subversion converter, but as the author notes, it has many limitations, not least of which is that it pulls only a single branch.
Since we branch and label religiously, I decided it would be plenty for us to pick up all the ‘release’ labelled versions, treating each like a changeset and importing them to Subversion. One way to do this might be to simply copy all the files over for each release label, dump them into Subversion and commit, but this would be clunky at best and would likely involve some trial and error, so I’m leaving it as a last resort.
Instead, I’m looking at dumping each Clearcase label as a Subversion ‘dump file’. Not only is this the ‘right way’, it would give us a lot of control over how much history we want to import, import attributes, etc. It would also be a more extensible mechanism so that (if someone wants), branches could be dumped.
Unfortunately, there’s very little documentation on the dump format that Subversion expects. The best bit of info I’ve found is that it’s very similar to the format described in RFC 822. Instead, I dumped the HEAD revision of one of my repositories and am using it as a guide to the format. Here’s a somewhat long, but interesting, chunk:
Node-path: trunk/tools Node-kind: dir Node-action: add Prop-content-length: 10 Content-length: 10 PROPS-END Node-path: trunk/tools/compare_errors Node-kind: file Node-action: add Prop-content-length: 36 Text-content-length: 13 Text-content-md5: bcf5a1a11da0d1a25494a4be5798e8bd Content-length: 49 K 14 svn:executable V 1 * PROPS-END #!/bin/ksh -p
Each directory and file (a ‘node’) in the revision being dumped has a properties section that describes the action performed in the revision (added, changed, deleted), any properties attached to the node, the lengths of various fields, and a checksum to prevent corruption. K and V stand for ‘key’ and ‘value’ respectively, and the numbers that follow them are the lengths (in characters, I’m assuming — haven’t looked at how they treat binary files yet) of the properties. After the PROPS-END marker, the file is dumped, character-for-character.
So, as of now, I’m thinking it’s do-able, though it’s going to take a bit of time.



This is pushing the limits of a geeky post, even for you. But then again you were the one who liked using the vi editor so that no one could understand what you were doing.