Staring down the Bazooka

Posted on 30 September 2007 by jose

I've been heavily using Drupal for a couple months now, and I've finally absorbed enough to start troubleshooting the various migration issues that occurred when I went from Wordpress to what I'm playfully referring to as The Bazooka (because there's just so much you can do, and so many modules if you can't do something; it's amazing, and as I originally assessed years ago when I wanted a blogging platform, almost too powerful. Almost).

So. Migration issues. When I migrated, my Wordpress categories came across twice, once with the terms associated with the vocabulary called Topics, and a second time with terms associated with no vocabulary at all. I guess I should preface this by mentioning that the first migration attempt using the migration module failed, but the second one got my posts and comments over, more or less. It seems that it also tried to get the categories over, but having created a default Topics vocabulary, I imagine the migration tool collided vocabularies the second time around.

This is how all my Wordpress posts ended up with (apparently) no categories. However, in playing around with views, I could clearly tell that there were duplicate categories, even if I couldn't see them in the categories admin area (because they were not assigned to any vocabulary, having failed assignment to topics on migration collision). I thought about just assigning them to a Shell vocabulary and then trying to fix things from there, but I realized that the category duplication meant that I had all posts assigned to the second set of 15 categories. I went into the database, exported the node to category relation table, pasted it into Calc (a spreadsheet, for those keeping score at home) and Textpad, and then proceeded to subtract 15 from the term ID of all terms above 15 associated with any node. A TRUNCATE and INSERT later, and my Wordpress categories are back!

This was a piece of cake compared to the database hacking I had to do to migrate my old poll data. Entirely different formats, and the migration module has nothing to do with any of this, as the polls in Wordpress were not part of the "core" functionality. Of course, in good Drupal fashion, although Drupal Core comes with polling functionality, I had to upgrade the poll to the Advanced Poll module. That made the migration a little trickier, since the whole point of the migration was to preserve my votes across the platforms. Advanced Poll uses the Voting API module to manage its poll data, and there are two tables of interest there: the votes, and a cached tally of the votes. Without looking at the code, I knew that messing with the cached tally would be tricky. When the cached tally is generated, rows are added and subtracted from the table, and I didn't want to spend a whole lot of time working out the algorithm just so I could match it and fool it into accepting the old data (something the system is manifestly not designed to do).

The migration is still a bit buggy, but it's getting there. I basically grabbed the tallies from Wordpress, and the IP addresses, and created the SQL INSERT statement first in Calc, and then in Textpad (to use its RegEx search and replace to put single quotes around strings and IP addresses, for example, and get rid of the tabs that come from using a spreadsheet for generating/editing credible timestamps). I did the poll with the most results first, and it seems to be working now, although I had an indexing collision issue that I brute-forced out of the way. Let's just say Refresh is your friend. I'm sure if I looked at the code, I'd be able to figure out where it remembers what the vote index for a particular poll should be, but I'm not sure if this is in the Voting API module or the Advanced Poll module, and it's faster, if tedious, to do it the way I did. And if there are further problems, I can fall back to code review/Drupal support forums/IRC (heavens!).

Latest poll

Which do you favor?