Migrated my content from dasBlog to WordPress. Here are some things I learned in the process.
If this helped you out in your migration I’d really like to hear about it. Please let me know by posting a comment below.
My upgrade process
- Mirror dasBlog directory to local disk
- Download dasBlog to BlogML exporter (http://code.msdn.microsoft.com/DasBlogML)
- Run exporter (potential issue: your content has errors)Note: The next five steps are basically the WordPress “five minute install”
- Download the latest version of WordPress and unpack it on local drive
- Setup a MySQL db to host WordPress content
- Make changes to wp-config.php
- Upload WordPress files into the same server directory as dasBlog (my host has default.aspx set to higher priority than index.php, so dasBlog is still functional at this point)
- Run WordPress installation script
- Navigate to WordPress Dashboard (http://[dasBlogPath]/wp-admin/)
- Download BlogML import plug-in (http://www.kavinda.net/content/other/BlogML-WordPress-Import.zip)
- Install BlogML import plug in: put files from previous step the WordPress Imports directory (/wp-admin/import/). Note: on WordPress 3.0 you may need to create the import directory.
- Install Redirection plugin (navigate to /wp-admin/plugin-install.php and search for “Redirection”, find the “Redirection” plugin by John Godley and install)
- Navigate to import page (/wp-admin/import.php)
- Click BlogML link and follow steps (potential issue: your content may be to big)
- Duplicate the permalinkmap.csv file and reformat to look like .htaccess file (see: don’t lose traffic)
- Import .htaccess file into Redirection plugin (from import section of options page: /wp-admin/tools.php?page=redirection.php&sub=options)
- Import or manually add the regex dasBlog to WordPress post mappings (see: mapping traffic…)
- Insert redirection code into dasBlog item template (see: getting people to leave…)
Here is a quick summary of the issues I encountered:
- Exporting to BlogML had content errors
- BlogML file was too large to import
- Posts with duplicate titles get dropped on import into WordPress
- Redirection plugin is case sensitive
- Some static links to site are based on dasBlog GUIDs
Some suggestions for your dasBlog to WordPress migration.
Up front I decided to try to upgrade the blog in place… to keep the same basic path of http://www.little.org/blog. By having the new blog in the same location as the old blog I’m able to use a plug-in in WordPress to catch when a request comes in for an old link and redirect it over to the new link. The redirector plug-in will import the CSV file output by the export from DasBlog, but only as a “pass-through”. With a few tweaks to the CSV file, however, it can be imported as an Apache .htaccess file which will return a 301 redirect for all requests. The pass-through keeps the URLs intact while the 301 will cause search engines update their references to your page so future traffic will land directly on the target page. To import as a pass-through you just need to delete the first row of the permalinkmap file output by the export to BlogML. To import as .htaccess you need to do a little more work, changing it to a space-delimited file with “redirect 301” at the start of each line. My process was:
- Duplicate file (so you keep an original copy of the permalink map)
- Open the duplicate with your favorite spreadsheet program (I used Excel)
- Remove the first row (headers)
- Insert a new first column filled with “redirect 301”
- Save and exit
- Rename file and change file type to .txt
- Open the file with your favorite text editor (I used notepad)
- Find and replace “,http://” with ” http://” (replace the commas with a space, and only the commas before the start of each URL)
- Remove the extra blank line from the end of the file
- Save and exit
- Import into Redirector
- Rename default.aspx to something else to disable most dasBlog traffic
My limited research indicated that the htaccess format for a redirect should be: redirect 301 /blog/2003/12/31/Edgeamuhkashun.aspx http://www.little.org/blog/2003/12/31/edge-a-muh-ka-shun/ The first path is supposed to be relative, but I found that the import was perfectly happy taking the fully qualified URLs from the exported permalinkmap.csv (which saved me from having to do more involved tweaking of the CSV file).
The following items can be added to your .htaccess import to provide translation for dasBlog links people may have stored in their favorites (or that search engines have cached by crawling your site). After importing these you will need to go in and check the regex box for each one of these. There aren’t many, though, so it should be quick. A note about the matches below: you may have noticed that I flagged some of the matches to be case insensitive ( using “(?i)”). While I added it to most entries, the place I suspect it’s really needed is on the Category matching. Since it is working I didn’t bother with any more fine tuning (if it ain’t broke…).
Tip 3: Maintaining the traffic
The way I figured out the incoming links needing repair was spending time watching the Redirection log. From the Redirection plugin menu, click on “Modules” then look at the number of hits the 404 module has received. Click that link to find all the pages returning 404 errors to your visitors. The Redirection plugin provides an entry point for new redirection via a handy “+” link at the far right of any entry. Go back there periodically to see if there’s any links that aren’t working or traffic coming to pages you’ve moved. Oh, and head over to the donation pages for any of the plugins you used. Encourage them to keep building cool stuff.
I started my blog on blogger, migrated into dasBlog and went through several version upgrades of DasBlog. Not all of my content was in the exact same format. So even though DasBlog would happily serve it up there were a couple different errors on the export. The first error I hit was: “Error: Object reference not set to an instance of an object”.
This I tracked down to formatting of empty trackback data in the comments for a couple really old posts. When I opened up the dayfeedback.xml file I found empty trackbacks formatted like:
<Trackings> <Tracking> </Tracking> </Trackings>
I compared this to more recent feedback entries and found they should be formatted like:
The second error in export was: “Error: url Parameter name: value”
This second URL I tracked down to an empty href in one of my posts (for some reason I had a 0px image with no HREF set). Once I corrected this second error I was able to export all of my content without incident.
I later found a post that wouldn’t import… it just failed silently. When I tried to re-import just the troublesome post on its own it still failed silently. I ended up having to manually transfer the post. The BlogML export has the post content decoded while dasBlog has the XML encoded. I used the coder’s toolbox encode/decode tool to change the format before manually pasting into an XML file for import. It may not have been nescessary but it imported so I was happy.
Note: I recommend trying the import before going through the work to break up your content. For many this isn’t necessary. For those who need to do this the content will just not import at all (i.e. you don’t end up with garbage posts, you can just keep trying until you succeed).
The BlogML import plug-in for WordPress says the max BlogML import size is 8MB but I found I needed to keep my BlogML file sizes under 300K or it would error on import into WordPress. The way I limited the file size was by only putting a year of posts at a time into my DasBlog content folder. When doing multiple exports make sure you give a unique file name to each export AND rename the permalink map file. Another gotcha with multiple exports: quit the export program after each export. I was getting duplication of content when I left the app running.
After all was said and done I ended up with an XML (the BlogML) and a CSV (the permalinkmap) file for each year my blog was active.
With over eight years of blogging it’s bound to happen: some of my posts have the same title. While there is a unique date in the permalink path, WordPress still treats duplicate post titles as conflicts. When importing a lot of content the BlogML importer won’t give you an error for duplicate post titles, so you may not know it happened. I caught the missing content by monitoring the redirection plugin’s 404 logs. The log told me that there should be a post titled “Woo Woo” on 2/19/2007 but there was nothing there. I located the post in my dasBlog content, did an export of just that post, modified the exported XML file to change the title and then imported it into WordPress.
WordPress URLs are case sensitive while dasBlog URLs are not. This means that the imported permalinkmap I did only redirects a percentage of inbound links. The solution is to turn the mapping into a regex comparison and add the case insensitive flag (“(?i)”) to the link you’re trying to match.
As far as I know there isn’t a way to import regex into Redirection so I’ve been waiting for links to show up in my 404 log and editing them manually as needed. I figure that there are only a fraction of my posts that are actually saved by someone and I’ll just fix those links.
There are a lot of GUID-based permalinks out there for my dasBlog site. Since I couldn’t find a way to map the GUIDs to the post title without dasBlog I used dasBlog to do the work.
In the regex mapping I send any GUID-based traffic to the comment view of the dasBlog post. Since this would just land the user back in dasBlog I’ve added a redirect into the dasBlog item template which bounces the user to the post title (which then gets caught by the Redirection mapping).
It’s hacky, but it works. Here’s the bit of JS I inserted into the item template in the itemTitleStyle DIV:
I was taking a look at some of the ping back links for this post and for folks working on making this same transition I’d encourage you do the same. Bob’s post is notable for the amount of helpful detail he added covering issues you may encounter in the upgrade.