Update Imported Markdown Posts
As you probably know I’ve migrated to GitHub Pages from WordPress as I blogged here.
It was a fairly easy migration but migrating the actual content proved to be trickier. There are lots of resources on using Jekyll’s importers. I found this one useful. Just export everything to an XML and run the converter to get the posts in markdown. The problem is the YAML Front Matter it generates is a bit messy:
---
layout: post
title: Blind Password Masking
date: 2011-06-14 03:46:18.000000000 +00:00
categories:
- Off the Top of My Head
tags: []
status: publish
type: post
published: true
meta:
_edit_last: '1'
author:
login: blogadmin
email: admin@myvirtualhome.net
display_name: Volkan
first_name: ''
last_name: ''
---
I don’t want or need most of this stuff anyway!. Also I had two main issues:
- Images didn’t work as it didn’t get the full path. I use S3 to host all images but the imported posts were converted to use a local assets folder. There may be a configuration setting for that but in my case I decided to convert all my posts to markdown from HTML anyway (which was a great way to practice Markdown)
- Main issue was with Disqus. It’s not like people are racing to submit comments to my ramblings but still I’d like to have Disqus enabled on all my posts. Apparently to enable comments you need to specify it in the front matter like this:
comments: true
Manual vs. programmatical
First I resisted the temptation to write a small application to convert the layouts but manual conversion soon proved to be very time consuming even with 100+ posts. So I developed the simple console application below. It scans the folder you specify (filters *.markdown files) and reads the existing layout and converts it to the format I wanted:
---
layout: post
title: @TITLE
date: @DATE
categories: [@CATEGORIES]
comments: true
I wanted to keep it simple and clean. Also as all my posts are now in pure markdown I can easily loop through and update the elements (like converting H3 to H2 or adding tags to layout etc)
Source Code
Usage
It probably won’t apply to most situations but helped me out so why not publish it just in case, right? To use it you have to change the ROOT_FOLDER value in Program class at the bottom (Do NOT forget to backup your posts first!)
As I wanted to revisit all my posts I wanted to mark them instead of replacing automatically with the original one. So when you run the program it deletes the original post and creates the updated one with “.output” appended. So you can easily find which files are modified by checking the extension. If you want it to replace the original post you can uncomment this line at the end of the ConvertFile method
// File.Move(outputFilePath, inputFilePath);