Ruby Based Pebble2WordPress Importer

Vamsi Krishna

2 min read

We recently migrated from Pebble to WordPress and we were looking for tools to convert Pebble blogs to WordPress. Eventually I wrote a simple application based on Ruby and ActiveRecord to do this migration.
Thought we would share the code so that the other people can contribute to make this application more successful.
Some Basics:

  • Pebble stores data of the blog (author of the post, post content, comments data etc.,) in xml files
  • For WordPress you need that data in mysql database
  • Ruby is a cool language and I thought I would use it for this task


  • Use Pebble to export post in XML format
  • Read XML data and store the data in WordPress database format

Using the Application

  1. Goto google code – svn
  2. configure db.yml
  3. keep all your pebble xml files in inputXML directory
  4. You need mention your absolute path in basedir variable(in line5) present in main.rb file
  5. run main.rb
  6. check db for your blog data

Now, if you are interested, here is some explanation about the Application.
1. Language – Ruby, ORM framework – ActiveRecord
2. The directory structure is
3. Sample Pebble exported XML Structue:
[sourcecode language=’xml’]

Title for the blog

12 Jan 2006 07:33:24:301 +0100

truetrue /sd


23 Jan 2006 16:01:37:839 +0100

4. Database table structure (wp_comments)
This is wp_comments table (wp_posts and wp_comments are linked by comment_post_ID (foreign key), one post can have multiple comments.)
Lets see the Application in more detail.
In this ruby application we got total 6 files (excluding test cases)
1. main.rb
This is the main file we will run to do the migration. It will call the methods for parsingxml and inserting data in to mysql
Here is a code snippet of my main.rb file
[sourcecode language=’ruby’]
require ‘xml_parsing’
require ‘process_data’
# Please give absolute path to your pebble exported xml files.
basedir = “inputXML/”
files = Dir.glob(“*.xml”);
files.each { |file|
if File.file?(basedir + file)
parsedPostData = parse_xml(basedir + file)
p “Processed postdata for the file: ” + file
parsedCommentData = parse_comments_nodelist(basedir + file)
p “Processed commentdata for the file: ” + file
print “Error: “, $!, “\n”
p “Error!.. Not a file. Add unit test!..”
p “Processed ” + files.length.to_s + ” xml files successfully!”
2. post_data.rb
This file contains a class called PostData which contains properties of blog post data.
3. comment_data.rb
This file contains a class called CommentData which contains properties of comment data. In CommentData all instance variables are used as arrays.
4. xml_parsing.rb
This file is used for parsing the xml file. It got two functions parse_xml will take the xml filename as input and parses the main blog post data and will encapsulate all the data in to an object of PostData type and will return that object. The other one is parse_comments_nodelist which is used for parsing all the comment nodes(since one post can have zero or more comments) So, this will return an object of CommentData.
5. process_data.rb
This file is used for inserting the post data and comment data in to database.
It got two functions one is process_post_data for inserting post data this function takes PostData object as input(which we will get from parse_xml function). The other one is process_comment_data.
6. database.yml
This file will have the database configuration (like database name, username etc.,).
This application can be used for parsing pebble xml files and storing that data in to database(any database).
You can find the complete source code of the application in
It took less time to develop this Application using Ruby, Though a lot of things has to be implemented yet
For example

  1. Make post_author dynamic
  2. Read categories,tag details from xml file to store it in WordPress db
  3. Add Unit tests and code coverage
  4. Add documentation

If you think of any suggestions/improvements for this application post a comment.

Related posts:

Leave a Reply

Your email address will not be published. Required fields are marked *