Ruby Based Pebble2WordPress Importer

Vamsi Krishna

2 min read

We recently migrated from Pebble to WordPress and we were looking for tools to convert Pebble blogs to WordPress. Eventually I wrote a simple application based on Ruby and ActiveRecord to do this migration.
Thought we would share the code so that the other people can contribute to make this application more successful.
Some Basics:

  • Pebble stores data of the blog (author of the post, post content, comments data etc.,) in xml files
  • For WordPress you need that data in mysql database
  • Ruby is a cool language and I thought I would use it for this task

Tasks

  • Use Pebble to export post in XML format
  • Read XML data and store the data in WordPress database format

Using the Application

  1. Goto google code – svn http://code.google.com/p/pebble2wordpress/source/checkout
  2. configure db.yml
  3. keep all your pebble xml files in inputXML directory
  4. You need mention your absolute path in basedir variable(in line5) present in main.rb file
  5. run main.rb
  6. check db for your blog data

Now, if you are interested, here is some explanation about the Application.
1. Language – Ruby, ORM framework – ActiveRecord
2. The directory structure is
dir11
3. Sample Pebble exported XML Structue:
[sourcecode language=’xml’]


Title for the blog



12 Jan 2006 07:33:24:301 +0100
approved
Krishna

truetrue /sd


Comment1

anonymous
anonymous@abc.com

127.0.0.1
23 Jan 2006 16:01:37:839 +0100
approved


[/sourcecode]
4. Database table structure (wp_comments)
comments
This is wp_comments table (wp_posts and wp_comments are linked by comment_post_ID (foreign key), one post can have multiple comments.)
Lets see the Application in more detail.
In this ruby application we got total 6 files (excluding test cases)
1. main.rb
This is the main file we will run to do the migration. It will call the methods for parsingxml and inserting data in to mysql
Here is a code snippet of my main.rb file
[sourcecode language=’ruby’]
require ‘xml_parsing’
require ‘process_data’
# Please give absolute path to your pebble exported xml files.
basedir = “inputXML/”
Dir.chdir(basedir)
files = Dir.glob(“*.xml”);
files.each { |file|
if File.file?(basedir + file)
begin
parsedPostData = parse_xml(basedir + file)
process_post_data(parsedPostData)
p “Processed postdata for the file: ” + file
parsedCommentData = parse_comments_nodelist(basedir + file)
process_comment_data(parsedCommentData)
p “Processed commentdata for the file: ” + file
rescue
print “Error: “, $!, “\n”
end
else
p “Error!.. Not a file. Add unit test!..”
end
}
p “Processed ” + files.length.to_s + ” xml files successfully!”
[/sourcecode]
2. post_data.rb
This file contains a class called PostData which contains properties of blog post data.
3. comment_data.rb
This file contains a class called CommentData which contains properties of comment data. In CommentData all instance variables are used as arrays.
4. xml_parsing.rb
This file is used for parsing the xml file. It got two functions parse_xml will take the xml filename as input and parses the main blog post data and will encapsulate all the data in to an object of PostData type and will return that object. The other one is parse_comments_nodelist which is used for parsing all the comment nodes(since one post can have zero or more comments) So, this will return an object of CommentData.
5. process_data.rb
This file is used for inserting the post data and comment data in to database.
It got two functions one is process_post_data for inserting post data this function takes PostData object as input(which we will get from parse_xml function). The other one is process_comment_data.
6. database.yml
This file will have the database configuration (like database name, username etc.,).
This application can be used for parsing pebble xml files and storing that data in to database(any database).
You can find the complete source code of the application in
http://code.google.com/p/pebble2wordpress/source/checkout
Conclusion
It took less time to develop this Application using Ruby, Though a lot of things has to be implemented yet
For example

  1. Make post_author dynamic
  2. Read categories,tag details from xml file to store it in WordPress db
  3. Add Unit tests and code coverage
  4. Add documentation

If you think of any suggestions/improvements for this application post a comment.

Related posts:

Leave a Reply

Your email address will not be published. Required fields are marked *