Guy Shahine's Blog

Moving my wordpress blog without losing any content

Today I decided to move my blog from GoDaddy.com to Windows Azure Websites. In this blog post I’ll explain the goal, challenges and solutions.

Goals

1. Move all content of my previous blog from godaddy hosting to Windows Azure Websites.

2. Implement URL redirects because I wanted to change the blog address from http://gshahine.com/blog to http://blog.gshahine.com and avoid breaking any of the indexed urls by search engines or referenced by other websites.

3. Shorten url path from /archives/{year}/{month}/{day}/{post name} to post name only.

Solutions

1. I started by setting up a new wordpress blog through Windows Azure dashboard, where it was super easy to setup and here’s a detailed blog post that explains the process step by step http://sunithamk.wordpress.com/2013/11/06/migrate-your-existing-wordpress-site-to-windows-azure/

wordpress-import

2. This one was a bit tricky to choose the best approach. Couple of months ago, I moved my main page (http://gshahine.com) to run on top of asp.net mvc 4 hosted on azure websites. So I searched online for url rewrite in asp.net and found this article http://msdn.microsoft.com/en-us/library/ms972974.aspx which I skimmed through and decided to write a custom http module that listens to the “OnBeginRequest” event and manipulates the response url when the first word in the path matches “/blog” (code shared below, which includes the solution for goal #3).

fiddler

3. When I initially setup my blog in 2009, I decided to pick a path that looks like this http://gshahine.com/blog/archives/2012/11/22/dont-be-a-turkey/ , well, my SEO (Search Engine Optimization) knowledge back then was pretty limited. Recently, I got some interest in learning more about SEO (and here’s a fantastic beginner’s guide http://static.seomoz.org/files/SEOmoz-The-Beginners-Guide-To-SEO-2012.pdf). So I wanted my new URLs to look like http://blog.gshahine.com/dont-be-a-turkey . WordPress, allows you to easily update the path under Settings->permalinks where they already have a predefined option for having a path with post name only but once you update the permalinks then all the old urls would stop working and return a Not Found page. So I had to update my url rewrite logic to only pick the last part of the path when applicable.

wordpress-permalinks

Here’s my asp.net custom http module code

namespace Gshahine.com.HttpModules
{
    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Threading.Tasks;
    using System.Web;

    public class CustomHttpModule : IHttpModule
    {
        public void Init(HttpApplication context)
        {
            var beginRequestWrapper = new EventHandlerTaskAsyncHelper(onBeginRequest);

            context.AddOnBeginRequestAsync(beginRequestWrapper.BeginEventHandler, beginRequestWrapper.EndEventHandler);
        }

        public void Dispose()
        { }

        private async Task onBeginRequest(object sender, EventArgs e)
        {
            HttpApplication app = (HttpApplication)sender;

            if (app.Request.Path.IndexOf("/blog") == 0)
            {
                var splitPath = app.Request.Path.Split(new[] { '/' }, StringSplitOptions.RemoveEmptyEntries);

                string postName = splitPath.Length > 1 ? splitPath.Last() : string.Empty;

                var newUrl = new Uri(new Uri("http://blog.gshahine.com"), postName);

                app.Response.RedirectPermanent(newUrl.AbsoluteUri, true);
            }
        }
    }
}

And you need to reference your custom module in your web.config

<system.webServer>
...
Removed for brevity
...    
<modules>
      <remove name="FormsAuthentication" />
      <add name="CustomHttpModule" type="Gshahine.com.HttpModules.CustomHttpModule, Gshahine.com" preCondition="managedHandler" />
    </modules>
  </system.webServer>

I hope you find this interesting!

-guy