Skip to main content

Digital Marketing

Migrating Content Managed Websites

Content management systems (CMS) are powerful tools that can simplify the management of content and workflow for complex websites. They ease the process of creating, editing, and relating web content quickly—without the need for a developer or engineer to perform these tasks. With so many content management systems out there, the chances are likely, that as a web software engineer, at some point you will need to deal with the migration of a web site from one CMS to another.
While a CMS may make things easier for creating content, life gets much more difficult when you need to migrate the data from one system to another. The data structure, relationships, programming language, and API (Application Programming Interface) will vary between the systems. So, not only will you need to have knowledge about the CMS for the new site you are developing, but also on the legacy system that holds all of the source data.
Get To Know the Legacy System
The first step of a migration is getting to know the data, beginning with the data stored in the current (legacy) CMS. This can be made easier if you have a technical contact with the developers of the system that can walk you through the details. However, this will not always be the case and you may be on your own. A good starting point is to log into the CMS and explore how the content is managed in the administration site. This is a good way to see how the data is defined by the user and how content is added. Exploring the CMS administration site will allow you to start to define the data models and relationships. If the CMS has an easy way to export content through their administration site or API, you may be able to shortcut this part of the migration process.
If there is no easy way to export the data from the CMS, you are going to need to pull the data from the database. The main difficulty when working with this data is most likely you will not know how the data is stored or the relationships between the tables. One page of content displayed on a website may have data that is pulled from multiple tables in a database. To figure out which tables you need you can write a query to display the details of when the tables in the database were last updated. Now, if you log into the CMS administration tool and add a piece of content, you can see which tables have been updated by running the query and checking the last updated time. This process will aid you in discovering which tables you need to reference when you build your migration scripts.
Build a Content Inventory
Even if you know the structure of the legacy data, you are going to need to know how much content there is and the hierarchy of how it is structured. Larger sites can contain hundreds or thousands of pages that are organized in a navigational hierarchy. Not only do you need to know the scope of the content migration, but you will need a baseline to compare the new site against to ensure a successful migration. Also, depending on the CMS, you may not be able to get the full vision of the site hierarchy using the CMS administration tools.
To build a content inventory you will need to use a web crawler, spider, or link checker. These tools can help build a profile of a website by crawling each page and building an inventory of links and content. In addition you can generate reports on broken links which will come in handy when you want to verify the new site.
Develop the New Content Models
Before you begin to migrate data you need to define the content models for the new site. A content model defines the data that is displayed for a specific page type. For example, a site may use Category Landing pages. These pages will have header text, a header image, a small intro description, and display a list of content pages. Each Category Landing page will have different content, but they will all share the same data structure.
The content models will define each unique content type used on the site. These models will then be used as a blueprint to create the web pages. More complex sites will have several content types.
This step is usually collaboration between multiple departments within a company and most likely will be started before the engineering migration effort begins. During the process of migrating the data, you may find that the content models can be adjusted or combined, or that new models need to be created.
Migrate the Data
Many modern CMS solutions provide an API that developers can use to programmatically add or manipulate the content and workflow of the system. The benefit of using the API to migrate content is that you don’t need to worry about the database relationships, or keeping table indexes synchronized. As an added benefit of using the API, you may find that you can reuse some of this code when you start to build the new site. If the CMS does not have a robust API, you will need to write SQL to migrate the data straight into the new database. You can repurpose much of the same strategy used to profile the legacy system to determine which tables you need to update.
There are some things you should keep in mind when developing your migration strategy:
Backup the data often. It is good to take a backup of the database before you run each migration script. If there is an error in the script you can restore the database to the previous version instead of needing to start from scratch. This will help when developing and troubleshooting your migration scripts.
Keep a reference table of the legacy content IDs and the new content IDs. When creating the content in the new system, you will want to build a relationship table you can use to tie the new content to the old content. This will be used to update content links in the html (to point the links at the new content ids).
Track how long it takes for the migration scripts to run. You are going to run these scripts multiple times during the development process and you will want to know how long the process takes.
Try not to write “throw away” code. In theory, this code will only be used to migrate the data for this website once. However, you may find that future development projects may benefit from these existing scripts, especially if you use the same CMS again.
You will need to parse HTML so you can update existing links and images in the content to reference the correct ids and URLs in the new site. There are some parsers available, such as the HTML Agility Pack for .Net that can assist in this.
Migrating Membership Users
Migrating membership users (end users that can log in to access restricted areas of the site) will throw another wrinkle into your migration plan. Many systems store private data such as passwords and secret question answers by using “one way” encryption into the database. This means you cannot decrypt that data and may not be able to use it in your system. If this is the case, you will need to provide a mechanism for the users to update their password and other sensitive data on their first login to the new site. You need to ensure whatever method you choose is secure and all of the edge cases are thought through and addressed.
Maintaining URLs
When migrating to a new site, keep in mind that the URLs used in the legacy site have been out in the wild (most likely for years) and used in linkbacks, bookmarks, and advertising. In some cases it is possible to keep the same URL structure for the new site. However, you may need to provide 301 redirects for some or all of the legacy URLs to ensure that existing links to the site will not break.
Quality Assurance Testing
Depending on the number of pages and complexity of the site, the Quality Assurance testing of the migration can be a large undertaking in itself. Using the web crawler or link checking tools can greatly assist in finding broken links and images as well as orphaned pages. It is a good idea to create a profile of the legacy site using these tools that can be used for comparison after migration.
Migrating data from one CMS to another is quite a challenge that requires a great deal of detective work and preparation to be done correctly. The key to a successful migration starts with a deep dive into the data and finish with a carefully planned and executed migration strategy.

Perficient Author

More from this Author

Categories
Follow Us