Converting a CMS Made Simple website to WordPress

06 January 2015 18:54 | Comments: 0 |

Harwell Parish Council website has been running on CMS Made Simple since 2009 and has collected about 300 news items and several comments.  A large portion of the site was a copy of the book Village for a 1000 years, which was moved in Oct 2014 to a separate website www.village4a1000years.com together with a lot of photos and other archives.  In general, Village 4 a 1000 years will contain historical information about Harwell, and the PC website will contain current information, split between strictly Parish Council and Village community pages.

There are no suitable conversion services - one will convert pages only but not new items, so the approach is to export the site to statis HTML files, and then import to WordPress from those files, using HTML Import 2.

Progress notes:

First run Xenu's Link Sleuth on the existing site to look for any obvious problems, and to capture a list of all the site pages, ready for setting up htaccess redirection.

Then use HTTrack to export all the pages.

Found the setting to put all the HTML files into one folder (CMS maintains a complex folder structure for news items)

Failed to find setting to

  • keep the original links to images and docs etc as full URL's
  • retain case of URLS - they were changed to lowevr case
  • Avoid scanning archives
  • stop downloading images - want to keep them all in the same place on the server

as a result the export took several hours (allow overnight!)

Manually identified all the HTML pages, leaving a folder of News items (to become posts).  Moved the folder into the website.  Local.

Now work on the site with Dreamweaver. Resist the temptation to do mass deletion of code, e.g. the menus.  Importer will focus on just one DIV (=main) and import that content,

Date line has a start and end date/time.  Importer needs a single div only.

Cleaned the date line in Dreamweaver with

Find Start date:

Replace empty

 

Find End date: ([0-9\-\s:]*)

Replace empty

 

So

Start date: 2014-03-13 11:12:18  End date: 2014-04-04 16:00:00

Became

2014-03-13 11:12:18

Failed to get a multi-line regex to work on

<!-- Start words-->

Lots of lines of code to be removed

<!-- End words -->

So gave up, edited to make sure all the rubbish code was commented out, uplaod to server (import only imports from the server - no uploading) and then imported using HTML Import 2

Brilliant  - 294 posts created with appropriate dates.

But many links need correcting, so use a combination of plugins: Broken Link Checker And Search Regex

Link checker doesn’t know to avoid links inside comments, so it was flagging four broken links for each post.  More cleaning therefore needed.

Search Regex has a multi-line option, but I’ve so far failed to get it to work, so gave up and did it one line at a time.

(text.*) finds complete lines starting with text, which can be replaced with nothing. 

Job just about done.  I’m just left with crap like

<!--
<h3>Start of remains of cmsms comment section</h3>
</div>
&nbsp;
<table>
<tr></td>
</tr>
<tr></td>
</tr>
</table>
</div>
</div>
&nbsp;

</div>
</div>
-->

At the bottom of each page.  Not the end of the world to keep it there.  Can’t risk removing line by line because I have some genuine tables in there.

Broken Link Checker is a good tool for reviewing and changing all the links.  If it's an old new link, eg to rrubbish collection timteable 4 years ago you can either remove the link, or dismiss it, and will be styled to show up on the page with strikethrough.  Then later perhaps review the dismissed links, or just purge old, ireelevant, news items.  Email alerts of links not working will probably be invaluable.

 

I’ve installed Post Expirator.  Most people seem happy, although there are a few reported “doesn't work”.  Can set expiry date, at which point the post category will change, e.g. from current_news to old_news. This will be good with new posts.  All imported posts have category of news.  Could see if they can all be changed to oldnews, but might be safer to use currentnews for "current news"

Next create a few pages.  Simplest to just use copy and paste to bring the contents over from the live site.

Now to concentrate on themes and customising appearance.

Will be using Genesis, and will try the Outreach theme.  Both installed.

Changed default page layout to full width.  (use columns if you want "sidebars" with unique content on each page.)

Start by seeing what I've done on other sites (Villgae 4a1000 and the wrc event sites) and install/configure appropriately.

Plugins to be installed

Jetpack and connect to wordpress.com.  And activate multi-site management https://wordpress.com/plugins

Activate: Custom CSS, JASON API, JetPack Comments, Notifications, Post by Email, Sharing, Shortcode Embed, Spelling and Grammar, Subscriptions, Stats

Genesis Simple Edits

Now to build a menu and put some widgets on the front page.

 

 



No Comments

Closed for comments.