888.300.3118

This is a how to guide for generating a dynamic WordPress website into a static HTML version. It is aimed at WordPress administrators that wish to serve up their WordPress deployments in static format, for security, scaling and performance purposes.

In this tutorial, we will give you the abilities for running a staging WordPress server that is effectively used for development (that very well could be deployed offline), and a scripted method for generating the dynamic version of the development website that can be pushed to a production web server that will serve the generated static HTML using Puppet.

Below is a quick overview of the platform and some common security techniques. If you’re only interested in the nuts and bolts of making this work, skip ahead here.

Overview

WordPress is hands down one of the most popular content management systems out there. It’s a modular platform that provides simplicity for website owners to publish web content, and is very easy to use.

On the flip side of the coin, a major concern is the poor security track record of the WordPress platform and its installable plugins (the plugins are of even greater concern, as it is near impossible to truly gauge the security posture and coding practices of the developers that maintain them). This has led to countless web site and web server compromises, and has resulted in the publication of many Common Vulnerabilities and Exposure (CVE) identifiers.

Common Mitigation Techniques

It can be quite cumbersome to properly, and proactively secure an average WordPress installation. Here are a few techniques that are more commonly used:

Apache ModSecurity

One of the known methods for mitigating a great deal of attack vectors against WordPress is through Apache ModSecurity. – A web application firewall (WAF) that is essentially an Apache module that provides security features and request filtering for Apache web server.

ModSecurity is a great product, but it carries a substantial performance cost to the web server that is running it. One must be very fluent with the module for proper configuration and hands on with ongoing maintenance. This also does nothing for those running WordPress on other web server software platforms, such as Nginx, with out having to run Apache on the backend.

Log Monitoring & Active Response

There are some other common software and scripting suites that WordPress site owners implement for attack mitigation, such as OSSEC and Fail2Ban. These two components have great capabilities, but one of the problems I have with going solely with an active log monitoring solution is that the attack may have already have taken place long before the software catches the attack. This doesn’t do very much good for those that do not read the alerts around the clock, and whom aren’t prepared for doing a forensic attack analysis every time an alert comes in.

In a scenario where an attacker were able to pull off a remote file include (RFI) attack or some sort of cross site scripting (XSS) attack, and the alert went unnoticed by administrators (if there even was such an alert), the attacker could cause considerable damages and headaches to the organization (it isn’t fun having to deal with threatening correspondences from your ISP stating that your web server is serving up malware or other unsightly files, when it was due to a compromise of your WordPress deployment).

The fact of the matter is there are many options out there for protecting WordPress deployments, but it’s a lot of work and isn’t 100% effective in all scenarios.

The Solution: Static HTML

There really is no better way to protect a resource hungry and historically dangerous content management system than serving up it’s generated HTML content statically… period. With static HTML there is not much to defend against in comparison.

Bonus #1: Performance!

Since WordPress is database-driven, requiring PHP and MySQL out of the box, the platform can be quite resource intensive. There are ways to mitigate performance issues, such as using various server-side caching components like Memcached, APC, Varnish, and etc. These do come at an extra cost of required system resources, software and overall maintenance.

Bonus #2: Scalability!

In my past career, I have architected and deployed fairly complex load-balanced, high-performance and high-availability WordPress deployments that allow for running multiple WordPress installations, on multiple servers using some custom rsync scripts, MySQL replication, Memcached replication (Memcachedrep), and a small laundry list of other components.

Having that said, there is obviously nothing easier than scaling out a static clone of WordPress.

Bonus #3: Cost Savings!

The economics are simple:

  • Less system resources required = Less money spent on system resources
  • Less time spent on security and maintenance = Less money spent on administration costs

…and more time for focusing on the stuff that needs your attention (presuming you value your time or the time you’re paying for your admin’s to spend)

The Script

There were a few reasons I decided to write this script, the first reason being that none of the WordPress static HTML generation plugins work! (also, some of them are quite spammy in their WordPress admin pages).

The second primary reason I’m wrote it is because I would much rather have the flexibilities of using a shell script for cloning (i.e. copying files/directories outside of WordPress that exist on the same vhost), and there isn’t a need for logging in to the web interface to perform the task.

Things to Note

One couple of things to keep in mind is that the server running this script must have a hosts file entry in /etc/hosts pointing to the right IP address. You will also need to ensure that your local machine has a hosts file entry setup properly for viewing the development or the production server if you are going to use a staging server setup using the same DNS hostname for WordPress (the staging server and production server both using www.yourdomain.tld that is).

I have a variation of this script that allows me to run WordPress on an internal server with the hostname: staging.mydomain.tld, whereby the staging/development WordPress can be cloned, and the script will automatically swap out the hostname with www.myserver.tld upon cloning.

You may wish to implement this vs using the same DNS hostnames for both servers (or use our contact page for contracting the work to us if you need any timely customization).

Also, you will lose the default comment functionality with a static HTML version of your WordPress site. There are alternatives (such as using Disquss) that will fix this up.

Here is the script – copy and paste friendly, but it can be alternatively downloaded here: wordpress-static-html.sh

Yep, that’s it! It’s actually quite a simple script. In a nutshell, here’s what it does:

  • Creates a base directory for housing the cloned files in $SAVETO and uses it for the prefix (-P)
  • Uses wget to recursively mirror a copy of the entire WordPress website (–mirror, –html-extension, -p)
  • Wget excludes the files and directories defined in $EXCLUDE (-X)
  • Loops through the pages, and creates named copies of them as directories (i.e. /blog/blogpost/)
  • Moves the pages as index files (i.e. /blog/blogpost/index.html)
  • Copies any additional files to the WordPress $SAVETO base directory, defined in $OTHERSTUFFTOCOPY

The user configurable variables are commented and should be changed to match your requirements.

Puppet

The Puppet configuration would be pretty simple. You would just need to ensure you have something like the the example below (adapted for your Puppet environment of course):

As always, use at your own risk as this site assumes no responsibility!

Tagged with →  
Share →
Buffer