Over the last year, we’ve occasionally been posting about page deployments from the old Lane site to the new Lane Drupal site. Today, we dramatically improved our testing infrastructure, and took a step towards a better, more accessible website.
Before we can explain that step, you need at least a grossly simplified idea of what it looks like for us to move pages. Essentially all we’re doing is manually copy and pasting over content from the old site to pages on the new site, then properly formatting and linking everything. Easy with one or two pages, but when doing a chunk of a hundred pages, it’s a tedious and time consuming process. And, as with any tedious process, it’s prone to errors.
So how to we ensure that everything worked right after a deployment? Our principle tool is a Link Checker, which is a type of Web Spider. A spider, once given a page to start on, follows all the links on that page, and then all the links on those pages, and so on. Eventually, it establishes a kind of web of a website.
When we first tried to find a spider, we found several that almost worked. The one we got furthest with was, appropriately enough, called LinkChecker. But it wasn’t quite right for our needs, and we had a lot of difficulty trying to extend it. So we did what any self respecting, overly confident programmer would do, and wrote our own.
Our first pass worked pretty well – checking some 13,000 pages and all their hundreds of thousands of links in about half an hour. But like any tool, before long, you want more power!
We experimented with adding some rudimentary spell checking, but found that doing a complete check of our site could take as long as 4 hours! And there were so many other things to add.
Over the last couple days, we’ve done a complete rewrite and realized significant speed gains. Instead of 30 minutes to check the site, it now takes 7. And adding in spell checking only makes it take 12. We also added a few other features:
- Hotlinking Checks:
It’s possible to include an image on your page that’s technically stored on another server. This practice is called “hotlinking” and is generally discouraged (some things, like Facebook icons or Google Docs embedded images are ok), since it can lead to awkward situations where the hosting server either removes or changes the image – effectively controlling content that’s displayed on your site. We’re now checking to make sure that all the images you include are local (or are part of a list of allowable sites)
- Alt Tag Checking:
According to accessibility rules, images are supposed to have alt tags to help visually impaired people identify what’s in the image. They’re really easy to add, so we really have no excuse not to include them every time we put an image on one of our pages. We’re now logging images that are missing alt tags.
- External Link Checking:
Previously, we were only checking links within the Lane website. We’ve now expanded that to check links off site.
- Page Title Checking:
It’s still in experimental mode, but we’re adding page title checking to make sure that none of our pages have redundant titles for SEO reasons.
- Phone number formatting:
We added a few checks to make sure that all phone numbers are formatted appropriately, so that you can click them to make a phone call.
- Email address formatting:
Similarly, we’re now making sure that all mailto: links have a properly formatted email address after them.
The broader, more important thing we did was to make sure the framework for our link checker is easier to extend – simplifying new tests in the future. And, because its so much faster, it’s “cheap” to run a test, meaning we can do them more often to catch problem areas sooner.
Now, on to the broader problem of actually fixing all the new problems on the site we’ve uncovered….