Encrypt all the things!

Over the last few weeks, we’ve been battling a problem where the web server would sometimes forget its own name. Some days it would want to go by www.lanecc.edu, other days it would want to go by 163.41.113.43 (our public IP address), and other days it would use our internal server name. We gave it a stern talking to, but it refused to cooperate.

The solution is to specify Drupal’s base_url variable. Normally, Drupal tries to identify what server name to use and it does a pretty good job. But clearly our server isn’t so great at that anymore. Specifying the base_url forces Drupal to use what we tell it.

But the base_url needs to be a full URL, complete with protocol. So it needs to be “http://www.lanecc.edu” or “https://www.lanecc.edu” – we’re not allowed to just say “www.lanecc.edu”. Why does that matter? Because even though the difference is just one letter – an “s” – that turns out to be one of the most important letters on the Internet.

HTTP is the protocol that defines how a lot of content on the internet moves around. It’s part of how this page got to you. But it’s a completely unencrypted format. When you’re browsing the web in HTTP, you’re sending everything in clear text – anyone that can listen in (for example, on an unencrypted WiFi connection) can read whatever you’re sending. But if we add the “s”, and browse via HTTPS, then everything we do is encrypted, and no one can listen in*.

But there’s some gotchas with HTTPS pages. For instance, most webpages actually consist of multiple requests – the Lane homepage has 34. If even one of those requests is made over HTTP instead of HTTPS, then we have a “mixed mode content error”, and the browser hides that content.

And that’s kept us from specifying our base_url so far. If we set it to “http://www.lanecc.edu”, then on pages that are HTTPS, like webforms, then all the styles and javascript will break, since those would be sent over HTTP. And if we went the other way, and set the base_url to “https://www.lanecc.edu”, then our caching infrastructure, which is built assuming most connections are over HTTP, would break, significantly slowing down the site. So we’ve been stuck running a mixed-mode site – most people use HTTP, but authenticated people and webform users use HTTPS.

There’s a number of reasons that isn’t ideal, which are well outside the scope of this already too long blog post. And the wider Internet is moving forward with using HTTPS only everywhere. So yesterday, we deployed new caching infrastructure which will allow us to go with using HTTPS only. Going forward, all connections with www.lanecc.edu will beĀ  encrypted.

This should be a almost completely transparent transition, but if you notice any problems, email us at webmaster@lanecc.edu and let us know!

* strictly speaking, this isn’t true, and there’s a a whole category of attacks that can still work on HTTPS. But there’s a fix for that too, and we’re working on rolling that out too some time in the future.