In early 2012, when I was asked to take over as Service Owner for the Web Hosting service, I asked “What is hosting?”. At the time, I asked that question because I had no knowledge or experience with hosting technology but it’s a question that I’ve come back to a few times in the 4+ years I’ve now been running the service.
The University of Edinburgh has a very large web presence, not just in terms of the main University website, www.ed.ac.uk, but we also have a very large number of smaller websites for things like individual business areas, affiliated teaching/research organisations, personal projects/blogs, internal documentation, support tools and the list goes on… As with any website, these need a place to live, therefore Information Services (IS) run an internal Web Hosting service. I like to call it our own little internal version of GoDaddy.
Running such a service within a higher education institution can be challenging given the wide variety of use-cases and skill levels within our very diverse user community. So, focusing on our very successful Linux-based side of the service, I thought I would share some of my experience so far and some details of what we’ve done with the service over the past few years. I’ve split those into a few themes; Consolidation, Technology, Security and Usability and we’ll publish this Behemoth blog post in 2 parts.
Consolidation
One of the biggest misconceptions about a web hosting service is that it basically runs itself. Technically, that’s true – A server will run, run and run if you don’t interfere with it. What about security updates, improvements, and new technology though? That’s not to say hosting needs the same constant attention as a huge business-critical application, like MyEd (our student/staff portal) but it still needs to be maintained and improved regularly. The result of that sort of misconception is when time and resources get tight, hosting is the kind of service that finds itself on a back burner while resources need to be focussed on the higher priority services.
Back in 2012, we had a number of Unix & Linux servers that fell under the umbrella of the hosting service, around 7 in total (including DR and Test servers). These were all running different technology and had very different use cases. Five of these were inherited from other places in IS as part of restructures. The other two were newer machines that had been brought in to replace those 5 – but never had.
The first challenge then, was to consolidate all of the websites running on these servers onto one ‘standard’ platform (more on that platform later). This painstaking activity was carried out manually, auditing every website we hosted and migrating it where required. We had a great deal of help from the web teams in various areas around the university who made use of the service. The most complicated opponent, deserving of a mention, was Morse – a server running websites dating back to when brand new episodes starring the famed detective of the same name, and his beautiful Jag Mk2, were on TV. The process has taken the last four years, with the final few websites due to be migrated in the coming few weeks. Those 7 machines have been replaced with 5 new servers, each identical to the next.
I can hear the questions already. “Five? That’s not much of a decrease!” Ah, Those 5 machines though, are hosting nearly twice as many websites as the 7 old rickety servers did 4 years ago. Most are resource intensive database-driven sites. We also have plenty of capacity for growth.
As part of the migration process, we devolved reseller rights to the various business areas and web teams throughout the organisation who use the service. This lets them create/edit/delete hosting accounts to suit themselves, making things much faster for them and reducing the workload on our team, giving us more time to spend on service improvement.
Technology
Now for the interesting part, the techy stuff. I would have been heard in meetings four years ago saying phrases like “out of date”, “unreliable” and “not fit for purpose”. Don’t think I need to go into much more details on that. So, when faced with the option of whether to upgrade the existing (newer) servers, I decided resources were better focused on designing and building an entirely new service. That brings me back to my initial question, “What is hosting?”.
I mentioned above that we have a wide variety of use cases, from standard blogs to content proxied into central services to feeds driving LCD displays in certain schools, we host it all. Needless to say we had issues with resource contention, issues with security and issues with compatibility. There was talk of setting ‘rules’ for what was and wasn’t allowed on the service but I disagreed because my answer to the question of what web hosting is: a service where people can host *anything they like* provided the technology can support it. Enter cPanel and CloudLinux.
cPanel is the web hosting management platform that everyone knows. It does everything for you other than make your coffee in the morning. You define a set of hosting packages then simply start provisioning accounts. cPanel takes care of Apache and PHP, MySQL, SSH, right down to access permissions, etc. It has a great intuitive UI that allows site owners to manage their own hosting and makes it nice and easy for us service managers to provision accounts. We were even able to write our own ‘hooks’ to carry out custom tasks triggered by cPanel events/actions.
CloudLinux is where things get very impressive though, a CentOS-based Linux distribution designed specifically for shared hosting. Anyone familiar with shared hosting will know that resource contention is one of the biggest issues we face, individual sites using too much of the server’s available resource. On a shared server, one small site doing something silly or being the innocent victim of a DDoS attack can bring down the entire server. CloudLinux limits individual users’ use of the shared resources to prevent that from happening. That includes; CPU %, Processes, IO, RAM and even MaxClients. As if that wasn’t impressive enough, they have something called CageFS (caged filesystem). CageFS puts each user in their own ‘cage’ which isolates them from all other users and the system files that they shouldn’t be able to see. It also adds the ability to let users choose the version of PHP they want to run and more recently adds support for Python and Ruby.
So there you have it, looking at all of the above. We’ve given users a service where they can run pretty much anything without worrying about any negative impact on the service itself or any of the other customers’ applications/websites.
Part 2 coming soon…