Over the last few months we'd experienced two fairly lengthy outages on our web server. It was a dedicated server with a UK host and we're not exactly sure of the reason for the downtime - could have been network failure, could have been the server crashing. It had become pretty annoying for us, and we realized that for a company touting the use of load balancers for High Availability, it is important that our own website should be up! Also, as Loadbalancer.org recieves traffic from every corner of the globe, we wanted to see what we could do to reduce latency to the farther-flung continents.
Enter Amazon Web Services. You've probably been living under a rock if you haven't heard of AWS - it's the new vogue of the IP world - "Cloud Computing". Amazon certainly aren't the only proponents; there's Google App Engine, Rackspace's "Mosso", 3tera, and probably numerous others. So what's so great about the cloud? I guess the main advantage is flexibility. Essentially we're talking about Virtualization, so if you want to launch a clone of your current server no one has to haul a physical server over to your rack to replicate your data. In the cloud the capacity is already there; you just have to concern yourself with creating enough demand. But scalability isn't our main concern - we "just" want HA and decent global delivery of our website's content, and for this we chose to leverage Amazon's EC2, S3 and CloudFront services.
I think of these three services as the following:
- EC2 is your server - it allows you to provide dynamic content via whatever platform you prefer (e.g. LAMP). Your server can be in Europe, or the US. * S3 is just storage. You can host your static files on here, but there is no logic layer. Give the requester what they asked for, and nothing else. It's geographically redundant, and pretty cheap per GB. * CloudFront - a global CDN for your S3 files, serving files via US, Europe, Hong Kong, or Japan, depending on the location of the request.
All of Amazon's services are accessed through their API, which exists in command-line form with a Java backend, and also as a web service, via SOAP requests, or straightforward HTTP(S). Other APIs which leverage the web service have cropped up, written in PHP, Perl, Python, C++ etc. For EC2 our favorite tool is ElasticFox, a pretty robust and feature-rich Firefox plugin.
So how do they all fit together? Well we have some dynamic content on our website, such as this blog, and we need a scripting language to send emails. So clearly we need a server, rather than being able to dump all our web files on S3, and serve them through CloudFront. There are a few features that make the EC2 platform attractive for use as a web server:
Elastic IPs - These are permanent public IPs assigned to your account. If your server (an instance in AWS lingo) crashes or becomes unavailable for any reason, you can reassign your elastic IP to another instance. Or launch a new instance and assign the elastic IP to it. So in theory you should never need to repoint your DNS entries to a different IP.
Elastic Block Store (EBS) - A similar principle to the elastic IP, but for storage devices. An instance has its normal storage device, but an EBS is more redundant and can be reattached to another instance. Essentially the EBS is a SAN in the cloud, whereas normal instance storage is vulnerable to individual disk failure. However, you can't avoid storing your essential OS files on the instance's disk, since networking must be up before you can mount your EBS. But you can store as much other stuff as you can on it: web files, databases, subversion repositories, config files, etc. So if your instance goes bang, you can very quickly launch another one and re-attach your EBS. For an extra layer of redundancy you can take snapshots of your EBS, which are stored on S3. RightScale have an excellent article regarding the ins and outs of the EBS.
System Images - So what about the rest of your files, the essential OS files called upon during boot? If your instance dies won't you have to waste time recreating your config? The simple answer is no. Once you have an instance just the way you like it, you take an image of it and it transfer it to S3. You can register it as public or private. There is a huge number of free public images to choose from, ranging from all flavors of Linux, Windows, Solaris, DB2. Just pick a clean image, make your changes, and register your own image. In some cases, you may want to just mount an EBS and serve your content from there.
Straightforward Security Maintenance - when you launch an instance, by default none if its ports are open to the outside world. Opening a port simply involves a one line call to the AWS API. And obviously closing one is just as easy.
So how did we fit all this around our scenario? Well, currently AWS instances can only be located in the US or Europe. So it struck us that the optimal thing to do is host as much of your static content as possible on CloudFront (images, CSS, downloads etc.). So we'd store our blog database (and a couple of others) on an EBS, attached to an instance in Europe. The instance is running Ubuntu 8.04 and also serves our Subversion repository. We set up an elastic IP and pointed the www and blog A records to it. Then we set up some cron jobs to take snapshots of the EBS. If our instance fails we can relaunch another one and it will be up and running and serving the same content within 30 seconds.
But what about the Load Balancer I hear you ask? Can't you ensure zero downtime and well distributed traffic amongst a cluster of web servers? Actually you can. AWS does include a load balancing facility, which essentially acts as a proxy, allowing you to forward traffic on a certain port to various servers in the same availability zone. This is a very sensible thing to do, but in the case of our website, is probably overkill. Plus the static content is effectively geographically load balanced via CloudFront. Although their load balancer is a valuable addition to the service, we do think it would be a massive improvement to be able to span multiple availability zones and even regions (i.e. Europe and US). In theory this is doable with HAProxy, and this is something we are experimenting with at the moment. Another downside is that the load balancing is only available in the US region at the moment.
All in all, we are happy with the level of availability our setup provides, and (touch wood) we have had no issues as yet. But for those hosting more critical services in the cloud, we suggest taking a look at the load balancing service, and for greater flexibility, running an instance with an HAProxy setup.