Why does Global Server Load Balancing (GSLB) suck?

Why does Global Server Load Balancing (GSLB) suck?

GSLB Published on 4 mins Last updated

OK, before I start to fan the flames, let me state the usual caveat, "GSLB doesn't ALWAYS suck.  Just more often than you'd think".

It's 2011, and here at Loadbalancer.org we've toyed with the idea of selling a Global Server Load Balancer. After all, it wouldn't take long to hack a decent PowerDNS interface onto one of our appliances. But every time we look at how it might work, we keep coming back to the fact that it doesn't always work as the customer would expect.

Let me continue this rant by describing what customers probably want, before moving on to what GSLB actually does, before suggesting some simple alternatives.

What do most customers want when they talk about GSLB?

There's normally two things customers ask for:

  1. Active-Passive failover between two Internet sites (disaster recovery and high availability).
  2. Active-Active load balancing between two or more geographically dispersed sites i.e. Europe and USA (this uses the closest site for speed and high availability).

Sounds simple enough, doesn't it?

But let's briefly step back to what we should have done first, which is make sure our primary site (or all sites for that matter) is as indestructible as possible.

Have you already got the following?

  • 2 x Internet feeds
  • 2 x Switch fabrics
  • 2 x Firewalls
  • 2 x Load balancers  (no persistence/sticky here please, we want high availability after all)
  • 3+ x Web Servers
  • 2+ x Database Servers (gee, I wonder if we could put the persistence here?)

Done that?

No? Then go and do that first before you think about GSLB.

Assuming you've done that already, then great, by all means explore GSLB.

Next, I'm going to explain what a GSLB is (and it's quite simple)...

What is a Global Server Load Balancer really?

GSLB (Global Server Load Balancer) = DNS (Domain Name Server)

That wasn't difficult, was it?

When a client requests "www.myGSLBsite.com", your DNS/GSLB replies saying "sure go to X.X.X.X".

Now this can all quickly get very complicated, with GSLB vendors saying "but we can do all this cool stuff as well". But I say "Hogwash", and agree with everything Pete Tenereillo says about GSLB (well almost).

So going back to customer request number 1...

How do I get Active-Passive failover between two Internet sites?

Err... Just change your DNS record?
Or write simple script to do it?
Or get your DNS provider to do it?

Wasn't too hard, was it? On to request number 2...

How do I get Active-Active load balancing between two or more geographically dispersed sites?

Make sure ANY user can hit ANY server at ANY time (session replication/ database replication), and then configure multiple A records in your DNS...

Who spotted the deliberate mistake?

Err... OK so it doesn't give you the local provider/ shortest hops etc, but if you read Pete's blog (above) you would realize that that is impossible.

Why? Well using things like the Maxmind database is pretty flakey at best, most companies hide behind VPNs these days, and even if it did give accurate locations. Sometimes New York can talk to Germany faster than it can to washington. So unless you are going to watch the latency of every single connection.. oh wait no you can't because you are not an in path load balancer you are an out of path DNS server....

So how can we get around that issue about fast access to local stuff?


Ever heard of Content Delivery Network?

Akamai, Cloudflare and others have made a lot of money by making sure your big files like picture and video's are replicated to edge networks around the world. And guess what? They know what they're doing!

BTW Amazon cloud front does this and it's dirt cheap...Who gives a monkey which application server they hit if the large datasets are served from an edge network?

Now, I have rushed this a bit (no, really?) and I've probably missed a lot of things (feel free to enlighten me!) but that's my personal point of view.

On a more positive note though, if you are still serious about wanting to do this GSLB-thingy then right at the bottom of Pete's famous rant you will find the possible answer. I'm sure it's the method that Google et al. do and it's this:

  • Use a global network of top level domain name servers with proper BGP agreements, with all the other top level DNS providers + a simple health checking framework with least hop selection criterion.
  • Then rapidly change BOTH the DNS entries and more importantly the physical IPs (BGP) depending on your geographical algorithms' etc.

Hang on though, isn't that a managed service provided by Neustar...?

So my conclusion is that GSLB sucks....

Even though Cloudflare does a pretty good job at what people think they want from GSLB...

But, oh don't you just love one last — But!


But, there are definitely a couple of situations where GSLB is awesome!

The primary reason to use a GSLB is for multi-site load balancing, when you need site failover or more importantly locality based routing i.e. use local site first to save on bandwidth costs and latency:

GSLB – Why Global Server Load Balancers don't always suck?

The second reason to use GSLB is when you need seriously high performance load balancing, for example large multi-site object storage systems:

GSLB direct-to-node: How it works, and when to use it for your object storage deployment

The ultimate GSLB guide

Benefits, uses, and configurations