We explore the concept of load balancing, and explain what a load balancer does — and how.
Table of contents
- What does a load balancer do?
- How does a load balancer work?
- What is a network load balancer?
- What is an application load balancer?
- What is a load balancer scheduling algorithm?
- Who needs a load balancer?
- What are the different load balancing methods?
- What load balancer types are available?
- What are the benefits of using a load balancer?
- Why do load balancers come in pairs?
What does a load balancer do?
In a nutshell, a load balancer facilitates effective application delivery. No surprise then that a load balancer is also known as an Application Delivery Controller, or ADC.
To understand the purpose of a load balancer, you first need to understand a bit about how applications work. Because, ultimately, there are limits to what an application can do without one.
How does application delivery work WITHOUT a load balancer?
With critical applications, the user experience is everything. If the user’s experience is interrupted or a connection is unreliable, then the services that application is trying to deliver are fundamentally undermined. And that’s a problem for the software company that designed the application, as much as for the person relying on that application.
Whether the user is an employee trying to communicate with colleagues via an email application, or a doctor trying to access a patients’ health records through an EHR (Electronic Health Record) application — every user expects content to be served to them quickly, reliably, and without interruption.
Applications rely on servers to deliver the requested content successfully to the end user. Application servers work closely with web servers to deliver static content like images, HTML pages, and videos to users in different locations, providing the ecosystem for dynamic web applications to run, and the crucial operations between the user interface and the back end. So applications effectively rely on servers to deliver the central functionality and the user requests to execute tasks.
But servers are fallible. And can only deliver so many user requests before they become overwhelmed, compromising application delivery.
How does application delivery work WITH a load balancer?
Critical applications need to be Highly Available (HA). That is, they need to run without interruption, 24 hours a day, 7 days a week. No matter what’s thrown at them. And load balancers make that possible. So hidden in data centers around the world, are millions of load balancers doing just that.
A load balancer sits between the user (otherwise known as the ‘client’) and the server cluster, distributing all the content requests from users (like a video, text, application data, or images) across all servers capable of fulfilling those requests. The users and the servers might be local to one another, i.e. within the same data center, or geographically dispersed across the internet or private networks.
How does a load balancer work?
Load balancers distribute connections among healthy servers based on the algorithm of your choice. The end result is that incoming application and network traffic is distributed across a group of backend servers, making sure no one single server is overloaded and ensuring traffic is only sent to healthy ones. This means user requests are successfully delivered, without any danger of dropped or interrupted connections.
If a server in the application cluster goes down for any reason, or is taken offline for maintenance, the load balancer immediately redirects traffic to the remaining healthy servers.
Without a load balancer, these services go offline:
For more on why servers can fail, check out this blog: Plan for the worst when it comes to critical IT systems.
But load balancers don’t just help applications stay always on, they can also help applications scale and adapt to a growing number of user requests. When new servers are added to this server pool, a load balancer will also automatically start sending requests to it, meaning servers can be easily scaled, as well as remain highly available.
And, depending on the rule set, or algorithms used, load balancers can also accelerate application performance by reducing latency, i.e. the amount of time it takes a server to respond to the user. For example, by directing user requests to the closest server to the user, response times are accelerated, meaning the user receives the desired information in the shortest amount of time possible.
What is a load balancer node?
A load balancer node is the physical or virtual server which runs the load balancing software that intelligently distributes the application traffic.
The following node states and roles together form a high availability (HA) cluster:
- Node roles - Primary and secondary
- Node states - Active and passive
Load balancer node roles: Primary and secondary
The Primary node is the appliance assigned the primary role in the high availability pair. It contains the primary settings for load balancing, such as the Virtual Services’ IP addresses and all associated configurations. Hence, this is the node on which users make all their configuration changes to the Virtual Services. It's also the node that assumes the Active state by default.
The Secondary node, as the name suggests, is the appliance that assumes a secondary role. While it might start in a Passive state, that doesn't mean it's incapable of becoming Active. It's ready to step in as the Active node if/when the Primary node (appliance) encounters issues. The Secondary node's configuration mirrors the Primary node to ensure a smooth transition when it takes over.
Having a Primary and Secondary appliance in a clustered pair provides redundancy and resilience for critical applications by removing the single point of failure. So even if the Primary node or appliance fails, the Secondary node can take over, achieving high availability of the application.
Load balancer node states: Active and passive
The Active node in a high availability load balancer pair is the one that handles traffic distribution. This means it is responsible for intelligently routing incoming requests to a pool of backend servers. In essence, it is the 'workhorse' that balances the load being routed to the available servers in a pool.
Conversely, the Passive node remains on standby, ready to take over if the Active node encounters any issues. While it doesn't actively route traffic, it constantly monitors the health and status of the Active node and stands ready to assume the Active role should the need arise. The Passive node is crucial for maintaining uninterrupted service in case of a failure or during maintenance.
More information on the difference between node roles and states.
What is a network load balancer?
The term network load balancer is confusing for two reasons. Firstly, a network load balancer is not just for network traffic! Secondly, a network load balancer in fact operates at the Transport Layer (Layer 4) of the Open Systems Interconnection (OSI) model, and not at the Network Layer (Layer 3) as the name implies.
The following video demonstrates how a Layer 4 DR mode load balancer works at a high-level, transporting traffic as efficiently and quickly as possible to a healthy server:
N.B. This video relates to Layer 4 DR mode only. Other methods of Layer 4 load balancing are available, including Layer 4 NAT, SNAT, and TUN, although these are rarely used. For more refer to the section below: “What are the different load balancer methods?”.
A network load balancer refers to the way the traffic is managed, rather than the type of traffic being managed. So if you hear someone talking about a network load balancer, what they’re really referring to is Layer 4 network load balancing. Layer 4 load balancing can be great for supporting the delivery of large files to users (such as videos) because it is more efficient than the alternative load balancing method called Layer 7 load balancing. But the trade-off is that Layer 4 load balancing is not as intelligent as Layer 7 load balancing, which allows you to apply more complex rules. For a comparison of the different load balancing methods including Global Server Load Balancing (GSLB), check out this blog: Compare Layer 4, 7, and Global Server Load Balancing (GSLB) techniques.
What is an application load balancer?
An application load balancer is one that load balances traffic at the Application Layer, otherwise known as a Layer 7 load balancer. Unlike a Layer 4 load balancer which is focused only on content delivery, a Layer 7 load balancer looks at the actual content being delivered, and responds accordingly, providing what is effectively smart routing.
The video below demonstrates how a Layer 7 load balancer works at a high-level by applying a set of rules:
For more on the differences between a Layer 4 load balancer and a Layer 7 load balancer, check out this blog: Difference Between Layer 4 vs. Layer 7 Load Balancing, or check out the high-level video summary below:
The Layer 7 load balancer is the most flexible method of load balancing, providing high availability and scale for applications, while being easy to implement. Some of the downsides to Layer 7 are that it can be hard to make transparent i.e. see the client's source IP, and operates as a full proxy so has slightly more latency (although still offers wire speed).
In contrast, with Layer 4 DR mode, the reply traffic flows from the servers straight back to the clients, bypassing the load balancer, maximizing the throughput of return traffic and allowing for near endless scalability. However, some applications will not support Layer 4 DR mode and it can't be used for SSL offloading.
So the method needs to fit the use case. For more on network verses application load balancers, check out this resource from AWS.
What is a load balancer scheduling algorithm?
A scheduling algorithm is the logic load balancers use to decide which server to use for the next new connection.
Algorithms include the following:
Round-robin / Weighted round-robin
With this method, incoming requests are distributed to backend servers in a sequential manner relative to each backend server’s weight. Servers with a higher weight receive more requests. A server with a weight of 200 will receive 4 times the number of requests than a server with a weight of 50. Weightings are relative, so it makes no difference if backend server #1 and #2 have weightings of 50 and 100 respectively or 5 and 10.
Least connection / Weighted least connection
With this method, incoming requests are distributed to backend servers with the fewest connections relative to each backend server’s weight. Servers with a higher weight receive more requests. A server with a weight of 200 will receive 4 times the number of requests than a server with a weight of 50. Again, weightings are relative, so it makes no difference if backend server #1 and #2 have weightings of 50 and 100 respectively or 5 and 10. This is the default method for new VIPs.
Who needs a load balancer?
If applications are vital to your business, if uninterrupted service is crucial, or if your user traffic experiences sudden spikes or drops, a load balancer is a must-have.
Without one, application reliability, performance, and scalability will become compromised.
In order to meet the expectations of application users today, businesses and organizations that deliver critical services must use a load balancer to ensure their digital experiences are truly ‘zero friction’.
What load balancer types are available?
The type of load balancer required will depend on the use case and infrastructure being leveraged. The three most common types of load balancer are:
- Hardware devices that physically sit in IT departments or data centers:
2. Software solutions that can be installed on physical or virtual machines in data centers, on-premise, or in the cloud.
There are pros and cons to each. For more on hardware versus software load balancers, check out this blog: What is the difference between a hardware and virtual load balancer?
What are the different load balancer methods?
There are four main types of load balancers at Layer 4:
And one at Layer 7:
Layer 4 and Layer 7 load balancing can also be used in conjunction with Global Server Load Balancing (GSLB). Demand for GSLB has grown significantly in the last three years, as large numbers of organizations have migrated away from traditional on-premise systems and have instead created hybrid cloud and hosted environments. Many have also made the strategic decision to split their data resources across multiple locations to improve business resilience and reduce costs. For more on GSLB, check out this comprehensive guide:
In all these instances, GSLB allows organizations to deliver a high-quality, reliable experience for users, no matter where they are in the world — no matter where their applications and data are located.
For a direct comparison of each, check out this blog: Compare Layer 4, 7, and Global Server Load Balancing (GSLB) techniques. This blog by our Founder, Malcolm Turnbull, is also a useful resource for anyone wanting to research the best method for their use case: Load balancing methods.
What are the benefits of using a load balancer
There are three main advantages to application users of having a load balancer in the mix:
- High availability: load balancers protect applications from downtime, ensuring an uninterrupted user experience in the event of server failure. They also make it possible to do server maintenance without having to take critical applications offline.
- Performance: load balancers improve application response times and performance, therefore reducing latency.
- Scalability: load balancers enable systems to handle sudden spikes in traffic and scale resources up or down as needed. They also make it easy to scale server infrastructure without the need for downtime.
For more on the benefits of load balancing, check out this blog: Advantages of Load Balancing for Enterprises.
For more on the benefits of load balancing, check out this blog: Advantages of Load Balancing for Enterprises.
Why do load balancers come in pairs?
In short, a single load balancer provides additional redundancy. Two load balancers provide high availability, helping you avoid the dreaded single point of failure. And redundancy matters because it is this that protects you from downtime (which was probably the primary reason for buying a load balancing in the first place).
With a single load balancer, you’re effectively merely shifting the single point of failure from the server to the load balancer. It gives you some protection against too many demands being made on your backend servers, but there’s still a limit to what it can actually protect you against because the load balancer itself then becomes the weakest link in the chain.
However, with two load balancers (otherwise known as a clustered HA pair), you’re adding a second, standby load balancer, ready to take over if the primary appliance fails for any reason, giving you an extra layer of redundancy and protection against downtime.
Hence, load balancers are most commonly deployed in active-passive pairs, meaning traffic can then be redirected to the redundant device in the event of server failure or essential maintenance downline.
30-day load balancer free trial
Curious about load balancers? Sometimes the best way to understand something is to see it for yourself. If this is you, feel free to download our virtual appliance, free for 30-days to see what it can do. And if you want to talk to a human, or have questions, our technical experts are always here to help. They’re obsessed with load balancers so really you’d be doing the rest of us a favor by taking them off our hands for a few minutes ; )