Load balancing object storage for increased availability, performance, scalability and resilience
Load balancing an object storage deployment will increase its performance, make it more resilient, improve scalability and increase its uptime.
While some object storage vendors will supply their product with simple load balancing features, these are frequently only adequate for small and single site deployments, often lacking the features to provide a truly optimized experience. Conversely, using a high-end load balancer, like F5 or Citrix NetScaler, can result in overly complex and difficult to maintain solutions that are a drain on your resources.
Loadbalancer.org have created another way: Smart, flexible and unbreakable ADC/load balancing solutions tailored specifically to meet the needs of object storage. Powerful enough to supercharge your object storage environment but without the complexity to put a strain on your team.
Calling Object Storage providers
Discover the power of tailored load balancing before your competitors do!
Explore
Why load balancer object storage?
The need for load balancing
There has been a dramatic shift away from traditional storage to cloud-based object storage solutions that are scalable and secure. Augmenting these applications by ensuring their high availability and performance is a key role of the load balancer, as is the provision of scalable IT architecture that can cope with the inevitable explosion in data, and the uncompromising demands of client-side applications.
Storage application drivers
1. Exponential data growth
Data is big and getting bigger. Exponential data growth has led to increasingly complex storage environments which need to be able to store, retain and maintain data securely.
2. A shift away from traditional storage solutions
Over the years, nearly all software applications were designed to read and write data based on file and block storage. While file storage manages data organized into hierarchical file systems, block storage manages data as blocks within sectors and tracks. But these platforms fail to scale sufficiently to meet the growing storage needs, simply because conventional storage techniques just weren’t designed to handle the data tsunami that the current business world is heading towards. They work well for smaller data sets but fall apart when attempted to scale. Besides, as more and more applications are now designed to talk directly to the storage nodes, without additional file system layers in between, the older concepts of data storage are slowly becoming obsolete.
Therefore, as data explodes, organizations are forced to relook at their storage systems and invest in solutions that are sustainable and long-term. As a result, they are making a move from traditional storage options to cloud-based object storage, which is a reliable, efficient and affordable way of storing, archiving, backing up, and managing huge volumes of static or unstructured data.
3. Client-side applications
Storage points into a number of different ecosystems and key client-side applications. It is important that these end-to-end workflows are load balanced in order to retain smooth data access, storage, and analysis.
How load balancers enhance object storage
To ensure a robust object storage system for their customers, storage vendors must implement an effective load balancing solution into their architecture. Load balancing improves responsiveness and increases the availability of applications by distributing network or application traffic across a cluster of servers. Thus, it is the key to running a successful object storage system. While most object-based data storage vendors promise unlimited scalability as one of their biggest strengths, load balancing is the driving force behind it. It helps object storage scale better by adding health checks and failover. Without a load balancer, most data storage vendors would use a simple round-robin DNS solution which no doubt can provide scalability, but is not completely reliable when a storage node fails. Therefore, in order to scale their solutions adequately to meet customers’ growing data demands, it is important for storage vendors to add load balancers to their storage cluster.
Load balancing methodologies
Layer 4 and/or layer 7
The easiest choice is to use Layer 4 and/or Layer 7 load balancing which allows all the traffic to flow through the load balancer. It is automatic, instant and not even noticeable to the end-users. With it, most storage appliances by default undergo the same method. But vendors must ensure to use a large enough load balancing solution to scale up according to the throughput requirement.
Direct server return (DSR/DR) mode
Other options include using the Direct Server Return (DSR/DR) mode where network traffic bypasses the load balancer on its return path to the client-server and thus, allows endless scalability for retrieving data out of the client storage. The write speed is less significant with this mode but it works well where faster read performance needs to be guaranteed.
Global server load balancing (GSLB)
Global Server Load Balancing (GSLB) is an upgraded version of the traditional round-robin DNS technique.
Use cases include:
- Multi-site – GSLB adds health checks, location awareness and site failover, making it the best choice for organizations looking at multi-site deployments. Rather than using a simple round-robin DNS, storage vendors can also use a GSLB to handle the entire load balancing requirement. This removes bottlenecks, but failover can be affected as it happens at the DNS level and is mostly reliant on the client software accessing the storage system.
- Heavy workloads – GSLB is direct-to-node, allowing the client to talk directly to the storage node. This takes the load balancer out of the path of the traffic, making it suitable for particularly heavy workloads that don’t need some of the intelligent load balancing functionality found at Layers 4 and 7.
Benefits of load balancing object storage
1. Load balancers facilitate zero downtime
The ability to copy data in multiple locations is another major benefit of using object storage. Data can easily be replicated within nodes and clusters among distributed data centers for additional back up, off-site and even across geographies. The storage system can be configured in such a way that if a particular disk within a cluster fails, a duplicate disk is always available, ensuring the system continues to run without interruption or performance degradation. What makes this possible? Again, load balancing does the magic! Load balancers allow storage vendors to distribute (spread out) and store data in multiple locations to facilitate zero downtime in case of failover. For example, if data center A fails, a load balancer allows users of that locality to access the same data in data center B. Load balancers can use a myriad of Access Control Lists (ACLs), rules and topology information to direct users to the correct location to access storage. Typically, for multi-site deployments, storage vendors can use the GSLB with its topology feature which allows matching the source subnets to locations, helping users access their resources locally unless a failover occurs. When a site failover occurs, users are rerouted to a different location, thus ensuring uninterrupted data access.
tl;dr
Benefits of load balancing object storage include:
- Facilitate zero downtime
- Support the integrity of storage infrastructure
- Offer consistent access to storage
- Enhance data management applications
- Play a role in ransomware protection
Adjusting server weights and distributing traffic evenly is important for storage vendors to efficiently manage client object storage systems and prevent complete data center failures. Load balancing facilitates this. By utilizing a “least connection” algorithm, load balancers can evenly distribute traffic among backends. They are equipped with intelligent decision-making capability which allows them to monitor the number of sessions sent to each node so that the new sessions can be scheduled to the least loaded node. Besides, they also work with Feedback Agents on real servers to additionally monitor CPU and memory loads and calculate the availability of resources left on the server. In case data center failures occur, load balancers are capable of allowing users with uninterrupted data access. How? Load balancers allow failovers to occur at the IP level and do not rely on client software to work. They fully load balance traffic without even a client typically needing to know that anything has changed. A load balancer quickly pulls back the failed backend node from the available servers and redistributes traffic amongst the remaining healthy nodes, thus resulting in a super fast failover. However, for vendors using GSLB alone, failover occurs at the DNS level where just like the IP failover, the failed location is removed from being served up to new users. The benefit of using GSLB is that it allows faster and maximum throughput although failover is slower as it is not handled at the IP level.
2. Load balancers support the integrity of storage infrastructure
Overall, a load balancer helps object storage architecture maintain system firmness, improve system performance and protect against system failures. To leverage the benefits of load balancing, it is important that storage applications have load balancers installed as a key architectural component in their infrastructure. Orchestrating a load balancing solution into the storage infrastructure eliminates downtime, and ensures increased scalability, flexibility, redundancy and data protection. Data storage vendors should partner with service providers offering proven, easy-to-install result driven load balancing solutions that are tightly integrated as part of their data storage infrastructure, running concurrently in the same environment as the application resources. Besides, installing standardized load balancing solutions across client sites helps vendors achieve consistent results and assured outcomes with every installation. Therefore, they should ideally look for service providers that understand their object storage requirements, and work together with them to design flexible load balancing solutions for specific storage needs.
3. Load balancers offer consistent access to storage
GSLB and multi-site failover ensure consistent access to storage solutions.
4. Load balancers enhance data management applications
Data management applications such as Splunk and Weka play a crucial role in analytics, machine learning and artificial intelligence. In order to offer reliable runtime, load balancers need to be added to the workflows of these applications.
5. Load balancers play a role in ransomware protection
Load balancing increases durability across multiple data centres (availability zones), protecting the integrity of data stored using cross-site replication. It guarantees business continuity in a crisis by ensuring that the desired Recovery Time and Recovery Point Objectives are met. For industries where the acceptable amount of downtime following a disruption such as a ransomware attack is zero, the load balancer failover happens seamlessly to avoid any disruption to the end-user.
When a failure event occurs, no backup data is being ingested into the local storage. The load balancer detects the failure and redirects traffic to the additional data centres meaning both storage and retrieval of data can continue, offering maximum redundancy. The recovery point objectives can be as low as seconds, minimizing the amount of data loss. In this scenario, the load balancer facilitates immediate retrieval of immutable data backups, held in alternative locations, to offer maximum redundancy. The net effect should be that failover and failback are therefore seamless. Hence load balancing mitigates the risks associated with the potential loss of data resulting from a ransomware attack.
White paper
Deployment guides
How can load balancers integrate with your storage architecture?
For detailed step-by-step instructions on the deployment of a load balancer with your existing object storage solution, please refer to the relevant deployment guide below:
- How to load balance Ceph Object Gateway
- How to load balance Cloudian HyperFile
- How to load balance Cloudian HyperStore
- How to load balance Dell EMC
- How to load balance Hitachi Vantara
- How to load balance IBM Cloud object storage
- How to load balance iRODS
- How to load balance MinIO
- How to load balance NetApp Storage Grid
- How to load balance Qumulo
- How to load balance Scality Ring
- How to load balance Panzura Cloud
- How to load balance Storage Made Easy,
- How to load balance Swiftstack
- How to load balance Zadara
Load balancing Cloudian Hyperstore
How do you load balance Cloudian Hyperstore?
Object and file storage solutions such as Cloudian’s may help in a number of ways, for example:
- By allowing businesses to take the benefits of cloud and Amazon S3 into their own, on-premise data centres.
- By connecting to an Amazon rack in the customer’s own data centre.
The following use case looks at how Cloudian HyperStore provides enterprise object storage to serve a research University’s complex storage needs.
Challenges faced
One of the UK’s leading research universities needed to safeguard over 1.5 petabytes of research data and intellectual assets generated by around 21,000 students and 5,000 staff. It decided to replace its aging storage area network with Cloudian’s HyperStore object storage system and needed to deploy compatible load balancers to help it keep this vital backup system running. At the time, the only load balancers used within the university were ‘home-grown’ solutions, built many years before using open source software by technicians who no longer worked at the university. As these solutions were difficult to manage and maintain, the university wanted to move to fully-supported, commercial load balancers.
Case study
Solutions and results
The university started by setting up a proof-of-concept for Cloudian HyperStore and evaluated load balancers from Loadbalancer.org in this test environment. They were very impressed with the way the Loadbalancer.org solutions reacted to storage node failures during testing, with the failover occurring so quickly that storage jobs didn’t even know that an incident had occurred.
The university now uses two Loadbalancer.org Enterprise 40G appliances, installed as a high availability pair across two data centers. These solutions balance traffic across 15 HPE Apollo servers, and back up or ‘churn’, on average, 120 terabytes of data per week. As such the Loadbalancer.org appliances play a vital role in ensuring the high availability of the university’s Cloudian HyperStore solution, handling exceptionally high throughput with ease. At peak times, the load balancers handle 3.3 gigabytes of storage per second with no impact on performance.
The university also configured the Loadbalancer.org appliances to provide a secure gateway from the private Cloudian network to external services on the university’s public network. This improved their security and saved them a huge amount of configuration time.
Object vs other storage types
Chapter Overview
What is file storage?
File storage is the storage of data as a single piece of information inside a folder. Data is organized and retrieved using metadata that tells the computer where to find it. The problem with the file storage system is that it is difficult to scale because you can’t add additional capacity to it. Instead you need to add more systems to it to scale. And you also need to know the path to find it. File storage is therefore well suited to small data files and data that is accessed and stored by individual users or small businesses.
What is block storage?
Block storage is also based on a file system but doesn’t rely on a single path to data, so can be retrieved more easily. Instead, it separates the data into blocks and stores them as separate pieces. Using unique identifiers, the data can therefore be stored wherever is most convenient, which might be across a number of different environments (e.g. in a Windows unit or Linux environment). In this way, blocks can be decoupled to better manage the data and reassembled only when needed.
The benefit of block storage is that it can be more easily retrieved and accessed, however it can be expensive and can’t deal with metadata (data about other data that might be necessary for analysis).
What is object storage?
Designed to be massively scalable, object storage is a new storage paradigm with a very simple interface. It organizes information into containers of flexible sizes, referred to as objects. Each object includes the data itself and its associated metadata and has a globally unique identifier name. It is a “flat” structure and naming convention.
The biggest benefit of such an architecture and interface is its ability to achieve enormous scale dynamically. Besides storing enormous amounts of data, object storage also allows access to large amounts of disparate data sources for analytics and advanced reporting which traditional storage fails to offer. No matter where a particular type of data is stored, object storage is intelligent enough to find that data whenever a related query is fed in. It also ensures improved efficiency while managing very large quantities of data, thus making it a high-performance, cost-effective solution ideal for long-term data retention.
Sector-specific object storage needs
While the object storage solutions available may cut across all industries, the type of data being stored may vary considerably. The focus for banks may be on big data and immutable backups to protect against ransomware, while for hospitals medical imaging and patient data will be the priority. Similarly, Universities will be focused on keeping student records highly available, and storage in media are likely to focus on high-capacity storage for new image formats.
For more sector specific information, refer to Chapters Six, Seven and Eight.
Cloud storage considerations
What needs to be considered when load balancing storage in the cloud?
Cloud object storage systems help solve pebibyte storage challenges for enterprises worldwide. They are an innovative and cost-effective approach for storing large volumes of unstructured data while still ensuring scalability, security, availability, reliability, manageability, and flexibility. There are two options: use a cloud-native load balancer, or opt for a platform-agnostic load balancer. The appropriate load balancing strategy will be dependent on the use case and desired outcomes.
Explore our cloud options
The examples cited above are the ones we get asked about the most
AWS cloud object storage
With AWS object storage solutions storage is managed in one place with an easy-to-use application interface, making it easier to perform analysis, gain insights and make faster, better decisions. There are a number of considerations when choosing the best load balancer for AWS.
AWS has released three types of load balancer – CLB (Classic Load Balancer), ALB (Application Load Balancer) and NLB (Network Load Balancer). As a customer, you are likely to buy one of these for its basic functionality, and then realize pretty soon that you need another one to do something else. In fact, as your business grows and your needs change, you will almost inevitably end up with all three load balancers. This ‘catch ‘em all’ approach is completely unnecessary when you realize that one Loadbalancer.org product, Enterprise AWS, easily fulfils the function of all three. Just take a look at the comparison table below.
While there are many benefits to AWS cloud load balancing, there are also a number of drawbacks.
Azure cloud object storage
Microsoft Azure offers a number of different cloud storage solutions. The appeal of this cloud platform is that you get highly scalable, secure storage from which you can run your Microsoft business applications.
Find out here how Loadbalancer.org can support Azure cloud object storage.
Google cloud object storage
Google provides high performance, scalable load balancing on their Google Cloud Platform (GCP). They have a video for cloud developers called ‘Best Practices for Storage Classes, Reliability, Performance, and Scalability’ which may be helpful if you’re considering this solution.
Find out here how to load balance Google Cloud GCP with our platform-agnostic load balancer.
Other cloud platforms
The three examples cited above are the ones we get asked about the most, but if you would like to discuss load balancing other cloud object storage platforms let us know. For example, we are working with an increasing number of organizations using data lakes, as well as storage vendors looking at managed service private cloud solutions.
IBM cloud object storage
IBM’s storage platform supports exponential data growth and cloud-native workloads with built-in, high-speed file transfer capabilities, cross-region offerings and integrated services.
Find out here how IBM Cloud Object Storage can be load balanced.
Object storage in healthcare
How do you load balance object storage in healthcare?
Load balancing storage architecture has a number of key benefits:
- It ensures the data being stored is protected
- It helps make sure that data remains highly available and accessible at all times
- It enables businesses to meet growing data demands through scalability
Chapter Overview
Here we explain how that works in a healthcare context.
Load balance medical imaging PACS and VNA
For healthcare, the greatest storage demands arguably come from medical imaging PACS and VNA.
A picture archiving and communication system (PACS) is a medical imaging technology which provides economical storage and convenient access to images from multiple imaging modalities. Electronic images and reports are transmitted digitally via PACS; this eliminates the need to manually file, retrieve, or transport film jackets. The universal format for PACS image storage and transfer is DICOM (Digital Imaging and Communications in Medicine). Non-image data, such as scanned documents, may be incorporated using consumer industry standard formats like PDF (Portable Document Format), once encapsulated in DICOM. The older PACS (Picture Archiving and Communication System) has a reputation for being out of date and proprietary, although Loadbalancer.org has seen some great PACS implementations.
The aptly named VNA (Vendor Neutral Archiving) has taken us one step closer to the goal of an open standard for medical archiving. VNA has encouraged many hospitals to improve their existing solution or even replace it entirely – to take advantage of new modalities and better service availability.
A VNA is an archival system that can be used to store virtually any type of digital data irrespective of the original source of the data. The VNA will also serve that data to any requesting system (with proper authentication and authorization) without regard to the vendor of the system requesting the data. It is the independence from the vendors that provide the source data or the data request that renders it “vendor neutral.” VNAs are also sometimes referred to as a PACS Neutral Archive. VNAs are distinguished from picture archiving and communication systems by functioning more as a central store for images from many sources and diverse vendors.
Failure of any of these medical imaging systems can be costly to the healthcare system and to patients, so achieving zero downtime is a key consideration. Loadbalancer.org appliances monitor the status of medical imaging storage and applications servers (health checks) and seamlessly directs traffic to the healthy, online or least loaded servers providing high availability.
Load balancing enterprise imaging
Enterprise imaging (EI) brings meaningful data from disparate PACS together, transform and make it accessible to all – thus helping healthcare organizations unlock, analyze and share information and ultimately become less department-centric and more patient-centric.
For more detail on how load balancing enhances EI, refer to the examples below:
White paper
Read our white paper on the pivotal role of the load balancer in PACS migrations
Deep learning forecasting
Data from healthcare applications such as medical imaging is sent to AI platforms for decision support. This generates large amounts of data which can be enhanced by load balancing the entire data ecosysteym.
Load balancing immutable backups
Load balancing is part of a healthcare data protection strategy. For an example of how our load balancers helped optimize a US health sciences university’s storage system, supporting its backups and data safeguarding.
Learn how loadbalancer.org helps a U.S. federal government health sciences university optimize its Dell ECS storage system
Load balancing big data
Big data such as medical images are so large, fast or complex that it’s difficult to process these using traditional methods. Harnessing this data however has the potential to dramatically increase levels of accuracy in detecting and diagnosing disease.
For example, conventional mammogram screening misses 1 in 5 breast cancer cases. Whereas Google’s AI-powered Lymph Node Assistant (LYNA) can detect breast cancer metastasis with 99% accuracy. While human pathologists miss metastases as much as 62% of the time, AI algorithms evaluate exhaustively for extremely high accuracy. This could lead to earlier detection and treatment of breast cancer – saving lives and improving the treatment options for thousands of women. But how should this data be stored, retrieved, managed and analyzed? Using load balancers to support object storage architecture means that big data remains high availability, secure and scalable.
For full details of how we load balance medical images and other clinical data refer to the guide below:
Load balancing to support the healthcare IT agenda
Advanced load balancers (or Application Delivery Controllers – ADCs) are playing a vital role within this transformational environment, ensuring high availability of core applications, assisting with interoperability challenges and smoothing the transition to a hybrid future with functionality like GSLB (Global Server Load Balancing) creating multi-site and multi-cloud resilience.
Object storage in finance
How do you load balance object storage in finance?
Load balancing storage architecture has a number of key benefits:
- It ensures the data being stored is protected
- It helps make sure that data remains highly available and accessible at all times
- It enables businesses to meet growing data demands through scalability
Load balancing finance applications for secure and highly available customer services
Open banking, regulatory changes, consumer expectations, cyberattacks, legacy system failure, data safeguarding, skill shortages. Just a small sample of the myriad of factors driving banking and financial services institutions to fast-track digital transformation.
Immutable backups and ransomware
Load balancing is part of a Finance data protection strategy.
For a more detailed explanation of the role of object storage and load balancing in protecting against ransomware attacks, refer to the blog below:
Blog post
How immutable object storage can help fight ransomware in the financial sector and protect data.
Deep learning forecasting
Unlike machine learning, deep learning algorithms are able to learn and determine whether its predictions are accurate without human intervention. This automation means learning can happen faster and therefore forecasts are more accurate. Credit risk, anomaly and fraud detection processes are also more efficient.
All of these activities require the safe storage, retrieval and processing of large amounts of internal and external data. By embedding load balancers in your storage architecture this data remains highly available, scalable and secure.
Object storage in education
How do you load balance object storage in education?
Load balancing storage architecture has a number of key benefits:
- It ensures the data being stored is protected
- It helps make sure that data remains highly available and accessible at all times
- It enables businesses to meet growing data demands through scalability
Here we explain how that works in the context of higher education.
Immutable backups and ransomware
Ransomware attacks are predicted to occur every 11 seconds in 2021 at a cost of $20 billion, which includes the cost of restoring data following an attack and not just ransomware payments. Increasingly, these attacks are aimed at Universities – perhaps as banks adopt increasingly sophisticated cybersecurity measures and criminals look to target more vulnerable and under resourced sectors. In 2021 alone there were two major attacks at the University of Portsmouth and University of Sunderland, resulting in significant disruption, taking student and staff records, as well as websites and key IT systems offline. For critical systems such as these, a load balancer can be used to make sure that failover happens seamlessly in the event of an attack, to avoid any disruption to the end-user.
When a failure event occurs, no backup data is being ingested into the local storage. The load balancer detects the failure and redirects traffic to the additional data centres meaning both storage and retrieval of data can continue, offering maximum redundancy. The recovery point objectives can be as low as seconds, minimizing the amount of data loss. In this scenario, the load balancer facilitates immediate retrieval of immutable data backups, held in alternative locations, to offer maximum redundancy. The net effect should be that failover and failback are therefore seamless. Hence load balancing mitigates the risks associated with the potential loss of data resulting from a ransomware attack.
Case study
Read how a leading UK university uses load balancing to help store and secure over 120 terabytes of data every week.
Big data
Universities are increasingly using big data not just for research purposes, but also to spot students who are struggling and reduce dropout rates. Retrieving historical, archived data, analysing it and spotting trends therefore results in tailored intervention and support plans for current students. Using load balancers to support object storage architecture ensures their big data remains secure, highly available, secure and scalable.
Object storage in media
Why use object storage for media archiving and backups?
Media and entertainment organizations have until recently relied on Linear Tape Open (LTO) storage. This long term magnetic tape data archiving and backup technology offered an open standard alternative to the earlier proprietary magnetic tape formats available in the early 1990s.
But LTO has its limitations, namely the increasing costs of expanding and maintaining tape libraries, and an inability to easily access or manage these files. Hence object storage has now emerged as the storage option of choice.
Object storage applications help to facilitate data center modernization, and on-prem, hybrid or cloud storage data solutions, meeting growing archive demands. Nodes can be distributed across multiple geographic locations, with media copies at each location – ensuring content collaboration and disaster recovery at all times.
Why load balance media object storage?
Load balancing object storage architecture has a number of key benefits:
- It ensures the media files being stored are protected
- It helps make sure that creative assets remain highly available and accessible at all times
- It enables agencies and organizations to meet growing data demands through scalability
For more information on load balancing in media and entertainment, check out our detailed guide:
Guide
Find out more about load balancing for intelligent asset management, and uninterrupted video streaming