SlideShare uma empresa Scribd logo
1 de 6
Baixar para ler offline
Ensuring High Availability- A White Paper

                    Monitoring Clusters and Load Balancers




Introduction

Enterprises employ several mechanisms to optimize the performance of their networks
inorder to ensure high availability. Clustering for failover (Active /Passive mode) and
load-balancing (Active / Active mode) is a commonly adopted technique that supports
redundancy, session or database replication, and load balancing requests across the
servers in the cluster. Some networks have software or hardware load-balancers even
outside the cluster to increase horizontal scalability.

Most businesses happen over the Internet and Enterprises prefer clusters or invest in
load-balancers because of the fault-tolerant architecture. Critical servers and
applications such as database servers, exchange servers etc are hosted in clustered
environment for high availability. Further, load balancers like Big IP are plugged into the
network to distribute the load on servers to address the primary need of un-interruped
availability at all times. Service interruptions, planned or unplanned, are costly affairs
and are unacceptable for businesses of any size. Administrators exhaust almost every
resource within their means to keep their networks happy and available.


Service outage and its business impact

We are all huge consumers of various services the internet supports. Imagine a situation
where you are accessing your bank account to transfer funds to a friend's for an
emergency. And the site simply says, 'Oops! Sorry, the service is unavailable!'. The
message concludes politely requesting you to try accessing after a little while. As end
users, we would hate to be in this situation. That you get to call the customer service
and bombard at random and look out for an ATM or a branch office is some respite. The
damage caused to the bank is beyond economic repairs and the man behind the show,
the administrator/network engineer gets to listen to variety music for this huge goof-up.
This despite having invested in a cluster for a reliable service!


Forewarned is forearmed

The job is only half-done by employing clustering for failover or load-balancing. Unless
an administrator has a clear visibility of the good, bad, and the ugly components of his
network including the ones in that critical cluster, or the important resources on the
expensive load-balancers, its impossible to have an alert-free holiday!




                                                                                         1
Example 1: A two-node SQL cluster in an enterprise:




The reasons for an Active node to failover to a standby could be either a system or an
application failure, or both. An administrator must have a clear visibility into the system
and application performance, which is possible only when proactively monitored. In the
scenario discussed above, it is possible that the clustering controller instance has failed
causing the whole system to fall apart! Or, despite the Active node successfully failing-
over to the Passive, the Passive too fails due to insufficient resources! This mix-up could
have been avoided or identified a little earlier to reduce the damage by monitoring the
basic resources on these systems.

Example 2: A load-balancer distributing requests across a few servers:




Despite reduntant servers set up for load-balancing, imagine a hardware resource failure
on the load-balancer leading to the service unavailability! The user requests never make
it to the server even when the service is up and running fine!



                                                                                         2
Reduce the damage

The purpose of clustering is lost if the resources are not constantly monitored. Even as
the administrator tries to ensure that the end-users do not 'feel' any service failure, he
must quickly identify the cause for the failover from active to the passive node, or why
the load exerted on a particular server is on the high. So, all the components or
resources that need to run for the clustering to work well, needs monitoring. This
includes the Cluster service on the nodes, the dependent services, the system resources
on the load-balancer, response time of the individual devices etc.Contant automated
monitoring of key components helps reduce the damage and helps realize the goal of
ensuring high availability at all times.

The key resources include:

Availability of the nodes: A detailed availability report indicating if the node is
unavailable due to a dependent device failure or if the node is pulled down for
maintenance.




Response time of the nodes: Response time of the nodes at any given time, and its
average response time indicating the load on it.




Services availability and response time: Availability of the cluster service and its
related services on the nodes.




System Resource utilizations: A constant check on the performance of the hardware
resources because the last thing you want is insufficient resources rendering a critical
service unavailable!




                                                                                        3
Service Parameters: Critical parameters of a service that can lead to a potential
failure.




System Events pertaining to the cluster: Keeping a tab on the system events
including the application events so that there are no sudden surprises and all avenues of
fault are watched.




Availability and Performance of the load balancer: Ensureing the basic availability
and responsiveness of the load balancer.




System resources on the load balancer: Monitoring the critical resources on the
load-balancers to identify and problem indicators much ahead.




                                                                                       4
Cluster Groups (Business Views): A holistic view of the nodes in a cluster with an
ability to drill down to the root cause. This provision to visualize a cluster helps to
understand the health of the cluster at a glance.




                                            -

Summary

ManageEngine OpManager is a network monitoring software that monitors all the
resources on your LAN and WAN. The performance and fault management capability of
OpManager helps identify performance bottlenecks quickly. Its ability to drill-down to the
root cause of a fault and the huge custom-capability, makes OpManager a preferred
solution among thousands of network administrators world-wide. A few useful plug-ins
and add-ons such as NCM Plug-in, NetFlow Plug-in, VoIP add-on, and the provision to
easily integrate with other applications in the management suite such as ServiceDesk
Plus, makes it a one-stop shop for all your network and IT management needs.
Visit www.opmanager.com to test drive your 30 days free trial versions of ManageEngine
OpManager.




                                                                                        5

Mais conteúdo relacionado

Mais procurados

Server and application monitoring webinars [Applications Manager] - Part 3
Server and application monitoring webinars [Applications Manager] - Part 3Server and application monitoring webinars [Applications Manager] - Part 3
Server and application monitoring webinars [Applications Manager] - Part 3ManageEngine, Zoho Corporation
 
Application-aware Network Performance Management with OpManager
Application-aware Network Performance Management with OpManagerApplication-aware Network Performance Management with OpManager
Application-aware Network Performance Management with OpManagerManageEngine, Zoho Corporation
 
Server and application monitoring webinars [Applications Manager] - Part 2
Server and application monitoring webinars [Applications Manager] - Part 2Server and application monitoring webinars [Applications Manager] - Part 2
Server and application monitoring webinars [Applications Manager] - Part 2ManageEngine, Zoho Corporation
 
IT Solutions Provider in Kosovo uses Bandwidth monitoring, NetFlow Analyzer
IT Solutions Provider in Kosovo uses Bandwidth monitoring, NetFlow AnalyzerIT Solutions Provider in Kosovo uses Bandwidth monitoring, NetFlow Analyzer
IT Solutions Provider in Kosovo uses Bandwidth monitoring, NetFlow AnalyzerManageEngine, Zoho Corporation
 
The Changing Landscape in Network Performance Monitoring
The Changing Landscape in Network Performance Monitoring The Changing Landscape in Network Performance Monitoring
The Changing Landscape in Network Performance Monitoring Savvius, Inc
 
Role of OpManager in event and fault management
Role of OpManager in event and fault managementRole of OpManager in event and fault management
Role of OpManager in event and fault managementManageEngine
 
Grid Control
Grid ControlGrid Control
Grid Controlbcole23
 
Dot Net performance monitoring
 Dot Net performance monitoring Dot Net performance monitoring
Dot Net performance monitoringKranthi Paidi
 
IRJET- An Improved Weighted Least Connection Scheduling Algorithm for Loa...
IRJET-  	  An Improved Weighted Least Connection Scheduling Algorithm for Loa...IRJET-  	  An Improved Weighted Least Connection Scheduling Algorithm for Loa...
IRJET- An Improved Weighted Least Connection Scheduling Algorithm for Loa...IRJET Journal
 
Mafiree Services 2016 (1)
Mafiree Services 2016 (1)Mafiree Services 2016 (1)
Mafiree Services 2016 (1)linyashaalu
 
Monitoring your physical, virtual and cloud infrastructure with Applications ...
Monitoring your physical, virtual and cloud infrastructure with Applications ...Monitoring your physical, virtual and cloud infrastructure with Applications ...
Monitoring your physical, virtual and cloud infrastructure with Applications ...ManageEngine, Zoho Corporation
 
Network access protection ppt
Network access protection pptNetwork access protection ppt
Network access protection pptDasarathi Dash
 
Ten questions to ask before choosing SCADA software
Ten questions to ask before choosing SCADA softwareTen questions to ask before choosing SCADA software
Ten questions to ask before choosing SCADA softwareTrihedral
 
Network Access Protection
Network Access ProtectionNetwork Access Protection
Network Access ProtectionZernike College
 
Availability tactics
Availability tacticsAvailability tactics
Availability tacticsahsan riaz
 

Mais procurados (20)

Server and application monitoring webinars [Applications Manager] - Part 3
Server and application monitoring webinars [Applications Manager] - Part 3Server and application monitoring webinars [Applications Manager] - Part 3
Server and application monitoring webinars [Applications Manager] - Part 3
 
Application-aware Network Performance Management with OpManager
Application-aware Network Performance Management with OpManagerApplication-aware Network Performance Management with OpManager
Application-aware Network Performance Management with OpManager
 
Server and application monitoring webinars [Applications Manager] - Part 2
Server and application monitoring webinars [Applications Manager] - Part 2Server and application monitoring webinars [Applications Manager] - Part 2
Server and application monitoring webinars [Applications Manager] - Part 2
 
OpManager - Technical overview
OpManager - Technical overviewOpManager - Technical overview
OpManager - Technical overview
 
IT Solutions Provider in Kosovo uses Bandwidth monitoring, NetFlow Analyzer
IT Solutions Provider in Kosovo uses Bandwidth monitoring, NetFlow AnalyzerIT Solutions Provider in Kosovo uses Bandwidth monitoring, NetFlow Analyzer
IT Solutions Provider in Kosovo uses Bandwidth monitoring, NetFlow Analyzer
 
The Changing Landscape in Network Performance Monitoring
The Changing Landscape in Network Performance Monitoring The Changing Landscape in Network Performance Monitoring
The Changing Landscape in Network Performance Monitoring
 
Role of OpManager in event and fault management
Role of OpManager in event and fault managementRole of OpManager in event and fault management
Role of OpManager in event and fault management
 
5 reasons to use OpManager Plus
5 reasons to use OpManager Plus5 reasons to use OpManager Plus
5 reasons to use OpManager Plus
 
Grid Control
Grid ControlGrid Control
Grid Control
 
Dot Net performance monitoring
 Dot Net performance monitoring Dot Net performance monitoring
Dot Net performance monitoring
 
IRJET- An Improved Weighted Least Connection Scheduling Algorithm for Loa...
IRJET-  	  An Improved Weighted Least Connection Scheduling Algorithm for Loa...IRJET-  	  An Improved Weighted Least Connection Scheduling Algorithm for Loa...
IRJET- An Improved Weighted Least Connection Scheduling Algorithm for Loa...
 
Mafiree Services 2016 (1)
Mafiree Services 2016 (1)Mafiree Services 2016 (1)
Mafiree Services 2016 (1)
 
Monitoring your physical, virtual and cloud infrastructure with Applications ...
Monitoring your physical, virtual and cloud infrastructure with Applications ...Monitoring your physical, virtual and cloud infrastructure with Applications ...
Monitoring your physical, virtual and cloud infrastructure with Applications ...
 
5 reasons why you need a network monitoring tool
5 reasons why you need a network monitoring tool5 reasons why you need a network monitoring tool
5 reasons why you need a network monitoring tool
 
Improving User Experience with Applications Manager
Improving User Experience with Applications ManagerImproving User Experience with Applications Manager
Improving User Experience with Applications Manager
 
Network access protection ppt
Network access protection pptNetwork access protection ppt
Network access protection ppt
 
Ten questions to ask before choosing SCADA software
Ten questions to ask before choosing SCADA softwareTen questions to ask before choosing SCADA software
Ten questions to ask before choosing SCADA software
 
Leading Indian IT Services Company uses OpManager
Leading Indian IT Services Company uses OpManagerLeading Indian IT Services Company uses OpManager
Leading Indian IT Services Company uses OpManager
 
Network Access Protection
Network Access ProtectionNetwork Access Protection
Network Access Protection
 
Availability tactics
Availability tacticsAvailability tactics
Availability tactics
 

Destaque (8)

Storage School 1
Storage School 1Storage School 1
Storage School 1
 
Best Practices for Planning your Datacenter
Best Practices for Planning your DatacenterBest Practices for Planning your Datacenter
Best Practices for Planning your Datacenter
 
RAID CONCEPT
RAID CONCEPTRAID CONCEPT
RAID CONCEPT
 
Firewall presentation
Firewall presentationFirewall presentation
Firewall presentation
 
Cluster computing pptl (2)
Cluster computing pptl (2)Cluster computing pptl (2)
Cluster computing pptl (2)
 
Web Servers (ppt)
Web Servers (ppt)Web Servers (ppt)
Web Servers (ppt)
 
Firewall
Firewall Firewall
Firewall
 
Firewall presentation
Firewall presentationFirewall presentation
Firewall presentation
 

Semelhante a Monitoring Clusters and Load Balancers

PriyaDharshini distributed operating system
PriyaDharshini distributed operating systemPriyaDharshini distributed operating system
PriyaDharshini distributed operating systemPriyadharshiniVS
 
Vanmathy distributed operating system
Vanmathy distributed operating system Vanmathy distributed operating system
Vanmathy distributed operating system PriyadharshiniVS
 
Contrasting High Availability, Fault Tolerance, and Disaster Recovery
Contrasting High Availability, Fault Tolerance, and Disaster RecoveryContrasting High Availability, Fault Tolerance, and Disaster Recovery
Contrasting High Availability, Fault Tolerance, and Disaster RecoveryMaryJWilliams2
 
Resiliency vs High Availability vs Fault Tolerance vs Reliability
Resiliency vs High Availability vs Fault Tolerance vs  ReliabilityResiliency vs High Availability vs Fault Tolerance vs  Reliability
Resiliency vs High Availability vs Fault Tolerance vs Reliabilityjeetendra mandal
 
Availability Considerations for SQL Server
Availability Considerations for SQL ServerAvailability Considerations for SQL Server
Availability Considerations for SQL ServerBob Roudebush
 
Design patterns and plan for developing high available azure applications
Design patterns and plan for developing high available azure applicationsDesign patterns and plan for developing high available azure applications
Design patterns and plan for developing high available azure applicationsHimanshu Sahu
 
IRJET- Real Time Monitoring of Servers with Prometheus and Grafana for High A...
IRJET- Real Time Monitoring of Servers with Prometheus and Grafana for High A...IRJET- Real Time Monitoring of Servers with Prometheus and Grafana for High A...
IRJET- Real Time Monitoring of Servers with Prometheus and Grafana for High A...IRJET Journal
 
IRJET- Analysis of Micro Inversion to Improve Fault Tolerance in High Spe...
IRJET-  	  Analysis of Micro Inversion to Improve Fault Tolerance in High Spe...IRJET-  	  Analysis of Micro Inversion to Improve Fault Tolerance in High Spe...
IRJET- Analysis of Micro Inversion to Improve Fault Tolerance in High Spe...IRJET Journal
 
Pre-Con Education: What Is CA Unified Infrastructure Management and what's ne...
Pre-Con Education: What Is CA Unified Infrastructure Management and what's ne...Pre-Con Education: What Is CA Unified Infrastructure Management and what's ne...
Pre-Con Education: What Is CA Unified Infrastructure Management and what's ne...CA Technologies
 
"up.time" New Release from uptime software - May, 2010
"up.time" New Release from uptime software - May, 2010"up.time" New Release from uptime software - May, 2010
"up.time" New Release from uptime software - May, 2010guesta93734
 
Microservices architecture
Microservices architectureMicroservices architecture
Microservices architectureFaren faren
 
An Investigation of Fault Tolerance Techniques in Cloud Computing
An Investigation of Fault Tolerance Techniques in Cloud ComputingAn Investigation of Fault Tolerance Techniques in Cloud Computing
An Investigation of Fault Tolerance Techniques in Cloud Computingijtsrd
 
Getting the Most Value from VM and Compliance Programs white paper
Getting the Most Value from VM and Compliance Programs white paperGetting the Most Value from VM and Compliance Programs white paper
Getting the Most Value from VM and Compliance Programs white paperTawnia Beckwith
 
NFV resiliency whitepaper - Ali Kafel, Stratus Technologies
NFV resiliency whitepaper - Ali Kafel, Stratus TechnologiesNFV resiliency whitepaper - Ali Kafel, Stratus Technologies
NFV resiliency whitepaper - Ali Kafel, Stratus TechnologiesAli Kafel
 
Building Cloud Ready Apps
Building Cloud Ready AppsBuilding Cloud Ready Apps
Building Cloud Ready AppsVMware Tanzu
 
Introdunction to Network Management Protocols - SNMP & TR-069
Introdunction to Network Management Protocols - SNMP & TR-069Introdunction to Network Management Protocols - SNMP & TR-069
Introdunction to Network Management Protocols - SNMP & TR-069William Lee
 
Architecting and Tuning IIB/eXtreme Scale for Maximum Performance and Reliabi...
Architecting and Tuning IIB/eXtreme Scale for Maximum Performance and Reliabi...Architecting and Tuning IIB/eXtreme Scale for Maximum Performance and Reliabi...
Architecting and Tuning IIB/eXtreme Scale for Maximum Performance and Reliabi...Prolifics
 
LOAD BALANCING IN CLOUD COMPUTING
LOAD BALANCING IN CLOUD COMPUTINGLOAD BALANCING IN CLOUD COMPUTING
LOAD BALANCING IN CLOUD COMPUTINGIRJET Journal
 

Semelhante a Monitoring Clusters and Load Balancers (20)

PriyaDharshini distributed operating system
PriyaDharshini distributed operating systemPriyaDharshini distributed operating system
PriyaDharshini distributed operating system
 
Vanmathy distributed operating system
Vanmathy distributed operating system Vanmathy distributed operating system
Vanmathy distributed operating system
 
Contrasting High Availability, Fault Tolerance, and Disaster Recovery
Contrasting High Availability, Fault Tolerance, and Disaster RecoveryContrasting High Availability, Fault Tolerance, and Disaster Recovery
Contrasting High Availability, Fault Tolerance, and Disaster Recovery
 
Resiliency vs High Availability vs Fault Tolerance vs Reliability
Resiliency vs High Availability vs Fault Tolerance vs  ReliabilityResiliency vs High Availability vs Fault Tolerance vs  Reliability
Resiliency vs High Availability vs Fault Tolerance vs Reliability
 
Availability Considerations for SQL Server
Availability Considerations for SQL ServerAvailability Considerations for SQL Server
Availability Considerations for SQL Server
 
Design patterns and plan for developing high available azure applications
Design patterns and plan for developing high available azure applicationsDesign patterns and plan for developing high available azure applications
Design patterns and plan for developing high available azure applications
 
Maximize the efficiency of your server farm
Maximize the efficiency of your server farmMaximize the efficiency of your server farm
Maximize the efficiency of your server farm
 
IRJET- Real Time Monitoring of Servers with Prometheus and Grafana for High A...
IRJET- Real Time Monitoring of Servers with Prometheus and Grafana for High A...IRJET- Real Time Monitoring of Servers with Prometheus and Grafana for High A...
IRJET- Real Time Monitoring of Servers with Prometheus and Grafana for High A...
 
IRJET- Analysis of Micro Inversion to Improve Fault Tolerance in High Spe...
IRJET-  	  Analysis of Micro Inversion to Improve Fault Tolerance in High Spe...IRJET-  	  Analysis of Micro Inversion to Improve Fault Tolerance in High Spe...
IRJET- Analysis of Micro Inversion to Improve Fault Tolerance in High Spe...
 
Pre-Con Education: What Is CA Unified Infrastructure Management and what's ne...
Pre-Con Education: What Is CA Unified Infrastructure Management and what's ne...Pre-Con Education: What Is CA Unified Infrastructure Management and what's ne...
Pre-Con Education: What Is CA Unified Infrastructure Management and what's ne...
 
"up.time" New Release from uptime software - May, 2010
"up.time" New Release from uptime software - May, 2010"up.time" New Release from uptime software - May, 2010
"up.time" New Release from uptime software - May, 2010
 
Microservices architecture
Microservices architectureMicroservices architecture
Microservices architecture
 
An Investigation of Fault Tolerance Techniques in Cloud Computing
An Investigation of Fault Tolerance Techniques in Cloud ComputingAn Investigation of Fault Tolerance Techniques in Cloud Computing
An Investigation of Fault Tolerance Techniques in Cloud Computing
 
Getting the Most Value from VM and Compliance Programs white paper
Getting the Most Value from VM and Compliance Programs white paperGetting the Most Value from VM and Compliance Programs white paper
Getting the Most Value from VM and Compliance Programs white paper
 
NFV resiliency whitepaper - Ali Kafel, Stratus Technologies
NFV resiliency whitepaper - Ali Kafel, Stratus TechnologiesNFV resiliency whitepaper - Ali Kafel, Stratus Technologies
NFV resiliency whitepaper - Ali Kafel, Stratus Technologies
 
Atifalhas
AtifalhasAtifalhas
Atifalhas
 
Building Cloud Ready Apps
Building Cloud Ready AppsBuilding Cloud Ready Apps
Building Cloud Ready Apps
 
Introdunction to Network Management Protocols - SNMP & TR-069
Introdunction to Network Management Protocols - SNMP & TR-069Introdunction to Network Management Protocols - SNMP & TR-069
Introdunction to Network Management Protocols - SNMP & TR-069
 
Architecting and Tuning IIB/eXtreme Scale for Maximum Performance and Reliabi...
Architecting and Tuning IIB/eXtreme Scale for Maximum Performance and Reliabi...Architecting and Tuning IIB/eXtreme Scale for Maximum Performance and Reliabi...
Architecting and Tuning IIB/eXtreme Scale for Maximum Performance and Reliabi...
 
LOAD BALANCING IN CLOUD COMPUTING
LOAD BALANCING IN CLOUD COMPUTINGLOAD BALANCING IN CLOUD COMPUTING
LOAD BALANCING IN CLOUD COMPUTING
 

Último

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 

Último (20)

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 

Monitoring Clusters and Load Balancers

  • 1.
  • 2. Ensuring High Availability- A White Paper Monitoring Clusters and Load Balancers Introduction Enterprises employ several mechanisms to optimize the performance of their networks inorder to ensure high availability. Clustering for failover (Active /Passive mode) and load-balancing (Active / Active mode) is a commonly adopted technique that supports redundancy, session or database replication, and load balancing requests across the servers in the cluster. Some networks have software or hardware load-balancers even outside the cluster to increase horizontal scalability. Most businesses happen over the Internet and Enterprises prefer clusters or invest in load-balancers because of the fault-tolerant architecture. Critical servers and applications such as database servers, exchange servers etc are hosted in clustered environment for high availability. Further, load balancers like Big IP are plugged into the network to distribute the load on servers to address the primary need of un-interruped availability at all times. Service interruptions, planned or unplanned, are costly affairs and are unacceptable for businesses of any size. Administrators exhaust almost every resource within their means to keep their networks happy and available. Service outage and its business impact We are all huge consumers of various services the internet supports. Imagine a situation where you are accessing your bank account to transfer funds to a friend's for an emergency. And the site simply says, 'Oops! Sorry, the service is unavailable!'. The message concludes politely requesting you to try accessing after a little while. As end users, we would hate to be in this situation. That you get to call the customer service and bombard at random and look out for an ATM or a branch office is some respite. The damage caused to the bank is beyond economic repairs and the man behind the show, the administrator/network engineer gets to listen to variety music for this huge goof-up. This despite having invested in a cluster for a reliable service! Forewarned is forearmed The job is only half-done by employing clustering for failover or load-balancing. Unless an administrator has a clear visibility of the good, bad, and the ugly components of his network including the ones in that critical cluster, or the important resources on the expensive load-balancers, its impossible to have an alert-free holiday! 1
  • 3. Example 1: A two-node SQL cluster in an enterprise: The reasons for an Active node to failover to a standby could be either a system or an application failure, or both. An administrator must have a clear visibility into the system and application performance, which is possible only when proactively monitored. In the scenario discussed above, it is possible that the clustering controller instance has failed causing the whole system to fall apart! Or, despite the Active node successfully failing- over to the Passive, the Passive too fails due to insufficient resources! This mix-up could have been avoided or identified a little earlier to reduce the damage by monitoring the basic resources on these systems. Example 2: A load-balancer distributing requests across a few servers: Despite reduntant servers set up for load-balancing, imagine a hardware resource failure on the load-balancer leading to the service unavailability! The user requests never make it to the server even when the service is up and running fine! 2
  • 4. Reduce the damage The purpose of clustering is lost if the resources are not constantly monitored. Even as the administrator tries to ensure that the end-users do not 'feel' any service failure, he must quickly identify the cause for the failover from active to the passive node, or why the load exerted on a particular server is on the high. So, all the components or resources that need to run for the clustering to work well, needs monitoring. This includes the Cluster service on the nodes, the dependent services, the system resources on the load-balancer, response time of the individual devices etc.Contant automated monitoring of key components helps reduce the damage and helps realize the goal of ensuring high availability at all times. The key resources include: Availability of the nodes: A detailed availability report indicating if the node is unavailable due to a dependent device failure or if the node is pulled down for maintenance. Response time of the nodes: Response time of the nodes at any given time, and its average response time indicating the load on it. Services availability and response time: Availability of the cluster service and its related services on the nodes. System Resource utilizations: A constant check on the performance of the hardware resources because the last thing you want is insufficient resources rendering a critical service unavailable! 3
  • 5. Service Parameters: Critical parameters of a service that can lead to a potential failure. System Events pertaining to the cluster: Keeping a tab on the system events including the application events so that there are no sudden surprises and all avenues of fault are watched. Availability and Performance of the load balancer: Ensureing the basic availability and responsiveness of the load balancer. System resources on the load balancer: Monitoring the critical resources on the load-balancers to identify and problem indicators much ahead. 4
  • 6. Cluster Groups (Business Views): A holistic view of the nodes in a cluster with an ability to drill down to the root cause. This provision to visualize a cluster helps to understand the health of the cluster at a glance. - Summary ManageEngine OpManager is a network monitoring software that monitors all the resources on your LAN and WAN. The performance and fault management capability of OpManager helps identify performance bottlenecks quickly. Its ability to drill-down to the root cause of a fault and the huge custom-capability, makes OpManager a preferred solution among thousands of network administrators world-wide. A few useful plug-ins and add-ons such as NCM Plug-in, NetFlow Plug-in, VoIP add-on, and the provision to easily integrate with other applications in the management suite such as ServiceDesk Plus, makes it a one-stop shop for all your network and IT management needs. Visit www.opmanager.com to test drive your 30 days free trial versions of ManageEngine OpManager. 5