SlideShare uma empresa Scribd logo
1 de 24
Scaling Nagios At A
Giant Insurance Company
            Daniel Wittenberg

           dwittenberg2008@gmail.com
   https://github.com/dwittenberg2008/nagios
Personal Background

   Certified for HP-UX in mid 90's, then RHCE in '99, and AIX in
   early 2000's.
   Worked on lots of different technologies and solutions
   including HA, SAN/iSCSI, Forensics/Security,
   Backups/Disaster Recovery, Performance Tuning, Capacity
   Planning, Monitoring/Trending, Networking/Protocol Analysis,
   Virtualization and Cloud Computing.
   Consulted and worked in many industries include insurance,
   banking, accounting, construction, embedded hardware
   design, printing/publishing early education, higher education,
   and ISP/hosting providers.




                                2012                                2
Topics

    Hardware
    Operating System
    Nagios Core
    Plugins
    Other Add-ons
    Event Brokers
    Other Software
    Performance Monitoring
    General

                        2012   3
Overview




           2012   4
Highest Counts Seen




                      2012   5
Hardware

  Hardware vs VMware
             High forking rate not good fit for VMware (livecheck/4.0)
  CPU Requirements
             Quantity vs Quality
  Memory
             Typically memory efficient, but have enough for ramdisk(s)
             Affected by your plugins if using active checks
  Disk I/O
             Faster the better!




                                    2012                                  6
VMware Performance Comparison
                                   Isolated VMWare ESX 4
Procs   Memory   Hosts    Avg Svc Lat   Avg CPU Util   Avg CPU Load    # Act Checks   # Pass Checks

 4      8 GB     1000        182            65              5            12084            6042
 8      8 GB     1000        162            47              4            12084            6042
 8      8 GB     600          87            60              8             7272            3636


                         Physical Dell PowerEdge R710 (new)
Procs   Memory   Hosts    Avg Svc Lat   Avg CPU Util   Avg CPU Load   # Act Checks    # Pass Checks

  4     16 GB    1000        0.19           10            1.25          12084            6042
  8     8 GB     1000        0.38           25            1.15          12084            6042

                 Physical HP Proliant DL380 G4 (~ 8 years old)
Procs   Memory   Hosts    Avg Svc Lat   Avg CPU Util   Avg CPU Load   # Act Checks    # Pass Checks

  4     4 GB     800         0.29           32            1.95           9684            4842
  8     4 GB     1000        0.47           37            4.43          12084            6042


                                                  2012                                                7
Operating System

     CentOS / RHEL 6.3
     Strip down the running services
     Create ramdisk in Nagios RC script
     - first one for status.dat, checkresults, temp_file
     - nagios.rc on github for full rc script – will be default in 4.0

 ramdisk=`mount |grep "/var/nagios/ramcache"`
 if [ "$ramdisk"X == "X" ]; then
    mkdir -p -m 755 /var/nagios/ramcache
    mount -t tmpfs -o size=128m tmpfs /var/nagios/ramcache
    mkdir -p -m 755 /var/nagios/ramcache/checkresults
    chown -R nagios:nagios /var/nagios/ramcache
 fi




                                                2012                     8
Operating System



   Make sure no ulimit restrictions
   ulimit -a

   Renice daemons and services
   daemon -15 --user=$user $exec -ud $config

   perfdata_file_run_cmd =/bin/nice -n 20 /usr/libexec/pnp4nagios/process_perfdata.pl

   puppet runs also re-niced (/etc/sysconfig/puppet – NICELEVEL=19)



   Watch your other running services and cron jobs interactively for awhile to see
   what spikes, you might be surprised!




                                       2012                                             9
Nagios Core

   Currently using Nagios 3.4.1 / 4.0
          Stock with the exception of custom rc script in 3.4.1
   Large Scale Suggestions Doc
          Pre-caching objects
          Re-write RC script to optimize restart time (use -vx)
          Don't allow restart/stop if config broken
          Limit use of macros (resources.cfg)




                                  2012                            10
Nagios Core

  Remove use of CGI's, disable in Apache
           Using Livestatus/Multisite/livestatus-slave
  Limit use of OS backups (crazy huh?)
  Keep logging level low in all core and plugins/brokers
  Keep comments limited, delete if X # or Y days old
  status_update_interval=20 (default is 10)
  (how often to update the status.dat in seconds)

  enable_environment_macros=0 (default is 1)
  (pass macros as ENV variables)




                                          2012             11
Plugins

   check_nrpe
   check_logfiles
   check_hpasm / check_dell_sensors / check_dell_omreport
   check_oracle_health – check_mysql_health
   check_ps.sh (re-written for perf data, correct calculations)
   nagios_auto_service
   Return perf data whenever possible
   Many other custom and one-up plugins




                                2012                              12
Other Add-ons
   NRPE
          Patched to allow large buffer size (20480 bytes)

   NSCA – (NRDP future ?)
          Patched MAX_PLUGINOUTPUT_LENGTH to 4096
          max_packet_age=60, forward and back time patch
          Run from xinetd to allow larger/faster connections/hang protection
          MUST use instances = UNLIMITED
          Recommend per_source = UNLIMITED
          Recommend cps = 5000 3

   NSClient++/NSCP
          Many updates for buffering, data truncation, queueing

   PNP4Nagios
   rrdcached

                                      2012                                     13
Event Brokers
   DNX
   Mod-Gearman
   MK Livestatus
   Performance Data Splunker (custom)
   Log separator (reduces grepping for messages) (custom)




                              2012                          14
Other Software
   Puppet
            Manage entire server, from OS to .cfg

   Splunk
            Log files, performance data, sampled from servers (25GB/day+)

   Cacti
            Nagiostats template, updated to use livestatus instead of CGI

   Custom Control Panel
            Build host groups based on templates, auto-config based on host info

   ConSol Labs
            check_logfiles, check_hpasm, mod_gearman, check_mysql_health,
               check_oracle_health




                                         2012                                      15
Performance Monitoring


   How to watch your system to determine bottlenecks
   vmstat
   iostat
   top
   iptraf
   sar
   strace
   esxtop (if have to use VM)



                           2012                        16
General Configs

   Host config files are standalone configurations that tell everything about a host.
   Hosts are tied to a hostgroup
   Hostgroups are tied to a servicegroup
   Services are tied to a servicegroup
   host.cfg → hostgroups → service ← servicegroups
   This allows for easy drop-in and removal of hosts, but also requires at least 1 host be
   assigned to a management server
   Limitations – harder to make per-server per-service customizations
   Hosts are built/assigned from control panel (round-robin distribution)
   Parents built automatically from topology database, updated nightly, ESX hourly
   Parent's only ping once a day unless there are problems, uses fping
   Some alerts do trigger eventhandlers – automate fixes as much as possible



                                           2012                                              17
Example Template Config




                     2012   18
General Configs

   Types of things being monitored:
   cpu load, cpu stats (idle/wait/user/system), disk space, log files/Event Log, hardware,
   processes, swap, memory usage, service ports, NTP drift, cron job completion, UPS
   Nagios configtest, livestatus connectivity
   PNP4Nagios/check_results directory size (keeping up on processing)
   Performance (cpu/memory) usage on certain processes
   Puppet update time to make sure doesn't get behind
   DB Response times and health (oracle/mysql/postgresql)
   Apache Stats
   Custom app status (user accounts, response times, loads, etc.)
   Various SNMP/WMI values (most network related stats)
   ActiveMQ/Mule ESB



                                           2012                                              19
Links – where to find this stuff
   My Stuff – https://github.com/dwittenberg2008/nagios
   MK Livestatus - http://mathias-kettner.de/checkmk_livestatus.html
   LivestatusSlave - http://nagios.larsmichelsen.com/livestatusslave/
   PNP4Nagios - http://docs.pnp4nagios.org/pnp-0.6/start
   ConSol Labs - http://labs.consol.de/
   Puppet - http://puppetlabs.com/
   Cacti Template (Base) - http://forums.cacti.net/about33806.html




                                          2012                          20
Future ?




   Nagios 4.0 will save the world!




                 2012                21
Nagios 4.0 Initial Specs

  Memory usage wasn't too good during initial testing....




                                2012                        22
Nagios 3.4.1 vs 4.0 -v Times




Final Numbers: 1,423,345 Services - 36,254 hosts – 255,108 service dependencies
NEVER would have done a complete -v, now completes in 1:51:00 !!!


                                          2012                                    23
Questions ?

Suggestions ?

Mais conteúdo relacionado

Mais procurados

Transforming the Ceph Integration Tests with OpenStack
Transforming the Ceph Integration Tests with OpenStack Transforming the Ceph Integration Tests with OpenStack
Transforming the Ceph Integration Tests with OpenStack Ceph Community
 
Ceph Day KL - Ceph on All-Flash Storage
Ceph Day KL - Ceph on All-Flash Storage Ceph Day KL - Ceph on All-Flash Storage
Ceph Day KL - Ceph on All-Flash Storage Ceph Community
 
Oracle Performance On Linux X86 systems
Oracle  Performance On Linux  X86 systems Oracle  Performance On Linux  X86 systems
Oracle Performance On Linux X86 systems Baruch Osoveskiy
 
Introduction to Stacki at Atlanta Meetup February 2016
Introduction to Stacki at Atlanta Meetup February 2016Introduction to Stacki at Atlanta Meetup February 2016
Introduction to Stacki at Atlanta Meetup February 2016StackIQ
 
Salesforce at Stacki Atlanta Meetup February 2016
Salesforce at Stacki Atlanta Meetup February 2016Salesforce at Stacki Atlanta Meetup February 2016
Salesforce at Stacki Atlanta Meetup February 2016StackIQ
 
Ceph Day KL - Delivering cost-effective, high performance Ceph cluster
Ceph Day KL - Delivering cost-effective, high performance Ceph clusterCeph Day KL - Delivering cost-effective, high performance Ceph cluster
Ceph Day KL - Delivering cost-effective, high performance Ceph clusterCeph Community
 
Enterprise manager cloud control 12c(12.1) &agent安装图文指南
Enterprise manager cloud control 12c(12.1) &agent安装图文指南Enterprise manager cloud control 12c(12.1) &agent安装图文指南
Enterprise manager cloud control 12c(12.1) &agent安装图文指南maclean liu
 
제4회 한국IBM과 함께하는 난공불락 오픈소스 인프라 세미나-Asible
제4회 한국IBM과 함께하는 난공불락 오픈소스 인프라 세미나-Asible제4회 한국IBM과 함께하는 난공불락 오픈소스 인프라 세미나-Asible
제4회 한국IBM과 함께하는 난공불락 오픈소스 인프라 세미나-AsibleTommy Lee
 
Ceph Day Beijing - Optimizing Ceph Performance by Leveraging Intel Optane and...
Ceph Day Beijing - Optimizing Ceph Performance by Leveraging Intel Optane and...Ceph Day Beijing - Optimizing Ceph Performance by Leveraging Intel Optane and...
Ceph Day Beijing - Optimizing Ceph Performance by Leveraging Intel Optane and...Danielle Womboldt
 
Automated Out-of-Band management with Ansible and Redfish
Automated Out-of-Band management with Ansible and RedfishAutomated Out-of-Band management with Ansible and Redfish
Automated Out-of-Band management with Ansible and RedfishJose De La Rosa
 
Building a Two Node SLES 11 SP2 Linux Cluster with VMware
Building a Two Node SLES 11 SP2 Linux Cluster with VMwareBuilding a Two Node SLES 11 SP2 Linux Cluster with VMware
Building a Two Node SLES 11 SP2 Linux Cluster with VMwaregeekswing
 
Ceph on 64-bit ARM with X-Gene
Ceph on 64-bit ARM with X-GeneCeph on 64-bit ARM with X-Gene
Ceph on 64-bit ARM with X-GeneCeph Community
 
제3회난공불락 오픈소스 인프라세미나 - lustre
제3회난공불락 오픈소스 인프라세미나 - lustre제3회난공불락 오픈소스 인프라세미나 - lustre
제3회난공불락 오픈소스 인프라세미나 - lustreTommy Lee
 
Out of the Box Replication in Postgres 9.4(pgconfsf)
Out of the Box Replication in Postgres 9.4(pgconfsf)Out of the Box Replication in Postgres 9.4(pgconfsf)
Out of the Box Replication in Postgres 9.4(pgconfsf)Denish Patel
 
NoSQL атакует: JSON функции в MySQL сервере.
NoSQL атакует: JSON функции в MySQL сервере.NoSQL атакует: JSON функции в MySQL сервере.
NoSQL атакует: JSON функции в MySQL сервере.Sveta Smirnova
 
Ceph Day Beijing - Storage Modernization with Intel and Ceph
Ceph Day Beijing - Storage Modernization with Intel and CephCeph Day Beijing - Storage Modernization with Intel and Ceph
Ceph Day Beijing - Storage Modernization with Intel and CephDanielle Womboldt
 
图文详解安装Net backup 6.5备份恢复oracle 10g rac 数据库
图文详解安装Net backup 6.5备份恢复oracle 10g rac 数据库图文详解安装Net backup 6.5备份恢复oracle 10g rac 数据库
图文详解安装Net backup 6.5备份恢复oracle 10g rac 数据库maclean liu
 
Vbox virtual box在oracle linux 5 - shoug 梁洪响
Vbox virtual box在oracle linux 5 - shoug 梁洪响Vbox virtual box在oracle linux 5 - shoug 梁洪响
Vbox virtual box在oracle linux 5 - shoug 梁洪响maclean liu
 
Webinar slides: The Holy Grail Webinar: Become a MySQL DBA - Database Perform...
Webinar slides: The Holy Grail Webinar: Become a MySQL DBA - Database Perform...Webinar slides: The Holy Grail Webinar: Become a MySQL DBA - Database Perform...
Webinar slides: The Holy Grail Webinar: Become a MySQL DBA - Database Perform...Severalnines
 
High Availability in 37 Easy Steps
High Availability in 37 Easy StepsHigh Availability in 37 Easy Steps
High Availability in 37 Easy StepsTim Serong
 

Mais procurados (20)

Transforming the Ceph Integration Tests with OpenStack
Transforming the Ceph Integration Tests with OpenStack Transforming the Ceph Integration Tests with OpenStack
Transforming the Ceph Integration Tests with OpenStack
 
Ceph Day KL - Ceph on All-Flash Storage
Ceph Day KL - Ceph on All-Flash Storage Ceph Day KL - Ceph on All-Flash Storage
Ceph Day KL - Ceph on All-Flash Storage
 
Oracle Performance On Linux X86 systems
Oracle  Performance On Linux  X86 systems Oracle  Performance On Linux  X86 systems
Oracle Performance On Linux X86 systems
 
Introduction to Stacki at Atlanta Meetup February 2016
Introduction to Stacki at Atlanta Meetup February 2016Introduction to Stacki at Atlanta Meetup February 2016
Introduction to Stacki at Atlanta Meetup February 2016
 
Salesforce at Stacki Atlanta Meetup February 2016
Salesforce at Stacki Atlanta Meetup February 2016Salesforce at Stacki Atlanta Meetup February 2016
Salesforce at Stacki Atlanta Meetup February 2016
 
Ceph Day KL - Delivering cost-effective, high performance Ceph cluster
Ceph Day KL - Delivering cost-effective, high performance Ceph clusterCeph Day KL - Delivering cost-effective, high performance Ceph cluster
Ceph Day KL - Delivering cost-effective, high performance Ceph cluster
 
Enterprise manager cloud control 12c(12.1) &agent安装图文指南
Enterprise manager cloud control 12c(12.1) &agent安装图文指南Enterprise manager cloud control 12c(12.1) &agent安装图文指南
Enterprise manager cloud control 12c(12.1) &agent安装图文指南
 
제4회 한국IBM과 함께하는 난공불락 오픈소스 인프라 세미나-Asible
제4회 한국IBM과 함께하는 난공불락 오픈소스 인프라 세미나-Asible제4회 한국IBM과 함께하는 난공불락 오픈소스 인프라 세미나-Asible
제4회 한국IBM과 함께하는 난공불락 오픈소스 인프라 세미나-Asible
 
Ceph Day Beijing - Optimizing Ceph Performance by Leveraging Intel Optane and...
Ceph Day Beijing - Optimizing Ceph Performance by Leveraging Intel Optane and...Ceph Day Beijing - Optimizing Ceph Performance by Leveraging Intel Optane and...
Ceph Day Beijing - Optimizing Ceph Performance by Leveraging Intel Optane and...
 
Automated Out-of-Band management with Ansible and Redfish
Automated Out-of-Band management with Ansible and RedfishAutomated Out-of-Band management with Ansible and Redfish
Automated Out-of-Band management with Ansible and Redfish
 
Building a Two Node SLES 11 SP2 Linux Cluster with VMware
Building a Two Node SLES 11 SP2 Linux Cluster with VMwareBuilding a Two Node SLES 11 SP2 Linux Cluster with VMware
Building a Two Node SLES 11 SP2 Linux Cluster with VMware
 
Ceph on 64-bit ARM with X-Gene
Ceph on 64-bit ARM with X-GeneCeph on 64-bit ARM with X-Gene
Ceph on 64-bit ARM with X-Gene
 
제3회난공불락 오픈소스 인프라세미나 - lustre
제3회난공불락 오픈소스 인프라세미나 - lustre제3회난공불락 오픈소스 인프라세미나 - lustre
제3회난공불락 오픈소스 인프라세미나 - lustre
 
Out of the Box Replication in Postgres 9.4(pgconfsf)
Out of the Box Replication in Postgres 9.4(pgconfsf)Out of the Box Replication in Postgres 9.4(pgconfsf)
Out of the Box Replication in Postgres 9.4(pgconfsf)
 
NoSQL атакует: JSON функции в MySQL сервере.
NoSQL атакует: JSON функции в MySQL сервере.NoSQL атакует: JSON функции в MySQL сервере.
NoSQL атакует: JSON функции в MySQL сервере.
 
Ceph Day Beijing - Storage Modernization with Intel and Ceph
Ceph Day Beijing - Storage Modernization with Intel and CephCeph Day Beijing - Storage Modernization with Intel and Ceph
Ceph Day Beijing - Storage Modernization with Intel and Ceph
 
图文详解安装Net backup 6.5备份恢复oracle 10g rac 数据库
图文详解安装Net backup 6.5备份恢复oracle 10g rac 数据库图文详解安装Net backup 6.5备份恢复oracle 10g rac 数据库
图文详解安装Net backup 6.5备份恢复oracle 10g rac 数据库
 
Vbox virtual box在oracle linux 5 - shoug 梁洪响
Vbox virtual box在oracle linux 5 - shoug 梁洪响Vbox virtual box在oracle linux 5 - shoug 梁洪响
Vbox virtual box在oracle linux 5 - shoug 梁洪响
 
Webinar slides: The Holy Grail Webinar: Become a MySQL DBA - Database Perform...
Webinar slides: The Holy Grail Webinar: Become a MySQL DBA - Database Perform...Webinar slides: The Holy Grail Webinar: Become a MySQL DBA - Database Perform...
Webinar slides: The Holy Grail Webinar: Become a MySQL DBA - Database Perform...
 
High Availability in 37 Easy Steps
High Availability in 37 Easy StepsHigh Availability in 37 Easy Steps
High Availability in 37 Easy Steps
 

Semelhante a Nagios Conference 2012 - Dan Wittenberg - Case Study: Scaling Nagios Core at Fortune 50 Company

Nagios Conference 2011 - Daniel Wittenberg - Scaling Nagios At A Giant Insur...
Nagios Conference 2011 - Daniel Wittenberg -  Scaling Nagios At A Giant Insur...Nagios Conference 2011 - Daniel Wittenberg -  Scaling Nagios At A Giant Insur...
Nagios Conference 2011 - Daniel Wittenberg - Scaling Nagios At A Giant Insur...Nagios
 
Experience In Building Scalable Web Sites Through Infrastructure's View
Experience In Building Scalable Web Sites Through Infrastructure's ViewExperience In Building Scalable Web Sites Through Infrastructure's View
Experience In Building Scalable Web Sites Through Infrastructure's ViewPhuwadon D
 
Accelerating Cassandra Workloads on Ceph with All-Flash PCIE SSDS
Accelerating Cassandra Workloads on Ceph with All-Flash PCIE SSDSAccelerating Cassandra Workloads on Ceph with All-Flash PCIE SSDS
Accelerating Cassandra Workloads on Ceph with All-Flash PCIE SSDSCeph Community
 
Ceph Day Taipei - Accelerate Ceph via SPDK
Ceph Day Taipei - Accelerate Ceph via SPDK Ceph Day Taipei - Accelerate Ceph via SPDK
Ceph Day Taipei - Accelerate Ceph via SPDK Ceph Community
 
Ceph Day Beijing - SPDK in Ceph
Ceph Day Beijing - SPDK in CephCeph Day Beijing - SPDK in Ceph
Ceph Day Beijing - SPDK in CephCeph Community
 
Introduction to PostgreSQL for System Administrators
Introduction to PostgreSQL for System AdministratorsIntroduction to PostgreSQL for System Administrators
Introduction to PostgreSQL for System AdministratorsJignesh Shah
 
[db tech showcase Tokyo 2018] #dbts2018 #B17 『オラクル パフォーマンス チューニング - 神話、伝説と解決策』
[db tech showcase Tokyo 2018] #dbts2018 #B17 『オラクル パフォーマンス チューニング - 神話、伝説と解決策』[db tech showcase Tokyo 2018] #dbts2018 #B17 『オラクル パフォーマンス チューニング - 神話、伝説と解決策』
[db tech showcase Tokyo 2018] #dbts2018 #B17 『オラクル パフォーマンス チューニング - 神話、伝説と解決策』Insight Technology, Inc.
 
PGConf.ASIA 2019 Bali - AppOS: PostgreSQL Extension for Scalable File I/O - K...
PGConf.ASIA 2019 Bali - AppOS: PostgreSQL Extension for Scalable File I/O - K...PGConf.ASIA 2019 Bali - AppOS: PostgreSQL Extension for Scalable File I/O - K...
PGConf.ASIA 2019 Bali - AppOS: PostgreSQL Extension for Scalable File I/O - K...Equnix Business Solutions
 
Persistent Memory Programming with Java*
Persistent Memory Programming with Java*Persistent Memory Programming with Java*
Persistent Memory Programming with Java*Intel® Software
 
Ceph Day Beijing - Ceph all-flash array design based on NUMA architecture
Ceph Day Beijing - Ceph all-flash array design based on NUMA architectureCeph Day Beijing - Ceph all-flash array design based on NUMA architecture
Ceph Day Beijing - Ceph all-flash array design based on NUMA architectureCeph Community
 
Achieving maximum performance in microsoft vdi environments - Jeff Stokes
Achieving maximum performance in microsoft vdi environments - Jeff StokesAchieving maximum performance in microsoft vdi environments - Jeff Stokes
Achieving maximum performance in microsoft vdi environments - Jeff StokesJeff Stokes
 
Wp intelli cache_reduction_iops_xd5.6_fp1_xs6.1
Wp intelli cache_reduction_iops_xd5.6_fp1_xs6.1Wp intelli cache_reduction_iops_xd5.6_fp1_xs6.1
Wp intelli cache_reduction_iops_xd5.6_fp1_xs6.1Nuno Alves
 
Testing Delphix: easy data virtualization
Testing Delphix: easy data virtualizationTesting Delphix: easy data virtualization
Testing Delphix: easy data virtualizationFranck Pachot
 
Introducing resinOS: An Operating System Tailored for Containers and Built fo...
Introducing resinOS: An Operating System Tailored for Containers and Built fo...Introducing resinOS: An Operating System Tailored for Containers and Built fo...
Introducing resinOS: An Operating System Tailored for Containers and Built fo...Balena
 
Performance Tuning Cheat Sheet for MongoDB
Performance Tuning Cheat Sheet for MongoDBPerformance Tuning Cheat Sheet for MongoDB
Performance Tuning Cheat Sheet for MongoDBSeveralnines
 
Open Source Data Deduplication
Open Source Data DeduplicationOpen Source Data Deduplication
Open Source Data DeduplicationRedWireServices
 
[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화
[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화
[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화OpenStack Korea Community
 
Planning for-high-performance-web-application
Planning for-high-performance-web-applicationPlanning for-high-performance-web-application
Planning for-high-performance-web-applicationNguyễn Duy Nhân
 
DrupalCampLA 2011: Drupal backend-performance
DrupalCampLA 2011: Drupal backend-performanceDrupalCampLA 2011: Drupal backend-performance
DrupalCampLA 2011: Drupal backend-performanceAshok Modi
 
Xap memory xtend-tutorial-2014
Xap memory xtend-tutorial-2014Xap memory xtend-tutorial-2014
Xap memory xtend-tutorial-2014Shay Hassidim
 

Semelhante a Nagios Conference 2012 - Dan Wittenberg - Case Study: Scaling Nagios Core at Fortune 50 Company (20)

Nagios Conference 2011 - Daniel Wittenberg - Scaling Nagios At A Giant Insur...
Nagios Conference 2011 - Daniel Wittenberg -  Scaling Nagios At A Giant Insur...Nagios Conference 2011 - Daniel Wittenberg -  Scaling Nagios At A Giant Insur...
Nagios Conference 2011 - Daniel Wittenberg - Scaling Nagios At A Giant Insur...
 
Experience In Building Scalable Web Sites Through Infrastructure's View
Experience In Building Scalable Web Sites Through Infrastructure's ViewExperience In Building Scalable Web Sites Through Infrastructure's View
Experience In Building Scalable Web Sites Through Infrastructure's View
 
Accelerating Cassandra Workloads on Ceph with All-Flash PCIE SSDS
Accelerating Cassandra Workloads on Ceph with All-Flash PCIE SSDSAccelerating Cassandra Workloads on Ceph with All-Flash PCIE SSDS
Accelerating Cassandra Workloads on Ceph with All-Flash PCIE SSDS
 
Ceph Day Taipei - Accelerate Ceph via SPDK
Ceph Day Taipei - Accelerate Ceph via SPDK Ceph Day Taipei - Accelerate Ceph via SPDK
Ceph Day Taipei - Accelerate Ceph via SPDK
 
Ceph Day Beijing - SPDK in Ceph
Ceph Day Beijing - SPDK in CephCeph Day Beijing - SPDK in Ceph
Ceph Day Beijing - SPDK in Ceph
 
Introduction to PostgreSQL for System Administrators
Introduction to PostgreSQL for System AdministratorsIntroduction to PostgreSQL for System Administrators
Introduction to PostgreSQL for System Administrators
 
[db tech showcase Tokyo 2018] #dbts2018 #B17 『オラクル パフォーマンス チューニング - 神話、伝説と解決策』
[db tech showcase Tokyo 2018] #dbts2018 #B17 『オラクル パフォーマンス チューニング - 神話、伝説と解決策』[db tech showcase Tokyo 2018] #dbts2018 #B17 『オラクル パフォーマンス チューニング - 神話、伝説と解決策』
[db tech showcase Tokyo 2018] #dbts2018 #B17 『オラクル パフォーマンス チューニング - 神話、伝説と解決策』
 
PGConf.ASIA 2019 Bali - AppOS: PostgreSQL Extension for Scalable File I/O - K...
PGConf.ASIA 2019 Bali - AppOS: PostgreSQL Extension for Scalable File I/O - K...PGConf.ASIA 2019 Bali - AppOS: PostgreSQL Extension for Scalable File I/O - K...
PGConf.ASIA 2019 Bali - AppOS: PostgreSQL Extension for Scalable File I/O - K...
 
Persistent Memory Programming with Java*
Persistent Memory Programming with Java*Persistent Memory Programming with Java*
Persistent Memory Programming with Java*
 
Ceph Day Beijing - Ceph all-flash array design based on NUMA architecture
Ceph Day Beijing - Ceph all-flash array design based on NUMA architectureCeph Day Beijing - Ceph all-flash array design based on NUMA architecture
Ceph Day Beijing - Ceph all-flash array design based on NUMA architecture
 
Achieving maximum performance in microsoft vdi environments - Jeff Stokes
Achieving maximum performance in microsoft vdi environments - Jeff StokesAchieving maximum performance in microsoft vdi environments - Jeff Stokes
Achieving maximum performance in microsoft vdi environments - Jeff Stokes
 
Wp intelli cache_reduction_iops_xd5.6_fp1_xs6.1
Wp intelli cache_reduction_iops_xd5.6_fp1_xs6.1Wp intelli cache_reduction_iops_xd5.6_fp1_xs6.1
Wp intelli cache_reduction_iops_xd5.6_fp1_xs6.1
 
Testing Delphix: easy data virtualization
Testing Delphix: easy data virtualizationTesting Delphix: easy data virtualization
Testing Delphix: easy data virtualization
 
Introducing resinOS: An Operating System Tailored for Containers and Built fo...
Introducing resinOS: An Operating System Tailored for Containers and Built fo...Introducing resinOS: An Operating System Tailored for Containers and Built fo...
Introducing resinOS: An Operating System Tailored for Containers and Built fo...
 
Performance Tuning Cheat Sheet for MongoDB
Performance Tuning Cheat Sheet for MongoDBPerformance Tuning Cheat Sheet for MongoDB
Performance Tuning Cheat Sheet for MongoDB
 
Open Source Data Deduplication
Open Source Data DeduplicationOpen Source Data Deduplication
Open Source Data Deduplication
 
[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화
[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화
[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화
 
Planning for-high-performance-web-application
Planning for-high-performance-web-applicationPlanning for-high-performance-web-application
Planning for-high-performance-web-application
 
DrupalCampLA 2011: Drupal backend-performance
DrupalCampLA 2011: Drupal backend-performanceDrupalCampLA 2011: Drupal backend-performance
DrupalCampLA 2011: Drupal backend-performance
 
Xap memory xtend-tutorial-2014
Xap memory xtend-tutorial-2014Xap memory xtend-tutorial-2014
Xap memory xtend-tutorial-2014
 

Mais de Nagios

Nagios XI Best Practices
Nagios XI Best PracticesNagios XI Best Practices
Nagios XI Best PracticesNagios
 
Jesse Olson - Nagios Log Server Architecture Overview
Jesse Olson - Nagios Log Server Architecture OverviewJesse Olson - Nagios Log Server Architecture Overview
Jesse Olson - Nagios Log Server Architecture OverviewNagios
 
Trevor McDonald - Nagios XI Under The Hood
Trevor McDonald  - Nagios XI Under The HoodTrevor McDonald  - Nagios XI Under The Hood
Trevor McDonald - Nagios XI Under The HoodNagios
 
Sean Falzon - Nagios - Resilient Notifications
Sean Falzon - Nagios - Resilient NotificationsSean Falzon - Nagios - Resilient Notifications
Sean Falzon - Nagios - Resilient NotificationsNagios
 
Marcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise Edition
Marcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise EditionMarcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise Edition
Marcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise EditionNagios
 
Janice Singh - Writing Custom Nagios Plugins
Janice Singh - Writing Custom Nagios PluginsJanice Singh - Writing Custom Nagios Plugins
Janice Singh - Writing Custom Nagios PluginsNagios
 
Dave Williams - Nagios Log Server - Practical Experience
Dave Williams - Nagios Log Server - Practical ExperienceDave Williams - Nagios Log Server - Practical Experience
Dave Williams - Nagios Log Server - Practical ExperienceNagios
 
Mike Weber - Nagios and Group Deployment of Service Checks
Mike Weber - Nagios and Group Deployment of Service ChecksMike Weber - Nagios and Group Deployment of Service Checks
Mike Weber - Nagios and Group Deployment of Service ChecksNagios
 
Mike Guthrie - Revamping Your 10 Year Old Nagios Installation
Mike Guthrie - Revamping Your 10 Year Old Nagios InstallationMike Guthrie - Revamping Your 10 Year Old Nagios Installation
Mike Guthrie - Revamping Your 10 Year Old Nagios InstallationNagios
 
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...Nagios
 
Matt Bruzek - Monitoring Your Public Cloud With Nagios
Matt Bruzek - Monitoring Your Public Cloud With NagiosMatt Bruzek - Monitoring Your Public Cloud With Nagios
Matt Bruzek - Monitoring Your Public Cloud With NagiosNagios
 
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.Nagios
 
Eric Loyd - Fractal Nagios
Eric Loyd - Fractal NagiosEric Loyd - Fractal Nagios
Eric Loyd - Fractal NagiosNagios
 
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...Nagios
 
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...Nagios
 
Nagios World Conference 2015 - Scott Wilkerson Opening
Nagios World Conference 2015 - Scott Wilkerson OpeningNagios World Conference 2015 - Scott Wilkerson Opening
Nagios World Conference 2015 - Scott Wilkerson OpeningNagios
 
Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios Core
Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios CoreNrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios Core
Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios CoreNagios
 
Nagios Log Server - Features
Nagios Log Server - FeaturesNagios Log Server - Features
Nagios Log Server - FeaturesNagios
 
Nagios Network Analyzer - Features
Nagios Network Analyzer - FeaturesNagios Network Analyzer - Features
Nagios Network Analyzer - FeaturesNagios
 
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing Nagios
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing NagiosNagios Conference 2014 - Dorance Martinez Cortes - Customizing Nagios
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing NagiosNagios
 

Mais de Nagios (20)

Nagios XI Best Practices
Nagios XI Best PracticesNagios XI Best Practices
Nagios XI Best Practices
 
Jesse Olson - Nagios Log Server Architecture Overview
Jesse Olson - Nagios Log Server Architecture OverviewJesse Olson - Nagios Log Server Architecture Overview
Jesse Olson - Nagios Log Server Architecture Overview
 
Trevor McDonald - Nagios XI Under The Hood
Trevor McDonald  - Nagios XI Under The HoodTrevor McDonald  - Nagios XI Under The Hood
Trevor McDonald - Nagios XI Under The Hood
 
Sean Falzon - Nagios - Resilient Notifications
Sean Falzon - Nagios - Resilient NotificationsSean Falzon - Nagios - Resilient Notifications
Sean Falzon - Nagios - Resilient Notifications
 
Marcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise Edition
Marcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise EditionMarcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise Edition
Marcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise Edition
 
Janice Singh - Writing Custom Nagios Plugins
Janice Singh - Writing Custom Nagios PluginsJanice Singh - Writing Custom Nagios Plugins
Janice Singh - Writing Custom Nagios Plugins
 
Dave Williams - Nagios Log Server - Practical Experience
Dave Williams - Nagios Log Server - Practical ExperienceDave Williams - Nagios Log Server - Practical Experience
Dave Williams - Nagios Log Server - Practical Experience
 
Mike Weber - Nagios and Group Deployment of Service Checks
Mike Weber - Nagios and Group Deployment of Service ChecksMike Weber - Nagios and Group Deployment of Service Checks
Mike Weber - Nagios and Group Deployment of Service Checks
 
Mike Guthrie - Revamping Your 10 Year Old Nagios Installation
Mike Guthrie - Revamping Your 10 Year Old Nagios InstallationMike Guthrie - Revamping Your 10 Year Old Nagios Installation
Mike Guthrie - Revamping Your 10 Year Old Nagios Installation
 
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
 
Matt Bruzek - Monitoring Your Public Cloud With Nagios
Matt Bruzek - Monitoring Your Public Cloud With NagiosMatt Bruzek - Monitoring Your Public Cloud With Nagios
Matt Bruzek - Monitoring Your Public Cloud With Nagios
 
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.
 
Eric Loyd - Fractal Nagios
Eric Loyd - Fractal NagiosEric Loyd - Fractal Nagios
Eric Loyd - Fractal Nagios
 
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
 
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...
 
Nagios World Conference 2015 - Scott Wilkerson Opening
Nagios World Conference 2015 - Scott Wilkerson OpeningNagios World Conference 2015 - Scott Wilkerson Opening
Nagios World Conference 2015 - Scott Wilkerson Opening
 
Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios Core
Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios CoreNrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios Core
Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios Core
 
Nagios Log Server - Features
Nagios Log Server - FeaturesNagios Log Server - Features
Nagios Log Server - Features
 
Nagios Network Analyzer - Features
Nagios Network Analyzer - FeaturesNagios Network Analyzer - Features
Nagios Network Analyzer - Features
 
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing Nagios
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing NagiosNagios Conference 2014 - Dorance Martinez Cortes - Customizing Nagios
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing Nagios
 

Último

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 

Último (20)

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 

Nagios Conference 2012 - Dan Wittenberg - Case Study: Scaling Nagios Core at Fortune 50 Company

  • 1. Scaling Nagios At A Giant Insurance Company Daniel Wittenberg dwittenberg2008@gmail.com https://github.com/dwittenberg2008/nagios
  • 2. Personal Background Certified for HP-UX in mid 90's, then RHCE in '99, and AIX in early 2000's. Worked on lots of different technologies and solutions including HA, SAN/iSCSI, Forensics/Security, Backups/Disaster Recovery, Performance Tuning, Capacity Planning, Monitoring/Trending, Networking/Protocol Analysis, Virtualization and Cloud Computing. Consulted and worked in many industries include insurance, banking, accounting, construction, embedded hardware design, printing/publishing early education, higher education, and ISP/hosting providers. 2012 2
  • 3. Topics Hardware Operating System Nagios Core Plugins Other Add-ons Event Brokers Other Software Performance Monitoring General 2012 3
  • 4. Overview 2012 4
  • 6. Hardware Hardware vs VMware High forking rate not good fit for VMware (livecheck/4.0) CPU Requirements Quantity vs Quality Memory Typically memory efficient, but have enough for ramdisk(s) Affected by your plugins if using active checks Disk I/O Faster the better! 2012 6
  • 7. VMware Performance Comparison Isolated VMWare ESX 4 Procs Memory Hosts Avg Svc Lat Avg CPU Util Avg CPU Load # Act Checks # Pass Checks 4 8 GB 1000 182 65 5 12084 6042 8 8 GB 1000 162 47 4 12084 6042 8 8 GB 600 87 60 8 7272 3636 Physical Dell PowerEdge R710 (new) Procs Memory Hosts Avg Svc Lat Avg CPU Util Avg CPU Load # Act Checks # Pass Checks 4 16 GB 1000 0.19 10 1.25 12084 6042 8 8 GB 1000 0.38 25 1.15 12084 6042 Physical HP Proliant DL380 G4 (~ 8 years old) Procs Memory Hosts Avg Svc Lat Avg CPU Util Avg CPU Load # Act Checks # Pass Checks 4 4 GB 800 0.29 32 1.95 9684 4842 8 4 GB 1000 0.47 37 4.43 12084 6042 2012 7
  • 8. Operating System CentOS / RHEL 6.3 Strip down the running services Create ramdisk in Nagios RC script - first one for status.dat, checkresults, temp_file - nagios.rc on github for full rc script – will be default in 4.0 ramdisk=`mount |grep "/var/nagios/ramcache"` if [ "$ramdisk"X == "X" ]; then mkdir -p -m 755 /var/nagios/ramcache mount -t tmpfs -o size=128m tmpfs /var/nagios/ramcache mkdir -p -m 755 /var/nagios/ramcache/checkresults chown -R nagios:nagios /var/nagios/ramcache fi 2012 8
  • 9. Operating System Make sure no ulimit restrictions ulimit -a Renice daemons and services daemon -15 --user=$user $exec -ud $config perfdata_file_run_cmd =/bin/nice -n 20 /usr/libexec/pnp4nagios/process_perfdata.pl puppet runs also re-niced (/etc/sysconfig/puppet – NICELEVEL=19) Watch your other running services and cron jobs interactively for awhile to see what spikes, you might be surprised! 2012 9
  • 10. Nagios Core Currently using Nagios 3.4.1 / 4.0 Stock with the exception of custom rc script in 3.4.1 Large Scale Suggestions Doc Pre-caching objects Re-write RC script to optimize restart time (use -vx) Don't allow restart/stop if config broken Limit use of macros (resources.cfg) 2012 10
  • 11. Nagios Core Remove use of CGI's, disable in Apache Using Livestatus/Multisite/livestatus-slave Limit use of OS backups (crazy huh?) Keep logging level low in all core and plugins/brokers Keep comments limited, delete if X # or Y days old status_update_interval=20 (default is 10) (how often to update the status.dat in seconds) enable_environment_macros=0 (default is 1) (pass macros as ENV variables) 2012 11
  • 12. Plugins check_nrpe check_logfiles check_hpasm / check_dell_sensors / check_dell_omreport check_oracle_health – check_mysql_health check_ps.sh (re-written for perf data, correct calculations) nagios_auto_service Return perf data whenever possible Many other custom and one-up plugins 2012 12
  • 13. Other Add-ons NRPE Patched to allow large buffer size (20480 bytes) NSCA – (NRDP future ?) Patched MAX_PLUGINOUTPUT_LENGTH to 4096 max_packet_age=60, forward and back time patch Run from xinetd to allow larger/faster connections/hang protection MUST use instances = UNLIMITED Recommend per_source = UNLIMITED Recommend cps = 5000 3 NSClient++/NSCP Many updates for buffering, data truncation, queueing PNP4Nagios rrdcached 2012 13
  • 14. Event Brokers DNX Mod-Gearman MK Livestatus Performance Data Splunker (custom) Log separator (reduces grepping for messages) (custom) 2012 14
  • 15. Other Software Puppet Manage entire server, from OS to .cfg Splunk Log files, performance data, sampled from servers (25GB/day+) Cacti Nagiostats template, updated to use livestatus instead of CGI Custom Control Panel Build host groups based on templates, auto-config based on host info ConSol Labs check_logfiles, check_hpasm, mod_gearman, check_mysql_health, check_oracle_health 2012 15
  • 16. Performance Monitoring How to watch your system to determine bottlenecks vmstat iostat top iptraf sar strace esxtop (if have to use VM) 2012 16
  • 17. General Configs Host config files are standalone configurations that tell everything about a host. Hosts are tied to a hostgroup Hostgroups are tied to a servicegroup Services are tied to a servicegroup host.cfg → hostgroups → service ← servicegroups This allows for easy drop-in and removal of hosts, but also requires at least 1 host be assigned to a management server Limitations – harder to make per-server per-service customizations Hosts are built/assigned from control panel (round-robin distribution) Parents built automatically from topology database, updated nightly, ESX hourly Parent's only ping once a day unless there are problems, uses fping Some alerts do trigger eventhandlers – automate fixes as much as possible 2012 17
  • 19. General Configs Types of things being monitored: cpu load, cpu stats (idle/wait/user/system), disk space, log files/Event Log, hardware, processes, swap, memory usage, service ports, NTP drift, cron job completion, UPS Nagios configtest, livestatus connectivity PNP4Nagios/check_results directory size (keeping up on processing) Performance (cpu/memory) usage on certain processes Puppet update time to make sure doesn't get behind DB Response times and health (oracle/mysql/postgresql) Apache Stats Custom app status (user accounts, response times, loads, etc.) Various SNMP/WMI values (most network related stats) ActiveMQ/Mule ESB 2012 19
  • 20. Links – where to find this stuff My Stuff – https://github.com/dwittenberg2008/nagios MK Livestatus - http://mathias-kettner.de/checkmk_livestatus.html LivestatusSlave - http://nagios.larsmichelsen.com/livestatusslave/ PNP4Nagios - http://docs.pnp4nagios.org/pnp-0.6/start ConSol Labs - http://labs.consol.de/ Puppet - http://puppetlabs.com/ Cacti Template (Base) - http://forums.cacti.net/about33806.html 2012 20
  • 21. Future ? Nagios 4.0 will save the world! 2012 21
  • 22. Nagios 4.0 Initial Specs Memory usage wasn't too good during initial testing.... 2012 22
  • 23. Nagios 3.4.1 vs 4.0 -v Times Final Numbers: 1,423,345 Services - 36,254 hosts – 255,108 service dependencies NEVER would have done a complete -v, now completes in 1:51:00 !!! 2012 23