SlideShare uma empresa Scribd logo
1 de 27
Baixar para ler offline
NonStop monitoring and
automation
Wolfgang Breidbach

Seite 1 | 29.01.2014 | Bank-Verlag GmbH
Bank-Verlag
■ Founded in 1961 as the publishing house of the magazine „Die Bank“.
■ Running on IBM Systems /1 and /370 the first Authorisation Center in Germany for ATMtransactions was founded at the Bank-Verlag in 1986.
■ In 1988 authorisation was migrated to Tandem creating the first active-active application.
■ In the following years we took our way through Cyclone, CLX, CLX2000, K10000, K20000,
S7000, S70000, S72000 to at last S86000
■ 2005 we moved to Integrity NonStop
■ 2010 the secondary datacentre was moved to a new location
■ 2012 we migrated our production systems to NonStop blades
■ Today wer are the IT-service provider for the Private Banks in Germany
Seite 2 | 29.01.2014 | Bank-Verlag GmbH
The start
■ Bank-Verlag was using a commercial monitoring tool
■ Management decided to replace that tool by open source Nagios for all Windows, Unix and
Linux systems
■ Nagios should be used for NonStop systems as well

■ Problem: No open source monitoring tool for NonStop available that fullfilled our needs
■ Decision: We will have to create something ourselves!

Seite 3 | 29.01.2014 | Bank-Verlag GmbH
Some basic decisions
■ The main purpose is monitoring our NonStop systems
■ Feeding Nagios with information should be a result of that
■ The open source world is changing quickly, we should be able to support any other tool with
little changes

■ The NonStop monitoring should not depend on any external tool
■ The messages should not require in-depth NonStop knowledge
■ Avoid manual configuration whereever possible

Seite 4 | 29.01.2014 | Bank-Verlag GmbH
Our approach
■ We have a bunch of „subsystems“ like CPU, Pathway, Lines, NetBatch and so on
■ Every subsystem has ist own monitoring module
■ Every module collects all available configuration information automatically like
■ NetBatch module collects all information concerning NetBatch jobs and calenders
■ Line Module collects all lines
■ Some modules need additional configuration data:
■ File module needs the filesets to check
■ EMS module needs the messages to look for

Seite 5 | 29.01.2014 | Bank-Verlag GmbH
Our approach
■ Every module has a „refresh configuration“ function
■ Every module is configurable with parameters, every parameter has a default
■ If an event is found that could be handled by the toolbox it should handled by the toolbox
■ File is getting full => perform a reload or increase maxextents
■ A static Pathway server is down => issue a START command
■ A process is consuming too many CPU cycles => reduce priority

Seite 6 | 29.01.2014 | Bank-Verlag GmbH
Our approach
■ Another goal was avoiding manual taks we do not like
■ Regular reloads
■ Checking Backups
■ Checking database contents
■ Collect statistical data
■
■
■
■

Line usage
File sizes
CPU usage
TMF rate

■ Create documentation about the configuration of the system
Seite 7 | 29.01.2014 | Bank-Verlag GmbH
Our approach

■ We want to make information available to people not familiar with NonStop systems

■ The X.25 line with the calling address 12345678 is connected to the SWAN-box with the
„S77“ sticker on Clip 1 line 0
■ The TCP/IP connection with the addrsss 192.168.77.77 is configured on the controller in
slot 2.4 on „D“ and the port has the MAC address 08.00.12.34.56
■ This should be database information accessible and usable without any detailed NonStop
knowledge
■ Reports of installed hardware should be understandable without the knowledge of HP product
numbers

Seite 8 | 29.01.2014 | Bank-Verlag GmbH
The Start
■ First subsystem was „CPU and processes“
■ Development based on some already available programs
■ The CPU- and processmonitoring program should not write any diskfiles
■ Create the tools to maintain the appropiate tables including the long-term data collection
■ Create a central message collector reading the tables and formatting the messages
■ Continue with the other subsystems

Seite 9 | 29.01.2014 | Bank-Verlag GmbH
The next steps
■ Decision to build the software like a product
■ Great advantages distributing the software on our 4 (at the moment 6) systems
■ Design of a central message handling program
■ Avoid any hard-coded messages
■ A side-effect: The toolbox supports multiple languages

Seite 10 | 29.01.2014 | Bank-Verlag GmbH
Available subsystems
■ CPU- and Processes (incl. automatic restart of processes *)
■ Lines
■ Pathway
■ Files incl. automatic reload *
■ TMF
■ RDF
■ Netbatch
■ Devices
■ TCP/IP
■ Spooler
■ EMS-messages *
■ Message collector
■ Backups *
* = configuration required
Seite 11 | 29.01.2014 | Bank-Verlag GmbH
CPU- and processmonitoring

Restart monitor
Subsystem modules

Database-interface

Configuration
tables

Message
templates

Event
tables

Message collector

Message
table

TCP/IP interface

Seite 12 | 29.01.2014 | Bank-Verlag GmbH
Some additional information
■ The original monitoring toolbox is based on SQL tables
■ An Enscribe version is in progress
■ The toolbox in not depending on Measure, Measure is only used to find the originator of a heavy
diskload
■ The toolbox is causing very little CPU-load,
■ Collected statistical data allows lots of reports using standard tools like Excel

Seite 13 | 29.01.2014 | Bank-Verlag GmbH
Advantages
■ Keep track of hardware changes like exchange of disks
■ No need for additional software like Measure
■ Software is running „out of the box“ without a need for additional configuration
■ Lots of parameters and table entries for configuration available
■ The software supports multiple languages, at the moment the messages are available in German
and English
■ Bank-Verlag is not a vendor but a user, we are using the software ourselves
■ Very limited commercial interest in selling the software

Seite 14 | 29.01.2014 | Bank-Verlag GmbH
Advantages during daily life
■ Reloads are carried out automatically if needed
■ Processes causing heavy diskload are found (Measure required!)
■ The priority of processes using too many CPU cycles can be automatically reduced
■ Pathway-servers can be automatically restarted
■ Missing processes can be restarted automatically
■ Existence of required processes can be checked

■ The whole system including all the applications can be started this way!

Seite 15 | 29.01.2014 | Bank-Verlag GmbH
Advantages during daily life
■ Batchjobs and Calendars are checked periodically.
■ If a calendar is expiring, a message if issued a few days before expiration
■ The outcome of all backup jobs is checked
■ Disk problems are checked periodically including
■ Number of ZZSA files
■ Status of OSS-filesets

Seite 16 | 29.01.2014 | Bank-Verlag GmbH
Advantages during daily life
■ Files matching predefined filesets are checked for files running full
■ If a file is too full it is automatically checked for a possible reload or the maxextents are increased
■ All configured files are periodically reloaded if necessary
■ Necessary reload is decided depending on slack and fragmentation
■ All needed parameters can be defined globally, for a fileset or even for a single file.
■ The need for manual reloads has been reduced to zero

Seite 17 | 29.01.2014 | Bank-Verlag GmbH
Interesting problems
■ The status of TCP/IP connections can be checked
■ You need 2 established connections from your $ZB000 (192.168.77.77) to 192.168.88.88
port 1234.
■ If at least one of these connections is down, a message is created
■ The cause for that might be an erroneously changed firewall configuration
■ The same feature has been implemented for X.25 connections

Seite 18 | 29.01.2014 | Bank-Verlag GmbH
A real life case concerning TCP/IP
■ Our NonStop is accessing another server though a firewall
■ There have to be 2 established connections on port 4711
■ A rule within the firewall was erroneously changed
■ The NonStop could no longer establish a new connection to the server
■ The already established connections were not affected
■ The real problem we had weeks later when one of the connections had to be reestablished

■ The monitoring tool found the missing connection immediately

Seite 19 | 29.01.2014 | Bank-Verlag GmbH
Another problem
■ We have a leased line to another provider
■ Line is using X.25 protocol
■ During peak hours we had some problems on the line
■ Using the statistical data we found out that the capacity of the line was exceeded
■ Increasing the speed immediately solved all problems

Seite 20 | 29.01.2014 | Bank-Verlag GmbH
Security issues
■ Safeguard reports erroneous logons
■ Safeguard does not report the external origin of this logon like the IP-address
■ We read the Safeguard log and add that information
■ So the question „From where did the logon with Administrator to the NonStop come“ can be
answered by a look at our table

Seite 21 | 29.01.2014 | Bank-Verlag GmbH
Application monitoring
■ There are 2 kinds of application monitoring:
■ Checking database contents
■ Checking application messages
■ The database contents are checked using SQL-statements of the type „SELECT COUNT(*) from
… WHERE… BROWSE ACCESS;“
■ The result is compared against given values and a message is created if necessary
■ The severity of the messages can be set depending on the result like:
■ 1 found => Warning
■ 2 found => Error

Seite 22 | 29.01.2014 | Bank-Verlag GmbH
Checking EMS-messages
■ Our applications are using EMS collectors to report any errors
■ We are able to check the number of messages per type per time period
■ A sample message would be „Timeout process $ABCD“, process $ABCD is routing messages to
XY-Bank
■ We define the message be „Timeout“ and „$ABCD“ as „Timeout to XY-BANK“ and count those
messages per period
■ A messages is created depending on the configured theshold for this type of message

Seite 23 | 29.01.2014 | Bank-Verlag GmbH
An idea for EMS message handling
■ We are handling authorisation requests for credit and debit cards, most of these requests are
send to the card-issuing banks

■ We are creating minute-based statistics of those requests per issuer
■ If an issuer has problems we can create a message like
60% of the requests unsuccessfull
■ Now the message handling gets this information and handles it according to the configuration:
■ 1 message within 10 minutes
■ 10 messages within 10 minutes

 no need for action
 create an alarm

Seite 24 | 29.01.2014 | Bank-Verlag GmbH
Our main Nagios screen for NonStop

Seite 25 | 29.01.2014 | Bank-Verlag GmbH
Our main Nagios screen for NonStop with error message

Seite 26 | 29.01.2014 | Bank-Verlag GmbH
Any questions???
Wolfgang Breidbach
Bank-Verlag GmbH
IT-Services
Wendelinstr. 1
50933 Köln
E-Mail: Wolfgang.Breidbach@Bank-Verlag.de
www.Bank-Verlag.de

Seite 27 | 29.01.2014 | Bank-Verlag GmbH

Mais conteúdo relacionado

Destaque

WebCT presentation 007
WebCT presentation 007WebCT presentation 007
WebCT presentation 007kylebb7
 
Kåre Rude Andersen - Create a scombot – automate and monitor azure
Kåre Rude Andersen - Create a scombot – automate and monitor azureKåre Rude Andersen - Create a scombot – automate and monitor azure
Kåre Rude Andersen - Create a scombot – automate and monitor azureNordic Infrastructure Conference
 
김피디ㅋ 3월 3호 "당장 뉴스를 멈춰"
김피디ㅋ 3월 3호 "당장 뉴스를 멈춰"김피디ㅋ 3월 3호 "당장 뉴스를 멈춰"
김피디ㅋ 3월 3호 "당장 뉴스를 멈춰"김 피디
 
Tata Tiscon Part II- Matrix Rewards
Tata Tiscon Part II-  Matrix RewardsTata Tiscon Part II-  Matrix Rewards
Tata Tiscon Part II- Matrix Rewardsmatrikrewards
 
Customer Care - Matrix Rewards
Customer Care - Matrix RewardsCustomer Care - Matrix Rewards
Customer Care - Matrix Rewardsmatrikrewards
 
Ottimizzazione non lineare,Teorema di Lagrange e applicazione economica
Ottimizzazione non lineare,Teorema di Lagrange e applicazione economicaOttimizzazione non lineare,Teorema di Lagrange e applicazione economica
Ottimizzazione non lineare,Teorema di Lagrange e applicazione economicaAngela Berardinelli
 
Tata Shaktee - Matrix Rewards
Tata Shaktee -  Matrix RewardsTata Shaktee -  Matrix Rewards
Tata Shaktee - Matrix Rewardsmatrikrewards
 
Kuidas õppida keeli efektiivselt
Kuidas õppida keeli efektiivseltKuidas õppida keeli efektiivselt
Kuidas õppida keeli efektiivseltKeelestuudio
 
Мобифорс - система управления мобильными сотрудниками
Мобифорс - система управления мобильными сотрудникамиМобифорс - система управления мобильными сотрудниками
Мобифорс - система управления мобильными сотрудникамиСергей Вассерман
 
Campus SaVE Act 2014 Regulatory Updates
Campus SaVE Act 2014 Regulatory UpdatesCampus SaVE Act 2014 Regulatory Updates
Campus SaVE Act 2014 Regulatory UpdatesLiz Williams
 

Destaque (20)

WebCT presentation 007
WebCT presentation 007WebCT presentation 007
WebCT presentation 007
 
Kåre Rude Andersen - Create a scombot – automate and monitor azure
Kåre Rude Andersen - Create a scombot – automate and monitor azureKåre Rude Andersen - Create a scombot – automate and monitor azure
Kåre Rude Andersen - Create a scombot – automate and monitor azure
 
Geopolitica stefanelli
Geopolitica stefanelliGeopolitica stefanelli
Geopolitica stefanelli
 
김피디ㅋ 3월 3호 "당장 뉴스를 멈춰"
김피디ㅋ 3월 3호 "당장 뉴스를 멈춰"김피디ㅋ 3월 3호 "당장 뉴스를 멈춰"
김피디ㅋ 3월 3호 "당장 뉴스를 멈춰"
 
Research Into Digipaks
Research Into DigipaksResearch Into Digipaks
Research Into Digipaks
 
Tata Tiscon Part II- Matrix Rewards
Tata Tiscon Part II-  Matrix RewardsTata Tiscon Part II-  Matrix Rewards
Tata Tiscon Part II- Matrix Rewards
 
Can i get covered outside of open enrollment
Can i get covered outside of open enrollmentCan i get covered outside of open enrollment
Can i get covered outside of open enrollment
 
Customer Care - Matrix Rewards
Customer Care - Matrix RewardsCustomer Care - Matrix Rewards
Customer Care - Matrix Rewards
 
Ottimizzazione non lineare,Teorema di Lagrange e applicazione economica
Ottimizzazione non lineare,Teorema di Lagrange e applicazione economicaOttimizzazione non lineare,Teorema di Lagrange e applicazione economica
Ottimizzazione non lineare,Teorema di Lagrange e applicazione economica
 
Tata Shaktee - Matrix Rewards
Tata Shaktee -  Matrix RewardsTata Shaktee -  Matrix Rewards
Tata Shaktee - Matrix Rewards
 
Kuidas õppida keeli efektiivselt
Kuidas õppida keeli efektiivseltKuidas õppida keeli efektiivselt
Kuidas õppida keeli efektiivselt
 
Мобифорс - система управления мобильными сотрудниками
Мобифорс - система управления мобильными сотрудникамиМобифорс - система управления мобильными сотрудниками
Мобифорс - система управления мобильными сотрудниками
 
Uk assignments
Uk assignmentsUk assignments
Uk assignments
 
My Music Video Timeline
My Music Video TimelineMy Music Video Timeline
My Music Video Timeline
 
Summer Shape-Up Guide (infographic)
Summer Shape-Up Guide (infographic)Summer Shape-Up Guide (infographic)
Summer Shape-Up Guide (infographic)
 
Campus SaVE Act 2014 Regulatory Updates
Campus SaVE Act 2014 Regulatory UpdatesCampus SaVE Act 2014 Regulatory Updates
Campus SaVE Act 2014 Regulatory Updates
 
Evaluation Question 6
Evaluation Question 6Evaluation Question 6
Evaluation Question 6
 
Hardware luis suarez 3
Hardware luis suarez 3Hardware luis suarez 3
Hardware luis suarez 3
 
My Life
My Life My Life
My Life
 
Question 2
Question 2Question 2
Question 2
 

Semelhante a Non stop monitoring and automation

Windows Debugging Tools - JavaOne 2013
Windows Debugging Tools - JavaOne 2013Windows Debugging Tools - JavaOne 2013
Windows Debugging Tools - JavaOne 2013MattKilner
 
SA114 - Virtual Notesiality! - How the Notes client and Browser Plugin can ex...
SA114 - Virtual Notesiality! - How the Notes client and Browser Plugin can ex...SA114 - Virtual Notesiality! - How the Notes client and Browser Plugin can ex...
SA114 - Virtual Notesiality! - How the Notes client and Browser Plugin can ex...Daniel Reimann
 
ICONUK 2018 - IBM Notes V10 Performance Boost
ICONUK 2018 - IBM Notes V10 Performance BoostICONUK 2018 - IBM Notes V10 Performance Boost
ICONUK 2018 - IBM Notes V10 Performance BoostChristoph Adler
 
AdminCamp 2018 - ApplicationInsights für Administratoren
AdminCamp 2018 - ApplicationInsights für AdministratorenAdminCamp 2018 - ApplicationInsights für Administratoren
AdminCamp 2018 - ApplicationInsights für AdministratorenChristoph Adler
 
Operational and business monitoring with IBM Integration Bus-Sanjay Nagchowdhury
Operational and business monitoring with IBM Integration Bus-Sanjay NagchowdhuryOperational and business monitoring with IBM Integration Bus-Sanjay Nagchowdhury
Operational and business monitoring with IBM Integration Bus-Sanjay NagchowdhuryKaren Broughton-Mabbitt
 
Virtual,Faster,Better! How To Virtualize the IBM Notes Client and IBM Client ...
Virtual,Faster,Better! How To Virtualize the IBM Notes Client and IBM Client ...Virtual,Faster,Better! How To Virtualize the IBM Notes Client and IBM Client ...
Virtual,Faster,Better! How To Virtualize the IBM Notes Client and IBM Client ...Christoph Adler
 
PROJECT REPORT ON COMPUTER SHOP SYSTEM IN C++
PROJECT REPORT ON COMPUTER SHOP SYSTEM IN C++PROJECT REPORT ON COMPUTER SHOP SYSTEM IN C++
PROJECT REPORT ON COMPUTER SHOP SYSTEM IN C++vikram mahendra
 
Top 10 Tricks and Tools of an Oracle EPM Administrator
Top 10 Tricks and Tools of an Oracle EPM AdministratorTop 10 Tricks and Tools of an Oracle EPM Administrator
Top 10 Tricks and Tools of an Oracle EPM Administratornking821
 
Operating System Unit 1
Operating System Unit 1Operating System Unit 1
Operating System Unit 1SanthiNivas
 
Pcm to unifier migration considerations - Oracle Primavera P6 Collaborate 14
Pcm to unifier migration considerations  - Oracle Primavera P6 Collaborate 14Pcm to unifier migration considerations  - Oracle Primavera P6 Collaborate 14
Pcm to unifier migration considerations - Oracle Primavera P6 Collaborate 14p6academy
 
Iib v10 performance problem determination examples
Iib v10 performance problem determination examplesIib v10 performance problem determination examples
Iib v10 performance problem determination examplesMartinRoss_IBM
 
Virtual, Faster, Better! How to Virtualize the Rich Client and Browser Plugin...
Virtual, Faster, Better! How to Virtualize the Rich Client and Browser Plugin...Virtual, Faster, Better! How to Virtualize the Rich Client and Browser Plugin...
Virtual, Faster, Better! How to Virtualize the Rich Client and Browser Plugin...ICS User Group
 
BP1491: Virtual, Faster, Better - How to Virtualize the Rich Client and Brows...
BP1491: Virtual, Faster, Better - How to Virtualize the Rich Client and Brows...BP1491: Virtual, Faster, Better - How to Virtualize the Rich Client and Brows...
BP1491: Virtual, Faster, Better - How to Virtualize the Rich Client and Brows...panagenda
 
DNUG 2015 - Notes Browser Clients, Client Upgrades und beste Startzeiten!
DNUG 2015 - Notes Browser Clients, Client Upgrades und beste Startzeiten!DNUG 2015 - Notes Browser Clients, Client Upgrades und beste Startzeiten!
DNUG 2015 - Notes Browser Clients, Client Upgrades und beste Startzeiten!Christoph Adler
 
Scaling FreeSWITCH Performance
Scaling FreeSWITCH PerformanceScaling FreeSWITCH Performance
Scaling FreeSWITCH PerformanceMoises Silva
 

Semelhante a Non stop monitoring and automation (20)

c programming 1-1.pptx
c programming 1-1.pptxc programming 1-1.pptx
c programming 1-1.pptx
 
Windows Debugging Tools - JavaOne 2013
Windows Debugging Tools - JavaOne 2013Windows Debugging Tools - JavaOne 2013
Windows Debugging Tools - JavaOne 2013
 
SA114 - Virtual Notesiality! - How the Notes client and Browser Plugin can ex...
SA114 - Virtual Notesiality! - How the Notes client and Browser Plugin can ex...SA114 - Virtual Notesiality! - How the Notes client and Browser Plugin can ex...
SA114 - Virtual Notesiality! - How the Notes client and Browser Plugin can ex...
 
ICONUK 2018 - IBM Notes V10 Performance Boost
ICONUK 2018 - IBM Notes V10 Performance BoostICONUK 2018 - IBM Notes V10 Performance Boost
ICONUK 2018 - IBM Notes V10 Performance Boost
 
Apache flink
Apache flinkApache flink
Apache flink
 
AdminCamp 2018 - ApplicationInsights für Administratoren
AdminCamp 2018 - ApplicationInsights für AdministratorenAdminCamp 2018 - ApplicationInsights für Administratoren
AdminCamp 2018 - ApplicationInsights für Administratoren
 
Mini Project- USB Temperature Logging
Mini Project- USB Temperature LoggingMini Project- USB Temperature Logging
Mini Project- USB Temperature Logging
 
Operational and business monitoring with IBM Integration Bus-Sanjay Nagchowdhury
Operational and business monitoring with IBM Integration Bus-Sanjay NagchowdhuryOperational and business monitoring with IBM Integration Bus-Sanjay Nagchowdhury
Operational and business monitoring with IBM Integration Bus-Sanjay Nagchowdhury
 
Virtual,Faster,Better! How To Virtualize the IBM Notes Client and IBM Client ...
Virtual,Faster,Better! How To Virtualize the IBM Notes Client and IBM Client ...Virtual,Faster,Better! How To Virtualize the IBM Notes Client and IBM Client ...
Virtual,Faster,Better! How To Virtualize the IBM Notes Client and IBM Client ...
 
PROJECT REPORT ON COMPUTER SHOP SYSTEM IN C++
PROJECT REPORT ON COMPUTER SHOP SYSTEM IN C++PROJECT REPORT ON COMPUTER SHOP SYSTEM IN C++
PROJECT REPORT ON COMPUTER SHOP SYSTEM IN C++
 
Top 10 Tricks and Tools of an Oracle EPM Administrator
Top 10 Tricks and Tools of an Oracle EPM AdministratorTop 10 Tricks and Tools of an Oracle EPM Administrator
Top 10 Tricks and Tools of an Oracle EPM Administrator
 
Chapter 1 - Prog101.ppt
Chapter 1 - Prog101.pptChapter 1 - Prog101.ppt
Chapter 1 - Prog101.ppt
 
Operating System Unit 1
Operating System Unit 1Operating System Unit 1
Operating System Unit 1
 
Pcm to unifier migration considerations - Oracle Primavera P6 Collaborate 14
Pcm to unifier migration considerations  - Oracle Primavera P6 Collaborate 14Pcm to unifier migration considerations  - Oracle Primavera P6 Collaborate 14
Pcm to unifier migration considerations - Oracle Primavera P6 Collaborate 14
 
Iib v10 performance problem determination examples
Iib v10 performance problem determination examplesIib v10 performance problem determination examples
Iib v10 performance problem determination examples
 
Virtual, Faster, Better! How to Virtualize the Rich Client and Browser Plugin...
Virtual, Faster, Better! How to Virtualize the Rich Client and Browser Plugin...Virtual, Faster, Better! How to Virtualize the Rich Client and Browser Plugin...
Virtual, Faster, Better! How to Virtualize the Rich Client and Browser Plugin...
 
3 types of monitoring for 2020
3 types of monitoring for 20203 types of monitoring for 2020
3 types of monitoring for 2020
 
BP1491: Virtual, Faster, Better - How to Virtualize the Rich Client and Brows...
BP1491: Virtual, Faster, Better - How to Virtualize the Rich Client and Brows...BP1491: Virtual, Faster, Better - How to Virtualize the Rich Client and Brows...
BP1491: Virtual, Faster, Better - How to Virtualize the Rich Client and Brows...
 
DNUG 2015 - Notes Browser Clients, Client Upgrades und beste Startzeiten!
DNUG 2015 - Notes Browser Clients, Client Upgrades und beste Startzeiten!DNUG 2015 - Notes Browser Clients, Client Upgrades und beste Startzeiten!
DNUG 2015 - Notes Browser Clients, Client Upgrades und beste Startzeiten!
 
Scaling FreeSWITCH Performance
Scaling FreeSWITCH PerformanceScaling FreeSWITCH Performance
Scaling FreeSWITCH Performance
 

Último

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024SynarionITSolutions
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024The Digital Insurer
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 

Último (20)

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 

Non stop monitoring and automation

  • 1. NonStop monitoring and automation Wolfgang Breidbach Seite 1 | 29.01.2014 | Bank-Verlag GmbH
  • 2. Bank-Verlag ■ Founded in 1961 as the publishing house of the magazine „Die Bank“. ■ Running on IBM Systems /1 and /370 the first Authorisation Center in Germany for ATMtransactions was founded at the Bank-Verlag in 1986. ■ In 1988 authorisation was migrated to Tandem creating the first active-active application. ■ In the following years we took our way through Cyclone, CLX, CLX2000, K10000, K20000, S7000, S70000, S72000 to at last S86000 ■ 2005 we moved to Integrity NonStop ■ 2010 the secondary datacentre was moved to a new location ■ 2012 we migrated our production systems to NonStop blades ■ Today wer are the IT-service provider for the Private Banks in Germany Seite 2 | 29.01.2014 | Bank-Verlag GmbH
  • 3. The start ■ Bank-Verlag was using a commercial monitoring tool ■ Management decided to replace that tool by open source Nagios for all Windows, Unix and Linux systems ■ Nagios should be used for NonStop systems as well ■ Problem: No open source monitoring tool for NonStop available that fullfilled our needs ■ Decision: We will have to create something ourselves! Seite 3 | 29.01.2014 | Bank-Verlag GmbH
  • 4. Some basic decisions ■ The main purpose is monitoring our NonStop systems ■ Feeding Nagios with information should be a result of that ■ The open source world is changing quickly, we should be able to support any other tool with little changes ■ The NonStop monitoring should not depend on any external tool ■ The messages should not require in-depth NonStop knowledge ■ Avoid manual configuration whereever possible Seite 4 | 29.01.2014 | Bank-Verlag GmbH
  • 5. Our approach ■ We have a bunch of „subsystems“ like CPU, Pathway, Lines, NetBatch and so on ■ Every subsystem has ist own monitoring module ■ Every module collects all available configuration information automatically like ■ NetBatch module collects all information concerning NetBatch jobs and calenders ■ Line Module collects all lines ■ Some modules need additional configuration data: ■ File module needs the filesets to check ■ EMS module needs the messages to look for Seite 5 | 29.01.2014 | Bank-Verlag GmbH
  • 6. Our approach ■ Every module has a „refresh configuration“ function ■ Every module is configurable with parameters, every parameter has a default ■ If an event is found that could be handled by the toolbox it should handled by the toolbox ■ File is getting full => perform a reload or increase maxextents ■ A static Pathway server is down => issue a START command ■ A process is consuming too many CPU cycles => reduce priority Seite 6 | 29.01.2014 | Bank-Verlag GmbH
  • 7. Our approach ■ Another goal was avoiding manual taks we do not like ■ Regular reloads ■ Checking Backups ■ Checking database contents ■ Collect statistical data ■ ■ ■ ■ Line usage File sizes CPU usage TMF rate ■ Create documentation about the configuration of the system Seite 7 | 29.01.2014 | Bank-Verlag GmbH
  • 8. Our approach ■ We want to make information available to people not familiar with NonStop systems ■ The X.25 line with the calling address 12345678 is connected to the SWAN-box with the „S77“ sticker on Clip 1 line 0 ■ The TCP/IP connection with the addrsss 192.168.77.77 is configured on the controller in slot 2.4 on „D“ and the port has the MAC address 08.00.12.34.56 ■ This should be database information accessible and usable without any detailed NonStop knowledge ■ Reports of installed hardware should be understandable without the knowledge of HP product numbers Seite 8 | 29.01.2014 | Bank-Verlag GmbH
  • 9. The Start ■ First subsystem was „CPU and processes“ ■ Development based on some already available programs ■ The CPU- and processmonitoring program should not write any diskfiles ■ Create the tools to maintain the appropiate tables including the long-term data collection ■ Create a central message collector reading the tables and formatting the messages ■ Continue with the other subsystems Seite 9 | 29.01.2014 | Bank-Verlag GmbH
  • 10. The next steps ■ Decision to build the software like a product ■ Great advantages distributing the software on our 4 (at the moment 6) systems ■ Design of a central message handling program ■ Avoid any hard-coded messages ■ A side-effect: The toolbox supports multiple languages Seite 10 | 29.01.2014 | Bank-Verlag GmbH
  • 11. Available subsystems ■ CPU- and Processes (incl. automatic restart of processes *) ■ Lines ■ Pathway ■ Files incl. automatic reload * ■ TMF ■ RDF ■ Netbatch ■ Devices ■ TCP/IP ■ Spooler ■ EMS-messages * ■ Message collector ■ Backups * * = configuration required Seite 11 | 29.01.2014 | Bank-Verlag GmbH
  • 12. CPU- and processmonitoring Restart monitor Subsystem modules Database-interface Configuration tables Message templates Event tables Message collector Message table TCP/IP interface Seite 12 | 29.01.2014 | Bank-Verlag GmbH
  • 13. Some additional information ■ The original monitoring toolbox is based on SQL tables ■ An Enscribe version is in progress ■ The toolbox in not depending on Measure, Measure is only used to find the originator of a heavy diskload ■ The toolbox is causing very little CPU-load, ■ Collected statistical data allows lots of reports using standard tools like Excel Seite 13 | 29.01.2014 | Bank-Verlag GmbH
  • 14. Advantages ■ Keep track of hardware changes like exchange of disks ■ No need for additional software like Measure ■ Software is running „out of the box“ without a need for additional configuration ■ Lots of parameters and table entries for configuration available ■ The software supports multiple languages, at the moment the messages are available in German and English ■ Bank-Verlag is not a vendor but a user, we are using the software ourselves ■ Very limited commercial interest in selling the software Seite 14 | 29.01.2014 | Bank-Verlag GmbH
  • 15. Advantages during daily life ■ Reloads are carried out automatically if needed ■ Processes causing heavy diskload are found (Measure required!) ■ The priority of processes using too many CPU cycles can be automatically reduced ■ Pathway-servers can be automatically restarted ■ Missing processes can be restarted automatically ■ Existence of required processes can be checked ■ The whole system including all the applications can be started this way! Seite 15 | 29.01.2014 | Bank-Verlag GmbH
  • 16. Advantages during daily life ■ Batchjobs and Calendars are checked periodically. ■ If a calendar is expiring, a message if issued a few days before expiration ■ The outcome of all backup jobs is checked ■ Disk problems are checked periodically including ■ Number of ZZSA files ■ Status of OSS-filesets Seite 16 | 29.01.2014 | Bank-Verlag GmbH
  • 17. Advantages during daily life ■ Files matching predefined filesets are checked for files running full ■ If a file is too full it is automatically checked for a possible reload or the maxextents are increased ■ All configured files are periodically reloaded if necessary ■ Necessary reload is decided depending on slack and fragmentation ■ All needed parameters can be defined globally, for a fileset or even for a single file. ■ The need for manual reloads has been reduced to zero Seite 17 | 29.01.2014 | Bank-Verlag GmbH
  • 18. Interesting problems ■ The status of TCP/IP connections can be checked ■ You need 2 established connections from your $ZB000 (192.168.77.77) to 192.168.88.88 port 1234. ■ If at least one of these connections is down, a message is created ■ The cause for that might be an erroneously changed firewall configuration ■ The same feature has been implemented for X.25 connections Seite 18 | 29.01.2014 | Bank-Verlag GmbH
  • 19. A real life case concerning TCP/IP ■ Our NonStop is accessing another server though a firewall ■ There have to be 2 established connections on port 4711 ■ A rule within the firewall was erroneously changed ■ The NonStop could no longer establish a new connection to the server ■ The already established connections were not affected ■ The real problem we had weeks later when one of the connections had to be reestablished ■ The monitoring tool found the missing connection immediately Seite 19 | 29.01.2014 | Bank-Verlag GmbH
  • 20. Another problem ■ We have a leased line to another provider ■ Line is using X.25 protocol ■ During peak hours we had some problems on the line ■ Using the statistical data we found out that the capacity of the line was exceeded ■ Increasing the speed immediately solved all problems Seite 20 | 29.01.2014 | Bank-Verlag GmbH
  • 21. Security issues ■ Safeguard reports erroneous logons ■ Safeguard does not report the external origin of this logon like the IP-address ■ We read the Safeguard log and add that information ■ So the question „From where did the logon with Administrator to the NonStop come“ can be answered by a look at our table Seite 21 | 29.01.2014 | Bank-Verlag GmbH
  • 22. Application monitoring ■ There are 2 kinds of application monitoring: ■ Checking database contents ■ Checking application messages ■ The database contents are checked using SQL-statements of the type „SELECT COUNT(*) from … WHERE… BROWSE ACCESS;“ ■ The result is compared against given values and a message is created if necessary ■ The severity of the messages can be set depending on the result like: ■ 1 found => Warning ■ 2 found => Error Seite 22 | 29.01.2014 | Bank-Verlag GmbH
  • 23. Checking EMS-messages ■ Our applications are using EMS collectors to report any errors ■ We are able to check the number of messages per type per time period ■ A sample message would be „Timeout process $ABCD“, process $ABCD is routing messages to XY-Bank ■ We define the message be „Timeout“ and „$ABCD“ as „Timeout to XY-BANK“ and count those messages per period ■ A messages is created depending on the configured theshold for this type of message Seite 23 | 29.01.2014 | Bank-Verlag GmbH
  • 24. An idea for EMS message handling ■ We are handling authorisation requests for credit and debit cards, most of these requests are send to the card-issuing banks ■ We are creating minute-based statistics of those requests per issuer ■ If an issuer has problems we can create a message like 60% of the requests unsuccessfull ■ Now the message handling gets this information and handles it according to the configuration: ■ 1 message within 10 minutes ■ 10 messages within 10 minutes  no need for action  create an alarm Seite 24 | 29.01.2014 | Bank-Verlag GmbH
  • 25. Our main Nagios screen for NonStop Seite 25 | 29.01.2014 | Bank-Verlag GmbH
  • 26. Our main Nagios screen for NonStop with error message Seite 26 | 29.01.2014 | Bank-Verlag GmbH
  • 27. Any questions??? Wolfgang Breidbach Bank-Verlag GmbH IT-Services Wendelinstr. 1 50933 Köln E-Mail: Wolfgang.Breidbach@Bank-Verlag.de www.Bank-Verlag.de Seite 27 | 29.01.2014 | Bank-Verlag GmbH