SlideShare uma empresa Scribd logo
1 de 27
Baixar para ler offline
NonStop monitoring and
automation
Wolfgang Breidbach

Seite 1 | 29.01.2014 | Bank-Verlag GmbH
Bank-Verlag
■ Founded in 1961 as the publishing house of the magazine „Die Bank“.
■ Running on IBM Systems /1 and /370 the first Authorisation Center in Germany for ATMtransactions was founded at the Bank-Verlag in 1986.
■ In 1988 authorisation was migrated to Tandem creating the first active-active application.
■ In the following years we took our way through Cyclone, CLX, CLX2000, K10000, K20000,
S7000, S70000, S72000 to at last S86000
■ 2005 we moved to Integrity NonStop
■ 2010 the secondary datacentre was moved to a new location
■ 2012 we migrated our production systems to NonStop blades
■ Today wer are the IT-service provider for the Private Banks in Germany
Seite 2 | 29.01.2014 | Bank-Verlag GmbH
The start
■ Bank-Verlag was using a commercial monitoring tool
■ Management decided to replace that tool by open source Nagios for all Windows, Unix and
Linux systems
■ Nagios should be used for NonStop systems as well

■ Problem: No open source monitoring tool for NonStop available that fullfilled our needs
■ Decision: We will have to create something ourselves!

Seite 3 | 29.01.2014 | Bank-Verlag GmbH
Some basic decisions
■ The main purpose is monitoring our NonStop systems
■ Feeding Nagios with information should be a result of that
■ The open source world is changing quickly, we should be able to support any other tool with
little changes

■ The NonStop monitoring should not depend on any external tool
■ The messages should not require in-depth NonStop knowledge
■ Avoid manual configuration whereever possible

Seite 4 | 29.01.2014 | Bank-Verlag GmbH
Our approach
■ We have a bunch of „subsystems“ like CPU, Pathway, Lines, NetBatch and so on
■ Every subsystem has ist own monitoring module
■ Every module collects all available configuration information automatically like
■ NetBatch module collects all information concerning NetBatch jobs and calenders
■ Line Module collects all lines
■ Some modules need additional configuration data:
■ File module needs the filesets to check
■ EMS module needs the messages to look for

Seite 5 | 29.01.2014 | Bank-Verlag GmbH
Our approach
■ Every module has a „refresh configuration“ function
■ Every module is configurable with parameters, every parameter has a default
■ If an event is found that could be handled by the toolbox it should handled by the toolbox
■ File is getting full => perform a reload or increase maxextents
■ A static Pathway server is down => issue a START command
■ A process is consuming too many CPU cycles => reduce priority

Seite 6 | 29.01.2014 | Bank-Verlag GmbH
Our approach
■ Another goal was avoiding manual taks we do not like
■ Regular reloads
■ Checking Backups
■ Checking database contents
■ Collect statistical data
■
■
■
■

Line usage
File sizes
CPU usage
TMF rate

■ Create documentation about the configuration of the system
Seite 7 | 29.01.2014 | Bank-Verlag GmbH
Our approach

■ We want to make information available to people not familiar with NonStop systems

■ The X.25 line with the calling address 12345678 is connected to the SWAN-box with the
„S77“ sticker on Clip 1 line 0
■ The TCP/IP connection with the addrsss 192.168.77.77 is configured on the controller in
slot 2.4 on „D“ and the port has the MAC address 08.00.12.34.56
■ This should be database information accessible and usable without any detailed NonStop
knowledge
■ Reports of installed hardware should be understandable without the knowledge of HP product
numbers

Seite 8 | 29.01.2014 | Bank-Verlag GmbH
The Start
■ First subsystem was „CPU and processes“
■ Development based on some already available programs
■ The CPU- and processmonitoring program should not write any diskfiles
■ Create the tools to maintain the appropiate tables including the long-term data collection
■ Create a central message collector reading the tables and formatting the messages
■ Continue with the other subsystems

Seite 9 | 29.01.2014 | Bank-Verlag GmbH
The next steps
■ Decision to build the software like a product
■ Great advantages distributing the software on our 4 (at the moment 6) systems
■ Design of a central message handling program
■ Avoid any hard-coded messages
■ A side-effect: The toolbox supports multiple languages

Seite 10 | 29.01.2014 | Bank-Verlag GmbH
Available subsystems
■ CPU- and Processes (incl. automatic restart of processes *)
■ Lines
■ Pathway
■ Files incl. automatic reload *
■ TMF
■ RDF
■ Netbatch
■ Devices
■ TCP/IP
■ Spooler
■ EMS-messages *
■ Message collector
■ Backups *
* = configuration required
Seite 11 | 29.01.2014 | Bank-Verlag GmbH
CPU- and processmonitoring

Restart monitor
Subsystem modules

Database-interface

Configuration
tables

Message
templates

Event
tables

Message collector

Message
table

TCP/IP interface

Seite 12 | 29.01.2014 | Bank-Verlag GmbH
Some additional information
■ The original monitoring toolbox is based on SQL tables
■ An Enscribe version is in progress
■ The toolbox in not depending on Measure, Measure is only used to find the originator of a heavy
diskload
■ The toolbox is causing very little CPU-load,
■ Collected statistical data allows lots of reports using standard tools like Excel

Seite 13 | 29.01.2014 | Bank-Verlag GmbH
Advantages
■ Keep track of hardware changes like exchange of disks
■ No need for additional software like Measure
■ Software is running „out of the box“ without a need for additional configuration
■ Lots of parameters and table entries for configuration available
■ The software supports multiple languages, at the moment the messages are available in German
and English
■ Bank-Verlag is not a vendor but a user, we are using the software ourselves
■ Very limited commercial interest in selling the software

Seite 14 | 29.01.2014 | Bank-Verlag GmbH
Advantages during daily life
■ Reloads are carried out automatically if needed
■ Processes causing heavy diskload are found (Measure required!)
■ The priority of processes using too many CPU cycles can be automatically reduced
■ Pathway-servers can be automatically restarted
■ Missing processes can be restarted automatically
■ Existence of required processes can be checked

■ The whole system including all the applications can be started this way!

Seite 15 | 29.01.2014 | Bank-Verlag GmbH
Advantages during daily life
■ Batchjobs and Calendars are checked periodically.
■ If a calendar is expiring, a message if issued a few days before expiration
■ The outcome of all backup jobs is checked
■ Disk problems are checked periodically including
■ Number of ZZSA files
■ Status of OSS-filesets

Seite 16 | 29.01.2014 | Bank-Verlag GmbH
Advantages during daily life
■ Files matching predefined filesets are checked for files running full
■ If a file is too full it is automatically checked for a possible reload or the maxextents are increased
■ All configured files are periodically reloaded if necessary
■ Necessary reload is decided depending on slack and fragmentation
■ All needed parameters can be defined globally, for a fileset or even for a single file.
■ The need for manual reloads has been reduced to zero

Seite 17 | 29.01.2014 | Bank-Verlag GmbH
Interesting problems
■ The status of TCP/IP connections can be checked
■ You need 2 established connections from your $ZB000 (192.168.77.77) to 192.168.88.88
port 1234.
■ If at least one of these connections is down, a message is created
■ The cause for that might be an erroneously changed firewall configuration
■ The same feature has been implemented for X.25 connections

Seite 18 | 29.01.2014 | Bank-Verlag GmbH
A real life case concerning TCP/IP
■ Our NonStop is accessing another server though a firewall
■ There have to be 2 established connections on port 4711
■ A rule within the firewall was erroneously changed
■ The NonStop could no longer establish a new connection to the server
■ The already established connections were not affected
■ The real problem we had weeks later when one of the connections had to be reestablished

■ The monitoring tool found the missing connection immediately

Seite 19 | 29.01.2014 | Bank-Verlag GmbH
Another problem
■ We have a leased line to another provider
■ Line is using X.25 protocol
■ During peak hours we had some problems on the line
■ Using the statistical data we found out that the capacity of the line was exceeded
■ Increasing the speed immediately solved all problems

Seite 20 | 29.01.2014 | Bank-Verlag GmbH
Security issues
■ Safeguard reports erroneous logons
■ Safeguard does not report the external origin of this logon like the IP-address
■ We read the Safeguard log and add that information
■ So the question „From where did the logon with Administrator to the NonStop come“ can be
answered by a look at our table

Seite 21 | 29.01.2014 | Bank-Verlag GmbH
Application monitoring
■ There are 2 kinds of application monitoring:
■ Checking database contents
■ Checking application messages
■ The database contents are checked using SQL-statements of the type „SELECT COUNT(*) from
… WHERE… BROWSE ACCESS;“
■ The result is compared against given values and a message is created if necessary
■ The severity of the messages can be set depending on the result like:
■ 1 found => Warning
■ 2 found => Error

Seite 22 | 29.01.2014 | Bank-Verlag GmbH
Checking EMS-messages
■ Our applications are using EMS collectors to report any errors
■ We are able to check the number of messages per type per time period
■ A sample message would be „Timeout process $ABCD“, process $ABCD is routing messages to
XY-Bank
■ We define the message be „Timeout“ and „$ABCD“ as „Timeout to XY-BANK“ and count those
messages per period
■ A messages is created depending on the configured theshold for this type of message

Seite 23 | 29.01.2014 | Bank-Verlag GmbH
An idea for EMS message handling
■ We are handling authorisation requests for credit and debit cards, most of these requests are
send to the card-issuing banks

■ We are creating minute-based statistics of those requests per issuer
■ If an issuer has problems we can create a message like
60% of the requests unsuccessfull
■ Now the message handling gets this information and handles it according to the configuration:
■ 1 message within 10 minutes
■ 10 messages within 10 minutes

 no need for action
 create an alarm

Seite 24 | 29.01.2014 | Bank-Verlag GmbH
Our main Nagios screen for NonStop

Seite 25 | 29.01.2014 | Bank-Verlag GmbH
Our main Nagios screen for NonStop with error message

Seite 26 | 29.01.2014 | Bank-Verlag GmbH
Any questions???
Wolfgang Breidbach
Bank-Verlag GmbH
IT-Services
Wendelinstr. 1
50933 Köln
E-Mail: Wolfgang.Breidbach@Bank-Verlag.de
www.Bank-Verlag.de

Seite 27 | 29.01.2014 | Bank-Verlag GmbH

Mais conteúdo relacionado

Destaque

WebCT presentation 007
WebCT presentation 007WebCT presentation 007
WebCT presentation 007kylebb7
 
Kåre Rude Andersen - Create a scombot – automate and monitor azure
Kåre Rude Andersen - Create a scombot – automate and monitor azureKåre Rude Andersen - Create a scombot – automate and monitor azure
Kåre Rude Andersen - Create a scombot – automate and monitor azureNordic Infrastructure Conference
 
김피디ㅋ 3월 3호 "당장 뉴스를 멈춰"
김피디ㅋ 3월 3호 "당장 뉴스를 멈춰"김피디ㅋ 3월 3호 "당장 뉴스를 멈춰"
김피디ㅋ 3월 3호 "당장 뉴스를 멈춰"김 피디
 
Tata Tiscon Part II- Matrix Rewards
Tata Tiscon Part II-  Matrix RewardsTata Tiscon Part II-  Matrix Rewards
Tata Tiscon Part II- Matrix Rewardsmatrikrewards
 
Customer Care - Matrix Rewards
Customer Care - Matrix RewardsCustomer Care - Matrix Rewards
Customer Care - Matrix Rewardsmatrikrewards
 
Ottimizzazione non lineare,Teorema di Lagrange e applicazione economica
Ottimizzazione non lineare,Teorema di Lagrange e applicazione economicaOttimizzazione non lineare,Teorema di Lagrange e applicazione economica
Ottimizzazione non lineare,Teorema di Lagrange e applicazione economicaAngela Berardinelli
 
Tata Shaktee - Matrix Rewards
Tata Shaktee -  Matrix RewardsTata Shaktee -  Matrix Rewards
Tata Shaktee - Matrix Rewardsmatrikrewards
 
Kuidas õppida keeli efektiivselt
Kuidas õppida keeli efektiivseltKuidas õppida keeli efektiivselt
Kuidas õppida keeli efektiivseltKeelestuudio
 
Мобифорс - система управления мобильными сотрудниками
Мобифорс - система управления мобильными сотрудникамиМобифорс - система управления мобильными сотрудниками
Мобифорс - система управления мобильными сотрудникамиСергей Вассерман
 
Campus SaVE Act 2014 Regulatory Updates
Campus SaVE Act 2014 Regulatory UpdatesCampus SaVE Act 2014 Regulatory Updates
Campus SaVE Act 2014 Regulatory UpdatesLiz Williams
 

Destaque (20)

WebCT presentation 007
WebCT presentation 007WebCT presentation 007
WebCT presentation 007
 
Kåre Rude Andersen - Create a scombot – automate and monitor azure
Kåre Rude Andersen - Create a scombot – automate and monitor azureKåre Rude Andersen - Create a scombot – automate and monitor azure
Kåre Rude Andersen - Create a scombot – automate and monitor azure
 
Geopolitica stefanelli
Geopolitica stefanelliGeopolitica stefanelli
Geopolitica stefanelli
 
김피디ㅋ 3월 3호 "당장 뉴스를 멈춰"
김피디ㅋ 3월 3호 "당장 뉴스를 멈춰"김피디ㅋ 3월 3호 "당장 뉴스를 멈춰"
김피디ㅋ 3월 3호 "당장 뉴스를 멈춰"
 
Research Into Digipaks
Research Into DigipaksResearch Into Digipaks
Research Into Digipaks
 
Tata Tiscon Part II- Matrix Rewards
Tata Tiscon Part II-  Matrix RewardsTata Tiscon Part II-  Matrix Rewards
Tata Tiscon Part II- Matrix Rewards
 
Can i get covered outside of open enrollment
Can i get covered outside of open enrollmentCan i get covered outside of open enrollment
Can i get covered outside of open enrollment
 
Customer Care - Matrix Rewards
Customer Care - Matrix RewardsCustomer Care - Matrix Rewards
Customer Care - Matrix Rewards
 
Ottimizzazione non lineare,Teorema di Lagrange e applicazione economica
Ottimizzazione non lineare,Teorema di Lagrange e applicazione economicaOttimizzazione non lineare,Teorema di Lagrange e applicazione economica
Ottimizzazione non lineare,Teorema di Lagrange e applicazione economica
 
Tata Shaktee - Matrix Rewards
Tata Shaktee -  Matrix RewardsTata Shaktee -  Matrix Rewards
Tata Shaktee - Matrix Rewards
 
Kuidas õppida keeli efektiivselt
Kuidas õppida keeli efektiivseltKuidas õppida keeli efektiivselt
Kuidas õppida keeli efektiivselt
 
Мобифорс - система управления мобильными сотрудниками
Мобифорс - система управления мобильными сотрудникамиМобифорс - система управления мобильными сотрудниками
Мобифорс - система управления мобильными сотрудниками
 
Uk assignments
Uk assignmentsUk assignments
Uk assignments
 
My Music Video Timeline
My Music Video TimelineMy Music Video Timeline
My Music Video Timeline
 
Summer Shape-Up Guide (infographic)
Summer Shape-Up Guide (infographic)Summer Shape-Up Guide (infographic)
Summer Shape-Up Guide (infographic)
 
Campus SaVE Act 2014 Regulatory Updates
Campus SaVE Act 2014 Regulatory UpdatesCampus SaVE Act 2014 Regulatory Updates
Campus SaVE Act 2014 Regulatory Updates
 
Evaluation Question 6
Evaluation Question 6Evaluation Question 6
Evaluation Question 6
 
Hardware luis suarez 3
Hardware luis suarez 3Hardware luis suarez 3
Hardware luis suarez 3
 
My Life
My Life My Life
My Life
 
Question 2
Question 2Question 2
Question 2
 

Semelhante a Non stop monitoring and automation

Windows Debugging Tools - JavaOne 2013
Windows Debugging Tools - JavaOne 2013Windows Debugging Tools - JavaOne 2013
Windows Debugging Tools - JavaOne 2013MattKilner
 
SA114 - Virtual Notesiality! - How the Notes client and Browser Plugin can ex...
SA114 - Virtual Notesiality! - How the Notes client and Browser Plugin can ex...SA114 - Virtual Notesiality! - How the Notes client and Browser Plugin can ex...
SA114 - Virtual Notesiality! - How the Notes client and Browser Plugin can ex...Daniel Reimann
 
ICONUK 2018 - IBM Notes V10 Performance Boost
ICONUK 2018 - IBM Notes V10 Performance BoostICONUK 2018 - IBM Notes V10 Performance Boost
ICONUK 2018 - IBM Notes V10 Performance BoostChristoph Adler
 
AdminCamp 2018 - ApplicationInsights für Administratoren
AdminCamp 2018 - ApplicationInsights für AdministratorenAdminCamp 2018 - ApplicationInsights für Administratoren
AdminCamp 2018 - ApplicationInsights für AdministratorenChristoph Adler
 
Operational and business monitoring with IBM Integration Bus-Sanjay Nagchowdhury
Operational and business monitoring with IBM Integration Bus-Sanjay NagchowdhuryOperational and business monitoring with IBM Integration Bus-Sanjay Nagchowdhury
Operational and business monitoring with IBM Integration Bus-Sanjay NagchowdhuryKaren Broughton-Mabbitt
 
Virtual,Faster,Better! How To Virtualize the IBM Notes Client and IBM Client ...
Virtual,Faster,Better! How To Virtualize the IBM Notes Client and IBM Client ...Virtual,Faster,Better! How To Virtualize the IBM Notes Client and IBM Client ...
Virtual,Faster,Better! How To Virtualize the IBM Notes Client and IBM Client ...Christoph Adler
 
PROJECT REPORT ON COMPUTER SHOP SYSTEM IN C++
PROJECT REPORT ON COMPUTER SHOP SYSTEM IN C++PROJECT REPORT ON COMPUTER SHOP SYSTEM IN C++
PROJECT REPORT ON COMPUTER SHOP SYSTEM IN C++vikram mahendra
 
Top 10 Tricks and Tools of an Oracle EPM Administrator
Top 10 Tricks and Tools of an Oracle EPM AdministratorTop 10 Tricks and Tools of an Oracle EPM Administrator
Top 10 Tricks and Tools of an Oracle EPM Administratornking821
 
Operating System Unit 1
Operating System Unit 1Operating System Unit 1
Operating System Unit 1SanthiNivas
 
Pcm to unifier migration considerations - Oracle Primavera P6 Collaborate 14
Pcm to unifier migration considerations  - Oracle Primavera P6 Collaborate 14Pcm to unifier migration considerations  - Oracle Primavera P6 Collaborate 14
Pcm to unifier migration considerations - Oracle Primavera P6 Collaborate 14p6academy
 
Iib v10 performance problem determination examples
Iib v10 performance problem determination examplesIib v10 performance problem determination examples
Iib v10 performance problem determination examplesMartinRoss_IBM
 
Virtual, Faster, Better! How to Virtualize the Rich Client and Browser Plugin...
Virtual, Faster, Better! How to Virtualize the Rich Client and Browser Plugin...Virtual, Faster, Better! How to Virtualize the Rich Client and Browser Plugin...
Virtual, Faster, Better! How to Virtualize the Rich Client and Browser Plugin...ICS User Group
 
BP1491: Virtual, Faster, Better - How to Virtualize the Rich Client and Brows...
BP1491: Virtual, Faster, Better - How to Virtualize the Rich Client and Brows...BP1491: Virtual, Faster, Better - How to Virtualize the Rich Client and Brows...
BP1491: Virtual, Faster, Better - How to Virtualize the Rich Client and Brows...panagenda
 
DNUG 2015 - Notes Browser Clients, Client Upgrades und beste Startzeiten!
DNUG 2015 - Notes Browser Clients, Client Upgrades und beste Startzeiten!DNUG 2015 - Notes Browser Clients, Client Upgrades und beste Startzeiten!
DNUG 2015 - Notes Browser Clients, Client Upgrades und beste Startzeiten!Christoph Adler
 
Scaling FreeSWITCH Performance
Scaling FreeSWITCH PerformanceScaling FreeSWITCH Performance
Scaling FreeSWITCH PerformanceMoises Silva
 

Semelhante a Non stop monitoring and automation (20)

c programming 1-1.pptx
c programming 1-1.pptxc programming 1-1.pptx
c programming 1-1.pptx
 
Windows Debugging Tools - JavaOne 2013
Windows Debugging Tools - JavaOne 2013Windows Debugging Tools - JavaOne 2013
Windows Debugging Tools - JavaOne 2013
 
SA114 - Virtual Notesiality! - How the Notes client and Browser Plugin can ex...
SA114 - Virtual Notesiality! - How the Notes client and Browser Plugin can ex...SA114 - Virtual Notesiality! - How the Notes client and Browser Plugin can ex...
SA114 - Virtual Notesiality! - How the Notes client and Browser Plugin can ex...
 
ICONUK 2018 - IBM Notes V10 Performance Boost
ICONUK 2018 - IBM Notes V10 Performance BoostICONUK 2018 - IBM Notes V10 Performance Boost
ICONUK 2018 - IBM Notes V10 Performance Boost
 
Apache flink
Apache flinkApache flink
Apache flink
 
AdminCamp 2018 - ApplicationInsights für Administratoren
AdminCamp 2018 - ApplicationInsights für AdministratorenAdminCamp 2018 - ApplicationInsights für Administratoren
AdminCamp 2018 - ApplicationInsights für Administratoren
 
Mini Project- USB Temperature Logging
Mini Project- USB Temperature LoggingMini Project- USB Temperature Logging
Mini Project- USB Temperature Logging
 
Operational and business monitoring with IBM Integration Bus-Sanjay Nagchowdhury
Operational and business monitoring with IBM Integration Bus-Sanjay NagchowdhuryOperational and business monitoring with IBM Integration Bus-Sanjay Nagchowdhury
Operational and business monitoring with IBM Integration Bus-Sanjay Nagchowdhury
 
Virtual,Faster,Better! How To Virtualize the IBM Notes Client and IBM Client ...
Virtual,Faster,Better! How To Virtualize the IBM Notes Client and IBM Client ...Virtual,Faster,Better! How To Virtualize the IBM Notes Client and IBM Client ...
Virtual,Faster,Better! How To Virtualize the IBM Notes Client and IBM Client ...
 
PROJECT REPORT ON COMPUTER SHOP SYSTEM IN C++
PROJECT REPORT ON COMPUTER SHOP SYSTEM IN C++PROJECT REPORT ON COMPUTER SHOP SYSTEM IN C++
PROJECT REPORT ON COMPUTER SHOP SYSTEM IN C++
 
Top 10 Tricks and Tools of an Oracle EPM Administrator
Top 10 Tricks and Tools of an Oracle EPM AdministratorTop 10 Tricks and Tools of an Oracle EPM Administrator
Top 10 Tricks and Tools of an Oracle EPM Administrator
 
Chapter 1 - Prog101.ppt
Chapter 1 - Prog101.pptChapter 1 - Prog101.ppt
Chapter 1 - Prog101.ppt
 
Operating System Unit 1
Operating System Unit 1Operating System Unit 1
Operating System Unit 1
 
Pcm to unifier migration considerations - Oracle Primavera P6 Collaborate 14
Pcm to unifier migration considerations  - Oracle Primavera P6 Collaborate 14Pcm to unifier migration considerations  - Oracle Primavera P6 Collaborate 14
Pcm to unifier migration considerations - Oracle Primavera P6 Collaborate 14
 
Iib v10 performance problem determination examples
Iib v10 performance problem determination examplesIib v10 performance problem determination examples
Iib v10 performance problem determination examples
 
Virtual, Faster, Better! How to Virtualize the Rich Client and Browser Plugin...
Virtual, Faster, Better! How to Virtualize the Rich Client and Browser Plugin...Virtual, Faster, Better! How to Virtualize the Rich Client and Browser Plugin...
Virtual, Faster, Better! How to Virtualize the Rich Client and Browser Plugin...
 
3 types of monitoring for 2020
3 types of monitoring for 20203 types of monitoring for 2020
3 types of monitoring for 2020
 
BP1491: Virtual, Faster, Better - How to Virtualize the Rich Client and Brows...
BP1491: Virtual, Faster, Better - How to Virtualize the Rich Client and Brows...BP1491: Virtual, Faster, Better - How to Virtualize the Rich Client and Brows...
BP1491: Virtual, Faster, Better - How to Virtualize the Rich Client and Brows...
 
DNUG 2015 - Notes Browser Clients, Client Upgrades und beste Startzeiten!
DNUG 2015 - Notes Browser Clients, Client Upgrades und beste Startzeiten!DNUG 2015 - Notes Browser Clients, Client Upgrades und beste Startzeiten!
DNUG 2015 - Notes Browser Clients, Client Upgrades und beste Startzeiten!
 
Scaling FreeSWITCH Performance
Scaling FreeSWITCH PerformanceScaling FreeSWITCH Performance
Scaling FreeSWITCH Performance
 

Último

Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfFIDO Alliance
 
State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!Memoori
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfFIDO Alliance
 
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfThe Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfFIDO Alliance
 
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfFIDO Alliance
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform EngineeringMarcus Vechiato
 
Intro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxIntro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxFIDO Alliance
 
TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024Stephen Perrenod
 
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...ScyllaDB
 
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...FIDO Alliance
 
Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024Hiroshi SHIBATA
 
Microsoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireMicrosoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireExakis Nelite
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxFIDO Alliance
 
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPTiSEO AI
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024Lorenzo Miniero
 
Your enemies use GenAI too - staying ahead of fraud with Neo4j
Your enemies use GenAI too - staying ahead of fraud with Neo4jYour enemies use GenAI too - staying ahead of fraud with Neo4j
Your enemies use GenAI too - staying ahead of fraud with Neo4jNeo4j
 
2024 May Patch Tuesday
2024 May Patch Tuesday2024 May Patch Tuesday
2024 May Patch TuesdayIvanti
 
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...FIDO Alliance
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessUXDXConf
 
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...FIDO Alliance
 

Último (20)

Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
 
State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
 
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfThe Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
 
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform Engineering
 
Intro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxIntro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptx
 
TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024
 
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
 
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
 
Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024
 
Microsoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireMicrosoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - Questionnaire
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
 
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024
 
Your enemies use GenAI too - staying ahead of fraud with Neo4j
Your enemies use GenAI too - staying ahead of fraud with Neo4jYour enemies use GenAI too - staying ahead of fraud with Neo4j
Your enemies use GenAI too - staying ahead of fraud with Neo4j
 
2024 May Patch Tuesday
2024 May Patch Tuesday2024 May Patch Tuesday
2024 May Patch Tuesday
 
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
 
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
 

Non stop monitoring and automation

  • 1. NonStop monitoring and automation Wolfgang Breidbach Seite 1 | 29.01.2014 | Bank-Verlag GmbH
  • 2. Bank-Verlag ■ Founded in 1961 as the publishing house of the magazine „Die Bank“. ■ Running on IBM Systems /1 and /370 the first Authorisation Center in Germany for ATMtransactions was founded at the Bank-Verlag in 1986. ■ In 1988 authorisation was migrated to Tandem creating the first active-active application. ■ In the following years we took our way through Cyclone, CLX, CLX2000, K10000, K20000, S7000, S70000, S72000 to at last S86000 ■ 2005 we moved to Integrity NonStop ■ 2010 the secondary datacentre was moved to a new location ■ 2012 we migrated our production systems to NonStop blades ■ Today wer are the IT-service provider for the Private Banks in Germany Seite 2 | 29.01.2014 | Bank-Verlag GmbH
  • 3. The start ■ Bank-Verlag was using a commercial monitoring tool ■ Management decided to replace that tool by open source Nagios for all Windows, Unix and Linux systems ■ Nagios should be used for NonStop systems as well ■ Problem: No open source monitoring tool for NonStop available that fullfilled our needs ■ Decision: We will have to create something ourselves! Seite 3 | 29.01.2014 | Bank-Verlag GmbH
  • 4. Some basic decisions ■ The main purpose is monitoring our NonStop systems ■ Feeding Nagios with information should be a result of that ■ The open source world is changing quickly, we should be able to support any other tool with little changes ■ The NonStop monitoring should not depend on any external tool ■ The messages should not require in-depth NonStop knowledge ■ Avoid manual configuration whereever possible Seite 4 | 29.01.2014 | Bank-Verlag GmbH
  • 5. Our approach ■ We have a bunch of „subsystems“ like CPU, Pathway, Lines, NetBatch and so on ■ Every subsystem has ist own monitoring module ■ Every module collects all available configuration information automatically like ■ NetBatch module collects all information concerning NetBatch jobs and calenders ■ Line Module collects all lines ■ Some modules need additional configuration data: ■ File module needs the filesets to check ■ EMS module needs the messages to look for Seite 5 | 29.01.2014 | Bank-Verlag GmbH
  • 6. Our approach ■ Every module has a „refresh configuration“ function ■ Every module is configurable with parameters, every parameter has a default ■ If an event is found that could be handled by the toolbox it should handled by the toolbox ■ File is getting full => perform a reload or increase maxextents ■ A static Pathway server is down => issue a START command ■ A process is consuming too many CPU cycles => reduce priority Seite 6 | 29.01.2014 | Bank-Verlag GmbH
  • 7. Our approach ■ Another goal was avoiding manual taks we do not like ■ Regular reloads ■ Checking Backups ■ Checking database contents ■ Collect statistical data ■ ■ ■ ■ Line usage File sizes CPU usage TMF rate ■ Create documentation about the configuration of the system Seite 7 | 29.01.2014 | Bank-Verlag GmbH
  • 8. Our approach ■ We want to make information available to people not familiar with NonStop systems ■ The X.25 line with the calling address 12345678 is connected to the SWAN-box with the „S77“ sticker on Clip 1 line 0 ■ The TCP/IP connection with the addrsss 192.168.77.77 is configured on the controller in slot 2.4 on „D“ and the port has the MAC address 08.00.12.34.56 ■ This should be database information accessible and usable without any detailed NonStop knowledge ■ Reports of installed hardware should be understandable without the knowledge of HP product numbers Seite 8 | 29.01.2014 | Bank-Verlag GmbH
  • 9. The Start ■ First subsystem was „CPU and processes“ ■ Development based on some already available programs ■ The CPU- and processmonitoring program should not write any diskfiles ■ Create the tools to maintain the appropiate tables including the long-term data collection ■ Create a central message collector reading the tables and formatting the messages ■ Continue with the other subsystems Seite 9 | 29.01.2014 | Bank-Verlag GmbH
  • 10. The next steps ■ Decision to build the software like a product ■ Great advantages distributing the software on our 4 (at the moment 6) systems ■ Design of a central message handling program ■ Avoid any hard-coded messages ■ A side-effect: The toolbox supports multiple languages Seite 10 | 29.01.2014 | Bank-Verlag GmbH
  • 11. Available subsystems ■ CPU- and Processes (incl. automatic restart of processes *) ■ Lines ■ Pathway ■ Files incl. automatic reload * ■ TMF ■ RDF ■ Netbatch ■ Devices ■ TCP/IP ■ Spooler ■ EMS-messages * ■ Message collector ■ Backups * * = configuration required Seite 11 | 29.01.2014 | Bank-Verlag GmbH
  • 12. CPU- and processmonitoring Restart monitor Subsystem modules Database-interface Configuration tables Message templates Event tables Message collector Message table TCP/IP interface Seite 12 | 29.01.2014 | Bank-Verlag GmbH
  • 13. Some additional information ■ The original monitoring toolbox is based on SQL tables ■ An Enscribe version is in progress ■ The toolbox in not depending on Measure, Measure is only used to find the originator of a heavy diskload ■ The toolbox is causing very little CPU-load, ■ Collected statistical data allows lots of reports using standard tools like Excel Seite 13 | 29.01.2014 | Bank-Verlag GmbH
  • 14. Advantages ■ Keep track of hardware changes like exchange of disks ■ No need for additional software like Measure ■ Software is running „out of the box“ without a need for additional configuration ■ Lots of parameters and table entries for configuration available ■ The software supports multiple languages, at the moment the messages are available in German and English ■ Bank-Verlag is not a vendor but a user, we are using the software ourselves ■ Very limited commercial interest in selling the software Seite 14 | 29.01.2014 | Bank-Verlag GmbH
  • 15. Advantages during daily life ■ Reloads are carried out automatically if needed ■ Processes causing heavy diskload are found (Measure required!) ■ The priority of processes using too many CPU cycles can be automatically reduced ■ Pathway-servers can be automatically restarted ■ Missing processes can be restarted automatically ■ Existence of required processes can be checked ■ The whole system including all the applications can be started this way! Seite 15 | 29.01.2014 | Bank-Verlag GmbH
  • 16. Advantages during daily life ■ Batchjobs and Calendars are checked periodically. ■ If a calendar is expiring, a message if issued a few days before expiration ■ The outcome of all backup jobs is checked ■ Disk problems are checked periodically including ■ Number of ZZSA files ■ Status of OSS-filesets Seite 16 | 29.01.2014 | Bank-Verlag GmbH
  • 17. Advantages during daily life ■ Files matching predefined filesets are checked for files running full ■ If a file is too full it is automatically checked for a possible reload or the maxextents are increased ■ All configured files are periodically reloaded if necessary ■ Necessary reload is decided depending on slack and fragmentation ■ All needed parameters can be defined globally, for a fileset or even for a single file. ■ The need for manual reloads has been reduced to zero Seite 17 | 29.01.2014 | Bank-Verlag GmbH
  • 18. Interesting problems ■ The status of TCP/IP connections can be checked ■ You need 2 established connections from your $ZB000 (192.168.77.77) to 192.168.88.88 port 1234. ■ If at least one of these connections is down, a message is created ■ The cause for that might be an erroneously changed firewall configuration ■ The same feature has been implemented for X.25 connections Seite 18 | 29.01.2014 | Bank-Verlag GmbH
  • 19. A real life case concerning TCP/IP ■ Our NonStop is accessing another server though a firewall ■ There have to be 2 established connections on port 4711 ■ A rule within the firewall was erroneously changed ■ The NonStop could no longer establish a new connection to the server ■ The already established connections were not affected ■ The real problem we had weeks later when one of the connections had to be reestablished ■ The monitoring tool found the missing connection immediately Seite 19 | 29.01.2014 | Bank-Verlag GmbH
  • 20. Another problem ■ We have a leased line to another provider ■ Line is using X.25 protocol ■ During peak hours we had some problems on the line ■ Using the statistical data we found out that the capacity of the line was exceeded ■ Increasing the speed immediately solved all problems Seite 20 | 29.01.2014 | Bank-Verlag GmbH
  • 21. Security issues ■ Safeguard reports erroneous logons ■ Safeguard does not report the external origin of this logon like the IP-address ■ We read the Safeguard log and add that information ■ So the question „From where did the logon with Administrator to the NonStop come“ can be answered by a look at our table Seite 21 | 29.01.2014 | Bank-Verlag GmbH
  • 22. Application monitoring ■ There are 2 kinds of application monitoring: ■ Checking database contents ■ Checking application messages ■ The database contents are checked using SQL-statements of the type „SELECT COUNT(*) from … WHERE… BROWSE ACCESS;“ ■ The result is compared against given values and a message is created if necessary ■ The severity of the messages can be set depending on the result like: ■ 1 found => Warning ■ 2 found => Error Seite 22 | 29.01.2014 | Bank-Verlag GmbH
  • 23. Checking EMS-messages ■ Our applications are using EMS collectors to report any errors ■ We are able to check the number of messages per type per time period ■ A sample message would be „Timeout process $ABCD“, process $ABCD is routing messages to XY-Bank ■ We define the message be „Timeout“ and „$ABCD“ as „Timeout to XY-BANK“ and count those messages per period ■ A messages is created depending on the configured theshold for this type of message Seite 23 | 29.01.2014 | Bank-Verlag GmbH
  • 24. An idea for EMS message handling ■ We are handling authorisation requests for credit and debit cards, most of these requests are send to the card-issuing banks ■ We are creating minute-based statistics of those requests per issuer ■ If an issuer has problems we can create a message like 60% of the requests unsuccessfull ■ Now the message handling gets this information and handles it according to the configuration: ■ 1 message within 10 minutes ■ 10 messages within 10 minutes  no need for action  create an alarm Seite 24 | 29.01.2014 | Bank-Verlag GmbH
  • 25. Our main Nagios screen for NonStop Seite 25 | 29.01.2014 | Bank-Verlag GmbH
  • 26. Our main Nagios screen for NonStop with error message Seite 26 | 29.01.2014 | Bank-Verlag GmbH
  • 27. Any questions??? Wolfgang Breidbach Bank-Verlag GmbH IT-Services Wendelinstr. 1 50933 Köln E-Mail: Wolfgang.Breidbach@Bank-Verlag.de www.Bank-Verlag.de Seite 27 | 29.01.2014 | Bank-Verlag GmbH