Mais conteúdo relacionado Semelhante a SolarWinds Scalability for the Enterprise (20) SolarWinds Scalability for the Enterprise2. A Few Notes about Today’s Session
» Welcome from Austin, TX and Brno Czech Republic
Francois Caron, SolarWinds Product Manager
Rob Hock, SolarWinds Product Manager
Brad Hale, SolarWinds Product Marketing Principal
» Ask questions!!!
No attendee left behind
Don’t wait until the end – ask questions using the chat box and we will
do our best to cover them all
» Today’s Session is being recorded
solarwinds.com
slideshare.com
© 2013 SOLARWINDS WORLDWIDE, LLC. ALL RIGHTS RESERVED.
2
3. Agenda
» The Orion® portfolio
» Scalability: a multi-faceted problem
Network & user growth
Managing Distributed environments
MSP-type environments
Keeping your administration costs under control as you scale with automation
» How SolarWinds scales
» Recommended practices
» Q&A
© 2013 SOLARWINDS WORLDWIDE, LLC. ALL RIGHTS RESERVED.
3
4. What is the Orion Platform?
Main Web Server
Scalability Engines
(Enterprise Operations Console, Polling Engines, Web Server, Fail Over)
Network
Performance
Monitor
NetFlow
Traffic
Analyzer
Network
Configuration
Manager
Fault & Perf
Monitoring
Traffic Analysis
Network
Configuration
Management
Core DB
IP Address
Manager
VoIP & Network
Quality
Manager
User Device
Tracker
Server &
Application
Manager
Web
Performance
Monitor
FSM
Virt. Mgr
Patch
IP Address
Management
VoIP/Network
Quality Analysis
Network User
and Port
Tracking
Servers & App
Management
Web
Experience
Monitoring
Integration
Modules
“Orion Core”
Main Polling Engine, Common Services (Alerts, Reports,
Discovery, Syslog, Traps)
» Suite of IT management products sharing common services, web interface and
database
» Also provides integration points with non-Orion products
© 2013 SOLARWINDS WORLDWIDE, LLC. ALL RIGHTS RESERVED.
DB
6. Large Scale Networks
» Scalability and deployment is
influenced by:
Network size
Network topology (e.g. monitoring across
WAN/low bandwidth)
Sampling frequency (1h to 1min or less)
Variety of managed objects (Products
installed, Customer pollers – UnDP)
» Designed to scale from 100’s of
nodes to tens of thousands
Based on the use of Additional Polling
Engines
© 2013 SOLARWINDS WORLDWIDE, LLC. ALL RIGHTS RESERVED.
Main WEB Server
NTA
IP Flows
NPM
IPAM
IP add.
Mgmt.
SAM
Servers and
Applications
Interfaces, Wireless,
Virtualization, UCS,…
Core DB
…
Core (main polling engine)
Common services: Nodes, Volumes, Events, Reports, Discovery, Syslogs & Traps
Additional
Polling Engine
Network, Servers
& Applications
UDT,
VNQM,
WPM,
NCM,
More
Network, Servers
& Applications
Additional
Polling Engine
More
Network, Servers
& Applications
7. Distributed Environments
» Add’al Polling Engine more
tolerant to remote deployments
since v10.2
Main WEB Server
NTA
IP Flows
NPM
IPAM
IP add.
Mgmt.
SAM
Servers and
Applications
Interfaces, Wireless,
Virtualization, UCS,…
Orion Core, NPM, SAM
NTA, NCM, VNQM, UDT to follow
Core DB
UDT,
VNQM,
WPM,
NCM,
…
Core (main polling engine)
Common services: Nodes, Volumes, Events, Reports, Discovery, Syslogs & Traps
Local Network, Servers
& Applications
WAN
Additional
Polling Engine
Remote
Network, Servers
& Applications
© 2013 SOLARWINDS WORLDWIDE, LLC. ALL RIGHTS RESERVED.
7
Additional
Polling Engine
Remote
Network, Servers
& Applications
8. Multi-level management
» Orion instances in regions, Enterprise
Operations Console (EOC) - at the top
level for roll-ups
» Up to 600K elements
» EOC consolidates real-time statuses
at the top level (no centralization of
platform admin):
Alerts, Events, Syslog, Traps (last 24 hours
worth of data)
Core Node, Volume, Interface and
Wireless data
Enterprise Operations Console (EOC)
NTA
IPAM
IP Flows
IP add.
Mgmt.
NPM
SAM
Servers and
Applications
Interfaces, Wireless,
Virtualization, UCS,…
UDT,
VNQM,
WPM,
NCM,
…
NTA
Core DB
8
IPAM
IP Flows
IP add.
Mgmt.
NPM
SAM
Servers and
Applications
Interfaces, Wireless,
Virtualization, UCS,…
Core DB
UDT,
VNQM,
WPM,
NCM,
…
Core (main polling engine)
Core (main polling engine)
» Limited historical data
© 2013 SOLARWINDS WORLDWIDE, LLC. ALL RIGHTS RESERVED.
Main WEB Server
Main WEB Server
9. User growth
» Additional Web Server
recommended above 20-30
concurrent web users
…
Additional WEB Server
Main WEB Server
» Load Balancer friendly
NTA
IP Flows
NPM
IPAM
IP add.
Mgmt.
SAM
Servers and
Applications
Interfaces, Wireless,
Virtualization, UCS,…
Core DB
…
Core (main polling engine)
Common services: Nodes, Volumes, Events, Reports, Discovery, Syslogs & Traps
Additional
Polling Engine
Network, Servers
& Applications
© 2013 SOLARWINDS WORLDWIDE, LLC. ALL RIGHTS RESERVED.
9
UDT,
VNQM,
WPM,
NCM,
More
Network, Servers
& Applications
Additional
Polling Engine
More
Network, Servers
& Applications
10. MSP-type environments
» Multi-tenancy is
supported in all
modes. More here
» EOC-based
deployment
Cost effective
Customers have full
management and
capabilities, NOC has
roll-ups
© 2013 SOLARWINDS WORLDWIDE, LLC. ALL RIGHTS RESERVED.
10
11. MSP-type environments
» NAT-based
Eliminates overlapping IP addresses issue
Makes identifications of managed devices more complex
because the translated IP’s don’t make sense to report
readers. This can be addressed by populating custom
properties with IP’s or Names that will not be affected by
any translation.
SNMP Traps have IP’s in the payload, that won’t be
translated and won’t make sense
» Hybrid
Remote Polling Engine possible and eliminates overlapping
IP addresses issue
Cost effective for large customers
VPN recommended
© 2013 SOLARWINDS WORLDWIDE, LLC. ALL RIGHTS RESERVED.
11
12. Automation
» Keeping your admin costs under control as you scale: automation
»
»
»
Capacity to scale is often limited more by automation (need to reduce manual config
tasks), than by raw performance or HW costs
Orion database can be “accessed” via its API and SDK
Data can also be directly addressed from SQL DB
» API gives you access to SWIS: the data access layer used by Orion-based products
It supports integration via a SQL-like language
Now also provides REST/JSON interface as well
Create custom automation scripts using PowerShell® or Perl® (for example)
» All you want to know about the SDK, API and SWIS is in the video posted here
Remember that it’s not a supported product feature
It does however have an active thwack® group
It’s well documented and very actively developed
© 2013 SOLARWINDS WORLDWIDE, LLC. ALL RIGHTS RESERVED.
12
13. Orion API
» Can (not an exhaustive list- see documentation for full list)
Add a node, trigger discovery, create and populate custom properties
Un-manage and re-manage nodes, interfaces, applications
» Cannot (known feature requests)
Trigger a custom poller (UnDP poller)
Read all discovered interfaces and selectively import them (e.g. remove
Loopbacks)
» Examples
NPM, NCM: HOW TO AUTOMATE THE CREATION OF ORION PLATFORM (AKA
CORE) NODES FROM THE API
SAM: HOW CAN I DEPLOY APPLICATION MONITORS USING POWERSHELL AND
THE ORION SDK?
All: The API technical reference available after you install
» Need more help?
The SDK “forum” is here
© 2013 SOLARWINDS WORLDWIDE, LLC. ALL RIGHTS RESERVED.
13
15. Rough Product Guidelines
» Main factors impacting scalability
Rough guidelines, we can’t test all combination of product, sizes, HW…
We strongly encourage testing in your own environment. Lab instances are great practices.
Prod.
NPM
SAM
Scalability impacted by
Throughput=Elements (interface, volume, nodes) x
polling frequency
Throughput=Component Monitors x polling frequency
IPAM
NTA
Number of managed IP’s
Flow per sec received
NCM
UDT
Throughput =Nodes x frequency of Inventory and Config
downloads
Number of managed ports
VNQM
IP SLA operations and CDR volume
WPM
Throughput=Number of transactions played back x
frequency
(1) At default polling frequency
© 2013 SOLARWINDS WORLDWIDE, LLC. ALL RIGHTS RESERVED.
•
•
•
•
Limit
12K elements(1) per poller
100K elements(1) per instance
8-10K monitors(1) per poller
50K monitors(1) per instance
• 3M IP’s per instance
• 10K flows/sec (DB limitation) per
instance
• 10K nodes per poller
• 30K nodes maximum
• 150k ports per poller
• 500k port or more, maximum
• 5K IP SLA operations and 200K calls per
day per poller
• 15K IP SLA operations and 200K calls per
day per instance
• Dozens of recordings per player
15
Notes
How do SNMP and WMI polling compare?
Does not support polling engines
NTA 4.0: 40K flows per sec, multi poller possible but does not
increase the 40K flow/sec
NCM performing 2 NCM operations (inventory update,
configuration download) per day on all 30K nodes.
20K calls per hour as maximum in peak/rush hours
Complexity of transactions determines limits per player and
requires specific testing
16. Hardware sizing
» Separate the database and primary poller.
» Database is physical. “Need More Power” should be your request. Mine
has 12 CPU and 128Gb RAM.
» Pollers can be virtual. I tend to run with 8 CPU, 12Gb ram to start.
» Disk is a big deal. You want lots of spindles. You want RAID 10 or you
want the SAN team to tell you they can load your entire database into
the memory of the storage array.
» Everything has to be in the same time zone.
» Primary poller and database must be in the same location
© 2013 SOLARWINDS WORLDWIDE, LLC. ALL RIGHTS RESERVED.
16
17. The vital signs
» Database
Available space (! Netflow)
Disk I/O latency. Seen SAM’s App Insight for SQL?
» Main server
CPU bound
» WAN links
To sites polled remotely (ICMP / SNMP / WMI traffic)
To sites equipped with remote Poller
• ICMP (Kbits/sec)= (0.0823*y + 0.6774) * 8
• SNMP (Kbits/sec): ( 0.3949*y + 2.7756 ) * 8
• y= # nodes, with 12 Interfaces and 2 volumes per node
© 2013 SOLARWINDS WORLDWIDE, LLC. ALL RIGHTS RESERVED.
17
18. The vital signs (cont’ed)
» Pollers
Polling Completion
• Delay in polling, should be close to 100%.
• Check amount of CPU and Memory if lower than 100%
Polling Rate:
• 85% means that you are approaching the maximum
throughput that a poller is designed to handle
• >100% means that the poller will slow down the polling
frequency to stay in the throughput limits
More details here
© 2012 SOLARWINDS WORLDWIDE, LLC. ALL RIGHTS RESERVED.
18
19. High Availability
» Fail-Over-Engine (FOE)
Licensed by number of passive components: main server, Add’al Polling Engine., Add’al Web
Server
No licensing needed for the Passive instance
» Active / passive clustering
Based on heartbeat between the 2 systems
Active’s failure switches to the passive member of the cluster
No need to change DNS or IP address
» Database replication
Needs to be handled at the SQL level
© 2013 SOLARWINDS WORLDWIDE, LLC. ALL RIGHTS RESERVED.
19
20. Important processes
» Role-based access controls
» Device Lifecycle
–
–
–
»
»
»
»
»
»
»
Who, how, where, when devices are added
Ditto for elements
Ditto for SAM items
Devices (and volumes, and interfaces) missing key custom property information
Decom devices
Down Devices
Devices not SNMP polling
Applications in “unknown” status
Duplicate Nodes
Bad application report
© 2013 SOLARWINDS WORLDWIDE, LLC. ALL RIGHTS RESERVED.
20
21. Summary
» SolarWinds products scale from tens of devices to tens of thousands
» Support for central and distributed deployment scenarios
» Support for MSP-like deployments
» High availability
© 2013 SOLARWINDS WORLDWIDE, LLC. ALL RIGHTS RESERVED.
21
22. Summary and Q & A
» Download a free fully functional 30-day trial at solarwinds.com
» Francois Caron, Product Manager
francois.caron@solarwinds.com
» Rob Hock, Product Manager
rob.hock@solarwinds.com
» Brad Hale, Product Marketing Principal
brad.hale@solarwinds.com
» Join our community of 150,000+ IT pros at www.thwack.com
Thank you for attending!
© 2013 SOLARWINDS WORLDWIDE, LLC. ALL RIGHTS RESERVED.
22
23. Thank You!!
The SOLARWINDS and SOLARWINDS & Design marks are the exclusive property of SolarWinds
Worldwide, LLC, are registered with the U.S. Patent and Trademark Office, and may be registered or
pending registration in other countries. All other SolarWinds trademarks, service marks, and logos
may be common law marks, registered or pending registration in the United States or in other
countries. All other trademarks mentioned herein are used for identification purposes only and
may be or are trademarks or registered trademarks of their respective companies.
© 2012 SOLARWINDS WORLDWIDE, LLC. ALL RIGHTS RESERVED.
Notas do Editor Welcome to today’s webcast: Advanced Network Monitoring with SolarWinds Network Performance MonitorMy name is Brad Hale and I am the product marketing principal for SolarWinds network management products and I am based in Austin, TX. Joining me from Brno Czechoslovakia is Michal Hrncirik, the product manager for Network Performance Monitor.Just a few housekeeping notes before we jump into the webcast. Please ask questions. On the right side of your screen you should see a chat box that will allow you to ask questions. We will do our best to either reply back or answer the question throughout the webcast.Lastly, today’s session is being recorded and will be available at solarwinds.com or slideshare.com. Each participant will receive a follow-on email with links to the recorded session as well as some other handy resources. So, that’s NPM version 10.5 and Michal and I would like to thank you for joining us today. A this time, we’ll go ahead and open up for a Q&A. If you have any questions that haven’t already been answered, then please type them into the go-to-meeting chat box and we’ll do our best to answer all that we can.One last reminder, you will each receive an email in the next day or so with a link to the recorded webcast.Thanks again for joining us.