The document discusses the IBM Netezza data warehouse appliance. It is a purpose-built analytics engine with an integrated database, server, and storage. It provides speed that is 10-100x faster than traditional systems, simplicity with minimal administration, scalability to handle peta-scale user data capacity, and intelligence to perform advanced analytics. The appliance is used across industries like digital media, financial services, government, health and life sciences, retail, telecom, and others.
1. Turn Information into Insights
The IBM Netezza datawarehouse appliance
- simplicity with maximum performance.
dai clegg
2. The IBM Netezza Appliance: Revolutionizing Analytics
Purpose-built analytics engine
Integrated database, server & storage
Standard interfaces
Low total cost of ownership
Speed: 10-100x faster than traditional systems
Simplicity: Minimal administration
Scalability: Peta-scale user data capacity
Smart: High-performance advanced analytics
3. The Netezza Appliance – Loading
OLE-DB
Data Integration
IBM Information Server
Ab Initio
JDBC
Business Objects/SAP
Composite Software
Expressor Software Data In
GoldenGate Software (Oracle)
ODBC
Informatica
Sunopsis (Oracle)
WisdomForce
SQL
4. The Netezza Appliance – Querying
OLE-DB
Reporting & Analysis
Cognos (IBM)
SPSS (IBM)
Unica (IBM)
JDBC
Actuate
Business Objects/SAP
Information Builders
Data Out Kalido
KXEN
ODBC
MicroStrategy
Oracle OBIEE
QlikTech
Quest Software
SQL
SAS
5. Digital Media
Financial Services
Government
Health & Life Sciences
Retail / Consumer
Products
Telecom
Page 5
Other
6. Speed
15,000 users running 800,000+ queries
per day 50X faster than before
“…when something took 24 hours I could only do so much with it,
but when something takes 10 seconds, I may be able to
completely rethink the business process…”
- SVP Application Development, Nielsen
Source: http://www.youtube.com/watch?v=yOwnX14nLrE&feature=player_embedded
7. Simplicity
Up and running 6 months before
having any training
200X faster than Oracle system
ROI in less than 3 months MONTHS
WEEKS
“Allowing the business users access to the Netezza box DAYS
was what sold it.”
Steve Taff,
Executive Dir. of IT Services
8. Scalability
1 PB on Netezza
7 years of historical data
100-200% annual data growth
“NYSE … has replaced an Oracle 10 relational database with a data
warehousing appliance from Netezza, allowing it to conduct rapid searches
of 650 terabytes of data.”
ComputerWeekly.com
Source: http://www.computerweekly.com/Articles/2008/04/14/230265/NYSE-improves-data-management-with-datawarehousing.htm
9. Smart
Predicts what shoppers are likely to buy in
future visits
Coupon redemption rates as high as 25%
“Because of (Netezza’s) in-database technology, we believe we'll
be able to do 600 predictive models per year (10X as many as
before) with the same staff."
Eric Williams,
CIO and executive VP
10. IBM Netezza True Appliance Massively Parallel Processing™
SOLARIS AIX
Client TRU64 HP-UX
S-Blade
1
WINDOWS LINUX Processor &
Snippets streaming DB logic
SQL
SQL Compiler S-Blade
2
Processor &
streaming DB logic
Query Execution
S-Blade
Plan Engine 3
Processor &
streaming DB logic
Optimize High-Performance
Database Engine
Streaming joins,
ETL Server
Admin
SQL aggregations, sorts
High-Speed
Loader/Unloader S-Blade
DBA CLI 960
Source Front End DBOS Processor &
streaming DB logic
Systems 3rd Party
Apps
Network Massively Parallel
SMP Host Fabric Intelligent Storage
High
Performance
Loader
11. Our Secret Sauce
select DISTRICT,
PRODUCTGRP,
sum(NRX)
from MTHLY_RX_TERR_DATA
where MONTH = '20091201'
and MARKET = 509123
FPGA Core CPU Core
and SPECIALTY = 'GASTRO'
Restrict, Complex ∑
Uncompress Project
Visibility Joins, Aggs, etc.
Slice of table
MTHLY_RX_TERR_DATA
(compressed) sum(NRX)
select DISTRICT, where MONTH = '20091201'
PRODUCTGRP, and MARKET = 509123
sum(NRX) and SPECIALTY = 'GASTRO'
12. IBM Netezza True Appliance Massively Parallel Processing™
SOLARIS AIX
Client TRU64 HP-UX
S-Blade
1
WINDOWS LINUX Processor &
Consolidate streaming DB logic
SQL
Compiler 2
S-Blade
Processor &
streaming DB logic
Query Execution
S-Blade
Plan Engine 3
Processor &
streaming DB logic
Optimize High-Performance
Database Engine
Streaming joins,
ETL Server
Admin aggregations, sorts
High-Speed
Loader/Unloader S-Blade
DBA CLI 960
Source Front End DBOS Processor &
streaming DB logic
Systems 3rd Party
Apps
Network Massively Parallel
SMP Host Fabric Intelligent Storage
High
Performance
Loader
13. The IBM Netezza Appliance: Revolutionizing Analytics
Digital Media
Financial Services
Government
Health & Life
Sciences
Retail / Consumer
Products
Speed: 10-100x faster than traditional systems
Telecom
Simplicity: Minimal administration
Other
Page 13
Scalability: Peta-scale user data capacity
Smart: High-performance advanced analytics
Notas do Editor
That’s exactly what the Netezza TwinFin is in the data warehousing and analytics world – a true appliance, which sets it apart from the competition It is engineered from the ground up for data warehousing and analytics And offers a complete solution that integrates database, server and storage together It supports standard interfaces such as ODBC, JDBC and ANSI SQL, making it very easy to deploy. Takes 2 days to get up and running versus weeks for other solutions The appliance characteristics translate into real value for customers (what we refer to as the 4 S’s)TwinFin is 10-100X faster than competitors like Oracle, Teradata and others. When analytic queries take seconds instead of hours to perform, customers get the opportunity to completely rethink their business processes and in some cases, even launch entirely new businesses The appliance is unlike anything that DBAs and IT teams have experienced in the past. Whereas Oracle and Teradata data warehouses require armies of specialists to manage, Netezza offers performance out-of-the-box, without requiring any tuning, indexing, aggregations, etc. A single appliance scales to more than a petabyte of user data capacity, not just acting as a repository for information, but allowing complex analytics to be conducted at-scale, on all the enterprise data By embedding analytics deep into the data warehouse, TwinFin powers high performance advanced analytics 100’s or even 1000’s of times faster than possible before Let’s look at some examples of Netezza (true) appliances in real customer environments
A Company is judged by the Company they keep. Those were just a few examples from over 500 Netezza customers Our customers span a variety of vertical industries and sizes
Predictability
XO Communications offers avariety of communications services including voice over internet protocol (VoIP), dataand internet services, network transport, broadband wireless access, and hosted andmanaged services. Its high capacity IP network and advanced transport networksupport more than 50 percent of the Fortune 500 and many of the world’s largesttelecommunications companies.
A key component of Netezza’s performance is the way in which its streaming architecture processes data. The Netezza architecture uniquely uses the FPGA as a turbocharger … a huge performance accelerator that not only allows the system to keep up with the data stream, but it actually accelerates the data stream through compression before processing it at line rates, ensuring no bottlenecks in the IO path. You can think of the way that data streaming works in the Netezza as similar to an assembly line. The Netezza assembly line has various stages in the FPGA and CPU cores. Each of these stages, along with the disk and network, operate concurrently, processing different chunks of the data stream at any given point in time. The concurrency within each data stream further increases performance relative to other architectures.Compressed data gets streamed from disk onto the assembly line at the fastest rate that the physics of the disk would allow. The data could also be cached, in which case it gets served right from memory instead of disk. The first stage in the assembly line, the Compress Engine within the FPGA core, picks up the data block and uncompresses it at wirespeed, instantly transforming each block on disk into 4-8 blocks in memory. The result is a significant speedup of the slowest componentin any data warehouse—the disk. The disk block is then passed on to the Project engine or stage, which filters out columns based on parameters specified in the SELECT clause of the SQL query being processed.The assembly line then moves the data block to the Restrict engine, which strips off rows that are not necessary to process the query, based on restrictions specified in the WHERE clause. The Visibility engine also feeds in additional parameters to the Restrict engine, to filter out rows that should not be “seen” by a query e.g. rows belonging to a transaction that is not committed yet. The Visibility engine is critical in maintaining ACID (Atomicity, Consistency, Isolation and Durability) compliance at streaming speeds in the Netezza.The processor core picks up the uncompressed, filtered data block and performs fundamental database operations such as sorts, joins and aggregations on it. It also applies complex algorithms that are embedded in the snippet code for advanced analytics processing. It finally assembles all the intermediate results together from the entire data stream and produces a result for the snippet. The result is then sent over the network fabric to other S-Blades or the host, as directed by the snippet code.
That’s exactly what the Netezza TwinFin is in the data warehousing and analytics world – a true appliance, which sets it apart from the competition It is engineered from the ground up for data warehousing and analytics And offers a complete solution that integrates database, server and storage together It supports standard interfaces such as ODBC, JDBC and ANSI SQL, making it very easy to deploy. Takes 2 days to get up and running versus weeks for other solutions The appliance characteristics translate into real value for customers (what we refer to as the 4 S’s)TwinFin is 10-100X faster than competitors like Oracle, Teradata and others. When analytic queries take seconds instead of hours to perform, customers get the opportunity to completely rethink their business processes and in some cases, even launch entirely new businesses The appliance is unlike anything that DBAs and IT teams have experienced in the past. Whereas Oracle and Teradata data warehouses require armies of specialists to manage, Netezza offers performance out-of-the-box, without requiring any tuning, indexing, aggregations, etc. A single appliance scales to more than a petabyte of user data capacity, not just acting as a repository for information, but allowing complex analytics to be conducted at-scale, on all the enterprise data By embedding analytics deep into the data warehouse, TwinFin powers high performance advanced analytics 100’s or even 1000’s of times faster than possible before Let’s look at some examples of Netezza (true) appliances in real customer environments