SlideShare uma empresa Scribd logo
1 de 34
Baixar para ler offline
Top 5 Factors to Consider When
Choosing a Big Data Solution
 Robin Schumacher, VP Products


©2012 DataStax                   1
•  VP Products, DataStax
    •  Director of Product Management MySQL, then
       EnterpriseDB
    •  VP Product Management at Embarcadero
       Technologies
    •  DBA with Oracle, Teradata, SQL Server, DB2,
       others…
    •  Database software reviewer for various
       magazines
    •  Author of 3 database books

©2012 DataStax                                       2
•  Define big data
       •  Identify “must have’s” of a big data solution
       •  Discuss difficulty in getting all of them from a
          business and technical perspective
       •  Brief tour of NoSQL, Cassandra and DataStax
          Enterprise




©2012 DataStax                                             3
What big data is and the
                 domains of data that need to
                 be considered.




©2012 DataStax                                  4
©2012 DataStax   5
“Big data technologies describe a new generation of technologies and
    architectures, designed to economically extract value from very large volumes of a
    wide variety of data, by enabling high-velocity capture, discovery, and/or analysis.”



     "Big data is data that exceeds the processing capacity of conventional database
     systems. The data is too big, moves too fast, or doesn't fit the strictures of your
     database architectures. To gain value from this data, you must choose an
     alternative way to process it."



    ”Datasets whose size is beyond the ability of typical database software tools to
    capture, store, manage, and analyze "



      * All definitions have one thing in common: new technology is needed for big data…

©2012 DataStax                                                                              6
1.  Real-time – transactional, online, streaming, low latency
        data
    2.  Analytic – aggregated data from real-time feeds or other
        sources; many times batch in nature
    3.  Search – supporting data, both external and internal, used
        for locating desired information and/or objects (e.g.
        products, documents, etc.)




©2012 DataStax                                                       7
Research done by McKinsey & Company shows the eye-opening, 10-year
          category growth rate differences between businesses that smartly use
          their big data and those that do not.


©2012 DataStax                                                                  8
What are the top five things to
                 consider in a big data
                 solution?




©2012 DataStax                                     9
©2012 DataStax   10
The characteristics that define big data are:

    1.  Velocity – includes the speed at which data comes in, and
        the number of events/elements being stored
    2.  Variety – involves structured, semi-structured, unstructured
        data
    3.  Volume – can equate to TB-PB’s of data
    4.  Complexity – typically entails the difficulty distributing the
        data (e.g. multi-data centers, cloud, etc.) and managing the
        data traffic/movement (e.g. ETL, migrations, etc.)




©2012 DataStax                                                         11
•  Data has high rate of input
         •  Data has large quantity of elements/events



                 • Sensor data
                 • Media streaming
                 • Mobile devices
                 • Financial streams
                 • Web clickstream
                 • Traffic monitoring
                 • Patient care




©2012 DataStax                                           12
•  Includes structured, semi, and unstructured
         •  Necessitates new data model and file formats
         •  Involves, real-time, analytic, and search data




©2012 DataStax                                               13
•  TB’s to PB’s
         •  Also involves data maintenance functions (e.g.
            purging, etc.)




©2012 DataStax                                               14
The McKinsey report found that the average investment firm with fewer than 1,000 employees has
      3.8 petabytes of data stored, experiences a data growth rate of 40 percent per year, and stores
      structured, semi-structured, and unstructured data. Overall, McKinsey found that 15 out of 17
      industry sectors in the United States have more data stored per company than the U.S. Library of
      Congress (which had 235 terabytes of information at the time of McKinsey’s study)

©2012 DataStax                                                                                           15
•  Typically involves data distribution, movement,
            etc., across multiple data centers and
            geographies
         •  Can be on-premise, cloud, or hybrid




©2012 DataStax                                                16
Getting a big data technology that provides two out of three can be
       challenging; finding one that supplies all three can be very hard.

©2012 DataStax                                                               17
NoSQL, Cassandra, and
                 DataStax Enterprise for big
                 data.




©2012 DataStax                                 18
NoSQL is a broad class of next-generation database management
        systems that differ from the classic model of the relational database
        management system (RDBMS) in some significant ways, most important
        being they:

         •       Sport a less-rigid, more dynamic data model
         •       Look to provide user controlled trade-off’s to the CAP theorem
         •       Do not support ANSI SQL or operations such as joins
         •       Attempt to solve some or all of the challenges of big data




©2012 DataStax                                                                   19
A NoSQL solution like Apache Cassandra:
         •  Handles high velocity data with ease
         •  Uses schema that support broad varieties of data
         •  Scales from GB’s to PB’s with linear performance capabilities
         •  Is built to handle multi-location/data center use cases
         •  Is designed for continuous availability
         •  Offers quick installation and configuration for multi-node
            clusters
         •  Is open source and/or cost 80-90% less than RDBMS’s




©2012 DataStax                                                              20
Overview of DataStax
        •  Founded in April 2010
        •  Commercial leader in Apache Cassandra™, the
           popular open-source “big data” database
        •  140+ customers
        •  40+ employees
        •  Home to Apache Cassandra Chair & most
           committers
        •  Headquartered in San Francisco Bay area
        •  Funded by prominent venture firms




©2012 DataStax                                           21
* Uses Cassandra and Hadoop for data management
©2012 DataStax                                          22
Cassandra is:
    Nearly 4x better in writes
    Nearly 2x better in reads
    Over 12x better in reads/updates




    YCSB Benchmark
    Source: http://blog.cubrid.org/dev-platform/nosql-benchmarking/?utm_source=NoSQL+Weekly+List&utm_campaign=143fae86b2-
    NoSQL_Weekly_Issue_41_September_8_2011&utm_medium=email

©2012 DataStax                                                                                                              23
Stores financial options tick data into very fluid data model for storage and
                 analysis into Cassandra.



©2012 DataStax                                                                                   24
“The hundreds of millions of web pages that contain this information are
                 stored in a multi-terabyte cache that grows continually as we crawl the
                 web, analyzing new pages and finding new versions of existing pages.” –
                 Zoominfo Architect on using Cassandra

©2012 DataStax                                                                              25
“I can create a Cassandra cluster in any region of the world in 10
                 minutes. When marketing guys decide we want to move into a
                 certain part of the world, we’re ready.” - Netflix architect

©2012 DataStax                                                                        26
•       Fully integrated smart big data platform
         •       Production certified Cassandra
         •       Continuously available analytics with Hadoop
         •       Scalable enterprise search with Solr
         •       Built in workload isolation
         •       No costly and error-prone ETL operations
         •       Easy migration of RDBMS and log data
         •       Simple to install and grow
         •       OpsCenter management solution
         •       80-90% less cost than RDBMS vendors




©2012 DataStax                                                  27
•  DataStax OpsCenter is a visual management and
           monitoring solution for DataStax Enterprise
        •  Manage and monitor all Cassandra and Hadoop and Solr
           operations
        •  Visual alerts and notifications




©2012 DataStax                                                    28
1.  Does it handle high data velocity?
        2.  Can it tackle all types of data?
        3.  How well does it perform with large data volumes?
        4.  Can it handle complex distribution and implementation
            use cases (e.g. on-premise/cloud, multi-geo)?
        5.  How does it stack up in hitting the big data “bulls
            eye?” (i.e. cost, saleable performance, and operational
            ease are concerned)?




©2012 DataStax                                                        29
DataStax Enterprise is tailor made for high-velocity, multi-variety, large
       volume, and complex deployment use cases that involve big data.




©2012 DataStax                                                                      30
Recommended Reading




                 http://www.datastax.com/resources/whitepapers

©2012 DataStax                                                   31
Next Steps
         Download DataStax Enterprise and try it in your
         own environment.

          ›  Go to www.datastax.com/
              software
          ›  Download a copy of DataStax
              Enterprise
          ›  Installs and configures in
              minutes
          ›  Completely free for
              development use




©2012 DataStax                                             32
For More Information




©2012 DataStax                  33
Move Faster.




©2012 DataStax                  34

Mais conteúdo relacionado

Mais procurados

Node.js and The Internet of Things
Node.js and The Internet of ThingsNode.js and The Internet of Things
Node.js and The Internet of ThingsLosant
 
Internet of things (IOT) connects physical to digital
Internet of things (IOT) connects physical to digitalInternet of things (IOT) connects physical to digital
Internet of things (IOT) connects physical to digitalEslam Nader
 
Koha intro presentation
Koha intro presentationKoha intro presentation
Koha intro presentationHeather Braum
 
Decentralized Autonomous Organizations.pptx
Decentralized Autonomous Organizations.pptxDecentralized Autonomous Organizations.pptx
Decentralized Autonomous Organizations.pptxShawn L. Grubb MBA PMP
 
Ethereum Blockchain with Smart contract and ERC20
Ethereum Blockchain with Smart contract and ERC20Ethereum Blockchain with Smart contract and ERC20
Ethereum Blockchain with Smart contract and ERC20Truong Nguyen
 
Supporting trade finance with letters of credit on corda
Supporting trade finance with letters of credit on cordaSupporting trade finance with letters of credit on corda
Supporting trade finance with letters of credit on cordaR3
 
Bitcoin - Understanding and Assessing potential Opportunities
Bitcoin - Understanding and Assessing potential OpportunitiesBitcoin - Understanding and Assessing potential Opportunities
Bitcoin - Understanding and Assessing potential OpportunitiesQuasarVentures
 
Unit 1 IoT Fundamentals.pdf
Unit 1 IoT Fundamentals.pdfUnit 1 IoT Fundamentals.pdf
Unit 1 IoT Fundamentals.pdfZoyaAli844417
 
Distributed Ledger Technology PowerPoint Presentation Slides
Distributed Ledger Technology PowerPoint Presentation SlidesDistributed Ledger Technology PowerPoint Presentation Slides
Distributed Ledger Technology PowerPoint Presentation SlidesSlideTeam
 
Nonrelational Databases
Nonrelational DatabasesNonrelational Databases
Nonrelational DatabasesUdi Bauman
 
Cloud of things (IoT + Cloud Computing)
Cloud of things (IoT + Cloud Computing)Cloud of things (IoT + Cloud Computing)
Cloud of things (IoT + Cloud Computing)Zakaria Hossain
 

Mais procurados (20)

Node.js and The Internet of Things
Node.js and The Internet of ThingsNode.js and The Internet of Things
Node.js and The Internet of Things
 
Internet of things (IOT) connects physical to digital
Internet of things (IOT) connects physical to digitalInternet of things (IOT) connects physical to digital
Internet of things (IOT) connects physical to digital
 
Digital library
Digital libraryDigital library
Digital library
 
Ebook and ereaders
Ebook and ereadersEbook and ereaders
Ebook and ereaders
 
Koha intro presentation
Koha intro presentationKoha intro presentation
Koha intro presentation
 
Decentralized Autonomous Organizations.pptx
Decentralized Autonomous Organizations.pptxDecentralized Autonomous Organizations.pptx
Decentralized Autonomous Organizations.pptx
 
Ethereum Blockchain with Smart contract and ERC20
Ethereum Blockchain with Smart contract and ERC20Ethereum Blockchain with Smart contract and ERC20
Ethereum Blockchain with Smart contract and ERC20
 
Architectural reference model
Architectural reference modelArchitectural reference model
Architectural reference model
 
Blockchain
BlockchainBlockchain
Blockchain
 
Supporting trade finance with letters of credit on corda
Supporting trade finance with letters of credit on cordaSupporting trade finance with letters of credit on corda
Supporting trade finance with letters of credit on corda
 
Bitcoin - Understanding and Assessing potential Opportunities
Bitcoin - Understanding and Assessing potential OpportunitiesBitcoin - Understanding and Assessing potential Opportunities
Bitcoin - Understanding and Assessing potential Opportunities
 
NFT Explained
NFT ExplainedNFT Explained
NFT Explained
 
Git and Github Session
Git and Github SessionGit and Github Session
Git and Github Session
 
Unit 1 IoT Fundamentals.pdf
Unit 1 IoT Fundamentals.pdfUnit 1 IoT Fundamentals.pdf
Unit 1 IoT Fundamentals.pdf
 
IOT Unit 1.pptx
IOT Unit 1.pptxIOT Unit 1.pptx
IOT Unit 1.pptx
 
Blockchain
BlockchainBlockchain
Blockchain
 
Distributed Ledger Technology PowerPoint Presentation Slides
Distributed Ledger Technology PowerPoint Presentation SlidesDistributed Ledger Technology PowerPoint Presentation Slides
Distributed Ledger Technology PowerPoint Presentation Slides
 
Folksonomies & social tagging
Folksonomies & social taggingFolksonomies & social tagging
Folksonomies & social tagging
 
Nonrelational Databases
Nonrelational DatabasesNonrelational Databases
Nonrelational Databases
 
Cloud of things (IoT + Cloud Computing)
Cloud of things (IoT + Cloud Computing)Cloud of things (IoT + Cloud Computing)
Cloud of things (IoT + Cloud Computing)
 

Semelhante a Top 5 Factors to Consider in Big Data Solutions

The Top 5 Factors to Consider When Choosing a Big Data Solution
The Top 5 Factors to Consider When Choosing a Big Data SolutionThe Top 5 Factors to Consider When Choosing a Big Data Solution
The Top 5 Factors to Consider When Choosing a Big Data SolutionDATAVERSITY
 
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)Denodo
 
Big data - Cassandra
Big data - CassandraBig data - Cassandra
Big data - CassandraJen Wei Lee
 
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization Denodo
 
Getting Big Value from Big Data
Getting Big Value from Big DataGetting Big Value from Big Data
Getting Big Value from Big DataDataStax
 
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)Denodo
 
Unlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data VirtualizationUnlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data VirtualizationDenodo
 
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...DataStax
 
Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?Denodo
 
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)Denodo
 
Bridging the Last Mile: Getting Data to the People Who Need It
Bridging the Last Mile: Getting Data to the People Who Need ItBridging the Last Mile: Getting Data to the People Who Need It
Bridging the Last Mile: Getting Data to the People Who Need ItDenodo
 
LinkedInSaxoBankDataWorkbench
LinkedInSaxoBankDataWorkbenchLinkedInSaxoBankDataWorkbench
LinkedInSaxoBankDataWorkbenchSheetal Pratik
 
Data Virtualization: An Essential Component of a Cloud Data Lake
Data Virtualization: An Essential Component of a Cloud Data LakeData Virtualization: An Essential Component of a Cloud Data Lake
Data Virtualization: An Essential Component of a Cloud Data LakeDenodo
 
Data Virtualization: An Introduction
Data Virtualization: An IntroductionData Virtualization: An Introduction
Data Virtualization: An IntroductionDenodo
 
Introduction to Bigdata and NoSQL
Introduction to Bigdata and NoSQLIntroduction to Bigdata and NoSQL
Introduction to Bigdata and NoSQLTushar Shende
 
Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)Denodo
 
Best Practices in the Cloud for Data Management (US)
Best Practices in the Cloud for Data Management (US)Best Practices in the Cloud for Data Management (US)
Best Practices in the Cloud for Data Management (US)Denodo
 
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...Denodo
 
Big Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RKBig Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RKRajesh Jayarman
 

Semelhante a Top 5 Factors to Consider in Big Data Solutions (20)

The Top 5 Factors to Consider When Choosing a Big Data Solution
The Top 5 Factors to Consider When Choosing a Big Data SolutionThe Top 5 Factors to Consider When Choosing a Big Data Solution
The Top 5 Factors to Consider When Choosing a Big Data Solution
 
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
 
Big data - Cassandra
Big data - CassandraBig data - Cassandra
Big data - Cassandra
 
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
 
Getting Big Value from Big Data
Getting Big Value from Big DataGetting Big Value from Big Data
Getting Big Value from Big Data
 
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
 
Unlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data VirtualizationUnlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data Virtualization
 
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...
 
Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?
 
Speak to Your Data
Speak to Your DataSpeak to Your Data
Speak to Your Data
 
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
 
Bridging the Last Mile: Getting Data to the People Who Need It
Bridging the Last Mile: Getting Data to the People Who Need ItBridging the Last Mile: Getting Data to the People Who Need It
Bridging the Last Mile: Getting Data to the People Who Need It
 
LinkedInSaxoBankDataWorkbench
LinkedInSaxoBankDataWorkbenchLinkedInSaxoBankDataWorkbench
LinkedInSaxoBankDataWorkbench
 
Data Virtualization: An Essential Component of a Cloud Data Lake
Data Virtualization: An Essential Component of a Cloud Data LakeData Virtualization: An Essential Component of a Cloud Data Lake
Data Virtualization: An Essential Component of a Cloud Data Lake
 
Data Virtualization: An Introduction
Data Virtualization: An IntroductionData Virtualization: An Introduction
Data Virtualization: An Introduction
 
Introduction to Bigdata and NoSQL
Introduction to Bigdata and NoSQLIntroduction to Bigdata and NoSQL
Introduction to Bigdata and NoSQL
 
Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)
 
Best Practices in the Cloud for Data Management (US)
Best Practices in the Cloud for Data Management (US)Best Practices in the Cloud for Data Management (US)
Best Practices in the Cloud for Data Management (US)
 
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
 
Big Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RKBig Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RK
 

Mais de DataStax

Is Your Enterprise Ready to Shine This Holiday Season?
Is Your Enterprise Ready to Shine This Holiday Season?Is Your Enterprise Ready to Shine This Holiday Season?
Is Your Enterprise Ready to Shine This Holiday Season?DataStax
 
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...DataStax
 
Running DataStax Enterprise in VMware Cloud and Hybrid Environments
Running DataStax Enterprise in VMware Cloud and Hybrid EnvironmentsRunning DataStax Enterprise in VMware Cloud and Hybrid Environments
Running DataStax Enterprise in VMware Cloud and Hybrid EnvironmentsDataStax
 
Best Practices for Getting to Production with DataStax Enterprise Graph
Best Practices for Getting to Production with DataStax Enterprise GraphBest Practices for Getting to Production with DataStax Enterprise Graph
Best Practices for Getting to Production with DataStax Enterprise GraphDataStax
 
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step JourneyWebinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step JourneyDataStax
 
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...
Webinar  |  How to Understand Apache Cassandra™ Performance Through Read/Writ...Webinar  |  How to Understand Apache Cassandra™ Performance Through Read/Writ...
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...DataStax
 
Webinar | Better Together: Apache Cassandra and Apache Kafka
Webinar  |  Better Together: Apache Cassandra and Apache KafkaWebinar  |  Better Together: Apache Cassandra and Apache Kafka
Webinar | Better Together: Apache Cassandra and Apache KafkaDataStax
 
Top 10 Best Practices for Apache Cassandra and DataStax Enterprise
Top 10 Best Practices for Apache Cassandra and DataStax EnterpriseTop 10 Best Practices for Apache Cassandra and DataStax Enterprise
Top 10 Best Practices for Apache Cassandra and DataStax EnterpriseDataStax
 
Introduction to Apache Cassandra™ + What’s New in 4.0
Introduction to Apache Cassandra™ + What’s New in 4.0Introduction to Apache Cassandra™ + What’s New in 4.0
Introduction to Apache Cassandra™ + What’s New in 4.0DataStax
 
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...DataStax
 
Webinar | Aligning GDPR Requirements with Today's Hybrid Cloud Realities
Webinar  |  Aligning GDPR Requirements with Today's Hybrid Cloud RealitiesWebinar  |  Aligning GDPR Requirements with Today's Hybrid Cloud Realities
Webinar | Aligning GDPR Requirements with Today's Hybrid Cloud RealitiesDataStax
 
Designing a Distributed Cloud Database for Dummies
Designing a Distributed Cloud Database for DummiesDesigning a Distributed Cloud Database for Dummies
Designing a Distributed Cloud Database for DummiesDataStax
 
How to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
How to Power Innovation with Geo-Distributed Data Management in Hybrid CloudHow to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
How to Power Innovation with Geo-Distributed Data Management in Hybrid CloudDataStax
 
How to Evaluate Cloud Databases for eCommerce
How to Evaluate Cloud Databases for eCommerceHow to Evaluate Cloud Databases for eCommerce
How to Evaluate Cloud Databases for eCommerceDataStax
 
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...DataStax
 
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...DataStax
 
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...DataStax
 
Datastax - The Architect's guide to customer experience (CX)
Datastax - The Architect's guide to customer experience (CX)Datastax - The Architect's guide to customer experience (CX)
Datastax - The Architect's guide to customer experience (CX)DataStax
 
An Operational Data Layer is Critical for Transformative Banking Applications
An Operational Data Layer is Critical for Transformative Banking ApplicationsAn Operational Data Layer is Critical for Transformative Banking Applications
An Operational Data Layer is Critical for Transformative Banking ApplicationsDataStax
 
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design ThinkingBecoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design ThinkingDataStax
 

Mais de DataStax (20)

Is Your Enterprise Ready to Shine This Holiday Season?
Is Your Enterprise Ready to Shine This Holiday Season?Is Your Enterprise Ready to Shine This Holiday Season?
Is Your Enterprise Ready to Shine This Holiday Season?
 
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
 
Running DataStax Enterprise in VMware Cloud and Hybrid Environments
Running DataStax Enterprise in VMware Cloud and Hybrid EnvironmentsRunning DataStax Enterprise in VMware Cloud and Hybrid Environments
Running DataStax Enterprise in VMware Cloud and Hybrid Environments
 
Best Practices for Getting to Production with DataStax Enterprise Graph
Best Practices for Getting to Production with DataStax Enterprise GraphBest Practices for Getting to Production with DataStax Enterprise Graph
Best Practices for Getting to Production with DataStax Enterprise Graph
 
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step JourneyWebinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
 
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...
Webinar  |  How to Understand Apache Cassandra™ Performance Through Read/Writ...Webinar  |  How to Understand Apache Cassandra™ Performance Through Read/Writ...
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...
 
Webinar | Better Together: Apache Cassandra and Apache Kafka
Webinar  |  Better Together: Apache Cassandra and Apache KafkaWebinar  |  Better Together: Apache Cassandra and Apache Kafka
Webinar | Better Together: Apache Cassandra and Apache Kafka
 
Top 10 Best Practices for Apache Cassandra and DataStax Enterprise
Top 10 Best Practices for Apache Cassandra and DataStax EnterpriseTop 10 Best Practices for Apache Cassandra and DataStax Enterprise
Top 10 Best Practices for Apache Cassandra and DataStax Enterprise
 
Introduction to Apache Cassandra™ + What’s New in 4.0
Introduction to Apache Cassandra™ + What’s New in 4.0Introduction to Apache Cassandra™ + What’s New in 4.0
Introduction to Apache Cassandra™ + What’s New in 4.0
 
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
 
Webinar | Aligning GDPR Requirements with Today's Hybrid Cloud Realities
Webinar  |  Aligning GDPR Requirements with Today's Hybrid Cloud RealitiesWebinar  |  Aligning GDPR Requirements with Today's Hybrid Cloud Realities
Webinar | Aligning GDPR Requirements with Today's Hybrid Cloud Realities
 
Designing a Distributed Cloud Database for Dummies
Designing a Distributed Cloud Database for DummiesDesigning a Distributed Cloud Database for Dummies
Designing a Distributed Cloud Database for Dummies
 
How to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
How to Power Innovation with Geo-Distributed Data Management in Hybrid CloudHow to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
How to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
 
How to Evaluate Cloud Databases for eCommerce
How to Evaluate Cloud Databases for eCommerceHow to Evaluate Cloud Databases for eCommerce
How to Evaluate Cloud Databases for eCommerce
 
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
 
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
 
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
 
Datastax - The Architect's guide to customer experience (CX)
Datastax - The Architect's guide to customer experience (CX)Datastax - The Architect's guide to customer experience (CX)
Datastax - The Architect's guide to customer experience (CX)
 
An Operational Data Layer is Critical for Transformative Banking Applications
An Operational Data Layer is Critical for Transformative Banking ApplicationsAn Operational Data Layer is Critical for Transformative Banking Applications
An Operational Data Layer is Critical for Transformative Banking Applications
 
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design ThinkingBecoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
 

Último

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 

Último (20)

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 

Top 5 Factors to Consider in Big Data Solutions

  • 1. Top 5 Factors to Consider When Choosing a Big Data Solution Robin Schumacher, VP Products ©2012 DataStax 1
  • 2. •  VP Products, DataStax •  Director of Product Management MySQL, then EnterpriseDB •  VP Product Management at Embarcadero Technologies •  DBA with Oracle, Teradata, SQL Server, DB2, others… •  Database software reviewer for various magazines •  Author of 3 database books ©2012 DataStax 2
  • 3. •  Define big data •  Identify “must have’s” of a big data solution •  Discuss difficulty in getting all of them from a business and technical perspective •  Brief tour of NoSQL, Cassandra and DataStax Enterprise ©2012 DataStax 3
  • 4. What big data is and the domains of data that need to be considered. ©2012 DataStax 4
  • 6. “Big data technologies describe a new generation of technologies and architectures, designed to economically extract value from very large volumes of a wide variety of data, by enabling high-velocity capture, discovery, and/or analysis.” "Big data is data that exceeds the processing capacity of conventional database systems. The data is too big, moves too fast, or doesn't fit the strictures of your database architectures. To gain value from this data, you must choose an alternative way to process it." ”Datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze " * All definitions have one thing in common: new technology is needed for big data… ©2012 DataStax 6
  • 7. 1.  Real-time – transactional, online, streaming, low latency data 2.  Analytic – aggregated data from real-time feeds or other sources; many times batch in nature 3.  Search – supporting data, both external and internal, used for locating desired information and/or objects (e.g. products, documents, etc.) ©2012 DataStax 7
  • 8. Research done by McKinsey & Company shows the eye-opening, 10-year category growth rate differences between businesses that smartly use their big data and those that do not. ©2012 DataStax 8
  • 9. What are the top five things to consider in a big data solution? ©2012 DataStax 9
  • 11. The characteristics that define big data are: 1.  Velocity – includes the speed at which data comes in, and the number of events/elements being stored 2.  Variety – involves structured, semi-structured, unstructured data 3.  Volume – can equate to TB-PB’s of data 4.  Complexity – typically entails the difficulty distributing the data (e.g. multi-data centers, cloud, etc.) and managing the data traffic/movement (e.g. ETL, migrations, etc.) ©2012 DataStax 11
  • 12. •  Data has high rate of input •  Data has large quantity of elements/events • Sensor data • Media streaming • Mobile devices • Financial streams • Web clickstream • Traffic monitoring • Patient care ©2012 DataStax 12
  • 13. •  Includes structured, semi, and unstructured •  Necessitates new data model and file formats •  Involves, real-time, analytic, and search data ©2012 DataStax 13
  • 14. •  TB’s to PB’s •  Also involves data maintenance functions (e.g. purging, etc.) ©2012 DataStax 14
  • 15. The McKinsey report found that the average investment firm with fewer than 1,000 employees has 3.8 petabytes of data stored, experiences a data growth rate of 40 percent per year, and stores structured, semi-structured, and unstructured data. Overall, McKinsey found that 15 out of 17 industry sectors in the United States have more data stored per company than the U.S. Library of Congress (which had 235 terabytes of information at the time of McKinsey’s study) ©2012 DataStax 15
  • 16. •  Typically involves data distribution, movement, etc., across multiple data centers and geographies •  Can be on-premise, cloud, or hybrid ©2012 DataStax 16
  • 17. Getting a big data technology that provides two out of three can be challenging; finding one that supplies all three can be very hard. ©2012 DataStax 17
  • 18. NoSQL, Cassandra, and DataStax Enterprise for big data. ©2012 DataStax 18
  • 19. NoSQL is a broad class of next-generation database management systems that differ from the classic model of the relational database management system (RDBMS) in some significant ways, most important being they: •  Sport a less-rigid, more dynamic data model •  Look to provide user controlled trade-off’s to the CAP theorem •  Do not support ANSI SQL or operations such as joins •  Attempt to solve some or all of the challenges of big data ©2012 DataStax 19
  • 20. A NoSQL solution like Apache Cassandra: •  Handles high velocity data with ease •  Uses schema that support broad varieties of data •  Scales from GB’s to PB’s with linear performance capabilities •  Is built to handle multi-location/data center use cases •  Is designed for continuous availability •  Offers quick installation and configuration for multi-node clusters •  Is open source and/or cost 80-90% less than RDBMS’s ©2012 DataStax 20
  • 21. Overview of DataStax •  Founded in April 2010 •  Commercial leader in Apache Cassandra™, the popular open-source “big data” database •  140+ customers •  40+ employees •  Home to Apache Cassandra Chair & most committers •  Headquartered in San Francisco Bay area •  Funded by prominent venture firms ©2012 DataStax 21
  • 22. * Uses Cassandra and Hadoop for data management ©2012 DataStax 22
  • 23. Cassandra is: Nearly 4x better in writes Nearly 2x better in reads Over 12x better in reads/updates YCSB Benchmark Source: http://blog.cubrid.org/dev-platform/nosql-benchmarking/?utm_source=NoSQL+Weekly+List&utm_campaign=143fae86b2- NoSQL_Weekly_Issue_41_September_8_2011&utm_medium=email ©2012 DataStax 23
  • 24. Stores financial options tick data into very fluid data model for storage and analysis into Cassandra. ©2012 DataStax 24
  • 25. “The hundreds of millions of web pages that contain this information are stored in a multi-terabyte cache that grows continually as we crawl the web, analyzing new pages and finding new versions of existing pages.” – Zoominfo Architect on using Cassandra ©2012 DataStax 25
  • 26. “I can create a Cassandra cluster in any region of the world in 10 minutes. When marketing guys decide we want to move into a certain part of the world, we’re ready.” - Netflix architect ©2012 DataStax 26
  • 27. •  Fully integrated smart big data platform •  Production certified Cassandra •  Continuously available analytics with Hadoop •  Scalable enterprise search with Solr •  Built in workload isolation •  No costly and error-prone ETL operations •  Easy migration of RDBMS and log data •  Simple to install and grow •  OpsCenter management solution •  80-90% less cost than RDBMS vendors ©2012 DataStax 27
  • 28. •  DataStax OpsCenter is a visual management and monitoring solution for DataStax Enterprise •  Manage and monitor all Cassandra and Hadoop and Solr operations •  Visual alerts and notifications ©2012 DataStax 28
  • 29. 1.  Does it handle high data velocity? 2.  Can it tackle all types of data? 3.  How well does it perform with large data volumes? 4.  Can it handle complex distribution and implementation use cases (e.g. on-premise/cloud, multi-geo)? 5.  How does it stack up in hitting the big data “bulls eye?” (i.e. cost, saleable performance, and operational ease are concerned)? ©2012 DataStax 29
  • 30. DataStax Enterprise is tailor made for high-velocity, multi-variety, large volume, and complex deployment use cases that involve big data. ©2012 DataStax 30
  • 31. Recommended Reading http://www.datastax.com/resources/whitepapers ©2012 DataStax 31
  • 32. Next Steps Download DataStax Enterprise and try it in your own environment. ›  Go to www.datastax.com/ software ›  Download a copy of DataStax Enterprise ›  Installs and configures in minutes ›  Completely free for development use ©2012 DataStax 32