SlideShare a Scribd company logo
1 of 30
Download to read offline
Mike Miller, CoFounder, Chief Scientist




                                          @mlmilleratmit
                                          mike@cloudant.com   1
My Background

Cloudant CoFounder, Chief Scientist


Assistant Professor, Particle Physics
(U. Washington, Affiliate)

Background: machine learning,
analysis, big data, globally distributed
systems




                             Cloudant, 9-26-2012   2
The face of big data




   http://abstract.cs.washington.edu/~shwetak/
                   Cloudant, 9-26-2012           3
The face of big data




          Cloudant, 9-26-2012   4
The face of big data




 “The future is stranger and sooner than you think”
              Reid Hoffman, LinkedIn/Greylock
                      Cloudant, 9-26-2012             5
Perfect Storm



                      Parallel
                     Processing

 Big Data
                                                        HTML5/JS




            Mobile                         9M Trained
                                           Developers
                     Cloudant, 9-26-2012                           6
Focus on your Application
   not data operations




        Cloudant, 9-26-2012   7
If your data is stuck in the warehouse...
            ... you’re losing




                Cloudant, 9-26-2012         8
Data Layer for the Web
Founded (2009) by leading MIT data
scientists

Funded by Y Combinator & Avalon

Global network of 20+ data centers
-- Application Data Network (ADN)

Built on leading NoSQL standard:
most durable data store on planet

10,000 users and growing.

         Cloudant: Akamai of dynamic content
                            Cloudant, 9-26-2012   9
Cloudant Product Line
•   Application State
    Hyper-Scalable Document Store (JSON+HTTP)
    MVCC
    Secondary indexes for flexible query

•   Application Data Security
    Accounts/API keys, data sharing, permission roles

•   Application Analytics
    Fully Integrated (Incremental) MapReduce engine

•   Application Search
    Fully Integrated (Incremental) Lucene + Geospatial
                                                          API Compatible
•   Application Object Storage
    images, audio, video...

•   Application State Distribution
    cloud <==> tablet <==> PC <==> mobile
                                    Cloudant, 9-26-2012                    10
Cloudant Install

  You do this:




  We give you:


                 That’s It

                 Cloudant, 9-26-2012   11
API Examples




     Write a doc...from the browser
      No client install necessary



               Cloudant, 9-26-2012    12
API Examples
                                      Create Secondary Indexes




Query Those indexes




                      Cloudant, 9-26-2012                        13
http://examples.cloudant.com/lobby-search/_design/lookup/index.html
                          Cloudant, 9-26-2012                         14
Global Data Network




 Cloudant scales within & between data centers
 Availability, low-latency


                             Cloudant, 9-26-2012   15
Anatomy of the Data Layer

                    PUT {document}                                          Secondary Data Centers
                                                                            (for DR & distributed access)

   US-EAST                                “Node”
                                                                            AP-JP
                                                   Filtered Replication &
                                                            Sync              EU-NL

    Single-tenant
       cluster
                         Multi-tenant
                           cluster
                                                                                   Disconnected
        Horizontally Scalable DB
                                                                                   Devices
        •   Fault tolerant
        •   Always consistent                                                     Edge Database
        •   Schemaless (NoSQL)                                                    Cluster
        •   Automatic sharding
        •   Distributed, parallel analytics
        •   Incremental, chainable
            MapReduce
        •   Full-text search                                         Single-Tenant or Multi-Tenant



                                                                                                            16
https://cloudant.com/blog/cloudant-labs-on-google-spanner/

                        Cloudant, 9-26-2012                  17
Why It Matters



     Cloudant, 9-26-2012   18
>1. Visualization Wins




   http://sosolimited.com/blog/2012/07/from-tweets-to-lightshow/
                          Cloudant, 9-26-2012                      19
>2. Prepare For Success




 Three #1 apps, from 6 to 90 servers in weeks
                   Cloudant, 9-26-2012          20
>3. Scale Invariance




           Cloudant, 9-26-2012   21
>3. Scale Invariance


  mobile/tablet




                  desktop




 Goal: Megabytes to Petabytes           Cloud
                  Cloudant, 9-26-2012           22
>3. Scale Invariance




             ‘Carry Small, Live Large’
single user experience at vastly different scales

                    Cloudant, 9-26-2012            23
>4. No Preferred Frame




 So why do you have a global ‘write master’?
                  Cloudant, 9-26-2012          24
>4. No Preferred Frame
This simple document...




...establishes Continuous Pipe from Europe to US


                    Cloudant, 9-26-2012            25
>4. No Preferred Frame

And you can do the reverse...




                                        ...at the same time


                  Cloudant, 9-26-2012                         26
>4. No Preferred Frame




         Write local, live global
What could you do with relaxed constraints?
                  Cloudant, 9-26-2012         27
>4. No Preferred Frame
                                         Data Import
               18                                                                                           18
   Size [GB]




                                                                                      Doc Count [Million]
                                                           Actual Customer Data
                                                           France to Amsterdam
               16       Data Size [GB]                                                                      16

                        Disk Size [GB]
               14                                                                                           14
                        Documents [M]
               12                                                                                           12


               10                                                                                           10


                8                                                                                            8


                6                                                                                            6


                4                                                                                            4


                2                                                                                            2


                0                                                                                            0
                 0     2000     4000      6000      8000       10000   12000      14000
                                                                           Time [sec]



                     One click (continuous) Import
                                         Cloudant, 9-26-2012                                                     28
Big and Getting Bigger




          Cloudant, 9-26-2012   29
Big and Getting Bigger
• And of course, we are hiring
 Languages
 erlang, scala, c, javascript, python, clojure, html5, iOS, Android, ruby/chef

 Sample problems in the Seattle office

 Create file format optimized for (huge) structured time-series data
 Integrate Cubism into two-tier application stack
 Profile creation of 100M databases (real customer)
 PIG / HIVE integration
 Prototype read-in-place Hadoop connector



                                 Cloudant, 9-26-2012                             30

More Related Content

What's hot

Prepare Your Data For The Cloud
Prepare Your Data For The CloudPrepare Your Data For The Cloud
Prepare Your Data For The CloudIndicThreads
 
SQL Server Managing Test Data & Stress Testing January 2011
SQL Server Managing Test Data & Stress Testing January 2011SQL Server Managing Test Data & Stress Testing January 2011
SQL Server Managing Test Data & Stress Testing January 2011Mark Ginnebaugh
 
Distributed parallel architecture for big data
Distributed parallel architecture for big dataDistributed parallel architecture for big data
Distributed parallel architecture for big datakamicool13
 
The Database Environment Chapter 13
The Database Environment Chapter 13The Database Environment Chapter 13
The Database Environment Chapter 13Jeanie Arnoco
 
Swib12 workshop lod_beginners
Swib12 workshop lod_beginnersSwib12 workshop lod_beginners
Swib12 workshop lod_beginnersdr0i
 
IRJET- Generate Distributed Metadata using Blockchain Technology within HDFS ...
IRJET- Generate Distributed Metadata using Blockchain Technology within HDFS ...IRJET- Generate Distributed Metadata using Blockchain Technology within HDFS ...
IRJET- Generate Distributed Metadata using Blockchain Technology within HDFS ...IRJET Journal
 
Scalable data pipeline
Scalable data pipelineScalable data pipeline
Scalable data pipelineGreenM
 
Co existence or Competition ? - RDBMS and Hadoop
Co existence or Competition ? - RDBMS and HadoopCo existence or Competition ? - RDBMS and Hadoop
Co existence or Competition ? - RDBMS and HadoopFlytxt
 
Performance Benchmarking of Key-Value Store NoSQL Databases
Performance Benchmarking of Key-Value Store NoSQL Databases Performance Benchmarking of Key-Value Store NoSQL Databases
Performance Benchmarking of Key-Value Store NoSQL Databases IJECEIAES
 
Research on big data
Research on big dataResearch on big data
Research on big dataRoby Chen
 
Product overview 6.0 v.1.0
Product overview 6.0 v.1.0Product overview 6.0 v.1.0
Product overview 6.0 v.1.0Gianluigi Riccio
 

What's hot (15)

Prepare Your Data For The Cloud
Prepare Your Data For The CloudPrepare Your Data For The Cloud
Prepare Your Data For The Cloud
 
Data ware house
Data ware houseData ware house
Data ware house
 
SQL Server Managing Test Data & Stress Testing January 2011
SQL Server Managing Test Data & Stress Testing January 2011SQL Server Managing Test Data & Stress Testing January 2011
SQL Server Managing Test Data & Stress Testing January 2011
 
Distributed parallel architecture for big data
Distributed parallel architecture for big dataDistributed parallel architecture for big data
Distributed parallel architecture for big data
 
Bicod2017
Bicod2017Bicod2017
Bicod2017
 
The Database Environment Chapter 13
The Database Environment Chapter 13The Database Environment Chapter 13
The Database Environment Chapter 13
 
Swib12 workshop lod_beginners
Swib12 workshop lod_beginnersSwib12 workshop lod_beginners
Swib12 workshop lod_beginners
 
IRJET- Generate Distributed Metadata using Blockchain Technology within HDFS ...
IRJET- Generate Distributed Metadata using Blockchain Technology within HDFS ...IRJET- Generate Distributed Metadata using Blockchain Technology within HDFS ...
IRJET- Generate Distributed Metadata using Blockchain Technology within HDFS ...
 
Scalable data pipeline
Scalable data pipelineScalable data pipeline
Scalable data pipeline
 
Asd 2015
Asd 2015Asd 2015
Asd 2015
 
Co existence or Competition ? - RDBMS and Hadoop
Co existence or Competition ? - RDBMS and HadoopCo existence or Competition ? - RDBMS and Hadoop
Co existence or Competition ? - RDBMS and Hadoop
 
Intro To IDMS
Intro To IDMSIntro To IDMS
Intro To IDMS
 
Performance Benchmarking of Key-Value Store NoSQL Databases
Performance Benchmarking of Key-Value Store NoSQL Databases Performance Benchmarking of Key-Value Store NoSQL Databases
Performance Benchmarking of Key-Value Store NoSQL Databases
 
Research on big data
Research on big dataResearch on big data
Research on big data
 
Product overview 6.0 v.1.0
Product overview 6.0 v.1.0Product overview 6.0 v.1.0
Product overview 6.0 v.1.0
 

Viewers also liked

Birmingham Meetup
Birmingham MeetupBirmingham Meetup
Birmingham MeetupIBM
 
SQL to NoSQL: Top 6 Questions
SQL to NoSQL: Top 6 QuestionsSQL to NoSQL: Top 6 Questions
SQL to NoSQL: Top 6 QuestionsMike Broberg
 
Mobile App Development With IBM Cloudant
Mobile App Development With IBM CloudantMobile App Development With IBM Cloudant
Mobile App Development With IBM CloudantIBM Cloud Data Services
 
Offline-First Mobile Web Apps with PouchDB, IBM Cloudant, and IBM Bluemix
Offline-First Mobile Web Apps with PouchDB, IBM Cloudant, and IBM BluemixOffline-First Mobile Web Apps with PouchDB, IBM Cloudant, and IBM Bluemix
Offline-First Mobile Web Apps with PouchDB, IBM Cloudant, and IBM BluemixIBM
 
I See NoSQL Document Stores in Geospatial Applications
I See NoSQL Document Stores in Geospatial ApplicationsI See NoSQL Document Stores in Geospatial Applications
I See NoSQL Document Stores in Geospatial ApplicationsIBM Cloud Data Services
 
Cloud Data Services: A Brand New Ballgame for Business
Cloud Data Services: A  Brand New Ballgame for BusinessCloud Data Services: A  Brand New Ballgame for Business
Cloud Data Services: A Brand New Ballgame for BusinessIBM Cloud Data Services
 
IBM Relay 2015: Cloud is All About the Customer
IBM Relay 2015: Cloud is All About the Customer IBM Relay 2015: Cloud is All About the Customer
IBM Relay 2015: Cloud is All About the Customer IBM
 
IBM Relay 2015: Open for Data
IBM Relay 2015: Open for Data IBM Relay 2015: Open for Data
IBM Relay 2015: Open for Data IBM
 
Socket.IO - Alternative Ways for Real-time Application
Socket.IO - Alternative Ways for Real-time ApplicationSocket.IO - Alternative Ways for Real-time Application
Socket.IO - Alternative Ways for Real-time ApplicationVorakamol Choonhasakulchok
 
IBM Relay 2015: Opening Keynote
IBM Relay 2015: Opening Keynote IBM Relay 2015: Opening Keynote
IBM Relay 2015: Opening Keynote IBM
 
IBM Relay 2015: Securing the Future
IBM Relay 2015: Securing the Future IBM Relay 2015: Securing the Future
IBM Relay 2015: Securing the Future IBM
 
Using Service Discovery and Service Proxy
Using Service Discovery and Service ProxyUsing Service Discovery and Service Proxy
Using Service Discovery and Service ProxyIBM
 
IBM RTP Dojo Launch
IBM RTP Dojo LaunchIBM RTP Dojo Launch
IBM RTP Dojo LaunchIBM
 

Viewers also liked (15)

Birmingham Meetup
Birmingham MeetupBirmingham Meetup
Birmingham Meetup
 
SQL to NoSQL: Top 6 Questions
SQL to NoSQL: Top 6 QuestionsSQL to NoSQL: Top 6 Questions
SQL to NoSQL: Top 6 Questions
 
Mobile App Development With IBM Cloudant
Mobile App Development With IBM CloudantMobile App Development With IBM Cloudant
Mobile App Development With IBM Cloudant
 
Offline-First Mobile Web Apps with PouchDB, IBM Cloudant, and IBM Bluemix
Offline-First Mobile Web Apps with PouchDB, IBM Cloudant, and IBM BluemixOffline-First Mobile Web Apps with PouchDB, IBM Cloudant, and IBM Bluemix
Offline-First Mobile Web Apps with PouchDB, IBM Cloudant, and IBM Bluemix
 
I See NoSQL Document Stores in Geospatial Applications
I See NoSQL Document Stores in Geospatial ApplicationsI See NoSQL Document Stores in Geospatial Applications
I See NoSQL Document Stores in Geospatial Applications
 
Practical Use of a NoSQL
Practical Use of a NoSQLPractical Use of a NoSQL
Practical Use of a NoSQL
 
Cloud Data Services: A Brand New Ballgame for Business
Cloud Data Services: A  Brand New Ballgame for BusinessCloud Data Services: A  Brand New Ballgame for Business
Cloud Data Services: A Brand New Ballgame for Business
 
IBM Relay 2015: Cloud is All About the Customer
IBM Relay 2015: Cloud is All About the Customer IBM Relay 2015: Cloud is All About the Customer
IBM Relay 2015: Cloud is All About the Customer
 
IBM Relay 2015: Open for Data
IBM Relay 2015: Open for Data IBM Relay 2015: Open for Data
IBM Relay 2015: Open for Data
 
Socket.IO - Alternative Ways for Real-time Application
Socket.IO - Alternative Ways for Real-time ApplicationSocket.IO - Alternative Ways for Real-time Application
Socket.IO - Alternative Ways for Real-time Application
 
IBM Relay 2015: Opening Keynote
IBM Relay 2015: Opening Keynote IBM Relay 2015: Opening Keynote
IBM Relay 2015: Opening Keynote
 
IBM Relay 2015: Securing the Future
IBM Relay 2015: Securing the Future IBM Relay 2015: Securing the Future
IBM Relay 2015: Securing the Future
 
Using Service Discovery and Service Proxy
Using Service Discovery and Service ProxyUsing Service Discovery and Service Proxy
Using Service Discovery and Service Proxy
 
IBM RTP Dojo Launch
IBM RTP Dojo LaunchIBM RTP Dojo Launch
IBM RTP Dojo Launch
 
IBM - Introduction to Cloudant
IBM - Introduction to CloudantIBM - Introduction to Cloudant
IBM - Introduction to Cloudant
 

Similar to Scalability 09262012

ClickHouse on Plug-n-Play Cloud, by Som Sikdar, Kodiak Data
ClickHouse on Plug-n-Play Cloud, by Som Sikdar, Kodiak DataClickHouse on Plug-n-Play Cloud, by Som Sikdar, Kodiak Data
ClickHouse on Plug-n-Play Cloud, by Som Sikdar, Kodiak DataAltinity Ltd
 
Jak konsolidovat Vaše databáze s využitím Cloud služeb?
Jak konsolidovat Vaše databáze s využitím Cloud služeb?Jak konsolidovat Vaše databáze s využitím Cloud služeb?
Jak konsolidovat Vaše databáze s využitím Cloud služeb?MarketingArrowECS_CZ
 
Say Goodbye to DIY Data Centers
Say Goodbye to DIY Data CentersSay Goodbye to DIY Data Centers
Say Goodbye to DIY Data CentersRackspace
 
Snowflake’s Cloud Data Platform and Modern Analytics
Snowflake’s Cloud Data Platform and Modern AnalyticsSnowflake’s Cloud Data Platform and Modern Analytics
Snowflake’s Cloud Data Platform and Modern AnalyticsSenturus
 
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part20812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2Raul Chong
 
Cloud Opportunities with Virtualization
Cloud Opportunities with VirtualizationCloud Opportunities with Virtualization
Cloud Opportunities with VirtualizationKellyn Pot'Vin-Gorman
 
Demystifying Data Warehousing as a Service - DFW
Demystifying Data Warehousing as a Service - DFWDemystifying Data Warehousing as a Service - DFW
Demystifying Data Warehousing as a Service - DFWKent Graziano
 
YugabyteDB - Distributed SQL Database on Kubernetes
YugabyteDB - Distributed SQL Database on KubernetesYugabyteDB - Distributed SQL Database on Kubernetes
YugabyteDB - Distributed SQL Database on KubernetesDoKC
 
Machine Learning for z/OS
Machine Learning for z/OSMachine Learning for z/OS
Machine Learning for z/OSCuneyt Goksu
 
Storage, Cloud, Web 2.0, Big Data Driving Growth
Storage, Cloud, Web 2.0, Big Data Driving GrowthStorage, Cloud, Web 2.0, Big Data Driving Growth
Storage, Cloud, Web 2.0, Big Data Driving GrowthMellanox Technologies
 
NuoDB + MayaData: How to Run Containerized Enterprise SQL Applications in the...
NuoDB + MayaData: How to Run Containerized Enterprise SQL Applications in the...NuoDB + MayaData: How to Run Containerized Enterprise SQL Applications in the...
NuoDB + MayaData: How to Run Containerized Enterprise SQL Applications in the...NuoDB
 
How to Run Containerized Enterprise SQL Applications in the Cloud with NuoDB ...
How to Run Containerized Enterprise SQL Applications in the Cloud with NuoDB ...How to Run Containerized Enterprise SQL Applications in the Cloud with NuoDB ...
How to Run Containerized Enterprise SQL Applications in the Cloud with NuoDB ...MayaData Inc
 
IRJET- Comparatively Analysis on K-Means++ and Mini Batch K-Means Clustering ...
IRJET- Comparatively Analysis on K-Means++ and Mini Batch K-Means Clustering ...IRJET- Comparatively Analysis on K-Means++ and Mini Batch K-Means Clustering ...
IRJET- Comparatively Analysis on K-Means++ and Mini Batch K-Means Clustering ...IRJET Journal
 
My Other Computer is a Data Center (2010 v21)
My Other Computer is a Data Center (2010 v21)My Other Computer is a Data Center (2010 v21)
My Other Computer is a Data Center (2010 v21)Robert Grossman
 
Conquering Disaster Recovery Challenges and Out-of-Control Data with the Hybr...
Conquering Disaster Recovery Challenges and Out-of-Control Data with the Hybr...Conquering Disaster Recovery Challenges and Out-of-Control Data with the Hybr...
Conquering Disaster Recovery Challenges and Out-of-Control Data with the Hybr...actualtechmedia
 
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...Denodo
 
How Service Mesh Fits into the Modern Data Stack
How Service Mesh Fits into the Modern Data StackHow Service Mesh Fits into the Modern Data Stack
How Service Mesh Fits into the Modern Data StackFabian Hardt
 
Building Cloud-Native Applications with a Container-Native SQL Database in th...
Building Cloud-Native Applications with a Container-Native SQL Database in th...Building Cloud-Native Applications with a Container-Native SQL Database in th...
Building Cloud-Native Applications with a Container-Native SQL Database in th...NuoDB
 
What’s New in Documentum 7.3
What’s New in Documentum 7.3What’s New in Documentum 7.3
What’s New in Documentum 7.3Michael Mohen
 

Similar to Scalability 09262012 (20)

2020 – A Decade of Change
2020 – A Decade of Change2020 – A Decade of Change
2020 – A Decade of Change
 
ClickHouse on Plug-n-Play Cloud, by Som Sikdar, Kodiak Data
ClickHouse on Plug-n-Play Cloud, by Som Sikdar, Kodiak DataClickHouse on Plug-n-Play Cloud, by Som Sikdar, Kodiak Data
ClickHouse on Plug-n-Play Cloud, by Som Sikdar, Kodiak Data
 
Jak konsolidovat Vaše databáze s využitím Cloud služeb?
Jak konsolidovat Vaše databáze s využitím Cloud služeb?Jak konsolidovat Vaše databáze s využitím Cloud služeb?
Jak konsolidovat Vaše databáze s využitím Cloud služeb?
 
Say Goodbye to DIY Data Centers
Say Goodbye to DIY Data CentersSay Goodbye to DIY Data Centers
Say Goodbye to DIY Data Centers
 
Snowflake’s Cloud Data Platform and Modern Analytics
Snowflake’s Cloud Data Platform and Modern AnalyticsSnowflake’s Cloud Data Platform and Modern Analytics
Snowflake’s Cloud Data Platform and Modern Analytics
 
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part20812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
 
Cloud Opportunities with Virtualization
Cloud Opportunities with VirtualizationCloud Opportunities with Virtualization
Cloud Opportunities with Virtualization
 
Demystifying Data Warehousing as a Service - DFW
Demystifying Data Warehousing as a Service - DFWDemystifying Data Warehousing as a Service - DFW
Demystifying Data Warehousing as a Service - DFW
 
YugabyteDB - Distributed SQL Database on Kubernetes
YugabyteDB - Distributed SQL Database on KubernetesYugabyteDB - Distributed SQL Database on Kubernetes
YugabyteDB - Distributed SQL Database on Kubernetes
 
Machine Learning for z/OS
Machine Learning for z/OSMachine Learning for z/OS
Machine Learning for z/OS
 
Storage, Cloud, Web 2.0, Big Data Driving Growth
Storage, Cloud, Web 2.0, Big Data Driving GrowthStorage, Cloud, Web 2.0, Big Data Driving Growth
Storage, Cloud, Web 2.0, Big Data Driving Growth
 
NuoDB + MayaData: How to Run Containerized Enterprise SQL Applications in the...
NuoDB + MayaData: How to Run Containerized Enterprise SQL Applications in the...NuoDB + MayaData: How to Run Containerized Enterprise SQL Applications in the...
NuoDB + MayaData: How to Run Containerized Enterprise SQL Applications in the...
 
How to Run Containerized Enterprise SQL Applications in the Cloud with NuoDB ...
How to Run Containerized Enterprise SQL Applications in the Cloud with NuoDB ...How to Run Containerized Enterprise SQL Applications in the Cloud with NuoDB ...
How to Run Containerized Enterprise SQL Applications in the Cloud with NuoDB ...
 
IRJET- Comparatively Analysis on K-Means++ and Mini Batch K-Means Clustering ...
IRJET- Comparatively Analysis on K-Means++ and Mini Batch K-Means Clustering ...IRJET- Comparatively Analysis on K-Means++ and Mini Batch K-Means Clustering ...
IRJET- Comparatively Analysis on K-Means++ and Mini Batch K-Means Clustering ...
 
My Other Computer is a Data Center (2010 v21)
My Other Computer is a Data Center (2010 v21)My Other Computer is a Data Center (2010 v21)
My Other Computer is a Data Center (2010 v21)
 
Conquering Disaster Recovery Challenges and Out-of-Control Data with the Hybr...
Conquering Disaster Recovery Challenges and Out-of-Control Data with the Hybr...Conquering Disaster Recovery Challenges and Out-of-Control Data with the Hybr...
Conquering Disaster Recovery Challenges and Out-of-Control Data with the Hybr...
 
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
 
How Service Mesh Fits into the Modern Data Stack
How Service Mesh Fits into the Modern Data StackHow Service Mesh Fits into the Modern Data Stack
How Service Mesh Fits into the Modern Data Stack
 
Building Cloud-Native Applications with a Container-Native SQL Database in th...
Building Cloud-Native Applications with a Container-Native SQL Database in th...Building Cloud-Native Applications with a Container-Native SQL Database in th...
Building Cloud-Native Applications with a Container-Native SQL Database in th...
 
What’s New in Documentum 7.3
What’s New in Documentum 7.3What’s New in Documentum 7.3
What’s New in Documentum 7.3
 

Recently uploaded

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 

Recently uploaded (20)

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 

Scalability 09262012

  • 1. Mike Miller, CoFounder, Chief Scientist @mlmilleratmit mike@cloudant.com 1
  • 2. My Background Cloudant CoFounder, Chief Scientist Assistant Professor, Particle Physics (U. Washington, Affiliate) Background: machine learning, analysis, big data, globally distributed systems Cloudant, 9-26-2012 2
  • 3. The face of big data http://abstract.cs.washington.edu/~shwetak/ Cloudant, 9-26-2012 3
  • 4. The face of big data Cloudant, 9-26-2012 4
  • 5. The face of big data “The future is stranger and sooner than you think” Reid Hoffman, LinkedIn/Greylock Cloudant, 9-26-2012 5
  • 6. Perfect Storm Parallel Processing Big Data HTML5/JS Mobile 9M Trained Developers Cloudant, 9-26-2012 6
  • 7. Focus on your Application not data operations Cloudant, 9-26-2012 7
  • 8. If your data is stuck in the warehouse... ... you’re losing Cloudant, 9-26-2012 8
  • 9. Data Layer for the Web Founded (2009) by leading MIT data scientists Funded by Y Combinator & Avalon Global network of 20+ data centers -- Application Data Network (ADN) Built on leading NoSQL standard: most durable data store on planet 10,000 users and growing. Cloudant: Akamai of dynamic content Cloudant, 9-26-2012 9
  • 10. Cloudant Product Line • Application State Hyper-Scalable Document Store (JSON+HTTP) MVCC Secondary indexes for flexible query • Application Data Security Accounts/API keys, data sharing, permission roles • Application Analytics Fully Integrated (Incremental) MapReduce engine • Application Search Fully Integrated (Incremental) Lucene + Geospatial API Compatible • Application Object Storage images, audio, video... • Application State Distribution cloud <==> tablet <==> PC <==> mobile Cloudant, 9-26-2012 10
  • 11. Cloudant Install You do this: We give you: That’s It Cloudant, 9-26-2012 11
  • 12. API Examples Write a doc...from the browser No client install necessary Cloudant, 9-26-2012 12
  • 13. API Examples Create Secondary Indexes Query Those indexes Cloudant, 9-26-2012 13
  • 15. Global Data Network Cloudant scales within & between data centers Availability, low-latency Cloudant, 9-26-2012 15
  • 16. Anatomy of the Data Layer PUT {document} Secondary Data Centers (for DR & distributed access) US-EAST “Node” AP-JP Filtered Replication & Sync EU-NL Single-tenant cluster Multi-tenant cluster Disconnected Horizontally Scalable DB Devices • Fault tolerant • Always consistent Edge Database • Schemaless (NoSQL) Cluster • Automatic sharding • Distributed, parallel analytics • Incremental, chainable MapReduce • Full-text search Single-Tenant or Multi-Tenant 16
  • 18. Why It Matters Cloudant, 9-26-2012 18
  • 19. >1. Visualization Wins http://sosolimited.com/blog/2012/07/from-tweets-to-lightshow/ Cloudant, 9-26-2012 19
  • 20. >2. Prepare For Success Three #1 apps, from 6 to 90 servers in weeks Cloudant, 9-26-2012 20
  • 21. >3. Scale Invariance Cloudant, 9-26-2012 21
  • 22. >3. Scale Invariance mobile/tablet desktop Goal: Megabytes to Petabytes Cloud Cloudant, 9-26-2012 22
  • 23. >3. Scale Invariance ‘Carry Small, Live Large’ single user experience at vastly different scales Cloudant, 9-26-2012 23
  • 24. >4. No Preferred Frame So why do you have a global ‘write master’? Cloudant, 9-26-2012 24
  • 25. >4. No Preferred Frame This simple document... ...establishes Continuous Pipe from Europe to US Cloudant, 9-26-2012 25
  • 26. >4. No Preferred Frame And you can do the reverse... ...at the same time Cloudant, 9-26-2012 26
  • 27. >4. No Preferred Frame Write local, live global What could you do with relaxed constraints? Cloudant, 9-26-2012 27
  • 28. >4. No Preferred Frame Data Import 18 18 Size [GB] Doc Count [Million] Actual Customer Data France to Amsterdam 16 Data Size [GB] 16 Disk Size [GB] 14 14 Documents [M] 12 12 10 10 8 8 6 6 4 4 2 2 0 0 0 2000 4000 6000 8000 10000 12000 14000 Time [sec] One click (continuous) Import Cloudant, 9-26-2012 28
  • 29. Big and Getting Bigger Cloudant, 9-26-2012 29
  • 30. Big and Getting Bigger • And of course, we are hiring Languages erlang, scala, c, javascript, python, clojure, html5, iOS, Android, ruby/chef Sample problems in the Seattle office Create file format optimized for (huge) structured time-series data Integrate Cubism into two-tier application stack Profile creation of 100M databases (real customer) PIG / HIVE integration Prototype read-in-place Hadoop connector Cloudant, 9-26-2012 30