SlideShare uma empresa Scribd logo
1 de 66
Web-scale data architectures

A survey of next generation data storage
             and retrieval
Web-scale architectures

 An argument for computing in the cloud
Or how I learned to stop worrying and love

               *aaS
Software as a                     COMMUNITY
SERVICE                           participation




                SIMPLE
                user interfaces




                DATA
Seller reputation




                    Product
                    recommendation
What mattered to these companies?

• MVC architecture?
• Java vs. Python vs. [insert your favorite language]?
• Ajax vs. Flash?
The common theme?...

MASSIVE AMOUNTS
OF DATA
In the beginning…


DATA
Forget SQL…

          YOU ONLY WANT TO
                STORE DATA
Web 2.0 is *pushing the envelope*…

•   Scale
•   CPU-intensive text analytics
•   Search outside the column
•   7x24 operation
Web Application Heresies?

•   REST and Resource-oriented data
•   Cloud computing
•   Map/Reduce will be next decade’s MVC
•   Semi-structured data
•   Grassroots, flexible schemas…microformats
•   Distributed hash tables
•   Offline browser clients

                                   ( * Adapted from Sam Ruby)
Save. See. Share. Secure.
FOUR PILLARS OF DATA MANAGEMENT*
                    ( * According to Damien Katz )
Massively distributed and scalable. Standards driven.

THE INTERNET AS A PLATFORM
The Three Levels of Platforms you will meet on the Internet *


ACCESS. PLUG-IN. RUNTIME.
               * Marc Andreessen, http://blog.pmarca.com/2007/09/the-three-kinds.html
Common. Code lives outside the runtime.

LEVEL 1: ACCESS API
Will become more common. More difficult for both platform and app developers.

LEVEL 2: PLUG-IN API
Rare. Constrained assumptions. Platform hosts code.

LEVEL 3: RUNTIME ENVIRONMENT
Lowers barriers to entry. Enables situational applications. Isolates concerns to app.

LEVEL 3
Who is building Level 3 platforms?

               Ning   Social Application Platform
         Salesforce   Sforce/AppExchange
            Google    Mashup Editor, et al
        Second Life   Scriptable 3D world
           Amazon     Electronic Computing Cloud
            Akamai    EdgeComputing
            Yahoo!    Pipes
                IBM   Mashup Maker
“IN THE LONG RUN, ALL CREDIBLE LARGE-SCALE
INTERNET COMPANIES WILL PROVIDE LEVEL 3
PLATFORMS”
         * Marc Andreessen, http://blog.pmarca.com/2007/09/the-three-kinds.html
WEB-SCALE DATA + PROCESSING
“…ORGANIZE THE WORLD’S
INFORMATION…”
Google File System. MapReduce Algorithm. Chubby Lock Server.

BIGTABLE
MapReduce Defined
Column-oriented databases
      Id         Last_name   First_name   Salary
      1          Smith       Joe          40000
      2          Jones       Mary         50000
      3          Johnson     Cathy        44000



1,Smith,Joe,40000;2,Jones,Mary,50000;3,Johnson,Cathy,44000;



1,2,3;Smith,Jones,Johnson;Joe,Mary,Cathy;40000,50000,44000;
Google Apps Based on BigTable

•   Google Reader    •   Google Docs
•   Google Maps      •   Google Calendar
•   Google Print     •   Google Page Creator
•   Google Earth     •   Google Notebook
•   Blogger.com      •   Google Mashup Editor
•   Google Code      •   Etc.
•   Orkut
•   YouTube
Distributed Hash Table
Versioning
Vector Clocks
Quorum

CASE STUDY: AMAZON
S3 → Simple Storage Service
  SQS → Simple Queue Service
EC2 → Electronic Compute Cloud




       Dynamo
NING’S ARCHITECTURE
http://docs.ning.com/page/page/show?id=492524:Page:26
Common themes?

•   Flexible schema
•   Highly distributed
•   HTTP is the database driver
•   JSON , XML, HTML, and JavaScript
•   Full text search
One Size Fits All
AN IDEA WHOSE TIME HAS COME AND GONE?


                              ( * Michael Stonebraker )
The first is that there will be a dedicated core, those that are heavily invested, either
monetarily or professionally, in the status quo, and they will resist any change.

The second is that change doesn't care about your investment.


TWO RULES FOR ANY CHANGE IN
TECHNOLOGY *
                     ( * Joe Gregorio, http://bitworking.org/news/217/Ch-ch-changes )
N>1
scale out
 not up
WEB-SCALE DATA + PROCESSING
GOOGLE + MYSQL
CouchDb

• Green implementation, no legacy
• Designed to:
  – implement the four pillars of data management
  – leverage recent paradigm shifts
• Level 3 Data Platform
CouchDB: Feature Summary

Robust Data Storage   Replication
REST API              User Authentication
Views                 Built on Erlang/OTP
Append-only writes    MVCC with optimistic concurrency
Etags                 Full text search
Map/Reduce            (Your feature here. It’s open source!)
CouchDb: REST API

• Easy retrieval using our favorite, scalable
  architecture: HTTP
• Exchange in industry-standard formats:
  (XML/JSON)
• Simple and intuitive interface
APACHE HADOOP
The Hadoop stack (from a DBMS perspective)
  MapReduce             Java framework to write parallel scans and aggregations

  Hbase                 Simple database

  HDFS                  Distributed file system



IBM Impliance
  Muse Query Language   Declarative query language

  MapReduce+            Enhancements to MapReduce
  Muse Data Model       Semi-structured data model

  Hbase Core            Databse storage, transactions
  HDFS                  Distributed file system
“Luckily, there are only a handful of companies…in the world that need to operate at
[this] scale.”

DOES EVERYBODY NEED THIS DEGREE
OF SCALE?
( * Dare Obasanjo, http://www.25hoursaday.com/weblog/2007/10/06/ThoughtsOnAmazonsInternalStorageSystemDynamo.aspx )
“BEWARE OF FOCUSING TOO MUCH ON THE
APPS OF THE PAST WHEN LOOKING AT
PLATFORMS OF THE FUTURE”
Linux, XEN virtualization, Apache Hadoop

IBM & GOOGLE UNIVERSITY
SPONSORSHIP
Understand and communicate HTTP Resource vs. RDBMS differences
Research, explore, and push the limits of the MapReduce programming model
Discover where distributed hash tables may make sense over RDBMS

ACADEMIC PROPOSAL
View ourselves as a Level 3 Platform by…
Take the runtime out of the developers control
Leverage IBM’s Impliance project for massive data scaling

PROJECT ZERO PROPOSAL

Mais conteúdo relacionado

Destaque

Dragons Den Pitch Real One
Dragons Den Pitch Real OneDragons Den Pitch Real One
Dragons Den Pitch Real One
Barbagroup
 
S O N I A M A N Z A N A S E G A R R A
S O N I A  M A N Z A N A  S E G A R R AS O N I A  M A N Z A N A  S E G A R R A
S O N I A M A N Z A N A S E G A R R A
al000529
 
Those Pesky Commas!
Those Pesky Commas! Those Pesky Commas!
Those Pesky Commas!
LAWJH
 
Roll of Thunder, Hear My Cry vocabulary
Roll of Thunder, Hear My Cry vocabularyRoll of Thunder, Hear My Cry vocabulary
Roll of Thunder, Hear My Cry vocabulary
LAWJH
 
Those Pesky Commas! Powe#D6
Those Pesky Commas! Powe#D6Those Pesky Commas! Powe#D6
Those Pesky Commas! Powe#D6
LAWJH
 

Destaque (20)

Web Development on Android
Web Development on AndroidWeb Development on Android
Web Development on Android
 
Dragons Den Pitch Real One
Dragons Den Pitch Real OneDragons Den Pitch Real One
Dragons Den Pitch Real One
 
S O N I A M A N Z A N A S E G A R R A
S O N I A  M A N Z A N A  S E G A R R AS O N I A  M A N Z A N A  S E G A R R A
S O N I A M A N Z A N A S E G A R R A
 
Designersofthefuture11
Designersofthefuture11Designersofthefuture11
Designersofthefuture11
 
Hate Vs
Hate VsHate Vs
Hate Vs
 
Future of Android ... And How To Stop It!
Future of Android ... And How To Stop It!Future of Android ... And How To Stop It!
Future of Android ... And How To Stop It!
 
Those Pesky Commas!
Those Pesky Commas! Those Pesky Commas!
Those Pesky Commas!
 
Eugenio Barba
Eugenio  BarbaEugenio  Barba
Eugenio Barba
 
ADD14: ChromeCast and the future of Android TV
ADD14: ChromeCast and the future of Android TVADD14: ChromeCast and the future of Android TV
ADD14: ChromeCast and the future of Android TV
 
Roll of Thunder, Hear My Cry vocabulary
Roll of Thunder, Hear My Cry vocabularyRoll of Thunder, Hear My Cry vocabulary
Roll of Thunder, Hear My Cry vocabulary
 
TriDroid - Republic Wireless
TriDroid - Republic WirelessTriDroid - Republic Wireless
TriDroid - Republic Wireless
 
Schneider Electric Corporate Presentation 2008 En
Schneider Electric Corporate Presentation 2008 EnSchneider Electric Corporate Presentation 2008 En
Schneider Electric Corporate Presentation 2008 En
 
Those Pesky Commas! Powe#D6
Those Pesky Commas! Powe#D6Those Pesky Commas! Powe#D6
Those Pesky Commas! Powe#D6
 
Proteína C Reactiva
Proteína C ReactivaProteína C Reactiva
Proteína C Reactiva
 
10 Insightful Quotes On Designing A Better Customer Experience
10 Insightful Quotes On Designing A Better Customer Experience10 Insightful Quotes On Designing A Better Customer Experience
10 Insightful Quotes On Designing A Better Customer Experience
 
Prototyping is an attitude
Prototyping is an attitudePrototyping is an attitude
Prototyping is an attitude
 
How to Build a Dynamic Social Media Plan
How to Build a Dynamic Social Media PlanHow to Build a Dynamic Social Media Plan
How to Build a Dynamic Social Media Plan
 
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika AldabaLightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
 
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job? Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
 
The impact of innovation on travel and tourism industries (World Travel Marke...
The impact of innovation on travel and tourism industries (World Travel Marke...The impact of innovation on travel and tourism industries (World Travel Marke...
The impact of innovation on travel and tourism industries (World Travel Marke...
 

Semelhante a Brandon

Is your cloud ready for Big Data? Strata NY 2013
Is your cloud ready for Big Data? Strata NY 2013Is your cloud ready for Big Data? Strata NY 2013
Is your cloud ready for Big Data? Strata NY 2013
Richard McDougall
 

Semelhante a Brandon (20)

Tech
TechTech
Tech
 
StackOverflow Architectural Overview
StackOverflow Architectural OverviewStackOverflow Architectural Overview
StackOverflow Architectural Overview
 
Above the cloud joarder kamal
Above the cloud   joarder kamalAbove the cloud   joarder kamal
Above the cloud joarder kamal
 
Jcon2020 keynote-high-performance-java-cloud-native
Jcon2020 keynote-high-performance-java-cloud-nativeJcon2020 keynote-high-performance-java-cloud-native
Jcon2020 keynote-high-performance-java-cloud-native
 
Building FoundationDB
Building FoundationDBBuilding FoundationDB
Building FoundationDB
 
The NoSQL Movement
The NoSQL MovementThe NoSQL Movement
The NoSQL Movement
 
Smack Stack and Beyond—Building Fast Data Pipelines with Jorg Schad
Smack Stack and Beyond—Building Fast Data Pipelines with Jorg SchadSmack Stack and Beyond—Building Fast Data Pipelines with Jorg Schad
Smack Stack and Beyond—Building Fast Data Pipelines with Jorg Schad
 
Big Data in the Real World
Big Data in the Real WorldBig Data in the Real World
Big Data in the Real World
 
CHAPTER 2 cloud computing technology in cs
CHAPTER 2 cloud computing technology in csCHAPTER 2 cloud computing technology in cs
CHAPTER 2 cloud computing technology in cs
 
Is your cloud ready for Big Data? Strata NY 2013
Is your cloud ready for Big Data? Strata NY 2013Is your cloud ready for Big Data? Strata NY 2013
Is your cloud ready for Big Data? Strata NY 2013
 
Big Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI ProsBig Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI Pros
 
Datacenter Computing with Apache Mesos - BigData DC
Datacenter Computing with Apache Mesos - BigData DCDatacenter Computing with Apache Mesos - BigData DC
Datacenter Computing with Apache Mesos - BigData DC
 
Internet Scale Architecture
Internet Scale ArchitectureInternet Scale Architecture
Internet Scale Architecture
 
Navigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skiesNavigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skies
 
Apache Arrow: Leveling Up the Analytics Stack
Apache Arrow: Leveling Up the Analytics StackApache Arrow: Leveling Up the Analytics Stack
Apache Arrow: Leveling Up the Analytics Stack
 
عصر کلان داده، چرا و چگونه؟
عصر کلان داده، چرا و چگونه؟عصر کلان داده، چرا و چگونه؟
عصر کلان داده، چرا و چگونه؟
 
Above the cloud: Big Data and BI
Above the cloud: Big Data and BIAbove the cloud: Big Data and BI
Above the cloud: Big Data and BI
 
Intro to Databases
Intro to DatabasesIntro to Databases
Intro to Databases
 
Tagging and Folksonomy Schema Design for Scalability and Performance
Tagging and Folksonomy Schema Design for Scalability and PerformanceTagging and Folksonomy Schema Design for Scalability and Performance
Tagging and Folksonomy Schema Design for Scalability and Performance
 
Computer project
Computer projectComputer project
Computer project
 

Último

Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
lizamodels9
 
Insurers' journeys to build a mastery in the IoT usage
Insurers' journeys to build a mastery in the IoT usageInsurers' journeys to build a mastery in the IoT usage
Insurers' journeys to build a mastery in the IoT usage
Matteo Carbone
 
Nelamangala Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Nelamangala Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Nelamangala Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Nelamangala Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
amitlee9823
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
dollysharma2066
 
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876
dlhescort
 
Call Now ☎️🔝 9332606886🔝 Call Girls ❤ Service In Bhilwara Female Escorts Serv...
Call Now ☎️🔝 9332606886🔝 Call Girls ❤ Service In Bhilwara Female Escorts Serv...Call Now ☎️🔝 9332606886🔝 Call Girls ❤ Service In Bhilwara Female Escorts Serv...
Call Now ☎️🔝 9332606886🔝 Call Girls ❤ Service In Bhilwara Female Escorts Serv...
Anamikakaur10
 
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
lizamodels9
 

Último (20)

Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
 
Insurers' journeys to build a mastery in the IoT usage
Insurers' journeys to build a mastery in the IoT usageInsurers' journeys to build a mastery in the IoT usage
Insurers' journeys to build a mastery in the IoT usage
 
Falcon's Invoice Discounting: Your Path to Prosperity
Falcon's Invoice Discounting: Your Path to ProsperityFalcon's Invoice Discounting: Your Path to Prosperity
Falcon's Invoice Discounting: Your Path to Prosperity
 
Nelamangala Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Nelamangala Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Nelamangala Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Nelamangala Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
 
It will be International Nurses' Day on 12 May
It will be International Nurses' Day on 12 MayIt will be International Nurses' Day on 12 May
It will be International Nurses' Day on 12 May
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
 
Call Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine ServiceCall Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine Service
 
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRLMONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL
 
Famous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st CenturyFamous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st Century
 
Call Girls Zirakpur👧 Book Now📱7837612180 📞👉Call Girl Service In Zirakpur No A...
Call Girls Zirakpur👧 Book Now📱7837612180 📞👉Call Girl Service In Zirakpur No A...Call Girls Zirakpur👧 Book Now📱7837612180 📞👉Call Girl Service In Zirakpur No A...
Call Girls Zirakpur👧 Book Now📱7837612180 📞👉Call Girl Service In Zirakpur No A...
 
BAGALUR CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRL
BAGALUR CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRLBAGALUR CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRL
BAGALUR CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRL
 
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876
 
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best ServicesMysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
 
Falcon Invoice Discounting: The best investment platform in india for investors
Falcon Invoice Discounting: The best investment platform in india for investorsFalcon Invoice Discounting: The best investment platform in india for investors
Falcon Invoice Discounting: The best investment platform in india for investors
 
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
 
Call Now ☎️🔝 9332606886🔝 Call Girls ❤ Service In Bhilwara Female Escorts Serv...
Call Now ☎️🔝 9332606886🔝 Call Girls ❤ Service In Bhilwara Female Escorts Serv...Call Now ☎️🔝 9332606886🔝 Call Girls ❤ Service In Bhilwara Female Escorts Serv...
Call Now ☎️🔝 9332606886🔝 Call Girls ❤ Service In Bhilwara Female Escorts Serv...
 
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
(Anamika) VIP Call Girls Napur Call Now 8617697112 Napur Escorts 24x7
(Anamika) VIP Call Girls Napur Call Now 8617697112 Napur Escorts 24x7(Anamika) VIP Call Girls Napur Call Now 8617697112 Napur Escorts 24x7
(Anamika) VIP Call Girls Napur Call Now 8617697112 Napur Escorts 24x7
 
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
 

Brandon

  • 1. Web-scale data architectures A survey of next generation data storage and retrieval
  • 2. Web-scale architectures An argument for computing in the cloud Or how I learned to stop worrying and love *aaS
  • 3.
  • 4.
  • 5. Software as a COMMUNITY SERVICE participation SIMPLE user interfaces DATA
  • 6. Seller reputation Product recommendation
  • 7.
  • 8. What mattered to these companies? • MVC architecture? • Java vs. Python vs. [insert your favorite language]? • Ajax vs. Flash?
  • 11. Forget SQL… YOU ONLY WANT TO STORE DATA
  • 12.
  • 13.
  • 14. Web 2.0 is *pushing the envelope*… • Scale • CPU-intensive text analytics • Search outside the column • 7x24 operation
  • 15. Web Application Heresies? • REST and Resource-oriented data • Cloud computing • Map/Reduce will be next decade’s MVC • Semi-structured data • Grassroots, flexible schemas…microformats • Distributed hash tables • Offline browser clients ( * Adapted from Sam Ruby)
  • 16.
  • 17.
  • 18. Save. See. Share. Secure. FOUR PILLARS OF DATA MANAGEMENT* ( * According to Damien Katz )
  • 19. Massively distributed and scalable. Standards driven. THE INTERNET AS A PLATFORM
  • 20. The Three Levels of Platforms you will meet on the Internet * ACCESS. PLUG-IN. RUNTIME. * Marc Andreessen, http://blog.pmarca.com/2007/09/the-three-kinds.html
  • 21. Common. Code lives outside the runtime. LEVEL 1: ACCESS API
  • 22.
  • 23. Will become more common. More difficult for both platform and app developers. LEVEL 2: PLUG-IN API
  • 24.
  • 25.
  • 26. Rare. Constrained assumptions. Platform hosts code. LEVEL 3: RUNTIME ENVIRONMENT
  • 27. Lowers barriers to entry. Enables situational applications. Isolates concerns to app. LEVEL 3
  • 28. Who is building Level 3 platforms? Ning Social Application Platform Salesforce Sforce/AppExchange Google Mashup Editor, et al Second Life Scriptable 3D world Amazon Electronic Computing Cloud Akamai EdgeComputing Yahoo! Pipes IBM Mashup Maker
  • 29. “IN THE LONG RUN, ALL CREDIBLE LARGE-SCALE INTERNET COMPANIES WILL PROVIDE LEVEL 3 PLATFORMS” * Marc Andreessen, http://blog.pmarca.com/2007/09/the-three-kinds.html
  • 30.
  • 31. WEB-SCALE DATA + PROCESSING
  • 33. Google File System. MapReduce Algorithm. Chubby Lock Server. BIGTABLE
  • 35. Column-oriented databases Id Last_name First_name Salary 1 Smith Joe 40000 2 Jones Mary 50000 3 Johnson Cathy 44000 1,Smith,Joe,40000;2,Jones,Mary,50000;3,Johnson,Cathy,44000; 1,2,3;Smith,Jones,Johnson;Joe,Mary,Cathy;40000,50000,44000;
  • 36. Google Apps Based on BigTable • Google Reader • Google Docs • Google Maps • Google Calendar • Google Print • Google Page Creator • Google Earth • Google Notebook • Blogger.com • Google Mashup Editor • Google Code • Etc. • Orkut • YouTube
  • 37. Distributed Hash Table Versioning Vector Clocks Quorum CASE STUDY: AMAZON
  • 38. S3 → Simple Storage Service SQS → Simple Queue Service EC2 → Electronic Compute Cloud Dynamo
  • 41. Common themes? • Flexible schema • Highly distributed • HTTP is the database driver • JSON , XML, HTML, and JavaScript • Full text search
  • 42.
  • 43. One Size Fits All AN IDEA WHOSE TIME HAS COME AND GONE? ( * Michael Stonebraker )
  • 44.
  • 45. The first is that there will be a dedicated core, those that are heavily invested, either monetarily or professionally, in the status quo, and they will resist any change. The second is that change doesn't care about your investment. TWO RULES FOR ANY CHANGE IN TECHNOLOGY * ( * Joe Gregorio, http://bitworking.org/news/217/Ch-ch-changes )
  • 46.
  • 47.
  • 48. N>1
  • 50.
  • 51.
  • 52.
  • 53.
  • 54. WEB-SCALE DATA + PROCESSING
  • 56. CouchDb • Green implementation, no legacy • Designed to: – implement the four pillars of data management – leverage recent paradigm shifts • Level 3 Data Platform
  • 57. CouchDB: Feature Summary Robust Data Storage Replication REST API User Authentication Views Built on Erlang/OTP Append-only writes MVCC with optimistic concurrency Etags Full text search Map/Reduce (Your feature here. It’s open source!)
  • 58. CouchDb: REST API • Easy retrieval using our favorite, scalable architecture: HTTP • Exchange in industry-standard formats: (XML/JSON) • Simple and intuitive interface
  • 60. The Hadoop stack (from a DBMS perspective) MapReduce Java framework to write parallel scans and aggregations Hbase Simple database HDFS Distributed file system IBM Impliance Muse Query Language Declarative query language MapReduce+ Enhancements to MapReduce Muse Data Model Semi-structured data model Hbase Core Databse storage, transactions HDFS Distributed file system
  • 61. “Luckily, there are only a handful of companies…in the world that need to operate at [this] scale.” DOES EVERYBODY NEED THIS DEGREE OF SCALE? ( * Dare Obasanjo, http://www.25hoursaday.com/weblog/2007/10/06/ThoughtsOnAmazonsInternalStorageSystemDynamo.aspx )
  • 62. “BEWARE OF FOCUSING TOO MUCH ON THE APPS OF THE PAST WHEN LOOKING AT PLATFORMS OF THE FUTURE”
  • 63.
  • 64. Linux, XEN virtualization, Apache Hadoop IBM & GOOGLE UNIVERSITY SPONSORSHIP
  • 65. Understand and communicate HTTP Resource vs. RDBMS differences Research, explore, and push the limits of the MapReduce programming model Discover where distributed hash tables may make sense over RDBMS ACADEMIC PROPOSAL
  • 66. View ourselves as a Level 3 Platform by… Take the runtime out of the developers control Leverage IBM’s Impliance project for massive data scaling PROJECT ZERO PROPOSAL