SlideShare uma empresa Scribd logo
1 de 67
Cloud Computing Ping Yeh June 14, 2008
Evolution of Computing with the Network ,[object Object],[object Object],[object Object],[object Object],Network is computer (client - server) ‏ Separation of Functionalities Cluster and grid images are from Fermilab and CERN, respectively.
Evolution of Computing with the Network ,[object Object],[object Object],[object Object],[object Object],Network is computer (client - server) ‏ Tightly coupled computing resources: CPU, storage, data, etc Usually connected within a LAN Managed as a single resource Separation of Functionalities Commodity, Open Source Cluster and grid images are from Fermilab and CERN, respectively.
Evolution of Computing with the Network ,[object Object],[object Object],[object Object],[object Object],Network is computer (client - server) ‏ Tightly coupled computing resources: CPU, storage, data, etc Usually connected within a LAN Managed as a single resource ,[object Object],[object Object],[object Object],[object Object],Separation of Functionalities Commodity, Open Source Global Resource Sharing Cluster and grid images are from Fermilab and CERN, respectively.
Evolution of Computing with the Network ,[object Object],[object Object],[object Object],[object Object],Network is computer (client - server) ‏ Tightly coupled computing resources: CPU, storage, data, etc Usually connected within a LAN Managed as a single resource ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Separation of Functionalities Commodity, Open Source Global Resource Sharing Ownership Model Cluster and grid images are from Fermilab and CERN, respectively.
The Next Step: Cloud Computing Services and data are in the cloud, accessible with any device connected to the cloud with a browser
The Next Step: Cloud Computing Services and data are in the cloud, accessible with any device connected to the cloud with a browser A key technical issue for developers: Scalability
Applications on the Web internet splat map:  http://flickr.com/photos/jurvetson/916142/ , CC-by 2.0 baby picture:  http://flickr.com/photos/cdharrison/280252512/ , CC-by-sa 2.0 Your user internet splat map:  http://flickr.com/photos/jurvetson/916142/ , CC-by 2.0 baby picture:  http://flickr.com/photos/cdharrison/280252512/ , CC-by-sa 2.0 Your Coolest Web Application
Applications on the Web internet splat map:  http://flickr.com/photos/jurvetson/916142/ , CC-by 2.0 baby picture:  http://flickr.com/photos/cdharrison/280252512/ , CC-by-sa 2.0 Your user internet splat map:  http://flickr.com/photos/jurvetson/916142/ , CC-by 2.0 baby picture:  http://flickr.com/photos/cdharrison/280252512/ , CC-by-sa 2.0 The Cloud Your Coolest Web Application
[object Object],[object Object],[object Object],[object Object],[object Object],I asked the kid under the pine tree, "Where might your master be?" "He is picking herbs in the mountain," he said, "the cloud is too deep to know where." Jia Dao, "Didn't meet the master," written around 800AD picture: http://flickr.com/photos/soylentgreen23/313880255/, CC-by 2.0
How many users do you want to have? The Cloud Your Coolest Web Application
How many users do you want to have? The Cloud Your Coolest Web Application
Google Growth Nov. '98: 10,000 queries on 25 computers Apr. '99: 500,000 queries on 300 computers Sep. '99: 3,000,000 queries on 2100 computers
Scalability matters
Counting the numbers Client / Server One : Many Personal Computer One : One
Counting the numbers Client / Server One : Many Personal Computer One : One Cloud Computing Many : Many Developer transition
What Powers Cloud Computing? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],chunk ... chunk ... chunk ... chunk ... /foo/bar
google.stanford.edu (circa 1997)‏
google.com (1999)‏ “ cork boards"
Google Data Center (circa 2000)‏
google.com (new data center 2001)‏
google.com (3 days later)‏
Current Design ,[object Object],[object Object],[object Object],[object Object],[object Object]
How to develop a web application that scales? Storage Database Serving Google's solution/replacement Google File System BigTable MapReduce Google AppEngine Data Processing
How to develop a web application that scales? Storage Database Serving Google's solution/replacement Google File System BigTable MapReduce Google AppEngine Published papers Opened on 2008/5/28 Data Processing hadoop: open source implementation
Google File System GFS Client Application Replicas Masters GFS Master GFS Master C 0 C 1 C 2 C 5 Chunkserver C 0 C 2 C 5 Chunkserver C 1 Chunkserver … File namespace chunk 2ef7 chunk ...  chunk ...  chunk ...  /foo/bar  GFS Client Application C 5 C 3 ,[object Object],[object Object],[object Object],[object Object]
GFS Usage @ Google ,[object Object],[object Object],[object Object],[object Object],[object Object]
BigTable “ www.cnn.com ” “ contents: ” Rows Columns Timestamps t 3 t 11 t 17 “ <html> …” ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Data model: (row, column, timestamp)‏    cell contents
Why not just use commercial DB? ,[object Object],[object Object],[object Object],[object Object]
System Structure Lock service Bigtable master Bigtable tablet server Bigtable tablet server Bigtable tablet server GFS Cluster scheduling system … holds metadata, handles master-election holds tablet data, logs handles failover, monitoring performs metadata ops + load balancing serves data serves data serves data BigTable Cell
System Structure Lock service Bigtable master Bigtable tablet server Bigtable tablet server Bigtable tablet server GFS Cluster scheduling system … holds metadata, handles master-election holds tablet data, logs handles failover, monitoring performs metadata ops + load balancing serves data serves data serves data Bigtable client Bigtable client library Open() ‏ read/write metadata ops BigTable Cell
BigTable Summary ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Distributed Data Processing ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Pseudo Codes for Phase 1 and 2 def findBucket(requestTime): # return minute of the week numRequest = zeros(1440*7) # an array of 1440*7 zeros for filename in sys.argv[2:]: for line in open(filename): minuteBucket = findBucket(findTime(line)) ‏ numRequest[minuteBucket] += 1 outFile = open(sys.argv[1], 'w') ‏ for i in range(1440*7): outFile.write(&quot;%d %d&quot; % (i, numRequest[i])) ‏ outFile.close() ‏ numRequest = zeros(1440*7) # an array of 1440*7 zeros for filename in sys.argv[2:]: for line in open(filename): col = line.split() ‏ [i, count] = [int(col[0]), int(col[1])] numRequest[i] += count # write out numRequest[] like phase 1
Task Management ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Technical Issues ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Performance Robustness Reusability
MapReduce – A New Model and System ,[object Object],[object Object],Two phases of data processing
MapReduce Programming Model ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
MapReduce Version of Pseudo Code def findBucket(requestTime): # return minute of the week class LogMinuteCounter(MapReduction): def Map(key, value, output): # key is location minuteBucket = findBucket(findTime(value)) ‏ output.collect(str(minuteBucket), &quot;1&quot;)  def Reduce(key, iter, output): sum = 0 while not iter.done(): sum += 1 output.collect(key, str(sum)) ‏ ,[object Object],[object Object],[object Object]
MapReduce Framework ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Task Granularity And Pipelining ,[object Object],[object Object],[object Object],[object Object],[object Object]
 
 
 
 
 
 
 
 
 
 
 
MapReduce: Adoption at Google MapReduce Programs in Google ’ s Source Tree Summer intern effect New MapReduce Programs Per Month
MapReduce: Uses at Google ,[object Object],[object Object],[object Object],[object Object],[object Object]
MapReduce Summary ,[object Object],[object Object],[object Object],[object Object]
A Data Playground ,[object Object],[object Object],[object Object],MapReduce + BigTable + GFS = Data playground
Query Frequency Over Time
Learning From Data Searching for Britney Spears...
Open Source Cloud Software: Project Hadoop ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Industrial interest in Hadoop ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Industrial interest in Hadoop ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Industrial interest in Hadoop ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
AppEngine ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Academic Cloud Computing Initiative ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
References ,[object Object],[object Object],[object Object],[object Object]
Summary ,[object Object],[object Object],[object Object],[object Object],[object Object]
The era of Cloud Computing is here! Photo by mr.hero on panoramio (http://www.panoramio.com/photo/1127015)‏ news people book search photo product search video maps e-mails mobile blogs groups calendar scholar Earth Sky web desktop translate messages

Mais conteúdo relacionado

Mais procurados

Fault tolerant mechanisms in Big Data
Fault tolerant mechanisms in Big DataFault tolerant mechanisms in Big Data
Fault tolerant mechanisms in Big DataKaran Pardeshi
 
Big data presentation (2014)
Big data presentation (2014)Big data presentation (2014)
Big data presentation (2014)Xavier Constant
 
RaptorX: Building a 10X Faster Presto with hierarchical cache
RaptorX: Building a 10X Faster Presto with hierarchical cacheRaptorX: Building a 10X Faster Presto with hierarchical cache
RaptorX: Building a 10X Faster Presto with hierarchical cacheAlluxio, Inc.
 
Accumulo Summit 2015: Performance Models for Apache Accumulo: The Heavy Tail ...
Accumulo Summit 2015: Performance Models for Apache Accumulo: The Heavy Tail ...Accumulo Summit 2015: Performance Models for Apache Accumulo: The Heavy Tail ...
Accumulo Summit 2015: Performance Models for Apache Accumulo: The Heavy Tail ...Accumulo Summit
 
Large Scale Data Analysis with Map/Reduce, part I
Large Scale Data Analysis with Map/Reduce, part ILarge Scale Data Analysis with Map/Reduce, part I
Large Scale Data Analysis with Map/Reduce, part IMarin Dimitrov
 
Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Ac...
Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Ac...Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Ac...
Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Ac...Accumulo Summit
 
Lessons Learned from Building SW at Google
Lessons Learned from Building SW at GoogleLessons Learned from Building SW at Google
Lessons Learned from Building SW at Googleadrianionel
 
Apache Hudi: The Path Forward
Apache Hudi: The Path ForwardApache Hudi: The Path Forward
Apache Hudi: The Path ForwardAlluxio, Inc.
 
Fluid: When Alluxio Meets Kubernetes
Fluid: When Alluxio Meets KubernetesFluid: When Alluxio Meets Kubernetes
Fluid: When Alluxio Meets KubernetesAlluxio, Inc.
 
Extending DevOps to Big Data Applications with Kubernetes
Extending DevOps to Big Data Applications with KubernetesExtending DevOps to Big Data Applications with Kubernetes
Extending DevOps to Big Data Applications with KubernetesNicola Ferraro
 
HBase New Features
HBase New FeaturesHBase New Features
HBase New Featuresrxu
 
Gpu computing workshop
Gpu computing workshopGpu computing workshop
Gpu computing workshopdatastack
 
MapR M7: Providing an enterprise quality Apache HBase API
MapR M7: Providing an enterprise quality Apache HBase APIMapR M7: Providing an enterprise quality Apache HBase API
MapR M7: Providing an enterprise quality Apache HBase APImcsrivas
 
Accumulo Summit 2015: Ferrari on a Bumpy Road: Shock Absorbers to Smooth Out ...
Accumulo Summit 2015: Ferrari on a Bumpy Road: Shock Absorbers to Smooth Out ...Accumulo Summit 2015: Ferrari on a Bumpy Road: Shock Absorbers to Smooth Out ...
Accumulo Summit 2015: Ferrari on a Bumpy Road: Shock Absorbers to Smooth Out ...Accumulo Summit
 
Fast Big Data Analytics with Spark on Tachyon
Fast Big Data Analytics with Spark on TachyonFast Big Data Analytics with Spark on Tachyon
Fast Big Data Analytics with Spark on TachyonAlluxio, Inc.
 
Fault Tolerance in Big Data Processing Using Heartbeat Messages and Data Repl...
Fault Tolerance in Big Data Processing Using Heartbeat Messages and Data Repl...Fault Tolerance in Big Data Processing Using Heartbeat Messages and Data Repl...
Fault Tolerance in Big Data Processing Using Heartbeat Messages and Data Repl...IJSRD
 
Accelerating Hive with Alluxio on S3
Accelerating Hive with Alluxio on S3Accelerating Hive with Alluxio on S3
Accelerating Hive with Alluxio on S3Alluxio, Inc.
 
Analysing of big data using map reduce
Analysing of big data using map reduceAnalysing of big data using map reduce
Analysing of big data using map reducePaladion Networks
 

Mais procurados (20)

Fault tolerant mechanisms in Big Data
Fault tolerant mechanisms in Big DataFault tolerant mechanisms in Big Data
Fault tolerant mechanisms in Big Data
 
Big data presentation (2014)
Big data presentation (2014)Big data presentation (2014)
Big data presentation (2014)
 
RaptorX: Building a 10X Faster Presto with hierarchical cache
RaptorX: Building a 10X Faster Presto with hierarchical cacheRaptorX: Building a 10X Faster Presto with hierarchical cache
RaptorX: Building a 10X Faster Presto with hierarchical cache
 
Accumulo Summit 2015: Performance Models for Apache Accumulo: The Heavy Tail ...
Accumulo Summit 2015: Performance Models for Apache Accumulo: The Heavy Tail ...Accumulo Summit 2015: Performance Models for Apache Accumulo: The Heavy Tail ...
Accumulo Summit 2015: Performance Models for Apache Accumulo: The Heavy Tail ...
 
Large Scale Data Analysis with Map/Reduce, part I
Large Scale Data Analysis with Map/Reduce, part ILarge Scale Data Analysis with Map/Reduce, part I
Large Scale Data Analysis with Map/Reduce, part I
 
MapReduce in Cloud Computing
MapReduce in Cloud ComputingMapReduce in Cloud Computing
MapReduce in Cloud Computing
 
Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Ac...
Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Ac...Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Ac...
Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Ac...
 
Lessons Learned from Building SW at Google
Lessons Learned from Building SW at GoogleLessons Learned from Building SW at Google
Lessons Learned from Building SW at Google
 
Apache Hudi: The Path Forward
Apache Hudi: The Path ForwardApache Hudi: The Path Forward
Apache Hudi: The Path Forward
 
Fluid: When Alluxio Meets Kubernetes
Fluid: When Alluxio Meets KubernetesFluid: When Alluxio Meets Kubernetes
Fluid: When Alluxio Meets Kubernetes
 
Extending DevOps to Big Data Applications with Kubernetes
Extending DevOps to Big Data Applications with KubernetesExtending DevOps to Big Data Applications with Kubernetes
Extending DevOps to Big Data Applications with Kubernetes
 
Nov 2010 HUG: Fuzzy Table - B.A.H
Nov 2010 HUG: Fuzzy Table - B.A.HNov 2010 HUG: Fuzzy Table - B.A.H
Nov 2010 HUG: Fuzzy Table - B.A.H
 
HBase New Features
HBase New FeaturesHBase New Features
HBase New Features
 
Gpu computing workshop
Gpu computing workshopGpu computing workshop
Gpu computing workshop
 
MapR M7: Providing an enterprise quality Apache HBase API
MapR M7: Providing an enterprise quality Apache HBase APIMapR M7: Providing an enterprise quality Apache HBase API
MapR M7: Providing an enterprise quality Apache HBase API
 
Accumulo Summit 2015: Ferrari on a Bumpy Road: Shock Absorbers to Smooth Out ...
Accumulo Summit 2015: Ferrari on a Bumpy Road: Shock Absorbers to Smooth Out ...Accumulo Summit 2015: Ferrari on a Bumpy Road: Shock Absorbers to Smooth Out ...
Accumulo Summit 2015: Ferrari on a Bumpy Road: Shock Absorbers to Smooth Out ...
 
Fast Big Data Analytics with Spark on Tachyon
Fast Big Data Analytics with Spark on TachyonFast Big Data Analytics with Spark on Tachyon
Fast Big Data Analytics with Spark on Tachyon
 
Fault Tolerance in Big Data Processing Using Heartbeat Messages and Data Repl...
Fault Tolerance in Big Data Processing Using Heartbeat Messages and Data Repl...Fault Tolerance in Big Data Processing Using Heartbeat Messages and Data Repl...
Fault Tolerance in Big Data Processing Using Heartbeat Messages and Data Repl...
 
Accelerating Hive with Alluxio on S3
Accelerating Hive with Alluxio on S3Accelerating Hive with Alluxio on S3
Accelerating Hive with Alluxio on S3
 
Analysing of big data using map reduce
Analysing of big data using map reduceAnalysing of big data using map reduce
Analysing of big data using map reduce
 

Semelhante a Google Cloud Computing on Google Developer 2008 Day

Designing a Scalable Twitter - Patterns for Designing Scalable Real-Time Web ...
Designing a Scalable Twitter - Patterns for Designing Scalable Real-Time Web ...Designing a Scalable Twitter - Patterns for Designing Scalable Real-Time Web ...
Designing a Scalable Twitter - Patterns for Designing Scalable Real-Time Web ...Nati Shalom
 
My Other Computer is a Data Center: The Sector Perspective on Big Data
My Other Computer is a Data Center: The Sector Perspective on Big DataMy Other Computer is a Data Center: The Sector Perspective on Big Data
My Other Computer is a Data Center: The Sector Perspective on Big DataRobert Grossman
 
Hadoop training in bangalore-kellytechnologies
Hadoop training in bangalore-kellytechnologiesHadoop training in bangalore-kellytechnologies
Hadoop training in bangalore-kellytechnologiesappaji intelhunt
 
Windows Azure: Lessons From The Field
Windows Azure: Lessons From The FieldWindows Azure: Lessons From The Field
Windows Azure: Lessons From The FieldRob Gillen
 
Sector Sphere 2009
Sector Sphere 2009Sector Sphere 2009
Sector Sphere 2009lilyco
 
sector-sphere
sector-spheresector-sphere
sector-spherexlight
 
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Bhupesh Bansal
 
Hadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedInHadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedInHadoop User Group
 
Clusters (Distributed computing)
Clusters (Distributed computing)Clusters (Distributed computing)
Clusters (Distributed computing)Sri Prasanna
 
Computing Outside The Box September 2009
Computing Outside The Box September 2009Computing Outside The Box September 2009
Computing Outside The Box September 2009Ian Foster
 
DM Radio Webinar: Adopting a Streaming-Enabled Architecture
DM Radio Webinar: Adopting a Streaming-Enabled ArchitectureDM Radio Webinar: Adopting a Streaming-Enabled Architecture
DM Radio Webinar: Adopting a Streaming-Enabled ArchitectureDATAVERSITY
 
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...confluent
 
seed block algorithm
seed block algorithmseed block algorithm
seed block algorithmDipak Badhe
 
Cluster based storage - Nasd and Google file system - advanced operating syst...
Cluster based storage - Nasd and Google file system - advanced operating syst...Cluster based storage - Nasd and Google file system - advanced operating syst...
Cluster based storage - Nasd and Google file system - advanced operating syst...Antonio Cesarano
 
Solving enterprise challenges through scale out storage &amp; big compute final
Solving enterprise challenges through scale out storage &amp; big compute finalSolving enterprise challenges through scale out storage &amp; big compute final
Solving enterprise challenges through scale out storage &amp; big compute finalAvere Systems
 
GPU cloud with Job scheduler and Container
GPU cloud with Job scheduler and ContainerGPU cloud with Job scheduler and Container
GPU cloud with Job scheduler and ContainerAndrew Yongjoon Kong
 
Amazon Elastic Map Reduce - Ian Meyers
Amazon Elastic Map Reduce - Ian MeyersAmazon Elastic Map Reduce - Ian Meyers
Amazon Elastic Map Reduce - Ian Meyershuguk
 

Semelhante a Google Cloud Computing on Google Developer 2008 Day (20)

Handout3o
Handout3oHandout3o
Handout3o
 
Designing a Scalable Twitter - Patterns for Designing Scalable Real-Time Web ...
Designing a Scalable Twitter - Patterns for Designing Scalable Real-Time Web ...Designing a Scalable Twitter - Patterns for Designing Scalable Real-Time Web ...
Designing a Scalable Twitter - Patterns for Designing Scalable Real-Time Web ...
 
My Other Computer is a Data Center: The Sector Perspective on Big Data
My Other Computer is a Data Center: The Sector Perspective on Big DataMy Other Computer is a Data Center: The Sector Perspective on Big Data
My Other Computer is a Data Center: The Sector Perspective on Big Data
 
Hadoop training in bangalore-kellytechnologies
Hadoop training in bangalore-kellytechnologiesHadoop training in bangalore-kellytechnologies
Hadoop training in bangalore-kellytechnologies
 
Windows Azure: Lessons From The Field
Windows Azure: Lessons From The FieldWindows Azure: Lessons From The Field
Windows Azure: Lessons From The Field
 
Sector Sphere 2009
Sector Sphere 2009Sector Sphere 2009
Sector Sphere 2009
 
sector-sphere
sector-spheresector-sphere
sector-sphere
 
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
 
Hadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedInHadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedIn
 
Clusters (Distributed computing)
Clusters (Distributed computing)Clusters (Distributed computing)
Clusters (Distributed computing)
 
Computing Outside The Box September 2009
Computing Outside The Box September 2009Computing Outside The Box September 2009
Computing Outside The Box September 2009
 
DM Radio Webinar: Adopting a Streaming-Enabled Architecture
DM Radio Webinar: Adopting a Streaming-Enabled ArchitectureDM Radio Webinar: Adopting a Streaming-Enabled Architecture
DM Radio Webinar: Adopting a Streaming-Enabled Architecture
 
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
 
seed block algorithm
seed block algorithmseed block algorithm
seed block algorithm
 
Cluster based storage - Nasd and Google file system - advanced operating syst...
Cluster based storage - Nasd and Google file system - advanced operating syst...Cluster based storage - Nasd and Google file system - advanced operating syst...
Cluster based storage - Nasd and Google file system - advanced operating syst...
 
Solving enterprise challenges through scale out storage &amp; big compute final
Solving enterprise challenges through scale out storage &amp; big compute finalSolving enterprise challenges through scale out storage &amp; big compute final
Solving enterprise challenges through scale out storage &amp; big compute final
 
NoSQL
NoSQLNoSQL
NoSQL
 
GPU cloud with Job scheduler and Container
GPU cloud with Job scheduler and ContainerGPU cloud with Job scheduler and Container
GPU cloud with Job scheduler and Container
 
Amazon Elastic Map Reduce - Ian Meyers
Amazon Elastic Map Reduce - Ian MeyersAmazon Elastic Map Reduce - Ian Meyers
Amazon Elastic Map Reduce - Ian Meyers
 
Cloud C
Cloud CCloud C
Cloud C
 

Google Cloud Computing on Google Developer 2008 Day

  • 1. Cloud Computing Ping Yeh June 14, 2008
  • 2.
  • 3.
  • 4.
  • 5.
  • 6. The Next Step: Cloud Computing Services and data are in the cloud, accessible with any device connected to the cloud with a browser
  • 7. The Next Step: Cloud Computing Services and data are in the cloud, accessible with any device connected to the cloud with a browser A key technical issue for developers: Scalability
  • 8. Applications on the Web internet splat map: http://flickr.com/photos/jurvetson/916142/ , CC-by 2.0 baby picture: http://flickr.com/photos/cdharrison/280252512/ , CC-by-sa 2.0 Your user internet splat map: http://flickr.com/photos/jurvetson/916142/ , CC-by 2.0 baby picture: http://flickr.com/photos/cdharrison/280252512/ , CC-by-sa 2.0 Your Coolest Web Application
  • 9. Applications on the Web internet splat map: http://flickr.com/photos/jurvetson/916142/ , CC-by 2.0 baby picture: http://flickr.com/photos/cdharrison/280252512/ , CC-by-sa 2.0 Your user internet splat map: http://flickr.com/photos/jurvetson/916142/ , CC-by 2.0 baby picture: http://flickr.com/photos/cdharrison/280252512/ , CC-by-sa 2.0 The Cloud Your Coolest Web Application
  • 10.
  • 11. How many users do you want to have? The Cloud Your Coolest Web Application
  • 12. How many users do you want to have? The Cloud Your Coolest Web Application
  • 13. Google Growth Nov. '98: 10,000 queries on 25 computers Apr. '99: 500,000 queries on 300 computers Sep. '99: 3,000,000 queries on 2100 computers
  • 15. Counting the numbers Client / Server One : Many Personal Computer One : One
  • 16. Counting the numbers Client / Server One : Many Personal Computer One : One Cloud Computing Many : Many Developer transition
  • 17.
  • 19. google.com (1999)‏ “ cork boards&quot;
  • 20. Google Data Center (circa 2000)‏
  • 21. google.com (new data center 2001)‏
  • 22. google.com (3 days later)‏
  • 23.
  • 24. How to develop a web application that scales? Storage Database Serving Google's solution/replacement Google File System BigTable MapReduce Google AppEngine Data Processing
  • 25. How to develop a web application that scales? Storage Database Serving Google's solution/replacement Google File System BigTable MapReduce Google AppEngine Published papers Opened on 2008/5/28 Data Processing hadoop: open source implementation
  • 26.
  • 27.
  • 28.
  • 29.
  • 30. System Structure Lock service Bigtable master Bigtable tablet server Bigtable tablet server Bigtable tablet server GFS Cluster scheduling system … holds metadata, handles master-election holds tablet data, logs handles failover, monitoring performs metadata ops + load balancing serves data serves data serves data BigTable Cell
  • 31. System Structure Lock service Bigtable master Bigtable tablet server Bigtable tablet server Bigtable tablet server GFS Cluster scheduling system … holds metadata, handles master-election holds tablet data, logs handles failover, monitoring performs metadata ops + load balancing serves data serves data serves data Bigtable client Bigtable client library Open() ‏ read/write metadata ops BigTable Cell
  • 32.
  • 33.
  • 34. Pseudo Codes for Phase 1 and 2 def findBucket(requestTime): # return minute of the week numRequest = zeros(1440*7) # an array of 1440*7 zeros for filename in sys.argv[2:]: for line in open(filename): minuteBucket = findBucket(findTime(line)) ‏ numRequest[minuteBucket] += 1 outFile = open(sys.argv[1], 'w') ‏ for i in range(1440*7): outFile.write(&quot;%d %d&quot; % (i, numRequest[i])) ‏ outFile.close() ‏ numRequest = zeros(1440*7) # an array of 1440*7 zeros for filename in sys.argv[2:]: for line in open(filename): col = line.split() ‏ [i, count] = [int(col[0]), int(col[1])] numRequest[i] += count # write out numRequest[] like phase 1
  • 35.
  • 36.
  • 37.
  • 38.
  • 39.
  • 40.
  • 41.
  • 42.  
  • 43.  
  • 44.  
  • 45.  
  • 46.  
  • 47.  
  • 48.  
  • 49.  
  • 50.  
  • 51.  
  • 52.  
  • 53. MapReduce: Adoption at Google MapReduce Programs in Google ’ s Source Tree Summer intern effect New MapReduce Programs Per Month
  • 54.
  • 55.
  • 56.
  • 58. Learning From Data Searching for Britney Spears...
  • 59.
  • 60.
  • 61.
  • 62.
  • 63.
  • 64.
  • 65.
  • 66.
  • 67. The era of Cloud Computing is here! Photo by mr.hero on panoramio (http://www.panoramio.com/photo/1127015)‏ news people book search photo product search video maps e-mails mobile blogs groups calendar scholar Earth Sky web desktop translate messages