SlideShare uma empresa Scribd logo
1 de 38
thinking in key-value stores http://developers.hover.in Bhasker V Kode co-founder  & CTO at hover.in at AWS Chennai November 10th, 2009
summary of tech at hover.in ,[object Object],[object Object],[object Object],http://developers.hover.in
typical webapp use-cases ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],http://developers.hover.in
typical crawler use-cases ,[object Object],[object Object],[object Object],[object Object],http://developers.hover.in
http://developers.hover.in crawler infrastructure at hover.in ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
http://developers.hover.in crawler infrastructure at hover.in ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
http://developers.hover.in # flowcontrol
http://developers.hover.in “ The domino effect is a chain reaction that occurs when a small change causes a similar change nearby, which then will cause another similar change.” via wikipedia page on Domino Effect
[object Object],[object Object],[object Object],[object Object],http://developers.hover.in
http://developers.hover.in crawler infrastructure at hover.in ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
spawning, in practice ,[object Object],[object Object],http://developers.hover.in
http://developers.hover.in “ the small portions of the program which cannot be parallelized will limit the overall speed-up available ” -  Amdahl's  Law wrt parallel computing...
http://developers.hover.in ,[object Object],[object Object],[object Object],[object Object],[object Object]
http://developers.hover.in Further , A & B need'nt be serial, but a batch of A's & a parallel batch of B's running tasks serially. Flowcontrol implemented by tail-recursive servers handling bursty AND slow traffic well.  flowcontrol 1 flowcontrol 2 where  is free cpu / time for handling  2x  B tasks
http://developers.hover.in crawler infrastructure at hover.in ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
http://developers.hover.in # distributed vs local
http://developers.hover.in # replication vs location transparency
http://developers.hover.in how it works at hover.in:  I  , “ node X ” will rather do tasks locally since this data is designated to me, rather than rpc'ing all over like crazy ! The meta-data of  which node does what calculated by   (a) statically assigned  (our choice)   (b) or a hash fn  (c) dynamically reassigned  (maybe later) which is  made available to nodes  by (a) replication or (b) location transparency  (our choice) Node1 Node2 NodeN
http://developers.hover.in crawler infrastructure at hover.in ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
http://developers.hover.in # persistent data vs cyclic queues
revisiting  typical webapp use-cases ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],http://developers.hover.in
fixed-length datastructures to the rescue http://developers.hover.in
http://developers.hover.in fixed-length datastores to the rescue ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],NEW!
http://developers.hover.in crawler infrastructure at hover.in ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
http://developers.hover.in # in-memory vs disk
http://developers.hover.in trends in key-value stores ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
in-memory is the new embedded ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],http://developers.hover.in
http://developers.hover.in popular key-value stores ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object],http://developers.hover.in
[object Object],[object Object],[object Object],http://developers.hover.in
7 rules of in-memory capacity planning ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],http://developers.hover.in
interesting key-value store features ,[object Object],[object Object],[object Object],http://developers.hover.in
interesting key-value store features - 2 ,[object Object],[object Object],[object Object],http://developers.hover.in
http://developers.hover.in crawler infrastructure at hover.in ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
brief introduction to hover.in choose words from your blog, & decide what content / ad you want when you hover* over it * or other events like click,right click,etc http://developers.hover.in or... the worlds first user-engagement platform for brands via in-text broadcasting or lets web publishers push client-side event handling to the cloud ,  to run various rich applications called  hoverlets demo at  http://start.hover.in/  and  http://hover.in/demo more at  http://hover.in  ,  http://developers.hover.in/blog/
summary of our erlang modules ,[object Object],http://developers.hover.in
thank you http://developers.hover.in
references ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],http://developers.hover.in

Mais conteúdo relacionado

Mais procurados

Functional Comparison and Performance Evaluation of Streaming Frameworks
Functional Comparison and Performance Evaluation of Streaming FrameworksFunctional Comparison and Performance Evaluation of Streaming Frameworks
Functional Comparison and Performance Evaluation of Streaming FrameworksHuafeng Wang
 
Spark Summit EU talk by Debasish Das and Pramod Narasimha
Spark Summit EU talk by Debasish Das and Pramod NarasimhaSpark Summit EU talk by Debasish Das and Pramod Narasimha
Spark Summit EU talk by Debasish Das and Pramod NarasimhaSpark Summit
 
DOD 2016 - Rafał Kuć - Building a Resilient Log Aggregation Pipeline Using El...
DOD 2016 - Rafał Kuć - Building a Resilient Log Aggregation Pipeline Using El...DOD 2016 - Rafał Kuć - Building a Resilient Log Aggregation Pipeline Using El...
DOD 2016 - Rafał Kuć - Building a Resilient Log Aggregation Pipeline Using El...PROIDEA
 
Hadoop, Hive, Spark and Object Stores
Hadoop, Hive, Spark and Object StoresHadoop, Hive, Spark and Object Stores
Hadoop, Hive, Spark and Object StoresSteve Loughran
 
A Container-based Sizing Framework for Apache Hadoop/Spark Clusters
A Container-based Sizing Framework for Apache Hadoop/Spark ClustersA Container-based Sizing Framework for Apache Hadoop/Spark Clusters
A Container-based Sizing Framework for Apache Hadoop/Spark ClustersDataWorks Summit/Hadoop Summit
 
Building a Scalable Web Crawler with Hadoop
Building a Scalable Web Crawler with HadoopBuilding a Scalable Web Crawler with Hadoop
Building a Scalable Web Crawler with HadoopHadoop User Group
 
Presto in the cloud
Presto in the cloudPresto in the cloud
Presto in the cloudQubole
 
Presto - Hadoop Conference Japan 2014
Presto - Hadoop Conference Japan 2014Presto - Hadoop Conference Japan 2014
Presto - Hadoop Conference Japan 2014Sadayuki Furuhashi
 
Big data: Loading your data with flume and sqoop
Big data:  Loading your data with flume and sqoopBig data:  Loading your data with flume and sqoop
Big data: Loading your data with flume and sqoopChristophe Marchal
 
Treasure Data and AWS - Developers.io 2015
Treasure Data and AWS - Developers.io 2015Treasure Data and AWS - Developers.io 2015
Treasure Data and AWS - Developers.io 2015N Masahiro
 
Distributed Stream Processing on Fluentd / #fluentd
Distributed Stream Processing on Fluentd / #fluentdDistributed Stream Processing on Fluentd / #fluentd
Distributed Stream Processing on Fluentd / #fluentdSATOSHI TAGOMORI
 
Spark stream - Kafka
Spark stream - Kafka Spark stream - Kafka
Spark stream - Kafka Dori Waldman
 
Data Analytics Service Company and Its Ruby Usage
Data Analytics Service Company and Its Ruby UsageData Analytics Service Company and Its Ruby Usage
Data Analytics Service Company and Its Ruby UsageSATOSHI TAGOMORI
 

Mais procurados (20)

Functional Comparison and Performance Evaluation of Streaming Frameworks
Functional Comparison and Performance Evaluation of Streaming FrameworksFunctional Comparison and Performance Evaluation of Streaming Frameworks
Functional Comparison and Performance Evaluation of Streaming Frameworks
 
Spark Summit EU talk by Debasish Das and Pramod Narasimha
Spark Summit EU talk by Debasish Das and Pramod NarasimhaSpark Summit EU talk by Debasish Das and Pramod Narasimha
Spark Summit EU talk by Debasish Das and Pramod Narasimha
 
January 2011 HUG: Pig Presentation
January 2011 HUG: Pig PresentationJanuary 2011 HUG: Pig Presentation
January 2011 HUG: Pig Presentation
 
DOD 2016 - Rafał Kuć - Building a Resilient Log Aggregation Pipeline Using El...
DOD 2016 - Rafał Kuć - Building a Resilient Log Aggregation Pipeline Using El...DOD 2016 - Rafał Kuć - Building a Resilient Log Aggregation Pipeline Using El...
DOD 2016 - Rafał Kuć - Building a Resilient Log Aggregation Pipeline Using El...
 
Prestogres internals
Prestogres internalsPrestogres internals
Prestogres internals
 
Hadoop, Hive, Spark and Object Stores
Hadoop, Hive, Spark and Object StoresHadoop, Hive, Spark and Object Stores
Hadoop, Hive, Spark and Object Stores
 
The Future of Apache Storm
The Future of Apache StormThe Future of Apache Storm
The Future of Apache Storm
 
A Container-based Sizing Framework for Apache Hadoop/Spark Clusters
A Container-based Sizing Framework for Apache Hadoop/Spark ClustersA Container-based Sizing Framework for Apache Hadoop/Spark Clusters
A Container-based Sizing Framework for Apache Hadoop/Spark Clusters
 
Building a Scalable Web Crawler with Hadoop
Building a Scalable Web Crawler with HadoopBuilding a Scalable Web Crawler with Hadoop
Building a Scalable Web Crawler with Hadoop
 
Presto in the cloud
Presto in the cloudPresto in the cloud
Presto in the cloud
 
Presto - Hadoop Conference Japan 2014
Presto - Hadoop Conference Japan 2014Presto - Hadoop Conference Japan 2014
Presto - Hadoop Conference Japan 2014
 
January 2011 HUG: Howl Presentation
January 2011 HUG: Howl PresentationJanuary 2011 HUG: Howl Presentation
January 2011 HUG: Howl Presentation
 
Big data: Loading your data with flume and sqoop
Big data:  Loading your data with flume and sqoopBig data:  Loading your data with flume and sqoop
Big data: Loading your data with flume and sqoop
 
Powering a Virtual Power Station with Big Data
Powering a Virtual Power Station with Big DataPowering a Virtual Power Station with Big Data
Powering a Virtual Power Station with Big Data
 
Treasure Data and AWS - Developers.io 2015
Treasure Data and AWS - Developers.io 2015Treasure Data and AWS - Developers.io 2015
Treasure Data and AWS - Developers.io 2015
 
Hdfs high availability
Hdfs high availabilityHdfs high availability
Hdfs high availability
 
Distributed Stream Processing on Fluentd / #fluentd
Distributed Stream Processing on Fluentd / #fluentdDistributed Stream Processing on Fluentd / #fluentd
Distributed Stream Processing on Fluentd / #fluentd
 
Spark stream - Kafka
Spark stream - Kafka Spark stream - Kafka
Spark stream - Kafka
 
Achieving 100k Queries per Hour on Hive on Tez
Achieving 100k Queries per Hour on Hive on TezAchieving 100k Queries per Hour on Hive on Tez
Achieving 100k Queries per Hour on Hive on Tez
 
Data Analytics Service Company and Its Ruby Usage
Data Analytics Service Company and Its Ruby UsageData Analytics Service Company and Its Ruby Usage
Data Analytics Service Company and Its Ruby Usage
 

Destaque

Pebs14 hubbub bike summit
Pebs14   hubbub bike summitPebs14   hubbub bike summit
Pebs14 hubbub bike summitBirgit Hess
 
Scientist and inventor
Scientist and inventorScientist and inventor
Scientist and inventorLee W.
 
Data Networks: Next-Generation Optical Access toward 10 Gb/s Everywhere
Data Networks: Next-Generation Optical Access toward 10 Gb/s EverywhereData Networks: Next-Generation Optical Access toward 10 Gb/s Everywhere
Data Networks: Next-Generation Optical Access toward 10 Gb/s EverywhereXi'an Jiaotong-Liverpool University
 
Short-Run Digital Book Printing
Short-Run Digital Book PrintingShort-Run Digital Book Printing
Short-Run Digital Book PrintingRISO printer
 
Thermal bimorph valve operated microthruster.
Thermal bimorph valve operated microthruster.Thermal bimorph valve operated microthruster.
Thermal bimorph valve operated microthruster.SAI SIVA
 
Secure develpment 2014
Secure develpment 2014Secure develpment 2014
Secure develpment 2014Ariel Evans
 
Power point presentation
Power point presentationPower point presentation
Power point presentationsaki-t
 
A brief introduction to Spintronics
A brief introduction to SpintronicsA brief introduction to Spintronics
A brief introduction to SpintronicsRam Kumar K R
 
National river linking project
National river linking projectNational river linking project
National river linking projectRaj_Anand
 
Project oxygen
Project oxygenProject oxygen
Project oxygenlinkoravi
 
Hyderabadi and Andhra Cuisine
Hyderabadi and Andhra CuisineHyderabadi and Andhra Cuisine
Hyderabadi and Andhra Cuisineshweta_1712
 
Presentation on smartphone by Apu
Presentation on smartphone by ApuPresentation on smartphone by Apu
Presentation on smartphone by ApuAshiqul Islam Apu
 
E-books Are Not the Future of Books
E-books Are Not the Future of BooksE-books Are Not the Future of Books
E-books Are Not the Future of BooksXavier Badosa
 

Destaque (20)

Hubbub deck short
Hubbub deck shortHubbub deck short
Hubbub deck short
 
Technology Advancement Core: Our Process
Technology Advancement Core: Our ProcessTechnology Advancement Core: Our Process
Technology Advancement Core: Our Process
 
Pebs14 hubbub bike summit
Pebs14   hubbub bike summitPebs14   hubbub bike summit
Pebs14 hubbub bike summit
 
Scientist and inventor
Scientist and inventorScientist and inventor
Scientist and inventor
 
Bbr security
Bbr securityBbr security
Bbr security
 
Data Networks: Next-Generation Optical Access toward 10 Gb/s Everywhere
Data Networks: Next-Generation Optical Access toward 10 Gb/s EverywhereData Networks: Next-Generation Optical Access toward 10 Gb/s Everywhere
Data Networks: Next-Generation Optical Access toward 10 Gb/s Everywhere
 
Short-Run Digital Book Printing
Short-Run Digital Book PrintingShort-Run Digital Book Printing
Short-Run Digital Book Printing
 
Thermal bimorph valve operated microthruster.
Thermal bimorph valve operated microthruster.Thermal bimorph valve operated microthruster.
Thermal bimorph valve operated microthruster.
 
Powerpoint
PowerpointPowerpoint
Powerpoint
 
Secure develpment 2014
Secure develpment 2014Secure develpment 2014
Secure develpment 2014
 
Power point presentation
Power point presentationPower point presentation
Power point presentation
 
SHMcloud vision
SHMcloud visionSHMcloud vision
SHMcloud vision
 
A brief introduction to Spintronics
A brief introduction to SpintronicsA brief introduction to Spintronics
A brief introduction to Spintronics
 
National river linking project
National river linking projectNational river linking project
National river linking project
 
Project Oxyzen
Project OxyzenProject Oxyzen
Project Oxyzen
 
Project oxygen
Project oxygenProject oxygen
Project oxygen
 
Shaan-E-Awadh
Shaan-E-AwadhShaan-E-Awadh
Shaan-E-Awadh
 
Hyderabadi and Andhra Cuisine
Hyderabadi and Andhra CuisineHyderabadi and Andhra Cuisine
Hyderabadi and Andhra Cuisine
 
Presentation on smartphone by Apu
Presentation on smartphone by ApuPresentation on smartphone by Apu
Presentation on smartphone by Apu
 
E-books Are Not the Future of Books
E-books Are Not the Future of BooksE-books Are Not the Future of Books
E-books Are Not the Future of Books
 

Semelhante a Key-value stores for efficient crawling and indexing

Scaleable PHP Applications in Kubernetes
Scaleable PHP Applications in KubernetesScaleable PHP Applications in Kubernetes
Scaleable PHP Applications in KubernetesRobert Lemke
 
Infrastructure as code, using Terraform
Infrastructure as code, using TerraformInfrastructure as code, using Terraform
Infrastructure as code, using TerraformHarkamal Singh
 
21 Www Web Services
21 Www Web Services21 Www Web Services
21 Www Web Servicesroyans
 
Apache Traffic Server
Apache Traffic ServerApache Traffic Server
Apache Traffic Serversupertom
 
lessons from managing a pulsar cluster
 lessons from managing a pulsar cluster lessons from managing a pulsar cluster
lessons from managing a pulsar clusterShivji Kumar Jha
 
Dowling buso-feature-store-logical-clocks-spark-ai-summit-2020.pptx
Dowling buso-feature-store-logical-clocks-spark-ai-summit-2020.pptxDowling buso-feature-store-logical-clocks-spark-ai-summit-2020.pptx
Dowling buso-feature-store-logical-clocks-spark-ai-summit-2020.pptxLex Avstreikh
 
Building a Feature Store around Dataframes and Apache Spark
Building a Feature Store around Dataframes and Apache SparkBuilding a Feature Store around Dataframes and Apache Spark
Building a Feature Store around Dataframes and Apache SparkDatabricks
 
Data Summer Conf 2018, “Building unified Batch and Stream processing pipeline...
Data Summer Conf 2018, “Building unified Batch and Stream processing pipeline...Data Summer Conf 2018, “Building unified Batch and Stream processing pipeline...
Data Summer Conf 2018, “Building unified Batch and Stream processing pipeline...Provectus
 
Fluentd Overview, Now and Then
Fluentd Overview, Now and ThenFluentd Overview, Now and Then
Fluentd Overview, Now and ThenSATOSHI TAGOMORI
 
Technology Stack Discussion
Technology Stack DiscussionTechnology Stack Discussion
Technology Stack DiscussionZaiyang Li
 
ABD315_Serverless ETL with AWS Glue
ABD315_Serverless ETL with AWS GlueABD315_Serverless ETL with AWS Glue
ABD315_Serverless ETL with AWS GlueAmazon Web Services
 
Knock Knock, Who’s There? With Justin Chen and Dhruv Jauhar | Current 2022
Knock Knock, Who’s There? With Justin Chen and Dhruv Jauhar | Current 2022Knock Knock, Who’s There? With Justin Chen and Dhruv Jauhar | Current 2022
Knock Knock, Who’s There? With Justin Chen and Dhruv Jauhar | Current 2022HostedbyConfluent
 
Presentation on Japanese doc sprint
Presentation on Japanese doc sprintPresentation on Japanese doc sprint
Presentation on Japanese doc sprintGo Chiba
 
Headless approach for offloading heavy tasks in Magento
Headless approach for offloading heavy tasks in MagentoHeadless approach for offloading heavy tasks in Magento
Headless approach for offloading heavy tasks in MagentoSander Mangel
 
Globo.com & Varnish
Globo.com & VarnishGlobo.com & Varnish
Globo.com & Varnishlokama
 
Kubernetes for the PHP developer
Kubernetes for the PHP developerKubernetes for the PHP developer
Kubernetes for the PHP developerPaul Czarkowski
 

Semelhante a Key-value stores for efficient crawling and indexing (20)

Scaleable PHP Applications in Kubernetes
Scaleable PHP Applications in KubernetesScaleable PHP Applications in Kubernetes
Scaleable PHP Applications in Kubernetes
 
Infrastructure as code, using Terraform
Infrastructure as code, using TerraformInfrastructure as code, using Terraform
Infrastructure as code, using Terraform
 
21 Www Web Services
21 Www Web Services21 Www Web Services
21 Www Web Services
 
Apache Traffic Server
Apache Traffic ServerApache Traffic Server
Apache Traffic Server
 
lessons from managing a pulsar cluster
 lessons from managing a pulsar cluster lessons from managing a pulsar cluster
lessons from managing a pulsar cluster
 
Dowling buso-feature-store-logical-clocks-spark-ai-summit-2020.pptx
Dowling buso-feature-store-logical-clocks-spark-ai-summit-2020.pptxDowling buso-feature-store-logical-clocks-spark-ai-summit-2020.pptx
Dowling buso-feature-store-logical-clocks-spark-ai-summit-2020.pptx
 
Building a Feature Store around Dataframes and Apache Spark
Building a Feature Store around Dataframes and Apache SparkBuilding a Feature Store around Dataframes and Apache Spark
Building a Feature Store around Dataframes and Apache Spark
 
Data Summer Conf 2018, “Building unified Batch and Stream processing pipeline...
Data Summer Conf 2018, “Building unified Batch and Stream processing pipeline...Data Summer Conf 2018, “Building unified Batch and Stream processing pipeline...
Data Summer Conf 2018, “Building unified Batch and Stream processing pipeline...
 
Fluentd Overview, Now and Then
Fluentd Overview, Now and ThenFluentd Overview, Now and Then
Fluentd Overview, Now and Then
 
Technology Stack Discussion
Technology Stack DiscussionTechnology Stack Discussion
Technology Stack Discussion
 
ABD315_Serverless ETL with AWS Glue
ABD315_Serverless ETL with AWS GlueABD315_Serverless ETL with AWS Glue
ABD315_Serverless ETL with AWS Glue
 
Scaling PHP apps
Scaling PHP appsScaling PHP apps
Scaling PHP apps
 
Knock Knock, Who’s There? With Justin Chen and Dhruv Jauhar | Current 2022
Knock Knock, Who’s There? With Justin Chen and Dhruv Jauhar | Current 2022Knock Knock, Who’s There? With Justin Chen and Dhruv Jauhar | Current 2022
Knock Knock, Who’s There? With Justin Chen and Dhruv Jauhar | Current 2022
 
Advanced JavaScript
Advanced JavaScriptAdvanced JavaScript
Advanced JavaScript
 
Presentation on Japanese doc sprint
Presentation on Japanese doc sprintPresentation on Japanese doc sprint
Presentation on Japanese doc sprint
 
Headless approach for offloading heavy tasks in Magento
Headless approach for offloading heavy tasks in MagentoHeadless approach for offloading heavy tasks in Magento
Headless approach for offloading heavy tasks in Magento
 
Globo.com & Varnish
Globo.com & VarnishGlobo.com & Varnish
Globo.com & Varnish
 
Caching By Nyros Developer
Caching By Nyros DeveloperCaching By Nyros Developer
Caching By Nyros Developer
 
Presemtation Tier Optimizations
Presemtation Tier OptimizationsPresemtation Tier Optimizations
Presemtation Tier Optimizations
 
Kubernetes for the PHP developer
Kubernetes for the PHP developerKubernetes for the PHP developer
Kubernetes for the PHP developer
 

Mais de Bhasker Kode

Kafka & Storm - FifthElephant 2015 by @bhaskerkode, Helpshift
Kafka & Storm - FifthElephant 2015 by @bhaskerkode, HelpshiftKafka & Storm - FifthElephant 2015 by @bhaskerkode, Helpshift
Kafka & Storm - FifthElephant 2015 by @bhaskerkode, HelpshiftBhasker Kode
 
Instrumenting Go (Gopherconindia Lightning talk by Bhasker Kode)
Instrumenting Go (Gopherconindia Lightning talk by Bhasker Kode)Instrumenting Go (Gopherconindia Lightning talk by Bhasker Kode)
Instrumenting Go (Gopherconindia Lightning talk by Bhasker Kode)Bhasker Kode
 
Recursion & Erlang, FunctionalConf 14, Bangalore
Recursion & Erlang, FunctionalConf 14, BangaloreRecursion & Erlang, FunctionalConf 14, Bangalore
Recursion & Erlang, FunctionalConf 14, BangaloreBhasker Kode
 
Parsing binaries and protocols with erlang
Parsing binaries and protocols with erlangParsing binaries and protocols with erlang
Parsing binaries and protocols with erlangBhasker Kode
 
hover.in at CUFP 2009
hover.in at CUFP 2009hover.in at CUFP 2009
hover.in at CUFP 2009Bhasker Kode
 
in-memory capacity planning, Erlang Factory London 09
in-memory capacity planning, Erlang Factory London 09in-memory capacity planning, Erlang Factory London 09
in-memory capacity planning, Erlang Factory London 09Bhasker Kode
 
erlang at hover.in , Devcamp Blr 09
erlang at hover.in , Devcamp Blr 09erlang at hover.in , Devcamp Blr 09
erlang at hover.in , Devcamp Blr 09Bhasker Kode
 
end user programming & yahoo pipes
end user programming & yahoo pipesend user programming & yahoo pipes
end user programming & yahoo pipesBhasker Kode
 
Overview of End-user Programming | Talk at DesignCamp Bangalore
Overview of End-user Programming | Talk at DesignCamp BangaloreOverview of End-user Programming | Talk at DesignCamp Bangalore
Overview of End-user Programming | Talk at DesignCamp BangaloreBhasker Kode
 

Mais de Bhasker Kode (9)

Kafka & Storm - FifthElephant 2015 by @bhaskerkode, Helpshift
Kafka & Storm - FifthElephant 2015 by @bhaskerkode, HelpshiftKafka & Storm - FifthElephant 2015 by @bhaskerkode, Helpshift
Kafka & Storm - FifthElephant 2015 by @bhaskerkode, Helpshift
 
Instrumenting Go (Gopherconindia Lightning talk by Bhasker Kode)
Instrumenting Go (Gopherconindia Lightning talk by Bhasker Kode)Instrumenting Go (Gopherconindia Lightning talk by Bhasker Kode)
Instrumenting Go (Gopherconindia Lightning talk by Bhasker Kode)
 
Recursion & Erlang, FunctionalConf 14, Bangalore
Recursion & Erlang, FunctionalConf 14, BangaloreRecursion & Erlang, FunctionalConf 14, Bangalore
Recursion & Erlang, FunctionalConf 14, Bangalore
 
Parsing binaries and protocols with erlang
Parsing binaries and protocols with erlangParsing binaries and protocols with erlang
Parsing binaries and protocols with erlang
 
hover.in at CUFP 2009
hover.in at CUFP 2009hover.in at CUFP 2009
hover.in at CUFP 2009
 
in-memory capacity planning, Erlang Factory London 09
in-memory capacity planning, Erlang Factory London 09in-memory capacity planning, Erlang Factory London 09
in-memory capacity planning, Erlang Factory London 09
 
erlang at hover.in , Devcamp Blr 09
erlang at hover.in , Devcamp Blr 09erlang at hover.in , Devcamp Blr 09
erlang at hover.in , Devcamp Blr 09
 
end user programming & yahoo pipes
end user programming & yahoo pipesend user programming & yahoo pipes
end user programming & yahoo pipes
 
Overview of End-user Programming | Talk at DesignCamp Bangalore
Overview of End-user Programming | Talk at DesignCamp BangaloreOverview of End-user Programming | Talk at DesignCamp Bangalore
Overview of End-user Programming | Talk at DesignCamp Bangalore
 

Último

Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesBernd Ruecker
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 

Último (20)

Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 

Key-value stores for efficient crawling and indexing

  • 1. thinking in key-value stores http://developers.hover.in Bhasker V Kode co-founder & CTO at hover.in at AWS Chennai November 10th, 2009
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 8. http://developers.hover.in “ The domino effect is a chain reaction that occurs when a small change causes a similar change nearby, which then will cause another similar change.” via wikipedia page on Domino Effect
  • 9.
  • 10.
  • 11.
  • 12. http://developers.hover.in “ the small portions of the program which cannot be parallelized will limit the overall speed-up available ” - Amdahl's Law wrt parallel computing...
  • 13.
  • 14. http://developers.hover.in Further , A & B need'nt be serial, but a batch of A's & a parallel batch of B's running tasks serially. Flowcontrol implemented by tail-recursive servers handling bursty AND slow traffic well. flowcontrol 1 flowcontrol 2 where is free cpu / time for handling 2x B tasks
  • 15.
  • 17. http://developers.hover.in # replication vs location transparency
  • 18. http://developers.hover.in how it works at hover.in: I , “ node X ” will rather do tasks locally since this data is designated to me, rather than rpc'ing all over like crazy ! The meta-data of which node does what calculated by (a) statically assigned (our choice) (b) or a hash fn (c) dynamically reassigned (maybe later) which is made available to nodes by (a) replication or (b) location transparency (our choice) Node1 Node2 NodeN
  • 19.
  • 21.
  • 22. fixed-length datastructures to the rescue http://developers.hover.in
  • 23.
  • 24.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33.
  • 34.
  • 35. brief introduction to hover.in choose words from your blog, & decide what content / ad you want when you hover* over it * or other events like click,right click,etc http://developers.hover.in or... the worlds first user-engagement platform for brands via in-text broadcasting or lets web publishers push client-side event handling to the cloud , to run various rich applications called hoverlets demo at http://start.hover.in/ and http://hover.in/demo more at http://hover.in , http://developers.hover.in/blog/
  • 36.
  • 38.