SlideShare uma empresa Scribd logo
1 de 51
ElephantDB


             Nathan Marz
              BackType
             @nathanmarz
Specialized database for
  exporting key/value
  data from Hadoop
ElephantDB
                            Server

  File      File

                         ElephantDB   Key/value queries
                            Server
    File       File


Distributed Filesystem
                         ElephantDB
                            Server
First, some context
BackType’s Challenges
BackType’s Challenges

 Complex analytics
BackType’s Challenges

 Complex analytics
on lots of data (> 30TB)
BackType’s Challenges

 Complex analytics
on lots of data (> 30TB)
      in realtime
What is a data system?
Data system

                     View 1
Raw data


                     View 2




                     View 3
Data system

                   # Tweets /
                      URL
Tweets


                   Influence
                     scores



                   Trending
                    topics
Our approach

    Speed Layer




    Batch Layer
Batch layer

                   MapReduce   Batch
Master dataset
                               View 1



                   MapReduce   Batch
                               View 2



                               Batch
                               View 3
                   MapReduce
(this is too slow in practice)
Incremental batch layer
                                                Batch
                                                View 1




New data
             Batch                   View       Batch
                                                View 2
                                  maintenance
            workflow
           Query              Append            Batch
                                                View 3

                   All data
Batch views are always out of
    date by a few hours
Speed layer

                     DB




Messages



                     DB
Compensate for high latency
 of updates to batch layer
Only needs to worry about
  last few hours of data
Application-level Queries

  Batch Layer   Query

                        Merge

  Speed Layer   Query
Batch layer

                 MapReduce   Batch
Master dataset

                                      ElephantDB
                             View 1



                 MapReduce   Batch
                                      ElephantDB
                             View 2



                             Batch    ElephantDB
                             View 3
                 MapReduce
ElephantDB

• Random reads
• Batch writes
• Random writes
Why ElephantDB?

• Simplicity
• Ease of use
• Trivially scalable
• “Just works”
Creating an ElephantDB
index is disassociated from
     serving an index
ElephantDB indexing
                data flow

                              sharded key/value
key/value pairs                   database
                  MapReduce                       Distributed filesystem
Index creation


• ElephantDB just used as a library
• No dependencies on a live ElephantDB
  cluster
ElephantDB serving
     data flow
Distributed filesystem
       Shard            ElephantDB
                           Server
       Shard

       Shard            ElephantDB
                           Server
       Shard
ElephantDB serving


• Each server in “ring” serves a subset of the
  data
• Download shards, open, serve
Terminology
• “Domain”: related set of key/value pairs
• “Shard”: Subset of a domain
• “Ring”: Cluster of servers that work
  together to serve same set of domains
• “Local persistence”: Regular key/value
  database that implements a shard (like
  Berkeley DB)
ElephantDB domain




• Versioned
• domain-spec.yaml contains metadata
  (number of shards, how domains are
  stored)
ElephantDB version




• Folder for each shard
ElephantDB Shard




• Files for local persistence
• Berkeley DB JE in this example
Writing to ElephantDB




        Cascading
Writing to ElephantDB




        Cascalog
MapReduce flow
  (key, value)            (shard, list<key, value>)

                                              Stream into LP

        hash mod      Group by shard

                                            Local
                                       Persistence on             Shard on DFS
                                                         Upload
(shard, key, value)                          LFS
Incremental ElephantDB
  (key, value)            (shard, list<key, value>)

                                              Stream into LP      Old shard on
        hash mod      Group by shard               Download           DFS

                                            Local
                                                                  New shard on
                                       Persistence on
                                                         Upload      DFS
(shard, key, value)                          LFS
Incremental ElephantDB

• Avoid reindexing domain from scratch
• Massive performance benefits in creating/
  updating versions
• ElephantDB ring still has to download
  domain from scratch
Incremental ElephantDB
Configuring a ring
Querying ElephantDB
Consuming ElephantDB
  from MapReduce

• Read from shards in DFS, not from live
  ElephantDB servers
• “Indexed key/value file format on Hadoop”
Demo
One-click deploy
CAP Theorem

In a distributed database, pick at most two:


              Consistency
              Availability
         Partition Tolerance
ElephantDB and CAP


      Availability

   Partition Tolerance
ElephantDB and CAP


But, a very predictable kind of consistency
Comparison to HBase

• HBase can be written to in batch
• Supports random writes
• So why ElephantDB?
Comparison to HBase

• HBase has an enormous complexity cost
• HBase has many moving parts
• Does not “just work”
Comparison to HBase


• HBase is not highly available
• ElephantDB has ability to revert to older
  versions of indexes
Implementation details

• Written in Clojure
• Thrift interface
• Local Persistences are pluggable
Questions?

   Twitter: @nathanmarz

Email: nathan.marz@gmail.com

 Web: http://nathanmarz.com

Mais conteúdo relacionado

Mais procurados

Apache Tez – Present and Future
Apache Tez – Present and FutureApache Tez – Present and Future
Apache Tez – Present and FutureDataWorks Summit
 
Hadoop Backup and Disaster Recovery
Hadoop Backup and Disaster RecoveryHadoop Backup and Disaster Recovery
Hadoop Backup and Disaster RecoveryCloudera, Inc.
 
Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)Ryan Blue
 
ポスト・ラムダアーキテクチャの切り札? Apache Hudi(NTTデータ テクノロジーカンファレンス 2020 発表資料)
ポスト・ラムダアーキテクチャの切り札? Apache Hudi(NTTデータ テクノロジーカンファレンス 2020 発表資料)ポスト・ラムダアーキテクチャの切り札? Apache Hudi(NTTデータ テクノロジーカンファレンス 2020 発表資料)
ポスト・ラムダアーキテクチャの切り札? Apache Hudi(NTTデータ テクノロジーカンファレンス 2020 発表資料)NTT DATA Technology & Innovation
 
【GOJAS Meetup-10】Splunk:SmartStoreを使ってみた
【GOJAS Meetup-10】Splunk:SmartStoreを使ってみた【GOJAS Meetup-10】Splunk:SmartStoreを使ってみた
【GOJAS Meetup-10】Splunk:SmartStoreを使ってみたtokita-r
 
PostgreSQL のイケてるテクニック7選
PostgreSQL のイケてるテクニック7選PostgreSQL のイケてるテクニック7選
PostgreSQL のイケてるテクニック7選Tomoya Kawanishi
 
ログ解析基盤におけるストリーム処理パイプラインについて
ログ解析基盤におけるストリーム処理パイプラインについてログ解析基盤におけるストリーム処理パイプラインについて
ログ解析基盤におけるストリーム処理パイプラインについてcyberagent
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guideRyan Blue
 
Introducing Multi Valued Vectors Fields in Apache Lucene
Introducing Multi Valued Vectors Fields in Apache LuceneIntroducing Multi Valued Vectors Fields in Apache Lucene
Introducing Multi Valued Vectors Fields in Apache LuceneSease
 
Hive on Spark の設計指針を読んでみた
Hive on Spark の設計指針を読んでみたHive on Spark の設計指針を読んでみた
Hive on Spark の設計指針を読んでみたRecruit Technologies
 
Hive Bucketing in Apache Spark
Hive Bucketing in Apache SparkHive Bucketing in Apache Spark
Hive Bucketing in Apache SparkTejas Patil
 
Presto on Apache Spark: A Tale of Two Computation Engines
Presto on Apache Spark: A Tale of Two Computation EnginesPresto on Apache Spark: A Tale of Two Computation Engines
Presto on Apache Spark: A Tale of Two Computation EnginesDatabricks
 
Hive+Tez: A performance deep dive
Hive+Tez: A performance deep diveHive+Tez: A performance deep dive
Hive+Tez: A performance deep divet3rmin4t0r
 
Integrating Existing C++ Libraries into PySpark with Esther Kundin
Integrating Existing C++ Libraries into PySpark with Esther KundinIntegrating Existing C++ Libraries into PySpark with Esther Kundin
Integrating Existing C++ Libraries into PySpark with Esther KundinDatabricks
 
Cassandra導入事例と現場視点での苦労したポイント cassandra summit2014jpn
Cassandra導入事例と現場視点での苦労したポイント cassandra summit2014jpnCassandra導入事例と現場視点での苦労したポイント cassandra summit2014jpn
Cassandra導入事例と現場視点での苦労したポイント cassandra summit2014jpnhaketa
 
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang WangApache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang WangDatabricks
 
Apache Spark Introduction
Apache Spark IntroductionApache Spark Introduction
Apache Spark Introductionsudhakara st
 

Mais procurados (20)

Apache Tez – Present and Future
Apache Tez – Present and FutureApache Tez – Present and Future
Apache Tez – Present and Future
 
Hiveを高速化するLLAP
Hiveを高速化するLLAPHiveを高速化するLLAP
Hiveを高速化するLLAP
 
Hadoop Backup and Disaster Recovery
Hadoop Backup and Disaster RecoveryHadoop Backup and Disaster Recovery
Hadoop Backup and Disaster Recovery
 
Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)
 
ポスト・ラムダアーキテクチャの切り札? Apache Hudi(NTTデータ テクノロジーカンファレンス 2020 発表資料)
ポスト・ラムダアーキテクチャの切り札? Apache Hudi(NTTデータ テクノロジーカンファレンス 2020 発表資料)ポスト・ラムダアーキテクチャの切り札? Apache Hudi(NTTデータ テクノロジーカンファレンス 2020 発表資料)
ポスト・ラムダアーキテクチャの切り札? Apache Hudi(NTTデータ テクノロジーカンファレンス 2020 発表資料)
 
【GOJAS Meetup-10】Splunk:SmartStoreを使ってみた
【GOJAS Meetup-10】Splunk:SmartStoreを使ってみた【GOJAS Meetup-10】Splunk:SmartStoreを使ってみた
【GOJAS Meetup-10】Splunk:SmartStoreを使ってみた
 
Apache Ranger
Apache RangerApache Ranger
Apache Ranger
 
PostgreSQL のイケてるテクニック7選
PostgreSQL のイケてるテクニック7選PostgreSQL のイケてるテクニック7選
PostgreSQL のイケてるテクニック7選
 
ログ解析基盤におけるストリーム処理パイプラインについて
ログ解析基盤におけるストリーム処理パイプラインについてログ解析基盤におけるストリーム処理パイプラインについて
ログ解析基盤におけるストリーム処理パイプラインについて
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guide
 
Introducing Multi Valued Vectors Fields in Apache Lucene
Introducing Multi Valued Vectors Fields in Apache LuceneIntroducing Multi Valued Vectors Fields in Apache Lucene
Introducing Multi Valued Vectors Fields in Apache Lucene
 
Hive on Spark の設計指針を読んでみた
Hive on Spark の設計指針を読んでみたHive on Spark の設計指針を読んでみた
Hive on Spark の設計指針を読んでみた
 
Hive Bucketing in Apache Spark
Hive Bucketing in Apache SparkHive Bucketing in Apache Spark
Hive Bucketing in Apache Spark
 
Presto on Apache Spark: A Tale of Two Computation Engines
Presto on Apache Spark: A Tale of Two Computation EnginesPresto on Apache Spark: A Tale of Two Computation Engines
Presto on Apache Spark: A Tale of Two Computation Engines
 
Hive+Tez: A performance deep dive
Hive+Tez: A performance deep diveHive+Tez: A performance deep dive
Hive+Tez: A performance deep dive
 
Integrating Existing C++ Libraries into PySpark with Esther Kundin
Integrating Existing C++ Libraries into PySpark with Esther KundinIntegrating Existing C++ Libraries into PySpark with Esther Kundin
Integrating Existing C++ Libraries into PySpark with Esther Kundin
 
Cassandra導入事例と現場視点での苦労したポイント cassandra summit2014jpn
Cassandra導入事例と現場視点での苦労したポイント cassandra summit2014jpnCassandra導入事例と現場視点での苦労したポイント cassandra summit2014jpn
Cassandra導入事例と現場視点での苦労したポイント cassandra summit2014jpn
 
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang WangApache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
 
Hadoopエコシステムのデータストア振り返り
Hadoopエコシステムのデータストア振り返りHadoopエコシステムのデータストア振り返り
Hadoopエコシステムのデータストア振り返り
 
Apache Spark Introduction
Apache Spark IntroductionApache Spark Introduction
Apache Spark Introduction
 

Destaque

Splout SQL - Web latency SQL views for Hadoop
Splout SQL - Web latency SQL views for HadoopSplout SQL - Web latency SQL views for Hadoop
Splout SQL - Web latency SQL views for Hadoopdatasalt
 
Realtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and HadoopRealtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and HadoopDataWorks Summit
 
ARCHIVE - XCC 2.0, New presentation is here: http://www.slideshare.net/timet...
ARCHIVE -  XCC 2.0, New presentation is here: http://www.slideshare.net/timet...ARCHIVE -  XCC 2.0, New presentation is here: http://www.slideshare.net/timet...
ARCHIVE - XCC 2.0, New presentation is here: http://www.slideshare.net/timet...TIMETOACT GROUP
 
OnTime für Salesforce - Integration von IBM Domino Kalender Kontakten und Ma...
OnTime für Salesforce  - Integration von IBM Domino Kalender Kontakten und Ma...OnTime für Salesforce  - Integration von IBM Domino Kalender Kontakten und Ma...
OnTime für Salesforce - Integration von IBM Domino Kalender Kontakten und Ma...Andreas Rosen
 
The inherent complexity of stream processing
The inherent complexity of stream processingThe inherent complexity of stream processing
The inherent complexity of stream processingnathanmarz
 
Demystifying Data Engineering
Demystifying Data EngineeringDemystifying Data Engineering
Demystifying Data Engineeringnathanmarz
 
Runaway complexity in Big Data... and a plan to stop it
Runaway complexity in Big Data... and a plan to stop itRunaway complexity in Big Data... and a plan to stop it
Runaway complexity in Big Data... and a plan to stop itnathanmarz
 
Dynamic Wellness JourneyCare Goal setting and research
Dynamic Wellness JourneyCare Goal setting and researchDynamic Wellness JourneyCare Goal setting and research
Dynamic Wellness JourneyCare Goal setting and researchaltonbaird
 
C. V Hanaa Ahmed
C. V Hanaa AhmedC. V Hanaa Ahmed
C. V Hanaa AhmedHanaa Ahmed
 
Day 1: Digital parliamentarians: Tools, opportunities and challenges for elec...
Day 1: Digital parliamentarians: Tools, opportunities and challenges for elec...Day 1: Digital parliamentarians: Tools, opportunities and challenges for elec...
Day 1: Digital parliamentarians: Tools, opportunities and challenges for elec...wepc2016
 
I love free_nsta2010
I love free_nsta2010I love free_nsta2010
I love free_nsta2010Jan Coley
 
Congelamiento de precios productos en cooperativa obrera
Congelamiento de precios   productos en cooperativa obreraCongelamiento de precios   productos en cooperativa obrera
Congelamiento de precios productos en cooperativa obreraDiario Elcomahueonline
 
Η Σπάρτη
Η ΣπάρτηΗ Σπάρτη
Η Σπάρτηvasso76
 
Congelamiento de Precios - Productos en supermercados libertad
Congelamiento de Precios - Productos en supermercados libertadCongelamiento de Precios - Productos en supermercados libertad
Congelamiento de Precios - Productos en supermercados libertadDiario Elcomahueonline
 

Destaque (20)

Splout SQL - Web latency SQL views for Hadoop
Splout SQL - Web latency SQL views for HadoopSplout SQL - Web latency SQL views for Hadoop
Splout SQL - Web latency SQL views for Hadoop
 
Realtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and HadoopRealtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and Hadoop
 
ARCHIVE - XCC 2.0, New presentation is here: http://www.slideshare.net/timet...
ARCHIVE -  XCC 2.0, New presentation is here: http://www.slideshare.net/timet...ARCHIVE -  XCC 2.0, New presentation is here: http://www.slideshare.net/timet...
ARCHIVE - XCC 2.0, New presentation is here: http://www.slideshare.net/timet...
 
OnTime für Salesforce - Integration von IBM Domino Kalender Kontakten und Ma...
OnTime für Salesforce  - Integration von IBM Domino Kalender Kontakten und Ma...OnTime für Salesforce  - Integration von IBM Domino Kalender Kontakten und Ma...
OnTime für Salesforce - Integration von IBM Domino Kalender Kontakten und Ma...
 
The inherent complexity of stream processing
The inherent complexity of stream processingThe inherent complexity of stream processing
The inherent complexity of stream processing
 
Voldemort
VoldemortVoldemort
Voldemort
 
Demystifying Data Engineering
Demystifying Data EngineeringDemystifying Data Engineering
Demystifying Data Engineering
 
Runaway complexity in Big Data... and a plan to stop it
Runaway complexity in Big Data... and a plan to stop itRunaway complexity in Big Data... and a plan to stop it
Runaway complexity in Big Data... and a plan to stop it
 
Cafe Con Leche
Cafe Con LecheCafe Con Leche
Cafe Con Leche
 
Dynamic Wellness JourneyCare Goal setting and research
Dynamic Wellness JourneyCare Goal setting and researchDynamic Wellness JourneyCare Goal setting and research
Dynamic Wellness JourneyCare Goal setting and research
 
C. V Hanaa Ahmed
C. V Hanaa AhmedC. V Hanaa Ahmed
C. V Hanaa Ahmed
 
Day 1: Digital parliamentarians: Tools, opportunities and challenges for elec...
Day 1: Digital parliamentarians: Tools, opportunities and challenges for elec...Day 1: Digital parliamentarians: Tools, opportunities and challenges for elec...
Day 1: Digital parliamentarians: Tools, opportunities and challenges for elec...
 
Banja Luka
Banja LukaBanja Luka
Banja Luka
 
Green it
Green itGreen it
Green it
 
I love free_nsta2010
I love free_nsta2010I love free_nsta2010
I love free_nsta2010
 
Play station 4 camilo q
Play station 4 camilo q Play station 4 camilo q
Play station 4 camilo q
 
Congelamiento de precios productos en cooperativa obrera
Congelamiento de precios   productos en cooperativa obreraCongelamiento de precios   productos en cooperativa obrera
Congelamiento de precios productos en cooperativa obrera
 
Η Σπάρτη
Η ΣπάρτηΗ Σπάρτη
Η Σπάρτη
 
Del mito a la actualidad
Del mito a la actualidadDel mito a la actualidad
Del mito a la actualidad
 
Congelamiento de Precios - Productos en supermercados libertad
Congelamiento de Precios - Productos en supermercados libertadCongelamiento de Precios - Productos en supermercados libertad
Congelamiento de Precios - Productos en supermercados libertad
 

Semelhante a ElephantDB

Storage Infrastructure Behind Facebook Messages
Storage Infrastructure Behind Facebook MessagesStorage Infrastructure Behind Facebook Messages
Storage Infrastructure Behind Facebook Messagesyarapavan
 
Facebook keynote-nicolas-qcon
Facebook keynote-nicolas-qconFacebook keynote-nicolas-qcon
Facebook keynote-nicolas-qconYiwei Ma
 
支撑Facebook消息处理的h base存储系统
支撑Facebook消息处理的h base存储系统支撑Facebook消息处理的h base存储系统
支撑Facebook消息处理的h base存储系统yongboy
 
Facebook Messages & HBase
Facebook Messages & HBaseFacebook Messages & HBase
Facebook Messages & HBase强 王
 
HBaseCon 2012 | Building a Large Search Platform on a Shoestring Budget
HBaseCon 2012 | Building a Large Search Platform on a Shoestring BudgetHBaseCon 2012 | Building a Large Search Platform on a Shoestring Budget
HBaseCon 2012 | Building a Large Search Platform on a Shoestring BudgetCloudera, Inc.
 
Clojure at BackType
Clojure at BackTypeClojure at BackType
Clojure at BackTypenathanmarz
 
Transforming house hunting with solr at trulia - By Kanarsky Alexander and Al...
Transforming house hunting with solr at trulia - By Kanarsky Alexander and Al...Transforming house hunting with solr at trulia - By Kanarsky Alexander and Al...
Transforming house hunting with solr at trulia - By Kanarsky Alexander and Al...lucenerevolution
 
Transforming the house hunting experience
Transforming the house hunting experienceTransforming the house hunting experience
Transforming the house hunting experienceLucidworks (Archived)
 
Impala presentation
Impala presentationImpala presentation
Impala presentationtrihug
 
[Hi c2011]building mission critical messaging system(guoqiang jerry)
[Hi c2011]building mission critical messaging system(guoqiang jerry)[Hi c2011]building mission critical messaging system(guoqiang jerry)
[Hi c2011]building mission critical messaging system(guoqiang jerry)baggioss
 
SQL on Hadoop: Defining the New Generation of Analytic SQL Databases
SQL on Hadoop: Defining the New Generation of Analytic SQL DatabasesSQL on Hadoop: Defining the New Generation of Analytic SQL Databases
SQL on Hadoop: Defining the New Generation of Analytic SQL DatabasesOReillyStrata
 
AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)
AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)
AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)Amazon Web Services
 
The Secrets of Building Realtime Big Data Systems
The Secrets of Building Realtime Big Data SystemsThe Secrets of Building Realtime Big Data Systems
The Secrets of Building Realtime Big Data Systemsnathanmarz
 
Distributed Stream Processing on Fluentd / #fluentd
Distributed Stream Processing on Fluentd / #fluentdDistributed Stream Processing on Fluentd / #fluentd
Distributed Stream Processing on Fluentd / #fluentdSATOSHI TAGOMORI
 
No sql solutions - 공개용
No sql solutions - 공개용No sql solutions - 공개용
No sql solutions - 공개용Byeongweon Moon
 
An introduction to apache drill presentation
An introduction to apache drill presentationAn introduction to apache drill presentation
An introduction to apache drill presentationMapR Technologies
 
Scaling Out With Hadoop And HBase
Scaling Out With Hadoop And HBaseScaling Out With Hadoop And HBase
Scaling Out With Hadoop And HBaseAge Mooij
 

Semelhante a ElephantDB (20)

Storage Infrastructure Behind Facebook Messages
Storage Infrastructure Behind Facebook MessagesStorage Infrastructure Behind Facebook Messages
Storage Infrastructure Behind Facebook Messages
 
Facebook keynote-nicolas-qcon
Facebook keynote-nicolas-qconFacebook keynote-nicolas-qcon
Facebook keynote-nicolas-qcon
 
支撑Facebook消息处理的h base存储系统
支撑Facebook消息处理的h base存储系统支撑Facebook消息处理的h base存储系统
支撑Facebook消息处理的h base存储系统
 
Facebook Messages & HBase
Facebook Messages & HBaseFacebook Messages & HBase
Facebook Messages & HBase
 
Impala for PhillyDB Meetup
Impala for PhillyDB MeetupImpala for PhillyDB Meetup
Impala for PhillyDB Meetup
 
HBaseCon 2012 | Building a Large Search Platform on a Shoestring Budget
HBaseCon 2012 | Building a Large Search Platform on a Shoestring BudgetHBaseCon 2012 | Building a Large Search Platform on a Shoestring Budget
HBaseCon 2012 | Building a Large Search Platform on a Shoestring Budget
 
Clojure at BackType
Clojure at BackTypeClojure at BackType
Clojure at BackType
 
Transforming house hunting with solr at trulia - By Kanarsky Alexander and Al...
Transforming house hunting with solr at trulia - By Kanarsky Alexander and Al...Transforming house hunting with solr at trulia - By Kanarsky Alexander and Al...
Transforming house hunting with solr at trulia - By Kanarsky Alexander and Al...
 
Transforming the house hunting experience
Transforming the house hunting experienceTransforming the house hunting experience
Transforming the house hunting experience
 
Impala presentation
Impala presentationImpala presentation
Impala presentation
 
[Hi c2011]building mission critical messaging system(guoqiang jerry)
[Hi c2011]building mission critical messaging system(guoqiang jerry)[Hi c2011]building mission critical messaging system(guoqiang jerry)
[Hi c2011]building mission critical messaging system(guoqiang jerry)
 
SQL on Hadoop: Defining the New Generation of Analytic SQL Databases
SQL on Hadoop: Defining the New Generation of Analytic SQL DatabasesSQL on Hadoop: Defining the New Generation of Analytic SQL Databases
SQL on Hadoop: Defining the New Generation of Analytic SQL Databases
 
AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)
AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)
AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)
 
RuG Guest Lecture
RuG Guest LectureRuG Guest Lecture
RuG Guest Lecture
 
The Secrets of Building Realtime Big Data Systems
The Secrets of Building Realtime Big Data SystemsThe Secrets of Building Realtime Big Data Systems
The Secrets of Building Realtime Big Data Systems
 
Distributed Stream Processing on Fluentd / #fluentd
Distributed Stream Processing on Fluentd / #fluentdDistributed Stream Processing on Fluentd / #fluentd
Distributed Stream Processing on Fluentd / #fluentd
 
No sql solutions - 공개용
No sql solutions - 공개용No sql solutions - 공개용
No sql solutions - 공개용
 
An introduction to apache drill presentation
An introduction to apache drill presentationAn introduction to apache drill presentation
An introduction to apache drill presentation
 
Apache Drill
Apache DrillApache Drill
Apache Drill
 
Scaling Out With Hadoop And HBase
Scaling Out With Hadoop And HBaseScaling Out With Hadoop And HBase
Scaling Out With Hadoop And HBase
 

Mais de nathanmarz

Using Simplicity to Make Hard Big Data Problems Easy
Using Simplicity to Make Hard Big Data Problems EasyUsing Simplicity to Make Hard Big Data Problems Easy
Using Simplicity to Make Hard Big Data Problems Easynathanmarz
 
The Epistemology of Software Engineering
The Epistemology of Software EngineeringThe Epistemology of Software Engineering
The Epistemology of Software Engineeringnathanmarz
 
Your Code is Wrong
Your Code is WrongYour Code is Wrong
Your Code is Wrongnathanmarz
 
Storm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computationStorm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computationnathanmarz
 
Become Efficient or Die: The Story of BackType
Become Efficient or Die: The Story of BackTypeBecome Efficient or Die: The Story of BackType
Become Efficient or Die: The Story of BackTypenathanmarz
 
Cascalog workshop
Cascalog workshopCascalog workshop
Cascalog workshopnathanmarz
 
Cascalog at Strange Loop
Cascalog at Strange LoopCascalog at Strange Loop
Cascalog at Strange Loopnathanmarz
 
Cascalog at Hadoop Day
Cascalog at Hadoop DayCascalog at Hadoop Day
Cascalog at Hadoop Daynathanmarz
 
Cascalog at May Bay Area Hadoop User Group
Cascalog at May Bay Area Hadoop User GroupCascalog at May Bay Area Hadoop User Group
Cascalog at May Bay Area Hadoop User Groupnathanmarz
 

Mais de nathanmarz (12)

Using Simplicity to Make Hard Big Data Problems Easy
Using Simplicity to Make Hard Big Data Problems EasyUsing Simplicity to Make Hard Big Data Problems Easy
Using Simplicity to Make Hard Big Data Problems Easy
 
The Epistemology of Software Engineering
The Epistemology of Software EngineeringThe Epistemology of Software Engineering
The Epistemology of Software Engineering
 
Your Code is Wrong
Your Code is WrongYour Code is Wrong
Your Code is Wrong
 
Storm
StormStorm
Storm
 
Storm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computationStorm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computation
 
Become Efficient or Die: The Story of BackType
Become Efficient or Die: The Story of BackTypeBecome Efficient or Die: The Story of BackType
Become Efficient or Die: The Story of BackType
 
Cascalog workshop
Cascalog workshopCascalog workshop
Cascalog workshop
 
Cascalog at Strange Loop
Cascalog at Strange LoopCascalog at Strange Loop
Cascalog at Strange Loop
 
Cascalog at Hadoop Day
Cascalog at Hadoop DayCascalog at Hadoop Day
Cascalog at Hadoop Day
 
Cascalog at May Bay Area Hadoop User Group
Cascalog at May Bay Area Hadoop User GroupCascalog at May Bay Area Hadoop User Group
Cascalog at May Bay Area Hadoop User Group
 
Cascalog
CascalogCascalog
Cascalog
 
Cascading
CascadingCascading
Cascading
 

Último

Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 

Último (20)

Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 

ElephantDB

Notas do Editor

  1. \n
  2. \n
  3. \n
  4. \n
  5. \n
  6. \n
  7. \n
  8. \n
  9. \n
  10. \n
  11. \n
  12. \n
  13. \n
  14. \n
  15. \n
  16. \n
  17. \n
  18. \n
  19. \n
  20. \n
  21. \n
  22. \n
  23. \n
  24. \n
  25. \n
  26. \n
  27. \n
  28. \n
  29. \n
  30. \n
  31. \n
  32. \n
  33. \n
  34. \n
  35. \n
  36. \n
  37. \n
  38. \n
  39. \n
  40. \n
  41. \n
  42. \n
  43. \n
  44. \n
  45. \n
  46. \n
  47. \n
  48. \n
  49. \n
  50. \n