SlideShare uma empresa Scribd logo
1 de 18
Baixar para ler offline
Lucene Revolution
                         San Fran 2011
                         Using SolrCloud for real
                                    1




Thursday, May 26, 2011
Whoggly?
        • I’m Jon, a happy lucene hacker since 2004
        • We do Logging as a Service (SAAS with a single focus)
                 ‣   Consolidation, Archiving, Search, Alerting (soon!)
                 ‣   You stream your logs to us, we index them, you search
        • Full public launch on Feb 2, 2011
        • Every customer has their own index
                 ‣   ~1500 customers, ~8k shards, ~7B docs, ~3TB index
        • Search finds the splinter in the log-jam
                 ‣   What happened? When? How often?
                         All your logs are belong to you   2


Thursday, May 26, 2011
Thursday, May 26, 2011
Time is on our side...
        • We’re not solving a typical search problem
                 ‣   Endless stream of data (all your logs)
                 ‣   Time is our “score”
                 ‣   Write once, update never
                 ‣   Large number of very small documents (events)
                 ‣   Search is mostly about what just happened
        • So, simple index life-cycle with “natural” sharding
                 ‣   For us, a shard is slice of a single customers index, based on start
                     and end time

                         All your logs are belong to you   4


Thursday, May 26, 2011
Why SolrCloud?
        • The wheel existed, and keeps getting better
                 ‣   Thanks to the community. Big Thanks to Mark & Yonik
        • Solr: Multi-core, Plugins, Facets, migration, caching, ...
                 ‣   Our Solr instances have hundreds of cores (one per shard)
                 ‣   We’ve made very few changes to core Solr code
        • Zookeeper: state of all nodes & shards, so...
                 ‣   Automatic cluster config
                 ‣   Cluster & Shard changes visible everywhere, almost instantly
                 ‣   Any node can service any request (except for indexing)

                         All your logs are belong to you   5


Thursday, May 26, 2011
Cluster Management
         • Solr instances register & deregister themselves in ZooKeeper
                  ‣      always know what Solr instances are available
         • All Solr configs in ZK except for solr.xml
                  ‣      all instances use same schema, etc, so simple management
                  ‣      solr.xml is “special”, instance specific
         • We’ve added our own persistent data
                  ‣      Loggly-specific config, and some performance data for Solr
                  ‣      Other app configs, “live” status


                          All your logs are belong to you   6


Thursday, May 26, 2011
Index Management
         • One Collection (“index”) per customer
                  ‣      Sharded by time, multiple shards per customer
         • Shards migrated from node to node using replication
                  ‣      Minor changes to existing Solr replication code
         • Shards merged by us, never automatically
                  ‣      We merge to create longer time-slices
                  ‣      Doing it manually makes merge load more predictable
         • Completely distributed management, no “Master”
                  ‣      Simple, Robust, “need to know”
                          All your logs are belong to you   7


Thursday, May 26, 2011
SolrCloud, meet reality
         • Day 1 (patch to trunk, 18 months ago), almost everything JFW’ed
         • The exceptions...
                  ‣      We had to fix a couple of TODO’s
                  ‣      We select shards for search a little differently
                  ‣      We hit a performance wall
                  ‣      We added some utilities to the ZK controller/client
         • None of these were difficult
                  ‣      Today, everything does JFW (for us)


                          All your logs are belong to you   8


Thursday, May 26, 2011
TODO’s
         • Very little missing, even 18 months ago...
         • FacetComponent.countFacets()
             ‣ // TODO: facet dates

                  ‣      We did it. Since been added to trunk (not our code)
         • ZkController.unregister()
             ‣ // TODO : perhaps mark the core down in zk?

                  ‣      We did it: remove the shard from ZK, walk up the tree removing its
                         parents if they’re empty
         • One more, but no spoilers...

                          All your logs are belong to you     9


Thursday, May 26, 2011
Shard selection
         • QueryComponent.checkDistributed() selects shards for each
           “slice”, based on the Collection of the core being queried.
         • We changed some things for version 1...
                  ‣      use a “loggly” core, and pass in the collection
                  ‣      select the biggest shard when overlaps occur
                  ‣      select the “best” copy of a shard on multiple machines
         • Now we use plugin variant of admin handler
                  ‣      avoids special case core
                  ‣      lets us do shard-level short-circuiting for search (not facets)

                          All your logs are belong to you   10


Thursday, May 26, 2011
Performance Fun
         • ZkStateReader...
                  ‣      // TODO: - possibly: incremental update rather than reread everything?

                  ‣      Yep, EVERYTHING in ZooKeeper, every time we update anything
                           ‣   to be fair, we’re kind of a pathological case
                  ‣      Watchers for every collection, every shard
         • The Perfect storm
                  ‣      Full read of ZK with 1000’s of shards ain’t fast
                  ‣      We weren’t using scheduled updates correctly
                  ‣      We’re updating frequently, triggering lots of Watcher updates
         • Up to 20 second wait for CloudState
                          All your logs are belong to you    11


Thursday, May 26, 2011
“Just avoid holding it in that way”
       • Watch fewer things
                ‣   Every node doesn’t have to know about every shard change
       • Incremental updates
                ‣   When a watch triggers, we rebuild only that data
       • On-demand Collection updates
                ‣   We read individual Collection info whenever its needed
       • Wait for CloudState is now 10‘s of milliseconds
                ‣   200-400 CloudState updates / minute



                         All your logs are belong to you   12


Thursday, May 26, 2011
ZK Utilities
         • Cleanup
                  ‣      nuke: rm -rf for the ZooKeeper tree
                  ‣      purgeShards: delete all shards for collection X
                  ‣      purgeNode: delete all references to node Y
         • upload: upload a file or directory
         • loggly_node: our config secret sauce
                  ‣      based on ZkNodeProps




                          All your logs are belong to you   13


Thursday, May 26, 2011
Loggly magic
         • We’ve spent most of our time on Shard Management
                  ‣      Plugins FTW
         • Minimal changes to Solr itself
                  ‣      0MQ streaming data input
                  ‣      More logging, to verify our Shard Management is working
         • Lots of admin API extensions
                  ‣      Lets our other apps tell Solr what to expect
                  ‣      Lets Solr tell other apps whats going on
                  ‣      Lets us fix things by hand when things go wrong

                          All your logs are belong to you   14


Thursday, May 26, 2011
Loggly ZK magic
         • We use ZK to store indexing performance data
                  ‣      lets us load balance our indexers
         • We use ZK in our other apps
                  ‣      we have “live_nodes”++ for ALL apps
                  ‣      one-off’s easy when entire state of the system is available
                          ‣   json output + ruby + REST = easy-peasy
         • ZK is very robust
                  ‣      transitioned 5-node ZK cluster to all new hosts with 0 impact


                          All your logs are belong to you   15


Thursday, May 26, 2011
Wish List
         • Standalone jar for the Zk* classes
                  ‣      Simplify access to the Solr specific data in other apps
                  ‣      Re-use SolrCloud wrappers around ZK (which are nice)
         • Better control of watchers
                  ‣      watching too many things considered harmful
         • Plugin shard selection for search
                  ‣      Hacking existing QueryComponent or replacing it entirely are both
                         kind of brute-force.
                  ‣      Maybe this just us though

                          All your logs are belong to you   16


Thursday, May 26, 2011
Other Stuff We Use
         • Amazon’s AWS for infrastructure (EC2, S3)
         • Sylog-NG for syslog/TLS input services
         • 0MQ for event queuing & work distribution
         • MongoDB for statistics and API /stat methods
         • Node.js for HTTP/HTTPs input services
         • Django/Python for middleware/app




                         All your logs are belong to you   17
Thursday, May 26, 2011
ACK/FIN
          One Last Thing... http://logg.ly/jobs
                                              18

Thursday, May 26, 2011

Mais conteúdo relacionado

Mais procurados

Oracle Systems _ Tony Jambu _ Exadata The Facts and Myths behing a proof of c...
Oracle Systems _ Tony Jambu _ Exadata The Facts and Myths behing a proof of c...Oracle Systems _ Tony Jambu _ Exadata The Facts and Myths behing a proof of c...
Oracle Systems _ Tony Jambu _ Exadata The Facts and Myths behing a proof of c...InSync2011
 
Crash course intro to cassandra
Crash course   intro to cassandraCrash course   intro to cassandra
Crash course intro to cassandraJon Haddad
 
Monitoring Oracle Database Instances with Zabbix
Monitoring Oracle Database Instances with ZabbixMonitoring Oracle Database Instances with Zabbix
Monitoring Oracle Database Instances with ZabbixGerger
 
Introduction to Systems Management with SaltStack
Introduction to Systems Management with SaltStackIntroduction to Systems Management with SaltStack
Introduction to Systems Management with SaltStackCraig Sebenik
 
London Hashicorp Meetup #22 - Congruent infrastructure @zopa by Ben Coughlan
London Hashicorp Meetup #22 - Congruent infrastructure @zopa by Ben CoughlanLondon Hashicorp Meetup #22 - Congruent infrastructure @zopa by Ben Coughlan
London Hashicorp Meetup #22 - Congruent infrastructure @zopa by Ben CoughlanBen Coughlan
 
Greenfields tech decisions
Greenfields tech decisionsGreenfields tech decisions
Greenfields tech decisionsTrent Hornibrook
 
Saltconf 2016: Salt stack transport and concurrency
Saltconf 2016: Salt stack transport and concurrencySaltconf 2016: Salt stack transport and concurrency
Saltconf 2016: Salt stack transport and concurrencyThomas Jackson
 
Optimization gems from Yager
Optimization gems from YagerOptimization gems from Yager
Optimization gems from YagerJoakim Ohlander
 
What we talk about when we talk about DevOps
What we talk about when we talk about DevOpsWhat we talk about when we talk about DevOps
What we talk about when we talk about DevOpsRicard Clau
 
Erlang factory SF 2011 "Erlang and the big switch in social games"
Erlang factory SF 2011 "Erlang and the big switch in social games"Erlang factory SF 2011 "Erlang and the big switch in social games"
Erlang factory SF 2011 "Erlang and the big switch in social games"Paolo Negri
 
Xen_and_Rails_deployment
Xen_and_Rails_deploymentXen_and_Rails_deployment
Xen_and_Rails_deploymentAbhishek Singh
 
Spark Summit Europe: Building a REST Job Server for interactive Spark as a se...
Spark Summit Europe: Building a REST Job Server for interactive Spark as a se...Spark Summit Europe: Building a REST Job Server for interactive Spark as a se...
Spark Summit Europe: Building a REST Job Server for interactive Spark as a se...gethue
 
Hashicorp at holaluz
Hashicorp at holaluzHashicorp at holaluz
Hashicorp at holaluzRicard Clau
 
FunctionalConf '16 Robert Virding Erlang Ecosystem
FunctionalConf '16 Robert Virding Erlang EcosystemFunctionalConf '16 Robert Virding Erlang Ecosystem
FunctionalConf '16 Robert Virding Erlang EcosystemRobert Virding
 
AWS Community Day - Jack Schlederer - Enterprise level search on ECS
AWS Community Day - Jack Schlederer - Enterprise level search on ECSAWS Community Day - Jack Schlederer - Enterprise level search on ECS
AWS Community Day - Jack Schlederer - Enterprise level search on ECSAWS Chicago
 

Mais procurados (15)

Oracle Systems _ Tony Jambu _ Exadata The Facts and Myths behing a proof of c...
Oracle Systems _ Tony Jambu _ Exadata The Facts and Myths behing a proof of c...Oracle Systems _ Tony Jambu _ Exadata The Facts and Myths behing a proof of c...
Oracle Systems _ Tony Jambu _ Exadata The Facts and Myths behing a proof of c...
 
Crash course intro to cassandra
Crash course   intro to cassandraCrash course   intro to cassandra
Crash course intro to cassandra
 
Monitoring Oracle Database Instances with Zabbix
Monitoring Oracle Database Instances with ZabbixMonitoring Oracle Database Instances with Zabbix
Monitoring Oracle Database Instances with Zabbix
 
Introduction to Systems Management with SaltStack
Introduction to Systems Management with SaltStackIntroduction to Systems Management with SaltStack
Introduction to Systems Management with SaltStack
 
London Hashicorp Meetup #22 - Congruent infrastructure @zopa by Ben Coughlan
London Hashicorp Meetup #22 - Congruent infrastructure @zopa by Ben CoughlanLondon Hashicorp Meetup #22 - Congruent infrastructure @zopa by Ben Coughlan
London Hashicorp Meetup #22 - Congruent infrastructure @zopa by Ben Coughlan
 
Greenfields tech decisions
Greenfields tech decisionsGreenfields tech decisions
Greenfields tech decisions
 
Saltconf 2016: Salt stack transport and concurrency
Saltconf 2016: Salt stack transport and concurrencySaltconf 2016: Salt stack transport and concurrency
Saltconf 2016: Salt stack transport and concurrency
 
Optimization gems from Yager
Optimization gems from YagerOptimization gems from Yager
Optimization gems from Yager
 
What we talk about when we talk about DevOps
What we talk about when we talk about DevOpsWhat we talk about when we talk about DevOps
What we talk about when we talk about DevOps
 
Erlang factory SF 2011 "Erlang and the big switch in social games"
Erlang factory SF 2011 "Erlang and the big switch in social games"Erlang factory SF 2011 "Erlang and the big switch in social games"
Erlang factory SF 2011 "Erlang and the big switch in social games"
 
Xen_and_Rails_deployment
Xen_and_Rails_deploymentXen_and_Rails_deployment
Xen_and_Rails_deployment
 
Spark Summit Europe: Building a REST Job Server for interactive Spark as a se...
Spark Summit Europe: Building a REST Job Server for interactive Spark as a se...Spark Summit Europe: Building a REST Job Server for interactive Spark as a se...
Spark Summit Europe: Building a REST Job Server for interactive Spark as a se...
 
Hashicorp at holaluz
Hashicorp at holaluzHashicorp at holaluz
Hashicorp at holaluz
 
FunctionalConf '16 Robert Virding Erlang Ecosystem
FunctionalConf '16 Robert Virding Erlang EcosystemFunctionalConf '16 Robert Virding Erlang Ecosystem
FunctionalConf '16 Robert Virding Erlang Ecosystem
 
AWS Community Day - Jack Schlederer - Enterprise level search on ECS
AWS Community Day - Jack Schlederer - Enterprise level search on ECSAWS Community Day - Jack Schlederer - Enterprise level search on ECS
AWS Community Day - Jack Schlederer - Enterprise level search on ECS
 

Destaque

Architecture and implementation of Apache Lucene
Architecture and implementation of Apache LuceneArchitecture and implementation of Apache Lucene
Architecture and implementation of Apache LuceneJosiane Gamgo
 
Devinsampa nginx-scripting
Devinsampa nginx-scriptingDevinsampa nginx-scripting
Devinsampa nginx-scriptingTony Fabeen
 
Munching & crunching - Lucene index post-processing
Munching & crunching - Lucene index post-processingMunching & crunching - Lucene index post-processing
Munching & crunching - Lucene index post-processingabial
 
From Lucene to Elasticsearch, a short explanation of horizontal scalability
From Lucene to Elasticsearch, a short explanation of horizontal scalabilityFrom Lucene to Elasticsearch, a short explanation of horizontal scalability
From Lucene to Elasticsearch, a short explanation of horizontal scalabilityStéphane Gamard
 
Lucandra
LucandraLucandra
Lucandraotisg
 
Intelligent crawling and indexing using lucene
Intelligent crawling and indexing using luceneIntelligent crawling and indexing using lucene
Intelligent crawling and indexing using luceneSwapnil & Patil
 
An introduction to inverted index
An introduction to inverted indexAn introduction to inverted index
An introduction to inverted indexweedge
 
Apache Solr/Lucene Internals by Anatoliy Sokolenko
Apache Solr/Lucene Internals  by Anatoliy SokolenkoApache Solr/Lucene Internals  by Anatoliy Sokolenko
Apache Solr/Lucene Internals by Anatoliy SokolenkoProvectus
 
Berlin Buzzwords 2013 - How does lucene store your data?
Berlin Buzzwords 2013 - How does lucene store your data?Berlin Buzzwords 2013 - How does lucene store your data?
Berlin Buzzwords 2013 - How does lucene store your data?Adrien Grand
 
Architecture and Implementation of Apache Lucene: Marter's Thesis
Architecture and Implementation of Apache Lucene: Marter's ThesisArchitecture and Implementation of Apache Lucene: Marter's Thesis
Architecture and Implementation of Apache Lucene: Marter's ThesisJosiane Gamgo
 
Lucene Introduction
Lucene IntroductionLucene Introduction
Lucene Introductionotisg
 

Destaque (20)

Inverted index
Inverted indexInverted index
Inverted index
 
Solr
SolrSolr
Solr
 
Architecture and implementation of Apache Lucene
Architecture and implementation of Apache LuceneArchitecture and implementation of Apache Lucene
Architecture and implementation of Apache Lucene
 
Introduction To Apache Lucene
Introduction To Apache LuceneIntroduction To Apache Lucene
Introduction To Apache Lucene
 
Search Lucene
Search LuceneSearch Lucene
Search Lucene
 
Devinsampa nginx-scripting
Devinsampa nginx-scriptingDevinsampa nginx-scripting
Devinsampa nginx-scripting
 
Munching & crunching - Lucene index post-processing
Munching & crunching - Lucene index post-processingMunching & crunching - Lucene index post-processing
Munching & crunching - Lucene index post-processing
 
Index types
Index typesIndex types
Index types
 
Text Indexing / Inverted Indices
Text Indexing / Inverted IndicesText Indexing / Inverted Indices
Text Indexing / Inverted Indices
 
From Lucene to Elasticsearch, a short explanation of horizontal scalability
From Lucene to Elasticsearch, a short explanation of horizontal scalabilityFrom Lucene to Elasticsearch, a short explanation of horizontal scalability
From Lucene to Elasticsearch, a short explanation of horizontal scalability
 
Lucene
LuceneLucene
Lucene
 
Lucene and MySQL
Lucene and MySQLLucene and MySQL
Lucene and MySQL
 
Lucandra
LucandraLucandra
Lucandra
 
Intelligent crawling and indexing using lucene
Intelligent crawling and indexing using luceneIntelligent crawling and indexing using lucene
Intelligent crawling and indexing using lucene
 
An introduction to inverted index
An introduction to inverted indexAn introduction to inverted index
An introduction to inverted index
 
Apache Solr/Lucene Internals by Anatoliy Sokolenko
Apache Solr/Lucene Internals  by Anatoliy SokolenkoApache Solr/Lucene Internals  by Anatoliy Sokolenko
Apache Solr/Lucene Internals by Anatoliy Sokolenko
 
Berlin Buzzwords 2013 - How does lucene store your data?
Berlin Buzzwords 2013 - How does lucene store your data?Berlin Buzzwords 2013 - How does lucene store your data?
Berlin Buzzwords 2013 - How does lucene store your data?
 
Introduction to solr
Introduction to solrIntroduction to solr
Introduction to solr
 
Architecture and Implementation of Apache Lucene: Marter's Thesis
Architecture and Implementation of Apache Lucene: Marter's ThesisArchitecture and Implementation of Apache Lucene: Marter's Thesis
Architecture and Implementation of Apache Lucene: Marter's Thesis
 
Lucene Introduction
Lucene IntroductionLucene Introduction
Lucene Introduction
 

Semelhante a Using Solr Cloud to Tame an Index Explosion

OpenNebulaConf 2013 - Monitoring of OpenNebula installations by Florian Heigl
OpenNebulaConf 2013 - Monitoring of OpenNebula installations by Florian Heigl OpenNebulaConf 2013 - Monitoring of OpenNebula installations by Florian Heigl
OpenNebulaConf 2013 - Monitoring of OpenNebula installations by Florian Heigl OpenNebula Project
 
Monitoring of OpenNebula installations
Monitoring of OpenNebula installationsMonitoring of OpenNebula installations
Monitoring of OpenNebula installationsNETWAYS
 
Taming Pythons with ZooKeeper (Pyconfi edition)
Taming Pythons with ZooKeeper (Pyconfi edition)Taming Pythons with ZooKeeper (Pyconfi edition)
Taming Pythons with ZooKeeper (Pyconfi edition)Jyrki Pulliainen
 
Solr on Windows: Does it Work? Does it Scale? - Teun Duynstee
Solr on Windows: Does it Work? Does it Scale? - Teun DuynsteeSolr on Windows: Does it Work? Does it Scale? - Teun Duynstee
Solr on Windows: Does it Work? Does it Scale? - Teun Duynsteelucenerevolution
 
Scaling with mongo db (with notes)
Scaling with mongo db (with notes)Scaling with mongo db (with notes)
Scaling with mongo db (with notes)emiltamas
 
Systems Design Experiences or Just Some War Stories…
Systems Design Experiences or Just Some War Stories…Systems Design Experiences or Just Some War Stories…
Systems Design Experiences or Just Some War Stories…Persistent Systems Ltd.
 
Keeping MongoDB Data Safe
Keeping MongoDB Data SafeKeeping MongoDB Data Safe
Keeping MongoDB Data SafeTony Tam
 
Determinism in finance
Determinism in financeDeterminism in finance
Determinism in financePeter Lawrey
 
VoltDB and Erlang - Tech planet 2012
VoltDB and Erlang - Tech planet 2012VoltDB and Erlang - Tech planet 2012
VoltDB and Erlang - Tech planet 2012Eonblast
 
Spil Storage Platform (Erlang) @ EUG-NL
Spil Storage Platform (Erlang) @ EUG-NLSpil Storage Platform (Erlang) @ EUG-NL
Spil Storage Platform (Erlang) @ EUG-NLThijs Terlouw
 
Running Oracle EBS in the cloud (UKOUG APPS16 edition)
Running Oracle EBS in the cloud (UKOUG APPS16 edition)Running Oracle EBS in the cloud (UKOUG APPS16 edition)
Running Oracle EBS in the cloud (UKOUG APPS16 edition)Andrejs Prokopjevs
 
A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?
A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?
A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?DATAVERSITY
 
Life After Sharding: Monitoring and Management of a Complex Data Cloud
Life After Sharding: Monitoring and Management of a Complex Data CloudLife After Sharding: Monitoring and Management of a Complex Data Cloud
Life After Sharding: Monitoring and Management of a Complex Data CloudOSCON Byrum
 
Application Profiling for Memory and Performance
Application Profiling for Memory and PerformanceApplication Profiling for Memory and Performance
Application Profiling for Memory and Performancepradeepfn
 
WalB: Block-level WAL. Concept.
WalB: Block-level WAL. Concept.WalB: Block-level WAL. Concept.
WalB: Block-level WAL. Concept.Takashi Hoshino
 
Lagom - Mircoservices "Just Right"
Lagom - Mircoservices "Just Right"Lagom - Mircoservices "Just Right"
Lagom - Mircoservices "Just Right"Markus Jura
 
London devops logging
London devops loggingLondon devops logging
London devops loggingTomas Doran
 
OSOM Operations in the Cloud
OSOM Operations in the CloudOSOM Operations in the Cloud
OSOM Operations in the Cloudmstuparu
 

Semelhante a Using Solr Cloud to Tame an Index Explosion (20)

OpenNebulaConf 2013 - Monitoring of OpenNebula installations by Florian Heigl
OpenNebulaConf 2013 - Monitoring of OpenNebula installations by Florian Heigl OpenNebulaConf 2013 - Monitoring of OpenNebula installations by Florian Heigl
OpenNebulaConf 2013 - Monitoring of OpenNebula installations by Florian Heigl
 
Monitoring of OpenNebula installations
Monitoring of OpenNebula installationsMonitoring of OpenNebula installations
Monitoring of OpenNebula installations
 
Taming Pythons with ZooKeeper (Pyconfi edition)
Taming Pythons with ZooKeeper (Pyconfi edition)Taming Pythons with ZooKeeper (Pyconfi edition)
Taming Pythons with ZooKeeper (Pyconfi edition)
 
Solr on Windows: Does it Work? Does it Scale? - Teun Duynstee
Solr on Windows: Does it Work? Does it Scale? - Teun DuynsteeSolr on Windows: Does it Work? Does it Scale? - Teun Duynstee
Solr on Windows: Does it Work? Does it Scale? - Teun Duynstee
 
Scaling with mongo db (with notes)
Scaling with mongo db (with notes)Scaling with mongo db (with notes)
Scaling with mongo db (with notes)
 
Systems Design Experiences or Just Some War Stories…
Systems Design Experiences or Just Some War Stories…Systems Design Experiences or Just Some War Stories…
Systems Design Experiences or Just Some War Stories…
 
Keeping MongoDB Data Safe
Keeping MongoDB Data SafeKeeping MongoDB Data Safe
Keeping MongoDB Data Safe
 
Determinism in finance
Determinism in financeDeterminism in finance
Determinism in finance
 
VoltDB and Erlang - Tech planet 2012
VoltDB and Erlang - Tech planet 2012VoltDB and Erlang - Tech planet 2012
VoltDB and Erlang - Tech planet 2012
 
Spil Storage Platform (Erlang) @ EUG-NL
Spil Storage Platform (Erlang) @ EUG-NLSpil Storage Platform (Erlang) @ EUG-NL
Spil Storage Platform (Erlang) @ EUG-NL
 
Running Oracle EBS in the cloud (UKOUG APPS16 edition)
Running Oracle EBS in the cloud (UKOUG APPS16 edition)Running Oracle EBS in the cloud (UKOUG APPS16 edition)
Running Oracle EBS in the cloud (UKOUG APPS16 edition)
 
A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?
A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?
A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?
 
Life After Sharding: Monitoring and Management of a Complex Data Cloud
Life After Sharding: Monitoring and Management of a Complex Data CloudLife After Sharding: Monitoring and Management of a Complex Data Cloud
Life After Sharding: Monitoring and Management of a Complex Data Cloud
 
Zero mq logs
Zero mq logsZero mq logs
Zero mq logs
 
My sql tutorial-oscon-2012
My sql tutorial-oscon-2012My sql tutorial-oscon-2012
My sql tutorial-oscon-2012
 
Application Profiling for Memory and Performance
Application Profiling for Memory and PerformanceApplication Profiling for Memory and Performance
Application Profiling for Memory and Performance
 
WalB: Block-level WAL. Concept.
WalB: Block-level WAL. Concept.WalB: Block-level WAL. Concept.
WalB: Block-level WAL. Concept.
 
Lagom - Mircoservices "Just Right"
Lagom - Mircoservices "Just Right"Lagom - Mircoservices "Just Right"
Lagom - Mircoservices "Just Right"
 
London devops logging
London devops loggingLondon devops logging
London devops logging
 
OSOM Operations in the Cloud
OSOM Operations in the CloudOSOM Operations in the Cloud
OSOM Operations in the Cloud
 

Mais de Lucidworks (Archived)

Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...Lucidworks (Archived)
 
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
 SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and SolrLucidworks (Archived)
 
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for BusinessSFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for BusinessLucidworks (Archived)
 
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceSFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceLucidworks (Archived)
 
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search EngineChicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search EngineLucidworks (Archived)
 
Chicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with SearchChicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with SearchLucidworks (Archived)
 
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache SolrMinneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache SolrLucidworks (Archived)
 
Minneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com SearchMinneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com SearchLucidworks (Archived)
 
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...Lucidworks (Archived)
 
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...Lucidworks (Archived)
 
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...Lucidworks (Archived)
 
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DCBig Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DCLucidworks (Archived)
 
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DCWhat's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DCLucidworks (Archived)
 
Solr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DCSolr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DCLucidworks (Archived)
 
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCIntro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCLucidworks (Archived)
 
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DCTest Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DCLucidworks (Archived)
 
Building a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLKBuilding a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLKLucidworks (Archived)
 

Mais de Lucidworks (Archived) (20)

Integrating Hadoop & Solr
Integrating Hadoop & SolrIntegrating Hadoop & Solr
Integrating Hadoop & Solr
 
The Data-Driven Paradigm
The Data-Driven ParadigmThe Data-Driven Paradigm
The Data-Driven Paradigm
 
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
 
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
 SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
 
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for BusinessSFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
 
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceSFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
 
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search EngineChicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
 
Chicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with SearchChicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with Search
 
What's new in solr june 2014
What's new in solr june 2014What's new in solr june 2014
What's new in solr june 2014
 
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache SolrMinneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
 
Minneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com SearchMinneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com Search
 
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
 
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...
 
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
 
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DCBig Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DC
 
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DCWhat's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
 
Solr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DCSolr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DC
 
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCIntro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
 
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DCTest Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
 
Building a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLKBuilding a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLK
 

Using Solr Cloud to Tame an Index Explosion

  • 1. Lucene Revolution San Fran 2011 Using SolrCloud for real 1 Thursday, May 26, 2011
  • 2. Whoggly? • I’m Jon, a happy lucene hacker since 2004 • We do Logging as a Service (SAAS with a single focus) ‣ Consolidation, Archiving, Search, Alerting (soon!) ‣ You stream your logs to us, we index them, you search • Full public launch on Feb 2, 2011 • Every customer has their own index ‣ ~1500 customers, ~8k shards, ~7B docs, ~3TB index • Search finds the splinter in the log-jam ‣ What happened? When? How often? All your logs are belong to you 2 Thursday, May 26, 2011
  • 4. Time is on our side... • We’re not solving a typical search problem ‣ Endless stream of data (all your logs) ‣ Time is our “score” ‣ Write once, update never ‣ Large number of very small documents (events) ‣ Search is mostly about what just happened • So, simple index life-cycle with “natural” sharding ‣ For us, a shard is slice of a single customers index, based on start and end time All your logs are belong to you 4 Thursday, May 26, 2011
  • 5. Why SolrCloud? • The wheel existed, and keeps getting better ‣ Thanks to the community. Big Thanks to Mark & Yonik • Solr: Multi-core, Plugins, Facets, migration, caching, ... ‣ Our Solr instances have hundreds of cores (one per shard) ‣ We’ve made very few changes to core Solr code • Zookeeper: state of all nodes & shards, so... ‣ Automatic cluster config ‣ Cluster & Shard changes visible everywhere, almost instantly ‣ Any node can service any request (except for indexing) All your logs are belong to you 5 Thursday, May 26, 2011
  • 6. Cluster Management • Solr instances register & deregister themselves in ZooKeeper ‣ always know what Solr instances are available • All Solr configs in ZK except for solr.xml ‣ all instances use same schema, etc, so simple management ‣ solr.xml is “special”, instance specific • We’ve added our own persistent data ‣ Loggly-specific config, and some performance data for Solr ‣ Other app configs, “live” status All your logs are belong to you 6 Thursday, May 26, 2011
  • 7. Index Management • One Collection (“index”) per customer ‣ Sharded by time, multiple shards per customer • Shards migrated from node to node using replication ‣ Minor changes to existing Solr replication code • Shards merged by us, never automatically ‣ We merge to create longer time-slices ‣ Doing it manually makes merge load more predictable • Completely distributed management, no “Master” ‣ Simple, Robust, “need to know” All your logs are belong to you 7 Thursday, May 26, 2011
  • 8. SolrCloud, meet reality • Day 1 (patch to trunk, 18 months ago), almost everything JFW’ed • The exceptions... ‣ We had to fix a couple of TODO’s ‣ We select shards for search a little differently ‣ We hit a performance wall ‣ We added some utilities to the ZK controller/client • None of these were difficult ‣ Today, everything does JFW (for us) All your logs are belong to you 8 Thursday, May 26, 2011
  • 9. TODO’s • Very little missing, even 18 months ago... • FacetComponent.countFacets() ‣ // TODO: facet dates ‣ We did it. Since been added to trunk (not our code) • ZkController.unregister() ‣ // TODO : perhaps mark the core down in zk? ‣ We did it: remove the shard from ZK, walk up the tree removing its parents if they’re empty • One more, but no spoilers... All your logs are belong to you 9 Thursday, May 26, 2011
  • 10. Shard selection • QueryComponent.checkDistributed() selects shards for each “slice”, based on the Collection of the core being queried. • We changed some things for version 1... ‣ use a “loggly” core, and pass in the collection ‣ select the biggest shard when overlaps occur ‣ select the “best” copy of a shard on multiple machines • Now we use plugin variant of admin handler ‣ avoids special case core ‣ lets us do shard-level short-circuiting for search (not facets) All your logs are belong to you 10 Thursday, May 26, 2011
  • 11. Performance Fun • ZkStateReader... ‣ // TODO: - possibly: incremental update rather than reread everything? ‣ Yep, EVERYTHING in ZooKeeper, every time we update anything ‣ to be fair, we’re kind of a pathological case ‣ Watchers for every collection, every shard • The Perfect storm ‣ Full read of ZK with 1000’s of shards ain’t fast ‣ We weren’t using scheduled updates correctly ‣ We’re updating frequently, triggering lots of Watcher updates • Up to 20 second wait for CloudState All your logs are belong to you 11 Thursday, May 26, 2011
  • 12. “Just avoid holding it in that way” • Watch fewer things ‣ Every node doesn’t have to know about every shard change • Incremental updates ‣ When a watch triggers, we rebuild only that data • On-demand Collection updates ‣ We read individual Collection info whenever its needed • Wait for CloudState is now 10‘s of milliseconds ‣ 200-400 CloudState updates / minute All your logs are belong to you 12 Thursday, May 26, 2011
  • 13. ZK Utilities • Cleanup ‣ nuke: rm -rf for the ZooKeeper tree ‣ purgeShards: delete all shards for collection X ‣ purgeNode: delete all references to node Y • upload: upload a file or directory • loggly_node: our config secret sauce ‣ based on ZkNodeProps All your logs are belong to you 13 Thursday, May 26, 2011
  • 14. Loggly magic • We’ve spent most of our time on Shard Management ‣ Plugins FTW • Minimal changes to Solr itself ‣ 0MQ streaming data input ‣ More logging, to verify our Shard Management is working • Lots of admin API extensions ‣ Lets our other apps tell Solr what to expect ‣ Lets Solr tell other apps whats going on ‣ Lets us fix things by hand when things go wrong All your logs are belong to you 14 Thursday, May 26, 2011
  • 15. Loggly ZK magic • We use ZK to store indexing performance data ‣ lets us load balance our indexers • We use ZK in our other apps ‣ we have “live_nodes”++ for ALL apps ‣ one-off’s easy when entire state of the system is available ‣ json output + ruby + REST = easy-peasy • ZK is very robust ‣ transitioned 5-node ZK cluster to all new hosts with 0 impact All your logs are belong to you 15 Thursday, May 26, 2011
  • 16. Wish List • Standalone jar for the Zk* classes ‣ Simplify access to the Solr specific data in other apps ‣ Re-use SolrCloud wrappers around ZK (which are nice) • Better control of watchers ‣ watching too many things considered harmful • Plugin shard selection for search ‣ Hacking existing QueryComponent or replacing it entirely are both kind of brute-force. ‣ Maybe this just us though All your logs are belong to you 16 Thursday, May 26, 2011
  • 17. Other Stuff We Use • Amazon’s AWS for infrastructure (EC2, S3) • Sylog-NG for syslog/TLS input services • 0MQ for event queuing & work distribution • MongoDB for statistics and API /stat methods • Node.js for HTTP/HTTPs input services • Django/Python for middleware/app All your logs are belong to you 17 Thursday, May 26, 2011
  • 18. ACK/FIN One Last Thing... http://logg.ly/jobs 18 Thursday, May 26, 2011