These slides were presented at the Great Indian Developer Summit 2014 at Bangalore. See http://www.developermarch.com/developersummit/session.html?insert=ShalinMangar2
"SolrCloud" is the name given to Apache Solr's feature set for fault tolerant, highly available, and massively scalable capabilities. SolrCloud has enabled organizations to scale, impressively, into the billions of documents with sub-second search!
2. Subset of o ptio nal featuresin Solr to enableand
simplify horizontal scaling asearch index using
sharding and replication.
Goals
performance, scalability, high-availability,
simplicity, and elasticity
What is SolrCloud?
3. Terminology
●
ZooKeeper: Distributed coordination servicethat
providescentralized configuration, cluster state
management, and leader election
●
Node: JVM processbound to aspecific port on amachine;
hoststheSolr web application
●
Collection: Search index distributed acrossmultiple
nodes; each collection hasaname, shard count, and
replication factor
●
Replication Factor: Number of copiesof adocument in
acollection
4. • Shard: Logical sliceof acollection; each shard hasaname, hash
range, leader, and replication factor. Documentsareassigned to
oneand only oneshard per collection using ahash-based
document routing strategy
• Replica: Solr index that hostsacopy of ashard in acollection;
behind thescenes, each replicaisimplemented asaSolr core
• Leader: Replicain ashard that assumesspecial dutiesneeded to
support distributed indexing in Solr; each shard hasoneand only
oneleader at any timeand leadersareelected using ZooKeeper
Terminology
6. Collection == Distributed Index
A collection isa distributed index defined by:
• named configuration stored in ZooKeeper
• number of shards: documents are distributed
across N partitions of the index
• document routing strategy: how documents get
assigned to shards
• replication factor: how many copiesof each
document in thecollection
8. ●
Collection has a fixed number of shards
- existing shardscan besplit
●
When to shard?
- Largenumber of docs
- Largedocument sizes
- Parallelization during indexing and
queries
- Datapartitioning (custom hashing)
Sharding
9. ●
Each shard coversahash-range
●
Default: Hash ID into 32-bit integer, map to range
- leadsto balanced (roughly) shards
●
Custom-hashing (examplein afew slides)
●
Tri-level: app!user!doc
●
Implicit: no hash-rangeset for shards
Document Routing
10. • Why replicate?
- High-availability
- Load balancing
●
How does it work in SolrCloud?
- Near-real-time, not master-slave
- Leader forwards to replicas in parallel,
waits for response
- Error handling during indexing is tricky
Replication
13. 1. Get cluster statefrom ZK
2. Routedocument directly to
leader (hash on doc ID)
3. Persist document on durable
storage(tlog)
4. Forward to healthy replicas
5. Acknowledgewrite succeed to
client
Distributed Indexing
14. ●
Additional responsibilitiesduring indexing only! Not a
master node
●
Leader isareplica(handlesqueries)
●
Acceptsupdaterequestsfor theshard
●
Incrementsthe_version_ on thenew or updated doc
●
Sendsupdates(in parallel) to all replicas
Shard Leader
15. Distributed Queries
1. Query client can beZK awareor just
query viaaload balancer
2. Client can send query to any nodein the
cluster
3. Controller nodedistributesthequery to
areplicafor each shard to identify
documentsmatching query
4. Controller nodesortstheresultsfrom
step 3 and issuesasecond query for all
fieldsfor apageof results
16. Scalability / Stability Highlights
●
All nodesin cluster perform indexing and execute
queries; no master node
●
Distributed indexing: No SPoF, high throughput via
direct updatesto leaders, automated failover to new
leader
●
Distributed queries: Add replicasto scale-out qps;
parallelizecomplex query computations; fault-tolerance
●
Indexing / queriescontinueso long asthereis1 healthy
replicaper shard
17. SolrCloud and CAP
●
A distributed system should be: Consistent, Available, and
Partition tolerant
●
CAPsayspick 2 of the3! (slightly morenuanced than that
in reality)
●
SolrCloud favorsconsistency over write-availability (CP)
●
All replicasin ashard havethesamedata
●
Activereplicasetsconcept (writesaccepted so long asa
shard hasat least oneactivereplicaavailable)
18. SolrCloud and CAP
• No toolsto detect or fix consistency issuesin Solr
– Reads go to one replica; no concept of quorum
– Writes must fail if consistency cannot be
guaranteed (SOLR-5468)
19. ZooKeeper
●
Isavery good thing ... clustersareazoo!
●
Centralized configuration management
●
Cluster statemanagement
●
Leader election (shard leader and overseer)
●
Overseer distributed work queue
●
LiveNodes
– Ephemeral znodesused to signal aserver isgone
●
Needs3 nodesfor quorum in production
20. ZooKeeper: Centralized Configuration
●
Storeconfig filesin
ZooKeeper
●
Solr nodespull config
during coreinitialization
●
Config setscan be“shared”
acrosscollections
●
Changesareuploaded to ZK
and then collectionsshould
bereloaded
22. Overseer
●
What doesit do?
– Persistscollection statechangeeventsto ZooKeeper
– Controller for Collection API commands
– Ordered updates
– Oneper cluster (for all collections); elected using leader election
●
How doesit work?
– Asynchronous(pub/sub messaging)
– ZooKeeper asdistributed queuerecipe
– Automated failover to ahealthy node
– Can beassigned to adedicated node(SOLR-5476)
23. Custom Hashing
●
Routedocumentsto specific shardsbased on ashard key
component in thedocument ID
●
Send all log messagesfrom thesamesystem to the
sameshard
●
Direct queriesto specific shards: q=...&_route_=httpd
{
"id" : ”httpd!2",
"level_s" : ”ERROR",
"lang_s" : "en",
...
},
Hash:
shardKey!docID
24. Custom Hashing Highlights
●
Co-locatedocumentshaving acommon property in thesame
shard
- e.g. docshaving IDshttpd!21 and httpd!33 will
bein thesameshard
• Scale-up thereplicasfor specific shardsto addresshigh query
and/or indexing volumefrom specific apps
• Not asmuch control over thedistribution of keys
- httpd, mysql, and collectd all in same shard
• Can split unbalanced shards when using custom hashing
25. • Can split shards into two sub-shards
• Live splitting! No downtime needed!
• Requests start being forwarded to sub-shards
automatically
• Expensive operation: Use as required during low
traffic
Shard Splitting
26. Other features / highlights
• Near-Real-Time Search: Documentsarevisiblewithin a
second or so after being indexed
• Partial Document Update: Just updatethefieldsyou need to
changeon existing documents
• Optimistic Locking: Ensureupdatesareapplied to thecorrect
version of adocument
• Transaction log: Better recoverability; peer-sync between nodes
after hiccups
• HTTPS
• Use HDFS for storing indexes
• UseMapReduce for building index (SOLR-1301)
28. Attributions
• Tim Potter's slides on “Introduction to SolrCloud” at
Lucene/Solr Exchange 2014
– http://twitter.com/thelabdude
• Erik Hatcher's slides on “Solr: Search at the speed of
light” at JavaZone 2009
– http://twitter.com/ErikHatcher