2. Introductions
• @WalmartLabs – Building Walmart Global eCommerce from the
2
ground up
• Data Foundation Team – Build, manage and provide tools for all OLTP
operations
3. Large Scale eCommerce problems
• Our customers love to shop online 24X7 and we love them for that
• Reads are many orders of magnitude more than writes, and reads
3
have to be blazing fast (every millisecond has a monetary value attached to it, according to
some studies)
• Scaling up only takes you so far, you have to scale out
• Low latency analytics absolutely canNOT be on OLTP data stores
• No full table scans
• Too many RDBMS column indexes leading to slow writes
5. Very large scale and always available means..
• There is really NO way around Brewer’s CAP theorem
Source: http://blog.mccrory.me/2010/11/03/cap-theorem-
and-the-clouds/
• Embrace “eventual” consistency and asynchrony
• Clearly articulate “eventual” to business stakeholders. Computer
5
“eventual” and human “eventual” are different scales entirely.
7. Typical data flow into EC data stores
IC Web Service
Web Service Client
7
Client
Web Service Client
EC Web Service
Web Service Client
Orchestrator Service
Client
Resource Tier Resource Tier Resource Tier
Batch layer (processes data on
Hadoop and loads into faster serving
Kafka
Event driven
updater
Kafka Consumer
for Solr
datastore)
Fire job and pull results
Kafka Consumer
for Hadoop
SolrCloud Hadoop
Web Service Client
70-80% of
total load
read
write write
8. Challenges
• Messaging System: Kafka was already being used and supported by
8
our Big Fast Data team
• Virtualization
– Shared CPU and memory among compute tenants generally bad
for Search engine infrastructure. If your use-case takes off, you will
eventually move to dedicated hardware.
– We started with big dedicated bare-metal hardware
– Virtualization requires complete lifecycle management
• Serialization format
– Our choice Avro (Schema + Data)
• Hierarchical Object to Flat
– If you are familiar with ElasticSearch, you’d say “No
problem..maybe”
– If you are already using HBase or Cassandra or similar, you’d say
“No problem..maybe”
– For Solr people, lets talk about schema.xml and plugin based
flattening
9. SolrCloud 101
• Solr is the web app wrapper on Lucene
• SolrCloud is the distributed search where a bunch of Solr nodes
9
coordinate using ZooKeeper
Source: SolrCloud Wiki
10. Solr schema.xml choices
• Let each team build their own schema.xml from scratch
10
– This would require each customer team to intimately learn search
engines, Solr etc.
– This would also mean that each time there is a change in
schema.xml, everything must be re-indexed.
• Leverage Solr’s dynamic fields and create a naming convention
– this gives the customer a kick-start
– Schema.xml doesn’t need to change often and can be mostly used
unchanged team to team
11. Best possible (unrealistic) scenario
• No writes
• No scoring, sorting, faceting
• 100% document cache hit ratio
• 99.6% of 192GB physical memory usage
• 2000+ select/sec
• 0.3 ms/query
11
14. Getting Worse..
• Hundreds of ms/query with close to 100% Doc cache hit ratio
14
15. Most common causes of slowdowns
• GC pauses. Cure: trial-and-error with help from experts
15
16. More naïve mistakes..
• Zookeeper in the same Solr machine
16
– We did not experience this, as we knew this going in
• Frequent commits (in our case was DB-style, 1 doc/update + commit)
– DON’T commit after every update. Solr commit is very different
from DBMS commit. It opens up a new searcher and warms it up in
the background. “Too many on-deck searchers” warning is a telltale
sign
– Batch as many docs as your application can tolerate in a single
update post
– We chose batching docs for 1 sec
• IO contention (Log level too high)
– Easy fix
17. Zookeeper
• Prefer odd number of nodes for the ensemble as quorum is N/2 + 1
• More nodes are not necessarily better
17
– 3 nodes is too low as you can handle only 1 failure
– 5 nodes is good balance between HA and write speed. More nodes
creates slower writes and slower quorums.
– We had to go with 9 = 3 nodes in each of 3 protects us from a
complete outage in one cloud.
• Pay good attention to Zookeeper availability as SolrCloud will only
function for a little while after ZK is dead
• CloudSolrServer (SolrJ client) completely relies on Zookeeper for
talking to SolrCloud
18. How do you do Disaster Recovery?
• SolrCloud is CP model (CAP theorem)
• You should not add replica from another data center. Every write will
18
get excruciatingly slow
• Use Kafka or other messaging system to send data cross-DC
• Get used to cross-DC eventual consistency. Monitor for tolerance
thresholds
19. Metrics Monitoring
• We poll metrics from Mbeans and push to Graphite servers
19